Publications

Publications by Alípio Jorge

2007

A tool for interactive subgroup discovery using distribution rules

Authors
Lucas, JP; Jorge, AM; Pereira, F; PernaS, AM; Machado, AA;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS

Abstract
We describe an approach and a tool for the discovery of subgroups within the framework of distribution rule mining. Distribution rules are a kind of association rules particularly suited for the exploratory study of numerical variables of interest. Being an exploratory technique, the result of a distribution mining process is typically a very large number of patterns. Exploring such results is thus a complex task and limits the use of the technique. To overcome this shortcoming we developed a tool, written in Java, which supports subgroup discovery in a post-processing step. The tool engages the analyst in an interactive process of subgroup discovery by means of a graphical interface with well defined statistical grounds, where domain knowledge can be used during the identification of such subgroups amid the population. We show a case study to analyze the results of students in a large scale university admission examination.

CloseRead Abstract

2007

Comparing rule measures for predictive association rules

Authors
Azevedo, PJ; Jorge, AM;

Publication
Machine Learning: ECML 2007, Proceedings

Abstract
We study the predictive ability of some association rule measures typically used to assess descriptive interest. Such measures, namely conviction, lift and chi(2) are compared with confidence, Laplace, mutual information, cosine, Jaccard and phi-coefficient. As prediction models, we use sets of association rules. Classification is done by selecting the best rule, or by weighted voting. We performed an evaluation on 17 datasets with different characteristics and conclude that conviction is on average the best predictive measure to use in this setting. We also provide some meta-analysis insights for explaining the results.

CloseRead Abstract

2003

Visualization and evaluation support of knowledge discovery through the predictive model markup language

Authors
Wettschereck, D; Jorge, A; Moyle, S;

Publication
KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS

Abstract
The emerging standard for the platform- and system-independent representation of data mining models PMML (Predictive Model Markup Language) is currently supported by a number of knowledge discovery support engines. The primary purpose of the PMML standard is to separate model generation from model storage in order to enable users to view, post-process, and utilize data mining models independently of the tool that generated the model. In this paper two systems, called VizWiz and PEAR, are described. These software packages allow for the visualization and evaluation of data mining models that are specified in PMML. They can be viewed. as decision support systems, since they enable non-expert users of data mining results to interactively inspect and evaluate these results.

CloseRead Abstract

2004

Extreme adaptivity

Authors
Alves, MA; Jorge, A; Leal, JP;

Publication
ADAPTIVE HYPERMEDIA AND ADAPOTIVE WEB-BASED SYSTEMS, PROCEEDINGS

Abstract
This Doctoral Consortium paper focuses on Extreme Adaptivity, a set of top level requirements for adaptive hypertext systems, which has resulted from one year of examining the adaptive hypertext landscape. The complete specification of a system, KnowledgeAtoms, is also given, mainly as an example of Extreme Adaptivity. Additional methodological elements are discussed.

CloseRead Abstract

2006

Visual interactive subgroup discovery with numerical properties of interest

Authors
Jorge, AM; Pereira, F; Azevedo, PJ;

Publication
DISCOVERY SCIENCE, PROCEEDINGS

Abstract
We propose an approach to subgroup discovery using distribution rules (a kind of association rules with a probability distribution on the consequent) for numerical properties of interest. The objective interest of the subgroups is measured through statistical goodness of fit tests. Their subjective interest can be assessed by the data analyst through a visual interactive subgroup browsing procedure.

CloseRead Abstract

2006

Improving SVM-linear predictions using CART for example selection

Authors
Moreira, JM; Jorge, AM; Soares, C; de Sousa, JF;

Publication
FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS

Abstract
This paper describes the study on example selection in regression problems using mu-SVM (Support Vector Machine) linear as prediction algorithm. The motivation case is a study done on real data for a problem of bus trip time prediction. In this study we use three different training sets: all the examples, examples from past days similar to the day where prediction is needed, and examples selected by a CART regression tree. Then, we verify if the CART based example selection approach is appropriate on different regression data sets. The experimental results obtained are promising.

CloseRead Abstract