Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Alípio Jorge

2009

Ensemble Learning: A Study on Different Variants of the Dynamic Selection Approach

Authors
Mendes Moreira, J; Jorge, AM; Soares, C; de Sousa, JF;

Publication
MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION

Abstract
Integration methods for ensemble learning can use two different approaches: combination or selection. The combination approach (also called fusion) consists on the combination of the predictions obtained by different models in the ensemble to obtain the final ensemble predication. The selection approach selects one (or more) models from the ensemble according to the prediction performance of these models on similar data from the validation set. Usually, the method to select similar data is the k-nearest neighbors with the Euclidean distance. In this paper we discuss other approaches to obtain similar data for the regression problem. We show that using similarity measures according to the target values improves results. We also show that selecting dynamically several models for the prediction task increases prediction accuracy comparing to the selection of just one model.

2011

Mining Association Rules for Label Ranking

Authors
de Sa, CR; Soares, C; Jorge, AM; Azevedo, P; Costa, J;

Publication
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011

Abstract
Recently, a number of learning algorithms have been adapted for label ranking, including instance-based and tree-based methods. In this paper, we propose an adaptation of association rules for label ranking. The adaptation, which is illustrated in this work with APRIORI Algorithm, essentially consists of using variations of the support and confidence measures based on ranking similarity functions that are suitable for label ranking. We also adapt the method to make a prediction from the possibly conflicting consequents of the rules that apply to an example. Despite having made our adaptation from a very simple variant of association rules for classification, the results clearly show that the method is making valid predictions. Additionally, they show that it competes well with state-of-the-art label ranking algorithms.

2000

Integrating rules and cases in learning via case explanation and paradigm shift

Authors
Lopes, AD; Jorge, A;

Publication
ADVANCES IN ARTIFICIAL INTELLIGENCE

Abstract
In this article we discuss in detail two techniques for rule and case integration. Case-based learning is used when the rule language is exhausted. Initially, all the examples are used to induce a set of rules with satisfactory quality. The examples that are not covered by these rules are then handled as cases. The case-based approach used also combines rules and cases internally. Instead of only storing the cases as provided, it has a learning phase where, for each case, it constructs and stores a set of explanations with support and confidence above given thresholds. These explanations have different levels of generality and the maximally specific one corresponds to the case itself. The same case may have different explanations representing different perspectives of the case. Therefore, to classify a new case, it looks for relevant stored explanations applicable to the new case. The different possible views of the case given by the explanations correspond to considering different sets of conditions/features to analyze the case. In other words, they lead to different ways to compute similarity between known cases/explanations and the new case to be classified (as opposed to the commonly used fixed metric).

2012

Combining usage and content in an online music recommendation system for music in the long-tail

Authors
Domingues, MA; Gouyon, F; Jorge, AM; Leal, JP; Vinagre, J; Lemos, L; Sordo, M;

Publication
WWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion

Abstract
In this paper we propose a hybrid music recommender system, which combines usage and content data. We describe an online evaluation experiment performed in real time on a commercial music web site, specialised in content from the very long tail of music content. We compare it against two stand-alone recommenders, the first system based on usage and the second one based on content data. The results show that the proposed hybrid recommender shows advantages with respect to usage- and content-based systems, namely, higher user absolute acceptance rate, higher user activity rate and higher user loyalty. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

2003

Automatic selection of table areas in documents for information extraction

Authors
Silva, ACE; Jorge, A; Torgo, L;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
The information contained in companies' financial statements is valuable to several users. Much of the relevant information in such documents is contained in tables and is currently mainly extracted by hand. We propose a method that accomplishes a prior step of the task of automatically extracting information from tables in documents: selecting the lines that are likely to belong to tables. Our method has been developed by empirically analyzing a set of Portuguese companies' financial statements using statistical and data mining techniques. Empirical evaluation indicates that more than 99% of table lines are selected after discarding at least 50% of all lines. The method can cope with the complexity of styles used in assembling information on paper and adapt its performance accordingly, thus maximizing its results.

2004

Hierarchical clustering for thematic browsing and summarization of large sets of association rules

Authors
Jorge, A;

Publication
Proceedings of the Fourth SIAM International Conference on Data Mining

Abstract
In this paper we propose a method for grouping and summarizing large sets of association rules according to the items contained in each rule. We use hierarchical clustering to partition the initial rule set into thematically coherent subsets. This enables the summarization of the rule set by adequately choosing a representative rule for each subset, and helps in the interactive exploration of the rule model by the user. We define the requirements of our approach, and formally show the adequacy of the chosen approach to our aims. Rule clusters can also be used to infer novel interest measures for the rules. Such measures are based on the lexicon of the rules and are complementary to measures based on statistical properties, such as confidence, lift and conviction. We show examples of the application of the proposed techniques.

  • 34
  • 46