Publications

Publications by LIAAD

2004

Hierarchical clustering for thematic browsing and summarization of large sets of association rules

Authors
Jorge, A;

Publication
Proceedings of the Fourth SIAM International Conference on Data Mining

Abstract
In this paper we propose a method for grouping and summarizing large sets of association rules according to the items contained in each rule. We use hierarchical clustering to partition the initial rule set into thematically coherent subsets. This enables the summarization of the rule set by adequately choosing a representative rule for each subset, and helps in the interactive exploration of the rule model by the user. We define the requirements of our approach, and formally show the adequacy of the chosen approach to our aims. Rule clusters can also be used to infer novel interest measures for the rules. Such measures are based on the lexicon of the rules and are complementary to measures based on statistical properties, such as confidence, lift and conviction. We show examples of the application of the proposed techniques.

CloseRead Abstract

2004

Extreme adaptivity

Authors
Alves, MA; Jorge, A; Leal, JP;

Publication
ADAPTIVE HYPERMEDIA AND ADAPOTIVE WEB-BASED SYSTEMS, PROCEEDINGS

Abstract
This Doctoral Consortium paper focuses on Extreme Adaptivity, a set of top level requirements for adaptive hypertext systems, which has resulted from one year of examining the adaptive hypertext landscape. The complete specification of a system, KnowledgeAtoms, is also given, mainly as an example of Extreme Adaptivity. Additional methodological elements are discussed.

CloseRead Abstract

2004

Model-based collaborative filtering for team building support

Authors
Veloso, M; Jorge, A; Azevedo, PJ;

Publication
ICEIS 2004 - Proceedings of the Sixth International Conference on Enterprise Information Systems

Abstract
In this paper we describe an application of recommender systems to team building in a company or organization. The recommender system uses a collaborative filtering model based approach. Recommender models are sets of association rules extracted from the activity log of employees assigned to projects or tasks. Recommendation is performed at two levels: first by recommending a single team element given a partially built team; and second by recommending changes to a completed team. The methodology is applied to a case study with real data. The results are evaluated through experimental tests and one survey to potential users.

CloseRead Abstract

2004

A meta-learning method to select the kernel width in Support Vector Regression

Authors
Soares, C; Brazdil, PB; Kuba, P;

Publication
MACHINE LEARNING

Abstract
The Support Vector Machine algorithm is sensitive to the choice of parameter settings. If these are not set correctly, the algorithm may have a substandard performance. Suggesting a good setting is thus an important problem. We propose a meta-learning methodology for this purpose and exploit information about the past performance of different settings. The methodology is applied to set the width of the Gaussian kernel. We carry out an extensive empirical evaluation, including comparisons with other methods (fixed default ranking; selection based on cross-validation and a heuristic method commonly used to set the width of the SVM kernel). We show that our methodology can select settings with low error while providing significant savings in time. Further work should be carried out to see how the methodology could be adapted to different parameter setting tasks.

CloseRead Abstract

2004

Using Meta-Learning to Support Data Mining

Authors
Vilalta, R; Carrier, CGG; Brazdil, P; Soares, C;

Publication
IJCSA

Abstract

2004

Functional trees

Authors
Gama, J;

Publication
MACHINE LEARNING

Abstract
In the context of classification problems, algorithms that generate multivariate trees are able to explore multiple representation languages by using decision tests based on a combination of attributes. In the regression setting, model trees algorithms explore multiple representation languages but using linear models at leaf nodes. In this work we study the effects of using combinations of attributes at decision nodes, leaf nodes, or both nodes and leaves in regression and classification tree learning. In order to study the use of functional nodes at different places and for different types of modeling, we introduce a simple unifying framework for multivariate tree learning. This framework combines a univariate decision tree with a linear function by means of constructive induction. Decision trees derived from the framework are able to use decision nodes with multivariate tests, and leaf nodes that make predictions using linear functions. Multivariate decision nodes are built when growing the tree, while functional leaves are built when pruning the tree. We experimentally evaluate a univariate tree, a multivariate tree using linear combinations at inner and leaf nodes, and two simplified versions restricting linear combinations to inner nodes and leaves. The experimental evaluation shows that all functional trees variants exhibit similar performance, with advantages in different datasets. In this study there is a marginal advantage of the full model. These results lead us to study the role of functional leaves and nodes. We use the bias-variance decomposition of the error, cluster analysis, and learning curves as tools for analysis. We observe that in the datasets under study and for classification and regression, the use of multivariate decision nodes has more impact in the bias component of the error, while the use of multivariate decision leaves has more impact in the variance component.

CloseRead Abstract