Publications

Publications by João Gama

1998

Combining Classifiers by Constructive Induction

Authors
Gama, J;

Publication
Machine Learning: ECML-98, 10th European Conference on Machine Learning, Chemnitz, Germany, April 21-23, 1998, Proceedings

Abstract
Using multiple classifiers for increasing learning accuracy is an active research area. In this paper we present a new general method for merging classifiers. The basic idea of Cascade Generalization is to sequentially run the set of classifiers, at each step performing an extension of the original data set by adding new attributes. The new attributes are derived from the probability class distribution given by a base classifier. This constructive step extends the representational language for the high level classifiers, relaxing their bias. Cascade Generalization produces a single but structured model for the data that combines the model class representation of the base classifiers. We have performed an empirical evaluation of Cascade composition of three well known classifiers: Naive Bayes, Linear Discriminant, and C4.5. Composite models show an increase of performance, sometimes impressive, when compared with the corresponding single models, with significant statistical confidence levels. © Springer-Veriag Berlin Heidelberg 1998.

CloseRead Abstract

1997

Oblique linear tree

Authors
Gama, J;

Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS: REASONING ABOUT DATA

Abstract
In this paper we present system Ltree for proposicional supervised learning. Ltree is able to define decision surfaces both orthogonal and oblique to the axes defined by the attributes of the input space. This is done combining a decision tree with a linear discriminant by means of constructive induction. At each decision node Ltree defines a new instance space by insertion of new attributes that are projections of the. examples that fall at this node over the hyper-planes given by a linear discriminant function. This new instance space is propagated down through the tree. Tests based on those new attributes are oblique with respect to the original input space. Ltree is a probabilistic tree in the sense that it outputs a class probability distribution for each query example. The class probability distribution is computed at learning time, taking into account the different class distributions on the path from the root to the actual node. We have carried out experiments on sixteen benchmark datasets and compared our system with other well known decision-tree systems (orthogonal and oblique) like C4.5, OC1 and LMDT. On these datasets we have observed that our system has advantages in what concerns accuracy and tree size at statistically significant confidence levels.

CloseRead Abstract

2001

Functional trees for classification

Authors
Gama, J;

Publication
2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS

Abstract
The design of algorithms that explore multiple representation languages and explore different search spaces has an intuitive appeal. In the context of classification problems, algorithms that generate multivariate trees are able to explore multiple representation languages by using decision tests based on a combination of attributes. The same applies to model trees algorithms, in regression domains, but using linear models at leaf nodes. In this paper we study where to use combinations of attributes in decision tree learning, We present an algorithm for multivariate tree learning that combines a univariate decision tree with a discriminant function by means of constructive induction. This algorithm is able to use decision nodes with multivariate tests, and leaf nodes that predict a class using a discriminant function. Multivariate decision nodes are built when growing the tree, while junctional leaves are built when pruning the tree. Functional trees can be seen as a generalization of multivariate trees. Our algorithm was compared against to its components and two simplified versions using 30 benchmark datasets. The experimental evaluation shows that our algorithm has clear advantages with respect to the generalization ability and model sizes at statistically significant confidence levels.

CloseRead Abstract

2010

Resource Aware Distributed Knowledge Discovery

Authors
Gama, J; Cornuéjols, A;

Publication
Ubiquitous Knowledge Discovery - Challenges, Techniques, Applications

Abstract
In the introduction it was argued that ubiquitous knowledge discovery systems have to be able to sense their environment and receive data from other devices, to adapt continuously to changing environmental conditions (including their own condition) and evolving user habits and need be capable of predictive self-diagnosis. In the last chapter, resource constraints arising from ubiquitous environments have been discussed in some detail. It has been argued that algorithms have to be resource-aware because of real-time constraints and of limited computing and battery power as well as communication resources. © 2010 Springer-Verlag.

CloseRead Abstract

1995

Characterization of Classification Algorithms

Authors
Gama, J; Brazdil, P;

Publication
Progress in Artificial Intelligence, 7th Portuguese Conference on Artificial Intelligence, EPIA '95, Funchal, Madeira Island, Portugal, October 3-6, 1995, Proceedings

Abstract
This paper is concerned with the problem of characterization of classification algorithms. The aim is to determine under what circumstances a particular classification algorithm is applicable. The method used involves generation of different kinds of models. These include regression and rule models, piecewise linear models (model trees) and instance based models. These are generated automatically on the basis of dataset characteristics and given test results. The lack of data is compensated for by various types of preprocessing. The models obtained are characterized by quantifying their predictive capability and the best models are identified. © Springer-Verlag Berlin Heidelberg 1995.

CloseRead Abstract

1997

Search-based class discretization

Authors
Torgo, L; Gama, J;

Publication
MACHINE LEARNING : ECML-97

Abstract
We present a methodology that enables the use of classification algorithms on regression tasks. We implement this method in system RECLA that transforms a regression problem into a classification one and then uses an existent classification system to solve this new problem. The transformation consists of mapping a continuous variable into an ordinal variable by grouping its values into an appropriate set of intervals. We use misclassification costs as a means to reflect the implicit ordering among the ordinal values of the new variable. We describe a set of alternative discretization methods and, based on our experimental results, justify the need for a search-based approach to choose the best method. Our experimental results confirm the validity of our search-based approach to class discretization, and reveal the accuracy benefits of adding misclassification costs.

CloseRead Abstract