Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2011

New Results on Minimum Error Entropy Decision Trees

Autores
Marques de Sa, JPM; Sebastiao, R; Gama, J; Fontes, T;

Publicação
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS

Abstract
We present new results on the performance of Minimum Error Entropy (MEE) decision trees, which use a novel node split criterion. The results were obtained in a comparive study with popular alternative algorithms, on 42 real world datasets. Carefull validation and statistical methods were used. The evidence gathered from this body of results show that the error performance of MEE trees compares well with alternative algorithms. An important aspect to emphasize is that MEE trees generalize better on average without sacrifing error performance.

FecharLer Abstract

2011

Correcting streaming predictions of an electricity load forecast system using a prediction reliability estimate

Autores
Bosnic, Z; Rodrigues, PP; Kononenko, I; Gama, J;

Publicação
Advances in Intelligent and Soft Computing

Abstract
Accurately predicting values for dynamic data streams is a challenging task in decision and expert systems, due to high data flow rates, limited storage and a requirement to quickly adapt a model to new data. We propose an approach for correcting predictions for data streams which is based on a reliability estimate for individual regression predictions. In our work, we implement the proposed technique and test it on a real-world problem: prediction of the electricity load for a selected European geographical region. For predicting the electricity load values we implement two regression models: the neural network and the k nearest neighbors algorithm. The results show that our method performs better than the referential method (i.e. the Kalman filter), significantly improving the original streaming predictions to more accurate values. © 2011 Springer-Verlag Berlin Heidelberg.

FecharLer Abstract

2011

Constrained Sequential Pattern Knowledge in Multi-relational Learning

Autores
Ferreira, CA; Gama, J; Costa, VS;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
In this work we present XMuSer, a multi-relational framework suitable to explore temporal patterns available in multi-relational databases. XMuSer's main idea consists of exploiting frequent sequence mining, using an efficient and direct method to learn temporal patterns in the form of sequences. Grounded on a coding methodology and on the efficiency of sequential miners, we find the most interesting sequential patterns available and then map these findings into a new table, which encodes the multi-relational timed data using sequential patterns. In the last step of our framework, we use an ILP algorithm to learn a theory on the enlarged relational database that consists on the original multi-relational database and the new sequence relation. We evaluate our framework by addressing three classification problems. Moreover, we map each one of three different types of sequential patterns: frequent sequences, closed sequences or maximal sequences.

FecharLer Abstract

2011

Adaptive windowing for online learning from multiple inter-related data streams

Autores
Ikonomovska, E; Driessensy, K; Dzeroski, S; Gamaz, J;

Publicação
Proceedings - IEEE International Conference on Data Mining, ICDM

Abstract
Relational reinforcement learning is a promising branch of reinforcement learning research that deals with structured environments. In these environments, states and actions are differentiated by the presence of certain types of objects and the relations between them and the objects that are involved in the actions. This makes it ultimately suited for tasks that require the manipulation of multiple, interacting objects, such as tasks that a future house-holding robot can be expected to perform like cleaning up a dinner table or storing away done dishes. However, the application of relational reinforcement learning to robotics has been hindered by assumptions such as discrete and atomic state observations. Typical robotic observation systems work in a streaming setup, where objects are discovered and recognized and their placement within their surroundings is determined in a quasi continuous manner instead of a state based one. The resulting information stream can be compared to a set of multiple inter-related data streams. In this paper, we propose an adaptive windowing strategy for generating a stream of learning examples and enabling relational learning from this kind of data. Our approach is independent from the learning algorithm and is based on a gradient search over the space of parameter values, i.e., window sizes, guided by the estimation of the testing error. The proposed algorithm performs online and is data driven and flexible. To the best of our knowledge, this is the first work addressing this problem. Our ideas are empirically supported by an extensive experimental evaluation in a controlled setup using artificial data. © 2011 IEEE.

FecharLer Abstract

2011

Advances in data stream mining for mobile and ubiquitous environments

Autores
Krishnaswamy, S; Gama, J; Gaber, MM;

Publicação
International Conference on Information and Knowledge Management, Proceedings

Abstract
The tutorial presents the state-of-the-art in mobile and ubiquitous data stream mining and discusses open research problems, issues, and challenges in this area. © 2011 Authors.

FecharLer Abstract

2011

Incremental multi-target model trees for data streams

Autores
Ikonomovska, E; Gama, J; Dzeroski, S;

Publicação
Proceedings of the ACM Symposium on Applied Computing

Abstract
As in batch learning, one may identify a class of streaming real-world problems which require the modeling of several targets simultaneously. Due to the dependencies among the targets, simultaneous modeling can be more successful and informative than creating independent models for each target. As a result one may obtain a smaller model able to simultaneously explain the relations between the input attributes and the targets. This problem has not been addressed previously in the streaming setting. We propose an algorithm for inducing multi-target model trees with low computational complexity, based on the principles of predictive clustering trees and probability bounds for supporting splitting decisions. Linear models are computed for each target separately, by incremental training of perceptrons in the leaves of the tree. Experiments are performed on synthetic and real-world datasets. The multi-target regression tree algorithm produces equally accurate and smaller models for simultaneous prediction of all the target attributes, as compared to a set of independent regression trees built separately for each target attribute. When the regression surface is smooth, the linear models computed in the leaves significantly improve the accuracy for all of the targets. © 2011 ACM.

FecharLer Abstract