Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2012

Where are we going? Predicting the evolution of individuals

Autores
Siddiqui, ZF; Oliveira, M; Gama, J; Spiliopoulou, M;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
When searching for patterns on data streams, we come across perennial (dynamic) objects that evolve over time. These objects are encountered repeatedly and each time with different definition and values. Examples are (a) companies registered at stock exchange and reporting their progress at the end of each year, and (b) students whose performance is evaluated at the end of each semester. On such data, domain experts also pose questions on how the individual objects will evolve: would it be beneficial to invest in a given company, given both the company's individual performance thus far and the drift experienced in the model? Or, how will a given student perform next year, given the performance variations observed thus far? While there is much research on how models evolve/change over time [Ntoutsi et al., 2011a], little is done to predict the change of individual objects when the states are not known a priori. In this work, we propose a framework that learns the clusters to which the objects belong at each moment, uses them as ad hoc states in a state-transition graph, and then learns a mixture model of Markov Chains, which predicts the next most likely state/cluster per object. We report on our evaluation on synthetic and real datasets. © Springer-Verlag Berlin Heidelberg 2012.

FecharLer Abstract

2012

Handling time changing data with adaptive very fast decision rules

Autores
Kosina, P; Gama, J;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Data streams are usually characterized by changes in the underlying distribution generating data. Therefore algorithms designed to work with data streams should be able to detect changes and quickly adapt the decision model. Rules are one of the most interpretable and flexible models for data mining prediction tasks. In this paper we present the Adaptive Very Fast Decision Rules (AVFDR), an on-line, any-time and one-pass algorithm for learning decision rules in the context of time changing data. AVFDR can learn ordered and unordered rule sets. It is able to adapt the decision model via incremental induction and specialization of rules. Detecting local drifts takes advantage of the modularity of rule sets. In AVFDR, each individual rule monitors the evolution of performance metrics to detect concept drift. AVFDR prunes rules that detect drift. This explicit change detection mechanism provides useful information about the dynamics of the process generating data, faster adaption to changes and generates compact rule sets. The experimental evaluation shows this method is able to learn fast and compact rule sets from evolving streams in comparison to alternative methods. © 2012 Springer-Verlag.

FecharLer Abstract

2012

Vehicular sensing: Emergence of a massive urban scanner

Autores
Ferreira, M; Fernandes, R; Conceicao, H; Gomes, P; D'Orey, PM; Moreira Matias, L; Gama, J; Lima, F; Damas, L;

Publicação
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering

Abstract
Vehicular sensing is emerging as a powerful mean to collect information using the variety of sensors that equip modern vehicles. These sensors range from simple speedometers to complex video capturing systems capable of performing image recognition. The advent of connected vehicles makes such information accessible nearly in real-time and creates a sensing network with a massive reach, amplified by the inherent mobility of vehicles. In this paper we discuss several applications that rely on vehicular sensing, using sensors such as the GPS receiver, windshield cameras, or specific sensors in special vehicles, such as a taximeter in taxi cabs. We further discuss connectivity issues related to the mobility and limited wireless range of an infrastructure-less network based only on vehicular nodes. © 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering.

FecharLer Abstract

2012

A weightless neural network-based approach for stream data clustering

Autores
Cardoso, D; De Gregorio, M; Lima, P; Gama, J; Franca, F;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
One of the major data mining tasks is to cluster similar data, because of its usefulness, providing means of summarizing large ammounts of raw data into handy information. Clustering data streams is particularly challenging, because of the constraints imposed when dealing with this kind of input. Here we report our work, in which it was investigated the use of WiSARD discriminators as primary data synthesizing units. An analysis of StreamWiSARD, a new sliding-window stream data clustering system, the benefits and the drawbacks of its use and a comparison to other approaches are all presented. © 2012 Springer-Verlag.

FecharLer Abstract

2012

Predictive sequence miner in ILP learning

Autores
Ferreira, CA; Gama, J; Santos Costa, V;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
This work presents an optimized version of XMuSer, an ILP based framework suitable to explore temporal patterns available in multi-relational databases. XMuSer's main idea consists of exploiting frequent sequence mining, an efficient method to learn temporal patterns in the form of sequences. XMuSer framework efficiency is grounded on a new coding methodology for temporal data and on the use of a predictive sequence miner. The frameworks selects and map the most interesting sequential patterns into a new table, the sequence relation. In the last step of our framework, we use an ILP algorithm to learn a classification theory on the enlarged relational database that consists of the original multi-relational database and the new sequence relation. We evaluate our framework by addressing three classification problems and map each one of three different types of sequential patterns: frequent, closed or maximal. The experiments show that our ILP based framework gains both from the descriptive power of the ILP algorithms and the efficiency of the sequential miners. © 2012 Springer-Verlag Berlin Heidelberg.

FecharLer Abstract

2012

Editorial message: Special track on data streams

Autores
Rodrigues, PP; Bifet, A; Krishnaswamy, S; Gama, J;

Publicação
Proceedings of the ACM Symposium on Applied Computing

Abstract