Publications

Publications by João Gama

2011

Constrained Sequential Pattern Knowledge in Multi-relational Learning

Authors
Ferreira, CA; Gama, J; Costa, VS;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
In this work we present XMuSer, a multi-relational framework suitable to explore temporal patterns available in multi-relational databases. XMuSer's main idea consists of exploiting frequent sequence mining, using an efficient and direct method to learn temporal patterns in the form of sequences. Grounded on a coding methodology and on the efficiency of sequential miners, we find the most interesting sequential patterns available and then map these findings into a new table, which encodes the multi-relational timed data using sequential patterns. In the last step of our framework, we use an ILP algorithm to learn a theory on the enlarged relational database that consists on the original multi-relational database and the new sequence relation. We evaluate our framework by addressing three classification problems. Moreover, we map each one of three different types of sequential patterns: frequent sequences, closed sequences or maximal sequences.

CloseRead Abstract

2004

Forest trees for on-line data

Authors
Gama, J; Medas, P; Rocha, R;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract
This paper presents an hybrid adaptive system for induction of forest of trees from data streams. The Ultra Fast Forest Tree system (UFFT) is an incremental algorithm, with constant time for processing each example, works online, and uses the Hoeffding bound to decide when to install a splitting test in a leaf leading to a decision node. Our system has been designed for continuous data. It uses analytical techniques to choose the splitting criteria, and the information gain to estimate the merit of each possible splitting-test. The number of examples required to evaluate the splitting criteria is sound, based on the Hoeffding bound. For multiclass problems,the algorithm builds a binary tree for each possible pair of classes, leading to a forest of trees. During the training phase the algorithm maintains a short term memory. Given a data stream, a fixed number of the most recent examples are maintained in a data-structure that supports constant time insertion and deletion. When a test is installed, a leaf is transformed into a decision node with two descendant leaves. The sufficient statistics of these leaves are initialized with the examples in the short term memory that will fall at these leaves. We study the behavior of UFFT in different problems. The experimental results shows that UFFT is competitive against a batch decision tree learner in large and medium datasets.

CloseRead Abstract

2012

Next challenges for adaptive learning systems

Authors
Zliobaite, I; Bifet, A; Gaber, MM; Gabrys, B; Gama, J; Minku, LL; Musial, K;

Publication
SIGKDD Explorations

Abstract

2010

Knowledge discovery from sensor data (SensorKDD)

Authors
Chandola, V; Omitaomu, OA; Ganguly, AR; Vatsavai, RR; Chawla, NV; Gama, J; Gaber, MM;

Publication
SIGKDD Explorations

Abstract

2008

Knowledge discovery from sensor data (SensorKDD)

Authors
Vatsavai, RR; Omitaomu, OA; Gama, J; Chawla, NV; Gaber, MM; Ganguly, AR;

Publication
SIGKDD Explorations

Abstract

2011

Adaptive windowing for online learning from multiple inter-related data streams

Authors
Ikonomovska, E; Driessensy, K; Dzeroski, S; Gamaz, J;

Publication
Proceedings - IEEE International Conference on Data Mining, ICDM

Abstract
Relational reinforcement learning is a promising branch of reinforcement learning research that deals with structured environments. In these environments, states and actions are differentiated by the presence of certain types of objects and the relations between them and the objects that are involved in the actions. This makes it ultimately suited for tasks that require the manipulation of multiple, interacting objects, such as tasks that a future house-holding robot can be expected to perform like cleaning up a dinner table or storing away done dishes. However, the application of relational reinforcement learning to robotics has been hindered by assumptions such as discrete and atomic state observations. Typical robotic observation systems work in a streaming setup, where objects are discovered and recognized and their placement within their surroundings is determined in a quasi continuous manner instead of a state based one. The resulting information stream can be compared to a set of multiple inter-related data streams. In this paper, we propose an adaptive windowing strategy for generating a stream of learning examples and enabling relational learning from this kind of data. Our approach is independent from the learning algorithm and is based on a gradient search over the space of parameter values, i.e., window sizes, guided by the estimation of the testing error. The proposed algorithm performs online and is data driven and flexible. To the best of our knowledge, this is the first work addressing this problem. Our ideas are empirically supported by an extensive experimental evaluation in a controlled setup using artificial data. © 2011 IEEE.

CloseRead Abstract