Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2007

OLINDDA: A cluster-based approach for detecting novelty and concept drift in data streams

Autores
Spinosa, EJ; de Carvalho, APDF; Gama, J;

Publicação
APPLIED COMPUTING 2007, VOL 1 AND 2

Abstract
A machine learning approach that is capable of treating data streams presents new challenges and enables the analysis of a variety of real problems in which concepts change over time. In this scenario, the ability to identify novel concepts as well as to deal with concept drift axe two important attributes. This paper presents a technique based on the k-means clustering algorithm aimed at considering those two situations in a single learning strategy. Experimental results performed with data from various domains provide insight into how clustering algorithms can be used for the discovery of new concepts in streams of data.

2007

Learning from data streams: Processing techniques in sensor networks

Autores
Gama, J; Gaber, MM;

Publicação
Learning from Data Streams: Processing Techniques in Sensor Networks

Abstract
Sensor networks consist of distributed autonomous devices that cooperatively monitor an environment. Sensors are equipped with capacities to store information in memory, process this information and communicate with their neighbors. Processing data streams generated from wireless sensor networks has raised new research challenges over the last few years due to the huge numbers of data streams to be managed continuously and at a very high rate. The book provides the reader with a comprehensive overview of stream data processing, including famous prototype implementations like the Nile system and the TinyOS operating system. The set of chapters covers the state-of-art in data stream mining approaches using clustering, predictive learning, and tensor analysis techniques, and applying them to applications in security, the natural sciences, and education. This research monograph delivers to researchers and graduate students the state of the art in data stream processing in sensor networks. The huge bibliography offers an excellent starting point for further reading and future research. © Springer-Verlag Berlin Heidelberg 2007. All rights are reserved.

2007

Pursuing the best ECOC dimension for multiclass problems

Autores
Pimenta, E; Gama, J; Carvalho, A;

Publicação
Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2007

Abstract
Recent work highlights advantages in decomposing multiclass decision problems into multiple binary problems. Several strategies have been proposed for this decomposition. The most frequently investigated are All-vs-All, One-vs-All and the Error correction output codes (ECOC). ECOC are binary words (codewords) and can be adapted to be used in classifications problems. They must, however, comply with some specific constraints. The codewords can have several dimensions for each number of classes to be represented. These dimensions grow exponentially with the number of classes of the multiclass problem. Two methods to choose the dimension of a ECOC, which assure a good trade-off between redundancy and error correction capacity, are proposed in this paper. The methods are evaluated in a set of benchmark classification problems. Experimental results show that they are competitive against conventional multiclass decomposition methods. Copyright

2007

OLINDDA

Autores
Spinosa, EJ; de Leon F. de Carvalho, AP; Gama, J;

Publicação
Proceedings of the 2007 ACM symposium on Applied computing - SAC '07

Abstract

2007

Knowledge discovery from data streams

Autores
Gama, J; Aguilar Ruiz, J;

Publicação
INTELLIGENT DATA ANALYSIS

Abstract

2007

Change detection in learning histograms from data streams

Autores
Sebastiao, R; Gama, J;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS

Abstract
In this paper we study the problem of constructing histograms from high-speed time-changing data streams. Learning in this context requires the ability to process examples once at the rate they arrive, maintaining a histogram consistent with the most recent data, and forgetting out-date data whenever a change in the distribution is detected. To construct histogram from high-speed data streams we use the two layer structure used in the Partition Incremental Discretization (PiD) algorithm. Our contribution is a new method to detect whenever a change in the distribution generating examples occurs. The base idea consists of monitoring distributions from two different time windows: the reference time window, that reflects the distribution observed in the past; and the current time window reflecting the distribution observed in the most recent data. We compare both distributions and signal a change whenever they are greater than a threshold value, using three different methods: the Entropy Absolute Difference, the Kullback-Leibler divergence and the Cosine Distance. The experimental results suggest that Kullback-Leibler divergence exhibit high probability in change detection, faster detection rates, with few false positives alarms.

  • 451
  • 506