Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2016

Clustering data streams using a forgetful neural model

Autores
Cardoso, DdO; Galvão França, FM; Gama, J;

Publicação
SAC

Abstract
To cluster a data stream is a more challenging task than its regular batch version, having stricter performance constraints. In this paper an approach to this problem is presented, based on WiSARD, a memory-based artificial neural network (ANN) model. This model functioning was reviewed and improved, in order to adapt it to this task. The experimental results obtained support the use of this system for the analysis of data streams in an informative way.

FecharLer Abstract

2016

Message from the MDM 2016 general co-chairs

Autores
Gama, J; Kumar, V; Tan, KL;

Publicação
Proceedings - IEEE International Conference on Mobile Data Management

Abstract

2016

An Overview of Concept Drift Applications

Autores
Žliobaite I.; Pechenizkiy M.; Gama J.;

Publicação
Studies in Big Data

Abstract
In most challenging data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time, thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining this phenomenon is referred to as concept drift. The objective is to deploy models that would diagnose themselves and adapt to changing data over time. This chapter provides an application oriented view towards concept drift research, with a focus on supervised learning tasks. First we overview and categorize application tasks for which the problem of concept drift is particularly relevant. Then we construct a reference framework for positioning application tasks within a spectrum of problems related to concept drift. Finally, we discuss some promising research directions from the application perspective, and present recommendations for application driven concept drift research and development.

FecharLer Abstract

2016

Detecting Events in Evolving Social Networks through Node Centrality Analysis

Autores
Pereira, FSF; Amo, Sd; Gama, J;

Publicação
STREAMEVOLV@ECML-PKDD

Abstract
Social networks have an evolving characteristic because of continuous interaction between users. Existing event detection tasks do not consider the analysis under a user-centric perspective. In this paper we propose to detect node centrality events, that is the task of finding events based on the position and roles of the nodes. We present a naive algorithm for detecting such events in network streams. Moreover, we apply our proposal in a case study, showing how node centrality events can be used for tracking user preferences changes.

FecharLer Abstract

2016

First Principle Models Based Dataset Generation for Multi-Target Regression and Multi-Label Classification Evaluation

Autores
Sousa, RT; Gama, J;

Publicação
STREAMEVOLV@ECML-PKDD

Abstract
Machine Learning and Data Mining research strongly depend on the quality and quantity of the real world datasets for the evaluation stages of the developing methods. In the context of the emerging Online Multi-Target Regression and Multi-Label Classification methodologies, datasets present new characteristics that require specific testing and represent new challenges. The first difficulty found in evaluation is the reduced amount of examples caused by data damage, privacy preservation or high cost of acquirement. Secondly, few data events of interest such as data changes are difficult to find in the datasets of specific domains, since these events naturally scarce. For those reasons, this work suggests a method of producing synthetic datasets with desired properties(number of examples, data changes events, ... ) for the evaluation of Multi-Target Regression and Multi-Label Classification methods. These datasets are produced using First Principle Models which give more realistic and representative properties such as real world meaning ( physical, financial, ... ) for the outputs and inputs variables. This type of dataset generation can be used to produce infinite streams and to evaluate incremental methods such as online anomaly and change detection. This paper illustrates the use of synthetic data generation through two showcases of data changes evaluation.

FecharLer Abstract

2016

Preface

Autores
Gavaldà, R; Žliobaite, I; Gama, J;

Publicação
CEUR Workshop Proceedings

Abstract