Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2025

RMIDDM: an unsupervised and interpretable concept drift detection method for data streams

Autores
Neto, R; Alencar, B; Gomes, HM; Bifet, A; Gama, J; Cassales, G; Rios, R;

Publicação
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Traditional machine learning techniques assume that data is drawn from a stationary source. This assumption is challenged in contexts with data streams for presenting constant and potentially infinite sequences whose distribution is prone to change over time. Based on these settings, detecting changes (a.k.a. concept drifts) is necessary to keep learning models up-to-date. Although state-of-the-art detection methods were designed to monitor the loss of predictive models, such monitoring falls short in many real-world scenarios where the true labels are not readily available. Therefore, there is increasing attention to unsupervised concept drift detection methods as approached in this paper. In this work, we present an unsupervised and interpretable method based on Radial Basis Function Networks (RBFN) and Markov Chains (MC), referred to as RMIDDM (Radial Markov Interpretable Drift Detection Method). In our method, RBF performs, in the intermediate layer, an activation process that implicitly produces groups of observations collected over time. Simultaneously, MC models the transitions between groups to support the detection of concept drifts, which happens when the active group changes and its probability exceeds a given threshold. A set of experiments with synthetic datasets and comparisons with state-of-the-art algorithms demonstrated that the proposed method can detect drifts at runtime in an efficient, interpretable, and independent way of labels, presenting competitive results and behavior. Additionally, to show its applicability in a real-world scenario, we analyzed new COVID-19 cases, deaths, and vaccinations to identify new waves as concept drifts and generate Markov models that allow understanding of their interaction.

2025

Interpretable Predictive Maintenance: Combining Anomaly Detection with Quantitative Root Cause Analysis

Autores
Barbosa, I; Gama, J; Veloso, B;

Publicação
EPIA (2)

Abstract
Predictive Maintenance (PdM) aims to prevent failures through early detection, yet lacks explainability to support decision-making. Current PdM models often identify failures, but fail to explain their root causes, especially in real-world scenarios, with complex and limited labeled data. This study proposes an interpretable framework that combines LSTM-based Anomaly Detection with a dual-layered Root Cause Analysis (RCA) based on SHAP attributions. Applied to a real-world dataset, the method detects degradation transitions, tracks failure patterns over time, and provides interpretable information without explicit root cause labels.

2025

Effect of AI on Innovation Capacity in the context of Industry 5.0: Findings from a Qualitative study

Autores
Bécue, A; Gama, J; Brito, PQ;

Publicação
Strategic Business Research

Abstract

2025

A Systematic Literature Review on Multi-label Data Stream Classification

Autores
Oliveira, HF; de Faria, ER; Gama, J; Khan, L; Cerri, R;

Publicação
CoRR

Abstract

2025

Salvador Urban Network Transportation (SUNT): A Landmark Spatiotemporal Dataset for Public Transportation

Autores
Ferreira, MV; Souza, M; Rios, TN; Fernandes, IFC; Nery, J; Gama, J; Bifet, A; Rios, RA;

Publicação
SCIENTIFIC DATA

Abstract
Efficient public transportation management is essential for the development of large urban centers, providing several benefits such as comprehensive coverage of population mobility, reduction of transport costs, better control of traffic congestion, and significant reduction of environmental impact limiting gas emissions and pollution. Realizing these benefits requires a deeply understanding the population and transit patterns and the adoption of approaches to model multiple relations and characteristics efficiently. This work addresses these challenges by providing a novel dataset that includes various public transportation components from three different systems: regular buses, subway, and BRT (Bus Rapid Transit). Our dataset comprises daily information from about 700,000 passengers in Salvador, one of Brazil's largest cities, and local public transportation data with approximately 2,000 vehicles operating across nearly 400 lines, connecting almost 3,000 stops and stations. With data collected from March 2024 to March 2025 at a frequency lower than one minute, SUNT stands as one of the largest, most comprehensive, and openly available urban datasets in the literature.

2025

Data Science: Foundations and Applications - 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Sydney, Australia, June 10-13, 2025, Proceedings, Part VII

Autores
Wu, X; Spiliopoulou, M; Wang, C; Kumar, V; Cao, L; Zhou, X; Pang, G; Gama, J;

Publicação
PAKDD (7)

Abstract

  • 16
  • 516