Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2026

Building of transformer-based RUL predictors supported by explainability techniques: Application on real industrial datasets

Autores
Dintén, R; Zorrilla, M; Veloso, B; Gama, J;

Publicação
INFORMATION FUSION

Abstract
One of the key aspects of Industry 4.0 is using intelligent systems to optimize manufacturing processes by improving productivity and reducing costs. These systems have greatly impacted in different areas, such as demand prediction and quality assessment. However, the prognostics and health management of industrial equipment is one of the areas with greater potential. This paper presents a comparative analysis of deep learning architectures applied to the prediction of the remaining useful life (RUL) on public real industrial datasets. The analysis includes some of the most commonly employed recurrent neural network variations and a novel approach based on a hybrid architecture using transformers. Moreover, we apply explainability techniques to provide comprehensive insights into the model's decision-making process. The contributions of the work are: (1) a novel transformer-based architecture for RUL prediction that outperforms traditional recurrent neural networks; (2) a detailed description of the design strategies used to construct the models on two under-explored datasets; (3) the use of explainability techniques to understand the feature importance and to explain the model's prediction and (4) making models built for reproducibility available to other researchers.

FecharLer Abstract

2026

Interpretable Predictive Maintenance: Combining Anomaly Detection with Quantitative Root Cause Analysis

Autores
Barbosa, I; Gama, J; Veloso, B;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2025, PT II

Abstract
Predictive Maintenance (PdM) aims to prevent failures through early detection, yet lacks explainability to support decision-making. Current PdM models often identify failures, but fail to explain their root causes, especially in real-world scenarios, with complex and limited labeled data. This study proposes an interpretable framework that combines LSTM-based Anomaly Detection with a dual-layered Root Cause Analysis (RCA) based on SHAP attributions. Applied to a real-world dataset, the method detects degradation transitions, tracks failure patterns over time, and provides interpretable information without explicit root cause labels.

FecharLer Abstract

2026

In-context Learning of Evolving Data Streams with Tabular Foundational Models

Autores
Lourenco, A; Gama, J; Xing, EP; Marreiros, G;

Publicação
PROCEEDINGS OF THE 32ND ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING VOL 1, KDD 2026

Abstract
State-of-the-art data stream mining has long drawn from ensembles of the Very Fast Decision Tree, a seminal algorithm honored with the 2015 KDD Test-of-Time Award. However, the emergence of large tabular models, i.e., transformers designed for structured numerical data, marks a significant paradigm shift. These models move beyond traditional weight updates, instead employing in-context learning through prompt tuning. By using on-the-fly sketches to summarize unbounded streaming data, one can feed this information into a pre-trained model for efficient processing. This work bridges advancements from both areas, highlighting how transformers' implicit meta-learning abilities, pre-training on drifting natural data, and reliance on context optimization directly address the core challenges of adaptive learning in dynamic environments. Exploring real-time model adaptation, this research demonstrates that TabPFN, coupled with a simple sliding memory strategy, consistently outperforms ensembles of Hoeffding trees, such as Adaptive Random Forest, and Streaming Random Patches, across all non-stationary benchmarks.

FecharLer Abstract

2026

DFDT: Dynamic Fast Decision Tree for IoT Data Stream Mining on Edge Devices

Autores
Lourenço, A; Rodrigo, J; Gama, J; Marreiros, G;

Publicação
AAAI

Abstract
The Internet of Things generates massive data streams, with edge computing emerging as a key enabler for online IoT applications and 5G networks. Edge solutions facilitate real-time machine learning inference, but also require continuous adaptation to concept drifts. While extensions of the Very Fast Decision Tree (VFDT) remain state-of-the-art for tabular stream mining, their unregulated growth limit efficiency, particularly in ensemble settings where post-pruning at the individual tree level is seldom applied. This paper presents DFDT, a novel memory-constrained algorithm for online learning. DFDT employs activity-aware pre-pruning, dynamically adjusting splitting criteria based on leaf node activity: low-activity nodes are deactivated to conserve resources, moderately active nodes split under stricter conditions, and highly active nodes leverage a skipping mechanism for accelerated growth. Additionally, adaptive grace periods and tie thresholds allow DFDT to modulate splitting decisions based on observed data variability, enhancing the accu-racy–memory–runtime trade-off while minimizing the need for hyperparameter tuning. An ablation study reveals three DFDT variants suited to different resource profiles. Fully compatible with existing ensemble frameworks, DFDT provides a drop-in alternative to standard VFDT-based learners. © 2026, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

FecharLer Abstract

2026

Interpretable rules for online failure prediction: a case study on metro do porto datasets

Autores
Jakobs, M; Veloso, B; Gama, J;

Publicação
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
Predictive maintenance applications have increasingly been approached with deep learning techniques in recent years due to their high predictive performance. However, as in other real-world application scenarios, the need for explainability is often stated but not sufficiently addressed, which can limit adoption in practice. In this study, we will focus on predicting failures of trains operating in Porto, Portugal. While recent works have found high-performing deep neural network architectures that feature a parallel explainability pipeline, we find that the generated explanations can be hard to comprehend in practice due to their low support over the failure range. In this work, we propose a novel online rule-learning approach that is able to generate simple rules that cover the entirety of the detected failures. We evaluate our method against AMRules, a state-of-the-art online rule-learning approach, on two datasets gathered from trains operated by Metro do Porto. Our experiments show that our approach consistently generates rules with very high support that are simultaneously short and interpretable.

FecharLer Abstract

2026

Unveiling Group-Specific Distributed Concept Drift: A Fairness Imperative in Federated Learning

Autores
Salazar, T; Gama, J; Araújo, H; Abreu, PH;

Publicação
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Abstract
In the evolving field of machine learning, ensuring group fairness has become a critical concern, prompting the development of algorithms designed to mitigate bias in decision-making processes. Group fairness refers to the principle that a model's decisions should be equitable across different groups defined by sensitive attributes such as gender or race, ensuring that individuals from privileged groups and unprivileged groups are treated fairly and receive similar outcomes. However, achieving fairness in the presence of group-specific concept drift remains an unexplored frontier, and our research represents pioneering efforts in this regard. Group-specific concept drift refers to situations where one group experiences concept drift over time, while another does not, leading to a decrease in fairness even if accuracy (ACC) remains fairly stable. Within the framework of federated learning (FL), where clients collaboratively train models, its distributed nature further amplifies these challenges since each client can experience group-specific concept drift independently while still sharing the same underlying concept, creating a complex and dynamic environment for maintaining fairness. The most significant contribution of our research is the formalization and introduction of the problem of group-specific concept drift and its distributed counterpart, shedding light on its critical importance in the field of fairness. In addition, leveraging insights from prior research, we adapt an existing distributed concept drift adaptation algorithm to tackle group-specific distributed concept drift, which uses a multimodel approach, a local group-specific drift detection mechanism, and continuous clustering of models over time. The findings from our experiments highlight the importance of addressing group-specific concept drift and its distributed counterpart to advance fairness in machine learning.

FecharLer Abstract