Publications

Publications by João Gama

2024

Detecting and Explaining Anomalies in the Air Production Unit of a Train

Authors
Davari, N; Veloso, B; Ribeiro, RP; Gama, J;

Publication
39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024

Abstract
Predictive maintenance methods play a crucial role in the early detection of failures and errors in machinery, preventing them from reaching critical stages. This paper presents a comprehensive study on a real-world dataset called MetroPT3, with data from a Metro do Porto train's air production unit (APU) system. The dataset comprises data collected from various analogue and digital sensors installed on the APU system, enabling the analysis of behavioural changes and deviations from normal patterns. We propose a data-driven predictive maintenance framework based on a Long Short-Term Memory Autoencoder (LSTM-AE) network. The LSTM-AE efficiently identifies abnormal data instances, leading to a reduction in false alarm rates. We also implement a Sparse Autoencoder (SAE) approach for comparative analysis. The experimental results demonstrate that the LSTM-AE outperforms the SAE regarding F1 Score, Recall, and Precision. Furthermore, to gain insights into the reasons for anomaly detection, we apply the Shap method to determine the importance of features in the predictive maintenance model. This approach enhances the interpretability of the model to support the decision-making process better.

CloseRead Abstract

2024

Where DoWe Go From Here? Location Prediction from Time-Evolving Markov Models

Authors
Andrade, T; Gama, J;

Publication
39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024

Abstract
Various relevant aspects of our lives relate to the places we visit and our daily activities. The movement of individuals between regular places, such as work, school, or other important personal locations is getting increasing attention due to the pervasiveness of geolocation devices and the amount of data they generate. This work presents an approach for location prediction using a probabilistic model and data mining techniques over mobility data streams. We evaluate the method over 5 real-world datasets. The results show the usefulness of the proposal in comparison with other-well-known approaches.

CloseRead Abstract

2024

S plus t-SNE - Bringing Dimensionality Reduction to Data Streams

Authors
Vieira, PC; Montrezol, JP; Vieira, JT; Gama, J;

Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT II, IDA 2024

Abstract
We present S+t-SNE, an adaptation of the t-SNE algorithm designed to handle infinite data streams. The core idea behind S+t-SNE is to update the t-SNE embedding incrementally as new data arrives, ensuring scalability and adaptability to handle streaming scenarios. By selecting the most important points at each step, the algorithm ensures scalability while keeping informative visualisations. By employing a blind method for drift management, the algorithm adjusts the embedding space, which facilitates the visualisation of evolving data dynamics. Our experimental evaluations demonstrate the effectiveness and efficiency of S+t-SNE, whilst highlighting its ability to capture patterns in a streaming scenario. We hope our approach offers researchers and practitioners a real-time tool for understanding and interpreting high-dimensional data.

CloseRead Abstract

2024

Super-Resolution Analysis for Landfill Waste Classification

Authors
Molina, M; Ribeiro, RP; Veloso, B; Carna, J;

Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT I, IDA 2024

Abstract
Illegal landfills are a critical issue due to their environmental, economic, and public health impacts. This study leverages aerial imagery for environmental crime monitoring. While advances in artificial intelligence and computer vision hold promise, the challenge lies in training models with high-resolution literature datasets and adapting them to open-access low-resolution images. Considering the substantial quality differences and limited annotation, this research explores the adaptability of models across these domains. Motivated by the necessity for a comprehensive evaluation of waste detection algorithms, it advocates cross-domain classification and super-resolution enhancement to analyze the impact of different image resolutions on waste classification as an evaluation to combat the proliferation of illegal landfills. We observed performance improvements by enhancing image quality but noted an influence on model sensitivity, necessitating careful threshold fine-tuning.

CloseRead Abstract

2019

Uma Análise sobre a Evolução das Preferências Musicais dos Usuários Utilizando Redes de Similaridade Temporal

Authors
Fernandes Pereira, FS; Linhares, CDG; Ponciano, JR; Gama, J; Amo, Sd; Oliveira, GMB;

Publication
Braz. J. Inf. Syst.

Abstract

2026

Unveiling Group-Specific Distributed Concept Drift: A Fairness Imperative in Federated Learning

Authors
Salazar, T; Gama, J; Araújo, H; Abreu, PH;

Publication
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Abstract
In the evolving field of machine learning, ensuring group fairness has become a critical concern, prompting the development of algorithms designed to mitigate bias in decision-making processes. Group fairness refers to the principle that a model's decisions should be equitable across different groups defined by sensitive attributes such as gender or race, ensuring that individuals from privileged groups and unprivileged groups are treated fairly and receive similar outcomes. However, achieving fairness in the presence of group-specific concept drift remains an unexplored frontier, and our research represents pioneering efforts in this regard. Group-specific concept drift refers to situations where one group experiences concept drift over time, while another does not, leading to a decrease in fairness even if accuracy (ACC) remains fairly stable. Within the framework of federated learning (FL), where clients collaboratively train models, its distributed nature further amplifies these challenges since each client can experience group-specific concept drift independently while still sharing the same underlying concept, creating a complex and dynamic environment for maintaining fairness. The most significant contribution of our research is the formalization and introduction of the problem of group-specific concept drift and its distributed counterpart, shedding light on its critical importance in the field of fairness. In addition, leveraging insights from prior research, we adapt an existing distributed concept drift adaptation algorithm to tackle group-specific distributed concept drift, which uses a multimodel approach, a local group-specific drift detection mechanism, and continuous clustering of models over time. The findings from our experiments highlight the importance of addressing group-specific concept drift and its distributed counterpart to advance fairness in machine learning.

CloseRead Abstract