Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2025

GASTeNv2: Generative Adversarial Stress Testing Networks with Gaussian Loss

Autores
Teixeira, C; Gomes, I; Cunha, L; Soares, C; van Rijn, JN;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2024, PT II

Abstract
As machine learning technologies are increasingly adopted, the demand for responsible AI practices to ensure transparency and accountability grows. To better understand the decision-making processes of machine learning models, GASTeN was developed to generate realistic yet ambiguous synthetic data near a classifier's decision boundary. However, the results were inconsistent, with few images in the low-confidence region and noise. Therefore, we propose a new GASTeN version with a modified architecture and a novel loss function. This new loss function incorporates a multi-objective measure with a Gaussian loss centered on the classifier probability, targeting the decision boundary. Our study found that while the original GASTeN architecture yields the highest Frechet Inception Distance (FID) scores, the updated version achieves lower Average Confusion Distance (ACD) values and consistent performance across low-confidence regions. Both architectures produce realistic and ambiguous images, but the updated one is more reliable, with no instances of GAN mode collapse. Additionally, the introduction of the Gaussian loss enhanced this architecture by allowing for adjustable tolerance in image generation around the decision boundary.

FecharLer Abstract

2025

Time Series Data Augmentation as an Imbalanced Learning Problem

Autores
Cerqueira, V; Moniz, N; Inacio, R; Soares, C;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2024, PT III

Abstract
Recent state-of-the-art forecasting methods are trained on collections of time series. These methods, often referred to as global models, can capture common patterns in different time series to improve their generalization performance. However, they require large amounts of data that might not be available. Moreover, global models may fail to capture relevant patterns unique to a particular time series. In these cases, data augmentation can be useful to increase the sample size of time series datasets. The main contribution of this work is a novel method for generating univariate time series synthetic samples. Our approach stems from the insight that the observations concerning a particular time series of interest represent only a small fraction of all observations. In this context, we frame the problem of training a forecasting model as an imbalanced learning task. Oversampling strategies are popular approaches used to handle the imbalance problem in machine learning. We use these techniques to create synthetic time series observations and improve the accuracy of forecasting models. We carried out experiments using 7 different databases that contain a total of 5502 univariate time series. We found that the proposed solution outperforms both a global and a local model, thus providing a better trade-off between these two approaches.

FecharLer Abstract

2025

Mast: interpretable stress testing via meta-learning for forecasting model robustness evaluation

Autores
Inácio, R; Cerqueira, V; Barandas, M; Soares, C;

Publicação
MACHINE LEARNING

Abstract
Evaluating and documenting the robustness of forecasting models to different input conditions is important for their responsible deployment in real-world applications. Time series forecasting models often exhibit degraded performance in the form of unusually large errors, high uncertainty, or hubris (high errors coupled with low uncertainty). Traditional stress testing approaches rely on manually designed adverse scenarios that fail to systematically identify unknown stress factors, in which data characteristics indicate potential issues. To overcome this limitation, this paper introduces MAST (Meta-learning and data Augmentation for Stress Testing), a novel method for stress testing forecasting models. MAST leverages model outputs (error scores and prediction intervals) to automatically identify and characterize input conditions that induce stress. Specifically, MAST is a binary probabilistic classifier that predicts the likelihood of forecasting model stress based on time series features. An additional contribution is a novel time series data augmentation approach based on oversampling or synthetic time series generation, that improves the information about stress factors in the input space, resulting in increased stress classification performance. Experiments were conducted using 6 benchmark datasets containing a total of 97.829 time series. We demonstrate how MAST is able to identify and explain input conditions that lead to manifestations of stress, namely large errors, high uncertainty, or hubris.

FecharLer Abstract

2025

Unveiling Fairness and Performance of Causal Discovery

Autores
Teixeira, S; Nogueira, AR; Gama, J;

Publicação
DSAA

Abstract
Data-driven decision models based on Artificial Intelligence (AI) are increasingly adopted across domains. However, these models are susceptible to bias that can result in unfair or discriminatory outcomes. Recent research has explored causal discovery methods as a promising way to understand and improve fairness in decision-making systems. In this work, we investigate how different conditional independence tests used in constraint-based causal discovery algorithms, specifically the PC algorithm, affect fairness and performance. We perform an empirical evaluation on several datasets, including Portuguese public contracts, COMPAS, and the German Credit dataset. Using seven conditional independence tests, we assess model behavior under fairness (demographic parity, accuracy parity, equalized odds and predictive rate parity) and performance (accuracy, F1-score, AUC) metrics. Our findings reveal that some tests, due to their statistical properties, fail to expose unfairness detectable via causal structures, even when performance metrics appear acceptable. Furthermore, we highlight significant differences in computational efficiency among the tests, with x2-Adf, sp-mi, and sp-x2 being the least efficient. This study underscores the need for careful selection of conditional independence tests in causal discovery to ensure both fairness and reliability in data-driven decision systems. © 2025 IEEE.

FecharLer Abstract

2025

Fish swarm parameter self-tuning for data streams

Autores
Veloso, B; Neto, HA; Buarque, F; Gama, J;

Publicação
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Hyper-parameter optimization in machine learning models is critical for achieving peak performance. Over the past few years, numerous researchers have worked on this optimization challenge. They primarily focused on batch learning tasks where data distributions remain relatively unchanged. However, addressing the properties of data streams poses a substantial challenge. With the rapid evolution of technology, the demand for sophisticated techniques to handle dynamic data streams is becoming increasingly urgent. This paper introduces a novel adaptation of the Fish School Search (FSS) Algorithm for online hyper-parameter optimization, the FSS-SPT. The FSS-SPT is a solution designed explicitly for the dynamic context of data streams. One fundamental property of the FSS-SPT is that it can change between exploration and exploitation modes to cope with the concept drift and converge to reasonable solutions. Our experiments on different datasets provide compelling evidence of the superior performance of our proposed methodology, the FSS-SPT. It outperformed existing algorithms in two machine learning tasks, demonstrating its potential for practical application.

FecharLer Abstract

2025

Fine-Tuning Transformer-Based LLMs in Hierarchical Text Classification

Autores
Santos, J; Silva, N; Ferreira, C; Gama, J;

Publicação
DISCOVERY SCIENCE, DS 2025

Abstract
Hierarchical document classification is essential for structuring large-scale textual corpora in domains such as digital libraries and academic repositories. While recent advances in large language models (LLMs) have opened new possibilities for text classification, their applicability to hierarchical settings under real-world constraints remains underexplored. This study investigates both generative and discriminative transformer-based models, evaluating their effectiveness across multiple inference strategies: zero-shot baseline, local fine-tuning, and a global approach using category-specific models. Experiments on two real-world hierarchical datasets provide a comprehensive comparison of classification accuracy, F1-macro scores, and inference times. The results highlight that, although generative LLMs can deliver competitive (yet variable) performance at higher levels of the hierarchy, their high inference costs hinder their use in time-sensitive applications. In contrast, fine-tuned discriminative models-particularly BERT-based architectures-consistently offer a more favorable trade-off between performance and efficiency.

FecharLer Abstract