Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2024

Fair-OBNC: Correcting Label Noise for Fairer Datasets

Authors
Silva, IOE; Jesus, S; Ferreira, H; Saleiro, P; Sousa, I; Bizarro, P; Soares, C;

Publication
ECAI 2024

Abstract
Data used by automated decision-making systems, such as Machine Learning models, often reflects discriminatory behavior that occurred in the past. These biases in the training data are sometimes related to label noise, such as in COMPAS, where more African-American offenders are wrongly labeled as having a higher risk of recidivism when compared to their White counterparts. Models trained on such biased data may perpetuate or even aggravate the biases with respect to sensitive information, such as gender, race, or age. However, while multiple label noise correction approaches are available in the literature, these focus on model performance exclusively. In this work, we propose Fair-OBNC, a label noise correction method with fairness considerations, to produce training datasets with measurable demographic parity. The presented method adapts Ordering-Based Noise Correction, with an adjusted criterion of ordering, based both on the margin of error of an ensemble, and the potential increase in the observed demographic parity of the dataset. We evaluate Fair-OBNC against other different pre-processing techniques, under different scenarios of controlled label noise. Our results show that the proposed method is the overall better alternative within the pool of label correction methods, being capable of attaining better reconstructions of the original labels. Models trained in the corrected data have an increase, on average, of 150% in demographic parity, when compared to models trained in data with noisy labels, across the considered levels of label noise.

2024

Finding Patterns in Ambiguity: Interpretable Stress Testing in the Decision~Boundary

Authors
Gomes, I; Teixeira, LF; van Rijn, JN; Soares, C; Restivo, A; Cunha, L; Santos, M;

Publication
CoRR

Abstract

2024

Lag Selection for Univariate Time Series Forecasting Using Deep Learning: An Empirical Study

Authors
Leites, J; Cerqueira, V; Soares, C;

Publication
Progress in Artificial Intelligence - 23rd EPIA Conference on Artificial Intelligence, EPIA 2024, Viana do Castelo, Portugal, September 3-6, 2024, Proceedings, Part III

Abstract
Most forecasting methods use recent past observations (lags) to model the future values of univariate time series. Selecting an adequate number of lags is important for training accurate forecasting models. Several approaches and heuristics have been devised to solve this task. However, there is no consensus about what the best approach is. Besides, lag selection procedures have been developed based on local models and classical forecasting techniques such as ARIMA. We bridge this gap in the literature by carrying out an extensive empirical analysis of different lag selection methods. We focus on deep learning methods trained in a global approach, i.e., on datasets comprising multiple univariate time series. Specifically, we use NHITS, a recently proposed architecture that has shown competitive forecasting performance. The experiments were carried out using three benchmark databases that contain a total of 2411 univariate time series. The results indicate that the lag size is a relevant parameter for accurate forecasts. In particular, excessively small or excessively large lag sizes have a considerable negative impact on forecasting performance. Cross-validation approaches show the best performance for lag selection, but this performance is comparable with simple heuristics. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

2024

An Empirical Evaluation of DeepAR for Univariate Time Series Forecasting

Authors
Gomes, RU; Soares, C; Reis, LP;

Publication
Progress in Artificial Intelligence - 23rd EPIA Conference on Artificial Intelligence, EPIA 2024, Viana do Castelo, Portugal, September 3-6, 2024, Proceedings, Part III

Abstract
DeepAR is a popular probabilistic time series forecasting algorithm. According to the authors, DeepAR is particularly suitable to build global models using hundreds of related time series. For this reason, it is a common expectation that DeepAR obtains poor results in univariate forecasting [10]. However, there are no empirical studies that clearly support this. Here, we compare the performance of DeepAR with standard forecasting models to assess its performance regarding 1 step-ahead forecasts. We use 100 time series from the M4 competition to compare univariate DeepAR with univariate LSTM and SARIMAX models, both for point and quantile forecasts. Results show that DeepAR obtains good results, which contradicts common perception. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

2024

RIFF: Inducing Rules for Fraud Detection from Decision Trees

Authors
Martins, L; Bravo, J; Gomes, AS; Soares, C; Bizarro, P;

Publication
RULES AND REASONING, RULEML+RR 2024

Abstract
Financial fraud is the cause of multi-billion dollar losses annually. Traditionally, fraud detection systems rely on rules due to their transparency and interpretability, key features in domains where decisions need to be explained. However, rule systems require significant input from domain experts to create and tune, an issue that rule induction algorithms attempt to mitigate by inferring rules directly from data. We explore the application of these algorithms to fraud detection, where rule systems are constrained to have a low false positive rate (FPR) or alert rate, by proposing RIFF, a rule induction algorithm that distills a low FPR rule set directly from decision trees. Our experiments show that the induced rules are often able to maintain or improve performance of the original models for low FPR tasks, while substantially reducing their complexity and outperforming rules hand-tuned by experts.

2024

RHiOTS: A Framework for Evaluating Hierarchical Time Series Forecasting Algorithms

Authors
Roque, L; Soares, C; Torgo, L;

Publication
PROCEEDINGS OF THE 30TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2024

Abstract
We introduce the Robustness of Hierarchically Organized Time Series (RHiOTS) framework, designed to assess the robustness of hierarchical time series forecasting models and algorithms on real-world datasets. Hierarchical time series, where lower-level forecasts must sum to upper-level ones, are prevalent in various contexts, such as retail sales across countries. Current empirical evaluations of forecasting methods are often limited to a small set of benchmark datasets, offering a narrow view of algorithm behavior. RHiOTS addresses this gap by systematically altering existing datasets and modifying the characteristics of individual series and their interrelations. It uses a set of parameterizable transformations to simulate those changes in the data distribution. Additionally, RHiOTS incorporates an innovative visualization component, turning complex, multidimensional robustness evaluation results into intuitive, easily interpretable visuals. This approach allows an in-depth analysis of algorithm and model behavior under diverse conditions. We illustrate the use of RHiOTS by analyzing the predictive performance of several algorithms. Our findings show that traditional statistical methods are more robust than state-of-the-art deep learning algorithms, except when the transformation effect is highly disruptive. Furthermore, we found no significant differences in the robustness of the algorithms when applying specific reconciliation methods, such as MinT. RHiOTS provides researchers with a comprehensive tool for understanding the nuanced behavior of forecasting algorithms, offering a more reliable basis for selecting the most appropriate method for a given problem.

  • 39
  • 515