Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2024

CNP-MLDM: Contract Net Protocol for Negotiation in Machine Learning Data Market

Authors
Baghcheband, H; Soares, C; Reis, LP;

Publication
DS (LB)

Abstract
The Machine Learning Data Market (MLDM), which relies on multi-agent systems, necessitates robust negotiation strategies to ensure efficient and fair transactions. The Contract Net Protocol (CNP), a well-established negotiation strategy within Multi-Agent Systems (MAS), offers a promising solution. This paper explores the integration of CNP into MLDM, proposing the CNP-MLDM model to facilitate data exchanges. Characterized by its task announcement and bidding process, CNP enhances negotiation efficiency in MLDM. This paper describes CNP tailored for MLDM, detailing the proposed protocol following experimental results.

2024

Tabular data generation with tensor contraction layers and transformers

Authors
Silva, A; Restivo, A; Santos, M; Soares, C;

Publication
CoRR

Abstract

2024

Meta-TadGAN: Time Series Anomaly Detection Using TadGAN with Meta-features

Authors
Silva, IOe; Soares, C; Cerqueira, V; Rodrigues, A; Bastardo, P;

Publication
EPIA (3)

Abstract
TadGAN is a recent algorithm with competitive performance on time series anomaly detection. The detection process of TadGAN works by comparing observed data with generated data. A challenge in anomaly detection is that there are anomalies which are not easy to detect by analyzing the original time series but have a clear effect on its higher-order characteristics. We propose Meta-TadGAN, an adaptation of TadGAN that analyzes meta-level representations of time series. That is, it analyzes a time series that represents the characteristics of the time series, rather than the original time series itself. Results on benchmark datasets as well as real-world data from fire detectors shows that the new method is competitive with TadGAN.

2024

Enhancing Algorithm Performance Understanding through tsMorph: Generating Semi-Synthetic Time Series for Robust Forecasting Evaluation

Authors
Santos, M; de Carvalho, ACPLF; Soares, C;

Publication
AEQUITAS@ECAI

Abstract
When never produced as much data as today, and tomorrow will probably produce even more data. The increase is due not only to the larger number of data sources, but also because the source can continuously produce more recent data. The discovery of temporal patterns in continuously generated data is the main goal in many forecasting tasks, such as the average value of a currency or the average temperature in a city, in the next day. In these tasks, it is assumed that the time difference between two consecutive values produced by the same source is constant, and the sequence of values form a time series. The importance, and the very large number, of time series forecasting tasks make them one of the most popular data analysis application, which has been dealt with by a large number of different methods. Despite its popularity, there is a dearth of research aimed at comprehending the conditions under which these methods present high or poor forecasting performances. Empirical studies, although common, are challenged by the limited availability of time series datasets, restricting the extraction of reliable insights. To address this limitation, we present tsMorph, a tool for generating semi-synthetic time series through dataset morphing. tsMorph works by creating a sequence of datasets from two original datasets. The characteristics of the generated datasets progressively depart from those of one of the datasets and a convergence toward the attributes of the other dataset. This method provides a valuable alternative for obtaining substantial datasets. In this paper, we show the benefits of tsMorph by assessing the predictive performance of the Long Short-Term Memory Network and DeepAR forecasting algorithms. The time series used for the experiments come from the NN5 Competition. The experimental results provide important insights. Notably, the performances of the two algorithms improve proportionally with the frequency of the time series. These experiments confirm that tsMorph can be an effective tool for better understanding the behaviour of forecasting algorithms, delivering a pathway to overcoming the limitations posed by empirical studies and enabling more extensive and reliable experiments. Furthermore, tsMorph can promote Responsible Artificial Intelligence by emphasising characteristics of time series where forecasting algorithms may not perform well, thereby highlighting potential limitations.

2024

Fair-OBNC: Correcting Label Noise for Fairer Datasets

Authors
Silva, IOE; Jesus, S; Ferreira, H; Saleiro, P; Sousa, I; Bizarro, P; Soares, C;

Publication
ECAI 2024

Abstract
Data used by automated decision-making systems, such as Machine Learning models, often reflects discriminatory behavior that occurred in the past. These biases in the training data are sometimes related to label noise, such as in COMPAS, where more African-American offenders are wrongly labeled as having a higher risk of recidivism when compared to their White counterparts. Models trained on such biased data may perpetuate or even aggravate the biases with respect to sensitive information, such as gender, race, or age. However, while multiple label noise correction approaches are available in the literature, these focus on model performance exclusively. In this work, we propose Fair-OBNC, a label noise correction method with fairness considerations, to produce training datasets with measurable demographic parity. The presented method adapts Ordering-Based Noise Correction, with an adjusted criterion of ordering, based both on the margin of error of an ensemble, and the potential increase in the observed demographic parity of the dataset. We evaluate Fair-OBNC against other different pre-processing techniques, under different scenarios of controlled label noise. Our results show that the proposed method is the overall better alternative within the pool of label correction methods, being capable of attaining better reconstructions of the original labels. Models trained in the corrected data have an increase, on average, of 150% in demographic parity, when compared to models trained in data with noisy labels, across the considered levels of label noise.

2024

Finding Patterns in Ambiguity: Interpretable Stress Testing in the Decision Boundary

Authors
Gomes, I; Teixeira, LF; van Rijn, JN; Soares, C; Restivo, A; Cunha, L; Santos, M;

Publication
CoRR

Abstract

  • 42
  • 516