Carlos Manuel Soares

O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais

Instituição
Investigação
Domínios de Investigação
Inteligência Artificial

Bioengenharia

Comunicações

Ciência e Engenharia dos Computadores
Fotónica

Sistemas de Energia

Robótica

Engenharia e Gestão de Sistemas
CENTROS DE INVESTIGAÇÃO
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Inovação
Inovação / Tec4

TEC4AGRO-FOOD

TEC4ENERGY

TEC4HEALTH

TEC4INDUSTRY

TEC4SEA

TECPARTNERSHIPS

Tecnologias Disponíveis
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Laboratórios
Laboratórios de Investigação

iilab
Comunicação
Notícias

Eventos

Media

Boletim Informativo
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Junte-se a nós
Contactos

Home
Pessoas
Carlos Manuel Soares

Tópicos
de interesse

Detalhes

Nome
Carlos Manuel Soares
Cargo
Investigador Colaborador Externo
Desde
01 janeiro 2008

Nacionalidade
Portugal
Centro
Laboratório de Inteligência Artificial e Apoio à Decisão
Contactos
+351222094398
carlos.m.soares@inesctec.pt

006

Publicações

Ler todas as publicações

2025

Meta-learning and Data Augmentation for Stress Testing Forecasting Models

Autores
Inácio, R; Cerqueira, V; Barandas, M; Soares, C;

Publicação
Advances in Intelligent Data Analysis XXIII - 23rd International Symposium on Intelligent Data Analysis, IDA 2025, Konstanz, Germany, May 7-9, 2025, Proceedings

Abstract
The effectiveness of time series forecasting models can be hampered by conditions in the input space that lead them to underperform. When those are met, negative behaviours, such as higher-than-usual errors or increased uncertainty are shown. Traditionally, stress testing is applied to assess how models respond to adverse, but plausible scenarios, providing insights on how to improve their robustness and reliability. This paper builds upon this technique by contributing with a novel framework called MAST (Meta-learning and data Augmentation for Stress Testing). In particular, MAST is a meta-learning approach that predicts the probability that a given model will perform poorly on a given time series based on a set of statistical features. This way, instead of designing new stress scenarios, this method uses the information provided by instances that led to decreases in forecasting performance. An additional contribution is made, a novel time series data augmentation technique based on oversampling, that improves the information about stress factors in the input space, which elevates the classification capabilities of the method. We conducted experiments using 6 benchmark datasets containing a total of 97.829 time series. The results suggest that MAST is able to identify conditions that lead to large errors effectively. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

FecharLer Abstract

2025

Forecasting with Deep Learning: Beyond Average of Average of Average Performance

Autores
Cerqueira, V; Roque, L; Soares, C;

Publicação
DISCOVERY SCIENCE, DS 2024, PT I

Abstract
Accurate evaluation of forecasting models is essential for ensuring reliable predictions. Current practices for evaluating and comparing forecasting models focus on summarising performance into a single score, using metrics such as SMAPE. We hypothesize that averaging performance over all samples dilutes relevant information about the relative performance of models. Particularly, conditions in which this relative performance is different than the overall accuracy. We address this limitation by proposing a novel framework for evaluating univariate time series forecasting models from multiple perspectives, such as one-step ahead forecasting versus multi-step ahead forecasting. We show the advantages of this framework by comparing a state-of-the-art deep learning approach with classical forecasting techniques. While classical methods (e.g. ARIMA) are long-standing approaches to forecasting, deep neural networks (e.g. NHITS) have recently shown state-of-the-art forecasting performance in benchmark datasets. We conducted extensive experiments that show NHITS generally performs best, but its superiority varies with forecasting conditions. For instance, concerning the forecasting horizon, NHITS only outperforms classical approaches for multi-step ahead forecasting. Another relevant insight is that, when dealing with anomalies, NHITS is outperformed by methods such as Theta. These findings highlight the importance of evaluating forecasts from multiple dimensions.

FecharLer Abstract

2025

PrivateCTGAN: Adapting GAN for Privacy-Aware Tabular Data Sharing

Autores
Lopes, F; Soares, C; Cortez, P;

Publicação
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2023, PT II

Abstract
This research addresses the challenge of generating synthetic data that resembles real-world data while preserving privacy. With privacy laws protecting sensitive information such as healthcare data, accessing sufficient training data becomes difficult, resulting in an increased difficulty in training Machine Learning models and in overall worst models. Recently, there has been an increased interest in the usage of Generative Adversarial Networks (GAN) to generate synthetic data since they enable researchers to generate more data to train their models. GANs, however, may not be suitable for privacy-sensitive data since they have no concern for the privacy of the generated data. We propose modifying the known Conditional Tabular GAN (CTGAN) model by incorporating a privacy-aware loss function, thus resulting in the Private CTGAN (PCTGAN) method. Several experiments were carried out using 10 public domain classification datasets and comparing PCTGAN with CTGAN and the state-of-the-art privacy-preserving model, the Differential Privacy CTGAN (DP-CTGAN). The results demonstrated that PCTGAN enables users to fine-tune the privacy fidelity trade-off by leveraging parameters, as well as that if desired, a higher level of privacy.

FecharLer Abstract

2025

Cherry-Picking in Time Series Forecasting: How to Select Datasets to Make Your Model Shine

Autores
Roque, L; Cerqueira, V; Soares, C; Torgo, L;

Publicação
THIRTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, AAAI-25, VOL 39 NO 19

Abstract
The importance of time series forecasting drives continuous research and the development of new approaches to tackle this problem. Typically, these methods are introduced through empirical studies that frequently claim superior accuracy for the proposed approaches. Nevertheless, concerns are rising about the reliability and generalizability of these results due to limitations in experimental setups. This paper addresses a critical limitation: the number and representativeness of the datasets used. We investigate the impact of dataset selection bias, particularly the practice of cherry-picking datasets, on the performance evaluation of forecasting methods. Through empirical analysis with a diverse set of benchmark datasets, our findings reveal that cherry-picking datasets can significantly distort the perceived performance of methods, often exaggerating their effectiveness. Furthermore, our results demonstrate that by selectively choosing just four datasets - what most studies report - 46% of methods could be deemed best in class, and 77% could rank within the top three. Additionally, recent deep learning-based approaches show high sensitivity to dataset selection, whereas classical methods exhibit greater robustness. Finally, our results indicate that, when empirically validating forecasting algorithms on a subset of the benchmarks, increasing the number of datasets tested from 3 to 6 reduces the risk of incorrectly identifying an algorithm as the best one by approximately 40%. Our study highlights the critical need for comprehensive evaluation frameworks that more accurately reflect real-world scenarios. Adopting such frameworks will ensure the development of robust and reliable forecasting methods.

FecharLer Abstract

2025

Estimating Completeness of Consensus Models: Geometrical and Distributional Approaches

Autores
Strecht, P; Mendes-Moreira, J; Soares, C;

Publicação
MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2024, PT I

Abstract
In many organizations with a distributed operation, not only is data collection distributed, but models are also developed and deployed separately. Understanding the combined knowledge of all the local models may be important and challenging, especially in the case of a large number of models. The automated development of consensus models, which aggregate multiple models into a single one, involves several challenges, including fidelity (ensuring that aggregation does not penalize the predictive performance severely) and completeness (ensuring that the consensus model covers the same space as the local models). In this paper, we address the latter, proposing two measures for geometrical and distributional completeness. The first quantifies the proportion of the decision space that is covered by a model, while the second takes into account the concentration of the data that is covered by the model. The use of these measures is illustrated in a real-world example of academic management, as well as four publicly available datasets. The results indicate that distributional completeness in the deployed models is consistently higher than geometrical completeness. Although consensus models tend to be geometrically incomplete, distributional completeness reveals that they cover the regions of the decision space with a higher concentration of data.

FecharLer Abstract

Teses
supervisionadas

Teses supervisionadas

Ver todas as teses supervisionadas

2024

A Framework to Interpret Multiple Related Rule-based Models

Autor
Pedro Rodrigo Caetano Strecht Ribeiro

Instituição
UP-FEUP

2024

A Framework to Interpret Multiple Related Rule-based Models

Autor
Pedro Rodrigo Caetano Strecht Ribeiro

Instituição
UP-FEUP

2024

Enhancing Forecasting using Read & Write Recurrent Neural Networks

Autor
Yassine Baghoussi

Instituição
UP-FEUP

2019

Learning to Rank with Random Forest: A Case Study in Hostel Reservations

Autor
Carolina Macedo Moreira

Instituição
UP-FEUP

2019

Recommending Recommender Systems: tackling the Collaborative Filtering algorithm selection problem

Autor
Tiago Daniel Sá Cunha

Instituição
UP-FEUP

Ver todas as teses supervisionadas

Carlos Manuel Soares

Detalhes

Nome

Cargo

Desde

Nacionalidade

Centro

Contactos

BI4UP

CMLDM

Chatbot_Intelligence

opti-MOVES

SSPM

PFAI4_3ed

Meta-learning and Data Augmentation for Stress Testing Forecasting Models

Forecasting with Deep Learning: Beyond Average of Average of Average Performance

PrivateCTGAN: Adapting GAN for Privacy-Aware Tabular Data Sharing

Cherry-Picking in Time Series Forecasting: How to Select Datasets to Make Your Model Shine

Estimating Completeness of Consensus Models: Geometrical and Distributional Approaches

A Framework to Interpret Multiple Related Rule-based Models

A Framework to Interpret Multiple Related Rule-based Models

Enhancing Forecasting using Read & Write Recurrent Neural Networks

Learning to Rank with Random Forest: A Case Study in Hostel Reservations

Recommending Recommender Systems: tackling the Collaborative Filtering algorithm selection problem