2018
Authors
Shekar, AK; de Sá, CR; Ferreira, H; Soares, C;
Publication
CoRR
Abstract
2018
Authors
Lindert, Dt; de Sá, CR; Soares, C; Knobbe, AJ;
Publication
CoRR
Abstract
2019
Authors
de Sá, CR; Azevedo, PJ; Soares, C; Jorge, AM; Knobbe, AJ;
Publication
CoRR
Abstract
2023
Authors
Cunha, L; Soares, C; Restivo, A; Teixeira, LF;
Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS XXI, IDA 2023
Abstract
Concerns with the interpretability of ML models are growing as the technology is used in increasingly sensitive domains (e.g., health and public administration). Synthetic data can be used to understand models better, for instance, if the examples are generated close to the frontier between classes. However, data augmentation techniques, such as Generative Adversarial Networks (GAN), have been mostly used to generate training data that leads to better models. We propose a variation of GANs that, given a model, generates realistic data that is classified with low confidence by a given classifier. The generated examples can be used in order to gain insights on the frontier between classes. We empirically evaluate our approach on two well-known image classification benchmark datasets, MNIST and Fashion MNIST. Results show that the approach is able to generate images that are closer to the frontier when compared to the original ones, but still realistic. Manual inspection confirms that some of those images are confusing even for humans.
2025
Authors
Inácio, R; Cerqueira, V; Barandas, M; Soares, C;
Publication
MACHINE LEARNING
Abstract
Evaluating and documenting the robustness of forecasting models to different input conditions is important for their responsible deployment in real-world applications. Time series forecasting models often exhibit degraded performance in the form of unusually large errors, high uncertainty, or hubris (high errors coupled with low uncertainty). Traditional stress testing approaches rely on manually designed adverse scenarios that fail to systematically identify unknown stress factors, in which data characteristics indicate potential issues. To overcome this limitation, this paper introduces MAST (Meta-learning and data Augmentation for Stress Testing), a novel method for stress testing forecasting models. MAST leverages model outputs (error scores and prediction intervals) to automatically identify and characterize input conditions that induce stress. Specifically, MAST is a binary probabilistic classifier that predicts the likelihood of forecasting model stress based on time series features. An additional contribution is a novel time series data augmentation approach based on oversampling or synthetic time series generation, that improves the information about stress factors in the input space, resulting in increased stress classification performance. Experiments were conducted using 6 benchmark datasets containing a total of 97.829 time series. We demonstrate how MAST is able to identify and explain input conditions that lead to manifestations of stress, namely large errors, high uncertainty, or hubris.
2026
Authors
Amorim, L; Santos, M; Azevedo, PJ; Soares, C; Cerqueira, V;
Publication
IDA
Abstract
Data augmentation is a crucial tool in time series forecasting, especially for deep learning architectures that require a large training sample size to generalize effectively. However, extensive datasets are not always available in real-world scenarios. Although many data augmentation methods exist, their limitations include the use of transformations that do not adequately preserve data properties. This paper introduces Grasynda, a novel graph-based approach for synthetic time series generation that: (1) converts univariate time series into a network structure using a graph representation, where each state is a node and each transition is represented as a directed edge; and (2) encodes their temporal dynamics in a transition probability matrix. We performed an extensive evaluation of Grasynda as a data augmentation method for time series forecasting. We use three neural network variations on six benchmark datasets. The results indicate that Grasynda consistently outperforms other time series data augmentation methods, including ones used in state-of-the-art time series foundation models. The method and all experiments are publicly available. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.