Details
Name
Carlos Manuel SoaresCluster
Computer ScienceRole
External Research CollaboratorSince
01st January 2008
Nationality
PortugalCentre
Artificial Intelligence and Decision SupportContacts
+351222094398
carlos.m.soares@inesctec.pt
2023
Authors
Cunha, L; Soares, C; Restivo, A; Teixeira, LF;
Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS XXI, IDA 2023
Abstract
Concerns with the interpretability of ML models are growing as the technology is used in increasingly sensitive domains (e.g., health and public administration). Synthetic data can be used to understand models better, for instance, if the examples are generated close to the frontier between classes. However, data augmentation techniques, such as Generative Adversarial Networks (GAN), have been mostly used to generate training data that leads to better models. We propose a variation of GANs that, given a model, generates realistic data that is classified with low confidence by a given classifier. The generated examples can be used in order to gain insights on the frontier between classes. We empirically evaluate our approach on two well-known image classification benchmark datasets, MNIST and Fashion MNIST. Results show that the approach is able to generate images that are closer to the frontier when compared to the original ones, but still realistic. Manual inspection confirms that some of those images are confusing even for humans.
2023
Authors
Cerqueira, V; Torgo, L; Soares, C;
Publication
NEURAL PROCESSING LETTERS
Abstract
Evaluating predictive models is a crucial task in predictive analytics. This process is especially challenging with time series data because observations are not independent. Several studies have analyzed how different performance estimation methods compare with each other for approximating the true loss incurred by a given forecasting model. However, these studies do not address how the estimators behave for model selection: the ability to select the best solution among a set of alternatives. This paper addresses this issue. The goal of this work is to compare a set of estimation methods for model selection in time series forecasting tasks. This objective is split into two main questions: (i) analyze how often a given estimation method selects the best possible model; and (ii) analyze what is the performance loss when the best model is not selected. Experiments were carried out using a case study that contains 3111 time series. The accuracy of the estimators for selecting the best solution is low, despite being significantly better than random selection. Moreover, the overall forecasting performance loss associated with the model selection process ranges from 0.28 to 0.58%. Yet, no considerable differences between different approaches were found. Besides, the sample size of the time series is an important factor in the relative performance of the estimators.
2023
Authors
Cerqueira, V; Torgo, L; Soares, C;
Publication
MACHINE LEARNING
Abstract
The early detection of anomalous events in time series data is essential in many domains of application. In this paper we deal with critical health events, which represent a significant cause of mortality in intensive care units of hospitals. The timely prediction of these events is crucial for mitigating their consequences and improving healthcare. One of the most common approaches to tackle early anomaly detection problems is through standard classification methods. In this paper we propose a novel method that uses a layered learning architecture to address these tasks. One key contribution of our work is the idea of pre-conditional events, which denote arbitrary but computable relaxed versions of the event of interest. We leverage this idea to break the original problem into two hierarchical layers, which we hypothesize are easier to solve. The results suggest that the proposed approach leads to a better performance relative to state of the art approaches for critical health episode prediction.
2023
Authors
Freitas, F; Brazdil, P; Soares, C;
Publication
Discovery Science - 26th International Conference, DS 2023, Porto, Portugal, October 9-11, 2023, Proceedings
Abstract
Many current AutoML platforms include a very large space of alternatives (the configuration space) that make it difficult to identify the best alternative for a given dataset. In this paper we explore a method that can reduce a large configuration space to a significantly smaller one and so help to reduce the search time for the potentially best workflow. We empirically validate the method on a set of workflows that include four ML algorithms (SVM, RF, LogR and LD) with different sets of hyperparameters. Our results show that it is possible to reduce the given space by more than one order of magnitude, from a few thousands to tens of workflows, while the risk that the best workflow is eliminated is nearly zero. The system after reduction is about one order of magnitude faster than the original one, but still maintains the same predictive accuracy and loss. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
2023
Authors
Baptista, T; Soares, C; Oliveira, T; Soares, F;
Publication
Applied Sciences
Abstract
Supervised Thesis
2019
Author
José Francisco Cagigal da Silva Gomes
Institution
UP-FEUP
2019
Author
Diogo Miguel da Rocha Alves
Institution
UP-FEP
2019
Author
Ricardo Manuel da Rocha Melo e Castro
Institution
UP-FEUP
2019
Author
Alexandre Marques de Castro Ribeiro
Institution
UP-FEUP
2019
Author
Tiago Bernardes Almeida
Institution
UP-FEUP
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.