Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Vítor Manuel Cerqueira

2022

Exceedance Probability Forecasting via Regression for Significant Wave Height Forecasting

Authors
Cerqueira, V; Torgo, L;

Publication
CoRR

Abstract

2023

Automated imbalanced classification via layered learning

Authors
Cerqueira, V; Torgo, L; Branco, P; Bellinger, C;

Publication
Mach. Learn.

Abstract

2023

Model Selection for Time Series Forecasting An Empirical Analysis of Multiple Estimators

Authors
Cerqueira, V; Torgo, L; Soares, C;

Publication
NEURAL PROCESSING LETTERS

Abstract
Evaluating predictive models is a crucial task in predictive analytics. This process is especially challenging with time series data because observations are not independent. Several studies have analyzed how different performance estimation methods compare with each other for approximating the true loss incurred by a given forecasting model. However, these studies do not address how the estimators behave for model selection: the ability to select the best solution among a set of alternatives. This paper addresses this issue. The goal of this work is to compare a set of estimation methods for model selection in time series forecasting tasks. This objective is split into two main questions: (i) analyze how often a given estimation method selects the best possible model; and (ii) analyze what is the performance loss when the best model is not selected. Experiments were carried out using a case study that contains 3111 time series. The accuracy of the estimators for selecting the best solution is low, despite being significantly better than random selection. Moreover, the overall forecasting performance loss associated with the model selection process ranges from 0.28 to 0.58%. Yet, no considerable differences between different approaches were found. Besides, the sample size of the time series is an important factor in the relative performance of the estimators.

2023

Early anomaly detection in time series: a hierarchical approach for predicting critical health episodes

Authors
Cerqueira, V; Torgo, L; Soares, C;

Publication
MACHINE LEARNING

Abstract
The early detection of anomalous events in time series data is essential in many domains of application. In this paper we deal with critical health events, which represent a significant cause of mortality in intensive care units of hospitals. The timely prediction of these events is crucial for mitigating their consequences and improving healthcare. One of the most common approaches to tackle early anomaly detection problems is through standard classification methods. In this paper we propose a novel method that uses a layered learning architecture to address these tasks. One key contribution of our work is the idea of pre-conditional events, which denote arbitrary but computable relaxed versions of the event of interest. We leverage this idea to break the original problem into two hierarchical layers, which we hypothesize are easier to solve. The results suggest that the proposed approach leads to a better performance relative to state of the art approaches for critical health episode prediction.

2021

Model Compression for Dynamic Forecast Combination

Authors
Cerqueira, V; Torgo, L; Soares, C; Bifet, A;

Publication
CoRR

Abstract

2024

Time Series Data Augmentation as an Imbalanced Learning Problem

Authors
Cerqueira, V; Moniz, N; Inácio, R; Soares, C;

Publication
Progress in Artificial Intelligence - 23rd EPIA Conference on Artificial Intelligence, EPIA 2024, Viana do Castelo, Portugal, September 3-6, 2024, Proceedings, Part II

Abstract
Recent state-of-the-art forecasting methods are trained on collections of time series. These methods, often referred to as global models, can capture common patterns in different time series to improve their generalization performance. However, they require large amounts of data that might not be available. Moreover, global models may fail to capture relevant patterns unique to a particular time series. In these cases, data augmentation can be useful to increase the sample size of time series datasets. The main contribution of this work is a novel method for generating univariate time series synthetic samples. Our approach stems from the insight that the observations concerning a particular time series of interest represent only a small fraction of all observations. In this context, we frame the problem of training a forecasting model as an imbalanced learning task. Oversampling strategies are popular approaches used to handle the imbalance problem in machine learning. We use these techniques to create synthetic time series observations and improve the accuracy of forecasting models. We carried out experiments using 7 different databases that contain a total of 5502 univariate time series. We found that the proposed solution outperforms both a global and a local model, thus providing a better trade-off between these two approaches. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

  • 5
  • 8