Publications

Publications by Luís Torgo

2013

OpenML: networked science in machine learning

Authors
Vanschoren, J; Rijn, JNv; Bischl, B; Torgo, L;

Publication
SIGKDD Explorations

Abstract

2017

Dynamic and Heterogeneous Ensembles for Time Series Forecasting

Authors
Cerqueira, V; Torgo, L; Oliveira, M; Pfahringer, B;

Publication
2017 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA)

Abstract
This paper addresses the issue of learning time series forecasting models in changing environments by leveraging the predictive power of ensemble methods. Concept drift adaptation is performed in an active manner, by dynamically combining base learners according to their recent performance using a non-linear function. Diversity in the ensembles is encouraged with several strategies that include heterogeneity among learners, sampling techniques and computation of summary statistics as extra predictors. Heterogeneity is used with the goal of better coping with different dynamic regimes of the time series. The driving hypotheses of this work are that (i) heterogeneous ensembles should better fit different dynamic regimes and (ii) dynamic aggregation should allow for fast detection and adaptation to regime changes. We extend some strategies typically used in classification tasks to time series forecasting. The proposed methods are validated using Monte Carlo simulations on 16 real-world univariate time series with numerical outcome as well as an artificial series with clear regime shifts. The results provide strong empirical evidence for our hypotheses. To encourage reproducibility the proposed method is publicly available as a software package.

CloseRead Abstract

2017

A Comparative Study of Performance Estimation Methods for Time Series Forecasting

Authors
Cerqueira, V; Torgo, L; Smailovic, J; Mozetic, I;

Publication
2017 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA)

Abstract
Performance estimation denotes a task of estimating the loss that a predictive model will incur on unseen data. These procedures are part of the pipeline in every machine learning task and are used for assessing the overall generalisation ability of models. In this paper we address the application of these methods to time series forecasting tasks. For independent and identically distributed data the most common approach is cross-validation. However, the dependency among observations in time series raises some caveats about the most appropriate way to estimate performance in these datasets and currently there is no settled way to do so. We compare different variants of cross-validation and different variants of out-of-sample approaches using two case studies: One with 53 real-world time series and another with three synthetic time series. Results show noticeable differences in the performance estimation methods in the two scenarios. In particular, empirical experiments suggest that cross-validation approaches can be applied to stationary synthetic time series. However, in real-world scenarios the most accurate estimates are produced by the out-of-sample methods, which preserve the temporal order of observations.

CloseRead Abstract

2014

OpenML: networked science in machine learning

Authors
Vanschoren, J; Rijn, JNv; Bischl, B; Torgo, L;

Publication
CoRR

Abstract

2015

Forecasting the Correct Trading Actions

Authors
Baia, L; Torgo, L;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
This paper addresses the problem of decision making in the context of financial markets. More specifically, the problem of forecasting the correct trading action for a certain future horizon. We study and compare two different alternative ways of addressing these forecasting tasks: i) using standard numeric prediction models to forecast the variation on the prices of the target asset and on a second stage transform these numeric predictions into a decision according to some pre-defined decision rules; and ii) use models that directly forecast the right decision thus ignoring the intermediate numeric forecasting task. The objective of our study is to determine if both strategies provide identical results or if there is any particular advantage worth being considered that may distinguish each alternative in the context of financial markets.

CloseRead Abstract

2014

An Infra-Structure for Performance Estimation and Experimental Comparison of Predictive Models in R

Authors
Torgo, Luis;

Publication
CoRR

Abstract