Publicacoes - INESC TEC

Publicações

Publicações por LIAAD

2021

A test to compare interval time series

Autores
Maharaj, EA; Brito, P; Teles, P;

Publicação
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING

Abstract
We compare two interval time series (ITS) by testing whether their underlying distributions are significantly different or not. To perform hypothesis testing, we make use of the discrete wavelet transform (DWT) which decomposes a time series into a set of coefficients over a number of frequency bands or scales. We obtain the DWT of the radius and centre of each of the two ITS at different scales, and perform randomisation tests. In order to use a randomisation test, the observations must be uncorrelated; this condition is more or less satisfied since at each scale, the DWT coefficients are approximately uncorrelated with each other. Our proposed test statistic is the ratio of the determinants of the covariance matrix of radius and centre DWTs of the two ITS, at each scale. This test statistic ensures that the variability between the upper and lower bounds of each ITS is encompassed. Simulation studies conducted to evaluate the performance of the test show reasonably good estimates of size and power under most conditions, and applications to real interval time series reveal the practical usefulness of this test.

FecharLer Abstract

2021

MAINT.Data: Modelling and Analysing Interval Data in R

Autores
Silva, APD; Brito, P; Filzmoser, P; Dias, JG;

Publicação
R JOURNAL

Abstract
We present the CRAN R package MAINT.Data for the modelling and analysis of multivariate interval data, i.e., where units are described by variables whose values are intervals of IR, representing intrinsic variability. Parametric inference methodologies based on probabilistic models for interval variables have been developed, where each interval is represented by its midpoint and log-range, for which multivariate Normal and Skew-Normal distributions are assumed. The intrinsic nature of the interval variables leads to special structures of the variance-covariance matrix, which are represented by four different possible configurations. MAINT.Data implements the proposed methodologies in the S4 object system, introducing a specific data class for representing interval data. It includes functions and methods for modelling and analysing interval data, in particular maximum likelihood estimation, statistical tests for the different configurations, (M)ANOVA and Discriminant Analysis. For the Gaussian model, Model-based Clustering, robust estimation, outlier detection and Robust Discriminant Analysis are also available.

FecharLer Abstract

2021

Novelty Detection in Physical Activity

Autores
Leite, B; Abdalrahman, A; Castro, J; Frade, J; Moreira, J; Soares, C;

Publicação
ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2

Abstract
Artificial Intelligence (AI) is continuously improving several aspects of our daily lives. There has been a great use of gadgets & monitoring devices for health and physical activity monitoring. Thus, by analyzing large amounts of data and applying Machine Learning (ML) techniques, we have been able to infer fruitful conclusions in various contexts. Activity Recognition is one of them, in which it is possible to recognize and monitor our daily actions. The main focus of the traditional systems is only to detect pre-established activities according to the previously configured parameters, and not to detect novel ones. However, when applying activity recognizers in real-world applications, it is necessary to detect new activities that were not considered during the training of the model. We propose a method for Novelty Detection in the context of physical activity. Our solution is based on the establishment of a threshold confidence value, which determines whether an activity is novel or not. We built and train our models by experimenting with three different algorithms and four threshold values. The best results were obtained by using the Random Forest algorithm with a threshold value of 0.8, resulting in 90.9% of accuracy and 85.1% for precision.

FecharLer Abstract

2021

Micro-MetaStream: Algorithm selection for time-changing data

Autores
Rossi, ALD; Soares, C; de Souza, BF; de Carvalho, ACPDF;

Publicação
INFORMATION SCIENCES

Abstract
Data stream mining needs to deal with scenarios where data distribution can change over time. As a result, different learning algorithms can be more suitable in different time periods. This paper proposes micro-MetaStream, a meta-learning based method to recommend the most suitable learning algorithm for each new example arriving in a data stream. It is an evolution of MetaStream, which recommends learning algorithms for batches of examples. By using a unitary granularity, micro-MetaStream is able to respond more efficiently to changes in data distribution than its predecessor. The meta-data combines meta-features, characteristics describing recent data, with base-level features, the original variables of the new example. In experiments on real-world regression data streams, micro-metaStream outperformed MetaStream and a baseline method at the meta-level and frequently improved the predictive performance at the base-level.

FecharLer Abstract

2021

Discovery Science - 24th International Conference, DS 2021, Halifax, NS, Canada, October 11-13, 2021, Proceedings

Autores
Soares, C; Torgo, L;

Publicação
DS

Abstract

2021

Empirical Study on the Impact of Different Sets of Parameters of Gradient Boosting Algorithms for Time-Series Forecasting with LightGBM

Autores
Barros, F; Cerqueira, V; Soares, C;

Publicação
PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Hanoi, Vietnam, November 8-12, 2021, Proceedings, Part I

Abstract
LightGBM has proven to be an effective forecasting algorithm by winning the M5 forecasting competition. However, given the sensitivity of LightGBM to hyperparameters, it is likely that their default values are not optimal. This work aims to answer whether it is essential to tune the hyperparameters of LightGBM to obtain better accuracy in time series forecasting and whether it can be done efficiently. Our experiments consisted of the collection and processing of data as well as hyperparameters generation and finally testing. We observed that on the 58 time series tested, the mean squared error is reduced by a maximum of 17.45% when using randomly generated configurations in contrast to using the default one. Additionally, the study of the individual hyperparameters’ performance was done. Based on the results obtained, we propose an alternative set of default LightGBM hyperparameter values to be used whilst using time series data for forecasting. © 2021, Springer Nature Switzerland AG.

FecharLer Abstract