Publications

Publications by LIAAD

2020

Reconciling Predictions in the Regression Setting: An Application to Bus Travel Time Prediction

Authors
Mendes Moreira, J; Baratchi, M;

Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS XVIII, IDA 2020

Abstract
In different application areas, the prediction of values that are hierarchically related is required. As an example, consider predicting the revenue per month and per year of a company where the prediction of the year should be equal to the sum of the predictions of the months of that year. The idea of reconciliation of prediction on grouped time-series has been previously proposed to provide optimal forecasts based on such data. This method in effect, models the time-series collectively rather than providing a separate model for time-series at each level. While originally, the idea of reconciliation is applicable on data of time-series nature, it is not clear if such an approach can also be applicable to regression settings where multi-attribute data is available. In this paper, we address such a problem by proposing Reconciliation for Regression (R4R), a two-step approach for prediction and reconciliation. In order to evaluate this method, we test its applicability in the context of Travel Time Prediction (TTP) of bus trips where two levels of values need to be calculated: (i) travel times of the links between consecutive bus-stops; and (ii) total trip travel time. The results show that R4R can improve the overall results in terms of both link TTP performance and reconciliation between the sum of the link TTPs and the total trip travel time. We compare the results acquired when using group-based reconciliation methods and show that the proposed reconciliation approach in a regression setting can provide better results in some cases. This method can be generalized to other domains as well.

CloseRead Abstract

2020

UnFOOT: Unsupervised Football Analytics Tool

Authors
Coutinho, JC; Moreira, JM; de Sa, CR;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT III

Abstract
Labelled football (soccer) data is hard to acquire and it usually needs humans to annotate the match events. This process makes it more expensive to be obtained by smaller clubs. UnFOOT (Unsupervised Football Analytics Tool) combines data mining techniques and basic statistics to measure the performance of players and teams from positional data. The capabilities of the tool involve preprocessing the match data, extraction of features, visualization of player and team performance. It also has built-in data mining techniques, such as association rule mining, subgroup discovery and a proposed approach to look for frequent distributions.

CloseRead Abstract

2020

Comparing State-of-the-Art Neural Network Ensemble Methods in Soccer Predictions

Authors
Neves, TM; Moreira, JM;

Publication
Foundations of Intelligent Systems - 25th International Symposium, ISMIS 2020, Graz, Austria, September 23-25, 2020, Proceedings

Abstract
For many reasons, including sports being one of the main forms of entertainment in the world, online gambling is growing. And in growing markets, opportunities to explore it arise. In this paper, neural network ensemble approaches, such as bagging, random subspace sampling, negative correlation learning and the simple averaging of predictions, are compared. For each one of these methods, several combinations of input parameters are evaluated. We used only the expected goals metric as predictors since it is able to have good predictive power while keeping the computational demands low. These models are compared in the soccer (also known as association football) betting context where we have access to metrics, such as rentability, to analyze the results in multiple perspectives. The results show that the optimal solution is goal-dependent, with the ensemble methods being able to increase the accuracy up to +3 % over the best single model. The biggest improvement over the single model was obtained by averaging dropout networks. © 2020, Springer Nature Switzerland AG.

CloseRead Abstract

2020

kNN Prototyping Schemes for Embedded Human Activity Recognition with Online Learning

Authors
Ferreira, PJS; Cardoso, JMP; Moreira, JM;

Publication
Comput.

Abstract
The kNN machine learning method is widely used as a classifier in Human Activity Recognition (HAR) systems. Although the kNN algorithm works similarly both online and in offline mode, the use of all training instances is much more critical online than offline due to time and memory restrictions in the online mode. Some methods propose decreasing the high computational costs of kNN by focusing, e.g., on approximate kNN solutions such as the ones relying on Locality-Sensitive Hashing (LSH). However, embedded kNN implementations also need to address the target device’s memory constraints, especially as the use of online classification needs to cope with those constraints to be practical. This paper discusses online approaches to reduce the number of training instances stored in the kNN search space. To address practical implementations of HAR systems using kNN, this paper presents simple, energy/computationally efficient, and real-time feasible schemes to maintain at runtime a maximum number of training instances stored by kNN. The proposed schemes include policies for substituting the training instances, maintaining the search space to a maximum size. Experiments in the context of HAR datasets show the efficiency of our best schemes. © 2020 by the authors. Licensee MDPI, Basel, Switzerland.

CloseRead Abstract

2020

Hierarchical Qualitative Clustering - clustering mixed datasets with critical qualitative information

Authors
Seca, D; Moreira, JM; Neves, TM; Sousa, R;

Publication
CoRR

Abstract

2020

A Lagrangian Bound on the Clique Number and an Exact Algorithm for the Maximum Edge Weight Clique Problem

Authors
Hosseinian, S; Fontes, DBMM; Butenko, S;

Publication
INFORMS JOURNAL ON COMPUTING

Abstract
This paper explores the connections between the classical maximum clique problem and its edge-weighted generalization, the maximum edge weight clique (MEWC) problem. As a result, a new analytic upper bound on the clique number of a graph is obtained and an exact algorithm for solving the MEWC problem is developed. The bound on the clique number is derived using a Lagrangian relaxation of an integer (linear) programming formulation of the MEWC problem. Furthermore, coloring-based bounds on the clique number are used in a novel upper-bounding scheme for the MEWC problem. This scheme is employed within a combinatorial branch-and-bound framework, yielding an exact algorithm for the MEWC problem. Results of computational experiments demonstrate a superior performance of the proposed algorithm compared with existing approaches.

CloseRead Abstract