Publications

Publications by João Gama

2021

CAUSAL DISCOVERY IN MACHINE LEARNING: THEORIES AND APPLICATIONS

Authors
Nogueira, AR; Gama, J; Ferreira, CA;

Publication
JOURNAL OF DYNAMICS AND GAMES

Abstract
Determining the cause of a particular event has been a case of study for several researchers over the years. Finding out why an event happens (its cause) means that, for example, if we remove the cause from the equation, we can stop the effect from happening or if we replicate it, we can create the subsequent effect. Causality can be seen as a mean of predicting the future, based on information about past events, and with that, prevent or alter future outcomes. This temporal notion of past and future is often one of the critical points in discovering the causes of a given event. The purpose of this survey is to present a cross-sectional view of causal discovery domain, with an emphasis in the machine learning/data mining area.

CloseRead Abstract

2021

Hyperparameter self-tuning for data streams

Authors
Veloso, B; Gama, J; Malheiro, B; Vinagre, J;

Publication
INFORMATION FUSION

Abstract
The number of Internet of Things devices generating data streams is expected to grow exponentially with the support of emergent technologies such as 5G networks. Therefore, the online processing of these data streams requires the design and development of suitable machine learning algorithms, able to learn online, as data is generated. Like their batch-learning counterparts, stream-based learning algorithms require careful hyperparameter settings. However, this problem is exacerbated in online learning settings, especially with the occurrence of concept drifts, which frequently require the reconfiguration of hyperparameters. In this article, we present SSPT, an extension of the Self Parameter Tuning (SPT) optimisation algorithm for data streams. We apply the Nelder-Mead algorithm to dynamically-sized samples, converging to optimal settings in a single pass over data while using a relatively small number of hyperparameter configurations. In addition, our proposal automatically readjusts hyperparameters when concept drift occurs. To assess the effectiveness of SSPT, the algorithm is evaluated with three different machine learning problems: recommendation, regression, and classification. Experiments with well-known data sets show that the proposed algorithm can outperform previous hyperparameter tuning efforts by human experts. Results also show that SSPT converges significantly faster and presents at least similar accuracy when compared with the previous double-pass version of the SPT algorithm.

CloseRead Abstract

2021

Advances in Intelligent Data Analysis XIX - 19th International Symposium on Intelligent Data Analysis, IDA 2021, Porto, Portugal, April 26-28, 2021, Proceedings

Authors
Abreu, PH; Rodrigues, PP; Fernández, A; Gama, J;

Publication
IDA

Abstract

2020

Using Network Features for Credit Scoring in MicroFinance: Extended Abstract

Authors
Paraíso, P; Ruiz, S; Gomes, P; Rodrigues, L; Gama, J;

Publication
2020 IEEE 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2020)

Abstract
This paper uses non-traditional data, from a MicroFinance Institution (MFI), in a Credit Scoring loan classification problem and addresses a common problem in emerging markets of the lack of a verifiable customers' credit history. We perform a set of experiments to define a baseline model and prove the relevance of node embedding features, in credit scoring models, using a real world dataset.

CloseRead Abstract

2021

Non-Intrusive Load Monitoring for Household Disaggregated Energy Sensing

Authors
Paulos, JP; Fidalgo, JN; Gama, J;

Publication
2021 IEEE MADRID POWERTECH

Abstract
The present work aims to compare several load disaggregation methods. While the supervised alternative was found to be the most competent, the semi-supervised is proved to be close in terms of potential, while the unsupervised alternative seems insufficient. By the same token, the tests with long-lasting data prove beneficial to confirm the long-term performance since no significant loss of performance is noticed with the scalar of the time-horizon. Finally, the patchwork of new parametrization and methodology fine-tuning also proves interesting for improving global performance in several methods.

CloseRead Abstract

2021

Generalised Partial Association in Causal Rules Discovery

Authors
Nogueira, AR; Ferreira, C; Gama, J; Pinto, A;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE (EPIA 2021)

Abstract
One of the most significant challenges for machine learning nowadays is the discovery of causal relationships from data. This causal discovery is commonly performed using Bayesian like algorithms. However, more recently, more and more causal discovery algorithms have appeared that do not fall into this category. In this paper, we present a new algorithm that explores global causal association rules with Uncertainty Coefficient. Our algorithm, CRPA-UC, is a global structure discovery approach that combines the advantages of association mining with causal discovery and can be applied to binary and non-binary discrete data. This approach was compared to the PC algorithm using several well-known data sets, using several metrics.

CloseRead Abstract