Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Gama

2023

Machine Learning and Principles and Practice of Knowledge Discovery in Databases - International Workshops of ECML PKDD 2022, Grenoble, France, September 19-23, 2022, Proceedings, Part II

Authors
Koprinska, I; Mignone, P; Guidotti, R; Jaroszewicz, S; Fröning, H; Gullo, F; Ferreira, PM; Roqueiro, D; Ceddia, G; Nowaczyk, S; Gama, J; Ribeiro, RP; Gavaldà, R; Masciari, E; Ras, ZW; Ritacco, E; Naretto, F; Theissler, A; Biecek, P; Verbeke, W; Schiele, G; Pernkopf, F; Blott, M; Bordino, I; Danesi, IL; Ponti, G; Severini, L; Appice, A; Andresini, G; Medeiros, I; Graça, G; Cooper, L; Ghazaleh, N; Richiardi, J; Miranda, DS; Sechidis, K; Canakoglu, A; Pidò, S; Pinoli, P; Bifet, A; Pashami, S;

Publication
PKDD/ECML Workshops (2)

Abstract

2023

Machine Learning and Principles and Practice of Knowledge Discovery in Databases - International Workshops of ECML PKDD 2022, Grenoble, France, September 19-23, 2022, Proceedings, Part I

Authors
Koprinska, I; Mignone, P; Guidotti, R; Jaroszewicz, S; Fröning, H; Gullo, F; Ferreira, PM; Roqueiro, D; Ceddia, G; Nowaczyk, S; Gama, J; Ribeiro, RP; Gavaldà, R; Masciari, E; Ras, ZW; Ritacco, E; Naretto, F; Theissler, A; Biecek, P; Verbeke, W; Schiele, G; Pernkopf, F; Blott, M; Bordino, I; Danesi, IL; Ponti, G; Severini, L; Appice, A; Andresini, G; Medeiros, I; Graça, G; Cooper, L; Ghazaleh, N; Richiardi, J; Miranda, DS; Sechidis, K; Canakoglu, A; Pidò, S; Pinoli, P; Bifet, A; Pashami, S;

Publication
PKDD/ECML Workshops (1)

Abstract

2022

Turning the Tables: Biased, Imbalanced, Dynamic Tabular Datasets for ML Evaluation

Authors
Jesus, S; Pombal, J; Alves, D; Cruz, AF; Saleiro, P; Ribeiro, RP; Gama, J; Bizarro, P;

Publication
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022

Abstract

2022

The MetroPT dataset for predictive maintenance

Authors
Veloso, B; Gama, J; Ribeiro, RP; Pereira, PM;

Publication
SCIENTIFIC DATA

Abstract
The paper describes the MetroPT data set, an outcome of a Predictive Maintenance project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 to develop machine learning methods for online anomaly detection and failure prediction. Several analog sensor signals (pressure, temperature, current consumption), digital signals (control signals, discrete signals), and GPS information (latitude, longitude, and speed) provide a framework that can be easily used and help the development of new machine learning methods. This dataset contains some interesting characteristics and can be a good benchmark for predictive maintenance models.

2025

Fed-VFDT: Federated Very Fast Decision Trees with Coordinated Splitting Over Data Streams

Authors
Silva, PR; Vinagre, J; Gama, J;

Publication
ICTAI

Abstract
We introduce Fed-VFDT, a federated adaptation of the Very Fast Decision Tree (VFDT) algorithm for classification over streaming data. While VFDT is a widely adopted online learning algorithm, its sequential and order-sensitive nature poses challenges in federated settings, marked by statistical heterogeneity and communication constraints. Fed-VFDT addresses these issues by having each client incrementally train a local VFDT and report split statistics to a central server when a leaf satisfies the Hoeffding criterion. The server selects a global splitting feature by aggregating clients' proposals according to a configurable strategy: quorum, merit-based selection, or majority voting. Once a feature is selected, it is broadcast to all clients, which apply the split at the corresponding tree path using their locally computed thresholds. We evaluate Fed-VFDT against its centralized counterpart using predictive and structural metrics, demonstrating that it maintains comparable performance while reducing communication and preserving synchronized tree growth.

2025

Bridging Streaming Continual Learning via In-Context Large Tabular Models

Authors
Lourenço, A; Gama, J; Xing, EP; Marreiros, G;

Publication
CoRR

Abstract
In streaming scenarios, models must learn continuously, adapting to concept drifts without erasing previously acquired knowledge. However, existing research communities address these challenges in isolation. Continual Learning (CL) focuses on long-term retention and mitigating catastrophic forgetting, often without strict real-time constraints. Stream Learning (SL) emphasizes rapid, efficient adaptation to high-frequency data streams, but typically neglects forgetting. Recent efforts have tried to combine these paradigms, yet no clear algorithmic overlap exists. We argue that large in-context tabular models (LTMs) provide a natural bridge for Streaming Continual Learning (SCL). In our view, unbounded streams should be summarized on-the-fly into compact sketches that can be consumed by LTMs. This recovers the classical SL motivation of compressing massive streams with fixed-size guarantees, while simultaneously aligning with the experience-replay desiderata of CL. To clarify this bridge, we show how the SL and CL communities implicitly adopt a divide-to-conquer strategy to manage the tension between plasticity (performing well on the current distribution) and stability (retaining past knowledge), while also imposing a minimal complexity constraint that motivates diversification (avoiding redundancy in what is stored) and retrieval (re-prioritizing past information when needed). Within this perspective, we propose structuring SCL with LTMs around two core principles of data selection for in-context learning: (1) distribution matching, which balances plasticity and stability, and (2) distribution compression, which controls memory size through diversification and retrieval mechanisms. © 2026 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

  • 95
  • 96