2016
Autores
Branco, P; Ribeiro, RP; Torgo, L;
Publicação
CoRR
Abstract
2025
Autores
Aminian, E; Ribeiro, RP; Gama, J;
Publicação
MACHINE LEARNING
Abstract
Imbalanced domains pose a significant challenge in real-world predictive analytics, particularly in the context of regression. While existing research has primarily focused on batch learning from static datasets, limited attention has been given to imbalanced regression in online learning scenarios. Intending to address this gap, in prior work, we proposed sampling strategies based on Chebyshev's inequality as the first methodologies designed explicitly for data streams. However, these approaches operated under the restrictive assumption that rare instances exclusively reside at distribution extremes. This study introduces histogram-based sampling strategies to overcome this constraint, proposing flexible solutions for imbalanced regression in evolving data streams. The proposed techniques - Histogram-based Undersampling (HistUS) and Histogram-based Oversampling (HistOS) - employ incremental online histograms to dynamically detect and prioritize rare instances across arbitrary regions of the target distribution to improve predictions in the rare cases. Comprehensive experiments on synthetic and real-world benchmarks demonstrate that HistUS and HistOS substantially improve rare-case prediction accuracy, outperforming baseline models while maintaining competitiveness with Chebyshev-based approaches.
2025
Autores
Ribeiro, RP; Pfahringer, B; Japkowicz, N; Larrañaga, P; Jorge, AM; Soares, C; Abreu, PH; Gama, J;
Publicação
Lecture Notes in Computer Science
Abstract
2023
Autores
Nogueira, B; Menezes, GM; Ribeiro, RP; Moniz, N;
Publicação
Discover Data
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.