Detalhes
Nome
Rita Paula RibeiroCargo
Investigador SéniorDesde
01 janeiro 2008
Nacionalidade
PortugalCentro
Laboratório de Inteligência Artificial e Apoio à DecisãoContactos
+351220402963
rita.p.ribeiro@inesctec.pt
2026
Autores
Pfahringer, B; Japkowicz, N; Larrañaga, P; Ribeiro, RP; Dutra, I; Pechenizkiy, M; Cortez, P; Pashami, S; Jorge, AM; Soares, C; Abreu, PH; Gama, J;
Publicação
ECML/PKDD (8)
Abstract
2026
Autores
Ribeiro, P; Japkowicz, N; Jorge, AM; Soares, C; Abreu, PH; Pfahringer, B; Gama, MP; Larrañaga, P; Dutra, I; Pechenizkiy, M; Pashami, S; Cortez, P;
Publicação
Lecture Notes in Computer Science
Abstract
[No abstract available]
2026
Autores
Ribeiro, RP; Pfahringer, B; Japkowicz, N; Larrañaga, P; Jorge, AM; Soares, C; Abreu, PH; Gama, J;
Publicação
ECML/PKDD (4)
Abstract
2026
Autores
Ribeiro, RP; Pfahringer, B; Japkowicz, N; Larrañaga, P; Jorge, AM; Soares, C; Abreu, PH; Gama, J;
Publicação
ECML/PKDD (1)
Abstract
2026
Autores
Pinheiro, AP; Ribeiro, RP;
Publicação
IDA
Abstract
Handling imbalanced target distributions in regression poses a persistent challenge, as the underrepresentation of relevant target values can significantly hinder model performance. Existing data-level solutions often adapt classification-oriented techniques, introducing arbitrary thresholds over the continuous target and leading to artificial and potentially misleading problem formulations. Deep generative models offer flexible sample synthesis but are computationally intensive and difficult to interpret. We propose a CART-based synthetic sampling method specifically designed for imbalanced regression on tabular data. The method integrates relevance- and density-guided sampling to address sparse target regions without thresholding, and employs a feature-driven tree structure to generate realistic tabular samples across heterogeneous features and non-linear interactions. Experiments on benchmark datasets for extreme-value prediction show that the proposed approach is competitive with state-of-the-art resampling and generative methods while offering faster execution and greater transparency. These results highlight its potential as a scalable and interpretable data-level strategy for improving regression models in imbalanced domains. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.