Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Rita Paula Ribeiro
  • Cargo

    Investigador Sénior
  • Desde

    01 janeiro 2008
011
Publicações

2026

Machine Learning and Knowledge Discovery in Databases. Research Track and Applied Data Science Track - European Conference, ECML PKDD 2025, Porto, Portugal, September 15-19, 2025, Proceedings, Part VIII

Autores
Pfahringer, B; Japkowicz, N; Larrañaga, P; Ribeiro, RP; Dutra, I; Pechenizkiy, M; Cortez, P; Pashami, S; Jorge, AM; Soares, C; Abreu, PH; Gama, J;

Publicação
ECML/PKDD (8)

Abstract

2026

Preface

Autores
Ribeiro, P; Japkowicz, N; Jorge, AM; Soares, C; Abreu, PH; Pfahringer, B; Gama, MP; Larrañaga, P; Dutra, I; Pechenizkiy, M; Pashami, S; Cortez, P;

Publicação
Lecture Notes in Computer Science

Abstract
[No abstract available]

2026

Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2025, Porto, Portugal, September 15-19, 2025, Proceedings, Part IV

Autores
Ribeiro, RP; Pfahringer, B; Japkowicz, N; Larrañaga, P; Jorge, AM; Soares, C; Abreu, PH; Gama, J;

Publicação
ECML/PKDD (4)

Abstract

2026

Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2025, Porto, Portugal, September 15-19, 2025, Proceedings, Part I

Autores
Ribeiro, RP; Pfahringer, B; Japkowicz, N; Larrañaga, P; Jorge, AM; Soares, C; Abreu, PH; Gama, J;

Publicação
ECML/PKDD (1)

Abstract

2026

CARTGen-IR: Synthetic Tabular Data Generation for Imbalanced Regression

Autores
Pinheiro, AP; Ribeiro, RP;

Publicação
IDA

Abstract
Handling imbalanced target distributions in regression poses a persistent challenge, as the underrepresentation of relevant target values can significantly hinder model performance. Existing data-level solutions often adapt classification-oriented techniques, introducing arbitrary thresholds over the continuous target and leading to artificial and potentially misleading problem formulations. Deep generative models offer flexible sample synthesis but are computationally intensive and difficult to interpret. We propose a CART-based synthetic sampling method specifically designed for imbalanced regression on tabular data. The method integrates relevance- and density-guided sampling to address sparse target regions without thresholding, and employs a feature-driven tree structure to generate realistic tabular samples across heterogeneous features and non-linear interactions. Experiments on benchmark datasets for extreme-value prediction show that the proposed approach is competitive with state-of-the-art resampling and generative methods while offering faster execution and greater transparency. These results highlight its potential as a scalable and interpretable data-level strategy for improving regression models in imbalanced domains. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.