Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Carlos Manuel Soares
  • Cluster

    Informática
  • Cargo

    Investigador Colaborador Externo
  • Desde

    01 janeiro 2008
008
Publicações

2022

Meta-features for meta-learning

Autores
Rivolli, A; Garcia, LPF; Soares, C; Vanschoren, J; de Carvalho, ACPLF;

Publicação
KNOWLEDGE-BASED SYSTEMS

Abstract
Meta-learning is increasingly used to support the recommendation of machine learning algorithms and their configurations. These recommendations are made based on meta-data, consisting of performance evaluations of algorithms and characterizations on prior datasets. These characterizations, also called meta-features, describe properties of the data which are predictive for the performance of machine learning algorithms trained on them. Unfortunately, despite being used in many studies, meta-features are not uniformly described, organized and computed, making many empirical studies irreproducible and hard to compare. This paper aims to deal with this by systematizing and standardizing data characterization measures for classification datasets used in meta-learning. Moreover, it presents an extensive list of meta-features and characterization tools, which can be used as a guide for new practitioners. By identifying particularities and subtle issues related to the characterization measures, this survey points out possible future directions that the development of meta-features for meta-learning can assume. © 2022 Elsevier B.V.

2022

Multidimensional Subgroup Discovery on Event Logs

Autores
Ribeiro, J; Fontes, T; Soares, C; Borges, J;

Publicação
SSRN Electronic Journal

Abstract

2022

On the joint-effect of class imbalance and overlap: a critical review

Autores
Santos, MS; Abreu, PH; Japkowicz, N; Fernandez, A; Soares, C; Wilk, S; Santos, J;

Publicação
ARTIFICIAL INTELLIGENCE REVIEW

Abstract
Current research on imbalanced data recognises that class imbalance is aggravated by other data intrinsic characteristics, among which class overlap stands out as one of the most harmful. The combination of these two problems creates a new and difficult scenario for classification tasks and has been discussed in several research works over the past two decades. In this paper, we argue that despite some insightful information can be derived from related research, the joint-effect of class overlap and imbalance is still not fully understood, and advocate for the need to move towards a unified view of the class overlap problem in imbalanced domains. To that end, we start by performing a thorough analysis of existing literature on the joint-effect of class imbalance and overlap, elaborating on important details left undiscussed on the original papers, namely the impact of data domains with different characteristics and the behaviour of classifiers with distinct learning biases. This leads to the hypothesis that class overlap comprises multiple representations, which are important to accurately measure and analyse in order to provide a full characterisation of the problem. Accordingly, we devise two novel taxonomies, one for class overlap measures and the other for class overlap-based approaches, both resonating with the distinct representations of class overlap identified. This paper therefore presents a global and unique view on the joint-effect of class imbalance and overlap, from precursor work to recent developments in the field. It meticulously discusses some concepts taken as implicit in previous research, explores new perspectives in light of the limitations found, and presents new ideas that will hopefully inspire researchers to move towards a unified view on the problem and the development of suitable strategies for imbalanced and overlapped domains.

2022

On Usefulness of Outlier Elimination in Classification Tasks

Autores
Hetlerovic, D; Popelínský, L; Brazdil, P; Soares, C; Freitas, F;

Publicação
Advances in Intelligent Data Analysis XX - 20th International Symposium on Intelligent Data Analysis, IDA 2022, Rennes, France, April 20-22, 2022, Proceedings

Abstract

2022

Density Estimation in High-Dimensional Spaces: A Multivariate Histogram Approach

Autores
Strecht, P; Mendes Moreira, J; Soares, C;

Publicação
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2022, PT II

Abstract

Teses
supervisionadas

2021

OLTPLakes - Suporte transacional para sistemas distribuídos analíticos

Autor
Nelson José Dias Teixeira

Instituição
UM

2021

Trustability in data-driven decision models for Public Policy

Autor
Sónia Alexandra Carvalho Teixeira

Instituição
UP-FEUP

2021

Estudo empírico da variabilidade em sistemas ROS

Autor
Sara Maria Barreira Melo

Instituição
UM

2021

Robot Localization and Mapping in Dynamic Underwater Environments

Autor
António João Almeida Bernardo Ferreira

Instituição
UP-FEUP

2021

Video Based tracking for 3D Scene Analysis

Autor
Américo José Rodrigues Pereira

Instituição
UP-FEUP