2019
Autores
Santos, MS; Pereira, RC; Costa, AF; Soares, JP; Santos, JAM; Abreu, PH;
Publicação
IEEE Access
Abstract
2019
Autores
Frazão, I; Abreu, PH; Cruz, T; Araújo, H; Simões, P;
Publicação
Abstract
2019
Autores
Pereira, RC; Abreu, PH; Polisciuc, E; Machado, P;
Publicação
Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2019, Volume 3: IVAPP, Prague, Czech Republic, February 25-27, 2019.
Abstract
2019
Autores
Martins, N; Cruz, JM; Cruz, T; Abreu, PH;
Publicação
Progress in Artificial Intelligence, 19th EPIA Conference on Artificial Intelligence, EPIA 2019, Vila Real, Portugal, September 3-6, 2019, Proceedings, Part II.
Abstract
2019
Autores
Marques, F; Duarte, H; Santos, J; Domingues, I; Amorim, JP; Abreu, PH;
Publicação
SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING
Abstract
The machine learning field has grown considerably in the last years. There are, however, some problems still to be solved. The characteristics of the training sets, for instance, are known to affect the classifiers performance. Here, and inspired by medical applications, we are interested in studying datasets that are both ordinal and imbalanced. Ordinal datasets present labels where only the relative ordering between different values is significant. Imbalanced datasets have very different quantity of examples per class. Building upon our previous work, we make three new contributions, (1) extend the number of classifiers, (2) evaluate two techniques to balance intermediate train sets in binary decomposition methods (often used in multi-class contexts and ordinal ones in particular), and (3) propose a new, iterative, classifier-based oversampling algorithm that we name InCuBAtE. Experiments were made on 6 private datasets, concerning the assessment of response to treatment on oncologic diseases, and 15 public datasets widely used in the literature. When compared with our previous work, results have improved (or remained the same) for 4 of the 6 private datasets and for 11 out of the 15 public datasets.
2019
Autores
Santos, MS; Pereira, RC; Costa, AF; Soares, JP; Santos, J; Abreu, PH;
Publicação
IEEE ACCESS
Abstract
The performance evaluation of imputation algorithms often involves the generation of missing values. Missing values can be inserted in only one feature (univariate configuration) or in several features (multivariate configuration) at different percentages (missing rates) and according to distinct missing mechanisms, namely, missing completely at random, missing at random, and missing not at random. Since the missing data generation process defines the basis for the imputation experiments (configuration, missing rate, and missing mechanism), it is essential that it is appropriately applied; otherwise, conclusions derived from ill-defined setups may be invalid. The goal of this paper is to review the different approaches to synthetic missing data generation found in the literature and discuss their practical details, elaborating on their strengths and weaknesses. Our analysis revealed that creating missing at random and missing not at random scenarios in datasets comprising qualitative features is the most challenging issue in the related work and, therefore, should be the focus of future work in the field.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.