2019
Authors
Marques, F; Duarte, H; Santos, J; Domingues, I; Amorim, JP; Abreu, PH;
Publication
SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING
Abstract
The machine learning field has grown considerably in the last years. There are, however, some problems still to be solved. The characteristics of the training sets, for instance, are known to affect the classifiers performance. Here, and inspired by medical applications, we are interested in studying datasets that are both ordinal and imbalanced. Ordinal datasets present labels where only the relative ordering between different values is significant. Imbalanced datasets have very different quantity of examples per class. Building upon our previous work, we make three new contributions, (1) extend the number of classifiers, (2) evaluate two techniques to balance intermediate train sets in binary decomposition methods (often used in multi-class contexts and ordinal ones in particular), and (3) propose a new, iterative, classifier-based oversampling algorithm that we name InCuBAtE. Experiments were made on 6 private datasets, concerning the assessment of response to treatment on oncologic diseases, and 15 public datasets widely used in the literature. When compared with our previous work, results have improved (or remained the same) for 4 of the 6 private datasets and for 11 out of the 15 public datasets.
2019
Authors
Santos, MS; Pereira, RC; Costa, AF; Soares, JP; Santos, J; Abreu, PH;
Publication
IEEE ACCESS
Abstract
The performance evaluation of imputation algorithms often involves the generation of missing values. Missing values can be inserted in only one feature (univariate configuration) or in several features (multivariate configuration) at different percentages (missing rates) and according to distinct missing mechanisms, namely, missing completely at random, missing at random, and missing not at random. Since the missing data generation process defines the basis for the imputation experiments (configuration, missing rate, and missing mechanism), it is essential that it is appropriately applied; otherwise, conclusions derived from ill-defined setups may be invalid. The goal of this paper is to review the different approaches to synthetic missing data generation found in the literature and discuss their practical details, elaborating on their strengths and weaknesses. Our analysis revealed that creating missing at random and missing not at random scenarios in datasets comprising qualitative features is the most challenging issue in the related work and, therefore, should be the focus of future work in the field.
2019
Authors
Domingues, I; Sampaio, IL; Duarte, H; Santos, JAM; Abreu, PH;
Publication
IEEE ACCESS
Abstract
Esophageal cancer is a disease with a high prevalence that can be evaluated by a variety of imaging modalities, including endoscopy, computed tomography, and positron emission tomography. Computer-aided techniques could provide a valuable help in the analysis of these images, decreasing the medical workflow time and human errors. The goal of this paper is to review the existing literature on the application of computer vision techniques in the domain of esophageal cancer. After an initial phase where a set of keywords was chosen, the selected terms were used to retrieve papers from four well-known databases: Web of Science, Scopus, PubMed, and Springer. The results were scanned by merging identical entries, and eliminating the out of scope works, resulting in 47 selected papers. These were organized according to the image modality. Major results were then summarized and compared, and main shortcomings were identified. It could be concluded that, even though the scientific community has already paid attention to the esophageal cancer problem, there are still several open issues. Two majorfindings of this review are the nonexistence of works on MRI data and the under-exploration of recent techniques using deep learning strategies, showing the need for further investigation.
2019
Authors
Abreu, PH; Silva, DC; Gomes, A;
Publication
ACM TRANSACTIONS ON COMPUTING EDUCATION
Abstract
Low performance of nontechnical engineering students in programming courses is a problem that remains unsolved. Over the years, many authors have tried to identify the multiple causes for that failure, but there is unanimity on the fact that motivation is a key factor for the acquisition of knowledge by students. To better understand motivation, a new evaluation strategy has been adopted in a second programming course of a nontechnical degree, consisting of 91 students. The goals of the study were to identify if those students felt more motivated to answer multiple-choice questions in comparison to development questions, and what type of question better allows for testing student knowledge acquisition. Possibilities around the motivational qualities of multiple-choice questions in programming courses will be discussed in light of the results. In conclusion, it seems clear that student performance varies according to the type of question. Our study points out that multiple-choice questions can be seen as a motivational factor for engineering students and it might also be a good way to test acquired programming concepts. Therefore, this type of question could be further explored in the evaluation points.
2019
Authors
Pereira, R; Abreu, P; Polisciuc, E; Machado, P;
Publication
PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL 3: IVAPP
Abstract
Automatic Identification System data has been used in several studies with different directions like traffic forecasting, pollution control or anomalous behavior detection in vessels trajectories. Considering this last subject, the intersection between vessels is often related with abnormal behaviors, but this topic has not been exploited yet. In this paper an approach to assist the domain experts in the task of analyzing these intersections is introduced, based on data processing and visualization. The work was experimented with a proprietary dataset that covers the Portuguese maritime zone, containing an average of 6460 intersections by day. The results show that several intersections were only noticeable with the visualization strategies here proposed. Copyright
2019
Authors
Montagna, S; Silva, DC; Abreu, PH; Ito, M; Schumacher, MI; Vargiu, E;
Publication
ARTIFICIAL INTELLIGENCE IN MEDICINE
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.