2020
Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Pasquali, A; Cordeiro, JP; Rocha, C; Mansouri, B; Santana, BS;
Publication
SIGIR Forum
Abstract
2020
Authors
Nunes, S; Little, S; Bhatia, S; Boratto, L; Cabanac, G; Campos, R; Couto, FM; Faralli, S; Frommholz, I; Jatowt, A; Jorge, A; Marras, M; Mayr, P; Stilo, G;
Publication
SIGIR Forum
Abstract
2020
Authors
Aminian, E; Ribeiro, RP; Gama, J;
Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II
Abstract
Data are growing fast in today's world and great portion of that is in the form of stream. In many situations, data streams are imbalanced making it difficult to use with classical data mining methods. However, mining these special kinds of streams is one of the most attractive research area. In this paper, we propose two algorithms for learning from imbalanced regression data streams. Both methods are based on Chebychev's inequality but in a different way. The first method, under-samples from the frequent target value examples while the second method over-samples the rare and extreme target value examples. This way, the learner will focus in the rare and more difficult cases. We applied our methods to train regression models using two benchmark datasets and two well-known regression algorithms: Perceptron and FIMT-DD. Our obtained results from the simulations indicate the usefulness of our proposed methods.
2020
Authors
Vasconcelos, PB; Ribeiro, RP;
Publication
First International Computer Programming Education Conference, ICPEC 2020, June 25-26, 2020, ESMAD, Vila do Conde, Portugal (Virtual Conference).
Abstract
This paper reports on the use of property-based testing for providing feedback to C programming exercises. Test cases are generated automatically from properties specified in a test script; this not only makes it possible to conduct many tests (thus potentially find more mistakes), but also allows simplifying failed tests cases automatically. We present some experimental validation gathered for an introductory C programming course during the fall semester of 2018 that show significant positive correlations between getting feedback during the semester and the student's results in the final exam. We also discuss some limitations regarding feedback for undefined behaviors in the C language. 2012 ACM Subject Classification Social and professional topics ! Student assessment; Software and its engineering ! Software testing and debugging; Software and its engineering ! Domain specific languages.
2020
Authors
Ribeiro, RP; Moniz, N;
Publication
MACHINE LEARNING
Abstract
Research in imbalanced domain learning has almost exclusively focused on solving classification tasks for accurate prediction of cases labelled with a rare class. Approaches for addressing such problems in regression tasks are still scarce due to two main factors. First, standard regression tasks assume each domain value as equally important. Second, standard evaluation metrics focus on assessing the performance of models on the most common values of data distributions. In this paper, we present an approach to tackle imbalanced regression tasks where the objective is to predict extreme (rare) values. We propose an approach to formalise such tasks and to optimise/evaluate predictive models, overcoming the factors mentioned and issues in related work. We present an automatic and non-parametric method to obtain relevance functions, building on the concept of relevance as the mapping of target values into non-uniform domain preferences. Then, we proposeSERA, a new evaluation metric capable of assessing the effectiveness and of optimising models towards the prediction of extreme values while penalising severe model bias. An experimental study demonstrates howSERAprovides valid and useful insights into the performance of models in imbalanced regression tasks.
2020
Authors
Barros, M; Veloso, B; Pereira, PM; Ribeiro, RP; Gama, J;
Publication
IoT Streams for Data-Driven Predictive Maintenance and IoT, Edge, and Mobile for Embedded Machine Learning - Second International Workshop, IoT Streams 2020, and First International Workshop, ITEM 2020, Co-located with ECML/PKDD 2020, Ghent, Belgium, September 14-18, 2020, Revised Selected Papers
Abstract
The transformation of industrial manufacturing with computers and automation with smart systems leads us to monitor and log of industrial equipment events. It is possible to apply analytic approaches, and to find interpretive results for strategic decision making, providing advantages such as failure detection and predictive maintenance. Over the last years, many researchers have been studying the application of machine learning techniques to improve such tasks. In this context, we develop a system capable of detect anomalies on an Air Production Unit (APU), taking into consideration the peak frequency of each sensor. The study started with the analysis of the sensors installed on the APU, defining its normal behavior and its failure mode. Using that information, we define rules, to monitor the APU, to detect anomalies on its components, and to predict possible failures. The definition of rules was based on the peak frequency analysis, which allowed the setting of boundaries of normality for the APU working modes and, thus, the identification of anomalies. © 2020, Springer Nature Switzerland AG.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.