2021
Authors
Davari, N; Veloso, B; Costa, GD; Pereira, PM; Ribeiro, RP; Gama, J;
Publication
SENSORS
Abstract
In the last few years, many works have addressed Predictive Maintenance (PdM) by the use of Machine Learning (ML) and Deep Learning (DL) solutions, especially the latter. The monitoring and logging of industrial equipment events, like temporal behavior and fault events-anomaly detection in time-series-can be obtained from records generated by sensors installed in different parts of an industrial plant. However, such progress is incipient because we still have many challenges, and the performance of applications depends on the appropriate choice of the method. This article presents a survey of existing ML and DL techniques for handling PdM in the railway industry. This survey discusses the main approaches for this specific application within a taxonomy defined by the type of task, employed methods, metrics of evaluation, the specific equipment or process, and datasets. Lastly, we conclude and outline some suggestions for future research.
2021
Authors
Davari, N; Veloso, B; Ribeiro, RP; Pereira, PM; Gama, J;
Publication
2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA)
Abstract
Predictive maintenance methods assist early detection of failures and errors in machinery before they reach critical stages. This study proposes a data-driven predictive maintenance framework for the air production unit (APU) system of a train of Metro do Porto by deep learning based on a sparse autoencoder (SAE) network that efficiently detects abnormal data and considerably reduces the false alarm rate. Several analog and digital sensors installed on the APU system allow the detection of behavioral changes and deviations from the normal pattern by analyzing the collected data. We implemented two versions of the SAE network in which we inputted analog sensors data and digital sensors data, and the experimental results show that the failures due to air leakage problems are predicted by analog sensors data while other types of failures are identified by digital sensors data. A low pass filter is applied to the output of the SAE network, and a sequence of abnormal data is used as an alarm for the APU system failure. Performance indicators of the SAE network with digital sensors data, in terms of F1 Score, Recall, and Precision, are respectively, about 33.6%, 42%, and 28% better than those of the SAE network with analog sensors data. For comparison purposes, we also implemented a variational autoencoder (VAE). The results show that SAE performance is better than that of VAE by 14%, 77%, and 37% respectively, for Recall, Precision and F1 Score.
2021
Authors
Gama, J; Veloso, B; Aminian, E; Ribeiro, RP;
Publication
9TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS, BDA 2021
Abstract
This article presents our recent work on the topic of learning from data streams. We focus on emerging topics, including fraud detection, learning from rare cases, and hyper-parameter tuning for streaming data. © 2021, Springer Nature Switzerland AG.
2021
Authors
Aminian, E; Ribeiro, RP; Gama, J;
Publication
DATA MINING AND KNOWLEDGE DISCOVERY
Abstract
In recent years data stream mining and learning from imbalanced data have been active research areas. Even though solutions exist to tackle these two problems, most of them are not designed to handle challenges inherited from both problems. As far as we are aware, the few approaches in the area of learning from imbalanced data streams fall in the context of classification, and no efforts on the regression domain have been reported yet. This paper proposes a technique that uses sampling strategies to cope with imbalanced data streams in a regression setting, where the most important cases have rare and extreme target values. Specifically, we employ under-sampling and over-sampling strategies that resort to Chebyshev's inequality value as a heuristic to disclose the type of incoming cases (i.e. frequent or rare). We have evaluated our proposal by applying it in the training of models by four well-known regression algorithms over fourteen benchmark data sets. We conducted a series of experiments with different setups on both synthetic and real-world data sets. The experimental results confirm our approach's effectiveness by showing the models' superior performance trained by each of the sampling strategies compared with their baseline pairs.
2021
Authors
Teixeira, S; Londres, G; Veloso, B; Ribeiro, RP; Gama, J;
Publication
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, PT II
Abstract
The production and management of urban waste is a growing challenge and a consequence of our day-to-day resources and activities. According to the Portuguese Environment Agency, in 2019, Portugal produced 1% more tons compared to 2018. The proper management of this waste can be co-substantiated by existing policies, namely, national legislation and the Strategic Plan for Urban Waste. Those policies assess and support the amount of waste processed, allowing the recovery of materials. Among the solutions for waste management is the selective collection of waste. We improve the possibility of manage the smart waste collection of Paper, Plastic, and Glass packaging from corporate customers who joined a recycling program. We have data collected since 2017 until 2020. The main objective of this work is to increase the system's predictive performance, without any loss for citizens, but with improvement in the collection management. We analyze two types of problems: (i) the presence or absence of containers; and (ii) the prediction of the number of containers by type of waste. To carry out the analysis, we applied three machine learning algorithms: XGBoost, Random Forest, and Rpart. Additionally, we also use AutoML for XGBoost and Random Forest algorithms. The results show that with AutoML, generally, it is possible to obtain better results for classifying the presence or absence of containers by type of waste and predict the number of containers.
2021
Authors
Kamp, M; Koprinska, I; Bibal, A; Bouadi, T; Frénay, B; Galárraga, L; Oramas, J; Adilova, L; Krishnamurthy, Y; Kang, B; Largeron, C; Lijffijt, J; Viard, T; Welke, P; Ruocco, M; Aune, E; Gallicchio, C; Schiele, G; Pernkopf, F; Blott, M; Fröning, H; Schindler, G; Guidotti, R; Monreale, A; Rinzivillo, S; Biecek, P; Ntoutsi, E; Pechenizkiy, M; Rosenhahn, B; Buckley, CL; Cialfi, D; Lanillos, P; Ramstead, M; Verbelen, T; Ferreira, PM; Andresini, G; Malerba, D; Medeiros, I; Viger, PF; Nawaz, MS; Ventura, S; Sun, M; Zhou, M; Bitetta, V; Bordino, I; Ferretti, A; Gullo, F; Ponti, G; Severini, L; Ribeiro, RP; Gama, J; Gavaldà, R; Cooper, LAD; Ghazaleh, N; Richiardi, J; Roqueiro, D; Miranda, DS; Sechidis, K; Graça, G;
Publication
PKDD/ECML Workshops (1)
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.