2022
Autores
Martins, I; Resende, JS; Sousa, PR; Silva, S; Antunes, L; Gama, J;
Publicação
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE
Abstract
The Internet of Things (IoT) envisions a smart environment powered by connectivity and heterogeneity where ensuring reliable services and communications across multiple industries, from financial fields to healthcare and fault detection systems, is a top priority. In such fields, data is being collected and broadcast at high speed on a continuous and real-time scale, including IoT in the streaming processing paradigm. Intrusion Detection Systems (IDS) rely on manually defined security policies and signatures that fail to design a real-time solution or prevent zero-day attacks. Therefore, anomaly detection appears as a prominent solution capable of recognizing patterns, learning from experience, and detecting abnormal behavior. However, most approaches do not fit the urged requirements, often evaluated on deprecated datasets not representative of the working environment. As a result, our contributions address an overview of cybersecurity threats in IoT, important recommendations for a real-time IDS, and a real-time dataset setting to evaluate a security system covering multiple cyber threats. The dataset used to evaluate current host-based IDS approaches is publicly available and can be used as a benchmark by the community.
2022
Autores
Davari, N; Pashami, S; Veloso, B; Fan, YT; Pereira, PM; Ribeiro, RP; Gama, J; Nowaczyk, S;
Publicação
ADVANCES IN INTELLIGENT DATA ANALYSIS XX, IDA 2022
Abstract
This study applies a data-driven anomaly detection frame-work based on a Long Short-Term Memory (LSTM) autoencoder network for several subsystems of a public transport bus. The proposed frame-work efficiently detects abnormal data, significantly reducing the false alarm rate compared to available alternatives. Using historical repair records, we demonstrate how detection of abnormal sequences in the signals can be used for predicting equipment failures. The deviations from normal operation patterns are detected by analysing the data collected from several on-board sensors (e.g., wet tank air pressure, engine speed, engine load) installed on the bus. The performance of LSTM autoencoder (LSTM-AE) is compared against the multi-layer autoencoder (mlAE) network in the same anomaly detection framework. The experimental results show that the performance indicators of the LSTM-AE network, in terms of F1 Score, Recall, and Precision, are better than those of the mlAE network.
2021
Autores
Teixeira, S; Londres, G; Veloso, B; Ribeiro, RP; Gama, J;
Publicação
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, PT II
Abstract
The production and management of urban waste is a growing challenge and a consequence of our day-to-day resources and activities. According to the Portuguese Environment Agency, in 2019, Portugal produced 1% more tons compared to 2018. The proper management of this waste can be co-substantiated by existing policies, namely, national legislation and the Strategic Plan for Urban Waste. Those policies assess and support the amount of waste processed, allowing the recovery of materials. Among the solutions for waste management is the selective collection of waste. We improve the possibility of manage the smart waste collection of Paper, Plastic, and Glass packaging from corporate customers who joined a recycling program. We have data collected since 2017 until 2020. The main objective of this work is to increase the system's predictive performance, without any loss for citizens, but with improvement in the collection management. We analyze two types of problems: (i) the presence or absence of containers; and (ii) the prediction of the number of containers by type of waste. To carry out the analysis, we applied three machine learning algorithms: XGBoost, Random Forest, and Rpart. Additionally, we also use AutoML for XGBoost and Random Forest algorithms. The results show that with AutoML, generally, it is possible to obtain better results for classifying the presence or absence of containers by type of waste and predict the number of containers.
2021
Autores
Kamp, M; Koprinska, I; Bibal, A; Bouadi, T; Frénay, B; Galárraga, L; Oramas, J; Adilova, L; Krishnamurthy, Y; Kang, B; Largeron, C; Lijffijt, J; Viard, T; Welke, P; Ruocco, M; Aune, E; Gallicchio, C; Schiele, G; Pernkopf, F; Blott, M; Fröning, H; Schindler, G; Guidotti, R; Monreale, A; Rinzivillo, S; Biecek, P; Ntoutsi, E; Pechenizkiy, M; Rosenhahn, B; Buckley, CL; Cialfi, D; Lanillos, P; Ramstead, M; Verbelen, T; Ferreira, PM; Andresini, G; Malerba, D; Medeiros, I; Viger, PF; Nawaz, MS; Ventura, S; Sun, M; Zhou, M; Bitetta, V; Bordino, I; Ferretti, A; Gullo, F; Ponti, G; Severini, L; Ribeiro, RP; Gama, J; Gavaldà, R; Cooper, LAD; Ghazaleh, N; Richiardi, J; Roqueiro, D; Miranda, DS; Sechidis, K; Graça, G;
Publicação
PKDD/ECML Workshops (1)
Abstract
2021
Autores
Kamp, M; Koprinska, I; Bibal, A; Bouadi, T; Frénay, B; Galárraga, L; Oramas, J; Adilova, L; Krishnamurthy, Y; Kang, B; Largeron, C; Lijffijt, J; Viard, T; Welke, P; Ruocco, M; Aune, E; Gallicchio, C; Schiele, G; Pernkopf, F; Blott, M; Fröning, H; Schindler, G; Guidotti, R; Monreale, A; Rinzivillo, S; Biecek, P; Ntoutsi, E; Pechenizkiy, M; Rosenhahn, B; Buckley, CL; Cialfi, D; Lanillos, P; Ramstead, M; Verbelen, T; Ferreira, PM; Andresini, G; Malerba, D; Medeiros, I; Viger, PF; Nawaz, MS; Ventura, S; Sun, M; Zhou, M; Bitetta, V; Bordino, I; Ferretti, A; Gullo, F; Ponti, G; Severini, L; Ribeiro, RP; Gama, J; Gavaldà, R; Cooper, LAD; Ghazaleh, N; Richiardi, J; Roqueiro, D; Miranda, DS; Sechidis, K; Graça, G;
Publicação
PKDD/ECML Workshops (2)
Abstract
2022
Autores
Shaji, N; Gama, J; Ribeiro, RP; Gomes, P;
Publicação
ADVANCES IN INTELLIGENT DATA ANALYSIS XX, IDA 2022
Abstract
Non-traditional data like the applicant's bank statement is a significant source for decision-making when granting loans. We find that we can use methods from network science on the applicant's bank statements to convert inherent cash flow characteristics to predictors for default prediction in a credit scoring or credit risk assessment model. First, the credit cash flow is extracted from a bank statement and later converted into a visibility graph or network. Afterwards, we use this visibility network to find features that predict the borrowers' repayment behaviour. We see that feature selection methods select all the five extracted features. Finally, SMOTE is used to balance the training data. The model using the features from the network and the standard features together is shown having superior performance compared to the model that uses only the standard features, indicating the network features' predictive power.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.