Publications

Publications by João Gama

2023

MetroPT-3 Dataset

Authors
Davari, N; Veloso, B; Ribeiro, RP; Gama, J;

Publication

Abstract

2023

Novelty detection for multi-label stream classification under extreme verification latency

Authors
Costa, JD; Faria, ER; Andrade Silva, Jd; Gama, J; Cerri, R;

Publication
Appl. Soft Comput.

Abstract
Multi-Label Stream Classification (MLSC) is the classification streaming examples into multiple classes simultaneously. Since new classes may emerge during the streaming process (concept evolution) and known classes may change over time (concept drift) it is challenging task. In real situations, concept drift and concept evolution occur in scenarios where the actual labels of arriving examples are never available; hence it is impractical to update decision models in a supervised fashion. This is known as Extreme Verification Latency, a topic that has not been well investigated in MLSC literature. This paper proposes a new method called MultI-label learNing Algorithm for Data Streams with Binary Relevance transformation (MINAS-BR), integrated with a Novelty Detection (ND) procedure for detecting concept evolution and concept drift, updating the model in an unsupervised fashion. Furthermore, since the label space is not static, we propose a new evaluation methodology for MLSC under extreme verification latency. Experiments over synthetic and real-world data sets with different concept drift and concept evolution scenarios confirmed the strategies employed in the MINAS-BR and presented relevant advances for handling streaming multi-label data.

CloseRead Abstract

2023

Online Influence Forest for Streaming Anomaly Detection

Authors
Martins, I; Resende, JS; Gama, J;

Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS XXI, IDA 2023

Abstract
As the digital world grows, data is being collected at high speed on a continuous and real-time scale. Hence, the imposed imbalanced and evolving scenario that introduces learning from streaming data remains a challenge. As the research field is still open to consistent strategies that assess continuous and evolving data properties, this paper proposes an unsupervised, online, and incremental anomaly detection ensemble of influence trees that implement adaptive mechanisms to deal with inactive or saturated leaves. This proposal features the fourth standardized moment, also known as kurtosis, as the splitting criteria and the isolation score, Shannon's information content, and the influence function of an instance as the anomaly score. In addition to improving interpretability, this proposal is also evaluated on publicly available datasets, providing a detailed discussion of the results.

CloseRead Abstract

2023

Predictive Maintenance, Adversarial Autoencoders and Explainability

Authors
Silva, MEP; Veloso, B; Gama, J;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2023, PT VII

Abstract
The transition to Industry 4.0 provoked a transformation of industrial manufacturing with a significant leap in automation and intelligent systems. This paradigm shift has brought about a mindset that emphasizes predictive maintenance: detecting future failures when current behaviour of industrial processes and machines is thought to be normal. The constant monitoring of industrial equipment produces massive quantities of data that enables the application of machine learning approaches to this task. This study uses deep learning-based models to build a data-driven predictive maintenance framework for the air production unit (APU), a crucial system for the proper functioning of a Metro do Porto train. This public transport system moves thousands of people every day and train failures lead to delays and loss of trust by clients. Therefore, it is essential not only to detect APU failures before they occur to minimize negative impacts, but also to provide explanations for the failure warnings that can aid in decision-making processes. We propose an autoencoder architecture trained with an adversarial loss, known as the Wasserstein Autoencoder with Generative Adversarial Network (WAE-GAN), designed to detect sensor failures in systems connected to the APU. Our model can detect APU failures up to two hours before they occur, allowing timely intervention of the maintenance teams. We further augment our model with an explainability layer, by providing explanations generated by a rule-based model that focuses on rare events. Results show that our model is able to detect APU failures without any false alarms, fulfilling the requisites of Metro do Porto for early detection of the failures.

CloseRead Abstract

2023

Identification of morphologically cryptic species with computer vision models: wall lizards (Squamata: Lacertidae: Podarcis) as a case study

Authors
Pinho, C; Kaliontzopoulou, A; Ferreira, CA; Gama, J;

Publication
ZOOLOGICAL JOURNAL OF THE LINNEAN SOCIETY

Abstract
Automated image classification is a thriving field of machine learning, and various successful applications dealing with biological images have recently emerged. In this work, we address the ability of these methods to identify species that are difficult to tell apart by humans due to their morphological similarity. We focus on distinguishing species of wall lizards, namely those belonging to the Podarcis hispanicus species complex, which constitutes a well-known example of cryptic morphological variation. We address two classification experiments: (1) assignment of images of the morphologically relatively distinct P. bocagei and P. lusitanicus; and (2) distinction between the overall more cryptic nine taxa that compose this complex. We used four datasets (two image perspectives and individuals of the two sexes) and three deep-learning models to address each problem. Our results suggest a high ability of the models to identify the correct species, especially when combining predictions from different perspectives and models (accuracy of 95.9% and 97.1% for females and males, respectively, in the two-class case; and of 91.2% to 93.5% for females and males, respectively, in the nine-class case). Overall, these results establish deep-learning models as an important tool for field identification and monitoring of cryptic species complexes, alleviating the burden of expert or genetic identification.

CloseRead Abstract

2024

Classification of Pulmonary Nodules in 2-[<SUP>18</SUP>F]FDG PET/CT Images with a 3D Convolutional Neural Network

Authors
Alves, VM; Cardoso, JD; Gama, J;

Publication
NUCLEAR MEDICINE AND MOLECULAR IMAGING

Abstract
Purpose 2-[F-18]FDG PET/CT plays an important role in the management of pulmonary nodules. Convolutional neural networks (CNNs) automatically learn features from images and have the potential to improve the discrimination between malignant and benign pulmonary nodules. The purpose of this study was to develop and validate a CNN model for classification of pulmonary nodules from 2-[F-18]FDG PET images.Methods One hundred thirteen participants were retrospectively selected. One nodule per participant. The 2-[F-18]FDG PET images were preprocessed and annotated with the reference standard. The deep learning experiment entailed random data splitting in five sets. A test set was held out for evaluation of the final model. Four-fold cross-validation was performed from the remaining sets for training and evaluating a set of candidate models and for selecting the final model. Models of three types of 3D CNNs architectures were trained from random weight initialization (Stacked 3D CNN, VGG-like and Inception-v2-like models) both in original and augmented datasets. Transfer learning, from ImageNet with ResNet-50, was also used.Results The final model (Stacked 3D CNN model) obtained an area under the ROC curve of 0.8385 (95% CI: 0.6455-1.0000) in the test set. The model had a sensibility of 80.00%, a specificity of 69.23% and an accuracy of 73.91%, in the test set, for an optimised decision threshold that assigns a higher cost to false negatives.Conclusion A 3D CNN model was effective at distinguishing benign from malignant pulmonary nodules in 2-[F-18]FDG PET images.

CloseRead Abstract