2023
Autores
Aguilar-Ruiz, JS; Bifet, A; Gama, J;
Publicação
Analytics
Abstract
2022
Autores
Cerri, R; Faria, ER; Gama, J;
Publicação
2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA
Abstract
Multi-label stream classification is the task of classifying instances in two or more classes simultaneously, with instances flowing continuously in high speed. This task imposes difficult challenges, such as the detection of concept drifts, where the distributions of the instances in the stream change with time, and infinitely delayed labels, when the ground truth labels of the instances are never available to help updating the classifiers. To solve such task, the methods from the literature use the problem transformation approach, which divides the multi-label problem into different sub-problems, associating one classification model for each class. In this paper, we propose a method based on self-organizing maps that, different from the literature, uses only one model to deal with all classes simultaneously. By using the algorithm adaptation approach, our proposal better considers label dependencies, improving the results over its counterparts. Experiments using different synthetic and real-world datasets showed that our proposal obtained the overall best performance when compared to different methods from the literature.
2023
Autores
Andrade, T; Gama, J;
Publicação
Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, SAC 2023, Tallinn, Estonia, March 27-31, 2023
Abstract
2023
Autores
Silva, PR; Vinagre, J; Gama, J;
Publicação
38TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2023
Abstract
Dynamic Time Warping (DTW) is a robust method to measure the similarity between two sequences. This paper proposes a method based on DTW to analyse high-speed data streams. The central idea is to decompose the network traffic into sequences of histograms of packet sizes and then calculate the distance between pairs of such sequences using DTW with Kullback-Leibler (KL) distance. As a baseline, we also compute the Euclidean Distance between the sequences of histograms. Since our preliminary experiments indicate that the distance between two sequences falls within a different range of values for distinct types of streams, we then exploit this distance information for stream classification using a Random Forest. The approach was investigated using recent internet traffic data from a telecommunications company. To illustrate the application of our approach, we conducted a case study with encrypted Internet Protocol Television (IPTV) network traffic data. The goal was to use our DTW-based approach to detect the video codec used in the streams, as well as the IPTV channel. Results strongly suggest that the DTW distance value between the data streams is highly informative for such classification tasks.
2023
Autores
Pinho, C; Kaliontzopoulou, A; Ferreira, CA; Gama, J;
Publicação
ZOOLOGICAL JOURNAL OF THE LINNEAN SOCIETY
Abstract
Automated image classification is a thriving field of machine learning, and various successful applications dealing with biological images have recently emerged. In this work, we address the ability of these methods to identify species that are difficult to tell apart by humans due to their morphological similarity. We focus on distinguishing species of wall lizards, namely those belonging to the Podarcis hispanicus species complex, which constitutes a well-known example of cryptic morphological variation. We address two classification experiments: (1) assignment of images of the morphologically relatively distinct P. bocagei and P. lusitanicus; and (2) distinction between the overall more cryptic nine taxa that compose this complex. We used four datasets (two image perspectives and individuals of the two sexes) and three deep-learning models to address each problem. Our results suggest a high ability of the models to identify the correct species, especially when combining predictions from different perspectives and models (accuracy of 95.9% and 97.1% for females and males, respectively, in the two-class case; and of 91.2% to 93.5% for females and males, respectively, in the nine-class case). Overall, these results establish deep-learning models as an important tool for field identification and monitoring of cryptic species complexes, alleviating the burden of expert or genetic identification.
2024
Autores
Alves, VM; Cardoso, JD; Gama, J;
Publicação
NUCLEAR MEDICINE AND MOLECULAR IMAGING
Abstract
Purpose 2-[F-18]FDG PET/CT plays an important role in the management of pulmonary nodules. Convolutional neural networks (CNNs) automatically learn features from images and have the potential to improve the discrimination between malignant and benign pulmonary nodules. The purpose of this study was to develop and validate a CNN model for classification of pulmonary nodules from 2-[F-18]FDG PET images.Methods One hundred thirteen participants were retrospectively selected. One nodule per participant. The 2-[F-18]FDG PET images were preprocessed and annotated with the reference standard. The deep learning experiment entailed random data splitting in five sets. A test set was held out for evaluation of the final model. Four-fold cross-validation was performed from the remaining sets for training and evaluating a set of candidate models and for selecting the final model. Models of three types of 3D CNNs architectures were trained from random weight initialization (Stacked 3D CNN, VGG-like and Inception-v2-like models) both in original and augmented datasets. Transfer learning, from ImageNet with ResNet-50, was also used.Results The final model (Stacked 3D CNN model) obtained an area under the ROC curve of 0.8385 (95% CI: 0.6455-1.0000) in the test set. The model had a sensibility of 80.00%, a specificity of 69.23% and an accuracy of 73.91%, in the test set, for an optimised decision threshold that assigns a higher cost to false negatives.Conclusion A 3D CNN model was effective at distinguishing benign from malignant pulmonary nodules in 2-[F-18]FDG PET images.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.