2021
Authors
Castro, HF; Cardoso, JS; Andrade, MT;
Publication
DATA
Abstract
The ever-growing capabilities of computers have enabled pursuing Computer Vision through Machine Learning (i.e., MLCV). ML tools require large amounts of information to learn from (ML datasets). These are costly to produce but have received reduced attention regarding standardization. This prevents the cooperative production and exploitation of these resources, impedes countless synergies, and hinders ML research. No global view exists of the MLCV dataset tissue. Acquiring it is fundamental to enable standardization. We provide an extensive survey of the evolution and current state of MLCV datasets (1994 to 2019) for a set of specific CV areas as well as a quantitative and qualitative analysis of the results. Data were gathered from online scientific databases (e.g., Google Scholar, CiteSeerX). We reveal the heterogeneous plethora that comprises the MLCV dataset tissue; their continuous growth in volume and complexity; the specificities of the evolution of their media and metadata components regarding a range of aspects; and that MLCV progress requires the construction of a global standardized (structuring, manipulating, and sharing) MLCV "library". Accordingly, we formulate a novel interpretation of this dataset collective as a global tissue of synthetic cognitive visual memories and define the immediately necessary steps to advance its standardization and integration.
2021
Authors
Pinto, JR; Cardoso, JS;
Publication
Encyclopedia of Cryptography, Security and Privacy
Abstract
2021
Authors
Saffari, M; Khodayar, M; Saadabadi, MSE; Sequeira, AF; Cardoso, JS;
Publication
SENSORS
Abstract
In recent years, deep neural networks have shown significant progress in computer vision due to their large generalization capacity; however, the overfitting problem ubiquitously threatens the learning process of these highly nonlinear architectures. Dropout is a recent solution to mitigate overfitting that has witnessed significant success in various classification applications. Recently, many efforts have been made to improve the Standard dropout using an unsupervised merit-based semantic selection of neurons in the latent space. However, these studies do not consider the task-relevant information quality and quantity and the diversity of the latent kernels. To solve the challenge of dropping less informative neurons in deep learning, we propose an efficient end-to-end dropout algorithm that selects the most informative neurons with the highest correlation with the target output considering the sparsity in its selection procedure. First, to promote activation diversity, we devise an approach to select the most diverse set of neurons by making use of determinantal point process (DPP) sampling. Furthermore, to incorporate task specificity into deep latent features, a mutual information (MI)-based merit function is developed. Leveraging the proposed MI with DPP sampling, we introduce the novel DPPMI dropout that adaptively adjusts the retention rate of neurons based on their contribution to the neural network task. Empirical studies on real-world classification benchmarks including, MNIST, SVHN, CIFAR10, CIFAR100, demonstrate the superiority of our proposed method over recent state-of-the-art dropout algorithms in the literature.
2020
Authors
Allahdadi, A; Morla, R; Cardoso, JS;
Publication
SIMULATION-TRANSACTIONS OF THE SOCIETY FOR MODELING AND SIMULATION INTERNATIONAL
Abstract
Despite the growing popularity of 802.11 wireless networks, users often suffer from connectivity problems and performance issues due to unstable radio conditions and dynamic user behavior, among other reasons. Anomaly detection and distinction are in the thick of major challenges that network managers encounter. The difficulty of monitoring broad and complex Wireless Local Area Networks, that often requires heavy instrumentation of the user devices, makes anomaly detection analysis even harder. In this paper we exploit 802.11 access point usage data and propose an anomaly detection technique based on Hidden Markov Model (HMM) and Universal Background Model (UBM) on data that is inexpensive to obtain. We then generate a number of network anomalous scenarios in OMNeT++/INET network simulator and compare the detection outcomes with those in baseline approaches-RawData and Principal Component Analysis. The experimental results show the superiority of HMM and HMM-UBM models in detection precision and sensitivity.
2021
Authors
Matta, A; Pinto, JR; Cardoso, JS;
Publication
Trends and Applications in Information Systems and Technologies - Volume 3, WorldCIST 2021, Terceira Island, Azores, Portugal, 30 March - 2 April, 2021.
Abstract
Face Recognition (FR) is a challenging task, especially when dealing with unknown identities. While Open-Set Face Recognition (OSFR) assigns a single class to all unfamiliar subjects, Open-World Face Recognition (OWFR) employs an incremental approach, creating a new class for each unknown individual. Current OWFR approaches still present limitations, mainly regarding the accuracy gap to standard closed-set approaches and execution time. This paper proposes a fast and simple mixture-based OWFR algorithm that tackles the execution time issue while avoiding accuracy decay. The proposed method uses data curve representations and Universal Background Models based on Gaussian Mixture Models. Experimental results show that the proposed approach achieves competitive performance, considering accuracy and execution time, in both closed-set and open-world scenarios. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.
2021
Authors
Costa, P; Smailagic, A; Cardoso, JS; Campilho, A;
Publication
U.Porto Journal of Engineering
Abstract
Current state-of-the-art medical image segmentation methods require high quality datasets to obtain good performance. However, medical specialists often disagree on diagnosis, hence, datasets contain contradictory annotations. This, in turn, leads to difficulties in the optimization process of Deep Learning models and hinder performance. We propose a method to estimate uncertainty in Convolutional Neural Network (CNN) segmentation models, that makes the training of CNNs more robust to contradictory annotations. In this work, we model two types of uncertainty, heteroscedastic and epistemic, without adding any additional supervisory signal other than the ground-truth segmentation mask. As expected, the uncertainty is higher closer to vessel boundaries, and on top of thinner and less visible vessels where it is more likely for medical specialists to disagree. Therefore, our method is more suitable to learn from datasets created with heterogeneous annotators. We show that there is a correlation between the uncertainty estimated by our method and the disagreement in the segmentation provided by two different medical specialists. Furthermore, by explicitly modeling the uncertainty, the Intersection over Union of the segmentation network improves 5.7 percentage points.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.