2023
Authors
Silva, DTE; Cruz, R; Goncalves, T; Carneiro, D;
Publication
FIFTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION, ICMV 2022
Abstract
Semantic segmentation consists of classifying each pixel according to a set of classes. This process is particularly slow for high-resolution images, which are present in many applications, ranging from biomedicine to the automotive industry. In this work, we propose an algorithm targeted to segment high-resolution images based on two stages. During stage 1, a lower-resolution interpolation of the image is the input of a first neural network, whose low-resolution output is resized to the original resolution. Then, in stage 2, the probabilities resulting from stage 1 are divided into contiguous patches, with less confident ones being collected and refined by a second neural network. The main novelty of this algorithm is the aggregation of the low-resolution result from stage 1 with the high-resolution patches from stage 2. We propose the U-Net architecture segmentation, evaluated in six databases. Our method shows similar results to the baseline regarding the Dice coefficient, with fewer arithmetic operations.
2023
Authors
Caldeira, E; Neto, PC; Gonçalves, T; Damer, N; Sequeira, AF; Cardoso, JS;
Publication
31st European Signal Processing Conference, EUSIPCO 2023, Helsinki, Finland, September 4-8, 2023
Abstract
Morphing attacks keep threatening biometric systems, especially face recognition systems. Over time they have become simpler to perform and more realistic, as such, the usage of deep learning systems to detect these attacks has grown. At the same time, there is a constant concern regarding the lack of interpretability of deep learning models. Balancing performance and interpretability has been a difficult task for scientists. However, by leveraging domain information and proving some constraints, we have been able to develop IDistill, an interpretable method with state-of-the-art performance that provides information on both the identity separation on morph samples and their contribution to the final prediction. The domain information is learnt by an autoencoder and distilled to a classifier system in order to teach it to separate identity information. When compared to other methods in the literature it outperforms them in three out of five databases and is competitive in the remaining. © 2023 European Signal Processing Conference, EUSIPCO. All rights reserved.
2023
Authors
Serrano e Silva, P; Cruz, R; Shihavuddin, ASM; Gonçalves, T;
Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
2023
Authors
Castro, E; Ferreira, PM; Rebelo, A; Rio-Torto, I; Capozzi, L; Ferreira, MF; Goncalves, T; Albuquerque, T; Silva, W; Afonso, C; Sousa, RG; Cimarelli, C; Daoudi, N; Moreira, G; Yang, HY; Hrga, I; Ahmad, J; Keswani, M; Beco, S;
Publication
MACHINE VISION AND APPLICATIONS
Abstract
Every year, the VISion Understanding and Machine intelligence (VISUM) summer school runs a competition where participants can learn and share knowledge about Computer Vision and Machine Learning in a vibrant environment. 2021 VISUM's focused on applying those methodologies in fashion. Recently, there has been an increase of interest within the scientific community in applying computer vision methodologies to the fashion domain. That is highly motivated by fashion being one of the world's largest industries presenting a rapid development in e-commerce mainly since the COVID-19 pandemic. Computer Vision for Fashion enables a wide range of innovations, from personalized recommendations to outfit matching. The competition enabled students to apply the knowledge acquired in the summer school to a real-world problem. The ambition was to foster research and development in fashion outfit complementary product retrieval by leveraging vast visual and textual data with domain knowledge. For this, a new fashion outfit dataset (acquired and curated by FARFETCH) for research and benchmark purposes is introduced. Additionally, a competitive baseline with an original negative sampling process for triplet mining was implemented and served as a starting point for participants. The top 3 performing methods are described in this paper since they constitute the reference state-of-the-art for this particular problem. To our knowledge, this is the first challenge in fashion outfit complementary product retrieval. Moreover, this joint project between academia and industry brings several relevant contributions to disseminating science and technology, promoting economic and social development, and helping to connect early-career researchers to real-world industry challenges.
2022
Authors
Silva, W; Goncalves, T; HSrmS, K; Schroder, E; Obmann, VC; Barroso, MC; Poellinger, A; Reyes, M; Cardoso, JS;
Publication
SCIENTIFIC REPORTS
Abstract
Currently, radiologists face an excessive workload, which leads to high levels of fatigue, and consequently, to undesired diagnosis mistakes. Decision support systems can be used to prioritize and help radiologists making quicker decisions. In this sense, medical content-based image retrieval systems can be of extreme utility by providing well-curated similar examples. Nonetheless, most medical content-based image retrieval systems work by finding the most similar image, which is not equivalent to finding the most similar image in terms of disease and its severity. Here, we propose an interpretability-driven and an attention-driven medical image retrieval system. We conducted experiments in a large and publicly available dataset of chest radiographs with structured labels derived from free-text radiology reports (MIMIC-CXR-JPG). We evaluated the methods on two common conditions: pleural effusion and (potential) pneumonia. As ground-truth to perform the evaluation, query/test and catalogue images were classified and ordered by an experienced board-certified radiologist. For a profound and complete evaluation, additional radiologists also provided their rankings, which allowed us to infer inter-rater variability, and yield qualitative performance levels. Based on our ground-truth ranking, we also quantitatively evaluated the proposed approaches by computing the normalized Discounted Cumulative Gain (nDCG). We found that the Interpretability-guided approach outperforms the other state-of-the-art approaches and shows the best agreement with the most experienced radiologist. Furthermore, its performance lies within the observed inter-rater variability.
2023
Authors
Cruz, R; Silva, DTE; Goncalves, T; Carneiro, D; Cardoso, JS;
Publication
SENSORS
Abstract
Semantic segmentation consists of classifying each pixel according to a set of classes. Conventional models spend as much effort classifying easy-to-segment pixels as they do classifying hard-to-segment pixels. This is inefficient, especially when deploying to situations with computational constraints. In this work, we propose a framework wherein the model first produces a rough segmentation of the image, and then patches of the image estimated as hard to segment are refined. The framework is evaluated in four datasets (autonomous driving and biomedical), across four state-of-the-art architectures. Our method accelerates inference time by four, with additional gains for training time, at the cost of some output quality.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.