Publications

Publications by CTM

2025

Predicting Aesthetic Outcomes in Breast Cancer Surgery: A Multimodal Retrieval Approach

Authors
Zolfagharnasab, MH; Freitas, N; Gonçalves, T; Bonci, E; Mavioso, C; Cardoso, MJ; Oliveira, HP; Cardoso, JS;

Publication
ARTIFICIAL INTELLIGENCE AND IMAGING FOR DIAGNOSTIC AND TREATMENT CHALLENGES IN BREAST CARE, DEEP-BREATH 2024

Abstract
Breast cancer treatments often affect patients' body image, making aesthetic outcome predictions vital. This study introduces a Deep Learning (DL) multimodal retrieval pipeline using a dataset of 2,193 instances combining clinical attributes and RGB images of patients' upper torsos. We evaluate four retrieval techniques: Weighted Euclidean Distance (WED) with various configurations and shallow Artificial Neural Network (ANN) for tabular data, pre-trained and fine-tuned Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), and a multimodal approach combining both data types. The dataset, categorised into Excellent/Good and Fair/Poor outcomes, is organised into over 20K triplets for training and testing. Results show fine-tuned multimodal ViTs notably enhance performance, achieving up to 73.85% accuracy and 80.62% Adjusted Discounted Cumulative Gain (ADCG). This framework not only aids in managing patient expectations by retrieving the most relevant post-surgical images but also promises broad applications in medical image analysis and retrieval. The main contributions of this paper are the development of a multimodal retrieval system for breast cancer patients based on post-surgery aesthetic outcome and the evaluation of different models on a new dataset annotated by clinicians for image retrieval.

CloseRead Abstract

2025

Endpoint Detection in Breast Images for Automatic Classification of Breast Cancer Aesthetic Results

Authors
Freitas, N; Veloso, C; Mavioso, C; Cardoso, MJ; Oliveira, HP; Cardoso, JS;

Publication
ARTIFICIAL INTELLIGENCE AND IMAGING FOR DIAGNOSTIC AND TREATMENT CHALLENGES IN BREAST CARE, DEEP-BREATH 2024

Abstract
Breast cancer is the most common type of cancer in women worldwide. Because of high survival rates, there has been an increased interest in patient Quality of Life after treatment. Aesthetic results play an important role in this aspect, as these treatments can leave a mark on a patient's self-image. Despite that, there are no standard ways of assessing aesthetic outcomes. Commonly used software such as BCCT.core or BAT require the manual annotation of keypoints, which makes them time-consuming for clinical use and can lead to result variability depending on the user. Recently, there have been attempts to leverage both traditional and Deep Learning algorithms to detect keypoints automatically. In this paper, we compare several methods for the detection of Breast Endpoints across two datasets. Furthermore, we present an extended evaluation of using these models as input for full contour prediction and aesthetic evaluation using the BCCT.core software. Overall, the YOLOv9 model, fine-tuned for this task, presents the best results considering both accuracy and usability, making this architecture the best choice for this application. The main contribution of this paper is the development of a pipeline for full breast contour prediction, which reduces clinician workload and user variability for automatic aesthetic assessment.

CloseRead Abstract

2025

Learning Ordinality in Semantic Segmentation

Authors
Cruz, RPM; Cristino, R; Cardoso, JS;

Publication
IEEE ACCESS

Abstract
Semantic segmentation consists of predicting a semantic label for each image pixel. While existing deep learning approaches achieve high accuracy, they often overlook the ordinal relationships between classes, which can provide critical domain knowledge (e.g., the pupil lies within the iris, and lane markings are part of the road). This paper introduces novel methods for spatial ordinal segmentation that explicitly incorporate these inter-class dependencies. By treating each pixel as part of a structured image space rather than as an independent observation, we propose two regularization terms and a new metric to enforce ordinal consistency between neighboring pixels. Two loss regularization terms and one metric are proposed for structural ordinal segmentation, which penalizes predictions of non-ordinal adjacent classes. Five biomedical datasets and multiple configurations of autonomous driving datasets demonstrate the efficacy of the proposed methods. Our approach achieves improvements in ordinal metrics and enhances generalization, with up to a 15.7% relative increase in the Dice coefficient. Importantly, these benefits come without additional inference time costs. This work highlights the significance of spatial ordinal relationships in semantic segmentation and provides a foundation for further exploration in structured image representations.

CloseRead Abstract

2025

Second FRCSyn-onGoing: Winning solutions and post-challenge analysis to improve face recognition with synthetic data

Authors
DeAndres Tame, I; Tolosana, R; Melzi, P; Vera Rodriguez, R; Kim, M; Rathgeb, C; Liu, XM; Gomez, LF; Morales, A; Fierrez, J; Ortega Garcia, J; Zhong, ZZ; Huang, YG; Mi, YX; Ding, SH; Zhou, SG; He, S; Fu, LZ; Cong, H; Zhang, RY; Xiao, ZH; Smirnov, E; Pimenov, A; Grigorev, A; Timoshenko, D; Asfaw, KM; Low, CY; Liu, H; Wang, CY; Zuo, Q; He, ZX; Shahreza, HO; George, A; Unnervik, A; Rahimi, P; Marcel, S; Neto, PC; Huber, M; Kolf, JN; Damer, N; Boutros, F; Cardoso, JS; Sequeira, AF; Atzori, A; Fenu, G; Marras, M; Struc, V; Yu, J; Li, ZJ; Li, JC; Zhao, WS; Lei, Z; Zhu, XY; Zhang, XY; Biesseck, B; Vidal, P; Coelho, L; Granada, R; Menotti, D;

Publication
INFORMATION FUSION

Abstract
Synthetic data is gaining increasing popularity for face recognition technologies, mainly due to the privacy concerns and challenges associated with obtaining real data, including diverse scenarios, quality, and demographic groups, among others. It also offers some advantages over real data, such as the large amount of data that can be generated or the ability to customize it to adapt to specific problem-solving needs. To effectively use such data, face recognition models should also be specifically designed to exploit synthetic data to its fullest potential. In order to promote the proposal of novel Generative AI methods and synthetic data, and investigate the application of synthetic data to better train face recognition systems, we introduce the 2nd FRCSyn-onGoing challenge, based on the 2nd Face Recognition Challenge in the Era of Synthetic Data (FRCSyn), originally launched at CVPR 2024. This is an ongoing challenge that provides researchers with an accessible platform to benchmark (i) the proposal of novel Generative AI methods and synthetic data, and (ii) novel face recognition systems that are specifically proposed to take advantage of synthetic data. We focus on exploring the use of synthetic data both individually and in combination with real data to solve current challenges in face recognition such as demographic bias, domain adaptation, and performance constraints in demanding situations, such as age disparities between training and testing, changes in the pose, or occlusions. Very interesting findings are obtained in this second edition, including a direct comparison with the first one, in which synthetic databases were restricted to DCFace and GANDiffFace.

CloseRead Abstract

2025

CNN explanation methods for ordinal regression tasks

Authors
Barbero Gómez, J; Cruz, RPM; Cardoso, JS; Gutiérrez, PA; Hervás Martínez, C;

Publication
NEUROCOMPUTING

Abstract
The use of Convolutional Neural Network (CNN) models for image classification tasks has gained significant popularity. However, the lack of interpretability in CNN models poses challenges for debugging and validation. To address this issue, various explanation methods have been developed to provide insights into CNN models. This paper focuses on the validity of these explanation methods for ordinal regression tasks, where the classes have a predefined order relationship. Different modifications are proposed for two explanation methods to exploit the ordinal relationships between classes: Grad-CAM based on Ordinal Binary Decomposition (GradOBDCAM) and Ordinal Information Bottleneck Analysis (OIBA). The performance of these modified methods is compared to existing popular alternatives. Experimental results demonstrate that GradOBD-CAM outperforms other methods in terms of interpretability for three out of four datasets, while OIBA achieves superior performance compared to IBA.

CloseRead Abstract

2025

MST-KD: Multiple Specialized Teachers Knowledge Distillation for Fair Face Recognition

Authors
Caldeira, E; Cardoso, JS; Sequeira, AF; Neto, PC;

Publication
COMPUTER VISION-ECCV 2024 WORKSHOPS, PT XV

Abstract
As in school, one teacher to cover all subjects is insufficient to distill equally robust information to a student. Hence, each subject is taught by a highly specialised teacher. Following a similar philosophy, we propose a multiple specialized teacher framework to distill knowledge to a student network. In our approach, directed at face recognition use cases, we train four teachers on one specific ethnicity, leading to four highly specialized and biased teachers. Our strategy learns a project of these four teachers into a common space and distill that information to a student network. Our results highlighted increased performance and reduced bias for all our experiments. In addition, we further show that having biased/specialized teachers is crucial by showing that our approach achieves better results than when knowledge is distilled from four teachers trained on balanced datasets. Our approach represents a step forward to the understanding of the importance of ethnicity-specific features.

CloseRead Abstract