Publications

Publications by CTM

2024

Intrinsic Explainability for End-to-End Object Detection

Authors
Fernandes, L; Fernandes, JND; Calado, M; Pinto, JR; Cerqueira, R; Cardoso, JS;

Publication
IEEE ACCESS

Abstract
Deep Learning models are automating many daily routine tasks, indicating that in the future, even high-risk tasks will be automated, such as healthcare and automated driving areas. However, due to the complexity of such deep learning models, it is challenging to understand their reasoning. Furthermore, the black box nature of the designed deep learning models may undermine public confidence in critical areas. Current efforts on intrinsically interpretable models focus only on classification tasks, leaving a gap in models for object detection. Therefore, this paper proposes a deep learning model that is intrinsically explainable for the object detection task. The chosen design for such a model is a combination of the well-known Faster-RCNN model with the ProtoPNet model. For the Explainable AI experiments, the chosen performance metric was the similarity score from the ProtoPNet model. Our experiments show that this combination leads to a deep learning model that is able to explain its classifications, with similarity scores, using a visual bag of words, which are called prototypes, that are learned during the training process. Furthermore, the adoption of such an explainable method does not seem to hinder the performance of the proposed model, which achieved a mAP of 69% in the KITTI dataset and a mAP of 66% in the GRAZPEDWRI-DX dataset. Moreover, our explanations have shown a high reliability on the similarity score.

CloseRead Abstract

2024

YOLOMM - You Only Look Once for Multi-modal Multi-tasking

Authors
Campos, F; Cerqueira, FG; Cruz, RPM; Cardoso, JS;

Publication
PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2023, PT I

Abstract
Autonomous driving can reduce the number of road accidents due to human error and result in safer roads. One important part of the system is the perception unit, which provides information about the environment surrounding the car. Currently, most manufacturers are using not only RGB cameras, which are passive sensors that capture light already in the environment but also Lidar. This sensor actively emits laser pulses to a surface or object and measures reflection and time-of-flight. Previous work, YOLOP, already proposed a model for object detection and semantic segmentation, but only using RGB. This work extends it for Lidar and evaluates performance on KITTI, a public autonomous driving dataset. The implementation shows improved precision across all objects of different sizes. The implementation is entirely made available: https://github.com/filipepcampos/yolomm.

CloseRead Abstract

2024

Anonymizing medical case-based explanations through disentanglement

Authors
Montenegro, H; Cardoso, JS;

Publication
MEDICAL IMAGE ANALYSIS

Abstract
Case-based explanations are an intuitive method to gain insight into the decision-making process of deep learning models in clinical contexts. However, medical images cannot be shared as explanations due to privacy concerns. To address this problem, we propose a novel method for disentangling identity and medical characteristics of images and apply it to anonymize medical images. The disentanglement mechanism replaces some feature vectors in an image while ensuring that the remaining features are preserved, obtaining independent feature vectors that encode the images' identity and medical characteristics. We also propose a model to manufacture synthetic privacy-preserving identities to replace the original image's identity and achieve anonymization. The models are applied to medical and biometric datasets, demonstrating their capacity to generate realistic-looking anonymized images that preserve their original medical content. Additionally, the experiments show the network's inherent capacity to generate counterfactual images through the replacement of medical features.

CloseRead Abstract

2024

Unmasking biases and navigating pitfalls in the ophthalmic artificial intelligence lifecycle: A narrative review

Authors
Nakayama, LF; Matos, J; Quion, J; Novaes, F; Mitchell, WG; Mwavu, R; Hung, CJYJ; Santiago, APD; Phanphruk, W; Cardoso, JS; Celi, LA;

Publication
PLOS DIGITAL HEALTH

Abstract
Over the past 2 decades, exponential growth in data availability, computational power, and newly available modeling techniques has led to an expansion in interest, investment, and research in Artificial Intelligence (AI) applications. Ophthalmology is one of many fields that seek to benefit from AI given the advent of telemedicine screening programs and the use of ancillary imaging. However, before AI can be widely deployed, further work must be done to avoid the pitfalls within the AI lifecycle. This review article breaks down the AI lifecycle into seven steps-data collection; defining the model task; data preprocessing and labeling; model development; model evaluation and validation; deployment; and finally, post-deployment evaluation, monitoring, and system recalibration-and delves into the risks for harm at each step and strategies for mitigating them.

CloseRead Abstract

2024

Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data

Authors
DeAndres-Tame, I; Tolosana, R; Melzi, P; Vera-Rodriguez, R; Kim, M; Rathgeb, C; Liu, XM; Morales, A; Fierrez, J; Ortega-Garcia, J; Zhong, ZZ; Huang, YG; Mi, YX; Ding, SH; Zhou, SG; He, S; Fu, LZ; Cong, H; Zhang, RY; Xiao, ZH; Smirnov, E; Pimenov, A; Grigorev, A; Timoshenko, D; Asfaw, KM; Low, CY; Liu, H; Wang, CY; Zuo, Q; He, ZX; Shahreza, HO; George, A; Unnervik, A; Rahimi, P; Marcel, E; Neto, PC; Huber, M; Kolf, JN; Damer, N; Boutros, F; Cardoso, JS; Sequeira, AF; Atzori, A; Fenu, G; Marras, M; Struc, V; Yu, J; Li, ZJ; Li, JC; Zhao, WS; Lei, Z; Zhu, XY; Zhang, XY; Biesseck, B; Vidal, P; Coelho, L; Granada, R; Menotti, D;

Publication
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW

Abstract
Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2(nd) edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at CVPR 2024. FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations, including data privacy concerns, demographic biases, generalization to novel scenarios, and performance constraints in challenging situations such as aging, pose variations, and occlusions. Unlike the 1(st) edition, in which synthetic data from DCFace and GANDiffFace methods was only allowed to train face recognition systems, in this 2(nd) edition we propose new subtasks that allow participants to explore novel face generative methods. The outcomes of the 2(nd) FRCSyn Challenge, along with the proposed experimental protocol and benchmarking contribute significantly to the application of synthetic data to face recognition.

CloseRead Abstract

2024

Phasing segmented telescopes via deep learning methods: application to a deployable CubeSat

Authors
Dumont, M; Correia, CM; Sauvage, JF; Schwartz, N; Gray, M; Cardoso, J;

Publication
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION

Abstract
Capturing high-resolution imagery of the Earth's surface often calls for a telescope of considerable size, even from low Earth orbits (LEOs). A large aperture often requires large and expensive platforms. For instance, achieving a resolution of 1 m at visible wavelengths from LEO typically requires an aperture diameter of at least 30 cm. Additionally, ensuring high revisit times often prompts the use of multiple satellites. In light of these challenges, a small, segmented, deployable CubeSat telescope was recently proposed creating the additional need of phasing the telescope's mirrors. Phasing methods on compact platforms are constrained by the limited volume and power available, excluding solutions that rely on dedicated hardware or demand substantial computational resources. Neural networks (NNs) are known for their computationally efficient inference and reduced onboard requirements. Therefore, we developed a NN-based method to measure co-phasing errors inherent to a deployable telescope. The proposed technique demonstrates its ability to detect phasing errors at the targeted performance level [typically a wavefront error (WFE) below 15 nm RMS for a visible imager operating at the diffraction limit] using a point source. The robustness of the NN method is verified in presence of high-order aberrations or noise and the results are compared against existing state-of-the-art techniques. The developed NN model ensures its feasibility and provides arealistic pathway towards achieving diffraction-limited images. (c) 2024 Optica Publishing Group

CloseRead Abstract