Publications

Publications by Tiago Filipe Gonçalves

2024

Abstract PO3-19-11: CINDERELLA Clinical Trial (NCT05196269): using artificial intelligence-driven healthcare to enhance breast cancer locoregional treatment decisions

Authors
Eduard-Alexandru Bonci; Orit Kaidar-Person; Marília Antunes; Oriana Ciani; Helena Cruz; Rosa Di Micco; Oreste Davide Gentilini; Nicole Rotmensz; Pedro Gouveia; Jörg Heil; Pawel Kabata; Nuno Freitas; Tiago Gonçalves; Miguel Romariz; Helena Montenegro; Hélder P. Oliveira; Jaime S. Cardoso; Henrique Martins; Daniela Lopes; Marta Martinho; Ludovica Borsoi; Elisabetta Listorti; Carlos Mavioso; Martin Mika; André Pfob; Timo Schinköthe; Giovani Silva; Maria-Joao Cardoso;

Publication
Cancer Research

Abstract
Abstract Background. Breast cancer treatment has improved overall survival rates, with different locoregional approaches offering patients similar locoregional control but variable aesthetic outcomes that may lead to disappointment and poor quality of life (QoL). There are no standardized methods for informing patients of the different therapies prior to intervention, nor validated tools for evaluation of aesthetics and patients' expectations. The CINDERELLA Project is based on years of research and developments of new healthcare technologies by various partners, aimed to provide an artificial intelligence (AI) tool to aid shared decision-making by showing breast cancer patients the predicted aesthetic outcomes of their locoregional treatment. The clinical trial will evaluate the use of this tool within an AI cloud-based platform approach (CINDERELLA App) versus a standard approach. We anticipate that the CINDERELLA App will lead to improved satisfaction, psychosocial well-being and health-related QoL while maintaining the quality of care and providing environmental and economic benefits. Trial design. CINDERELLA is an international multicentric interventional randomized controlled open-label clinical trial. Using the CINDERELLA App, the AI and Digital Health arm will provide patients with complete information about the proposed types of locoregional treatments and photographs of similar patients previously treated with the same techniques. The Control arm will follow the standard approach of each clinical site. Randomization will be conducted online using the digital health platform CANKADO, ensuring a balanced distribution of participants between the two groups. CANKADO is the underlying platform through which physicians control the patients' app content and conduct all data collection. Privacy, data protection and ethical principles in AI usage were taken into account. Eligibility criteria. Patients diagnosed with primary breast cancer without evidence of systemic disease. All patients must sign an informed consent and be able to use a web-based app autonomously or with home-based support. Specific aims. Primary objective: to assess the levels of agreement among patients' expectations regarding the aesthetic outcome before and 12 months after locoregional treatment. The trial will also evaluate the aesthetic outcome level of agreement between the AI evaluation tool and self-evaluation. Secondary objectives: health-related QoL (EQ-5D-5L and BREAST-Q ICHOM questionnaires) and resource consumption (e.g., time spent in the hospital, out-of-pocket expenses). The questionnaires and photographs will be applied prior to any treatment, at wound healing, at 6 and 12 months following the completion of locoregional therapy. Statistical methods. Wilcoxon signed rank test will be used to assess the intervention's impact on the agreement level between expectations and obtained results. Weighted Cohen's kappa will be calculated to measure the improvement in classifying aesthetic results with intervention. Statistical tests and/or bootstrap techniques will compare results between arms. A similarity measure will be calculated between self-evaluation and outcome obtained with the AI tool for each participant, and a beta regression model will be used to analyze the intervention's effect. Secondary objectives will be evaluated by scoring questionnaires based on provided guidelines. Target accrual. The clinical trial, led by Champalimaud Clinical Centre, will enroll a minimum of 515 patients in each arm between July 2023 and January 2025. Recruitment is currently open at five study sites in Germany, Israel, Italy, Poland and Portugal. The clinical trial is still open for further international study sites. Funding. European Union grant HORIZON-HLTH-2021-DISEASE-04-04 Agreement No. 101057389. Citation Format: Eduard-Alexandru Bonci, Orit Kaidar-Person, Marília Antunes, Oriana Ciani, Helena Cruz, Rosa Di Micco, Oreste Davide Gentilini, Nicole Rotmensz, Pedro Gouveia, Jörg Heil, Pawel Kabata, Nuno Freitas, Tiago Gonçalves, Miguel Romariz, Helena Montenegro, Hélder P. Oliveira, Jaime S. Cardoso, Henrique Martins, Daniela Lopes, Marta Martinho, Ludovica Borsoi, Elisabetta Listorti, Carlos Mavioso, Martin Mika, André Pfob, Timo Schinköthe, Giovani Silva, Maria-Joao Cardoso. CINDERELLA Clinical Trial (NCT05196269): using artificial intelligence-driven healthcare to enhance breast cancer locoregional treatment decisions [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO3-19-11.

CloseRead Abstract

2025

Evaluating the Impact of Pulse Oximetry Bias in Machine Learning Under Counterfactual Thinking

Authors
Martins, I; Matos, J; Goncalves, T; Celi, LA; Wong, AKI; Cardoso, JS;

Publication
APPLICATIONS OF MEDICAL ARTIFICIAL INTELLIGENCE, AMAI 2024

Abstract
Algorithmic bias in healthcare mirrors existing data biases. However, the factors driving unfairness are not always known. Medical devices capture significant amounts of data but are prone to errors; for instance, pulse oximeters overestimate the arterial oxygen saturation of darker-skinned individuals, leading to worse outcomes. The impact of this bias in machine learning (ML) models remains unclear. This study addresses the technical challenges of quantifying the impact of medical device bias in downstream ML. Our experiments compare a perfect world, without pulse oximetry bias, using SaO(2) (blood-gas), to the actual world, with biased measurements, using SpO(2) (pulse oximetry). Under this counterfactual design, two models are trained with identical data, features, and settings, except for the method of measuring oxygen saturation: models using SaO(2) are a control and models using SpO(2) a treatment. The blood-gas oximetry linked dataset was a suitable testbed, containing 163,396 nearly-simultaneous SpO(2) - SaO(2) paired measurements, aligned with a wide array of clinical features and outcomes. We studied three classification tasks: in-hospital mortality, respiratory SOFA score in the next 24 h, and SOFA score increase by two points. Models using SaO(2) instead of SpO(2) generally showed better performance. Patients with overestimation of O-2 by pulse oximetry of >= 3% had significant decreases in mortality prediction recall, from 0.63 to 0.59, P < 0.001. This mirrors clinical processes where biased pulse oximetry readings provide clinicians with false reassurance of patients' oxygen levels. A similar degradation happened in ML models, with pulse oximetry biases leading to more false negatives in predicting adverse outcomes.

CloseRead Abstract

2022

A survey on attention mechanisms for medical applications: are we moving towards better algorithms?

Authors
Gonçalves, T; Torto, IR; Teixeira, LF; Cardoso, JS;

Publication
CoRR

Abstract
Abstract

The increasing popularity of attention mechanisms in deep learning algorithms for computer vision and natural language processing made these models attractive to other research domains. In healthcare, there is a strong need for tools that may improve the routines of the clinicians and the patients. Naturally, the use of attention-based algorithms for medical applications occurred smoothly. However, being healthcare a domain that depends on high-stake decisions, the scientific community must ponder if these high-performing algorithms fit the needs of medical applications. With this motto, this paper extensively reviews the use of attention mechanisms in machine learning (including Transformers) for several medical applications. This work distinguishes itself from its predecessors by proposing a critical analysis of the claims and potentialities of attention mechanisms presented in the literature through an experimental case study on medical image classification with three different use cases. These experiments focus on the integrating process of attention mechanisms into established deep learning architectures, the analysis of their predictive power, and a visual assessment of their saliency maps generated by post-hoc explanation methods. This paper concludes with a critical analysis of the claims and potentialities presented in the literature about attention mechanisms and proposes future research lines in medical applications that may benefit from these frameworks.

CloseRead Abstract

2024

ON THE SUITABILITY OF B-COS NETWORKS FOR THE MEDICAL DOMAIN

Authors
Rio Torto, I; Gonçalves, T; Cardoso, JS; Teixeira, LF;

Publication
IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI 2024

Abstract
In fields that rely on high-stakes decisions, such as medicine, interpretability plays a key role in promoting trust and facilitating the adoption of deep learning models by the clinical communities. In the medical image analysis domain, gradient-based class activation maps are the most widely used explanation methods and the field lacks a more in depth investigation into inherently interpretable models that focus on integrating knowledge that ensures the model is learning the correct rules. A new approach, B-cos networks, for increasing the interpretability of deep neural networks by inducing weight-input alignment during training showed promising results on natural image classification. In this work, we study the suitability of these B-cos networks to the medical domain by testing them on different use cases (skin lesions, diabetic retinopathy, cervical cytology, and chest X-rays) and conducting a thorough evaluation of several explanation quality assessment metrics. We find that, just like in natural image classification, B-cos explanations yield more localised maps, but it is not clear that they are better than other methods' explanations when considering more explanation properties.

CloseRead Abstract

2024

Massively Annotated Datasets for Assessment of Synthetic and Real Data in Face Recognition

Authors
Neto, PC; Mamede, RM; Albuquerque, C; Gonçalves, T; Sequeira, AF;

Publication
2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024

Abstract
Face recognition applications have grown in parallel with the size of datasets, complexity of deep learning models and computational power. However, while deep learning models evolve to become more capable and computational power keeps increasing, the datasets available are being retracted and removed from public access. Privacy and ethical concerns are relevant topics within these domains. Through generative artificial intelligence, researchers have put efforts into the development of completely synthetic datasets that can be used to train face recognition systems. Nonetheless, the recent advances have not been sufficient to achieve performance comparable to the state-of-the-art models trained on real data. To study the drift between the performance of models trained on real and synthetic datasets, we leverage a massive attribute classifier (MAC) to create annotations for four datasets: two real and two synthetic. From these annotations, we conduct studies on the distribution of each attribute within all four datasets. Additionally, we further inspect the differences between real and synthetic datasets on the attribute set. When comparing through the Kullback-Leibler divergence we have found differences between real and synthetic samples. Interestingly enough, we have verified that while real samples suffice to explain the synthetic distribution, the opposite could not be further from being true.

CloseRead Abstract

2023

Evaluating Privacy on Synthetic Images Generated using GANs: Contributions of the VCMI Team to ImageCLEFmedical GANs 2023

Authors
Montenegro, H; Neto, PC; Patrício, C; Torto, IR; Gonçalves, T; Teixeira, LF;

Publication
CLEF (Working Notes)

Abstract
This paper presents the main contributions of the VCMI Team to the ImageCLEFmedical GANs 2023 task. This task aims to evaluate whether synthetic medical images generated using Generative Adversarial Networks (GANs) contain identifiable characteristics of the training data. We propose various approaches to classify a set of real images as having been used or not used in the training of the model that generated a set of synthetic images. We use similarity-based approaches to classify the real images based on their similarity to the generated ones. We develop autoencoders to classify the images through outlier detection techniques. Finally, we develop patch-based methods that operate on patches extracted from real and generated images to measure their similarity. On the development dataset, we attained an F1-score of 0.846 and an accuracy of 0.850 using an autoencoder-based method. On the test dataset, a similarity-based approach achieved the best results, with an F1-score of 0.801 and an accuracy of 0.810. The empirical results support the hypothesis that medical data generated using deep generative models trained without privacy constraints threatens the privacy of patients in the training data.

CloseRead Abstract