Publicacoes - INESC TEC

Publicações

Publicações por Jaime Cardoso

2023

Detecting Concepts and Generating Captions from Medical Images: Contributions of the VCMI Team to ImageCLEFmedical Caption 2023

Autores
Torto, IR; Patrício, C; Montenegro, H; Gonçalves, T; Cardoso, JS;

Publicação
Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), Thessaloniki, Greece, September 18th to 21st, 2023.

Abstract
This paper presents the main contributions of the VCMI Team to the ImageCLEFmedical Caption 2023 task. We addressed both the concept detection and caption prediction tasks. Regarding concept detection, our team employed different approaches to assign concepts to medical images: multi-label classification, adversarial training, autoregressive modelling, image retrieval, and concept retrieval. We also developed three model ensembles merging the results of some of the proposed methods. Our best submission obtained an F1-score of 0.4998, ranking 3rd among nine teams. Regarding the caption prediction task, our team explored two main approaches based on image retrieval and language generation. The language generation approaches, based on a vision model as the encoder and a language model as the decoder, yielded the best results, allowing us to rank 5th among thirteen teams, with a BERTScore of 0.6147. © 2023 Copyright for this paper by its authors.

FecharLer Abstract

2023

BOLD: Blood-gas and Oximetry Linked Dataset - Open Source Research

Autores
Matos, J; Struja, T; Gallifant, J; Nakayama, LF; Charpignon, M; Liu, X; Economou-Zavlanos, N; Cardoso, JS; Johnson, KS; Bhavsar, N; Gichoya, JW; Celi, LA; Wong, AI;

Publicação

Abstract
Pulse oximeters measure peripheral arterial oxygen saturation (SpO2) noninvasively, while the gold standard (SaO2) involves arterial blood gas measurement. There are known racial and ethnic disparities in their performance. BOLD is a new comprehensive dataset that aims to underscore the importance of addressing biases in pulse oximetry accuracy, which disproportionately affect darker-skinned patients. The dataset was created by harmonizing three Electronic Health Record databases (MIMIC-III, MIMIC-IV, eICU-CRD) comprising Intensive Care Unit stays of US patients. Paired SpO2 and SaO2 measurements were time-aligned and combined with various other sociodemographic and parameters to provide a detailed representation of each patient. BOLD includes 49,099 paired measurements, within a 5-minute window and with oxygen saturation levels between 70-100%. Minority racial and ethnic groups account for ~25% of the data - a proportion seldom achieved in previous studies. The codebase is publicly available. Given the prevalent use of pulse oximeters in the hospital and at home, we hope that BOLD will be leveraged to develop debiasing algorithms that can result in more equitable healthcare solutions.

FecharLer Abstract

2023

Evaluating the Performance of Explanation Methods on Ordinal Regression CNN Models

Autores
Barbero-Gómez, J; Cruz, R; Cardoso, JS; Gutiérrez, PA; Hervás-Martínez, C;

Publicação
ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT II

Abstract
This paper introduces an evaluation procedure to validate the efficacy of explanation methods for Convolutional Neural Network (CNN) models in ordinal regression tasks. Two ordinal methods are contrasted against a baseline using cross-entropy, across four datasets. A statistical analysis demonstrates that attribution methods, such as Grad-CAM and IBA, perform significantly better when used with ordinal regression CNN models compared to a baseline approach in most ordinal and nominal metrics. The study suggests that incorporating ordinal information into the attribution map construction process may improve the explanations further.

FecharLer Abstract

2024

Classification of Pulmonary Nodules in 2-[<SUP>18</SUP>F]FDG PET/CT Images with a 3D Convolutional Neural Network

Autores
Alves, VM; Cardoso, JD; Gama, J;

Publicação
NUCLEAR MEDICINE AND MOLECULAR IMAGING

Abstract
Purpose 2-[F-18]FDG PET/CT plays an important role in the management of pulmonary nodules. Convolutional neural networks (CNNs) automatically learn features from images and have the potential to improve the discrimination between malignant and benign pulmonary nodules. The purpose of this study was to develop and validate a CNN model for classification of pulmonary nodules from 2-[F-18]FDG PET images.Methods One hundred thirteen participants were retrospectively selected. One nodule per participant. The 2-[F-18]FDG PET images were preprocessed and annotated with the reference standard. The deep learning experiment entailed random data splitting in five sets. A test set was held out for evaluation of the final model. Four-fold cross-validation was performed from the remaining sets for training and evaluating a set of candidate models and for selecting the final model. Models of three types of 3D CNNs architectures were trained from random weight initialization (Stacked 3D CNN, VGG-like and Inception-v2-like models) both in original and augmented datasets. Transfer learning, from ImageNet with ResNet-50, was also used.Results The final model (Stacked 3D CNN model) obtained an area under the ROC curve of 0.8385 (95% CI: 0.6455-1.0000) in the test set. The model had a sensibility of 80.00%, a specificity of 69.23% and an accuracy of 73.91%, in the test set, for an optimised decision threshold that assigns a higher cost to false negatives.Conclusion A 3D CNN model was effective at distinguishing benign from malignant pulmonary nodules in 2-[F-18]FDG PET images.

FecharLer Abstract

2023

OCT Image Synthesis through Deep Generative Models

Autores
Melo, T; Cardoso, J; Carneiro, A; Campilho, A; Mendonça, AM;

Publicação
2023 IEEE 36TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, CBMS

Abstract
The development of accurate methods for OCT image analysis is highly dependent on the availability of large annotated datasets. As such datasets are usually expensive and hard to obtain, novel approaches based on deep generative models have been proposed for data augmentation. In this work, a flow-based network (SRFlow) and a generative adversarial network (ESRGAN) are used for synthesizing high-resolution OCT B-scans from low-resolution versions of real OCT images. The quality of the images generated by the two models is assessed using two standard fidelity-oriented metrics and a learned perceptual quality metric. The performance of two classification models trained on real and synthetic images is also evaluated. The obtained results show that the images generated by SRFlow preserve higher fidelity to the ground truth, while the outputs of ESRGAN present, on average, better perceptual quality. Independently of the architecture of the network chosen to classify the OCT B-scans, the model's performance always improves when images generated by SRFlow are included in the training set.

FecharLer Abstract

2023

Shining Light on Dark Skin: Pulse Oximetry Correction Models

Autores
Matos, J; Struja, T; Gallifant, J; Charpignon, ML; Cardoso, JS; Celi, LA;

Publicação
2023 IEEE 7TH PORTUGUESE MEETING ON BIOENGINEERING, ENBENG

Abstract
Pulse oximeters are medical devices used to assess peripheral arterial oxygen saturation (SpO(2)) noninvasively. In contrast, the gold standard requires arterial blood to be drawn to measure the arterial oxygen saturation (SaO(2)). Devices currently on the market measure SpO(2) with lower accuracy in populations with darker skin tones. Pulse oximetry inaccuracies can yield episodes of hidden hypoxemia (HH), with SpO(2) >= 88%, but SaO(2) < 88%. HH can result in less treatment and increased mortality. Despite being flawed, pulse oximeters remain ubiquitously used; debiasing models could alleviate the downstream repercussions of HH. To our knowledge, this is the first study to propose such models. Experiments were conducted using the MIMIC-IV dataset. The cohort includes patients admitted to the Intensive Care Unit with paired (SaO(2), SpO(2)) measurements captured within 10min of each other. We built a XGBoost regression predicting SaO(2) from SpO(2), patient demographics, physiological data, and treatment information. We used an asymmetric mean squared error as the loss function to minimize falsely elevated predicted values. The model achieved R-2 = 67.6% among Black patients; frequency of HH episodes was partially mitigated. Respiratory function was most predictive of SaO(2); race-ethnicity was not a top predictor. This singlecenter study shows that SpO(2) corrections can be achieved with Machine Learning. In future, model validation will be performed on additional patient cohorts featuring diverse settings.

FecharLer Abstract