Publicacoes - INESC TEC

Publicações

Publicações por Margarida Gonçalves Gouveia

2025

Domain-Specific Data Augmentation for Lung Nodule Malignancy Classification

Autores
Gouveia, M; Araújo, J; Oliveira, HP; Pereira, T;

Publicação
2025 47TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC)

Abstract
Lung cancer is one of the leading causes of cancer-related deaths worldwide, mainly due to late diagnosis. Screening programs can benefit from Computer-Aided Diagnosis (CAD) systems that detect and classify lung nodules using Computed Tomography (CT) scans. A great proportion of the literature proposes deep learning models based on single and private datasets with no evaluation of their generalisation capability. The main goal of this work is to study and address the lack of generalisation to out-of-domain data (source domain different from the target domain). In this work, we propose using a ResNet architecture with 2.5D inputs capable of maintaining the spatial information of the nodules (3 input channels based on the anatomical planes). Secondly, we apply domain-specific data augmentation tailored for CT scans. Combined with data augmentation, using 2.5D inputs achieves the best results, both in in-domain data (LIDC-IDRI: N=1377 nodules; and LNDb: N=183 nodules) and in out-of-domain data (LUNGx: N=73 nodules). In in-domain data, an Area Under the Curve (AUC) of 0.914 was achieved in the internal test set and 0.746 in one of the external test sets. Notably, in out-of-domain data, where the ground-truth labels have been confirmed by biopsy, whereas the training data only involved radiologist annotation regarding the likelihood of malignancy, AUC improves from 0.576 to 0.695, reaching a performance close to that of radiology experts. In the future, strategies should be applied to deal with the level of uncertainty of lung nodule annotations based solely on the observation of the CT scans.

FecharLer Abstract

2025

Clinical Data-Driven Modeling of Disease-Specific Survival in Lung Cancer: Insights from the National Lung Screening Trial Dataset

Autores
Amaro, M; Sousa, JV; Gouveia, M; Oliveira, HP; Pereira, T;

Publicação
Measurement and Evaluations in Cancer Care

Abstract

2025

Clinical Annotation and Medical Image Anonymization for AI Model Training in Lung Cancer Detection

Autores
Freire, AM; Rodrigues, EM; Sousa, JV; Gouveia, M; Ferreira-Santos, D; Pereira, T; Oliveira, HP; Sousa, P; Silva, AC; Fernandes, MS; Hespanhol, V; Araújo, J;

Publicação
UNIVERSAL ACCESS IN HUMAN-COMPUTER INTERACTION, UAHCI 2025, PT I

Abstract
Lung cancer remains one of the most common and lethal forms of cancer, with approximately 1.8 million deaths annually, often diagnosed at advanced stages. Early detection is crucial, but it depends on physicians' accurate interpretation of computed tomography (CT) scans, a process susceptible to human limitations and variability. ByMe has developed a medical image annotation and anonymization tool designed to address these challenges through a human-centered approach. The tool enables physicians to seamlessly add structured attribute-based annotations (e.g., size, location, morphology) directly within their established workflows, ensuring intuitive interaction.Integrated with Picture Archiving and Communication Systems (PACS), the tool streamlines the annotation process and enhances usability by offering a dedicated worklist for retrospective and prospective case analysis. Robust anonymization features ensure compliance with privacy regulations such as the General Data Protection Regulation (GDPR), enabling secure dataset sharing for research and developing artificial intelligence (AI) models. Designed to empower AI integration, the tool not only facilitates the creation of high-quality datasets but also lays the foundation for incorporating AI-driven insights directly into clinical workflows. Focusing on usability, workflow integration, and privacy, this innovation bridges the gap between precision medicine and advanced technology. By providing the means to develop and train AI models for lung cancer detection, it holds the potential to significantly accelerate diagnosis as well as enhance its accuracy and consistency.

FecharLer Abstract

2025

Efficient-Proto-Caps: A Parameter-Efficient and Interpretable Capsule Network for Lung Nodule Characterization

Autores
Rodrigues, EM; Gouveia, M; Oliveira, HP; Pereira, T;

Publicação
IEEE ACCESS

Abstract
Deep learning techniques have demonstrated significant potential in computer-assisted diagnosis based on medical imaging. However, their integration into clinical workflows remains limited, largely due to concerns about interpretability. To address this challenge, we propose Efficient-Proto-Caps, a lightweight and inherently interpretable model that combines capsule networks with prototype learning for lung nodule characterization. Additionally, an innovative Davies-Bouldin Index with multiple centroids per cluster is employed as a loss function to promote clustering of lung nodule visual attribute representations. When evaluated on the LIDC-IDRI dataset, the most widely recognized benchmark for lung cancer prediction, our model achieved an overall accuracy of 89.7 % in predicting lung nodule malignancy and associated visual attributes. This performance is statistically comparable to that of the baseline model, while utilizing a backbone with only approximately 2 % of the parameters of the baseline model's backbone. State-of-the-art models achieved better performance in lung nodule malignancy prediction; however, our approach relies on multiclass malignancy predictions and provides a decision rationale aligned with globally accepted clinical guidelines. These results underscore the potential of our approach, as the integration of lightweight and less complex designs into accurate and inherently interpretable models represents a significant advancement toward more transparent and clinically viable computer-assisted diagnostic systems. Furthermore, these findings highlight the model's potential for broader applicability, extending beyond medicine to other domains where final classifications are grounded in concept-based or example-based attributes.

FecharLer Abstract

2025

Comparing 2D and 3D Feature Extraction Methods for Lung Adenocarcinoma Prediction Using CT Scans: A Cross-Cohort Study

Autores
Gouveia, M; Mendes, T; Rodrigues, EM; Oliveira, HP; Pereira, T;

Publicação
APPLIED SCIENCES-BASEL

Abstract
Lung cancer stands as the most prevalent and deadliest type of cancer, with adenocarcinoma being the most common subtype. Computed Tomography (CT) is widely used for detecting tumours and their phenotype characteristics, for an early and accurate diagnosis that impacts patient outcomes. Machine learning algorithms have already shown the potential to recognize patterns in CT scans to classify the cancer subtype. In this work, two distinct pipelines were employed to perform binary classification between adenocarcinoma and non-adenocarcinoma. Firstly, radiomic features were classified by Random Forest and eXtreme Gradient Boosting classifiers. Next, a deep learning approach, based on a Residual Neural Network and a Transformer-based architecture, was utilised. Both 2D and 3D CT data were initially explored, with the Lung-PET-CT-Dx dataset being employed for training and the NSCLC-Radiomics and NSCLC-Radiogenomics datasets used for external evaluation. Overall, the 3D models outperformed the 2D ones, with the best result being achieved by the Hybrid Vision Transformer, with an AUC of 0.869 and a balanced accuracy of 0.816 on the internal test set. However, a lack of generalization capability was observed across all models, with the performances decreasing on the external test sets, a limitation that should be studied and addressed in future work.

FecharLer Abstract

2023

Deep Minutiae Fingerprint Extraction Using Equivariance Priors

Autores
Gouveia, M; Castro, E; Rebelo, A; Cardoso, JS; Patrão, B;

Publicação
BIOSIGNALS

Abstract