Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por Hélder Filipe Oliveira

2023

Lung CT image synthesis using GANs

Autores
Mendes, J; Pereira, T; Silva, F; Frade, J; Morgado, J; Freitas, C; Negrao, E; de Lima, BF; da Silva, MC; Madureira, AJ; Ramos, I; Costa, JL; Hespanhol, V; Cunha, A; Oliveira, HP;

Publicação
EXPERT SYSTEMS WITH APPLICATIONS

Abstract
Biomedical engineering has been targeted as a potential research candidate for machine learning applications, with the purpose of detecting or diagnosing pathologies. However, acquiring relevant, high-quality, and heterogeneous medical datasets is challenging due to privacy and security issues and the effort required to annotate the data. Generative models have recently gained a growing interest in the computer vision field due to their ability to increase dataset size by generating new high-quality samples from the initial set, which can be used as data augmentation of a training dataset. This study aimed to synthesize artificial lung images from corresponding positional and semantic annotations using two generative adversarial networks and databases of real computed tomography scans: the Pix2Pix approach that generates lung images from the lung segmentation maps; and the conditional generative adversarial network (cCGAN) approach that was implemented with additional semantic labels in the generation process. To evaluate the quality of the generated images, two quantitative measures were used: the domain-specific Frechet Inception Distance and Structural Similarity Index. Additionally, an expert assessment was performed to measure the capability to distinguish between real and generated images. The assessment performed shows the high quality of synthesized images, which was confirmed by the expert evaluation. This work represents an innovative application of GAN approaches for medical application taking into consideration the pathological findings in the CT images and the clinical evaluation to assess the realism of these features in the generated images.

2019

Computer Aided Detection of Deep Inferior Epigastric Perforators in Computed Tomography Angiography scans

Autores
Araújo, RJ; Garrido, V; Baraças, CA; Vasconcelos, MA; Mavioso, C; Anacleto, JC; Cardoso, MJ; Oliveira, HP;

Publicação
CoRR

Abstract

2021

Topological Similarity Index and Loss Function for Blood Vessel Segmentation

Autores
Araújo, RJ; Cardoso, JS; Oliveira, HP;

Publicação
CoRR

Abstract

2025

From Pixels to Pathways: AI-Based Approaches for Multimodal Lung Cancer Classification

Autores
Gonçalves, S; Sousa, JV; Gouveia, M; Amaro, M; Oliveira, HP; Pereira, T;

Publicação
BIBM

Abstract
Lung cancer remains the leading cause of cancer related deaths globally, responsible for approximately 1.8 million deaths each year. A key contributor to this high mortality rate is the late-stage diagnosis of the disease, underscoring the urgent need for effective early detection strategies. Low-dose computed tomography (CT) has shown great value in early screening, particularly when paired with clinical information. Clinical data, while valuable, lacks spatial and morphological insights essential for comprehensive evaluation. Combining both modalities offers a more holistic approach for lung cancer classification. This study presents AI-based methods for lung cancer classification using unimodal approaches - structured clinical data and chest CT imaging - alongside a novel multimodal deep learning framework that integrates both data types to classify lung nodules as malignant or benign. For the clinical modality, machine learning models including logistic regression, random forests, LightGBM, XGBoost, and multilayer perceptrons were evaluated with extensive hyperparameter tuning. In the imaging modality, ResNet18 and ResNet34 convolutional neural networks were used, with and without data augmentation. The study explored both intermediate and late fusion strategies to combine modality-specific representations. Results show that multimodal models consistently outperformed their unimodal counterparts, achieving a best-case area under the ROC curve (AUC) of 0.9138, with an accuracy of 0.8424 and an F1-score of 0.8422. These findings highlight the complementary strengths of imaging and clinical data and support the growing potential of multimodal deep learning in improving diagnostic accuracy in lung cancer classification. © 2025 IEEE.

2025

ROBUST VISUAL TRANSFORMERS FOR MEDICAL IMAGE CLASSIFICATION

Autores
Montrezol, J; Oliveira, HS; Araujo, J; Oliveira, HP;

Publicação
2025 47TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC)

Abstract
The Vision Transformer (ViT) architecture has emerged as a potential game-changer in computer vision, offering scalability and global attention that have generated considerable interest in recent years. Its adaptability has fueled enthusiasm for its application. This work investigates the boundaries of the architecture, focusing on developing new techniques targeting explicitly complex tasks, such as medical imaging datasets, which often exhibit high variability, class imbalance, and limited sample sizes. We propose a set of mixed regularisation and augmentation techniques to enhance the performance of models. These include a novel loss function and a smoothly differentiable activation function, leading to more stable training and model performance. The results show that incorporating these techniques improves model performance and training convergence.

2026

Decoding vision transformer variations for image classification: A guide to performance and usability

Autores
Montrezol, J; Oliveira, HS; Oliveira, HP;

Publicação
MACHINE LEARNING WITH APPLICATIONS

Abstract
With the rise of Transformers, Vision Transformers (ViTs) have become a new standard in visual recognition. This has led to the development of numerous architectures with diverse designs and applications. This survey identifies 22 key ViT and hybrid CNN-ViT models, along with 5 top Convolutional Neural Network (CNN) models. These were selected based on their new architecture, relevance to benchmarks, and overall impact. The models are organised using a defined taxonomy formed by CNN-based, pure Transformer-based, and hybrid architectures. We analyse their main components, training methods, and computational features, while assessing performance using reported results on standard benchmarks such as ImageNet and CIFAR, along with our training and fine-tuning evaluations on specific imaging datasets. In addition to accuracy, we look at real-world deployment issues by analysing the trade-offs between accuracy and efficiency in embedded, mobile, and clinical settings. The results indicate that modern CNNs are still very competitive in limited-resource environments, while advanced ViT variants perform well after large-scale pretraining, especially in areas with high variability. Hybrid CNN-ViT architectures, on the other hand, tend to offer the best balance between accuracy, data efficiency, and computational cost. This survey establishes a consolidated benchmark and reference framework for understanding the evolution, capabilities, and practical applicability of contemporary vision architectures.

  • 28
  • 29