Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CTM

2026

Optimizing Medical Image Captioning with Conditional Prompt Encoding

Autores
Fernandes, RF; Oliveira, HS; Ribeiro, PP; Oliveira, HP;

Publicação
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2025, PT II

Abstract
Medical image captioning is an essential tool to produce descriptive text reports of medical images. One of the central problems of medical image captioning is their poor domain description generation because large pre-trained language models are primarily trained in non-medical text domains with different semantics of medical text. To overcome this limitation, we explore improvements in contrastive learning for X-ray images complemented with soft prompt engineering for medical image captioning and conditional text decoding for caption generation. The main objective is to develop a softprompt model to improve the accuracy and clinical relevance of the automatically generated captions while guaranteeing their complete linguistic accuracy without corrupting the models' performance. Experiments on the MIMIC-CXR and ROCO datasets showed that the inclusion of tailored soft-prompts improved accuracy and efficiency, while ensuring a more cohesive medical context for captions, aiding medical diagnosis and encouraging more accurate reporting.

2026

Decoding vision transformer variations for image classification: A guide to performance and usability

Autores
Montrezol, J; Oliveira, HS; Oliveira, HP;

Publicação
MACHINE LEARNING WITH APPLICATIONS

Abstract
With the rise of Transformers, Vision Transformers (ViTs) have become a new standard in visual recognition. This has led to the development of numerous architectures with diverse designs and applications. This survey identifies 22 key ViT and hybrid CNN-ViT models, along with 5 top Convolutional Neural Network (CNN) models. These were selected based on their new architecture, relevance to benchmarks, and overall impact. The models are organised using a defined taxonomy formed by CNN-based, pure Transformer-based, and hybrid architectures. We analyse their main components, training methods, and computational features, while assessing performance using reported results on standard benchmarks such as ImageNet and CIFAR, along with our training and fine-tuning evaluations on specific imaging datasets. In addition to accuracy, we look at real-world deployment issues by analysing the trade-offs between accuracy and efficiency in embedded, mobile, and clinical settings. The results indicate that modern CNNs are still very competitive in limited-resource environments, while advanced ViT variants perform well after large-scale pretraining, especially in areas with high variability. Hybrid CNN-ViT architectures, on the other hand, tend to offer the best balance between accuracy, data efficiency, and computational cost. This survey establishes a consolidated benchmark and reference framework for understanding the evolution, capabilities, and practical applicability of contemporary vision architectures.

2026

Pattern Recognition and Image Analysis

Autores
Gonçalves, N; Oliveira, HP; Sánchez, JA;

Publicação
Lecture Notes in Computer Science

Abstract

2026

Pattern Recognition and Image Analysis - 12th Iberian Conference, IbPRIA 2025, Coimbra, Portugal, June 30 - July 3, 2025, Proceedings, Part II

Autores
Gonçalves, N; Oliveira, HP; Sánchez, JA;

Publicação
IbPRIA (2)

Abstract

2026

Pattern Recognition and Image Analysis - 12th Iberian Conference, IbPRIA 2025, Coimbra, Portugal, June 30 - July 3, 2025, Proceedings, Part I

Autores
Gonçalves, N; Oliveira, HP; Sánchez, JA;

Publicação
IbPRIA (1)

Abstract

2026

Assessment of Tartrazine Diffusion Properties in Skeletal Muscle

Autores
Guerra, AR; Oliveira, LR; Rodrigues, GO; Pinheiro, MR; Carvalho, MI; Tuchín, VV; Oliveira, LM;

Publicação
IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS

Abstract
Evaluating diffusion properties of novel optical clearing (OC) agents is critical for advancing medical imaging. Tartrazine (TTZ), a strong absorbing dye, has shown promise in enhancing tissue transparency, yet its diffusion properties remain uncharacterized. In this work, OC treatments with TTZ-water solutions with varying osmolarities were performed, and the diffusion times (tau) that characterize the tissue dehydration and the RI matching mechanisms were estimated. From kinetic T-c measurements during treatment, tau values of water and TTZ were estimated in muscles as 60.0 s and 416.0 s, respectively. Corresponding diffusion coefficients (D) were derived from sample thickness data measured during treatments where the unique fluxes of TTZ and water occur. The respective D values were then calculated as 1.9 x 10(-6) cm(2)/s for water and 3.6 x 10(-7) cm(2)/s for TTZ. These findings provide key insights into TTZ diffusion in skeletal muscle and support its potential as an effective OC agent.

  • 2
  • 402