Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CTM

2025

Sonar-Based Deep Learning in Underwater Robotics: Overview, Robustness, and Challenges

Autores
Aubard, M; Madureira, A; Teixeira, L; Pinto, J;

Publicação
IEEE JOURNAL OF OCEANIC ENGINEERING

Abstract
With the growing interest in underwater exploration and monitoring, autonomous underwater vehicles have become essential. The recent interest in onboard deep learning (DL) has advanced real-time environmental interaction capabilities relying on efficient and accurate vision-based DL models. However, the predominant use of sonar in underwater environments, characterized by limited training data and inherent noise, poses challenges to model robustness. This autonomy improvement raises safety concerns for deploying such models during underwater operations, potentially leading to hazardous situations. This article aims to provide the first comprehensive overview of sonar-based DL under the scope of robustness. It studies sonar-based DL perception task models, such as classification, object detection, segmentation, and simultaneous localization and mapping. Furthermore, this article systematizes sonar-based state-of-the-art data sets, simulators, and robustness methods, such as neural network verification, out-of-distribution, and adversarial attacks. This article highlights the lack of robustness in sonar-based DL research and suggests future research pathways, notably establishing a baseline sonar-based data set and bridging the simulation-to-reality gap.

2025

Enhancing Medical Image Analysis: A Pipeline Combining Synthetic Image Generation and Super-Resolution

Autores
Sousa, P; Campas, D; Andrade, J; Pereira, P; Gonçalves, T; Teixeira, LF; Pereira, T; Oliveira, HP;

Publicação
Pattern Recognition and Image Analysis - 12th Iberian Conference, IbPRIA 2025, Coimbra, Portugal, June 30 - July 3, 2025, Proceedings, Part II

Abstract
Cancer is a leading cause of mortality worldwide, with breast and lung cancer being the most prevalent globally. Early and accurate diagnosis is crucial for successful treatment, and medical imaging techniques play a pivotal role in achieving this. This paper proposes a novel pipeline that leverages generative artificial intelligence to enhance medical images by combining synthetic image generation and super-resolution techniques. The framework is validated in two medical use cases (breast and lung cancers), demonstrating its potential to improve the quality and quantity of medical imaging data, ultimately contributing to more precise and effective cancer diagnosis and treatment. Overall, although some limitations do exist, this paper achieved satisfactory results for an image size which is conductive to specialist analysis, and further expands upon this field’s capabilities. © 2025 Elsevier B.V., All rights reserved.

2025

Abnormal Human Behaviour Detection Using Normalising Flows and Attention Mechanisms

Autores
Rodrigues Nogueira, AF; Oliveira, HP; Teixeira, LF;

Publicação
Pattern Recognition and Image Analysis - 12th Iberian Conference, IbPRIA 2025, Coimbra, Portugal, June 30 - July 3, 2025, Proceedings, Part I

Abstract
The aim of this work is to explore normalising flows to detect anomalous behaviours which is an essential task mainly for surveillance systems-related applications. To accomplish that, a series of ablation studies were performed by varying the parameters of the Spatio-Temporal Graph Normalising Flows (STG-NF) model [3] and combining it with attention mechanisms. Out of all these experiments, it was only possible to improve the state-of-the-art result for the UBnormal dataset by 3.4 percentual points (pp), for the Avenue by 4.7 pp and for the Avenue-HR by 3.2 pp. However, further research remains urgent to find a model that can give the best performance across different scenarios. The inaccuracies of the pose tracking and estimation algorithm seems to be the main factor limiting the models’ performance. The code is available at https://github.com/AnaFilipaNogueira/Abnormal-Human-Behaviour-Detection-using-Normalising-Flows-and-Attention-Mechanisms. © 2025 Elsevier B.V., All rights reserved.

2025

Expanding Relevance Judgments for Medical Case-based Retrieval Task with Multimodal LLMs

Autores
Pires, C; Nunes, S; Teixeira, LF;

Publicação
CoRR

Abstract

2025

Evaluating Dense Model-based Approaches for Multimodal Medical Case Retrieval

Autores
Catarina Pires; Sérgio Nunes; Luís Filipe Teixeira;

Publicação
Information Retrieval Research

Abstract
Medical case retrieval plays a crucial role in clinical decision-making by enabling healthcare professionals to find relevant cases based on patient records, diagnostic images, and textual descriptions. Given the inherently multimodal nature of medical data, effective retrieval requires models that can bridge the gap between different modalities. Traditional retrieval approaches often rely on unimodal representations, limiting their ability to capture cross-modal relationships. Recent advances in dense model-based techniques have shown promise in overcoming these limitations by encoding multimodal information into a shared latent space, facilitating retrieval based on semantic similarity. This paper investigates the potential of dense models to enhance multimodal search systems. We evaluate various dense model-based approaches to assess which model characteristics have the greatest impact on retrieval effectiveness, using the medical case-based retrieval task from ImageCLEFmed 2013 as a benchmark. Our findings indicate that different dense model approaches substantially impact retrieval effectiveness, and that applying the CombMAX fusion methodto combine their output results further improves effectiveness. Extending context length, however, yielded mixed results depending on the input data. Additionally, domain-specific models—those trained on medical data—outperformed general models trained on broad, non-specialized datasets within their respective fields. Furthermore, when text is the dominant information source, text-only models surpassed multimodal models

2025

Exploring Motion Information in Homography Calculation for Football Matches With Moving Cameras

Autores
Gomes, C; Mastralexi, C; Carvalho, P;

Publicação
IEEE ACCESS

Abstract
In football, where minor differences can significantly affect outcomes and performance, automatic video analysis has become a critical tool for analyzing and optimizing team strategies. However, many existing solutions require expensive and complex hardware comprising multiple cameras, sensors, or GPS devices, limiting accessibility for many clubs, particularly those with limited resources. Using images and video from a moving camera can help a wider audience benefit from video analysis, but it introduces new challenges related to motion. To address this, we explore an alternative homography estimation in moving camera scenarios. Homography plays a crucial role in video analysis, but presents challenges when keypoints are sparse, especially in dynamic environments. Existing techniques predominantly rely on visible keypoints and apply homography transformations on a frame-by-frame basis, often lacking temporal consistency and facing challenges in areas with sparse keypoints. This paper explores the use of estimated motion information for homography computation. Our experimental results reveal that integrating motion data directly into homography estimations leads to reduced errors in keypoint-sparse frames, surpassing state-of-the-art methods, filling a current gap in moving camera scenarios.

  • 9
  • 381