Publications

Publications by Pedro Manuel Ribeiro

2026

Optimizing Medical Image Captioning with Conditional Prompt Encoding

Authors
Fernandes, RF; Oliveira, HS; Ribeiro, PP; Oliveira, HP;

Publication
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2025, PT II

Abstract
Medical image captioning is an essential tool to produce descriptive text reports of medical images. One of the central problems of medical image captioning is their poor domain description generation because large pre-trained language models are primarily trained in non-medical text domains with different semantics of medical text. To overcome this limitation, we explore improvements in contrastive learning for X-ray images complemented with soft prompt engineering for medical image captioning and conditional text decoding for caption generation. The main objective is to develop a softprompt model to improve the accuracy and clinical relevance of the automatically generated captions while guaranteeing their complete linguistic accuracy without corrupting the models' performance. Experiments on the MIMIC-CXR and ROCO datasets showed that the inclusion of tailored soft-prompts improved accuracy and efficiency, while ensuring a more cohesive medical context for captions, aiding medical diagnosis and encouraging more accurate reporting.

CloseRead Abstract

2025

Next Higher Point: Two Novel Approaches for Computing Natural Visibility Graphs

Authors
Daniel, P; Silva, VF; Ribeiro, P;

Publication
COMPLEX NETWORKS & THEIR APPLICATIONS XIII, COMPLEX NETWORKS 2024, VOL 1

Abstract
With the huge amount of data that has been collected over time, many methods are being developed to allow better understanding and forecasting in several domains. Time series analysis is a powerful tool to achieve this goal. Despite being a well-established area, there are some gaps, and new methods are emerging to overcome these limitations, such as visibility graphs. Visibility graphs allow the analyses of times series as complex networks and make possible the use of more advanced techniques from another well-established area, network science. In this paper, we present two new efficient approaches for computing natural visibility graphs from times series, one for online scenarios in.O(n log n) and the other for offline scenarios in.O(nm), the latter taking advantage of the number of different values in the time series (m).

CloseRead Abstract

2024

Deep-Graph-Sprints: Accelerated Representation Learning in Continuous-Time Dynamic Graphs

Authors
Eddin, AN; Bono, J; Aparício, DO; Ferreira, H; Pinto Ribeiro, PM; Bizarro, P;

Publication
Trans. Mach. Learn. Res.

Abstract
Continuous-time dynamic graphs (CTDGs) are essential for modeling interconnected, evolving systems. Traditional methods for extracting knowledge from these graphs often depend on feature engineering or deep learning. Feature engineering is limited by the manual and time-intensive nature of crafting features, while deep learning approaches suffer from high inference latency, making them impractical for real-time applications. This paper introduces Deep-Graph-Sprints (DGS), a novel deep learning architecture designed for efficient representation learning on CTDGs with low-latency inference requirements. We benchmark DGS against state-of-the-art (SOTA) feature engineering and graph neural network methods using five diverse datasets. The results indicate that DGS achieves competitive performance while inference speed improves between 4x and 12x compared to other deep learning approaches on our benchmark datasets. Our method effectively bridges the gap between deep representation learning and low-latency application requirements for CTDGs.

CloseRead Abstract

2024

Computing Motifs in Hypergraphs

Authors
Nóbrega, D; Ribeiro, P;

Publication
COMPLEX NETWORKS XV, COMPLENET 2024

Abstract
Motifs are overrepresented and statistically significant sub-patterns in a network, whose identification is relevant to uncover its underlying functional units. Recently, its extraction has been performed on higher-order networks, but due to the complexity arising from polyadic interactions, and the similarity with known computationally hard problems, its practical application is limited. Our main contribution is a novel approach for hyper-subgraph census and higher-order motif discovery, allowing for motifs with sizes 3 or 4 to be found efficiently, in real-world scenarios. It is consistently an order of magnitude faster than a baseline state-of-art method, while using less memory and supporting a wider range of base algorithms.

CloseRead Abstract

2023

From random-walks to graph-sprints: a low-latency node embedding framework on continuous-time dynamic graphs

Authors
Eddin, AN; Bono, J; Aparício, D; Ferreira, H; Ascensao, J; Ribeiro, P; Bizarro, P;

Publication
PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2023

Abstract
Many real-world datasets have an underlying dynamic graph structure, where entities and their interactions evolve over time. Machine learning models should consider these dynamics in order to harness their full potential in downstream tasks. Previous approaches for graph representation learning have focused on either sampling khop neighborhoods, akin to breadth-first search, or random walks, akin to depth-first search. However, these methods are computationally expensive and unsuitable for real-time, low-latency inference on dynamic graphs. To overcome these limitations, we propose graph-sprints a general purpose feature extraction framework for continuous-time-dynamic-graphs (CTDGs) that has low latency and is competitive with state-of-the-art, higher latency models. To achieve this, a streaming, low latency approximation to the random-walk based features is proposed. In our framework, time-aware node embeddings summarizing multi-hop information are computed using only single-hop operations on the incoming edges. We evaluate our proposed approach on three open-source datasets and two in-house datasets, and compare with three state-of-the-art algorithms (TGN-attn, TGN-ID, Jodie). We demonstrate that our graph-sprints features, combined with a machine learning classifier, achieve competitive performance (outperforming all baselines for the node classification tasks in five datasets). Simultaneously, graphsprints significantly reduce inference latencies, achieving close to an order of magnitude speed-up in our experimental setting.

CloseRead Abstract

2025

Multilayer quantile graph for multivariate time series analysis and dimensionality reduction

Authors
Silva, VF; Silva, ME; Ribeiro, P; Silva, F;

Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
In recent years, there has been a surge in the prevalence of high- and multidimensional temporal data across various scientific disciplines. These datasets are characterized by their vast size and challenging potential for analysis. Such data typically exhibit serial and cross-dependency and possess high dimensionality, thereby introducing additional complexities to conventional time series analysis methods. To address these challenges, a recent and complementary approach has emerged, known as network-based analysis methods for multivariate time series. In univariate settings, quantile graphs have been employed to capture temporal transition properties and reduce data dimensionality by mapping observations to a smaller set of sample quantiles. To confront the increasingly prominent issue of high dimensionality, we propose an extension of quantile graphs into a multivariate variant, which we term Multilayer Quantile Graphs. In this innovative mapping, each time series is transformed into a quantile graph, and inter-layer connections are established to link contemporaneous quantiles of pairwise series. This enables the analysis of dynamic transitions across multiple dimensions. In this study, we demonstrate the effectiveness of this new mapping using synthetic and benchmark multivariate time series datasets. We delve into the resulting network's topological structures, extract network features, and employ these features for original dataset analysis. Furthermore, we compare our results with a recent method from the literature. The resulting multilayer network offers a significant reduction in the dimensionality of the original data while capturing serial and cross-dimensional transitions. This approach facilitates the characterization and analysis of large multivariate time series datasets through network analysis techniques.

CloseRead Abstract