Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por Alípio Jorge

2026

NLP for Local Governance Meeting Records: A Focus Article on Tasks, Datasets, Metrics and Benchmark

Autores
Campos, R; Evans, JP; Isidro, J; Marques, M; Cunha, LF; Jorge, A; Nunes, S; Guimarães, N;

Publicação
CoRR

Abstract

2026

EPHG-CR: embedding propagation for heterogeneous graphs with class refinement

Autores
Dos Santos, BN; Marcacini, RM; Jorge, AM; Campos, R; Rezende, SO;

Publicação
APPLIED INTELLIGENCE

Abstract
Heterogeneous graphs can represent real-world problems in a way close to reality, supporting diverse types of vertices and edges. However, their inherent heterogeneity poses challenges in interpreting problem semantics. To address this, heterogeneous graph embedding, aiming to map graph elements to low-dimensional vectors, simplifies subsequent machine learning analysis. This approach has gained prominence in machine learning, fueling classification, recommendation, and similarity search applications. Embedding diverse data is essential for efficient data processing. Incorporating language models, like BERT, into heterogeneous graphs enhances semantic context capture, which is particularly useful when one vertex type represents text. Language models stand out in contextual representation, enriching graph vertex embeddings for various tasks. This paper proposes a novel approach to enhancing heterogeneous graph embeddings by combining language models and task class data. Our approach increases vector quality, accounting for graph structure, semantic textual information, and task labels. We compared our proposal with a language model in the aspect-based sentiment analysis task, demonstrating competitive results and, in some cases, a slight superiority. Furthermore, we explore applications of embeddings from auxiliary vertices in another task, highlighting another advantage of the approach over the language model.

2025

The incremental process of building an annotation scheme for clinical narratives in portuguese: the contribution of human variation analysis

Autores
Ana Luisa Fernandes; Purificação Silvano; António Leal; Nuno Guimarães; Rita Rb-Silva; Luís Filipe Cunha; Alípio Jorge;

Publicação
Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025)

Abstract
The development of a robust annotation scheme and corresponding guidelines is crucial for pro- ducing annotated datasets that advance both lin- guistic and computational research. This paper presents a case study that outlines a method- ology for designing an annotation scheme and its guidelines, specifically aimed at represent- ing morphosyntactic and semantic information regarding temporal features, as well as medi- cal information in medical reports written in Portuguese. We detail a multi-step process that includes reviewing existing frameworks, con- ducting an annotation experiment to determine the optimal approach, and designing a model based on these findings. We validated the ap- proach through a pilot experiment where we assessed the reliability and applicability of the annotation scheme and guidelines. In this ex- periment, two annotators independently anno- tated a patient's medical report consisting of six documents using the proposed model, while a curator established the ground truth. The analy- sis of inter-annotator agreement and the annota- tion results enabled the identification of sources of human variation and provided insights for further refinement of the annotation scheme and guidelines.

  • 46
  • 46