2026
Authors
Campos, R; Evans, JP; Isidro, J; Marques, M; Cunha, LF; Jorge, A; Nunes, S; Guimarães, N;
Publication
CoRR
Abstract
2026
Authors
Dos Santos, BN; Marcacini, RM; Jorge, AM; Campos, R; Rezende, SO;
Publication
APPLIED INTELLIGENCE
Abstract
Heterogeneous graphs can represent real-world problems in a way close to reality, supporting diverse types of vertices and edges. However, their inherent heterogeneity poses challenges in interpreting problem semantics. To address this, heterogeneous graph embedding, aiming to map graph elements to low-dimensional vectors, simplifies subsequent machine learning analysis. This approach has gained prominence in machine learning, fueling classification, recommendation, and similarity search applications. Embedding diverse data is essential for efficient data processing. Incorporating language models, like BERT, into heterogeneous graphs enhances semantic context capture, which is particularly useful when one vertex type represents text. Language models stand out in contextual representation, enriching graph vertex embeddings for various tasks. This paper proposes a novel approach to enhancing heterogeneous graph embeddings by combining language models and task class data. Our approach increases vector quality, accounting for graph structure, semantic textual information, and task labels. We compared our proposal with a language model in the aspect-based sentiment analysis task, demonstrating competitive results and, in some cases, a slight superiority. Furthermore, we explore applications of embeddings from auxiliary vertices in another task, highlighting another advantage of the approach over the language model.
2025
Authors
Ana Luisa Fernandes; Purificação Silvano; António Leal; Nuno Guimarães; Rita Rb-Silva; Luís Filipe Cunha; Alípio Jorge;
Publication
Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025)
Abstract
The development of a robust annotation scheme
and corresponding guidelines is crucial for pro-
ducing annotated datasets that advance both lin-
guistic and computational research. This paper
presents a case study that outlines a method-
ology for designing an annotation scheme and
its guidelines, specifically aimed at represent-
ing morphosyntactic and semantic information
regarding temporal features, as well as medi-
cal information in medical reports written in
Portuguese. We detail a multi-step process that
includes reviewing existing frameworks, con-
ducting an annotation experiment to determine
the optimal approach, and designing a model
based on these findings. We validated the ap-
proach through a pilot experiment where we
assessed the reliability and applicability of the
annotation scheme and guidelines. In this ex-
periment, two annotators independently anno-
tated a patient's medical report consisting of six
documents using the proposed model, while a
curator established the ground truth. The analy-
sis of inter-annotator agreement and the annota-
tion results enabled the identification of sources
of human variation and provided insights for
further refinement of the annotation scheme
and guidelines.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.