Publicacoes - INESC TEC

Publicações

Publicações por Alípio Jorge

2026

Can LLMs Reliably Label YouTube Videos? A Committee-Based Evaluation

Autores
Mourthé, A; Mello, CE; Jorge, A;

Publicação
SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2025, PT I

Abstract
As recommender systems play an increasingly central role in shaping information exposure on platforms like YouTube, understanding the nature of the content they promote, especially in sensitive contexts, requires scalable and reliable labelling methods. This paper investigates the use of Large Language Models (LLM) to label YouTube videos based solely on their metadata. We propose a committee-based approach that aggregates predictions from an ensemble of seven state-of-the-art LLMs through majority voting. Using a novel dataset collected via simulated user interactions on YouTube, we analyse model agreement, labelling behavior, and the influence of model size. To assess label reliability, we also investigate the semantic coherence of label assignments. Our results show that LLM committees produce highly consistent labels in low-disagreement settings. These findings highlight both the promise and limitations of LLM-based annotation for auditing social networks.

FecharLer Abstract

2026

The 9th International Workshop on Narrative Extraction from Text: Text2Story 2026

Autores
Campos, R; Jorge, A; Jatowt, A; Bhatia, S; Litvak, M;

Publicação
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2026, PT III

Abstract
For eight years, the Text2Story Workshop series has fostered a vibrant research community dedicated to narrative understanding, advancing shared insights into the challenges of modelling narrative structure in text. While earlier approaches laid important foundations, recent progress in Transformers and Large Language Models (LLMs) has fundamentally reshaped the field. Building on the increasing prominence of LLM-based contributions in recent editions, the ninth edition of Text2Story expands the focus toward agentic AI, where systems plan, reason, and interact over time using narratives as internal representations. Recent advances, including long-context architectures, instruction and preference-tuned models, retrieval-augmented generation, and discourse-aware prompting, have broadened the applicability of LLMs to complex narrative tasks. Nevertheless, reliably capturing fine-grained narrative structures remains challenging, particularly for event chains, temporal and causal relations, character development, and perspective consistency. These challenges are amplified in interactive and agentic settings, where narrative coherence, controllability, and reliability are critical. This edition of Text2Story explores both the opportunities and limitations of LLMs and agentic systems for narrative understanding, including the analysis of narratives generated by LLMs themselves with respect to consistency, hallucination, bias, and control. Through a diverse program of research papers, works in progress, demos, resources, and keynote talks, the workshop continues to advance narrative understanding in the era of foundation and agentic models.

FecharLer Abstract

2026

NLP for Local Governance Meeting Records: A Focus Article on Tasks, Datasets, Metrics and Benchmark

Autores
Campos, R; Evans, JP; Isidro, J; Marques, M; Cunha, LF; Jorge, A; Nunes, S; Guimarães, N;

Publicação
CoRR

Abstract

2026

EPHG-CR: embedding propagation for heterogeneous graphs with class refinement

Autores
Dos Santos, BN; Marcacini, RM; Jorge, AM; Campos, R; Rezende, SO;

Publicação
APPLIED INTELLIGENCE

Abstract
Heterogeneous graphs can represent real-world problems in a way close to reality, supporting diverse types of vertices and edges. However, their inherent heterogeneity poses challenges in interpreting problem semantics. To address this, heterogeneous graph embedding, aiming to map graph elements to low-dimensional vectors, simplifies subsequent machine learning analysis. This approach has gained prominence in machine learning, fueling classification, recommendation, and similarity search applications. Embedding diverse data is essential for efficient data processing. Incorporating language models, like BERT, into heterogeneous graphs enhances semantic context capture, which is particularly useful when one vertex type represents text. Language models stand out in contextual representation, enriching graph vertex embeddings for various tasks. This paper proposes a novel approach to enhancing heterogeneous graph embeddings by combining language models and task class data. Our approach increases vector quality, accounting for graph structure, semantic textual information, and task labels. We compared our proposal with a language model in the aspect-based sentiment analysis task, demonstrating competitive results and, in some cases, a slight superiority. Furthermore, we explore applications of embeddings from auxiliary vertices in another task, highlighting another advantage of the approach over the language model.

FecharLer Abstract

2025

The incremental process of building an annotation scheme for clinical narratives in portuguese: the contribution of human variation analysis

Autores
Ana Luisa Fernandes; Purificação Silvano; António Leal; Nuno Guimarães; Rita Rb-Silva; Luís Filipe Cunha; Alípio Jorge;

Publicação
Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025)

Abstract
The development of a robust annotation scheme and corresponding guidelines is crucial for pro- ducing annotated datasets that advance both lin- guistic and computational research. This paper presents a case study that outlines a method- ology for designing an annotation scheme and its guidelines, specifically aimed at represent- ing morphosyntactic and semantic information regarding temporal features, as well as medi- cal information in medical reports written in Portuguese. We detail a multi-step process that includes reviewing existing frameworks, con- ducting an annotation experiment to determine the optimal approach, and designing a model based on these findings. We validated the ap- proach through a pilot experiment where we assessed the reliability and applicability of the annotation scheme and guidelines. In this ex- periment, two annotators independently anno- tated a patient's medical report consisting of six documents using the proposed model, while a curator established the ground truth. The analy- sis of inter-annotator agreement and the annota- tion results enabled the identification of sources of human variation and provided insights for further refinement of the annotation scheme and guidelines.

FecharLer Abstract