2025
Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Litvak, M; Cordeiro, JP; Rocha, C; Sousa, HO; Cunha, LF; Mansouri, B;
Publication
SIGIR Forum
Abstract
2025
Authors
Sousa, H; Campos, R; Jorge, A;
Publication
PROCEEDINGS OF THE 34TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2025
Abstract
In this paper we demo the Temporal Game, a novel approach to temporal relation extraction that casts the task as an interactive game. Instead of directly annotating interval-level relations, our approach decomposes them into point-wise comparisons between the start and end points of temporal entities. At each step, players classify a single point relation, and the system applies temporal closure to infer additional relations and enforce consistency. This point-based strategy naturally supports both interval and instant entities, enabling more fine-grained and flexible annotation than any previous approach. The Temporal Game also lays the groundwork for training reinforcement learning agents, by treating temporal annotation as a sequential decision-making task. To showcase this potential, the demo presented in this paper includes a Game mode, in which users annotate texts from the TempEval-3 dataset and receive feedback based on a scoring system, and an Annotation mode, that allows custom documents to be annotated and resulting timeline to be exported. Therefore, this demo serves both as a research tool and an annotation interface. The demo is publicly available at https://temporal-game.inesctec.pt, and the source code is open-sourced to foster further research and community-driven development in temporal reasoning and annotation.
2025
Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Litvak, M;
Publication
Text2Story@ECIR
Abstract
2025
Authors
Muratov, A; Shaikh, HF; Jani, V; Mahmoud, T; Xie, Z; Orel, D; Singh, A; Wang, Y; Joshi, A; Iqbal, H; Hee, MS; Sahnan, D; Nikolaidis, N; Silvano, P; Dimitrov, D; Yangarber, R; Campos, R; Jorge, A; Guimarães, N; Sartori, E; Stefanovitch, N; San Martino, GD; Piskorski, J; Nakov, P;
Publication
CoRR
Abstract
2025
Authors
Fernandes, AL; Silvano, P; Guimarães, N; Silva, RR; Munna, TA; Cunha, LF; Leal, A; Campos, R; Jorge, A;
Publication
Text2Story@ECIR
Abstract
Electronic Health Records (EHRs) contain vast amounts of unstructured narrative text, posing challenges for organization, curation, and automated information extraction in clinical and research settings. Developing effective annotation schemes is crucial for training extraction models, yet it remains complex for both human experts and Large Language Models (LLMs). This study compares human- and LLM-generated annotation schemes and guidelines through an experimental framework. In the first phase, both a human expert and an LLM created annotation schemes based on predefined criteria. In the second phase, experienced annotators applied these schemes following the guidelines. In both cases, the results were qualitatively evaluated using Likert scales. The findings indicate that the human-generated scheme is more comprehensive, coherent, and clear compared to those produced by the LLM. These results align with previous research suggesting that while LLMs show promising performance with respect to text annotation, the same does not apply to the development of annotation schemes, and human validation remains essential to ensure accuracy and reliability.
2025
Authors
Munna, TA; Fernandes, AL; Silvano, P; Guimarães, N; Jorge, A;
Publication
Text2Story@ECIR
Abstract
The relationship of a patient with a hospital from admission to discharge is often kept in a series of textual documents that describe the patient’s journey. These documents are important to analyze the different steps of the clinical process and to make aggregated studies of the paths of patients in the hospital. In this paper, we explore the potential of Large Language Models (LLMs) to generate realistic and comprehensive patient journeys in European Portuguese, addressing the scarcity of medical data in this specific context. We employed Google’s Gemini 1.5 Flash model and utilized a dataset of 285 European Portuguese published case reports from the SPMI website, published by the Portuguese Society of Internal Medicine, as references for generating synthetic medical reports. Our methodology involves a sequential approach to generating a synthetic patient journey. Initially, we generate an admission report, followed by a discharge report. Subsequently, we generate a comprehensive patient journey that integrates the admission, multiple daily progress reports, and the discharge into a cohesive narrative. This end-to-end process ensures a realistic and detailed representation of the patient’s clinical pathway as a patient’s journey. The generated reports were rigorously evaluated by medical and linguistic professionals, as well as automatic metrics to measure the inclusion of key medical entities, similarity to the case report, and correct Portuguese variant. Both qualitative and quantitative evaluations confirmed that the generated synthetic reports are predominantly written in European Portuguese without the loss of important medical information from the case reports. This work contributes to developing high-quality synthetic medical data for training LLMs and advancing AI-driven healthcare applications in under-resourced language settings.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.