Publications

Publications by Nuno Ricardo Guimarães

2025

Será o ChatGPT um bom divulgador científico em cosmetologia? Um estudo linguístico sobre textos de divulgação científica - Is ChatGPT a good popular science disseminator in cosmetology? A linguistic study on popular science texts

Authors
Pacheco, AF; Guimarães, N; Torres, A; Silvano, P; Almeida, I;

Publication
Revista da Associação Portuguesa de Linguística

Abstract
O género textual de divulgação científica é fundamental para a disseminação do conhecimento científico de forma acessível e compreensível junto do público não especializado, apresentando estrutura e características diferentes das dos artigos científicos (e.g., Garces-Conejos & Sanchez-Macarro, 1998; Zamboni, 1998). Os estudos sobre as propriedades linguísticas do texto de divulgação científica em português europeu não abundam, sendo a exceção o projeto Promoção da Literacia Científica (Gonçalves & Jorge, 2018). Por outro lado, no âmbito da produção de conteúdo, os grandes modelos de linguagem (LLM), nomeadamente os modelos GPT da OpenAI, ganharam, em pouco tempo, atenção generalizada do público. Sendo recentes, a avaliação da qualidade linguística dos textos produzidos é ainda muito reduzida. Tendo estas premissas em consideração, o presente estudo tem como objetivo avaliar a qualidade linguística das respostas geradas pelo ChatGPT (GPT-3.5) no domínio da cosmetologia, no que respeita às categorias de produtos cosméticos, ingredientes, segurança e eficácia e regulamentação, visando identificar padrões que permitam compreender as diferenças e/ou semelhanças entre o conteúdo gerado pelo LLM e aquele produzido por especialistas humanos no Portal infoCosméticos. Para isso, foram selecionadas vinte questões previamente respondidas e publicadas no portal e, posteriormente, criados quatro prompts distintos com diferentes graus de complexidade, que deram origem a oitenta respostas geradas pelo ChatGPT. As respostas foram, de seguida, analisadas, de acordo com os resultados conduzidos por uma grelha de avaliação linguística composta por 11 perguntas. A análise produziu resultados de diferentes tipos: em termos globais, as respostas escritas pelos especialistas produzem resultados ligeiramente superiores às do ChatGPT; quanto à coesão interfrásica, constatou-se que os textos produzidos por especialistas usam um número reduzido de conectores, contrastando com o uso recorrentemente de marcadores discursivos nos textos do ChatGPT; verifica-se o uso de jargão científico não explicado e uma macroestrutura com ausência do parágrafo da conclusão, nos textos publicados no portal; os textos gerados pelo ChatGPT apresentam uma frequência elevada de repetições e/ou tautologias.

CloseRead Abstract

2026

Knowledge-Aware Clinical Narrative Extraction Using Ontologies and Knowledge Graphs

Authors
Leite, M; Rb Silva, R; Guimaraes, N; Stork, L; Jorge, A;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2025, PT I

Abstract
Providing healthcare professionals with quick access to structured standardized information enables comprehensive analysis and improves clinical decision-making. However, an important part of the records in health institutions is in the form of free text. This paper proposes a pipeline that automatically extracts medical information from Electronic Medical Records (EMRs), based on large language models (LLMs) and a domain ontology defined and validated in collaboration with a medical expert. The output is a knowledge graph of clinical narratives that can be used to search through repositories of EMRs or discover new facts. We showcase our approach on a set of Portuguese clinical texts of cases of Acute Myeloid Leukemia (AML) guided by one medical expert. We evaluate the quality of the extraction and of the knowledge graph.

CloseRead Abstract

2026

LLM-Based Framework for Synthetic Data Generation in Portuguese Clinical NER

Authors
Henriques, L; Guimaraes, N; Jorge, A;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2025, PT I

Abstract
The ever-increasing volume of data produced in Healthcare demands solutions capable of automatically extracting the relevant elements of their narratives. However, given privacy regulations, bureaucratic procedures, and annotation efforts, the development of said solutions via Natural Language Processing (NLP) systems becomes hindered due to training data scarcity. Such scarcity increases when we consider languages and language varieties with lower resource availability, such as European and Brazilian Portuguese. To address this problem, we propose a Large Language Model (LLM)-based SDG (Synthetic Data Generation) framework to generate and annotate synthetic clinical texts for medical Named-Entity Recognition (NER). The SDG framework consists of a system/user prompt augmented with real examples, powered by GPT-4o. Our results show that, by feeding the framework few real clinical annotated texts, we can generate synthetic data capable of increasing the performance of NER models with respect to their non-augmented counterparts. In addition, the reduction of the BLEU scores in the generated texts indicates a decrease in the risk of privacy disclosure while ensuring greater lexical diversity. These results highlight the potential of synthetic data as a solution to overcome human annotation bottlenecks and privacy concerns, laying the groundwork for future research in clinical NLP across tasks, domains, and low-resource languages.

CloseRead Abstract

2025

FRaN-X: FRaming and Narratives-eXplorer

Authors
Muratov, A; Shaikh, HF; Jani, V; Mahmoud, T; Xie, Z; Orel, D; Singh, A; Wang, Y; Joshi, A; Iqbal, H; Hee, MS; Sahnan, D; Nikolaidis, N; Silvano, P; Dimitrov, D; Yangarber, R; Campos, R; Jorge, A; Guimarães, N; Sartori, E; Stefanovitch, N; San Martino, GD; Piskorski, J; Nakov, P;

Publication
CoRR

Abstract

2025

PolyNarrative: A Multilingual, Multilabel, Multi-domain Dataset for Narrative Extraction from News Articles

Authors
Nikolaidis, N; Stefanovitch, N; Silvano, P; Dimitrov, D; Yangarber, R; Guimaraes, N; Sartori, E; Androutsopoulos, I; Nakov, P; Da San Martino, G; Piskorski, J;

Publication
PROCEEDINGS OF THE 63RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS

Abstract
We present PolyNarrative, a new multilingual dataset of news articles, annotated for narratives. Narratives are overt or implicit claims, recurring across articles and languages, promoting a specific interpretation or viewpoint on an ongoing topic, often propagating mis/disinformation. We developed two-level taxonomies with coarse- and fine-grained narrative labels for two domains: (i) climate change and (ii) the military conflict between Ukraine and Russia. We collected news articles in four languages (Bulgarian, English, Portuguese, and Russian) related to the two domains and manually annotated them at the paragraph level. We make the dataset publicly available, along with experimental results of several strong baselines that assign narrative labels to news articles at the paragraph or the document level. We believe that this dataset will foster research in narrative detection and enable new research directions towards more multi-domain and highly granular narrative related tasks.

CloseRead Abstract

2025

Human Experts vs. Large Language Models: Evaluating Annotation Scheme and Guidelines Development for Clinical Narratives

Authors
Fernandes, AL; Silvano, P; Guimarães, N; Silva, RR; Munna, TA; Cunha, LF; Leal, A; Campos, R; Jorge, A;

Publication
Text2Story@ECIR

Abstract
Electronic Health Records (EHRs) contain vast amounts of unstructured narrative text, posing challenges for organization, curation, and automated information extraction in clinical and research settings. Developing effective annotation schemes is crucial for training extraction models, yet it remains complex for both human experts and Large Language Models (LLMs). This study compares human- and LLM-generated annotation schemes and guidelines through an experimental framework. In the first phase, both a human expert and an LLM created annotation schemes based on predefined criteria. In the second phase, experienced annotators applied these schemes following the guidelines. In both cases, the results were qualitatively evaluated using Likert scales. The findings indicate that the human-generated scheme is more comprehensive, coherent, and clear compared to those produced by the LLM. These results align with previous research suggesting that while LLMs show promising performance with respect to text annotation, the same does not apply to the development of annotation schemes, and human validation remains essential to ensure accuracy and reliability.

CloseRead Abstract