2017
Authors
Terra, Ana Lúcia; Batista, Ana Alice; Lopes, Carla Teixeira; Ribeiro, Cristina; Martins, Fernanda; David, Gabriel; Rodrigues, Irene; Borbinha, José; Borges, Maria Manuel; Pinto, Maria Manuela; Fialho, Paulo;
Publication
ECIL 2017, Fifth European Conference on Information Literacy
Abstract
This study reports the Portuguese contribution to an international survey on data literacy of academics and researchers are presented in this study. The community contributed with 943 filled questionnaire, covering key aspects related to the use and production of research data (e.g. file type and volume of data created and used; the choice of data storage devices and the creation of metadata on research data, among others). Also considered were the use of Data Management Plan and data management practices (e.g. file naming, citation rules, use of unique identifiers and tags), as also sharing of research data. Based on the results, it is concluded that there is a need to formulate institutional policies for the management of scientific data and to design training initiatives to develop data literacy skills. The comparing of these results with those of the overall international study is a next step.
2025
Authors
Rodrigues, JF; Cardoso, HL; Lopes, CT;
Publication
COMPANION PROCEEDINGS OF THE ACM WEB CONFERENCE 2025, WWW COMPANION 2025
Abstract
Text simplification converts complex text into simpler language, improving readability and comprehension. This study evaluates the effectiveness of open-source large language models for text simplification across various categories. We created a dataset of 66,620 lead section pairs from English and Simple English Wikipedia, spanning nine categories, and tested Llama 3 for text simplification. We assessed its output for readability, simplicity, and meaning preservation. Results show improved readability, with simplification varying by category. Texts on Time were the most shortened, while Leisurerelated texts had the greatest reduction of words/characters and syllables per sentence. Meaning preservation was most effective for the Objects and Education categories.
2025
Authors
Dias, M; Lopes, CT;
Publication
Research Challenges in Information Science - 19th International Conference, RCIS 2025, Seville, Spain, May 20-23, 2025, Proceedings, Part II
Abstract
Entity linking is an important task in medical natural language processing (NLP) for converting unstructured text into structured data for clinical analysis and semantic interoperability. However, in lower-resource languages, this task is challenging due to the limited availability of domain-specific resources. This paper explores a translation-based cross-lingual entity linking approach using GPT models, GPT-3.5 and GPT-4o, for zero-shot machine translation and entity linking with in-context learning. We evaluate our approach using a Portuguese-English parallel dataset of radiology abstracts. Our results show that chunk-level machine translation outperforms sentence-level translation. Moreover, our translation-based approach to cross-lingual entity linking of UMLS concepts outperformed the multilingual encoder method baseline. However, the in-context learning entity linking approach did not outperform a translation-based approach with a dictionary-based entity linking method. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
2025
Authors
Giagnolini, L; Koch, I; Tomasi, F; Teixeira Lopes, C;
Publication
Journal of Documentation
Abstract
Purpose – This study aims to comparatively evaluate two semantic models, ArchOnto (CIDOC CRM based) and Records in Contexts Ontology (RiC-O), for archival representation within the Linked Open Data framework. The research seeks to critically analyse their ability to represent archival documents, events, activities, and provenance through the application on a case study of historical baptism records. Design/methodology/approach – The study adopted a comparative approach, utilising the two models to represent a dataset of baptism records from a Portuguese parish spanning several centuries. This involved information extraction and conversion processes, transforming XML EAD finding aids into RDF to facilitate more explicit semantic representation and analysis. Findings – The analysis revealed distinctive strengths and limitations of each semantic model, providing nuanced insights into their respective capacities for archival description. The findings guide cultural heritage institutions in selecting and implementing the most suitable semantic model for their needs and pave the way for semantic alignment between the two models. Research limitations/implications – Although the case study explored the representation of a wide range of features, potential limitations include the specific contextual constraints of parish records and the need for broader comparative studies across diverse archival contexts. Originality/value – This paper offers original insights into semantic modelling for archival representations by providing a detailed comparative analysis of two ontological approaches. It offers valuable perspectives for archivists, digital humanities researchers, and cultural heritage professionals seeking to enhance the semantic richness of archival descriptions. © 2025 Emerald Publishing Limited
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.