Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

2025

LLM-Based Framework for Synthetic Data Generation in Portuguese Clinical NER

Authors
Henriques, L; Guimarães, N; Jorge, A;

Publication
Progress in Artificial Intelligence - 24th EPIA Conference on Artificial Intelligence, EPIA 2025, Faro, Portugal, October 1-3, 2025, Proceedings, Part I

Abstract
The ever-increasing volume of data produced in Healthcare demands solutions capable of automatically extracting the relevant elements of their narratives. However, given privacy regulations, bureaucratic procedures, and annotation efforts, the development of said solutions via Natural Language Processing (NLP) systems becomes hindered due to training data scarcity. Such scarcity increases when we consider languages and language varieties with lower resource availability, such as European and Brazilian Portuguese. To address this problem, we propose a Large Language Model (LLM)-based SDG (Synthetic Data Generation) framework to generate and annotate synthetic clinical texts for medical Named-Entity Recognition (NER). The SDG framework consists of a system/user prompt augmented with real examples, powered by GPT-4o. Our results show that, by feeding the framework few real clinical annotated texts, we can generate synthetic data capable of increasing the performance of NER models with respect to their non-augmented counterparts. In addition, the reduction of the BLEU scores in the generated texts indicates a decrease in the risk of privacy disclosure while ensuring greater lexical diversity. These results highlight the potential of synthetic data as a solution to overcome human annotation bottlenecks and privacy concerns, laying the groundwork for future research in clinical NLP across tasks, domains, and low-resource languages. © 2025 Elsevier B.V., All rights reserved.

2025

Airborne Wind Energy Farms: Layout Optimization Combining NSGA-II and BRKGA

Authors
da Costa, RC; Roque, LAC; Paiva, LT; Fernandes, MCRM; Fontes, DBMM; Fontes, FACC;

Publication
DYNAMICS OF INFORMATION SYSTEMS, DIS 2024

Abstract
We address the layout optimization problem of deciding the number, the location, and the operational space of a set of Airborne Wind Energy (AWE) units, which overall constitute an AWE farm. The layout optimization problem in conventional wind farms, with standard wind turbines, is a well-studied subject; however, in the case of AWE, there are several new characteristics and challenges. While in the case of conventional wind farms, the main concern is to guarantee a reduced aerodynamical wake effect from other units, in AWE the main concern is to avoid collision among units. The optimization problem addressed is the following: given a specific land dimension and local wind characteristics, we solve a bi-objective problem of maximizing power production while minimizing the number of units, by deciding the number of producing units, their locations, as well as their flight envelopes. The solution method uses a combination of metaheuristic methods, including elements from the Non-Dominated Sorting Genetic Algorithm-II (NSGA-II) and the Biased Random Key Genetic Algorithm (BRKGA). The results produce a custom Pareto set adapted to the wind local characteristics, allowing for a more accurate estimation of the key objectives, better estimate of the annual power output of the AWE farm, and make better-informed decisions regarding the optimal number of units to deploy in the farm.

2025

A multi-criteria approach to support frequency setting and vehicle technology selection of bus transportation

Authors
Caetano, JA; De Sousa, JP; Marques, CM; Ribeiro, GM; Bahiense, L;

Publication
Transportation Research Procedia

Abstract
This research addresses the Frequency Setting Problem (FSP) together with vehicle technology selection for bus fleet sizing and management. A decision support tool was developed that combines a multi-criteria decision analysis, using the Analytic Hierarchy Process (AHP), and an enumeration procedure. The tool assists transportation operators in selecting optimal frequencies and vehicle technologies, considering economic, social, and environmental criteria. Computational experiments performed in the city of Niterói, Brazil, demonstrate the effectiveness of the tool. Scenarios with different criteria prioritizations highlight the flexibility of the approach and emphasize the need for a balance between all the sustainability dimensions. This approach positively impacts public transportation system performance, favouring higher-capacity vehicles while considering demand, and contributing to sustainable urban mobility. © 2024 The Authors.

2025

Leveraging Large-language Models for Thematic Analysis of Children's Folk Lyrics: A comparative study of Iberian Traditions

Authors
Rodriguez, JF; Bernardes, G;

Publication
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON DIGITAL LIBRARIES FOR MUSICOLOGY, DLFM 2025

Abstract
Folk music and particularly children's folk songs serve as vital repositories of cultural identity, emotional expression, and social values. This study presents a computational thematic analysis of Portuguese and Spanish children's folk songs using the I-Folk corpus, comprising 800 annotated entries in the Music Encoding Initiative (MEI) format. Despite shared historical influences on the Iberian Peninsula, the lyrical content of each tradition reveals distinct thematic orientations. Through a methodological framework that combines traditional text pre-processing, frequency analysis, and semantic embedding using large language models (LLMs), we uncover cross-cultural similarities and divergences in content, form, and emotional register. Spanish lyrics focus primarily on caregiving, emotional development, and moral-religious motifs, while Portuguese songs emphasize performative rhythm, localized identity, and folkloric references. Our results highlight the need for tailored analytical strategies when working with children's repertoire and demonstrate the utility of LLMs in capturing culturally embedded patterns that are often obscured in conventional analyses. This work contributes to digital folklore scholarship, corpus-based ethnomusicology, and the preservation of underrepresented cultural expressions in computational humanities.

2025

Social Compliance With NPIs, Mobility Patterns, and Reproduction Number: Lessons From COVID-19 in Europe

Authors
Baccega, D; Aguilar, J; Baquero, C; Anta, AF; Ramirez, JM;

Publication
IEEE ACCESS

Abstract
Non-pharmaceutical interventions (NPIs), such as lockdowns, travel restrictions, and social distancing mandates, play a critical role in controlling the spread of infectious diseases by shaping human mobility patterns. Using COVID-19 as a case study, this research investigates the relationships between NPIs, mobility, and the effective reproduction number (R-t) across 13 European countries. We employ XGBoost regression models to estimate missing mobility data from NPIs and missing R(t )values from mobility, achieving high accuracy. Additionally, using clustering techniques, we uncover national distinctions in social compliance. Northern European countries demonstrate higher adherence to NPIs than Southern Europe, which exhibits more variability in response to restrictions. These differences highlight the influence of cultural and social norms on public health outcomes. In general, our analysis reveals a strong correlation between NPIs and mobility reductions, highlighting the immediate impact of restrictions on population movement. However, the relationship between mobility and R(t )is weaker and more nuanced, reflecting the time delays involved, as changes in mobility take time to influence transmission rates. These results underscore the interdependence of restrictions, mobility, and disease spread while demonstrating the potential for data-driven approaches to guide policy decisions. Our approach offers valuable insights for optimizing public health strategies and tailoring interventions to diverse cultural contexts during future health crises.

2025

On the Definition of Robustness and Resilience of AI Agents for Real-time Congestion Management

Authors
Tjhay T.; Bessa R.J.; Paulos J.;

Publication
2025 IEEE Kiel Powertech Powertech 2025

Abstract
The European Union's Artificial Intelligence (AI) Act defines robustness, resilience, and security requirements for high-risk sectors but lacks detailed methodologies for assessment. This paper introduces a novel framework for quantitatively evaluating the robustness and resilience of reinforcement learning agents in congestion management. Using the AI-friendly digital environment Grid2Op, perturbation agents simulate natural and adversarial disruptions by perturbing the input of AI systems without altering the actual state of the environment, enabling the assessment of AI performance under various scenarios. Robustness is measured through stability and reward impact metrics, while resilience quantifies recovery from performance degradation. The results demonstrate the framework's effectiveness in identifying vulnerabilities and improving AI robustness and resilience for critical applications.

  • 88
  • 4353