Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CTM

2025

Evaluation of Lyrics Extraction from Folk Music Sheets Using Vision Language Models (VLMs)

Autores
Sales Mendes, A; Lozano Murciego, Á; Silva, LA; Jiménez Bravo, M; Navarro Cáceres, M; Bernardes, G;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Monodic folk music has traditionally been preserved in physical documents. It constitutes a vast archive that needs to be digitized to facilitate comprehensive analysis using AI techniques. A critical component of music score digitization is the transcription of lyrics, an extensively researched process in Optical Character Recognition (OCR) and document layout analysis. These fields typically require the development of specific models that operate in several stages: first, to detect the bounding boxes of specific texts, then to identify the language, and finally, to recognize the characters. Recent advances in vision language models (VLMs) have introduced multimodal capabilities, such as processing images and text, which are competitive with traditional OCR methods. This paper proposes an end-to-end system for extracting lyrics from images of handwritten musical scores. We aim to evaluate the performance of two state-of-the-art VLMs to determine whether they can eliminate the need to develop specialized text recognition and OCR models for this task. The results of the study, obtained from a dataset in a real-world application environment, are presented along with promising new research directions in the field. This progress contributes to preserving cultural heritage and opens up new possibilities for global analysis and research in folk music. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

2025

Exploring the Role of Sound Design in Serious Games: Impact on User Experience and Learning Outcomes

Autores
Cao, Z; Pinto, AS; Bernardes, G;

Publicação
International Conference on Computer Supported Education, CSEDU - Proceedings

Abstract
Sound design plays an important role in serious games, influencing user experience and learning outcomes. However, deriving general principles and best practices remains challenging, as most literature relies on case-based studies in different application domains. Through a systematic review of the literature, 21 studies were analyzed to address two key questions: 1) what types of serious games and application domains incorporate sound design? and 2) what sound design strategies are implemented to enhance user experience and learning outcomes? The findings show that serious games have mainly focused on education, healthcare, and training, using sound to enhance motivation (50%), cognition (32%), and knowledge acquisition (18%). Furthermore, sound design strategies fulfill distinct roles: sound effects enhance feedback and engagement, background music influences motivation and cognitive processing, ambient sounds support navigation and emotional regulation, and dialogue facilitates knowledge acquisition. The findings highlight the need for further research to establish standardized sound design principles to optimize user experience and learning outcomes in serious games. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda.

2025

Sound Design for Electric Vehicles: Enhancing Safety and User Experience Through Acoustic Vehicle Alerting System (AVAS)

Autores
Ana Raquel Rodrigues Ferraz Esteves; Eduardo Miguel Campos Magalhães; Gilberto Bernardes de Almeida;

Publicação
SAE Technical Paper Series

Abstract
<div class="section abstract"><div class="htmlview paragraph">Silent motors are an excellent strategy to combat noise pollution. Still, they can pose risks for pedestrians who rely on auditory cues for safety and reduce driver awareness due to the absence of the familiar sounds of combustion engines. Sound design for silent motors not only tackles the above issues but goes beyond safety standards towards a user-centered approach by considering how users perceive and interpret sounds. This paper examines the evolving field of sound design for electric vehicles (EVs), focusing on Acoustic Vehicle Alerting Systems (AVAS). The study analyzes existing AVAS, classifying them into different groups according to their design characteristics, from technical concerns and approaches to aesthetic properties. Based on the proposed classification, an (adaptive) sound design methodology, and concept for AVAS are proposed based on state-of-the-art technologies and tools (APIs), like Wwise Automotive, and integration through a functional prototype within a virtual environment. We validate our solution by conducting user tests focusing on EV sound perception and preferences in rural and urban environments. Results showed participants preferred nature-like and melodic sounds with a wide range of frequencies, emphasizing 1000Hz, in rural areas, for the AVAS. For the interior experience, melodic, reliable, and relaxing sounds with a frequency range from 200Hz to 500Hz. In urban areas, melodic, futuristic, but not overpowering sounds (80Hz to 700Hz) with balanced frequencies at high speeds were chosen for the car's exterior. In the interior, melodic, futuristic, and combustion engine-like sounds with a low frequencies background and higher frequencies at high speeds were also preferred.</div></div>

2025

Dynamic Data Radio Bearer Management for O-RAN Slicing in 5G Standalone Networks

Autores
Silva, P; Dinis, R; Coelho, A; Ricardo, M;

Publicação
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST

Abstract
The rapid growth of data traffic and evolving service demands are driving a shift from traditional network architectures to advanced solutions. While 5G networks provide reduced latency and higher availability, they still face limitations due to reliance on integrated hardware, leading to configuration and interoperability challenges. The emerging Open Radio Access Network (O-RAN) paradigm addresses these issues by enabling remote configuration and management of virtualized components through open interfaces, promoting cost-effective, multi-vendor interoperability. Network slicing, a key 5G enabler, allows for tailored network configurations to meet heterogeneous performance requirements. The main contribution of this paper is a private Standalone 5G network based on O-RAN, featuring a dynamic Data Radio Bearer Management xApp (xDRBM) for real-time metric collection and traffic prioritization. xDRBM optimizes resource usage and ensures performance guarantees for specific applications. Validation was conducted in an emulated environment representative of real-world scenarios. © ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2025.

2025

Use Cases for Terahertz Communications: An Industrial Perspective

Autores
Zugno, T; Ciochina, C; Sambhwani, S; Svedman, P; Pessoa, LM; Chen, B; Lehne, PH; Boban, M; Kürner, T;

Publicação
IEEE WIRELESS COMMUNICATIONS

Abstract
Thanks to the vast amount of available resources and unique propagation properties, terahertz (THz) frequency bands are viewed as a key enabler for achieving ultrahigh communication performance and precise sensing capabilities in future wireless systems. Recently, the European Telecommunications Standards Institute (ETSI) initiated an Industry Specification Group (ISG) on THz which aims at establishing the technical foundation for subsequent standardization of this technology, which is pivotal for its successful integration into future networks. Starting from the work recently finalized within this group, this article provides an industrial perspective on potential use cases and frequency bands of interest for THz communication systems. We first identify promising frequency bands in the 100 GHz-1 THz range, offering over 500 GHz of available spectrum that can be exploited to unlock the full potential of THz communications. Then, we present key use cases and application areas for THz communications, emphasizing the role of this technology and its advantages over other frequency bands. We discuss their target requirements and show that some applications demand multi-Tb/s data rates, latency below 0.5 ms, and sensing accuracy down to 0.5 cm. Additionally, we identify the main deployment scenarios and outline other enabling technologies crucial for overcoming the challenges faced by THz systems. Finally, we summarize past and ongoing standardization efforts focusing on THz communications, while also providing an outlook toward the inclusion of this technology as an integral part of the future sixth generation (6G) and beyond communication networks.

2025

Analysis of Reconfigurable Reflective Unit Cells in Waveguide Environment for Ka and D Band

Autores
Finich, S; Elsaid, M; Inacio, SI; Salgado, HM; Pessoa, LM;

Publicação
2025 19th European Conference on Antennas and Propagation (EuCAP)

Abstract

  • 6
  • 368