2025
Autores
Khatri, N; Bernardes, G;
Publicação
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON DIGITAL LIBRARIES FOR MUSICOLOGY, DLFM 2025
Abstract
We present an analysis of performance configurations in Portuguese traditional music, using computational methods to process field recordings from the A Musica Portuguesa A Gostar Dela Propria (MPAGDP) archive. Our approach employs YOLOv11s (You Only Look Once), a computer vision system that can detect and count performers in archival footage, allowing us to automatically classify performances into meaningful categories: solo, duo, small, and large ensembles. This computational classification method processed 8122 field recordings with 96% classification accuracy, enabling systematic examination of performance contexts that would be time-consuming through manual analysis. Our analysis shows relationships between performance configuration and musical practice across Portuguese traditions. Solo performers, comprising 48% of vocal recordings, predominantly appear in narrative and poetic traditions requiring individual expression. Large ensembles (21%) maintain collective practices like polyphonic singing traditions. The geographic distribution shows regional traits-Alentejo features large-ensemble singing traditions, while northern regions favor solo performances. The temporal analysis traces how traditional forms maintain continuity through specific performance configurations, while contemporary adaptations emerge primarily in small group formats, illuminating the social dimensions of musical transmission and adaptation in Portuguese traditional music.
2025
Autores
Carvalho, N; Sousa, J; Portovedo, H; Bernardes, G;
Publicação
INTERNATIONAL JOURNAL OF PERFORMANCE ARTS AND DIGITAL MEDIA
Abstract
This article investigates sampling strategies in latent space navigation to enhance co-creative music systems, focusing on timbre latent spaces. Adopting Villa-Rojo's 'Lamento' for tenor saxophone and tape as a case study, we conducted two experiments. The first assessed traditional corpus-based concatenative synthesis sampling within the RAVE model's latent space, finding that sampling strategies gradually deviate from a given target sonority while still relating to the original morphology. The second experiment aims at defining sampling strategies for creating variations of an input signal, namely parallel, contrary, and oblique motions. The findings expose the need to explore individual model layers and the geometric transformation nature of the contrary and oblique motions that tend to dilate the original shape. The findings highlight the potential of motion-aware sampling for more contextually aware and expressive control of music structures via CBCS.
2025
Autores
Rodrigues Ferraz Esteves, AR; Campos Magalhães, EM; Bernardes De Almeida, G;
Publicação
SAE Technical Papers
Abstract
Silent motors are an excellent strategy to combat noise pollution. Still, they can pose risks for pedestrians who rely on auditory cues for safety and reduce driver awareness due to the absence of the familiar sounds of combustion engines. Sound design for silent motors not only tackles the above issues but goes beyond safety standards towards a user-centered approach by considering how users perceive and interpret sounds. This paper examines the evolving field of sound design for electric vehicles (EVs), focusing on Acoustic Vehicle Alerting Systems (AVAS). The study analyzes existing AVAS, classifying them into different groups according to their design characteristics, from technical concerns and approaches to aesthetic properties. Based on the proposed classification, an (adaptive) sound design methodology, and concept for AVAS are proposed based on state-of-the-art technologies and tools (APIs), like Wwise Automotive, and integration through a functional prototype within a virtual environment. We validate our solution by conducting user tests focusing on EV sound perception and preferences in rural and urban environments. Results showed participants preferred nature-like and melodic sounds with a wide range of frequencies, emphasizing 1000Hz, in rural areas, for the AVAS. For the interior experience, melodic, reliable, and relaxing sounds with a frequency range from 200Hz to 500Hz. In urban areas, melodic, futuristic, but not overpowering sounds (80Hz to 700Hz) with balanced frequencies at high speeds were chosen for the car's exterior. In the interior, melodic, futuristic, and combustion engine-like sounds with a low frequencies background and higher frequencies at high speeds were also preferred. © 2025 SAE International. All Rights Reserved.
2025
Autores
Cao, Z; Pinto, AS; Bernardes, G;
Publicação
International Conference on Computer Supported Education, CSEDU - Proceedings
Abstract
Sound design plays an important role in serious games, influencing user experience and learning outcomes. However, deriving general principles and best practices remains challenging, as most literature relies on case-based studies in different application domains. Through a systematic review of the literature, 21 studies were analyzed to address two key questions: 1) what types of serious games and application domains incorporate sound design? and 2) what sound design strategies are implemented to enhance user experience and learning outcomes? The findings show that serious games have mainly focused on education, healthcare, and training, using sound to enhance motivation (50%), cognition (32%), and knowledge acquisition (18%). Furthermore, sound design strategies fulfill distinct roles: sound effects enhance feedback and engagement, background music influences motivation and cognitive processing, ambient sounds support navigation and emotional regulation, and dialogue facilitates knowledge acquisition. The findings highlight the need for further research to establish standardized sound design principles to optimize user experience and learning outcomes in serious games. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda.
2025
Autores
Mendes, AS; Murciego, AL; Silva, LA; Jiménez-Bravo, DM; Navarro-Cáceres, M; Bernardes, G;
Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2024, PT I
Abstract
Monodic folk music has traditionally been preserved in physical documents. It constitutes a vast archive that needs to be digitized to facilitate comprehensive analysis using AI techniques. A critical component of music score digitization is the transcription of lyrics, an extensively researched process in Optical Character Recognition (OCR) and document layout analysis. These fields typically require the development of specific models that operate in several stages: first, to detect the bounding boxes of specific texts, then to identify the language, and finally, to recognize the characters. Recent advances in vision language models (VLMs) have introduced multimodal capabilities, such as processing images and text, which are competitive with traditional OCR methods. This paper proposes an end-to-end system for extracting lyrics from images of handwritten musical scores. We aim to evaluate the performance of two state-of-the-art VLMs to determine whether they can eliminate the need to develop specialized text recognition and OCR models for this task. The results of the study, obtained from a dataset in a real-world application environment, are presented along with promising new research directions in the field. This progress contributes to preserving cultural heritage and opens up new possibilities for global analysis and research in folk music.
2025
Autores
Ribeiro, P; Coelho, A; Campos, R;
Publicação
2025 13th Wireless Days Conference (WD)
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.