Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Gilberto Bernardes Almeida

2025

Explicit Tonal Tension Conditioning via Dual-Level Beam Search for Symbolic Music Generation

Authors
Ebrahimzadeh, Maral; Bernardes, Gilberto; Stober, Sebastian;

Publication

Abstract
State-of-the-art symbolic music generation models have recently achieved remarkable output quality, yet explicit control over compositional features, such as tonal tension, remains challenging. We propose a novel approach that integrates a computational tonal tension model, based on tonal interval vector analysis, into a Transformer framework. Our method employs a two-level beam search strategy during inference. At the token level, generated candidates are re-ranked using model probability and diversity metrics to maintain overall quality. At the bar level, a tension-based re-ranking is applied to ensure that the generated music aligns with a desired tension curve. Objective evaluations indicate that our approach effectively modulates tonal tension, and subjective listening tests confirm that the system produces outputs that align with the target tension. These results demonstrate that explicit tension conditioning through a dual-level beam search provides a powerful and intuitive tool to guide AI-generated music. Furthermore, our experiments demonstrate that our method can generate multiple distinct musical interpretations under the same tension condition.

2025

Toward Musicologically-Informed Retrieval: Enhancing MEI with Computational Metadata

Authors
Carvalho, Nádia; Bernardes, Gilberto;

Publication

Abstract
We present a metadata enrichment framework for Music Encoding Initiative (MEI) files, featuring mid- to higher-level multimodal features to support content-driven (similarity) retrieval with semantic awareness across large collections. While traditional metadata captures basic bibliographic and structural elements, it often lacks the depth required for advanced retrieval tasks that rely on musical phrases, form, key or mode, idiosyncratic patterns, and textual topics. To address this, we propose a system that fosters the computational analysis and edition of MEI encodings at scale. Inserting extended metadata derived from computational analysis and heuristic rules lays the groundwork for more nuanced retrieval tools. A batch environment and a lightweight JavaScript web-based application propose a complementary workflow by offering large-scale annotations and an interactive environment for reviewing, validating, and refining MEI files' metadata. Development is informed by user-centered methodologies, including consultations with music editors and digital musicologists, and has been co-designed in the context of orally transmitted folk music traditions, ensuring that both the batch processes and interactive tools align with scholarly and domain-specific needs.

2025

Computational Phrase Segmentation of Iberian Folk Traditions: An Optimized LBDM Model

Authors
Orouji, Amir Abbas; Carvalho, Nadia; Sá Pinto, António; Bernardes, Gilberto;

Publication

Abstract
Phrase segmentation is a fundamental preprocessing step for computational folk music similarity, specifically in identifying tune families within digital corpora. Furthermore, recent literature increasingly recognizes the need for tradition-specific frameworks that accommodate the structural idiosyncrasies of each tradition. In this context, this study presents a culturally informed adaptation of the established rule-based Local Boundary Detection Model (LBDM) algorithm to underrepresented Iberian folk repertoires. Our methodological enhancement expands the LBDM baseline, which traditionally analyzes rests, pitch intervals, and inter-onset duration functions to identify potential segmentation boundaries, by integrating a sub-structure surface repetition function coupled with an optimized peak-selection algorithm. Furthermore, we implement a genetic algorithm to maximize segmentation accuracy by weighting coefficients for each function while calibrating the meta-parameters of the peak-selection process. Empirical evaluation on the I-Folk digital corpus, comprising 802 symbolically encoded folk melodies from Portuguese and Spanish traditions, demonstrates improvements in segmentation F-measure of six and sixteen percentage points~(p.p.) relative to established baseline methodologies for Portuguese and Spanish repertoires, respectively.

2024

ASSESSING MUSICAL PREFERENCES OF CHILDREN ON THE AUTISTIC SPECTRUM: IMPLICATIONS FOR THERAPY

Authors
Santos, N; Bernardes, G; Cotta, R; Coelho, N; Baganha, A;

Publication
Proceedings of the Sound and Music Computing Conferences

Abstract
Music-based therapies have been yielding favorable clinical outcomes in children with Autism Spectrum Disorder (ASD). However, there is a lack of guidelines for content selection in music-based interventions. In this context, we propose a methodology for conducting experimental studies on musical preferences in children diagnosed with ASD. It consists of a generative music system with seven manipulable musical parameters where participants are encouraged to create music content according to their preferences. We conducted a preliminary transversal study with 24 children in the state of Pará, Brazil. The results suggest preferences for fast tempo, higher pitch, consonance, high event density, and timbres with smooth attacks. Intriguingly, the results revealed inconsistency in the identified preferences across therapy sessions. The critical need for personalized regulation in music-based interventions for children with ASD highlights the unique nature of individual responses, emphasizing the imperative of tailoring therapeutic approaches accordingly. © 2024. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original.

2024

STATISTICAL ANALYSIS OF MUSICAL FEATURES FOR EMOTIONAL SEMANTIC DIFFERENTIATION IN HUMAN AND AI DATABASES

Authors
Braga, F; Forero, J; Bernardes, G;

Publication
Proceedings of the Sound and Music Computing Conferences

Abstract
Understanding the structural features of perceived musical emotions is crucial for various applications, including content generation and mood-driven playlists. This study performs a comparative statistical analysis to examine the association of a set of musical features with emotions, described using adjectives. The analysis uses two datasets containing rock and pop musical fragments, categorized as human-generated and AI-generated. Focusing on four emotional adjectives (happy, sad, angry, tender-gentle) representing each valence-arousal plane's quadrant, we analyzed semantic differential meanings reported as symmetric pairs for all possible combinations of quadrants through diagonals, vertical, and horizontal axes. The results obtained were discussed based on Livingstone's circular representation of emotional features in music. Our findings demonstrate that the human and AI-generated datasets could be considered equivalent for diagonal symmetries, while horizontal and vertical symmetries show discrepancies. Furthermore, we assessed significant separability for both happy-sad and angry-tender pairs in the human dataset. In contrast, the AI-generated music exhibits a strong differentiation mainly in the angry-gentle pair. © 2024. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original.

2024

EXPLORING SAMPLING STRATEGIES IN LATENT SPACES FOR MUSIC GENERATION

Authors
Carvalho, N; Bernardes, G;

Publication
Proceedings of the Sound and Music Computing Conferences

Abstract
This paper investigates sampling strategies within latent spaces for music generation, focusing on (chordified) J.S. Bach Chorales and utilizing MusicVAE as the generative model. We conduct an experiment comparing three sampling and interpolation strategies within the latent space to generate chord progressions - from a discrete vocabulary of Bach's chords - to Bach's original chord sequences. Given a three-chord sequence from an original Bach chorale, we assess sampling strategies for replacing the middle chord. In detail, we adopt the following sampling strategies: (1) traditional linear interpolation, (2) k-nearest neighbors, and (3) k-nearest neighbors combined with angular alignment. The study evaluates their alignment with music theory principles of functional harmony embedding and voice-leading to mirror Bach's original chord sequences. Preliminary findings suggest that knearest neighbors and k-nearest neighbors combined with angular alignment closely align with the tonal function of the original chord, with k-nearest neighbors excelling in bass line interpolation and the combined strategy potentially enhancing voice-leading in upper voices. Linear interpolation maintains aspects of voice-leading but confines selections within defined tonal spaces, reflecting the nonlinear characteristics of the original sequences. Our study contributes to the dynamics of latent space sampling for music generation, offering potential avenues for enhancing explainable creative strategies. © 2024. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original.

  • 9
  • 15