Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Gilberto Bernardes Almeida

2025

Toward Musicologically-Informed Retrieval: Enhancing MEI with Computational Metadata

Authors
Carvalho, Nádia; Bernardes, Gilberto;

Publication

Abstract
We present a metadata enrichment framework for Music Encoding Initiative (MEI) files, featuring mid- to higher-level multimodal features to support content-driven (similarity) retrieval with semantic awareness across large collections. While traditional metadata captures basic bibliographic and structural elements, it often lacks the depth required for advanced retrieval tasks that rely on musical phrases, form, key or mode, idiosyncratic patterns, and textual topics. To address this, we propose a system that fosters the computational analysis and edition of MEI encodings at scale. Inserting extended metadata derived from computational analysis and heuristic rules lays the groundwork for more nuanced retrieval tools. A batch environment and a lightweight JavaScript web-based application propose a complementary workflow by offering large-scale annotations and an interactive environment for reviewing, validating, and refining MEI files' metadata. Development is informed by user-centered methodologies, including consultations with music editors and digital musicologists, and has been co-designed in the context of orally transmitted folk music traditions, ensuring that both the batch processes and interactive tools align with scholarly and domain-specific needs.

2025

Computational Phrase Segmentation of Iberian Folk Traditions: An Optimized LBDM Model

Authors
Orouji, Amir Abbas; Carvalho, Nadia; Sá Pinto, António; Bernardes, Gilberto;

Publication

Abstract
Phrase segmentation is a fundamental preprocessing step for computational folk music similarity, specifically in identifying tune families within digital corpora. Furthermore, recent literature increasingly recognizes the need for tradition-specific frameworks that accommodate the structural idiosyncrasies of each tradition. In this context, this study presents a culturally informed adaptation of the established rule-based Local Boundary Detection Model (LBDM) algorithm to underrepresented Iberian folk repertoires. Our methodological enhancement expands the LBDM baseline, which traditionally analyzes rests, pitch intervals, and inter-onset duration functions to identify potential segmentation boundaries, by integrating a sub-structure surface repetition function coupled with an optimized peak-selection algorithm. Furthermore, we implement a genetic algorithm to maximize segmentation accuracy by weighting coefficients for each function while calibrating the meta-parameters of the peak-selection process. Empirical evaluation on the I-Folk digital corpus, comprising 802 symbolically encoded folk melodies from Portuguese and Spanish traditions, demonstrates improvements in segmentation F-measure of six and sixteen percentage points~(p.p.) relative to established baseline methodologies for Portuguese and Spanish repertoires, respectively.

2024

ASSESSING MUSICAL PREFERENCES OF CHILDREN ON THE AUTISTIC SPECTRUM: IMPLICATIONS FOR THERAPY

Authors
Santos, N; Bernardes, G; Cotta, R; Coelho, N; Baganha, A;

Publication
Proceedings of the Sound and Music Computing Conferences

Abstract
Music-based therapies have been yielding favorable clinical outcomes in children with Autism Spectrum Disorder (ASD). However, there is a lack of guidelines for content selection in music-based interventions. In this context, we propose a methodology for conducting experimental studies on musical preferences in children diagnosed with ASD. It consists of a generative music system with seven manipulable musical parameters where participants are encouraged to create music content according to their preferences. We conducted a preliminary transversal study with 24 children in the state of Pará, Brazil. The results suggest preferences for fast tempo, higher pitch, consonance, high event density, and timbres with smooth attacks. Intriguingly, the results revealed inconsistency in the identified preferences across therapy sessions. The critical need for personalized regulation in music-based interventions for children with ASD highlights the unique nature of individual responses, emphasizing the imperative of tailoring therapeutic approaches accordingly. © 2024. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original.

2024

STATISTICAL ANALYSIS OF MUSICAL FEATURES FOR EMOTIONAL SEMANTIC DIFFERENTIATION IN HUMAN AND AI DATABASES

Authors
Braga, F; Forero, J; Bernardes, G;

Publication
Proceedings of the Sound and Music Computing Conferences

Abstract
Understanding the structural features of perceived musical emotions is crucial for various applications, including content generation and mood-driven playlists. This study performs a comparative statistical analysis to examine the association of a set of musical features with emotions, described using adjectives. The analysis uses two datasets containing rock and pop musical fragments, categorized as human-generated and AI-generated. Focusing on four emotional adjectives (happy, sad, angry, tender-gentle) representing each valence-arousal plane's quadrant, we analyzed semantic differential meanings reported as symmetric pairs for all possible combinations of quadrants through diagonals, vertical, and horizontal axes. The results obtained were discussed based on Livingstone's circular representation of emotional features in music. Our findings demonstrate that the human and AI-generated datasets could be considered equivalent for diagonal symmetries, while horizontal and vertical symmetries show discrepancies. Furthermore, we assessed significant separability for both happy-sad and angry-tender pairs in the human dataset. In contrast, the AI-generated music exhibits a strong differentiation mainly in the angry-gentle pair. © 2024. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original.

2024

EXPLORING SAMPLING STRATEGIES IN LATENT SPACES FOR MUSIC GENERATION

Authors
Carvalho, N; Bernardes, G;

Publication
Proceedings of the Sound and Music Computing Conferences

Abstract
This paper investigates sampling strategies within latent spaces for music generation, focusing on (chordified) J.S. Bach Chorales and utilizing MusicVAE as the generative model. We conduct an experiment comparing three sampling and interpolation strategies within the latent space to generate chord progressions - from a discrete vocabulary of Bach's chords - to Bach's original chord sequences. Given a three-chord sequence from an original Bach chorale, we assess sampling strategies for replacing the middle chord. In detail, we adopt the following sampling strategies: (1) traditional linear interpolation, (2) k-nearest neighbors, and (3) k-nearest neighbors combined with angular alignment. The study evaluates their alignment with music theory principles of functional harmony embedding and voice-leading to mirror Bach's original chord sequences. Preliminary findings suggest that knearest neighbors and k-nearest neighbors combined with angular alignment closely align with the tonal function of the original chord, with k-nearest neighbors excelling in bass line interpolation and the combined strategy potentially enhancing voice-leading in upper voices. Linear interpolation maintains aspects of voice-leading but confines selections within defined tonal spaces, reflecting the nonlinear characteristics of the original sequences. Our study contributes to the dynamics of latent space sampling for music generation, offering potential avenues for enhancing explainable creative strategies. © 2024. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original.

2024

BISAID: BIPOLAR SEMANTIC ADJECTIVES ICONS AND EARCONS DATASET

Authors
Cao, Z; Pinto, S; Bernardes, G;

Publication
Proceedings of the Sound and Music Computing Conferences

Abstract
This paper presents BiSAID, a dataset for exploring bipolar semantic adjectives in non-speech auditory cues, including earcons and auditory icons, i.e., sounds used to signify specific events or relay information in auditory interfaces from recorded or synthetic sources, respectively. In total, our dataset includes 599 non-speech auditory cues with different semantic labels, covering temperature (cold vs. warm), brightness (bright vs. dark), sharpness (sharp vs. dull), shape (curved vs. flat), and accuracy (correct vs. incorrect). Furthermore, we advance a preliminary analysis of brightness and accuracy earcon pairs from the BiSAID dataset to infer idiosyncratic sonic structures of each semantic earcon label from 66 instantaneous low- and mid-level descriptors, covering temporal, spectral, rhythmic, and tonal descriptors. Ultimately, we aim to unveil the relationship between sonic parameters behind earcon design, thus systematizing their structural foundations and shedding light on the metaphorical semantic nature of their description. This exploration revealed that spectral characteristics (e.g. spectral flux and spectral complexity) serve as the most relevant acoustic correlates in differentiating earcons on the dimensions of brightness and accuracy, respectively. The methodology holds great promise for systematizing earcon design and generating hypotheses for in-depth perceptual studies. © 2024. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original.

  • 14
  • 14