Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Marcelo Freitas Caetano

2016

Computer-aided musical orchestration using an artificial immune system

Authors
Abreu, J; Caetano, M; Penha, R;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
The aim of computer-aided musical orchestration is to find a combination of musical instrument sounds that approximates a target sound. The difficulty arises from the complexity of timbre perception and the combinatorial explosion of all possible instrument mixtures. The estimation of perceptual similarities between sounds requires a model capable of capturing the multidimensional perception of timbre, among other perceptual qualities of sounds. In this work, we use an artificial immune system (AIS) called opt-aiNet to search for combinations of musical instrument sounds that minimize the distance to a target sound encoded in a fitness function. Opt-aiNet is capable of finding multiple solutions in parallel while preserving diversity, proposing alternative orchestrations for the same target sound that are different among themselves. We performed a listening test to evaluate the subjective similarity and diversity of the orchestrations. © Springer International Publishing Switzerland 2016.

2015

earGram Actors: An Interactive Audiovisual System Based on Social Behavior

Authors
Beyls, P; Bernardes, G; Caetano, M;

Publication
JOURNAL OF SCIENCE AND TECHNOLOGY OF THE ARTS

Abstract
In multi-agent systems, local interactions among system components following relatively simple rules often result in complex overall systemic behavior. Complex behavioral and morphological patterns have been used to generate and organize audiovisual systems with artistic purposes. In this work, we propose to use the Actor model of social interactions to drive a concatenative synthesis engine called earGram in real time. The Actor model was originally developed to explore the emergence of complex visual patterns. In turn, earGram was originally developed to facilitate the creative exploration of concatenative sound synthesis. The integrated audiovisual system allows a human performer to interact with the system dynamics while receiving visual and auditory feedback. The interaction happens indirectly by disturbing the rules governing the social relationships amongst the actors, which results in a wide range of dynamic spatiotemporal patterns. A user-performer thus improvises within the behavioral scope of the system while evaluating the apparent connections between parameter values and actual complexity of the system output.

2015

Adaptive modeling of synthetic nonstationary sinusoids

Authors
Caetano, M; Kafentzis, AG; Mouchtaris, A;

Publication
DAFx 2015 - Proceedings of the 18th International Conference on Digital Audio Effects

Abstract
Nonstationary oscillations are ubiquitous in music and speech, ranging from the fast transients in the attack of musical instruments and consonants to amplitude and frequency modulations in expressive variations present in vibrato and prosodic contours. Modeling nonstationary oscillations with sinusoids remains one of the most challenging problems in signal processing because the fit also depends on the nature of the underlying sinusoidal model. For example, frequency modulated sinusoids are more appropriate to model vibrato than fast transitions. In this paper, we propose to model nonstationary oscillations with adaptive sinusoids from the extended adaptive quasi-harmonic model (eaQHM).We generated synthetic nonstationary sinusoids with different amplitude and frequency modulations and compared the modeling performance of adaptive sinusoids estimated with eaQHM, exponentially damped sinusoids estimated with ESPRIT, and log-linear-amplitude quadratic-phase sinusoids estimated with frequency reassignment. The adaptive sinusoids from eaQHM outperformed frequency reassignment for all nonstationary sinusoids tested and presented performance comparable to exponentially damped sinusoids.

2013

ADAPTIVE SINUSOIDAL MODELING OF PERCUSSIVE MUSICAL INSTRUMENT SOUNDS

Authors
Caetano, M; Kafentzis, GP; Mouchtaris, A; Stylianou, Y;

Publication
2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO)

Abstract
Percussive musical instrument sounds figure among the most challenging to model using sinusoids particularly due to the characteristic attack that features a sharp onset and transients. Attack transients present a highly nonstationary inharmonic behaviour that is very difficult to model with traditional sinusoidal models which use slowly varying sinusoids, commonly introducing an artifact known as pre-echo. In this work we use an adaptive sinusoidal model dubbed eaQHM to model percussive sounds from musical instruments such as plucked strings or percussion and investigate how eaQHM handles the sharp onsets and the nonstationary inharmonic nature of the attack transients. We show that adaptation renders a virtually perceptually identical sinusoidal representation of percussive sounds from different musical instruments, improving the Signal to Reconstruction Error Ratio (SRER) obtained with a traditional sinusoidal model. The result of a listening test revealed that the percussive sounds modeled with eaQHM were considered perceptually closer to the original sounds than their traditional-sinusoidal-modeled counterparts. Most listeners reported that they used the attack as cue.

2013

EVALUATING HOW WELL FILTERED WHITE NOISE MODELS THE RESIDUAL FROM SINUSOIDAL MODELING OF MUSICAL INSTRUMENT SOUNDS

Authors
Caetano, M; Kafentzis, G; Degottex, G; Mouchtaris, A; Stylianou, Y;

Publication
2013 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA)

Abstract
Nowadays, sinusoidal modeling commonly includes a residual obtained by the subtraction of the sinusoidal model from the original sound. This residual signal is often further modeled as filtered white noise. In this work, we evaluate how well filtered white noise models the residual from sinusoidal modeling of musical instrument sounds for several sinusoidal algorithms. We compare how well each sinusoidal model captures the oscillatory behavior of the partials by looking into how "noisy" their residuals are. We performed a listening test to evaluate the perceptual similarity between the original residual and the modeled counterpart. Then we further investigate whether the result of the listening test can be explained by the fine structure of the residual magnitude spectrum. The results presented here have the potential to subsidize improvements on residual modeling.

2016

Full-Band Quasi-Harmonic Analysis and Synthesis of Musical Instrument Sounds with Adaptive Sinusoids

Authors
Caetano, M; Kafentzis, GP; Mouchtaris, A; Stylianou, Y;

Publication
APPLIED SCIENCES-BASEL

Abstract
Sinusoids are widely used to represent the oscillatory modes of musical instrument sounds in both analysis and synthesis. However, musical instrument sounds feature transients and instrumental noise that are poorly modeled with quasi-stationary sinusoids, requiring spectral decomposition and further dedicated modeling. In this work, we propose a full-band representation that fits sinusoids across the entire spectrum. We use the extended adaptive Quasi-Harmonic Model (eaQHM) to iteratively estimate amplitude- and frequency-modulated (AM-FM) sinusoids able to capture challenging features such as sharp attacks, transients, and instrumental noise. We use the signal-to-reconstruction-error ratio (SRER) as the objective measure for the analysis and synthesis of 89 musical instrument sounds from different instrumental families. We compare against quasi-stationary sinusoids and exponentially damped sinusoids. First, we show that the SRER increases with adaptation in eaQHM. Then, we show that full-band modeling with eaQHM captures partials at the higher frequency end of the spectrum that are neglected by spectral decomposition. Finally, we demonstrate that a frame size equal to three periods of the fundamental frequency results in the highest SRER with AM-FM sinusoids from eaQHM. A listening test confirmed that the musical instrument sounds resynthesized from full-band analysis with eaQHM are virtually perceptually indistinguishable from the original recordings.

  • 2
  • 5