Publicacoes - INESC TEC

Publicações

Publicações por Aníbal Ferreira

2023

Identification of words in whispered speech: The role of cues to fricatives' place and voicing

Autores
Jesus, LMT; Ferreira, JFS; Ferreira, AJS;

Publicação
JASA EXPRESS LETTERS

Abstract
The temporal distribution of acoustic cues in whispered speech was analyzed using the gating paradigm. Fifteen Portuguese participants listened to real disyllabic words produced by four Portuguese speakers. Lexical choices, confidence scores, isolation points (IPs), and recognition points (RPs) were analyzed. Mixed effects models predicted that the first syllable and 70% of the total duration of the second syllable were needed for lexical choices to be above chance level. Fricatives' place, not voicing, had a significant effect on the percentage of correctly identified words. IP and RP values of words with postalveolar voiced and voiceless fricatives were significantly different.

FecharLer Abstract

2010

Singing voice resynthesis using vocal sound libraries

Autores
Fonseca, N; Ferreira, A;

Publicação
13th International Conference on Digital Audio Effects, DAFx 2010 Proceedings

Abstract
Although resynthesis may seem a simple analysis/synthesis process, it is a quite complex task, even more when it comes to recreating a singing voice. This paper presents a system whose goal is to start with an original audio stream of someone singing and recreate the same performance (melody, phonetics, dynam-ics) using an internal vocal sound library (choir or solo voice). By extracting dynamics and pitch information, and looking for phonetic similarities between the original audio frames and the frames of the sound library, a completely new audio stream is created. The obtained audio results, although not perfect (mainly due to the existence of audio artifacts), show that this technologi-cal approach may become an extremely powerful audio tool.

FecharLer Abstract

2010

Singing voice resynthesis using vocal sound libraries

Autores
Fonseca, N; Ferreira, A;

Publicação
Proceedings of the International Conference on Digital Audio Effects, DAFx

FecharLer Abstract

1995

Effect of low bit-rate coding upon impaired audio material

Autores
Keyhl, M; Herre, J; Ferreira, A; Gilchrist, NHC;

Publicação
IEE Conference Publication

Abstract
The quality of low bit-rate coding algorithms standardized under ISO/MPEG has been carefully verified by subjective listening tests using audio material of the highest technical quality. When using digital audio compression in the context of television sound systems, however, the question arises, as to how the coders would perform with imperfect audio signals, such as those obtained from impaired film material. This paper describes studies into the effects of the more severe impairments present in film soundtracks upon MPEG-I Layer-2 and Layer-3 audio coding, which have been carried out within the European RACE `COUGAR' project programme R2122.

FecharLer Abstract

2005

Accurate and robust frequency estimation in the ODFT domain

Autores
Ferreira, A; Sinha, D;

Publicação
2005 WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA)

Abstract
This paper presents new results improving by a factor of 10 the accuracy of an Odd-DFT based frequency estimation algorithm. These results are shown to be robust to the influence of additive noise and compare favorably to other non-iterative frequency domain estimation algorithms. A perspective is given oil possible application areas, namely those involving real-time constraints.

FecharLer Abstract

2007

New enhancements to Immersive Sound field Rendition (ISR) system

Autores
Dubey, C; Annadana, R; Sinha, D; Ferreira, A;

Publicação
Audio Engineering Society - 122nd Audio Engineering Society Convention 2007

Abstract
Consumer audio applications such as satellite radio broadcasts, multi-channel audio streaming and playback systems coupled with the need to meet stringent bandwidth requirements are eliciting newer challenges in parametric multichannel audio coding schemes. This paper describes the continuation of our research concerning the Immersive Soundfield Rendition (ISR) system and the different enhancements in various algorithmic components. The need to maintain a constant bit rate for many applications requires a rate control mechanism. The various strategies utilized in the rate control mechanism are presented. In addition, an innovative phase compensated down-mixing scheme has been incorporated in the ISR system so as to generate a high quality carrier signal. Enhancements have been made to the blind up-mixing scheme and to considerable gains have been made in terms of acoustic diversity. The various enhancements of the ISR system and its performance are detailed. Audio demonstrations are available at http://www.atc-labs.com/isr.

FecharLer Abstract