Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por Aníbal Ferreira

2008

New enhancements to the automatic noise removal (ANR) system utilizing improved noise statistics and multi-band processing

Autores
Saeed, S; Harinarayanan, EV; Sinha, D; Ferreira, A;

Publicação
Audio Engineering Society - 124th Audio Engineering Society Convention 2008

Abstract
We recently introduced a novel Automatic Noise Reduction (ANR) algorithm for the removal of wideband stationary/non-stationary noise from audio [1]. Current noise reduction techniques exhibit certain undesirable characteristics. Distortion and/or alteration of the audio characteristics is a common problem. User intervention in identifying the noise profile is sometimes necessary. ANR uses a novel framework employing dominant component subtraction and restoration and performs better than conventional techniques in subjective tests. Here we describe three enhancements to ANR. The first of these increases the level of noise removal for the special case of stationary background noise. The second is a new tool for improving the temporal envelope coherence and yields additional noise removal. The third is a multi-band processing tool for conditioning time-frequency envelope for reduced listener fatigue.

2008

Real-time recognition of isolated vowels

Autores
Carvalho, M; Ferreira, A;

Publicação
PERCEPTION IN MULTIMODAL DIALOGUE SYSTEMS, PROCEEDINGS

Abstract
In this paper we present a new approach to the problem of isolated vowel recognition in real-time. Language learning and speech therapy are examples of application areas that require real-time biofeedback of acoustic features. As the performance of known approaches usually drops for child speakers, we evaluated different alternatives of feature extraction and pattern recognition techniques, including PCA, LDA, ANN and Bayesian classification. In addition, we studied the explicit inclusion of pitch as a main parameter in both simulation and the real-time feature extraction process. Best results were obtained with our dataset when MFCCs are mapped, using LDA, to a 4-dimensional sub-space that is followed by Bayesian classification. An interactive game was designed that implements the selected real-time vowel recognition technique.

1998

A new frequency domain approach to time-scale expansion of audio signals

Autores
Ferreira, AJS;

Publicação
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6

Abstract
We present a new algorithm for time-scale expansion of audio signals that comprises: time interpolation, frequency-scale expansion and modification of a spectral representation of the signal. The algorithm relies an an accurate model of signal analysis and synthesis, and was constrained to a non-iterative modification of the magnitudes and the wrapped phases of the relevant sinusoidal components of the signal. The structure of the algorithm is described and its performance is illustrated. A few examples of time-expanded wideband speech can be found on the Internet.

2012

Evolutionary Algorithms and Automatic Transcription of Music

Autores
Reis, G; Fernandez, F; Ferreira, A;

Publicação
PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION COMPANION (GECCO'12)

Abstract
The main problem behind Automatic Transcription (Multiple Fundamental Frequency - F0 - Estimation) relies on its complexity. Harmonic collision and partial overlapping create a frequency lattice that is almost impossible to de-construct. Although traditional approaches to this problem of rely mainly in Digital Signal Processing (DSP) techniques, evolutionary algorithms have been applied recently to this problem and achieved competitive results. We describe all evolutionary approaches to the problem of automatic music transcription and how some were improved so they could achieve competitive results. Finally, we show how the best evolutionary approach performs on piano transcription, when compared with the state-of-the-art.

2011

Concatenative singing voice resynthesis

Autores
Fonseca, N; Ferreira, A; Rocha, AP;

Publicação
17th DSP 2011 International Conference on Digital Signal Processing, Proceedings

Abstract
The concept of capturing the sound of something for later replication is not new, and it is used in many synthesizers. But capturing sounds and use them as an audio effect, is less common. This paper presents an approach for the resynthesis of a singing voice, based on concatenative techniques, that uses pre-recorded audio material as an high level semantic audio effect, replacing an original audio recording with the sound of a different singer, while trying to keep the same musical/phonetic performance. © 2011 IEEE.

2008

A Genetic Algorithm Approach with Harmonic Structure Evolution for Polyphonic Music Transcription

Autores
Reis, G; Fonseca, N; Fernandez, F; Ferreira, A;

Publicação
ISSPIT: 8TH IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY

Abstract
This paper presents a Genetic Algorithm approach with Harmonic Structure Evolution for Polyphonic Music Transcription. Automatic Music Transcription is a very complex problem that continues waiting for solutions due to the harmonic complexity of musical sounds. More traditional approaches try to extract the information directly from the audio stream, but by taking into account that a polyphonic audio stream is no more than a combination of several musical notes, music transcription can be addressed as a search space problem where the goal is to find the sequence of notes that best models our audio signal. By taking advantage of the genetic algorithms to explore large search spaces we present a new approach to the music transcription problem. In order to reduce the harmonic overfitting several techniques were used including the encoding of the harmonic structure of the internal synthesizer inside the individual's genotype as a way to evolve towards the instrument played on the original audio signal. The results obtained in polyphonic piano transcriptions show the feasibility of the approach.

  • 7
  • 14