Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Aníbal Ferreira

1999

An odd-DFT based approach to time-scale expansion of audio signals

Authors
Ferreira, AJS;

Publication
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING

Abstract
A new time-scale expansion algorithm based on a frequency-scale modification approach combined with tame interpolation is presented. The algorithm is noniterative and is constrained to a blind modification of the magnitudes and phases of the relevant spectral components of the signal, on a frame-by-frame basis. The resulting advantages and limitations are discussed. A few simplified models for signal analysis/synthesis are developed, the most critical of which concern phase and frequency estimation beyond the frequency resolution of the filterbank, The structure of the algorithm is described and its performance is illustrated with both synthetic and natural audio signals.

2007

Static features in real-time recognition of isolated vowels at high pitch

Authors
Ferreira, AJS;

Publication
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA

Abstract
This paper addresses the problem of automatic identification of vowels uttered in isolation by female and child speakers. In this case, the magnitude spectrum of voiced vowels is sparsely sampled since only frequencies at integer multiples of F0 are significant. This impacts negatively on the performance of vowel identification techniques that either ignore pitch or rely on global shape models. A new pitch-dependent approach to vowel identification is proposed that emerges from the concept of timbre and that defines perceptual spectral clusters (PSC) of harmonic partials. A representative set of static PSC-related features are estimated and their performance is evaluated in automatic classification tests using the Mahalanobis distance. Linear prediction features and Mel-frequency cepstral coefficients (MFCC) coefficients are used as a reference and a database of five (Portuguese) natural vowel sounds uttered by 44 speakers (including 27 child speakers) is used for training and testing the Gaussian models. Results indicate that perceptual spectral cluster (PSC) features perform better than plain linear prediction features, but perform slightly worse than MFCC features. However, PSC features have the potential to take full advantage of the pitch structure of voiced vowels, namely in the analysis of concurrent voices, or by using pitch as a normalization parameter. (C) 2007 Acoustical Society of America.

1992

SUM-DIFFERENCE STEREO TRANSFORM CODING

Authors
JOHNSTON, JD; FERREIRA, AJ;

Publication
ICASSP-92 - 1992 INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5

Abstract

2007

A novel automatic noise removal technique for audio and speech signals

Authors
Harinarayanan, EV; Sinha, D; Saeed, S; Ferreira, A;

Publication
Audio Engineering Society - 123rd Audio Engineering Society Convention 2007

Abstract
This paper introduces new ideas on wideband stationary/non-stationary noise removal for audio signals. Current noise reduction techniques have generally proven to be effective, yet these typically exhibit certain undesirable characteristics. Distortion and/or alteration of the audio characteristics of primary audio sound is a common problem. Also user intervention in identifying the noise profile is sometimes necessary. The proposed technique is centered on the classical Kalman filtering technique for noise removal but uses a novel architecture whereby advanced signal processing techniques are used to identify and preserve the richness of the audio spectrum. The paper also includes conceptual and derivative results on parameter estimation, a description of multi parameter Signal Activity Detector (SAD) and our new found improved results.

2008

New enhancements to the Audio Bandwidth Extension Toolkit (ABET)

Authors
Harinarayanan, EV; Annadana, R; Sinha, D; Ferreira, A;

Publication
Audio Engineering Society - 124th Audio Engineering Society Convention 2008

Abstract
Audio bandwidth extension has emerged as a key low bit rate coding tool. In continuation with our on going research on audio bandwidth extension, this paper presents new enhancements to Audio Bandwidth Extension Toolkit (ABET). ABET consists of three primary tools Accurate Spectral Replacement (ASR), Fractal Self Similarity Model (FSSM) and Multi-band Temporal Envelope Amplitude Coding (MBTAC) [1],[2],[3]. Additionally we have also introduced a blind bandwidth extension mode into ABET [4]. We discuss several new ideas / improvements to ABET. Specifically enhancements to the blind bandwidth extension architecture which allow it to work with signals with only 3.5-4.0 kHz audio bandwidth are described. We also elaborate on a new tool for efficient coding of time-frequency envelope which cuts the overhead by 0.75-1.0 kbps/channel. We also address a practical issue i.e., the computational complexity and describe a new low decoder complexity mode of ABET.

2010

DFT-based frequency estimation under harmonic interference

Authors
Ferreira, A; Sousa, R;

Publication
Final Program and Abstract Book - 4th International Symposium on Communications, Control, and Signal Processing, ISCCSP 2010

Abstract
In this paper we address the accurate estimation of the frequency of sinusoids of natural signals such as singing, voice or music. These signals are intrinsicly harmonic and are normally contaminated by noise. Taking the Cramér-Rao Lower Bound for unbiased frequency estimators as a reference, we compare the performance of several DFT-based frequency estimators that are non-iterative and that use the rectangular window or the Hanning window. Tests conditions simulate harmonic interference and two new ArcTan-based frequency estimators are also included in the tests. Conclusions are presented on the relative performance of the different frequency estimators as a function of the SNR. ©2010 IEEE.

  • 8
  • 14