Publications

Publications by CTM

2008

New enhancements to the Audio Bandwidth Extension Toolkit (ABET)

Authors
Harinarayanan, EV; Annadana, R; Sinha, D; Ferreira, A;

Publication
Audio Engineering Society - 124th Audio Engineering Society Convention 2008

Abstract
Audio bandwidth extension has emerged as a key low bit rate coding tool. In continuation with our on going research on audio bandwidth extension, this paper presents new enhancements to Audio Bandwidth Extension Toolkit (ABET). ABET consists of three primary tools Accurate Spectral Replacement (ASR), Fractal Self Similarity Model (FSSM) and Multi-band Temporal Envelope Amplitude Coding (MBTAC) [1],[2],[3]. Additionally we have also introduced a blind bandwidth extension mode into ABET [4]. We discuss several new ideas / improvements to ABET. Specifically enhancements to the blind bandwidth extension architecture which allow it to work with signals with only 3.5-4.0 kHz audio bandwidth are described. We also elaborate on a new tool for efficient coding of time-frequency envelope which cuts the overhead by 0.75-1.0 kbps/channel. We also address a practical issue i.e., the computational complexity and describe a new low decoder complexity mode of ABET.

CloseRead Abstract

2008

Hybrid Genetic Algorithm based on Gene Fragment Competition for Polyphonic Music Transcription

Authors
Reis, G; Fonseca, N; de Vega, FF; Ferreira, A;

Publication
APPLICATIONS OF EVOLUTIONARY COMPUTING, PROCEEDINGS

Abstract
This paper presents the Gene Fragment Competition concept that can be used with Hybrid Genetic Algorithms specially in signal and image processing. Memetic Algorithms have shown great success in real-life problems by adding local search operators to improve the quality of the already achieved "good" solutions during the evolutionary process. Nevertheless these traditional local search operators don't perform well in highly demanding evaluation processes. This stresses the need for a new semi-local non-exhaustive method. Our proposed approach sits as a tradeoff between classical Genetic Algorithms and traditional Memetic Algorithms, performing a quasi-global/quasi-local search by means of gene fragment evaluation and selection. The applicability of this hybrid Genetic Algorithm to the signal processing problem of Polyphonic Music Transcription is shown. The results obtained show the feasibility of the approach.

CloseRead Abstract

2008

Evaluation of existing Harmonic-to-Noise Ratio methods for voice assessment

Authors
Sousa, R; Ferreira, A;

Publication
New Trends in Audio and Video - Signal Processing: Algorithms, Architectures, Arrangements, and Applications, NTAV / SPA 2008 - Conference Proceedings

Abstract
In this paper, an evaluation of several methods allowing the estimation of the Harmonic-to-Noise Ratio (HNR) of sustained vowels was conducted. The HNR estimation methods are mainly based on time, spectral, and cepstral signal representations. An algorithm was implemented for each method and was tested with synthesized voice sounds in order to evaluate their accuracy. Tests were also conducted with real pathological voice sounds in order to evaluate the behaviour of the different methods under real conditions. © 2008 Division of Signal Processin.

CloseRead Abstract

2008

Static features in isolated vowel recognition at high pitch

Authors
Ferreira, A;

Publication
SIGMAP 2008: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS

Abstract
Vowel recognition is frequently based on Linear Prediction (LP) analysis and formant estimation techniques. However, the performance of these techniques decreases in the case of female or child speech because at high pitch frequencies (F0) the magnitude spectrum is scarcely sampled making formant estimation unreliable. In this paper we describe the implementation of a perceptually motivated concept of vowel recognition that is based on Perceptual Spectral Clusters (PSC) of harmonic partials. PSC based features were evaluated in automatic recognition tests using the Mahalanobis distance and using a data base of five natural Portuguese vowel sounds uttered by 44 speakers, 27 of whom are child speakers. LP based features and Mel-Frequency Cepstral Coefficients (MFCC) were also included in the tests as a reference. Results show that while the recognition performance of PSC features falls between that of LP based features and that of MFCC coefficients, the normalization of PSC features by F0 increases the performance and approaches that of MFCC coefficients. PSC features are not only amenable to a psychophysical interpretation (as LP based features are) but have also the potential to compete with global shape features such as MFCCs.

CloseRead Abstract

2008

Admission Control in IP Multicast over Heterogeneous Access Networks

Authors
Santos, P; Pinto, A; Ricardo, M; Almeida, T; Fontes, F;

Publication
NGMAST 2008: SECOND INTERNATIONAL CONFERENCE ON NEXT GENERATION MOBILE APPLICATIONS, SERVICES, AND TECHNOLOGIES, PROCEEDINGS

Abstract
Network operators have been reluctant to deploy IP multicast services mainly due to the lack of native control rover multicast groups. This lack of control does not only prevent operators from generating revenue from multicast-based services but also hinders regular network management. In this work we identified the network elements where admission control should be enforced for multicast session spawning over heterogeneous access networks. The architecture proposed uses existing AAA functionality to perform user identification and multicast session admission control. This control is made at the network layer with no protocol modifications. Three access networks were considered: xDSL, WiMAX and UMTS.

CloseRead Abstract

2008

Multicast deflector - Secure video distribution system

Authors
Pinto, A; Ricardo, M;

Publication
TELECOMMUNICATION SYSTEMS

Abstract
Technological evolution is leading telecommunications toward all-IP scenarios, where multiple services are transported as IP packets. Among these services is the broadcast of video. A possible mechanism for broadcasting multiple video channels over IP is to use IP multicast, and let each client decide about the reception of a channel. The secure IP multicast specified by the IETF MSEC working group is a candidate solution for securing these broadcast services. In this paper we propose a new solution for supporting the broadcast of multiple video channels which can be accessed only by authorized users; besides, when a video channel is not visualized in the last mile its transmission is temporarily suspended, so that the cable can be used for other services such as standard Internet access.

CloseRead Abstract