Publicacoes - INESC TEC

Publicações

Publicações por Aníbal Ferreira

2014

Audio-Perceptual Evaluation of Portuguese Voice Disorders-An Inter-and Intrajudge Reliability Study

Autores
Freitas, SV; Pestana, PM; Almeida, V; Ferreira, A;

Publicação
JOURNAL OF VOICE

Abstract
Objectives/Hypothesis. The aim of this article was to describe the results of an audio-perceptual evaluation carried out by 10 judges, on a database comprising 90 voice recordings plus 10 samples repetition, with the purpose of characterizing the intra-and interrater reliability. Study Design. Exploratory, transversal. Methods. The classification of the GRBAS parameters was obtained for each one of the 10 experts, concerning the 90 voice samples. The intraclass correlation coefficient determined the interrater reliability. For the 10 repeated voices, the intrarater reliability was assessed by means of a dispersion analysis. Results. The average judges' classification for each of the GRBAS parameters differs (P < 0.05). The values of the correlations, with confidence intervals of 95%, between the average scores for all components of the GRBAS scale lie, in general, between 0.838 and 0.966. The first three parameters of the scale (G, R, and B) have the higher interrater reliability. Differences were statistically significant (P < 0.05) for experts 1, 6, 9, and 10, which means a poor intrarater reliability for 40% of the judges. Conclusions. All the experts had similar evaluation criteria for the assessment of the five parameters of the GRBAS scale (the values of the confidence intervals at 95% of the experts average ratings of the GRB were above 0.8). However, its quantification is not statistically similar. Asthenia and Strain have lower reliability. Most experts do not reveal statistically significant differences between the values assigned to the GRB parameters (P > 0.05).

FecharLer Abstract

2014

The harmonic and noise information of the glottal pulses in speech

Autores
Sousa, R; Ferreira, A; Alku, P;

Publicação
BIOMEDICAL SIGNAL PROCESSING AND CONTROL

Abstract
This paper presents an algorithm, in the context of speech analysis and pathologic/dysphonic voices evaluation, which splits the signal of the glottal excitation into harmonic and noise components. The algorithm uses a harmonic and noise splitter and a glottal inverse filtering. The combination of these two functionalities leads to an improved estimation of the glottal excitation and its components. The results demonstrate this improvement of estimates of the glottal excitation in comparison to a known inverse filtering method (IAIF). These results comprise performance tests with synthetic voices and application to natural voices that show the waveforms of harmonic and noise components of the glottal excitation. This enhances the glottal information retrieval such as waveform patterns with physiological meaning.

FecharLer Abstract

2015

Acoustic Correlates of Compensatory Adjustments to the Glottic and Supraglottic Structures in Patients with Unilateral Vocal Fold Paralysis

Autores
Jesus, LMT; Martinez, J; Hall, A; Ferreira, A;

Publicação
BIOMED RESEARCH INTERNATIONAL

Abstract
The goal of this study was to analyse perceptually and acoustically the voices of patients with Unilateral Vocal Fold Paralysis (UVFP) and compare them to the voices of normal subjects. These voices were analysed perceptually with the GRBAS scale and acoustically using the following parameters: mean fundamental frequency (F0), standard-deviation of F0, jitter (ppq5), shimmer (apq11), mean harmonics-to-noise ratio (HNR), mean first (F1) and second (F2) formants frequency, and standard-deviation of F1 and F2 frequencies. Statistically significant differences were found in all of the perceptual parameters. Also the jitter, shimmer, HNR, standard-deviation of F0, and standard-deviation of the frequency of F2 were statistically different between groups, for both genders. In the male data differences were also found in F1 and F2 frequencies values and in the standard-deviation of the frequency of F1. This study allowed the documentation of the alterations resulting from UVFP and addressed the exploration of parameters with limited information for this pathology.

FecharLer Abstract

2015

Integrating Voice Evaluation: Correlation Between Acoustic and Audio-Perceptual Measures

Autores
Freitas, SV; Pestana, PM; Almeida, V; Ferreira, A;

Publicação
JOURNAL OF VOICE

Abstract
Objectives/Hypothesis. This article aims to establish correlations between acoustic and audio-perceptual measures using the GRBAS scale with respect to four different voice analysis software programs. Study Design. Exploratory, transversal. Methods. A total of 90 voice records were collected and analyzed with the Dr. Speech (Tiger Electronics, Seattle, WA), Multidimensional Voice Program (Kay Elemetrics, NJ, USA), PRAAT (University of Amsterdam, The Netherlands), and Voice Studio (Seegnal, Oporto, Portugal) software programs. The acoustic measures were correlated to the audio-perceptual parameters of the GRBAS and rated by 10 experts. Results. The predictive value of the acoustic measurements related to the audio-perceptual parameters exhibited magnitudes ranging from weak (R-a(2) = 0.17) to moderate (R-a(2) = 0.71). The parameter exhibiting the highest correlation magnitude is B (Breathiness), whereas the weaker correlation magnitudes were found to be for A (Asthenia) and S (Strain). The acoustic measures with stronger predictive values were local Shimmer, harmonics-to-noise ratio, APQ5 shimmer, and PPQ5 jitter, with different magnitudes for each one of the studied software programs. Conclusions. Some acoustic measures are pointed as significant predictors of GRBAS parameters, but they differ among software programs. B (Breathiness) was the parameter exhibiting the highest correlation magnitude.

FecharLer Abstract

2014

ON THE POSSIBILITY OF SPEAKER DISCRIMINATION USING A GLOTTAL PULSE PHASE-RELATED FEATURE

Autores
Ferreira, A;

Publicação
2014 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT)

Abstract
Normalized Relative Delay (NRD) is a phase-related feature that can be extracted from the harmonic structure of a periodic sound, using accurate frequency and phase estimation. We present research results showing that NRD coefficients reflect the phase structure of glottal pulses and possess a speaker discrimination capability. We use both synthetic and natural voiced vowels uttered by children, adult males and females, to illustrate both the shift-invariance property of NRDs, as well as their speaker discrimination potential, using a Fisher related criterion of data scattering.

FecharLer Abstract

2013

A HYBRID LF-ROSENBERG FREQUENCY-DOMAIN MODEL OF THE GLOTTAL PULSE

Autores
Dias, S; Ferreira, A;

Publicação
2013 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA)

Abstract
In this paper we describe innovative advances to the design of a new frequency-domain algorithm to glottal source estimation whose conceptual approach we have reported recently [1]. Those advances result from accurate sinusoidal/harmonic analysis and synthesis of two concomitant acoustic signals: the glottal source signal captured near the vocal folds, and the corresponding voiced signal captured outside the mouth. We describe the experimental procedure which was performed by an ORL specialist using a rigid video-laryngoscope and two tiny and high-quality microphones. Six subjects have participated in the tests and records were made for vowels /a/ and /i/. The data analysis allowed us to conclude on the magnitude and on the phase-related NRD features of the glottal source signal. In addition, a new frequency-domain glottal pulse model combining features of the Liljencrants-Fant and Rosenberg models has been devised that is a better match to the observed data. The derivatives of the three models are obtained using accurate frequency-domain processing. The paper concludes with next research steps.

FecharLer Abstract