Cookies
Usamos cookies para melhorar nosso site e a sua experiência. Ao continuar a navegar no site, você aceita a nossa política de cookies. Ver mais
Aceitar Rejeitar
  • Menu
Tópicos
de interesse
Detalhes

Detalhes

Publicações

2019

Phonetic-oriented identification of twin speakers using 4-second vowel sounds and a combination of a shift-invariant phase feature (NRD), MFCCs and F0 information

Autores
Ferreira, AJ;

Publicação
Proceedings of the AES International Conference

Abstract
Automatic speaker identification typically relies on sophisticated statistical modeling and classification which requires large amounts of data for good performance. However, in actual audio forensics casework, frequently only a few seconds of speech material are available. In this paper, we favor diversity in feature extraction, simple modeling and classification, and constructive combination of congruent classification scores. We use phase, spectral magnitude and F0-related features in speaker identification experiments on a database of 35 speakers most of whom are twins. Using only 4.4 sec. of vowel-like sounds per speaker, we characterize the performance that is reached with individual features and we characterize simple and yet effective ways of classification score fusion. Insights for further research are also presented.

2018

Acoustic analysis of voice signal: Comparison of four applications software

Autores
Vaz Freitas, S; Pestana, PM; Almeida, V; Ferreira, A;

Publicação
BIOMEDICAL SIGNAL PROCESSING AND CONTROL

Abstract
Objectives: To describe the results of the acoustic analysis of a database of 90 voice samples with distinct dysphonia levels, using four different - commercial and open source - software programs. Study design: Exploratory, transversal. Methods: The samples were analyzed by four different types of software programs that perform acoustical evaluation - one open source software (Praat) and three commercial ones (Multi Dimensional Voice Program - MDVP by Kay Elemetrics; VoiceStudio by Seegnal; and Dr. Speech by Tiger Electronics) - for comparison among the most commonly used acoustic measures (frequency, perturbation and noise measures). Results: There is a moderate to strong,correlation, positive and statistically significant among the software programs. The mean FO is not statistically different among the used applications. The other acoustic measures revealed statistically significant differences. Conclusion: Even though it is easier to access software programs and there are numerous proposals for acoustic measures, not all of them are statistically representative nor have numeric semblance among the different applications.

2018

On the physiological validity of the group delay response of all-pole vocal tract modeling

Autores
Ferreira, AJ;

Publicação
145th Audio Engineering Society International Convention, AES 2018

Abstract
Magnitude-oriented approaches dominate the voice analysis front-ends of most current technologies addressing e.g. speaker identification, speech coding/compression, voice reconstruction and re-synthesis. A popular technique is all-pole vocal tract modeling. The phase response of all-pole models is known to be non-linear and highly dependent on the magnitude frequency response. In this paper, we use a shift-invariant phase-related feature that is estimated from signal harmonics in order to study the impact of all-pole models on the phase structure of voiced sounds. We relate that impact to the phase structure that is found in natural voiced sounds to conclude on the physiological validity of the group delay of all-pole vocal tract modeling. Our findings emphasize that harmonic phase models are idiosyncratic, and this is important in speaker identification, and in fostering the quality and naturalness of synthetic and reconstructed speech. © 2018 KASHYAP.

2018

First experiments on speaker identification combining a new shift-invariant phase-related feature (NRD), MFCCs and F0 information

Autores
Ferreira, A;

Publicação
ICETE 2018 - Proceedings of the 15th International Joint Conference on e-Business and Telecommunications

Abstract
In this paper we report on a number of speaker identification experiments that assume a phonetic-oriented segmentation scheme exists such as to motivate the extraction of psychoacoustically-motivated phase and pitch related features. MFCC features are also considered for benchmarking. An emphasis is given to an innovative shift-invariant phase-related feature that is closely linked to the glottal source. A very simple statistical modeling is proposed and adapted in order to highlight the relative discrimination capabilities of different feature types. Results are presented for individual features and a discussion is also developed regarding possibilities of fusing features at the speaker modeling stage, or fusing distances at the speaker identification stage. Copyright

2017

CARMIE: A conversational medication assistant for heart failure

Autores
Lobo, J; Ferreira, L; Ferreira, AJS;

Publicação
International Journal of E-Health and Medical Communications

Abstract
The incidence of chronic diseases is increasing and monitoring patients in a home environment is recommended. Noncompliance with prescribed medication regimens is a concern, especially among older people. Heart failure is a chronic disease that requires patients to follow strict medication plans permanently. With the objective of helping these patients managing information about their medicines and increasing adherence, the personal medication advisor CARMIE was developed as a conversational agent capable of interacting, in Portuguese, with users through spoken natural language. The system architecture is based on a language parser, a dialog manager, and a language generator, integrated with already existing tools for speech recognition and synthesis. All modules work together and interact with the user through an Android application, supporting users to manage information about their prescribed medicines. The authors also present a preliminary usability study and further considerations on CARMIE. Copyright © 2017, IGI Global.

Teses
supervisionadas

2019

AutoSpeech: Automatic Speech Analysis of Verbal Fluency for Older Adults

Autor
João António Fernandes da Costa

Instituição
UP-FEUP

2019

Modelização harmónica precisa de sons vozeados por humanos

Autor
Francisca Vieira de Brito

Instituição
UP-FEUP

2019

Adaptation of an Harp for MIDI Implementation and Sound Amplification

Autor
João Miguel Almeida Beleza

Instituição
UP-FEUP

2018

Rede de sensores sem fios para deteção de incêndios florestais

Autor
João Gilberto Fernandes Gonçalves Teixeira

Instituição
UP-FEUP

2018

Screening device for knee osteoarthritis based on Vibroarthrography

Autor
Filipa dos Santos Castro Pereira

Instituição
UP-FEUP