Detalhes
Nome
Aníbal FerreiraCluster
Redes de Sistemas InteligentesCargo
Investigador AfiliadoDesde
22 novembro 1995
Nacionalidade
PortugalCentro
Centro de Telecomunicações e MultimédiaContactos
+351222094299
anibal.ferreira@inesctec.pt
2019
Autores
Ferreira, AJ;
Publicação
Proceedings of the AES International Conference
Abstract
Automatic speaker identification typically relies on sophisticated statistical modeling and classification which requires large amounts of data for good performance. However, in actual audio forensics casework, frequently only a few seconds of speech material are available. In this paper, we favor diversity in feature extraction, simple modeling and classification, and constructive combination of congruent classification scores. We use phase, spectral magnitude and F0-related features in speaker identification experiments on a database of 35 speakers most of whom are twins. Using only 4.4 sec. of vowel-like sounds per speaker, we characterize the performance that is reached with individual features and we characterize simple and yet effective ways of classification score fusion. Insights for further research are also presented.
2018
Autores
Vaz Freitas, S; Pestana, PM; Almeida, V; Ferreira, A;
Publicação
BIOMEDICAL SIGNAL PROCESSING AND CONTROL
Abstract
Objectives: To describe the results of the acoustic analysis of a database of 90 voice samples with distinct dysphonia levels, using four different - commercial and open source - software programs. Study design: Exploratory, transversal. Methods: The samples were analyzed by four different types of software programs that perform acoustical evaluation - one open source software (Praat) and three commercial ones (Multi Dimensional Voice Program - MDVP by Kay Elemetrics; VoiceStudio by Seegnal; and Dr. Speech by Tiger Electronics) - for comparison among the most commonly used acoustic measures (frequency, perturbation and noise measures). Results: There is a moderate to strong,correlation, positive and statistically significant among the software programs. The mean FO is not statistically different among the used applications. The other acoustic measures revealed statistically significant differences. Conclusion: Even though it is easier to access software programs and there are numerous proposals for acoustic measures, not all of them are statistically representative nor have numeric semblance among the different applications.
2018
Autores
Ferreira, AJ;
Publicação
145th Audio Engineering Society International Convention, AES 2018
Abstract
Magnitude-oriented approaches dominate the voice analysis front-ends of most current technologies addressing e.g. speaker identification, speech coding/compression, voice reconstruction and re-synthesis. A popular technique is all-pole vocal tract modeling. The phase response of all-pole models is known to be non-linear and highly dependent on the magnitude frequency response. In this paper, we use a shift-invariant phase-related feature that is estimated from signal harmonics in order to study the impact of all-pole models on the phase structure of voiced sounds. We relate that impact to the phase structure that is found in natural voiced sounds to conclude on the physiological validity of the group delay of all-pole vocal tract modeling. Our findings emphasize that harmonic phase models are idiosyncratic, and this is important in speaker identification, and in fostering the quality and naturalness of synthetic and reconstructed speech. © 2018 KASHYAP.
2018
Autores
Ferreira, A;
Publicação
ICETE 2018 - Proceedings of the 15th International Joint Conference on e-Business and Telecommunications
Abstract
In this paper we report on a number of speaker identification experiments that assume a phonetic-oriented segmentation scheme exists such as to motivate the extraction of psychoacoustically-motivated phase and pitch related features. MFCC features are also considered for benchmarking. An emphasis is given to an innovative shift-invariant phase-related feature that is closely linked to the glottal source. A very simple statistical modeling is proposed and adapted in order to highlight the relative discrimination capabilities of different feature types. Results are presented for individual features and a discussion is also developed regarding possibilities of fusing features at the speaker modeling stage, or fusing distances at the speaker identification stage. Copyright
2017
Autores
Lobo, J; Ferreira, L; Ferreira, AJS;
Publicação
International Journal of E-Health and Medical Communications
Abstract
The incidence of chronic diseases is increasing and monitoring patients in a home environment is recommended. Noncompliance with prescribed medication regimens is a concern, especially among older people. Heart failure is a chronic disease that requires patients to follow strict medication plans permanently. With the objective of helping these patients managing information about their medicines and increasing adherence, the personal medication advisor CARMIE was developed as a conversational agent capable of interacting, in Portuguese, with users through spoken natural language. The system architecture is based on a language parser, a dialog manager, and a language generator, integrated with already existing tools for speech recognition and synthesis. All modules work together and interact with the user through an Android application, supporting users to manage information about their prescribed medicines. The authors also present a preliminary usability study and further considerations on CARMIE. Copyright © 2017, IGI Global.
Teses supervisionadas
2019
Autor
João Miguel Almeida Beleza
Instituição
UP-FEUP
2019
Autor
João António Fernandes da Costa
Instituição
UP-FEUP
2019
Autor
Francisca Vieira de Brito
Instituição
UP-FEUP
2018
Autor
João Gilberto Fernandes Gonçalves Teixeira
Instituição
UP-FEUP
2018
Autor
Filipa dos Santos Castro Pereira
Instituição
UP-FEUP
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.