Details
Name
Aníbal FerreiraCluster
Networked Intelligent SystemsRole
Affiliated ResearcherSince
22nd November 1995
Nationality
PortugalCentre
Telecommunications and MultimediaContacts
+351222094299
anibal.ferreira@inesctec.pt
2021
Authors
Silva, J; Oliveira, M; Ferreira, A;
Publication
28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020)
Abstract
Whispered-voice to normal-voice conversion is typically achieved using codec-based analysis and re-synthesis, using statistical conversion of important spectral and prosodic features, or using data-driven end-to-end signal conversion. These approaches are however highly constrained by the architecture of the codec, the statistical projection, or the size and quality of the training data. In this paper, we presume direct implantation of voiced phonemes in whispered speech and we focus on fully flexible parametric models that i) can be independently controlled, ii) synthesize natural and linguistically correct voiced phonemes, iii) preserve idiosyncratic characteristics of a given speaker, and iv) are amenable to co-articulation effects through simple model interpolation. We use natural spoken and sung vowels to illustrate these capabilities in a signal modeling and re-synthesis process where spectral magnitude, phase structure, F0 contour and sound morphing can be independently controlled in arbitrary ways.
2020
Authors
Ferreira, A; Silva, J; Brito, F; Sinha, D;
Publication
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING
Abstract
Harmonic representation models are widely used, notably in speech coding and synthesis. In this paper, we describe two fully parametric harmonic representation and signal reconstruction alternatives that rely on a shift-invariant harmonic phase model and that implement accurate frame-based synthesis in the frequency-domain, and accurate pitch pulse-based synthesis in the time-domain. We use natural spoken and sung voice signals in order to assess the objective and subjective quality of both alternatives when parameters are exact, and when they are replaced by compact and shift-invariant harmonic phase and magnitude approximation models. We highlight the flexibility of these models and present results indicating that not only does the compact shift-invariant phase model cause a smaller impact than that caused by harmonic magnitude modeling, but it also compares favorably to results presented in the literature. © 2020 IEEE
2020
Authors
Silva, JP; Oliveira, MA; Cardoso, CF; Ferreira, AJ;
Publication
IEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation
Abstract
In this paper, we present a computationally efficient and fully parametric harmonic speech model that is suitable for real-time flexible frame-based analysis and synthesis implementation in the frequency domain. We carry out a performance comparison between this vocoder and similar ones, such as WORLD and HPMD. Then, a deliberate manipulation of the speaker's fundamental frequency micro-variations is performed in order to understand in which way it conveys prosodic and idiosyncratic information. We conclude our discussion by evaluating the impact of these manipulations through the realization of perceptual tests. © 2020 IEEE.
2019
Authors
Ferreira, AJ;
Publication
2019 AES INTERNATIONAL CONFERENCE ON AUDIO FORENSICS
Abstract
Automatic speaker identification typically relies on sophisticated statistical modeling and classification which requires large amounts of data for good performance. However, in actual audio forensics casework, frequently only a few seconds of speech material are available. In this paper, we favor diversity in feature extraction, simple modeling and classification, and constructive combination of congruent classification scores. We use phase, spectral magnitude and F0-related features in speaker identification experiments on a database of 35 speakers most of whom are twins. Using only 4.4 sec. of vowel-like sounds per speaker, we characterize the performance that is reached with individual features and we characterize simple and yet effective ways of classification score fusion. Insights for further research are also presented.
2018
Authors
Vaz Freitas, S; Pestana, PM; Almeida, V; Ferreira, A;
Publication
BIOMEDICAL SIGNAL PROCESSING AND CONTROL
Abstract
Objectives: To describe the results of the acoustic analysis of a database of 90 voice samples with distinct dysphonia levels, using four different - commercial and open source - software programs. Study design: Exploratory, transversal. Methods: The samples were analyzed by four different types of software programs that perform acoustical evaluation - one open source software (Praat) and three commercial ones (Multi Dimensional Voice Program - MDVP by Kay Elemetrics; VoiceStudio by Seegnal; and Dr. Speech by Tiger Electronics) - for comparison among the most commonly used acoustic measures (frequency, perturbation and noise measures). Results: There is a moderate to strong,correlation, positive and statistically significant among the software programs. The mean FO is not statistically different among the used applications. The other acoustic measures revealed statistically significant differences. Conclusion: Even though it is easier to access software programs and there are numerous proposals for acoustic measures, not all of them are statistically representative nor have numeric semblance among the different applications.
Supervised Thesis
2021
Author
Sandro Augusto Costa Magalhães
Institution
UP-FEUP
2021
Author
MIGUEL ALEXANDRE BORGES DA SILVA
Institution
IPP-ISEP
2021
Author
João Pedro Vasques Vieira da Silva
Institution
UP-FEUP
2021
Author
João Filipe Torres Costa
Institution
UP-FEUP
2021
Author
Sílvia da Conceição Neto Bessa
Institution
UP-FCUP
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.