Aníbal Ferreira

O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais

Instituição
Investigação
Domínios de Investigação
Inteligência Artificial

Bioengenharia

Comunicações

Ciência e Engenharia dos Computadores

Fotónica

Sistemas de Energia

Robótica

Engenharia e Gestão de Sistemas
CENTROS DE INVESTIGAÇÃO
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Inovação
Inovação / Tec4

TEC4AGRO-FOOD

TEC4ENERGY

TEC4HEALTH

TEC4INDUSTRY

TEC4SEA

TECPARTNERSHIPS

Tecnologias Disponíveis
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Laboratórios
Laboratórios de Investigação

iilab
Comunicação
Notícias

Eventos

Media

Boletim Informativo
Porto, Portugal

+351 222 094 000

info@inesctec.pt
Junte-se a nós
Contactos

Home
Pessoas
Aníbal Ferreira

Tópicos
de interesse

Detalhes

Nome
Aníbal Ferreira
Cargo
Investigador Sénior
Desde
22 novembro 1995

Nacionalidade
Portugal
Centro
Telecomunicações e Multimédia
Contactos
+351222094299
anibal.ferreira@inesctec.pt

Publicações

Ler todas as publicações

2025

Accurate Analysis of the Pitch Pulse-Based Magnitude/Phase Structure of Natural Vowels and Assessment of Three Lightweight Time/Frequency Voicing Restoration Methods

Autores
Ferreira, JS; Jesus, MT; Leal, LM; Spratley, JEF;

Publicação
Journal of Voice

Abstract
This paper addresses two challenges that are intertwined and are key in informing signal processing methods restoring natural (voiced) speech from whispered speech. The first challenge involves characterizing and modeling the evolution of the harmonic phase/magnitude structure of a sequence of individual pitch periods in a voiced region of natural speech comprising sustained or co-articulated vowels. A novel algorithm segmenting individual pitch pulses is proposed, which is then used to obtain illustrative results highlighting important differences between sustained and co-articulated vowels, and suggesting practical synthetic voicing approaches. The second challenge involves model-based synthetic voicing restoration in real-time and on-the-fly. Three implementation alternatives are described that differ in their signal reconstruction approaches: frequency-domain, combined frequency- and time-domain, and physiologically inspired filtering of glottal excitation pulses individually generated. The three alternatives are compared objectively using illustrative examples, and subjectively using the results of listening tests involving synthetic voicing of sustained and co-articulated vowels in word context. © 2025 Elsevier B.V., All rights reserved.

FecharLer Abstract

2025

Neural network models for whisper to normal speech conversion

Autores
Yamamura, F; Scalassara, R; Oliveira, A; Ferreira, JS;

Publicação
U.Porto Journal of Engineering

Abstract
Whispers are common and essential for secondary communication. Nonetheless, individuals with aphonia, including laryngectomees, rely on whispers as their primary means of communication. Due to the distinct features between whispered and regular speech, debates have emerged in the field of speech recognition, highlighting the challenge of effectively converting between them. This study investigates the characteristics of whispered speech and proposes a system for converting whispered vowels into normal ones. The system is developed using multilayer perceptron networks and two types of generative adversarial networks. Three metrics are analyzed to evaluate the performance of the system: mel-cepstral distortion, root mean square error of the fundamental frequency, and accuracy with f1-score of a vowel classifier. Overall, the perceptron networks demonstrated better results, with no significant differences observed between male and female voices or the presence/absence of speech silence, except for improved accuracy in estimating the fundamental frequency during the conversion process. © 2025, Universidade do Porto - Faculdade de Engenharia. All rights reserved.

FecharLer Abstract

2025

A Review of Voicing Decision in Whispered Speech: From Rules to Machine Learning

Autores
da Silva, JMPP; Duarte Nunes, G; Ferreira, A;

Publicação

Abstract

2024

On the mismatch between the phase structure of all-pole-based synthetic vowels and natural vowels

Autores
Ferreira, A; Santos, V; Oliveira, M;

Publicação
2024 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS, SIPS

Abstract
The phase response of all-pole (AP) models is known to be non-linear and highly dependent on the frequency response magnitude. The objective and perceptual impact of the group delay of AP models in the synthesis of vowel sounds has not been thoroughly addressed in the literature. In this paper, we use a dedicated frequency-domain framework so as to i) synthesize a plausible glottal excitation setting the ground-truth for the harmonic phase structure and replicating the fundamental frequency contour of natural vowels, ii) synthesize realistic vowel sounds through all-zero (AZ) and all-pole (AP) models sharing the same frequency response magnitude, and iii) assess the objective and perceptual impact of the group delay of AP models taking as a reference natural vowels and, in particular, the ground-truth harmonic phase structure of the glottal excitation. Our findings emphasize that the non-linear phase characteristics of AP models degrade the harmonic phase structure of synthetic vowels significantly beyond what is found in natural vowels, however, that is not always clearly audible.

FecharLer Abstract

2024

Attributes Associated with Consonantal Place and Voicing in Whispered Speech

Autores
Luis Jesus; Sara Castilho; Aníbal JS Ferreira; Maria Conceição Costa;

Publicação
ISSP 2024 - 13th International Seminar on Speech Production

Abstract

Detalhes

Nome

Cargo

Desde

Nacionalidade

Centro

Contactos

Accurate Analysis of the Pitch Pulse-Based Magnitude/Phase Structure of Natural Vowels and Assessment of Three Lightweight Time/Frequency Voicing Restoration Methods

Neural network models for whisper to normal speech conversion

A Review of Voicing Decision in Whispered Speech: From Rules to Machine Learning

On the mismatch between the phase structure of all-pole-based synthetic vowels and natural vowels

Attributes Associated with Consonantal Place and Voicing in Whispered Speech