Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    João Pereira Silva
  • Cargo

    Investigador
  • Desde

    01 junho 2017
002
Publicações

2023

One-Step Discrete Fourier Transform-Based Sinusoid Frequency Estimation under Full-Bandwidth Quasi-Harmonic Interference

Autores
Silva, JM; Oliveira, MA; Saraiva, AF; Ferreira, AJS;

Publicação
ACOUSTICS

Abstract
The estimation of the frequency of sinusoids has been the object of intense research for more than 40 years. Its importance in classical fields such as telecommunications, instrumentation, and medicine has been extended to numerous specific signal processing applications involving, for example, speech, audio, and music processing. In many cases, these applications run in real-time and, thus, require accurate, fast, and low-complexity algorithms. Taking the normalized Cramer-Rao lower bound as a reference, this paper evaluates the relative performance of nine non-iterative discrete Fourier transform-based individual sinusoid frequency estimators when the target sinusoid is affected by full-bandwidth quasi-harmonic interference, in addition to stationary noise. Three levels of the quasi-harmonic interference severity are considered: no harmonic interference, mild harmonic interference, and strong harmonic interference. Moreover, the harmonic interference is amplitude-modulated and frequency-modulated reflecting real-world conditions, e.g., in singing and musical chords. Results are presented for when the Signal-to-Noise Ratio varies between -10 dB and 70 dB, and they reveal that the relative performance of different frequency estimators depends on the SNR and on the selectivity and leakage of the window that is used, but also changes drastically as a function of the severity of the quasi-harmonic interference. In particular, when this interference is strong, the performance curves of the majority of the tested frequency estimators collapse to a few trends around and above 0.4% of the DFT bin width.

2023

Time-Series Pattern Verification in CNC Machining Data

Autores
Silva, JM; Nogueira, AR; Pinto, J; Alves, AC; Sousa, R;

Publicação
Progress in Artificial Intelligence - 22nd EPIA Conference on Artificial Intelligence, EPIA 2023, Faial Island, Azores, September 5-8, 2023, Proceedings, Part I

Abstract
Effective quality control is essential for efficient and successful manufacturing processes in the era of Industry 4.0. Artificial Intelligence solutions are increasingly employed to enhance the accuracy and efficiency of quality control methods. In Computer Numerical Control machining, challenges involve identifying and verifying specific patterns of interest or trends in a time-series dataset. However, this can be a challenge due to the extensive diversity. Therefore, this work aims to develop a methodology capable of verifying the presence of a specific pattern of interest in a given collection of time-series. This study mainly focuses on evaluating One-Class Classification techniques using Linear Frequency Cepstral Coefficients to describe the patterns on the time-series. A real-world dataset produced by turning machines was used, where a time-series with a certain pattern needed to be verified to monitor the wear offset. The initial findings reveal that the classifiers can accurately distinguish between the time-series’ target pattern and the remaining data. Specifically, the One-Class Support Vector Machine achieves a classification accuracy of 95.6 % ± 1.2 and an F1-score of 95.4 % ± 1.3.

2023

Analysis and Re-Synthesis of Natural Cricket Sounds Assessing the Perceptual Relevance of Idiosyncratic Parameters

Autores
Oliveira, M; Almeida, V; Silva, J; Ferreira, A;

Publicação
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Abstract
Cricket sounds are usually regarded as pleasant and, thus, can be used as suitable test signals in psychoacoustic experiments assessing the human listening acuity to specific temporal and spectral features. In addition, the simple structure of cricket sounds makes them prone to reverse engineering such that they can be analyzed and re-synthesized with desired alterations in their defining parameters. This paper describes cricket sounds from a parametric point of view, characterizes their main temporal and spectral features, namely jitter, shimmer and frequency sweeps, and explains a re-synthesis process generating modified natural cricket sounds. These are subsequently used in listening tests helping to shed light on the sound identification and discrimination capabilities of humans that are important, for example, in voice recognition. © 2023 IEEE.

2021

Flexible parametric implantation of voicing in whispered speech under scarce training data

Autores
Silva, J; Oliveira, M; Ferreira, A;

Publicação
28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020)

Abstract
Whispered-voice to normal-voice conversion is typically achieved using codec-based analysis and re-synthesis, using statistical conversion of important spectral and prosodic features, or using data-driven end-to-end signal conversion. These approaches are however highly constrained by the architecture of the codec, the statistical projection, or the size and quality of the training data. In this paper, we presume direct implantation of voiced phonemes in whispered speech and we focus on fully flexible parametric models that i) can be independently controlled, ii) synthesize natural and linguistically correct voiced phonemes, iii) preserve idiosyncratic characteristics of a given speaker, and iv) are amenable to co-articulation effects through simple model interpolation. We use natural spoken and sung vowels to illustrate these capabilities in a signal modeling and re-synthesis process where spectral magnitude, phase structure, F-0 contour and sound morphing can be independently controlled in arbitrary ways.

2020

IMPACT OF A SHIFT-INVARIANT HARMONIC PHASE MODEL IN FULLY PARAMETRIC HARMONIC VOICE REPRESENTATION AND TIME/FREQUENCY SYNTHESIS

Autores
Ferreira, A; Silva, J; Brito, F; Sinha, D;

Publicação
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING

Abstract
Harmonic representation models are widely used, notably in speech coding and synthesis. In this paper, we describe two fully parametric harmonic representation and signal reconstruction alternatives that rely on a shift-invariant harmonic phase model and that implement accurate frame-based synthesis in the frequency-domain, and accurate pitch pulse-based synthesis in the time-domain. We use natural spoken and sung voice signals in order to assess the objective and subjective quality of both alternatives when parameters are exact, and when they are replaced by compact and shift-invariant harmonic phase and magnitude approximation models. We highlight the flexibility of these models and present results indicating that not only does the compact shift-invariant phase model cause a smaller impact than that caused by harmonic magnitude modeling, but it also compares favorably to results presented in the literature.