scholarly journals Fundamental Frequency Extraction of Noisy Speech Signals

2015 ◽  
Vol 43 ◽  
pp. 51-61
Author(s):  
Mirza A.F.M. Rashidul Hasan ◽  
Rubaiyat Yasmin ◽  
Dipankar Das ◽  
M. M. Hoque ◽  
M. I. Pramanik ◽  
...  

In this paper, we proposed a correlation based method which is a new approach using the autocorrelation function is weighted by the reciprocal of the YIN and very useful for accurate fundamental frequency extraction. The autocorrelation function and also YIN is a popular measurement in estimating fundamental frequency in time domain. In our proposed method, instead of the original signal, we employ its center clipping signal for obtaining the autocorrelation function and this function is weighted by the reciprocal of the YIN for fundamental frequency detection. Comparative results on female and male voices in white and exhibition noise shows that the proposed method can detect fundamental frequency with better accuracy in terms of gross pitch errors as compared to other related methods.

2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Cevahir Parlak ◽  
Yusuf Altun

In this article, a novel pitch determination algorithm based on harmonic differences method (HDM) is proposed. Most of the algorithms today rely on autocorrelation, cepstrum, and lastly convolutional neural networks, and they have some limitations (small datasets, wideband or narrowband, musical sounds, temporal smoothing, etc.), accuracy, and speed problems. There are very rare works exploiting the spacing between the harmonics. HDM is designed for both wideband and exclusively narrowband (telephone) speech and tries to find the most repeating difference between the harmonics of speech signal. We use three vowel databases in our experiments, namely, Hillenbrand Vowel Database, Texas Vowel Database, and Vowels from the TIMIT corpus. We compare HDM with autocorrelation, cepstrum, YIN, YAAPT, CREPE, and FCN algorithms. Results show that harmonic differences are reliable and fast choice for robust pitch detection. Also, it is superior to others in most cases.


ALQALAM ◽  
2015 ◽  
Vol 32 (2) ◽  
pp. 284
Author(s):  
Muhammad Subali ◽  
Miftah Andriansyah ◽  
Christanto Sinambela

This article aims to look at the similarities and differences in the fundamental frequency and formant frequencies using the autocorrelation function and LPCfunction in GUI MATLAB 2012b on sound hijaiyah letters for adult male speaker beginner and expert based on makhraj pronunciation and both of speaker will be analysis on matching distance of the sound use DTW method on cepstrum. Subject for speech beginner makhraj pronunciation are taken from college student of Universitas Gunadarma and SITC aged 22 years old Data of the speech beginner makhraj pronunciation is recorded using MATLAB algorithm on GUI Subject for speech expert makhraj pronunciation are taken from previous research. They are 20-30 years old from the time of taking data. The sound will be extracted to get the value of the fundamental frequency and formant frequency. After getting both frequencies, it will be obtained analysis of the similarities and differences in the fundamental frequency and formant frequencies of speech beginner and expert and it will shows matching distance of both speech. The result is all of speech beginner and expert based on makhraj pronunciation have different values of fundamental frequency and formant frequency. Then the results of the analysis matching distance using method DTW showed that obtained in the range of 28.9746 to 136.4 between speech beginner and expert based on makhraj pronunciation. Keywords: fundamental frequency, formant frequency, hijaiyah letters, makhraj


2011 ◽  
Vol 383-390 ◽  
pp. 4962-4966
Author(s):  
Ling Li ◽  
Guo Bin Jin ◽  
Shao Ping Huang ◽  
Xiao Peng

A novel method on frequency measurement based on improved TLS-ESPRIT (total least square estimation of signal parameters via rotational invariance techniques) is proposed in this paper with the research on fundamental frequency measurement in power system. TLS-ESPRIT is belong to subspace estimation in modern signal process. Noise is included in signal model, so it is independent on noise. But the same multi-poles cannot be taken when signal is in noise and based on TLS-ESPRIT. Multiple poles restoring is presented to take the true poles accurately. It is revealed that fundamental frequency is detected accurately in harmonics, interharmonics, noise and frequency fluctuations and better anti-noise ability in particular better adaptiveness on time varying signal in amplitude by simulation results.


Revista CEFAC ◽  
2019 ◽  
Vol 21 (6) ◽  
Author(s):  
Flávia Viegas ◽  
Danieli Viegas ◽  
Glaucio Serra Guimarães ◽  
Margareth Maria Gomes de Souza ◽  
Ronir Raggio Luiz ◽  
...  

ABSTRACT Purpose: to compare the measurements of fundamental frequency (F0) and frequency of the first two formants (F1 and F2) of the seven oral vowels of the Brazilian Portuguese in two speech tasks, in adults without voice and speech disorders. Methods: eighty participants in the age range 18 and 40 years, paired by gender, were selected after orofacial, orthodontic and auditory-perceptual assessments of voice and speech. The speech signals were obtained from carrier phrases and sustained vowels and the values of the F0 and frequencies of F1 and F2 were estimated. The differences were verified through the t Test, and the effect size was calculated. Results: differences were found in the F0 measurements between the two speech tasks, in two vowels in males, and in five vowels, in females. In the F1 frequencies, differences were noted in six vowels, in men, and in two, in women. In the F2 frequencies, there was a difference in four vowels, in men, and three, in women. Conclusion: based on the differences found, it is concluded that the speech task for evaluation of fundamental frequency and formants’ frequencies, in the Brazilian Portuguese, can show distinct results in both glottal and supraglottal measures in the production of different oral vowels of this language. Thus, it is suggested that clinicians and researchers consider both forms of emission for a more accurate interpretation of the implications of these data in the evaluation of oral communication and therapeutic conducts.


Sign in / Sign up

Export Citation Format

Share Document