scholarly journals An Improved Bi-Level Thresholding Based Uncertainty Evaluation for Speech Enhancement in Non-Stationary Noises

2018 ◽  
Vol 7 (2.24) ◽  
pp. 436
Author(s):  
S Nageswara Rao ◽  
K Jaya Sankar ◽  
C D. Naidu

This paper proposes a new speech enhancement framework to improve the quality of speeches recorded under adverse acoustic environments based on the speech presence uncertainty. Since the uncertainty evaluation gives a more and clear discrimination about the speech and noise, this paper proposes a new uncertainty evaluation mechanism as a preprocessing mechanism to the noise suppression methods.  This mechanism relates with energies of a noisy speech signal and classifies the speech segments and noise segments more perfectly. In addition to the quality enhancement, this approach also reduces the unnecessary computational burden over the speech processing system. Extensive simulations are carried out over the speech signals with different types of non-stationary noises like babble noise, exhibition noise, restaurant noise and train station noises and the performance is measured with the performance metrics namely the Output SNR, AvgSegSNR, PESQ and COMP. The comparative analysis of proposed approach over the conventional approaches shows an outstanding performance in all environments.  

2021 ◽  
pp. 2150022
Author(s):  
Caio Cesar Enside de Abreu ◽  
Marco Aparecido Queiroz Duarte ◽  
Bruno Rodrigues de Oliveira ◽  
Jozue Vieira Filho ◽  
Francisco Villarreal

Speech processing systems are very important in different applications involving speech and voice quality such as automatic speech recognition, forensic phonetics and speech enhancement, among others. In most of them, the acoustic environmental noise is added to the original signal, decreasing the signal-to-noise ratio (SNR) and the speech quality by consequence. Therefore, estimating noise is one of the most important steps in speech processing whether to reduce it before processing or to design robust algorithms. In this paper, a new approach to estimate noise from speech signals is presented and its effectiveness is tested in the speech enhancement context. For this purpose, partial least squares (PLS) regression is used to model the acoustic environment (AE) and a Wiener filter based on a priori SNR estimation is implemented to evaluate the proposed approach. Six noise types are used to create seven acoustically modeled noises. The basic idea is to consider the AE model to identify the noise type and estimate its power to be used in a speech processing system. Speech signals processed using the proposed method and classical noise estimators are evaluated through objective measures. Results show that the proposed method produces better speech quality than state-of-the-art noise estimators, enabling it to be used in real-time applications in the field of robotic, telecommunications and acoustic analysis.


Author(s):  
Shifeng Ou ◽  
Peng Song ◽  
Ying Gao

The a priori signal-to-noise ratio (SNR) plays an essential role in many speech enhancement systems. Most of the existing approaches to estimate the a priori SNR only exploit the amplitude spectra while making the phase neglected. Considering the fact that incorporating phase information into a speech processing system can significantly improve the speech quality, this paper proposes a phase-sensitive decision-directed (DD) approach for the a priori SNR estimate. By representing the short-time discrete Fourier transform (STFT) signal spectra geometrically in a complex plane, the proposed approach estimates the a priori SNR using both the magnitude and phase information while making no assumptions about the phase difference between clean speech and noise spectra. Objective evaluations in terms of the spectrograms, segmental SNR, log-spectral distance (LSD) and short-time objective intelligibility (STOI) measures are presented to demonstrate the superiority of the proposed approach compared to several competitive methods at different noise conditions and input SNR levels.


Author(s):  
Margaret M. Kehoe ◽  
Emilie Cretton

Purpose This study examines intraword variability in 40 typically developing French-speaking monolingual and bilingual children, aged 2;6–4;8 (years;months). Specifically, it measures rate of intraword variability and investigates which factors best account for it. They include child-specific ones such as age, expressive vocabulary, gender, bilingual status, and speech sound production ability, and word-specific factors, such as phonological complexity (including number of syllables), phonological neighborhood density (PND), and word frequency. Method A variability test was developed, consisting of 25 words, which differed in terms of phonological complexity, PND, and word frequency. Children produced three exemplars of each word during a single session, and productions of words were coded as variable or not variable. In addition, children were administered an expressive vocabulary test and two tests tapping speech motor ability (oral motor assessment and diadochokinetic test). Speech sound ability was also assessed by measuring percent consonants correct on all words produced by the children during the session. Data were entered into a binomial logistic regression. Results Average intraword variability was 29% across all children. Several factors were found to predict intraword variability including age, gender, bilingual status, speech sound production ability, phonological complexity, and PND. Conclusions Intraword variability was found to be lower in French than what has been reported in English, consistent with phonological differences between French and English. Our findings support those of other investigators in indicating that the factors influencing intraword variability are multiple and reflect sources at various levels in the speech processing system.


1981 ◽  
Author(s):  
Steven F. Boll ◽  
James Kajiya ◽  
James Youngberg ◽  
Tracy L. Petersen ◽  
H. Ravindra

Author(s):  
Syed Akhter Hossain ◽  
M. Lutfar Rahman ◽  
Faruk Ahmed ◽  
M. Abdus Sobhan

The aim of this chapter is to clearly understand the salient features of Bangla vowels and the sources of acoustic variability in Bangla vowels, and to suggest classification of vowels based on normalized acoustic parameters. Possible applications in automatic speech recognition and speech enhancement have made the classification of vowels an important problem to study. However, Bangla vowels spoken by different native speakers show great variations in their respective formant values. This brings further complications in the acoustic comparison of vowels due to different dialect and language backgrounds of the speakers. This variation necessitates the use of normalization procedures to remove the effect of non-linguistic factors. Although several researchers found a number of acoustical and perceptual correlates of vowels, acoustic parameters that work well in a speaker-independent manner are yet to be found. Besides, study of acoustic features of Bangla dental consonants to identify the spectral differences between different consonants and to parameterize them for the synthesis of the segments is another problem area for study. The extracted features for both Bangla vowels and dental consonants are tested and found with good synthetic representations that demonstrate the quality of acoustic features.


Sign in / Sign up

Export Citation Format

Share Document