scholarly journals Phase Differences Based Coherence Filter for Dual-Microphone System

2020 ◽  
Vol 8 (5) ◽  
pp. 1632-1634

This paper proposes a dual-microphone coherence filter, which based on phase difference of two noisy signals. This method passes, without dirtortion target speaker. In real noisy environment, it’s difficult to get exactly information of condition noise field, especially in complex environment. Therefore a post-filtering, which is a function depends on speech presence probability, is suitable for compact dual-microphone and is adapted in the author's approach. The performance of suggested algorithm proved the stability and efficiency of the algorithm. This approach allows increasing performance of speech enhancement in term of signal-to-noise ratio and the amount of noise reduction. The author intends to continue using phase differences as conditions for different algorithms speech quality improvement under various noise conditions

2020 ◽  
Author(s):  
chaofeng lan ◽  
yuanyuan Zhang ◽  
hongyun Zhao

Abstract This paper draws on the training method of Recurrent Neural Network (RNN), By increasing the number of hidden layers of RNN and changing the layer activation function from traditional Sigmoid to Leaky ReLU on the input layer, the first group and the last set of data are zero-padded to enhance the effective utilization of data such that the improved reduction model of Denoise Recurrent Neural Network (DRNN) with high calculation speed and good convergence is constructed to solve the problem of low speaker recognition rate in noisy environment. According to this model, the random semantic speech signal with a sampling rate of 16 kHz and a duration of 5 seconds in the speech library is studied. The experimental settings of the signal-to-noise ratios are − 10dB, -5dB, 0dB, 5dB, 10dB, 15dB, 20dB, 25dB. In the noisy environment, the improved model is used to denoise the Mel Frequency Cepstral Coefficients (MFCC) and the Gammatone Frequency Cepstral Coefficents (GFCC), impact of the traditional model and the improved model on the speech recognition rate is analyzed. The research shows that the improved model can effectively eliminate the noise of the feature parameters and improve the speech recognition rate. When the signal-to-noise ratio is low, the speaker recognition rate can be more obvious. Furthermore, when the signal-to-noise ratio is 0dB, the speaker recognition rate of people is increased by 40%, which can be 85% improved compared with the traditional speech model. On the other hand, with the increase in the signal-to-noise ratio, the recognition rate is gradually increased. When the signal-to-noise ratio is 15dB, the recognition rate of speakers is 93%.


2008 ◽  
Vol 20 (5) ◽  
pp. 488-501 ◽  
Author(s):  
Yu‐Cheng Lee ◽  
Tieh‐Min Yen ◽  
Chih‐Hung Tsai

PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0254119
Author(s):  
Jordan A. Drew ◽  
W. Owen Brimijoin

Those experiencing hearing loss face severe challenges in perceiving speech in noisy situations such as a busy restaurant or cafe. There are many factors contributing to this deficit including decreased audibility, reduced frequency resolution, and decline in temporal synchrony across the auditory system. Some hearing assistive devices implement beamforming in which multiple microphones are used in combination to attenuate surrounding noise while the target speaker is left unattenuated. In increasingly challenging auditory environments, more complex beamforming algorithms are required, which increases the processing time needed to provide a useful signal-to-noise ratio of the target speech. This study investigated whether the benefits from signal enhancement from beamforming are outweighed by the negative impacts on perception from an increase in latency between the direct acoustic signal and the digitally enhanced signal. The hypothesis for this study is that an increase in latency between the two identical speech signals would decrease intelligibility of the speech signal. Using 3 gain / latency pairs from a beamforming simulation previously completed in lab, perceptual thresholds of SNR from a simulated use case were obtained from normal hearing participants. No significant differences were detected between the 3 conditions. When presented with 2 copies of the same speech signal presented at varying gain / latency pairs in a noisy environment, any negative intelligibility effects from latency are masked by the noise. These results allow for more lenient restrictions for limiting processing delays in hearing assistive devices.


2021 ◽  
Author(s):  
Willem Kleijn ◽  
RC Hendriks

We introduce a model of communication that includes noise inherent in the message production process as well as noise inherent in the message interpretation process. The production and interpretation noise processes have a fixed signal-to-noise ratio. The resulting system is a simple but effective model of human communication. The model naturally leads to a method to enhance the intelligibility of speech rendered in a noisy environment. State-of-the-art experimental results confirm the practical value of the model. © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.


2018 ◽  
Vol 7 (2.7) ◽  
pp. 5
Author(s):  
V Gopi Tilak ◽  
S Koteswara Rao

Maintaining good quality and intelligibility of speech is the primary constraint in mobile communications. The present work is on the enhancement of speech under the consideration of additive white and colored noise environments using Kalman filter. Dual and Joint estimation techniques were applied and the quality of speech is analyzed through the signal to noise ratio. The techniques were applied in both ideal and practical cases for two different speech samples.


2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Rodmonga Potapova ◽  
Maria Grigorieva

This paper discusses the results of the pilot experimental research dedicated to speech recognition and perception of the semantic content of the utterances in noisy environment. The experiment included perceptual-auditory analysis of words and phrases in Russian and German (in comparison) in the same noisy environment: various (pink and white) types of noise with various levels of signal-to-noise ratio. The statistical analysis showed that intelligibility and perception of the speech in noisy environment are influenced not only by noise type and its signal-to-noise ratio, but also by some linguistic and extralinguistic factors, such as the existing redundancy of a particular language at various levels of linguistic structure, changes in the acoustic characteristics of the speaker while switching from one language to another one, the level of speaker and listener’s proficiency in a specific language, and acoustic characteristics of the speaker’s voice.


2020 ◽  
Vol 75 (1) ◽  
pp. 57-69
Author(s):  
Abigail Waldron ◽  
Ashley Allen ◽  
Arelis Colón ◽  
J. Chance Carter ◽  
S. Michael Angel

A monolithic spatial heterodyne Raman spectrometer (mSHRS) is described, where the optical components of the spectrometer are bonded to make a small, stable, one-piece structure. This builds on previous work, where we described bench top spatial heterodyne Raman spectrometers (SHRS), developed for planetary spacecraft and rovers. The SHRS is based on a fixed grating spatial heterodyne spectrometer (SHS) that offers high spectral resolution and high light throughput in a small footprint. The resolution of the SHS is not dependent on a slit, and high resolution can be realized without using long focal length dispersing optics since it is not a dispersive device. Thus, the SHS can be used as a component in a compact Raman spectrometer with high spectral resolution and a large spectral range using a standard 1024 element charge-coupled device. Since the resolution of the SHRS is not dependent on a long optical path, it is amenable to the use of monolithic construction techniques to make a compact and robust device. In this paper, we describe the use of two different monolithic SHSs (mSHSs), with Littrow wavelengths of 531.6 nm and 541.05 nm, each about 3.5 × 3.5 × 2.5 cm in size and weighing about 80 g, in a Raman spectrometer that provides ∼3500 cm−1 spectral range with 4–5 cm−1 and 8–9 cm−1 resolution, for 600 grooves/mm and 150 grooves/mm grating-based mSHS devices, respectively. In this proof of concept paper, the stability, spectral resolution, spectral range, and signal-to-noise ratio of the mSHRS spectrometers are compared to our bench top SHRS that uses free-standing optics, and signal to noise comparisons are also made to a Kaiser Holospec f/1.8 Raman spectrometer.


Sign in / Sign up

Export Citation Format

Share Document