Energy Based Dual-Microphone Electronic Speech Segregation

2013 ◽  
Vol 385-386 ◽  
pp. 1381-1384
Author(s):  
Yi Jiang ◽  
Hong Zhou ◽  
Yuan Yuan Zu ◽  
Xiao Chen

Speech segregation based on energy has a good performance on dual-microphone electronic speech signal processing. The implication of the binary mask to an auditory mixture has been shown to yield substantial improvements in signal-to-noise-ratio (SNR) and intelligibility. To evaluate the performance of a binary mask based dual microphone speech enhancement algorithm, various spatial noise sources and reverberation test conditions are used. Two compare dual microphone systems based on energy difference and machine learning are used at the same time. Result with SNR and speech intelligibility show that more robust performance can be achieved than the two compare systems.

2019 ◽  
Vol 8 (3) ◽  
pp. 3509-3516

The primary aim of this paper is to examine the application of binary mask to improve intelligibility in most unfavorable conditions where hearing impaired/normal listeners find it difficult to understand what is being told. Most of the existing noise reduction algorithms are known to improve the speech quality but they hardly improve speech intelligibility. The paper proposed by Gibak Kim and Philipos C. Loizou uses the Weiner gain function for improving speech intelligibility. Here, in this paper we have proposed to apply the same approach in magnitude spectrum using the parametric wiener filter in order to study its effects on overall speech intelligibility. Subjective and objective tests were conducted to evaluate the performance of the enhanced speech for various types of noises. The results clearly indicate that there is an improvement in average segmental signal-to-noise ratio for the speech corrupted at -5dB, 0dB, 5dB and 10dB SNR values for random noise, babble noise, car noise and helicopter noise. This technique can be used in real time applications, such as mobile, hearing aids and speech–activated machines


2021 ◽  
Author(s):  
Tom Gajecki ◽  
Waldo Nogueira

Cochlear implant (CI) users struggle to understand speech in noisy conditions. In this work, we propose an end-to-end speech coding and denoising sound coding strategy that estimates the electrodograms from the raw audio captured by the microphone. We compared this approach to a classic Wiener filter and TasNet to assess its potential benefits in the context of electric hearing. The performance of the network is assessed by means of noise reduction performance (signal-to-noise-ratio improvement) and objective speech intelligibility measures. Furthermore, speech intelligibility was measured in 5 CI users to assess the potential benefits of each of the investigated algorithms. Results suggest that the speech performance of the tested group seemed to be equally good using our method compared to the front-end speech enhancement algorithm.


Galaxies ◽  
2021 ◽  
Vol 9 (1) ◽  
pp. 14
Author(s):  
Tomohiro Ishikawa ◽  
Shoki Iwaguchi ◽  
Yuta Michimura ◽  
Masaki Ando ◽  
Rika Yamada ◽  
...  

The DECi-hertz Interferometer Gravitational-wave Observatory (DECIGO) is the future Japanese, outer space gravitational wave detector. We previously set the default design parameters to provide a good target sensitivity to detect the primordial gravitational waves (GWs). However, the updated upper limit of the primordial GWs by the Planck observations motivated us toward further optimization of the target sensitivity. Previously, we had not considered optical diffraction loss due to the very long cavity length. In this paper, we optimize various DECIGO parameters by maximizing the signal-to-noise ratio (SNR) of the primordial GWs to quantum noise, including the effects of diffraction loss. We evaluated the power spectrum density for one cluster in DECIGO utilizing the quantum noise of one differential Fabry–Perot interferometer. Then we calculated the SNR by correlating two clusters in the same position. We performed the optimization for two cases: the constant mirror-thickness case and the constant mirror-mass case. As a result, we obtained the SNR dependence on the mirror radius, which also determines various DECIGO parameters. This result is the first step toward optimizing the DECIGO design by considering the practical constraints on the mirror dimensions and implementing other noise sources.


2020 ◽  
Vol 24 ◽  
pp. 233121652097034
Author(s):  
Florian Langner ◽  
Andreas Büchner ◽  
Waldo Nogueira

Cochlear implant (CI) sound processing typically uses a front-end automatic gain control (AGC), reducing the acoustic dynamic range (DR) to control the output level and protect the signal processing against large amplitude changes. It can also introduce distortions into the signal and does not allow a direct mapping between acoustic input and electric output. For speech in noise, a reduction in DR can result in lower speech intelligibility due to compressed modulations of speech. This study proposes to implement a CI signal processing scheme consisting of a full acoustic DR with adaptive properties to improve the signal-to-noise ratio and overall speech intelligibility. Measurements based on the Short-Time Objective Intelligibility measure and an electrodogram analysis, as well as behavioral tests in up to 10 CI users, were used to compare performance with a single-channel, dual-loop, front-end AGC and with an adaptive back-end multiband dynamic compensation system (Voice Guard [VG]). Speech intelligibility in quiet and at a +10 dB signal-to-noise ratio was assessed with the Hochmair–Schulz–Moser sentence test. A logatome discrimination task with different consonants was performed in quiet. Speech intelligibility was significantly higher in quiet for VG than for AGC, but intelligibility was similar in noise. Participants obtained significantly better scores with VG than AGC in the logatome discrimination task. The objective measurements predicted significantly better performance estimates for VG. Overall, a dynamic compensation system can outperform a single-stage compression (AGC + linear compression) for speech perception in quiet.


2021 ◽  
Author(s):  
Remi Dallmayr ◽  
Johannes Freitag ◽  
Maria Hörhold ◽  
Thomas Laepple ◽  
Johannes Lemburg ◽  
...  

<p>The validity of any glaciological paleo proxy used to interpret climate records is based on the level of understanding of their transfer from the atmosphere into the ice sheet and their recording in the snowpack. Large spatial noise in snow properties is observed, as the wind constantly redistributes the deposited snow at the surface routed by the local topography. To increase the signal-to-noise ratio and getting a representative estimate of snow properties with respect to the high spatial variability, a large number of snow profiles is needed. However, the classical way of obtaining profiles via snow-pits is time and energy-consuming, and thus unfavourable for large surface sampling programs. In response, we present a dual-tube technique to sample the upper metre of the snowpack at a variable depth resolution with high efficiency. The developed device is robust and avoids contact with the samples by exhibiting two tubes attached alongside each other in order to (1) contain the snow core sample and (2) to access the bottom of the sample, respectively. We demonstrate the performance of the technique through two case studies in East Antarctica where we analysed the variability of water isotopes at a 100 m and 5 km spatial scales.</p>


2012 ◽  
Vol 241-244 ◽  
pp. 2491-2495 ◽  
Author(s):  
Antonio Boscolo ◽  
Francesca Vatta ◽  
Francesco Armani ◽  
Emanuele Viviani ◽  
Daniele Salvalaggio

This paper presents a physical channel emulator solution for applications such as Bit Error Rate Testing of Error Correcting Codes. The solution relies on an analog White Gaussian Noise Generator coupled additively with an analog data signal to emulate the communication channel. This is interfaced to a computer through a USB connection, allowing the use of programs in different environments, such as Matlab and Labview. This solution can allow different types of channels to be emulated and with different noise sources. A software-based method to measure Signal to Noise Ratio and to characterize the channel is also presented. The system has been validated using a Matlab interface implementing multiple error correcting codes and showed good agreement with the theoretical model.


2018 ◽  
Vol 23 (1) ◽  
pp. 32-38 ◽  
Author(s):  
Jantien L. Vroegop ◽  
Nienke C. Homans ◽  
André Goedegebure ◽  
J. Gertjan Dingemanse ◽  
Teun van Immerzeel ◽  
...  

Although the benefit of bimodal listening in cochlear implant users has been agreed on, speech comprehension remains a challenge in acoustically complex real-life environments due to reverberation and disturbing background noises. One way to additionally improve bimodal auditory performance is the use of directional microphones. The objective of this study was to investigate the effect of a binaural beamformer for bimodal cochlear implant (CI) users. This prospective study measured speech reception thresholds (SRT) in noise in a repeated-measures design that varied in listening modality for static and dynamic listening conditions. A significant improvement in SRT of 4.7 dB was found with the binaural beamformer switched on in the bimodal static listening condition. No significant improvement was found in the dynamic listening condition. We conclude that there is a clear additional advantage of the binaural beamformer in bimodal CI users for predictable/static listening conditions with frontal target speech and spatially separated noise sources.


2020 ◽  
Vol 24 ◽  
pp. 233121652097563
Author(s):  
Christopher F. Hauth ◽  
Simon C. Berning ◽  
Birger Kollmeier ◽  
Thomas Brand

The equalization cancellation model is often used to predict the binaural masking level difference. Previously its application to speech in noise has required separate knowledge about the speech and noise signals to maximize the signal-to-noise ratio (SNR). Here, a novel, blind equalization cancellation model is introduced that can use the mixed signals. This approach does not require any assumptions about particular sound source directions. It uses different strategies for positive and negative SNRs, with the switching between the two steered by a blind decision stage utilizing modulation cues. The output of the model is a single-channel signal with enhanced SNR, which we analyzed using the speech intelligibility index to compare speech intelligibility predictions. In a first experiment, the model was tested on experimental data obtained in a scenario with spatially separated target and masker signals. Predicted speech recognition thresholds were in good agreement with measured speech recognition thresholds with a root mean square error less than 1 dB. A second experiment investigated signals at positive SNRs, which was achieved using time compressed and low-pass filtered speech. The results demonstrated that binaural unmasking of speech occurs at positive SNRs and that the modulation-based switching strategy can predict the experimental results.


1980 ◽  
Vol 89 (5_suppl) ◽  
pp. 79-83
Author(s):  
Richard Lippmann

Following the Harvard master hearing aid study in 1947 there was little research on linear amplification. Recently, however, there have been a number of studies designed to determine the relationship between the frequency-gain characteristic of a hearing aid and speech intelligibility for persons with sensorineural hearing loss. These studies have demonstrated that a frequency-gain characteristic that rises at a rate of 6 dB/octave, as suggested by the Harvard study, is not optimal. They have also demonstrated that high-frequency emphasis of 10–40 dB above 500–1000 Hz is beneficial. Most importantly, they have demonstrated that hearing aids as they are presently being fit do not provide maximum speech intelligibility. Percent word correct scores obtained with the best frequency-gain characteristics tested in various studies have been found to be 9 to 19 percentage points higher than scores obtained with commercial aids owned by subjects. This increase in scores is equivalent to an increase in signal-to-noise ratio of 10 to 20 dB. This is a significant increase which could allow impaired listeners to communicate in many situations where they presently cannot. These results demonstrate the need for further research on linear amplification aimed at developing practical suggestions for fitting hearing aids.


Sign in / Sign up

Export Citation Format

Share Document