scholarly journals Negative impacts from latency masked by noise in simulated beamforming

PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0254119
Author(s):  
Jordan A. Drew ◽  
W. Owen Brimijoin

Those experiencing hearing loss face severe challenges in perceiving speech in noisy situations such as a busy restaurant or cafe. There are many factors contributing to this deficit including decreased audibility, reduced frequency resolution, and decline in temporal synchrony across the auditory system. Some hearing assistive devices implement beamforming in which multiple microphones are used in combination to attenuate surrounding noise while the target speaker is left unattenuated. In increasingly challenging auditory environments, more complex beamforming algorithms are required, which increases the processing time needed to provide a useful signal-to-noise ratio of the target speech. This study investigated whether the benefits from signal enhancement from beamforming are outweighed by the negative impacts on perception from an increase in latency between the direct acoustic signal and the digitally enhanced signal. The hypothesis for this study is that an increase in latency between the two identical speech signals would decrease intelligibility of the speech signal. Using 3 gain / latency pairs from a beamforming simulation previously completed in lab, perceptual thresholds of SNR from a simulated use case were obtained from normal hearing participants. No significant differences were detected between the 3 conditions. When presented with 2 copies of the same speech signal presented at varying gain / latency pairs in a noisy environment, any negative intelligibility effects from latency are masked by the noise. These results allow for more lenient restrictions for limiting processing delays in hearing assistive devices.

2020 ◽  
Vol 8 (5) ◽  
pp. 1632-1634

This paper proposes a dual-microphone coherence filter, which based on phase difference of two noisy signals. This method passes, without dirtortion target speaker. In real noisy environment, it’s difficult to get exactly information of condition noise field, especially in complex environment. Therefore a post-filtering, which is a function depends on speech presence probability, is suitable for compact dual-microphone and is adapted in the author's approach. The performance of suggested algorithm proved the stability and efficiency of the algorithm. This approach allows increasing performance of speech enhancement in term of signal-to-noise ratio and the amount of noise reduction. The author intends to continue using phase differences as conditions for different algorithms speech quality improvement under various noise conditions


2012 ◽  
Vol 433-440 ◽  
pp. 4675-4678
Author(s):  
Hong Yan Xing ◽  
Cui Hua Yu ◽  
Peng Li

Pitch detection in noisy environment plays an important role in speech analyzing and recognition. In the light of the properties of Hilbert-Huang transform and the EMD soft-threshold de-noising method, an effective pitch detection method for noisy speech signal is proposed in this paper. Firstly, the EMD soft-threshold de-noising method is applied to realize the background noise reduction, secondly, using the Hilbert-Huang transform to detect the pitch period of the de-noising speech signal. The analysis proposed in this paper show that, compared with the conventional methods of the pitch detection of the noisy speech, especially for the low signal to noise ratio (SNR), this approach has a higher accuracy.


Author(s):  
Mourad Talbi ◽  
Med Salim Bouhlel

Background: In this paper, we propose a secure image watermarking technique which is applied to grayscale and color images. It consists in applying the SVD (Singular Value Decomposition) in the Lifting Wavelet Transform domain for embedding a speech image (the watermark) into the host image. Methods: It also uses signature in the embedding and extraction steps. Its performance is justified by the computation of PSNR (Pick Signal to Noise Ratio), SSIM (Structural Similarity), SNR (Signal to Noise Ratio), SegSNR (Segmental SNR) and PESQ (Perceptual Evaluation Speech Quality). Results: The PSNR and SSIM are used for evaluating the perceptual quality of the watermarked image compared to the original image. The SNR, SegSNR and PESQ are used for evaluating the perceptual quality of the reconstructed or extracted speech signal compared to the original speech signal. Conclusion: The Results obtained from computation of PSNR, SSIM, SNR, SegSNR and PESQ show the performance of the proposed technique.


Author(s):  
Achilles Vairis ◽  
Suzana Brown ◽  
Maurice Bess ◽  
Kyu Hyun Bae ◽  
Jonathan Boyack

Enhancing gait stability in people who use crutches is paramount for their health. With the significant difference in gait compared to users who do not require an assistive device, the use of standard gait analysis tools to measure movement for temporary crush users and physically disabled people proves to be more challenging. In this paper, a novel approach based on video analysis is proposed as non-contact low-cost solution to the more expensive alternative with the data collected from processed videos, two values are calculated: the Signal to Noise Ratio (SNR) of acceleration, and the Signal to Noise Ratio of the jerk (time derivative of acceleration), to assess the user’s stability while they walk with crutches. The adopted methodology has been tested on a total of 10 participants. Five are temporary users of assistive devices with one being a long-term user and the other four novice users, and five are disabled participants who use those assistive devices permanently. Preliminary results show differences between novice users, long-term users, and physically disabled users. The approach is promising and could improve the assessment of crutch user stability, allowing for the correction of gait for individuals while using an inexpensive non-contact setup and preventing unnecessary falls.


2013 ◽  
Vol 443 ◽  
pp. 392-396
Author(s):  
Peng Zhou ◽  
Chi Sheng Li

In this paper, we proposed a new symbol rate estimation algorithm for phase shift keying (PSK) and qua drawtube amplitude modulation (QAM) signals in AWGN channel First we constructe a delay-multiplied signal, from which we obtaine the modulated information. Then we calculated the instantaneous autocorrelation of the delay-multiplied signal to pick out the phase jump. To eliminate the restriction of frequency resolution in fast Fourier transform, we performed a Chirp-Z transform to find out the exact spectral line which represente the symbol rate of the signal to be analyzed. Compared with the existing algorithms, it is a simple solution that has a better performance and accuracy in low signal-to-noise-ratio channel conditions. Simulation results show that the probability of relative estimating deviation below 0.1% reaches 100% and the average and standard variance of absolute estimation deviation are at the magnitude of 10-2 when SNR is over 2dB.


1977 ◽  
Vol 21 (3) ◽  
pp. 241-243 ◽  
Author(s):  
Clanton E. Mancill

The maximum entropy spectrum (MES), a sampled data power spectrum estimator, is applied to the enhancement of imagery obtained by synthetic array radar (SAR) imaging systems. MES offers better frequency resolution than conventional Fourier transform methods for certain signal classes. Since azimuth ground resolution in SAR systems is obtained by doppler frequency measurement of the radar return, the method is capable of enhancing the resolution of SAR maps. The principal signal requirement is adequate signal-to-noise ratio. The maximum entropy method has been tested using data obtained by the Hughes FLAMR radar system. The super-resolution capabilities of the method are demonstrated using FLAMR images of corner reflector arrays.


2021 ◽  
Author(s):  
S.V. Zimina

Setting up artificial neural networks using iterative algorithms is accompanied by fluctuations in weight coefficients. When an artificial neural network solves the problem of allocating a useful signal against the background of interference, fluctuations in the weight vector lead to a deterioration of the useful signal allocated by the network and, in particular, losses in the output signal-to-noise ratio. The goal of the research is to perform a statistical analysis of an artificial neural network, that includes analysis of losses in the output signal-to-noise ratio associated with fluctuations in the weight coefficients of an artificial neural network. We considered artificial neural networks that are configured using discrete gradient, fast recurrent algorithms with restrictions, and the Hebb algorithm. It is shown that fluctuations lead to losses in the output signal/noise ratio, the level of which depends on the type of algorithm under consideration and the speed of setting up an artificial neural network. Taking into account the fluctuations of the weight vector in the analysis of the output signal-to-noise ratio allows us to correlate the permissible level of loss in the output signal-to-noise ratio and the speed of network configuration corresponding to this level when working with an artificial neural network.


2021 ◽  
pp. 1-15
Author(s):  
Poovarasan Selvaraj ◽  
E. Chandra

The most challenging process in recent Speech Enhancement (SE) systems is to exclude the non-stationary noises and additive white Gaussian noise in real-time applications. Several SE techniques suggested were not successful in real-time scenarios to eliminate noises in the speech signals due to the high utilization of resources. So, a Sliding Window Empirical Mode Decomposition including a Variant of Variational Model Decomposition and Hurst (SWEMD-VVMDH) technique was developed for minimizing the difficulty in real-time applications. But this is the statistical framework that takes a long time for computations. Hence in this article, this SWEMD-VVMDH technique is extended using Deep Neural Network (DNN) that learns the decomposed speech signals via SWEMD-VVMDH efficiently to achieve SE. At first, the noisy speech signals are decomposed into Intrinsic Mode Functions (IMFs) by the SWEMD Hurst (SWEMDH) technique. Then, the Time-Delay Estimation (TDE)-based VVMD was performed on the IMFs to elect the most relevant IMFs according to the Hurst exponent and lessen the low- as well as high-frequency noise elements in the speech signal. For each signal frame, the target features are chosen and fed to the DNN that learns these features to estimate the Ideal Ratio Mask (IRM) in a supervised manner. The abilities of DNN are enhanced for the categories of background noise, and the Signal-to-Noise Ratio (SNR) of the speech signals. Also, the noise category dimension and the SNR dimension are chosen for training and testing manifold DNNs since these are dimensions often taken into account for the SE systems. Further, the IRM in each frequency channel for all noisy signal samples is concatenated to reconstruct the noiseless speech signal. At last, the experimental outcomes exhibit considerable improvement in SE under different categories of noises.


Sign in / Sign up

Export Citation Format

Share Document