Negative impacts from latency masked by noise in simulated beamforming

Those experiencing hearing loss face severe challenges in perceiving speech in noisy situations such as a busy restaurant or cafe. There are many factors contributing to this deficit including decreased audibility, reduced frequency resolution, and decline in temporal synchrony across the auditory system. Some hearing assistive devices implement beamforming in which multiple microphones are used in combination to attenuate surrounding noise while the target speaker is left unattenuated. In increasingly challenging auditory environments, more complex beamforming algorithms are required, which increases the processing time needed to provide a useful signal-to-noise ratio of the target speech. This study investigated whether the benefits from signal enhancement from beamforming are outweighed by the negative impacts on perception from an increase in latency between the direct acoustic signal and the digitally enhanced signal. The hypothesis for this study is that an increase in latency between the two identical speech signals would decrease intelligibility of the speech signal. Using 3 gain / latency pairs from a beamforming simulation previously completed in lab, perceptual thresholds of SNR from a simulated use case were obtained from normal hearing participants. No significant differences were detected between the 3 conditions. When presented with 2 copies of the same speech signal presented at varying gain / latency pairs in a noisy environment, any negative intelligibility effects from latency are masked by the noise. These results allow for more lenient restrictions for limiting processing delays in hearing assistive devices.

Download Full-text

Phase Differences Based Coherence Filter for Dual-Microphone System

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e6150.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 1632-1634

Keyword(s):

Quality Improvement ◽

Speech Enhancement ◽

Signal To Noise Ratio ◽

Noisy Environment ◽

Complex Environment ◽

Signal To Noise ◽

Noisy Signals ◽

The Stability ◽

Target Speaker ◽

Phase Differences

This paper proposes a dual-microphone coherence filter, which based on phase difference of two noisy signals. This method passes, without dirtortion target speaker. In real noisy environment, it’s difficult to get exactly information of condition noise field, especially in complex environment. Therefore a post-filtering, which is a function depends on speech presence probability, is suitable for compact dual-microphone and is adapted in the author's approach. The performance of suggested algorithm proved the stability and efficiency of the algorithm. This approach allows increasing performance of speech enhancement in term of signal-to-noise ratio and the amount of noise reduction. The author intends to continue using phase differences as conditions for different algorithms speech quality improvement under various noise conditions

Download Full-text

Research on Pitch Extraction Algorithm of Noisy Speech

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.4675 ◽

2012 ◽

Vol 433-440 ◽

pp. 4675-4678

Author(s):

Hong Yan Xing ◽

Cui Hua Yu ◽

Peng Li

Keyword(s):

Speech Signal ◽

Signal To Noise Ratio ◽

Pitch Detection ◽

Noisy Environment ◽

Pitch Period ◽

Noisy Speech ◽

Hilbert Huang Transform ◽

Extraction Algorithm ◽

Soft Threshold ◽

Pitch Extraction

Pitch detection in noisy environment plays an important role in speech analyzing and recognition. In the light of the properties of Hilbert-Huang transform and the EMD soft-threshold de-noising method, an effective pitch detection method for noisy speech signal is proposed in this paper. Firstly, the EMD soft-threshold de-noising method is applied to realize the background noise reduction, secondly, using the Hilbert-Huang transform to detect the pitch period of the de-noising speech signal. The analysis proposed in this paper show that, compared with the conventional methods of the pitch detection of the noisy speech, especially for the low signal to noise ratio (SNR), this approach has a higher accuracy.

Download Full-text

Singular Values Decomposition and Lifting Wavelet Transform for Speech Signal Embedding into Digital Image

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096511666180511151646 ◽

2019 ◽

Vol 12 (2) ◽

pp. 138-151

Author(s):

Mourad Talbi ◽

Med Salim Bouhlel

Keyword(s):

Wavelet Transform ◽

Speech Signal ◽

Signal To Noise Ratio ◽

Perceptual Quality ◽

Lifting Wavelet Transform ◽

Signal To Noise ◽

Perceptual Evaluation ◽

Lifting Wavelet ◽

Noise Ratio

Background: In this paper, we propose a secure image watermarking technique which is applied to grayscale and color images. It consists in applying the SVD (Singular Value Decomposition) in the Lifting Wavelet Transform domain for embedding a speech image (the watermark) into the host image. Methods: It also uses signature in the embedding and extraction steps. Its performance is justified by the computation of PSNR (Pick Signal to Noise Ratio), SSIM (Structural Similarity), SNR (Signal to Noise Ratio), SegSNR (Segmental SNR) and PESQ (Perceptual Evaluation Speech Quality). Results: The PSNR and SSIM are used for evaluating the perceptual quality of the watermarked image compared to the original image. The SNR, SegSNR and PESQ are used for evaluating the perceptual quality of the reconstructed or extracted speech signal compared to the original speech signal. Conclusion: The Results obtained from computation of PSNR, SSIM, SNR, SegSNR and PESQ show the performance of the proposed technique.

Download Full-text

Assessing Stability of Crutch Users by Non-Contact Methods

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18063001 ◽

2021 ◽

Vol 18 (6) ◽

pp. 3001

Author(s):

Achilles Vairis ◽

Suzana Brown ◽

Maurice Bess ◽

Kyu Hyun Bae ◽

Jonathan Boyack

Keyword(s):

Signal To Noise Ratio ◽

Time Derivative ◽

Assistive Devices ◽

Signal To Noise ◽

Physically Disabled ◽

Significant Difference ◽

Novice Users ◽

Noise Ratio ◽

Physically Disabled People

Enhancing gait stability in people who use crutches is paramount for their health. With the significant difference in gait compared to users who do not require an assistive device, the use of standard gait analysis tools to measure movement for temporary crush users and physically disabled people proves to be more challenging. In this paper, a novel approach based on video analysis is proposed as non-contact low-cost solution to the more expensive alternative with the data collected from processed videos, two values are calculated: the Signal to Noise Ratio (SNR) of acceleration, and the Signal to Noise Ratio of the jerk (time derivative of acceleration), to assess the user’s stability while they walk with crutches. The adopted methodology has been tested on a total of 10 participants. Five are temporary users of assistive devices with one being a long-term user and the other four novice users, and five are disabled participants who use those assistive devices permanently. Preliminary results show differences between novice users, long-term users, and physically disabled users. The approach is promising and could improve the assessment of crutch user stability, allowing for the correction of gait for individuals while using an inexpensive non-contact setup and preventing unnecessary falls.

Download Full-text

A Novel Symbol Rate Estimation Algorithm for Phase Modulating Signals in Wireless Communications

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.443.392 ◽

2013 ◽

Vol 443 ◽

pp. 392-396

Author(s):

Peng Zhou ◽

Chi Sheng Li

Keyword(s):

Signal To Noise Ratio ◽

Estimation Algorithm ◽

Frequency Resolution ◽

Phase Shift Keying ◽

Phase Jump ◽

Rate Estimation ◽

Awgn Channel ◽

Symbol Rate ◽

Shift Keying ◽

Simulation Results

In this paper, we proposed a new symbol rate estimation algorithm for phase shift keying (PSK) and qua drawtube amplitude modulation (QAM) signals in AWGN channel First we constructe a delay-multiplied signal, from which we obtaine the modulated information. Then we calculated the instantaneous autocorrelation of the delay-multiplied signal to pick out the phase jump. To eliminate the restriction of frequency resolution in fast Fourier transform, we performed a Chirp-Z transform to find out the exact spectral line which represente the symbol rate of the signal to be analyzed. Compared with the existing algorithms, it is a simple solution that has a better performance and accuracy in low signal-to-noise-ratio channel conditions. Simulation results show that the probability of relative estimating deviation below 0.1% reaches 100% and the average and standard variance of absolute estimation deviation are at the magnitude of 10-2 when SNR is over 2dB.

Download Full-text

Speaker verification in a noisy environment by enhancing the speech signal using various approaches of spectral subtraction

2016 10th International Conference on Intelligent Systems and Control (ISCO) ◽

10.1109/isco.2016.7726904 ◽

2016 ◽

Cited By ~ 1

Author(s):

B Bharathi ◽

S Kavitha ◽

K Mohana Priya

Keyword(s):

Speech Signal ◽

Speaker Verification ◽

Spectral Subtraction ◽

Noisy Environment

Download Full-text

Enhancement of Radar Imagery by Maximum Entropy Processing

Proceedings of the Human Factors Society Annual Meeting ◽

10.1177/107118137702100315 ◽

1977 ◽

Vol 21 (3) ◽

pp. 241-243 ◽

Cited By ~ 3

Author(s):

Clanton E. Mancill

Keyword(s):

Maximum Entropy ◽

Signal To Noise Ratio ◽

Super Resolution ◽

Doppler Frequency ◽

Frequency Measurement ◽

Entropy Method ◽

Frequency Resolution ◽

Sampled Data ◽

Maximum Entropy Spectrum ◽

Using Data

The maximum entropy spectrum (MES), a sampled data power spectrum estimator, is applied to the enhancement of imagery obtained by synthetic array radar (SAR) imaging systems. MES offers better frequency resolution than conventional Fourier transform methods for certain signal classes. Since azimuth ground resolution in SAR systems is obtained by doppler frequency measurement of the radar return, the method is capable of enhancing the resolution of SAR maps. The principal signal requirement is adequate signal-to-noise ratio. The maximum entropy method has been tested using data obtained by the Hughes FLAMR radar system. The super-resolution capabilities of the method are demonstrated using FLAMR images of corner reflector arrays.

Download Full-text

The Use of Speech Recognition Systems to Select a Useful Signal in Noisy Speech at a Low Signal-To-Noise Ratio

10.1109/dynamics52735.2021.9653711 ◽

2021 ◽

Author(s):

Sh. R. Salimov ◽

N. A. Volkov ◽

A. V. Ivanov

Keyword(s):

Speech Recognition ◽

Signal To Noise Ratio ◽

Signal To Noise ◽

Noisy Speech ◽

Useful Signal ◽

Recognition Systems ◽

Noise Ratio

Download Full-text

Analysis of neural networks efficiency when accounting for weight coefficients jitter, leading to reduction in output signal/noise ratio

Neurocomputers ◽

10.18127/j19998554-202102-02 ◽

2021 ◽

Author(s):

S.V. Zimina

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Network ◽

Output Signal ◽

Signal To Noise Ratio ◽

Signal To Noise ◽

Useful Signal ◽

Artificial Neural ◽

Noise Ratio ◽

Weight Coefficients

Setting up artificial neural networks using iterative algorithms is accompanied by fluctuations in weight coefficients. When an artificial neural network solves the problem of allocating a useful signal against the background of interference, fluctuations in the weight vector lead to a deterioration of the useful signal allocated by the network and, in particular, losses in the output signal-to-noise ratio. The goal of the research is to perform a statistical analysis of an artificial neural network, that includes analysis of losses in the output signal-to-noise ratio associated with fluctuations in the weight coefficients of an artificial neural network. We considered artificial neural networks that are configured using discrete gradient, fast recurrent algorithms with restrictions, and the Hebb algorithm. It is shown that fluctuations lead to losses in the output signal/noise ratio, the level of which depends on the type of algorithm under consideration and the speed of setting up an artificial neural network. Taking into account the fluctuations of the weight vector in the analysis of the output signal-to-noise ratio allows us to correlate the permissible level of loss in the output signal-to-noise ratio and the speed of network configuration corresponding to this level when working with an artificial neural network.

Download Full-text

Ideal ratio mask estimation using supervised DNN approach for target speech signal enhancement

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211236 ◽

2021 ◽

pp. 1-15

Author(s):

Poovarasan Selvaraj ◽

E. Chandra

Keyword(s):

Real Time ◽

Speech Signal ◽

Signal To Noise Ratio ◽

Additive White Gaussian Noise ◽

Time Delay Estimation ◽

Variational Model ◽

Speech Signals ◽

Frequency Noise ◽

Intrinsic Mode Functions ◽

Real Time Applications

The most challenging process in recent Speech Enhancement (SE) systems is to exclude the non-stationary noises and additive white Gaussian noise in real-time applications. Several SE techniques suggested were not successful in real-time scenarios to eliminate noises in the speech signals due to the high utilization of resources. So, a Sliding Window Empirical Mode Decomposition including a Variant of Variational Model Decomposition and Hurst (SWEMD-VVMDH) technique was developed for minimizing the difficulty in real-time applications. But this is the statistical framework that takes a long time for computations. Hence in this article, this SWEMD-VVMDH technique is extended using Deep Neural Network (DNN) that learns the decomposed speech signals via SWEMD-VVMDH efficiently to achieve SE. At first, the noisy speech signals are decomposed into Intrinsic Mode Functions (IMFs) by the SWEMD Hurst (SWEMDH) technique. Then, the Time-Delay Estimation (TDE)-based VVMD was performed on the IMFs to elect the most relevant IMFs according to the Hurst exponent and lessen the low- as well as high-frequency noise elements in the speech signal. For each signal frame, the target features are chosen and fed to the DNN that learns these features to estimate the Ideal Ratio Mask (IRM) in a supervised manner. The abilities of DNN are enhanced for the categories of background noise, and the Signal-to-Noise Ratio (SNR) of the speech signals. Also, the noise category dimension and the SNR dimension are chosen for training and testing manifold DNNs since these are dimensions often taken into account for the SE systems. Further, the IRM in each frequency channel for all noisy signal samples is concatenated to reconstruct the noiseless speech signal. At last, the experimental outcomes exhibit considerable improvement in SE under different categories of noises.

Download Full-text