magnitude spectrum
Recently Published Documents


TOTAL DOCUMENTS

51
(FIVE YEARS 10)

H-INDEX

7
(FIVE YEARS 0)

Author(s):  
Devulapalli Shyam Prasad ◽  
Srinivasa Rao Chanamallu ◽  
Kodati Satya Prasad

Electroencephalograph is an electrical field that produced by our brain without any interrupt. In this paper, I & II-order derivatives of the Magnitude Response Functions are proposed for EEG signal Enhancement. By using this concept the random noise existing in the Electroencephalograph (EEG) signals can be reduced. A simulated model is discussed to mix the random noise of varying frequency & magnitude with the EEG signals and finally remove the noise signal using I & II-order derivatives of the Magnitude Response Functions filtering approach. The model can be used as estimation and get rid of the tool of random as well as artifacts in EEG signal from multiple origins. This work also shows the magnitude spectrum and comparing with FT magnitude spectrum. The filter characteristics are determined on the basis of parameters such as Mean Square Error (RMSE), SNR, PSNR, Mean Absolute Error (MAE) & Normalized Correlation coefficient (NCC) and a good improvement is reported.


2021 ◽  
Author(s):  
Aaron Nicolson ◽  
Kuldip K. Paliwal

Estimation of the clean speech short-time magnitude spectrum (MS) is key for speech enhancement and separation. Moreover, an automatic speech recognition (ASR) system that employs a front-end relies on clean speech MS estimation to remain robust. Training targets for deep learning approaches to clean speech MS estimation fall into three categories: computational auditory scene analysis (CASA), MS, and minimum mean-square error (MMSE) estimator training targets. The choice of training target can have a significant impact on speech enhancement/separation and robust ASR performance. Motivated by this, we find which training target produces enhanced/separated speech at the highest quality and intelligibility, and which is best for an ASR front-end. Three different deep neural network (DNN) types and two datasets that include real-world non-stationary and coloured noise sources at multiple SNR levels were used for evaluation. Ten objective measures were employed, including the word error rate (WER) of the Deep Speech ASR system. We find that training targets that estimate the <i>a priori</i> signal-to-noise ratio (SNR) for MMSE estimators produce the highest objective quality scores. Moreover, we find that the gain of MMSE estimators and the ideal amplitude mask (IAM) produce the highest objective intelligibility scores and are most suitable for an ASR front-end.


2020 ◽  
Author(s):  
Aaron Nicolson ◽  
Kuldip K. Paliwal

The estimation of the clean speech short-time magnitude spectrum (MS) is key for speech enhancement and separation. Moreover, an automatic speech recognition (ASR) system that employs a front-end relies on clean speech MS estimation to remain robust. Training targets for deep learning approaches to clean speech MS estimation fall into three main categories: computational auditory scene analysis (CASA), MS, and minimum mean-square error (MMSE) training targets. In this study, we aim to determine which training target produces enhanced/separated speech at the highest quality and intelligibility, and which is most suitable as a front-end for robust ASR. The training targets were evaluated using a temporal convolutional network (TCN) on the DEMAND Voice Bank and Deep Xi datasets---which include real-world non-stationary and coloured noise sources at multiple SNR levels. Seven objective measures were used, including the word error rate (WER) of the Deep Speech ASR system. We find that MMSE training targets produce the highest objective quality scores. We also find that CASA training targets, in particular the ideal ratio mask (IRM), produce the highest intelligibility scores and perform best as a front-end for robust ASR.


2020 ◽  
Author(s):  
Aaron Nicolson ◽  
Kuldip K. Paliwal

The estimation of the clean speech short-time magnitude spectrum (MS) is key for speech enhancement and separation. Moreover, an automatic speech recognition (ASR) system that employs a front-end relies on clean speech MS estimation to remain robust. Training targets for deep learning approaches to clean speech MS estimation fall into three main categories: computational auditory scene analysis (CASA), MS, and minimum mean-square error (MMSE) training targets. In this study, we aim to determine which training target produces enhanced/separated speech at the highest quality and intelligibility, and which is most suitable as a front-end for robust ASR. The training targets were evaluated using a temporal convolutional network (TCN) on the DEMAND Voice Bank and Deep Xi datasets---which include real-world non-stationary and coloured noise sources at multiple SNR levels. Seven objective measures were used, including the word error rate (WER) of the Deep Speech ASR system. We find that MMSE training targets produce the highest objective quality scores. We also find that CASA training targets, in particular the ideal ratio mask (IRM), produce the highest intelligibility scores and perform best as a front-end for robust ASR.


Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2392
Author(s):  
Claudia Ochoa-Diaz ◽  
Antônio Padilha L. Bó

The calculation of symmetry in amputee gait is a valuable tool to assess the functional aspects of lower limb prostheses and how it impacts the overall gait mechanics. This paper analyzes the vertical trajectory of the body center of mass (CoM) of a group formed by transfemoral amputees and non-amputees to quantitatively compare the symmetry level of this parameter for both cases. A decomposition of the vertical CoM into discrete Fourier series (DFS) components is performed for each subject’s CoM trajectory to identify the main components of each pattern. A DFS-based index is then calculated to quantify the CoM symmetry level. The obtained results show that the CoM displays different patterns along a gait cycle for each amputee, which differ from the sine-wave shape obtained in the non-amputee case. The CoM magnitude spectrum also reveals more coefficients for the amputee waveforms. The different CoM trajectories found in the studied subjects can be thought as the manifestation of developed compensatory mechanisms, which lead to gait asymmetries. The presence of odd components in the magnitude spectrum is related to the asymmetric behavior of the CoM trajectory, given the fact that this signal is an even function for a non-amputee gait. The DFS-based index reflects this fact due to the high value obtained for the non-amputee reference, in comparison to the low values for each amputee.


2019 ◽  
Vol 8 (3) ◽  
pp. 3509-3516

The primary aim of this paper is to examine the application of binary mask to improve intelligibility in most unfavorable conditions where hearing impaired/normal listeners find it difficult to understand what is being told. Most of the existing noise reduction algorithms are known to improve the speech quality but they hardly improve speech intelligibility. The paper proposed by Gibak Kim and Philipos C. Loizou uses the Weiner gain function for improving speech intelligibility. Here, in this paper we have proposed to apply the same approach in magnitude spectrum using the parametric wiener filter in order to study its effects on overall speech intelligibility. Subjective and objective tests were conducted to evaluate the performance of the enhanced speech for various types of noises. The results clearly indicate that there is an improvement in average segmental signal-to-noise ratio for the speech corrupted at -5dB, 0dB, 5dB and 10dB SNR values for random noise, babble noise, car noise and helicopter noise. This technique can be used in real time applications, such as mobile, hearing aids and speech–activated machines


Sign in / Sign up

Export Citation Format

Share Document