RETRACTED: Speech enhancement method using deep learning approach for hearing-impaired listeners

AbstractSpeech is easily interfered by external environment in reality, which results in the loss of important features. Deep learning has become a popular speech enhancement method because of its superior potential in solving nonlinear mapping problems for complex features. However, the deficiency of traditional deep learning methods is the weak learning capability of important information from previous time steps and long-term event dependencies between the time-series data. To overcome this problem, we propose a novel speech enhancement method based on the fused features of deep neural networks (DNNs) and gated recurrent unit (GRU). The proposed method uses GRU to reduce the number of parameters of DNNs and acquire the context information of the speech, which improves the enhanced speech quality and intelligibility. Firstly, DNN with multiple hidden layers is used to learn the mapping relationship between the logarithmic power spectrum (LPS) features of noisy speech and clean speech. Secondly, the LPS feature of the deep neural network is fused with the noisy speech as the input of GRU network to compensate the missing context information. Finally, GRU network is performed to learn the mapping relationship between LPS features and log power spectrum features of clean speech spectrum. The proposed model is experimentally compared with traditional speech enhancement models, including DNN, CNN, LSTM and GRU. Experimental results demonstrate that the PESQ, SSNR and STOI of the proposed algorithm are improved by 30.72%, 39.84% and 5.53%, respectively, compared with the noise signal under the condition of matched noise. Under the condition of unmatched noise, the PESQ and STOI of the algorithm are improved by 23.8% and 37.36%, respectively. The advantage of the proposed method is that it uses the key information of features to suppress noise in both matched and unmatched noise cases and the proposed method outperforms other common methods in speech enhancement.

Get full-text (via PubEx)

DeepLPC: A Deep Learning Approach to Augmented Kalman Filter-Based Single-Channel Speech Enhancement

IEEE Access ◽

10.1109/access.2021.3075209 ◽

2021 ◽

Vol 9 ◽

pp. 64524-64538

Author(s):

Sujan Kumar Roy ◽

Aaron Nicolson ◽

Kuldip K. Paliwal

Keyword(s):

Deep Learning ◽

Kalman Filter ◽

Speech Enhancement ◽

Single Channel ◽

Learning Approach

Get full-text (via PubEx)

Adaptive Single-Channel Speech Enhancement Method for a Push-To-Talk Enabled Wireless Communication Device

IEICE Transactions on Communications ◽

10.1587/transcom.2015ccp0023 ◽

2016 ◽

Vol E99.B (8) ◽

pp. 1745-1753

Author(s):

Hyoung-Gook KIM ◽

Jin Young KIM

Keyword(s):

Wireless Communication ◽

Speech Enhancement ◽

Single Channel ◽

Communication Device ◽

Enhancement Method

Get full-text (via PubEx)

Comparison of various Activation Functions A Deep Learning Approach

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i3.122126 ◽

2018 ◽

Vol 6 (3) ◽

pp. 122-126

Author(s):

Mohammed Ibrahim Khan ◽

◽

Akansha Singh ◽

Anand Handa ◽

◽

...

Keyword(s):

Deep Learning ◽

Learning Approach ◽

Activation Functions

Get full-text (via PubEx)

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/3/3 ◽

2020 ◽

Vol 17 (3) ◽

pp. 299-305 ◽

Cited By ~ 1

Author(s):

Riaz Ahmad ◽

Saeeda Naz ◽

Muhammad Afzal ◽

Sheikh Rashid ◽

Marcus Liwicki ◽

...

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Data Augmentation ◽

Short Term Memory ◽

Recognition System ◽

Learning Approach ◽

Arabic Text ◽

Data Set ◽

Processing Step ◽

Handwritten Arabic

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Get full-text (via PubEx)