Speaker Recognition Based on Lightweight Neural Network for Smart Home Solutions

Author(s):  
Haojun Ai ◽  
Wuyang Xia ◽  
Quanxin Zhang
2021 ◽  
Author(s):  
Yu-Jia Zhang ◽  
Yih-Wen Wang ◽  
Chia-Ping Chen ◽  
Chung-Li Lu ◽  
Bo-Cheng Chan

2020 ◽  
Author(s):  
chaofeng lan ◽  
yuanyuan Zhang ◽  
hongyun Zhao

Abstract This paper draws on the training method of Recurrent Neural Network (RNN), By increasing the number of hidden layers of RNN and changing the layer activation function from traditional Sigmoid to Leaky ReLU on the input layer, the first group and the last set of data are zero-padded to enhance the effective utilization of data such that the improved reduction model of Denoise Recurrent Neural Network (DRNN) with high calculation speed and good convergence is constructed to solve the problem of low speaker recognition rate in noisy environment. According to this model, the random semantic speech signal with a sampling rate of 16 kHz and a duration of 5 seconds in the speech library is studied. The experimental settings of the signal-to-noise ratios are − 10dB, -5dB, 0dB, 5dB, 10dB, 15dB, 20dB, 25dB. In the noisy environment, the improved model is used to denoise the Mel Frequency Cepstral Coefficients (MFCC) and the Gammatone Frequency Cepstral Coefficents (GFCC), impact of the traditional model and the improved model on the speech recognition rate is analyzed. The research shows that the improved model can effectively eliminate the noise of the feature parameters and improve the speech recognition rate. When the signal-to-noise ratio is low, the speaker recognition rate can be more obvious. Furthermore, when the signal-to-noise ratio is 0dB, the speaker recognition rate of people is increased by 40%, which can be 85% improved compared with the traditional speech model. On the other hand, with the increase in the signal-to-noise ratio, the recognition rate is gradually increased. When the signal-to-noise ratio is 15dB, the recognition rate of speakers is 93%.


Author(s):  
Mridusmita Sharma ◽  
Rituraj Kaushik ◽  
Kandarpa Kumar Sarma

Speaker recognition is the task of identifying a person by his/her unique identification features or behavioural characteristics that are included in the speech uttered by the person. Speaker recognition deals with the identity of the speaker. It is a biometric modality which uses the features of the speaker that is influenced by one's individual behaviour as well as the characteristics of the vocal cord. The issue becomes more complex when regional languages are considered. Here, the authors report the design of a speaker recognition system using normal and telephonic Assamese speech for their case study. In their work, the authors have implemented i-vectors as features to generate an optimal feature set and have used the Feed Forward Neural Network for the recognition purpose which gives a fairly high recognition rate.


Author(s):  
Yongliang Wang ◽  
Huimin Lv ◽  
Qi Zhang ◽  
Pengfei Wang ◽  
Daliang Yan ◽  
...  

Author(s):  
Puji Catur Siswipraptini ◽  
Rosida Nur Aziza ◽  
Iriansyah BM Sangadji ◽  
Indrianto Indrianto ◽  
Riki RuliA. Siregar

Sign in / Sign up

Export Citation Format

Share Document