Research on Identity Recognition Algorithm Based on Speech Signal

Author(s):  
Chengtao Cai ◽  
Fan Liu
2018 ◽  
Vol 173 ◽  
pp. 03038
Author(s):  
Kun Wang ◽  
Yiwu Qian ◽  
Jinmo Wu ◽  
Xiaoyong Liu

Currently, pattern recognition technique is widely applied in human identity recognition while there are shortages remaining in most kinds of such techniques. In order to overcome these problems, a novel algorithm is proposed to apply in identity recognition course in new field-static gait recognition field. Two combined features are going for static gait recognition: the distance between each part of center pressure and overall foot and the side length of outline triangle of human foot. Then through the comparison of different classifier, the best course for recognition can be obtained.


2021 ◽  
Vol 4 (4) ◽  
pp. 223-232
Author(s):  
Changjie Wang ◽  
Zhihua Li ◽  
Benjamin Sarpong

Author(s):  
Hong Zhao ◽  
Lupeng Yue ◽  
Weijie Wang ◽  
Zeng Xiangyan

Speech signal is a time-varying signal, which is greatly affected by individual and environment. In order to improve the end-to-end voice print recognition rate, it is necessary to preprocess the original speech signal to some extent. An end-to-end voiceprint recognition algorithm based on convolutional neural network is proposed. In this algorithm, the convolution and down-sampling of convolutional neural network are used to preprocess the speech signals in end-to-end voiceprint recognition. The one-dimensional and two-dimensional convolution operations were established to extract the characteristic parameters of Meier frequency cepstrum coefficient from the preprocessed signals, and the classical universal background model was used to model the recognition model of voice print. In this study, the principle of end-to-end voiceprint recognition was firstly analyzed, and the process of end-to-end voice print recognition, end-to-end voice print recognition features and Res-FD-CNN network structure were studied. Then the convolutional neural network recognition model was constructed, and the data were preprocessed to form the convolutional layer in frequency domain and the algorithm was tested.


Author(s):  
Chabib Arifin ◽  
Hartanto Junaedi

Speech one of the biometric characteristic owned by human being, as well as fingerprint, DNA, retina of the eyes and so not the two human beings who have the same voice. Human emotion is a matter that can only be predicted through the face of a person, or from the change of facial expression but it turns out human emotions can also be detected through the spoken voice. Someone emotion are happy, angry, neutral, sad, and surprise can be detected through speech signal. The development of voice recognition system is still running at this moment. So that ini this research, the analysis of someone emotion through speech signal. Some related research about the sound aims to have process of identity recognition gender recognition, Emotion recognition based on conversation. In this research the writer does research on the emotional classification of speech two classes started from happy, angry, neutral, sad and surprise while the used algorithm in this research is SVM (Support Vector Machine) with alghoritmMFCC (Mel-frequency cepstral coefficient)for extraction where it contains filter process that adapted to human’s listening. The result of the implementation process of both algorithms gives accuracy level ashappy=68.54%, angry=75.24%, neutral=78.50%, sad=74.22% and surprise=68.23%.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Yingdong Wang ◽  
Qingfeng Wu ◽  
Chen Wang ◽  
Qunsheng Ruan

In the past few decades, identification recognition based on electroencephalography (EEG) has received extensive attention to resolve the security problems of conventional biometric systems. In the present study, a novel EEG-based identification system with different entropy and a continuous convolution neural network (CNN) classifier is proposed. The performance of the proposed method is experimentally evaluated through the emotional EEG data. The conducted experiment shows that the proposed method approaches the stunning accuracy (ACC) of 99.7% on average and can rapidly train and update the DE-CNN model. Then, the effects of different emotions and the impact of different time intervals on the identification performance are investigated. Obtained results show that different emotions affect the identification accuracy, where the negative and neutral mood EEG has a better robustness than positive emotions. For a video signal as the EEG stimulant, it is found that the proposed method with 0–75 Hz is more robust than a single band, while the 15–32 Hz band presents overfitting and reduces the accuracy of the cross-emotion test. It is concluded that time interval reduces the accuracy and the 15–32 Hz band has the best compatibility in terms of the attenuation.


2021 ◽  
Vol 58 (2) ◽  
pp. 6497-6501
Author(s):  
N Mekebayev, O Mamyrbayev, M Turdalyuly, D Oralbekova, M Tasbolatov

Digital processing of speech signal and the voice recognition algorithm is very important for fast and accurate automatic scoring of the recognition technology. A voice is a signal of infinite information.  The direct analysis and synthesis of a complex speech signal is due to the fact that the information is contained in the signal. Speech is the most natural way of communicating people. The task of speech recognition is to convert speech into a sequence of words using a computer program. This article presents an algorithm of extracting MFCC for speech recognition. The MFCC algorithm reduces the processing power by 53% compared to the conventional algorithm. Automatic speech recognition using Matlab.


2020 ◽  
Vol 13 (2) ◽  
pp. 207-221
Author(s):  
Minghua Wei

PurposeIn order to solve the problem that the performance of the existing local feature descriptors in uncontrolled environment is greatly affected by illumination, background, occlusion and other factors, we propose a novel face recognition algorithm in uncontrolled environment which combines the block central symmetry local binary pattern (CS-LBP) and deep residual network (DRN) model.Design/methodology/approachThe algorithm first extracts the block CSP-LBP features of the face image, then incorporates the extracted features into the DRN model, and gives the face recognition results by using a well-trained DRN model. The features obtained by the proposed algorithm have the characteristics of both local texture features and deep features that robust to illumination.FindingsCompared with the direct usage of the original image, the usage of local texture features of the image as the input of DRN model significantly improves the computation efficiency. Experimental results on the face datasets of FERET, YALE-B and CMU-PIE have shown that the recognition rate of the proposed algorithm is significantly higher than that of other compared algorithms.Originality/valueThe proposed algorithm fundamentally solves the problem of face identity recognition in uncontrolled environment, and it is particularly robust to the change of illumination, which proves its superiority.


Author(s):  
Chin-Teng Lin ◽  
Hsi-Wen Nein ◽  
Wei-Fen Lin

In this paper, we propose a speech recognition algorithm which utilizes hidden Markov models (HMM) and Viterbi algorithm for segmenting the input speech sequence, such that the variable-dimensional speech signal is converted into a fixed-dimensional speech signal, called TN vector. We then use the fuzzy perceptron to generate hyperplanes which separate patterns of each class from the others. The proposed speech recognition algorithm is easy for speaker adaptation when the idea of "supporting pattern" is used. The supporting patterns are those patterns closest to the hyperplane. When a recognition error occurs, we include all the TN vectors of the input speech sequence with respect to the segmentations of all HMM models as the supporting patterns. The supporting patterns are then used by the fuzzy perceptron to tune the hyperplane that can cause correct recognition, and also tune the hyperplane that resulted in wrong recognition. Since only two hyperplane need to be tuned for a recognition error, the proposed adaptation scheme is time-economic and suitable for on-line adaptation. Although the adaptation scheme cannot ensure to correct the wrong recognition right after adaptation, the hyperplanes are tuned in the direction for correct recognition iteratively and the speed of adaptation can be adjusted by a "belief" parameter set by the user. Several examples are used to show the performance of the proposed speech recognition algorithm and the speaker adaptation scheme.


Sign in / Sign up

Export Citation Format

Share Document