Mandarin speech recognition using convolution neural network with augmented tone features

Author(s):  
Xinhui Hu ◽  
Xugang Lu ◽  
Chiori Hori
Author(s):  
Hunny Pahuja ◽  
Priya Ranjan ◽  
Amit Ujlayan ◽  
Ayush Goyal

Introduction: This paper introduces novel and reliable approach for speech impaired people to assist them to communicate effectively in real time. A deep learning technique named as convolution neural network is used as its classifier. With the help of this algorithm, words are recognized from an input which is a visual speech, disregards with its audible or acoustic property. Methods: This network extracts the features from mouth stances and different images respectively. With the help of a source, non-audible mouth stances are taken as an input and then segregated as subsets to get desired output. The Complete Datum is then arranged to recognize the word as an affricate. Results: Convolution neural network is one of the most effective algorithms that extracts features, performs classification and provides the desired output from the input images for speech recognition system. Conclusion: Recognizing the syllables at real time from visual mouth stances input is the main objective of the proposed method. When tested, datum accuracy and quantity of training sets is giving satisfactory output. A small set of datum is taken as first step of learning. In future, large set of datum can be considered for analyzing the data. Discussion: On the basis of type of Datum, network proposed in this paper is tested to obtain its precision level. A network is maintained to identify the syllables but it fails when syllables are of same set. Requirement of Higher end graphics pro-cessing units is there to bring down the time consumption and increases the efficiency of network.


Sign in / Sign up

Export Citation Format

Share Document