Development of Consonant-Vowel Recognition Systems for Indian languages: Bengali and Odia

Author(s):  
K E Manjunath ◽  
S. B. Sunil Kumar ◽  
Debadatta Pati ◽  
Biswajit Satapathy ◽  
K. Sreenivasa Rao
Author(s):  
Manjunath K. E. ◽  
Srinivasa Raghavan K. M. ◽  
K. Sreenivasa Rao ◽  
Dinesh Babu Jayagopi ◽  
V. Ramasubramanian

In this study, we evaluate and compare two different approaches for multilingual phone recognition in code-switched and non-code-switched scenarios. First approach is a front-end Language Identification (LID)-switched to a monolingual phone recognizer (LID-Mono), trained individually on each of the languages present in multilingual dataset. In the second approach, a common multilingual phone-set derived from the International Phonetic Alphabet (IPA) transcription of the multilingual dataset is used to develop a Multilingual Phone Recognition System (Multi-PRS). The bilingual code-switching experiments are conducted using Kannada and Urdu languages. In the first approach, LID is performed using the state-of-the-art i-vectors. Both monolingual and multilingual phone recognition systems are trained using Deep Neural Networks. The performance of LID-Mono and Multi-PRS approaches are compared and analysed in detail. It is found that the performance of Multi-PRS approach is superior compared to more conventional LID-Mono approach in both code-switched and non-code-switched scenarios. For code-switched speech, the effect of length of segments (that are used to perform LID) on the performance of LID-Mono system is studied by varying the window size from 500 ms to 5.0 s, and full utterance. The LID-Mono approach heavily depends on the accuracy of the LID system and the LID errors cannot be recovered. But, the Multi-PRS system by virtue of not having to do a front-end LID switching and designed based on the common multilingual phone-set derived from several languages, is not constrained by the accuracy of the LID system, and hence performs effectively on code-switched and non-code-switched speech, offering low Phone Error Rates than the LID-Mono system.


Author(s):  
N. Shobha Rani ◽  
Sanjay Kumar Verma ◽  
Anitta Joseph

Realization of high accuracies and efficiencies in South Indian character recognition systems is one of the principle goals to be attempted time after time so as to promote the usage of optical character recognition (OCR) for South Indian languages like Telugu. The process of character recognition comprises pre-processing, segmentation, feature extraction, classification and recognition. The feature extraction stage is meant for uniquely recognizing each character image for the purpose of classifying it. The selection of a feature extraction algorithm is very critical and important for any image processing application and mostly of the times it is directly proportional to the type of the image objects that we have to identify. For optical technologies like South Indian OCR, the feature extraction technique plays a very vital role in accuracy of recognition due to the huge character sets. In this work we mainly focus on evaluating the performance of various feature extraction techniques with respect to Telugu character recognition systems and analyze its efficiencies and accuracies in recognition of Telugu character set.


Author(s):  
R. SANJEEV KUNTE ◽  
R. D. SUDHAKER SAMUEL

Optical Character Recognition (OCR) systems have been effectively developed for the recognition of printed characters of non-Indian languages. Efforts are underway for the development of efficient OCR systems for Indian languages, especially for Kannada, a popular South Indian language. We present in this paper an OCR system developed for the recognition of basic characters in printed Kannada text, which can handle different font sizes and font sets. Wavelets that have been progressively used in pattern recognition and on-line character recognition systems are used in our system to extract the features of printed Kannada characters. Neural classifiers have been effectively used for the classification of characters based on wavelet features. The system methodology can be extended for the recognition of other south Indian languages, especially for Telugu.


Author(s):  
N. Shobha Rani ◽  
Sanjay Kumar Verma ◽  
Anitta Joseph

Realization of high accuracies and efficiencies in South Indian character recognition systems is one of the principle goals to be attempted time after time so as to promote the usage of optical character recognition (OCR) for South Indian languages like Telugu. The process of character recognition comprises pre-processing, segmentation, feature extraction, classification and recognition. The feature extraction stage is meant for uniquely recognizing each character image for the purpose of classifying it. The selection of a feature extraction algorithm is very critical and important for any image processing application and mostly of the times it is directly proportional to the type of the image objects that we have to identify. For optical technologies like South Indian OCR, the feature extraction technique plays a very vital role in accuracy of recognition due to the huge character sets. In this work we mainly focus on evaluating the performance of various feature extraction techniques with respect to Telugu character recognition systems and analyze its efficiencies and accuracies in recognition of Telugu character set.


2019 ◽  
Vol 22 (1) ◽  
pp. 157-168 ◽  
Author(s):  
K. E. Manjunath ◽  
Dinesh Babu Jayagopi ◽  
K. Sreenivasa Rao ◽  
V. Ramasubramanian

1970 ◽  
Vol 13 (4) ◽  
pp. 715-724 ◽  
Author(s):  
Richard L. Powell ◽  
Oscar Tosi

Vowels were segmented into 15 different temporal segments taken from the middle of the vowel and ranging from 4 to 60 msecs, then presented to 6 subjects with normal hearing. The mean temporal-segment recognition threshold of 15 msecs with a range from 9.3 msecs for the /u/ to 27.2 milliseconds for the /a/. Misidenti-fication of vowels was most often confused with the vowel sound adjacent to it on the vowel-hump diagram. There was no significant difference between the cardinal and noncardinal vowels.


1986 ◽  
Vol 29 (3) ◽  
pp. 420-424 ◽  
Author(s):  
Michael Dorman ◽  
Ingrid Cedar ◽  
Maureen Hannley ◽  
Marjorie Leek ◽  
Julie Mapes Lindholm

Computer synthesized vowels of 50- and 300-ms duration were presented to normal-hearing listeners at a moderate and high sound pressure level (SPL). Presentation at the high SPL resulted in poor recognition accuracy for vowels of a duration (50 ms) shorter than the latency of the acoustic stapedial reflex. Presentation level had no effect on recognition accuracy for vowels of sufficient duration (300 ms) to elicit the reflex. The poor recognition accuracy for the brief, high intensity vowels was significantly improved when the reflex was preactivated. These results demonstrate the importance of the acoustic reflex in extending the dynamic range of the auditory system for speech recognition.


Author(s):  
Ramandeep Kaur ◽  
◽  
Lakhvir Singh Garcha ◽  
Mohita Garag ◽  
Satinderpal Singh ◽  
...  

Author(s):  
V. Jagan Naveen ◽  
K. Krishna Kishore ◽  
P. Rajesh Kumar

In the modern world, human recognition systems play an important role to   improve security by reducing chances of evasion. Human ear is used for person identification .In the Empirical study on research on human ear, 10000 images are taken to find the uniqueness of the ear. Ear based system is one of the few biometric systems which can provides stable characteristics over the age. In this paper, ear images are taken from mathematical analysis of images (AMI) ear data base and the analysis is done on ear pattern recognition based on the Expectation maximization algorithm and k means algorithm.  Pattern of ears affected with different types of noises are recognized based on Principle component analysis (PCA) algorithm.


Sign in / Sign up

Export Citation Format

Share Document