Arabic Phoneme Identification Using Conventional and Concurrent Neural Networks in Non Native Speakers

Author(s):  
Mian M. Awais ◽  
Shahid Masud ◽  
Junaid Ahktar ◽  
Shafay Shamail
2019 ◽  
Vol 29 (2) ◽  
pp. 393-405 ◽  
Author(s):  
Magdalena Piotrowska ◽  
Gražina Korvel ◽  
Bożena Kostek ◽  
Tomasz Ciszewski ◽  
Andrzej Cżyzewski

Abstract Automatic classification methods, such as artificial neural networks (ANNs), the k-nearest neighbor (kNN) and self-organizing maps (SOMs), are applied to allophone analysis based on recorded speech. A list of 650 words was created for that purpose, containing positionally and/or contextually conditioned allophones. For each word, a group of 16 native and non-native speakers were audio-video recorded, from which seven native speakers’ and phonology experts’ speech was selected for analyses. For the purpose of the present study, a sub-list of 103 words containing the English alveolar lateral phoneme /l/ was compiled. The list includes ‘dark’ (velarized) allophonic realizations (which occur before a consonant or at the end of the word before silence) and 52 ‘clear’ allophonic realizations (which occur before a vowel), as well as voicing variants. The recorded signals were segmented into allophones and parametrized using a set of descriptors, originating from the MPEG 7 standard, plus dedicated time-based parameters as well as modified MFCC features proposed by the authors. Classification methods such as ANNs, the kNN and the SOM were employed to automatically detect the two types of allophones. Various sets of features were tested to achieve the best performance of the automatic methods. In the final experiment, a selected set of features was used for automatic evaluation of the pronunciation of dark /l/ by non-native speakers.


Author(s):  
R Santhoshi

While learning a new language through the internet or applications, a lot of them focus on teaching words and sentences and do not concentrate on the pronouncing ability of the users. Even though many speakers are proficient in a language, their pronunciation might be influenced by their native language. For people who are interested in improving their pronunciation capabilities this proposed system was introduced. This system is primarily focused on improving the pronunciation of English words and sentences for non-native speakers i.e., for whom English is a second language. For a given audio clip, we scale the audio and extract the features, input the features to the model developed and the output of the model gives the phonemes that are spoken in the clip. Many models detect phonemes and various methods have been proposed but the main reason for choosing deep learning is that the learning and the features that we tend to oversee or overlook are picked up by the model provided the dataset is balanced and the model is built properly. The features to be considered varies for every speech processing project and through previous research work and through trial and error we choose the features that work best for us. Comparing the phonemes with the actual phonemes present, we can give the speaker which part of their speech they need to work on. Based on the phoneme, feedback is given on how to improve their pronunciation.


2016 ◽  
Vol 41 (4) ◽  
pp. 669-682 ◽  
Author(s):  
Gábor Gosztolya ◽  
András Beke ◽  
Tilda Neuberger ◽  
László Tóth

Abstract Laughter is one of the most important paralinguistic events, and it has specific roles in human conversation. The automatic detection of laughter occurrences in human speech can aid automatic speech recognition systems as well as some paralinguistic tasks such as emotion detection. In this study we apply Deep Neural Networks (DNN) for laughter detection, as this technology is nowadays considered state-of-the-art in similar tasks like phoneme identification. We carry out our experiments using two corpora containing spontaneous speech in two languages (Hungarian and English). Also, as we find it reasonable that not all frequency regions are required for efficient laughter detection, we will perform feature selection to find the sufficient feature subset.


Author(s):  
Sandra Godinho ◽  
Margarida V. Garrido ◽  
Oleksandr V. Horchak

Abstract. Words whose articulation resembles ingestion movements are preferred to words mimicking expectoration movements. This so-called in-out effect, suggesting that the oral movements caused by consonantal articulation automatically activate concordant motivational states, was already replicated in languages belonging to Germanic (e.g., German and English) and Italic (e.g., Portuguese) branches of the Indo-European family. However, it remains unknown whether such preference extends to the Indo-European branches whose writing system is based on the Cyrillic rather than Latin alphabet (e.g., Ukrainian), or whether it occurs in languages not belonging to the Indo-European family (e.g., Turkish). We replicated the in-out effect in two high-powered experiments ( N = 274), with Ukrainian and Turkish native speakers, further supporting an embodied explanation for this intriguing preference.


Sign in / Sign up

Export Citation Format

Share Document