arabic speech recognition
Recently Published Documents


TOTAL DOCUMENTS

110
(FIVE YEARS 28)

H-INDEX

12
(FIVE YEARS 2)

2021 ◽  
Vol 49 (1) ◽  
Author(s):  
Souad Larabi-Marie-Sainte ◽  
◽  
Betool S. Alnamlah ◽  
Norah F. Alkassim ◽  
Sara Y. Alshathry ◽  
...  

Automated recitation plays an important role in improving self-learning. It is based on Speech/Text recognition. The research in Arabic speech recognition is very limited. The few existing applications are only based on the Holy Qur’an. This article proposed a new system (Samee’a - ) to facilitate memorizing any kind of text such that poems, speeches and the Holy Qur’an. Samee’a system is based on Google Cloud Speech Recognition API to convert the Arabic speech to text and Jaro Winkler Distance algorithm to determine the similarity between the original and converted texts. The system has been tested using 70 collected files ranging between 12 to 400 words and some chapters from the Holy Qur’an. The average similarity achieved 83.33% for the 70 files and 69% for the selected chapters of the Holy Qur’an. These results were enhanced to 91.33 % and 95.66% after applying preprocessing operations on the text files and the Holly Qur’an respectively. To validate the obtained results, two comparison studies were performed. The Jaro Winker distance was successfully compared to the cosine and the Euclidean distance. In addition, the proposed system outperformed the related work with an improvement of the similarity reaching 5% when using section 30 of the Holy Qur’an. Finally, the user experience testing was carried out by 10 users of different ages (between 5 and 50-year-old) using small texts and some small chapters of the Holy Qur’an. The proposed system proved its efficiency.


2021 ◽  
pp. 1-13
Author(s):  
Hamzah A. Alsayadi ◽  
Abdelaziz A. Abdelhamid ◽  
Islam Hegazy ◽  
Zaki T. Fayed

Arabic language has a set of sound letters called diacritics, these diacritics play an essential role in the meaning of words and their articulations. The change in some diacritics leads to a change in the context of the sentence. However, the existence of these letters in the corpus transcription affects the accuracy of speech recognition. In this paper, we investigate the effect of diactrics on the Arabic speech recognition based end-to-end deep learning. The applied end-to-end approach includes CNN-LSTM and attention-based technique presented in the state-of-the-art framework namely, Espresso using Pytorch. In addition, and to the best of our knowledge, the approach of CNN-LSTM with attention-based has not been used in the task of Arabic Automatic speech recognition (ASR). To fill this gap, this paper proposes a new approach based on CNN-LSTM with attention based method for Arabic ASR. The language model in this approach is trained using RNN-LM and LSTM-LM and based on nondiacritized transcription of the speech corpus. The Standard Arabic Single Speaker Corpus (SASSC), after omitting the diacritics, is used to train and test the deep learning model. Experimental results show that the removal of diacritics decreased out-of-vocabulary and perplexity of the language model. In addition, the word error rate (WER) is significantly improved when compared to diacritized data. The achieved average reduction in WER is 13.52%.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Fatmah Abdulrahman Baothman

Artificial intelligence (AI) is progressively changing techniques of teaching and learning. In the past, the objective was to provide an intelligent tutoring system without intervention from a human teacher to enhance skills, control, knowledge construction, and intellectual engagement. This paper proposes a definition of AI focusing on enhancing the humanoid agent Nao’s learning capabilities and interactions. The aim is to increase Nao intelligence using big data by activating multisensory perceptions such as visual and auditory stimuli modules and speech-related stimuli, as well as being in various movements. The method is to develop a toolkit by enabling Arabic speech recognition and implementing the Haar algorithm for robust image recognition to improve the capabilities of Nao during interactions with a child in a mixed reality system using big data. The experiment design and testing processes were conducted by implementing an AI principle design, namely, the three-constituent principle. Four experiments were conducted to boost Nao’s intelligence level using 100 children, different environments (class, lab, home, and mixed reality Leap Motion Controller (LMC). An objective function and an operational time cost function are developed to improve Nao’s learning experience in different environments accomplishing the best results in 4.2 seconds for each number recognition. The experiments’ results showed an increase in Nao’s intelligence from 3 to 7 years old compared with a child’s intelligence in learning simple mathematics with the best communication using a kappa ratio value of 90.8%, having a corpus that exceeded 390,000 segments, and scoring 93% of success rate when activating both auditory and vision modules for the agent Nao. The developed toolkit uses Arabic speech recognition and the Haar algorithm in a mixed reality system using big data enabling Nao to achieve a 94% success learning rate at a distance of 0.09 m; when using LMC in mixed reality, the hand sign gestures recorded the highest accuracy of 98.50% using Haar algorithm. The work shows that the current work enabled Nao to gradually achieve a higher learning success rate as the environment changes and multisensory perception increases. This paper also proposes a cutting-edge research work direction for fostering child-robots education in real time.


2021 ◽  
Vol 8 (1) ◽  
pp. 164-170
Author(s):  
Mohammad Husam Alhumsi ◽  
Saleh Belhassen

Phonetic dictionaries are regarded as pivotal components of speech recognition systems. The function of speech recognition research is to generate a machine which will accurately identify and distinguish the normal human speech from any other speaker. Literature affirmed that Arabic phonetics is one of the major problems in Arabic speech recognition. Therefore, this paper reviews previous studies tackling the challenges faced by initiating an Arabic phonetic dictionary with respect to Arabic speech recognition. It has been found that the system of speech recognition investigated areas of differences concerning Arabic phonetics. In addition, an Arabic phonetic dictionary should be initiated where the Arabic vowels’ phonemes should be considered as a component of the consonants’ phonemes. Thus, the incorporation of developed machine translation systems may enhance the quality of the system. The current paper concludes with the existing challenges faced by Arabic phonetic dictionary.


Sign in / Sign up

Export Citation Format

Share Document