QVR: Quranic Verses Recitation Recognition System using PocketSphinx

Hasan Ali Gamal Al-Kaf;  ; Muhammad Suhaimi Sulong; Ariffuddin Joret; Nuramin Fitri Aminuddin; Che Adenan Mohammad;  ;  ;  ;  ;  ;

doi:10.30880/jqsr.2021.02.02.004

QVR: Quranic Verses Recitation Recognition System using PocketSphinx

Journal of Quranic Sciences and Research ◽

10.30880/jqsr.2021.02.02.004 ◽

2021 ◽

Vol 02 (02) ◽

Author(s):

Hasan Ali Gamal Al-Kaf ◽

◽

Muhammad Suhaimi Sulong ◽

Ariffuddin Joret ◽

Nuramin Fitri Aminuddin ◽

...

Keyword(s):

Automatic Speech Recognition ◽

Graphical User Interface ◽

Visual Basic ◽

Recognition System ◽

Training Data ◽

Word Error Rate ◽

Application System ◽

Testing Data ◽

Engine System ◽

User Friendly

The recitation of Quran verses according to the actual tajweed is obligatory and it must be accurate and precise in pronunciation. Hence, it should always be reviewed by an expert on the recitation of the Quran. Through the latest technology, this recitation review can be implemented through an application system and it is most appropriate in this current Covid-19 pandemic situation where system application online is deemed to be developed. In this empirical study, a recognition system so-called the Quranic Verse Recitation Recognition (QVR) system using PocketSphinx to convert the Quranic verse from Arabic sound to Roman text and determine the accuracy of reciters, has been developed. The Graphical User Interface (GUI) of the system with a user-friendly environment was designed using Microsoft Visual Basic 6 in an Ubuntu platform. A verse of surah al-Ikhlas has been chosen in this study and the data were collected by recording 855 audios as training data recorded by professional reciters. Another 105 audios were collected as testing data, to test the accuracy of the system. The results indicate that the system obtained a 100% accuracy with a 0.00% of word error rate (WER) for both training and testing data of the said audios via Quran Roman text. The system with automatic speech recognition (ASR) engine system demonstrates that it has been successfully designed and developed, and is significant to be extended further. Added, it will be improved with the addition of other Quran surahs.

Download Full-text

Deep Learning-based Facial Expression Recognition and Analysis for Filipino Gamers

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1027.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 1822-1827 ◽

Cited By ~ 1

Keyword(s):

Deep Learning ◽

Short Term Memory ◽

Recognition System ◽

Training Data ◽

Test Accuracy ◽

Expression Recognition ◽

Basic Emotions ◽

Learning Techniques ◽

Testing Data ◽

Long Short Term Memory

This paper presents a computer vision based emotion recognition system for the identification of six basic emotions among Filipino Gamers using deep learning techniques. In particular, the proposed system utilized deep learning through the Inception Network and Long-Short Term Memory (LSTM). The researchers gathered a database for Filipino Facial Expressions consisting of 74 gamers for the training data and 4 gamer subjects for the testing data. The system was able to produce a maximum categorical validation accuracy of .9983 and a test accuracy of .9940 for the six basic emotions using the Filipino database. The cross-database analysis results using the well-known Cohn -Kanade+ database showed that the proposed Inception-LSTM system has accuracy on a par with the current existing systems. The results demonstrated the feasibility of the proposed system and showed sample computations of empathy and engagement based on the six basic emotions as a proof of concept

Download Full-text

PREDIKSI KELANCARAN PEMBAYARAN CICILAN CALON DEBITUR DENGAN METODE K-NEAREST NEIGHBOR

JURTEKSI ◽

10.33330/jurteksi.v7i2.1078 ◽

2021 ◽

Vol 7 (2) ◽

pp. 195-202

Author(s):

Sri Ayu Rizky ◽

Rolly Yesputra ◽

Santoso Santoso

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Training Data ◽

Mining Method ◽

K Nearest Neighbor ◽

Application System ◽

K Value ◽

Testing Data ◽

Calculation Process ◽

K Nearest Neighbor Algorithm

Abstract: In this research, a prediction system has been successfully developed to predict whether or not a prospective money borrower will run smoothly. Prospective borrowers who will borrow, some of the data that meet the criteria will be inputted by the office clerk into a prediction application system interface to be processed using the Data Mining method, namely the K-Nearest Neighbor Algorithm with the Codeigniter programming language 3. The results of the Euclidean calculation process are based on predetermined criteria Between training data (training) to testing data (test) will be displayed with a table that has been sorted from smallest to largest containing 9 closest neighbors according to the K value that has been determined, namely 9. The nine neighbors will be taken the dominant category. This dominant category can be used as a guideline that makes it easier for the leader to make a decision on the next borrower. Keywords: Data Mining; Euclidean; K-Nearest Neighbor; Prospective Borrowers; Abstrak: Dalam penelitian ini telah berhasil dibuat sebuah sistem prediksi untuk memprediksi lancar atau tidak lancarnya seorang calon peminjam uang. Calon peminjam uang yang akan meminjam, sebagian datanya yang memenuhi kriteria akan diinputkan petugas kantor ke dalam sebuah interface sistem aplikasi prediksi untuk diolah menggunakan metode Data Mining yaitu Algoritma K-Nearest Neighbor dengan bahasa pemrograman Codeigniter 3. Hasil proses perhitungan Euclidean berdasarkan kriteria yang sudah ditentukan antara data training (latih) ke data testing (uji) tersebut akan ditampilkan dengan sebuah tabel yang sudah diurutkan dari yang terkecil ke terbesar berisi 9 tetangga terdekat sesuai dengan nilai K yang sudah ditentukan yaitu 9. Sembilan tetangga tersebut akan diambil kategori yang dominan. Kategori yang dominan tersebut bisa dijadikan suatu pedoman yang memudahkan pimpinan dalam mengambil sebuah keputusan kepada calon peminjam selanjutnya. Kata kunci: Debitur; Data Mining; Euclidean; K-Nearest Neighbor

Download Full-text

Langkah Praktis Membangun Sistem Pengenalan Suara dengan HTK

JSAI (Journal Scientific and Applied Informatics) ◽

10.36085/jsai.v2i2.314 ◽

2019 ◽

Vol 2 (2) ◽

pp. 149-153

Author(s):

Zulkarnaen Hatala

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Error Rate ◽

Principle Component Analysis ◽

Hidden Markov ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System ◽

Word Error Rate ◽

Bahasa Indonesia

Dipaparkan prosedur untuk mengembangkan Sistem Pengenalan Suara otomatis, Automatic Speech Recognition System (ASR) untuk kasus online recognition. Prosedur ini secara cepat dan efisien membangun ASR menggunakan Hidden Markov Toolkit (HTK). Langkah-langkah praktis ini dipaparkan secara jelas untuk mengimplementasikan ASR dengan daftar kata sedikit (Small Vocabulary) dalam contoh kasus pengenalan digit Bahasa Indonesia. Dijelaskan beberapa teknik meningkatkan performansi seperti cara mengatasi noise, pengejaan ganda dan penerapan Principle Component Analysis. Hasil akhir berupa Word Error Rate

Download Full-text

Development of End – to – End Encoder - Decoder Model Applying Voice Recognition System in Different Channels

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1267.0982s1119 ◽

2019 ◽

Vol 8 (2S11) ◽

pp. 2350-2352

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Error Rate ◽

Voice Recognition ◽

Ground Truth ◽

Recognition System ◽

Training Algorithms ◽

Word Error Rate ◽

End To End ◽

Evaluation Metric

the dissimilarity in recognizing the word sequence and their ground truth in different channels can be absorbed by implementing Automatic Speech Recognition which is the standard evaluation metric and is encountered with the phenomena of Word Error Rate for various measures. In the model of 1ch, the track is trained without any preprocessing and study on multichannel end-to-end Automatic Speech Recognition envisaged that the function can be integrated into (Deep Neural network) – based system and lead to multiple experimental results. More so, when the Word Error Rate (WER) is not directly differentiable, it is pertinent to adopt Encoder – Decoder gradient objective function which has been clear in CHiME-4 system. In this study, we examine that the sequence level evaluation metric is a fair choice for optimizing Encoder – Decoder model for which many training algorithms is designed to reduce sequence level error. The study incorporates the scoring of multiple hypotheses in decoding stage for improving the decoding result to optimum. By this, the mismatch between the objectives is resulted in a feasible form to the maxim. Hence, the study finds the result of voice recognition which is most effective for adaptation.

Download Full-text

Making Evidence-Based Practice User Friendly: A Curriculum for Training "Data-Proficient" Clinicians

PsycEXTRA Dataset ◽

10.1037/e517292011-091 ◽

2009 ◽

Author(s):

Christopher Layne ◽

Virginia Strand ◽

Robert Abramovitz ◽

Glenn Saxe

Keyword(s):

Evidence Based Practice ◽

Training Data ◽

Evidence Based ◽

User Friendly

Download Full-text

A Study on Utilization of Three-Dimensional Sensor Lip Image for Developing a Pronunciation Recognition System

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2019.63.5.050402 ◽

2019 ◽

Vol 63 (5) ◽

pp. 50402-1-50402-9 ◽

Cited By ~ 1

Author(s):

Ing-Jr Ding ◽

Chong-Min Ruan

Keyword(s):

Principal Component Analysis ◽

Automatic Speech Recognition ◽

Feature Fusion ◽

Three Dimensional ◽

Principal Component ◽

Recognition System ◽

Geometrical Characteristics ◽

3D Geometry ◽

Different Types ◽

The Disabled

Abstract The acoustic-based automatic speech recognition (ASR) technique has been a matured technique and widely seen to be used in numerous applications. However, acoustic-based ASR will not maintain a standard performance for the disabled group with an abnormal face, that is atypical eye or mouth geometrical characteristics. For governing this problem, this article develops a three-dimensional (3D) sensor lip image based pronunciation recognition system where the 3D sensor is efficiently used to acquire the action variations of the lip shapes of the pronunciation action from a speaker. In this work, two different types of 3D lip features for pronunciation recognition are presented, 3D-(x, y, z) coordinate lip feature and 3D geometry lip feature parameters. For the 3D-(x, y, z) coordinate lip feature design, 18 location points, each of which has 3D-sized coordinates, around the outer and inner lips are properly defined. In the design of 3D geometry lip features, eight types of features considering the geometrical space characteristics of the inner lip are developed. In addition, feature fusion to combine both 3D-(x, y, z) coordinate and 3D geometry lip features is further considered. The presented 3D sensor lip image based feature evaluated the performance and effectiveness using the principal component analysis based classification calculation approach. Experimental results on pronunciation recognition of two different datasets, Mandarin syllables and Mandarin phrases, demonstrate the competitive performance of the presented 3D sensor lip image based pronunciation recognition system.

Download Full-text

An Analog Circuit Fault Diagnosis Approach Based on Wavelet-based fractal analysis and Multiple Kernel SVM

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666201207154641 ◽

2020 ◽

Vol 13 ◽

Author(s):

Jianfeng Jiang

Keyword(s):

Fault Diagnosis ◽

Fractal Analysis ◽

Analog Circuit ◽

Training Data ◽

Support Vector ◽

Pass Filter ◽

Multiple Kernel ◽

Testing Data ◽

Circuit Fault Diagnosis ◽

Diagnosis Approach

Objective: In order to diagnose the analog circuit fault correctly, an analog circuit fault diagnosis approach on basis of wavelet-based fractal analysis and multiple kernel support vector machine (MKSVM) is presented in the paper. Methods: Time responses of the circuit under different faults are measured, and then wavelet-based fractal analysis is used to process the collected time responses for the purpose of generating features for the signals. Kernel principal component analysis (KPCA) is applied to reduce the features’ dimensionality. Afterwards, features are divided into training data and testing data. MKSVM with its multiple parameters optimized by chaos particle swarm optimization (CPSO) algorithm is utilized to construct an analog circuit fault diagnosis model based on the testing data. Results: The proposed analog diagnosis approach is revealed by a four opamp biquad high-pass filter fault diagnosis simulation. Conclusion: The approach outperforms other commonly used methods in the comparisons.

Download Full-text

“Spanish Políglota”: an automatic Speech Recognition system based on HMM

2021 Second International Conference on Information Systems and Software Technologies (ICI2ST) ◽

10.1109/ici2st51859.2021.00011 ◽

2021 ◽

Author(s):

Jonathan A. Zea ◽

Josafa Aguiar

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text

CrossGR

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3448100 ◽

2021 ◽

Vol 5 (1) ◽

pp. 1-23

Author(s):

Xinyi Li ◽

Liqiong Chang ◽

Fangfang Song ◽

Ju Wang ◽

Xiaojiang Chen ◽

...

Keyword(s):

Gesture Recognition ◽

Low Cost ◽

User Involvement ◽

Recognition System ◽

Training Data ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Target User ◽

Order Of Magnitude ◽

Training Examples

This paper focuses on a fundamental question in Wi-Fi-based gesture recognition: "Can we use the knowledge learned from some users to perform gesture recognition for others?". This problem is also known as cross-target recognition. It arises in many practical deployments of Wi-Fi-based gesture recognition where it is prohibitively expensive to collect training data from every single user. We present CrossGR, a low-cost cross-target gesture recognition system. As a departure from existing approaches, CrossGR does not require prior knowledge (such as who is currently performing a gesture) of the target user. Instead, CrossGR employs a deep neural network to extract user-agnostic but gesture-related Wi-Fi signal characteristics to perform gesture recognition. To provide sufficient training data to build an effective deep learning model, CrossGR employs a generative adversarial network to automatically generate many synthetic training data from a small set of real-world examples collected from a small number of users. Such a strategy allows CrossGR to minimize the user involvement and the associated cost in collecting training examples for building an accurate gesture recognition system. We evaluate CrossGR by applying it to perform gesture recognition across 10 users and 15 gestures. Experimental results show that CrossGR achieves an accuracy of over 82.6% (up to 99.75%). We demonstrate that CrossGR delivers comparable recognition accuracy, but uses an order of magnitude less training samples collected from the end-users when compared to state-of-the-art recognition systems.

Download Full-text

Development of a Generalized Voice-Controlled Human-Robot Interface: One Automatic Speech Recognition System for All Robots

2020 3rd International Conference on Control and Robots (ICCR) ◽

10.1109/iccr51572.2020.9344123 ◽

2020 ◽

Author(s):

Warat Khaewratana ◽

Elizabeth S. Veinott ◽

S. Manian Ramkumar

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Recognition System ◽

Speech Recognition System ◽

Automatic Speech Recognition System

Download Full-text