scholarly journals Arabic Speech Classification Method Based on Padding and Deep Learning Neural Network

2021 ◽  
Vol 18 (2(Suppl.)) ◽  
pp. 0925
Author(s):  
Asroni Asroni ◽  
Ku Ruhana Ku-Mahamud ◽  
Cahya Damarjati ◽  
Hasan Basri Slamat

Deep learning convolution neural network has been widely used to recognize or classify voice. Various techniques have been used together with convolution neural network to prepare voice data before the training process in developing the classification model. However, not all model can produce good classification accuracy as there are many types of voice or speech. Classification of Arabic alphabet pronunciation is a one of the types of voice and accurate pronunciation is required in the learning of the Qur’an reading. Thus, the technique to process the pronunciation and training of the processed data requires specific approach. To overcome this issue, a method based on padding and deep learning convolution neural network is proposed to evaluate the pronunciation of the Arabic alphabet. Voice data from six school children are recorded and used to test the performance of the proposed method. The padding technique has been used to augment the voice data before feeding the data to the CNN structure to developed the classification model. In addition, three other feature extraction techniques have been introduced to enable the comparison of the proposed method which employs padding technique. The performance of the proposed method with padding technique is at par with the spectrogram but better than mel-spectrogram and mel-frequency cepstral coefficients. Results also show that the proposed method was able to distinguish the Arabic alphabets that are difficult to pronounce. The proposed method with padding technique may be extended to address other voice pronunciation ability other than the Arabic alphabets.

2017 ◽  
Vol 22 (4) ◽  
pp. 270-275
Author(s):  
A. A. Gorbunov ◽  
◽  
E. A. Isaev ◽  
V. A. Samodurov ◽  
◽  
...  

A rapid dissemination of Android operating system in smart phone market has resulted in an exponential growth of threats to mobile applications. Various studies have been carried out in academia and industry for the identification and classification of malicious applications using machine learning and deep learning algorithms. Convolution Neural Network is a deep learning technique which has gained popularity in speech and image recognition. The conventional solution for identifying Android malware needs learning based on pre-extracted features to preserve high performance for detecting Android malware. In order to reduce the efforts and domain expertise involved in hand-feature engineering, we have generated the grayscale images of AndroidManifest.xml and classes.dex files which are extracted from the Android package and applied Convolution Neural Network for classifying the images. The experiments are conducted on a recent dataset of 1747 malicious Android applications. The results indicate that classes.dex file gives better results as compared to the AndroidManifest.xml and also demonstrate that model performs better as the image become larger.


Author(s):  
D.A Janeera ◽  
P. Amudhavalli ◽  
P Sherubha ◽  
S.P Sasirekha ◽  
P. Anantha Christu Raj ◽  
...  

Author(s):  
Abdul Kholik ◽  
Agus Harjoko ◽  
Wahyono Wahyono

The volume density of vehicles is a problem that often occurs in every city, as for the impact of vehicle density is congestion. Classification of vehicle density levels on certain roads is required because there are at least 7 vehicle density level conditions. Monitoring conducted by the police, the Department of Transportation and the organizers of the road currently using video-based surveillance such as CCTV that is still monitored by people manually. Deep Learning is an approach of synthetic neural network-based learning machines that are actively developed and researched lately because it has succeeded in delivering good results in solving various soft-computing problems, This research uses the convolutional neural network architecture. This research tries to change the supporting parameters on the convolutional neural network to further calibrate the maximum accuracy. After the experiment changed the parameters, the classification model was tested using K-fold cross-validation, confusion matrix and model exam with data testing. On the K-fold cross-validation test with an average yield of 92.83% with a value of K (fold) = 5, model testing is done by entering data testing amounting to 100 data, the model can predict or classify correctly i.e. 81 data.


MATEMATIKA ◽  
2018 ◽  
Vol 34 (3) ◽  
pp. 83-90
Author(s):  
Nita Cahyani ◽  
Kartika Fithriasari ◽  
Irhamah Irhamah ◽  
Nur Iriawan

Neural Network and Binary Logistic Regression are modern and classical data mining analysis tools that can be used to classify data on Bidikmisi scholarship acceptance in East Java Province, Indonesia. One form of Neural Network model available for various applications is the Resilient Backpropagation Neural Network (Resilient BPNN). This study aims to compare the performance of the Resilient BPNN method as a Deep Learning Neural Network and Binary Logistic Regression method in determining the classification of Bidikmisi scholarship acceptance in East Java Province. After preprocessing data and dividing them into two parts, i.e. sets of testing and training data, with 10-foldcross-validation procedure, the Resilient BPNN and Binary Logistic Regression methods are implemented. The result shows that Resilient BPNN with two hidden layers is the best platformnetwork model. The classificationG-mean resulted by these both methods is that Resilient BPNN with two hidden layers is more representative with better performance than Binary Logistic Regression. The Resilient BPNN is recommended to be used topredict acceptance of Bidikmisi applicants yearly.


2021 ◽  
Vol 11 (15) ◽  
pp. 7149
Author(s):  
Ji-Yeoun Lee

This work is focused on deep learning methods, such as feedforward neural network (FNN) and convolutional neural network (CNN), for pathological voice detection using mel-frequency cepstral coefficients (MFCCs), linear prediction cepstrum coefficients (LPCCs), and higher-order statistics (HOSs) parameters. In total, 518 voice data samples were obtained from the publicly available Saarbruecken voice database (SVD), comprising recordings of 259 healthy and 259 pathological women and men, respectively, and using /a/, /i/, and /u/ vowels at normal pitch. Significant differences were observed between the normal and the pathological voice signals for normalized skewness (p = 0.000) and kurtosis (p = 0.000), except for normalized kurtosis (p = 0.051) that was estimated in the /u/ samples in women. These parameters are useful and meaningful for classifying pathological voice signals. The highest accuracy, 82.69%, was achieved by the CNN classifier with the LPCCs parameter in the /u/ vowel in men. The second-best performance, 80.77%, was obtained with a combination of the FNN classifier, MFCCs, and HOSs for the /i/ vowel samples in women. There was merit in combining the acoustic measures with HOS parameters for better characterization in terms of accuracy. The combination of various parameters and deep learning methods was also useful for distinguishing normal from pathological voices.


Author(s):  
Vinit Kumar Gunjan ◽  
Rashmi Pathak ◽  
Omveer Singh

This article describes how to establish the neural network technique for various image groupings in a convolution neural network (CNN) training. In addition, it also suggests initial classification results using CNN learning characteristics and classification of images from different categories. To determine the correct architecture, we explore a transfer learning technique, called Fine-Tuning of Deep Learning Technology, a dataset used to provide solutions for individually classified image-classes.


Sign in / Sign up

Export Citation Format

Share Document