linear predictive coding
Recently Published Documents


TOTAL DOCUMENTS

243
(FIVE YEARS 46)

H-INDEX

15
(FIVE YEARS 3)

Author(s):  
Nsiri Benayad ◽  
Zayrit Soumaya ◽  
Belhoussine Drissi Taoufiq ◽  
Ammoumou Abdelkrim

<span lang="EN-US">Among the several ways followed for detecting Parkinson's disease, there is the one based on the speech signal, which is a symptom of this disease. In this paper focusing on the signal analysis, a data of voice records has been used. In these records, the patients were asked to utter vowels “a”, “o”, and “u”. Discrete wavelet transforms (DWT) applied to the speech signal to fetch the variable resolution that could hide the most important information about the patients. From the approximation a3 obtained by Daubechies wavelet at the scale 2 level 3, 21 features have been extracted: a <a name="_Hlk88480766"></a>linear predictive coding (LPC), energy, zero-crossing rate (ZCR), mel frequency cepstral coefficient (MFCC), and wavelet Shannon entropy. Then for the classification, the K-nearest neighbour (KNN) has been used. The KNN is a type of instance-based learning that can make a decision based on approximated local functions, besides the ensemble learning. However, through the learning process, the choice of the training features can have a significant impact on overall the process. So, here it stands out the role of the genetic algorithm (GA) to select the best training features that give the best accurate classification.</span>


2022 ◽  
pp. 828-847
Author(s):  
Gaurav Aggarwal ◽  
Latika Singh

Classification of intellectually disabled children through manual assessment of speech at an early age is inconsistent, subjective, time-consuming and prone to error. This study attempts to classify the children with intellectual disabilities using two speech feature extraction techniques: Linear Predictive Coding (LPC) based cepstral parameters, and Mel-frequency cepstral coefficients (MFCC). Four different classification models: k-nearest neighbour (k-NN), support vector machine (SVM), linear discriminant analysis (LDA) and radial basis function neural network (RBFNN) are employed for classification purposes. 48 speech samples of each group are taken for analysis, from subjects with a similar age and socio-economic background. The effect of the different frame length with the number of filterbanks in the MFCC and different frame length with the order in the LPC is also examined for better accuracy. The experimental outcomes show that the projected technique can be used to help speech pathologists in estimating intellectual disability at early ages.


PLoS ONE ◽  
2021 ◽  
Vol 16 (12) ◽  
pp. e0259140
Author(s):  
Cihun-Siyong Alex Gong ◽  
Chih-Hui Simon Su ◽  
Kuo-Wei Chao ◽  
Yi-Chu Chao ◽  
Chin-Kai Su ◽  
...  

The research describes the recognition and classification of the acoustic characteristics of amphibians using deep learning of deep neural network (DNN) and long short-term memory (LSTM) for biological applications. First, original data is collected from 32 species of frogs and 3 species of toads commonly found in Taiwan. Secondly, two digital filtering algorithms, linear predictive coding (LPC) and Mel-frequency cepstral coefficient (MFCC), are respectively used to collect amphibian bioacoustic features and construct the datasets. In addition, principal component analysis (PCA) algorithm is applied to achieve dimensional reduction of the training model datasets. Next, the classification of amphibian bioacoustic features is accomplished through the use of DNN and LSTM. The Pytorch platform with a GPU processor (NVIDIA GeForce GTX 1050 Ti) realizes the calculation and recognition of the acoustic feature classification results. Based on above-mentioned two algorithms, the sound feature datasets are classified and effectively summarized in several classification result tables and graphs for presentation. The results of the classification experiment of the different features of bioacoustics are verified and discussed in detail. This research seeks to extract the optimal combination of the best recognition and classification algorithms in all experimental processes.


2021 ◽  
Author(s):  
Lawrence H. Kim ◽  
Rahul Goel ◽  
Jia Liang ◽  
Mert Pilanci ◽  
Pablo E. Paredes

2021 ◽  
Author(s):  
akuwan saleh

Technological developments in the world have no boundaries. One of them is Speech Recognition. At first, words spoken by humans cannot be recognized by computers. To be recognizable, the word is processed using a specific method. Linear Predictive Coding Method (LPC) is a method used in this research to extract the characteristics of speech. The result of the LPC method is the LPC coefficient which is the number of LPC orders plus 1. The LPC coefficient is processed using Fast Fourier Transform (FFT) 512 to simplify the process of speech recognition. The results are then trained using Backpropagation Neural Network (BPNN) to recognize the spoken word. Speech recognition on the program is implemented as an animated object motion controller on the computer. The end result of this research is animated objects move in accordance with the spoken word. The optimal BPNN structure in this research is to use traingda training function, number of nodes 3, learning rate 0.05, epoch 1000, performance goal 0,00001. This structure can produce the smallest MSE value that is 0,000009957. So, this structure can recognize new words with 100% accuracy for trained data, 80% for the same respondents with trained data and reach 67.5% for new respondents.


2021 ◽  
Vol 7 (2) ◽  
pp. 101-107
Author(s):  
Chondro seto Nur Suryawan ◽  
Marisa Premitasari

Pada umumnya manusia saat ini menggunakan sistem operasi windows yang berjalan di perangkat desktop akan memasang banyak aplikasi sesuai kebutuhannya. Semakin banyak aplikasi yang di pasang maka semakin banyak pula shortcut yang tampil di bagian desktop windows. Shortcut sendiri merupakan sebuah objek alternatif yang digunakan untuk mewakili sehingga pengguna dapat dengan mudah membuka aplikasi tanpa harus pengguna membuka tempat dimana aplikasi tersebut terpasang. Banyaknya aplikasi yang terpasang pada sistem operasi windows membuat shortcut pada bagian desktop menjadi banyak dan membuat pengguna kesulitan dalam mencari  atau membuka aplikasi yang dinginkan. Oleh karena itu diperlukan aplikasi yang dapat membantu pengguna dalam mencari dan membuka aplikasi dengan mudah tanpa membuat pengguna kesulitan. Aplikasi tersebut adalah virtual asisten yang akan membantu pengguna dalam mencari dan membuka aplikasi yang diinginkan. Cara kerjanya dengan pengguna memasukan suara pengguna lalu di proses ekstraksi ciri menggunakan metode Linear Predictive Coding lalu di klasifikasikan menggunakan metode Hidden Markov Model Forward. Setelah terdeteksi maka aplikasi akan membuka aplikasi sesuai suara yang terdeksi. Penelitian ini menggunakan 120 data latih yang terdiri dari 6 label yaitu whatsapp, linkedin, Tokopedia, gmail, powerpoint, word. Untuk setiap label memiliki data latih berjumlah 20 data. Data yang diujikan berjumlah 60. Untuk setiap labelnya memiliki 10 data uji.


2021 ◽  
Vol 2 (2) ◽  
pp. 95-100
Author(s):  
Davita Nadia Fadhilah ◽  
Rita Magdalena ◽  
Sofia Sa’idah

Humans have a variety of characteristics that are different from one another. Characteristics possessed by humans are genuine which can be used as a differentiator between one individual and another, one of which is sound. Voice recognition is called speech recognition. In this study, it was developed as an individual voice recognition system using a combination of the Linear Predictive Coding (LPC) method of feature extraction and K-Nearest Neighbor (K-NN) classification in the speech recognition process. Testing is done by testing changes in several parameters, namely the LPC order value, the number of frames, the K value, and different distance methods. The results of the parameter combination test showed a fairly good presentation of 73.56321839% with the combination parameter or LPC 8, the number of frames 480, the value of K 5, with the distance method used by Chebychev.


2021 ◽  
Vol 39 (1B) ◽  
pp. 30-40
Author(s):  
Ahmed M. Ahmed ◽  
Aliaa K. Hassan

Speaker Recognition Defined by the process of recognizing a person by his\her voice through specific features that extract from his\her voice signal. An Automatic Speaker recognition (ASP) is a biometric authentication system. In the last decade, many advances in the speaker recognition field have been attained, along with many techniques in feature extraction and modeling phases. In this paper, we present an overview of the most recent works in ASP technology. The study makes an effort to discuss several modeling ASP techniques like Gaussian Mixture Model GMM, Vector Quantization (VQ), and Clustering Algorithms. Also, several feature extraction techniques like Linear Predictive Coding (LPC) and Mel frequency cepstral coefficients (MFCC) are examined. Finally, as a result of this study, we found MFCC and GMM methods could be considered as the most successful techniques in the field of speaker recognition so far.


Sign in / Sign up

Export Citation Format

Share Document