VLSI arrays for speech processing with linear predictive coding

2016 ◽

Vol 2 (2) ◽

pp. 106-117

Author(s):

Adam Faroqi ◽

Mada Sanjaya WS ◽

Riyan Nugraha

Keyword(s):

Real Time ◽

Speech Processing ◽

Predictive Coding ◽

Linear Predictive Coding ◽

Fuzzy Interference System ◽

Interference System ◽

Neuro Fuzzy

Perkembangan teknologi saat ini sangat bermanfaat bagi kehidupan banyak orang. Semua aspek kehidupan dapat memanfaatkan teknologi sesuai dengan bidang yang dibutuhkan, termasuk kendali rumah. Dari berbagai penelitian yang telah dilakukan diketahui bahwa sinyal suara dapat juga digunakan untuk berinteraksi dengan komputer, sehingga interaksi tersebut dapat berjalan lebih alami. Penelitian yang dilakukan dengan menggunakan data sinyal suara ini umumnya disebut dengan pemrosesan sinyal suara (speech processing).Penelitian ini bertujuan untuk membuat sistem yang dapat mengenali suara dalam bentuk kalimat agar kedepannya bisa digunakan dalam teknologi listrik. Proses pengolahan suara pun perlu melawati beberapa proses seperti: sampling, ektraksi dan pembelajaran. Dengan proses ekstraksi suatu sinyal suara dapat diketahui karakteristiknya. Terdapat beberapa macam metode ekstraksi ciri yang biasa digunakan, tetapi pada penelitian kali ini menggunakan metode Linear Predictive Coding (LPC). LPC digunakan karena sistem ekstraksinya yang mengadopsi sistem pendengaran manusia sebagai filter pengambilan informasi. Kemudian proses pembelajaran dan pengenalan suara sendiri akan dilakukan oleh Adaptive Neuro Fuzzy Interference System (ANFIS) karena kemampuannya yang bisa melakukan analisis probabilitas dan kemudian menghasilkan respon sesuai dengan parameter. Proses pengenalan suara untuk mengenali kalimat diawali dengan proses perekaman yang akan dijadikan data latih sebanyak 20 buah. Dari hasil uji coba, hasil ekstraksi dengan 4 ciri mempunyai akurasi paling kecil dengan 60% - 70% , sedangkan dengan 5 ciri akurasinya 60% - 80% dan 6 ciri menghasilkan akurasi yang sama yaitu 70% - 80%. Hasil identifikasi secara secara real time dengan 2 orang sebagai pengujiannya menghasilkan akurasi 60% pada pengujian orang pertama dan 70% pada orang kedua untuk pengujian dengan 4 ciri. Analisa waktu respon dengan ciri adalah ciri lebih sedikit akan mempercepat respon matlab dan analisi dengan banyak ciri akan melambatkan waktu respon.

Download Full-text

Identifikasi Emosi Manusia Berdasarkan Ucapan Menggunakan Metode Ekstraksi Ciri LPC dan Metode Euclidean Distance

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2020722693 ◽

2020 ◽

Vol 7 (6) ◽

pp. 1177

Author(s):

Siti Helmiyah ◽

Imam Riadi ◽

Rusydi Umar ◽

Abdullah Hanif ◽

Anton Yudhana ◽

...

Keyword(s):

Signal Processing ◽

Feature Extraction ◽

Speech Processing ◽

Euclidean Distance ◽

Predictive Coding ◽

Digital Signal ◽

Linear Predictive Coding ◽

Distance Method ◽

Average Accuracy ◽

Voice Data

Ucapan merupakan sinyal yang memiliki kompleksitas tinggi terdiri dari berbagai informasi. Informasi yang dapat ditangkap dari ucapan dapat berupa pesan terhadap lawan bicara, pembicara, bahasa, bahkan emosi pembicara itu sendiri tanpa disadari oleh si pembicara. Speech Processing adalah cabang dari pemrosesan sinyal digital yang bertujuan untuk terwujudnya interaksi yang natural antar manusia dan mesin. Karakteristik emosional adalah fitur yang terdapat dalam ucapan yang membawa ciri-ciri dari emosi pembicara. Linear Predictive Coding (LPC) adalah sebuah metode untuk mengekstraksi ciri dalam pemrosesan sinyal. Penelitian ini, menggunakan LPC sebagai ekstraksi ciri dan Metode Euclidean Distance untuk identifikasi emosi berdasarkan ciri yang didapatkan dari LPC. Penelitian ini menggunakan data emosi marah, sedih, bahagia, netral dan bosan. Data yang digunakan diambil dari Berlin Emo DB, dengan menggunakan tiga kalimat berbeda dan aktor yang berbeda juga. Penelitian ini menghasilkan akurasi pada emosi sedih 58,33%, emosi netral 50%, emosi marah 41,67%, emosi bahagia 8,33% dan untuk emosi bosan tidak dapat dikenali. Penggunaan Metode LPC sebagai ekstraksi ciri memberikan hasil yang kurang baik pada penelitian ini karena akurasi rata-rata hanya sebesar 31,67% untuk identifikasi semua emosi. Data suara yang digunakan dengan kalimat, aktor, umur dan aksen yang berbeda dapat mempengaruhi dalam pengenalan emosi, maka dari itu ekstraksi ciri dalam pengenalan pola ucapan emosi manusia sangat penting. Hasil akurasi pada penelitian ini masih sangat kecil dan dapat ditingkatkan dengan menggunakan ekstraksi ciri yang lain seperti prosidis, spektral, dan kualitas suara, penggunaan parameter max, min, mean, median, kurtosis dan skewenes. Selain itu penggunaan metode klasifikasi juga dapat mempengaruhi hasil pengenalan emosi. AbstractSpeech is a signal that has a high complexity consisting of various information. Information that can be captured from speech can be in the form of messages to interlocutor, the speaker, the language, even the speaker's emotions themselves without the speaker realizing it. Speech Processing is a branch of digital signal processing aimed at the realization of natural interactions between humans and machines. Emotional characteristics are features contained in the speech that carry the characteristics of the speaker's emotions. Linear Predictive Coding (LPC) is a method for extracting features in signal processing. This research uses LPC as a feature extraction and Euclidean Distance Method to identify emotions based on features obtained from LPC. This study uses data on emotions of anger, sadness, happiness, neutrality, and boredom. The data used was taken from Berlin Emo DB, using three different sentences and different actors. This research resulted in inaccuracy in sad emotions 58.33%, neutral emotions 50%, angry emotions 41.67%, happy emotions 8.33% and bored emotions could not be recognized. The use of the LPC method as feature extraction gave unfavorable results in this study because the average accuracy was only 31.67% for the identification of all emotions. Voice data used with different sentences, actors, ages, and accents can influence the recognition of emotions, therefore the extraction of features in the recognition of speech patterns of human emotions is very important. Accuracy results in this study are still very small and can be improved by using other feature extractions such as provides, spectral, and sound quality, using parameters max, min, mean, median, kurtosis, and skewness. Besides the use of classification methods can also affect the results of emotional recognition.

Download Full-text

Fast Computation of LSP Frequencies Using the Bairstow Method

Electronics ◽

10.3390/electronics9030387 ◽

2020 ◽

Vol 9 (3) ◽

pp. 387 ◽

Cited By ~ 1

Author(s):

Yuqun Xue ◽

Zhijiu Zhu ◽

Jianhua Jiang ◽

Yi Zhan ◽

Zenghui Yu ◽

...

Keyword(s):

Speech Processing ◽

Linear Prediction ◽

Predictive Coding ◽

Computation Time ◽

Fast Computation ◽

Linear Predictive Coding ◽

Polynomial Roots ◽

Alternative Representation ◽

Perceptual Evaluation ◽

Initial Method

Linear prediction is the kernel technology in speech processing. It has been widely applied in speech recognition, synthesis, and coding, and can efficiently and correctly represent the speech frequency spectrum with only a few parameters. Line Spectrum Pairs (LSPs) frequencies, as an alternative representation of Linear Predictive Coding (LPC), have the advantages of good quantization accuracy and low spectral sensitivity. However, computing the LSPs frequencies takes a long time. To address this issue, a fast computation algorithm, based on the Bairstow method for computing LSPs frequencies from linear prediction coefficients, is proposed in this paper. The algorithm process first transforms the symmetric and antisymmetric polynomial to general polynomial, then extracts the polynomial roots. Associated with the short-term stationary property of speech signal, an adaptive initial method is applied to reduce the average iteration numbers by 26%, as compared to the statics in the initial method, with a Perceptual Evaluation of Speech Quality (PESQ) score reaching 3.46. Experimental results show that the proposed method can extract the polynomial roots efficiently and accurately with significantly reduced computation complexity. Compared to previous works, the proposed method is 17 times faster than Tschirnhus Transform, and has a 22% PESQ improvement on the Birge-Vieta method with an almost comparable computation time.

Download Full-text

Perancangan Sistem Kontrol Otomatis Lampu Menggunakan Metode Pengenalan Suara Berbasis Arduino

TELKA - Telekomunikasi Elektronika Komputasi dan Kontrol ◽

10.15575/telka.v2i2.31 ◽

2016 ◽

Vol 2 (2) ◽

pp. 106-117

Author(s):

Adam Faroqi ◽

Mada Sanjaya WS ◽

Riyan Nugraha

Keyword(s):

Real Time ◽

Speech Processing ◽

Predictive Coding ◽

Linear Predictive Coding ◽

Fuzzy Interference System ◽

Interference System ◽

Neuro Fuzzy

Perkembangan teknologi saat ini sangat bermanfaat bagi kehidupan banyak orang. Semua aspek kehidupan dapat memanfaatkan teknologi sesuai dengan bidang yang dibutuhkan, termasuk kendali rumah. Dari berbagai penelitian yang telah dilakukan diketahui bahwa sinyal suara dapat juga digunakan untuk berinteraksi dengan komputer, sehingga interaksi tersebut dapat berjalan lebih alami. Penelitian yang dilakukan dengan menggunakan data sinyal suara ini umumnya disebut dengan pemrosesan sinyal suara (speech processing).Penelitian ini bertujuan untuk membuat sistem yang dapat mengenali suara dalam bentuk kalimat agar kedepannya bisa digunakan dalam teknologi listrik. Proses pengolahan suara pun perlu melawati beberapa proses seperti: sampling, ektraksi dan pembelajaran. Dengan proses ekstraksi suatu sinyal suara dapat diketahui karakteristiknya. Terdapat beberapa macam metode ekstraksi ciri yang biasa digunakan, tetapi pada penelitian kali ini menggunakan metode Linear Predictive Coding (LPC). LPC digunakan karena sistem ekstraksinya yang mengadopsi sistem pendengaran manusia sebagai filter pengambilan informasi. Kemudian proses pembelajaran dan pengenalan suara sendiri akan dilakukan oleh Adaptive Neuro Fuzzy Interference System (ANFIS) karena kemampuannya yang bisa melakukan analisis probabilitas dan kemudian menghasilkan respon sesuai dengan parameter. Proses pengenalan suara untuk mengenali kalimat diawali dengan proses perekaman yang akan dijadikan data latih sebanyak 20 buah. Dari hasil uji coba, hasil ekstraksi dengan 4 ciri mempunyai akurasi paling kecil dengan 60% - 70% , sedangkan dengan 5 ciri akurasinya 60% - 80% dan 6 ciri menghasilkan akurasi yang sama yaitu 70% - 80%. Hasil identifikasi secara secara real time dengan 2 orang sebagai pengujiannya menghasilkan akurasi 60% pada pengujian orang pertama dan 70% pada orang kedua untuk pengujian dengan 4 ciri. Analisa waktu respon dengan ciri adalah ciri lebih sedikit akan mempercepat respon matlab dan analisi dengan banyak ciri akan melambatkan waktu respon.

Download Full-text

A Novel Meteosat Second Generation Image Compression Method Based on Radon Transform, Linear Predictive Coding with Filtering and Sorted Run Length Coding

International Review on Computers and Software (IRECOS) ◽

10.15866/irecos.v10i4.5704 ◽

2015 ◽

Vol 10 (4) ◽

pp. 438 ◽

Cited By ~ 1

Author(s):

Mehdi Cherifi ◽

Mourad Lahdir ◽

Soltane Ameur

Keyword(s):

Image Compression ◽

Radon Transform ◽

Second Generation ◽

Predictive Coding ◽

Compression Method ◽

Linear Predictive Coding ◽

Run Length ◽

Run Length Coding

Download Full-text

Exploring vowel formant estimation through simulation-based techniques

Linguistics Vanguard ◽

10.1515/lingvan-2018-0060 ◽

2020 ◽

Vol 6 (s1) ◽

Cited By ~ 1

Author(s):

Tyler Kendall ◽

Charlotte Vaughn

Keyword(s):

Predictive Coding ◽

Linear Predictive Coding ◽

Fine Grained ◽

Vowel Formant ◽

Simulation Based ◽

Insight Into

AbstractThis paper contributes insight into the sources of variability in vowel formant estimation, a major analytic activity in sociophonetics, by reviewing the outcomes of two simulations that manipulated the settings used for linear predictive coding (LPC)-based vowel formant estimation. Simulation 1 explores the range of frequency differences obtained when minor adjustments are made to LPC settings, and measurement timepoints around the settings used by trained analysts, in order to determine the range of variability that should be expected in sociophonetic vowel studies. Simulation 2 examines the variability that emerges when LPC settings are varied combinatorially around constant default settings, rather than settings set by trained analysts. The impacts of different LPC settings are discussed as a way of demonstrating the inherent properties of LPC-based formant estimation. This work suggests that differences more fine-grained than about 10 Hz in F1 and 15–20 Hz in F2 are within the range of LPC-based formant estimation variability.

Download Full-text

Acoustic classification using linear predictive coding for wildlife detection systems

2017 International Symposium on Signals, Circuits and Systems (ISSCS) ◽

10.1109/isscs.2017.8034944 ◽

2017 ◽

Cited By ~ 2

Author(s):

Lacrimioara Grama ◽

Elena Roxana Buhus ◽

Corneliu Rusu

Keyword(s):

Predictive Coding ◽

Linear Predictive Coding ◽

Detection Systems ◽

Wildlife Detection ◽

Acoustic Classification

Download Full-text

Warped Linear Predictive Coding of Speech Signal of Processing

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.34680 ◽

2021 ◽

Vol 9 (5) ◽

pp. 1819-1827

Author(s):

P. Choubey

Keyword(s):

Speech Signal ◽

Predictive Coding ◽

Linear Predictive Coding

Download Full-text

Power Efficient Speaker Verification Using Linear Predictive Coding on FPGA

2018 International CET Conference on Control, Communication, and Computing (IC4) ◽

10.1109/cetic4.2018.8530925 ◽

2018 ◽

Author(s):

Amit Sravan Bora ◽

Rohit Reddy ◽

Srinath Satpathy ◽

H. Balachander ◽

V. Vijendra ◽

...

Keyword(s):

Speaker Verification ◽

Predictive Coding ◽

Linear Predictive Coding ◽

Power Efficient

Download Full-text

Navigation Security Module with Real-Time Voice Command Recognition System

Polish Maritime Research ◽

10.1515/pomr-2017-0046 ◽

2017 ◽

Vol 24 (2) ◽

pp. 17-26

Author(s):

Mustafa Yagimli ◽

Huseyin Kursat Tezer

Keyword(s):

Real Time ◽

Situational Awareness ◽

Predictive Coding ◽

Recognition System ◽

Time Warping ◽

Linear Predictive Coding ◽

Mel Frequency Cepstral Coefficients ◽

Voice Command ◽

Security Module ◽

Dynamic Time

Abstract The real-time voice command recognition system used for this study, aims to increase the situational awareness, therefore the safety of navigation, related especially to the close manoeuvres of warships, and the courses of commercial vessels in narrow waters. The developed system, the safety of navigation that has become especially important in precision manoeuvres, has become controllable with voice command recognition-based software. The system was observed to work with 90.6% accuracy using Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping (DTW) parameters and with 85.5% accuracy using Linear Predictive Coding (LPC) and DTW parameters.

Download Full-text