Implementation of Voice Recognition Via CNN and LSTM

doi:10.35940/ijitee.d1832.029420

Implementation of Voice Recognition Via CNN and LSTM

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1832.029420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 1842-1844

Keyword(s):

Feature Extraction ◽

Low Cost ◽

Voice Recognition ◽

Recognition System ◽

Computational Costs ◽

The Voice

The voice recognition system uses CNN a lot. This is because CNN has the optimized ability to recognize and classify targets. CNN, however, has a problem that the bigger the object to be recognized, the more expensive the computational costs are. In this paper, we are going to solve these problems through MFCC feature extraction and model roll combining CNN and LSTM to present the possibility of performing voice recognition even through low-cost devices.

Download Full-text

ANALYSIS OF VOICE CUES IN RECOGNITION OF SARCASM

Recent Patents on Computer Science ◽

10.2174/2213275912666190819113541 ◽

2019 ◽

Vol 12 ◽

Cited By ~ 1

Author(s):

Basavaraj N Hiremath ◽

Malini M Patilb

Keyword(s):

Feature Extraction ◽

Computer Program ◽

Principle Component Analysis ◽

Voice Recognition ◽

Component Analysis ◽

Recognition System ◽

Principle Component ◽

Whole Process ◽

The Voice

The voice recognition system is about cognizing the signals, by feature extraction and identification of related parameters. The whole process is referred to as voice analytics. The paper aims at analysing and synthesizing the phonetics of voice using a computer program called “PRAAT”. The work carried out in the paper also supports the analysis of voice segmentation labelling, analyse the unique features of voice cues, understanding physics of voice, further the process is carried out to recognize sarcasm. Different unique features identified in the work are, intensity, pitch, formants related to read, speak, interactive and declarative sentences by using principle component analysis.

Download Full-text

Remote Control using Voice Recognition based on Arduino

Diyala Journal of Engineering Sciences ◽

10.24237/djes.2019.12304 ◽

2019 ◽

Vol 12 (3) ◽

pp. 22-28

Author(s):

Jinan N. Shehab

Keyword(s):

Remote Control ◽

Low Cost ◽

Voice Recognition ◽

Recognition System ◽

Home Automation ◽

Easy Method ◽

C Language ◽

Infrared Technology ◽

Data Coordination ◽

The Voice

Home automation becomes important, because it gives the user convenient and easy method to use home appliances. This paper aims to help people with special needs or physical disabilities and injuries by paralysis to control any device using infrared technology using voice commands based on the voice recognition system (voice recognition unit V3) system can recognize voice commands, convert them to desired data coordination and data transmission via IR transmitter and microcontroller (Arduino Uno) Receiving this signal by IR sensor to control TV receiver then get a full remote control that works by voice commands. The software consists of a Micro C language programmable microcontroller. This system is of low cost and flexible with growing variety of devices that can be controlled.

Download Full-text

A Comparison of Three Methods for Controlling Aircraft Systems

Proceedings of the Human Factors Society Annual Meeting ◽

10.1177/154193128603000706 ◽

1986 ◽

Vol 30 (7) ◽

pp. 638-641

Author(s):

John P. Zenyuh ◽

John M. Reising

Keyword(s):

Control Systems ◽

Relative Effectiveness ◽

Voice Recognition ◽

Recognition System ◽

Control Device ◽

Control Mode ◽

Aircraft Systems ◽

Remote Touch ◽

The Voice ◽

Speed And Accuracy

The objective of this study was to compare the relative effectiveness of three modes of subsystem control: a voice recognition system with visual feedback presented on the head-up display, a standard multifunction control device with tailored switching logic, and a remotely operated multifunction control with feedback presented on the head-up display. Comparisons were based on measures of interference with a loading task and overall speed and accuracy of the control operations performed. The working hypothesis was that the voice system and head-up multifunction control would manifest substantially lower interference with the primary task, while subsystem control operation times would remain unaffected by control mode. The results indicate that performance with the remote touch panel was significantly poorer than with the voice or standard multifunction control systems.

Download Full-text

Automatic Speaker Identification by Voice Based on Vector Quantization Method

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9523.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 2443-2445

Keyword(s):

Feature Extraction ◽

Vector Quantization ◽

Speaker Identification ◽

Voice Recognition ◽

Recognition System ◽

Automatic Identification ◽

Quantization Method ◽

Feature Vectors ◽

Cepstral Features ◽

Code Book

In this paper, the systems of speaker identification of a text-dependent and independent nature were considered. Feature extraction was performed using chalk-frequency cepstral coefficients (MFCC). The vector quantization method for the automatic identification of a person by voice has been investigated. Using the extracted features, the code book from each speaker was built by clustering the feature vectors. Speakers were modeled using vector quantization (VQ). Using the extracted features, the code book from each speaker was built by clustering the feature vectors. Codebooks of all announcers were collected in the database. From the results, it can be said that vector quantization using cepstral features produces good results for creating a voice recognition system.

Download Full-text

Analisis Speaker Recognition Menggunakan Metode Dynamic Time Warping (DTW) Berbasis Matlab

AVITEC ◽

10.28989/avitec.v1i1.492 ◽

2019 ◽

Vol 1 (1) ◽

Author(s):

Noor Fita Indri Prayoga

Keyword(s):

Feature Extraction ◽

Speaker Recognition ◽

Dynamic Time Warping ◽

Recognition Accuracy ◽

Recognition System ◽

Extraction Process ◽

Test Results ◽

Time Warping ◽

Dynamic Time ◽

The Voice

Voice is one of way to communicate and express yourself. Speaker recognition is a process carried out by a device to recognize the speaker through the voice. This study designed a speaker recognition system that was able to identify speakers based on what was said by using dynamic time warping (DTW) method based in matlab. To design a speaker recognition system begins with the process of reference data and test data. Both processes have the same process, which starts with sound recording, preprocessing, and feature extraction. In this system, the Fast Fourier Transform (FFT) method is used to extract the features. The results of the feature extraction process from the two data will be compared using the DTW method. Calculations using DTW that produce the smallest value will be determined as the output. The test results show that the system can identify the voice with the best level of recognition accuracy of 90%, and the average recognition accuracy of 80%. The results were obtained from 50 tests, carried out by 5 people consisting of 3 men and 2 women, each speaker said a predetermined word

Download Full-text

Review on Automated Elevator-an Attentive Elevator to Elevate using Speech Recognition

Journal of Advanced Research in Power Electronics and Power Systems ◽

10.24321/2456.1401.202102 ◽

2021 ◽

Vol 08 (1&2) ◽

pp. 20-26

Author(s):

Vishakha Patil ◽

Keyword(s):

Speech Recognition ◽

Present Status ◽

Voice Recognition ◽

Recognition System ◽

Modern World ◽

Motor Module ◽

The City ◽

The Voice ◽

Transport Device ◽

Over Time

Elevator has over time become an important part of our day-to-day life. It is used as an everyday transport device useful to move goods as well as persons. In the modern world, the city and crowded areas require multiform buildings. According to wheelchair access laws, elevators/lifts are a must requirement in new multi-stored buildings. The main purpose of this project is to operate the elevator by voice command. The project is operating based on voice, which could help handicap people or dwarf people to travel from one place to another without the help of any other person. The use of a microcontroller is to control different devices and integrate each module, namely- voice module, motor module, and LCD. LCD is used to display the present status of the lift. The reading edge of our project is the “voice recognition system” which genet’s exceptional result while recognizing speech.

Download Full-text

Voice Recognition System Through Machine Learning

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j1072.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 4478-4483

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Learning Algorithm ◽

Voice Recognition ◽

Recognition System ◽

New Method ◽

Human Voice ◽

Developing Area ◽

Voice Processing ◽

The Voice

Human voice recognition by computers has been ever developing area since 1952. It is challenging task for a computer to understand and act according to human voice rather than to commands or programs. The reason is that no two human’s voice or style or pitch will be similar and every word is not pronounced by everyone in a similar fashion. Background noises and disturbances may confuse the system. The voice or accent of the same person may change according to the user’s mood, situation, time etc. despite of all these challenges, voice recognition and speech to text conversion has reached a successful stage. Voice processing technology deserves still more research. As a tip of iceberg of this research we contribute our work on this are and we propose a new method i.e., VRSML (Voice Recognition System through Machine Learning) mainly focuses on Speech to text conversion, then analyzing the text extracted from speech in the form of tokens through Machine Learning. After analyzing the derived text, reports are created in textual as well graphical format to represent the vocabulary levels used in that speech. As Supervised learning algorithm from Machine Learning is employed to classify the tokens derived from text, the reports will be more accurate and will be generated faster.

Download Full-text

Airborne Message Entry by Voice Recognition

Proceedings of the Human Factors Society Annual Meeting ◽

10.1177/154193128703100409 ◽

1987 ◽

Vol 31 (4) ◽

pp. 424-427

Author(s):

Christian P. Skriver

Keyword(s):

Voice Recognition ◽

Data Entry ◽

Recognition System ◽

Text Data ◽

Enlisted Men ◽

Initial Errors ◽

Input Mode ◽

Independent Variable ◽

Manual Input ◽

The Voice

This report presents the results of an experiment that measured performance in a simulated ASW message entry task with two modes of data input—vocal and manual. The subjects (Ss) were 12 Naval enlisted men. The independent variable was message data entry mode—vocal or manual. The dependent variables were: time to enter 20 lines of text, data entry errors that were corrected by the Ss, and errors that remained undetected. All Ss were trained to use the voice recognition system with a 100 word vocabulary set. The task was for the S to read one line of message text from a display and then re-enter the text below the displayed text via either voice recognizer or keyboard until 20 lines of text had been entered. Keyboard entry was found to be slightly faster (11%) than voice recognition input. While the number of initial errors (corrected) in the vocal input mode was over three times greater than the number for manual input, the remaining input errors (uncorrected) were about the same.

Download Full-text

Design and Implementation of Modernised Dental Chair using Voice Recognition Control Circuit

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i1076.0789s219 ◽

2019 ◽

Vol 8 (9S2) ◽

pp. 355-357

Keyword(s):

Low Cost ◽

Voice Recognition ◽

Continuous Operation ◽

Control Circuit ◽

Voice Command ◽

Design And Implementation ◽

Dental Chair ◽

The Voice

Generally, in hospitals the dental chair can be operated forward/backward or upward/downward according to the treatment for the patients which is operated by human. Sometimes the chair will not function properly due to piston rust and over weighted patient and the dentist may have pain in the legs due to continuous operation of the chair. To overcome these issues, planning to design a voice recognition dental chair for the doctors in hospitals. This project describes the design of a smart, motorized, voice controlled dental chair. The voice command is given by the dentist/human, sensor recognizes the voice and sends the command to the Arduino. This voice command is converted to string and it is responsible for movement of chair. The intelligent dental chair is designed in such a way that it can be controlled easily by the doctor and has an advantage is the low cost design. This system was designed and developed to avoid wasting the energy and time of the doctor

Download Full-text

Comparison of CNNs and SVM for voice control wheelchair

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v9.i3.pp387-393 ◽

2020 ◽

Vol 9 (3) ◽

pp. 387

Author(s):

Mohammad Shahrul Izham Sharifuddin ◽

Sharifalillah Nordin ◽

Azliza Mohd Ali

Keyword(s):

Feature Extraction ◽

Processing Time ◽

Voice Recognition ◽

The Other ◽

Raspberry Pi ◽

Intelligent Wheelchair ◽

Voice Control ◽

Voice Data ◽

The Voice ◽

Feature Extraction Technique

In this paper, we develop an intelligent wheelchair using CNNs and SVM voice recognition methods. The data is collected from Google and some of them are self-recorded. There are four types of data to be recognized which are go, left, right, and stop. Voice data are extracted using MFCC feature extraction technique. CNNs and SVM are then used to classify and recognize the voice data. The motor driver is embedded in Raspberry PI 3B+ to control the movement of the wheelchair prototype. CNNs produced higher accuracy i.e. 95.30% compared to SVM which is only 72.39%. On the other hand, SVM only took 8.21 seconds while CNNs took 250.03 seconds to execute. Therefore, CNNs produce better result because noise are filtered in the feature extraction layer before classified in the classification layer. However, CNNs took longer time due to the complexity of the networks and the less complexity implementation in SVM give shorter processing time.

Download Full-text