A Smart Home System Based on Speech Recognition Technology

2015 ◽  
Vol 713-715 ◽  
pp. 2123-2125 ◽  
Author(s):  
Wei Li ◽  
Bai Hui Cui ◽  
Fa Wei Zhang ◽  
Xing Guo

In order to enhance the self-care ability of persons with disabilities and satisfy people's demand for intelligent control home appliance, a smart home system based on Microsoft speech synthesis and speech recognition technology is proposed. After the initialization of the system, it receive voice commands which send by users, while the system distinguish the voice signal, the system will call voice feedback module and request users to confirm the voice instructions, after users’ confirmation, the system will record the voice command and convert it to electrical control code which can be recognized by the general household appliances’ control system. One voice command recognition time was within 30 ms and one speech interaction process was within 3 second which shows the simply and efficiently control for appliances based on speech recognition technology.

2014 ◽  
Vol 596 ◽  
pp. 384-387
Author(s):  
Ge Liu ◽  
Hai Bing Zhang

This paper introduces the concept of Voice Assistant, the voice recognition service providers, several typical Voice Assistant product, and then the basic working process of the Voice Assistant is described in detail and proposed the technical bottleneck problems in the development of Voice Assistant software.


Author(s):  
A. SUBASH CHANDAR ◽  
S. SURIYANARAYANAN ◽  
M. MANIKANDAN

This paper proposes a method of Speech recognition using Self Organizing Maps (SOM) and actuation through network in Matlab. The different words spoken by the user at client end are captured and filtered using Least Mean Square (LMS) algorithm to remove the acoustic noise. FFT is taken for the filtered voice signal. The voice spectrum is recognized using trained SOM and appropriate label is sent to server PC. The client and the server communication are established using User Datagram Protocol (UDP). Microcontroller (AT89S52) is used to control the speed of the actuator depending upon the input it receives from the client. Real-time working of the prototype system has been verified with successful speech recognition, transmission, reception and actuation via network.


Author(s):  
Geetha V. ◽  
Gomathy C K ◽  
Manasa Sri Vardhan Kottamasu ◽  
Nukala Pavan Kumar

Personal Assistants, or conversational interfaces, or chat bots reinvent a new way for individuals to interact with computes. A Personal Virtual Assistant allows a user to simply ask questions in the same manner that they would address a human, and are even capable of doing some basic tasks like opening apps, reading out news, taking notes etc., with just a voice command. Personal Assistants like Google Assistant, Alexa, Siri works by Speech Recognition (Speech-to-text) and Text-to-Speech.


Proceedings ◽  
2019 ◽  
Vol 31 (1) ◽  
pp. 54
Author(s):  
Benítez-Guijarro ◽  
Callejas ◽  
Noguera ◽  
Benghazi

Devices with oral interfaces are enabling new interesting interaction scenarios and ways of interaction in ambient intelligence settings. The use of several of such devices in the same environment opens up the possibility to compare the inputs gathered from each one of them and perform a more accurate recognition and processing of user speech. However, the combination of multiple devices presents coordination challenges, as the processing of one voice signal by different speech processing units may result in conflicting outputs and it is necessary to decide which is the most reliable source. This paper presents an approach to rank several sources of spoken input in multi-device environments in order to give preference to the input with the highest estimated quality. The voice signals received by the multiple devices are assessed in terms of their calculated acoustic quality and the reliability of the speech recognition hypotheses produced. After this assessment, each input is assigned a unique score that allows the audio sources to be ranked so as to pick the best to be processed by the system. In order to validate this approach, we have performed an evaluation using a corpus of 4608 audios recorded in a two-room intelligent environment with 24 microphones. The experimental results show that our ranking approach makes it possible to successfully orchestrate an increasing number of acoustic inputs, obtaining better recognition rates than considering a single input, both in clear and noisy settings.


2015 ◽  
Vol 734 ◽  
pp. 369-374 ◽  
Author(s):  
Ping Qian ◽  
Ying Zhen Zhang ◽  
Yu Li

The application of embedded speech recognition technology in the smart home is researched, combining of the Internet of Things, the voice control system for smart home has been designed. The core processor chooses the high-performance Cortex-M4 MCU STM32F407VGT6 produced by STMicroelectronics. The system contains a hardware unit based on LD3320 for speaker-independent speech recognition. RF wireless communication uses ultra-low power chip CC1101 and GSM employ SIM900A. Real-time operating system FreeRTOS is used for multitask scheduling and the operation of household devices. The practical application verifies that this voice control system practicably can identify voice commands quickly and accurately, complete the control actions primely, has a wide application prospect.


2021 ◽  
Author(s):  
Monika Gupta ◽  
R K Singh ◽  
Sachin Singh

Abstract The pandemic caused due to COVID-19, has seen things going online. People tired of typing prefer to give voice commands. Most of the voice based applications and devices are not prepared to handle the native languages. Moreover, in a party environment it is difficult to identify a voice command as there are many speakers. The proposed work addresses the Cocktail party problem of Indian language, Gujarati. The voice response systems like, Siri, Alexa, Google Assistant as of now work on single voice command. The proposed algorithm G- Cocktail would help these applications to identify command given in Gujarati even from a mixed voice signal. Benchmark Dataset is taken from Microsoft and Linguistic Data Consortium for Indian Languages(LDC-IL) comprising single words and phrases. G-Cocktail utilizes the power of CatBoost algorithm to classify and identify the voice. Voice print of the entire sound files is created using Pitch, and Mel Frequency Cepstral Coefficients (MFCC). Seventy percent of the voice prints are used to train the network and thirty percent for testing. The proposed work is tested and compared with K-means, Naïve Bayes, and LightGBM.


The speech control is now most important feature of a smart home. In this paper, we projected voice command module that is used to enable the user for a hands-free interaction between smart home and himself. We presented mainly three components that is required for a simple and an efficient control of smart home device(s). The wake-up-word parts allows the actual speech command processing. The voice recognition part maps the spoken voice command to text and then Voice Control Interface passes that text into an appropriate JSON format for the home automation. We evaluate every possibility of using a voice control module in the smart home by distinctly analyzing each and every component of module


Author(s):  
Sohan Singh & Anupam Lakhanpal. Shashwat Shukla., Srishti Sinha.,

“Jarvis” was main character of Tony’s Stark’s life assistant in Movies Iron Man. Unlike original comic in which Jarvis was Stark’s human butler, the movie version of Jarvis is an intelligent computer that converses with stark, monitors his household and help to build and program his superhero suit. In this Project Jarvis is Digital Life Assistant which uses mainly human communication means such Twitter, instant message and voice to create two way connections between human and his apartment, controlling lights and appliances, assist in cooking, notify him of breaking news, Facebook’s Notifications and many more. In our project we mainly use voice as communication means so the Jarvis is basically the Speech recognition application. The concept of speech technology really encompasses two technologies: Synthesizer and recognizer. A speech synthesizer takes as input and produces an audio stream as output. A speech recognizer on the other hand does opposite. It takes an audio stream as input and thus turns it into text transcription. The voice is a signal of infinite information. A direct analysisand synthesizing the complex voice signal is due to too much information contained in the signal. Therefore the digital signal processes such as Feature Extraction and Feature Matching are introduced to represent the voice signal. In this project we directly use speech engine which use Feature extraction technique as Mel scaled frequency cepstral. The mel- scaled frequency cepstral coefficients (MFCCs) derived from Fourier transform and filter bank analysis are perhaps the most widely used front- ends in state-of-the-art speech recognition systems. Our aim to create more and more functionalities which can help human to assist in their daily life and also reduces their efforts. In our test we check all this functionality is working properly. We test this on 2 speakers(1 Female and 1 Male) for accuracy purpose.


Sign in / Sign up

Export Citation Format

Share Document