voice input
Recently Published Documents


TOTAL DOCUMENTS

197
(FIVE YEARS 62)

H-INDEX

11
(FIVE YEARS 1)

Author(s):  
Dr. Pooja M R ◽  
◽  
Meghana M ◽  
Harshith Bhaskar ◽  
Anusha Hulatti ◽  
...  

We witness many people who face disabilities like being deaf, dumb, blind etc. They face a lot of challenges and difficulties trying to interact and communicate with others. This paper presents a new technique by providing a virtual solution without making use of any sensors. Histogram Oriented Gradient (HOG) along with Artificial Neural Network (ANN) have been implemented. The user makes use of web camera, which takes input from the user and processes the image of different gestures. The algorithm recognizes the image and identifies the pending voice input. This paper explains two way means of communication between impaired and normal people which implies that the proposed ideology can convert sign language to text and voice.


Author(s):  
Dr. Pooja M R ◽  
◽  
Meghana M ◽  
Harshith Bhaskar ◽  
Anusha Hulatti ◽  
...  

We witness many people who face disabilities like being deaf, dumb, blind etc. They face a lot of challenges and difficulties trying to interact and communicate with others. This paper presents a new technique by providing a virtual solution without making use of any sensors. Histogram Oriented Gradient (HOG) along with Artificial Neural Network (ANN) have been implemented. The user makes use of web camera, which takes input from the user and processes the image of different gestures. The algorithm recognizes the image and identifies the pending voice input. This paper explains two way means of communication between impaired and normal people which implies that the proposed ideology can convert sign language to text and voice.


Informatics ◽  
2021 ◽  
Vol 18 (4) ◽  
pp. 40-52
Author(s):  
S. A. Hetsevich ◽  
Dz. A. Dzenisyk ◽  
Yu. S. Hetsevich ◽  
L. I. Kaigorodova ◽  
K. A. Nikalaenka

O b j e c t i v e s. The main goal of the work is a research of the natural language user interfaces and the developmentof a prototype of such an interface. The prototype is a bilingual Russian and Belarusian question-and-answer dialogue system. The research of the natural language interfaces was conducted in terms of the use of natural language for interaction between a user and a computer system. The main problems here are the ambiguity of natural language and the difficulties in the design of natural language interfaces that meet user expectations.M e t ho d s. The main principles of modelling the natural language user interfaces are considered. As an intelligent system, it consists of a database, knowledge machine and a user interface. Speech recognition and speech synthesis components make natural language interfaces more convenient from the point of view of usability.R e s u l t s. The description of the prototype of a natural language interface for a question-and-answer intelligent system is presented. The model of the prototype includes speech-to-text and text-to-speech Belarusian and Russian subsystems, generation of responses in the form of the natural language and formal text.An additional component is natural Belarusian and Russian voice input. Some of the data, required for human voice recognition, are stored as knowledge in the knowledge base or created on the basis of existing knowledge. Another important component is Belarusian and Russian voice output. This component is the top required for making the natural language interface more user-friendly.Co n c l u s i o n. The article presents the research of natural language user interfaces, the result of which provides the development and description of the prototype of the natural language interface for the intelligent question- and-answer system.


Author(s):  
K. Satheeshkumar ◽  
S. Ayyanar ◽  
L. Srinivasaperumal ◽  
S. Susi

Today, Internet is the best place to communicate and share information among the people throughout the world and gives an endless support of knowledge and entertainment. The main objective of Internet technology is to increase efficiency and decrease human effort. With the introduction of Internet of Things (IoT) in the last decade, we have been pushing for ubiquitous computing in all spheres of life. Physically challenged people are also using the Internet with the help of Speech commands (SC). The main objective of this paper is to minimize the effort and increase efficiency of the Voice recognition and IoT based secured automation, To design multi parameter monitoring system using Microcontroller that measures and controls various global parameters and the system comprises with wireless mode of communication. These processes were managed using Arduino and Bluetooth. The parameters that can be tracked are temperature, humidity and fire. Here we design the automatic system for voice based industrial automated system. That is an android app is used to give input data as voice input to the micro controller. In the micro controller section there will be a Bluetooth module to receive the data transmitted. Once the data is received the coding is executed according to the data which the controller got. Here we can monitor the temperature, humidity and gas and fire value of the industry and according to the values the prevention action is taken. The motor, dc fan and the light is turned on and off according to the voice input. And the action done here is implemented to the IOT.


2021 ◽  
Author(s):  
Chin-Yuan Tsan ◽  
Meng-Chun Chen ◽  
Jia-Chang Wen ◽  
Yi-Chen Wang

We develop a mobile care application that includes tools such as voice input, image upload, and image recognition. This procedure will be used in clinical care. The study is expected to undergo actual use testing in the ward and a questionnaire survey three months after use. During use, the mobile phone connection data will continuously monitor to analyze the number and time of connection records.


Author(s):  
Harsh Goyal ◽  
Piyush Piyush ◽  
Ravinder Ravinder ◽  
Pooja Gupta

Medicine side effects are the major problem in the world, due to wrong prescriptions thousands of people die every year. Most of these mistakes are due to illegible handwriting which leads to taking the wrong medicine or dosage. To solve this issue, a voice-based prescription came into the picture where the prescription is taken as voice input, and a pdf file is generated which is then emailed to the patient. This method can save wealth and life throughout the world, particularly in developing countries where the prescriptions are generally paper-based. The system proposed in this paper is for those doctors and hospitals that are still using a paper-based handwritten prescription. Keywords: Healthcare, Voice-based, Python, Natural Language Processing (NLP), Electronic Prescription, Text Processing, Electronic Health Record (EHR).


Author(s):  
Shubham Jain ◽  
Shreya Joshi ◽  
Ruchi Parashar

This research paper gives a comprehensive view of a virtual assistant named “Alpha” which is developed using the concept of Artificial intelligence, to aid in education, market, business and many other fields.This bot is programmed in python and is stored in raspberry pi, providing it a user friendly environment by making the bot move along with the user. A virtual assistant is an application that takes voice input, processes it and then gives output according to the input. Alpha is a real time, interactive bot and it is built on the latest technologies. It uses numerous python Libraries to help perform various functions that enable the Assistant to assist its user in day to day activities. The Assistant can convert text to speech and vice versa using gTTS and speech Recognition libraries respectively. It is a multilingual digital employee that speaks, listens and comprehends over 26 languages. It has other interesting features of face recognition and registration, smog sensing and alcohol sensing. It is also capable of making payments using the QR scanner feature added to it.


2021 ◽  
Vol 9 (1) ◽  
pp. 11-28
Author(s):  
Hui Hui Wang

The most popular video website YouTube has about 2 billion users worldwide who speak and understand different languages. Subtitles are essential for the users to get the message from the video. However, not all video owners provide subtitles for their videos. It causes the potential audiences to have difficulties in understanding the video content. Thus, this study proposed a speech recorder and translator to solve this problem. The general concept of this study was to combine Automatic Speech Recognition (ASR) and translation technologies to recognize the video content and translate it into other languages. This paper compared and discussed three different ASR technologies. They are Google Cloud Speech-to-Text, Limecraft Transcriber, and VoxSigma. Finally, the proposed system used Google Cloud Speech-to-Text because it supports more languages than Limecraft Transcriber and VoxSigma. Besides, it was more flexible to use with Google Cloud Translation. This paper also consisted of a questionnaire about the crucial features of the speech recorder and translator. There was a total of 19 university students participated in the questionnaire. Most of the respondents stated that high translation accuracy is vital for the proposed system. This paper also discussed a related work of speech recorder and translator. It was a study that compared speech recognition between ordinary voice and speech impaired voice. It used a mobile application to record acoustic voice input. Compared to the existing mobile App, this project proposed a web application. It was a different and new study, especially in terms of development and user experience. Finally, this project developed the proposed system successfully. The results showed that Google Cloud Speech-to-Text and Translation were reliable to use in video translation. However, it could not recognize the speech when the background music was too loud. Besides, it had a problem of direct translation, which was challenging. Thus, future research may need a custom trained model. In conclusion, the proposed system in this project was to contribute a new idea of a web application to solve the language barrier on the video watching platform.


2021 ◽  
Vol 16 (22) ◽  
pp. 189-207
Author(s):  
Talgat Sembayev ◽  
Zhanat Nurbekova ◽  
Gulmira Abildinova

a new trend in the development of immersive technologies has become augmented reality (AR), which is in demand due to its property to implement visual objects to enrich the learning content. The paper is devoted to the study of the applicability of AR technologies for evaluating learning activi-ties since there is a problem of inconsistency of teaching approaches with tools that lead to biased results. This led to the development of the “AR Quiz” application that contains interaction types such as touch-based, voice, input field, gaze and gesture that stimulate activities. In combination with 10 other forms of assessment materials, its application field has expanded and the tasks for students have diversified. The present study provides the calculation of validity and reliability coefficients of the assessment materials contained in the “AR Quiz” application that reflects the suitability of indicators for the purpose, accuracy and stability of measurements. The paper reveals positive attitudes of expert teachers and students towards the use of AR when evaluating learning activities. Along with integration map of compliance of AR interaction types with assessment materials, the paper provides recommendations for teachers on evaluating learning activities based on AR.


Author(s):  
Hutami Septiana Raswaty ◽  
Nuryuliani Nuryuliani

People with different languages need to be assisted by translator to establish the communication between them. The technology development which exists to fulfill communication needs is digital dictionary as translator tool. The capability of digital dictionary to translate the languages yet has a weakness in putting the input. Through this research, Optical Character Recognition using Tesseract library and Voice Recognition technologies using Google Speech-To-Text are used to replace the previous input system. Based on the implementation and testing, the OCR and Voice Recognition have been successfully recognizing the text and voice input with the amount of similarity of 92,72% for OCR and 95,46% for Voice Recognition. The result of the implementation is expected to help a group of people with different language to communicate easily.


Sign in / Sign up

Export Citation Format

Share Document