voice recognition
Recently Published Documents





Youssef Elfahm ◽  
Nesrine Abajaddi ◽  
Badia Mounir ◽  
Laila Elmaazouzi ◽  
Ilham Mounir ◽  

<span>Many technology systems have used voice recognition applications to transcribe a speaker’s speech into text that can be used by these systems. One of the most complex tasks in speech identification is to know, which acoustic cues will be used to classify sounds. This study presents an approach for characterizing Arabic fricative consonants in two groups (sibilant and non-sibilant). From an acoustic point of view, our approach is based on the analysis of the energy distribution, in frequency bands, in a syllable of the consonant-vowel type. From a practical point of view, our technique has been implemented, in the MATLAB software, and tested on a corpus built in our laboratory. The results obtained show that the percentage energy distribution in a speech signal is a very powerful parameter in the classification of Arabic fricatives. We obtained an accuracy of 92% for non-sibilant consonants /f, χ, ɣ, ʕ, ћ, and h/, 84% for sibilants /s, sҁ, z, Ӡ and ∫/, and 89% for the whole classification rate. In comparison to other algorithms based on neural networks and support vector machines (SVM), our classification system was able to provide a higher classification rate.</span>

Song Li ◽  
Mustafa Ozkan Yerebakan ◽  
Yue Luo ◽  
Ben Amaba ◽  
William Swope ◽  

Abstract Voice recognition has become an integral part of our lives, commonly used in call centers and as part of virtual assistants. However, voice recognition is increasingly applied to more industrial uses. Each of these use cases has unique characteristics that may impact the effectiveness of voice recognition, which could impact industrial productivity, performance, or even safety. One of the most prominent among them is the unique background noises that are dominant in each industry. The existence of different machinery and different work layouts are primary contributors to this. Another important characteristic is the type of communication that is present in these settings. Daily communication often involves longer sentences uttered under relatively silent conditions, whereas communication in industrial settings is often short and conducted in loud conditions. In this study, we demonstrated the importance of taking these two elements into account by comparing the performances of two voice recognition algorithms under several background noise conditions: a regular Convolutional Neural Network (CNN) based voice recognition algorithm to an Auto Speech Recognition (ASR) based model with a denoising module. Our results indicate that there is a significant performance drop between the typical background noise use (white noise) and the rest of the background noises. Also, our custom ASR model with the denoising module outperformed the CNN based model with an overall performance increase between 14-35% across all background noises. . Both results give proof that specialized voice recognition algorithms need to be developed for these environments to reliably deploy them as control mechanisms.

Yung Ming ◽  
Lily Yuan

Machine Learning (ML) and Artificial Intelligence (AI) methods are transforming many commercial and academic areas, including feature extraction, autonomous driving, computational linguistics, and voice recognition. These new technologies are now having a significant effect in radiography, forensics, and many other areas where the accessibility of automated systems may improve the precision and repeatability of essential job performance. In this systematic review, we begin by providing a short overview of the different methods that are currently being developed, with a particular emphasis on those utilized in biomedical studies.

Electronics ◽  
2022 ◽  
Vol 11 (1) ◽  
pp. 168
Mohsen Bakouri ◽  
Mohammed Alsehaimi ◽  
Husham Farouk Ismail ◽  
Khaled Alshareef ◽  
Ali Ganoun ◽  

Many wheelchair people depend on others to control the movement of their wheelchairs, which significantly influences their independence and quality of life. Smart wheelchairs offer a degree of self-dependence and freedom to drive their own vehicles. In this work, we designed and implemented a low-cost software and hardware method to steer a robotic wheelchair. Moreover, from our method, we developed our own Android mobile app based on Flutter software. A convolutional neural network (CNN)-based network-in-network (NIN) structure approach integrated with a voice recognition model was also developed and configured to build the mobile app. The technique was also implemented and configured using an offline Wi-Fi network hotspot between software and hardware components. Five voice commands (yes, no, left, right, and stop) guided and controlled the wheelchair through the Raspberry Pi and DC motor drives. The overall system was evaluated based on a trained and validated English speech corpus by Arabic native speakers for isolated words to assess the performance of the Android OS application. The maneuverability performance of indoor and outdoor navigation was also evaluated in terms of accuracy. The results indicated a degree of accuracy of approximately 87.2% of the accurate prediction of some of the five voice commands. Additionally, in the real-time performance test, the root-mean-square deviation (RMSD) values between the planned and actual nodes for indoor/outdoor maneuvering were 1.721 × 10−5 and 1.743 × 10−5, respectively.

Khalid Majrashi

Voice User Interfaces (VUIs) are increasingly popular owing to improvements in automatic speech recognition. However, the understanding of user interaction with VUIs, particularly Arabic VUIs, remains limited. Hence, this research compared user performance, learnability, and satisfaction when using voice and keyboard-and-mouse input modalities for text creation on Arabic user interfaces. A Voice-enabled Email Interface (VEI) and a Traditional Email Interface (TEI) were developed. Forty participants attempted pre-prepared and self-generated message creation tasks using voice on the VEI, and the keyboard-and-mouse modal on the TEI. The results showed that participants were faster (by 1.76 to 2.67 minutes) in pre-prepared message creation using voice than using the keyboard and mouse. Participants were also faster (by 1.72 to 2.49 minutes) in self-generated message creation using voice than using the keyboard and mouse. Although the learning curves were more efficient with the VEI, more participants were satisfied with the TEI. With the VEI, participants reported problems, such as misrecognitions and misspellings, but were satisfied about the visibility of possible executable commands and about the overall accuracy of voice recognition.

Disari Chattopadhyay

Abstract: This paper represents the development of an automated system based on IoT, which can mainly be used in the home and some features can also be implemented in offices, banks, or schools. The main purpose of this project is to save time and manpower along with security and convenience, using Raspberry pi. The salient features of this automated system are gas leakage detection for safety purposes, motion detection for security purposes, and controlling the home appliances as per the user’s need. The system takes command through voice as well as text as per the user’s requirements using google assistant, which further sends a response to Raspberry-Pi via Firebase for the required action. DHT22 sensor is used for the measurement of temperature and humidity, room temperature and Humidity will be displayed through Google assistant. This system consists of Python as the main programming language by default, provided by Raspberry Pi. The system will detect human presence with the help of a motion sensor i.e, whenever a person enters the room, the motion is detected and automatically an alert message will be sent to the user via Google assistant. Keywords: IoT, Raspberry-pi, Google assistant, Firebase, Python, Dialogflow, Voice Recognition.

Pathobiology ◽  
2021 ◽  
pp. 1-9
Emad A. Rakha ◽  
Konstantinos Vougas ◽  
Puay Hoon Tan

Digital technology has been used in the field of diagnostic breast pathology and immunohistochemistry (IHC) for decades. Examples include automated tissue processing and staining, digital data processing, storing and management, voice recognition systems, and digital technology-based production of antibodies and other IHC reagents. However, the recent application of whole slide imaging technology and artificial intelligence (AI)-based tools has attracted a lot of attention. The use of AI tools in breast pathology is discussed briefly as it is covered in other reviews. Here, we present the main application of digital technology in IHC. This includes automation of IHC staining, using image analysis systems and computer vision technology to interpret IHC staining, and the use of AI-based tools to predict marker expression from haematoxylin and eosin-stained digitalized images.

Олег Игоревич Денисенко ◽  
Никита Алексеевич Кубасов

В настоящее время набирает популярность использование на территории организаций, аэропортов и в других сферах, в том числе в исправительных учреждениях Российской Федерации (далее - ИУ РФ) и зарубежных стран двухмерного штрих-кода для передачи информации. Безусловно, применение данного штрих-кода в биометрической идентификации личности имеет огромное преимущество перед осуществлением аналогичной деятельности непосредственно сотрудниками уголовно-исполнительной системы Российской Федерации (далее - УИС), которое выражается в усиленном контроле пропуска и безопасности сотрудников и осужденных от несанкционированного прохода посторонних лиц. Биометрическая идентификация личности производится путем сканирования сетчатки глаза, отпечатков пальцев, сканирования биометрии лица, измерения температуры тела и распознавания голоса. Однако даже такая современная система имеет определенные недостатки, выявленные специалистами в сфере инженерно-технического обеспечения, которым посвящен ряд научных работ, рассмотренных в данной статье. Также авторами проанализированы основные разновидности 2D-кодов, такие как Stackedlinear и Matrixcode. Отмечается, что 2D-кодировка применяется во многих отраслях: при производстве, транспортировке грузов, идентификации личности, шифровки данных документов и отчетов, проведении инвентаризации. Nowadays using a two-dimensional barcode for transmitting information becomes more popular in the territory of organizations, airports, and in other spheres such as correctional facilities of the Russian Federation (hereinafter - CF RF) and in abroad counties. Undoubtedly, the application of this barcode in biomedical identification of personality has a huge advantage over similar activities, which has been realized by penal officers (hereinafter - FPS ). This dignity includes enhanced control and safety of employees and convicts from an unauthorized passage of unauthorized persons. Biometric identification of personality conducted by retinal scan, fingerprint scan, facial biometrics scan, body temperature measurement, and voice recognition. However, even such a system has several disadvantages, which were identified by engendering specialists. Lots of scientific works are dedicated to these flaws, which we are going to consider in the article. Also the main varieties of 2D codes were analysed in this article, such as Stackedlinear and Matrixcode. It was found out that 2D coding is used in many different industries: in the process of production, transportation of goods, person identification, encryption of these documents and reports, inventory.

2021 ◽  
G Giunti ◽  
M Isomursu ◽  
E Gabarron ◽  
Y Solad

Advances in voice recognition, natural language processing, and artificial intelligence have led to the increasing availability and use of conversational agents (chatbots) in different settings. Chatbots are systems that mimic human dialogue interaction through text or voice. This paper describes a series of design considerations for integrating chatbots interfaces with health services. The present paper is part of ongoing work that explores the overall implementation of chatbots in the healthcare context. The findings have been created using a research through design process, combining (1) literature survey of existing body of knowledge on designing chatbots, (2) analysis on state-of-the-practice in using chatbots as service interfaces, and (3) generative process of designing a chatbot interface for depression screening. In this paper we describe considerations that would be useful for the design of a chatbot for a healthcare context.

Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3130
Bharathwaj Suresh ◽  
Kamlesh Pillai ◽  
Gurpreet Singh Kalsi ◽  
Avishaii Abuhatzera ◽  
Sreenivas Subramoney

Deep Neural Networks (DNNs) have set state-of-the-art performance numbers in diverse fields of electronics (computer vision, voice recognition), biology, bioinformatics, etc. However, the process of learning (training) from the data and application of the learnt information (inference) process requires huge computational resources. Approximate computing is a common method to reduce computation cost, but it introduces loss in task accuracy, which limits their application. Using an inherent property of Rectified Linear Unit (ReLU), a popular activation function, we propose a mathematical model to perform MAC operation using reduced precision for predicting negative values early. We also propose a method to perform hierarchical computation to achieve the same results as IEEE754 full precision compute. Applying this method on ResNet50 and VGG16 shows that up to 80% of ReLU zeros (which is 50% of all ReLU outputs) can be predicted and detected early by using just 3 out of 23 mantissa bits. This method is equally applicable to other floating-point representations.

Sign in / Sign up

Export Citation Format

Share Document