scholarly journals Soft missing-feature mask generation for Robot Audition

2010 ◽  
Vol 1 (1) ◽  
Author(s):  
Toru Takahashi ◽  
Kazuhiro Nakadai ◽  
Kazunori Komatani ◽  
Tetsuya Ogata ◽  
Hiroshi G. Okuno

AbstractThis paper describes an improvement in automatic speech recognition (ASR) for robot audition by introducing Missing Feature Theory (MFT) based on soft missing feature masks (MFM) to realize natural human-robot interaction. In an everyday environment, a robot’s microphones capture various sounds besides the user’s utterances. Although sound-source separation is an effective way to enhance the user’s utterances, it inevitably produces errors due to reflection and reverberation. MFT is able to cope with these errors. First, MFMs are generated based on the reliability of time-frequency components. Then ASR weighs the time-frequency components according to the MFMs. We propose a new method to automatically generate soft MFMs, consisting of continuous values from 0 to 1 based on a sigmoid function. The proposed MFM generation was implemented for HRP-2 using HARK, our open-sourced robot audition software. Preliminary results show that the soft MFM outperformed a hard (binary) MFM in recognizing three simultaneous utterances. In a human-robot interaction task, the interval limitations between two adjacent loudspeakers were reduced from 60 degrees to 30 degrees by using soft MFMs.

2019 ◽  
Vol 39 (1) ◽  
pp. 73-99 ◽  
Author(s):  
Matt Webster ◽  
David Western ◽  
Dejanira Araiza-Illan ◽  
Clare Dixon ◽  
Kerstin Eder ◽  
...  

We present an approach for the verification and validation (V&V) of robot assistants in the context of human–robot interactions, to demonstrate their trustworthiness through corroborative evidence of their safety and functional correctness. Key challenges include the complex and unpredictable nature of the real world in which assistant and service robots operate, the limitations on available V&V techniques when used individually, and the consequent lack of confidence in the V&V results. Our approach, called corroborative V&V, addresses these challenges by combining several different V&V techniques; in this paper we use formal verification (model checking), simulation-based testing, and user validation in experiments with a real robot. This combination of approaches allows V&V of the human–robot interaction task at different levels of modeling detail and thoroughness of exploration, thus overcoming the individual limitations of each technique. We demonstrate our approach through a handover task, the most critical part of a complex cooperative manufacturing scenario, for which we propose safety and liveness requirements to verify and validate. Should the resulting V&V evidence present discrepancies, an iterative process between the different V&V techniques takes place until corroboration between the V&V techniques is gained from refining and improving the assets (i.e., system and requirement models) to represent the human–robot interaction task in a more truthful manner. Therefore, corroborative V&V affords a systematic approach to “meta-V&V,” in which different V&V techniques can be used to corroborate and check one another, increasing the level of certainty in the results of V&V.


Author(s):  
Marie D. Manner

We describe experiments performed with a large number of preschool children (ages 1.5 to 4 years) in a two-task eye tracking experiment and a human-robot interaction experiment. The resulting data of mostly neuro-typical children forms a baseline with which to compare children with autism, allowing us to further characterize the autism phenotype. Eye tracking task results indicate a strong preference for a humanoid robot and a social being (a four year old girl) over other robot types. Results from the human-robot interaction task, a semi-structured play interaction between child and robot, showed we can cluster participants based on social distances and other social responsiveness metrics.


2017 ◽  
Vol 12 (1) ◽  
pp. 55-64 ◽  
Author(s):  
Chien Van Dang ◽  
◽  
Tin Trung Tran ◽  
Trung Xuan Pham ◽  
Ki-Jong Gil ◽  
...  

2020 ◽  
Vol 16 (2) ◽  
pp. 1-12
Author(s):  
Ameer Badr ◽  
Alia Abdul-Hassan

With the recent developments of technology and the advances in artificial intelligence and machine learning techniques, it has become possible for the robot to understand and respond to voice as part of Human-Robot Interaction (HRI). The voice-based interface robot can recognize the speech information from humans so that it will be able to interact more naturally with its human counterpart in different environments. In this work, a review of the voice-based interface for HRI systems has been presented. The review focuses on voice-based perception in HRI systems from three facets, which are: feature extraction, dimensionality reduction, and semantic understanding. For feature extraction, numerous types of features have been reviewed in various domains, such as time, frequency, cepstral (i.e. implementing the inverse Fourier transform for the signal spectrum logarithm), and deep domains. For dimensionality reduction, subspace learning can be used to eliminate the redundancies of high-dimensional features by further processing extracted features to reflect their semantic information better. For semantic understanding, the aim is to infer from the extracted features the objects or human behaviors. Numerous types of semantic understanding have been reviewed, such as speech recognition, speaker recognition, speaker gender detection, speaker gender and age estimation, and speaker localization. Finally, some of the existing voice-based interface issues and recommendations for future works have been outlined.


Sign in / Sign up

Export Citation Format

Share Document