Pitch-Cluster-Map Based Daily Sound Recognition for Mobile Robot Audition

2010 ◽  
Vol 22 (3) ◽  
pp. 402-410 ◽  
Author(s):  
Yoko Sasaki ◽  
◽  
Masahito Kaneyoshi ◽  
Satoshi Kagami ◽  
Hiroshi Mizoguchi ◽  
...  

This paper presents a sound identification method for a mobile robot in home and office environments. We propose a short-term sound recognition method using Pitch-Cluster-Maps (PCMs) sound database (DB) based on a Vector Quantization approach. A binarized frequency spectrum is used to generate PCMs codebook, which describes a variety of sound sources, not only voice, from short-term sound input. PCMs sound identification requires several tens of milliseconds of sound input, and is suitable for mobile robot applications in which conditions are continuously and dynamically changing. We implemented this in mobile robot audition system using a 32-channel microphone array. Robot noise reduction and sound source tracking using our proposal are applied to robot audition system, and we evaluate daily sound recognition performance for separated sound sources from a moving robot.

2011 ◽  
Vol 25 (1-2) ◽  
pp. 135-152 ◽  
Author(s):  
Jwu-Sheng Hu ◽  
Chen-Yu Chan ◽  
Cheng-Kang Wang ◽  
Ming-Tang Lee ◽  
Ching-Yi Kuo

2007 ◽  
Vol 19 (3) ◽  
pp. 281-289 ◽  
Author(s):  
Yoko Sasaki ◽  
◽  
Saori Masunaga ◽  
Simon Thompson ◽  
Satoshi Kagami ◽  
...  

The paper describes a tele-operated mobile robot system which can perform multiple sound source localization and separation using a 32-channel tri-concentric microphone array. Tele-operated mobile robots require two main capabilities: 1) audio/visual presentation of the robot’s environment to the operator, and 2) autonomy for mobility. This paper focuses on the auditory system of a tele-operated mobile robot in order to improve both the presentation of sound sources to the operator and also to facilitate autonomous robot actions. The auditory system is based on a 32-channel distributed microphone array that uses highly efficient directional design for localizing and separating multiple moving sound sources. Experimental results demonstrate the feasibility of inter-person distant communication through the tele-operated robot system.


2021 ◽  
Vol 15 ◽  
Author(s):  
Dong Liu ◽  
Zhiyong Wang ◽  
Lifeng Wang ◽  
Longxi Chen

The redundant information, noise data generated in the process of single-modal feature extraction, and traditional learning algorithms are difficult to obtain ideal recognition performance. A multi-modal fusion emotion recognition method for speech expressions based on deep learning is proposed. Firstly, the corresponding feature extraction methods are set up for different single modalities. Among them, the voice uses the convolutional neural network-long and short term memory (CNN-LSTM) network, and the facial expression in the video uses the Inception-Res Net-v2 network to extract the feature data. Then, long and short term memory (LSTM) is used to capture the correlation between different modalities and within the modalities. After the feature selection process of the chi-square test, the single modalities are spliced to obtain a unified fusion feature. Finally, the fusion data features output by LSTM are used as the input of the classifier LIBSVM to realize the final emotion recognition. The experimental results show that the recognition accuracy of the proposed method on the MOSI and MELD datasets are 87.56 and 90.06%, respectively, which are better than other comparison methods. It has laid a certain theoretical foundation for the application of multimodal fusion in emotion recognition.


2009 ◽  
Vol 2009 (0) ◽  
pp. _1P1-C13_1-_1P1-C13_4
Author(s):  
Yoko SASAKI ◽  
Masahito Kaneyoshi ◽  
Satoshi KAGAMI ◽  
Hiroshi MIZOGUCHI ◽  
Tadashi ENOMOTO

Author(s):  
Yoko Sasaki ◽  
Masahito Kaneyoshi ◽  
Satoshi Kagami ◽  
Hiroshi Mizoguchi ◽  
Tadashi Enomoto

2015 ◽  
Vol 4 (2) ◽  
pp. 1 ◽  
Author(s):  
Aurélien Reveleau ◽  
François Ferland ◽  
Mathieu Labbé ◽  
Dominic Létourneau ◽  
François Michaud

Sign in / Sign up

Export Citation Format

Share Document