1A1-W08 Investigation on Cooperation of Microphone Array with Depth Image Sensor : For the Speech Separation of the Multiple Humans that Positions are Unsettled

2015 ◽  
Vol 2015 (0) ◽  
pp. _1A1-W08_1-_1A1-W08_2
Author(s):  
Takahiro KIGAWA ◽  
Takeki OGITSU ◽  
Hiroshi TAKEMURA ◽  
Hiroshi MIZOGUCHI
Author(s):  
Toshiya Watanabe ◽  
Kazuki Kamata ◽  
Sheik Hasan ◽  
Susumu Shibusawa ◽  
Masaru Kamada ◽  
...  

Sensors ◽  
2020 ◽  
Vol 20 (12) ◽  
pp. 3527
Author(s):  
Ching-Feng Liu ◽  
Wei-Siang Ciou ◽  
Peng-Ting Chen ◽  
Yi-Chun Du

In the context of assisted human, identifying and enhancing non-stationary speech targets speech in various noise environments, such as a cocktail party, is an important issue for real-time speech separation. Previous studies mostly used microphone signal processing to perform target speech separation and analysis, such as feature recognition through a large amount of training data and supervised machine learning. The method was suitable for stationary noise suppression, but relatively limited for non-stationary noise and difficult to meet the real-time processing requirement. In this study, we propose a real-time speech separation method based on an approach that combines an optical camera and a microphone array. The method was divided into two stages. Stage 1 used computer vision technology with the camera to detect and identify interest targets and evaluate source angles and distance. Stage 2 used beamforming technology with microphone array to enhance and separate the target speech sound. The asynchronous update function was utilized to integrate the beamforming control and speech processing to reduce the effect of the processing delay. The experimental results show that the noise reduction in various stationary and non-stationary noise environments were 6.1 dB and 5.2 dB respectively. The response time of speech processing was less than 10ms, which meets the requirements of a real-time system. The proposed method has high potential to be applied in auxiliary listening systems or machine language processing like intelligent personal assistant.


2016 ◽  
Vol 28 (2) ◽  
pp. 173-184 ◽  
Author(s):  
Takanobu Tanimoto ◽  
◽  
Ryo Fukano ◽  
Kei Shinohara ◽  
Keita Kurashiki ◽  
...  

[abstFig src='/00280002/08.jpg' width=""300"" text='Superimposed terrain model in operator's view image' ]In recent years, unmanned construction based on the teleoperation of construction equipment has increasingly been used in disaster sites or mines. However, operations based on teleoperation are based on 2D images, in which the lack of perspective results in considerably lower efficiency when compared with on-board operations. Previous studies employed multi-viewpoint images or binocular stereo, which resulted in problems, such as lower efficiency, caused by the operator's need to evaluate distances by shifting his or her line of sight, or eye fatigue due to binocular stereo. Thus, the present study aims to improve the work efficiency of teleoperation by superimposing a 3D model of the terrain on the on-board operator's view image. The surrounding terrain is measured by a depth image sensor and represented as a digital terrain model, which is generated and updated in real time. The terrain model is transformed into the on-board operator's view, on which an artificial shadow of the bucket tip and an evenly spaced grid projected to the ground surface are superimposed. This allows the operator to visually evaluate the bucket tip position from the artificial shadow and the distance between the excavation point and bucket tip from the terrain grid. An experiment was conducted investigating the positioning of the bucket tip by teleoperation using a miniature excavator and the terrain model superimposed display. The results showed that the standard deviations of the positioning errors measured with the superimposed display were lower by 30% or more than those obtained without the superimposed display, while they were approximately equal to those acquired using binocular stereo. We thus demonstrated the effectiveness of the superimposed display in improving work efficiency in teleoperation.


Author(s):  
Takahiro KIGAWA ◽  
Taisuke SAKANO ◽  
Hiroshi TAKEMURA ◽  
Hiroshi MIZOGUCHI

2020 ◽  
Vol 71 (06) ◽  
pp. 530-537
Author(s):  
HAKAN YÜKSEL ◽  
MELIHA OKTAV BULUT

Sensors can capture and scan many objects in real time for military, security, health and industrial applications. Sensorscan be made smaller, cheaper and more energy efficient due to rapid changes in technology. Low-cost sensors areattractive alternatives to high cost laser scanners in recent years. The Kinect sensor can measure depth data with lowcost and high resolution by scanning the environment. In this study, this sensor collected data on users in front of ascanner, and the depth data results were tested. The process was repeated with four different body positions, and theresults were analysed. The sensor data was reliable versus real measurements. When compared the depth data takenby the sensor with the real measures, the reliability rate is found significance. The difference between the depth imagedata of different users, different positions and different body measures and real data is 0.35 to 1.15 cm. This shows thatthe sensor’s results are close to real data. When the accuracy of the sensor against real measurements is examined,it is seen that these values are between 98.46 % and 99.6 %. Thus, this depth image sensor is reliable and can be usedas an alternative and cheaper way for body measurements.


Sign in / Sign up

Export Citation Format

Share Document