Model-based Head Orientation Estimation for Smart Devices

Voice interaction is friendly and convenient for users. Smart devices such as Amazon Echo allow users to interact with them by voice commands and become increasingly popular in our daily life. In recent years, research works focus on using the microphone array built in smart devices to localize the user's position, which adds additional context information to voice commands. In contrast, few works explore the user's head orientation, which also contains useful context information. For example, when a user says, "turn on the light", the head orientation could infer which light the user is referring to. Existing model-based works require a large number of microphone arrays to form an array network, while machine learning-based approaches need laborious data collection and training workload. The high deployment/usage cost of these methods is unfriendly to users. In this paper, we propose HOE, a model-based system that enables Head Orientation Estimation for smart devices with only two microphone arrays, which requires a lower training overhead than previous approaches. HOE first estimates the user's head orientation candidates by measuring the voice energy radiation pattern. Then, the voice frequency radiation pattern is leveraged to obtain the final result. Real-world experiments are conducted, and the results show that HOE can achieve a median estimation error of 23 degrees. To the best of our knowledge, HOE is the first model-based attempt to estimate the head orientation by only two microphone arrays without the arduous data training overhead.

Download Full-text

Head Orientation Estimation from Multiple Microphone Arrays

2020 28th European Signal Processing Conference (EUSIPCO) ◽

10.23919/eusipco47968.2020.9287555 ◽

2021 ◽

Author(s):

Rebecca C. Felsheim ◽

Andreas Brendel ◽

Patrick A. Naylor ◽

Walter Kellermann

Keyword(s):

Microphone Arrays ◽

Head Orientation ◽

Orientation Estimation

Download Full-text

Acoustic source localization using the open spherical microphone array in the low-frequency range

MATEC Web of Conferences ◽

10.1051/matecconf/201928304001 ◽

2019 ◽

Vol 283 ◽

pp. 04001

Author(s):

Boquan Yang ◽

Shengguo Shi ◽

Desen Yang

Keyword(s):

Source Localization ◽

Generalized Inverse ◽

Microphone Array ◽

Low Frequency ◽

Poor Performance ◽

Microphone Arrays ◽

Three Dimension ◽

Frequency Range ◽

Array Aperture ◽

Regularization Matrix

Recently, spherical microphone arrays (SMA) have become increasingly significant for source localization and identification in three dimension due to its spherical symmetry. However, conventional Spherical Harmonic Beamforming (SHB) based on SMA has limitations, such as poor resolution and high side-lobe levels in image maps. To overcome these limitations, this paper employs the iterative generalized inverse beamforming algorithm with a virtual extrapolated open spherical microphone array. The sidelobes can be suppressed and the main-lobe can be narrowed by introducing the two iteration processes into the generalized inverse beamforming (GIB) algorithm. The instability caused by uncertainties in actual measurements, such as measurement noise and configuration problems in the process of GIB, can be minimized by iteratively redefining the form of regularization matrix and the corresponding GIB localization results. In addition, the poor performance of microphone arrays in the low-frequency range due to the array aperture can be improved by using a virtual extrapolated open spherical array (EA), which has a larger array aperture. The virtual array is obtained by a kind of data preprocessing method through the regularization matrix algorithm. Both results from simulations and experiments show the feasibility and accuracy of the method.

Download Full-text

POF misalignment model based on the calculation of the radiation pattern using the Hankel transform

Optics Express ◽

10.1364/oe.23.008061 ◽

2015 ◽

Vol 23 (6) ◽

pp. 8061 ◽

Cited By ~ 6

Author(s):

J. Mateo ◽

M. A. Losada ◽

A. López

Keyword(s):

Radiation Pattern ◽

Hankel Transform ◽

Model Based

Download Full-text

Deep Spatial-Temporal Field for Human Head Orientation Estimation

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-36718-3_42 ◽

2019 ◽

pp. 499-509

Author(s):

Zhansheng Xiong ◽

Zhenhua Wang ◽

Zheng Wang ◽

Jianhua Zhang

Keyword(s):

Human Head ◽

Head Orientation ◽

Orientation Estimation ◽

Temporal Field

Download Full-text

Model-Based Scene Interpretation by Multilayered Context Information

10.1007/springerreference_302056 ◽

2012 ◽

Keyword(s):

Context Information ◽

Scene Interpretation ◽

Model Based

Download Full-text

Physical-model based efficient data representation for many-channel microphone array

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2016.7471699 ◽

2016 ◽

Cited By ~ 7

Author(s):

Yuji Koyano ◽

Kohei Yatabe ◽

Yusuke Ikeda ◽

Yasuhiro Oikawa

Keyword(s):

Physical Model ◽

Microphone Array ◽

Data Representation ◽

Model Based ◽

Efficient Data

Download Full-text

A Robust Method for Head Orientation Estimation Using Histogram of Oriented Gradients

Communications in Computer and Information Science - Signal Processing, Image Processing and Pattern Recognition ◽

10.1007/978-3-642-27183-0_41 ◽

2011 ◽

pp. 391-400 ◽

Cited By ~ 1

Author(s):

Dinh Tuan Tran ◽

Joo-Ho Lee

Keyword(s):

Head Orientation ◽

Histogram Of Oriented Gradients ◽

Robust Method ◽

Orientation Estimation

Download Full-text

Model Based Abnormal Acoustic Source Detection Using a Microphone Array

AI 2005: Advances in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/11589990_121 ◽

2005 ◽

pp. 966-969 ◽

Cited By ~ 1

Author(s):

Heungkyu Lee ◽

Jounghoon Beh ◽

June Kim ◽

Hanseok Ko

Keyword(s):

Microphone Array ◽

Source Detection ◽

Acoustic Source ◽

Model Based

Download Full-text

Layout Optimization of Cooperative Distributed Microphone Arrays Based on Estimation of Source Separation Performance

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2017.p0083 ◽

2017 ◽

Vol 29 (1) ◽

pp. 83-93

Author(s):

Kouhei Sekiguchi ◽

◽

Yoshiaki Bando ◽

Katsutoshi Itoyama ◽

Kazuyoshi Yoshii

Keyword(s):

Mobile Robots ◽

Microphone Array ◽

Source Separation ◽

Microphone Arrays ◽

Separation Performance ◽

Sound Sources ◽

Multiple Mobile Robots ◽

Position Information ◽

Reconfigurable Array ◽

Source Signals

[abstFig src='/00290001/08.jpg' width='300' text='Optimizing robot positions for source separation' ] The active audition method presented here improves source separation performance by moving multiple mobile robots to optimal positions. One advantage of using multiple mobile robots that each has a microphone array is that each robot can work independently or as part of a big reconfigurable array. To determine optimal layout of the robots, we must be able to predict source separation performance from source position information because actual source signals are unknown and actual separation performance cannot be calculated. Our method thus simulates delay-and-sum beamforming from a possible layout to calculate gain theoretically, i.e., the expected ratio of a target sound source to other sound sources in the corresponding separated signal. Robots are moved into the layout with the highest average gain over target sources. Experimental results showed that our method improved the harmonic mean of signal-to-distortion ratios (SDRs) by 5.5 dB in simulation and by 3.5 dB in a real environment.

Download Full-text

Real-time head orientation estimation using neural networks

Proceedings International Conference on Image Processing ICIP-02 ◽

10.1109/icip.2002.1038018 ◽

2003 ◽

Cited By ~ 8

Author(s):

Liang Zhao ◽

G. Pingali ◽

I. Carlbom

Keyword(s):

Neural Networks ◽

Real Time ◽

Head Orientation ◽

Orientation Estimation

Download Full-text