Learning-based Gaussian Mixture Reduction for Distributed Bayesian Filter

Author(s):  
Feng Yang ◽  
Xinyi Tang ◽  
Tiancheng Li ◽  
Litao Zheng
2020 ◽  
Vol 64 (4) ◽  
pp. 40404-1-40404-16
Author(s):  
I.-J. Ding ◽  
C.-M. Ruan

Abstract With rapid developments in techniques related to the internet of things, smart service applications such as voice-command-based speech recognition and smart care applications such as context-aware-based emotion recognition will gain much attention and potentially be a requirement in smart home or office environments. In such intelligence applications, identity recognition of the specific member in indoor spaces will be a crucial issue. In this study, a combined audio-visual identity recognition approach was developed. In this approach, visual information obtained from face detection was incorporated into acoustic Gaussian likelihood calculations for constructing speaker classification trees to significantly enhance the Gaussian mixture model (GMM)-based speaker recognition method. This study considered the privacy of the monitored person and reduced the degree of surveillance. Moreover, the popular Kinect sensor device containing a microphone array was adopted to obtain acoustic voice data from the person. The proposed audio-visual identity recognition approach deploys only two cameras in a specific indoor space for conveniently performing face detection and quickly determining the total number of people in the specific space. Such information pertaining to the number of people in the indoor space obtained using face detection was utilized to effectively regulate the accurate GMM speaker classification tree design. Two face-detection-regulated speaker classification tree schemes are presented for the GMM speaker recognition method in this study—the binary speaker classification tree (GMM-BT) and the non-binary speaker classification tree (GMM-NBT). The proposed GMM-BT and GMM-NBT methods achieve excellent identity recognition rates of 84.28% and 83%, respectively; both values are higher than the rate of the conventional GMM approach (80.5%). Moreover, as the extremely complex calculations of face recognition in general audio-visual speaker recognition tasks are not required, the proposed approach is rapid and efficient with only a slight increment of 0.051 s in the average recognition time.


Author(s):  
J. M. C. Clark ◽  
P.A. Kountouriotis ◽  
R. B. Vinter
Keyword(s):  

2019 ◽  
Vol 70 (3) ◽  
pp. 214-224
Author(s):  
Bui Ngoc Dung ◽  
Manh Dzung Lai ◽  
Tran Vu Hieu ◽  
Nguyen Binh T. H.

Video surveillance is emerging research field of intelligent transport systems. This paper presents some techniques which use machine learning and computer vision in vehicles detection and tracking. Firstly the machine learning approaches using Haar-like features and Ada-Boost algorithm for vehicle detection are presented. Secondly approaches to detect vehicles using the background subtraction method based on Gaussian Mixture Model and to track vehicles using optical flow and multiple Kalman filters were given. The method takes advantages of distinguish and tracking multiple vehicles individually. The experimental results demonstrate high accurately of the method.


2018 ◽  
Vol 30 (4) ◽  
pp. 642
Author(s):  
Guichao Lin ◽  
Yunchao Tang ◽  
Xiangjun Zou ◽  
Qing Zhang ◽  
Xiaojie Shi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document