Expression recognition from 3D dynamic faces using robust spatio-temporal shape features

Author(s):  
Vuong Le ◽  
Hao Tang ◽  
Thomas S. Huang
2008 ◽  
Vol 41 (1) ◽  
pp. 204-216 ◽  
Author(s):  
T. Xiang ◽  
M.K.H. Leung ◽  
S.Y. Cho

2013 ◽  
Vol 347-350 ◽  
pp. 3780-3785
Author(s):  
Jing Jie Yan ◽  
Ming Han Xin

Although spatio-temporal features (ST) have recently been developed and shown to be available for facial expression recognition and behavior recognition in videos, it utilizes the method of directly flattening the cuboid into a vector as a feature vector for recognition which causes the obtained vector is likely to potentially sensitive to small cuboid perturbations or noises. To overcome the drawback of spatio-temporal features, we propose a novel method called fused spatio-temporal features (FST) method utilizing the separable linear filters to detect interesting points and fusing two cuboids representation methods including local histogrammed gradient descriptor and flattening the cuboid into a vector for cuboids descriptor. The proposed FST method may robustness to small cuboid perturbations or noises and also preserve both spatial and temporal positional information. The experimental results on two video-based facial expression databases demonstrate the effectiveness of the proposed method.


2003 ◽  
Vol 36 (5) ◽  
pp. 1131-1142 ◽  
Author(s):  
Antonio Fernandez-Caballero ◽  
Miguel A. Fernandez ◽  
Jose Mira ◽  
Ana E. Delgado

2021 ◽  
Vol 2 (4) ◽  
pp. 1-26
Author(s):  
Peining Zhen ◽  
Hai-Bao Chen ◽  
Yuan Cheng ◽  
Zhigang Ji ◽  
Bin Liu ◽  
...  

Mobile devices usually suffer from limited computation and storage resources, which seriously hinders them from deep neural network applications. In this article, we introduce a deeply tensor-compressed long short-term memory (LSTM) neural network for fast video-based facial expression recognition on mobile devices. First, a spatio-temporal facial expression recognition LSTM model is built by extracting time-series feature maps from facial clips. The LSTM-based spatio-temporal model is further deeply compressed by means of quantization and tensorization for mobile device implementation. Based on datasets of Extended Cohn-Kanade (CK+), MMI, and Acted Facial Expression in Wild 7.0, experimental results show that the proposed method achieves 97.96%, 97.33%, and 55.60% classification accuracy and significantly compresses the size of network model up to 221× with reduced training time per epoch by 60%. Our work is further implemented on the RK3399Pro mobile device with a Neural Process Engine. The latency of the feature extractor and LSTM predictor can be reduced 30.20× and 6.62× , respectively, on board with the leveraged compression methods. Furthermore, the spatio-temporal model costs only 57.19 MB of DRAM and 5.67W of power when running on the board.


Sign in / Sign up

Export Citation Format

Share Document