Off-TANet: A Lightweight Neural Micro-expression Recognizer with Optical Flow Features and Integrated Attention Mechanism

2021 ◽  
pp. 266-279
Author(s):  
Jiahao Zhang ◽  
Feng Liu ◽  
Aimin Zhou
2019 ◽  
Vol 29 (01) ◽  
pp. 2050006 ◽  
Author(s):  
Qiuyu Li ◽  
Jun Yu ◽  
Toru Kurihara ◽  
Haiyan Zhang ◽  
Shu Zhan

Micro-expression is a kind of brief facial movements which could not be controlled by the nervous system. Micro-expression indicates that a person is hiding his true emotion consciously. Micro-expression recognition has various potential applications in public security and clinical medicine. Researches are focused on the automatic micro-expression recognition, because it is hard to recognize the micro-expression by people themselves. This research proposed a novel algorithm for automatic micro-expression recognition which combined a deep multi-task convolutional network for detecting the facial landmarks and a fused deep convolutional network for estimating the optical flow features of the micro-expression. First, the deep multi-task convolutional network is employed to detect facial landmarks with the manifold-related tasks for dividing the facial region. Furthermore, a fused convolutional network is applied for extracting the optical flow features from the facial regions which contain the muscle changes when the micro-expression appears. Because each video clip has many frames, the original optical flow features of the whole video clip will have high number of dimensions and redundant information. This research revises the optical flow features for reducing the redundant dimensions. Finally, a revised optical flow feature is applied for refining the information of the features and a support vector machine classifier is adopted for recognizing the micro-expression. The main contribution of work is combining the deep multi-task learning neural network and the fusion optical flow network for micro-expression recognition and revising the optical flow features for reducing the redundant dimensions. The results of experiments on two spontaneous micro-expression databases prove that our method achieved competitive performance in micro-expression recognition.


Author(s):  
Yongyi Tang ◽  
Lin Ma ◽  
Lianqiang Zhou

Appearance and motion are two key components to depict and characterize the video content. Currently, the two-stream models have achieved state-of-the-art performances on video classification. However, extracting motion information, specifically in the form of optical flow features, is extremely computationally expensive, especially for large-scale video classification. In this paper, we propose a motion hallucination network, namely MoNet, to imagine the optical flow features from the appearance features, with no reliance on the optical flow computation. Specifically, MoNet models the temporal relationships of the appearance features and exploits the contextual relationships of the optical flow features with concurrent connections. Extensive experimental results demonstrate that the proposed MoNet can effectively and efficiently hallucinate the optical flow features, which together with the appearance features consistently improve the video classification performances. Moreover, MoNet can help cutting down almost a half of computational and data-storage burdens for the two-stream video classification. Our code is available at: https://github.com/YongyiTang92/MoNet-Features


2020 ◽  
Vol 14 (13) ◽  
pp. 1845-1854
Author(s):  
Yuan Hu ◽  
Hubert P. H. Shum ◽  
Edmond S. L. Ho

Author(s):  
Hongwei Ren ◽  
Chenghao Li ◽  
Xinyi Zhang ◽  
Chenchen Ding ◽  
Changhai Man ◽  
...  

Sensors ◽  
2019 ◽  
Vol 19 (23) ◽  
pp. 5142 ◽  
Author(s):  
Dong Liang ◽  
Jiaxing Pan ◽  
Han Sun ◽  
Huiyu Zhou

Foreground detection is an important theme in video surveillance. Conventional background modeling approaches build sophisticated temporal statistical model to detect foreground based on low-level features, while modern semantic/instance segmentation approaches generate high-level foreground annotation, but ignore the temporal relevance among consecutive frames. In this paper, we propose a Spatio-Temporal Attention Model (STAM) for cross-scene foreground detection. To fill the semantic gap between low and high level features, appearance and optical flow features are synthesized by attention modules via the feature learning procedure. Experimental results on CDnet 2014 benchmarks validate it and outperformed many state-of-the-art methods in seven evaluation metrics. With the attention modules and optical flow, its F-measure increased 9 % and 6 % respectively. The model without any tuning showed its cross-scene generalization on Wallflower and PETS datasets. The processing speed was 10.8 fps with the frame size 256 by 256.


Sign in / Sign up

Export Citation Format

Share Document