Neural network model for video-based facial expression recognition in-the-wild on mobile devices

Author(s):  
Polina Demochkina ◽  
Andrey V. Savchenko
2021 ◽  
Vol 2 (4) ◽  
pp. 1-26
Author(s):  
Peining Zhen ◽  
Hai-Bao Chen ◽  
Yuan Cheng ◽  
Zhigang Ji ◽  
Bin Liu ◽  
...  

Mobile devices usually suffer from limited computation and storage resources, which seriously hinders them from deep neural network applications. In this article, we introduce a deeply tensor-compressed long short-term memory (LSTM) neural network for fast video-based facial expression recognition on mobile devices. First, a spatio-temporal facial expression recognition LSTM model is built by extracting time-series feature maps from facial clips. The LSTM-based spatio-temporal model is further deeply compressed by means of quantization and tensorization for mobile device implementation. Based on datasets of Extended Cohn-Kanade (CK+), MMI, and Acted Facial Expression in Wild 7.0, experimental results show that the proposed method achieves 97.96%, 97.33%, and 55.60% classification accuracy and significantly compresses the size of network model up to 221× with reduced training time per epoch by 60%. Our work is further implemented on the RK3399Pro mobile device with a Neural Process Engine. The latency of the feature extractor and LSTM predictor can be reduced 30.20× and 6.62× , respectively, on board with the leveraged compression methods. Furthermore, the spatio-temporal model costs only 57.19 MB of DRAM and 5.67W of power when running on the board.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Seyed Muhammad Hossein Mousavi ◽  
S. Younes Mirinezhad

AbstractThis study presents a new color-depth based face database gathered from different genders and age ranges from Iranian subjects. Using suitable databases, it is possible to validate and assess available methods in different research fields. This database has application in different fields such as face recognition, age estimation and Facial Expression Recognition and Facial Micro Expressions Recognition. Image databases based on their size and resolution are mostly large. Color images usually consist of three channels namely Red, Green and Blue. But in the last decade, another aspect of image type has emerged, named “depth image”. Depth images are used in calculating range and distance between objects and the sensor. Depending on the depth sensor technology, it is possible to acquire range data differently. Kinect sensor version 2 is capable of acquiring color and depth data simultaneously. Facial expression recognition is an important field in image processing, which has multiple uses from animation to psychology. Currently, there is a few numbers of color-depth (RGB-D) facial micro expressions recognition databases existing. With adding depth data to color data, the accuracy of final recognition will be increased. Due to the shortage of color-depth based facial expression databases and some weakness in available ones, a new and almost perfect RGB-D face database is presented in this paper, covering Middle-Eastern face type. In the validation section, the database will be compared with some famous benchmark face databases. For evaluation, Histogram Oriented Gradients features are extracted, and classification algorithms such as Support Vector Machine, Multi-Layer Neural Network and a deep learning method, called Convolutional Neural Network or are employed. The results are so promising.


Sign in / Sign up

Export Citation Format

Share Document