scholarly journals Emotion Categorization from Video-Frame Images Using a Novel Sequential Voting Technique

Author(s):  
Harisu Abdullahi Shehu ◽  
Will Browne ◽  
Hedwig Eisenbarth
2021 ◽  
Author(s):  
Harisu Abdullahi Shehu ◽  
William Browne ◽  
Hedwig Eisenbarth

Emotion categorization can be the process of identifying different emotions in humans based on their facial expressions. It requires time and sometimes it is hard for human classifiers to agree with each other about an emotion category of a facial expression. However, machine learning classifiers have done well in classifying different emotions and have widely been used in recent years to facilitate the task of emotion categorization. Much research on emotion video databases uses a few frames from when emotion is expressed at peak to classify emotion, which might not give a good classification accuracy when predicting frames where the emotion is less intense. In this paper, using the CK+ emotion dataset as an example, we use more frames to analyze emotion from mid and peak frame images and compared our results to a method using fewer peak frames. Furthermore, we propose an approach based on sequential voting and apply it to more frames of the CK+ database. Our approach resulted in up to 85.9% accuracy for the mid frames and overall accuracy of 96.5% for the CK+ database compared with the accuracy of 73.4% and 93.8% from existing techniques.


2021 ◽  
Author(s):  
Harisu Abdullahi Shehu ◽  
William Browne ◽  
Hedwig Eisenbarth

Emotion categorization can be the process of identifying different emotions in humans based on their facial expressions. It requires time and sometimes it is hard for human classifiers to agree with each other about an emotion category of a facial expression. However, machine learning classifiers have done well in classifying different emotions and have widely been used in recent years to facilitate the task of emotion categorization. Much research on emotion video databases uses a few frames from when emotion is expressed at peak to classify emotion, which might not give a good classification accuracy when predicting frames where the emotion is less intense. In this paper, using the CK+ emotion dataset as an example, we use more frames to analyze emotion from mid and peak frame images and compared our results to a method using fewer peak frames. Furthermore, we propose an approach based on sequential voting and apply it to more frames of the CK+ database. Our approach resulted in up to 85.9% accuracy for the mid frames and overall accuracy of 96.5% for the CK+ database compared with the accuracy of 73.4% and 93.8% from existing techniques.


Author(s):  
Tim Oliver ◽  
Michelle Leonard ◽  
Juliet Lee ◽  
Akira Ishihara ◽  
Ken Jacobson

We are using video-enhanced light microscopy to investigate the pattern and magnitude of forces that fish keratocytes exert on flexible silicone rubber substrata. Our goal is a clearer understanding of the way molecular motors acting through the cytoskeleton co-ordinate their efforts into locomotion at cell velocities up to 1 μm/sec. Cell traction forces were previously observed as wrinkles(Fig.l) in strong silicone rubber films by Harris.(l) These forces are now measureable by two independant means.In the first of these assays, weakly crosslinked films are made, into which latex beads have been embedded.(Fig.2) These films report local cell-mediated traction forces as bead displacements in the plane of the film(Fig.3), which recover when the applied force is released. Calibrated flexible glass microneedles are then used to reproduce the translation of individual beads. We estimate the force required to distort these films to be 0.5 mdyne/μm of bead movement. Video-frame analysis of bead trajectories is providing data on the relative localisation, dissipation and kinetics of traction forces.


Information ◽  
2020 ◽  
Vol 12 (1) ◽  
pp. 3
Author(s):  
Shuang Chen ◽  
Zengcai Wang ◽  
Wenxin Chen

The effective detection of driver drowsiness is an important measure to prevent traffic accidents. Most existing drowsiness detection methods only use a single facial feature to identify fatigue status, ignoring the complex correlation between fatigue features and the time information of fatigue features, and this reduces the recognition accuracy. To solve these problems, we propose a driver sleepiness estimation model based on factorized bilinear feature fusion and a long- short-term recurrent convolutional network to detect driver drowsiness efficiently and accurately. The proposed framework includes three models: fatigue feature extraction, fatigue feature fusion, and driver drowsiness detection. First, we used a convolutional neural network (CNN) to effectively extract the deep representation of eye and mouth-related fatigue features from the face area detected in each video frame. Then, based on the factorized bilinear feature fusion model, we performed a nonlinear fusion of the deep feature representations of the eyes and mouth. Finally, we input a series of fused frame-level features into a long-short-term memory (LSTM) unit to obtain the time information of the features and used the softmax classifier to detect sleepiness. The proposed framework was evaluated with the National Tsing Hua University drowsy driver detection (NTHU-DDD) video dataset. The experimental results showed that this method had better stability and robustness compared with other methods.


2020 ◽  
Vol 34 (07) ◽  
pp. 10607-10614 ◽  
Author(s):  
Xianhang Cheng ◽  
Zhenzhong Chen

Learning to synthesize non-existing frames from the original consecutive video frames is a challenging task. Recent kernel-based interpolation methods predict pixels with a single convolution process to replace the dependency of optical flow. However, when scene motion is larger than the pre-defined kernel size, these methods yield poor results even though they take thousands of neighboring pixels into account. To solve this problem in this paper, we propose to use deformable separable convolution (DSepConv) to adaptively estimate kernels, offsets and masks to allow the network to obtain information with much fewer but more relevant pixels. In addition, we show that the kernel-based methods and conventional flow-based methods are specific instances of the proposed DSepConv. Experimental results demonstrate that our method significantly outperforms the other kernel-based interpolation methods and shows strong performance on par or even better than the state-of-the-art algorithms both qualitatively and quantitatively.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Parsa Omidi ◽  
Mohamadreza Najiminaini ◽  
Mamadou Diop ◽  
Jeffrey J. L. Carson

AbstractSpatial resolution in three-dimensional fringe projection profilometry is determined in large part by the number and spacing of fringes projected onto an object. Due to the intensity-based nature of fringe projection profilometry, fringe patterns must be generated in succession, which is time-consuming. As a result, the surface features of highly dynamic objects are difficult to measure. Here, we introduce multispectral fringe projection profilometry, a novel method that utilizes multispectral illumination to project a multispectral fringe pattern onto an object combined with a multispectral camera to detect the deformation of the fringe patterns due to the object. The multispectral camera enables the detection of 8 unique monochrome fringe patterns representing 4 distinct directions in a single snapshot. Furthermore, for each direction, the camera detects two π-phase shifted fringe patterns. Each pair of fringe patterns can be differenced to generate a differential fringe pattern that corrects for illumination offsets and mitigates the effects of glare from highly reflective surfaces. The new multispectral method solves many practical problems related to conventional fringe projection profilometry and doubles the effective spatial resolution. The method is suitable for high-quality fast 3D profilometry at video frame rates.


2021 ◽  
Vol 2 (2) ◽  
Author(s):  
Alakananda Mitra ◽  
Saraju P. Mohanty ◽  
Peter Corcoran ◽  
Elias Kougianos

2021 ◽  
Vol 11 (11) ◽  
pp. 4726
Author(s):  
Muhammad Ayaz Shirazi ◽  
Riaz Uddin ◽  
Min-Young Kim

Video display content can be extended to the walls of the living room around the TV using projection. The problem of providing appropriate projection content is hard for the computer and we solve this problem with deep neural network. We propose the peripheral vision system that provides the immersive visual experiences to the user by extending the video content using deep learning and projecting that content around the TV screen. The user may manually create the appropriate content for the existing TV screen, but it is too expensive to create it. The PCE (Pixel context encoder) network considers the center of the video frame as input and the outside area as output to extend the content using supervised learning. The proposed system is expected to pave a new road to the home appliance industry, transforming the living room into the new immersive experience platform.


Sign in / Sign up

Export Citation Format

Share Document