scholarly journals Expression Recognition Method Using Improved VGG16 Network Model in Robot Interaction

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Shengbin Wu

Aiming at the problems of poor representation ability and less feature data when traditional expression recognition methods are applied to intelligent applications, an expression recognition method based on improved VGG16 network is proposed. Firstly, the VGG16 network is improved by using large convolution kernel instead of small convolution kernel and reducing some fully connected layers to reduce the complexity and parameters of the model. Then, the high-dimensional abstract feature data output by the improved VGG16 is input into the convolution neural network (CNN) for training, so as to output the expression types with high accuracy. Finally, the expression recognition method combined with the improved VGG16 and CNN model is applied to the human-computer interaction of the NAO robot. The robot makes different interactive actions according to different expressions. The experimental results based on CK + dataset show that the improved VGG16 network has strong supervised learning ability. It can extract features well for different expression types, and its overall recognition accuracy is close to 90%. Through multiple tests, the interactive results show that the robot can stably recognize emotions and make corresponding action interactions.

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6438
Author(s):  
Chiara Filippini ◽  
David Perpetuini ◽  
Daniela Cardone ◽  
Arcangelo Merla

An intriguing challenge in the human–robot interaction field is the prospect of endowing robots with emotional intelligence to make the interaction more genuine, intuitive, and natural. A crucial aspect in achieving this goal is the robot’s capability to infer and interpret human emotions. Thanks to its design and open programming platform, the NAO humanoid robot is one of the most widely used agents for human interaction. As with person-to-person communication, facial expressions are the privileged channel for recognizing the interlocutor’s emotional expressions. Although NAO is equipped with a facial expression recognition module, specific use cases may require additional features and affective computing capabilities that are not currently available. This study proposes a highly accurate convolutional-neural-network-based facial expression recognition model that is able to further enhance the NAO robot’ awareness of human facial expressions and provide the robot with an interlocutor’s arousal level detection capability. Indeed, the model tested during human–robot interactions was 91% and 90% accurate in recognizing happy and sad facial expressions, respectively; 75% accurate in recognizing surprised and scared expressions; and less accurate in recognizing neutral and angry expressions. Finally, the model was successfully integrated into the NAO SDK, thus allowing for high-performing facial expression classification with an inference time of 0.34 ± 0.04 s.


Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2056
Author(s):  
Junjie Wu ◽  
Jianfeng Xu ◽  
Deyu Lin ◽  
Min Tu

The recognition accuracy of micro-expressions in the field of facial expressions is still understudied, as current research methods mainly focus on feature extraction and classification. Based on optical flow and decision thinking theory, we propose a novel micro-expression recognition method, which can filter low-quality micro-expression video clips. Determined by preset thresholds, we develop two optical flow filtering mechanisms: one based on two-branch decisions (OFF2BD) and the other based on three-way decisions (OFF3WD). In OFF2BD, which use the classical binary logic to classify images, and divide the images into positive or negative domain for further filtering. Differ from the OFF2BD, OFF3WD added boundary domain to delay to judge the motion quality of the images. In this way, the video clips with low degree of morphological change can be eliminated, so as to directly improve the quality of micro-expression features and recognition rate. From the experimental results, we verify the recognition accuracy of 61.57%, and 65.41% for CASMEII, and SMIC datasets, respectively. Through the comparative analysis, it shows that the scheme can effectively improve the recognition performance.


2019 ◽  
Vol 8 (4) ◽  
pp. 9782-9787

Facial Expression Recognition is an important undertaking for the machinery to recognize different expressive alterations in individual. Emotions have a strong relationship with our behavior. Human emotions are discrete reactions to inside or outside occasions which have some importance meaning. Involuntary sentiment detection is a process to understand the individual’s expressive state to identify his intensions from facial expression which is also a noteworthy piece of non-verbal correspondence. In this paper we propose a Framework that combines discriminative features discovered using Convolutional Neural Networks (CNN) to enhance the performance and accuracy of Facial Expression Recognition. For this we have implemented Inception V3 pre-trained architecture of CNN and then applying concatenation of intermediate layer with final layer which is further passing through fully connected layer to perform classification. We have used JAFFE (Japanese Female Facial Expression) Dataset for this purpose and Experimental results show that our proposed method shows better performance and improve the recognition accuracy.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1919
Author(s):  
Shuhua Liu ◽  
Huixin Xu ◽  
Qi Li ◽  
Fei Zhang ◽  
Kun Hou

With the aim to solve issues of robot object recognition in complex scenes, this paper proposes an object recognition method based on scene text reading. The proposed method simulates human-like behavior and accurately identifies objects with texts through careful reading. First, deep learning models with high accuracy are adopted to detect and recognize text in multi-view. Second, datasets including 102,000 Chinese and English scene text images and their inverse are generated. The F-measure of text detection is improved by 0.4% and the recognition accuracy is improved by 1.26% because the model is trained by these two datasets. Finally, a robot object recognition method is proposed based on the scene text reading. The robot detects and recognizes texts in the image and then stores the recognition results in a text file. When the user gives the robot a fetching instruction, the robot searches for corresponding keywords from the text files and achieves the confidence of multiple objects in the scene image. Then, the object with the maximum confidence is selected as the target. The results show that the robot can accurately distinguish objects with arbitrary shape and category, and it can effectively solve the problem of object recognition in home environments.


2011 ◽  
Vol 121-126 ◽  
pp. 2141-2145 ◽  
Author(s):  
Wei Gang Yan ◽  
Chang Jian Wang ◽  
Jin Guo

This paper proposes a new image segmentation algorithm to detect the flame image from video in enclosed compartment. In order to avoid the contamination of soot and water vapor, this method first employs the cubic root of four color channels to transform a RGB image to a pseudo-gray one. Then the latter is divided into many small stripes (child images) and OTSU is employed to perform child image segmentation. Lastly, these processed child images are reconstructed into a whole image. A computer program using OpenCV library is developed and the new method is compared with other commonly used methods such as edge detection and normal Otsu’s method. It is found that the new method has better performance in flame image recognition accuracy and can be used to obtain flame shape from experiment video with much noise.


2021 ◽  
Vol 39 (1B) ◽  
pp. 1-10
Author(s):  
Iman H. Hadi ◽  
Alia K. Abdul-Hassan

Speaker recognition depends on specific predefined steps. The most important steps are feature extraction and features matching. In addition, the category of the speaker voice features has an impact on the recognition process. The proposed speaker recognition makes use of biometric (voice) attributes to recognize the identity of the speaker. The long-term features were used such that maximum frequency, pitch and zero crossing rate (ZCR).  In features matching step, the fuzzy inner product was used between feature vectors to compute the matching value between a claimed speaker voice utterance and test voice utterances. The experiments implemented using (ELSDSR) data set. These experiments showed that the recognition accuracy is 100% when using text dependent speaker recognition.


2021 ◽  
Author(s):  
Qiuchen Wang ◽  
Xiaowei Xu ◽  
Ye Tao ◽  
Xiaodong Wang ◽  
Fangfang Chen ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document