scholarly journals Research on a Microexpression Recognition Technology Based on Multimodal Fusion

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Jie Kang ◽  
Xiao Ying Chen ◽  
Qi Yuan Liu ◽  
Si Han Jin ◽  
Cheng Han Yang ◽  
...  

Microexpressions have extremely high due value in national security, public safety, medical, and other fields. However, microexpressions have characteristics that are obviously different from macroexpressions, such as short duration and weak changes, which greatly increase the difficulty of microexpression recognition work. In this paper, we propose a microexpression recognition method based on multimodal fusion through a comparative study of traditional microexpression recognition algorithms such as LBP algorithm and CNN and LSTM deep learning algorithms. The method couples the separate microexpression image information with the corresponding body temperature information to establish a multimodal fusion microexpression database. This paper firstly introduces how to build a multimodal fusion microexpression database in a laboratory environment, secondly compares the recognition accuracy of LBP, LSTM, and CNN + LSTM networks for microexpressions, and finally selects the superior CNN + LSTM network in the comparison results for model training and testing on the test set under separate microexpression database and multimodal fusion database. The experimental results show that a microexpression recognition method based on multimodal fusion designed in this paper is more accurate than unimodal recognition in multimodal recognition after feature fusion, and its recognition rate reaches 75.1%, which proves that the method is feasible and effective in improving microexpression recognition rate and has good practical value.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4333
Author(s):  
Pengfei Zhao ◽  
Lijia Huang ◽  
Yu Xin ◽  
Jiayi Guo ◽  
Zongxu Pan

At present, synthetic aperture radar (SAR) automatic target recognition (ATR) has been deeply researched and widely used in military and civilian fields. SAR images are very sensitive to the azimuth aspect of the imaging geomety; the same target at different aspects differs greatly. Thus, the multi-aspect SAR image sequence contains more information for classification and recognition, which requires the reliable and robust multi-aspect target recognition method. Nowadays, SAR target recognition methods are mostly based on deep learning. However, the SAR dataset is usually expensive to obtain, especially for a certain target. It is difficult to obtain enough samples for deep learning model training. This paper proposes a multi-aspect SAR target recognition method based on a prototypical network. Furthermore, methods such as multi-task learning and multi-level feature fusion are also introduced to enhance the recognition accuracy under the case of a small number of training samples. The experiments by using the MSTAR dataset have proven that the recognition accuracy of our method can be close to the accruacy level by all samples and our method can be applied to other feather extraction models to deal with small sample learning problems.



2014 ◽  
Vol 989-994 ◽  
pp. 4187-4190 ◽  
Author(s):  
Lin Zhang

An adaptive gender recognition method is proposed in this paper. At first, do multiwavlet transform to face image and get its low frequency information, then do feature extraction to the low frequency information using compressive sensing (CS), use extreme learning machine (ELM) to achieve gender recognition finally. In the process of feature extraction, we use genetic algorithm (GA) to get the number of measurements of CS in order to gain the highest recognition rate, so the method can adaptive access optimal performance. Experimental results show that compared with PDA and LDA, the new method improved the recognition accuracy substantially.



Entropy ◽  
2019 ◽  
Vol 21 (10) ◽  
pp. 999
Author(s):  
Yuting Pu ◽  
Honggeng Yang ◽  
Xiaoyang Ma ◽  
Xiangxun Sun

The recognition of the voltage sag sources is the basis for formulating a voltage sag governance plan and clarifying the responsibility for the accident. Aiming at the recognition problem of voltage sag sources, a recognition method of voltage sag sources based on phase space reconstruction and improved Visual Geometry Group (VGG) transfer learning is proposed from the perspective of image classification. Firstly, phase space reconstruction technology is used to transform voltage sag signals, generate reconstruction images of voltage sag, and analyze the intuitive characteristics of different sag sources from reconstruction images. Secondly, combined with the attention mechanism, the standard VGG 16 model is improved to extract the features completely and prevent over-fitting. Finally, VGG transfer learning model uses the idea of transfer learning for training, which improves the efficiency of model training and the recognition accuracy of sag sources. The purpose of the training model is to minimize the cross entropy loss function. The simulation analysis verifies the effectiveness and superiority of the proposed method.



Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2056
Author(s):  
Junjie Wu ◽  
Jianfeng Xu ◽  
Deyu Lin ◽  
Min Tu

The recognition accuracy of micro-expressions in the field of facial expressions is still understudied, as current research methods mainly focus on feature extraction and classification. Based on optical flow and decision thinking theory, we propose a novel micro-expression recognition method, which can filter low-quality micro-expression video clips. Determined by preset thresholds, we develop two optical flow filtering mechanisms: one based on two-branch decisions (OFF2BD) and the other based on three-way decisions (OFF3WD). In OFF2BD, which use the classical binary logic to classify images, and divide the images into positive or negative domain for further filtering. Differ from the OFF2BD, OFF3WD added boundary domain to delay to judge the motion quality of the images. In this way, the video clips with low degree of morphological change can be eliminated, so as to directly improve the quality of micro-expression features and recognition rate. From the experimental results, we verify the recognition accuracy of 61.57%, and 65.41% for CASMEII, and SMIC datasets, respectively. Through the comparative analysis, it shows that the scheme can effectively improve the recognition performance.



2020 ◽  
Author(s):  
dongshen ji ◽  
yanzhong zhao ◽  
zhujun zhang ◽  
qianchuan zhao

In view of the large demand for new coronary pneumonia covid19 image recognition samples,the recognition accuracy is not ideal.In this paper,a new coronary pneumonia positive image recognition method proposed based on small sample recognition. First, the CT image pictures are preprocessed, and the pictures are converted into the picture formats which are required for transfer learning. Secondly, perform small-sample image enhancement and expansion on the converted picture, such as miscut transformation, random rotation and translation, etc.. Then, multiple migration models are used to extract features and then perform feature fusion. Finally,the model is adjusted by fine-tuning.Then train the model to obtain experimental results. The experimental results show that our method has excellent recognition performance in the recognition of new coronary pneumonia images,even with only a small number of CT image samples.



Electronics ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 20
Author(s):  
Linhui Sun ◽  
Yunyi Bu ◽  
Bo Zou ◽  
Sheng Fu ◽  
Pingan Li

Extracting speaker’s personalized feature parameters is vital for speaker recognition. Only one kind of feature cannot fully reflect the speaker’s personality information. In order to represent the speaker’s identity more comprehensively and improve speaker recognition rate, we propose a speaker recognition method based on the fusion feature of a deep and shallow recombination Gaussian supervector. In this method, the deep bottleneck features are first extracted by Deep Neural Network (DNN), which are used for the input of the Gaussian Mixture Model (GMM) to obtain the deep Gaussian supervector. On the other hand, we input the Mel-Frequency Cepstral Coefficient (MFCC) to GMM directly to extract the traditional Gaussian supervector. Finally, the two categories of features are combined in the form of horizontal dimension augmentation. In addition, when the number of speakers to be recognized increases, in order to prevent the system recognition rate from falling sharply, we introduce the optimization algorithm to find the optimal weight before the feature fusion. The experiment results indicate that the speaker recognition rate based on the feature which is fused directly can reach 98.75%, which is 5% and 0.62% higher than the traditional feature and deep bottleneck feature, respectively. When the number of speakers increases, the fusion feature based on optimized weight coefficients can improve the recognition rate by 0.81%. It is validated that our proposed fusion method can effectively consider the complementarity of the different types of features and improve the speaker recognition rate.



2021 ◽  
Vol 271 ◽  
pp. 01039
Author(s):  
Dongsheng Ji ◽  
Yanzhong Zhao ◽  
Zhujun Zhang ◽  
Qianchuan Zhao

In view of the large demand for new coronary pneumonia covid19 image recognition samples, the recognition accuracy is not ideal. In this paper, a new coronary pneumonia positive image recognition method proposed based on small sample recognition. First, the CT image pictures are preprocessed, and the pictures are converted into the picture formats which are required for transfer learning. Secondly, small-sample image enhancement and extension are performed on the transformed image, such as staggered transformation, random rotation and translation, etc.. Then, multiple migration models are used to extract features and then perform feature fusion. Finally,the model is adjusted by fine-tuning. Then train the model to obtain experimental results. The experimental results show that our method has excellent recognition performance in the recognition of new coronary pneumonia images, even with only a small number of CT image samples.



2018 ◽  
Vol 8 (10) ◽  
pp. 1857 ◽  
Author(s):  
Jing Yang ◽  
Shaobo Li ◽  
Zong Gao ◽  
Zheng Wang ◽  
Wei Liu

The complexity of the background and the similarities between different types of precision parts, especially in the high-speed movement of conveyor belts in complex industrial scenes, pose immense challenges to the object recognition of precision parts due to diversity in illumination. This study presents a real-time object recognition method for 0.8 cm darning needles and KR22 bearing machine parts under a complex industrial background. First, we propose an image data increase algorithm based on directional flip, and we establish two types of dataset, namely, real data and increased data. We focus on increasing recognition accuracy and reducing computation time, and we design a multilayer feature fusion network to obtain feature information. Subsequently, we propose an accurate method for classifying precision parts on the basis of non-maximal suppression, and then form an improved You Only Look Once (YOLO) V3 network. We implement this method and compare it with models in our real-time industrial object detection experimental platform. Finally, experiments on real and increased datasets show that the proposed method outperforms the YOLO V3 algorithm in terms of recognition accuracy and robustness.



2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Xiaodong Liu ◽  
Songyang Li ◽  
Miao Wang

The context, such as scenes and objects, plays an important role in video emotion recognition. The emotion recognition accuracy can be further improved when the context information is incorporated. Although previous research has considered the context information, the emotional clues contained in different images may be different, which is often ignored. To address the problem of emotion difference between different modes and different images, this paper proposes a hierarchical attention-based multimodal fusion network for video emotion recognition, which consists of a multimodal feature extraction module and a multimodal feature fusion module. The multimodal feature extraction module has three subnetworks used to extract features of facial, scene, and global images. Each subnetwork consists of two branches, where the first branch extracts the features of different modes, and the other branch generates the emotion score for each image. Features and emotion scores of all images in a modal are aggregated to generate the emotion feature of the modal. The other module takes multimodal features as input and generates the emotion score for each modal. Finally, features and emotion scores of multiple modes are aggregated, and the final emotion representation of the video will be produced. Experimental results show that our proposed method is effective on the emotion recognition dataset.



2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Junjun Huo

Based on deep learning and digital image processing algorithms, we design and implement an accurate automatic recognition system for bank note text and propose an improved recognition method based on ResNet for the problems of difficult image text extraction and insufficient recognition accuracy. Firstly, a deep hyperparameterized convolution (DO-Conv) is used instead of the traditional convolution in the network to improve the recognition rate while reducing the model parameters. Then, the spatial attention model (SAM) and the squeezed excitation block (SE-Block) are fused and applied to a modified ResNet to extract detailed features of bank note images in the channel and spatial domains. Finally, the label-smoothed cross-entropy (LSCE) loss function is used to train the model to automatically calibrate the network to prevent classification errors. The experimental results demonstrate that the improved model is not easily affected by the image quality, and the model in this paper has good performance in text detection and recognition in specific business ticket scenarios.



Sign in / Sign up

Export Citation Format

Share Document