scholarly journals Expression-EEG Bimodal Fusion Emotion Recognition Method Based on Deep Learning

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yu Lu ◽  
Hua Zhang ◽  
Lei Shi ◽  
Fei Yang ◽  
Jing Li

As one of the key issues in the field of emotional computing, emotion recognition has rich application scenarios and important research value. However, the single biometric recognition in the actual scene has the problem of low accuracy of emotion recognition classification due to its own limitations. In response to this problem, this paper combines deep neural networks to propose a deep learning-based expression-EEG bimodal fusion emotion recognition method. This method is based on the improved VGG-FACE network model to realize the rapid extraction of facial expression features and shorten the training time of the network model. The wavelet soft threshold algorithm is used to remove artifacts from EEG signals to extract high-quality EEG signal features. Then, based on the long- and short-term memory network models and the decision fusion method, the model is built and trained using the signal feature data extracted under the expression-EEG bimodality to realize the final bimodal fusion emotion classification and identification research. Finally, the proposed method is verified based on the MAHNOB-HCI data set. Experimental results show that the proposed model can achieve a high recognition accuracy of 0.89, which can increase the accuracy of 8.51% compared with the traditional LSTM model. In terms of the running time of the identification method, the proposed method can effectively be shortened by about 20 s compared with the traditional method.

2021 ◽  
Vol 15 ◽  
Author(s):  
Dong Liu ◽  
Zhiyong Wang ◽  
Lifeng Wang ◽  
Longxi Chen

The redundant information, noise data generated in the process of single-modal feature extraction, and traditional learning algorithms are difficult to obtain ideal recognition performance. A multi-modal fusion emotion recognition method for speech expressions based on deep learning is proposed. Firstly, the corresponding feature extraction methods are set up for different single modalities. Among them, the voice uses the convolutional neural network-long and short term memory (CNN-LSTM) network, and the facial expression in the video uses the Inception-Res Net-v2 network to extract the feature data. Then, long and short term memory (LSTM) is used to capture the correlation between different modalities and within the modalities. After the feature selection process of the chi-square test, the single modalities are spliced to obtain a unified fusion feature. Finally, the fusion data features output by LSTM are used as the input of the classifier LIBSVM to realize the final emotion recognition. The experimental results show that the recognition accuracy of the proposed method on the MOSI and MELD datasets are 87.56 and 90.06%, respectively, which are better than other comparison methods. It has laid a certain theoretical foundation for the application of multimodal fusion in emotion recognition.


2020 ◽  
Vol 17 (3) ◽  
pp. 299-305 ◽  
Author(s):  
Riaz Ahmad ◽  
Saeeda Naz ◽  
Muhammad Afzal ◽  
Sheikh Rashid ◽  
Marcus Liwicki ◽  
...  

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.


Author(s):  
Kyungkoo Jun

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.


Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1514
Author(s):  
Seung-Ho Lim ◽  
WoonSik William Suh ◽  
Jin-Young Kim ◽  
Sang-Young Cho

The optimization for hardware processor and system for performing deep learning operations such as Convolutional Neural Networks (CNN) in resource limited embedded devices are recent active research area. In order to perform an optimized deep neural network model using the limited computational unit and memory of an embedded device, it is necessary to quickly apply various configurations of hardware modules to various deep neural network models and find the optimal combination. The Electronic System Level (ESL) Simulator based on SystemC is very useful for rapid hardware modeling and verification. In this paper, we designed and implemented a Deep Learning Accelerator (DLA) that performs Deep Neural Network (DNN) operation based on the RISC-V Virtual Platform implemented in SystemC in order to enable rapid and diverse analysis of deep learning operations in an embedded device based on the RISC-V processor, which is a recently emerging embedded processor. The developed RISC-V based DLA prototype can analyze the hardware requirements according to the CNN data set through the configuration of the CNN DLA architecture, and it is possible to run RISC-V compiled software on the platform, can perform a real neural network model like Darknet. We performed the Darknet CNN model on the developed DLA prototype, and confirmed that computational overhead and inference errors can be analyzed with the DLA prototype developed by analyzing the DLA architecture for various data sets.


Geofluids ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Dongsheng Wang ◽  
Jun Feng ◽  
Xinpeng Zhao ◽  
Yeping Bai ◽  
Yujie Wang ◽  
...  

It is difficult to form a method for recognizing the degree of infiltration of a tunnel lining. To solve this problem, we propose a recognition method by using a deep convolutional neural network. We carry out laboratory tests, prepare cement mortar specimens with different saturation levels, simulate different degrees of infiltration of tunnel concrete linings, and establish an infrared thermal image data set with different degrees of infiltration. Then, based on a deep learning method, the data set is trained using the Faster R-CNN+ResNet101 network, and a recognition model is established. The experiments show that the recognition model established by the deep learning method can be used to select cement mortar specimens with different degrees of infiltration by using an accurately minimized rectangular outer frame. This model shows that the classification recognition model for tunnel concrete lining infiltration established by the indoor experimental method has high recognition accuracy.


Author(s):  
Osama A. Osman ◽  
Hesham Rakha

Distracted driving (i.e., engaging in secondary tasks) is an epidemic that threatens the lives of thousands every year. Data collected from vehicular sensor technologies and through connectivity provide comprehensive information that, if used to detect driver engagement in secondary tasks, could save thousands of lives and millions of dollars. This study investigates the possibility of achieving this goal using promising deep learning tools. Specifically, two deep neural network models (a multilayer perceptron neural network model and a long short-term memory networks [LSTMN] model) were developed to identify three secondary tasks: cellphone calling, cellphone texting, and conversation with adjacent passengers. The Second Strategic Highway Research Program Naturalistic Driving Study (SHRP 2 NDS) time series data, collected using vehicle sensor technology, were used to train and test the model. The results show excellent performance for the developed models, with a slight improvement for the LSTMN model, with overall classification accuracies ranging between 95 and 96%. Specifically, the models are able to identify the different types of secondary tasks with high accuracies of 100% for calling, 96%–97% for texting, 90%–91% for conversation, and 95%–96% for the normal driving. Based on this performance, the developed models improve on the results of a previous model developed by the author to classify the same three secondary tasks, which had an accuracy of 82%. The model is promising for use in in-vehicle driving assistance technology to report engagement in unlawful tasks or alert drivers to take over control in level 1 and 2 automated vehicles.


Author(s):  
A. Saravanan ◽  
J. Jerald ◽  
A. Delphin Carolina Rani

AbstractThe objective of the paper is to develop a new method to model the manufacturing cost–tolerance and to optimize the tolerance values along with its manufacturing cost. A cost–tolerance relation has a complex nonlinear correlation among them. The property of a neural network makes it possible to model the complex correlation, and the genetic algorithm (GA) is integrated with the best neural network model to optimize the tolerance values. The proposed method used three types of neural network models (multilayer perceptron, backpropagation network, and radial basis function). These network models were developed separately for prismatic and rotational parts. For the construction of network models, part size and tolerance values were used as input neurons. The reference manufacturing cost was assigned as the output neuron. The qualitative production data set was gathered in a workshop and partitioned into three files for training, testing, and validation, respectively. The architecture of the network model was identified based on the best regression coefficient and the root-mean-square-error value. The best network model was integrated into the GA, and the role of genetic operators was also studied. Finally, two case studies from the literature were demonstrated in order to validate the proposed method. A new methodology based on the neural network model enables the design and process planning engineers to propose an intelligent decision irrespective of their experience.


2020 ◽  
Vol 12 (01) ◽  
pp. 2050001
Author(s):  
Yadigar N. Imamverdiyev ◽  
Fargana J. Abdullayeva

In this paper, a fault prediction method for oil well equipment based on the analysis of time series data obtained from multiple sensors is proposed. The proposed method is based on deep learning (DL). For this purpose, comparative analysis of single-layer long short-term memory (LSTM) with the convolutional neural network (CNN) and stacked LSTM methods is provided. To demonstrate the efficacy of the proposed method, some experiments are conducted on the real data set obtained from eight sensors installed in oil wells. In this paper, compared to the single-layer LSTM model, the CNN and stacked LSTM predicted the faulty time series with a minimal loss.


2020 ◽  
Author(s):  
Frederik Kratzert ◽  
Daniel Klotz ◽  
Sepp Hochreiter ◽  
Grey S. Nearing

Abstract. A deep learning rainfall-runoff model can take multiple meteorological forcing products as inputs and learn to combine them in spatially and temporally dynamic ways. This is demonstrated using Long Short Term Memory networks (LSTMs) trained over basins in the continental US using the CAMELS data set. Using multiple precipitation products (NLDAS, Maurer, DayMet) in a single LSTM significantly improved simulation accuracy relative to using only individual precipitation products. A sensitivity analysis showed that the LSTM learned to utilize different precipitation products in different ways in different basins and for simulating different parts of the hydrograph in individual basins.


2021 ◽  
Author(s):  
Tomochika Fujisawa ◽  
Victor Noguerales ◽  
Emmanouil Meramveliotakis ◽  
Anna Papadopoulou ◽  
Alfried P Vogler

Complex bulk samples of invertebrates from biodiversity surveys present a great challenge for taxonomic identification, especially if obtained from unexplored ecosystems. High-throughput imaging combined with machine learning for rapid classification could overcome this bottleneck. Developing such procedures requires that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. Yet the feasibility of transfer learning for the classification of unknown samples remains to be tested. Here, we assess the efficiency of deep learning and domain transfer algorithms for family-level classification of below-ground bulk samples of Coleoptera from understudied forests of Cyprus. We trained neural network models with images from local surveys versus global databases of above-ground samples from tropical forests and evaluated how prediction accuracy was affected by: (a) the quality and resolution of images, (b) the size and complexity of the training set and (c) the transferability of identifications across very disparate source-target pairs that do not share any species or genera. Within-dataset classification accuracy reached 98% and depended on the number and quality of training images and on dataset complexity. The accuracy of between-datasets predictions was reduced to a maximum of 82% and depended greatly on the standardisation of the imaging procedure. When the source and target images were of similar quality and resolution, albeit from different faunas, the reduction of accuracy was minimal. Application of algorithms for domain adaptation significantly improved the prediction performance of models trained by non-standardised, low-quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, when the imaging conditions and classification algorithms are carefully considered. Also, our results provide guidelines for data acquisition and algorithmic development for high-throughput image-based biodiversity surveys.


Sign in / Sign up

Export Citation Format

Share Document