Research on advertising content recognition based on convolutional neural network and recurrent neural network

Author(s):  
Xiaomei Liu ◽  
Fazhi Qi
Electronics ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 81
Author(s):  
Jianbin Xiong ◽  
Dezheng Yu ◽  
Shuangyin Liu ◽  
Lei Shu ◽  
Xiaochan Wang ◽  
...  

Plant phenotypic image recognition (PPIR) is an important branch of smart agriculture. In recent years, deep learning has achieved significant breakthroughs in image recognition. Consequently, PPIR technology that is based on deep learning is becoming increasingly popular. First, this paper introduces the development and application of PPIR technology, followed by its classification and analysis. Second, it presents the theory of four types of deep learning methods and their applications in PPIR. These methods include the convolutional neural network, deep belief network, recurrent neural network, and stacked autoencoder, and they are applied to identify plant species, diagnose plant diseases, etc. Finally, the difficulties and challenges of deep learning in PPIR are discussed.


Author(s):  
E. Yu. Shchetinin

The recognition of human emotions is one of the most relevant and dynamically developing areas of modern speech technologies, and the recognition of emotions in speech (RER) is the most demanded part of them. In this paper, we propose a computer model of emotion recognition based on an ensemble of bidirectional recurrent neural network with LSTM memory cell and deep convolutional neural network ResNet18. In this paper, computer studies of the RAVDESS database containing emotional speech of a person are carried out. RAVDESS-a data set containing 7356 files. Entries contain the following emotions: 0 – neutral, 1 – calm, 2 – happiness, 3 – sadness, 4 – anger, 5 – fear, 6 – disgust, 7 – surprise. In total, the database contains 16 classes (8 emotions divided into male and female) for a total of 1440 samples (speech only). To train machine learning algorithms and deep neural networks to recognize emotions, existing audio recordings must be pre-processed in such a way as to extract the main characteristic features of certain emotions. This was done using Mel-frequency cepstral coefficients, chroma coefficients, as well as the characteristics of the frequency spectrum of audio recordings. In this paper, computer studies of various models of neural networks for emotion recognition are carried out on the example of the data described above. In addition, machine learning algorithms were used for comparative analysis. Thus, the following models were trained during the experiments: logistic regression (LR), classifier based on the support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting over trees – XGBoost, convolutional neural network CNN, recurrent neural network RNN (ResNet18), as well as an ensemble of convolutional and recurrent networks Stacked CNN-RNN. The results show that neural networks showed much higher accuracy in recognizing and classifying emotions than the machine learning algorithms used. Of the three neural network models presented, the CNN + BLSTM ensemble showed higher accuracy.


2019 ◽  
Vol 15 (6) ◽  
pp. 155014771985649 ◽  
Author(s):  
Van Quan Nguyen ◽  
Tien Nguyen Anh ◽  
Hyung-Jeong Yang

We proposed an approach for temporal event detection using deep learning and multi-embedding on a set of text data from social media. First, a convolutional neural network augmented with multiple word-embedding architectures is used as a text classifier for the pre-processing of the input textual data. Second, an event detection model using a recurrent neural network is employed to learn time series data features by extracting temporal information. Recently, convolutional neural networks have been used in natural language processing problems and have obtained excellent results as performing on available embedding vector. In this article, word-embedding features at the embedding layer are combined and fed to convolutional neural network. The proposed method shows no size limitation, supplementation of more embeddings than standard multichannel based approaches, and obtained similar performance (accuracy score) on some benchmark data sets, especially in an imbalanced data set. For event detection, a long short-term memory network is used as a predictor that learns higher level temporal features so as to predict future values. An error distribution estimation model is built to calculate the anomaly score of observation. Events are detected using a window-based method on the anomaly scores.


2021 ◽  
Vol 11 (2) ◽  
pp. 1097-1108
Author(s):  
Bathaloori Reddy Prasad

Aim: Text classification is a method to classify the features from language translation in speech recognition from English to Telugu using a recurrent neural network- long short term memory (RNN-LSTM) comparison with convolutional neural network (CNN). Materials and Methods: Accuracy and precision are performed with dataset alexa and english-telugu of size 8166 sentences. Classification of language translation is performed by the recurrent neural network where a number of the samples (N=62) and convolutional neural network were a number of samples (N=62) techniques, the algorithm RNN implies speech recognition that can be compared with convolutional is the second technique. Results and Discussion: RNN-LSTM from the dataset speech recognition, feature Telugu_id produce accuracy 93% and precision 68.04% which can be comparatively higher than CNN accuracy 66.11%, precision 61.90%. It shows a statistical significance as 0.007 from Independent Sample T-test. Conclusion: The RNN-LSTM performs better in finding accuracy and precision when compared to CNN.


Satellite images are important for developing and protected environmental resources that can be used for flood detection. The satellite image of before-flooding and after-flooding to be segmented and feature with integration of deeply LRNN and CNN networks for giving high accuracy. It is also important for learning LRNN and CNN is able to find the feature of flooding regions sufficiently and, it will influence the effectiveness of flood relief. The CNNs and LRNNs consists of two set are training set and testing set. The before flooding and after flooding of satellite images to be extract and segment formed by testing and training phase of data patches. All patches are trained by LRNN where changes occur or any misdetection of flooded region to extract accurately without delay. This proposed method obtain accuracy of system is 99% of flood region detections.


Sign in / Sign up

Export Citation Format

Share Document