Automatic Image Captioning Based on ResNet50 and LSTM with Soft Attention

Captioning the images with proper descriptions automatically has become an interesting and challenging problem. In this paper, we present one joint model AICRL, which is able to conduct the automatic image captioning based on ResNet50 and LSTM with soft attention. AICRL consists of one encoder and one decoder. The encoder adopts ResNet50 based on the convolutional neural network, which creates an extensive representation of the given image by embedding it into a fixed length vector. The decoder is designed with LSTM, a recurrent neural network and a soft attention mechanism, to selectively focus the attention over certain parts of an image to predict the next sentence. We have trained AICRL over a big dataset MS COCO 2014 to maximize the likelihood of the target description sentence given the training images and evaluated it in various metrics like BLEU, METEROR, and CIDEr. Our experimental results indicate that AICRL is effective in generating captions for the images.

Download Full-text

DAN: Deep Attention Neural Network for News Recommendation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015973 ◽

2019 ◽

Vol 33 ◽

pp. 5973-5980 ◽

Cited By ~ 8

Author(s):

Qiannan Zhu ◽

Xiaofei Zhou ◽

Zeliang Song ◽

Jianlong Tan ◽

Li Guo

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Real World ◽

Attention Mechanism ◽

Experimental Results ◽

Data Sets ◽

Challenging Problem ◽

World News ◽

Sequential Information ◽

News Recommendation

With the rapid information explosion of news, making personalized news recommendation for users becomes an increasingly challenging problem. Many existing recommendation methods that regard the recommendation procedure as the static process, have achieved better recommendation performance. However, they usually fail with the dynamic diversity of news and user’s interests, or ignore the importance of sequential information of user’s clicking selection. In this paper, taking full advantages of convolution neural network (CNN), recurrent neural network (RNN) and attention mechanism, we propose a deep attention neural network DAN for news recommendation. Our DAN model presents to use attention-based parallel CNN for aggregating user’s interest features and attention-based RNN for capturing richer hidden sequential features of user’s clicks, and combines these features for new recommendation. We conduct experiment on real-world news data sets, and the experimental results demonstrate the superiority and effectiveness of our proposed DAN model.

Download Full-text

Learning Dynamic Factors to Improve the Accuracy of Bus Arrival Time Prediction via a Recurrent Neural Network

Future Internet ◽

10.3390/fi11120247 ◽

2019 ◽

Vol 11 (12) ◽

pp. 247

Author(s):

Xin Zhou ◽

Peixin Dong ◽

Jianping Xing ◽

Peijia Sun

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Public Transportation ◽

Prediction Accuracy ◽

Arrival Time ◽

Attention Mechanism ◽

Experimental Results ◽

Support Vector ◽

Data Set ◽

Dynamic Factors

Accurate prediction of bus arrival times is a challenging problem in the public transportation field. Previous studies have shown that to improve prediction accuracy, more heterogeneous measurements provide better results. So what other factors should be added into the prediction model? Traditional prediction methods mainly use the arrival time and the distance between stations, but do not make full use of dynamic factors such as passenger number, dwell time, bus driving efficiency, etc. We propose a novel approach that takes full advantage of dynamic factors. Our approach is based on a Recurrent Neural Network (RNN). The experimental results indicate that a variety of prediction algorithms (such as Support Vector Machine, Kalman filter, Multilayer Perceptron, and RNN) have significantly improved performance after using dynamic factors. Further, we introduce RNN with an attention mechanism to adaptively select the most relevant input factors. Experiments demonstrate that the prediction accuracy of RNN with an attention mechanism is better than RNN with no attention mechanism when there are heterogeneous input factors. The experimental results show the superior performances of our approach on the data set provided by Jinan Public Transportation Corporation.

Download Full-text

Radar Emitter Signal Recognition Based on One-Dimensional Convolutional Neural Network with Attention Mechanism

Sensors ◽

10.3390/s20216350 ◽

2020 ◽

Vol 20 (21) ◽

pp. 6350

Author(s):

Bin Wu ◽

Shibo Yuan ◽

Peng Li ◽

Zehuan Jing ◽

Shao Huang ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Attention Mechanism ◽

Superior Performance ◽

Radar Signal ◽

Signal Recognition ◽

Electromagnetic Environment ◽

One Dimensional ◽

The Given ◽

And Storage

As the real electromagnetic environment grows complex and the quantity of radar signals turns massive, traditional methods, which require a large amount of prior knowledge, are time-consuming and ineffective for radar emitter signal recognition. In recent years, convolutional neural network (CNN) has shown its superiority in recognition so that experts have applied it in radar signal recognition. However, in the field of radar emitter signal recognition, the data are usually one-dimensional (1-D), which takes more time and storage space than by using the original two-dimensional CNN model directly. Moreover, the features extracted from convolutional layers are redundant so that the recognition accuracy is low. In order to solve these problems, this paper proposes a novel one-dimensional convolutional neural network with an attention mechanism (CNN-1D-AM) to extract more discriminative features and recognize the radar emitter signals. In this method, features of the given 1-D signal sequences are extracted directly by the 1-D convolutional layers and are weighted in accordance with their importance to recognition by the attention unit. The experiments based on seven different radar emitter signals indicate that the proposed CNN-1D-AM has the advantages of high accuracy and superior performance in radar emitter signal recognition.

Download Full-text

Malicious URL Detection Algorithm Based on Multi Neural Network Series

CONVERTER ◽

10.17762/converter.209 ◽

2021 ◽

pp. 579-590

Author(s):

Weirong Xiu

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Convolutional Neural Network ◽

Language Processing ◽

Recurrent Neural Network ◽

Detection Algorithm ◽

Attention Mechanism ◽

Global Features ◽

Multi Neural Network

Convolutional neural network based on attention mechanism and a bidirectional independent recurrent neural network tandem joint algorithm (CATIR) are proposed. In natural language processing related technologies, word vector features are extracted based on URLs, and the extracted URL information features and host information features are merged. The proposed CATIR algorithm uses CNN (Convolutional Neural Network) to obtain the deep local features in the data, uses the Attention mechanism to adjust the weights, and uses IndRNN (Independent Recurrent Neural Network) to obtain the global features in the data. The experimental results shows that the CATIR algorithm has significantly improved the accuracy of malicious URL detection based on traditional algorithms to 96.9%.

Download Full-text

A CNN-ABiGRU method for Gearbox Fault Diagnosis

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2022.16.54 ◽

2022 ◽

Vol 16 ◽

pp. 440-446

Author(s):

Xiaoyang Zheng ◽

Zeyu Ye ◽

Jinliang Wu

Keyword(s):

Neural Network ◽

Fault Diagnosis ◽

Convolutional Neural Network ◽

Prior Knowledge ◽

Time Sequence ◽

Attention Mechanism ◽

Experimental Results ◽

Generalization Performance ◽

Proposed Model ◽

Gated Recurrent Unit

As a key part of modern industrial machinery, there has been a lot of fault diagnosis methods for gearbox. However, traditional fault diagnosis methods suffer from dependence on prior knowledge. This paper proposed an end-to-end method based on convolutional neural network (CNN), Bidirectional gated recurrent unit (BiGRU), and Attention Mechanism. Among them, the application of BiGRU not only made perfect use of the time sequence of signal, but also saved computing resources more than the same type of networks because of the low amount of calculation. In order to verify the effectiveness and generalization performance of the proposed method, experiments are carried out on two datasets, and the accuracy is calculated by the ten-fold crossvalidation. Compared with the existing fault diagnosis methods, the experimental results show that the proposed model has higher accuracy.

Download Full-text

Image Captioning using CNN and LSTM

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37846 ◽

2021 ◽

Vol 9 (8) ◽

pp. 2666-2669

Author(s):

Anish Banda

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Language Processing ◽

Short Term Memory ◽

Image Description ◽

Image Captioning ◽

Training Images ◽

Long Short Term Memory ◽

Standard Calculation

Abstract: In the model we proposed, we examine the deep neural networks-based image caption generation technique. We give image as input to the model, the technique give output in three different forms i.e., sentence in three different languages describing the image, mp3 audio file and an image file is also generated. In this model, we use the techniques of both computer vision and natural language processing. We are aiming to develop a model using the techniques of Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) to build a model to generate a Caption. Target image is compared with the training images, we have a large dataset containing the training images, this is done by convolutional neural network. This model generates a decent description utilizing the trained data. To extract features from images we need encoder, we use CNN as encoder. To decode the description of image generated we use LSTM. To evaluate the accuracy of generated caption we use BLEU metric algorithm. It grades the quality of content generated. Performance is calculated by the standard calculation matrices. Keywords: CNN, RNN, LSTM, BLEU score, encoder, decoder, captions, image description.

Download Full-text

An Implicit Preference-Aware Sequential Recommendation Method Based on Knowledge Graph

Wireless Communications and Mobile Computing ◽

10.1155/2021/5206228 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Haiyan Wang ◽

Kaiming Yao ◽

Jian Luo ◽

Yi Lin

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Recommendation System ◽

Attention Mechanism ◽

Experimental Results ◽

Knowledge Graph ◽

Data Overload ◽

Current Sequence

Sequential recommendation system has received widespread attention due to its good performance in solving data overload. However, most of the sequential recommendation methods assume that user’s preferences only depend on specific items in the current sequence and do not consider user’s implicit interests. In addition, most of the previous works mainly focus on exploiting relationships between items in the sequence and seldom consider quantifying the degree of preferences for items implied by user’s different behaviors. In order to address these above two problems, we propose an implicit preference-aware sequential recommendation method based on knowledge graph (IPAKG). Firstly, this method introduces knowledge graph to exploit user’s implicit preference representations. Secondly, we integrate recurrent neural network and attention mechanism to capture user’s evolving interests and relationships between different items in the sequence. Thirdly, we introduce the concept of behavior intensity and design a behavior activation unit to exploit the degree of preferences for items implied by a user’s different behaviors. Through the activation unit, the user’s preferences on different items are further quantified. Finally, we conduct experiments on an Amazon electronics dataset and Tmall dataset to evaluate the performance of our method. Experimental results demonstrate that our proposed method has better performance than those baseline methods.

Download Full-text

Text Classification Using a Bidirectional Recurrent Neural Network with an Attention Mechanism

2020 International Conference on Culture-oriented Science & Technology (ICCST) ◽

10.1109/iccst50977.2020.00057 ◽

2020 ◽

Author(s):

Bingyuan Wang ◽

Fang Miao ◽

Xueting Wang ◽

Libiao Jin

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Text Classification ◽

Attention Mechanism

Download Full-text

Protein Image Classification based on Convolutional Neural Network and Recurrent Neural Network

Proceedings of the 2019 3rd International Conference on Computer Science and Artificial Intelligence ◽

10.1145/3374587.3374617 ◽

2019 ◽

Author(s):

Yuanying Qu ◽

Haowei Song ◽

Hai-Ning Liang ◽

Jieming Ma ◽

Wei Wang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Image Classification ◽

Recurrent Neural Network

Download Full-text

A Review of Plant Phenotypic Image Recognition Technology Based on Deep Learning

Electronics ◽

10.3390/electronics10010081 ◽

2021 ◽

Vol 10 (1) ◽

pp. 81

Author(s):

Jianbin Xiong ◽

Dezheng Yu ◽

Shuangyin Liu ◽

Lei Shu ◽

Xiaochan Wang ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Plant Species ◽

Image Recognition ◽

Recurrent Neural Network ◽

Plant Diseases ◽

Learning Methods ◽

Smart Agriculture ◽

Important Branch

Plant phenotypic image recognition (PPIR) is an important branch of smart agriculture. In recent years, deep learning has achieved significant breakthroughs in image recognition. Consequently, PPIR technology that is based on deep learning is becoming increasingly popular. First, this paper introduces the development and application of PPIR technology, followed by its classification and analysis. Second, it presents the theory of four types of deep learning methods and their applications in PPIR. These methods include the convolutional neural network, deep belief network, recurrent neural network, and stacked autoencoder, and they are applied to identify plant species, diagnose plant diseases, etc. Finally, the difficulties and challenges of deep learning in PPIR are discussed.

Download Full-text