Image Caption Generator Using Neural Networks

In this paper, we focus on one of the visual recognition facets of computer vision, i.e. image captioning. This model’s goal is to come up with captions for an image. Using deep learning techniques, image captioning aims to generate captions for an image automatically. Initially, a Convolutional Neural Network is used to detect the objects in the image (InceptionV3). Recurrent Neural Networks (RNN) and Long Short Term Memory (LSTM) with attention mechanism are used to generate a syntactically and semantically correct caption for the image based on the detected objects. In our project, we're working with a traffic sign dataset that has been captioned using the process described above. This model is extremely useful for visually impaired people who need to cross roads safely.

Download Full-text

Image Caption Generator Using Deep Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38652 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1554-1564

Author(s):

A. V. N. Kameswari

Keyword(s):

Deep Learning ◽

Visually Impaired ◽

Short Term Memory ◽

Image Understanding ◽

Research Area ◽

Short Term ◽

Visually Impaired People ◽

Learning Techniques ◽

Long Short Term Memory ◽

Image Caption

Abstract: When humans see an image, their brain can easily tell what the image is about, but a computer cannot do it easily. Computer vision researchers worked on this a lot and they considered it impossible until now! With the advancement in Deep learning techniques, availability of huge datasets and computer power, we can build models that can generate captions for an image. Image Caption Generator is a popular research area of Deep Learning that deals with image understanding and a language description for that image. Generating well-formed sentences requires both syntactic and semantic understanding of the language. Being able to describe the content of an image using accurately formed sentences is a very challenging task, but it could also have a great impact, by helping visually impaired people better understand the content of images. The biggest challenge is most definitely being able to create a description that must capture not only the objects contained in an image, but also express how these objects relate to each other. This paper uses Flickr_8K dataset and Flickr8k_text folder that contains Flickr8k.token which is the main file of our dataset that contains image name and their respective caption separated by newline(“\n”). CNN is used for extracting features from the image. We will use the pre-trained model Xception. LSTM will use the information from CNN to help generate a description of the image. In our Flickr8k_text folder, we have Flickr_8k.trainImages.txt file that contains a list of 6000 images names that we will use for training. After CNN-LSTM model is defined we give an image file as parameter through command prompt for testing image caption generator and it generates the caption of an image and its accuracy is observed by calculating bleu score for generated and reference captions. Keywords: Image Caption Generator, Convolutional Neural Network, Long Short-Term Memory, Bleu score, Flickr_8K

Download Full-text

Comparative Assessment of Image Captioning Models

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.8693 ◽

2020 ◽

Vol 17 (1) ◽

pp. 473-478

Author(s):

Mayank ◽

Naveen Kumar Gondhi

Keyword(s):

Neural Networks ◽

Language Processing ◽

English Language ◽

Short Term Memory ◽

Comparative Assessment ◽

Image Feature ◽

Image Captioning ◽

Short Term ◽

Long Short Term Memory ◽

Simple Sentences

Image Captioning is the combination of Computer Vision and Natural Language Processing (NLP) in which simple sentences have been automatically generated describing the content of the image. This paper presents the comparative analysis of different models used for the generation of descriptive English captions for a given image. Feature extractions of the images are done using Convolutional Neural Networks (CNN). These features are then, passed onto Recurrent Neural Networks (RNN) or Long Short-term Memory (LSTM) to generate captions in English language. The evaluation metrics used to appraise the conduct of the models are BLEU score, CIDEr and METEOR.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text