Image Caption Generator Using Deep Learning

Abstract: When humans see an image, their brain can easily tell what the image is about, but a computer cannot do it easily. Computer vision researchers worked on this a lot and they considered it impossible until now! With the advancement in Deep learning techniques, availability of huge datasets and computer power, we can build models that can generate captions for an image. Image Caption Generator is a popular research area of Deep Learning that deals with image understanding and a language description for that image. Generating well-formed sentences requires both syntactic and semantic understanding of the language. Being able to describe the content of an image using accurately formed sentences is a very challenging task, but it could also have a great impact, by helping visually impaired people better understand the content of images. The biggest challenge is most definitely being able to create a description that must capture not only the objects contained in an image, but also express how these objects relate to each other. This paper uses Flickr_8K dataset and Flickr8k_text folder that contains Flickr8k.token which is the main file of our dataset that contains image name and their respective caption separated by newline(“\n”). CNN is used for extracting features from the image. We will use the pre-trained model Xception. LSTM will use the information from CNN to help generate a description of the image. In our Flickr8k_text folder, we have Flickr_8k.trainImages.txt file that contains a list of 6000 images names that we will use for training. After CNN-LSTM model is defined we give an image file as parameter through command prompt for testing image caption generator and it generates the caption of an image and its accuracy is observed by calculating bleu score for generated and reference captions. Keywords: Image Caption Generator, Convolutional Neural Network, Long Short-Term Memory, Bleu score, Flickr_8K

Download Full-text

BO-LSTM: Classifying relations via long short-term memory networks along biomedical ontologies

10.1101/336719 ◽

2018 ◽

Author(s):

Andre Lamurias ◽

Luka A. Clarke ◽

Francisco M. Couto

Keyword(s):

Deep Learning ◽

Text Mining ◽

Drug Interactions ◽

Short Term Memory ◽

Biomedical Ontologies ◽

Short Term ◽

Term Memory ◽

Domain Specific ◽

Learning Techniques ◽

Long Short Term Memory

AbstractRecent studies have proposed deep learning techniques, namely recurrent neural networks, to improve biomedical text mining tasks. However, these techniques rarely take advantage of existing domain-specific resources, such as ontologies. In Life and Health Sciences there is a vast and valuable set of such resources publicly available, which are continuously being updated. Biomedical ontologies are nowadays a mainstream approach to formalize existing knowledge about entities, such as genes, chemicals, phenotypes, and disorders. These resources contain supplementary information that may not be yet encoded in training data, particularly in domains with limited labeled data.We propose a new model, BO-LSTM, that takes advantage of domain-specific ontologies, by representing each entity as the sequence of its ancestors in the ontology. We implemented BO-LSTM as a recurrent neural network with long short-term memory units and using an open biomedical ontology, which in our case-study was Chemical Entities of Biological Interest (ChEBI). We assessed the performance of BO-LSTM on detecting and classifying drug-drug interactions in a publicly available corpus from an international challenge, composed of 792 drug descriptions and 233 scientific abstracts. By using the domain-specific ontology in addition to word embeddings and WordNet, BO-LSTM improved both the F1-score of the detection and classification of drug-drug interactions, particularly in a document set with a limited number of annotations. Our findings demonstrate that besides the high performance of current deep learning techniques, domain-specific ontologies can still be useful to mitigate the lack of labeled data.Author summaryA high quantity of biomedical information is only available in documents such as scientific articles and patents. Due to the rate at which new documents are produced, we need automatic methods to extract useful information from them. Text mining is a subfield of information retrieval which aims at extracting relevant information from text. Scientific literature is a challenge to text mining because of the complexity and specificity of the topics approached. In recent years, deep learning has obtained promising results in various text mining tasks by exploring large datasets. On the other hand, ontologies provide a detailed and sound representation of a domain and have been developed to diverse biomedical domains. We propose a model that combines deep learning algorithms with biomedical ontologies to identify relations between concepts in text. We demonstrate the potential of this model to extract drug-drug interactions from abstracts and drug descriptions. This model can be applied to other biomedical domains using an annotated corpus of documents and an ontology related to that domain to train a new classifier.

Download Full-text

Deep Learning-Based Sentiment Analysis of COVID-19 Vaccination Responses from Twitter Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/4321131 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Kazi Nabiul Alam ◽

Md Shakib Khan ◽

Abdur Rab Dhruba ◽

Mohammad Monirujjaman Khan ◽

Jehad F. Al-Amri ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Performance Metrics ◽

Short Term Memory ◽

Confusion Matrix ◽

Short Term ◽

Learning Techniques ◽

The World ◽

Long Short Term Memory ◽

Severe Anxiety

The COVID-19 pandemic has had a devastating effect on many people, creating severe anxiety, fear, and complicated feelings or emotions. After the initiation of vaccinations against coronavirus, people’s feelings have become more diverse and complex. Our aim is to understand and unravel their sentiments in this research using deep learning techniques. Social media is currently the best way to express feelings and emotions, and with the help of Twitter, one can have a better idea of what is trending and going on in people’s minds. Our motivation for this research was to understand the diverse sentiments of people regarding the vaccination process. In this research, the timeline of the collected tweets was from December 21 to July21. The tweets contained information about the most common vaccines available recently from across the world. The sentiments of people regarding vaccines of all sorts were assessed using the natural language processing (NLP) tool, Valence Aware Dictionary for sEntiment Reasoner (VADER). Initializing the polarities of the obtained sentiments into three groups (positive, negative, and neutral) helped us visualize the overall scenario; our findings included 33.96% positive, 17.55% negative, and 48.49% neutral responses. In addition, we included our analysis of the timeline of the tweets in this research, as sentiments fluctuated over time. A recurrent neural network- (RNN-) oriented architecture, including long short-term memory (LSTM) and bidirectional LSTM (Bi-LSTM), was used to assess the performance of the predictive models, with LSTM achieving an accuracy of 90.59% and Bi-LSTM achieving 90.83%. Other performance metrics such as precision,, F1-score, and a confusion matrix were also used to validate our models and findings more effectively. This study improves understanding of the public’s opinion on COVID-19 vaccines and supports the aim of eradicating coronavirus from the world.

Download Full-text

Predictive Analysis of Cryptocurrency Price Using Deep Learning

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.27.17889 ◽

2018 ◽

Vol 7 (3.27) ◽

pp. 258 ◽

Cited By ~ 4

Author(s):

Yecheng Yao ◽

Jungho Yi ◽

Shengjun Zhai ◽

Yuwen Lin ◽

Taekseung Kim ◽

...

Keyword(s):

Deep Learning ◽

International Relations ◽

Short Term Memory ◽

Training Data ◽

Short Term ◽

Effective Learning ◽

Learning Techniques ◽

Benchmark Datasets ◽

Novel Method ◽

Long Short Term Memory

The decentralization of cryptocurrencies has greatly reduced the level of central control over them, impacting international relations and trade. Further, wide fluctuations in cryptocurrency price indicate an urgent need for an accurate way to forecast this price. This paper proposes a novel method to predict cryptocurrency price by considering various factors such as market cap, volume, circulating supply, and maximum supply based on deep learning techniques such as the recurrent neural network (RNN) and the long short-term memory (LSTM),which are effective learning models for training data, with the LSTM being better at recognizing longer-term associations. The proposed approach is implemented in Python and validated for benchmark datasets. The results verify the applicability of the proposed approach for the accurate prediction of cryptocurrency price.

Download Full-text

COMPARATIVE ANALYSIS AND EVALUATION OF THE APPLICATION OF DEEP LEARNING TECHNIQUES TO CYBERSECURITY DATASETS

DYNA INGENIERIA E INDUSTRIA ◽

10.6036/10007 ◽

2021 ◽

Vol 96 (5) ◽

pp. 528-533

Author(s):

XAVIER LARRIVA NOVO ◽

MARIO VEGA BARBAS ◽

VICTOR VILLAGRA ◽

JULIO BERROCAL

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Performance ◽

New Technologies ◽

Short Term Memory ◽

Machine Learning Techniques ◽

Short Term ◽

Term Memory ◽

Learning Techniques ◽

Long Short Term Memory

Cybersecurity has stood out in recent years with the aim of protecting information systems. Different methods, techniques and tools have been used to make the most of the existing vulnerabilities in these systems. Therefore, it is essential to develop and improve new technologies, as well as intrusion detection systems that allow detecting possible threats. However, the use of these technologies requires highly qualified cybersecurity personnel to analyze the results and reduce the large number of false positives that these technologies presents in their results. Therefore, this generates the need to research and develop new high-performance cybersecurity systems that allow efficient analysis and resolution of these results. This research presents the application of machine learning techniques to classify real traffic, in order to identify possible attacks. The study has been carried out using machine learning tools applying deep learning algorithms such as multi-layer perceptron and long-short-term-memory. Additionally, this document presents a comparison between the results obtained by applying the aforementioned algorithms and algorithms that are not deep learning, such as: random forest and decision tree. Finally, the results obtained are presented, showing that the long-short-term-memory algorithm is the one that provides the best results in relation to precision and logarithmic loss.

Download Full-text

Comparative Analysis of Deep Learning Techniques for the Classification of Hate Speech

NIGERIAN ANNALS OF PURE AND APPLIED SCIENCES ◽

10.46912/napas.227 ◽

2021 ◽

Vol 4 (1) ◽

pp. 121-128

Author(s):

A Iorliam ◽

S Agber ◽

MP Dzungwe ◽

DK Kwaghtyo ◽

S Bum

Keyword(s):

Neural Network ◽

Social Media ◽

Deep Learning ◽

Hate Speech ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Learning Techniques ◽

Or Groups ◽

Long Short Term Memory

Social media provides opportunities for individuals to anonymously communicate and express hateful feelings and opinions at the comfort of their rooms. This anonymity has become a shield for many individuals or groups who use social media to express deep hatred for other individuals or groups, tribes or race, religion, gender, as well as belief systems. In this study, a comparative analysis is performed using Long Short-Term Memory and Convolutional Neural Network deep learning techniques for Hate Speech classification. This analysis demonstrates that the Long Short-Term Memory classifier achieved an accuracy of 92.47%, while the Convolutional Neural Network classifier achieved an accuracy of 92.74%. These results showed that deep learning techniques can effectively classify hate speech from normal speech.

Download Full-text

HUMAN ROBOT INTERACTIVE INTENTION PREDICTION USING DEEP LEARNING TECHNIQUES

Journal of Military Science and Technology ◽

10.54939/1859-1043.j.mst.72a.2021.1-12 ◽

2021 ◽

pp. 1-12

Author(s):

Thang

Keyword(s):

Neural Network ◽

Deep Learning ◽

Deep Neural Network ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

The Neural Network ◽

Learning Techniques ◽

Long Short Term Memory ◽

Deep Learning Neural Network

In this research, we propose a method of human robot interactive intention prediction. The proposed algorithm makes use of a OpenPose library and a Long-short term memory deep learning neural network. The neural network observes the human posture in a time series, then predicts the human interactive intention. We train the deep neural network using dataset generated by us. The experimental results show that, our proposed method is able to predict the human robot interactive intention, providing 92% the accuracy on the testing set.

Download Full-text

Sentiment Analysis on Twitter Data by Using Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM)

10.21203/rs.3.rs-247154/v1 ◽

2021 ◽

Author(s):

Usha Devi G ◽

Priyan M K ◽

Gokulnath Chandra Babu ◽

Gayathri Karthick

Keyword(s):

Neural Network ◽

Deep Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Twitter Data ◽

Learning Techniques ◽

Stop Word ◽

Long Short Term Memory

Abstract Twitter sentiment analysis is an automated process of analyzing the text data which determining the opinion or feeling of public tweets from the various fields. For example, in marketing field, political field huge number of tweets is posting with hash tags every moment via internet from one user to another user. This sentiment analysis is a challenging task for the researchers mainly to correct interpretation of context in which certain tweet words are difficult to evaluate what truly is negative and positive statement from the huge corpus of tweet data. This problem violates the integrity of the system and the user reliability can be significantly reduced. In this paper, we identify the each tweet word and we are assigning a meaning into it. The feature work is combined with tweet words, word2vec, stop words and integrated into the deep learning techniques of Convolution neural network model and Long short Term Memory, these algorithms can identify the pattern of stop word counts with its own strategy. Those two models are well trained and applied for IMDB dataset which contains 50,000 movie reviews. With huge amount of twitter data is processed for predicting the sentimental tweets for classification. With the proposed methodology, the samples are experimentally collected from the real-time environment can be discriminated well and the efficacy of the system is improved. The result of Deep Learning algorithms aims to rate the review tweets and also able to identify movie review with testing accuracy as 87.74% and 88.02%.

Download Full-text

Mapping Patient Trajectories using Longitudinal Extraction and Deep Learning in the MIMIC-III Critical Care Database

10.1101/177428 ◽

2017 ◽

Cited By ~ 3

Author(s):

Brett K. Beaulieu-Jones ◽

Patryk Orzechowski ◽

Jason H. Moore

Keyword(s):

Health Care ◽

Deep Learning ◽

Health Care Providers ◽

Short Term Memory ◽

Care Providers ◽

Short Term ◽

Learning Techniques ◽

Long Short Term Memory ◽

Mimic Iii ◽

Care Database

Electronic Health Records (EHRs) contain a wealth of patient data useful to biomedical researchers. At present, both the extraction of data and methods for analyses are frequently designed to work with a single snapshot of a patient’s record. Health care providers often perform and record actions in small batches over time. By extracting these care events, a sequence can be formed providing a trajectory for a patient’s interactions with the health care system. These care events also offer a basic heuristic for the level of attention a patient receives from health care providers. We show that is possible to learn meaningful embeddings from these care events using two deep learning techniques, unsupervised autoencoders and long short-term memory networks. We compare these methods to traditional machine learning methods which require a point in time snapshot to be extracted from an EHR.

Download Full-text

Image Caption Generator Using Neural Networks

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit21736 ◽

2021 ◽

pp. 01-07

Author(s):

Sujeet Kumar Shukla ◽

Saurabh Dubey ◽

Aniket Kumar Pandey ◽

Vineet Mishra ◽

Mayank Awasthi ◽

...

Keyword(s):

Neural Networks ◽

Visual Recognition ◽

Short Term Memory ◽

Image Captioning ◽

Short Term ◽

Traffic Sign ◽

Visually Impaired People ◽

Learning Techniques ◽

Long Short Term Memory ◽

To Come

In this paper, we focus on one of the visual recognition facets of computer vision, i.e. image captioning. This model’s goal is to come up with captions for an image. Using deep learning techniques, image captioning aims to generate captions for an image automatically. Initially, a Convolutional Neural Network is used to detect the objects in the image (InceptionV3). Recurrent Neural Networks (RNN) and Long Short Term Memory (LSTM) with attention mechanism are used to generate a syntactically and semantically correct caption for the image based on the detected objects. In our project, we're working with a traffic sign dataset that has been captioned using the process described above. This model is extremely useful for visually impaired people who need to cross roads safely.

Download Full-text

Image Captioning using Deep Learning for the Visually Impaired

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.36267 ◽

2021 ◽

Vol 9 (VII) ◽

pp. 517-522

Author(s):

Dr. A. M. Chandrashekhar

Keyword(s):

Deep Learning ◽

Language Processing ◽

Visually Impaired ◽

Short Term Memory ◽

Fundamental Problem ◽

Text To Speech ◽

Image Captioning ◽

Short Term ◽

Long Short Term Memory ◽

Result Analysis

Describing the content of an image has been a fundamental problem of Machine learning that connects computer vision and natural language processing. In recent years, the task of object recognition has advanced at an exceptional rate which in turn has made image captioning that much better and easier. In this paper, we have discussed the usage of image captioning using deep learning for the visually impaired. We have used Convolutional Neural Networks along with Long Short-Term Memory to train and generate captions for images along with a text-to-speech engine which makes the experience of visually impaired users who are browsing the internet much smoother. We discuss how the model was implemented, its different components and modules along with a result analysis conducted on a set of outputs peer reviewed by our colleagues, friends and professors.

Download Full-text