scholarly journals Explorations into Deep Learning Text Architectures for Dense Image Captioning

Author(s):  
Martina Toshevska ◽  
Frosina Stojanovska ◽  
Eftim Zdravevski ◽  
Petre Lameski ◽  
Sonja Gievska



2021 ◽  
Author(s):  
Karanjit Gill ◽  
Sriparna Saha ◽  
Santosh Kumar Mishra
Keyword(s):  




Author(s):  
Toshiba Kamruzzaman ◽  
Soomanib Kamruzzaman ◽  
Abir Zaman






2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Faisal Muhammad Shah ◽  
Mayeesha Humaira ◽  
Md Abidur Rahman Khan Jim ◽  
Amit Saha Ami ◽  
Shimul Paul




Data ◽  
2019 ◽  
Vol 4 (4) ◽  
pp. 139
Author(s):  
Changhoon Jeong ◽  
Sung-Eun Jang ◽  
Sanghyuck Na ◽  
Juntae Kim

Recently, deep learning-based methods for solving multi-modal tasks such as image captioning, multi-modal classification, and cross-modal retrieval have attracted much attention. To apply deep learning for such tasks, large amounts of data are needed for training. However, although there are several Korean single-modal datasets, there are not enough Korean multi-modal datasets. In this paper, we introduce a KTS (Korean tourist spot) dataset for Korean multi-modal deep-learning research. The KTS dataset has four modalities (image, text, hashtags, and likes) and consists of 10 classes related to Korean tourist spots. All data were extracted from Instagram and preprocessed. We performed two experiments, image classification and image captioning with the dataset, and they showed appropriate results. We hope that many researchers will use this dataset for multi-modal deep-learning research.



Sign in / Sign up

Export Citation Format

Share Document