Evaluating Gender-Neutral Training Data for Automated Image Captioning

Author(s):  
Jack J Amend ◽  
Albatool Wazzan ◽  
Richard Souvenir
2020 ◽  
Vol 34 (03) ◽  
pp. 2693-2700
Author(s):  
Paul Hongsuck Seo ◽  
Piyush Sharma ◽  
Tomer Levinboim ◽  
Bohyung Han ◽  
Radu Soricut

Human ratings are currently the most accurate way to assess the quality of an image captioning model, yet most often the only used outcome of an expensive human rating evaluation is a few overall statistics over the evaluation dataset. In this paper, we show that the signal from instance-level human caption ratings can be leveraged to improve captioning models, even when the amount of caption ratings is several orders of magnitude less than the caption training data. We employ a policy gradient method to maximize the human ratings as rewards in an off-policy reinforcement learning setting, where policy gradients are estimated by samples from a distribution that focuses on the captions in a caption ratings dataset. Our empirical evidence indicates that the proposed method learns to generalize the human raters' judgments to a previously unseen set of images, as judged by a different set of human judges, and additionally on a different, multi-dimensional side-by-side human evaluation procedure.


Author(s):  
Xianyu Chen ◽  
Ming Jiang ◽  
Qi Zhao

Image captioning models depend on training with paired image-text corpora, which poses various challenges in describing images containing novel objects absent from the training data. While previous novel object captioning methods rely on external image taggers or object detectors to describe novel objects, we present the Attention-based Novel Object Captioner (ANOC) that complements novel object captioners with human attention features that characterize generally important information independent of tasks. It introduces a gating mechanism that adaptively incorporates human attention with self-learned machine attention, with a Constrained Self-Critical Sequence Training method to address the exposure bias while maintaining constraints of novel object descriptions. Extensive experiments conducted on the nocaps and Held-Out COCO datasets demonstrate that our method considerably outperforms the state-of-the-art novel object captioners. Our source code is available at https://github.com/chenxy99/ANOC.


Author(s):  
Kota Akshith Reddy ◽  
◽  
Satish C J ◽  
Jahnavi Polsani ◽  
Teja Naveen Chintapalli ◽  
...  

Automatic Image Caption Generation is one of the core problems in the field of Deep Learning. Data Augmentation is a technique which helps in increasing the amount of data at hand and this is done by augmenting the training data using various techniques like flipping, rotating, Zooming, Brightening, etc. In this work, we create an Image Captioning model and check its robustness on all the major types of Image Augmentation techniques. The results show the fuzziness of the model while working with the same image but a different augmentation technique and because of this, a different caption is produced every time a different data augmentation technique is employed. We also show the change in the performance of the model after applying these augmentation techniques. Flickr8k dataset is used for this study along with BLEU score as the evaluation metric for the image captioning model.


2022 ◽  
Vol 4 ◽  
Author(s):  
Ziyan Yang ◽  
Leticia Pinto-Alva ◽  
Franck Dernoncourt ◽  
Vicente Ordonez

People are able to describe images using thousands of languages, but languages share only one visual world. The aim of this work is to use the learned intermediate visual representations from a deep convolutional neural network to transfer information across languages for which paired data is not available in any form. Our work proposes using backpropagation-based decoding coupled with transformer-based multilingual-multimodal language models in order to obtain translations between any languages used during training. We particularly show the capabilities of this approach in the translation of German-Japanese and Japanese-German sentence pairs, given a training data of images freely associated with text in English, German, and Japanese but for which no single image contains annotations in both Japanese and German. Moreover, we demonstrate that our approach is also generally useful in the multilingual image captioning task when sentences in a second language are available at test time. The results of our method also compare favorably in the Multi30k dataset against recently proposed methods that are also aiming to leverage images as an intermediate source of translations.


2018 ◽  
Vol 17 (4) ◽  
pp. 193-203 ◽  
Author(s):  
Tanja Hentschel ◽  
Lisa Kristina Horvath ◽  
Claudia Peus ◽  
Sabine Sczesny

Abstract. Entrepreneurship programs often aim at increasing women’s lower entrepreneurial activities. We investigate how advertisements for entrepreneurship programs can be designed to increase women’s application intentions. Results of an experiment with 156 women showed that women indicate (1) lower self-ascribed fit to and interest in the program after viewing a male-typed image (compared to a gender-neutral or female-typed image) in the advertisement; and (2) lower self-ascribed fit to and interest in the program as well as lower application intentions if the German masculine linguistic form of the term “entrepreneur” (compared to the gender-fair word pair “female and male entrepreneur”) is used in the recruitment advertisement. Women’s reactions are most negative when both a male-typed image and the masculine linguistic form appear in the advertisement. Self-ascribed fit and program interest mediate the relationship of advertisement characteristics on application intentions.


2011 ◽  
Vol 131 (8) ◽  
pp. 1459-1466
Author(s):  
Yasunari Maeda ◽  
Hideki Yoshida ◽  
Masakiyo Suzuki ◽  
Toshiyasu Matsushima

2016 ◽  
Vol 136 (12) ◽  
pp. 898-907 ◽  
Author(s):  
Joao Gari da Silva Fonseca Junior ◽  
Hideaki Ohtake ◽  
Takashi Oozeki ◽  
Kazuhiko Ogimoto

2011 ◽  
Vol 9 (2) ◽  
pp. 99
Author(s):  
Alex J Auseon ◽  
Albert J Kolibash ◽  
◽  

Background:Educating trainees during cardiology fellowship is a process in constant evolution, with program directors regularly adapting to increasing demands and regulations as they strive to prepare graduates for practice in today’s healthcare environment.Methods and Results:In a 10-year follow-up to a previous manuscript regarding fellowship education, we reviewed the literature regarding the most topical issues facing training programs in 2010, describing our approach at The Ohio State University.Conclusion:In the midst of challenges posed by the increasing complexity of training requirements and documentation, work hour restrictions, and the new definitions of quality and safety, we propose methods of curricula revision and collaboration that may serve as an example to other medical centers.


Sign in / Sign up

Export Citation Format

Share Document