Improving stylized caption compatibility with image content by integrating region context

Author(s):  
Junlong Feng ◽  
Jianping Zhao
Keyword(s):  
2008 ◽  
Vol 67 (19) ◽  
pp. 1777-1790 ◽  
Author(s):  
C. Cruz-Ramos ◽  
R. Reyes-Reyes ◽  
J. Mendoza-Noriega ◽  
Mariko Nakano-Miyatake ◽  
Hector Manuel Perez-Meana

2018 ◽  
Vol 9 (1) ◽  
pp. 24-31
Author(s):  
Rudianto Rudianto ◽  
Eko Budi Setiawan

Availability the Application Programming Interface (API) for third-party applications on Android devices provides an opportunity to monitor Android devices with each other. This is used to create an application that can facilitate parents in child supervision through Android devices owned. In this study, some features added to the classification of image content on Android devices related to negative content. In this case, researchers using Clarifai API. The result of this research is to produce a system which has feature, give a report of image file contained in target smartphone and can do deletion on the image file, receive browser history report and can directly visit in the application, receive a report of child location and can be directly contacted via this application. This application works well on the Android Lollipop (API Level 22). Index Terms— Application Programming Interface(API), Monitoring, Negative Content, Children, Parent.


2021 ◽  
Vol 12 (2) ◽  
pp. 204380872110075
Author(s):  
Ashley Slabbert ◽  
Penelope Hasking ◽  
Lies Notebaert ◽  
Mark Boyes

The Emotional Image Tolerance (EIT) task assesses tolerance of negative emotion induced by negatively valenced images. We made several minor modifications to the task (Study 1) and adapted the task to include positive and neutral images in order to assess whether individuals respond to the valence or the intensity of the image content (Study 2). In both studies, we assessed subjective distress, gender differences in task responses, and associations between behavioral and self-reported distress tolerance, and related constructs. Across both studies, the EIT successfully induced distress and gender differences were observed, with females generally indicating more distress than males. In Study 2, responses on the adapted EIT task were correlated with self-reported distress tolerance, rumination, and emotion reactivity. The EIT successfully induces distress and the correlations in Study 2 provide promising evidence of validity.


Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 510
Author(s):  
Taiyong Li ◽  
Duzhong Zhang

Image security is a hot topic in the era of Internet and big data. Hyperchaotic image encryption, which can effectively prevent unauthorized users from accessing image content, has become more and more popular in the community of image security. In general, such approaches conduct encryption on pixel-level, bit-level, DNA-level data or their combinations, lacking diversity of processed data levels and limiting security. This paper proposes a novel hyperchaotic image encryption scheme via multiple bit permutation and diffusion, namely MBPD, to cope with this issue. Specifically, a four-dimensional hyperchaotic system with three positive Lyapunov exponents is firstly proposed. Second, a hyperchaotic sequence is generated from the proposed hyperchaotic system for consequent encryption operations. Third, multiple bit permutation and diffusion (permutation and/or diffusion can be conducted with 1–8 or more bits) determined by the hyperchaotic sequence is designed. Finally, the proposed MBPD is applied to image encryption. We conduct extensive experiments on a couple of public test images to validate the proposed MBPD. The results verify that the MBPD can effectively resist different types of attacks and has better performance than the compared popular encryption methods.


Author(s):  
Huimin Lu ◽  
Rui Yang ◽  
Zhenrong Deng ◽  
Yonglin Zhang ◽  
Guangwei Gao ◽  
...  

Chinese image description generation tasks usually have some challenges, such as single-feature extraction, lack of global information, and lack of detailed description of the image content. To address these limitations, we propose a fuzzy attention-based DenseNet-BiLSTM Chinese image captioning method in this article. In the proposed method, we first improve the densely connected network to extract features of the image at different scales and to enhance the model’s ability to capture the weak features. At the same time, a bidirectional LSTM is used as the decoder to enhance the use of context information. The introduction of an improved fuzzy attention mechanism effectively improves the problem of correspondence between image features and contextual information. We conduct experiments on the AI Challenger dataset to evaluate the performance of the model. The results show that compared with other models, our proposed model achieves higher scores in objective quantitative evaluation indicators, including BLEU , BLEU , METEOR, ROUGEl, and CIDEr. The generated description sentence can accurately express the image content.


2012 ◽  
Vol 239-240 ◽  
pp. 1472-1475
Author(s):  
Dan Ai ◽  
Jing Li Shi ◽  
Jun Jun Cao ◽  
Hong Yan Zhong

Landmark correspondence plays a decisive role in the landmark-based multi-modality image registration. We combine RPM (Robust Point Matching) and improved Mean Shift to estimate the correspondence of landmarks in images. We improve the target mode and bandwidth used in Mean Shift, and we also perform RPM to estimate the initial landmark correspondence. Next, we use improved Mean Shift to adjust corresponding relations between points. Our method is benefit to make corresponding relations between points more accurate and impels the convergence process of RPM to be related to the image content. Experimental results show that our method can achieve accurate registration of the multi-modal images.


Author(s):  
Gangavarapu Venkata Satya Kumar ◽  
Pillutla Gopala Krishna Mohan

In diverse computer applications, the analysis of image content plays a key role. This image content might be either textual (like text appearing in the images) or visual (like shape, color, texture). These two image contents consist of image’s basic features and therefore turn out to be as the major advantage for any of the implementation. Many of the art models are based on the visual search or annotated text for Content-Based Image Retrieval (CBIR) models. There is more demand toward multitasking, a new method needs to be introduced with the combination of both textual and visual features. This paper plans to develop the intelligent CBIR system for the collection of different benchmark texture datasets. Here, a new descriptor named Information Oriented Angle-based Local Tri-directional Weber Patterns (IOA-LTriWPs) is adopted. The pattern is operated not only based on tri-direction and eight neighborhood pixels but also based on four angles [Formula: see text], [Formula: see text], [Formula: see text], and [Formula: see text]. Once the patterns concerning tri-direction, eight neighborhood pixels, and four angles are taken, the best patterns are selected based on maximum mutual information. Moreover, the histogram computation of the patterns provides the final feature vector, from which the new weighted feature extraction is performed. As a new contribution, the novel weight function is optimized by the Improved MVO on random basis (IMVO-RB), in such a way that the precision and recall of the retrieved image is high. Further, the proposed model has used the logarithmic similarity called Mean Square Logarithmic Error (MSLE) between the features of the query image and trained images for retrieving the concerned images. The analyses on diverse texture image datasets have validated the accuracy and efficiency of the developed pattern over existing.


Sign in / Sign up

Export Citation Format

Share Document