Image annotation and retrieval based on efficient learning of contextual latent space

Author(s):  
Tatsuya Harada ◽  
Hideki Nakayama ◽  
Yasuo Kuniyoshi
Author(s):  
Rui Zhang ◽  
Ling Guan

With nearly twenty years of intensive study on the content-based image retrieval and annotation, the topic still remains difficult. By and large, the essential challenge lies in the limitation of using low-level visual features to characterize the semantic information of images, commonly known as the semantic gap. To bridge this gap, various approaches have been proposed based on the incorporation of human knowledge and textual information as well as the learning techniques utilizing the information of different modalities. At the same time, contextual information which represents the relationship between different real world/conceptual entities has shown its significance with respect to recognition tasks not only through real life experience but also scientific studies. In this chapter, the authors first review the state of the art of the existing works on image annotation and retrieval. Moreover, a general Bayesian framework which integrates content and contextual information and its application to both image annotation and retrieval are elaborated. The contextual information is considered as the statistical relationship between different images and different semantic concepts for image retrieval and annotation, respectively. The framework has efficient learning and classification procedures and the effectiveness is evaluated based on experimental studies, which demonstrate its advantage over both content-based and context-based approaches.


2022 ◽  
Vol 40 (4) ◽  
pp. 1-27
Author(s):  
Zhongwei Xie ◽  
Ling Liu ◽  
Yanzhao Wu ◽  
Luo Zhong ◽  
Lin Li

This article introduces a two-phase deep feature engineering framework for efficient learning of semantics enhanced joint embedding, which clearly separates the deep feature engineering in data preprocessing from training the text-image joint embedding model. We use the Recipe1M dataset for the technical description and empirical validation. In preprocessing, we perform deep feature engineering by combining deep feature engineering with semantic context features derived from raw text-image input data. We leverage LSTM to identify key terms, deep NLP models from the BERT family, TextRank, or TF-IDF to produce ranking scores for key terms before generating the vector representation for each key term by using Word2vec. We leverage Wide ResNet50 and Word2vec to extract and encode the image category semantics of food images to help semantic alignment of the learned recipe and image embeddings in the joint latent space. In joint embedding learning, we perform deep feature engineering by optimizing the batch-hard triplet loss function with soft-margin and double negative sampling, taking into account also the category-based alignment loss and discriminator-based alignment loss. Extensive experiments demonstrate that our SEJE approach with deep feature engineering significantly outperforms the state-of-the-art approaches.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Adam Gosztolai ◽  
Alexis Arnaudon

AbstractDescribing networks geometrically through low-dimensional latent metric spaces has helped design efficient learning algorithms, unveil network symmetries and study dynamical network processes. However, latent space embeddings are limited to specific classes of networks because incompatible metric spaces generally result in information loss. Here, we study arbitrary networks geometrically by defining a dynamic edge curvature measuring the similarity between pairs of dynamical network processes seeded at nearby nodes. We show that the evolution of the curvature distribution exhibits gaps at characteristic timescales indicating bottleneck-edges that limit information spreading. Importantly, curvature gaps are robust to large fluctuations in node degrees, encoding communities until the phase transition of detectability, where spectral and node-clustering methods fail. Using this insight, we derive geometric modularity to find multiscale communities based on deviations from constant network curvature in generative and real-world networks, significantly outperforming most previous methods. Our work suggests using network geometry for studying and controlling the structure of and information spreading on networks.


2015 ◽  
Vol 36 (4) ◽  
pp. 228-236 ◽  
Author(s):  
Janko Međedović ◽  
Boban Petrović

Abstract. Machiavellianism, narcissism, and psychopathy are personality traits understood to be dispositions toward amoral and antisocial behavior. Recent research has suggested that sadism should also be added to this set of traits. In the present study, we tested a hypothesis proposing that these four traits are expressions of one superordinate construct: The Dark Tetrad. Exploration of the latent space of four “dark” traits suggested that the singular second-order factor which represents the Dark Tetrad can be extracted. Analysis has shown that Dark Tetrad traits can be located in the space of basic personality traits, especially on the negative pole of the Honesty-Humility, Agreeableness, Conscientiousness, and Emotionality dimensions. We conclude that sadism behaves in a similar manner as the other dark traits, but it cannot be reduced to them. The results support the concept of “Dark Tetrad.”


Methodology ◽  
2006 ◽  
Vol 2 (1) ◽  
pp. 24-33 ◽  
Author(s):  
Susan Shortreed ◽  
Mark S. Handcock ◽  
Peter Hoff

Recent advances in latent space and related random effects models hold much promise for representing network data. The inherent dependency between ties in a network makes modeling data of this type difficult. In this article we consider a recently developed latent space model that is particularly appropriate for the visualization of networks. We suggest a new estimator of the latent positions and perform two network analyses, comparing four alternative estimators. We demonstrate a method of checking the validity of the positional estimates. These estimators are implemented via a package in the freeware statistical language R. The package allows researchers to efficiently fit the latent space model to data and to visualize the results.


2013 ◽  
Vol 39 (10) ◽  
pp. 1674
Author(s):  
Dong YANG ◽  
Xiu-Ling ZHOU ◽  
Ping GUO

Author(s):  
Joseph P Davin ◽  
Sunil Gupta ◽  
Mikolaj Jan Piskorski
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document