semantic alignment
Recently Published Documents


TOTAL DOCUMENTS

88
(FIVE YEARS 37)

H-INDEX

8
(FIVE YEARS 4)

2022 ◽  
Vol 40 (4) ◽  
pp. 1-27
Author(s):  
Zhongwei Xie ◽  
Ling Liu ◽  
Yanzhao Wu ◽  
Luo Zhong ◽  
Lin Li

This article introduces a two-phase deep feature engineering framework for efficient learning of semantics enhanced joint embedding, which clearly separates the deep feature engineering in data preprocessing from training the text-image joint embedding model. We use the Recipe1M dataset for the technical description and empirical validation. In preprocessing, we perform deep feature engineering by combining deep feature engineering with semantic context features derived from raw text-image input data. We leverage LSTM to identify key terms, deep NLP models from the BERT family, TextRank, or TF-IDF to produce ranking scores for key terms before generating the vector representation for each key term by using Word2vec. We leverage Wide ResNet50 and Word2vec to extract and encode the image category semantics of food images to help semantic alignment of the learned recipe and image embeddings in the joint latent space. In joint embedding learning, we perform deep feature engineering by optimizing the batch-hard triplet loss function with soft-margin and double negative sampling, taking into account also the category-based alignment loss and discriminator-based alignment loss. Extensive experiments demonstrate that our SEJE approach with deep feature engineering significantly outperforms the state-of-the-art approaches.


2021 ◽  
Author(s):  
Zhijian Liu ◽  
Haotian Tang ◽  
Sibo Zhu ◽  
Song Han

Author(s):  
Zhong Ji ◽  
Kexin Chen ◽  
Haoran Wang

Image-text matching plays a central role in bridging the semantic gap between vision and language. The key point to achieve precise visual-semantic alignment lies in capturing the fine-grained cross-modal correspondence between image and text. Most previous methods rely on single-step reasoning to discover the visual-semantic interactions, which lacks the ability of exploiting the multi-level information to locate the hierarchical fine-grained relevance. Different from them, in this work, we propose a step-wise hierarchical alignment network (SHAN) that decomposes image-text matching into multi-step cross-modal reasoning process. Specifically, we first achieve local-to-local alignment at fragment level, following by performing global-to-local and global-to-global alignment at context level sequentially. This progressive alignment strategy supplies our model with more complementary and sufficient semantic clues to understand the hierarchical correlations between image and text. The experimental results on two benchmark datasets demonstrate the superiority of our proposed method.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Mingyong Li ◽  
Qiqi Li ◽  
Lirong Tang ◽  
Shuang Peng ◽  
Yan Ma ◽  
...  

Cross-modal hashing encodes heterogeneous multimedia data into compact binary code to achieve fast and flexible retrieval across different modalities. Due to its low storage cost and high retrieval efficiency, it has received widespread attention. Supervised deep hashing significantly improves search performance and usually yields more accurate results, but requires a lot of manual annotation of the data. In contrast, unsupervised deep hashing is difficult to achieve satisfactory performance due to the lack of reliable supervisory information. To solve this problem, inspired by knowledge distillation, we propose a novel unsupervised knowledge distillation cross-modal hashing method based on semantic alignment (SAKDH), which can reconstruct the similarity matrix using the hidden correlation information of the pretrained unsupervised teacher model, and the reconstructed similarity matrix can be used to guide the supervised student model. Specifically, firstly, the teacher model adopted an unsupervised semantic alignment hashing method, which can construct a modal fusion similarity matrix. Secondly, under the supervision of teacher model distillation information, the student model can generate more discriminative hash codes. Experimental results on two extensive benchmark datasets (MIRFLICKR-25K and NUS-WIDE) show that compared to several representative unsupervised cross-modal hashing methods, the mean average precision (MAP) of our proposed method has achieved a significant improvement. It fully reflects its effectiveness in large-scale cross-modal data retrieval.


2021 ◽  
Author(s):  
Xu Yang ◽  
Cheng Deng ◽  
Zhiyuan Dang ◽  
Kun Wei ◽  
Junchi Yan
Keyword(s):  

2021 ◽  
Vol 69 ◽  
pp. 73-80
Author(s):  
Xiao Liu ◽  
Jing Zhao ◽  
Shiliang Sun ◽  
Huawen Liu ◽  
Hao Yang

Author(s):  
Sangita De ◽  
Premek Brada ◽  
Juergen Mottok ◽  
Michael Niklas

Semantic alignment of application software components’ ontologies represents a great interest in vehicle application domains that manipulate heterogeneous overlapping knowledge application frameworks. In the past few years, with the growth in the novel vehicle service requirements such as autonomous driving, V2X (Vehicle-to-Vehicle communication) and many others, automotive application software component models are becoming increasingly collaborative with other qualified cross-enterprise industrial partners to accomplish these complex service requirements. The most daunting impediment to this cross-enterprise collaboration is semantic interoperability. For efficient services collaboration through cross-enterprise semantic interoperability between the vehicle application frameworks’ software components, aligning the interface ontologies of these components by identifying the depth of semantic alignment relationships between the concepts of the interface ontologies is the major focus of this paper. In contrast to several existing ontology structural metrics, this work defines, evaluates and validates ontology metrics to measure the depth of semantic alignment between the vehicle domain software component frameworks’ interface ontological models. To emphasize the substantial role of semantic alignment of software component frameworks’ interface ontologies in semantic interoperability, a typical vehicle domain case study involving vehicle applications is considered for demonstration.


Sign in / Sign up

Export Citation Format

Share Document