scholarly journals Discrete Two-Step Cross-Modal Hashing through the Exploitation of Pairwise Relations

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Shaohua Wang ◽  
Xiao Kang ◽  
Fasheng Liu ◽  
Xiushan Nie ◽  
Xingbo Liu

The cross-modal hashing method can map heterogeneous multimodal data into a compact binary code that preserves semantic similarity, which can significantly enhance the convenience of cross-modal retrieval. However, the currently available supervised cross-modal hashing methods generally only factorize the label matrix and do not fully exploit the supervised information. Furthermore, these methods often only use one-directional mapping, which results in an unstable hash learning process. To address these problems, we propose a new supervised cross-modal hash learning method called Discrete Two-step Cross-modal Hashing (DTCH) through the exploitation of pairwise relations. Specifically, this method fully exploits the pairwise similarity relations contained in the supervision information: for the label matrix, the hash learning process is stabilized by combining matrix factorization and label regression; for the pairwise similarity matrix, a semirelaxed and semidiscrete strategy is adopted to potentially reduce the cumulative quantization errors while improving the retrieval efficiency and accuracy. The approach further combines an exploration of fine-grained features in the objective function with a novel out-of-sample extension strategy to enable the implicit preservation of consistency between the different modal distributions of samples and the pairwise similarity relations. The superiority of our method was verified through extensive experiments using two widely used datasets.

Author(s):  
Donglin Zhang ◽  
Xiao-Jun Wu ◽  
Jun Yu

Hashing methods have sparked a great revolution on large-scale cross-media search due to its effectiveness and efficiency. Most existing approaches learn unified hash representation in a common Hamming space to represent all multimodal data. However, the unified hash codes may not characterize the cross-modal data discriminatively, because the data may vary greatly due to its different dimensionalities, physical properties, and statistical information. In addition, most existing supervised cross-modal algorithms preserve the similarity relationship by constructing an n × n pairwise similarity matrix, which requires a large amount of calculation and loses the category information. To mitigate these issues, a novel cross-media hashing approach is proposed in this article, dubbed label flexible matrix factorization hashing (LFMH). Specifically, LFMH jointly learns the modality-specific latent subspace with similar semantic by the flexible matrix factorization. In addition, LFMH guides the hash learning by utilizing the semantic labels directly instead of the large n × n pairwise similarity matrix. LFMH transforms the heterogeneous data into modality-specific latent semantic representation. Therefore, we can obtain the hash codes by quantifying the representations, and the learned hash codes are consistent with the supervised labels of multimodal data. Then, we can obtain the similar binary codes of the corresponding modality, and the binary codes can characterize such samples flexibly. Accordingly, the derived hash codes have more discriminative power for single-modal and cross-modal retrieval tasks. Extensive experiments on eight different databases demonstrate that our model outperforms some competitive approaches.


Author(s):  
J. Bernardino Lopes ◽  
Maria Clara Viegas ◽  
José Alexandre Pinto

It is acknowledged that to improve the value of the learning process and outcomes in areas such as science, technology, engineering and math, the teaching quality needs to be enhanced. Therefore, it is crucial to have access to real teaching practices. The multimodal narrative (MN) tool allows teaching practices to become public, sharable, and usable (open science perspective), preserving their holistic, complex, and ecological nature. This tool has characteristics and a structure that enable an in-depth study of teaching practices, in different contexts, with several purposes (e.g., teacher education, professional development, and research). This chapter presents MNs and the necessary steps involved in collecting multimodal data, structuring the narrative, and validating the document. MNs can be used by teachers and researchers, or other professionals, with multiple specific objectives, globally contributing to improving professional practices.


Cybersecurity ◽  
2019 ◽  
Vol 2 (1) ◽  
Author(s):  
Wenjie Li ◽  
Dongpeng Xu ◽  
Wei Wu ◽  
Xiaorui Gong ◽  
Xiaobo Xiang ◽  
...  

2019 ◽  
pp. 101203 ◽  
Author(s):  
Sanna Järvelä ◽  
Jonna Malmberg ◽  
Eetu Haataja ◽  
Marta Sobocinski ◽  
Paul A. Kirschner

Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 36
Author(s):  
Weiping Zheng ◽  
Zhenyao Mo ◽  
Gansen Zhao

Acoustic scene classification (ASC) tries to inference information about the environment using audio segments. The inter-class similarity is a significant issue in ASC as acoustic scenes with different labels may sound quite similar. In this paper, the similarity relations amongst scenes are correlated with the classification error. A class hierarchy construction method by using classification error is then proposed and integrated into a multitask learning framework. The experiments have shown that the proposed multitask learning method improves the performance of ASC. On the TUT Acoustic Scene 2017 dataset, we obtain the ensemble fine-grained accuracy of 81.4%, which is better than the state-of-the-art. By using multitask learning, the basic Convolutional Neural Network (CNN) model can be improved by about 2.0 to 3.5 percent according to different spectrograms. The coarse category accuracies (for two to six super-classes) range from 77.0% to 96.2% by single models. On the revised version of the LITIS Rouen dataset, we achieve the ensemble fine-grained accuracy of 83.9%. The multitask learning models obtain an improvement of 1.6% to 1.8% compared to their basic models. The coarse category accuracies range from 94.9% to 97.9% for two to six super-classes with single models.


Author(s):  
Wei Ji ◽  
Xi Li ◽  
Yueting Zhuang ◽  
Omar El Farouk Bourahla ◽  
Yixin Ji ◽  
...  

Clothing segmentation is a challenging vision problem typically implemented within a fine-grained semantic segmentation framework. Different from conventional segmentation, clothing segmentation has some domain-specific properties such as texture richness, diverse appearance variations, non-rigid geometry deformations, and small sample learning. To deal with these points, we propose a semantic locality-aware segmentation model, which adaptively attaches an original clothing image with a semantically similar (e.g., appearance or pose) auxiliary exemplar by search. Through considering the interactions of the clothing image and its exemplar, more intrinsic knowledge about the locality manifold structures of clothing images is discovered to make the learning process of small sample problem more stable and tractable. Furthermore, we present a CNN model based on the deformable convolutions to extract the non-rigid geometry-aware features for clothing images. Experimental results demonstrate the effectiveness of the proposed model against the state-of-the-art approaches.


2019 ◽  
Vol 28 (10) ◽  
pp. 4954-4969 ◽  
Author(s):  
Thanh-Toan Do ◽  
Khoa Le ◽  
Tuan Hoang ◽  
Huu Le ◽  
Tam V. Nguyen ◽  
...  
Keyword(s):  

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Wenyan Pan ◽  
Meimin Wang ◽  
Jiaohua Qin ◽  
Zhili Zhou

As more and more image data are stored in the encrypted form in the cloud computing environment, it has become an urgent problem that how to efficiently retrieve images on the encryption domain. Recently, Convolutional Neural Network (CNN) features have achieved promising performance in the field of image retrieval, but the high dimension of CNN features will cause low retrieval efficiency. Also, it is not suitable to directly apply them for image retrieval on the encryption domain. To solve the above issues, this paper proposes an improved CNN-based hashing method for encrypted image retrieval. First, the image size is increased and inputted into the CNN to improve the representation ability. Then, a lightweight module is introduced to replace a part of modules in the CNN to reduce the parameters and computational cost. Finally, a hash layer is added to generate a compact binary hash code. In the retrieval process, the hash code is used for encrypted image retrieval, which greatly improves the retrieval efficiency. The experimental results show that the scheme allows an effective and efficient retrieval of encrypted images.


Sign in / Sign up

Export Citation Format

Share Document