scholarly journals Elegant Unsupervised Cross-Modal Hashing

Author(s):  
Yiu-ming Cheung ◽  
Zhikai Hu

<div><p>Unsupervised cross-modal retrieval has received increasing attention recently, because of the extreme difficulty of labeling the explosive multimedia data. The core challenge of it is how to measure the similarities between multi-modal data without label information. In previous works, various distance metrics are selected for measuring the similarities and predicting whether samples belong to the same class. However, these predictions are not always right. Unfortunately, even a few wrong predictions can undermine the final retrieval performance. To address this problem, in this paper, we categorize predictions as solid and soft ones based on their confidence. We further categorize samples as solid and soft ones based on the predictions. We propose that these two kinds of predictions and samples should be treated differently. Besides, we find that the absolute values of similarities can represent not only the similarity but also the confidence of the predictions. Thus, we first design an elegant dot product fusion strategy to obtain effective inter-modal similarities. Subsequently, utilizing these similarities, we propose a generalized and flexible weighted loss function where larger weights are assigned to solid samples to increase the retrieval performance, and smaller weights are assigned to soft samples to decrease the disturbance of wrong predictions. Despite less information is used, empirical studies show that the proposed approach achieves the state-of-the-art retrieval performance.</p><br></div>

2021 ◽  
Author(s):  
Yiu-ming Cheung ◽  
Zhikai Hu

<div><p>Unsupervised cross-modal retrieval has received increasing attention recently, because of the extreme difficulty of labeling the explosive multimedia data. The core challenge of it is how to measure the similarities between multi-modal data without label information. In previous works, various distance metrics are selected for measuring the similarities and predicting whether samples belong to the same class. However, these predictions are not always right. Unfortunately, even a few wrong predictions can undermine the final retrieval performance. To address this problem, in this paper, we categorize predictions as solid and soft ones based on their confidence. We further categorize samples as solid and soft ones based on the predictions. We propose that these two kinds of predictions and samples should be treated differently. Besides, we find that the absolute values of similarities can represent not only the similarity but also the confidence of the predictions. Thus, we first design an elegant dot product fusion strategy to obtain effective inter-modal similarities. Subsequently, utilizing these similarities, we propose a generalized and flexible weighted loss function where larger weights are assigned to solid samples to increase the retrieval performance, and smaller weights are assigned to soft samples to decrease the disturbance of wrong predictions. Despite less information is used, empirical studies show that the proposed approach achieves the state-of-the-art retrieval performance.</p><br></div>


Author(s):  
Yang Wang

With the development of web technology, multi-modal or multi-view data has surged as a major stream for big data, where each modal/view encodes individual property of data objects. Often, different modalities are complementary to each other. This fact motivated a lot of research attention on fusing the multi-modal feature spaces to comprehensively characterize the data objects. Most of the existing state-of-the-arts focused on how to fuse the energy or information from multi-modal spaces to deliver a superior performance over their counterparts with single modal. Recently, deep neural networks have been exhibited as a powerful architecture to well capture the nonlinear distribution of high-dimensional multimedia data, so naturally does for multi-modal data. Substantial empirical studies are carried out to demonstrate its advantages that are benefited from deep multi-modal methods, which can essentially deepen the fusion from multi-modal deep feature spaces. In this article, we provide a substantial overview of the existing state-of-the-arts in the field of multi-modal data analytics from shallow to deep spaces. Throughout this survey, we further indicate that the critical components for this field go to collaboration, adversarial competition, and fusion over multi-modal spaces. Finally, we share our viewpoints regarding some future directions in this field.


Author(s):  
Junjie Chen ◽  
William K. Cheung

Quantization has been widely adopted for large-scale multimedia retrieval due to its effectiveness of coding highdimensional data. Deep quantization models have been demonstrated to achieve the state-of-the-art retrieval accuracy. However, training the deep models given a large-scale database is highly time-consuming as a large amount of parameters are involved. Existing deep quantization methods often sample only a subset from the database for training, which may end up with unsatisfactory retrieval performance as a large portion of label information is discarded. To alleviate this problem, we propose a novel model called Similarity Preserving Deep Asymmetric Quantization (SPDAQ) which can directly learn the compact binary codes and quantization codebooks for all the items in the database efficiently. To do that, SPDAQ makes use of an image subset as well as the label information of all the database items so the image subset items and the database items are mapped to two different but correlated distributions, where the label similarity can be well preserved. An efficient optimization algorithm is proposed for the learning. Extensive experiments conducted on four widely-used benchmark datasets demonstrate the superiority of our proposed SPDAQ model.


Author(s):  
Xuanwu Liu ◽  
Guoxian Yu ◽  
Carlotta Domeniconi ◽  
Jun Wang ◽  
Yazhou Ren ◽  
...  

Cross-modal hashing has been receiving increasing interests for its low storage cost and fast query speed in multi-modal data retrievals. However, most existing hashing methods are based on hand-crafted or raw level features of objects, which may not be optimally compatible with the coding process. Besides, these hashing methods are mainly designed to handle simple pairwise similarity. The complex multilevel ranking semantic structure of instances associated with multiple labels has not been well explored yet. In this paper, we propose a ranking-based deep cross-modal hashing approach (RDCMH). RDCMH firstly uses the feature and label information of data to derive a semi-supervised semantic ranking list. Next, to expand the semantic representation power of hand-crafted features, RDCMH integrates the semantic ranking information into deep cross-modal hashing and jointly optimizes the compatible parameters of deep feature representations and of hashing functions. Experiments on real multi-modal datasets show that RDCMH outperforms other competitive baselines and achieves the state-of-the-art performance in cross-modal retrieval applications.


Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 39
Author(s):  
Carlos Lassance ◽  
Vincent Gripon ◽  
Antonio Ortega

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.


Open Theology ◽  
2016 ◽  
Vol 2 (1) ◽  
Author(s):  
Thomas G. Plante

AbstractSince the publication of Bergin’s classic 1980 paper “Psychotherapy and Religious Values” in the Journal of Clinical and Consulting Psychology, an enormous amount of quality research has been conducted on the integration of religious and spiritual values and perspectives into the psychotherapy endeavor. Numerous empirical studies, chapters, books, blogs, and specialty organizations have emerged in the past 35 years that have helped researchers and clinicians alike come to appreciate the value of religion and spirituality in the psychotherapeutic process. While so much has been accomplished in this area of integration, so much more needs to occur in order for the psychotherapeutic world to benefit from the wisdom of the great religious and spiritual traditions and values. While state-of-the-art quality research has and continues to demonstrate how religious and spiritual practices and values can be used effectively to enhance the benefits of behavioral and psychological interventions, too often the field either gets overly focused on particular and perhaps trendy areas of interest (e.g., mindfulness) or fails to appreciate and incorporate the research evidence supporting (or not supporting) the use of certain religiously or spiritually informed assessments and interventions. The purpose of this article is to reflect on where the field integrating religion, spirituality and psychotherapy has evolved through the present and where it still needs to go in the future. In doing so I hope to reflect on the call for integration that Bergin highlights in his classic 1980 paper.


Author(s):  
Antonio L. Alfeo ◽  
Mario G. C. A. Cimino ◽  
Gigliola Vaglini

AbstractIn nowadays manufacturing, each technical assistance operation is digitally tracked. This results in a huge amount of textual data that can be exploited as a knowledge base to improve these operations. For instance, an ongoing problem can be addressed by retrieving potential solutions among the ones used to cope with similar problems during past operations. To be effective, most of the approaches for semantic textual similarity need to be supported by a structured semantic context (e.g. industry-specific ontology), resulting in high development and management costs. We overcome this limitation with a textual similarity approach featuring three functional modules. The data preparation module provides punctuation and stop-words removal, and word lemmatization. The pre-processed sentences undergo the sentence embedding module, based on Sentence-BERT (Bidirectional Encoder Representations from Transformers) and aimed at transforming the sentences into fixed-length vectors. Their cosine similarity is processed by the scoring module to match the expected similarity between the two original sentences. Finally, this similarity measure is employed to retrieve the most suitable recorded solutions for the ongoing problem. The effectiveness of the proposed approach is tested (i) against a state-of-the-art competitor and two well-known textual similarity approaches, and (ii) with two case studies, i.e. private company technical assistance reports and a benchmark dataset for semantic textual similarity. With respect to the state-of-the-art, the proposed approach results in comparable retrieval performance and significantly lower management cost: 30-min questionnaires are sufficient to obtain the semantic context knowledge to be injected into our textual search engine.


2008 ◽  
Vol 31 (2) ◽  
pp. 142-143 ◽  
Author(s):  
Brendan McGonigle ◽  
Margaret Chalmers

AbstractThe “rational bubble” stance espoused in the target article confounds cultural symbolic achievements with individual cognitive competences. With no explicit role for learning, the core rationale for claiming a major functional discontinuity between humans and other species rests on a hybrid formal model LISA (Learning and Inference with Schemas and Analogies) now overtaken by new models of cognitive growth and new empirical studies within an embodied systems stance.


2016 ◽  
Vol 33 (2) ◽  
pp. 187-197 ◽  
Author(s):  
Feliciano Henriques VEIGA ◽  
Viorel ROBU ◽  
Joseph CONBOY ◽  
Adriana ORTIZ ◽  
Carolina CARVALHO ◽  
...  

"Students' engagement in school" is regarded in the literature as a current and valued construct despite the lack of empirical studies on its relationship with specific family variables. The present research aimed to survey studies on the correlation between students' engagement in school and family contexts, specifically in terms of the following variables: perceived parental support, socioeconomic and sociocultural levels, perceived rights, and parental educational styles. In order to describe the state of the art of student's "engagement in school" and "family variables", a narrative review was conducted. The studies reviewed highlight the role of family as a context with significance in student's engagement in school. However, further research is needed to deepen the knowledge of this topic considering potential mediator variables, either personal or school variables. It was also found the need for a psychosocial intervention aimed at providing support for the students coming from adverse family contexts who exhibit low level of engagement associated with poor academic achievement and a higher probability of dropping out.


Sign in / Sign up

Export Citation Format

Share Document