Heterogeneous data alignment for cross-media computing

Author(s):  
Shikui Wei ◽  
Yunchao Wei ◽  
Lei Zhang ◽  
Zhenfeng Zhu ◽  
Yao Zhao
2021 ◽  
Author(s):  
KMA Solaiman ◽  
Tao Sun ◽  
Alina Nesen ◽  
Bharat Bhargava ◽  
Michael Stonebraker

We present a system for integrating multiple sources of data for finding missing persons. This system can assist authorities in finding children during amber alerts, mentally challenged persons who have wandered off, or person-of-interests in an investigation. Authorities search for the person in question by reaching out to acquaintances, checking video feeds, or by looking into the previous histories relevant to the investigation. In the absence of any leads, authorities lean on public help from sources such as tweets or tip lines. A missing person investigation requires information from multiple modalities and heterogeneous data sources to be combined.<div>Existing cross-modal fusion models use separate information models for each data modality and lack the compatibility to utilize pre-existing object properties in an application domain. A framework for multimodal information retrieval, called Find-Them is developed. It includes extracting features from different modalities and mapping them into a standard schema for context-based data fusion. Find-Them can integrate application domains with previously derived object properties and can deliver data relevant for the mission objective based on the context and needs of the user. Measurements on a novel open-world cross-media dataset show the efficacy of our model. The objective of this work is to assist authorities in finding uses of Find-Them in missing person investigation.</div>


Author(s):  
Donglin Zhang ◽  
Xiao-Jun Wu ◽  
Jun Yu

Hashing methods have sparked a great revolution on large-scale cross-media search due to its effectiveness and efficiency. Most existing approaches learn unified hash representation in a common Hamming space to represent all multimodal data. However, the unified hash codes may not characterize the cross-modal data discriminatively, because the data may vary greatly due to its different dimensionalities, physical properties, and statistical information. In addition, most existing supervised cross-modal algorithms preserve the similarity relationship by constructing an n × n pairwise similarity matrix, which requires a large amount of calculation and loses the category information. To mitigate these issues, a novel cross-media hashing approach is proposed in this article, dubbed label flexible matrix factorization hashing (LFMH). Specifically, LFMH jointly learns the modality-specific latent subspace with similar semantic by the flexible matrix factorization. In addition, LFMH guides the hash learning by utilizing the semantic labels directly instead of the large n × n pairwise similarity matrix. LFMH transforms the heterogeneous data into modality-specific latent semantic representation. Therefore, we can obtain the hash codes by quantifying the representations, and the learned hash codes are consistent with the supervised labels of multimodal data. Then, we can obtain the similar binary codes of the corresponding modality, and the binary codes can characterize such samples flexibly. Accordingly, the derived hash codes have more discriminative power for single-modal and cross-modal retrieval tasks. Extensive experiments on eight different databases demonstrate that our model outperforms some competitive approaches.


2021 ◽  
Author(s):  
KMA Solaiman ◽  
Tao Sun ◽  
Alina Nesen ◽  
Bharat Bhargava ◽  
Michael Stonebraker

We present a system for integrating multiple sources of data for finding missing persons. This system can assist authorities in finding children during amber alerts, mentally challenged persons who have wandered off, or person-of-interests in an investigation. Authorities search for the person in question by reaching out to acquaintances, checking video feeds, or by looking into the previous histories relevant to the investigation. In the absence of any leads, authorities lean on public help from sources such as tweets or tip lines. A missing person investigation requires information from multiple modalities and heterogeneous data sources to be combined.<div>Existing cross-modal fusion models use separate information models for each data modality and lack the compatibility to utilize pre-existing object properties in an application domain. A framework for multimodal information retrieval, called Find-Them is developed. It includes extracting features from different modalities and mapping them into a standard schema for context-based data fusion. Find-Them can integrate application domains with previously derived object properties and can deliver data relevant for the mission objective based on the context and needs of the user. Measurements on a novel open-world cross-media dataset show the efficacy of our model. The objective of this work is to assist authorities in finding uses of Find-Them in missing person investigation.</div>


Author(s):  
Bin Zhang ◽  
◽  
Huaxiang Zhang ◽  
Jiande Sun ◽  
Zhenhua Wang ◽  
...  

Cross-media retrieval has raised a lot of research interests, and a significant number of works focus on mapping the heterogeneous data into a common subspace using a couple of projection matrices corresponding to each modal data before implementing similarity comparison. Differently, we reconstruct one modal data (e.g., images) to the other one (e.g., texts) using a model named sparse neural network pre-trained by Restricted Boltzmann Machines (MRCR-RSNN) so that we can project one modal data into the space of the other one directly. In the model, input is low-level features of one modal data and output is the other one. And cross-media retrieval is implemented based on the similarities of their representatives. Our model need not any manual annotation and its application is more widely. It is simple but effective. We evaluate the performance of our method on several benchmark datasets, and experimental results prove its effectiveness based on the Mean Average Precision (MAP) and Precision Recall (PR).


2018 ◽  
Vol 7 (2.7) ◽  
pp. 257
Author(s):  
Monelli Ayyavaraiah ◽  
Dr Bondu Venkateswarlu

The large number of heterogeneous data are rapidly increasing in the internet and most data consist of audio, video, text and images. The searching of the required data from the large database is difficult and time taking process. The single media retrieval is used to get the needed data from the large dataset and it has the drawback, it can only retrieve the single media only. If the query is given as the text and acquired result are present in text. The users demand the cross-media retrieval for their queries and it is very consistent in providing the result. This helps the users to get more information regarding to their queries. Finding the similarities between the heterogeneous data is very complex. Many research is done on the cross-media retrieval with different methods and provide the different result. The aim is to analysis the different cross-media retrieval with the joint graph regularization (JGR) to understand the various technique. The most of researches are using the parameter of MAP, precision and recall for their research.


Neuróptica ◽  
2020 ◽  
pp. 249-252
Author(s):  
Julia Rigual Mur
Keyword(s):  

Reseña del libro: HERNÁNDEZ PÉREZ, M., Manga, anime y videojuegos. Narrativa cross-media japonesa, Zaragoza, Prensas Universitarias de Zaragoza, 2017.


Sign in / Sign up

Export Citation Format

Share Document