text query Latest Research Papers

While the computer vision problem of searching for activities in videos is usually addressed by using discriminative models, their decisions tend to be opaque and difficult for people to understand. We propose a case study of a novel machine learning approach for generative searching and ranking of motion capture activities with visual explanation. Instead of directly ranking videos in the database given a text query, our approach uses a variant of Generative Adversarial Networks (GANs) to generate exemplars based on the query and uses them to search for the activity of interest in a large database. Our model is able to achieve comparable results to its discriminative counterpart, while being able to dynamically generate visual explanations. In addition to our searching and ranking method, we present an explanation interface that enables the user to successfully explore the model’s explanations and its confidence by revealing query-based, model-generated motion capture clips that contributed to the model’s decision. Finally, we conducted a user study with 44 participants to show that by using our model and interface, participants benefit from a deeper understanding of the model’s conceptualization of the search query. We discovered that the XAI system yielded a comparable level of efficiency, accuracy, and user-machine synchronization as its black-box counterpart, if the user exhibited a high level of trust for AI explanation.

Download Full-text

Traffic Video Event Retrieval via Text Query using Vehicle Appearance and Motion Attributes

10.1109/cvprw53098.2021.00470 ◽

2021 ◽

Author(s):

Tien-Phat Nguyen ◽

Ba-Thinh Tran-Le ◽

Xuan-Dang Thai ◽

Tam V. Nguyen ◽

Minh N. Do ◽

...

Keyword(s):

Video Event ◽

Traffic Video ◽

Text Query

Download Full-text

VSRNet: End-to-End Video Segment Retrieval with Text Query

Pattern Recognition ◽

10.1016/j.patcog.2021.108027 ◽

2021 ◽

pp. 108027

Author(s):

Xiao Sun ◽

Xiang Long ◽

Dongliang He ◽

Shilei Wen ◽

Zhouhui Lian

Keyword(s):

Video Segment ◽

End To End ◽

Text Query

Download Full-text

Large-scale image search with text for information retrieval

Journal of Innovations in Engineering Education ◽

10.3126/jiee.v4i1.35390 ◽

2021 ◽

Vol 4 (1) ◽

pp. 87-89

Author(s):

Janardan Bhatta

Keyword(s):

Information Retrieval ◽

Language Processing ◽

Large Scale ◽

Image Feature ◽

Image Search ◽

Search Results ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Text Features ◽

Text Query

Searching images in a large database is a major requirement in Information Retrieval Systems. Expecting image search results based on a text query is a challenging task. In this paper, we leverage the power of Computer Vision and Natural Language Processing in Distributed Machines to lower the latency of search results. Image pixel features are computed based on contrastive loss function for image search. Text features are computed based on the Attention Mechanism for text search. These features are aligned together preserving the information in each text and image feature. Previously, the approach was tested only in multilingual models. However, we have tested it in image-text dataset and it enabled us to search in any form of text or images with high accuracy.

Download Full-text

Multimodal semantic analysis with regularized semantic autoencoder

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189759 ◽

2021 ◽

pp. 1-9

Author(s):

Shaily Malik ◽

Poonam Bansal

Keyword(s):

Semantic Analysis ◽

Feature Space ◽

Machine Learning Algorithms ◽

Real World Data ◽

Common Space ◽

Locality Preservation ◽

Space Transformation ◽

Image Query ◽

Text Query ◽

Low Dimensional

The real-world data is multimodal and to classify them by machine learning algorithms, features of both modalities must be transformed into common latent space. The high dimensional common space transformation of features lose their locality information and susceptible to noise. This research article has dealt with this issue of a semantic autoencoder and presents a novel algorithm with distinct mapped features with locality preservation into a commonly hidden space. We call it discriminative regularized semantic autoencoder (DRSAE). It maintains the low dimensional features in the manifold to manage the inter and intra-modality of the data. The data has multi labels, and these are transformed into an aware feature space. Conditional Principal label space transformation (CPLST) is used for it. With the two-fold proposed algorithm, we achieve a significant improvement in text retrieval form image query and image retrieval from the text query.

Download Full-text

Text query based summarized event searching interface system using deep learning over cloud

Multimedia Tools and Applications ◽

10.1007/s11042-020-10157-4 ◽

2021 ◽

Author(s):

Krishan Kumar

Keyword(s):

Deep Learning ◽

Interface System ◽

Text Query

Download Full-text

Compositional Learning of Image-Text Query for Image Retrieval

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) ◽

10.1109/wacv48630.2021.00118 ◽

2021 ◽

Author(s):

Muhammad Umer Anwaar ◽

Egor Labintcev ◽

Martin Kleinsteuber

Keyword(s):

Image Retrieval ◽

Text Query ◽

Compositional Learning

Download Full-text

Text2Brain: Synthesis of Brain Activation Maps from Free-Form Text Query

10.1007/978-3-030-87234-2_57 ◽

2021 ◽

pp. 605-614

Author(s):

Gia H. Ngo ◽

Minh Nguyen ◽

Nancy F. Chen ◽

Mert R. Sabuncu

Keyword(s):

Brain Activation ◽

Free Form ◽

Text Query

Download Full-text

Text-Based Image Retrieval Using Deep Learning

Encyclopedia of Information Science and Technology, Fifth Edition - Advances in Information Quality and Management ◽

10.4018/978-1-7998-3479-3.ch007 ◽

2021 ◽

pp. 87-97

Author(s):

Udit Singhania ◽

B. K. Tripathy

Keyword(s):

Information Retrieval ◽

Deep Learning ◽

Language Processing ◽

Neural Nets ◽

Image Search ◽

Restricted Boltzmann Machines ◽

Previous Version ◽

Text Query ◽

Learning Architectures ◽

Over Time

This chapter is mainly an advanced version of the previous version of the chapter named “An Insight to Deep Learning Architectures” in the encyclopedia. This chapter mainly focusses on giving the insights of information retrieval after the year 2014, as the earlier part has been discussed in the previous version. Deep learning plays an important role in today's era, and this chapter makes use of such deep learning architectures which have evolved over time and have proved to be efficient in image search/retrieval nowadays. In this chapter, various techniques to solve the problem of natural language processing to process text query are mentioned. Recurrent neural nets, deep restricted Boltzmann machines, general adversarial nets have been discussed seeing how they revolutionize the field of information retrieval.

Download Full-text

text query
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Cross-modal Dynamic Networks for Video Moment Retrieval with Text Query

Learn, Generate, Rank, Explain: A Case Study of Visual Explanation by Generative Machine Learning

Traffic Video Event Retrieval via Text Query using Vehicle Appearance and Motion Attributes

VSRNet: End-to-End Video Segment Retrieval with Text Query

Large-scale image search with text for information retrieval

Multimodal semantic analysis with regularized semantic autoencoder

Text query based summarized event searching interface system using deep learning over cloud

Compositional Learning of Image-Text Query for Image Retrieval

Text2Brain: Synthesis of Brain Activation Maps from Free-Form Text Query

Text-Based Image Retrieval Using Deep Learning

Export Citation Format

text queryRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Cross-modal Dynamic Networks for Video Moment Retrieval with Text Query

Learn, Generate, Rank, Explain: A Case Study of Visual Explanation by Generative Machine Learning

Traffic Video Event Retrieval via Text Query using Vehicle Appearance and Motion Attributes

VSRNet: End-to-End Video Segment Retrieval with Text Query

Large-scale image search with text for information retrieval

Multimodal semantic analysis with regularized semantic autoencoder

Text query based summarized event searching interface system using deep learning over cloud

Compositional Learning of Image-Text Query for Image Retrieval

Text2Brain: Synthesis of Brain Activation Maps from Free-Form Text Query

Text-Based Image Retrieval Using Deep Learning

text query
Recently Published Documents