Person search by natural language description in Vietnamese using pre-trained visual-textual attributes alignment model

We presented a learning model that generated natural language description of images. The model utilized the connections between natural language and visual data by produced text line based contents from a given image. Our Hybrid Recurrent Neural Network model is based on the intricacies of Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Bi-directional Recurrent Neural Network (BRNN) models. We conducted experiments on three benchmark datasets, e.g., Flickr8K, Flickr30K, and MS COCO. Our hybrid model utilized LSTM model to encode text line or sentences independent of the object location and BRNN for word representation, this reduced the computational complexities without compromising the accuracy of the descriptor. The model produced better accuracy in retrieving natural language based description on the dataset.

Download Full-text

On the Selection of Verbs for Natural Language Description of Traffic Scenes

Informatik-Fachberichte - GWAI-82 ◽

10.1007/978-3-642-68826-3_2 ◽

1982 ◽

pp. 22-31 ◽

Cited By ~ 1

Author(s):

Hans-Joachim Novak

Keyword(s):

Natural Language ◽

Selection Of ◽

Language Description

Download Full-text

Person Search by Queried Description in Vietnamese Natural Language

Communications in Computer and Information Science - Intelligent Information and Database Systems ◽

10.1007/978-981-15-3380-8_41 ◽

2020 ◽

pp. 469-480

Author(s):

Thi Thanh Thuy Pham ◽

Dinh-Duc Nguyen ◽

Ba Hoang Phuc Ta ◽

Thuy-Binh Nguyen ◽

Thi-Ngoc-Diep Do ◽

...

Keyword(s):

Natural Language ◽

Person Search

Download Full-text

Scene understanding using natural language description based on 3D semantic graph map

Intelligent Service Robotics ◽

10.1007/s11370-018-0257-x ◽

2018 ◽

Vol 11 (4) ◽

pp. 347-354 ◽

Cited By ~ 3

Author(s):

Jiyoun Moon ◽

Beomhee Lee

Keyword(s):

Natural Language ◽

Scene Understanding ◽

Semantic Graph ◽

Graph Map ◽

Language Description

Download Full-text

A Concept for Generating Business Process Models from Natural Language Description

Knowledge Science, Engineering and Management - Lecture Notes in Computer Science ◽

10.1007/978-3-319-99365-2_8 ◽

2018 ◽

pp. 91-103 ◽

Cited By ~ 1

Author(s):

Krzysztof Honkisz ◽

Krzysztof Kluza ◽

Piotr Wiśniewski

Keyword(s):

Natural Language ◽

Business Process ◽

Process Models ◽

Business Process Models ◽

Language Description

Download Full-text

Hybrid Attention Network for Language-Based Person Search

Sensors ◽

10.3390/s20185279 ◽

2020 ◽

Vol 20 (18) ◽

pp. 5279

Author(s):

Yang Li ◽

Huahu Xu ◽

Junsheng Xiao

Keyword(s):

Image Features ◽

Attention Mechanism ◽

Feature Representation ◽

Semantic Features ◽

Retrieval Task ◽

Attention Network ◽

Fine Grained ◽

Person Search ◽

High Level ◽

Language Description

Language-based person search retrieves images of a target person using natural language description and is a challenging fine-grained cross-modal retrieval task. A novel hybrid attention network is proposed for the task. The network includes the following three aspects: First, a cubic attention mechanism for person image, which combines cross-layer spatial attention and channel attention. It can fully excavate both important midlevel details and key high-level semantics to obtain better discriminative fine-grained feature representation of a person image. Second, a text attention network for language description, which is based on bidirectional LSTM (BiLSTM) and self-attention mechanism. It can better learn the bidirectional semantic dependency and capture the key words of sentences, so as to extract the context information and key semantic features of the language description more effectively and accurately. Third, a cross-modal attention mechanism and a joint loss function for cross-modal learning, which can pay more attention to the relevant parts between text and image features. It can better exploit both the cross-modal and intra-modal correlation and can better solve the problem of cross-modal heterogeneity. Extensive experiments have been conducted on the CUHK-PEDES dataset. Our approach obtains higher performance than state-of-the-art approaches, demonstrating the advantage of the approach we propose.

Download Full-text

Nerva: Automated application synthesis for humanoid robot from user natural language description

Communications in Information and Systems ◽

10.4310/cis.2017.v17.n1.a3 ◽

2017 ◽

Vol 17 (1) ◽

pp. 45-64

Author(s):

Hao Li ◽

Yu-Ping Wang ◽

Tai-Jiang Mu

Keyword(s):

Natural Language ◽

Humanoid Robot ◽

Language Description

Download Full-text