Scene understanding using natural language description based on 3D semantic graph map

We presented a learning model that generated natural language description of images. The model utilized the connections between natural language and visual data by produced text line based contents from a given image. Our Hybrid Recurrent Neural Network model is based on the intricacies of Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Bi-directional Recurrent Neural Network (BRNN) models. We conducted experiments on three benchmark datasets, e.g., Flickr8K, Flickr30K, and MS COCO. Our hybrid model utilized LSTM model to encode text line or sentences independent of the object location and BRNN for word representation, this reduced the computational complexities without compromising the accuracy of the descriptor. The model produced better accuracy in retrieving natural language based description on the dataset.

Download Full-text

On the Selection of Verbs for Natural Language Description of Traffic Scenes

Informatik-Fachberichte - GWAI-82 ◽

10.1007/978-3-642-68826-3_2 ◽

1982 ◽

pp. 22-31 ◽

Cited By ~ 1

Author(s):

Hans-Joachim Novak

Keyword(s):

Natural Language ◽

Selection Of ◽

Language Description

Download Full-text

PDDL Planning with Natural Language-Based Scene Understanding for UAV-UGV Cooperation

Applied Sciences ◽

10.3390/app9183789 ◽

2019 ◽

Vol 9 (18) ◽

pp. 3789 ◽

Cited By ~ 2

Author(s):

Jiyoun Moon ◽

Beom-Hee Lee

Keyword(s):

Deep Learning ◽

Natural Language ◽

Knowledge Acquisition ◽

Scene Understanding ◽

Semantic Knowledge ◽

Ground Vehicles ◽

Learning Techniques ◽

Recent Developments ◽

Heterogeneous Robots ◽

Semantic Scene

Natural-language-based scene understanding can enable heterogeneous robots to cooperate efficiently in large and unconstructed environments. However, studies on symbolic planning rarely consider the semantic knowledge acquisition problem associated with the surrounding environments. Further, recent developments in deep learning methods show outstanding performance for semantic scene understanding using natural language. In this paper, a cooperation framework that connects deep learning techniques and a symbolic planner for heterogeneous robots is proposed. The framework is largely composed of the scene understanding engine, planning agent, and knowledge engine. We employ neural networks for natural-language-based scene understanding to share environmental information among robots. We then generate a sequence of actions for each robot using a planning domain definition language planner. JENA-TDB is used for knowledge acquisition storage. The proposed method is validated using simulation results obtained from one unmanned aerial and three ground vehicles.

Download Full-text

A Concept for Generating Business Process Models from Natural Language Description

Knowledge Science, Engineering and Management - Lecture Notes in Computer Science ◽

10.1007/978-3-319-99365-2_8 ◽

2018 ◽

pp. 91-103 ◽

Cited By ~ 1

Author(s):

Krzysztof Honkisz ◽

Krzysztof Kluza ◽

Piotr Wiśniewski

Keyword(s):

Natural Language ◽

Business Process ◽

Process Models ◽

Business Process Models ◽

Language Description

Download Full-text

Nerva: Automated application synthesis for humanoid robot from user natural language description

Communications in Information and Systems ◽

10.4310/cis.2017.v17.n1.a3 ◽

2017 ◽

Vol 17 (1) ◽

pp. 45-64

Author(s):

Hao Li ◽

Yu-Ping Wang ◽

Tai-Jiang Mu

Keyword(s):

Natural Language ◽

Humanoid Robot ◽

Language Description

Download Full-text

Knowledge acquisition for language description from scene understanding

2015 International Conference on Computer, Communication and Control (IC4) ◽

10.1109/ic4.2015.7375651 ◽

2015 ◽

Cited By ~ 5

Author(s):

Priyanka Jain ◽

Priyanka Pawar ◽

Gaurav Koriya ◽

Anuradha Lele ◽

Ajai Kumar ◽

...

Keyword(s):

Knowledge Acquisition ◽

Scene Understanding ◽

Language Description

Download Full-text

Linguistic processor integration for solving planimetric problems

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.20211001oa24 ◽

2021 ◽

Vol 15 (4) ◽

pp. 0-0

Keyword(s):

Natural Language ◽

Syntactic Structure ◽

Semantic Representation ◽

Interactive Visualization ◽

Subject Area ◽

Syntactic Structures ◽

The Subject ◽

Language Description

The research deals with the original algorithms of the linguistic processor integration for solving planimetric problems. The linguistic processor translates the natural language description of the problem into a semantic representation based on the ontology that supports the axiomatics of geometry. The linguistic processor synthesizes natural-language comments to the solution and drawing objects. The method of interactive visualization of the linguistic processor functioning is proposed. The method provides a step-by-step dialog control of syntactic structure construction and its display in semantic representation. During the experiments, several dozens of standard syntactic structures correctly displayed in the semantic structures of the subject area were obtained. The direction of further research related to the development of the proposed approach is outlined.

Download Full-text

Neural factoid geospatial question answering

Journal of Spatial Information Science ◽

10.5311/josis.2021.23.159 ◽

2021 ◽

pp. 65-90

Author(s):

Haonan Li ◽

Ehsan Hamzei ◽

Ivan Majic ◽

Hua Hua ◽

Jochen Renz ◽

...

Keyword(s):

Natural Language ◽

Question Answering ◽

Query Language ◽

Graph Representation ◽

Geospatial Information ◽

Graph Representations ◽

Semantic Graph ◽

Complex Syntax ◽

Question Answering Systems ◽

Semantic Encoding

Existing question answering systems struggle to answer factoid questions when geospatial information is involved. This is because most systems cannot accurately detect the geospatial semantic elements from the natural language questions, or capture the semantic relationships between those elements. In this paper, we propose a geospatial semantic encoding schema and a semantic graph representation which captures the semantic relations and dependencies in geospatial questions. We demonstrate that our proposed graph representation approach aids in the translation from natural language to a formal, executable expression in a query language. To decrease the need for people to provide explanatory information as part of their question and make the translation fully automatic, we treat the semantic encoding of the question as a sequential tagging task, and the graph generation of the query as a semantic dependency parsing task. We apply neural network approaches to automatically encode the geospatial questions into spatial semantic graph representations. Compared with current template-based approaches, our method generalises to a broader range of questions, including those with complex syntax and semantics. Our proposed approach achieves better results on GeoData201 than existing methods.

Download Full-text

Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00220 ◽

2013 ◽

Vol 1 ◽

pp. 193-206 ◽

Cited By ~ 36

Author(s):

Jayant Krishnamurthy ◽

Thomas Kollar

Keyword(s):

Language Acquisition ◽

Natural Language ◽

Physical Environment ◽

Question Answering ◽

Scene Understanding ◽

Physical World ◽

Training Procedure ◽

Supervised Training ◽

Weakly Supervised ◽

Logical Semantics

This paper introduces Logical Semantics with Perception (LSP), a model for grounded language acquisition that learns to map natural language statements to their referents in a physical environment. For example, given an image, LSP can map the statement “blue mug on the table” to the set of image segments showing blue mugs on tables. LSP learns physical representations for both categorical (“blue,” “mug”) and relational (“on”) language, and also learns to compose these representations to produce the referents of entire statements. We further introduce a weakly supervised training procedure that estimates LSP’s parameters using annotated referents for entire statements, without annotated referents for individual words or the parse structure of the statement. We perform experiments on two applications: scene understanding and geographical question answering. We find that LSP outperforms existing, less expressive models that cannot represent relational language. We further find that weakly supervised training is competitive with fully supervised training while requiring significantly less annotation effort.

Download Full-text