scene understanding
Recently Published Documents


TOTAL DOCUMENTS

552
(FIVE YEARS 169)

H-INDEX

35
(FIVE YEARS 8)

2022 ◽  
pp. 373-403
Author(s):  
Cornelia Fermüller ◽  
Michael Maynord
Keyword(s):  

2022 ◽  
Vol 183 ◽  
pp. 470-481
Author(s):  
Ning Zhang ◽  
Francesco Nex ◽  
Norman Kerle ◽  
George Vosselman

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Yuezhong Wu ◽  
Xuehao Shen ◽  
Qiang Liu ◽  
Falong Xiao ◽  
Changyun Li

Garbage classification is a social issue related to people’s livelihood and sustainable development, so letting service robots autonomously perform intelligent garbage classification has important research significance. Aiming at the problems of complex systems with data source and cloud service center data transmission delay and untimely response, at the same time, in order to realize the perception, storage, and analysis of massive multisource heterogeneous data, a garbage detection and classification method based on visual scene understanding is proposed. This method uses knowledge graphs to store and model items in the scene in the form of images, videos, texts, and other multimodal forms. The ESA attention mechanism is added to the backbone network part of the YOLOv5 network, aiming to improve the feature extraction ability of the network, combining with the built multimodal knowledge graph to form the YOLOv5-Attention-KG model, and deploying it to the service robot to perform real-time perception on the items in the scene. Finally, collaborative training is carried out on the cloud server side and deployed to the edge device side to reason and analyze the data in real time. The test results show that, compared with the original YOLOv5 model, the detection and classification accuracy of the proposed model is higher, and the real-time performance can also meet the actual use requirements. The model proposed in this paper can realize the intelligent decision-making of garbage classification for big data in the scene in a complex system and has certain conditions for promotion and landing.


2021 ◽  
Vol 4 ◽  
Author(s):  
Ruwan Wickramarachchi ◽  
Cory Henson ◽  
Amit Sheth

Scene understanding is a key technical challenge within the autonomous driving domain. It requires a deep semantic understanding of the entities and relations found within complex physical and social environments that is both accurate and complete. In practice, this can be accomplished by representing entities in a scene and their relations as a knowledge graph (KG). This scene knowledge graph may then be utilized for the task of entity prediction, leading to improved scene understanding. In this paper, we will define and formalize this problem as Knowledge-based Entity Prediction (KEP). KEP aims to improve scene understanding by predicting potentially unrecognized entities by leveraging heterogeneous, high-level semantic knowledge of driving scenes. An innovative neuro-symbolic solution for KEP is presented, based on knowledge-infused learning, which 1) introduces a dataset agnostic ontology to describe driving scenes, 2) uses an expressive, holistic representation of scenes with knowledge graphs, and 3) proposes an effective, non-standard mapping of the KEP problem to the problem of link prediction (LP) using knowledge-graph embeddings (KGE). Using real, complex and high-quality data from urban driving scenes, we demonstrate its effectiveness by showing that the missing entities may be predicted with high precision (0.87 Hits@1) while significantly outperforming the non-semantic/rule-based baselines.


2021 ◽  
pp. 1-24
Author(s):  
Qiushuo Zheng ◽  
Hao Wen ◽  
Meng Wang ◽  
Guilin Qi

Abstract Existing visual scene understanding methods mainly focus on identifying coarse-grained concepts about the visual objects and their relationships, largely neglecting fine-grained scene understanding. In fact, many data-driven applications on the web (e.g. newsreading and e-shopping) require to accurately recognize much less coarse concepts as entities and properly link to a knowledge graph, which can take their performance to the next level. In light of this, in this paper, we identify a new research task: visual entity linking for fine-grained scene understanding. To accomplish the task, we first extract features of candidate entities from different modalities, i.e., visual features, textual features, and KG features. Then, we design a deep modal-attention neural network-based learning-to-rank method aggregates all features and map visual objects to the entities in KG. Extensive experimental results on the newly constructed dataset show that our proposed method is effective as it significantly improves the accuracy performance from 66.46% to 83.16% comparing with baselines.


2021 ◽  
Vol 2066 (1) ◽  
pp. 012049
Author(s):  
Jianfeng Zhong

Abstract As a value-added service that improves the efficiency of online customer service, customer service robots have been well received by sellers in recent years. Because the robot strives to free the customer service staff from the heavy consulting services in the past, thereby reducing the seller’s operating costs and improving the quality of online services. The purpose of this article is to study the intelligent customer service robot scene understanding technology based on deep learning. It mainly introduces some commonly used models and training methods of deep learning and the application fields of deep learning. Analyzed the problems of the traditional Encoder-Decoder framework, and introduced the chat model designed in this paper based on these problems, that is, the intelligent chat robot model (T-DLLModel) obtained by combining the neural network topic model and the deep learning language model. Conduct an independent question understanding experiment based on question retelling and a question understanding experiment combined with contextual information on the dialogue between online shopping customer service and customers. The experimental results show that when the similarity threshold is 0.4, the method achieves better results, and an F value of 0.5 is achieved. The semantic similarity calculation method proposed in this paper is better than the traditional method based on keywords and semantic information, especially when the similarity threshold increases, the recall rate of this paper is significantly better than the traditional method. The method in this article has a slightly better answer sorting effect on the real customer service dialogue data than the method based on LDA.


2021 ◽  
Author(s):  
Muraleekrishna Gopinathan ◽  
Giang Truong ◽  
Jumana Abu-Khalaf

2021 ◽  
Author(s):  
Hoai-Nhan Nguyen ◽  
Minh-Son Nguyen ◽  
Tri-Nhut Do

Sign in / Sign up

Export Citation Format

Share Document