scholarly journals Video Captioning Based on Both Egocentric and Exocentric Views of Robot Vision for Human-Robot Interaction

Author(s):  
Soo-Han Kang ◽  
Ji-Hyeong Han

AbstractRobot vision provides the most important information to robots so that they can read the context and interact with human partners successfully. Moreover, to allow humans recognize the robot’s visual understanding during human-robot interaction (HRI), the best way is for the robot to provide an explanation of its understanding in natural language. In this paper, we propose a new approach by which to interpret robot vision from an egocentric standpoint and generate descriptions to explain egocentric videos particularly for HRI. Because robot vision equals to egocentric video on the robot’s side, it contains as much egocentric view information as exocentric view information. Thus, we propose a new dataset, referred to as the global, action, and interaction (GAI) dataset, which consists of egocentric video clips and GAI descriptions in natural language to represent both egocentric and exocentric information. The encoder-decoder based deep learning model is trained based on the GAI dataset and its performance on description generation assessments is evaluated. We also conduct experiments in actual environments to verify whether the GAI dataset and the trained deep learning model can improve a robot vision system

2019 ◽  
Vol 14 (1) ◽  
pp. 22-30
Author(s):  
Dongkeon Park ◽  
◽  
Kyeong-Min Kang ◽  
Jin-Woo Bae ◽  
Ji-Hyeong Han

2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Chinh Trong Nguyen ◽  
Dang Tuan Nguyen

Recently, many deep learning models have archived high results in question answering task with overall F1 scores above 0.88 on SQuAD datasets. However, many of these models have quite low F1 scores on why-questions. These F1 scores range from 0.57 to 0.7 on SQuAD v1.1 development set. This means these models are more appropriate to the extraction of answers for factoid questions than for why-questions. Why-questions are asked when explanations are needed. These explanations are possibly arguments or simply subjective opinions. Therefore, we propose an approach to finding the answer for why-question using discourse analysis and natural language inference. In our approach, natural language inference is applied to identify implicit arguments at sentence level. It is also applied in sentence similarity calculation. Discourse analysis is applied to identify the explicit arguments and the opinions at sentence level in documents. The results from these two methods are the answer candidates to be selected as the final answer for each why-question. We also implement a system with our approach. Our system can provide an answer for a why-question and a document as in reading comprehension test. We test our system with a Vietnamese translated test set which contains all why-questions of SQuAD v1.1 development set. The test results show that our system cannot beat a deep learning model in F1 score; however, our system can answer more questions (answer rate of 77.0%) than the deep learning model (answer rate of 61.0%).


2020 ◽  
Vol 13 (4) ◽  
pp. 627-640 ◽  
Author(s):  
Avinash Chandra Pandey ◽  
Dharmveer Singh Rajpoot

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.


Author(s):  
Giorgio Metta

This chapter outlines a number of research lines that, starting from the observation of nature, attempt to mimic human behavior in humanoid robots. Humanoid robotics is one of the most exciting proving grounds for the development of biologically inspired hardware and software—machines that try to recreate billions of years of evolution with some of the abilities and characteristics of living beings. Humanoids could be especially useful for their ability to “live” in human-populated environments, occupying the same physical space as people and using tools that have been designed for people. Natural human–robot interaction is also an important facet of humanoid research. Finally, learning and adapting from experience, the hallmark of human intelligence, may require some approximation to the human body in order to attain similar capacities to humans. This chapter focuses particularly on compliant actuation, soft robotics, biomimetic robot vision, robot touch, and brain-inspired motor control in the context of the iCub humanoid robot.


Sign in / Sign up

Export Citation Format

Share Document