Robot Vision to Audio Description Based on Deep Learning for Effective Human-Robot Interaction

AbstractRobot vision provides the most important information to robots so that they can read the context and interact with human partners successfully. Moreover, to allow humans recognize the robot’s visual understanding during human-robot interaction (HRI), the best way is for the robot to provide an explanation of its understanding in natural language. In this paper, we propose a new approach by which to interpret robot vision from an egocentric standpoint and generate descriptions to explain egocentric videos particularly for HRI. Because robot vision equals to egocentric video on the robot’s side, it contains as much egocentric view information as exocentric view information. Thus, we propose a new dataset, referred to as the global, action, and interaction (GAI) dataset, which consists of egocentric video clips and GAI descriptions in natural language to represent both egocentric and exocentric information. The encoder-decoder based deep learning model is trained based on the GAI dataset and its performance on description generation assessments is evaluated. We also conduct experiments in actual environments to verify whether the GAI dataset and the trained deep learning model can improve a robot vision system

Download Full-text

Humans and humanoids

10.1093/oso/9780199674923.003.0047 ◽

2018 ◽

Author(s):

Giorgio Metta

Keyword(s):

Humanoid Robot ◽

Robot Vision ◽

Physical Space ◽

Humanoid Robots ◽

Human Robot Interaction ◽

Biologically Inspired ◽

Robot Interaction ◽

Humanoid Robotics ◽

Biomimetic Robot ◽

Compliant Actuation

This chapter outlines a number of research lines that, starting from the observation of nature, attempt to mimic human behavior in humanoid robots. Humanoid robotics is one of the most exciting proving grounds for the development of biologically inspired hardware and software—machines that try to recreate billions of years of evolution with some of the abilities and characteristics of living beings. Humanoids could be especially useful for their ability to “live” in human-populated environments, occupying the same physical space as people and using tools that have been designed for people. Natural human–robot interaction is also an important facet of humanoid research. Finally, learning and adapting from experience, the hallmark of human intelligence, may require some approximation to the human body in order to attain similar capacities to humans. This chapter focuses particularly on compliant actuation, soft robotics, biomimetic robot vision, robot touch, and brain-inspired motor control in the context of the iCub humanoid robot.

Download Full-text

Deep learning techniques‐based perfection of multi‐sensor fusion oriented human‐robot interaction system for identification of dense organisms

Cognitive Computation and Systems ◽

10.1049/ccs2.12010 ◽

2021 ◽

Author(s):

Haiju Li ◽

Chuntang Zhang ◽

Jingwen Bo ◽

Zhongjun Ding

Keyword(s):

Deep Learning ◽

Sensor Fusion ◽

Human Robot Interaction ◽

Robot Interaction ◽

Interaction System ◽

Learning Techniques

Download Full-text

Learning Robot Vision for Assisted Living

Robotic Vision ◽

10.4018/978-1-4666-2672-0.ch015 ◽

2013 ◽

pp. 257-280

Author(s):

Wenjie Yan ◽

Elena Torta ◽

David van der Pol ◽

Nils Meins ◽

Cornelius Weber ◽

...

Keyword(s):

Assisted Living ◽

Input Data ◽

Robot Navigation ◽

Ambient Assisted Living ◽

Robot Vision ◽

Human Robot Interaction ◽

Test Case ◽

Robot Interaction ◽

Learning Abilities

This chapter presents an overview of a typical scenario of Ambient Assisted Living (AAL) in which a robot navigates to a person for conveying information. Indoor robot navigation is a challenging task due to the complexity of real-home environments and the need of online learning abilities to adjust for dynamic conditions. A comparison between systems with different sensor typologies shows that vision-based systems promise to provide good performance and a wide scope of usage at reasonable cost. Moreover, vision-based systems can perform different tasks simultaneously by applying different algorithms to the input data stream thus enhancing the flexibility of the system. The authors introduce the state of the art of several computer vision methods for realizing indoor robotic navigation to a person and human-robot interaction. A case study has been conducted in which a robot, which is part of an AAL system, navigates to a person and interacts with her. The authors evaluate this test case and give an outlook on the potential of learning robot vision in ambient homes.

Download Full-text