Coordinating Entrainment Phenomena: Robot Conversation Strategy for Object Recognition

This study proposes a robot conversation strategy involving speech and gestures to improve a robot’s indicated object recognition, i.e., the recognition of an object indicated by a human. Research conducted to improve the performance of indicated object recognition is divided into two main approaches: development and interactive. The development approach addresses the development of new devices or algorithms. Through human–robot interaction, the interactive approach improves the performance by decreasing the variability and the ambiguity of the references. Inspired by the findings of entrainment and entrainment inhibition, this study proposes a robot conversation strategy that utilizes the interactive approach. While entrainment is a phenomenon in which people unconsciously tend to mimic words and/or gestures of their interlocutor, entrainment inhibition is the opposite phenomenon in which people decrease the amount of information contained in their words and gestures when their interlocutor provides excess information. Based on these phenomena, we designed a robot conversation strategy that elicits clear references. We experimentally compared this strategy with the other interactive strategy in which a robot explicitly requests clarifications when a human refers to an object. We obtained the following findings: (1) The proposed strategy clarifies human references and improves indicated object recognition performance, and (2) the proposed strategy forms better impressions than the other interactive strategy that explicitly requests clarifications when people refer to objects.

Download Full-text

GazeEMD: Detecting Visual Intention in Gaze-Based Human-Robot Interaction

Robotics ◽

10.3390/robotics10020068 ◽

2021 ◽

Vol 10 (2) ◽

pp. 68

Author(s):

Lei Shi ◽

Cosmin Copot ◽

Steve Vanlanduit

Keyword(s):

Cognitive Load ◽

Real World ◽

Robotic Manipulator ◽

Similarity Score ◽

Human Robot Interaction ◽

Experimental Results ◽

The Other ◽

Robot Interaction ◽

Run Length ◽

The Real

In gaze-based Human-Robot Interaction (HRI), it is important to determine human visual intention for interacting with robots. One typical HRI interaction scenario is that a human selects an object by gaze and a robotic manipulator will pick up the object. In this work, we propose an approach, GazeEMD, that can be used to detect whether a human is looking at an object for HRI application. We use Earth Mover’s Distance (EMD) to measure the similarity between the hypothetical gazes at objects and the actual gazes. Then, the similarity score is used to determine if the human visual intention is on the object. We compare our approach with a fixation-based method and HitScan with a run length in the scenario of selecting daily objects by gaze. Our experimental results indicate that the GazeEMD approach has higher accuracy and is more robust to noises than the other approaches. Hence, the users can lessen cognitive load by using our approach in the real-world HRI scenario.

Download Full-text

How Much Information Does a Robot Need? Exploring the Benefits of Increased Sensory Range in a Simulated Crowd Navigation Task

Information ◽

10.3390/info11020112 ◽

2020 ◽

Vol 11 (2) ◽

pp. 112

Author(s):

Marit Hagens ◽

Serge Thill

Keyword(s):

Task Performance ◽

Human Robot Interaction ◽

The Other ◽

Proof Of Concept ◽

Robot Interaction ◽

Navigation Task ◽

Navigation Strategy ◽

Hardware Designs ◽

Navigation Strategies ◽

And Task

Perfect information about an environment allows a robot to plan its actions optimally, but often requires significant investments into sensors and possibly infrastructure. In applications relevant to human–robot interaction, the environment is by definition dynamic and events close to the robot may be more relevant than distal ones. This suggests a non-trivial relationship between sensory sophistication on one hand, and task performance on the other. In this paper, we investigate this relationship in a simulated crowd navigation task. We use three different environments with unique characteristics that a crowd navigating robot might encounter and explore how the robot’s sensor range correlates with performance in the navigation task. We find diminishing returns of increased range in our particular case, suggesting that task performance and sensory sophistication might follow non-trivial relationships and that increased sophistication on the sensor side does not necessarily equal a corresponding increase in performance. Although this result is a simple proof of concept, it illustrates the benefit of exploring the consequences of different hardware designs—rather than merely algorithmic choices—in simulation first. We also find surprisingly good performance in the navigation task, including a low number of collisions with simulated human agents, using a relatively simple A*/NavMesh-based navigation strategy, which suggests that navigation strategies for robots in crowds need not always be sophisticated.

Download Full-text

Spatial Relation Model for Object Recognition in Human-Robot Interaction

Emerging Intelligent Computing Technology and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-642-04070-2_63 ◽

2009 ◽

pp. 574-584 ◽

Cited By ~ 3

Author(s):

Lu Cao ◽

Yoshinori Kobayashi ◽

Yoshinori Kuno

Keyword(s):

Object Recognition ◽

Spatial Relation ◽

Human Robot Interaction ◽

Robot Interaction ◽

Relation Model

Download Full-text

The Effect of Polarity on Object Recognition in Thermal Images

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/154193129303700132 ◽

1993 ◽

Vol 37 (1) ◽

pp. 137-141

Author(s):

Michael S. Brickner ◽

Amir Zvuloni

Keyword(s):

Object Recognition ◽

Experimental Design ◽

Thermal Imaging ◽

Recognition Performance ◽

Target Object ◽

The Other ◽

Thermal Images ◽

Natural Objects ◽

Different Temperatures ◽

Effect Of Polarity

Thermal imaging (TI) systems, transform the distribution of relative temperatures in a scene into a visible TV image. TIs differ significantly from regular TV images. Most TI systems allow their operators to select preferred polarity which determines the way in which gray shades represent different temperatures. Polarity may be set to either black hot (BH) or white hot (WH). The present experiments were designed to investigate the effects of polarity on object recognition performance in TI and to compare object recognition performance of experts and novices. In the first experiment, twenty flight candidates were asked to recognize target objects in 60 dynamic TI recordings taken from two different TI systems. The targets included a variety of human placed and natural objects. Each subject viewed half the targets in BH and the other half in WH polarity in a balanced experimental design. For 24 out of the 60 targets one direction of polarity produced better performance than the other. Although the direction of superior polarity (BH or WH better) was not consistent, the preferred representation of the target object was very consistent. For example, vegetation was more readily recognized when presented as dark objects on a brighter background. The results are discussed in terms of importance of surface determinants versus edge determinants in the recognition of TI objects. In the second experiment, the performance of 10 expert TI users was found to be significantly more accurate but not much faster than the performance of 20 novice subjects.

Download Full-text

Projection-dependent input processing for 3D object recognition in human robot interaction systems

Image and Vision Computing ◽

10.1016/j.imavis.2020.104089 ◽

2021 ◽

Vol 106 ◽

pp. 104089

Author(s):

P.S. Febin Sheron ◽

K.P. Sridhar ◽

S. Baskar ◽

P. Mohamed Shakeel

Keyword(s):

Object Recognition ◽

Human Robot Interaction ◽

3D Object Recognition ◽

Robot Interaction ◽

3D Object ◽

Input Processing

Download Full-text

A Comparative Analysis for 2D Object Recognition: A Case Study with Tactode Puzzle-Like Tiles

Journal of Imaging ◽

10.3390/jimaging7040065 ◽

2021 ◽

Vol 7 (4) ◽

pp. 65

Author(s):

Daniel Silva ◽

Armando Sousa ◽

Valter Costa

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Comparative Analysis ◽

Object Recognition ◽

Template Matching ◽

Recognition Performance ◽

Regions Of Interest ◽

The Other ◽

Classification Methods

Object recognition represents the ability of a system to identify objects, humans or animals in images. Within this domain, this work presents a comparative analysis among different classification methods aiming at Tactode tile recognition. The covered methods include: (i) machine learning with HOG and SVM; (ii) deep learning with CNNs such as VGG16, VGG19, ResNet152, MobileNetV2, SSD and YOLOv4; (iii) matching of handcrafted features with SIFT, SURF, BRISK and ORB; and (iv) template matching. A dataset was created to train learning-based methods (i and ii), and with respect to the other methods (iii and iv), a template dataset was used. To evaluate the performance of the recognition methods, two test datasets were built: tactode_small and tactode_big, which consisted of 288 and 12,000 images, holding 2784 and 96,000 regions of interest for classification, respectively. SSD and YOLOv4 were the worst methods for their domain, whereas ResNet152 and MobileNetV2 showed that they were strong recognition methods. SURF, ORB and BRISK demonstrated great recognition performance, while SIFT was the worst of this type of method. The methods based on template matching attained reasonable recognition results, falling behind most other methods. The top three methods of this study were: VGG16 with an accuracy of 99.96% and 99.95% for tactode_small and tactode_big, respectively; VGG19 with an accuracy of 99.96% and 99.68% for the same datasets; and HOG and SVM, which reached an accuracy of 99.93% for tactode_small and 99.86% for tactode_big, while at the same time presenting average execution times of 0.323 s and 0.232 s on the respective datasets, being the fastest method overall. This work demonstrated that VGG16 was the best choice for this case study, since it minimised the misclassifications for both test datasets.

Download Full-text

Expression and Identification of Confidence Based on Individual Verbal and Non-Verbal Features in Human-Robot Interaction

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2019.p1089 ◽

2019 ◽

Vol 23 (6) ◽

pp. 1089-1097

Author(s):

Youdi Li ◽

◽

Wei Fen Hsieh ◽

Eri Sato-Shimokawara ◽

Toru Yamaguchi

Keyword(s):

Daily Life ◽

Expression Patterns ◽

Individual Characteristics ◽

Human Robot Interaction ◽

The Other ◽

Robot Interaction ◽

Psychological Feature ◽

Confidence Degree ◽

Distinct Features ◽

Proper Response

In our daily life, it is inevitable to confront the condition which we feel confident or unconfident. Under these conditions, we might have different expressions and responses. Not to mention under the situation when a human communicates with a robot. It is necessary for robots to behave in various styles to show adaptive confidence degree, for example, in previous work, when the robot made mistakes during the interaction, different certainty expression styles have shown influence on humans’ truthfulness and acceptance. On the other hand, when human feel uncertain on the robot’s utterance, the approach of how the robot recognizes human’s uncertainty is crucial. However, relative researches are still scarce and ignore individual characteristics. In current study, we designed an experiment to obtain human verbal and non-verbal features under certain and uncertain condition. From the certain/uncertain answer experiment, we extracted the head movement and voice factors as features to investigate if we can classify these features correctly. From the result, we have found that different people had distinct features to show different certainty degree but some participants might have a similar pattern considering their relatively close psychological feature value. We aim to explore different individuals’ certainty expression patterns because it can not only facilitate humans’ confidence status detection but also is expected to be utilized on robot side to give the proper response adaptively and thus spice up the Human-Robot Interaction.

Download Full-text

Enhancing Perception with Tactile Object Recognition in Adaptive Grippers for Human–Robot Interaction

Sensors ◽

10.3390/s18030692 ◽

2018 ◽

Vol 18 (3) ◽

pp. 692 ◽

Cited By ~ 18

Author(s):

Juan Gandarias ◽

Jesús Gómez-de-Gabriel ◽

Alfonso García-Cerezo

Keyword(s):

Object Recognition ◽

Human Robot Interaction ◽

Robot Interaction

Download Full-text

Human robot interaction through simple expressions for object recognition

RO-MAN 2008 - The 17th IEEE International Symposium on Robot and Human Interactive Communication ◽

10.1109/roman.2008.4600740 ◽

2008 ◽

Cited By ~ 7

Author(s):

Al Mansur ◽

Katsutoshi Sakata ◽

Tajin Rukhsana ◽

Yoshinori Kobayashi ◽

Yoshinori Kuno

Keyword(s):

Object Recognition ◽

Human Robot Interaction ◽

Robot Interaction

Download Full-text

Emotion based human-robot interaction

MATEC Web of Conferences ◽

10.1051/matecconf/201816101001 ◽

2018 ◽

Vol 161 ◽

pp. 01001 ◽

Cited By ~ 3

Author(s):

Karsten Berns ◽

Zuhair Zafar

Keyword(s):

Humanoid Robot ◽

Humanoid Robots ◽

Verbal Communication ◽

Human Robot Interaction ◽

The Other ◽

Robot Interaction ◽

Human Machine Interaction ◽

Verbal Cues ◽

Perception System ◽

Machine Interaction

Human-machine interaction is a major challenge in the development of complex humanoid robots. In addition to verbal communication the use of non-verbal cues such as hand, arm and body gestures or mimics can improve the understanding of the intention of the robot. On the other hand, by perceiving such mechanisms of a human in a typical interaction scenario the humanoid robot can adapt its interaction skills in a better way. In this work, the perception system of two social robots, ROMAN and ROBIN of the RRLAB of the TU Kaiserslautern, is presented in the range of human-robot interaction.

Download Full-text