A robot vision navigation method using deep learning in edge computing environment

AbstractIn the development of modern agriculture, the intelligent use of mechanical equipment is one of the main signs for agricultural modernization. Navigation technology is the key technology for agricultural machinery to control autonomously in the operating environment, and it is a hotspot in the field of intelligent research on agricultural machinery. Facing the accuracy requirements of autonomous navigation for intelligent agricultural robots, this paper proposes a visual navigation algorithm for agricultural robots based on deep learning image understanding. The method first uses a cascaded deep convolutional network and hybrid dilated convolution fusion method to process images collected by a vision system. Then, it extracts the route of processed images based on the improved Hough transform algorithm. At the same time, the posture of agricultural robots is adjusted to realize autonomous navigation. Finally, our proposed method is verified by using non-interference experimental scenes and noisy experimental scenes. Experimental results show that the method can perform autonomous navigation in complex and noisy environments and has good practicability and applicability.

Download Full-text

A Robot Vision Navigation Method using deep Learning in Edge Computing Environment

10.21203/rs.3.rs-221010/v1 ◽

2021 ◽

Author(s):

Jing Li ◽

Jialin Yin ◽

Lin Deng

Keyword(s):

Deep Learning ◽

Autonomous Navigation ◽

Vision System ◽

Robot Vision ◽

Image Understanding ◽

Visual Navigation ◽

Mechanical Equipment ◽

Agricultural Machinery ◽

Convolutional Network ◽

Agricultural Robots

Abstract In the development of modern agriculture, the intelligent use of mechanical equipment is one of the main signs for agricultural modernization. Navigation technology is the key technology for agricultural machinery to control autonomously in operating environment, and it is a hotspot in the field of intelligent research on agricultural machinery. Facing the accuracy requirements of autonomous navigation for intelligent agricultural robots, this paper proposes a visual navigation algorithm for agricultural robots based on deep learning image understanding. The method first uses cascaded deep convolutional network and hybrid dilated convolution fusion method to process images collected by vision system. Then it extracts the route of processed images based on improved Hough transform algorithm. At the same time, the posture of agricultural robots is adjusted to realize autonomous navigation. Finally, our proposed method is verified by using non-interference experimental scenes and noisy experimental scenes. Experimental results show that the method can perform autonomous navigation in complex and noisy environments, and has good practicability and applicability.

Download Full-text

Robot Vision System for Human Detection and Action Recognition

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2020.p0346 ◽

2020 ◽

Vol 24 (3) ◽

pp. 346-356

Author(s):

Satoshi Hoshino ◽

◽

Kyohei Niimura

Keyword(s):

Optical Flow ◽

Real Time ◽

Action Recognition ◽

Autonomous Navigation ◽

Vision System ◽

Robot Vision ◽

Human Detection ◽

Time Performance ◽

Camera Sensors ◽

Camera Sensor

Mobile robots equipped with camera sensors are required to perceive humans and their actions for safe autonomous navigation. For simultaneous human detection and action recognition, the real-time performance of the robot vision is an important issue. In this paper, we propose a robot vision system in which original images captured by a camera sensor are described by the optical flow. These images are then used as inputs for the human and action classifications. For the image inputs, two classifiers based on convolutional neural networks are developed. Moreover, we describe a novel detector (a local search window) for clipping partial images around the target human from the original image. Since the camera sensor moves together with the robot, the camera movement has an influence on the calculation of optical flow in the image, which we address by further modifying the optical flow for changes caused by the camera movement. Through the experiments, we show that the robot vision system can detect humans and recognize the action in real time. Furthermore, we show that a moving robot can achieve human detection and action recognition by modifying the optical flow.

Download Full-text

Video Captioning Based on Both Egocentric and Exocentric Views of Robot Vision for Human-Robot Interaction

International Journal of Social Robotics ◽

10.1007/s12369-021-00842-1 ◽

2021 ◽

Author(s):

Soo-Han Kang ◽

Ji-Hyeong Han

Keyword(s):

Deep Learning ◽

Natural Language ◽

Vision System ◽

Robot Vision ◽

Learning Model ◽

Human Robot Interaction ◽

Robot Interaction ◽

Video Captioning ◽

Global Action ◽

Deep Learning Model

AbstractRobot vision provides the most important information to robots so that they can read the context and interact with human partners successfully. Moreover, to allow humans recognize the robot’s visual understanding during human-robot interaction (HRI), the best way is for the robot to provide an explanation of its understanding in natural language. In this paper, we propose a new approach by which to interpret robot vision from an egocentric standpoint and generate descriptions to explain egocentric videos particularly for HRI. Because robot vision equals to egocentric video on the robot’s side, it contains as much egocentric view information as exocentric view information. Thus, we propose a new dataset, referred to as the global, action, and interaction (GAI) dataset, which consists of egocentric video clips and GAI descriptions in natural language to represent both egocentric and exocentric information. The encoder-decoder based deep learning model is trained based on the GAI dataset and its performance on description generation assessments is evaluated. We also conduct experiments in actual environments to verify whether the GAI dataset and the trained deep learning model can improve a robot vision system

Download Full-text

A vision system for mobile robot navigation

Robotica ◽

10.1017/s026357470001821x ◽

1994 ◽

Vol 12 (1) ◽

pp. 77-89 ◽

Cited By ~ 4

Author(s):

M. Elarbi Boudihir ◽

M. Dufaut ◽

R. Husson

Keyword(s):

Mobile Robot ◽

Knowledge Base ◽

Autonomous Navigation ◽

Vision System ◽

Visual Navigation ◽

Mobile Robot Navigation ◽

Main Element ◽

Task Coordination ◽

New Vision ◽

Robot Position

A new vision system architecture has been developed to support the visual navigation of an autonomous mobile robot. This robot is primarily intended for urban park inspection, so it should be able to move in a complex unstructured environment. The system consists of various modules each ensuring a specific task involved in autonomous navigation. Task coordination focuses on the central module called the supervisor which triggers each module at the time appropriate to the current situation of the robot. Most of the processing time is spent with the scene exploration module which is based on the Hough transform to extract the dominant straight features. This module operates in two modes: the initial phase which forms the type of processing applied to the first image acquired in order to initiate navigation, and the continuous following mode which ensures the processing of subsequent images taken at the end of the blind distance. In order to rely less on visual data, a detailed map of the environment has been established, and an algorithm is used to make a scene prediction based on robot position provided by the localization system. The predicted scene is used to validate the objects detected by the knowledge base. This knowledge base uses the acquired and predicted data to construct a scene model which is the main element of the vision system.

Download Full-text

Optical Flow for Real-Time Human Detection and Action Recognition Based on CNN Classifiers

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2019.p0735 ◽

2019 ◽

Vol 23 (4) ◽

pp. 735-742

Author(s):

Satoshi Hoshino ◽

◽

Kyohei Niimura

Keyword(s):

Optical Flow ◽

Real Time ◽

Autonomous Navigation ◽

Vision System ◽

Robot Vision ◽

Human Detection ◽

Original Image ◽

Search Window ◽

Time Performance ◽

Camera Sensors

Mobile robots equipped with camera sensors are required to perceive surrounding humans and their actions for safe and autonomous navigation. In this work, moving humans are the target objects. For robot vision, real-time performance is an important requirement. Therefore, we propose a robot vision system in which the original images captured by a camera sensor are described by optical flow. These images are then used as inputs to a classifier. For classifying images into human and not-human classifications, and the actions, we use a convolutional neural network (CNN), rather than coding invariant features. Moreover, we present a local search window as a novel detector for clipping partial images around target objects in an original image. Through the experiments, we ultimately show that the robot vision system is able to detect moving humans and recognize action in real time.

Download Full-text

Research on Vision System Calibration Method of Forestry Mobile Robots

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2020.14.139 ◽

2021 ◽

Vol 14 ◽

pp. 1107-1114

Author(s):

Ruting Yao ◽

Yili Zheng ◽

Fengjun Chen ◽

Jian Wu ◽

Hui Wang

Keyword(s):

Mobile Robots ◽

Autonomous Navigation ◽

Time Synchronization ◽

Vision System ◽

Robot Vision ◽

Nonlinear Least Squares ◽

Calibration Method ◽

Time Operation ◽

Spatial Calibration ◽

Low Efficiency

Forestry mobile robots can effectively solve the problems of low efficiency and poor safety in the forestry operation process. To realize the autonomous navigation of forestry mobile robots, a vision system consisting of a monocular camera and two-dimensional LiDAR and its calibration method are investigated. First, the adaptive algorithm is used to synchronize the data captured by the two in time. Second, a calibration board with a convex checkerboard is designed for the spatial calibration of the devices. The nonlinear least squares algorithm is employed to solve and optimize the external parameters. The experimental results show that the time synchronization precision of this calibration method is 0.0082s, the communication rate is 23Hz, and the gradient tolerance of spatial calibration is 8.55e−07. The calibration results satisfy the requirements of real-time operation and accuracy of the forestry mobile robot vision system. Furthermore, the engineering applications of the vision system are discussed herein. This study lays the foundation for further forestry mobile robots research, which is relevant to intelligent forest machines.

Download Full-text

RGB-D-Based Pose Estimation of Workpieces with Semantic Segmentation and Point Cloud Registration

Sensors ◽

10.3390/s19081873 ◽

2019 ◽

Vol 19 (8) ◽

pp. 1873 ◽

Cited By ~ 4

Author(s):

Hui Xu ◽

Guodong Chen ◽

Zhenhua Wang ◽

Lining Sun ◽

Fan Su

Keyword(s):

Pose Estimation ◽

Point Cloud ◽

Evaluation Method ◽

Vision System ◽

Robot Vision ◽

Semantic Segmentation ◽

Industrial Robots ◽

Depth Information ◽

Point Cloud Registration ◽

Convolutional Network

As an important part of a factory’s automated production line, industrial robots can perform a variety of tasks by integrating external sensors. Among these tasks, grasping scattered workpieces on the industrial assembly line has always been a prominent and difficult point in robot manipulation research. By using RGB-D (color and depth) information, we propose an efficient and practical solution that fuses the approaches of semantic segmentation and point cloud registration to perform object recognition and pose estimation. Different from objects in an indoor environment, the characteristics of the workpiece are relatively simple; thus, we create and label an RGB image dataset from a variety of industrial scenarios and train the modified FCN (Fully Convolutional Network) on a homemade dataset to infer the semantic segmentation results of the input images. Then, we determine the point cloud of the workpieces by incorporating the depth information to estimate the real-time pose of the workpieces. To evaluate the accuracy of the solution, we propose a novel pose error evaluation method based on the robot vision system. This method does not rely on expensive measuring equipment and can also obtain accurate evaluation results. In an industrial scenario, our solution has a rotation error less than two degrees and a translation error < 10 mm.

Download Full-text

SYSTEMATIC DESIGN OF COMPLEX ARTEFACTS: ROBOT VISION

International Journal of Information Acquisition ◽

10.1142/s0219878908001491 ◽

2008 ◽

Vol 05 (01) ◽

pp. 51-63 ◽

Cited By ~ 2

Author(s):

JOAQUIN SITTE ◽

PETRA WINZER

Keyword(s):

Vision System ◽

Robot Vision ◽

Visual Navigation ◽

Design Methods ◽

System Level ◽

Robot Design ◽

Systematic Design ◽

Mechatronic Systems ◽

Current Design ◽

Testing Design

In this paper we use the design of an innovative on-board vision system for a small commercial minirobot to demonstrate the application of the demand compliant design (DeCoDe) method. Vision systems are amongst the most complex sensor systems both in nature and in engineering and thus provide an excellent arena for testing design methods. A review of current design methods for mechatronic systems shows that there are no methods that support or require a complete description of the product system. The DeCoDe method is a step towards overcoming this deficiency. The minirobot robot design is carried from the generic vision system level down to first refinement for a minirobot vision system for visual navigation.

Download Full-text

Towards 6G IoT: Tracing Mobile Sensor Nodes with Deep Learning Clustering in UAV Networks

Sensors ◽

10.3390/s21113936 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3936

Author(s):

Yannis Spyridis ◽

Thomas Lagkas ◽

Panagiotis Sarigiannidis ◽

Vasileios Argyriou ◽

Antonios Sarigiannidis ◽

...

Keyword(s):

Deep Learning ◽

Heuristic Method ◽

Sensor Nodes ◽

Convolutional Network ◽

Received Signal Strength Indicator ◽

Anchor Nodes ◽

Time Required ◽

Rf Signal ◽

Deep Learning Model

Unmanned aerial vehicles (UAVs) in the role of flying anchor nodes have been proposed to assist the localisation of terrestrial Internet of Things (IoT) sensors and provide relay services in the context of the upcoming 6G networks. This paper considered the objective of tracing a mobile IoT device of unknown location, using a group of UAVs that were equipped with received signal strength indicator (RSSI) sensors. The UAVs employed measurements of the target’s radio frequency (RF) signal power to approach the target as quickly as possible. A deep learning model performed clustering in the UAV network at regular intervals, based on a graph convolutional network (GCN) architecture, which utilised information about the RSSI and the UAV positions. The number of clusters was determined dynamically at each instant using a heuristic method, and the partitions were determined by optimising an RSSI loss function. The proposed algorithm retained the clusters that approached the RF source more effectively, removing the rest of the UAVs, which returned to the base. Simulation experiments demonstrated the improvement of this method compared to a previous deterministic approach, in terms of the time required to reach the target and the total distance covered by the UAVs.

Download Full-text

Deep-Emotion: Facial Expression Recognition Using Attentional Convolutional Network

Sensors ◽

10.3390/s21093046 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3046

Author(s):

Shervin Minaee ◽

Mehdi Minaei ◽

Amirali Abdolrashidi

Keyword(s):

Deep Learning ◽

Facial Expression ◽

Facial Expression Recognition ◽

Expression Recognition ◽

Visualization Technique ◽

Convolutional Network ◽

The Past ◽

Multiple Datasets ◽

The Face ◽

Traditional Approaches

Facial expression recognition has been an active area of research over the past few decades, and it is still challenging due to the high intra-class variation. Traditional approaches for this problem rely on hand-crafted features such as SIFT, HOG, and LBP, followed by a classifier trained on a database of images or videos. Most of these works perform reasonably well on datasets of images captured in a controlled condition but fail to perform as well on more challenging datasets with more image variation and partial faces. In recent years, several works proposed an end-to-end framework for facial expression recognition using deep learning models. Despite the better performance of these works, there are still much room for improvement. In this work, we propose a deep learning approach based on attentional convolutional network that is able to focus on important parts of the face and achieves significant improvement over previous models on multiple datasets, including FER-2013, CK+, FERG, and JAFFE. We also use a visualization technique that is able to find important facial regions to detect different emotions based on the classifier’s output. Through experimental results, we show that different emotions are sensitive to different parts of the face.

Download Full-text