scholarly journals Visual–tactile object recognition of a soft gripper based on faster Region-based Convolutional Neural Network and machining learning algorithm

2020 ◽  
Vol 17 (5) ◽  
pp. 172988142094872
Author(s):  
Chenlei Jiao ◽  
Binbin Lian ◽  
Zhe Wang ◽  
Yimin Song ◽  
Tao Sun

Object recognition is a prerequisite to control a soft gripper successfully grasping an unknown object. Visual and tactile recognitions are two commonly used methods in a grasping system. Visual recognition is limited if the size and weight of the objects are involved, whereas the efficiency of tactile recognition is a problem. A visual–tactile recognition method is proposed to overcome the disadvantages of both methods in this article. The design and fabrication of the soft gripper considering the visual and tactile sensors are implemented, where the Kinect v2 is adopted for visual information, bending and pressure sensors are embedded to the soft fingers for tactile information. The proposed method is divided into three steps: initial recognition by vision, detail recognition by touch, and a data fusion decision making. Experiments show that the visual–tactile recognition has the best results. The average recognition accuracy of the daily objects by the proposed method is also the highest. The feasibility of the visual–tactile recognition is verified.

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1919
Author(s):  
Shuhua Liu ◽  
Huixin Xu ◽  
Qi Li ◽  
Fei Zhang ◽  
Kun Hou

With the aim to solve issues of robot object recognition in complex scenes, this paper proposes an object recognition method based on scene text reading. The proposed method simulates human-like behavior and accurately identifies objects with texts through careful reading. First, deep learning models with high accuracy are adopted to detect and recognize text in multi-view. Second, datasets including 102,000 Chinese and English scene text images and their inverse are generated. The F-measure of text detection is improved by 0.4% and the recognition accuracy is improved by 1.26% because the model is trained by these two datasets. Finally, a robot object recognition method is proposed based on the scene text reading. The robot detects and recognizes texts in the image and then stores the recognition results in a text file. When the user gives the robot a fetching instruction, the robot searches for corresponding keywords from the text files and achieves the confidence of multiple objects in the scene image. Then, the object with the maximum confidence is selected as the target. The results show that the robot can accurately distinguish objects with arbitrary shape and category, and it can effectively solve the problem of object recognition in home environments.


Author(s):  
Wu Jianxing ◽  
Zeng Dexin ◽  
Ju Qiaodan ◽  
Chang Zixuan ◽  
Yu Hai

Background:: Owing to the ability of a deep learning algorithm to identify objects and the related detection technology of security inspection equipment, in this paper, we propose a progressive object recognition method that con-siders local information of objects. Methods:: First, we construct an X-Base model by cascading multiple convolutions and pooling layers to obtain the feature mapping image. Moreover, we provide a “segmented convolution, unified recognition” strategy to detect the size of the objects. Results:: Experimental results show that this method can effectively identify the specifications of bags passing through the security inspection equipment. Compared with the traditional VGG and progressive VGG recognition methods, the pro-posed method achieves advantages in terms of efficiency and concurrency. Conclusion:: This study provides a method to gradually recognize objects and can potentially assist the operators to identify prohibited objects.


2001 ◽  
Vol 13 (1) ◽  
pp. 88-95
Author(s):  
Kazunori Terada ◽  
◽  
Takayuki Nakamura ◽  
Hideaki Takeda ◽  
Tsukasa Ogasawara

In this paper, we propose a new architecture for object recognition based on the concept of ""embodiment"" as a primitive function for a cognitive robot. We define the term ""embodiment"" as the extent of the agent itself, locomotive ability, and its sensor. Based on this concept, an object is represented by reaching action paths, which correspond to a set of sequences of movement by the agent for reaching the object. Such behavior is acquired by trial-and-error based on visual and tactile information. Visual information is used to obtain sensorimotor mapping, which represents the relationship between the change of an object's appearance and the movement of the agent. Tactile information is used to evaluate the change of physical condition of the object caused by such movement. By such means, the agent can recognize an object regardless of its position and orientation in the environment. To demonstrate the feasibility of our method, we detail experimental results of computer simulation.


2018 ◽  
Vol 8 (10) ◽  
pp. 1857 ◽  
Author(s):  
Jing Yang ◽  
Shaobo Li ◽  
Zong Gao ◽  
Zheng Wang ◽  
Wei Liu

The complexity of the background and the similarities between different types of precision parts, especially in the high-speed movement of conveyor belts in complex industrial scenes, pose immense challenges to the object recognition of precision parts due to diversity in illumination. This study presents a real-time object recognition method for 0.8 cm darning needles and KR22 bearing machine parts under a complex industrial background. First, we propose an image data increase algorithm based on directional flip, and we establish two types of dataset, namely, real data and increased data. We focus on increasing recognition accuracy and reducing computation time, and we design a multilayer feature fusion network to obtain feature information. Subsequently, we propose an accurate method for classifying precision parts on the basis of non-maximal suppression, and then form an improved You Only Look Once (YOLO) V3 network. We implement this method and compare it with models in our real-time industrial object detection experimental platform. Finally, experiments on real and increased datasets show that the proposed method outperforms the YOLO V3 algorithm in terms of recognition accuracy and robustness.


2019 ◽  
Vol 39 (1) ◽  
pp. 17-25 ◽  
Author(s):  
Lin Feng ◽  
Yang Liu ◽  
Zan Li ◽  
Meng Zhang ◽  
Feilong Wang ◽  
...  

PurposeThe purpose of this paper is to promote the efficiency of RGB-depth (RGB-D)-based object recognition in robot vision and find discriminative binary representations for RGB-D based objects.Design/methodology/approachTo promote the efficiency of RGB-D-based object recognition in robot vision, this paper applies hashing methods to RGB-D-based object recognition by utilizing the approximate nearest neighbors (ANN) to vote for the final result. To improve the object recognition accuracy in robot vision, an “Encoding+Selection” binary representation generation pattern is proposed. “Encoding+Selection” pattern can generate more discriminative binary representations for RGB-D-based objects. Moreover, label information is utilized to enhance the discrimination of each bit, which guarantees that the most discriminative bits can be selected.FindingsThe experiment results validate that the ANN-based voting recognition method is more efficient and effective compared to traditional recognition method in RGB-D-based object recognition for robot vision. Moreover, the effectiveness of the proposed bit selection method is also validated to be effective.Originality/valueHashing learning is applied to RGB-D-based object recognition, which significantly promotes the recognition efficiency for robot vision while maintaining high recognition accuracy. Besides, the “Encoding+Selection” pattern is utilized in the process of binary encoding, which effectively enhances the discrimination of binary representations for objects.


2020 ◽  
Vol 64 (4) ◽  
pp. 40404-1-40404-16
Author(s):  
I.-J. Ding ◽  
C.-M. Ruan

Abstract With rapid developments in techniques related to the internet of things, smart service applications such as voice-command-based speech recognition and smart care applications such as context-aware-based emotion recognition will gain much attention and potentially be a requirement in smart home or office environments. In such intelligence applications, identity recognition of the specific member in indoor spaces will be a crucial issue. In this study, a combined audio-visual identity recognition approach was developed. In this approach, visual information obtained from face detection was incorporated into acoustic Gaussian likelihood calculations for constructing speaker classification trees to significantly enhance the Gaussian mixture model (GMM)-based speaker recognition method. This study considered the privacy of the monitored person and reduced the degree of surveillance. Moreover, the popular Kinect sensor device containing a microphone array was adopted to obtain acoustic voice data from the person. The proposed audio-visual identity recognition approach deploys only two cameras in a specific indoor space for conveniently performing face detection and quickly determining the total number of people in the specific space. Such information pertaining to the number of people in the indoor space obtained using face detection was utilized to effectively regulate the accurate GMM speaker classification tree design. Two face-detection-regulated speaker classification tree schemes are presented for the GMM speaker recognition method in this study—the binary speaker classification tree (GMM-BT) and the non-binary speaker classification tree (GMM-NBT). The proposed GMM-BT and GMM-NBT methods achieve excellent identity recognition rates of 84.28% and 83%, respectively; both values are higher than the rate of the conventional GMM approach (80.5%). Moreover, as the extremely complex calculations of face recognition in general audio-visual speaker recognition tasks are not required, the proposed approach is rapid and efficient with only a slight increment of 0.051 s in the average recognition time.


2020 ◽  
Author(s):  
Bahareh Jozranjbar ◽  
Arni Kristjansson ◽  
Heida Maria Sigurdardottir

While dyslexia is typically described as a phonological deficit, recent evidence suggests that ventral stream regions, important for visual categorization and object recognition, are hypoactive in dyslexic readers who might accordingly show visual recognition deficits. By manipulating featural and configural information of faces and houses, we investigated whether dyslexic readers are disadvantaged at recognizing certain object classes or utilizing particular visual processing mechanisms. Dyslexic readers found it harder to recognize objects (houses), suggesting that visual problems in dyslexia are not completely domain-specific. Mean accuracy for faces was equivalent in the two groups, compatible with domain-specificity in face processing. While face recognition abilities correlated with reading ability, lower house accuracy was nonetheless related to reading difficulties even when accuracy for faces was kept constant, suggesting a specific relationship between visual word recognition and the recognition of non-face objects. Representational similarity analyses (RSA) revealed that featural and configural processes were clearly separable in typical readers, while dyslexic readers appeared to rely on a single process. This occurred for both faces and houses and was not restricted to particular visual categories. We speculate that reading deficits in some dyslexic readers reflect their reliance on a single process for object recognition.


2011 ◽  
Vol 121-126 ◽  
pp. 2141-2145 ◽  
Author(s):  
Wei Gang Yan ◽  
Chang Jian Wang ◽  
Jin Guo

This paper proposes a new image segmentation algorithm to detect the flame image from video in enclosed compartment. In order to avoid the contamination of soot and water vapor, this method first employs the cubic root of four color channels to transform a RGB image to a pseudo-gray one. Then the latter is divided into many small stripes (child images) and OTSU is employed to perform child image segmentation. Lastly, these processed child images are reconstructed into a whole image. A computer program using OpenCV library is developed and the new method is compared with other commonly used methods such as edge detection and normal Otsu’s method. It is found that the new method has better performance in flame image recognition accuracy and can be used to obtain flame shape from experiment video with much noise.


Sign in / Sign up

Export Citation Format

Share Document