Changing within-trial array location and target object position enhances rats’ (Rattus norvegicus) missing object recognition accuracy

With the aim to solve issues of robot object recognition in complex scenes, this paper proposes an object recognition method based on scene text reading. The proposed method simulates human-like behavior and accurately identifies objects with texts through careful reading. First, deep learning models with high accuracy are adopted to detect and recognize text in multi-view. Second, datasets including 102,000 Chinese and English scene text images and their inverse are generated. The F-measure of text detection is improved by 0.4% and the recognition accuracy is improved by 1.26% because the model is trained by these two datasets. Finally, a robot object recognition method is proposed based on the scene text reading. The robot detects and recognizes texts in the image and then stores the recognition results in a text file. When the user gives the robot a fetching instruction, the robot searches for corresponding keywords from the text files and achieves the confidence of multiple objects in the scene image. Then, the object with the maximum confidence is selected as the target. The results show that the robot can accurately distinguish objects with arbitrary shape and category, and it can effectively solve the problem of object recognition in home environments.

Download Full-text

Stereo Vision-Based Object Recognition and Manipulation by Regions with Convolutional Neural Network

Electronics ◽

10.3390/electronics9020210 ◽

2020 ◽

Vol 9 (2) ◽

pp. 210 ◽

Cited By ~ 5

Author(s):

Yi-Chun Du ◽

Muslikhin Muslikhin ◽

Tsung-Han Hsieh ◽

Ming-Shyan Wang

Keyword(s):

Neural Network ◽

Object Recognition ◽

Convolutional Neural Network ◽

Stereo Vision ◽

Fuzzy Inference ◽

Target Object ◽

Robot Arm ◽

Segmentation Method ◽

Inference System ◽

Six Degree Of Freedom

This paper develops a hybrid algorithm of adaptive network-based fuzzy inference system (ANFIS) and regions with convolutional neural network (R-CNN) for stereo vision-based object recognition and manipulation. The stereo camera at an eye-to-hand configuration firstly captures the image of the target object. Then, the shape, features, and centroid of the object are estimated. Similar pixels are segmented by the image segmentation method, and similar regions are merged through selective search. The eye-to-hand calibration is based on ANFIS to reduce computing burden. A six-degree-of-freedom (6-DOF) robot arm with a gripper will conduct experiments to demonstrate the effectiveness of the proposed system.

Download Full-text

Research on Target Object Recognition Based on Transfer-Learning Convolutional SAE in Intelligent Urban Construction

IEEE Access ◽

10.1109/access.2019.2939284 ◽

2019 ◽

Vol 7 ◽

pp. 125357-125368

Author(s):

Bing Xie ◽

Zhemin Duan ◽

Bin Zheng ◽

Liping Liu

Keyword(s):

Object Recognition ◽

Transfer Learning ◽

Target Object ◽

Urban Construction

Download Full-text

Robot Learning from Demonstration Using 3D Computer Vision

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.875-877.1994 ◽

2014 ◽

Vol 875-877 ◽

pp. 1994-1999

Author(s):

James Aaron Debono ◽

Gu Fang

Keyword(s):

Object Recognition ◽

Recognition Accuracy ◽

Robot Learning ◽

Learning From Demonstration ◽

State Machines ◽

Linear Motion ◽

3D Computer Vision ◽

Single Frame ◽

Pick And Place ◽

Finite State

For robot application to proliferate in industry, and in unregulated environments, a simple means of programming is required. This paper describes methods for robot Learning from Demonstration (LfD). These methods used an RGB-D sensor for demonstration observation, and used finite state machines (FSMs) for policy derivation. Particularly, a method for object recognition was developed, which required only a single frame of data for training, and was able to perform real-time recognition. A planning method for object grasping was also developed. Experiments with a pick-and-place robot show that the developed methods resulted in object recognition accuracy greater than 99% in cluttered scenes, and manipulation accuracies of below 3mm in linear motion and 2° in rotation.

Download Full-text

Convolutional neural network hyperparameters designed for highest object recognition accuracy in satellite imagery

Proceedings of the Second International Conference on Data Science, E-Learning and Information Systems - DATA '19 ◽

10.1145/3368691.3368695 ◽

2019 ◽

Author(s):

Povilas Gudzius ◽

Olga Kurasova ◽

Ernestas Filatovas

Keyword(s):

Neural Network ◽

Object Recognition ◽

Convolutional Neural Network ◽

Satellite Imagery ◽

Recognition Accuracy

Download Full-text

The Effect of Polarity on Object Recognition in Thermal Images

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/154193129303700132 ◽

1993 ◽

Vol 37 (1) ◽

pp. 137-141

Author(s):

Michael S. Brickner ◽

Amir Zvuloni

Keyword(s):

Object Recognition ◽

Experimental Design ◽

Thermal Imaging ◽

Recognition Performance ◽

Target Object ◽

The Other ◽

Thermal Images ◽

Natural Objects ◽

Different Temperatures ◽

Effect Of Polarity

Thermal imaging (TI) systems, transform the distribution of relative temperatures in a scene into a visible TV image. TIs differ significantly from regular TV images. Most TI systems allow their operators to select preferred polarity which determines the way in which gray shades represent different temperatures. Polarity may be set to either black hot (BH) or white hot (WH). The present experiments were designed to investigate the effects of polarity on object recognition performance in TI and to compare object recognition performance of experts and novices. In the first experiment, twenty flight candidates were asked to recognize target objects in 60 dynamic TI recordings taken from two different TI systems. The targets included a variety of human placed and natural objects. Each subject viewed half the targets in BH and the other half in WH polarity in a balanced experimental design. For 24 out of the 60 targets one direction of polarity produced better performance than the other. Although the direction of superior polarity (BH or WH better) was not consistent, the preferred representation of the target object was very consistent. For example, vegetation was more readily recognized when presented as dark objects on a brighter background. The results are discussed in terms of importance of surface determinants versus edge determinants in the recognition of TI objects. In the second experiment, the performance of 10 expert TI users was found to be significantly more accurate but not much faster than the performance of 20 novice subjects.

Download Full-text

Integrated Contextual Representation for Objects' Identities and Their Locations

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2008.20027 ◽

2008 ◽

Vol 20 (3) ◽

pp. 371-388 ◽

Cited By ~ 57

Author(s):

Nurit Gronau ◽

Maital Neta ◽

Moshe Bar

Keyword(s):

Object Recognition ◽

Semantic Processing ◽

Contextual Information ◽

Reaction Times ◽

Brain Regions ◽

Target Object ◽

Interactive Effects ◽

Attention And Memory ◽

Contextual Knowledge ◽

Lateral Occipital Complex

Visual context plays a prominent role in everyday perception. Contextual information can facilitate recognition of objects within scenes by providing predictions about objects that are most likely to appear in a specific setting, along with the locations that are most likely to contain objects in the scene. Is such identity-related (“semantic”) and location-related (“spatial”) contextual knowledge represented separately or jointly as a bound representation? We conducted a functional magnetic resonance imaging (fMRI) priming experiment whereby semantic and spatial contextual relations between prime and target object pictures were independently manipulated. This method allowed us to determine whether the two contextual factors affect object recognition with or without interacting, supporting a unified versus independent representations, respectively. Results revealed a Semantic × Spatial interaction in reaction times for target object recognition. Namely, significant semantic priming was obtained when targets were positioned in expected (congruent), but not in unexpected (incongruent), locations. fMRI results showed corresponding interactive effects in brain regions associated with semantic processing (inferior prefrontal cortex), visual contextual processing (parahippocampal cortex), and object-related processing (lateral occipital complex). In addition, activation in fronto-parietal areas suggests that attention and memory-related processes might also contribute to the contextual effects observed. These findings indicate that object recognition benefits from associative representations that integrate information about objects' identities and their locations, and directly modulate activation in object-processing cortical regions. Such context frames are useful in maintaining a coherent and meaningful representation of the visual world, and in providing a platform from which predictions can be generated to facilitate perception and action.

Download Full-text

Eye-in-Hand Robotic Gripper Vision Fusion for Object Recognition and Tracking

ASME 2019 28th Conference on Information Storage and Processing Systems ◽

10.1115/isps2019-7496 ◽

2019 ◽

Author(s):

Shih-Wei Liu ◽

Jen-Yuan (James) Chang

Keyword(s):

Object Recognition ◽

Human Resources ◽

Production Line ◽

Tracking System ◽

Target Object ◽

Experimental Result ◽

World Economic Forum ◽

The World ◽

The Future ◽

World Economic

Abstract With the development of Automation industry, a new industrial model has been born, and traditional human resources have gradually been replaced by machines. The World Economic Forum (WEF) pointed out in “The Future of Jobs Report 2018” that the world is experiencing a “workplace revolution”, which means that machine will play a more important role in the future. In response to this situation, in this paper, techniques for object recognition and tracking on a conveyor using eye-in-hand gripper are presented, which are useful in production line for automatic object classification. The eye-in-hand configuration is the most suitable for camera and gripper application because the camera coordinate is the same as the gripper coordinate. The main advantages of eye-in-hand configuration are as follow: (1) occlusion avoidance (2) intuitive teleoperation (3) image from different angles (4) simple calibration. The main difference with eye-on-hand configuration is that it may be out of view sight when the camera is too close to the object. The experimental result is using the eye-in-hand robotic gripper to establish a tracking system to chase the target object. Preliminary results show that the speed of the conveyor can be calculated and the moving distance between the robot and the object is very close after a period of time. It means that the tracking system is successful.

Download Full-text

Task-Agnostic Object Recognition for Mobile Robots through Few-Shot Image Matching

Electronics ◽

10.3390/electronics9030380 ◽

2020 ◽

Vol 9 (3) ◽

pp. 380 ◽

Cited By ~ 1

Author(s):

Agnese Chiatti ◽

Gianluca Bardaro ◽

Emanuele Bastianelli ◽

Ilaria Tiddi ◽

Prasenjit Mitra ◽

...

Keyword(s):

Object Recognition ◽

Mobile Robots ◽

Image Matching ◽

Scale Up ◽

Target Object ◽

Data Set ◽

Vast Number ◽

Training Time ◽

Novel Objects ◽

Object Classes

To assist humans with their daily tasks, mobile robots are expected to navigate complex and dynamic environments, presenting unpredictable combinations of known and unknown objects. Most state-of-the-art object recognition methods are unsuitable for this scenario because they require that: (i) all target object classes are known beforehand, and (ii) a vast number of training examples is provided for each class. This evidence calls for novel methods to handle unknown object classes, for which fewer images are initially available (few-shot recognition). One way of tackling the problem is learning how to match novel objects to their most similar supporting example. Here, we compare different (shallow and deep) approaches to few-shot image matching on a novel data set, consisting of 2D views of common object types drawn from a combination of ShapeNet and Google. First, we assess if the similarity of objects learned from a combination of ShapeNet and Google can scale up to new object classes, i.e., categories unseen at training time. Furthermore, we show how normalising the learned embeddings can impact the generalisation abilities of the tested methods, in the context of two novel configurations: (i) where the weights of a Convolutional two-branch Network are imprinted and (ii) where the embeddings of a Convolutional Siamese Network are L2-normalised.

Download Full-text

Manufacture Automation: Model and Object Recognition by Using Object Position Auto Locating Algorithm and Object Comparison Model

JALA Journal of the Association for Laboratory Automation ◽

10.1016/s1535-5535(04)00062-0 ◽

2000 ◽

Vol 5 (2) ◽

pp. 61-65

Author(s):

C Su

Keyword(s):

Object Recognition ◽

Object Position ◽

Comparison Model

Download Full-text