scholarly journals A Real-Time Apple Targets Detection Method for Picking Robot Based on Improved YOLOv5

2021 ◽  
Vol 13 (9) ◽  
pp. 1619
Author(s):  
Bin Yan ◽  
Pan Fan ◽  
Xiaoyan Lei ◽  
Zhijie Liu ◽  
Fuzeng Yang

The apple target recognition algorithm is one of the core technologies of the apple picking robot. However, most of the existing apple detection algorithms cannot distinguish between the apples that are occluded by tree branches and occluded by other apples. The apples, grasping end-effector and mechanical picking arm of the robot are very likely to be damaged if the algorithm is directly applied to the picking robot. Based on this practical problem, in order to automatically recognize the graspable and ungraspable apples in an apple tree image, a light-weight apple targets detection method was proposed for picking robot using improved YOLOv5s. Firstly, BottleneckCSP module was improved designed to BottleneckCSP-2 module which was used to replace the BottleneckCSP module in backbone architecture of original YOLOv5s network. Secondly, SE module, which belonged to the visual attention mechanism network, was inserted to the proposed improved backbone network. Thirdly, the bonding fusion mode of feature maps, which were inputs to the target detection layer of medium size in the original YOLOv5s network, were improved. Finally, the initial anchor box size of the original network was improved. The experimental results indicated that the graspable apples, which were unoccluded or only occluded by tree leaves, and the ungraspable apples, which were occluded by tree branches or occluded by other fruits, could be identified effectively using the proposed improved network model in this study. Specifically, the recognition recall, precision, mAP and F1 were 91.48%, 83.83%, 86.75% and 87.49%, respectively. The average recognition time was 0.015 s per image. Contrasted with original YOLOv5s, YOLOv3, YOLOv4 and EfficientDet-D0 model, the mAP of the proposed improved YOLOv5s model increased by 5.05%, 14.95%, 4.74% and 6.75% respectively, the size of the model compressed by 9.29%, 94.6%, 94.8% and 15.3% respectively. The average recognition speeds per image of the proposed improved YOLOv5s model were 2.53, 1.13 and 3.53 times of EfficientDet-D0, YOLOv4 and YOLOv3 and model, respectively. The proposed method can provide technical support for the real-time accurate detection of multiple fruit targets for the apple picking robot.

2020 ◽  
Vol 17 (3) ◽  
pp. 172988142093271
Author(s):  
Xiali Li ◽  
Manjun Tian ◽  
Shihan Kong ◽  
Licheng Wu ◽  
Junzhi Yu

To tackle the water surface pollution problem, a vision-based water surface garbage capture robot has been developed in our lab. In this article, we present a modified you only look once v3-based garbage detection method, allowing real-time and high-precision object detection in dynamic aquatic environments. More specifically, to improve the real-time detection performance, the detection scales of you only look once v3 are simplified from 3 to 2. Besides, to guarantee the accuracy of detection, the anchor boxes of our training data set are reclustered for replacing some of the original you only look once v3 prior anchor boxes that are not appropriate to our data set. By virtue of the proposed detection method, the capture robot has the capability of cleaning floating garbage in the field. Experimental results demonstrate that both detection speed and accuracy of the modified you only look once v3 are better than those of other object detection algorithms. The obtained results provide valuable insight into the high-speed detection and grasping of dynamic objects in complex aquatic environments autonomously and intelligently.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 3958
Author(s):  
Seongkyun Han ◽  
Jisang Yoo ◽  
Soonchul Kwon

Vehicle detection is an important research area that provides background information for the diversity of unmanned-aerial-vehicle (UAV) applications. In this paper, we propose a vehicle-detection method using a convolutional-neural-network (CNN)-based object detector. We design our method, DRFBNet300, with a Deeper Receptive Field Block (DRFB) module that enhances the expressiveness of feature maps to detect small objects in the UAV imagery. We also propose the UAV-cars dataset that includes the composition and angular distortion of vehicles in UAV imagery to train our DRFBNet300. Lastly, we propose a Split Image Processing (SIP) method to improve the accuracy of the detection model. Our DRFBNet300 achieves 21 mAP with 45 FPS in the MS COCO metric, which is the highest score compared to other lightweight single-stage methods running in real time. In addition, DRFBNet300, trained on the UAV-cars dataset, obtains the highest AP score at altitudes of 20–50 m. The gap of accuracy improvement by applying the SIP method became larger when the altitude increases. The DRFBNet300 trained on the UAV-cars dataset with SIP method operates at 33 FPS, enabling real-time vehicle detection.


Author(s):  
Ping Jiang ◽  
Tao Gao

In this paper, an improved paper defects detection method based on visual attention mechanism computation model is presented. First, multi-scale feature maps are extracted by linear filtering. Second, the comparative maps are obtained by carrying out center-surround difference operator. Third, the saliency map is obtained by combining conspicuity maps, which is gained by combining the multi-scale comparative maps. Last, the seed point of watershed segmentation is determined by competition among salient points in the saliency map and the defect regions are segmented from the background. Experimental results show the efficiency of the approach for paper defects detection.


Author(s):  
Ping Jiang ◽  
Tao Gao

In this paper, an improved paper defects detection method based on visual attention mechanism computation model is presented. First, multi-scale feature maps are extracted by linear filtering. Second, the comparative maps are obtained by carrying out center-surround difference operator. Third, the saliency map is obtained by combining conspicuity maps, which is gained by combining the multi-scale comparative maps. Last, the seed point of watershed segmentation is determined by competition among salient points in the saliency map and the defect regions are segmented from the background. Experimental results show the efficiency of the approach for paper defects detection.


2019 ◽  
Vol 2 (5) ◽  
Author(s):  
Tong Wang

The compaction quality of the subgrade is directly related to the service life of the road. Effective control of the subgrade construction process is the key to ensuring the compaction quality of the subgrade. Therefore, real-time, comprehensive, rapid and accurate prediction of construction compaction quality through informatization detection method is an important guarantee for speeding up construction progress and ensuring subgrade compaction quality. Based on the function of the system, this paper puts forward the principle of system development and the development mode used in system development, and displays the development system in real-time to achieve the whole process control of subgrade construction quality.


2010 ◽  
Vol 130 (11) ◽  
pp. 2039-2046
Author(s):  
Munetoshi Numada ◽  
Masaru Shimizu ◽  
Takuma Funahashi ◽  
Hiroyasu Koshimizu

IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Jinkang Wang ◽  
Xiaohui He ◽  
Shao Faming ◽  
Guanlin Lu ◽  
Hu Cong ◽  
...  

2021 ◽  
Vol 11 (11) ◽  
pp. 4940
Author(s):  
Jinsoo Kim ◽  
Jeongho Cho

The field of research related to video data has difficulty in extracting not only spatial but also temporal features and human action recognition (HAR) is a representative field of research that applies convolutional neural network (CNN) to video data. The performance for action recognition has improved, but owing to the complexity of the model, some still limitations to operation in real-time persist. Therefore, a lightweight CNN-based single-stream HAR model that can operate in real-time is proposed. The proposed model extracts spatial feature maps by applying CNN to the images that develop the video and uses the frame change rate of sequential images as time information. Spatial feature maps are weighted-averaged by frame change, transformed into spatiotemporal features, and input into multilayer perceptrons, which have a relatively lower complexity than other HAR models; thus, our method has high utility in a single embedded system connected to CCTV. The results of evaluating action recognition accuracy and data processing speed through challenging action recognition benchmark UCF-101 showed higher action recognition accuracy than the HAR model using long short-term memory with a small amount of video frames and confirmed the real-time operational possibility through fast data processing speed. In addition, the performance of the proposed weighted mean-based HAR model was verified by testing it in Jetson NANO to confirm the possibility of using it in low-cost GPU-based embedded systems.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1240
Author(s):  
Yang Liu ◽  
Hailong Su ◽  
Cao Zeng ◽  
Xiaoli Li

In complex scenes, it is a huge challenge to accurately detect motion-blurred, tiny, and dense objects in the thermal infrared images. To solve this problem, robust thermal infrared vehicle and pedestrian detection method is proposed in this paper. An important weight parameter β is first proposed to reconstruct the loss function of the feature selective anchor-free (FSAF) module in its online feature selection process, and the FSAF module is optimized to enhance the detection performance of motion-blurred objects. The proposal of parameter β provides an effective solution to the challenge of motion-blurred object detection. Then, the optimized anchor-free branches of the FSAF module are plugged into the YOLOv3 single-shot detector and work jointly with the anchor-based branches of the YOLOv3 detector in both training and inference, which efficiently improves the detection precision of the detector for tiny and dense objects. Experimental results show that the method proposed is superior to other typical thermal infrared vehicle and pedestrian detection algorithms due to 72.2% mean average precision (mAP).


Sign in / Sign up

Export Citation Format

Share Document