An improved YOLO network for unopened cotton boll detection in the field

2021 ◽  
pp. 1-14
Author(s):  
Yan Zhang ◽  
Gongping Yang ◽  
Yikun Liu ◽  
Chong Wang ◽  
Yilong Yin

Detection of cotton bolls in the field environments is one of crucial techniques for many precision agriculture applications, including yield estimation, disease and pest recognition and automatic harvesting. Because of the complex conditions, such as different growth periods and occlusion among leaves and bolls, detection in the field environments is a task with considerable challenges. Despite this, the development of deep learning technologies have shown great potential to effectively solve this task. In this work, we propose an Improved YOLOv5 network to detect unopened cotton bolls in the field accurately and with lower cost, which combines DenseNet, attention mechanism and Bi-FPN. Besides, we modify the architecture of the network to get larger feature maps from shallower network layers to enhance the ability of detecting bolls due to the size of cotton boll is generally small. We collect image data of cotton in Aodu Farm in Xinjiang Province, China and establish a dataset containing 616 high-resolution images. The experiment results show that the proposed method is superior to the original YOLOv5 model and other methods such as YOLOv3,SSD and FasterRCNN considering the detection accuracy, computational cost, model size and speed at the same time. The detection of cotton boll can be further applied for different purposes such as yield prediction and identification of diseases and pests in earlier stage which can effectively help farmers take effective approaches in time and reduce the crop losses and therefore increase production.

Author(s):  
Lifang Zhou ◽  
Guang Deng ◽  
Weisheng Li ◽  
Jianxun Mi ◽  
Bangjun Lei

Current state-of-the-art detectors achieved impressive performance in detection accuracy with the use of deep learning. However, most of such detectors cannot detect objects in real time due to heavy computational cost, which limits their wide application. Although some one-stage detectors are designed to accelerate the detection speed, it is still not satisfied for task in high-resolution remote sensing images. To address this problem, a lightweight one-stage approach based on YOLOv3 is proposed in this paper, which is named Squeeze-and-Excitation YOLOv3 (SE-YOLOv3). The proposed algorithm maintains high efficiency and effectiveness simultaneously. With an aim to reduce the number of parameters and increase the ability of feature description, two customized modules, lightweight feature extraction and attention-aware feature augmentation, are embedded by utilizing global information and suppressing redundancy features, respectively. To meet the scale invariance, a spatial pyramid pooling method is used to aggregate local features. The evaluation experiments on two remote sensing image data sets, DOTA and NWPU VHR-10, reveal that the proposed approach achieves more competitive detection effect with less computational consumption.


Agriculture ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 1190
Author(s):  
Lifa Fang ◽  
Yanqiang Wu ◽  
Yuhua Li ◽  
Hongen Guo ◽  
Hua Zhang ◽  
...  

Consistent ginger shoot orientation helps to ensure consistent ginger emergence and meet shading requirements. YOLO v3 is used to recognize ginger images in response to the current ginger seeder’s difficulty in meeting the above agronomic problems. However, it is not suitable for direct application on edge computing devices due to its high computational cost. To make the network more compact and to address the problems of low detection accuracy and long inference time, this study proposes an improved YOLO v3 model, in which some redundant channels and network layers are pruned to achieve real-time determination of ginger shoots and seeds. The test results showed that the pruned model reduced its model size by 87.2% and improved the detection speed by 85%. Meanwhile, its mean average precision (mAP) reached 98.0% for ginger shoots and seeds, only 0.1% lower than the model before pruning. Moreover, after deploying the model to the Jetson Nano, the test results showed that its mAP was 97.94%, the recognition accuracy could reach 96.7%, and detection speed could reach 20 frames·s−1. The results showed that the proposed method was feasible for real-time and accurate detection of ginger images, providing a solid foundation for automatic and accurate ginger seeding.


2019 ◽  
Vol 11 (24) ◽  
pp. 2939 ◽  
Author(s):  
Lonesome Malambo ◽  
Sorin Popescu ◽  
Nian-Wei Ku ◽  
William Rooney ◽  
Tan Zhou ◽  
...  

Small unmanned aerial systems (UAS) have emerged as high-throughput platforms for the collection of high-resolution image data over large crop fields to support precision agriculture and plant breeding research. At the same time, the improved efficiency in image capture is leading to massive datasets, which pose analysis challenges in providing needed phenotypic data. To complement these high-throughput platforms, there is an increasing need in crop improvement to develop robust image analysis methods to analyze large amount of image data. Analysis approaches based on deep learning models are currently the most promising and show unparalleled performance in analyzing large image datasets. This study developed and applied an image analysis approach based on a SegNet deep learning semantic segmentation model to estimate sorghum panicles counts, which are critical phenotypic data in sorghum crop improvement, from UAS images over selected sorghum experimental plots. The SegNet model was trained to semantically segment UAS images into sorghum panicles, foliage and the exposed ground using 462, 250 × 250 labeled images, which was then applied to field orthomosaic to generate a field-level semantic segmentation. Individual panicle locations were obtained after post-processing the segmentation output to remove small objects and split merged panicles. A comparison between model panicle count estimates and manually digitized panicle locations in 60 randomly selected plots showed an overall detection accuracy of 94%. A per-plot panicle count comparison also showed high agreement between estimated and reference panicle counts (Spearman correlation ρ = 0.88, mean bias = 0.65). Misclassifications of panicles during the semantic segmentation step and mosaicking errors in the field orthomosaic contributed mainly to panicle detection errors. Overall, the approach based on deep learning semantic segmentation showed good promise and with a larger labeled dataset and extensive hyper-parameter tuning, should provide even more robust and effective characterization of sorghum panicle counts.


2019 ◽  
Vol 11 (10) ◽  
pp. 1157 ◽  
Author(s):  
Jorge Fuentes-Pacheco ◽  
Juan Torres-Olivares ◽  
Edgar Roman-Rangel ◽  
Salvador Cervantes ◽  
Porfirio Juarez-Lopez ◽  
...  

Crop segmentation is an important task in Precision Agriculture, where the use of aerial robots with an on-board camera has contributed to the development of new solution alternatives. We address the problem of fig plant segmentation in top-view RGB (Red-Green-Blue) images of a crop grown under open-field difficult circumstances of complex lighting conditions and non-ideal crop maintenance practices defined by local farmers. We present a Convolutional Neural Network (CNN) with an encoder-decoder architecture that classifies each pixel as crop or non-crop using only raw colour images as input. Our approach achieves a mean accuracy of 93.85% despite the complexity of the background and a highly variable visual appearance of the leaves. We make available our CNN code to the research community, as well as the aerial image data set and a hand-made ground truth segmentation with pixel precision to facilitate the comparison among different algorithms.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2611
Author(s):  
Andrew Shepley ◽  
Greg Falzon ◽  
Christopher Lawson ◽  
Paul Meek ◽  
Paul Kwan

Image data is one of the primary sources of ecological data used in biodiversity conservation and management worldwide. However, classifying and interpreting large numbers of images is time and resource expensive, particularly in the context of camera trapping. Deep learning models have been used to achieve this task but are often not suited to specific applications due to their inability to generalise to new environments and inconsistent performance. Models need to be developed for specific species cohorts and environments, but the technical skills required to achieve this are a key barrier to the accessibility of this technology to ecologists. Thus, there is a strong need to democratize access to deep learning technologies by providing an easy-to-use software application allowing non-technical users to train custom object detectors. U-Infuse addresses this issue by providing ecologists with the ability to train customised models using publicly available images and/or their own images without specific technical expertise. Auto-annotation and annotation editing functionalities minimize the constraints of manually annotating and pre-processing large numbers of images. U-Infuse is a free and open-source software solution that supports both multiclass and single class training and object detection, allowing ecologists to access deep learning technologies usually only available to computer scientists, on their own device, customised for their application, without sharing intellectual property or sensitive data. It provides ecological practitioners with the ability to (i) easily achieve object detection within a user-friendly GUI, generating a species distribution report, and other useful statistics, (ii) custom train deep learning models using publicly available and custom training data, (iii) achieve supervised auto-annotation of images for further training, with the benefit of editing annotations to ensure quality datasets. Broad adoption of U-Infuse by ecological practitioners will improve ecological image analysis and processing by allowing significantly more image data to be processed with minimal expenditure of time and resources, particularly for camera trap images. Ease of training and use of transfer learning means domain-specific models can be trained rapidly, and frequently updated without the need for computer science expertise, or data sharing, protecting intellectual property and privacy.


Author(s):  
Xuewu Zhang ◽  
Yansheng Gong ◽  
Chen Qiao ◽  
Wenfeng Jing

AbstractThis article mainly focuses on the most common types of high-speed railways malfunctions in overhead contact systems, namely, unstressed droppers, foreign-body invasions, and pole number-plate malfunctions, to establish a deep-network detection model. By fusing the feature maps of the shallow and deep layers in the pretraining network, global and local features of the malfunction area are combined to enhance the network's ability of identifying small objects. Further, in order to share the fully connected layers of the pretraining network and reduce the complexity of the model, Tucker tensor decomposition is used to extract features from the fused-feature map. The operation greatly reduces training time. Through the detection of images collected on the Lanxin railway line, experiments result show that the proposed multiview Faster R-CNN based on tensor decomposition had lower miss probability and higher detection accuracy for the three types faults. Compared with object-detection methods YOLOv3, SSD, and the original Faster R-CNN, the average miss probability of the improved Faster R-CNN model in this paper is decreased by 37.83%, 51.27%, and 43.79%, respectively, and average detection accuracy is increased by 3.6%, 9.75%, and 5.9%, respectively.


2021 ◽  
Vol 13 (14) ◽  
pp. 2656
Author(s):  
Furong Shi ◽  
Tong Zhang

Deep-learning technologies, especially convolutional neural networks (CNNs), have achieved great success in building extraction from areal images. However, shape details are often lost during the down-sampling process, which results in discontinuous segmentation or inaccurate segmentation boundary. In order to compensate for the loss of shape information, two shape-related auxiliary tasks (i.e., boundary prediction and distance estimation) were jointly learned with building segmentation task in our proposed network. Meanwhile, two consistency constraint losses were designed based on the multi-task network to exploit the duality between the mask prediction and two shape-related information predictions. Specifically, an atrous spatial pyramid pooling (ASPP) module was appended to the top of the encoder of a U-shaped network to obtain multi-scale features. Based on the multi-scale features, one regression loss and two classification losses were used for predicting the distance-transform map, segmentation, and boundary. Two inter-task consistency-loss functions were constructed to ensure the consistency between distance maps and masks, and the consistency between masks and boundary maps. Experimental results on three public aerial image data sets showed that our method achieved superior performance over the recent state-of-the-art models.


2021 ◽  
Vol 13 (11) ◽  
pp. 2171
Author(s):  
Yuhao Qing ◽  
Wenyi Liu ◽  
Liuyan Feng ◽  
Wanjia Gao

Despite significant progress in object detection tasks, remote sensing image target detection is still challenging owing to complex backgrounds, large differences in target sizes, and uneven distribution of rotating objects. In this study, we consider model accuracy, inference speed, and detection of objects at any angle. We also propose a RepVGG-YOLO network using an improved RepVGG model as the backbone feature extraction network, which performs the initial feature extraction from the input image and considers network training accuracy and inference speed. We use an improved feature pyramid network (FPN) and path aggregation network (PANet) to reprocess feature output by the backbone network. The FPN and PANet module integrates feature maps of different layers, combines context information on multiple scales, accumulates multiple features, and strengthens feature information extraction. Finally, to maximize the detection accuracy of objects of all sizes, we use four target detection scales at the network output to enhance feature extraction from small remote sensing target pixels. To solve the angle problem of any object, we improved the loss function for classification using circular smooth label technology, turning the angle regression problem into a classification problem, and increasing the detection accuracy of objects at any angle. We conducted experiments on two public datasets, DOTA and HRSC2016. Our results show the proposed method performs better than previous methods.


2020 ◽  
Vol 13 (1) ◽  
pp. 23
Author(s):  
Wei Zhao ◽  
William Yamada ◽  
Tianxin Li ◽  
Matthew Digman ◽  
Troy Runge

In recent years, precision agriculture has been researched to increase crop production with less inputs, as a promising means to meet the growing demand of agriculture products. Computer vision-based crop detection with unmanned aerial vehicle (UAV)-acquired images is a critical tool for precision agriculture. However, object detection using deep learning algorithms rely on a significant amount of manually prelabeled training datasets as ground truths. Field object detection, such as bales, is especially difficult because of (1) long-period image acquisitions under different illumination conditions and seasons; (2) limited existing prelabeled data; and (3) few pretrained models and research as references. This work increases the bale detection accuracy based on limited data collection and labeling, by building an innovative algorithms pipeline. First, an object detection model is trained using 243 images captured with good illimitation conditions in fall from the crop lands. In addition, domain adaptation (DA), a kind of transfer learning, is applied for synthesizing the training data under diverse environmental conditions with automatic labels. Finally, the object detection model is optimized with the synthesized datasets. The case study shows the proposed method improves the bale detecting performance, including the recall, mean average precision (mAP), and F measure (F1 score), from averages of 0.59, 0.7, and 0.7 (the object detection) to averages of 0.93, 0.94, and 0.89 (the object detection + DA), respectively. This approach could be easily scaled to many other crop field objects and will significantly contribute to precision agriculture.


2020 ◽  
Vol 34 (14n16) ◽  
pp. 2040121 ◽  
Author(s):  
Zhi-Xian Ye ◽  
Qian Chen ◽  
Bing-Hua Li ◽  
Jian-Feng Zou ◽  
Yao Zheng

Vortex identification is important for understanding the physical mechanism of turbulent flow. The common vortex identification techniques based on velocity gradient tensor such as [Formula: see text] criterion will consume a lot of computing resources for processing great quantity of experimental data. To improve the vortex identification efficiency and achieve real-time recognition, we present a novel vortex identification method using segmentation with convolutional neural network (CNN) based on flow field image data, which is named “Butterfly-CNN”. Considering that the view of flow field is small, it is necessary to integrate both the local and global feature maps to achieve higher precision. The architecture consists of an encoded–decoded path, which is similar to [Formula: see text]-net but with different superimposed network part. In the Butterfly-CNN, the cross-expanding paths are designed with the global information to enable precise localization, and the feature maps after each convolution are regarded as the original pictures, then convolute to the size of the last feature map and upsample to the original size again. Finally, the decoded and cross-expanding networks are added up. The Butterfly-CNN can be trained end-to-end from a few images, and it is useful and efficient for vortex identification.


Sign in / Sign up

Export Citation Format

Share Document