YOLO-Tomato: A Robust Algorithm for Tomato Detection Based on YOLOv3

Automatic fruit detection is a very important benefit of harvesting robots. However, complicated environment conditions, such as illumination variation, branch, and leaf occlusion as well as tomato overlap, have made fruit detection very challenging. In this study, an improved tomato detection model called YOLO-Tomato is proposed for dealing with these problems, based on YOLOv3. A dense architecture is incorporated into YOLOv3 to facilitate the reuse of features and help to learn a more compact and accurate model. Moreover, the model replaces the traditional rectangular bounding box (R-Bbox) with a circular bounding box (C-Bbox) for tomato localization. The new bounding boxes can then match the tomatoes more precisely, and thus improve the Intersection-over-Union (IoU) calculation for the Non-Maximum Suppression (NMS). They also reduce prediction coordinates. An ablation study demonstrated the efficacy of these modifications. The YOLO-Tomato was compared to several state-of-the-art detection methods and it had the best detection performance.

Download Full-text

Omnidirectional Scene Text Detection with Sequential-free Box Discretization

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/423 ◽

2019 ◽

Cited By ~ 9

Author(s):

Yuliang Liu ◽

Sheng Zhang ◽

Lianwen Jin ◽

Lele Xie ◽

Yaqiang Wu ◽

...

Keyword(s):

State Of The Art ◽

Detection Performance ◽

Text Detection ◽

Detection Methods ◽

Bounding Box ◽

Scene Text Detection ◽

Scene Text ◽

Art Methods ◽

In The Wild ◽

Ablation Study

Scene text in the wild is commonly presented with high variant characteristics. Using quadrilateral bounding box to localize the text instance is nearly indispensable for detection methods. However, recent researches reveal that introducing quadrilateral bounding box for scene text detection will bring a label confusion issue which is easily overlooked, and this issue may significantly undermine the detection performance. To address this issue, in this paper, we propose a novel method called Sequential-free Box Discretization (SBD) by discretizing the bounding box into key edges (KE) which can further derive more effective methods to improve detection performance. Experiments showed that the proposed method can outperform state-of-the-art methods in many popular scene text benchmarks, including ICDAR 2015, MLT, and MSRA-TD500. Ablation study also showed that simply integrating the SBD into Mask R-CNN framework, the detection performance can be substantially improved. Furthermore, an experiment on the general object dataset HRSC2016 (multi-oriented ships) showed that our method can outperform recent state-of-the-art methods by a large margin, demonstrating its powerful generalization ability.

Download Full-text

An Efficient Pedestrian Detection Method Based on YOLOv2

Mathematical Problems in Engineering ◽

10.1155/2018/3518959 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

Zhongmin Liu ◽

Zhicai Chen ◽

Zhanming Li ◽

Wenjin Hu

Keyword(s):

Detection Method ◽

Size Range ◽

State Of The Art ◽

Pedestrian Detection ◽

Semantic Segmentation ◽

Detection Methods ◽

Detection Model ◽

Great Progress ◽

Good Trade ◽

Speed And Accuracy

In recent years, techniques based on the deep detection model have achieved overwhelming improvements in the accuracy of detection, which makes them being the most adapted for the applications, such as pedestrian detection. However, speed and accuracy are a pair of contradictions that always exist and have long puzzled researchers. How to achieve the good trade-off between them is a problem we must consider while designing the detectors. To this end, we employ the general detector YOLOv2, a state-of-the-art method in the general detection tasks, in the pedestrian detection. Then we modify the network parameters and structures, according to the characteristics of the pedestrians, making this method more suitable for detecting pedestrians. Experimental results in INRIA pedestrian detection dataset show that it has a fairly high detection speed with a small precision gap compared with the state-of-the-art pedestrian detection methods. Furthermore, we add weak semantic segmentation networks after shared convolution layers to illuminate pedestrians and employ a scale-aware structure in our model according to the characteristics of the wide size range in Caltech pedestrian detection dataset, which make great progress under the original improvement.

Download Full-text

A Fast Orientation Invariant Detector Based on the One-stage Method

MATEC Web of Conferences ◽

10.1051/matecconf/201823204036 ◽

2018 ◽

Vol 232 ◽

pp. 04036

Author(s):

Jun Yin ◽

Huadong Pan ◽

Hui Su ◽

Zhonggeng Liu ◽

Zhirong Peng

Keyword(s):

Object Detection ◽

Loss Function ◽

High Efficiency ◽

Detection Method ◽

State Of The Art ◽

Orientation Angle ◽

Detection Methods ◽

Detection Algorithms ◽

Bounding Boxes ◽

The One

We propose an object detection method that predicts the orientation bounding boxes (OBB) to estimate objects locations, scales and orientations based on YOLO (You Only Look Once), which is one of the top detection algorithms performing well both in accuracy and speed. Horizontal bounding boxes(HBB), which are not robust to orientation variances, are used in the existing object detection methods to detect targets. The proposed orientation invariant YOLO (OIYOLO) detector can effectively deal with the bird’s eye viewpoint images where the orientation angles of the objects are arbitrary. In order to estimate the rotated angle of objects, we design a new angle loss function. Therefore, the training of OIYOLO forces the network to learn the annotated orientation angle of objects, making OIYOLO orientation invariances. The proposed approach that predicts OBB can be applied in other detection frameworks. In additional, to evaluate the proposed OIYOLO detector, we create an UAV-DAHUA datasets that annotated with objects locations, scales and orientation angles accurately. Extensive experiments conducted on UAV-DAHUA and DOTA datasets demonstrate that OIYOLO achieves state-of-the-art detection performance with high efficiency comparing with the baseline YOLO algorithms.

Download Full-text

BAR — A Reinforcement Learning Agent for Bounding-Box Automated Refinement

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5639 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2561-2568

Author(s):

Morgane Ayle ◽

Jimmy Tekli ◽

Julia El-Zini ◽

Boulos El-Asmar ◽

Mariette Awad

Keyword(s):

Reinforcement Learning ◽

Industrial Sector ◽

Detection Methods ◽

Learning Approaches ◽

Human Intervention ◽

Car Industry ◽

Bounding Box ◽

Learning Agent ◽

Industry Standards ◽

Bounding Boxes

Research has shown that deep neural networks are able to help and assist human workers throughout the industrial sector via different computer vision applications. However, such data-driven learning approaches require a very large number of labeled training images in order to generalize well and achieve high accuracies that meet industry standards. Gathering and labeling large amounts of images is both expensive and time consuming, specifically for industrial use-cases. In this work, we introduce BAR (Bounding-box Automated Refinement), a reinforcement learning agent that learns to correct inaccurate bounding-boxes that are weakly generated by certain detection methods, or wrongly annotated by a human, using either an offline training method with Deep Reinforcement Learning (BAR-DRL), or an online one using Contextual Bandits (BAR-CB). Our agent limits the human intervention to correcting or verifying a subset of bounding-boxes instead of re-drawing new ones. Results on a car industry-related dataset and on the PASCAL VOC dataset show a consistent increase of up to 0.28 in the Intersection-over-Union of bounding-boxes with their desired ground-truths, while saving 30%-82% of human intervention time in either correcting or re-drawing inaccurate proposals.

Download Full-text

Multiple Object Tracking using YOLO-based Detector

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2021.65.4.040401 ◽

2021 ◽

Author(s):

Shinfeng D. Lin ◽

Tingyu Chang ◽

Wensheng Chen

Keyword(s):

Object Tracking ◽

State Of The Art ◽

Multiple Object Tracking ◽

Motion Prediction ◽

Multiple Object ◽

Bounding Box ◽

Video Frames ◽

Tracking Process ◽

Tracking By Detection ◽

Bounding Boxes

In computer vision, multiple object tracking (MOT) plays a crucial role in solving many important issues. A common approach of MOT is tracking by detection. Tracking by detection includes occlusions, motion prediction, and object re-identification. From the video frames, a set of detections is extracted for leading the tracking process. These detections are usually associated together for assigning the same identifications to bounding boxes holding the same target. In this article, MOT using YOLO-based detector is proposed. The authors’ method includes object detection, bounding box regression, and bounding box association. First, the YOLOv3 is exploited to be an object detector. The bounding box regression and association is then utilized to forecast the object’s position. To justify their method, two open object tracking benchmarks, 2D MOT2015 and MOT16, were used. Experimental results demonstrate that our method is comparable to several state-of-the-art tracking methods, especially in the impressive results of MOT accuracy and correctly identified detections.

Download Full-text

exBAKE: Automatic Fake News Detection Model Based on Bidirectional Encoder Representations from Transformers (BERT)

Applied Sciences ◽

10.3390/app9194062 ◽

2019 ◽

Vol 9 (19) ◽

pp. 4062 ◽

Cited By ~ 7

Author(s):

Heejung Jwa ◽

Dongsuk Oh ◽

Kinam Park ◽

Jang Kang ◽

Hueiseok Lim

Keyword(s):

State Of The Art ◽

The Body ◽

Data Driven ◽

Detection Methods ◽

News Story ◽

Fake News ◽

Improve Performance ◽

News Stories ◽

Detection Model ◽

The Relationship

News currently spreads rapidly through the internet. Because fake news stories are designed to attract readers, they tend to spread faster. For most readers, detecting fake news can be challenging and such readers usually end up believing that the fake news story is fact. Because fake news can be socially problematic, a model that automatically detects such fake news is required. In this paper, we focus on data-driven automatic fake news detection methods. We first apply the Bidirectional Encoder Representations from Transformers model (BERT) model to detect fake news by analyzing the relationship between the headline and the body text of news. To further improve performance, additional news data are gathered and used to pre-train this model. We determine that the deep-contextualizing nature of BERT is best suited for this task and improves the 0.14 F-score over older state-of-the-art models.

Download Full-text

Advancing Tassel Detection and Counting: Annotation and Algorithms

Remote Sensing ◽

10.3390/rs13152881 ◽

2021 ◽

Vol 13 (15) ◽

pp. 2881

Author(s):

Azam Karami ◽

Karoll Quijano ◽

Melba Crawford

Keyword(s):

Unmanned Aerial Vehicles ◽

State Of The Art ◽

Early Stage ◽

High Accuracy ◽

Yield Prediction ◽

Learning Approaches ◽

Bounding Box ◽

Detection And Counting ◽

Rgb Images ◽

Bounding Boxes

Tassel counts provide valuable information related to flowering and yield prediction in maize, but are expensive and time-consuming to acquire via traditional manual approaches. High-resolution RGB imagery acquired by unmanned aerial vehicles (UAVs), coupled with advanced machine learning approaches, including deep learning (DL), provides a new capability for monitoring flowering. In this article, three state-of-the-art DL techniques, CenterNet based on point annotation, task-aware spatial disentanglement (TSD), and detecting objects with recursive feature pyramids and switchable atrous convolution (DetectoRS) based on bounding box annotation, are modified to improve their performance for this application and evaluated for tassel detection relative to Tasselnetv2+. The dataset for the experiments is comprised of RGB images of maize tassels from plant breeding experiments, which vary in size, complexity, and overlap. Results show that the point annotations are more accurate and simpler to acquire than the bounding boxes, and bounding box-based approaches are more sensitive to the size of the bounding boxes and background than point-based approaches. Overall, CenterNet has high accuracy in comparison to the other techniques, but DetectoRS can better detect early-stage tassels. The results for these experiments were more robust than Tasselnetv2+, which is sensitive to the number of tassels in the image.

Download Full-text

Object Detection in Remote Sensing Images Based on Improved Bounding Box Regression and Multi-Level Features Fusion

Remote Sensing ◽

10.3390/rs12010143 ◽

2020 ◽

Vol 12 (1) ◽

pp. 143 ◽

Cited By ~ 6

Author(s):

Xiaoliang Qian ◽

Sheng Lin ◽

Gong Cheng ◽

Xiwen Yao ◽

Hangli Ren ◽

...

Keyword(s):

Remote Sensing ◽

Object Detection ◽

State Of The Art ◽

Remote Sensing Images ◽

Baseline Method ◽

Deep Network ◽

Features Fusion ◽

Bounding Box ◽

Multi Level ◽

Bounding Boxes

The objective of detection in remote sensing images is to determine the location and category of all targets in these images. The anchor based methods are the most prevalent deep learning based methods, and still have some problems that need to be addressed. First, the existing metric (i.e., intersection over union (IoU)) could not measure the distance between two bounding boxes when they are nonoverlapping. Second, the exsiting bounding box regression loss could not directly optimize the metric in the training process. Third, the existing methods which adopt a hierarchical deep network only choose a single level feature layer for the feature extraction of region proposals, meaning they do not take full use of the advantage of multi-level features. To resolve the above problems, a novel object detection method for remote sensing images based on improved bounding box regression and multi-level features fusion is proposed in this paper. First, a new metric named generalized IoU is applied, which can quantify the distance between two bounding boxes, regardless of whether they are overlapping or not. Second, a novel bounding box regression loss is proposed, which can not only optimize the new metric (i.e., generalized IoU) directly but also overcome the problem that existing bounding box regression loss based on the new metric cannot adaptively change the gradient based on the metric value. Finally, a multi-level features fusion module is proposed and incorporated into the existing hierarchical deep network, which can make full use of the multi-level features for each region proposal. The quantitative comparisons between the proposed method and baseline method on the large scale dataset DIOR demonstrate that incorporating the proposed bounding box regression loss, multi-level features fusion module, and a combination of both into the baseline method can obtain an absolute gain of 0.7%, 1.4%, and 2.2% or so in terms of mAP, respectively. Comparing this with the state-of-the-art methods demonstrates that the proposed method has achieved a state-of-the-art performance. The curves of average precision with different thresholds show that the advantage of the proposed method is more evident when the threshold of generalized IoU (or IoU) is relatively high, which means that the proposed method can improve the precision of object localization. Similar conclusions can be obtained on a NWPU VHR-10 dataset.

Download Full-text

A Two-Phase Mitosis Detection Approach Based on U-Shaped Network

BioMed Research International ◽

10.1155/2021/1722652 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Wenjing Lu

Keyword(s):

Deep Learning ◽

Object Detection ◽

State Of The Art ◽

Experimental Results ◽

Detection Methods ◽

Two Phase ◽

Mitosis Detection ◽

Detection Approach ◽

Histopathology Images ◽

Bounding Boxes

This paper proposes a deep learning-based method for mitosis detection in breast histopathology images. A main problem in mitosis detection is that most of the datasets only have weak labels, i.e., only the coordinates indicating the center of the mitosis region. This makes most of the existing powerful object detection methods hardly be used in mitosis detection. Aiming at solving this problem, this paper firstly applies a CNN-based algorithm to pixelwisely segment the mitosis regions, based on which bounding boxes of mitosis are generated as strong labels. Based on the generated bounding boxes, an object detection network is trained to accomplish mitosis detection. Experimental results show that the proposed method is effective in detecting mitosis, and the accuracies outperform state-of-the-art literatures.

Download Full-text

Iterative Scheme for Object Detection in Crowded Environments

PROGRAMMNAYA INGENERIA ◽

10.17587/prin.12.31-39 ◽

2021 ◽

Vol 12 (1) ◽

pp. 31-39

Author(s):

D. D. Rukhovich ◽

Keyword(s):

Deep Learning ◽

Object Detection ◽

Iterative Scheme ◽

State Of The Art ◽

Sufficient Accuracy ◽

Two Stage ◽

One Stage ◽

Bounding Box ◽

Bounding Boxes ◽

Crowded Environments

Deep learning-based detectors usually produce a redundant set of object bounding boxes including many duplicate detections of the same object. These boxes are then filtered using non-maximum suppression (NMS) in order to select exactly one bounding box per object of interest. This greedy scheme is simple and provides sufficient accuracy for isolated objects but often fails in crowded environments, since one needs to both preserve boxes for different objects and suppress duplicate detections. In this work we develop an alternative iterative scheme, where a new subset of objects is detected at each iteration. Detected boxes from the previous iterations are passed to the network at the following iterations to ensure that the same object would not be detected twice. This iterative scheme can be applied to both one-stage and two-stage object detectors with just minor modifications of the training and inference procedures. We perform extensive experiments with two different baseline detectors on four datasets and show significant improvement over the baseline, leading to state-of-the-art performance on CrowdHuman and WiderPerson datasets.

Download Full-text