scholarly journals Real-Time Water Surface Object Detection Based on Improved Faster R-CNN

Sensors ◽  
2019 ◽  
Vol 19 (16) ◽  
pp. 3523 ◽  
Author(s):  
Lili Zhang ◽  
Yi Zhang ◽  
Zhen Zhang ◽  
Jie Shen ◽  
Huibin Wang

In this paper, we consider water surface object detection in natural scenes. Generally, background subtraction and image segmentation are the classical object detection methods. The former is highly susceptible to variable scenes, so its accuracy will be greatly reduced when detecting water surface objects due to the changing of the sunlight and waves. The latter is more sensitive to the selection of object features, which will lead to poor generalization as a result, so it cannot be applied widely. Consequently, methods based on deep learning have recently been proposed. The River Chief System has been implemented in China recently, and one of the important requirements is to detect and deal with the water surface floats in a timely fashion. In response to this case, we propose a real-time water surface object detection method in this paper which is based on the Faster R-CNN. The proposed network model includes two modules and integrates low-level features with high-level features to improve detection accuracy. Moreover, we propose to set the different scales and aspect ratios of anchors by analyzing the distribution of object scales in our dataset, so our method has good robustness and high detection accuracy for multi-scale objects in complex natural scenes. We utilized the proposed method to detect the floats on the water surface via a three-day video surveillance stream of the North Canal in Beijing, and validated its performance. The experiments show that the mean average precision (MAP) of the proposed method was 83.7%, and the detection speed was 13 frames per second. Therefore, our method can be applied in complex natural scenes and mostly meets the requirements of accuracy and speed of water surface object detection online.

2019 ◽  
Vol 11 (7) ◽  
pp. 786 ◽  
Author(s):  
Yang-Lang Chang ◽  
Amare Anagaw ◽  
Lena Chang ◽  
Yi Wang ◽  
Chih-Yu Hsiao ◽  
...  

Synthetic aperture radar (SAR) imagery has been used as a promising data source for monitoring maritime activities, and its application for oil and ship detection has been the focus of many previous research studies. Many object detection methods ranging from traditional to deep learning approaches have been proposed. However, majority of them are computationally intensive and have accuracy problems. The huge volume of the remote sensing data also brings a challenge for real time object detection. To mitigate this problem a high performance computing (HPC) method has been proposed to accelerate SAR imagery analysis, utilizing the GPU based computing methods. In this paper, we propose an enhanced GPU based deep learning method to detect ship from the SAR images. The You Only Look Once version 2 (YOLOv2) deep learning framework is proposed to model the architecture and training the model. YOLOv2 is a state-of-the-art real-time object detection system, which outperforms Faster Region-Based Convolutional Network (Faster R-CNN) and Single Shot Multibox Detector (SSD) methods. Additionally, in order to reduce computational time with relatively competitive detection accuracy, we develop a new architecture with less number of layers called YOLOv2-reduced. In the experiment, we use two types of datasets: A SAR ship detection dataset (SSDD) dataset and a Diversified SAR Ship Detection Dataset (DSSDD). These two datasets were used for training and testing purposes. YOLOv2 test results showed an increase in accuracy of ship detection as well as a noticeable reduction in computational time compared to Faster R-CNN. From the experimental results, the proposed YOLOv2 architecture achieves an accuracy of 90.05% and 89.13% on the SSDD and DSSDD datasets respectively. The proposed YOLOv2-reduced architecture has a similarly competent detection performance as YOLOv2, but with less computational time on a NVIDIA TITAN X GPU. The experimental results shows that the deep learning can make a big leap forward in improving the performance of SAR image ship detection.


Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3374
Author(s):  
Hansen Liu ◽  
Kuangang Fan ◽  
Qinghua Ouyang ◽  
Na Li

To address the threat of drones intruding into high-security areas, the real-time detection of drones is urgently required to protect these areas. There are two main difficulties in real-time detection of drones. One of them is that the drones move quickly, which leads to requiring faster detectors. Another problem is that small drones are difficult to detect. In this paper, firstly, we achieve high detection accuracy by evaluating three state-of-the-art object detection methods: RetinaNet, FCOS, YOLOv3 and YOLOv4. Then, to address the first problem, we prune the convolutional channel and shortcut layer of YOLOv4 to develop thinner and shallower models. Furthermore, to improve the accuracy of small drone detection, we implement a special augmentation for small object detection by copying and pasting small drones. Experimental results verify that compared to YOLOv4, our pruned-YOLOv4 model, with 0.8 channel prune rate and 24 layers prune, achieves 90.5% mAP and its processing speed is increased by 60.4%. Additionally, after small object augmentation, the precision and recall of the pruned-YOLOv4 almost increases by 22.8% and 12.7%, respectively. Experiment results verify that our pruned-YOLOv4 is an effective and accurate approach for drone detection.


2021 ◽  
Vol 15 ◽  
Author(s):  
Zhiguo Zhou ◽  
Jiaen Sun ◽  
Jiabao Yu ◽  
Kaiyuan Liu ◽  
Junwei Duan ◽  
...  

Water surface object detection is one of the most significant tasks in autonomous driving and water surface vision applications. To date, existing public large-scale datasets collected from websites do not focus on specific scenarios. As a characteristic of these datasets, the quantity of the images and instances is also still at a low level. To accelerate the development of water surface autonomous driving, this paper proposes a large-scale, high-quality annotated benchmark dataset, named Water Surface Object Detection Dataset (WSODD), to benchmark different water surface object detection algorithms. The proposed dataset consists of 7,467 water surface images in different water environments, climate conditions, and shooting times. In addition, the dataset comprises a total of 14 common object categories and 21,911 instances. Simultaneously, more specific scenarios are focused on in WSODD. In order to find a straightforward architecture to provide good performance on WSODD, a new object detector, named CRB-Net, is proposed to serve as a baseline. In experiments, CRB-Net was compared with 16 state-of-the-art object detection methods and outperformed all of them in terms of detection precision. In this paper, we further discuss the effect of the dataset diversity (e.g., instance size, lighting conditions), training set size, and dataset details (e.g., method of categorization). Cross-dataset validation shows that WSODD significantly outperforms other relevant datasets and that the adaptability of CRB-Net is excellent.


2021 ◽  
Vol 104 (2) ◽  
pp. 003685042110113
Author(s):  
Xianghua Ma ◽  
Zhenkun Yang

Real-time object detection on mobile platforms is a crucial but challenging computer vision task. However, it is widely recognized that although the lightweight object detectors have a high detection speed, the detection accuracy is relatively low. In order to improve detecting accuracy, it is beneficial to extract complete multi-scale image features in visual cognitive tasks. Asymmetric convolutions have a useful quality, that is, they have different aspect ratios, which can be used to exact image features of objects, especially objects with multi-scale characteristics. In this paper, we exploit three different asymmetric convolutions in parallel and propose a new multi-scale asymmetric convolution unit, namely MAC block to enhance multi-scale representation ability of CNNs. In addition, MAC block can adaptively merge the features with different scales by allocating learnable weighted parameters to three different asymmetric convolution branches. The proposed MAC blocks can be inserted into the state-of-the-art backbone such as ResNet-50 to form a new multi-scale backbone network of object detectors. To evaluate the performance of MAC block, we conduct experiments on CIFAR-100, PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO 2014 datasets. Experimental results show that the detection precision can be greatly improved while a fast detection speed is guaranteed as well.


2021 ◽  
Vol 11 (13) ◽  
pp. 6006
Author(s):  
Huy Le ◽  
Minh Nguyen ◽  
Wei Qi Yan ◽  
Hoa Nguyen

Augmented reality is one of the fastest growing fields, receiving increased funding for the last few years as people realise the potential benefits of rendering virtual information in the real world. Most of today’s augmented reality marker-based applications use local feature detection and tracking techniques. The disadvantage of applying these techniques is that the markers must be modified to match the unique classified algorithms or they suffer from low detection accuracy. Machine learning is an ideal solution to overcome the current drawbacks of image processing in augmented reality applications. However, traditional data annotation requires extensive time and labour, as it is usually done manually. This study incorporates machine learning to detect and track augmented reality marker targets in an application using deep neural networks. We firstly implement the auto-generated dataset tool, which is used for the machine learning dataset preparation. The final iOS prototype application incorporates object detection, object tracking and augmented reality. The machine learning model is trained to recognise the differences between targets using one of YOLO’s most well-known object detection methods. The final product makes use of a valuable toolkit for developing augmented reality applications called ARKit.


2019 ◽  
Vol 8 (2S8) ◽  
pp. 1311-1313

With the increasing awareness of environmental protection, people are paying more and more attention to the protection of wild animals. Their survive-al is closely related to human beings. As progress in target detection has achieved unprecedented success in computer vision, we can more easily tar-get animals. Animal detection based on computer vision is an important branch of object recognition, which is applied to intelligent monitoring, smart driving, and environmental protection. At present, many animal detection methods have been proposed. However, animal detection is still a challenge due to the complexity of the background, the diversity of animal pos-es, and the obstruction of objects. An accurate algorithm is needed. In this paper, the fast Region-based Convolutional Neural Network (Faster R-CNN) is used. The proposed method was tested using the CAMERA_TRAP DATASET. The results show that the proposed animal detection method based on Faster R-CNN performs better in terms of detection accuracy when its performance is compared to conventional schemes


Sensors ◽  
2018 ◽  
Vol 18 (9) ◽  
pp. 3016 ◽  
Author(s):  
Yeşeren Saylan ◽  
Adil Denizli

Hemoglobin is an iron carrying protein in erythrocytes and also an essential element to transfer oxygen from the lungs to the tissues. Abnormalities in hemoglobin concentration are closely correlated with health status and many diseases, including thalassemia, anemia, leukemia, heart disease, and excessive loss of blood. Particularly in resource-constrained settings existing blood analyzers are not readily applicable due to the need for high-level instrumentation and skilled personnel, thereby inexpensive, easy-to-use, and reliable detection methods are needed. Herein, a molecular fingerprints of hemoglobin on a nanofilm chip was obtained for real-time, sensitive, and selective hemoglobin detection using a surface plasmon resonance system. Briefly, through the photopolymerization technique, a template (hemoglobin) was imprinted on a monomeric (acrylamide) nanofilm on-chip using a cross-linker (methylenebisacrylamide) and an initiator-activator pair (ammonium persulfate-tetramethylethylenediamine). The molecularly imprinted nanofilm on-chip was characterized by atomic force microscopy and ellipsometry, followed by benchmarking detection performance of hemoglobin concentrations from 0.0005 mg mL−1 to 1.0 mg mL−1. Theoretical calculations and real-time detection implied that the molecularly imprinted nanofilm on-chip was able to detect as little as 0.00035 mg mL−1 of hemoglobin. In addition, the experimental results of hemoglobin detection on the chip well-fitted with the Langmuir adsorption isotherm model with high correlation coefficient (0.99) and association and dissociation coefficients (39.1 mL mg−1 and 0.03 mg mL−1) suggesting a monolayer binding characteristic. Assessments on selectivity, reusability and storage stability indicated that the presented chip is an alternative approach to current hemoglobin-targeted assays in low-resource regions, as well as antibody-based detection procedures in the field. In the future, this molecularly imprinted nanofilm on-chip can easily be integrated with portable plasmonic detectors, improving its access to these regions, as well as it can be tailored to detect other proteins and biomarkers.


Author(s):  
Seung-Hwan Bae

Region-based object detection infers object regions for one or more categories in an image. Due to the recent advances in deep learning and region proposal methods, object detectors based on convolutional neural networks (CNNs) have been flourishing and provided the promising detection results. However, the detection accuracy is degraded often because of the low discriminability of object CNN features caused by occlusions and inaccurate region proposals. In this paper, we therefore propose a region decomposition and assembly detector (R-DAD) for more accurate object detection.In the proposed R-DAD, we first decompose an object region into multiple small regions. To capture an entire appearance and part details of the object jointly, we extract CNN features within the whole object region and decomposed regions. We then learn the semantic relations between the object and its parts by combining the multi-region features stage by stage with region assembly blocks, and use the combined and high-level semantic features for the object classification and localization. In addition, for more accurate region proposals, we propose a multi-scale proposal layer that can generate object proposals of various scales. We integrate the R-DAD into several feature extractors, and prove the distinct performance improvement on PASCAL07/12 and MSCOCO18 compared to the recent convolutional detectors.


2019 ◽  
Vol 2019 ◽  
pp. 1-9 ◽  
Author(s):  
Hai Wang ◽  
Xinyu Lou ◽  
Yingfeng Cai ◽  
Yicheng Li ◽  
Long Chen

Vehicle detection is one of the most important environment perception tasks for autonomous vehicles. The traditional vision-based vehicle detection methods are not accurate enough especially for small and occluded targets, while the light detection and ranging- (lidar-) based methods are good in detecting obstacles but they are time-consuming and have a low classification rate for different target types. Focusing on these shortcomings to make the full use of the advantages of the depth information of lidar and the obstacle classification ability of vision, this work proposes a real-time vehicle detection algorithm which fuses vision and lidar point cloud information. Firstly, the obstacles are detected by the grid projection method using the lidar point cloud information. Then, the obstacles are mapped to the image to get several separated regions of interest (ROIs). After that, the ROIs are expanded based on the dynamic threshold and merged to generate the final ROI. Finally, a deep learning method named You Only Look Once (YOLO) is applied on the ROI to detect vehicles. The experimental results on the KITTI dataset demonstrate that the proposed algorithm has high detection accuracy and good real-time performance. Compared with the detection method based only on the YOLO deep learning, the mean average precision (mAP) is increased by 17%.


Author(s):  
Aofeng Li ◽  
Xufang Zhu ◽  
Shuo He ◽  
Jiawei Xia

AbstractIn view of the deficiencies in traditional visual water surface object detection, such as the existence of non-detection zones, failure to acquire global information, and deficiencies in a single-shot multibox detector (SSD) object detection algorithm such as remote detection and low detection precision of small objects, this study proposes a water surface object detection algorithm from panoramic vision based on an improved SSD. We reconstruct the backbone network for the SSD algorithm, replace VVG16 with a ResNet-50 network, and add five layers of feature extraction. More abundant semantic information of the shallow feature graph is obtained through a feature pyramid network structure with deconvolution. An experiment is conducted by building a water surface object dataset. Results showed the mean Average Precision (mAP) of the improved algorithm are increased by 4.03%, compared with the existing SSD detecting Algorithm. Improved algorithm can effectively improve the overall detection precision of water surface objects and enhance the detection effect of remote objects.


Sign in / Sign up

Export Citation Format

Share Document