Survey on Image Object Detection Algorithms Based on Deep Learning

Author(s):  
Wei Fang ◽  
Liang Shen ◽  
Yupeng Chen
Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3415 ◽  
Author(s):  
Jinpeng Zhang ◽  
Jinming Zhang ◽  
Shan Yu

In the image object detection task, a huge number of candidate boxes are generated to match with a relatively very small amount of ground-truth boxes, and through this method the learning samples can be created. But in fact the vast majority of the candidate boxes do not contain valid object instances and should be recognized and rejected during the training and evaluation of the network. This leads to extra high computation burden and a serious imbalance problem between object and none-object samples, thereby impeding the algorithm’s performance. Here we propose a new heuristic sampling method to generate candidate boxes for two-stage detection algorithms. It is generally applicable to the current two-stage detection algorithms to improve their detection performance. Experiments on COCO dataset showed that, relative to the baseline model, this new method could significantly increase the detection accuracy and efficiency.


Author(s):  
Vibhavari B Rao

The crime rates today can inevitably put a civilian's life in danger. While consistent efforts are being made to alleviate crime, there is also a dire need to create a smart and proactive surveillance system. Our project implements a smart surveillance system that would alert the authorities in real-time when a crime is being committed. During armed robberies and hostage situations, most often, the police cannot reach the place on time to prevent it from happening, owing to the lag in communication between the informants of the crime scene and the police. We propose an object detection model that implements deep learning algorithms to detect objects of violence such as pistols, knives, rifles from video surveillance footage, and in turn send real-time alerts to the authorities. There are a number of object detection algorithms being developed, each being evaluated under the performance metric mAP. On implementing Faster R-CNN with ResNet 101 architecture we found the mAP score to be about 91%. However, the downside to this is the excessive training and inferencing time it incurs. On the other hand, YOLOv5 architecture resulted in a model that performed very well in terms of speed. Its training speed was found to be 0.012 s / image during training but naturally, the accuracy was not as high as Faster R-CNN. With good computer architecture, it can run at about 40 fps. Thus, there is a tradeoff between speed and accuracy and it's important to strike a balance. We use transfer learning to improve accuracy by training the model on our custom dataset. This project can be deployed on any generic CCTV camera by setting up a live RTSP (real-time streaming protocol) and streaming the footage on a laptop or desktop where the deep learning model is being run.


2019 ◽  
Vol 32 (7) ◽  
pp. 1949-1958 ◽  
Author(s):  
Laigang Zhang ◽  
Zhou Sheng ◽  
Yibin Li ◽  
Qun Sun ◽  
Ying Zhao ◽  
...  

2021 ◽  
Vol 66 (3) ◽  
pp. 2493-2507
Author(s):  
Hyun Kyu Shin ◽  
Si Woon Lee ◽  
Goo Pyo Hong ◽  
Sael Lee ◽  
Sang Hyo Lee ◽  
...  

2021 ◽  
Vol 38 (2) ◽  
pp. 481-494
Author(s):  
Yurong Guan ◽  
Muhammad Aamir ◽  
Zhihua Hu ◽  
Waheed Ahmed Abro ◽  
Ziaur Rahman ◽  
...  

Object detection in images is an important task in image processing and computer vision. Many approaches are available for object detection. For example, there are numerous algorithms for object positioning and classification in images. However, the current methods perform poorly and lack experimental verification. Thus, it is a fascinating and challenging issue to position and classify image objects. Drawing on the recent advances in image object detection, this paper develops a region-baed efficient network for accurate object detection in images. To improve the overall detection performance, image object detection was treated as a twofold problem, involving object proposal generation and object classification. First, a framework was designed to generate high-quality, class-independent, accurate proposals. Then, these proposals, together with their input images, were imported to our network to learn convolutional features. To boost detection efficiency, the number of proposals was reduced by a network refinement module, leaving only a few eligible candidate proposals. After that, the refined candidate proposals were loaded into the detection module to classify the objects. The proposed model was tested on the test set of the famous PASCAL Visual Object Classes Challenge 2007 (VOC2007). The results clearly demonstrate that our model achieved robust overall detection efficiency over existing approaches using fewer or more proposals, in terms of recall, mean average best overlap (MABO), and mean average precision (mAP).


2020 ◽  
Vol 28 (S2) ◽  
Author(s):  
Asmida Ismail ◽  
Siti Anom Ahmad ◽  
Azura Che Soh ◽  
Mohd Khair Hassan ◽  
Hazreen Haizi Harith

The object detection system is a computer technology related to image processing and computer vision that detects instances of semantic objects of a certain class in digital images and videos. The system consists of two main processes, which are classification and detection. Once an object instance has been classified and detected, it is possible to obtain further information, including recognizes the specific instance, track the object over an image sequence and extract further information about the object and the scene. This paper presented an analysis performance of deep learning object detector by combining a deep learning Convolutional Neural Network (CNN) for object classification and applies classic object detection algorithms to devise our own deep learning object detector. MiniVGGNet is an architecture network used to train an object classification, and the data used for this purpose was collected from specific indoor environment building. For object detection, sliding windows and image pyramids were used to localize and detect objects at different locations, and non-maxima suppression (NMS) was used to obtain the final bounding box to localize the object location. Based on the experiment result, the percentage of classification accuracy of the network is 80% to 90% and the time for the system to detect the object is less than 15sec/frame. Experimental results show that there are reasonable and efficient to combine classic object detection method with a deep learning classification approach. The performance of this method can work in some specific use cases and effectively solving the problem of the inaccurate classification and detection of typical features.


Sign in / Sign up

Export Citation Format

Share Document