Analyses of Time Efficiency and Speed-ups in Inference Process of Two-Stage Object Detection Algorithms

Author(s):  
Jiang Wu ◽  
Yifeng Sun ◽  
Guangming Tang ◽  
Xiaoyu Xu
Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3415 ◽  
Author(s):  
Jinpeng Zhang ◽  
Jinming Zhang ◽  
Shan Yu

In the image object detection task, a huge number of candidate boxes are generated to match with a relatively very small amount of ground-truth boxes, and through this method the learning samples can be created. But in fact the vast majority of the candidate boxes do not contain valid object instances and should be recognized and rejected during the training and evaluation of the network. This leads to extra high computation burden and a serious imbalance problem between object and none-object samples, thereby impeding the algorithm’s performance. Here we propose a new heuristic sampling method to generate candidate boxes for two-stage detection algorithms. It is generally applicable to the current two-stage detection algorithms to improve their detection performance. Experiments on COCO dataset showed that, relative to the baseline model, this new method could significantly increase the detection accuracy and efficiency.


2020 ◽  
Vol 1544 ◽  
pp. 012033 ◽  
Author(s):  
Lixuan Du ◽  
Rongyu Zhang ◽  
Xiaotian Wang

Author(s):  
Samuel Humphries ◽  
Trevor Parker ◽  
Bryan Jonas ◽  
Bryan Adams ◽  
Nicholas J Clark

Quick identification of building and roads is critical for execution of tactical US military operations in an urban environment. To this end, a gridded, referenced, satellite images of an objective, often referred to as a gridded reference graphic or GRG, has become a standard product developed during intelligence preparation of the environment. At present, operational units identify key infrastructure by hand through the work of individual intelligence officers. Recent advances in Convolutional Neural Networks, however, allows for this process to be streamlined through the use of object detection algorithms. In this paper, we describe an object detection algorithm designed to quickly identify and label both buildings and road intersections present in an image. Our work leverages both the U-Net architecture as well the SpaceNet data corpus to produce an algorithm that accurately identifies a large breadth of buildings and different types of roads. In addition to predicting buildings and roads, our model numerically labels each building by means of a contour finding algorithm. Most importantly, the dual U-Net model is capable of predicting buildings and roads on a diverse set of test images and using these predictions to produce clean GRGs.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2894
Author(s):  
Minh-Quan Dao ◽  
Vincent Frémont

Multi-Object Tracking (MOT) is an integral part of any autonomous driving pipelines because it produces trajectories of other moving objects in the scene and predicts their future motion. Thanks to the recent advances in 3D object detection enabled by deep learning, track-by-detection has become the dominant paradigm in 3D MOT. In this paradigm, a MOT system is essentially made of an object detector and a data association algorithm which establishes track-to-detection correspondence. While 3D object detection has been actively researched, association algorithms for 3D MOT has settled at bipartite matching formulated as a Linear Assignment Problem (LAP) and solved by the Hungarian algorithm. In this paper, we adapt a two-stage data association method which was successfully applied to image-based tracking to the 3D setting, thus providing an alternative for data association for 3D MOT. Our method outperforms the baseline using one-stage bipartite matching for data association by achieving 0.587 Average Multi-Object Tracking Accuracy (AMOTA) in NuScenes validation set and 0.365 AMOTA (at level 2) in Waymo test set.


2021 ◽  
Author(s):  
Alexis Koulidis ◽  
Mohamed Abdullatif ◽  
Ahmed Galal Abdel-Kader ◽  
Mohammed-ilies Ayachi ◽  
Shehab Ahmed ◽  
...  

Abstract Surface data measurement and analysis are an established mean of detecting drillstring low-frequency torsional vibration or stick-slip. The industry has also developed models that link surface torque and downhole drill bit rotational speed. Cameras provide an alternative noninvasive approach to existing wired/wireless sensors used to gather such surface data. The results of a preliminary field assessment of drilling dynamics utilizing camera-based drillstring monitoring are presented in this work. Detection and timing of events from the video are performed using computer vision techniques and object detection algorithms. A real-time interest point tracker utilizing homography estimation and sparse optical flow point tracking is deployed. We use a fully convolutional deep neural network trained to detect interest points and compute their accompanying descriptors. The detected points and descriptors are matched across video sequences and used for drillstring rotation detection and speed estimation. When the drillstring's vibration is invisible to the naked eye, the point tracking algorithm is preceded with a motion amplification function based on another deep convolutional neural network. We have clearly demonstrated the potential of camera-based noninvasive approaches to surface drillstring dynamics data acquisition and analysis. Through the application of real-time object detection algorithms on rig video feed, surface events were detected and timed. We were also able to estimate drillstring rotary speed and motion profile. Torsional drillstring modes can be identified and correlated with drilling parameters and bottomhole assembly design. A novel vibration array sensing approach based on a multi-point tracking algorithm is also proposed. A vibration threshold setting was utilized to enable an additional motion amplification function providing seamless assessment for multi-scale vibration measurement. Cameras were typically devices to acquire images/videos for offline automated assessment (recently) or online manual monitoring (mainly), this work has shown how fog/edge computing makes it possible for these cameras to be "conscious" and "intelligent," hence play a critical role in automation/digitalization of drilling rigs. We showcase their preliminary application as drilling dynamics and rig operations sensors in this work. Cameras are an ideal sensor for a drilling environment since they can be installed anywhere on a rig to perform large-scale live video analytics on drilling processes.


2020 ◽  
Vol 219 (10) ◽  
Author(s):  
Dominic Waithe ◽  
Jill M. Brown ◽  
Katharina Reglinski ◽  
Isabel Diez-Sevilla ◽  
David Roberts ◽  
...  

Object detection networks are high-performance algorithms famously applied to the task of identifying and localizing objects in photography images. We demonstrate their application for the classification and localization of cells in fluorescence microscopy by benchmarking four leading object detection algorithms across multiple challenging 2D microscopy datasets. Furthermore we develop and demonstrate an algorithm that can localize and image cells in 3D, in close to real time, at the microscope using widely available and inexpensive hardware. Furthermore, we exploit the fast processing of these networks and develop a simple and effective augmented reality (AR) system for fluorescence microscopy systems using a display screen and back-projection onto the eyepiece. We show that it is possible to achieve very high classification accuracy using datasets with as few as 26 images present. Using our approach, it is possible for relatively nonskilled users to automate detection of cell classes with a variety of appearances and enable new avenues for automation of fluorescence microscopy acquisition pipelines.


2021 ◽  
Vol 23 (11) ◽  
pp. 159-165
Author(s):  
JAYANTH DWIJESH H P ◽  
◽  
SANDEEP S V ◽  
RASHMI S ◽  
◽  
...  

In today’s world, accurate and fast information is vital for safe aircraft landings. The purpose of an EMAS (Engineered Materials Arresting System) is to prevent an aeroplane from overrunning with no human injury and minimal damage to the aircraft. Although various algorithms for object detection analysis have been developed, only a few researchers have examined image analysis as a landing assist. Image intensity edges are employed in one system to detect the sides of a runway in an image sequence, allowing the runway’s 3-dimensional position and orientation to be approximated. A fuzzy network system is used to improve object detection and extraction from aerial images. In another system, multi-scale, multiplatform imagery is used to combine physiologically and geometrically inspired algorithms for recognizing objects from hyper spectral and/or multispectral (HS/MS) imagery. However, the similarity in the top view of runways, buildings, highways, and other objects is a disadvantage of these methods. We propose a new method for detecting and tracking the runway based on pattern matching and texture analysis of digital images captured by aircraft cameras. Edge detection techniques are used to recognize runways from aerial images. The edge detection algorithms employed in this paper are the Hough Transform, Canny Filter, and Sobel Filter algorithms, which result in efficient detection.


2021 ◽  
Vol 11 (23) ◽  
pp. 11241
Author(s):  
Ling Li ◽  
Fei Xue ◽  
Dong Liang ◽  
Xiaofei Chen

Concealed objects detection in terahertz imaging is an urgent need for public security and counter-terrorism. So far, there is no public terahertz imaging dataset for the evaluation of objects detection algorithms. This paper provides a public dataset for evaluating multi-object detection algorithms in active terahertz imaging. Due to high sample similarity and poor imaging quality, object detection on this dataset is much more difficult than on those commonly used public object detection datasets in the computer vision field. Since the traditional hard example mining approach is designed based on the two-stage detector and cannot be directly applied to the one-stage detector, this paper designs an image-based Hard Example Mining (HEM) scheme based on RetinaNet. Several state-of-the-art detectors, including YOLOv3, YOLOv4, FRCN-OHEM, and RetinaNet, are evaluated on this dataset. Experimental results show that the RetinaNet achieves the best mAP and HEM further enhances the performance of the model. The parameters affecting the detection metrics of individual images are summarized and analyzed in the experiments.


Sign in / Sign up

Export Citation Format

Share Document