scholarly journals Research on Multiscene Vehicle Dataset Based on Improved FCOS Detection Algorithms

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Fei Yan ◽  
Hui Zhang ◽  
Tianyang Zhou ◽  
Zhiyong Fan ◽  
Jia Liu

Whether in intelligent transportation or autonomous driving, vehicle detection is an important part. Vehicle detection still faces many problems, such as inaccurate vehicle detection positioning and low detection accuracy in complex scenes. FCOS as a representative of anchor-free detection algorithms was once a sensation, but now it seems to be slightly insufficient. Based on this situation, we propose an improved FCOS algorithm. The improvements are as follows: (1) we introduce a deformable convolution into the backbone to solve the problem that the receptive field cannot cover the overall goal; (2) we add a bottom-up information path after the FPN of the neck module to reduce the loss of information in the propagation process; (3) we introduce the balance module according to the balance principle, which reduces inconsistent detection of the bbox head caused by the mismatch of variance of different feature maps. To enhance the comparative experiment, we have extracted some of the most recent datasets from UA-DETRAC, COCO, and Pascal VOC. The experimental results show that our method has achieved good results on its dataset.

Sensors ◽  
2019 ◽  
Vol 19 (5) ◽  
pp. 1089 ◽  
Author(s):  
Ye Wang ◽  
Zhenyi Liu ◽  
Weiwen Deng

Region proposal network (RPN) based object detection, such as Faster Regions with CNN (Faster R-CNN), has gained considerable attention due to its high accuracy and fast speed. However, it has room for improvements when used in special application situations, such as the on-board vehicle detection. Original RPN locates multiscale anchors uniformly on each pixel of the last feature map and classifies whether an anchor is part of the foreground or background with one pixel in the last feature map. The receptive field of each pixel in the last feature map is fixed in the original faster R-CNN and does not coincide with the anchor size. Hence, only a certain part can be seen for large vehicles and too much useless information is contained in the feature for small vehicles. This reduces detection accuracy. Furthermore, the perspective projection results in the vehicle bounding box size becoming related to the bounding box position, thereby reducing the effectiveness and accuracy of the uniform anchor generation method. This reduces both detection accuracy and computing speed. After the region proposal stage, many regions of interest (ROI) are generated. The ROI pooling layer projects an ROI to the last feature map and forms a new feature map with a fixed size for final classification and box regression. The number of feature map pixels in the projected region can also influence the detection performance but this is not accurately controlled in former works. In this paper, the original faster R-CNN is optimized, especially for the on-board vehicle detection. This paper tries to solve these above-mentioned problems. The proposed method is tested on the KITTI dataset and the result shows a significant improvement without too many tricky parameter adjustments and training skills. The proposed method can also be used on other objects with obvious foreshortening effects, such as on-board pedestrian detection. The basic idea of the proposed method does not rely on concrete implementation and thus, most deep learning based object detectors with multiscale feature maps can be optimized with it.


Author(s):  
Zezheng Lv ◽  
Xiaoci Huang ◽  
Yaozhong Liang ◽  
Wenguan Cao ◽  
Yuxiang Chong

Lane detection algorithms require extremely low computational costs as an important part of autonomous driving. Due to heavy backbone networks, algorithms based on pixel-wise segmentation is struggling to handle the problem of runtime consumption in the recognition of lanes. In this paper, a novel and practical methodology based on lightweight Segmentation Network is proposed, which aims to achieve accurate and efficient lane detection. Different with traditional convolutional layers, the proposed Shadow module can reduce the computational cost of the backbone network by performing linear transformations on intrinsic feature maps. Thus a lightweight backbone network Shadow-VGG-16 is built. After that, a tailored pyramid parsing module is introduced to collect different sub-domain features, which is composed of both a strip pool module based on Pyramid Scene Parsing Network (PSPNet) and a convolution attention module. Finally, a lane structural loss is proposed to explicitly model the lane structure and reduce the influence of noise, so that the pixels can fit the lane better. Extensive experimental results demonstrate that the performance of our method is significantly better than the state-of-the-art (SOTA) algorithms such as Pointlanenet and Line-CNN et al. 95.28% and 90.06% accuracy and 62.5 frames per second (fps) inference speed can be achieved on the CULane and Tusimple test dataset. Compared with the latest ERFNet, Line-CNN, SAD, F1 scores have respectively increased by 3.51%, 2.84%, and 3.82%. Meanwhile, the result from our dataset exceeds the top performances of the other by 8.6% with an 87.09 F1 score, which demonstrates the superiority of our method.


2011 ◽  
Vol 130-134 ◽  
pp. 2429-2432
Author(s):  
Liang Xiu Zhang ◽  
Xu Yun Qiu ◽  
Zhu Lin Zhang ◽  
Yu Lin Wang

Realtime on-road vehicle detection is a key technology in many transportation applications, such as driver assistance, autonomous driving and active safety. A vehicle detection algorithm based on cascaded structure is introduced. Haar-like features are used to built model in this application, and GAB algorithm is chosen to train the strong classifiers. Then, the real-time on-road vehicle classifier based on cascaded structure is constructed by combining the strong classifiers. Experimental results show that the cascaded classifier is excellent in both detection accuracy and computational efficiency, which ensures its application to collision warning system.


2019 ◽  
Vol 16 (4) ◽  
pp. 172988141987067
Author(s):  
Enze Yang ◽  
Linlin Huang ◽  
Jian Hu

Vehicle detection is involved in a wide range of intelligent transportation and smart city applications, and the demand of fast and accurate detection of vehicles is increasing. In this article, we propose a convolutional neural network-based framework, called separable reverse connected network, for multi-scale vehicles detection. In this network, reverse connected structure enriches the semantic context information of previous layers, while separable convolution is introduced for sparse representation of heavy feature maps generated from subnetworks. Further, we use multi-scale training scheme, online hard example mining, and model compression technique to accelerate the training process as well as reduce the parameters. Experimental results on Pascal Visual Object Classes (VOC) 2007 + 2012 and MicroSoft Common Objects in COntext (MS COCO) 2014 demonstrate the proposed method yields state-of-the-art performance. Moreover, by separable convolution and model compression, the network of two-stage detector is accelerated by about two times with little loss of detection accuracy.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Liwei Zhang ◽  
Jiahong Lai ◽  
Zenghui Zhang ◽  
Zhen Deng ◽  
Bingwei He ◽  
...  

Multiobject Tracking (MOT) is one of the most important abilities of autonomous driving systems. However, most of the existing MOT methods only use a single sensor, such as a camera, which has the problem of insufficient reliability. In this paper, we propose a novel Multiobject Tracking method by fusing deep appearance features and motion information of objects. In this method, the locations of objects are first determined based on a 2D object detector and a 3D object detector. We use the Nonmaximum Suppression (NMS) algorithm to combine the detection results of the two detectors to ensure the detection accuracy in complex scenes. After that, we use Convolutional Neural Network (CNN) to learn the deep appearance features of objects and employ Kalman Filter to obtain the motion information of objects. Finally, the MOT task is achieved by associating the motion information and deep appearance features. A successful match indicates that the object was tracked successfully. A set of experiments on the KITTI Tracking Benchmark shows that the proposed MOT method can effectively perform the MOT task. The Multiobject Tracking Accuracy (MOTA) is up to 76.40% and the Multiobject Tracking Precision (MOTP) is up to 83.50%.


Author(s):  
Y. Yuan ◽  
M. Sester

Abstract. Collective perception of connected vehicles can sufficiently increase the safety and reliability of autonomous driving by sharing perception information. However, collecting real experimental data for such scenarios is extremely expensive. Therefore, we built a computational efficient co-simulation synthetic data generator through CARLA and SUMO simulators. The simulated data contain image and point cloud data as well as ground truth for object detection and semantic segmentation tasks. To verify the superior performance gain of collective perception over single-vehicle perception, we conducted experiments of vehicle detection, which is one of the most important perception tasks for autonomous driving, on this data set. A 3D object detector and a Bird’s Eye View (BEV) detector are trained and then test with different configurations of the number of cooperative vehicles and vehicle communication ranges. The experiment results showed that collective perception can not only dramatically increase the overall mean detection accuracy but also the localization accuracy of detected bounding boxes. Besides, a vehicle detection comparison experiment showed that the detection performance drop caused by sensor observation noise can be canceled out by redundant information collected by multiple vehicles.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Hoanh Nguyen

Vehicle detection is a crucial task in autonomous driving systems. Due to large variance of scales and heavy occlusion of vehicle in an image, this task is still a challenging problem. Recent vehicle detection methods typically exploit feature pyramid to detect vehicles at different scales. However, the drawbacks in the design prevent the multiscale features from being completely exploited. This paper introduces a feature pyramid architecture to address this problem. In the proposed architecture, an improving region proposal network is designed to generate intermediate feature maps which are then used to add more discriminative representations to feature maps generated by the backbone network, as well as improving the computational cost of the network. To generate more discriminative feature representations, this paper introduces multilayer enhancement module to reweight feature representations of feature maps generated by the backbone network to increase the discrimination of foreground objects and background regions in each feature map. In addition, an adaptive RoI pooling module is proposed to pool features from all pyramid levels for each proposal and fuse them for the detection network. Experimental results on the KITTI vehicle detection benchmark and the PASCAL VOC 2007 car dataset show that the proposed approach obtains better detection performance compared with recent methods on vehicle detection.


2020 ◽  
Vol 6 ◽  
pp. e311
Author(s):  
Zheming Fan ◽  
Chengbin Peng ◽  
Licun Dai ◽  
Feng Cao ◽  
Jianyu Qi ◽  
...  

Recently, object detection methods have developed rapidly and have been widely used in many areas. In many scenarios, helmet wearing detection is very useful, because people are required to wear helmets to protect their safety when they work in construction sites or cycle in the streets. However, for the problem of helmet wearing detection in complex scenes such as construction sites and workshops, the detection accuracy of current approaches still needs to be improved. In this work, we analyze the mechanism and performance of several detection algorithms and identify two feasible base algorithms that have complementary advantages. We use one base algorithm to detect relatively large heads and helmets. Also, we use the other base algorithm to detect relatively small heads, and we add another convolutional neural network to detect whether there is a helmet above each head. Then, we integrate these two base algorithms with an ensemble method. In this method, we first propose an approach to merge information of heads and helmets from the base algorithms, and then propose a linear function to estimate the confidence score of the identified heads and helmets. Experiments on a benchmark data set show that, our approach increases the precision and recall for base algorithms, and the mean Average Precision of our approach is 0.93, which is better than many other approaches. With GPU acceleration, our approach can achieve real-time processing on contemporary computers, which is useful in practice.


Author(s):  
Chen Guoqiang ◽  
Yi Huailong ◽  
Mao Zhuangzhuang

Aims: The factors including light, weather, dynamic objects, seasonal effects and structures bring great challenges for the autonomous driving algorithm in the real world. Autonomous vehicles can detect different object obstacles in complex scenes to ensure safe driving. Background: The ability to detect vehicles and pedestrians is critical to the safe driving of autonomous vehicles. Automated vehicle vision systems must handle extremely wide and challenging scenarios. Objective: The goal of the work is to design a robust detector to detect vehicles and pedestrians. The main contribution is that the Multi-level Feature Fusion Block (MFFB) and the Detector Cascade Block (DCB) are designed. The multi-level feature fusion and multi-step prediction are used which greatly improve the detection object precision. Methods: The paper proposes a vehicle and pedestrian object detector, which is an end-to-end deep convolutional neural network. The key parts of the paper are to design the Multi-level Feature Fusion Block (MFFB) and Detector Cascade Block (DCB). The former combines inherent multi-level features by combining contextual information with useful multi-level features that combine high resolution but low semantics and low resolution but high semantic features. The latter uses multi-step prediction, cascades a series of detectors, and combines predictions of multiple feature maps to handle objects of different sizes. Results: The experiments on the RobotCar dataset and the KITTI dataset show that our algorithm can achieve high precision results through real-time detection. The algorithm achieves 84.61% mAP on the RobotCar dataset and is evaluated on the well-known KITTI benchmark dataset, achieving 81.54% mAP. In particular, the detection accuracy of a single-category vehicle reaches 90.02%. Conclusion: The experimental results show that the proposed algorithm has a good trade-off between detection accuracy and detection speed, which is beyond the current state-of-the-art RefineDet algorithm. The 2D object detector is proposed in the paper, which can solve the problem of vehicle and pedestrian detection and improve the accuracy, robustness and generalization ability in autonomous driving.


2019 ◽  
Vol 9 (24) ◽  
pp. 5397
Author(s):  
Kun Zhao ◽  
Li Liu ◽  
Yu Meng ◽  
Qing Gu

3D object detection has recently become a research hotspot in the field of autonomous driving. Although great progress has been made, it still needs to be further improved. Therefore, this paper presents FDCA, a feature deep continuous aggregation network using multi-sensors for 3D vehicle detection. The proposed network adopts a two-stage structure with the bird’s-eye view (BEV) map and the RGB image as an input. In the first stage, two feature extractors were used to generate feature maps with the high-resolution and representational ability for each input view. These feature maps were then fused and fed to a 3D proposal generator to obtain the reliable 3D vehicle proposals. In the second stage, the refinement network aggregated the features of the proposal regions further and performed classifications, a 3D bounding boxes regression, and orientation estimations to predict the location and heading of vehicles in 3D space. The FDCA network proposed was trained and evaluated on the KITTI 3D object detection benchmark. The experimental results of the validation set illustrated that compared with other fusion-based methods, the 3D average precision (AP) could achieve 76.82% on a moderate setting while having real-time capability, which was higher than that of the second-best performing method by 2.38%. Meanwhile, the results of ablation experiments show that the convergence rate of FDCA was much faster and the stability was also much better, making it a candidate for application in autonomous driving.


Sign in / Sign up

Export Citation Format

Share Document