scholarly journals Research on Object Detection Algorithm Based on Multilayer Information Fusion

2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Bao-Yuan Chen ◽  
Yu-Kun Shen ◽  
Kun Sun

At present, object detectors based on convolution neural networks generally rely on the last layer of features extracted by the feature extraction network. In the process of continuous convolution and pooling of deep features, the position information cannot be completely transferred backward. This paper proposes a multiscale feature reuse detection model, which includes the basic feature extraction network DenseNet, feature fusion network, multiscale anchor region proposal network, and classification and regression network. The fusion of high-dimensional features and low-dimensional features not only strengthens the model's sensitivity to objects of different sizes but also strengthens the transmission of information, so that the feature map has rich deep semantic information and shallow location information at the same time, which significantly improves the robustness and detection accuracy of the model. The algorithm is trained and tested in Pascal VOC2007 dataset. The experimental results show that the mean average precision of the objects in the dataset is 73.87%. At the same time, compared with the mainstream faster RCNN and SSD detection models, the mean average precision of object detection algorithm based on DenseNet is improved by 5.63% and 3.86%, respectively.

2022 ◽  
Vol 11 (01) ◽  
pp. 22-26
Author(s):  
Hui Xiang ◽  
Junyan Han ◽  
Hanqing Wang ◽  
Hao Li ◽  
Shangqing Li ◽  
...  

Aiming at the problems of low detection accuracy and poor recognition effect of small-scale targets in traditional vehicle and pedestrian detection methods, a vehicle and pedestrian detection method based on improved YOLOv4-Tiny is proposed. On the basis of YOLOv4-Tiny, the 8-fold down sampling feature layer was added for feature fusion, the PANet structure was used to perform bidirectional fusion for the deep and shallow features from the output feature layer of backbone network, and the detection head for small targets was added. The results show that the mean average precision of the improved method has reached 85.93%, and the detection performance is similar to that of YOLOv4. Compared with the YOLOv4-Tiny, the mean average precision of the improved method is increased by 24.45%, and the detection speed reaches 67.83FPS, which means that the detection effect is significantly improved and can meet the real-time requirements.


Electronics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1235
Author(s):  
Yang Yang ◽  
Hongmin Deng

In order to make the classification and regression of single-stage detectors more accurate, an object detection algorithm named Global Context You-Only-Look-Once v3 (GC-YOLOv3) is proposed based on the You-Only-Look-Once (YOLO) in this paper. Firstly, a better cascading model with learnable semantic fusion between a feature extraction network and a feature pyramid network is designed to improve detection accuracy using a global context block. Secondly, the information to be retained is screened by combining three different scaling feature maps together. Finally, a global self-attention mechanism is used to highlight the useful information of feature maps while suppressing irrelevant information. Experiments show that our GC-YOLOv3 reaches a maximum of 55.5 object detection mean Average Precision (mAP)@0.5 on Common Objects in Context (COCO) 2017 test-dev and that the mAP is 5.1% higher than that of the YOLOv3 algorithm on Pascal Visual Object Classes (PASCAL VOC) 2007 test set. Therefore, experiments indicate that the proposed GC-YOLOv3 model exhibits optimal performance on the PASCAL VOC and COCO datasets.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Rui Wang ◽  
Ziyue Wang ◽  
Zhengwei Xu ◽  
Chi Wang ◽  
Qiang Li ◽  
...  

Object detection is an important part of autonomous driving technology. To ensure the safe running of vehicles at high speed, real-time and accurate detection of all the objects on the road is required. How to balance the speed and accuracy of detection is a hot research topic in recent years. This paper puts forward a one-stage object detection algorithm based on YOLOv4, which improves the detection accuracy and supports real-time operation. The backbone of the algorithm doubles the stacking times of the last residual block of CSPDarkNet53. The neck of the algorithm replaces the SPP with the RFB structure, improves the PAN structure of the feature fusion module, adds the attention mechanism CBAM and CA structure to the backbone and neck structure, and finally reduces the overall width of the network to the original 3/4, so as to reduce the model parameters and improve the inference speed. Compared with YOLOv4, the algorithm in this paper improves the average accuracy on KITTI dataset by 2.06% and BDD dataset by 2.95%. When the detection accuracy is almost unchanged, the inference speed of this algorithm is increased by 9.14%, and it can detect in real time at a speed of more than 58.47 FPS.


2021 ◽  
pp. 157-166
Author(s):  
Lei Zhao ◽  
◽  
Jia Su ◽  
Zhiping Shi ◽  
Yong Guan

This paper focuses on using traditional image processing algorithms with some apparent-to-semantic features to improve the detection accuracy. Based on the optimization of Faster R-CNN algorithm, a mainstream framework in current object detection scenario, the multi-channel features are achieved by combining traditional image semantic feature algorithms (like Integral Channel Feature (ICF), Histograms of Gradient (HOG), Local Binary Pattern (LBF), etc.) and advanced semantic feature algorithms (like segmentation, heatmap, etc.). In order to realize the joint training of the original image and the above feature extraction algorithms, a unique network for increasing the accuracy of object detection and minimizing system weight called Multi-Channel Feature Network (MCFN) is proposed. The function of MCFN is to provide a multi-channel interface, which is not limited to the RGB component of a single picture, nor to the number of input channels. The experimental result shows the relationship between the number of additional channels, performance of model and accuracy. Compared with the basic Faster R-CNN structure, this result is based on the case of two additional channels. And the universal Mean Average Precision (mAP) can be improved by 2%-3%. When the number of extra channels is increased, the accuracy will not increase linearly. In fact, system performance starts to fluctuate in a range after the number of additional channels reaches six.


2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Guo X. Hu ◽  
Zhong Yang ◽  
Lei Hu ◽  
Li Huang ◽  
Jia M. Han

The existing object detection algorithm based on the deep convolution neural network needs to carry out multilevel convolution and pooling operations to the entire image in order to extract a deep semantic features of the image. The detection models can get better results for big object. However, those models fail to detect small objects that have low resolution and are greatly influenced by noise because the features after repeated convolution operations of existing models do not fully represent the essential characteristics of the small objects. In this paper, we can achieve good detection accuracy by extracting the features at different convolution levels of the object and using the multiscale features to detect small objects. For our detection model, we extract the features of the image from their third, fourth, and 5th convolutions, respectively, and then these three scales features are concatenated into a one-dimensional vector. The vector is used to classify objects by classifiers and locate position information of objects by regression of bounding box. Through testing, the detection accuracy of our model for small objects is 11% higher than the state-of-the-art models. In addition, we also used the model to detect aircraft in remote sensing images and achieved good results.


Water ◽  
2021 ◽  
Vol 13 (17) ◽  
pp. 2420
Author(s):  
Pengfei Shi ◽  
Xiwang Xu ◽  
Jianjun Ni ◽  
Yuanxue Xin ◽  
Weisheng Huang ◽  
...  

Underwater organisms are an important part of the underwater ecological environment. More and more attention has been paid to the perception of underwater ecological environment by intelligent means, such as machine vision. However, many objective reasons affect the accuracy of underwater biological detection, such as the low-quality image, different sizes or shapes, and overlapping or occlusion of underwater organisms. Therefore, this paper proposes an underwater biological detection algorithm based on improved Faster-RCNN. Firstly, the ResNet is used as the backbone feature extraction network of Faster-RCNN. Then, BiFPN (Bidirectional Feature Pyramid Network) is used to build a ResNet–BiFPN structure which can improve the capability of feature extraction and multi-scale feature fusion. Additionally, EIoU (Effective IoU) is used to replace IoU to reduce the proportion of redundant bounding boxes in the training data. Moreover, K-means++ clustering is used to generate more suitable anchor boxes to improve detection accuracy. Finally, the experimental results show that the detection accuracy of underwater biological detection algorithm based on improved Faster-RCNN on URPC2018 dataset is improved to 88.94%, which is 8.26% higher than Faster-RCNN. The results fully prove the effectiveness of the proposed algorithm.


2021 ◽  
Vol 2132 (1) ◽  
pp. 012036
Author(s):  
Dawei Liu ◽  
Shujing Gao

Abstract An improved algorithm is proposed to solve the problems of inaccurate recognition and low recall of Faster-Regions with Convolutional Neural Network (Faster-RCNN) algorithm for the detection of ship targets in remote sensing images. The algorithm is based on the Faster-RCNN network framework. Aiming at the small size and dense distribution of ship targets in remote sensing images, the feature extraction network is improved to enhance the detection ability of small targets. ResNet50 is used as the basic feature extraction network of the algorithm,and the hole residual block is introduced for multi-layer feature fusion to construct a new feature extraction network,which improves the feature extraction capability of the algorithm. The experimental results show that compared with the Faster-RCNN algorithm, this algorithm can learn more abundant target features in smaller pixel areas, thereby effectively improving the detection accuracy of ship targets.


2018 ◽  
Vol 10 (1) ◽  
pp. 57-64 ◽  
Author(s):  
Rizqa Raaiqa Bintana ◽  
Chastine Fatichah ◽  
Diana Purwitasari

Community-based question answering (CQA) is formed to help people who search information that they need through a community. One condition that may occurs in CQA is when people cannot obtain the information that they need, thus they will post a new question. This condition can cause CQA archive increased because of duplicated questions. Therefore, it becomes important problems to find semantically similar questions from CQA archive towards a new question. In this study, we use convolutional neural network methods for semantic modeling of sentence to obtain words that they represent the content of documents and new question. The result for the process of finding the same question semantically to a new question (query) from the question-answer documents archive using the convolutional neural network method, obtained the mean average precision value is 0,422. Whereas by using vector space model, as a comparison, obtained mean average precision value is 0,282. Index Terms—community-based question answering, convolutional neural network, question retrieval


2021 ◽  
pp. 1-18
Author(s):  
Hui Liu ◽  
Boxia He ◽  
Yong He ◽  
Xiaotian Tao

The existing seal ring surface defect detection methods for aerospace applications have the problems of low detection efficiency, strong specificity, large fine-grained classification errors, and unstable detection results. Considering these problems, a fine-grained seal ring surface defect detection algorithm for aerospace applications is proposed. Based on analysis of the stacking process of standard convolution, heat maps of original pixels in the receptive field participating in the convolution operation are quantified and generated. According to the generated heat map, the feature extraction optimization method of convolution combinations with different dilation rates is proposed, and an efficient convolution feature extraction network containing three kinds of dilated convolutions is designed. Combined with the O-ring surface defect features, a multiscale defect detection network is designed. Before the head of multiscale classification and position regression, feature fusion tree modules are added to ensure the reuse and compression of the responsive features of different receptive fields on the same scale feature maps. Experimental results show that on the O-rings-3000 testing dataset, the mean condition accuracy of the proposed algorithm reaches 95.10% for 5 types of surface defects of aerospace O-rings. Compared with RefineDet, the mean condition accuracy of the proposed algorithm is only reduced by 1.79%, while the parameters and FLOPs are reduced by 35.29% and 64.90%, respectively. Moreover, the proposed algorithm has good adaptability to image blur and light changes caused by the cutting of imaging hardware, thus saving the cost.


2021 ◽  
Vol 11 (13) ◽  
pp. 6016
Author(s):  
Jinsoo Kim ◽  
Jeongho Cho

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.


Sign in / Sign up

Export Citation Format

Share Document