A Few Shot Object Detection Method Based on Feature Pyramid Network and Graph Neural Network

Author(s):  
Xinlong Li ◽  
Xingwei Li ◽  
Jiating Jin ◽  
Shaojie Guan ◽  
Yizhi Ge
2020 ◽  
Vol 12 (5) ◽  
pp. 784 ◽  
Author(s):  
Wei Guo ◽  
Weihong Li ◽  
Weiguo Gong ◽  
Jinkai Cui

Multi-scale object detection is a basic challenge in computer vision. Although many advanced methods based on convolutional neural networks have succeeded in natural images, the progress in aerial images has been relatively slow mainly due to the considerably huge scale variations of objects and many densely distributed small objects. In this paper, considering that the semantic information of the small objects may be weakened or even disappear in the deeper layers of neural network, we propose a new detection framework called Extended Feature Pyramid Network (EFPN) for strengthening the information extraction ability of the neural network. In the EFPN, we first design the multi-branched dilated bottleneck (MBDB) module in the lateral connections to capture much more semantic information. Then, we further devise an attention pathway for better locating the objects. Finally, an augmented bottom-up pathway is conducted for making shallow layer information easier to spread and further improving performance. Moreover, we present an adaptive scale training strategy to enable the network to better recognize multi-scale objects. Meanwhile, we present a novel clustering method to achieve adaptive anchors and make the neural network better learn data features. Experiments on the public aerial datasets indicate that the presented method obtain state-of-the-art performance.


2019 ◽  
Vol 2019 ◽  
pp. 1-16
Author(s):  
Jiangfan Feng ◽  
Fanjie Wang ◽  
Siqin Feng ◽  
Yongrong Peng

The performance of convolutional neural network- (CNN-) based object detection has achieved incredible success. Howbeit, existing CNN-based algorithms suffer from a problem that small-scale objects are difficult to detect because it may have lost its response when the feature map has reached a certain depth, and it is common that the scale of objects (such as cars, buses, and pedestrians) contained in traffic images and videos varies greatly. In this paper, we present a 32-layer multibranch convolutional neural network named MBNet for fast detecting objects in traffic scenes. Our model utilizes three detection branches, in which feature maps with a size of 16 × 16, 32 × 32, and 64 × 64 are used, respectively, to optimize the detection for large-, medium-, and small-scale objects. By means of a multitask loss function, our model can be trained end-to-end. The experimental results show that our model achieves state-of-the-art performance in terms of precision and recall rate, and the detection speed (up to 33 fps) is fast, which can meet the real-time requirements of industry.


2019 ◽  
Vol 11 (16) ◽  
pp. 1921 ◽  
Author(s):  
Zijun Duo ◽  
Wenke Wang ◽  
Huizan Wang

Oceanic mesoscale eddies greatly influence energy and matter transport and acoustic propagation. However, the traditional detection method for oceanic mesoscale eddies relies too much on the threshold value and has significant subjectivity. The existing machine learning methods are not mature or purposeful enough, as their train set lacks authority. In view of the above problems, this paper constructs a mesoscale eddy automatic identification and positioning network—OEDNet—based on an object detection network. Firstly, 2D image processing technology is used to enhance the data of a small number of accurate eddy samples annotated by marine experts to generate the train set. Then, the object detection model with a deep residual network, and a feature pyramid network as the main structure, is designed and optimized for small samples and complex regions in the mesoscale eddies of the ocean. Experimental results show that the model achieves better recognition compared to the traditional detection method and exhibits a good generalization ability in different sea areas.


2019 ◽  
Vol 56 (4) ◽  
pp. 041502 ◽  
Author(s):  
任之俊 Ren Zhijun ◽  
蔺素珍 Lin Suzhen ◽  
李大威 Li Dawei ◽  
王丽芳 Wang Lifang ◽  
左健宏 Zuo Jianhong

2021 ◽  
Vol 12 (2) ◽  
pp. 128
Author(s):  
Anky Aditya P ◽  
Suryo Adhi Wibowo ◽  
Rissa Rahmania

Abstract Augmented Reality (AR) is a technology with the concept of combining real-world dimensions with virtual world dimensions that are displayed in realtime. In the AR environment, interaction techniques used can vary. Marker-based AR is one type of AR that allows virtual objects to be displayed in the real world by using markers as pointers. In the use of marker-based AR required object detection method used for tracking markers. In this study, a system that can detect objects in the form of fingertips will be designed. In designing the system the Faster Region-based Convolutional Neural Network (Faster R-CNN) method is used. R-CNN Faster is an object detection method which is a combination of the Fast R-CNN method and the Region Proposal Network (RPN). The results of the detection parameters will be used for tracking, namely the coordinates x, y, width, and length. This research uses the Faster R-CNN method because it has a faster computing speed compared to the previous method, namely Particle Filter. The Faster R-CNN method uses ResNet architecture as the core of CNN. The system configuration to be tested is the 25K, 50K and 75K step training with the same-padding scheme. The testing process is taken from a video consisting of 10800 training data and 3600 test data. The best system configuration based on parameter priority for AR technology is obtained in the 50K step training.Keyword: augmented reality, convolutional neural network, faster region-based convolutional neural network, region proposal network, ResNet.Abstrak Augmented Reality (AR) adalah teknologi dengan konsep menggabungkan dimensi dunia nyata dengan dimensi dunia virtual yang ditampilkan secara real-time. Dalam lingkungan AR, teknik interaksi yang digunakan dapat bermacam – macam. Marker-based AR merupakan salah satu jenis AR yang memungkinkan objek virtual ditampilkan ke dalam dunia nyata dengan digunakannya  marker sebagai pointer-nya. Dalam penggunaan AR berbasis marker diperlukan metode deteksi objek yang digunakan untuk tracking marker. Dalam penelitian ini akan dirancang sebuah sistem yang dapat mendeteksi objek berupa ujung jari. Dalam perancangan sistem tersebut digunakan metode Faster Region-Based Convolutional Nueral Network (Faster R-CNN). Faster R-CNN merupakan salah satu metode deteksi objek yang merupakan gabungan dari metode Fast R-CNN dan Region Proposal Network (RPN). Hasil dari parameter deteksi akan digunakan untuk tracking, yaitu koordinat x, y, width, dan length. Penelitian ini menggunakan metode Faster R-CNN karena memiliki kecepatan komputasi yang lebih cepat dibandingkan dengan metode sebelumnya yaitu Particle Filter. Metode Faster R-CNN mengunakan arsitektur ResNet sebagai inti dari CNN. Konfigurasi sistem yang akan diuji adalah step training 25K, 50K dan 75K dengan skema same-padding. Proses pengujian diambil dari video yang terdiri dari 10800 data latih dan 3600 data uji. Konfigurasi sistem terbaik berdasarkan prioritas parameter untuk teknologi AR didapatkan pada step training 50K.Keyword: augmented reality, convolutional neural network, faster region-based convolutional neural network, region proposal network, ResNet.


Sign in / Sign up

Export Citation Format

Share Document