scholarly journals SNS-CF: Siamese Network with Spatially Semantic Correlation Features for Object Tracking

Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4881
Author(s):  
Thierry Ntwari ◽  
Hasil Park ◽  
Joongchol Shin ◽  
Joonki Paik

Recent advances in object tracking based on deep Siamese networks shifted the attention away from correlation filters. However, the Siamese network alone does not have as high accuracy as state-of-the-art correlation filter-based trackers, whereas correlation filter-based trackers alone have a frame update problem. In this paper, we present a Siamese network with spatially semantic correlation features (SNS-CF) for accurate, robust object tracking. To deal with various types of features spread in many regions of the input image frame, the proposed SNS-CF consists of—(1) a Siamese feature extractor, (2) a spatially semantic feature extractor, and (3) an adaptive correlation filter. To the best of authors knowledge, the proposed SNS-CF is the first attempt to fuse the Siamese network and the correlation filter to provide high frame rate, real-time visual tracking with a favorable tracking performance to the state-of-the-art methods in multiple benchmarks.

2021 ◽  
Author(s):  
Kosuke Honda ◽  
Hamido Fujita

In recent years, template-based methods such as Siamese network trackers and Correlation Filter (CF) based trackers have achieved state-of-the-art performance in several benchmarks. Recent Siamese network trackers use deep features extracted from convolutional neural networks to locate the target. However, the tracking performance of these trackers decreases when there are similar distractors to the object and the target object is deformed. On the other hand, correlation filter (CF)-based trackers that use handcrafted features (e.g., HOG features) to spatially locate the target. These two approaches have complementary characteristics due to differences in learning methods, features used, and the size of search regions. Also, we found that these trackers are complementary in terms of performance in benchmarking. Therefore, we propose the “Complementary Tracking framework using Average peak-to-correlation energy” (CTA). CTA is the generic object tracking framework that connects CF-trackers and Siamese-trackers in parallel and exploits the complementary features of these. In CTA, when a tracking failure of the Siamese tracker is detected using Average peak-to-correlation energy (APCE), which is an evaluation index of the response map matrix, the CF-trackers correct the output. In experimental on OTB100, CTA significantly improves the performance over the original tracker for several combinations of Siamese-trackers and CF-rackers.


Author(s):  
D. Zhang ◽  
J. Lv ◽  
Z. Cheng ◽  
Y. Bai ◽  
Y. Cao

Abstract. After the development of deep learning object tracking methods in recent years, the fully convolutional siamese network object tracking algorithm SiamFC has become a more classic deep learning object tracking algorithm. In view of the problem that the accuracy of the tracking results of SiamFC will be reduced in the case of complex backgrounds, this paper introduces the attention mechanism based on the SiamFC, which performs channel and spatial weighting on the feature maps obtained by convolution of the input image. At the same time, the backbone network model of CNN in the algorithm is adjusted, then the siamese network combined with attention mechanism for object tracking is proposed. It can strengthen the effectiveness of the results of feature extraction and enhance the ability of the network model to discriminate targets. In this paper, the algorithm is tested on the OTB2015, VOT2016 and VOT2017 datasets, and compared with multiple object tracking algorithms. Experimental results show that the algorithm in this paper can better solve the complex background problem in object tracking, and has certain advantages compared with other algorithms.


Sensors ◽  
2019 ◽  
Vol 19 (10) ◽  
pp. 2362 ◽  
Author(s):  
Yijin Yang ◽  
Yihong Zhang ◽  
Demin Li ◽  
Zhijie Wang

Correlation filter-based methods have recently performed remarkably well in terms of accuracy and speed in the visual object tracking research field. However, most existing correlation filter-based methods are not robust to significant appearance changes in the target, especially when the target undergoes deformation, illumination variation, and rotation. In this paper, a novel parallel correlation filters (PCF) framework is proposed for real-time visual object tracking. Firstly, the proposed method constructs two parallel correlation filters, one for tracking the appearance changes in the target, and the other for tracking the translation of the target. Secondly, through weighted merging the response maps of these two parallel correlation filters, the proposed method accurately locates the center position of the target. Finally, in the training stage, a new reasonable distribution of the correlation output is proposed to replace the original Gaussian distribution to train more accurate correlation filters, which can prevent the model from drifting to achieve excellent tracking performance. The extensive qualitative and quantitative experiments on the common object tracking benchmarks OTB-2013 and OTB-2015 have demonstrated that the proposed PCF tracker outperforms most of the state-of-the-art trackers and achieves a high real-time tracking performance.


2020 ◽  
Vol 226 ◽  
pp. 03010 ◽  
Author(s):  
Pavel Goncharov ◽  
Alexander Uzhinskiy ◽  
Gennady Ososkov ◽  
Andrey Nechaevskiy ◽  
Julia Zudikhina

Crop losses are a major threat to the wellbeing of rural families, to the economy and governments, and to food security worldwide. The goal of our research is to develop a multi-functional platform to help the farming community to tilt against plant diseases. In our previous works, we reported about the creation of a special database of healthy and diseased plants’ leaves consisting of five sets of grapes images and proposed a special classification model based on a deep siamese network followed by k-nearest neighbors (KNN) classifier. Then we extended our database to five sets of images for grape, corn, and wheat – 611 images in total. Since after this extension the classification accuracy decreased to 86 %, we propose in this paper a novel architecture with a deep siamese network as feature extractor and a single-layer perceptron as a classifier that results in a significant gain of accuracy, up to 96 %.


Author(s):  
Daqun Li ◽  
Yi Yu ◽  
Xiaolin Chen

AbstractTo improve the deficient tracking ability of fully-convolutional Siamese networks (SiamFC) in complex scenes, an object tracking framework with Siamese network and re-detection mechanism (Siam-RM) is proposed. The mechanism adopts the Siamese instance search tracker (SINT) as the re-detection network. When multiple peaks appear on the response map of SiamFC, a more accurate re-detection network can re-determine the location of the object. Meanwhile, for the sake of adapting to various changes in appearance of the object, this paper employs a generative model to construct the templates of SiamFC. Furthermore, a method of template updating with high confidence is also used to prevent the template from being contaminated. Objective evaluation on the popular online tracking benchmark (OTB) shows that the tracking accuracy and the success rate of the proposed framework can reach 79.8% and 63.8%, respectively. Compared to SiamFC, the results of several representative video sequences demonstrate that our framework has higher accuracy and robustness in scenes with fast motion, occlusion, background clutter, and illumination variations.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1129 ◽  
Author(s):  
Jianming Zhang ◽  
Yang Liu ◽  
Hehua Liu ◽  
Jin Wang

Visual object tracking is a significant technology for camera-based sensor networks applications. Multilayer convolutional features comprehensively used in correlation filter (CF)-based tracking algorithms have achieved excellent performance. However, there are tracking failures in some challenging situations because ordinary features are not able to well represent the object appearance variations and the correlation filters are updated irrationally. In this paper, we propose a local–global multiple correlation filters (LGCF) tracking algorithm for edge computing systems capturing moving targets, such as vehicles and pedestrians. First, we construct a global correlation filter model with deep convolutional features, and choose horizontal or vertical division according to the aspect ratio to build two local filters with hand-crafted features. Then, we propose a local–global collaborative strategy to exchange information between local and global correlation filters. This strategy can avoid the wrong learning of the object appearance model. Finally, we propose a time-space peak to sidelobe ratio (TSPSR) to evaluate the stability of the current CF. When the estimated results of the current CF are not reliable, the Kalman filter redetection (KFR) model would be enabled to recapture the object. The experimental results show that our presented algorithm achieves better performances on OTB-2013 and OTB-2015 compared with the other latest 12 tracking algorithms. Moreover, our algorithm handles various challenges in object tracking well.


Mathematics ◽  
2021 ◽  
Vol 9 (10) ◽  
pp. 1130
Author(s):  
Ming-Hao Lin ◽  
Zhi-Xiang Hou ◽  
Kai-Han Cheng ◽  
Chin-Hsien Wu ◽  
Yan-Tsung Peng

Cameras are essential parts of portable devices, such as smartphones and tablets. Most people have a smartphone and can take pictures anywhere and anytime to record their lives. However, these pictures captured by cameras may suffer from noise contamination, causing issues for subsequent image analysis, such as image recognition, object tracking, and classification of an object in the image. This paper develops an effective combinational denoising framework based on the proposed Adaptive and Overlapped Average Filtering (AOAF) and Mixed-pooling Attention Refinement Networks (MARNs). First, we apply AOAF to the noisy input image to obtain a preliminarily denoised result, where noisy pixels are removed and recovered. Next, MARNs take the preliminary result as the input and output a refined image where details and edges are better reconstructed. The experimental results demonstrate that our method performs favorably against state-of-the-art denoising methods.


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6388
Author(s):  
Jia Chen ◽  
Fan Wang ◽  
Yingjie Zhang ◽  
Yibo Ai ◽  
Weidong Zhang

Visual tracking task is divided into classification and regression tasks, and manifold features are introduced to improve the performance of the tracker. Although the previous anchor-based tracker has achieved superior tracking performance, the anchor-based tracker not only needs to set parameters manually but also ignores the influence of the geometric characteristics of the object on the tracker performance. In this paper, we propose a novel Siamese network framework with ResNet50 as the backbone, which is an anchor-free tracker based on manifold features. The network design is simple and easy to understand, which not only considers the influence of geometric features on the target tracking performance but also reduces the calculation of parameters and improves the target tracking performance. In the experiment, we compared our tracker with the most advanced public benchmarks and obtained a state-of-the-art performance.


2020 ◽  
Vol 10 (9) ◽  
pp. 3021
Author(s):  
Wangpeng He ◽  
Heyi Li ◽  
Wei Liu ◽  
Cheng Li ◽  
Baolong Guo

Object tracking is a challenging research task because of drastic appearance changes of the target and a lack of training samples. Most online learning trackers are hampered by complications, e.g., drifting problem under occlusion, being out of view, or fast motion. In this paper, a real-time object tracking algorithm termed “robust sum of template and pixel-wise learners” (rStaple) is proposed to address those problems. It combines multi-feature correlation filters with a color histogram. Firstly, we extract a combination of specific features from the searching area around the target and then merge feature channels to train a translation correlation filter online. Secondly, the target state is determined by a discriminating mechanism, wherein the model update procedure stops when the target is occluded or out of view, and re-activated when the target re-appears. In addition, by calculating the color histogram score in the searching area, a significant enhancement is adopted for the score map. The target position can be estimated by combining the enhanced color histogram score with the correlation filter response map. Finally, a scale filter is trained for multi-scale detection to obtain the final tracking result. Extensive experimental results on a large benchmark dataset demonstrates that the proposed rStaple is superior to several state-of-the-art algorithms in terms of accuracy and efficiency.


Sign in / Sign up

Export Citation Format

Share Document