scholarly journals SiamRCR: Reciprocal Classification and Regression for Visual Object Tracking

Author(s):  
Jinlong Peng ◽  
Zhengkai Jiang ◽  
Yueyang Gu ◽  
Yang Wu ◽  
Yabiao Wang ◽  
...  

Recently, most siamese network based trackers locate targets via object classification and bounding-box regression. Generally, they select the bounding-box with maximum classification confidence as the final prediction. This strategy may miss the right result due to the accuracy misalignment between classification and regression. In this paper, we propose a novel siamese tracking algorithm called SiamRCR, addressing this problem with a simple, light and effective solution. It builds reciprocal links between classification and regression branches, which can dynamically re-weight their losses for each positive sample. In addition, we add a localization branch to predict the localization accuracy, so that it can work as the replacement of the regression assistance link during inference. This branch makes the training and inference more consistent. Extensive experimental results demonstrate the effectiveness of SiamRCR and its superiority over the state-of-the-art competitors on GOT-10k, LaSOT, TrackingNet, OTB-2015, VOT-2018 and VOT-2019. Moreover, our SiamRCR runs at 65 FPS, far above the real-time requirement.

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6388
Author(s):  
Jia Chen ◽  
Fan Wang ◽  
Yingjie Zhang ◽  
Yibo Ai ◽  
Weidong Zhang

Visual tracking task is divided into classification and regression tasks, and manifold features are introduced to improve the performance of the tracker. Although the previous anchor-based tracker has achieved superior tracking performance, the anchor-based tracker not only needs to set parameters manually but also ignores the influence of the geometric characteristics of the object on the tracker performance. In this paper, we propose a novel Siamese network framework with ResNet50 as the backbone, which is an anchor-free tracker based on manifold features. The network design is simple and easy to understand, which not only considers the influence of geometric features on the target tracking performance but also reduces the calculation of parameters and improves the target tracking performance. In the experiment, we compared our tracker with the most advanced public benchmarks and obtained a state-of-the-art performance.


Sensors ◽  
2020 ◽  
Vol 20 (14) ◽  
pp. 4021 ◽  
Author(s):  
Mustansar Fiaz ◽  
Arif Mahmood ◽  
Soon Ki Jung

We propose to improve the visual object tracking by introducing a soft mask based low-level feature fusion technique. The proposed technique is further strengthened by integrating channel and spatial attention mechanisms. The proposed approach is integrated within a Siamese framework to demonstrate its effectiveness for visual object tracking. The proposed soft mask is used to give more importance to the target regions as compared to the other regions to enable effective target feature representation and to increase discriminative power. The low-level feature fusion improves the tracker robustness against distractors. The channel attention is used to identify more discriminative channels for better target representation. The spatial attention complements the soft mask based approach to better localize the target objects in challenging tracking scenarios. We evaluated our proposed approach over five publicly available benchmark datasets and performed extensive comparisons with 39 state-of-the-art tracking algorithms. The proposed tracker demonstrates excellent performance compared to the existing state-of-the-art trackers.


Author(s):  
Aishi Li ◽  
Ming Yang ◽  
Wanqi Yang

Discriminative correlation filters have recently achieved excellent performance for visual object tracking. The key to success is to make full use of dense sampling and specific properties of circulant matrices in the Fourier domain. However, previous studies don't take into consideration the importance and complementary information of different features, simply concatenating them. This paper investigates an effective method of feature integration for correlation filters, which jointly learns filters, as well as importance maps in each frame. These importance maps borrow the advantages of different features, aiming to achieve complementary traits and improve robustness. Moreover, for each feature, an importance map is shared by its all channels to avoid overfitting. In addition, we introduce a regularization term for the importance maps and use the penalty factor to control the significance of features. Based on handcrafted and CNN features, we implement two trackers, which achieve a competitive performance compared with several state-of-the-art trackers.


Symmetry ◽  
2019 ◽  
Vol 11 (9) ◽  
pp. 1155 ◽  
Author(s):  
Fawad ◽  
Muhammad Jamil Khan ◽  
MuhibUr Rahman ◽  
Yasar Amin ◽  
Hannu Tenhunen

Kernel correlation filters (KCF) demonstrate significant potential in visual object tracking by employing robust descriptors. Proper selection of color and texture features can provide robustness against appearance variations. However, the use of multiple descriptors would lead to a considerable feature dimension. In this paper, we propose a novel low-rank descriptor, that provides better precision and success rate in comparison to state-of-the-art trackers. We accomplished this by concatenating the magnitude component of the Overlapped Multi-oriented Tri-scale Local Binary Pattern (OMTLBP), Robustness-Driven Hybrid Descriptor (RDHD), Histogram of Oriented Gradients (HoG), and Color Naming (CN) features. We reduced the rank of our proposed multi-channel feature to diminish the computational complexity. We formulated the Support Vector Machine (SVM) model by utilizing the circulant matrix of our proposed feature vector in the kernel correlation filter. The use of discrete Fourier transform in the iterative learning of SVM reduced the computational complexity of our proposed visual tracking algorithm. Extensive experimental results on Visual Tracker Benchmark dataset show better accuracy in comparison to other state-of-the-art trackers.


2022 ◽  
Vol 16 (4) ◽  
pp. 1-19
Author(s):  
Fei Gao ◽  
Jiada Li ◽  
Yisu Ge ◽  
Jianwen Shao ◽  
Shufang Lu ◽  
...  

With the popularization of visual object tracking (VOT), more and more trajectory data are obtained and have begun to gain widespread attention in the fields of mobile robots, intelligent video surveillance, and the like. How to clean the anomalous trajectories hidden in the massive data has become one of the research hotspots. Anomalous trajectories should be detected and cleaned before the trajectory data can be effectively used. In this article, a Trajectory Evaluator by Sub-tracks (TES) for detecting VOT-based anomalous trajectory is proposed. Feature of Anomalousness is defined and described as the Eigenvector of classifier to filter Track Lets anomalous trajectory and IDentity Switch anomalous trajectory, which includes Feature of Anomalous Pose and Feature of Anomalous Sub-tracks (FAS). In the comparative experiments, TES achieves better results on different scenes than state-of-the-art methods. Moreover, FAS makes better performance than point flow, least square method fitting and Chebyshev Polynomial Fitting. It is verified that TES is more accurate and effective and is conducive to the sub-tracks trajectory data analysis.


Information ◽  
2019 ◽  
Vol 10 (1) ◽  
pp. 26 ◽  
Author(s):  
Senquan Yang ◽  
Yuan Xie ◽  
Pu Li ◽  
Haoxiang Wen ◽  
Huan Luo ◽  
...  

Color histogram-based trackers have obtained excellent performance against many challenging situations. However, since the appearance of color is sensitive to illumination, they tend to achieve lower accuracy when illumination is severely variant throughout a sequence. To overcome this limitation, we propose a novel hyperline clustering based discriminant model, an illumination invariant model that is able to distinguish the object from its surrounding background. Furthermore, we exploit this model and propose an anchor based scale estimation to cope with shape deformation and scale variation. Numerous experiments on recent online tracking benchmark datasets demonstrate that our approach achieve favorable performance compared with several state-of-the-art tracking algorithms. In particular, our approach achieves higher accuracy than comparative methods in the illumination variant and shape deformation challenging situations.


Sensors ◽  
2019 ◽  
Vol 19 (2) ◽  
pp. 387 ◽  
Author(s):  
Ming Du ◽  
Yan Ding ◽  
Xiuyun Meng ◽  
Hua-Liang Wei ◽  
Yifan Zhao

In recent years, regression trackers have drawn increasing attention in the visual-object tracking community due to their favorable performance and easy implementation. The tracker algorithms directly learn mapping from dense samples around the target object to Gaussian-like soft labels. However, in many real applications, when applied to test data, the extreme imbalanced distribution of training samples usually hinders the robustness and accuracy of regression trackers. In this paper, we propose a novel effective distractor-aware loss function to balance this issue by highlighting the significant domain and by severely penalizing the pure background. In addition, we introduce a full differentiable hierarchy-normalized concatenation connection to exploit abstractions across multiple convolutional layers. Extensive experiments were conducted on five challenging benchmark-tracking datasets, that is, OTB-13, OTB-15, TC-128, UAV-123, and VOT17. The experimental results are promising and show that the proposed tracker performs much better than nearly all the compared state-of-the-art approaches.


2020 ◽  
Vol 34 (07) ◽  
pp. 12184-12191
Author(s):  
Ning Wang ◽  
Wengang Zhou ◽  
Guojun Qi ◽  
Houqiang Li

In visual object tracking, by reasonably fusing multiple experts, ensemble framework typically achieves superior performance compared to the individual experts. However, the necessity of parallelly running all the experts in most existing ensemble frameworks heavily limits their efficiency. In this paper, we propose POST, a POlicy-based Switch Tracker for robust and efficient visual tracking. The proposed POST tracker consists of multiple weak but complementary experts (trackers) and adaptively assigns one suitable expert for tracking in each frame. By formulating this expert switch in consecutive frames as a decision-making problem, we learn an agent via reinforcement learning to directly decide which expert to handle the current frame without running others. In this way, the proposed POST tracker maintains the performance merit of multiple diverse models while favorably ensuring the tracking efficiency. Extensive ablation studies and experimental comparisons against state-of-the-art trackers on 5 prevalent benchmarks verify the effectiveness of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document