SiamRCR: Reciprocal Classification and Regression for Visual Object Tracking

Recently, most siamese network based trackers locate targets via object classification and bounding-box regression. Generally, they select the bounding-box with maximum classification confidence as the final prediction. This strategy may miss the right result due to the accuracy misalignment between classification and regression. In this paper, we propose a novel siamese tracking algorithm called SiamRCR, addressing this problem with a simple, light and effective solution. It builds reciprocal links between classification and regression branches, which can dynamically re-weight their losses for each positive sample. In addition, we add a localization branch to predict the localization accuracy, so that it can work as the replacement of the regression assistance link during inference. This branch makes the training and inference more consistent. Extensive experimental results demonstrate the effectiveness of SiamRCR and its superiority over the state-of-the-art competitors on GOT-10k, LaSOT, TrackingNet, OTB-2015, VOT-2018 and VOT-2019. Moreover, our SiamRCR runs at 65 FPS, far above the real-time requirement.

Download Full-text

SiamMFC: Visual Object Tracking Based on Mainfold Full Convolution Siamese Network

Sensors ◽

10.3390/s21196388 ◽

2021 ◽

Vol 21 (19) ◽

pp. 6388

Author(s):

Jia Chen ◽

Fan Wang ◽

Yingjie Zhang ◽

Yibo Ai ◽

Weidong Zhang

Keyword(s):

Object Tracking ◽

Target Tracking ◽

State Of The Art ◽

Tracking Performance ◽

Visual Object ◽

Geometric Features ◽

Visual Object Tracking ◽

Calculation Of Parameters ◽

Siamese Network ◽

Classification And Regression

Visual tracking task is divided into classification and regression tasks, and manifold features are introduced to improve the performance of the tracker. Although the previous anchor-based tracker has achieved superior tracking performance, the anchor-based tracker not only needs to set parameters manually but also ignores the influence of the geometric characteristics of the object on the tracker performance. In this paper, we propose a novel Siamese network framework with ResNet50 as the backbone, which is an anchor-free tracker based on manifold features. The network design is simple and easy to understand, which not only considers the influence of geometric features on the target tracking performance but also reduces the calculation of parameters and improves the target tracking performance. In the experiment, we compared our tracker with the most advanced public benchmarks and obtained a state-of-the-art performance.

Download Full-text

Learning Soft Mask Based Feature Fusion with Channel and Spatial Attention for Robust Visual Object Tracking

Sensors ◽

10.3390/s20144021 ◽

2020 ◽

Vol 20 (14) ◽

pp. 4021 ◽

Cited By ~ 2

Author(s):

Mustansar Fiaz ◽

Arif Mahmood ◽

Soon Ki Jung

Keyword(s):

Object Tracking ◽

Spatial Attention ◽

Feature Fusion ◽

State Of The Art ◽

Feature Representation ◽

Visual Object ◽

Target Feature ◽

Visual Object Tracking ◽

Low Level ◽

Benchmark Datasets

We propose to improve the visual object tracking by introducing a soft mask based low-level feature fusion technique. The proposed technique is further strengthened by integrating channel and spatial attention mechanisms. The proposed approach is integrated within a Siamese framework to demonstrate its effectiveness for visual object tracking. The proposed soft mask is used to give more importance to the target regions as compared to the other regions to enable effective target feature representation and to increase discriminative power. The low-level feature fusion improves the tracker robustness against distractors. The channel attention is used to identify more discriminative channels for better target representation. The spatial attention complements the soft mask based approach to better localize the target objects in challenging tracking scenarios. We evaluated our proposed approach over five publicly available benchmark datasets and performed extensive comparisons with 39 state-of-the-art tracking algorithms. The proposed tracker demonstrates excellent performance compared to the existing state-of-the-art trackers.

Download Full-text

The State-of-the-Art in Handling Occlusions for Visual Object Tracking

IEICE Transactions on Information and Systems ◽

10.1587/transinf.2014edr0002 ◽

2015 ◽

Vol E98.D (7) ◽

pp. 1260-1274 ◽

Cited By ~ 10

Author(s):

Kourosh MESHGI ◽

Shin ISHII

Keyword(s):

Object Tracking ◽

State Of The Art ◽

The State ◽

Visual Object ◽

Visual Object Tracking

Download Full-text

Feature Integration with Adaptive Importance Maps for Visual Tracking

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/108 ◽

2018 ◽

Author(s):

Aishi Li ◽

Ming Yang ◽

Wanqi Yang

Keyword(s):

State Of The Art ◽

Feature Integration ◽

Circulant Matrices ◽

Visual Object ◽

Correlation Filters ◽

Excellent Performance ◽

Complementary Information ◽

Visual Object Tracking ◽

Penalty Factor ◽

Dense Sampling

Discriminative correlation filters have recently achieved excellent performance for visual object tracking. The key to success is to make full use of dense sampling and specific properties of circulant matrices in the Fourier domain. However, previous studies don't take into consideration the importance and complementary information of different features, simply concatenating them. This paper investigates an effective method of feature integration for correlation filters, which jointly learns filters, as well as importance maps in each frame. These importance maps borrow the advantages of different features, aiming to achieve complementary traits and improve robustness. Moreover, for each feature, an importance map is shared by its all channels to avoid overfitting. In addition, we introduce a regularization term for the importance maps and use the penalty factor to control the significance of features. Based on handcrafted and CNN features, we implement two trackers, which achieve a competitive performance compared with several state-of-the-art trackers.

Download Full-text

Low-Rank Multi-Channel Features for Robust Visual Object Tracking

Symmetry ◽

10.3390/sym11091155 ◽

2019 ◽

Vol 11 (9) ◽

pp. 1155 ◽

Cited By ~ 3

Author(s):

Fawad ◽

Muhammad Jamil Khan ◽

MuhibUr Rahman ◽

Yasar Amin ◽

Hannu Tenhunen

Keyword(s):

Computational Complexity ◽

Object Tracking ◽

State Of The Art ◽

Color Naming ◽

Circulant Matrix ◽

Low Rank ◽

Support Vector ◽

Visual Object ◽

Visual Object Tracking ◽

Kernel Correlation

Kernel correlation filters (KCF) demonstrate significant potential in visual object tracking by employing robust descriptors. Proper selection of color and texture features can provide robustness against appearance variations. However, the use of multiple descriptors would lead to a considerable feature dimension. In this paper, we propose a novel low-rank descriptor, that provides better precision and success rate in comparison to state-of-the-art trackers. We accomplished this by concatenating the magnitude component of the Overlapped Multi-oriented Tri-scale Local Binary Pattern (OMTLBP), Robustness-Driven Hybrid Descriptor (RDHD), Histogram of Oriented Gradients (HoG), and Color Naming (CN) features. We reduced the rank of our proposed multi-channel feature to diminish the computational complexity. We formulated the Support Vector Machine (SVM) model by utilizing the circulant matrix of our proposed feature vector in the kernel correlation filter. The use of discrete Fourier transform in the iterative learning of SVM reduced the computational complexity of our proposed visual tracking algorithm. Extensive experimental results on Visual Tracker Benchmark dataset show better accuracy in comparison to other state-of-the-art trackers.

Download Full-text

A Trajectory Evaluator by Sub-tracks for Detecting VOT-based Anomalous Trajectory

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3490032 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-19

Author(s):

Fei Gao ◽

Jiada Li ◽

Yisu Ge ◽

Jianwen Shao ◽

Shufang Lu ◽

...

Keyword(s):

Data Analysis ◽

Mobile Robots ◽

State Of The Art ◽

Least Square Method ◽

Least Square ◽

Visual Object ◽

Massive Data ◽

Trajectory Data ◽

Visual Object Tracking ◽

Research Hotspots

With the popularization of visual object tracking (VOT), more and more trajectory data are obtained and have begun to gain widespread attention in the fields of mobile robots, intelligent video surveillance, and the like. How to clean the anomalous trajectories hidden in the massive data has become one of the research hotspots. Anomalous trajectories should be detected and cleaned before the trajectory data can be effectively used. In this article, a Trajectory Evaluator by Sub-tracks (TES) for detecting VOT-based anomalous trajectory is proposed. Feature of Anomalousness is defined and described as the Eigenvector of classifier to filter Track Lets anomalous trajectory and IDentity Switch anomalous trajectory, which includes Feature of Anomalous Pose and Feature of Anomalous Sub-tracks (FAS). In the comparative experiments, TES achieves better results on different scenes than state-of-the-art methods. Moreover, FAS makes better performance than point flow, least square method fitting and Chebyshev Polynomial Fitting. It is verified that TES is more accurate and effective and is conducive to the sub-tracks trajectory data analysis.

Download Full-text

Visual Object Tracking Robust to Illumination Variation Based on Hyperline Clustering

Information ◽

10.3390/info10010026 ◽

2019 ◽

Vol 10 (1) ◽

pp. 26 ◽

Cited By ~ 2

Author(s):

Senquan Yang ◽

Yuan Xie ◽

Pu Li ◽

Haoxiang Wen ◽

Huan Luo ◽

...

Keyword(s):

State Of The Art ◽

Shape Deformation ◽

Visual Object ◽

Excellent Performance ◽

Illumination Variation ◽

Visual Object Tracking ◽

Discriminant Model ◽

Lower Accuracy ◽

Benchmark Datasets ◽

Online Tracking

Color histogram-based trackers have obtained excellent performance against many challenging situations. However, since the appearance of color is sensitive to illumination, they tend to achieve lower accuracy when illumination is severely variant throughout a sequence. To overcome this limitation, we propose a novel hyperline clustering based discriminant model, an illumination invariant model that is able to distinguish the object from its surrounding background. Furthermore, we exploit this model and propose an anchor based scale estimation to cope with shape deformation and scale variation. Numerous experiments on recent online tracking benchmark datasets demonstrate that our approach achieve favorable performance compared with several state-of-the-art tracking algorithms. In particular, our approach achieves higher accuracy than comparative methods in the illumination variant and shape deformation challenging situations.

Download Full-text

Distractor-Aware Deep Regression for Visual Tracking

Sensors ◽

10.3390/s19020387 ◽

2019 ◽

Vol 19 (2) ◽

pp. 387 ◽

Cited By ~ 1

Author(s):

Ming Du ◽

Yan Ding ◽

Xiuyun Meng ◽

Hua-Liang Wei ◽

Yifan Zhao

Keyword(s):

Object Tracking ◽

Visual Tracking ◽

Test Data ◽

Loss Function ◽

State Of The Art ◽

Target Object ◽

Visual Object ◽

Visual Object Tracking ◽

Training Samples ◽

Better Than

In recent years, regression trackers have drawn increasing attention in the visual-object tracking community due to their favorable performance and easy implementation. The tracker algorithms directly learn mapping from dense samples around the target object to Gaussian-like soft labels. However, in many real applications, when applied to test data, the extreme imbalanced distribution of training samples usually hinders the robustness and accuracy of regression trackers. In this paper, we propose a novel effective distractor-aware loss function to balance this issue by highlighting the significant domain and by severely penalizing the pure background. In addition, we introduce a full differentiable hierarchy-normalized concatenation connection to exploit abstractions across multiple convolutional layers. Extensive experiments were conducted on five challenging benchmark-tracking datasets, that is, OTB-13, OTB-15, TC-128, UAV-123, and VOT17. The experimental results are promising and show that the proposed tracker performs much better than nearly all the compared state-of-the-art approaches.

Download Full-text

POST: POlicy-Based Switch Tracking

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6899 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12184-12191

Author(s):

Ning Wang ◽

Wengang Zhou ◽

Guojun Qi ◽

Houqiang Li

Keyword(s):

Decision Making ◽

State Of The Art ◽

Superior Performance ◽

Visual Object ◽

Current Frame ◽

Visual Object Tracking ◽

Decision Making Problem ◽

The Individual ◽

Multiple Experts ◽

Experimental Comparisons

In visual object tracking, by reasonably fusing multiple experts, ensemble framework typically achieves superior performance compared to the individual experts. However, the necessity of parallelly running all the experts in most existing ensemble frameworks heavily limits their efficiency. In this paper, we propose POST, a POlicy-based Switch Tracker for robust and efficient visual tracking. The proposed POST tracker consists of multiple weak but complementary experts (trackers) and adaptively assigns one suitable expert for tracking in each frame. By formulating this expert switch in consecutive frames as a decision-making problem, we learn an agent via reinforcement learning to directly decide which expert to handle the current frame without running others. In this way, the proposed POST tracker maintains the performance merit of multiple diverse models while favorably ensuring the tracking efficiency. Extensive ablation studies and experimental comparisons against state-of-the-art trackers on 5 prevalent benchmarks verify the effectiveness of the proposed method.

Download Full-text