Mutual Learning and Feature Fusion Siamese Networks for Visual Object Tracking

Author(s):  
Min Jiang ◽  
Yuyao Zhao ◽  
Jun Kong
Author(s):  
Zheng Zhu ◽  
Qiang Wang ◽  
Bo Li ◽  
Wei Wu ◽  
Junjie Yan ◽  
...  

2018 ◽  
Vol 77 (17) ◽  
pp. 22131-22143 ◽  
Author(s):  
Longchao Yang ◽  
Peilin Jiang ◽  
Fei Wang ◽  
Xuan Wang

Sensors ◽  
2020 ◽  
Vol 20 (14) ◽  
pp. 4021 ◽  
Author(s):  
Mustansar Fiaz ◽  
Arif Mahmood ◽  
Soon Ki Jung

We propose to improve the visual object tracking by introducing a soft mask based low-level feature fusion technique. The proposed technique is further strengthened by integrating channel and spatial attention mechanisms. The proposed approach is integrated within a Siamese framework to demonstrate its effectiveness for visual object tracking. The proposed soft mask is used to give more importance to the target regions as compared to the other regions to enable effective target feature representation and to increase discriminative power. The low-level feature fusion improves the tracker robustness against distractors. The channel attention is used to identify more discriminative channels for better target representation. The spatial attention complements the soft mask based approach to better localize the target objects in challenging tracking scenarios. We evaluated our proposed approach over five publicly available benchmark datasets and performed extensive comparisons with 39 state-of-the-art tracking algorithms. The proposed tracker demonstrates excellent performance compared to the existing state-of-the-art trackers.


Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 854
Author(s):  
Yuxiang Yang ◽  
Weiwei Xing ◽  
Shunli Zhang ◽  
Qi Yu ◽  
Xiaoyu Guo ◽  
...  

Visual object tracking by Siamese networks has achieved favorable performance in accuracy and speed. However, the features used in Siamese networks have spatially redundant information, which increases computation and limits the discriminative ability of Siamese networks. Addressing this issue, we present a novel frequency-aware feature (FAF) method for robust visual object tracking in complex scenes. Unlike previous works, which select features from different channels or layers, the proposed method factorizes the feature map into multi-frequency and reduces the low-frequency information that is spatially redundant. By reducing the low-frequency map’s resolution, the computation is saved and the receptive field of the layer is also increased to obtain more discriminative information. To further improve the performance of the FAF, we design an innovative data-independent augmentation for object tracking to improve the discriminative ability of tracker, which enhanced linear representation among training samples by convex combinations of the images and tags. Finally, a joint judgment strategy is proposed to adjust the bounding box result that combines intersection-over-union (IoU) and classification scores to improve tracking accuracy. Extensive experiments on 5 challenging benchmarks demonstrate that our FAF method performs favorably against SOTA tracking methods while running around 45 frames per second.


Sign in / Sign up

Export Citation Format

Share Document