Attention shake siamese network with auxiliary relocation branch for visual object tracking

Offline-trained Siamese networks are not robust to the environmental complication in visual object tracking. Without online learning, the Siamese network cannot learn from instance domain knowledge and adapt to appearance changes of targets. In this paper, a new lightweight Siamese network is proposed for feature extraction. To cope with the dynamics of targets and backgrounds, the weight in the proposed Siamese network is updated in an online manner during the tracking process. In order to enhance the discrimination capability, the cross-entropy loss is integrated into the contrastive loss. Inspired by the face verification algorithm DeepID2, the Bayesian verification model is applied for candidate selection. In general, visual object tracking can benefit from face verification algorithms. Numerical results suggest that the newly developed algorithm achieves comparable performance in public benchmarks.

Download Full-text

ACSiamRPN: Adaptive Context Sampling for Visual Object Tracking

Electronics ◽

10.3390/electronics9091528 ◽

2020 ◽

Vol 9 (9) ◽

pp. 1528

Author(s):

Xiaofei Qin ◽

Yipeng Zhang ◽

Hang Chang ◽

Hao Lu ◽

Xuedian Zhang

Keyword(s):

Object Tracking ◽

Long Range ◽

Global Maximum ◽

Visual Object ◽

Visual Object Tracking ◽

Siamese Network ◽

Information Embedding ◽

Selection Step ◽

Block Based ◽

The Impact

In visual object tracking fields, the Siamese network tracker, based on the region proposal network (SiamRPN), has achieved promising tracking effects, both in speed and accuracy. However, it did not consider the relationship and differences between the long-range context information of various objects. In this paper, we add a global context block (GC block), which is lightweight and can effectively model long-range dependency, to the Siamese network part of SiamRPN so that the object tracker can better understand the tracking scene. At the same time, we propose a novel convolution module, called a cropping-inside selective kernel block (CiSK block), based on selective kernel convolution (SK convolution, a module proposed in selective kernel networks) and use it in the region proposal network (RPN) part of SiamRPN, which can adaptively adjust the size of the receptive field for different types of objects. We make two improvements to SK convolution in the CiSK block. The first improvement is that in the fusion step of SK convolution, we use both global average pooling (GAP) and global maximum pooling (GMP) to enhance global information embedding. The second improvement is that after the selection step of SK convolution, we crop out the outermost pixels of features to reduce the impact of padding operations. The experiment results show that on the OTB100 benchmark, we achieved an accuracy of 0.857 and a success rate of 0.643. On the VOT2016 and VOT2019 benchmarks, we achieved expected average overlap (EAO) scores of 0.394 and 0.240, respectively.

Download Full-text

Attention shake siamese network with auxiliary relocation branch for visual object tracking

IOU – Siamtrack: IOU Guided Siamese Network For Visual Object Tracking

A Asymmetric Attention Siamese Network for Visual Object Tracking

Dual Attention based Siamese Network for Visual Object Tracking

Visual Object Tracking by Hierarchical Attention Siamese Network

Dice Loss in Siamese Network for Visual Object Tracking

DomainSiam: Domain-Aware Siamese Network for Visual Object Tracking

Visual object tracking based on Siamese network and online patch filters

Learning Dynamic Siamese Network for Visual Object Tracking

Online Siamese Network for Visual Object Tracking

ACSiamRPN: Adaptive Context Sampling for Visual Object Tracking

Export Citation Format