Fully Convolutional Single-Crop Siamese Networks for Real-Time Visual Object Tracking
The visual object tracking problem seeks to track an arbitrary object in a video, and many deep convolutional neural network-based algorithms have achieved significant performance improvements in recent years. However, most of them do not guarantee real-time operation due to the large computation overhead for deep feature extraction. This paper presents a single-crop visual object tracking algorithm based on a fully convolutional Siamese network (SiamFC). The proposed algorithm significantly reduces the computation burden by extracting multiple scale feature maps from a single image crop. Experimental results show that the proposed algorithm demonstrates superior speed performance in comparison with that of SiamFC.