rStaple: A Robust Complementary Learning Method for Real-Time Object Tracking

Object tracking is a challenging research task because of drastic appearance changes of the target and a lack of training samples. Most online learning trackers are hampered by complications, e.g., drifting problem under occlusion, being out of view, or fast motion. In this paper, a real-time object tracking algorithm termed “robust sum of template and pixel-wise learners” (rStaple) is proposed to address those problems. It combines multi-feature correlation filters with a color histogram. Firstly, we extract a combination of specific features from the searching area around the target and then merge feature channels to train a translation correlation filter online. Secondly, the target state is determined by a discriminating mechanism, wherein the model update procedure stops when the target is occluded or out of view, and re-activated when the target re-appears. In addition, by calculating the color histogram score in the searching area, a significant enhancement is adopted for the score map. The target position can be estimated by combining the enhanced color histogram score with the correlation filter response map. Finally, a scale filter is trained for multi-scale detection to obtain the final tracking result. Extensive experimental results on a large benchmark dataset demonstrates that the proposed rStaple is superior to several state-of-the-art algorithms in terms of accuracy and efficiency.

Download Full-text

Parallel Correlation Filters for Real-Time Visual Tracking

Sensors ◽

10.3390/s19102362 ◽

2019 ◽

Vol 19 (10) ◽

pp. 2362 ◽

Cited By ~ 5

Author(s):

Yijin Yang ◽

Yihong Zhang ◽

Demin Li ◽

Zhijie Wang

Keyword(s):

Object Tracking ◽

Real Time ◽

Research Field ◽

Tracking Performance ◽

Correlation Filter ◽

Visual Object ◽

Correlation Filters ◽

Illumination Variation ◽

Visual Object Tracking ◽

Appearance Changes

Correlation filter-based methods have recently performed remarkably well in terms of accuracy and speed in the visual object tracking research field. However, most existing correlation filter-based methods are not robust to significant appearance changes in the target, especially when the target undergoes deformation, illumination variation, and rotation. In this paper, a novel parallel correlation filters (PCF) framework is proposed for real-time visual object tracking. Firstly, the proposed method constructs two parallel correlation filters, one for tracking the appearance changes in the target, and the other for tracking the translation of the target. Secondly, through weighted merging the response maps of these two parallel correlation filters, the proposed method accurately locates the center position of the target. Finally, in the training stage, a new reasonable distribution of the correlation output is proposed to replace the original Gaussian distribution to train more accurate correlation filters, which can prevent the model from drifting to achieve excellent tracking performance. The extensive qualitative and quantitative experiments on the common object tracking benchmarks OTB-2013 and OTB-2015 have demonstrated that the proposed PCF tracker outperforms most of the state-of-the-art trackers and achieves a high real-time tracking performance.

Download Full-text

Correlation filter tracker with siamese: A robust and real-time object tracking framework

Neurocomputing ◽

10.1016/j.neucom.2019.05.033 ◽

2019 ◽

Vol 358 ◽

pp. 33-43 ◽

Cited By ~ 4

Author(s):

Gengzheng Pan ◽

Guochun Chen ◽

Wenxiong Kang ◽

Junhui Hou

Keyword(s):

Object Tracking ◽

Real Time ◽

Correlation Filter

Download Full-text

Visual Object Multimodality Tracking Based on Correlation Filters for Edge Computing

Security and Communication Networks ◽

10.1155/2020/8891035 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Guosheng Yang ◽

Qisheng Wei

Keyword(s):

Neural Network ◽

Deep Learning ◽

Target Position ◽

Correlation Filter ◽

Estimation Accuracy ◽

Visual Object ◽

Correlation Filters ◽

Data Set ◽

Hierarchical Processing ◽

Target Rotation

In recent years, visual object tracking has become a very active research field which is mainly divided into the correlation filter-based tracking and deep learning (e.g., deep convolutional neural network and Siamese neural network) based tracking. For target tracking algorithms based on deep learning, a large amount of computation is required, usually deployed on expensive graphics cards. However, for the rich monitoring devices in the Internet of Things, it is difficult to capture all the moving targets in each device in real time, so it is necessary to perform hierarchical processing and use tracking based on correlation filtering in insensitive areas to alleviate the local computing pressure. In sensitive areas, upload the video stream to a cloud computing platform with a faster computing speed to perform an algorithm based on deep features. In this paper, we mainly focus on the correlation filter-based tracking. In the correlation filter-based tracking, the discriminative scale space tracker (DSST) is one of the most popular and typical ones which is successfully applied to many application fields. However, there are still some improvements that need to be further studied for DSST. One is that the algorithms do not consider the target rotation on purpose. The other is that it is a very heavy computational load to extract the histogram of oriented gradient (HOG) features from too many patches centered at the target position in order to ensure the scale estimation accuracy. To address these two problems, we introduce the alterable patch number for target scale tracking and the space searching for target rotation tracking into the standard DSST tracking method and propose a visual object multimodality tracker based on correlation filters (MTCF) to simultaneously cope with translation, scale, and rotation in plane for the tracked target and to obtain the target information of position, scale, and attitude angle at the same time. Finally, in Visual Tracker Benchmark data set, the experiments are performed on the proposed algorithms to show their effectiveness in multimodality tracking.

Download Full-text

Real-Time Visual Tracking with Variational Structure Attention Network

Sensors ◽

10.3390/s19224904 ◽

2019 ◽

Vol 19 (22) ◽

pp. 4904 ◽

Cited By ~ 1

Author(s):

Yeongbin Kim ◽

Joongchol Shin ◽

Hasil Park ◽

Joonki Paik

Keyword(s):

Real Time ◽

Visual Tracking ◽

Boundary Effect ◽

Online Training ◽

Correlation Filter ◽

Shape Distortion ◽

Correlation Filters ◽

Attention Network ◽

Real Time Processing ◽

Variational Structure

Online training framework based on discriminative correlation filters for visual tracking has recently shown significant improvement in both accuracy and speed. However, correlation filter-base discriminative approaches have a common problem of tracking performance degradation when the local structure of a target is distorted by the boundary effect problem. The shape distortion of the target is mainly caused by the circulant structure in the Fourier domain processing, and it makes the correlation filter learn distorted training samples. In this paper, we present a structure–attention network to preserve the target structure from the structure distortion caused by the boundary effect. More specifically, we adopt a variational auto-encoder as a structure–attention network to make various and representative target structures. We also proposed two denoising criteria using a novel reconstruction loss for variational auto-encoding framework to capture more robust structures even under the boundary condition. Through the proposed structure–attention framework, discriminative correlation filters can learn robust structure information of targets during online training with an enhanced discriminating performance and adaptability. Experimental results on major visual tracking benchmark datasets show that the proposed method produces a better or comparable performance compared with the state-of-the-art tracking methods with a real-time processing speed of more than 80 frames per second.

Download Full-text