Efficient Online Object Tracking Scheme for Challenging Scenarios

Visual object tracking (VOT) is a vital part of various domains of computer vision applications such as surveillance, unmanned aerial vehicles (UAV), and medical diagnostics. In recent years, substantial improvement has been made to solve various challenges of VOT techniques such as change of scale, occlusions, motion blur, and illumination variations. This paper proposes a tracking algorithm in a spatiotemporal context (STC) framework. To overcome the limitations of STC based on scale variation, a max-pooling-based scale scheme is incorporated by maximizing over posterior probability. To avert target model from drift, an efficient mechanism is proposed for occlusion handling. Occlusion is detected from average peak to correlation energy (APCE)-based mechanism of response map between consecutive frames. On successful occlusion detection, a fractional-gain Kalman filter is incorporated for handling the occlusion. An additional extension to the model includes APCE criteria to adapt the target model in motion blur and other factors. Extensive evaluation indicates that the proposed algorithm achieves significant results against various tracking methods.

Download Full-text

Spatio-Temporal Context, Correlation Filter and Measurement Estimation Collaboration Based Visual Object Tracking

Sensors ◽

10.3390/s21082841 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2841

Author(s):

Khizer Mehmood ◽

Abdul Jalil ◽

Ahmad Ali ◽

Baber Khan ◽

Maria Murad ◽

...

Keyword(s):

Kalman Filter ◽

Object Tracking ◽

Environmental Changes ◽

Correlation Filter ◽

Temporal Context ◽

Visual Object ◽

Target Model ◽

Change Of Scale ◽

Tracking Model ◽

Spatio Temporal

Despite eminent progress in recent years, various challenges associated with object tracking algorithms such as scale variations, partial or full occlusions, background clutters, illumination variations are still required to be resolved with improved estimation for real-time applications. This paper proposes a robust and fast algorithm for object tracking based on spatio-temporal context (STC). A pyramid representation-based scale correlation filter is incorporated to overcome the STC’s inability on the rapid change of scale of target. It learns appearance induced by variations in the target scale sampled at a different set of scales. During occlusion, most correlation filter trackers start drifting due to the wrong update of samples. To prevent the target model from drift, an occlusion detection and handling mechanism are incorporated. Occlusion is detected from the peak correlation score of the response map. It continuously predicts target location during occlusion and passes it to the STC tracking model. After the successful detection of occlusion, an extended Kalman filter is used for occlusion handling. This decreases the chance of tracking failure as the Kalman filter continuously updates itself and the tracking model. Further improvement to the model is provided by fusion with average peak to correlation energy (APCE) criteria, which automatically update the target model to deal with environmental changes. Extensive calculations on the benchmark datasets indicate the efficacy of the proposed tracking method with state of the art in terms of performance analysis.

Download Full-text

A Visual Object Tracking Algorithm Based on Improved TLD

Algorithms ◽

10.3390/a13010015 ◽

2020 ◽

Vol 13 (1) ◽

pp. 15 ◽

Cited By ~ 3

Author(s):

Xinxin Zhen ◽

Shumin Fei ◽

Yinmin Wang ◽

Wei Du

Keyword(s):

Object Tracking ◽

Environmental Changes ◽

Failure Detection ◽

Motion Blur ◽

Tracking Performance ◽

Visual Object ◽

Tracking Problem ◽

Visual Object Tracking ◽

Scanning Area ◽

Tracking Module

Visual object tracking is an important research topic in the field of computer vision. Tracking–learning–detection (TLD) decomposes the tracking problem into three modules—tracking, learning, and detection—which provides effective ideas for solving the tracking problem. In order to improve the tracking performance of the TLD tracker, three improvements are proposed in this paper. The built-in tracking module is replaced with a kernelized correlation filter (KCF) algorithm based on the histogram of oriented gradient (HOG) descriptor in the tracking module. Failure detection is added for the response of KCF to identify whether KCF loses the target. A more specific detection area of the detection module is obtained through the estimated location provided by the tracking module. With the above operations, the scanning area of object detection is reduced, and a full frame search is required in the detection module if objects fails to be tracked in the tracking module. Comparative experiments were conducted on the object tracking benchmark (OTB) and the results showed that the tracking speed and accuracy was improved. Further, the TLD tracker performed better in different challenging scenarios with the proposed method, such as motion blur, occlusion, and environmental changes. Moreover, the improved TLD achieved outstanding tracking performance compared with common tracking algorithms.

Download Full-text

Re2EMA: Regularized and Reinitialized Exponential Moving Average for Target Model Update in Object Tracking

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018457 ◽

2019 ◽

Vol 33 ◽

pp. 8457-8464

Author(s):

Jianglei Huang ◽

Wengang Zhou

Keyword(s):

Object Tracking ◽

Moving Average ◽

Transformation Matrix ◽

Visual Object ◽

Optimal Model ◽

Visual Object Tracking ◽

Target Model ◽

Model Update ◽

Optimal Target

Target model update plays an important role in visual object tracking. However, performing optimal model update is challenging. In this work, we propose to achieve an optimal target model by learning a transformation matrix from the last target model to the newly generated one, which results into a minimization objective. In this objective, there exists two challenges. The first is that the newly generated target model is unreliable. To overcome this problem, we propose to impose a penalty to limit the distance between the learned target model and the last one. The second is that as time evolves, we can not decide whether the last target model has been corrupted or not. To get out of this dilemma, we propose a reinitialization term. Besides, to control the complexity of the transformation matrix, we also add a regularizer. We find that the optimization formula’s solution, with some simplifications, degenerates to EMA. Finally, despite the simplicity, extensive experiments conducted on several commonly used benchmarks demonstrate the effectiveness of our proposed approach in relatively long term scenarios.

Download Full-text

Visual object tracking using Fourier domain phase information

Signal Image and Video Processing ◽

10.1007/s11760-021-01968-5 ◽

2021 ◽

Author(s):

Serdar Cakir ◽

A. Enis Cetin

Keyword(s):

Object Tracking ◽

Visual Object ◽

Fourier Domain ◽

Phase Information ◽

Visual Object Tracking ◽

Domain Phase

Download Full-text

Adaptive Channel Selection for Robust Visual Object Tracking with Discriminative Correlation Filters

International Journal of Computer Vision ◽

10.1007/s11263-021-01435-1 ◽

2021 ◽

Author(s):

Tianyang Xu ◽

Zhenhua Feng ◽

Xiao-Jun Wu ◽

Josef Kittler

Keyword(s):

Object Tracking ◽

Augmented Lagrangian Method ◽

Channel Selection ◽

Image Feature ◽

Superior Performance ◽

Appearance Model ◽

Visual Object ◽

Correlation Filters ◽

Visual Object Tracking ◽

Feature Representations

AbstractDiscriminative Correlation Filters (DCF) have been shown to achieve impressive performance in visual object tracking. However, existing DCF-based trackers rely heavily on learning regularised appearance models from invariant image feature representations. To further improve the performance of DCF in accuracy and provide a parsimonious model from the attribute perspective, we propose to gauge the relevance of multi-channel features for the purpose of channel selection. This is achieved by assessing the information conveyed by the features of each channel as a group, using an adaptive group elastic net inducing independent sparsity and temporal smoothness on the DCF solution. The robustness and stability of the learned appearance model are significantly enhanced by the proposed method as the process of channel selection performs implicit spatial regularisation. We use the augmented Lagrangian method to optimise the discriminative filters efficiently. The experimental results obtained on a number of well-known benchmarking datasets demonstrate the effectiveness and stability of the proposed method. A superior performance over the state-of-the-art trackers is achieved using less than $$10\%$$ 10 % deep feature channels.

Download Full-text