A multivariate intersection over union of SiamRPN network for visual tracking
AbstractSiamPRN algorithm performs well in visual tracking, but it is easy to drift under occlusion and fast motion scenes because it uses $$\ell _1$$ ℓ 1 -smooth loss function to measure the regression location of bounding box. In this paper, we propose a multivariate intersection over union (MIOU) loss in SiamRPN tracking framework. Firstly, MIOU loss includes three geometric factors in regression: the overlap area ratio, the center distance ratio, and the aspect ratio, which can better reflect the coincidence degree of target box and prediction box. Secondly, we improve the definition of aspect ratio loss to avoid gradient explosion, improve the optimization performance of prediction box. Finally, based on SiamPRN tracker, we compared the tracking performance of $$\ell _1$$ ℓ 1 -smooth loss, IOU loss, GIOU loss, DIOU loss, and MIOU loss. Experimental results show that the MIOU loss has better target location regression than other loss functions on the OTB2015 and VOT2016 benchmark, especially for the challenges of occlusion, illumination change and fast motion.