A spatio-temporal video analysis system for object segmentation

Author(s):  
Jianhui Xia ◽  
Yulin Wang
2020 ◽  
Vol 71 (7) ◽  
pp. 868-880
Author(s):  
Nguyen Hong-Quan ◽  
Nguyen Thuy-Binh ◽  
Tran Duc-Long ◽  
Le Thi-Lan

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.


2021 ◽  
Author(s):  
Sundaram Muthu ◽  
Ruwan Tennakoon ◽  
Reza Hoseinnezhad ◽  
Alireza Bab-Hadiashar

<div>This paper presents a new approach to solve unsupervised video object segmentation~(UVOS) problem (called TMNet). The UVOS is still a challenging problem as prior methods suffer from issues like generalization errors to segment multiple objects in unseen test videos (category agnostic), over reliance on inaccurate optic flow, and problem towards capturing fine details at object boundaries. These issues make the UVOS, particularly in presence of multiple objects, an ill-defined problem. Our focus is to constrain the problem and improve the segmentation results by inclusion of multiple available cues such as appearance, motion, image edge, flow edge and tracking information through neural attention. To solve the challenging category agnostic multiple object UVOS, our model is designed to predict neighbourhood affinities for being part of the same object and cluster those to obtain accurate segmentation. To achieve multi cue based neural attention, we designed a Temporal Motion Attention module, as part of our segmentation framework, to learn the spatio-temporal features. To refine and improve the accuracy of object segmentation boundaries, an edge refinement module (using image and optic flow edges) and a geometry based loss function are incorporated. The overall framework is capable of segmenting and finding accurate objects' boundaries without any heuristic post processing. This enables the method to be used for unseen videos. Experimental results on challenging DAVIS16 and multi object DAVIS17 datasets shows that our proposed TMNet performs favourably compared to the state-of-the-art methods without post processing.</div>


Sign in / Sign up

Export Citation Format

Share Document