video representation
Recently Published Documents


TOTAL DOCUMENTS

167
(FIVE YEARS 46)

H-INDEX

19
(FIVE YEARS 3)

2021 ◽  
Author(s):  
Yue Liu ◽  
Junqi Ma ◽  
Yufei Xie ◽  
Xuefeng Yang ◽  
Xingzhen Tao ◽  
...  

2021 ◽  
Author(s):  
Jingran Zhang ◽  
Xing Xu ◽  
Fumin Shen ◽  
Yazhou Yao ◽  
Jie Shao ◽  
...  

2021 ◽  
Vol 22 ◽  
Author(s):  
Jonathan Doan ◽  
Irfan Sheikh ◽  
Lawrence Elmer ◽  
Mehmood Rashid

Author(s):  
Yuqi Huo ◽  
Mingyu Ding ◽  
Haoyu Lu ◽  
Ziyuan Huang ◽  
Mingqian Tang ◽  
...  

This paper proposes a novel pretext task for self-supervised video representation learning by exploiting spatiotemporal continuity in videos. It is motivated by the fact that videos are spatiotemporal by nature and a representation learned by detecting spatiotemporal continuity/discontinuity is thus beneficial for downstream video content analysis tasks. A natural choice of such a pretext task is to construct spatiotemporal (3D) jigsaw puzzles and learn to solve them. However, as we demonstrate in the experiments, this task turns out to be intractable. We thus propose Constrained Spatiotemporal Jigsaw (CSJ) whereby the 3D jigsaws are formed in a constrained manner to ensure that large continuous spatiotemporal cuboids exist. This provides sufficient cues for the model to reason about the continuity. Instead of solving them directly, which could still be extremely hard, we carefully design four surrogate tasks that are more solvable. The four tasks aim to learn representations sensitive to spatiotemporal continuity at both the local and global levels. Extensive experiments show that our CSJ achieves state-of-the-art on various benchmarks.


Author(s):  
Zhipeng Wang ◽  
Chunping Hou ◽  
Guanghui Yue ◽  
Qingyuan Yang

Sign in / Sign up

Export Citation Format

Share Document