A Video Descriptor Using Orientation Tensors and Shape-Based Trajectory Clustering
Dense trajectories have been shown as a very promising method in the human action recognition field. In this paper, we propose a new kind of video descriptor, generated from the relationship between the trajectory’s optical flow with the gradient field in its neighborhood. Orientation tensors are used to accumulate relevant information over the video, representing the tendency of direction in the descriptor space for that kind of movement. Furthermore, a method to cluster trajectories using their shape is proposed. This method allows us to accumulate different motion patterns in different tensors and easier distinguish trajectories that are created by real movements from the trajectories created by the camera’s movement. The proposed method is capable to achieve the best known recognition rates for methods based on the self-descriptor constraint in popular datasets — Hollywood2 (up to 46%) and KTH (up to 94%).