In this paper, we introduce the high performance Deformable part models from object detection into human action recognition and localization and propose a unified method to detect action in video sequences. The Deformable part models have attracted intensive attention in the field of object detection. We generalize the approach from 2D still images to 3D spatiotemporal volumes. The human actions are described by 3D histograms of oriented gradients based features. Different poses are presented by mixture of models on different resolutions. The model autonomously selects the most discriminative 3D parts and learns their anchor positions related to the root. Empirical results on several video datasets prove the efficacy of our proposed method on both action recognition and localization.