The aim of this work is to present a visual-based human action recognition system which is adapted to constrained embedded devices, such as smart phones. Basically, vision-based human action recognition is a combination of feature-tracking, descriptor-extraction and subsequent classification of image representations, with a color-based identification tool to distinguish between multiple human subjects. Simple descriptors sets were evaluated to optimize recognition rate and performance and two dimensional (2D) descriptors were found to be effective. These sets installed on the latest phones can recognize human actions in videos in less than one second with a success rate of over 82%.