This paper presents several approaches to the machine perception of motion and discusses the role and levels of knowledge in each. In particular, different techniques of motion understanding as focusing on one of movement, activity or action are described.
Movements
are the most atomic primitives, requiring no contextual or sequence knowledge to be recognized; movement is often addressed using either view–invariant or view–specific geometric techniques.
Activity
refers to sequences of movements or states, where the only real knowledge required is the statistics of the sequence; much of the recent work in gesture understanding falls within this category of motion perception. Finally,
actions
are larger–scale events, which typically include interaction with the environment and causal relationships; action understanding straddles the grey division between perception and cognition, computer vision and artificial intelligence. These levels are illustrated with examples drawn mostly from the group's work in understanding motion in video imagery. It is argued that the utility of such a division is that it makes explicit the representational competencies and manipulations necessary for perception.