Published online by Cambridge University Press: 05 July 2012
Analyzing the behaviors of people in smart environment using multimodal sensors requires to answer a set of typical questions: who are the people? where are they? what activities are they doing? when? with whom are they interacting? and how are they interacting? In this view, locating people or their faces and characterizing them (e.g. extracting their body or head orientation) allows us to address the first two questions (who and where), and is usually one of the first steps before applying higher-level multimodal scene analysis algorithms that address the other questions. In the last ten years, tracking algorithms have experienced considerable progress, particularly in indoor environment or for specific applications, where they have reached a maturity allowing their deployment in real systems and applications. Nevertheless, there are still several issues that can make tracking difficult: background clutter and potentially small object size; complex shape, appearance, and motion, and their changes over time or across camera views; inaccurate/rough scene calibration or inconsistent camera calibration between views for 3D tracking; real-time processing requirements. In what follows, we discuss some important aspects of tracking algorithms, and introduce the remaining chapter content.
Scenarios and Set-ups. Scenarios and application needs strongly influence the considered physical environment, and therefore the set-up (where, how many, and what type of sensors are used) and choice of tracking method. A first set of scenarios commonly involves the tracking of people in the so-called smart spaces (Singh et al., 2006).