Published online by Cambridge University Press: 20 May 2010
The recognition of objects in images is a central task in computer vision. It is particularly challenging because the form and appearance of 3-D objects projected onto 2-D images undergo significant variation owing to the numerous dimensions of visual transformations, such as changes in viewing distance and direction, illumination conditions, occlusion, articulation, and, perhaps most significantly, within-category type variation. The challenge is how to encode the collection of these forms and appearances (Fig. 23.1), which are high-dimensional manifolds embedded in even higherdimensional spaces, such that the topology induced by such variations is preserved. We believe this is the key to the successful differentiation of various categories.
Form and appearance play separate, and distinct, but perhaps interacting roles in recognition. The early exploration of the form-only role assumed that figures can be successfully segregated from image. This assumption was justified by the vast platform of “segmentation” research going on at the same time. Currently, it is generally accepted that segmentation as a stand-alone approach is ill-posed and that segmentation must be approached together with recognition or other high-level tasks. Nevertheless, research on form-only representation and recognition remains immensely valuable in that it has identified key issues in shape representation and key challenges in recognition such as those in the presence of occlusion, articulation, and other unwieldy visual transformations, which are present even in such a highly oversimplified domain. Section 23.2 reviews some of the lessons learned and captured in the context of the shock-graph approach to recognition.