Published online by Cambridge University Press: 20 May 2010
One possible approach to category recognition is to model object categories as graphs of features, and to focus mainly on the second-order (pairwise) relationships between them: category-dependent as well as perceptual grouping constraints. This differs from the popular bag-of-words model (Csurka et al. 2004), which concentrates exclusively on local features, ignoring the higher-order interactions between them. The main observation is that higher-order relationships between model features are more important for category recognition than local, first-order features. Earlier studies support the view that simple, unary features, without higher-order relationships (such as geometric constraints or conjunctions of properties), are not sufficient at higher cognitive levels where object category recognition takes place (Treisman 1986; Hummel 2000). The importance of using pairwise relationships between features was recognized early on, starting with Ullman's theory of the correspondence process, which introduced the notion of correspondence strength that takes into consideration both the local/unary affinities, but also pairwise interactions between features (Marr 1982).
More generally, using pairwise or global geometric constraints between contour fragments was explored extensively in early work. For example, interpretation trees were used to find correspondences between contour fragments, aligning an object model and the object instance in a novel image of a cluttered scene (Grimson and Lozano-Pérez 1987; Grimson 1990a). Other approaches relied on more global techniques based on transformation voting alignment (Lowe 1985), as well as geometric reasoning on groups of fragments (Goad 1983; Brooks 1981).