Towards Integration of Different Paradigms in Modeling, Representation, and Learning of Visual Categories

doi:10.1017/CBO9780511635465.018

17 - Towards Integration of Different Paradigms in Modeling, Representation, and Learning of Visual Categories

Published online by Cambridge University Press: 20 May 2010

Mario Fritz and

Edited by

Bernt Schiele and

Sven J. Dickinson: Affiliation:
University of Toronto
Aleš Leonardis: Affiliation:
University of Ljubljana
Bernt Schiele: Affiliation:
Technische Universität, Darmstadt, Germany
Michael J. Tarr: Affiliation:
Carnegie Mellon University, Pennsylvania

Book contents

Get access

Summary

Introduction

Object representations for categorization tasks should be applicable for a wide range of objects, scalable to handle large numbers of object classes, and at the same time learnable from a few training samples. While such a scalable representation is still illusive today, it has been argued that such a representation should have at least the following properties: it should enable sharing of features (Torralba et al. 2007), it should combine generative models with discriminative models (Fritz et al. 2005; Jaakkola and Haussler 1999), and it should combine both local and global as well as appearanceand shape-based features (Leibe et al. 2005). Additionally, we argue that such object representations should be applicable both for unsupervised learning (e.g., visual object discovery) as well as supervised training (e.g., object detection). Therefore, we extend our previous efforts of hybrid modeling (Fritz et al. 2005) with ideas of unsupervised learning of generative decompositions to obtain an approach that integrates across different paradigms of modeling, representing, and learning of visual categories.

We present a novel method for the discovery and detection of visual object categories based on decompositions using topic models. The approach is capable of learning a compact and low-dimensional representation for multiple visual categories from multiple viewpoints without labeling of training instances. The learnt object components range from local structures over line segments to global silhouette-like descriptions. This representation can be used to discover object categories in a totally unsupervised fashion. Furthermore we employ the representation as the basis for building a supervised multicategory detection system making efficient use of training examples and outperforming pure features-based representations.

Type: Chapter
Information: Object Categorization
Computer and Human Vision Perspectives
, pp. 324 - 347

DOI: https://doi.org/10.1017/CBO9780511635465.018 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

17 - Towards Integration of Different Paradigms in Modeling, Representation, and Learning of Visual Categories

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive