Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- Part one Pattern Classification with Binary-Output Neural Networks
- Part two Pattern Classification with Real-Output Networks
- Part three Learning Real-Valued Functions
- 16 Learning Classes of Real Functions
- 17 Uniform Convergence Results for Real Function Classes
- 18 Bounding Covering Numbers
- 19 Sample Complexity of Learning Real Function Classes
- 20 Convex Classes
- 21 Other Learning Problems
- Part four Algorithmics
- Appendix 1 Useful Results
- Bibliography
- Author index
- Subject index
20 - Convex Classes
Published online by Cambridge University Press: 26 February 2010
- Frontmatter
- Contents
- Preface
- 1 Introduction
- Part one Pattern Classification with Binary-Output Neural Networks
- Part two Pattern Classification with Real-Output Networks
- Part three Learning Real-Valued Functions
- 16 Learning Classes of Real Functions
- 17 Uniform Convergence Results for Real Function Classes
- 18 Bounding Covering Numbers
- 19 Sample Complexity of Learning Real Function Classes
- 20 Convex Classes
- 21 Other Learning Problems
- Part four Algorithmics
- Appendix 1 Useful Results
- Bibliography
- Author index
- Subject index
Summary
Introduction
We have seen in the previous chapter that finiteness of the fat-shattering dimension is necessary and sufficient for learning. Unfortunately, there is a considerable gap between our lower and upper bounds on sample complexity. Even for a function class with finite pseudo-dimension, the bounds show only that the sample complexity is Ω(1/∈) and O(1/∈2). In this chapter, we show that this gap is not just a consequence of our lack of skill in proving sample complexity bounds: there are function classes demonstrating that both rates are possible. More surprisingly, we show that the sample complexity or, equivalently, the estimation error rate is determined by the ‘closure convexity’ of the function class. (Closure convexity is a slightly weaker condition than convexity.) Specifically, for function classes with finite pseudo-dimension, if the class is closure convex, the sample complexity grows roughly as 1/∈; if it is not closure convex, the sample complexity grows roughly as 1/∈2, and no other rates are possible (ignoring log factors).
To understand the intuition behind these results, consider a domain X of cardinality one. In this case, a function class is equivalent to a bounded subset of the real numbers, and the learning problem is equivalent to finding the best approximation from that subset to the expectation of a bounded random variable. It is a standard result of probability theory that the squared difference between the sample average and the expectation of such a random variable decreases as 1/m.
- Type
- Chapter
- Information
- Neural Network LearningTheoretical Foundations, pp. 269 - 283Publisher: Cambridge University PressPrint publication year: 1999