Book contents
- Frontmatter
- Contents
- Preface
- 1 Getting Started
- 2 Perceptron Learning – Basics
- 3 A Choice of Learning Rules
- 4 Augmented Statistical Mechanics Formulation
- 5 Noisy Teachers
- 6 The Storage Problem
- 7 Discontinuous Learning
- 8 Unsupervised Learning
- 9 On-line Learning
- 10 Making Contact with Statistics
- 11 A Bird's Eye View: Multifractals
- 12 Multilayer Networks
- 13 On-line Learning in Multilayer Networks
- 14 What Else?
- Appendices
- Bibliography
- Index
9 - On-line Learning
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- 1 Getting Started
- 2 Perceptron Learning – Basics
- 3 A Choice of Learning Rules
- 4 Augmented Statistical Mechanics Formulation
- 5 Noisy Teachers
- 6 The Storage Problem
- 7 Discontinuous Learning
- 8 Unsupervised Learning
- 9 On-line Learning
- 10 Making Contact with Statistics
- 11 A Bird's Eye View: Multifractals
- 12 Multilayer Networks
- 13 On-line Learning in Multilayer Networks
- 14 What Else?
- Appendices
- Bibliography
- Index
Summary
So far we have focused on the performance of various learning rules as a function of the size of the training set with examples which are all selected before training starts and remain available during the whole training period. However, in both real life and many practical situations, the training examples come and go with time. Learning then has to proceed on-line, using only the training example which is available at any particular time. This is to be contrasted with the previous scenario, called off-line or batch learning, in which all the training examples are available at all times.
For the Hebb rule, the off-line and on-line scenario coincide: each example provides an additive contribution to the synaptic vector, which is independent of the other examples. We mentioned already in chapter 3 that this rule performs rather badly for large training sets, precisely because it treats all the learning examples in exactly the same way. The purpose of this chapter is to introduce more advanced or alternative on-line learning rules, and to compare their performance with that of their off-line versions.
Stochastic gradient descent
In an on-line scenario, the training examples are presented once and in a sequential order and the coupling vector J is updated at each time step using information from this single example only.
- Type
- Chapter
- Information
- Statistical Mechanics of Learning , pp. 149 - 175Publisher: Cambridge University PressPrint publication year: 2001
- 1
- Cited by