Book contents
- Frontmatter
- Contents
- Foreword by Steven Salzberg
- Preface
- Acknowledgements
- 1 Introduction
- 2 Mathematical preliminaries
- 3 Overview of computational gene prediction
- 4 Gene finder evaluation
- 5 A toy exon finder
- 6 Hidden Markov models
- 7 Signal and content sensors
- 8 Generalized hidden Markov models
- 9 Comparative gene finding
- 10 Machine-learning methods
- 11 Tips and tricks
- 12 Advanced topics
- Appendix
- References
- Index
10 - Machine-learning methods
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Foreword by Steven Salzberg
- Preface
- Acknowledgements
- 1 Introduction
- 2 Mathematical preliminaries
- 3 Overview of computational gene prediction
- 4 Gene finder evaluation
- 5 A toy exon finder
- 6 Hidden Markov models
- 7 Signal and content sensors
- 8 Generalized hidden Markov models
- 9 Comparative gene finding
- 10 Machine-learning methods
- 11 Tips and tricks
- 12 Advanced topics
- Appendix
- References
- Index
Summary
Quite a few of the techniques described in the foregoing chapters could be said to qualify as machine-learning methods. In this chapter we consider a number of other popular machine-learning algorithms and models which either have seen limited use in gene finding, or would seem to offer possible avenues for future investigation in this arena. Most of the methods which we describe are relatively easy to implement in software, and nearly all are available in open-source implementations (see Appendix). While the current emphasis in the field of gene prediction seems to be on Markovian systems (in one form or another), an expanded role for other predictive techniques in the future is not inconceivable.
Overview of automatic classification
Perhaps the most typical setting for machine-learning applications is that of N-way classification (Figure 10.1). In this setting, a test case (i.e., a novel object) is presented to a classifier for assignment to one of a fixed number of discrete categories. The test case is typically encoded as a vector of real-valued or integer-valued attributes (i.e., random variables – section 2.6), though since we will generally treat all attributes as being real-valued; thus the attributes of a single test case are drawn from, for some integer m. The categories to which test cases are to be mapped are typically encoded as integer values in the range.
- Type
- Chapter
- Information
- Methods for Computational Gene Prediction , pp. 325 - 357Publisher: Cambridge University PressPrint publication year: 2007
- 1
- Cited by