Book contents
- Frontmatter
- Contents
- Preface
- I Exact String Matching: The Fundamental String Problem
- 1 Exact Matching: Fundamental Preprocessing and First Algorithms
- 2 Exact Matching: Classical Comparison-Based Methods
- 3 Exact Matching: A Deeper Look at Classical Methods
- 4 Seminumerical String Matching
- II Suffix Trees and Their Uses
- III Inexact Matching, Sequence Alignment, Dynamic Programming
- IV Currents, Cousins, and Cameos
- Epilogue – where next?
- Bibliography
- Glossary
- Index
1 - Exact Matching: Fundamental Preprocessing and First Algorithms
from I - Exact String Matching: The Fundamental String Problem
Published online by Cambridge University Press: 23 June 2010
- Frontmatter
- Contents
- Preface
- I Exact String Matching: The Fundamental String Problem
- 1 Exact Matching: Fundamental Preprocessing and First Algorithms
- 2 Exact Matching: Classical Comparison-Based Methods
- 3 Exact Matching: A Deeper Look at Classical Methods
- 4 Seminumerical String Matching
- II Suffix Trees and Their Uses
- III Inexact Matching, Sequence Alignment, Dynamic Programming
- IV Currents, Cousins, and Cameos
- Epilogue – where next?
- Bibliography
- Glossary
- Index
Summary
The naive method
Almost all discussions of exact matching begin with the naive method, and we follow this tradition. The naive method aligns the left end of P with the left end of T and then compares the characters of P and T left to right until either two unequal characters are found or until P is exhausted, in which case an occurrence of P is reported. In either case, P is then shifted one place to the right, and the comparisons are restarted from the left end of P. This process repeats until the right end of P shifts past the right end of T.
Using n to denote the length of P and m to denote the length of T, the worst-case number of comparisons made by this method is Θ(nm). In particular, if both P and T consist of the same repeated character, then there is an occurrence of P at each of the first m − n + 1 positions of T and the method performs exactly n(m − n + 1) comparisons. For example, if P = aaa and T = aaaaaaaaaa then n = 3, m = 10, and 24 comparisons are made.
The naive method is certainly simple to understand and program, but its worst-case running time of Θ(nm) may be unsatisfactory and can be improved. Even the practical running time of the naive method may be too slow for larger texts and patterns.
- Type
- Chapter
- Information
- Algorithms on Strings, Trees and SequencesComputer Science and Computational Biology, pp. 5 - 15Publisher: Cambridge University PressPrint publication year: 1997