1 - Why Data Structures? A Motivating Example
Published online by Cambridge University Press: 10 November 2016
Summary
To begin the study of data structures, I demonstrate the usefulness of even quite simple structures by working through a detailed motivating example. We shall afterward come back to the basics and build up our body of knowledge incrementally.
The algorithm presented in this introduction is due to R. S. Boyer and J S. Moore and solves the string matching problem in a surprisingly efficient way. The techniques, although sophisticated, do not require any advanced mathematical tools for their understanding. It is precisely because of this simplicity that the algorithm is a good example of the usefulness of data structures, even the simplest ones. In fact, all that is needed to make the algorithm work are two small arrays storing integers.
There are two sorts of algorithms that, when first encountered, inspire both perplexity and admiration. The first is an algorithm so complicated that one can hardly imagine how its inventors came up with the idea, triggering a reaction of the kind, “How could they think of that?” The other possibility is just the opposite – some flash of ingeniousity that gives an utterly simple solution, leaving us with the question, “How didn't I think of that?” The Boyer–Moore algorithm is of this second kind.
We encounter on a daily basis instances of the string matching problem, defined generically as follows: given a text T = T[1]T[2] · · · T[n] of length n characters and a string S = S[1]S[2] · · · S[m] of length m, find the (first, or all) location(s) of S in T, if one appears there at all. In the example of Figure 1.1, the string S = TRYME is indeed found in T, starting at position 22.
To solve the problem, we imagine that the string is aligned underneath the text, starting with both text and string left justified. One can then compare corresponding characters, until a mismatch is found, which enables us to move the string forward to a new potential matching position. We call this the naive approach. It should be emphasized that our discourse of moving the pattern along an imaginary sliding path is just for facilitating understanding.
- Type
- Chapter
- Information
- Basic Concepts in Data Structures , pp. 1 - 13Publisher: Cambridge University PressPrint publication year: 2016