Basic Concepts

R. D. Tennent

doi:10.1017/CBO9781139164900.012

Recognizing patterns in strings is ubiquitous in computing. For example, a programming-language compiler has to recognize whether input programs match the “pattern” defined by the syntax rules of the programming language. Many important software tools, such as text editors, command interpreters, and formatters, require the capability to recognize patterns in strings. In fact, any program that reads textual input from its users must implicitly test the well-formedness of that input. In the following chapters, we introduce some of the interesting concepts and techniques that may be used to address this class of applications.

Strings

In computing, the term string is normally understood to refer to finite sequences of characters drawn from a character set such as ASCII. We will find it convenient to generalize this concept slightly by allowing string components to be drawn from an arbitrary finite set, termed the alphabet or the vocabulary.

Definition 7.1 If Σ is any finite set, a string over Σ is any finite sequence of elements of Σ.

Strings may also be termed words or sentences. String components (i.e., elements of Σ) might be termed, depending on the context, characters, tokens, symbols, atoms, or generators. If Σ is the ASCII character set, a string over Σ is exactly what is normally considered to be a string. But if we are discussing the syntax of a programming language, the relevant vocabulary might be a set of lexical tokens, ignoring, at this level of abstraction, the substructure of multiple-character tokens such as <=.

Book contents

7 - Basic Concepts

Summary

Access options

Book contents

7 - Basic Concepts

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive