In this chapter, we present some basic material from formal language theory, and we concentrate on those topics that arise and have been studied in connection with group theory. These include, for example, real-time and indexed languages, and 2-variable automata, which do not generally merit extensive coverage in textbooks on the subject.
Although we do not assume any prior knowledge of the subject from the reader, some familiarity with the basics would be helpful. As a general reference, we recommend the first edition of the standard text by Hopcroft and Ullman [159]. The third edition of this book (co-authored also by Motwani) [158] serves as a more leisurely but less complete introduction to the subject. For many of the basic results of formal language theory, we provide only sketch proofs here (which we nevertheless hope are enough to allow the reader to fill in the details) and refer to [159] or [158] for full details and further discussion.
Languages, automata and grammars
Let A be a finite alphabet. Recall from 1.1.2 that A* denotes the set of all strings or words over A, including the empty word. We call a subset of A* a language over A. We shall sometimes study families of languages; the languages in a family are not normally all defined over the same alphabet.
A language over an alphabet A may be defined in various ways, in particular by an automaton that accepts it, or by a grammar that generates it. We use the term automaton in a general sense that encompasses the range from finite state automata to Turing machines. We use the term grammar where other authors use the terms Type 0 grammar, phrase structure, semi-Thue system or unrestricted grammar.
Automata We consider an automaton M over an alphabet A to be a device (with a finite description) that reads input strings over A from a tape, and accepts some of them; the language of M, which is denoted by L(M), consists of those strings that are accepted. In this chapter we describe many of the different types of automata that can be defined, of varying complexity, ranging from the most basic finite state automata to the most general types of Turing machines, operating either deterministically or non-deterministically, and we examine properties of their associated languages.