Skip to main content Accessibility help
  • Print publication year: 2019
  • Online publication date: July 2019

10 - The Minimal Deterministic Finite-State Automaton for a Finite Language

from Part II - From Theory to Practice


A fundamental task in natural language processing is the efficient representation of lexica. From a computational viewpoint, lexica need to be represented in a way directly supporting fast access to entries, and minimizing space requirements. A standard method is to represent lexica as minimal deterministic (classical) finite-state automata. To reach such a representation it is of course possible to first build the trie of the lexicon and then to minimize this automaton afterwards. However, in general the intermediate trie is much larger than the resulting minimal automaton. Hence a much better strategy is to use a specialized algorithm to directly compute the minimal deterministic automaton in an incremental way. In this chapter we describe such a procedure.