Book contents
- Frontmatter
- Contents
- Preface
- 1 The smallest free number
- 2 A surpassing problem
- 3 Improving on saddleback search
- 4 A selection problem
- 5 Sorting pairwise sums
- 6 Making a century
- 7 Building a tree with minimum height
- 8 Unravelling greedy algorithms
- 9 Finding celebrities
- 10 Removing duplicates
- 11 Not the maximum segment sum
- 12 Ranking suffixes
- 13 The Burrows–Wheeler transform
- 14 The last tail
- 15 All the common prefixes
- 16 The Boyer–Moore algorithm
- 17 The Knuth–Morris–Pratt algorithm
- 18 Planning solves the Rush Hour problem
- 19 A simple Sudoku solver
- 20 The Countdown problem
- 21 Hylomorphisms and nexuses
- 22 Three ways of computing determinants
- 23 Inside the convex hull
- 24 Rational arithmetic coding
- 25 Integer arithmetic coding
- 26 The Schorr–Waite algorithm
- 27 Orderly insertion
- 28 Loopless functional algorithms
- 29 The Johnson–Trotter algorithm
- 30 Spider spinning for dummies
- Index
13 - The Burrows–Wheeler transform
Published online by Cambridge University Press: 05 March 2013
- Frontmatter
- Contents
- Preface
- 1 The smallest free number
- 2 A surpassing problem
- 3 Improving on saddleback search
- 4 A selection problem
- 5 Sorting pairwise sums
- 6 Making a century
- 7 Building a tree with minimum height
- 8 Unravelling greedy algorithms
- 9 Finding celebrities
- 10 Removing duplicates
- 11 Not the maximum segment sum
- 12 Ranking suffixes
- 13 The Burrows–Wheeler transform
- 14 The last tail
- 15 All the common prefixes
- 16 The Boyer–Moore algorithm
- 17 The Knuth–Morris–Pratt algorithm
- 18 Planning solves the Rush Hour problem
- 19 A simple Sudoku solver
- 20 The Countdown problem
- 21 Hylomorphisms and nexuses
- 22 Three ways of computing determinants
- 23 Inside the convex hull
- 24 Rational arithmetic coding
- 25 Integer arithmetic coding
- 26 The Schorr–Waite algorithm
- 27 Orderly insertion
- 28 Loopless functional algorithms
- 29 The Johnson–Trotter algorithm
- 30 Spider spinning for dummies
- Index
Summary
Introduction
The Burrows–Wheeler transform (BWT) is a method for permuting a list with the aim of bringing repeated elements together. Its main use is as a preprocessing step in data compression. Lists with many repeated adjacent elements can be encoded compactly using simple schemes such as run length or move-to-front encoding. The result can then be fed into more advanced compressors, such as Huffman or arithmetic coding, to compress the input even more.
Clearly, the best way of bringing repeated elements together is just to sort the list. But the idea has a major flaw as a preliminary to compression: there is no way to recover the original list unless the complete sorting permutation is also produced as part of the output. Without the ability to recover the original input, data compression is pointless; and if a permutation has to be produced as well, then compression is ineffective. Instead, the BWT achieves a more modest permutation, one that brings some but not all repeated elements into adjacent positions. The main advantage of the BWT is that the transform can be inverted using a single additional piece of information, namely an integer k in the range 0 ≤ k < n, where n is the length of the (nonempty) input list. In this pearl we describe the BWT, identify the fundamental reason why inversion is possible, and use it to derive the inverse transform from its specification.
- Type
- Chapter
- Information
- Pearls of Functional Algorithm Design , pp. 91 - 101Publisher: Cambridge University PressPrint publication year: 2010