This paper uses data from the Human Gene Mutation Database to contrast two hypotheses for
the origin of short DNA repeats: substitutions and insertions that duplicate adjacent sequences.
Because substitutions are much more common than insertions, they are the dominant source of
new 2-repeat loci. Insertions are rarer, but over 70% of the 2–4 base insertion mutations are
duplications of adjacent sequences, and over half of these generate new repeat regions. Insertions
contribute fewer new repeat loci than substitutions, but their relative importance increases rapidly
with repeat number so that all new 4–5-repeat mutations come from insertions, as do all 3-repeat
mutations of tetranucleotide repeats. This suggests that the process of repeat duplication that
dominates microsatellite evolution at high repeat numbers is also important very early in
microsatellite evolution. This result sheds light on the puzzle of the origin of short tandem repeats.
It also suggests that most short insertion mutations derive from a slippage-like process during