Hostname: page-component-cd4964975-8cclj Total loading time: 0 Render date: 2023-03-30T00:55:05.858Z Has data issue: true Feature Flags: { "useRatesEcommerce": false } hasContentIssue true

Stochastic scrabble: large deviations for sequences with scores

Published online by Cambridge University Press:  14 July 2016

Richard Arratia
University of Southern California
Pricilla Morris
University of Southern California
Michael S. Waterman
University of Southern California


A derivation of a law of large numbers for the highest-scoring matching subsequence is given. Let Xk, Yk be i.i.d. q=(q(i))iS letters from a finite alphabet S and v=(v(i))iS be a sequence of non-negative real numbers assigned to the letters of S. Using a scoring system similar to that of the game Scrabble, the score of a word w=i1 · ·· im is defined to be V(w)=v(i1) + · ·· + v(im). Let Vn denote the value of the highest-scoring matching contiguous subsequence between X1X2 · ·· Xn and Y1Y2· ·· Yn. In this paper, we show that Vn/K log(n) → 1 a.s. where KK(q,v). The method employed here involves ‘stuttering’ the letters to construct a Markov chain and applying previous results for the length of the longest matching subsequence. An explicit form for β ∊Pr(S), where β (i) denotes the proportion of letter i found in the highest-scoring word, is given. A similar treatment for Markov chains is also included.

Implicit in these results is a large-deviation result for the additive functional, H ≡ Σn < τv(Xn), for a Markov chain stopped at the hitting time τ of some state. We give this large deviation result explicitly, for Markov chains in discrete time and in continuous time.

Research Papers
Copyright © Applied Probability Trust 1988 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Research supported by the System Development Foundation,

the National Science Foundation,


the Institute of Mathematics and its Applications and


the National Institutes of Health.


[1] Arratia, R. and Waterman, M. S. (1985) Critical phenomena in sequence matching. Ann. Prob. 13, 12361249.CrossRefGoogle Scholar
[2] Arratia, R. and Waterman, M. S. (1985) An Erdös–Rényi law with shifts. Adv. Math. 55, 1323.CrossRefGoogle Scholar
[3] Arratia, R., Gordon, L., and Waterman, M. S. (1986) An extreme value theory for sequence matching. Ann . Statist. 14, 971993.CrossRefGoogle Scholar
[4] Erdös, P. and Rényi, A. (1970) On a new law of large numbers. J. Anal. Math. 22, 103111.CrossRefGoogle Scholar
[5] Karlin, S. and Taylor, H. M. (1970) A First Course in Stochastic Processes , 2nd edn. Academic Press, New York.Google Scholar
[6] Rényi, A. (1970) Probability Theory , Akademia Kiado, Budapest.Google Scholar
[7] Waterman, M. S. (1984) General methods of sequence comparison. Bull. Math. Biol. 46, 473500. Reference added in proof CrossRefGoogle Scholar
[8] Arratia, R., Goldstein, L. and Gordon, L. (1988) Two moments suffice for Poisson approximations: the Chen-Stein method. Ann. Prob. To appear.Google Scholar