Hostname: page-component-78c5997874-dh8gc Total loading time: 0 Render date: 2024-11-18T09:10:00.689Z Has data issue: false hasContentIssue false

How Not To Count Gaps

Published online by Cambridge University Press:  27 June 2016

Eric S. Wheeler*
Affiliation:
University of Toronto

Extract

Králík (1977) examined four Czech texts of 7000 words each, in which he marked every occurrence of the word meaning and (and separately, three other words). The interval between two successive occurrences of the marked word is called a gap, and the number of words in that gap is the gap length. When the quantity of gaps of each length was recorded, it was found that there were many more short gaps than long ones. Králík proposed an exponential decay model of this distribution of gap lengths: For N gaps in a text of T words, the proportion of those gaps that are of length x should tend to be:

f(x) = a exp (— ax) where a = N/T

That is, as the gap length increases, the number of such gaps decreases in a smooth, downwardly convex, “swooping” curve.

Type
Remarks/Remarques
Copyright
Copyright © Canadian Linguistic Association 1979

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Králík, Jan (1977) “An application of Exponential Distribution Law in Quantitative Linguistics.” Prague Studies in Mathematical Linguistics 5:23335.Google Scholar
Simon, H. A. (1968) “On Judging the Plausibility of Theories,” in Rootselaar, B. Van and Staal, J. F.. eds., Logic, Methodology and Philosophy of Science III. Amsterdam: North Holland Publishing.Google Scholar