Hostname: page-component-848d4c4894-5nwft Total loading time: 0 Render date: 2024-05-22T00:32:06.777Z Has data issue: false hasContentIssue false

The coinages in Seuss

Published online by Cambridge University Press:  10 August 2022

Department of Linguistics 3125 Campbell Hall University of California, Los Angeles Los Angeles, CA 90095-1543 USA
Rights & Permissions [Opens in a new window]


The children's books of Dr. Seuss abound in words that the author invented. Inspection shows that these coinages are not arbitrary, raising the challenge of specifying the linguistic basis on which they were created. Drawing evidence from regression analyses covering the full set of Seuss coinages, I note several patterns, which include coinages that are phonotactically ill-formed, coinages meant to sound German and coinages that assist compliance with the meter. But the primary coinage principle for Seuss appears to have been to use words that include phonesthemes (Firth 1930), small quasi-morphemic sequences affiliated with vague meanings. For instance, the coinage Snumm contains two phonesthemes identified in earlier research, [sn-] and [-ʌm]. Concerning phonesthemes in general, I assert their affiliation with vernacular style, and suggest that phonesthemes can be identified in words purely from their stylistic effect, even when the affiliated meaning is absent. This is true, I argue, both for Seuss’s coinages and for the existing vocabulary.

Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (, which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright © The Author, 2022. Published by Cambridge University Press

1 Introduction: coined words

The reader of children's books by Dr. Seuss (Theodore Seuss Geisel, 1904–91) cannot help but notice the great number of words Seuss coined himself. In (1) I give some examples, specifically the full set of coined words in If I Ran the Circus (1956).

  1. (1) The Seuss coinages in If I Ran the Circus

Here is a passage employing two of these coined words:

  1. (2) From a country called Frumm comes this Drum-Tummied Snumm

    Who can drum any tune that you might care to hum.

    Doesn't hurt him a bit, cause his Drum-Tummy's numb.

In coining words, Seuss was hardly alone among authors of fiction; more exalted figures of English literature such as Jonathan Swift or James Joyce did the same. However, words are also coined by ordinary people from time to time (Marchand Reference Marchand1960: 320–1; Malkiel Reference Malkiel1990: 105). When adopted generally, these words end up in the dictionary, listed as words of obscure origin. Thus, for instance, Eisiminger (Reference Eisiminger1981) compiled a list of English words that have no etymology, and it abounds in slangy, obviously new forms. The same is true for older forms that gradually lost their slangy tinge and settled into standard usage (see, e.g., the Oxford English Dictionary's entries for boy, girl, big and bad). So research questions about word coinage are not confined to literature but are part of the study of language in general. A prolific word-coiner like Seuss can help us to explore word coinage in one ‘idiolect’ for which ample data are attested.

The freedom available to word-coiners is in principle very broad: it seems that they need only find some string that conforms to the phonotactics of their language enough to be pronounceable by other speakers. But this greatly oversimplifies the question. Word-coiners create their words for a reason, and they make substantial use of the phonological resources of their language when they create a novel phonological form. This point is well established, I believe, by recent research on word coinage, notably the extensive current research program on the creation of Pokémon names (for overviews, see Shih et al. Reference Shih, Ackerman, Hermalin, Inkelas and Kavitskaya2018 and Kawahara Reference Kawahara2021a). However, we will see that the Seuss corpus has idiosyncrasies that justify a slightly different analytical approach.Footnote 2

Before plunging into the Seuss coinages, I should offer a couple of clarifications. First, when I say ‘coinage’ here, I obviously am using it in a restricted sense, namely ‘made up de novo’, since we can also say that word creation in the normal way – application of the language's word formation rules, as in Seuss-ian or un-mute – counts as coinage.Footnote 3 Second, I acknowledge that word coinage often relies on lexical as well as phonological resources: a coined word is sometimes perceived to be similar in its phonology and semantics to an existing word (MacDonald Reference MacDonald1988: 67; Magnus Reference Magnus2001: 140). For simplicity, the discussion below will ignore this factor.

2 The Seuss coinages: an attempt at precise description

I adopt the strategy of Shih and Kawahara, employing a digital data corpus and statistical modeling in order to obtain objective testimony about issues that can become subjective very easily. The modeling described here employs the technique of logistic regression (on which see, e.g., Johnson Reference Johnson2008: 159–74). The purpose of my models is to predict for any given word, on the basis of just its phonological form, whether it is a Seuss coinage or a real word (for similar applications in other domains, see Hayes Reference Hayes2016). It is, of course, impossible to achieve perfect prediction, and what the model really does is to assign a ‘probability-of-Seussian’ value to every form, so its predictions are gradient. The deeper purpose of the modeling is that, once a model has been optimized, we can make useful inferences from its internal structure, specifically the degree to which the model attributes explanatory importance to principles hypothesized to underlie Seuss’s coinage practice.

I employed readily accessible data. The Seuss coinages, which number about 435 in his complete oeuvre, were carefully collected and described by Lathem (Reference Lathem2000). I extracted the coinages from Lathem's work and rendered them in phonetic transcription by hand. I believe the latter task is not difficult or controversial, in light of Seuss’s clear use of orthography and the additional clues provided by rhyme. For English in general I employed my own version of the Carnegie-Mellon pronunciation dictionary ( My edited version, with 17,744 entries, includes only words that have a frequency of one or above in the English CELEX database (Baayen et al. Reference Baayen, Piepenbrock and Gulikers1995). This is meant to restrict it to words likely to be familiar to English speakers.Footnote 4 I also excluded words formed with highly productive suffixes such as inflectional [-z/-s/-əz] (plural, possessive, 3rd sg. pres.) or [-d/-t/-əd] (past tenses and participles). This is important because there are sequences that are very unusual in stems, but common in inflected forms. For instance, [ts] is rare in stems (e.g. Katz, Hertz) but is ordinary in inflected forms like cats or hurts. I argue below that Seuss indeed uses [ts] as a basis for coinages.

All the analytic work I did for this article (lexical databases with phonetic transcription, R scripts, spreadsheets) may be accessed in the Supplemental Materials at

2.1 What principles might be used to characterize the coined words?

Following a few pilot efforts, I settled on the following procedure to guide the work: I searched a fairly large preliminary set of predictive principles, then narrowed it down to a smaller set with just the most effective ones.

My pilot studies indicated that several strongly predictive traits consisted of word-initial syllable onsets, such as the [sn-] of Snumm. To be thorough, I searched the entire set of 73 attested word-initial onsets, irrespective of whether they occur in the real-word corpus or the Seuss corpus (some onsets occurred only in Seuss). I also found that particular vowels, occurring in the main-stressed position of a word, were sometimes highly predictive of Seuss, such as [ʌ]. Thus, in my more serious search, I included all 15 main-stressed vowels from the corpora as potential predictive factors.Footnote 5

I also incorporated into my search some ideas taken from the research literature on sound symbolism, mostly from work on phonesthemes. These are short segmental sequences that don't fully qualify as morphemes, but nonetheless often impart a (perhaps vague) meaning to words that they contain. Phonesthemes are discussed extensively below in section 3.4. For purposes of including multiple phonesthemes in the initial search, I relied on the lists in Marchand (Reference Marchand1960) and Hutchins (Reference Hutchins1998).

2.2 The culling procedure

The final model was made by reducing the original set of factors, described in the previous section, to a smaller set, each of whose members demonstrably contributes to predicting Seussian status in a logistic regression model.

I offer a brief note on logistic regression. Every principle that might help predict Seussian status is here termed a feature.Footnote 6 Each feature is given a particular number, called its weight. In my own setup, if the weight of a feature is positive, it means that the feature favors Seuss-coinage status; if the weight is negative, is means that the feature militates against Seussian status; and if it is zero, the feature is indifferent. Greater magnitudes of weights (either positive or negative) have greater effect.

The output of the model, for any given phonetically transcribed word, is a value ranging from zero to one, expressing the estimated probability that a word is Seussian; a perfect model would assign one to all Seuss coinages and zero to all ordinary words. Where computation enters is in setting the weights: one's chosen logistic-regression software will calculate the weights that best separate out the Seuss coinages in the data from the real words.Footnote 7

My chosen software was the bayesglm() function within the R statistics system ( This version of logistic regression is somewhat conservative, assigning lower-magnitude weights than would obtain under the simplest forms of logistic regression.

I sought to trim my large candidate system of features into one that would be much smaller but perform almost as well. First, I removed all constraints that tested nonsignificant (by a .1 p value), then culled further with the stepAIC() function, which lets us keep a feature only if it creates improvement by the Akaike Information Criterion, a well-known measure that penalizes overly complex models.Footnote 8 Statistical evaluation of all models reported here is given below in appendix A.

2.3 The features of the completed model

Tables 1 and 2 below give what I found. There is one non-specific feature in the system, the Intercept, which simply is a raw penalty against being Seussian – a sensible penalty, in light of the disparity in numbers. The weight of the Intercept is −4.80, which is quite large. Hence, for any form to receive a really strong Seussian probability, it must rack up substantial compensation from the positive-weighted features in order, as it were, to climb out of the hole.

Table 1. Features that favor being a Seuss coinage

Table 2. Features that favor being a real word

For each feature in the tables, I give the following information:

  • The form of the feature; usually a sequence of phonemes. ‘[’means that a feature counts the relevant sequence only if it is initial in the word; ‘]’ analogously means ‘final’; no bracket means ‘anywhere’. A few features deviate from this format and are described in words.

  • The weight of the feature. For intuitive interpretation of weights see footnote.Footnote 9

  • The number of words, both Seussian and real, that come under the scope of the feature, with representative Seussian examples.

  • Explanatory comments, where applicable; these serve as placeholders for the discussion to follow. ‘Marchand’ with page number means that a sequence has been identified as an English phonestheme by Marchand (Reference Marchand1960), the reference source used for statistical testing in section 4.

2.4 How well does the model work?

We should not expect the model to make always-correct up-or-down decisions on whether a word is a Seuss coinage or a normal word; it would be remarkable if Seuss somehow managed to make every coinage fully distinguishable in this way. Rather, we should see if the model makes meaningful, useful distinctions. It emerges here that the model's average ‘probability is Seuss’ for Seussian coinages is 21.1 percent, whereas the average ‘probability is Seuss’ for normal words is only 1.9 percent.

We can get a more detailed picture by comparing histograms. In figures 1 and 2, I plot the probabilities assigned by the model to the 435 Seuss words, compared to the probabilities assigned to the 17,744 real words (in the latter, the scale is compressed to accommodate them in the same vertical space).

Figure 1. Histogram of model probabilities of Seuss coinages

Figure 2. Histogram of model probabilities of real words

It should be clear that the model predicts a very different distribution for the Seussian coinages.

We can also examine the extremes of behavior. In table 3 are given the ten most ‘Seussian’ Seuss coinages. I include the particular phoneme sequences that are picked up by the features of tables 1 and 2 and converted, via the math of logistic regression, to high predicted probability.

Table 3. The ten Seuss coinages with the highest model probability

In less detail, (3) gives the ten least Seussian Seuss coinages, as well as the ‘most Seussian’ and ‘least Seussian’ real words (the latter consists of ten words randomly chosen from the 484 real words that got a 0.000 score).

  1. (3) Further examples of the model's behavior

    1. (a) Seuss coinages rated by the model as minimally Seussian

      lopulous (0.004), Lass-a-lack (0.004), Hippo-Heimer (0.004), Ronk (0.003), Antrum (0.003), Offt (0.003), Gee-Hossa-Flat (0.003), Solla (Sollew) (0.003), rippulous (0.002), Keck (0.001)

    2. (b) Real words rated as most Seussian by the model

      xerox (0.845), zinc (0.843), quartz (0.837), zigzag (0.837), waltz (0.757), snuggle (0.754), snuff (0.749), snoop (0.732), zip (0.722), snub (0.659)

    3. (c) Sample of real words rated as minimally Seussian by the model (all at 0.000)

      administration, appreciation, appreciative, chicanery, competitor, elaboration, electromagnetic, encyclopedic, immemorial, meteorological Footnote 11

These are meant mainly as a guide for the intuition, though the forms of (3b) evoke a further phenomenon: Seuss occasionally adapts a real word, often bearing Seussian phonological traits, to serve as a novel word; for example zip, respelled as Zipp, is used as a surname in Oh Say Can You Say?. These forms are discussed in appendix B.

One wonders whether the model could be improved by further work. I would judge that this is likely, since many of the Seussian coinages are assigned low scores but somehow sound Seussian to me, for example sporn, Jounce and tweetle, all with scores below 0.02 – something is still missing. However, I believe the model in its present form suffices for its intended purpose; namely, that we can inspect it, trying to find in its features some principles that will be informative about Seuss’s coinage practice in general terms.

3 The Seuss coinages: seeking general principles

I will put forth four proposed principles of Seussian coinage.

3.1 Meter

First, the Seussian words are skewed somewhat to make them fit easily into his hallmark meter, anapestic tetrameter – see features (k) and (p) in table 1. Since these metrical principles are so distinct from the main theme of this article, I have relegated discussion of them to appendix C below.

3.2 Phonotactic violations

As Nilsen (Reference Nilsen1977) observed, a noticeable minority of the coinages violate principles of English phonotactics; specifically word-level phonological well-formedness. For English phonotactics see e.g. Hammond (Reference Hammond1999), Hayes & Wilson (Reference Hayes and Wilson2008) and Daland et al. (Reference Daland, Hayes, White, Garellek, Davis and Norrmann2011). The patterns noted below are probably not controversial.

First, a number of onsets found in the Seuss coinages are not permissible in the core English vocabulary (although they may occur in unassimilated borrowings).

  1. (4) Illegal onsets occurring in Seuss coinages

Among other onsets tagged in the feature-selection process described above, [θw] and [gw] are very unusual in real English words and might be regarded as hovering on the fringes of ill-formedness. The impossible coda [bsk] is found in Obsk (a bird in Scrambled Eggs Super), along with Tobsk and Nobsk in the same location.

The letter name Nuh, from On Beyond Zebra, ends in a lax vowel ([nʌ]),Footnote 12 something impossible in ordinary words and limited to quasi-gestural forms like uh ([ʌ]; hesitation noise) and duh ([dʌ]; used to indicate one's interlocutor has missed something obvious).

Lastly, consider Snumm, quoted above in (2). Along with Snimm (a proper name from Too Many Daves) this coinage violates a phonotactic principle discussed in Davis (Reference Davis, Paradis and Prunet1991): English avoids the occurrence of similar or identical consonants in the C positions of the formula sCVC. Davis’ constraints include, for instance, bans on /spVp/ and /skVk/ (spip or skeck would be odd as English words). In the present context, the relevant ban, also noticed by Davis, is on /sNVN/, where N is any nasal consonant. As Davis points out, no such words exist in English and I personally find smem, smun, snam (and indeed Snumm and Snimm) to sound odd.

Unsurprisingly, none of these phonotactic violations is extreme, like, say, the use of uvular consonants or grossly sonority-violating initial or final clusters. It seems that Seuss wanted his words to sound funny, but would hardly want to inflict an impossible phonetic challenge on his readers.

The specific examples given above most likely are only the most salient cases of a more general pattern: Westbury et al.'s (Reference Westbury, Shaoul, Moroschan and Ramscar2016) experiments suggest that phonotactically improbable English nonce words are more likely than chance to be felt as funny, and their sample of Seuss coinages emerged in the aggregate as less phonotactically probable than ordinary words.

3.3 Words that sound German

Nilsen (Reference Nilsen1977) and Teuber (Reference Teuber2018) suggest that a number of the Seuss coinages sound like German words. Some of these have already been mentioned in the previous section: words beginning in [ʃl] and [ʃn] are aberrant in English, but are normal in German.

German is of course closely related to English and has similar phonotactics. Yet the phonological history of the language (see e.g. Chambers & Wilkie Reference Chambers and Wilkie1970) has produced some points of departure. By the Second Consonant Shift, historical Germanic *t and *p (preserved intact in English) evolved in certain contexts into [ts] and [pf], sequences that are very rare in English. Thus, the clusters in Katze [ˈkatsə] ‘cat’ and Kropf ‘(bird's) crop’ are a point of phonological divergence between German and English that is attested in multiple words. The [ʃl] and [ʃn] clusters just mentioned were also created by sound change, from historical *sl and *sn. All four of these German-linked patterns are shown in table 1 above to be statistically unambiguous features for Seuss coinages.Footnote 13

That these coinages were actually intended by Seuss to sound German is made plausible by several factors. First, the orthography Seuss chose for them is largely German, as in Gitz, Glotz, Schlottz, Schnutz (that is, tz not ts, sch not sh). Second, the texts include a few overt German cultural references, notably the blue-footed mandolinist Gretchen von Schwinn, from Oh Say Can You Say, and the castle of Krupp, from Dr. Seuss's Sleep Book. Lastly, Seuss’s German-styled coinage practice can be related to his own life history (Morgan & Morgan Reference Morgan and Morgan1995): he grew up in a German-speaking family (he was third-generation) in Springfield, Massachusetts, a city that during his youth included a vibrant German-American community.

3.3.1 The German coinages and the American audience

It is only natural that Seuss, a popular artist, would have attempted to create coinages that would make sense to his readers. In the present context this raises the question of whether Seuss’s audience (mostly mid-century Americans) would have been able to identify Germanness in nonce words. An intriguing research finding by Oh et al. (Reference Oh, Needle, Todd, Beckner, Hay and King2020) bears on this question: they show by experiment that non-Māori residents of New Zealand, very few of whom can actually speak Māori, nonetheless have an accurate sense of the phonotactic principles of the language, obtained from second-hand exposure. This suggests that if Seuss’s audience had enough second-hand exposure to German they likewise could have internalized a sense of what German phonology is like. It seems reasonable to me to claim that mid-century Americans did indeed have considerable exposure to German; this was the period following World War II, and closer to the historical time when German-Americans were the nation's largest ethnic minority.Footnote 14 Of course, even now many American Seuss readers would surely recognize Schlottz as a German-like word.Footnote 15

3.4 Phonesthemes

For present purposes, I define a phonestheme as the following: (i) it is a segment or segment sequence that occurs in multiple words; (ii) it has some vague, often expressive meaning; (iii) its ‘residue’ in a word is not a morpheme; e.g. in words of the form [ Ph X ]word, where Ph is a phonestheme, X is not in general an identifiable morpheme of the language. To give an example, initial [sn-] is a well-known phonestheme of English. Its meaning is (vaguely, as always) ‘having something to do with the nose’, as in snoot, snot, sneeze, snout, snuff, snore, sniff, sniffle, snort; and by extension ‘looking down the nose at’, snob, snooty, sneer, snicker, snide, sniffy and snub. Footnote 16 We will examine other phonesthemes below.

Phonesthemes are the topic of a large research literature,Footnote 17 which I briefly discuss before going on to the Seuss coinages.

3.4.1 Theories about phonesthemes

I see three basic lines of thought.

The first is least relevant here, so let us dispose of it up front. Phonesthemes, or at least many of them, are often said to have a natural phonetic basis, as in the affiliation of [i] (a low-sonority vowel with high F2) with smallness (Jespersen Reference Jespersen1933). For a careful overview of this topic see Kawahara (Reference Kawahara2021b). For present purposes I believe it will be safe to ignore whether a phonestheme is natural or arbitrary.

More pertinently, there are different points of view about where phonesthemes come from and their role in language. One prominent viewpoint is the word affinities approach, put forth by Bolinger (Reference Bolinger1965) and Magnus (Reference Magnus2001). This sees phonesthemes as the result of word comparison: human language learners comb through their lexicons, seeking all conceivable correlations between phoneme sequences and meaning. Of course, when pursued to a successful conclusion, this learning behavior yields knowledge of the authentic morphology, enabling most words to be parsed into a sequence of clearly defined, plainly meaningful morphemes. Phonesthemes, in contrast, are the morpheme candidates left on the workbench when learning doesn't fully succeed – hence, they occur in words whose ‘residues’ (X in [ Ph X ]word) are meaningless, their meanings are elusive, and native speaker judgments about them are difficult and ambivalent.Footnote 18

A rather different view on phonesthemes is put forth by Bloomfield (Reference Bloomfield1933: 156), Wales (Reference Wales and Ramsaran1990) and Joseph (Reference Joseph, Hinton, Nichols and Ohala1994), who emphasize the stylistic function of phonesthemes: phonesthemic words are characteristically vernacular in tone and expressive in function. Joseph (Reference Joseph, Hinton, Nichols and Ohala1994: 222, 229) articulates this view clearly, describing phonesthemic words as ‘expressive, affective, connotative’; they ‘add color to the language’. An important component of this view, put forth by Joseph, is that a word can include a phonestheme which embodies style without bearing any trace of the phonestheme's meaning. This will turn out to be important when we later turn to Seuss.

The stylistic function of phonesthemes arises, I suspect, from their use in word coinage. Speakers obviously do not concoct phonologically novel words for the purpose of making their meaning clear; rather, these coinages are intended to make an impression, based their imaginative phonological content. Earlier (section 1), I mentioned the apparent fact that many of our existing words originated as phonesthetic coinages, the work of creative speakers long forgotten. It is not unreasonable to regard these coinages, at least at the moment of origin, as the deployment of phonesthemes in the service of verbal folk art.Footnote 19 Here, I suggest that Seuss embraced this art form as part of his own distinctive vernacular style.

I have now given two accounts of phonesthemes, but how do we integrate them? Here again, word coinage provides the key: the anonymous verbal artists who coin new words draw on the set of word affinities to make their words more vivid as well as more intelligible. Although the phonesthemes originate with word affinities, the fact that they are repeatedly used to create new vernacular words over time means that the phonesthemes themselves are likely eventually to acquire the vernacular stylistic tinge. And the process may be self-feeding: the acquired stylistic tinge invites word-coiners to make use of the phonestheme more frequently, a virtuous cycle.

3.4.2 Phonesthemic words: a three-way distinction

With the above general discussion in mind, we now turn to a proposed taxonomy of the words in which the phonesthemes occur. The idea is that for any given phonestheme, we will normally find words that fit into each of the following categories.

  1. (5) A three-way classification for phonesthemic vocabulary

    1. (a) Words in the meaningful core of a phonestheme (‘core words’) both contain the phonestheme and bear the appropriate meaning.

    2. (b) Words in the penumbra of a phonestheme contain the phonestheme and also convey the vivid, expressive character of phonesthetic style; but they do not bear the meaning of the phonestheme.

    3. (c) Words in the neutral zone of a phonestheme contain the segments of the phonestheme but do not bear the meaning of the phonestheme and lack a vivid, expressive meaning; they are not phonesthetic.Footnote 20

I illustrate this taxonomy for the ‘nasal’ phonestheme [sn-] already mentioned. The core of this phonestheme would include the words with nasal meaning enumerated earlier: snoot, snot, sneeze, snout, snuff, snore, sniff, sniffle, snort, snob, snooty, sneer, snicker, snide, sniffy, snub. What of the penumbra? I suggest that it includes words like snazzy, snag, snap, snatch, sneak, snip, snitch, snoop and snug. These seem unnasal in their meaning, but they are nonetheless expressive, in the way that phonesthemes characteristically are. To defend this claim, I juxtapose some words that occupy the penumbra of the [sn-] phonestheme with their literal near-equivalents:

  1. (6) Comparing penumbral phonestheme words with literal expression

Here are further comparisons, in this case involving words in the penumbrae of two phonesthemes to be discussed below, [z-] and [j-]:

  1. (7) Further comparisons with [z-] and [j-]

I think it is clear that the words in (6) and (7) that include the phonestheme are more vivid and more colloquial. The implication is that a phonestheme does not require its core meaning to be present to render its stylistic effect.

Unsurprisingly, the element of vivid style that is the sole phonesthemic property of penumbral words is also found in the words of the core, as the comparisons of (8) suggest.

  1. (8) The stylistic effect of phonesthemes in core vocabulary

Consider next the neutral zone. It is treated here as the set of words that accidentally contain the segments of a phonestheme, in the same way that, say, lens accidentally contains the [-z] of the plural suffix. This zone can be a source of frustration to anyone lecturing about phonesthemes, whose audience is naturally inclined to ask, ‘What about word X? Isn't that a counterexample?’ It seems best to acknowledge that most phonesthemes do have a neutral zone, but the existence of this zone should not be taken as counterevidence to the existence of the phonestheme – in pointing out a phonestheme, we are only pointing out a pattern that is too frequent to be coincidence, not an implicational law. Indeed, Hutchins’ (Reference Hutchins1998) experiments affirmed psychological reality for phonesthemes that possess a demonstrable neutral zone.

The neutral zone for [sn-], a potent phonestheme, is small; I suggest that two plausible candidate words are snow and snail.

3.4.3 ‘Gravitational attraction’ in phonesthemes

A number of scholars (e.g. Jespersen Reference Jespersen1922: 407; Malkiel Reference Malkiel1990; Magnus Reference Magnus2001: 8, 72; Pentangelo Reference Pentangelo2020)) have suggested that phonesthemes exert a kind of gravitational attraction; drawing additional words into their membership by adjusting either their formFootnote 21 or their meaning. In present terms this claim can be elaborated a bit: I suggest that members of the periphery may gradually assume semantic properties of the core, and members of the neutral zone may be drawn into the core or periphery, becoming regarded as phonesthetic and expressive. Such drift is likely to be the result of language misacquisition; children are prone to mislearn either the style level or the meaning of phonesthetic words.

Here is an example of drift into the core: Malkiel (Reference Malkiel1990: 99–110) documents an Italian phonestheme of the form CVCiCiV (CiCi a geminate) with meaning ‘negative, or ridiculous, or both’, which has pulled in words that were formerly neutral such as nullo ‘nothing’ and secco ‘dry’, giving them novel secondary usages that fit the core meaning. Another example is the extraordinary semantic drift of English snob (roughly, from ‘lowlife’ to ‘one who looks down on others’), documented in the OED. For drift into the penumbra, I am on more speculative ground, but the reader may wish to ponder the words snooker and snipe. I feel that they belong in the penumbra, not the neutral zone, of [sn-]: as words they seem absurdly jokey and vivid for the purpose of denoting an ordinary indoor sport and bird species.Footnote 22 The Broadway composer Irving Berlin evidently felt a sense of pull for the phonestheme [j-] when he wrote the musical Yip Yip Yaphank, attracting the name of the Long Island town where he did his Army service ([ˈjæpæŋk]) into the penumbra of the [j-] phonestheme.

What enables a neutral-zone word to resist the inward pull of its component phonestheme? I suspect frequency matters: in my lexical database, the most frequent words (per CELEX) beginning with the phonesthemes discussed here have at most a modest penumbral tinge: snow,Footnote 23 Z, zone, year, use, young. The other cause of phonestheme resistance is speech register: formal or technical words are incompatible with the stylistic character of phonesthesia, and so they can contain the phonesthemic sequence without being pulled in: Snell's Law, zoning, zinc, zinnia, yarrow, ubiquity. For further discussion see Magnus (Reference Magnus2001: 5, 10, 34, 72).

An important implication of the above for present purposes is this: a word coined for purposes of writing a children's book would be unlikely to occupy the neutral zone of any phonestheme it contains. If an author uses the segments of a phonestheme, it will probably be perceived by readers as being intended as a phonestheme. The word frequency of a coinage is very low (i.e. zero); and technical or formal vocabulary would hardly be expected in a children's book.

3.4.4 Phonesthemes in general: summary

Summing up, in the discussion below I will approach Seuss’s coinages from the viewpoint of the three-way taxonomy of (5), which emphasizes (a) the stylistic role of phonesthemes; (b) the possibility of phonesthemes that convey style but not the relevant meaning; (c) gravitational attraction, under specified conditions, from neutral zone to penumbra to core. These ideas can be connected to the rough theories of phonesthemes discussed above. The core words acquire their phonesthetic meaning via the word-comparison process, during language acquisition. Core words tend to be felt as vernacular for the reason given earlier; that use in coinages over time gradually lends the phonestheme a vernacular tone. Words of the penumbra have meanings that cannot be accommodated within the phonestheme's semantic territory, but speakers nevertheless apprehend their vernacular character, either from context, or simply by adopting the reasonable hypothesis that whatever is phonesthemic is also vernacular. Lastly, neutral zone words are the words that can escape the gravitational-attraction mechanism: either they are so frequent that they can maintain their style and meaning on their own, or they fall into a dry, technical lexical domain, so that no one would think of using them in vernacular style.

At this point we can turn to some of the particular phonesthemes used in Seuss’s coinages. I will argue that a minority of the phonesthemic usages in Seuss are core, the rest are penumbral and none are neutral.

3.4.5 [sn-]

The 21 Seuss coinages that begin with the ‘nasal’ phonestheme [sn-] are given in (9):

  1. (9) Coinages with [sn-]

snop, snarggled, Snarp, snaff, Snux, Snumm, snuv, Snegg, Sneth, Snick, Snimm, Snee, Sneetch, Sneetcher, Sneedle, Sneeden, Sneelock, Sneepy, Snooker(s), Snoo, Snoor

Of these, I have identified four as belonging to the core of [sn-]. Snaff, from The Big Brag, inherits the phonesthemic status of sniff, of which it is a jocular past tense. Snargled appears in a sequence of verbs with sneezed, snuffled and sniffed, describing inhalation of polluted air, in The Lorax. The snobbish Sneetches plainly qualify, per Seuss’s description:

  1. (10) With their snoots in the air, they would sniff and they'd snort

A more subtle case is the Sneedle, from On Beyond Zebra: this is an insect whose nose takes the form of a large and frightening stinger:

  1. (11) Then we go on to SNEE. And the SNEE is for Sneedle

    A terrible kind of ferocious mos-keedle

    Whose hum-dinger stinger is sharp as a needle.

However, this seems to exhaust the core [sn-] words in Seuss, as the remaining 18 [sn-] coinages are slim pickings for anyone seeking out nasal meaning. For instance, the Drum-Tummied Snumm, from (2) above, has a spectacular tummy, but a very ordinary nose. Elsewhere in If I Ran the Circus, neither the Harp-Twanging Snarp nor Mr. Sneelock seem nasal in any way, and the same goes for the remaining words in (9). I would suggest that these forms are indeed penumbral; i.e. expressive but not nasally meaningful.

3.4.6 [z-]

[z-] is given short shrift by my primary reference, Marchand (Reference Marchand1960) (‘an infrequent initial’) but is taken more seriously by Wescott (Reference Wescott1980), who demonstrates considerable productivity for it. Let us consider the cases from my own data corpus. The 47 real [z]-initial words in my dictionary include eight that seem fairly clearly phonesthetic: zest, zigzag, zing, zip, zoom, zot, zany and zap. Were I try to define the core meaning of [z-], I would guess something like ‘with great liveliness’. Thus, a person who is zany is not just somewhat crazy, but crazy in a lively way; for a lizard to zap an insect it must make a very abrupt movement of its tongue.

The [z-] phonestheme also appears to have a penumbra. For example, zit is a very expressive way to denote a pimple, but pimples are not lively. Zilch means ‘nothing’, but is used to express the idea with feeling and humor. Zonked is plainly expressive but denotes stupor rather than liveliness. There is a neutral zone, composed of technical expressions like zinc and zinnia. A possible example of a neutral-zone word drawn toward the penumbra is Zenith, a brand name that did well for selling television sets in Seuss’s day.

[z-] has been noticed before by scholars of the Seuss coinages (Teuber Reference Teuber2018; Keyes Reference Keyes2021) and is indeed the most frequent phonestheme in his work, with 40 occurrences.

  1. (12) Coinages with [z-]

Zomba-ma-tant, Zozzfozzel, Zax, Zans, zang, Zatz, Zatz-it, zazz, Zuff, Zuk, zum, zummer, Zummz, Zummzian, Zutt, zuzz, Zall, Zong, Zorn, Zower, Zike, Zed, Zellar, Zelf, Zable, Zayt, Zidd, Ziff, Zillow, Zinn-a-Zu, Zind, Zinzibar, zizz, Zizzer-Zoof, Zizzy, Zizzer-Zazzer-Zuzz, Zeep, Zook, Zooie, zoop

Searching through their meanings, we again find just a few cases in which the [z]-word occupies the phonesthemic core. Zoop is part of Zoom-a-zoop, describing a virtuosic trapeze act in If I Ran the Circus. When the bird character Gertrude McFuzz suddenly sprouts a spectacular tail to satisfy her vanity, she does it ‘With a zang! With a zing!’. With an extension to not-quite-initial position, we may include G-r-r-zapp, G-r-r-zibb, G-r-r-zopp, the sounds of the arrows shot by the Yeoman of the Bowmen in The 500 Hats of Bartholemew Cubbins. However, most instances of [z-] in Seuss appear to be only penumbral. Notably, several [z]-initial Seussian animals are placid and serene: the Ziffs and Zuffs of Scrambled Eggs Super, the Zizzer-Zazzer-Zuzz of Dr. Seuss's ABC, the Zatz-It of On Beyond Zebra, and the Zans and the Zeep of One Fish Two Fish Red Fish Blue Fish. Footnote 24

3.4.7 [j-]

For Marchand (Reference Marchand1960: 334) this phonestheme is for ‘words expressive of vocal sounds’; my own preference would be to characterize it as ‘vigorous or uncontrolled vocalization’. Core examples are: yahoo, yammer, yatter, yap, yawp, yell, yelp, yip, yipe, yippee, yo, yowl and yodel. Some penumbral words are yo-yo, yank and Yankee. [j-] is a ‘weaker’ phonestheme than the other two and it has a large neutral zone including words like yellow,Footnote 25 yoke, yarn, yolk and eucharist. Neutral zone words that (for me at least) risk falling into the penumbra are Yonkers, yak and yam, which seem a bit silly for purposes of denoting a city, an animal and a vegetable; see also Yaphank, above.

As a phonestheme in Seuss [j-] includes the following core items:

  1. (13) Core [j-]-initial coinages in Seuss

    1. (a) YOPP, the cry of help made by a small Who that saves the Whos from destruction (Horton Hears a Who)Footnote 26

    2. (b) Yekko, a beast who ‘howls in an underground grotto in Gekko’ (On Beyond Zebra)

    3. (c) Ying, a creature with whom it is fun to sing (One Fish Two Fish)

But as before, the penumbral examples outnumber them: these include Yop (this time a name of a creature, in One Fish Two Fish); Yink, another creature in One Fish Two Fish; and Yupster, a place name in On Beyond Zebra. There are about ten other cases.

To sum up this section: the patterning of phonesthemes in Seuss’s coinages matches their behavior in real language: we find full-blown core coinages like Sneedle, bearing the appropriate meaning; as well as penumbral coinages like Snumm, in which the phonestheme provides only expressiveness and style. The third case, namely appearance of the phonesthemic segments without any phonesthetic effect at all, appears to be impossible, since in real life these cases exist only among words that are frequent or learned, neither of which could plausibly be used in a Seuss coinage.

4 Are these speculations on the right track? A statistical test

To return to the main thread, we sought to explain in general terms Seuss’s coinage practice, and came up with four hypotheses:

  • Words that match Seuss’s meter are likely to be Seuss coinages.

  • Words that are phonotactically aberrant are likely to be Seuss coinages.

  • Words that sound German are likely to be Seuss coinages.

  • Words that contain phonesthemes are likely to be Seuss coinages.

The statistical model described in section 2 was meant to provide the raw material for evaluating these hypotheses in detail. However, that model only tests phoneme sequences as such, and we have not yet tested whether it is really true that it is the phonesthemic status of these sequences, as I have claimed, that is essential. Perhaps Seuss’s practice is systematic, but has nothing to do with phonesthemes. Hypothesis-testing in this domain is not straightforward given the notorious subjectivity of phonesthemic analysis.

Hoping to find objectivity, I constructed a second logistic regression model on a different basis. Whereas the previous model was an attempt to scrutinize a great number of potential features, hoping to find the best ones, my second model implements only the proposed phonesthemes found in one single reference source, Marchand (Reference Marchand1960); I will call it the Marchand Model.Footnote 27 The model is less complete and accurate than the Full Model given in tables 12, but it is arguably objective. Marchand had no ax to grind concerning Seuss, but simply compiled a long list, offering his considered and informed judgment (based on examination of numerous examples) of whether a particular sequence was phonesthemic.

In compiling his list it is clear that Marchand examined all English vowels, all possible onsets and a great many syllable rhymes.Footnote 28 In these domains, if Marchand makes no mention of a sequence, it is reasonable to infer that he saw no reason to call it a phonestheme. Unsurprisingly, there is much overlap in the features of the Marchand Model with my Full Model, and to show this, I included the information (page number) of Marchand's discussion of these various sequences in tables 1 and 2 above. As before, the complete Marchand Model may be inspected in the Supplemental Materials.

I made two versions of the Marchand Model, coarse-grained and fine-grained. The coarse-grained version implements the intended statistical test. It has just five features, shown in table 4.

Table 4. Features of the coarse-grained Marchand Model

I fitted the coarse-grained Marchand Model to the same data as before, again using the BayesGLM()package in R. The constraint weights and significance values that were calculated are given in (14).

  1. (14) Result of fitting the coarse-grained Marchand Model

The model, being so coarse, is far less effective than the model of tables 1 and 2 in predicting Seussian status; see appendix A for details. The key point of the model is that it permits Marchand's independent testimony to bear on the question of whether Seuss’s practice is indeed phonesthemic. The results of the model are that Germanness, phonotactic ill-formedness and metrical appropriateness all test significant as factors for predicting Seussian status. In addition, phonesthemic status, for sequences identified as such by Marchand, also matters; although the constraint weights may be lower than those for Germanness and phonotactics, the number of words covered is considerably larger.Footnote 30, Footnote 31

To obtain a more detailed look, I also ran a fine-grained version of the Marchand Model that separates out all the Marchand-mentioned features (there are 44 for onsets and 116 for rhymes). The result, available in the Supplemental Materials, demonstrates that most of the work of predicting Seussian status is being done by a fairly small subset of Marchand's features; only 21 of 160 meet the criterion of bearing a weight of at least 1 and receiving a p-value < .001.Footnote 32

The upshot of these studies, I believe, is as follows. If we agree to take Marchand as an impartial witness for phonesthemic status, then it seems almost certain that Seuss is using phonesthemes when he coins words. Further, Seuss is making use of only a modest subset of Marchand's phonesthemes. There are at least two possible reasons for this. First, Marchand may have been overenthusiastic in positing phonesthemes (I tend to think so, particularly among the onsets). Second, Seuss was perhaps making an unconscious artistic decision, choosing his favorites from a larger available inventory.

5 Conclusions

Verbal artists, particularly popular artists, must rely on phonological resources they share with their reading community. This dictum is confirmed by Seuss’s coinage practice. First, native speakers of English internalize a detailed phonotactics of their language, which leads them to be amused by novel words like Thneed. Speakers also have some ability to internalize phonotactic principles of languages they don't speak but find accessible, and hence can be entertained by novel pseudo-German words like Schlottz. Lastly, they have internalized a system of phonesthemes, which gives them the ability to appreciate novel phonesthemic words. Just like real-life phonesthetic words, coined ones may either include the semantic component of the phonestheme, as in Sneedle, or exclude it, with the phonestheme offering only a sense of style, as in Snumm.

It goes without saying that the rigor of the research reported in this article would be increased by extensive experimentation, in the research tradition of, e.g., Fordyce (Reference Fordyce1988), Hutchins (Reference Hutchins1998) and Bergen (Reference Bergen2004). We would like to know more about which proposed phonemes are actually internalized by native speakers, what meanings they are assigned, whether my proposed ‘penumbra’ (section 3.4.2) is psychologically real, and (on a different topic) the extent to which older American English speakers (the original Seuss audience) have internalized the phonotactics of German (section 3.3.1). Since my account depends on the ability of people to learn the stylistic affiliation of particular linguistic entities, we are also in need of a theory of how this is done.

Lastly, it might also be useful to carry out studies comparing Seuss’s use of phonesthemes with that of other word-coiners – in literature, in ordinary life and in industry (see Wong Reference Wong2014, and the Pokémon research cited in section 1). I imagine that such study would find considerable variation. While Seuss’s choices were principled, they probably access only a subset of the possibilities offered by the resources the language offers – this is what the fine-grained Marchand Model (section 4) suggests. My conjecture reflects the view (section 3.4) that word coinage is a folk art: within the limits of what their language makes available, verbal artists can make choices.

Appendix A: Performance of the models compared

The metrics given here are explicated in, e.g., Johnson (Reference Johnson2008).

Appendix B: Coinages homophonous with real words

I excluded from the regression analyses the 66 Seuss ‘coinages’ that exist as real words, but are used in context as novel. For instance, Flummox is used in Seuss not as a verb but as a noun, the name of an imaginary creature in If I Ran the Circus. From the viewpoint of the Full Model developed in section 2, these ‘semi-coinages’ appear to have an intermediate status, as the figures in (15) indicate.

The ten real words with highest Seussian model probability are Zipp (0.722), Flummox (0.489), Krupp (0.338), guff (0.329), Snide (0.264), Fuddle (0.227), Quibble (0.222), duff (0.179), Snell (0.167) and Gusset (0.15).

Appendix C: The coinage principles motivated by meter

I discuss in this appendix two principles of Seuss coinage that are based on the fact that he would naturally favor coinages that fit well with his favorite meter, which is anapestic tetrameter, base form /x x X x x X x x X x x X/. Seuss took care to write this meter with strict adherence to syllable count; i.e. he sought never to invoke the license (common among his parodists) of substituting a binary for a ternary foot. Hence, words like Bippo-no-Bungus or Motta-fa-Potta-fa-Pell,Footnote 33 which have the sequence ˈσ σ̆ σ̆ ˈσ, serve a useful metrical purpose.Footnote 34 They are fairly common in the Seuss oeuvre, despite his general dispreference for long coinages (table 2, (a)).

A special case is found in words that include the medial sequence [əmə], as in Katta-ma-side, Yuzz-a-ma-Tuzz and six similar instances. This sequence appears in real English words, usually vernacular; the cases known to me are razzamatazz, rigamarole, tacamahac, Fishamajig, whatcha-ma-callit, thingamajig(ger), thingamabob and Kalamazoo (the last three appear in Seuss). The morphological status of [-əmə-] is obscure to me, but it does seem to be productive (e.g. rigamarole evolved from earlier rigmarole; OED); perhaps it is a phonestheme. In any event, [-əmə-] provided Seuss with a source for words that are both colloquial and metrically facilitating.


This article has benefited from input by audience members in the UCLA Phonology Seminar and a later public lecture given by the author celebrating the creation of the Theresa M. and Henry P. Biggs Centennial Term Chair in Linguistics at UCLA. I would like to thank the Biggses for making the latter occasion possible. Thanks as well to Stuart Davis, Shigeto Kawahara, Claire Moore-Cantwell, Kie Zuraw and the ELL reviewers for their helpful input. In addition, I would also like to acknowledge two persons who shared with me the exhilarating experience of reading Seuss aloud: my late mother, who read the books to me when I was a child, and my son Peter, to whom I read them when he was a child.

Since I have been working on Seuss, people occasionally have asked me about the political controversy currently surrounding him. My reply is to suggest reading about him before arriving at any conclusions. Morgan & Morgan (1995) is a good place to start.

2 The work just cited tends to track very general properties, such as ‘number of voiced obstruents’ or ‘number of syllables’, whereas I have found it useful to zero in on particular phonemic sequences. In Fordyce's taxonomy (Fordyce Reference Fordyce1988: 238–9), I am largely addressing phonesthemes, whereas Shih et al. and Kawahara are addressing sound symbolism.

3 Indeed, this is not unknown in Seuss, as a reviewer points out: a-snooze, from How the Grinch Stole Christmas, follows the word formation rule responsible for aflame, adrift, etc. (Marchand Reference Marchand1960: 92).

4 By this I mean adult speakers, whom I consider to be the criterial audience. Like most children's book authors, Seuss sought to keep the adult reader engaged, and it seems fairly certain that much of his technique (e.g. the use of Germanisms; section 3.3) goes over children's heads. If the comparison lexicon were to be restricted to words likely to be known to children, then the model would probably be less accurate in predicting Seussian status.

5 For both cases above, it might have been possible to generalize the initial findings, using standard phonological features in the normal way. However, I usually found this was not all that helpful (see fn. 10 for the one case that I retained in the final model). Since generalizing the features vastly expands the number of hypotheses to consider and makes diagnosis harder, I did not pursue it any further.

6 The terminology comes, I believe, from computer science. In linguistics, we would be more likely to call the features constraints, following the research tradition of Optimality Theory (Prince & Smolensky Reference Prince and Smolensky1993/2004). However, constraints normally act only as penalties for particular candidates, and here they more often reward them; hence the terminology.

7 The method of computation used here is a standard one, namely to maximize likelihood, the probability predicted by the model for the corpus as a whole. This is the product of the probability assigned to Seussian-status for all 435 Seuss coinages, multiplied by the probability assigned to real-status for all 17,744 real words. Maximizing this product (it would approach 1 in a perfect model, 0 in a perfectly bad model) maximizes what we intuit as model effectiveness. Actual likelihood achieved by this and other models is reported (in log form) in appendix A.

8 It would have been more principled to use stepAIC() to do all of the culling, but this task overwhelmed my computing equipment.

9 By the math of logistic regression, it can be shown that the formula e−weight tells us the ratio of predicted probability between otherwise-identical forms that do and do not come under the scope of a feature. A weight difference of 5 means a probability ratio of about 150, 2 about 7, 1 about 2.7, .5 about 1.6, and 0 means equal probability.

10 The weight is low, but this is because there are nine other features in the system that do the same work, essentially preempting this general constraint.

11 Perhaps more informative is a sampling of minimal-Seuss-score monosyllables, which are for the most part not from learned vocabulary: case, coal, cork, corn, course, shake, stake, stave, stove, sty.

12 We know this because in On Beyond Zebra Nuh is the letter used to spell Nutches, which rhymes with hutches.

13 I did not include in my feature set any phoneme sequences that correspond to actual German morphemes, but these do occur a number of times in the coinages: Herk-heimer, Hippo-heimer, Eiffelberg, Eisenbart, Bickelbaum, Biffer-baum, Katzen-bein, Katzen-stein, Spritz.

14 For what it's worth, I myself was a mid-century American and judge that during my childhood the United States was far more attuned to Germanness than it is today. The mid-twentieth century was the heyday of Wernher von Braun, ‘Hogan's Heroes’, beers like Schlitz, Pabst, Schmidt and Rheingold, stores with names like Ski Haus and Cheese Haus¸ and a fast-food chain called Wienerschnitzel.

15 A final note on Germanness: we will never know what the tight-lipped Seuss thought about his Germanisms, but they could not have been a fully neutral topic for him. In 1917, when the United States joined the fight against Germany, a xenophobic campaign to deculturalize German-Americans took place, including in Springfield, where the young Seuss was beaten in the street (Morgan & Morgan Reference Morgan and Morgan1995).

16 Throughout, I limit examples to words I know, hopefully thereby approximating the words Seuss knew. The Oxford English Dictionary offers a wealth of obscure and obsolete words that include one of the three focus phonesthemes discussed below ([sn-], [z-], [j-]). I believe including these words would lead to similar conclusions.

17 Some helpful entry points to this literature are Schmidtke et al. Reference Schmidtke, Conrad and Jacobs2014 (psychology and cognitive science), Magnus Reference Magnus2001 (linguistics) and Kawahara Reference Kawahara2020 (linguistics). On the role of phonesthemes in word coinage (i.e. by ordinary people of the past) see Pentangelo (Reference Pentangelo2020). The idea that Seuss uses phonesthemes in his coinages was first put forth by Teuber (Reference Teuber2018).

18 Interestingly, in recent years it has become possible to implement the word-affinities approach as a computational model (Otis & Sagi Reference Otis and Sagi2008; Liu et al. Reference Liu, Levow and Smith2018), since there exist ways to approximate meaning using text distributions.

19 Wales (Reference Wales and Ramsaran1990) aptly refers to the coiners of novel (vernacular) words as ‘folk poets’.

20 Citations: (5a) is agreed upon by all. To my knowledge, only Joseph (Reference Joseph, Hinton, Nichols and Ohala1994: 229–30) has ever noticed (5b), the penumbra. The neutral zone, (5c), is widely noted, e.g. by Jespersen (Reference Jespersen1922: 406) and Fordyce (Reference Fordyce1988: 177). Note that my use of the term ‘core’ differs from that of Fordyce, who uses it to describe those words that embody the phonestheme's meaning most clearly and saliently. That degree of adherence to the meaning of a phonestheme is gradient seems clear from Fordyce's as well as Hutchins’ (Reference Hutchins1998) experiments.

21 For example, Jespersen suggests that peep originated as a phonesthetic ‘repair’ of pipe, a word which had lost its phonesthetic appropriateness when the Great Vowel Shift altered its vowel from [iː] to [aɪ].

22 It is circular reasoning, but worth pointing out, that Seuss used both words in his books: snipe appear (as such) in If I Ran the Circus, and Snookers is a surname in Happy Birthday to You. A journalist calls snooker ‘the funniest word I have ever heard’;

23 Snow is perhaps a core word for sniffers of white powder cocaine, a usage dated to 1914 by the OED.

24 A possibility to consider is that the [z-] phonestheme possesses two cores, the second of which evokes sleepiness. Some relevant real words I have noticed are zzz (orthographic phonestheme denoting snoring), zone out and zonked. For Seuss, several of the animals cited above are portrayed as sleepy. For multiple-core phonesthemes see Fordyce (Reference Fordyce1988: 194–5), Wales (Reference Wales and Ramsaran1990), Magnus (Reference Magnus2001) and Pentangelo (Reference Pentangelo2020).

25 Yellow is perhaps penumbral when used to mean ‘cowardly’.

26 As MacDonald (Reference MacDonald1988: 86) observes, the climactic YOPP is prepared by the appearance of four evidently phonesthemic [j-] words (yapping, yipping, yo-yo, yip) in the immediately preceding pages. One is reminded of Bergen's (Reference Bergen2004) experimental finding that phonesthemes can be primed.

27 I picked Marchand because he tends to be somewhat conservative, refraining from seeing phonesthemes everywhere he looks. Magnus (Reference Magnus2001) and Bolinger (Reference Bolinger1965) are, conceivably, correct in claiming that phonesthemes are omnipresent, but if this view is true then the hypothesis ‘Seuss used phonesthemes in coining words’ becomes trivial and not worth checking.

28 Indeed, departing from his normal practice, he seeks a phonesthemic interpretation for every vowel, so there was no point in including them in my Marchand Model. Marchand also covers a few singleton final codas, but they did not improve the accuracy of the model and I omit them here.

29 Of course the German sequences are themselves mostly ill-formed in English, but for clarity I excluded them from the scope of my Phonotactic feature, which covers only the remaining ill-formed cases.

30 A reviewer suggested examining a model that includes interaction terms; this would test, for example, if having a Marchand onset is more important in words that are metrically felicitous. A check revealed that no interaction terms test as significant.

31 A final note: as reviewers have commented, the factors overlap; e.g. initial [z-] is a phonestheme, but it may also be a Germanism (from the sound change *[s > [z) and is moreover phonotactically somewhat improbable (Hayes & Wilson Reference Hayes and Wilson2008). I'm not sure what kind of test could control for these issues.

32 These 21 are listed as follows in descending order of weight: [z-], [sn-], [-ɑp], [-ʌmp], [-ʌd], [-ʌm], [-ʌb], [bl-], [-uːn], [gl-], [-ɛk], [skr-], [-uː], [g-], [j-], [kw-], [gr-], [w-], [fl-], [b-], [dʒ-].

33 Seuss often includes hyphens in the spelling of his long coinages. I suspect these are not intended to denote pseudo-morphological structure, but are only intended to make the long words easier to read. Teuber (Reference Teuber2018) suggests that hyphen placement also aids the reader in assigning stress; e.g. Va-Vode comes across more clearly as finally stressed than Vavode would be.

34 To be sure, Seuss also wrote in binary meters, usually in short works for young children such as Green Eggs and Ham. I suggest that these were not as hard to write (he favored them in old age), and need little metrical help from coinages.


Baayen, Harald, Piepenbrock, Richard & Gulikers, Leon. 1995. CELEX2 LDC96L14. Web download. Philadelphia, PA: Linguistic Data Consortium.Google Scholar
Bergen, Benjamin K. 2004. The psychological reality of phonaesthemes. Language 80, 290311.CrossRefGoogle Scholar
Bloomfield, Leonard. 1933. Language. New York: Henry Holt.Google Scholar
Bolinger, Dwight. 1965. Forms of English: Accent, morpheme, order. Cambridge, MA: Harvard University Press.Google Scholar
Chambers, W. Walker & Wilkie, John R.. 1970. A short history of the German language. London: Methuen.Google Scholar
Daland, Robert, Hayes, Bruce, White, James, Garellek, Marc, Davis, Andrea & Norrmann, Ingrid. 2011. Explaining sonority projection effects. Phonology 28, 197234.CrossRefGoogle Scholar
Davis, Stuart. 1991. Coronals and the phonotactics of nonadjacent consonants in English. In Paradis, Carole & Prunet, Jean-François (eds.), The special status of coronals: Internal and external evidence, 4960. San Diego, CA: Academic Press.CrossRefGoogle Scholar
Eisiminger, Sterling. 1981. Etymology unknown: Toward a master list of words of obscure origin. American Speech 56, 146–8.CrossRefGoogle Scholar
Firth, J. R. 1930. Speech. London: Ernest Benn.Google Scholar
Fordyce, James. 1988. Studies in sound symbolism with special reference to English. PhD dissertation, University of California, Los Angeles.Google Scholar
Hammond, Michael. 1999. The phonology of English: A prosodic optimality-theoretic approach. Oxford: Oxford University Press.Google Scholar
Hayes, Bruce. 2016. Comparative phonotactics. Proceedings of the 50th Meeting of the Chicago Linguistic Society, 265–85.Google Scholar
Hayes, Bruce & Wilson, Colin. 2008. A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry 39, 379440.CrossRefGoogle Scholar
Hutchins, Sharon Suzanne. 1998. The psychological reality, variability, and compositionality of English phonesthemes. PhD dissertation, Department of Psychology, Emory University.Google Scholar
Jespersen, Otto. 1922. Language: Its nature, development, and origin. London: Allen & Unwin.Google Scholar
Jespersen, Otto. 1933. Symbolic value of the vowel i. In his Linguistica: Selected papers in English, French and German, 283303. Copenhagen: Levin & Munksgaard.Google Scholar
Johnson, Keith. 2008. Quantitative methods in linguistics. Malden, MA: Blackwell.Google Scholar
Joseph, Brian. 1994. Modern Greek [ts]: Beyond sound symbolism. In Hinton, Leanne, Nichols, Johanna & Ohala, John J. (eds.), Sound symbolism, 222–36. Cambridge: Cambridge University Press.Google Scholar
Kawahara, Shigeto. 2020. Sound symbolism and theoretical phonology. Language and Linguistics Compass 14, e12372.CrossRefGoogle Scholar
Kawahara, Shigeto. 2021a. How Pokémonastics has evolved: Version 1.0. Scholar
Kawahara, Shigeto. 2021b. Phonetic bases of sound symbolism: A review. Scholar
Keyes, Ralph. 2021. The hidden history of coined words. Oxford: Oxford University Press.CrossRefGoogle Scholar
Lathem, Edward Connery. 2000. Who's who & what's what in the books of Dr. Seuss. Hanover, NH: Dartmouth College.Google Scholar
Liu, Nelson F., Levow, Gina-Anne & Smith, Noah A.. 2018. Discovering phonesthemes with sparse regularization. Proceedings of the Second Workshop on Subword/Character Level Models, 49–54. New Orleans, LA: Association for Computational Linguistics.CrossRefGoogle Scholar
MacDonald, Ruth K. 1988. Dr. Seuss (Theodore Seuss Geisel). Woodbridge, CT: G. K. Hall & Company.Google Scholar
Magnus, Margaret. 2001. What's in a word? Studies in phonosemantics. PhD dissertation, Norwegian University of Science and Technology, Trondheim.Google Scholar
Malkiel, Yakov. 1990. Diachronic problems in phonosymbolism. Amsterdam: John Benjamins.CrossRefGoogle Scholar
Marchand, Hans. 1960. The categories and types of present-day English word-formation. Wiesbaden: Otto Harrassowitz.Google Scholar
Morgan, Judith & Morgan, Neil. 1995. Dr. Seuss & Mr. Geisel: A biography. New York: Random House.Google Scholar
Nilsen, Don L. F. 1977. Dr. Seuss as grammar consultant. Language Arts 54, 567–74.Google Scholar
Oh, Y., Needle, J., Todd, Simon, Beckner, Clay, Hay, Jennifer & King, Jeannette. 2020. Non-Māori-speaking New Zealanders have a Māori proto-lexicon. Scientific Reports 10, art. 22318.Google Scholar
Otis, Katya & Sagi, Eyal. 2008. Phonaesthemes: A corpus-based analysis. Proceedings of the 30th Annual Meeting of the Cognitive Science Society, 65–70.Google Scholar
Pentangelo, Joseph. 2020. Phonesthetics and the etymologies of blood and bone. English Language and Linguistics 25, 225–55.CrossRefGoogle Scholar
Prince, Alan & Smolensky, Paul. 1993/2004. Optimality Theory: Constraint interaction in generative grammar. Technical report, Rutgers University Center for Cognitive Science. [Published 2004; Oxford: Blackwell.]CrossRefGoogle Scholar
Schmidtke, David S., Conrad, Markus & Jacobs, Arthur M.. 2014. Phonological iconicity. Frontiers in Psychology 5, 80.CrossRefGoogle ScholarPubMed
Shih, Stephanie S., Ackerman, Jordan, Hermalin, Noah, Inkelas, Sharon & Kavitskaya, Darya. 2018. Pokémonikers: A study of sound symbolism and Pokémon names. Proceedings of the Linguistic Society of America 3.42, 16.CrossRefGoogle Scholar
Teuber, Daniel. 2018. An analysis of the nonce-words of Dr. Seuss. Journal of Osaka Sangyo University 34, 4369.Google Scholar
Wales, Katie. 1990. Phonotactics and phonesthesia: The power of folk lexicology. In Ramsaran, Susan (ed.), Studies in the pronunciation of English: A commemorative volume in honour of A. C. Gimson, 339–51. London: Routledge.Google Scholar
Wescott, Roger. 1980. ‘Zazzification’ in American English slang. In his Sound and sense: Linguistic essays on phonosemic subjects, 391–3. Lake Bluff, IL: Jupiter Press.Google Scholar
Westbury, Chris, Shaoul, Cyrus, Moroschan, Gail & Ramscar, Michael. 2016. Telling the world's least funny jokes: On the quantification of humor as entropy. Journal of Memory and Language 86, 141–56.CrossRefGoogle Scholar
Wong, Andrew. 2014. Branding and linguistic anthropology: Brand names, indexical fields, and sound symbolism. Practicing Anthropology 36, 3841.CrossRefGoogle Scholar
Figure 0

Table 1. Features that favor being a Seuss coinage

Figure 1

Table 2. Features that favor being a real word

Figure 2

Figure 1. Histogram of model probabilities of Seuss coinages

Figure 3

Figure 2. Histogram of model probabilities of real words

Figure 4

Table 3. The ten Seuss coinages with the highest model probability

Figure 5

Table 4. Features of the coarse-grained Marchand Model