1.1 Introduction
In 1964 Harvey Sacks (1935–1975) began giving lectures on conversation at UCLA and circulated them privately on request, in the process creating his “most successful and prolific form of scientific communication” (Reference Schegloff, Sacks and JeffersonSchegloff 1992a: xix). The time was a propitious one for innovation in the human sciences. Human cognition and reasoning had been recently rehabilitated as respectable topics of academic study, having been banished for the previous forty years as subjective and unobservable by the now-waning edict of logical positivism and behaviorism. In psychology, pioneering work by Jerome Bruner (Reference Bruner, Goodnow and AustinBruner et al. 1956) and George Miller (Reference Miller, Galanter and PribramMiller et al. 1960) led the way and, in linguistics, the behaviorism that had held sway since the days of Leonard Bloomfield was destroyed in Reference ChomskyChomsky’s (1959) devastating review of Skinner’s Verbal Behavior. In anthropology, ethnological research on non-Western classification systems began to expand (Reference ConklinConklin 1959; Reference FrakeFrake 1961), while in sociology there arose a widespread research program premised on the social construction of reality (Reference Schütz and NatansonSchütz 1962; Reference Berger and LuckmannBerger and Luckmann 1967; Reference HolznerHolzner 1968). Finally, strongly influenced by the approach to communication that focused on context analysis (Reference McQuownMcQuown 1971; Reference Leeds-HurwitzLeeds-Hurwitz 1987; Reference Kendon and KendonKendon 1990), Goffman’s research into the interaction order (Reference Goffman1955, Reference Goffman1956) was a major influence on Sacks and his colleagues (Reference Schegloff, Sacks and JeffersonSchegloff 1992a; Reference Clayman, Heritage, Maynard, Maynard and HeritageClayman et al. 2022).
The philosophical research associated with these movements, particularly the work of Alfred Reference Schütz and NatansonSchütz (1962) and Ludwig Reference WittgensteinWittgenstein (1958), made it clear that the human sciences had a range of difficult issues to confront. These may be summarized as follows: First, social actions are meaningful and involve meaning-making. Second, this is achieved through a combination of the content of actions and their contexts. Third, action involves shared or intersubjective meanings that may be far from perfect but are normally enough to allow the participants to carry on. Finally, as Reference SchegloffSchegloff (1988: 442) observed, “social life is lived in single occurrences.” Because they embody unique and singular conjunctions of content and context, actions are themselves unique and singular. While possessed of general and generalizable characteristics, these contents and contexts nonetheless intersect in mutually elaborative synergies to enable meanings that are also particular (Reference GarfinkelGarfinkel 1967: 76–103; Reference Garfinkel, Sacks, McKinney and TiryakianGarfinkel and Sacks 1970).
The consensus among those who were determined to study social interaction – in particular, Harold Reference GarfinkelGarfinkel (1952, Reference Garfinkel1967) and Erving Reference GoffmanGoffman (1955, Reference Goffman1964, Reference Goffman1967, Reference Goffman1983) – was that none of these problems could be analyzed by appealing to psychological processes or other essentially private states of mind. Rather, the task was to focus on the surface of action – the visible order of society (Reference LivingstonLivingston 2008) – since it is in public action and public communication that meaning-making and comprehension are managed.
Within the emerging Conversation Analysis (henceforth CA) paradigm, the focus was not on the “propositions” of positivistic philosophy, nor on the “messages” of communication science, nor yet again on the “meanings” that might emerge from cultural analysis. Rather CA was premised on the notion that, so far as human interaction is concerned, the job of speech is to deliver actions. For example, if A tells B that “Someone just vandalized my car” (Reference Wilson, Boden and ZimmermanWilson 1991), B has the task of determining whether this statement is intended as a complaint, a request for help, or as an excuse (for a late or non-arrival). This task is of pressing significance because B must frame a response to this information then and there, and this response (for example, an offer of assistance, or an expression of sympathy, or of forgiveness) will unavoidably portray B’s analysis of the action that A’s utterance implemented. It follows that conversational participants have an urgent and compelling interest in the performance and analysis of social actions, and this in turn mandates the corresponding CA focus on action and sequence organization (Reference Schegloff and SacksSchegloff and Sacks 1973; Reference Sacks, Button and LeeSacks 1987; 1992).
The leading idea that dominated the early development of CA, as well as its subsequent evolution, was Reference GarfinkelGarfinkel’s (1967) notion that social action is methodical, that is, organized by reference to shared methods of acting and reasoning about action that are the fundamental resources for meaning-making. Moreover, the same methods are deployed both in producing recognizable actions and in recognizing them. As Sacks put it: “A culture is an apparatus for generating recognizable actions; if the same procedures are used for generating as for detecting, that is perhaps as simple a solution to the problem of recognizability as is formulatable” (Sacks Lectures, Fall 1965, Appendix A, p. 226, emphasis in original). The program of research should, accordingly, center on the methods for producing and recognizing action, with these methods construed as a structured set that, like language itself, functions in broad independence from personal sentiments and proclivities.
With these insights in hand, CA thinking slowly developed toward the kind of research that we witness today (Clayman et al. 1922). At first, the going was very slow. By the time of Sacks’s death in 1975 in an automobile accident, fewer than twenty CA papers had been published. Starting in the 1980s, the field began to grow and by 2020 the number of publications per annum had increased approximately fortyfold. Starting from an exclusively anglophone base, CA methods have come increasingly to be applied across a broad range of languages.Footnote 1
This growth has also been associated with an institutionalization of the field together with the broadening of CA’s disciplinary context (Reference Stivers, Sidnell, Sidnell and StiversStivers and Sidnell 2013). Initially motivated by fundamental sociological concerns with the analysis of social action and intersubjective understanding bequeathed by Reference Schütz and NatansonSchütz (1962), CA rapidly found an audience within the field of discursive psychology (Reference Potter and WetherellPotter and Wetherell 1987) and now forms the basis of the more recently founded field of interactional linguistics (Reference Couper-Kuhlen and SeltingCouper-Kuhlen and Selting 2018). It also intersects with a variety of other fields, including computer science and neuroscience, and has a very wide range of applications in the fields of medicine, education, law enforcement, media analysis, second language acquisition, and research into disabilities involving autism, stroke, neuro-degenerative illness, and other forms of brain damage.
1.2 A Revolution in Methods
A field that started with the preoccupations of CA – a focus on social action, the achievement of intersubjectivity, and the lodging of both in an ever-shifting envelope of context – could hardly proceed using the older methods of discourse analysis. Speech act theory, with its invention of single sentence examples, devoid of any real context save that stipulated by the theorist, coupled with a top-down analytic process conceived in terms of rules and conditions, was already foundering as an approach to action and proved to be largely without merit as an approach to discourse (Reference Levinson, Parret, Sbisa and VerschuerenLevinson 1981, Reference Levinson1983, Reference Levinson, Sidnell and Stivers2013, Reference Levinson and Huang2017). From the outset, Sacks argued that research based on imagined sentences and hypothesized contexts was essentially constrained by what audiences would accept as “reasonable” (Reference Sacks, Atkinson and Heritage1984: 25). Moreover, these imagined scenes would lack the detail and complexity of the real world of action. Indeed, as he put it, from “close looking at the world we can find things that we could not, by imagination, assert were there. We would not know that they were ‘typical’ … Indeed, we might not have noticed that they happen” (p. 25).
As Sacks’s reference to “close looking” suggests, he adopted the stance of a naturalist going into the field, and finding and describing elements of social actions and observing their “habitats,” the contexts in which they appeared. Recordings enabled him to study them repeatedly and to allow interactional materials to be examined in concert with others. A vital result was “that anyone else can go and see whether what was said is so. And that is a tremendous control on seeing whether one is learning anything” (Reference Sacks, Atkinson and Heritage1984: 25). The tenor of Sacks’s comments on methods here, made at the inception of CA, is very much that of an explorer who does not know what is out there to be found. Accordingly, Sacks was generally opposed to the use of methods including coding, experimental procedures, role playing, etc. (cf. Reference Heritage, Atkinson, Atkinson and HeritageHeritage and Atkinson 1984) that would involve top-down generalization or would otherwise presuppose what is likely to be found. Such methods have the potential to obscure the particularity, precision, and underlying organization of observable human conduct. This stance remains a fundamental one within CA, though the increasing expansion of CA findings has more recently eventuated in the use of statistical, experimental, and neuroscience methods as adjuncts to CA investigations (see Reference Robinson, Clift, Kendrick and RaymondRobinson et al. 2022).
1.3 The Basic Structure of CA Research
At the present time, CA research may be characterized in terms of three levels of analysis.
1.3.1 Sequence
The most fundamental of these levels is that of sequence. All actions are built by reference to some “place” in talk and exploit this positioning as a sense-making resource (Reference Schegloff, Atkinson and HeritageSchegloff 1984). Moreover, a current action will normally “project” the relevance of some next action, albeit with different degrees of constraint on what should be done next (see below). Combining these two points we may say that each action is context shaped and context renewing (Reference HeritageHeritage 1984a: 242). In addition, each next action will normally display some analysis of what has just been said. (1) below is taken from the opening of a conversation between a community nurse (HV) and a new mother, at the mother’s home. The baby is chewing on something:
(1)
[HV:4A1:1] ((HV: health visitor; M: Mother and F: Father)) 1 HV: He’s enjoying that [isn’t he. 2 F: [°Yes, he certainly is=° 3 M: =He’s not hungry ’cuz (h)he’s ju(h)st (h)had 4 ’iz bo:ttle .hhh 5 (0.5) 6 HV: You’re feeding him on (.) Cow and Gate Premium.=
Here the HV’s apparently casual observation about the baby at line 1 is treated in distinctive ways by the child’s father and mother. The father treats it as something to be agreed with, which he does promptly at line 2. The mother’s response, however, is notably defensive. She is at pains to claim that her baby’s behavior is not due to hunger (lines 3–4), and in this way treats the HV’s observation as potentially critical of her parenting. These two different responses likely reflect the division of labor in the household, at least as far as childcare is concerned. It is in these responses, that preceding speakers can see how they were understood and can respond by validating or correcting the understandings displayed in second turns. It follows that, as we shall see, sequences are also a fundamental vehicle for the management of intersubjective understanding between interactants.
1.3.2 Practices
The actions that make up sequences are the objects of numerous practices of turn construction (Reference Drew, Sidnell and StiversDrew 2013). To be successfully identified, these practices will normally be (a) recurrent, (b) specifically situated within a turn or sequence, and (c) differentiate the turn from one that is comparable but does not embody the practice in question. Here are some examples:
(i) prefacing a turn with an address term, which selects a next speaker to speak next:
(2) A: John, do you want another drink?
In this case, the address term selects John as the recipient of the offer.
(ii) prefacing a turn with the word well, projecting a turn that will be expanded to more than one unit of talk (Reference HeritageHeritage 2015), and that will likely be negatively valenced (Reference Davidson, Atkinson and HeritageDavidson 1984; Reference HeritageHeritage 2015; Reference Kendrick and TorreiraKendrick and Torreira 2015). In (3) Edgerton is offering to help a family stricken with an injury:
(3)
[Heritage:0II:4:11–19] 1 Edg: =Oh:hh Lord.< And we were wondering if there’s anything 2 we can do to help< 3 Mic: [Well that’s] 4 Edg: [I mean] can we do any shopping for her or 5 something like tha:t? 6 (0.7) 7 Mic: Well that’s most ki:nd Edgerton .hhh At the moment 8 no:. Because we’ve still got two bo:ys at home. 9 Edg: Of course.
After receiving the offer to help with the shopping (line 4), Michael begins his response with well at line 7. His response comprises two turn constructional units: an appreciation preceded by well, and a rejection of the offer accompanied by an account. The conjunction here between negative valence and turn expansion is scarcely accidental. In fact, the well preface is critical because, without it, that’s most kind might have been understood as an acceptance of the offer, rather than as an appreciation headed toward rejection. Here the well preface instructs the recipient that there will likely be more to come (Reference HeritageHeritage 2015)
(iii) Incorporating the word any into a polar (yes/no) question.
(4)
[Reference Heritage, Freed and EhrlichHeritage 2010] 1 Doc: Do you have any drug aller:gies?
This communicates the questioner’s expectation that the response to the question is likely to be negative (Reference Heritage, Robinson, Elliott, Beckett and WilkesHeritage et al. 2007) and is therefore an aspect of the epistemic stance of the questioner (Reference Heritage and RaymondHeritage and Raymond 2021).
As previously noted, practices must be identified as recurrent and as specifically situated. This kind of identification is essential for analysis. For example, the import of practices such as well-prefacing shifts with respect to the word’s placement both within a turn (Reference Lerner and KitzingerLerner and Kitzinger 2019), and within a sequence (Reference HeritageHeritage 2015, Reference Heritage, Heritage and Sorjonen2018).
1.3.3 Organizations of Practices
Conversation Analysis is founded on Reference GoffmanGoffman’s (1955, Reference Goffman1964, Reference Goffman1967, Reference Goffman1983) notion that social interaction comprises an institutional order in its own right, and one that is embedded in a wide variety of other social institutions, while structuring participation within them. As an institution, social interaction has its own internal organization and, as Reference Schegloff, Enfield and LevinsonSchegloff (2006: 71) notes, “where stable talk in interaction is sustained, solutions to key organizational problems are in operation, and … organizations of practice are the basis for these solutions.” Collections of practices are readily found in relation to a range of systemic organizational problems in interaction, all of which are putatively universal (Reference Schegloff, Enfield and LevinsonSchegloff 2006). These include the following:
(i) Turn-taking: General resources are available to solve questions concerning who should talk next, and when they should do so. These embrace practices for the construction of units of talk (turn constructional units) and projecting their conclusion, for their one-at-a-time allocation, and for allocating next turns to next speakers. These were described in Reference Sacks, Schegloff and JeffersonSacks et al.’s (1974) foundational paper on the topic and have been greatly elaborated in subsequent decades. Recent comparative research strongly suggests that the issues raised by the Sacks et al. study have universal relevance across languages (Reference Stivers, Enfield, Brown, Englert, Hayashi, Heinemann and LevinsonStivers et al. 2009).
(ii) Sequence organization: This will be elaborated in more detail later in this chapter, but the sequence organizational problem concerns the resources and constraints through which turns at talk are lined up into coherent trajectories of interaction. Although many of the practices involved are specific to particular kinds of actions and sequences, the generic notion of paired actions has been a central resource for understanding how sequences work and is also likely to be a conversational universal (Reference Kendrick, Brown, Dingemanse, Floyd, Gipper, Hayano and LevinsonKendrick et al. 2020).
(iii) Repair: Interaction can run awry when failures of speaking, hearing, and understanding undermine the mutual intelligibility on which it is based. The organization of practices for dealing with such problems is described under the heading of “repair.” Once again, the basic elements were sketched out by Reference Schegloff, Jefferson and SacksSchegloff et al. (1977) with an important addendum by Reference SchegloffSchegloff (1992b), and this analysis has undergone considerable refinement in the intervening years. Again, many of the basic principles of repair organization are highly general, if not universal (Reference Dingemanse, Roberts, Baranova, Blythe, Drew, Floyd and EnfieldDingemanse et al. 2015)
(iv) Social solidarity: If the first three levels of organization are addressed to the coherent management of interaction itself, the following groups of interactional practices are concerned with the management of the relations between speakers. In the case of social solidarity, these practices are focused on the limitation of conflict and the avoidance of its escalation. Here the organization of “preference” (Reference Pomerantz, Atkinson and HeritagePomerantz 1984; Reference Sacks, Button and LeeSacks 1987; Reference SchegloffSchegloff 2007: 58–97; Reference HeritagePomerantz and Heritage 2013; Reference Pillet-Shore and NussbaumPillet-Shore 2017) mandating the (delayed) timing and (mitigated) expression of disaffiliative actions is central (see below).
(v) Knowledge management: Different speakers bring distinctive kinds of knowledge to bear in the course of their interactional engagements. A wide variety of interactional practices inform the expression of this knowledge: these practices concern its novelty, how the speaker knows it, and whether the speaker has primacy in access to it (Reference Heritage and RaymondHeritage and Raymond 2005).
(vi) Deontic rights: speakers may vary in their deontic rights: the right to issue directives to others or to require actions from them (Reference Stevanovic and PerakylaStevanovic and Peräkylä 2012). A range of practices are associated with the management of these rights and with the concrete expression of directives under different circumstances (Reference Stevanovic and SvennevigStevanovic and Svennevig 2015).
If these six domains of interaction have attracted a good deal of research, there are others that are also managed through clusters of practices. These include references to persons and places, gaze and body behavior, the expression of emotion, storytelling and many more.
1.4 Major Concepts in Conversation Analysis
1.4.1 Sequence
Unquestionably, the fundamental and animating concept in CA is that of sequence. Every action (including the first) will necessarily occur at some “place” in an interaction and its content will unavoidably be produced and understood by reference to that place. This means that the most fundamental context in which an action, housed in a turn at talk and/or bodily behavior, will be understood is the immediately preceding turn. It is sequences of action which serve as the medium through which every other aspect of context – both local and distal, social and psychological – emerge at the surface of social interaction (Reference Schegloff, Alexander, Giesen and SmelserSchegloff 1987, Reference Schegloff, Boden and Zimmerman1991, Reference Schegloff, Drew and Heritage1992c). It cannot be sufficiently stressed that, for CA, sequence is the engine room of interaction, and it is the vehicle through which other elements of context such as social identities and social institutions are realized (Reference Heritage and ClaymanHeritage and Clayman 2010). Accordingly, very many resources are deployed in the management of sequential coherence, including the “fitting” of elementary actions to one another (Reference Schegloff and SacksSchegloff and Sacks 1973), the management of coherence and cohesion across turns (Reference Halliday and HasanHalliday and Hasan 1976), and the use of a variety of particles such as oh, well, and by the way to fine tune the relationship between a current turn and its immediately preceding one (Reference SchiffrinSchiffrin 1987; Reference FischerFischer 2006; Reference Heritage and SorjonenHeritage and Sorjonen 2018).
The initial work on sequences was focused on adjacency pairs: actions tightly organized as first pair parts and second pair parts in which the production of a first part strongly obligates a second speaker to produce a second “next” (Reference Schegloff and SudnowSchegloff 1972; Reference Schegloff2007: 13–27; Reference Schegloff and SacksSchegloff and Sacks 1973). Greetings, “goodbyes,” and question–answer pairs are paradigmatic examples. This apparently simple conceptualization has far-reaching implications. The normative requirement for a fitted second pair part to occur “next” creates a context in which failures to respond appropriately become “noticeable absences,” which in turn can be sanctioned by a first pair part producer who may repeat the first pair part in the hope of response, or complain about the absence. Negative inferences may also be drawn from absent second pair parts: the absence may be attributed to inattention, rudeness, hostility, or an unwillingness to respond due, for example, to a desire to avoid self-incrimination.Footnote 2 The close-ordering constraint characteristic of adjacency pairs is associated with both the aim of “getting things to happen” in interaction, and processes of understanding that are associated with this. As Reference Schegloff and SacksSchegloff and Sacks (1973: 297–298) summarized the point:
by an adjacently positioned second, a speaker can show that he understood what a prior aimed at, and that he is willing to go along with that. Also, by virtue of the occurrence of an adjacently produced second, the doer of a first can see that what he intended was indeed understood, and that it was or was not accepted. Also, of course, a second can assert his failure to understand, or disagreement, and inspection of a second by a first can allow the first speaker to see that while the second thought he understood, indeed he misunderstood. It is then through the use of adjacent positioning that appreciations, failures, corrections, etcetera can be themselves understandably attempted.
It is these functions that also necessitate the close ordering of adjacency pair parts. Again Reference Schegloff and SacksSchegloff and Sacks (1973: 297) pinpoint the fundamental issue:
Given the utterance-by-utterance organization of turn-taking, unless close ordering is attempted there can be no methodic assurance that a more or less eventually aimed-for successive utterance or utterance type will ever be produced. If a next speaker does not do it, that speaker may provide for a further next that should not do it (or should do something that is not it); and, if what follows that next is “free” and does not do the originally aimed-for utterance, it (i.e., the utterance placed there) may provide for a yet further next that does not do it, etc. Close ordering is, then, the basic generalized means for assuring that some desired event will ever happen. If it cannot be made to happen next, its happening is not merely delayed, but may never come about. The adjacency pair technique, in providing a determinate “when” for it to happen, i.e., “next,” has then means for handling the close order problem, where that problem has its import, through its control of the assurance that some relevant event will be made to occur.
These observations further suggest a second vital dimension of close ordering and sequence in general. This concerns the use of sequential organization in the management of intersubjectivity, to which we now turn.
1.4.1.1 Sequence Organization and Intersubjectivity
More or less explicit in the preceding discussion is the idea that sequence organization is a vehicle for the management of mutual understanding and social accountability. Central to this process is what may be termed the “three-step model of intersubjectivity” (Reference HeritageHeritage 1984a: 254–264; Reference Heritage, Deppermann and Haugh2022). This is based on the following two notions: (i) a second turn response to a prior turn communicates, if only indirectly and inferentially, a treatment of the prior action that allows a first speaker to see whether (or how) they were understood; and (ii) a third turn from the first speaker is necessary for the second speaker to see whether the understanding conveyed in Turn 2 was appropriate. “Turn 3” is thus the point at which an intersubjectively shared understanding of “Turn 1” is first available and, just as important, is known to be shared and accountably so by the participants (Reference HeritageHeritage 1984a; Reference SchegloffSchegloff 1992b). It follows that each and every next turn at talk is subject to this three-step consolidation of understanding. Thus, rather like a tracked vehicle, an ongoing process of intersubjective consolidation – an incarnate “running index” (Reference Heritage and AntakiHeritage 1988) of the state of the talk – is continuously and unavoidably extruded through successive turns at talk. An important consequence of this line of reasoning is that intersubjective understanding is neither an exclusively subjective, nor an instantaneous event. Rather, it is a social process unfolding across multiple turns at talk.
However, “intersubjective understanding” is not the explicit project of sequences. Rather, the actions that compose sequences are accountable in their own right as actions and as competent understandings of what has gone before. In this context then, every turn simultaneously embodies both an understanding of the prior turn and constitutes an action in its own right. This double role for turns at talk is nicely illustrated in (5) below. Here Marty approaches Loes, who is at a desk, with a request (line 1):
(5)
[Reference SchegloffSchegloff 1992b: 1321) 1 Marty: Loes, do you have a calendar, 2 Loes: Yeah ((reaches for her desk calendar)) 3 Marty: Do you have one that hangs on the wall? 4 Loes: Oh, you want one. 5 Marty: Yeah
Loes’s responsive action (line 2) is interdicted by Marty’s second request (line 3), at which point Loes, at line 4, articulates the understanding that Marty wants to have a calendar to take away, rather than merely to consult one. This “display of understanding” simultaneously conveys a revised understanding of line 1, while also covertly accounting for the previous misunderstanding. Accountability in the two fundamental senses of understanding and action are seamlessly interwoven here.
Adjacency pair sequences can be expanded. These expansions can precede the adjacency pair proper (pre-expansions) and in this position are frequently used to establish the appropriateness of the “base” adjacency pair sequence (Reference SchegloffSchegloff 2007: 28–57). Insert expansions, occurring between a first pair part and a second, are mainly occupied with initiating repair on the first pair part or establishing conditions that will be necessary for the provision of the second (Reference SchegloffSchegloff 2007: 97–114). While both these types of expansions are normally themselves carried out through adjacency pairs, the third type of expansion – the post-expansion – is often more loosely structured and can take a wide variety of forms (Reference SchegloffSchegloff 2007: 115–180).
1.4.1.2 Relaxing the Notion of Sequential Constraint
Conversations, of course, are not exclusively implemented through sequences of adjacency pairs. Rather the latter are a particularly tightly organized form of sequential constraint, in which recipients have few options or “room for maneuver.” However, the very idea of sequential organization suggests that talk is always organized by a relationship of “nextness” to a prior. And, even though this relationship may be looser in terms of the constraint placed on a next turn by a prior, it is still “central to the ways in which talk-in-interaction is organized and understood. Next turns are understood by co-participants to display their speaker’s understanding of the just-prior turn and to embody an action responsive to the just-prior turn so understood” (Reference SchegloffSchegloff 2007: 15).
Consider the following case. Here a daughter Lesley (Les) has recommended garlic tablets to her mother, and enquires whether she has followed up on the suggestion:
(6)
This sequence begins with two adjacency pair sequences that are “chained” both logically and in terms of anaphoric reference. The sequence is apparently concluded with an oh-prefaced assessment (line 5), which is a frequent method of sequence closure (Reference Heritage, Atkinson and HeritageHeritage 1984b; Reference SchegloffSchegloff 2007: 118–136). However, at line 6 Mum initiates a further expansion of the sequence with “Garlic’n parsley” that is tied to Lesley’s initial turn that describes the tablets as “garlic” and is understandable, and understood, as a minor correction of that earlier characterization. When Lesley confirms this at line 7, she slightly mischaracterizes the name of the supermarket chain that sells the tablets, attracting a further correction from Mum at line 10. Lesley then confirms the now corrected name and moves to close the sequence with another assessment (line 11). At line 15, she initiates a further expansion on the subject.
The point about this chain of turns is that, after line 5, none of these contributions is mandated by the turn preceding it. Rather, each is “volunteered” and achieves its sense as an action through its inferred connections to preceding talk. In this looser form of sequential organization, action is both constructed and understood through the retrospective application of relevance rules that convey the general sequential implicativeness of turns at talk (Reference HeritageHeritage 2012b).
1.4.2 Preference Organization
It is a readily observed fact about conversational interaction that participants generally avoid actions that disaffiliate with others. That is, confronted with an offer, request, invitation, or an evaluation of something, participants will generally manage their conduct to avoid or mitigate responsive actions that disagree with or refuse what came before. Conversation analysts use the term “preference organization” to describe the range of practices that are associated with this outcome (Reference Davidson, Atkinson and HeritageDavidson 1984; Reference Pomerantz, Atkinson and HeritagePomerantz 1984; Reference Sacks, Button and LeeSacks 1987; Reference HeritagePomerantz and Heritage 2013; Reference Pillet-Shore and NussbaumPillet-Shore 2017). The term was coined by Sacks in a public lecture in 1973 (later published in 1987, some years after his death). It does not refer to the private, personal, or psychological desires of individuals, but rather to institutionalized practices of talk that are generally expected under such circumstances (Reference HeritageHeritage 1984a: 265–280; Reference SchegloffSchegloff 2007: 58–96).
In his original paper, Sacks argued that both first speakers and second speakers engage in practices that contribute to the minimization of potential or actual disaffiliation. A primary resource open to first speakers is the use of pre-sequences to check out the availability of recipients for some form of joint action, as in (7):
(7)
[SB:1 (Reference SchegloffSchegloff 2007:31)] 1 John: Ha you doin-<say what’r you doing? 2 Judy: Well, we’re going out. Why. 3 John: Oh, I was just gonna say come out and come over 4 here and talk this evening, but if you’re going out 5 [you can’t very well do that 6 Judy: [“Talk,” you mean get drunk, don’t you?
By means of Judy’s response to line 1, John can avoid launching the invitation he had apparently planned, Judy is released from having to reject that invitation, and John is released from receiving it. This kind of mutual interest in the avoidance of rejection drives many pre-sequences of this sort.
The producers of first actions have considerable resources to design their turns to be more or less insistent. This is particularly relevant in the case of requests, with their potential for unwanted imposition on the recipient. Reference Curl and DrewCurl and Drew (2008) have influentially argued that the design of requests can vary between the poles of entitlement and contingency. Request formats such as “I want X” or “I need X” tend to be deployed in contexts of high entitlement, whereas “I was wondering if …” and irrealis formulations such as “Could you X” recognize the discretion of the recipient and the contingencies associated with granting the request (Reference Drew and Couper-KuhlenDrew and Couper-Kuhlen 2014). The relationship between, grammar, turn design, and social action is presently undergoing extensive exploration for first actions of this sort (Reference Kendrick and DrewKendrick and Drew 2016; Reference Floyd, Rossi and EnfieldFloyd et al. 2020; Reference RossiRossi 2022; Reference Teleghani-Nikasm, Betz and GolatoTaleghani-Nikasm et al. 2020). It may be added that the design of offers has similarly been shown to be shaped by the contexts of their emergence (Reference CurlCurl 2006), and both offers and requests are understood against contexts of need and desirability (Reference Clayman, Heritage, Drew and Couper-KuhlenClayman and Heritage 2014). At a still greater level of generality, entitlement and contingency are calibrated within the larger social context first outlined by Reference Brown and LevinsonBrown and Levinson (1987) in terms of the weight of the request, and the relative power and social distance between the parties.
Turning to responsive actions, the initial CA research pointed to delay and mitigation as primary practices in managing dispreferred, disaffiliative actions (Reference Davidson, Atkinson and HeritageDavidson 1984; Reference Pomerantz, Atkinson and HeritagePomerantz 1984; Reference Sacks, Button and LeeSacks 1987). Reviewing some details from (3) above will illustrate this:
(3)
[Detail] 4 Edg: [Imean] can we do any shopping for her or 5 something like tha:t? 6 (0.7) 7 Mic: Well that’s most ki:nd Edgerton .hhh At the moment 8 no:. Because we’ve still got two bo:ys at home.
Here Michael’s response to Edgerton’s offer involves (i) delay (line 6), a well-prefaced appreciation (line 7), a qualified rejection (lines 7–8), and an account (“we’ve still got two bo:ys at home”) that describes a state of affairs unrelated to, and presumably unknown to, the offerer. The net effect of the account is to deflect any sense that the rejection could be a negative commentary either on the offerer as a person, or on the social appropriateness of the offer. Note here that Michael’s pause of 700 milliseconds at line 6 is close to the length of time that predicts dispreferred responses in next turn (Reference Kendrick and TorreiraKendrick and Torreira 2015),Footnote 3 and also to the delay – also 700 milliseconds – after which experimental subjects start to judge that a dispreferred response is likely (Reference Roberts and FrancisRoberts and Francis 2013).
Both the first speaker and the second speaker practices that make up preference organization are associated with “face” and affiliation (Reference GoffmanGoffman 1955; Reference HoltgravesHoltgraves 1992; Reference LernerLerner 1996; Reference Heritage, Clayman, Mondada and PeräkyläHeritage and Clayman 2023). This is demonstrated by the fact that, in contexts such as self-deprecations where disagreement would be affiliative and agreement would be disaffiliative, none of these characteristic features are present when second speakers disagree (Reference Pomerantz, Atkinson and HeritagePomerantz 1984). Earlier it was noted that these features are not tied to the personal or subjective desires of participants. This is evidenced in the fact that when affiliative and disaffiliative actions are produced in the prescribed ways, this is treated as unremarkable. However, if the pattern is reversed, and disaffiliative actions are done “early” and affiliative ones “late,” this is treated as remarkable, inferentially rich and an index of “real” feelings and attitudes.Footnote 4
1.4.3 Turn Design
Sequences are made up of turns at talk. These turns are the objects of practices of turn design that massively contribute to the understanding of turns as actions. Put simply, turn design embraces all the resources that speakers draw upon in building a turn to be understood in the way they wish it to be understood. Turn design is a vast topic, embracing a wide range of resources deployed in turn construction. These include “lexis (or words), phonetic and prosodic resources, syntactic, morphological and other grammatical forms, timing, laughter and aspiration, gesture and other bodily movements and positions (including eye gaze)” (Reference Drew, Sidnell and StiversDrew 2013: 132), many of which have long traditions of research in linguistics and related fields, and all of which have been the objects of steadily growing research in CA. These topics can only be touched upon in a survey such as this.
It is comparatively straightforward to see the relevance of these resources in contexts where speakers use self-repair to adjust the actions that they are in the course of producing (Reference Drew, Sidnell and StiversDrew 2013). For example, in the following case Emma has called her friend Margy to thank her for a lunch event the previous week. Margy deflects the thanks by indicating that it was a pleasure to have her and continues by suggesting that “We’ll have to do that more often.” The conversation continues as follows:
(8)
[NB:VII:82–88 (from Reference Drew, Walker, Ogden, Hayashi, Raymond and SidnellDrew et al. 2013)] 1 Mar: =W’l haftuh do tha[t more] o[:ften.] 2 Emm: [.hhhhh] [Wul w]hy don’t we: uh-m:= 3 Emm: =Why don’t I take you’n Mo:m up there tuh: Coco’s.someday 4 fer lu:nch.
Emma begins her turn with “Why don’t we:” (line 2), matching the “we” construction of Margy’s previous turn, before revising her turn so that it begins with “Why don’t I … .” The significance of this simple revision is obvious: it alters the action projected to be implemented by her turn from a “suggestion” to an “offer” which would reciprocate the lunch engagement that she has recently enjoyed. Here the grammatical adjustment of her turn at talk drastically revises the action it is implementing to one that Emma feels is better fitted to its social context.
Lexical choice is also a vehicle for the management of meaning. Reference Drew, Sidnell and StiversDrew (2013: 131) focuses on the role of the word burglar in Mom’s announcement of the events of the previous evening (lines 4–5):
(9)
[Field X(C):2:00 1:00 4:00 27–37 (from Reference Drew, Sidnell and StiversDrew 2013:131)] 1 Mom: Oh hello: 2 Kat: ‘lo, 3 (0.4) 4 Mom: .hhh I thought you were the police we had a bu:rglar 5 las’ ni:ght 6 (.) 7 Kat: ↑Really. Did[’e↑take anything. 8 Mom: [.hhh 9 (0.2) 10 Mom: .h NO: ..hh Uh: m (0.3) You see. ↑we were in bed it 11 wz about three twen↓ty.h a.m:, …
Noting that Katherine’s line 7 does not express dismay and does not assume that anything was stolen, Drew argues that her position is responsive to Mom’s use of the word burglar rather than burglary in line 4. While burglar describes a person intending to commit a robbery, a burglary describes the consummation of such an act. Katherine’s question is fitted to this lexical choice, and Mom confirms that, in fact, nothing was taken (line 10). Interaction in institutional contexts is pervasively shaped by these kinds of lexical selections (Reference Drew, Heritage, Drew and HeritageDrew and Heritage 1992b; Reference Heritage, Sanders and FitchHeritage 2005; Reference Heritage and ClaymanHeritage and Clayman 2010).
Other practices of turn deign are associated with the management of cohesion across sequences of turns at talk (Reference Halliday and HasanHalliday and Hasan 1976; Reference Drew, Sidnell and StiversDrew 2013). Consider anaphora, for example. In the following sequence, pronominal anaphora is deployed to sustain sequential linkages across turns at lines 6, 10, and 12 that focus on a non-present third party – Alice:
(10)
[SN4: 615–630] 1 Mark: So (‘r) you da:ting Keith? 2 (1.0) 3 Kar: It’s a frie:nd. 4 (0.5) 5 Mark: What about that girl he use(d) to go with for so long. 6 Kar: → A:lice? I [don’t-] they gave up. 7 Mark: [(mm)] 8 (0.4) 9 Mark: °(Oh)?° 10 Kar: → I dunno where she is but I- 11 (0.9) 12 Kar: → Talks about her every so o:ften, but- I dunno where she is 13 (0.5) 14 Mark: hmh 15 (0.2) 16 Sher: *→ Alice was stra::nge, 17 (0.3) ((rubbing sound)) 18 Mark: → Very o:dd. She used to call herself a pro:stitute,= 19 → =’n I used to- (0.4) ask her if she was getting any 20 more money than …
There are, of course, also sequential sources of cohesion across this series of turns (Reference Schegloff and DorvalSchegloff 1990), in particular, its construction through a question–answer + post-expansion adjacency pair structure. However, it is also clear that anaphoric reference is a massive source of cohesion in the sequence across lines 6–12.
At line 16, however, Sheri starts on a new topic, while retaining Alice as the subject of the sequence. Notably, this is begun with a repeat of the word Alice. As suggested by Reference FoxFox (1987: 71) and subsequently reinforced by Reference Schegloff and FoxSchegloff (1996a: 452), this return to the base (locally initial) reference form indexes that what will follow is a new sequence, rather than a continuation of the prior talk. More recently, in a generalization of this argument, Reference Raymond, Clift and HeritageRaymond et al. (2021) have argued that the use of non-anaphoric resources for co-reference are constitutively associated with conversational actions that are inherently agentive.
We conclude this brief survey of turn design with a consideration of the role of particles – words like and, oh, well, bueno, pues, Achso, etc. – in the management of turns and sequences. Recent years have seen a large boom in studies of these objects both within and across languages (e.g., Reference FischerFischer 2006; Reference Kim and KuroshimaKim and Kuroshima 2013; Reference Heinemann and KoivistoHeinemann and Koivisto 2016; Reference Heritage and SorjonenHeritage and Sorjonen 2018; among many others). Many of them occur in turn-initial positions, and often function there to take up a stance toward what has just been said, or what the speaker is about to say. For example, the particle oh – a “change of state token” ordinarily used to acknowledge new information (Reference Heritage, Atkinson and HeritageHeritage 1984b, Reference Heritage, Heritage and Sorjonen2018) – can be used in second position to respond to assessments so as to convey a more knowledgeable (K+) position relative to the previous speaker. In (10), for example, Jon and Lyn have been to see the movie Midnight Cowboy while Eve has not. At line 6, Eve offers the opinion that the movie is depressing, attributing this judgment to a friend Jo:
(11)
[JS:II:61:ST] 1 Jon: We saw Midnight Cowboy yesterday -or [suh- 2 Eve: [Oh? 3 Jon: Friday. 4 Lyn: Didju s- you saw that, [it’s really good. 5 Eve: [No I haven’t seen it 6 Jo saw it ‘n she said she f- depressed her 7 ter[ribly 8 Jon: → [Oh it’s [terribly depressing 9 Lyn: → [Oh it’s depressing. 10 Eve: Ve[ry 11 Lyn: → [But it’s a fantastic [film. 12 Jon: → [It’s a beautiful movie.
At lines 8 and 9, both Jon and Lyn agree with this assessment, and they both preface their responses with oh, indexing an independent stance of epistemic superiority based in firsthand experience of the movie relative to Eve’s “hearsay” information. Subsequently, both go on to leverage this epistemic stance into disagreements counterbalancing the “depressing” aspect of the film with its other qualities which they describe as “fantastic” and “beautiful” respectively (lines 11 and 12).
The notion of stance as an aspect of turn design is attracting increasingly significant research interest. Major dimensions of stance include epistemic, deontic, and affective aspects. Following Reference Sacks, Button and LeeSacks’s (1987) observations about preference, researchers began to explore the notion that questions will routinely convey the questioner’s stance concerning the likelihood of the state of affairs under question primarily through the use of polarity items (Reference PomerantzPomerantz 1988; Reference RobinsonRobinson 2020a; Reference Heritage and RaymondHeritage and Raymond 2021; Reference Raymond and HeritageRaymond and Heritage 2021; Robinson 2020a). Similarly, choices between declarative and interrogative question forms will communicate the questioner’s confidence in that likelihood (Reference Heritage, Freed and EhrlichHeritage 2010, Reference Heritage2012a). Thus, in the following sequence, the doctor’s initial question proposes an affirmative hypothesis for confirmation (Reference Bolinger and HizBolinger 1978) – that the patient is, in fact, married – while its interrogative syntax communicates that he is less than certain about it:
(12)
[Reference Heritage, Freed and EhrlichHeritage 2010} 1 DOC: Are you married? 2 (.) 3 PAT: No. 4 (.) 5 DOC: You’re divorced cur[rently, 6 PAT: [Mm hm,
His second declarative question, however, at line 5 displays more assurance that the patient is likely to be divorced. Studies of responses to polar questions are congruent with these observations. Interactants tend to respond minimally to questions that invite confirmation but are more likely to respond phrasally or clausally in response to information seeking questions (Reference Enfield, Stivers, Brown, Englert, Harjunpää, Hayashi and LevinsonEnfield et al. 2019; Reference StiversStivers 2010). Even comparatively brief responses also encode the degree to which the respondent views the question as appropriately framed and the responses as self-evident (Reference RaymondRaymond 2003; Reference Heritage and RaymondHeritage and Raymond 2005, Reference Heritage, Raymond and de Ruiter2012; Reference Raymond and HeritageRaymond and Heritage 2006; Reference Fox and ThompsonFox and Thompson 2010; Reference Thompson, Fox and Couper-KuhlenThompson, Fox and Couper-Kuhlen 2015; Reference StiversStivers 2019, Reference Stivers2022). Deontic stance is similarly the object of a wide range of research (Reference StevanovicStevanovic 2011; Reference RossiRossi 2012; Reference Stevanovic and PerakylaStevanovic and Peräkylä 2012; Reference Drew and Couper-KuhlenDrew and Couper-Kuhlen 2014; Reference Kendrick and DrewKendrick and Drew 2016; Reference Floyd, Rossi and EnfieldFloyd et al. 2020; Reference Thompson, Fox and RaymondThompson et al. 2021). Affective stance is also an expanding locus of research (Reference GoodwinGoodwin 2007a; Reference RuusuvuoriRuusuvuori 2007; Reference Du Bois and KärkkäinenDu Bois and Kärkkäinen 2012; Reference Local and WalkerLocal and Walker 2012; Reference Peräkylä and SorjonenPeräkylä and Sorjonen 2012; Reference ReberReber 2012; Reference Stevanovic and PeräkyläStevanovic and Peräkylä 2014; Reference Voutilainen, Henttonen, Kahri, Kivioja, Ravaja, Sams and PeräkyläVoutilainen et al. 2014).
This section has presented a “snapshot” of a few of the very many aspects of turn design that inform the production of sequences of action. Some of these have involved elements that are well known to linguists. However, once these are understood against a context of action construction and sequence organization – which Schegloff terms a positionally sensitive grammar (Reference Schegloff, Ochs, Thompson and SchegloffSchegloff 1996b; Reference Mazeland, Sidnell and StiversMazeland 2013) – new analytic vistas are revealed in which the mutual relevance of grammar and interaction can become strikingly clear (Reference Couper-Kuhlen and SeltingCouper-Kuhlen and Selting 2018).
1.4.4 Recipient Design
Conversational actions are managed with reference to the particulars both of local sequential context and of personal identities. Conversation analysts refer to this aspect of conversational organization as recipient design. This was defined by Reference Sacks, Schegloff and JeffersonSacks et al. (1974: 727) as referring to “a multitude of respects in which the talk by a party in a conversation is constructed or designed in ways which display an orientation and sensitivity to the particular other(s) who are the co-participants.” Sacks et al. add that “in our work, we have found recipient design to operate with regard to word selection, topic selection, admissibility and ordering of sequences, options and obligations for starting and terminating conversations, etc.” (p. 727). Plainly this is a domain of massive scope.
Recipient design perhaps emerges most transparently and frequently in relation to what others may know. For example, in deploying referring expressions about persons, places, and time, speakers must determine what expressions will be recognizable to their recipient (Reference Schegloff and SudnowSchegloff 1972, Reference Schegloff and Fox1996a; Reference Schegloff and GivonSacks and Schegloff 1979; Reference StiversEnfield and Stivers 2007; Reference Raymond and WhiteRaymond and White 2017). As Reference Schegloff and SudnowSchegloff (1972) observes about place formulations: “For any location to which reference is made, there is a set of terms each of which, by a correspondence test, is a correct way to refer to it.” In such a context the place formulation used will be selected by reference to the presumed knowledge of the recipient, the topic, the local action involved, and so on. Similar issues inform person and time references as well.
Reference Schegloff and GivonSacks and Schegloff (1979) outlined a preference structure for these kinds of references in which a speaker will ordinarily begin with a minimal reference form and proceeds to expand on it in the case of non-recognition by the recipient, as in (13):
(13)
[PT:NB:VII:7] 1 Emm: .hhh B’t anyway we played golf et San or Bud played et San 2 Ma:rcus so I went down with’im= 3 → =yihknow that’s back’v Ensk- (.) E[sc’ndido [so, 4 Mar: → [Ye:ah. [Mhm,
In this case, Emma’s initial reference to San Marcos fails to attract a response by the end of line 2, whereupon she expands the reference (line 3), receiving a prompt acknowledgment (line 4) at the completion of the expansion.
To the extent that they may anticipate this kind of recognition failure, speakers may deploy pre-sequences that help to assure successful recognition (Reference Heritage, Enfield and StiversHeritage 2007). In (14) Dave is trying to recruit Pete to go on a fishing trip but is unsure whether he knows the proposed location.
(14)
[Northridge 2:3] 1 Dave: → Like yih know whereah: : Pilgrim Lake is i(ts)-= 2 → =that’s on the other si:de u’th’Grapevine= 3 → =yihknow this side of the Grapevine 4 Pete: → Oh the’s jus’ up to Bakersfield. 5 Dave: Yeah.
Here, the pre-sequence at line 1 serves as an adjunct to the recipient design of the place reference.
Pre-sequences can solve other interactional issues where the state of recipient knowledge many be uncertain. For example, pre-announcements are routinely deployed to determine whether a candidate recipient already knows the news to be told. In (15), in which A and B are a couple, and C and D are also, A makes a pre-announcement (Reference MaynardMaynard 2003; Terasaki [1976] Reference Terasaki and Lerner2004) about some good news. The sequence is depicted below:
(15)
[A and B are a couple, as are C and d] 1 A: Hey we got good news. 2 C: [What’s the good ne]ws, 3 D: [I k n o : w.] 4 (.) 5 A: [Oh ya do::? 6 B: [Ya heard it? 7 A: Oh good. 8 C: Oh yeah, mm hm 9 D: Except I don’t know what a giant follicular 10 lymphoblastoma is. 11 A: Who the hell does except a doctor.
It will be observed that the other couple (C and D) respond to this announcement in contrasting ways: C asks about the news (line 2), while D claims to know it (line 3). At lines 5 and 6, A and B both respond to the claim of knowledge (line 3) rather than the go-ahead request (lines 2) and the news delivery is aborted.Footnote 5
As the prevalence of pre-sequences may suggest, recipient design is not always easily or flawlessly achieved. In (16), Lesley describes a lunch date with a person she refers to as “Missiz Baker.” With this reference form she projects her recipient (Joyce) to have only a slight acquaintance with the person in question. However, as it turns out, this is not quite correct:
(16)
[Holt 5/88:1:2:174–180] 1 Les: i Yes: : .hh An’ I met Missiz Baker ‘n we had lunch together 2 which wz very ni:[ce,.hhh 3 Joy: → [Oh did you with Di:a:nne,°Ba[ker.° 4 Les: [ih- ye:s ‘n 5 then: she wz going an’ I suddenly re’mbered she’d paid f’the 6 lo:t fortunately I managed to catch her.
At line 3, Joyce responds with a turn that contains a revised reference form that includes the given name of her lunch companion (“Di:a:nne, °Baker.°”), thus intimating a closer relationship than Lesley projected. Here the expanded person reference (line 3) is appended to a turn component (“Oh did you”) that otherwise invites an expansion of the informing to which Lesley seems to be committed. In this context, the turn forwards the informing while also declining to accept the reference form (“Missiz Baker”) presented by Lesley. Thus, the potentially disambiguating, but otherwise redundant, addition of “Di:a:nne, Baker” is most likely designed to index a failure of recipient design on Lesley’s part: Joy knows “Dianne Baker” better than Lesley initially supposed at line 1.
Finally, in (17) from a call to a birth helpline, the caller has previously said that she is a retired midwife (Reference Kitzinger and MandelbaumKitzinger and Mandelbaum 2013: 185–186) and, confident in this knowledge, the helpline worker (Hlp) uses a technical term (cephalic presentation) to describe the likely position of a baby in the womb (line 1):
(17)
[Reference Kitzinger and MandelbaumKitzinger and Mandelbaum 2013: 186] 1 Hlp: → [·hhh But- but-] i- it’s a cephalic presentation i:sn’t i:t. 2 Clr: Sorry¿ 3 Hlp: → hh uh the baby’s head down¿ 4 Clr: Yes he’s cephalic, y[es () 5 Hlp: [·hh Yes, ·hhh a::nd u::m not posterior. 6 (0.9) 7 Clr: No:t that I’m aware of.= 8 Hlp: =No
When the caller (Clr) initiates an “open class” repair (Reference DrewDrew 1997)Footnote 6 on this turn (line 2), the helpline worker revises the description into “lay” terms (line 3). The caller then confirms Hlp’s supposition using the previously deployed technical term (line 4), Subsequently the helpline worker continues by using technical terminology at line 5 that treats the caller as qualified in this branch of medicine. In this case, the helpline worker’s beliefs about the caller ‘s experience and knowledge wavers briefly, and the caller pushes back at this brief failure of recipient design.
With the topic of recipient design and the analytical focus on how talk is organized with reference to “the particular other(s) who are the co-participants,” CA approaches a particular dimension of language and context: the dimension of identity. This is an immensely variegated topic and one that is also approached through Membership Categorization Analysis (MCA), which is beyond the scope of the present chapter. Once significantly separated from CA, there is now considerable convergence between the two fields of study, and there is much to be gained from their rapprochement.
1.4.5 Progressivity
In a notable passage, Reference SchegloffSchegloff (2007) observes the following:
Among the most pervasively relevant features in the organization of talk-and-other-conduct-in-interaction is the relationship of adjacency or “nextness.” The default relationship between the components of most kinds of organization is that each should come next after the prior. In articulating a turn-constructional unit, each element – each word, for example – should come next after the one before; in fact, at a smaller level of granularity, each syllable – indeed, each sound – should come next after the one before it.
And he continues:
Moving from some element to a hearably-next-one with nothing intervening is the embodiment of, and the measure of, progressivity. Should something intervene between some element and what is hearable as a/the next one due – should something violate or interfere with their contiguity, whether next sound, next word or next turn – it will be heard as qualifying the progressivity of the talk, and will be examined for its import, for what understanding should be accorded it.
Originally formulated in the context of a paper on repair (Reference Schegloff and GivonSchegloff 1979), these remarks have obvious implications for the idea of sequence which have been somewhat explored above, but they also have large implications for the study of turns at talk, and the clausal, phrasal, and lexical elements that compose them.
Considering the production of a turn at talk, we can begin with the notion that the preceding turn sets up (a range of) likely prospects for its content, including the action it is in the process of delivering. These expectations are likely consolidated fairly early in the turn (Reference Bögels, Kendrick and LevinsonBögels et al. 2019; Reference Gisladottir, Bögels and LevinsonGisladottir et al. 2018) and are progressively refined across the sequence of words that compose it, including anticipation of its completion (Reference De Ruiter, Mitterer and Enfieldde Ruiter et al. 2006; Reference Jefferson, D’Urso and LeonardiJefferson 1983). As Raymond and Lerner observe:
When one initiates a turn at talk, the unfolding turn-so-far will project roughly what it will take to complete it. Moreover, the continuing moment-by-moment unfolding of a turn will be inspected for the progressive realization (suspension, deflection, or abandonment) of what has been projected so far. The hallmark of this realization is found in such material elements as the pace of the talk, the adjacent placement of syntactically next words and the intonation contour that carries the talk. … In this sense the projectability of a speaker’s turn at talk constitutes a proximate normative structure within which a range of other organizational contingencies are coordinated and managed – including the timing and design of action by others; it is precisely this progressively realized structure that makes any deflections in its locally projected course a site of action, a recognizable form of action, and a site of action and interpretation by others.
It is in this context that repair initiation becomes significant (Reference Drew, Sidnell and StiversDrew 2013). Initiated through sound stretches, glottal stops, uhs, you knows, and other resources (Reference Clayman, Sidnell and StiversClayman 2013; Reference Clayman and RaymondClayman and Raymond 2021; Reference Schegloff and GivonSchegloff 1979), repair can be initiated on any word, indeed any sound (including the final one) of a turn (Reference Schegloff and GivonSchegloff 1979), and can modify any aspect of the action being delivered (Reference Schegloff and GivonSchegloff 1979, Reference Schegloff, Hayashi, Raymond and Sidnell2013; Reference Drew, Walker, Ogden, Hayashi, Raymond and SidnellDrew et al. 2013). These modifications can be to entire actions (as in (8) above), to the entire grammatical form of a turn (as in (18), line 3 below, which is modified to a more appropriate epistemic stance (Reference Drew, Walker, Ogden, Hayashi, Raymond and SidnellDrew et al. 2013)), to convert a “content” into a polar question (as in (19)), or to redo an assertion to adjust its epistemic stance (20), line 7; finally, repair may be initiated at or just past the final sound of a turn constructional unit, as in (11), line 1.
(18)
[Field SO88(II):1:3:1 ] (Leslie is caller) (Reference Drew, Walker, Ogden, Hayashi, Raymond and SidnellDrew et al 2013) 1 Hal: Oh ‘el[lo Lesl[ie? 2 Les: [.hhhh [I RANG you up- (.) ah: think it wz la:s’ night. 3 But you were- (.) u-were you ↑ou:t? or: was it the night 4 before per[↓haps. 5 Hal: [Uh:m night be↓fore I expect we w’r dancing Tuesdee ni:ght.
(19)
[NB: IV: 2 :2 (Reference Schegloff and GivonSchegloff 1979)] 1 Agnes: Chop [it. 2 Martha: [Tell me, uh what- d’you need a hot sauce? 3 (0.5) 4 Agnes: t’hhh a Taco sauce.
(20)
The overwhelming relevance of progressivity in turn and sequence organization creates powerful pressures to complete repairs within the turn containing the trouble source rather than adopting some form of “wait and see” approach (Reference Schegloff, Jefferson and SacksSchegloff et al. 1977; Reference Schegloff and GivonSchegloff 1979), even though repair actions inhibit progressivity within the turn (Reference Heritage, Enfield and StiversHeritage 2007). Progressivity and recipients’ capacity to project what will emerge next also provide fertile soil for multimodal operations within the sentence. We conclude by noting that progressivity deeply penetrates the organization of sequences, to the extent of overriding other normative aspects of sequence organization such as next speaker selection (Reference Stivers and RobinsonStivers and Robinson 2006). So overwhelming are the pressures for progressivity exerted by the organization of turn-taking and sequence organization that special aspects of sequence organization are required to end conversations (Reference Schegloff and SacksSchegloff and Sacks 1973).
1.4.6 Multimodality
Interaction is prosecuted through bodily engagements as well as turns at talk. Indeed, as Charles Reference GoodwinGoodwin (1981, Reference Goodwin2000a, Reference Goodwin2018) observed, utterances are not merely the carriers of action, but also sites within which multiple social actions are taking place. This arises from a fundamental fact about social interaction. Whereas the acoustic basis of spoken language militates against extensive overlapping talk, bodily behavior is not inhibited in this way (Reference Deppermann, Mondada and DoehlerDeppermann et al. 2021). Rather it can, and usually does, emerge before the onset of speech and readily accompanies turns at talk projecting their course, and amplifying, elaborating, or modulating their content. As Reference MondadaMondada (2021a: 1) puts it: the “temporality of action is emergent, incremental, organized step-by-step, and adjustable to situated contingencies.” Prominent among these bodily resources are gaze, gesture (including facial gesture), body posture, and touch, all of which are conditioned by the physical and epistemic ecology in which a sequence occurs.
These elements of bodily action can arise before turns begin and be involved in multiple reflexive adjustments across the turn, facilitating action ascription, and enabling early response (Reference GoodwinGoodwin 2018; Reference Deppermann and SchmidtDepperman and Schmidt 2021; Reference MondadaMondada 2021a). For example, speaker gaze aversion before turn beginnings reliably predicts subsequent dispreferred turns (Reference Kendrick and HollerKendrick and Holler 2017; Reference RobinsonRobinson 2020b; Reference Pekarek Doehler, Polak-Yitzhaki, Li, Stoenica, Havlík and KeevallikPekarek Doehler et al. 2021), while gaze withdrawal at sequence boundaries is quite strongly associated with sequence closure, especially if performed by the one who initiated the sequence (Reference Rossano, Brown, Levinson and SidnellRossano et al. 2009; Reference RossanoRossano 2012, Reference Rossano, Sidnell and Stivers2013). And, more generally, embodied responses to a current speaker can emerge when that speaker’s action becomes intelligible, in contrast to spoken responses which generally await the completion of the prior turn (Reference JeffersonJefferson 1973, Reference Jefferson, D’Urso and Leonardi1983; Reference MondadaMondada 2021a). And, as Reference Goodwin and PsathasGoodwin (1979) was the first to point out, speakers readily make segmental and gestural adjustments to accommodate the bodily responses of their interlocutors as their turn at talk unfolds – a phenomenon termed micro-sequentiality (Reference MondadaMondada 2018). These micro-sequential adjustments between action and response may be temporally adjacent if not simultaneous. This stretches and blurs the notion of sequence considered in terms of first and second actions, and also raises the question of how to characterize the nature and onset of response (Reference MondadaMondada 2021a). Finally, first actions themselves can be implemented without language, as when assistance of some kind is recruited without a verbal request (Reference Kendrick and DrewKendrick and Drew 2016; Reference Floyd, Rossi and EnfieldFloyd et al. 2020).
In general, it is possible to distinguish relatively routine constellations or assemblies of multimodal resources that are constructed in comparatively repetitive activities. For example, Reference MondadaMondada (2021a, Reference Mondada2021b) demonstrates the body orientation, gazing, and pointing that are the precursors and the accompaniments deployed in the verbal selection of a cheese for sampling or purchase. A similar patterning may be observed in the bodily postures associated with different levels of involvement with multiple others (Reference SchegloffSchegloff 1998). Other studies document the deployment of multimodal resources in unique and singular moments in interactional events. Reference Goodwin and KitaGoodwin’s (2003) study of pointing is an exemplary case.
Arguably a founding originator in the analysis of body behavior in relation to social interaction, Reference Goodwin and PsathasGoodwin’s (1979, Reference Goodwin1981) work on gaze greatly expanded the vocabulary of analysis. This focused not simply on speakers’ gaze toward others but also how they track the gaze of their co-participants, and whether the co-participants gaze back, see the gaze, engage in mutual gaze, and so on. The field now embraces investigations into the integration of gaze into holistic arrangements of multimodal resources, that include gesture, body postures, movements, and the manipulation of artifacts. These studies have issued into a wide-ranging body of research on showing and looking together at something, and coming to see that thing in common (Reference GoodwinGoodwin 1994, Reference Goodwin2000a). They have established ways of finding the integration of the material environment within interaction and enabled ways of investigating the multisensoriality of interaction (Reference MondadaMondada 2019).
Goodwin’s work also readily accommodated the study of gesture (Reference GoodwinGoodwin 1986, Reference Goodwin2000a), and it is gesture that is the primary focus of a long series of studies by Reference StreeckStreeck (2009; see also Reference Streeck, Goodwin and LeBaronStreeck et al. 2011; Reference Meyer, Streeck and JordanMeyer et al. 2017). This research analyzes the fundamental modes of gesturing – indexical (environmentally coupled; Reference Goodwin, Duncan, Cassell and LevyGoodwin 2007b), depictive, and conceptual – and develops a powerful case for an enactive approach to gesture that identifies its role in relation to concurrent speech, in relation to positioning within turns and sequences of action, ongoing physical activities, and the physical environment. It would thus be appropriate to suggest that contemporary multimodal analysis is cementing a robust interface between language, action, and the material world and a more refined understanding of human social engagement.
1.4.7 Institutional Talk
Perhaps the most extensive domain of contemporary CA research comprises “institutional talk.” The term was coined by Reference Drew, Heritage, Drew and HeritageDrew and Heritage (1992b) to address the fact that interactional sequences in task-focused interactions – especially those involving interactions between institutional role incumbents and lay counterparts, but also in other workplace contexts – are strikingly distinctive relative to the ordinary interactions between friends, acquaintances, and family members. Early work in CA did not much distinguish between ordinary conversation and “institutional” interaction. Harvey Sacks’s early lectures dealt with calls to a suicide prevention center and later used data from group therapy sessions (labeled “GTS”) in lectures and other papers. Sacks used these data not to explore their institutional import, but rather to examine generic issues concerning turn-taking, repair, sequence organization, and storytelling. It was only in the late 1970s, with the publication of Reference Atkinson and DrewAtkinson and Drew’s (1979) research on courtroom interaction, that researchers began focusing on the distinctive nature of institutional talk in its own right. Other research swiftly followed, including work on news interviews, calls to the emergency services, classroom interaction, interaction in medical settings, mediation, therapy, and many others.
In a 1992 collection of studies, Reference Drew, Heritage, Drew and HeritageDrew and Heritage (1992b), built on work by Stephen Reference LevinsonLevinson (1979) to outline the distinctive characteristics of institutional interactions as a class. They argued that, in contrast to ordinary conversation, these interactions
(1) normally involve the participants in specific goal orientations which are tied to their institution-relevant identities, such as doctor and patient, teacher and student, etc.;
(2) normally involve special constraints on what will be treated as allowable contributions to the business at hand; and
(3) are normally associated with inferential frameworks and procedures that are particular to specific institutional contexts.
These features were not offered as defining characteristics of institutional interactions. Rather, as Drew and Heritage also noted, the boundaries between ordinary conversation and institutional talk are inexact, permeable, and difficult to specify (cf. Reference SchegloffSchegloff 1999). Institutional talk is not confined to particular physical or symbolic settings such as classrooms, hospitals or offices, and equally ordinary conversation can also emerge in each of these settings (Reference Drew, Sorjonen and DijkDrew and Sorjonen 1997). However, as Reference HeritageHeritage (2013) noted:
though the boundaries between ordinary conversation and institutional interactions cannot always be specified with precision, the distinction between the two is often abundantly obvious to even naïve observers, who do not readily confuse medical consultations, courtroom examinations, news interviews or mediation hearings with ordinary conversation between peers (Reference AtkinsonAtkinson 1982). The study of institutional interaction is essentially mandated by these basic differences.
The three characteristics listed above suggest several underlying assumptions concerning the primacy of ordinary conversational interaction. First, ordinary conversational interaction involves the deployment of the widest array of interactional rules and practices. Talk in various institutional settings, by comparison, involves restrictions in the use of particular practices and the re-specification of those that remain (Reference Heritage and DijkHeritage 1985; Reference Drew, Heritage, Drew and HeritageDrew and Heritage 1992b; Reference Heritage and ClaymanHeritage and Clayman 2010). Second, ordinary conversation is the predominant form of interaction in the social world: Other forms of institutional interaction are practiced in more restricted “niche” environments. Third, ordinary conversational practices are historically primary in the life of a society. Conversational interaction evidently antedates legal or pedagogic discourse, for example. Fourth, conversation is biographically primary in the life of the individual: language socialization proceeds through conversation (Reference Ochs and SchieffelinOchs and Schieffelin 1979, Reference Ochs and Schieffelin1986). Thus, for example, if children are to become successful participants in the classroom, they must learn new interactional conventions and practices that are different from those of ordinary conversation. Fifth, ordinary conversation is characterized by relative stability over time, whereas institutional talk can undergo rapid historical change (Reference Clayman and HeritageClayman and Heritage 2002a, Reference Clayman and Heritage2002b; Reference Heritage and ClaymanHeritage and Clayman 2010). And finally, whereas most conversational norms and practices are, like those of grammar, tacitly learned and implemented, the norms and practices of institutional interactions can be, and sometimes are, the objects of explicit societal discussion, justification, and proposals for active change. While this process is most obvious in, for example, discussions of the rules for Presidential Debates in US Elections, it is also present in movements for the reform of educational practice, for alterations in the practice of doctor–patient encounters, and in adjustments to rules for the conduct of police interrogations and courtroom trials.
Contemporary CA research on institutional talk can be divided into two broad forms. The first seeks to delineate the basic organization of talk within a particular institutional context. This can include special turn-taking arrangements and mechanisms, the overall structural organization of the interaction, aspects of sequence organization, turn design, lexical choice, and epistemic and other forms of asymmetry between the participants. The goal here has been to identify what is “institutional” about institutional interaction. This goal was comparatively urgent. Beginning from the notion that each turn at talk is context-shaped and context renewing, it follows that “context” is both a project and a product of each successive turn at talk. As a result, context, identity, and institutions have to be treated as inexorably locally produced and hence as transformable at any moment (Reference Heritage and ClaymanHeritage and Clayman 2010). Thus, identifying the interactional practices through which identities and institutions are made relevant in talk is an essential first step in the application of CA to social institutions. Credible sketches of interaction in particular institutional settings have emerged over the past twenty years. The following represent a small sample of quite large literatures: in relation to the courts (Reference Atkinson and DrewAtkinson and Drew 1979; Reference Drew, Drew and HeritageDrew 1992; Reference Dupret, Lynch and BerardDupret et al. 2015), emergency calls (Reference Zimmerman and SchiffrinZimmerman 1984, Reference Zimmerman, Drew and Heritage1992; Reference Whalen, Zimmerman and WhalenWhalen et al. 1988; Reference Whalen and ZimmermanWhalen and Zimmerman 1990); primary care (Reference RobinsonRobinson 2001, Reference Robinson2003; Reference Heritage and MaynardHeritage and Maynard 2006) counseling (Reference PeräkyläPeräkylä 1995; Reference SilvermanSilverman 1997); mediation (Reference Greatbatch, Dingwall, Antaki and WiddicombeGreatbatch and Dingwall 1998; Reference GarciaGarcia 2019); news interviews (Reference Clayman and HeritageClayman and Heritage 2002a). Some of this research is summarized in Reference Heritage and ClaymanHeritage and Clayman (2010).
The second, and still larger, body of research into institutional interactions involves the application of CA to evaluate or even to suggest change to conduct in these settings. The distinctive feature of this research is that it is primarily aimed as a contribution to another disciplinary field, while contributions to the body of CA research are secondary at best. For example, in a study focusing on the increasingly adversarial character of presidential news conferences, Clayman and colleagues used resources from CA and other branches of linguistics to classify journalists’ questions and to document processes of change (Reference Clayman and HeritageClayman and Heritage 2002b; Reference Clayman, Sidnell and StiversHeritage and Clayman 2013; see also Reference Clayman and HeritageClayman and Heritage 2021). Developing the study further, they were able to identify some of the major social factors associated with these processes (Reference Clayman, Elliott, Heritage and McDonaldClayman et al. 2006, Reference Clayman, Heritage, Elliott and McDonald2007, Reference Clayman, Elliott, Heritage and Beckett2010). This was primarily a contribution to political science, rather than CA. In medicine, a study of question design showed that the avoidance of negative polarity items (primarily any) could strongly reduce the likelihood that patients would leave the doctor’s office with unmet medical concerns (Reference Heritage, Robinson, Elliott, Beckett and WilkesHeritage et al. 2007). Also in the early 2000s, an extensive series of studies by Stivers (including Reference Stivers2002, Reference Stivers2005, Reference Stivers2007; Reference Stivers, Mangione-Smith, Elliott, McDonald and HeritageStivers et al. 2003) formed the foundation of a successful intervention to reduce inappropriate antibiotics prescribing by pediatricians (Reference Kronman, Gerber, Grundmeier, Zhou, Robinson, Heritage and Mangione-SmithKronman et al. 2020). Again, these are findings concerning medical practice, and only secondarily CA findings. These applications involved the use of statistical evidence, founded on CA findings, as the basis for recommendations, but statistical procedures are not always necessary to influence institutional arrangements. On the contrary, smaller-scale qualitative studies can also be deeply illuminating for practitioners and can form the basis for changes in practice.
1.5 Conclusion
Conversation Analysis has aimed for, and in some measure achieved, findings that are formal and generalizable. This has led some to the conclusion that the field is somehow indifferent to the contexts in which interaction occurs – “the sins of noncontextuality,” as Reference GoffmanGoffman (1981: 32) described them (see also Reference CicourelCicourel 1987; Reference DurantiDuranti 1997). For many in the field, myself included, this criticism (for that is how it is intended) is difficult to reconcile with what we know about the practice of CA. For the fact is that much of what we mean by the “context” of an utterance consists of the prior turn and sequence in which it emerges and to which it responds. Moreover, it is commonly the case that many of the more “distal” aspects of social context are structurally or inferentially embedded in those prior sequences (Reference Schegloff, Boden and ZimmermanSchegloff 1991, Reference Schegloff, Drew and Heritage1992c). It might be claimed, indeed, that in an important sense the whole effort of CA is aimed at explicating the meaning of “context” in interaction. However, just as the structure and meaning of sentences is built up from constituents, so too is the structure and meaning of interactions. The research strategy for identifying these constituents and their properties must necessarily involve identifying the trans-situational role and significance of particular practices of interaction. This involves examining the contextual boundaries of practices (Reference SchegloffSchegloff 1997; Reference Clift and RaymondClift and Raymond 2018), together with the ways their import is particularized in specific moments of interaction (see Reference Heritage and SilvermanHeritage 2020 for an illustration). The upshot here is that CA research requires extraordinary care and attention to the contexts in which a practice is deployed as a means of identifying the trans-contextual ways in which it can function as a constituent of interaction.
Interaction, Schegloff avers, “is the primordial site of human sociality” (Reference Schegloff, Ochs, Thompson and Schegloff1996b; see also Reference Schegloff, Enfield and LevinsonSchegloff 2006). Across massive social and historical change and in the first contacts between peoples previously unknown to one another, the interaction order provides for continuity and mutual understanding. The organization of interaction “needs to be – and is – robust enough, flexible enough, and sufficiently self-maintaining to sustain social order at family dinners and in coal mining pits, around the surgical operating table and on skid row, in New York City and Montenegro and Rossel Island, and so forth, in every nook and cranny where human life is to be found” (Reference Schegloff, Enfield and LevinsonSchegloff 2006: 71). The sustaining of this social order is achieved through organizations of interactional practices that provide for a self-organizing and self-reproducing system for the living of real lives. Some of these have been briefly sketched above.
In the brief compass of this chapter, I have tried to indicate and to illustrate some of the interests and concerns that conversation analysts have pursued as the field developed from a small and marginal “cottage industry” into a major international multidisciplinary field of inquiry. There are significant omissions and limitations in this account. Neither turn-taking, nor repair have been addressed in anything but the most fleeting and tangential way. Action ascription (Reference Levinson, Sidnell and StiversLevinson 2013; Reference Deppermann and HaughDeppermann and Haugh 2022), despite its obvious importance, has likewise been given short shrift. The nature and significance of findings in the numerous languages that conversation analysts have addressed have also, perforce, been triaged out. These omissions and others notwithstanding, this sketch may, I hope, convey something of this very large field in general, and of its approach to context in particular.
Transcription ConventionsFootnote 7
The typed or printed examples embody an effort to have the spelling of the words to give a rough indication of how the words were produced. Often this involves a departure from standard orthography. In addition:
- ?,._
Punctuation is designed to capture intonation, not grammar and should be used to describe intonation at the end of a word/sound at the end of a sentence or some other shorter unit. Use the symbols as follows: Question mark for marked upward intonation; comma is for slightly upward intonation; and period for falling intonation. An “empty underline” indicates level final intonation.
- [
Left-side brackets indicate where overlapping talk begins.
- ]
Right-side brackets indicate where overlapping talk ends, or marks alignments within a continuing stream of overlapping talk.
- (0.8)
Numbers in parentheses indicate periods of silence, in tenths of a second.
- :::
Colons indicate a lengthening of the sound just preceding them, proportional to the number of colons.
- becau-
A hyphen indicates an abrupt cut-off or self-interruption of the sound in progress indicated by the preceding letter(s) (the example here represents a self-interrupted “because”).
- He says
Underlining indicates stress or emphasis.
- dr^ink
A “hat” or circumflex accent symbol indicates a marked pitch rise.
- =
Equal signs (ordinarily at the end of one line and the start of an ensuing one) indicates a “latched” relationship – no silence atall between them.
- ()
Empty parentheses indicate talk too obscure to transcribe. Words or letters inside such parentheses indicate the transcriber’s best estimate of what is being said.
- hhh.hhh
The letter h is used to indicate hearable aspiration, its length roughly proportional to the number of hs. If preceded by a dot, the aspiration is an in-breath. Aspiration internal to a word is enclosed in parentheses. Otherwise hs may indicate anything from ordinary breathing to sighing, laughing, etc.
- °
Talk appearing within degree signs is lower in volume relative to surrounding talk.
- ((looks))
Words in double parentheses indicate transcriber’s comments, not transcriptions.
- →
Arrows in the margin point to the lines of transcript relevant to the point being made in the text.