Interactive alignment is a phenomenon whereby interlocutors adopt and re-use each other's language patterns in the course of authentic interaction. According to the interactive alignment model, originally proposed by Pickering & Garrod (2004), this linguistic coordination is one way in which interlocutors achieve understanding in dialogue, occurring at the level of the lexicon, grammar, and pronunciation. The goal of this paper is to extend this model to second language (L2) pronunciation and to discuss its possible implications for L2 pronunciation teaching.

1. Introduction

In an episode of ‘I Love Lucy’, a much-loved American TV comedy show, the main character Lucy and her friend Ethel were taking private French lessons so that they could master at least enough spoken French to order a meal at a French restaurant. In the end, Lucy and Ethel learned no useful phrases but had lots of fun imitating the pronunciation of their French tutor, who used a ‘listen and repeat’ approach to teaching the words le crayon ‘pencil’ and la plume ‘pen’. At the end of the lesson, Lucy sarcastically remarked, ‘We're in great shape if the restaurant we go to happens to serve pens and pencils!’ This tongue-in-cheek example aptly highlights several challenging questions that both language researchers and teachers – particularly those interested in the learning and teaching of second language (L2) pronunciation – have grappled with for decades: How is L2 pronunciation learned and how is it best taught? The goal of this paper is to begin addressing at least some aspects of these questions by bringing in some evidence from psycholinguistics. I will first briefly outline a possible view of L2 pronunciation learning, will then show some preliminary data that are consistent with this view, and conclude by discussing some possible implications of this view for L2 pronunciation learning and teaching.

In the field of L2 pronunciation learning, there is clearly no shortage of interesting theoretical proposals which are designed to explain how various aspects of L2 pronunciation are learned (e.g. Eckman 1991; Major 2002; Escudero & Boersma 2004; Best & Tyler 2007; Flege 2007; Darcy et al. 2012). However, it is clear (especially to those involved in the practical task of language teaching) that although many of these proposals are scientifically sound and engaging, they often have little to contribute to L2 pedagogy, either because they were not designed with practice in mind or because such links have not yet been established. Fraser (2004) captured this sentiment in her criticism of similar theoretical views:

This is of course a valid scientific analysis, but is of limited direct use in the practical task of helping learners alter their pronunciation, since we are dealing here, not with phonological systems in collision, but with people learning a cognitive skill (p. 279).

While the utility of theories cannot (and should not) be judged solely on their contributions to practice, one goal of theory-building in L2 pronunciation should be the establishment of ‘best practice’, or the idea that research should ultimately inform L2 pronunciation teaching. This is because pronunciation is not simply a fascinating object of inquiry, but one that permeates all spheres of human life, lying at the core of oral language expression and embodying the way in which the speaker and hearer work together to produce and understand each other's utterances. My goal therefore is to contribute to the overall objective of bridging the gap between research and practice by outlining a ‘teaching-friendly’ view of L2 pronunciation learning and discussing its pedagogical applications. This view is an extension of the interactive alignment model to L2 pronunciation learning and teaching.

2. Interactive alignment

Interactive alignment, as a theoretical view, originated in the field of cognitive psychology and was first articulated by Pickering & Garrod (2004). Underlying this view is the idea that dialogue is the most natural mode of human communication, and that the goal of interaction is for interlocutors to arrive at a common situation model. In other words, interlocutors need to establish ‘common ground’, which includes (but is not limited to) information about people, time, actions, and their causes and consequences.

An interesting question here is how interlocutors achieve such common ground in the course of an interaction. Pickering & Garrod proposed that at least one way of doing so is related to how interlocutors use language in the course of interaction. More specifically, interlocutors achieve understanding by aligning or coordinating their language at various levels: lexical, syntactic, and phonological. And this alignment becomes evident during conversation when interlocutors adopt and repeatedly use each other's language patterns. For example, native speakers engaged in communication tasks tend to re-use each other's lexical content and phrasal structure across turns as they work to construct a common understanding as part of interaction (e.g. Garrod & Anderson 1987). Moreover, alignment at one linguistic level is assumed to lead to alignment at another, suggesting that alignment is greatest when speakers use common lexical, syntactic, and phonological patterns in an integrated fashion. This, Pickering & Garrod argued, illustrates convergence in language use, which promotes successful communication.

Since then, researchers have shown that native-speaking interlocutors constantly demonstrate linguistic alignment or coordination in spoken interaction. Interlocutors re-use not only each other's words (Garrod & Anderson 1987; Brennan & Clark 1996; Bortfeld & Brennan 1997) and grammatical structures (Branigan, Pickering & Cleland 2000; Branigan et al. 2010) but also converge on common phonetic realizations of words (Clarke & Garrett 2004; Pardo 2006) and on common accent and speech rate (Giles, Coupland & Coupland 1991). This re-use of language patterns across interlocutors, indicative of alignment at various linguistic levels, has come to be seen as a powerful repetition-driven cognitive mechanism which supports successful interaction.

3. Interactive alignment in pronunciation

To date, the interactive alignment view has been successfully applied to describe different aspects of interaction between native speakers (Garrod & Pickering 2009) and has been extended to bilingual code-switching (Kootstra, van Hell & Dijkstra 2010). Can interactive alignment also be used to explain how learners acquire and use L2 pronunciation?

3.1 Alignment in native speaker interaction

There is now a considerable body of evidence that native-speaking interlocutors converge on common speech patterns in the course of interaction. This idea is far from novel. In fact, what is referred to here as linguistic alignment at the level of pronunciation has been studied for decades within sociolinguistics as part of accommodation theory (Giles 1973; Giles et al. 1991; Shepard, Giles & Le Poire 2001). Briefly, accommodation theory is a framework for the study of linguistic and nonlinguistic behaviour, in the context of social interaction, as a function of interlocutor beliefs, attitudes, and sociocultural conditions. Over 20 years ago, for example, Giles et al. (1991) listed several speech characteristics on which interlocutors appear to converge during laboratory-controlled and spontaneous interactions. These characteristics included utterance length, speech rate, information density, volume, and pausing frequencies and lengths, as well as response latency. Accommodation theory explains such linguistic convergence as a sign of interlocutors’ (often subconscious) desire for mutual social integration and identification and their need for mutual social approval.

More recently, working within the cognitive processing perspective, researchers have demonstrated tight links between interlocutors’ speech output (production) and speech input (perception) in conversation. Pardo (2006), for instance, has shown that interlocutors converge on common phonetic realizations of words and that such convergence occurs rapidly (early on in the conversation) and persists for at least one week after the initial conversation. In another study, Kim, Horton & Bradlow (2011) have found that native-speaking interlocutors sharing the same dialect are more likely to converge on common phonetic and prosodic speech patterns than interlocutors with distinct dialects, suggesting that convergence is facilitated when interlocutors share a common linguistic background (see also Pardo, Jay & Krauss 2010; Nielsen 2011; Babel 2012; Pardo et al. 2012; Pardo et al. 2013). Phonetic convergence can occur even for speech that is only seen. For example, listeners show similar degrees of phonetic convergence for words that they heard and for words that they lip-read from a silent video recording of a speaker (Miller, Sanchez & Rosenblum 2010; Mol et al. 2012). Taken together, these findings point to the existence of a rapid and probably automatic process of phonetic alignment in native-speaking interlocutors. This process appears to reflect a human perceptual system which adapts readily in response to recent experience (Samuel & Kralijc 2009).

3.2 Alignment in L2 learners

When it comes to L2 learners interacting with other learners or with native speakers, it is far less obvious whether and under what circumstances learners demonstrate interactive alignment in pronunciation. It appears, though, that interactive alignment depends on several related variables, including the native language background of interlocutors, their level of L2 proficiency, their familiarity with each other's way of speaking, and perhaps some shared knowledge (Costa, Pickering & Sorace 2008; Garrod & Pickering 2009). However, at least for English, most of the interactions in today's world occur between non-native speakers who might not share a common language or might not have shared background knowledge. Language classrooms are also increasingly diverse, composed of learners from different social, educational, and linguistic backgrounds. In other words, alignment may be less likely to occur because of the diversity which typifies many L2 interactions. In fact, Gambi & Pickering (2013) recently hypothesized that the extent of alignment in pronunciation will be determined by perceived (and actual) similarity between the two interlocutors, with greatest alignment occurring when similarity is high. Presumably, such similarity is based on a variety of factors, including linguistic (e.g. differences in interlocutors’ language backgrounds, L2 proficiency), cognitive (e.g. automaticity of L2 skill), and social (e.g. extent of social interaction between the two interlocutors, attitudes). Therefore, before interactive alignment can be extended to L2 pronunciation learning, it is important to determine if it does indeed occur in conversations between L2 speakers, particularly with respect to pronunciation.

Some evidence for this issue comes from recent work by Trofimovich & Kennedy (unpublished). These researchers audio- and video-recorded 34 non-native speakers of English, from many different language backgrounds, interacting with each other in two-way information-gap tasks in a university setting. In one of the tasks, interlocutors had to reconstruct a common map from the partial information given to each interlocutor (either landmarks, or the route with most landmarks missing). In another task, interlocutors had to co-construct a common six-picture narrative from three pictures given to each partner. The analyses of speaker interaction in these tasks support at least three generalizations. First, some conversation partners are more successful than others at managing the interaction on a turn-by-turn basis, with successful interactions nearly always characterized by vast amounts of repetition across dialogue partners. The second finding is that repetition appears to be a common way of addressing intelligibility problems, which is consistent with previous findings in intercultural communication (Bremer & Simonot 1996; Watterson 2008). Finally, the finding that is most relevant to L2 pronunciation learning is that in some cases speakers can converge on a common pronunciation in the speech of their interlocutor through repetition. In the following example, which comes from the map task, Shokri consistently drops the /h/ in the words ‘house’ and ‘home’; however, he converges on a more targetlike production after experiencing Yu's targetlike variants of these words.

These descriptive data were further supported by analyses of listener judgments of L2 interactions. In order to determine if global interactive alignment occurs in conversation between two L2 learners, Trofimovich & Kennedy took 50-second excerpts from the beginning (first minute) and end of each interaction (approximately six minutes later). These excerpts were then presented in scrambled order to ten native-speaker listeners, who rated each interaction on a 50-millimeter continuous scale estimating the extent to which both interlocutors sounded similar. The results of this analysis are shown in Figure 1, which illustrates that the L2 interlocutors in the excerpts at the end of the interaction were rated by listeners as sounding more similar than the same interlocutors communicating early on in their conversation. This effect was statistically significant for both tasks.

Figure 1 Listener ratings of how similar L2 interlocutors sounded in excerpts from the beginning (first minute) and end of the interaction (approximately six minutes later).

3.3 Interactive alignment as a teaching tool?

If interactive alignment of the kind demonstrated by Trofimovich & Kennedy occurs in the course of an authentic interaction in the L2, then an interesting question to explore is whether such alignment can be exploited pedagogically. Some evidence for potential applications of interactive alignment to the teaching of L2 pronunciation comes from a recent classroom-based study with university-level L2 learners of English conducted by Trofimovich, McDonough & Foote (unpublished). These researchers created communicative activities for learners enrolled in a course of English for academic purposes, and seeded these activities with multiple instances of three- and four-syllable English academic words with the stress on the second syllable (e.g. consider, intelligent). Such words, which feature a common stress pattern in English, often pose challenges for even advanced L2 speakers (e.g. Murphy 2004). Besides providing learners with opportunities to exchange information on a topic relevant to unit themes, these activities were meant to elicit interactive alignment. In other words, it was hoped that L2 interlocutors would produce a target stress pattern more frequently after their interlocutor produced one than when their interlocutor had not produced an accurate stress. This is precisely what was observed. Across all four tasks administered throughout a 13-week course, L2 learners showed the tendency to repeat a target stress pattern after experiencing one in the immediately preceding turn in the speech of their interlocutor. This finding is illustrated in Figure 2, which combines the data from the four communicative tasks.

Figure 2 Mean proportion of accurate word stress produced after a target stress pattern versus after a non-target stress pattern in interlocutor's speech.

The following excerpt, taken from a true/false quiz on children's health, illustrates alignment between two L2 learners in terms of their accurate stress placement in two academic words: deTECted and asSUMption (where capitals designate the stressed syllable).

In this exchange, Speaker B produced an accurate stress pattern after hearing an accurate stress pattern spoken by Speaker A, and this tendency was proportionately greater than the tendency for Speaker B to produce an accurate stress pattern after Speaker A had not produced one.

Interactive alignment aside, these findings indicate that collaborative tasks seeded with instances of targeted pronunciation patterns appear to be successful in providing practice opportunities for learners. In total, academic words were used by each pair of interlocutors on average between 23 and 29 times, with each speaker producing about 11–15 words and consequently hearing his or her partner say an equivalent number of words. This amounts to a sizeable exposure to the target stress pattern in a 10–15 minute activity in which learners are not intentionally focusing on word stress at all, and are simultaneously receiving practice opportunities for other English skills, such as fluency development, vocabulary, or question formation. This means that even in cases where interactive alignment does not occur, learners are still given ample opportunities to practice the target structure in a communicative setting.

4. Implications for pronunciation learning

If the alignment view is at all relevant to L2 pronunciation learning, what does it have to offer? There are several positive consequences of exploring L2 pronunciation from the alignment perspective. First, it firmly places intelligibility as central to communicative success (Levis 2005; Derwing & Munro 2009). If interlocutors’ goal is to achieve understanding, then intelligibility problems can be viewed as failure to align at the level of phonetic/prosodic perception and production. Interactive alignment thus becomes one means for interlocutors to resolve and avoid communication breakdowns, particularly when lack of intelligibility compromises smooth and efficient communication. Second, the alignment view is not relevant just to language immersion as a context of language learning, it also firmly establishes pronunciation within communicative approaches to classroom language learning and teaching (Celce-Murcia, Brinton & Goodwin 2010), with a dual focus on both the speaker and the hearer as active participants in communication. Third, the alignment view gives input an important role in L2 pronunciation learning, especially its quantity and variability (e.g. Thomson 2012). Indeed, learners should be capable of aligning not just to a single speaker (e.g. their teacher). To avoid this, they may need to be exposed to a variety of interlocutors through interaction. Finally, the alignment view does not exclude social and contextual influences on learning. For instance, in the course of interaction, interlocutors might align not only in terms of language but also in terms of gestures, facial expressions, eye gaze, and body movement (Atkinson et al. 2007; Churchill et al. 2010). In fact, alignment can be viewed even more broadly – in the context of an individual's interaction with his or her environment (Atkinson 2011). It is possible to imagine, then, that interlocutors can also align (or fail to do so) at the level of social factors, such as attitudes, beliefs, and identity, and that these could influence the nature of interaction and the quality of language produced (Lindemann 2002). From this vantage point, the interactive alignment view emerges as a useful framework for researchers to explain some of the complexities of L2 pronunciation development both from cognitive and sociocultural perspectives, and for teachers to develop or refine activities for use in L2 pronunciation classrooms.

5. Implications for pronunciation teaching

If repetition of language patterns at different linguistic levels is indeed a commonplace feature of communication among native speakers and can be observed and elicited in L2 learners, then what can interactive alignment, as a theoretical view, offer to L2 pronunciation teaching? The answer to this question likely depends on a clear understanding of what underlies alignment in dialogue. In their original model, Pickering & Garrod (2004) proposed priming as the main mechanism of alignment in dialogue. Priming is essentially an implicit, unconscious repetition phenomenon. It refers to speakers re-using language patterns experienced in recent discourse (McDonough & Trofimovich 2008). There is strong support for repetition and priming as implicit phenomena in the fields of social and cognitive psychology, both for native speakers and L2 learners. In social psychology, for instance, mimicry (verbal, facial, emotional, and behavioural repetition) has been long regarded as an automatic and implicit phenomenon of social behavior (Chartrand & Dalton 2008). And in the field of cognitive psychology, the unconscious repetition of language patterns experienced in recent discourse (shown as priming effects) is considered an automatic and implicit language learning mechanism (Ferreira & Bock 2006; Chang, Janciauskas & Fitz 2012). Thus, the involvement of implicit learning in linguistic alignment is established. What needs to be clarified, though, is how more explicit and overt ways of language learning and use relate to alignment and how such explicit ways of learning (e.g. category formation, inferencing) may be harnessed to promote linguistic alignment. Gambi & Pickering (2011, 2013) have recently outlined a perspective which has the potential to bridge implicit and explicit influences on alignment (see also Pickering & Garrod 2013). This perspective, which is based on a tight coordination between the speaker's and listener's language comprehension and production systems, assumes that speakers not only produce their own utterances but also predict the utterances in the speech of their listeners as they jointly construct understanding in dialogue. Most importantly, the extent of interlocutor coordination, according to Gambi & Pickering, depends on both implicit alignment processes (e.g. mimicry) and explicit contextual factors (e.g. interlocutors’ communicative role).

Although it may be premature to suggest definitive applications of the interactive alignment view to L2 pronunciation teaching, at least until we better understand the implicit and explicit mechanisms underlying repetition in dialogue, several possibilities nevertheless come to mind. First, L2 learners will likely benefit from awareness-raising activities that will sensitize them to the fact that successful interaction often involves a lot of repetition. Learners might also benefit from listening activities featuring authentic spoken interaction, in order to become aware of pronunciation patterns (both segmental and especially suprasegmental) often repeated between interlocutors. Learners may then become more sensitive to how repetition can be used to construct successful interactions (Bremer & Simonot 1996; Watterson 2008). Second, if we adopt the alignment view, then pronunciation activities specifically targeting linguistic alignment hold some promise in pronunciation teaching. This includes collaborative classroom-based activities designed to elicit alignment with target pronunciation patterns (as opposed to convergence on common pronunciation errors), activities featuring corrective feedback, especially recasts, as repeated models of targetlike language, as well as activities built around high-frequency, functional, formulaic language (e.g. Gatbonton & Segalowitz 2005; Trofimovich & Gatbonton 2006; Saito & Lyster 2012; Trofimovich, McDonough & Neumann 2013; Trofimovich et al. unpublished).

Third, if we assume that alignment is enhanced when learners encounter patterns of language that match in many ways – for example, in terms of grammar and pronunciation – we can also hypothesize that alignment should be enhanced for patterns of language experienced simultaneously across several modalities, sensory channels, and presentation media. And there is some very interesting evidence emerging about the effectiveness of multimodal, multisensory techniques applied to the teaching of L2 pronunciation (e.g. Levis & Pickering 2004; Sueyoshi & Hardison 2005; Hardison 2010). Last but not least, the alignment view implies that different kinds of imitation activities – such as silent mouthing (Davis & Rinvolucri 1990), mirroring, echoing, and shadowing (Celce-Murcia et al. 2010) – as well as dramatic imitation techniques that involve imitating not only speech, but also gestures, facial expressions, and affect (Hardison & Sonchaeng 2005), may be particularly useful in helping L2 learners align to a model. I am not arguing here for meaningless, drill-like repetition used as a teaching tool. Instead, my view is informed by insights from first language acquisition literature, where repetition is viewed as a powerful mechanism used by children in learning their native language (Meltzoff 2005; Tomasello & Carpenter 2005) and used only when children understand the meanings and functions of the language or action they experience (Tomasello & Carpenter 2005). L2 learners do seem to rely heavily on repetition as a strategy in pronunciation learning (Osburne 2003; Ding 2007); however, the learning potential of various repetition activities has not been thoroughly explored. The alignment perspective provides a useful framework for investigating the pedagogical value of repetition.

6. Concluding remarks

So where does this all leave us, at least for now? For researchers, the interactive alignment view seems to provide a useful framework for fruitful future investigations into L2 pronunciation learning. Researchers could, for example, study alignment as a function of interlocutor characteristics, investigate alignment in interactive assessment contexts and its impact on assessment validity, compare interactive alignment in native-non-native and L2–L2 communication (especially in lingua franca contexts), study alignment as a complex and situated phenomenon, or examine the learning potential of alignment activities. Two particularly promising avenues of future research include studying long-term learning benefits of alignment in order to establish the acquisitional value of alignment, as well as investigating the kinds of learner and task characteristics (e.g. learner proficiency level, task type, speaker attitude) that lead to maximum alignment (or, in contrast, result in a lack of alignment) in meaningful classroom-based interaction. And for learners like Lucy and Ethel and their teachers, the alignment view suggests that contextualized, meaningful repetition-based activities, including dramatic imitation, multimodal/multisensory tasks, and collaborative alignment activities, can be a useful addition to the activities learners and teachers already use in L2 pronunciation classrooms.


The research reported in this paper was supported by grants from the Social Sciences and Humanities Research Council of Canada (SSHRC) and Fonds québécois de la recherche sur la société et la culture (FQRSC). I would like to thank Sarita Kennedy and three anonymous reviewers for their helpful comments on earlier drafts of this manuscript. I am particularly grateful to my students, co-authors, and research assistants for their invaluable assistance and insight.


Dr. Pavel Trofimovich is an associate professor of applied linguistics in the Department of Education and a member of the Centre for the Study of Learning and Performance at Concordia University in Montreal, Canada. His research focuses on cognitive aspects of L2 processing, L2 phonology, sociolinguistic aspects of L2 acquisition, and the teaching and learning of L2 pronunciation. He has co-authored two books on the use of cognitive psycholinguistic research methods in L2 research and has published over 45 peer-reviewed articles and book chapters. He currently serves as an associate editor for Language Learning.