We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure coreplatform@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Cet article présente une analyse multifactorielle du comportement des consonnes liquides post-obstruantes en position finale de mot en français. Plus précisément, nous nous intéressons au phénomène par lequel un <l> ou <r> final dans la transcription orthographique d’un mot n’est pas prononcé dans la parole continue, p. ex. comme dans les items « découvre » et « terrible », respectivement prononcés [dekuv] et [teʁʁb] au lieu de [dekuvʁʁ] et [teʁʁibl]. Au total, plus de 2500 items comportant un cluster obstruante+liquide en fin de mot ont été extraits d’un corpus de parole d’une durée de 13 heures, qui a été étiqueté à différents niveaux. Ce corpus comprend les productions de 120 locuteurs originaires de 3 pays francophones (Belgique, France et Suisse), enregistrés dans deux tâches différentes (lecture et conversation). Pour déterminer ce qui affecte la (non-)prononciation de <l> ou <r> dans ces contextes, une douzaine de prédicteurs de différentes natures sont testés dans un même modèle statistique. Les caractéristiques phonétiques telles que le lieu et le mode d’articulation, le contexte droit, le statut prosodique et le taux d’articulation; mais aussi les prédicteurs liés aux locuteurs (pays d’origine, sexe, âge, classe sociale) et au style de parole sont pris en compte. Des modèles mixtes linéaires généralisés révèlent que seulement la moitié d’entre eux ressortent comme jouant un rôle significatif sur la variable à l’étude. Les résultats sont discutés à la lumière dʼétudes antérieures portant sur les aspects sociolinguistiques des variantes de prononciation en français.
Chapter 1 of Discourse Syntax introduces the concept of discourse syntax and connects this topic to what students likely already know from a basic introduction to English syntax, like parts of speech, basic principles of canonical word order in English and basic patterns of grammatical variation, such as syntactic movement. It emphasizes that patterns of variation are systematic and often rooted in the surrounding discourse. The chapter also introduces corpora of English, such as the Corpus of Contemporary American English (COCA), and the notion of reference grammars.
This chapter presents some of the most significant studies in the history of intercultural pragmatics (IP) research that have applied the methodology of corpus pragmatics (CP). In fact, the use of corpora has been an essential contribution to IP in crucial areas such as formulaic language, context and common ground, or politeness research, among others, with the conviction that CP has redefined the conceptualization of pragmatic competence in a globalized world. The chapter follows a topical structure in which critical areas of research from an intercultural and corpus pragmatic perspective are addressed, like the role of the lingua franca; the use of academic, professional, and scientific language; cross-cultural studies; prosody, multimodality, and computer-mediated communication and learner's corpora. In all these areas, the chapter highlights the significant research concerns and achievements that have helped to shape IP as an essential discipline in current linguistic theory. A final section with conclusions and ideas for further research will ensue.
This study investigates how Japanese-speaking children learn interactional dependencies in conversations that determine the use of un, a token typically used as a positive response for yes-no questions, backchannel, and acknowledgement. We hypothesise that children learn to produce un appropriately by recognising different types of cues occurring in the immediately preceding turns. We built a set of generalised linear models on the longitudinal conversation data from seven children aged 1 to 5 years and their caregivers. Our models revealed that children not only increased their un production, but also learned to attend relevant cues in the preceding turns to understand when to respond by producing un. Children increasingly produced un when their interlocutors asked a yes-no question or signalled the continuation of their own speech. These results illustrate how children learn the probabilistic dependency between adjacent turns, and become able to participate in conversational interactions.
This Element explores approaches to locating and examining social identity in corpora with and without the aid of demographic metadata. This is a key concern in corpus-aided studies of language and identity, and this Element sets out to explore the main challenges and affordances associated with either approach and to discern what either approach can (and cannot) show. It describes two case studies which each compare two approaches to social identity variables – sex and age – in a corpus of 14-million words of patient comments about NHS cancer services in England. The first approach utilises demographic tags to group comments according to patients' sex/age while the second involves categorising cases where patients disclose their sex/age in their comments. This Element compares the findings from either approach, with the approaches themselves being critically discussed in terms of their implications for corpus-aided studies of language and identity.
In this chapter, I show that mental health and illness is an increasingly important topic in UK society, both in terms of the number of newspaper articles covering mental illness-related issues and the increased prevalence of mental illness generally. I also show how the public are increasingly aware of the language used to discuss mental illness in the press. Moreover, I explain how the language used to discuss mental illness is being increasingly prescribed by anti-stigma initiatives. Despite anti-stigma activities and initiatives, very little research exists that explores the language used to discuss mental illness in the press using a purely linguistic approach. For this reason, I set out the research gap in the existing literature that this book goes some way to addressing. I also introduced the MI 1984–2014 Corpus and provide an outline for the rest of this book.
This study investigated whether Korean children follow the acquisition pattern predicted by the Aspect Hypothesis (Shirai & Andersen, 1995), and the relationship between caretakers’ and children’s speech. Accordingly, we analyzed a Korean corpus (Ryu-Corpus) on the CHILDES database (MacWhinney, 2000), which comprised longitudinal video-recorded interactions of three Korean children and their caregivers. Results indicate that the children used the past marker -ess principally with telic verbs, consistent with the Aspect Hypothesis. Each child’s usage closely reflects the caretaker’s frequency, yielding a high correlation (τb = 0.79). However, the acquisition of the imperfective marker -ko iss did not show a predicted association with activity verbs, contrary to the Aspect Hypothesis. Furthermore, caretakers’ input did not correlate with the children’s utterances of the imperfective marker (τb = 0.40). We argue that multiple factors such as input frequency, language-specific organization of aspectual semantics, and individual differences should be considered to explain tense-aspect acquisition.
We show that empirical corpus-based research is prevalent across subdisciplines of (applied) linguistics, not just in “corpus linguistics” journals. We define a corpus as a large, principled sample of texts designed to represent a target domain of language use. Corpus representativeness is conceptualized as the extent to which a corpus permits accurate and meaningful generalizations about linguistic patterns that are typical in a domain. Corpus representativeness involves two main considerations, which are both relative to the linguistic research goal of interest: domain considerations (adequate representation of the text varieties in the domain), and distribution considerations (adequate representation of the distribution of linguistic features in the domain).
This article aims to provide a fresh approach to the study of hypercorrection, the misguided application of a real or imagined rule – typically in response to prescriptive pressure – in which the speaker's attempt to be ‘correct’ leads to an ‘incorrect’ result. Instead of more familiar sources of information on hypercorrection such as attitude elicitation studies and prescriptive commentary, insights are sought from quantitative and qualitative data extracted from the 2-billion-word Global Web-based English corpus (GloWbE; Davies 2013). Five categories are investigated: case-marked pronouns, -ly and non-ly adverbs, agreement with number-transparent nouns, (extended uses of) irrealis were, and ‘hyperforeign’ noun suffixation. The nature and extent of hypercorrection in these categories, across the twenty English varieties represented in GloWbE, are investigated and discussed. Findings include a tendency for hypercorrection to be more common in American than in British English, and more prevalent in the ‘Inner Circle’ (IC) than in the ‘Outer Circle’ (OC) varieties (particularly with established constructions which have been the target of institutionalised prescriptive commentary over a long period of time).
This chapter begins with the basics of affixation, including the types of morphemes that are commonly found in English (prefixes, suffixes, bound bases, formatives, extenders). Students learn to formulate word formation rules, to represent the internal structure of complex words with tree structures, and to understand difficulties that arise in segmenting words. The chapter also considers the range of meanings that derivational affixes can express. It includes a section on compounding that considers the difficulties in defining what a compound is, the notion of headedness, and the types of compounds that are common in English. The chapter also briefly considers minor types of word formation such as backformation, blending, acronyms, initialisms, and coinage. Students are taught to use corpora such as the Corpus of Contemporary American English to find their own morphological data.
How do women and men from around the world really speak English? Using examples from World Englishes in Africa, America, Asia, Britain and the Caribbean, this book explores the degree of variation based on gender, in native-, second- and foreign-language varieties. Each chapter is rooted in a particular set of linguistic corpora, and combines authentic records of speakers with state-of-the-art statistical modelling. It gives empirically reliable evaluations of the impact of gender on linguistic choices in the context of other (socio-)linguistic factors, such as age or speaker status, under consideration of local social realities. It analyses linguistic phenomena traditionally associated with genderlectal research, such as hedges, intensifiers or quotatives, as well as those associated with World Englishes, like the dative or genitive alternation. A truly innovative approach to the subject, this book is essential reading for researchers and advanced students with an interest in language, gender and World Englishes.
Digital technology has had a profound and generally beneficial effect on dictionaries and other language-reference tools. Electronic dictionaries continue to evolve and it seems likely that for people born in the current century and beyond, ‘dictionary’ may cease to have its primary denotation as a thick book filled with a list of alphabetised words and their definitions. The idea of the dictionary developed over centuries to its place of privilege in the mid-twentieth century: an authoritative book that could be found in nearly every home. In the decades since then, the idea of the dictionary has rapidly evolved to become, especially for today’s digital natives, an amorphous collection of data that lives in the cloud and that should be quickly retrievable to anyone who desires to find the definition of a word they don’t know, using whatever device they have at hand. In their efforts to become the newest, best, and most dazzling, makers of electronic dictionaries today must not lose sight of the fact that the core need of their user is a simple one than can be met with a simple solution, provided to them with what is now relatively simple technology.
This chapter reviews the transformative effects of technology on dictionary-making, focusing on four main areas: the use of databases for storing and organising dictionary text; the creation and exploitation of corpora for use as the dictionary’s evidence base; the enhancement of the value and usability of corpus data through the application of software tools developed in the NLP (natural language processing) community; and the migration of dictionaries from print to online media. During the last half-century, activity in all these areas has brought fundamental changes to the way dictionaries are created and made available to their users. We trace the development of corpus-based lexicography in English, from the early work of John Sinclair and his colleagues in the 1980s to the present day. Lexicographers working in English and other widely used languages now have access to resources which would scarcely have been imaginable thirty years ago: very large corpora (measured in tens of billions of words) and sophisticated corpus-querying tools are routinely available. The move from print to digital publication is a more recent development, but no less significant. The far-reaching implications of these changes – for dictionary-makers and dictionary-users alike – are explored at every stage.
In Chapter 16, the authors point to the four roles of teachers in vocabulary courses and present research-based suggestions for the effective instruction of vocabulary; they also present a case study that investigated teachers’ perceptions about useful vocabulary, followed by principles required for helping learners with desirable vocabulary learning outcomes.
With increased lexical influence and general English competence among Norwegian language users, the association of the suffix -s with the category of plural appears to be expanding. This article explores the occurrence and productivity of non-possessive -s in contemporary Norwegian, a feature which incorporates several phenomena. Our aim is to chart the lexico-grammatical categories instantiated by this morpho-phonological segment in light of the previous literature on Anglicisms in Norwegian and on the basis of empirical evidence from present-day language use. The article presents a corpus-based survey of categories where non-possessive -s occurs (i) as the plural marker of Anglicisms, e.g. drinks; (ii) in colloquialisms such as dritings ‘dead drunk’ – a combination of a domestic noun and English (or Norwegian) -ing + non-possessive -s reanalysed into an adjectival stem; (iii) in nouns like en caps ‘a (baseball) cap’, where it has lost its plurality marking function and become part of the lexical stem; and (iv) sporadically as a plurality marker of domestic or non-English words, e.g. temas. The variability in presence vs. absence of -s is further explored in four case studies dedicated to different stages of borrowing.
Using a cross-linguistic approach, we investigated Turkish-speaking children's acquisition and use of relative clauses (RCs) by examining longitudinal child–caregiver interactions and cross-sectional peer conversations. Longitudinal data were collected from 8 children between the ages of 8 and 36 months. Peer conversational corpus came from 78 children aged between 43 and 64 months. Children produced RCs later than in English (Diessel, 2004) and Mandarin (Chen & Shirai, 2015), and demonstrated increasing semantic and structural complexity with age. Despite the morphosyntactic difficulty of object RCs, and prior experimental findings showing a subject RC advantage, preschool-aged children produced object RCs, which were highly frequent in child-directed speech, as frequently as subject RCs. Object RCs in spontaneous speech were semantically less demanding (with pronominal subjects and inanimate head nouns) than the stimuli used in prior experiments. Results suggest that multiple factors such as input frequency and morphosyntactic and semantic difficulty affect the acquisition patterns.
Resultatives in English and Dutch have developed special degree readings. These readings stem from a reinterpretation of the resultative predicate as indicating a high degree rather than an actual result. For example, when a parent says I love you to death, one need not call the cops, since the sentence is not about love turning lethal, but merely indicative of a high degree of affection. Such cases have often been noted in the literature as idiomatic, but this view ignores the fact that these are not isolated cases but productive constructions that can be used with a variety of verbs. We explore various resultative constructions in English and Dutch, and give a classification of the subtypes involved as well as their diachronic development from ordinary to degree interpretation. We link these subtypes to lexical semantic classes of verbs. Both English and Dutch show a steady growth in the lexical and structural diversity of degree resultatives throughout the early modern and contemporary periods (1600-2000). We focus in our paper on the period 1800-2000, for which we did an extensive corpus study using the Corpus of Historical American English (COHA) and Delpher (a collection of digitized Dutch newspapers, journals, magazines, and other resources). One of our findings is that, similar to other types of expressive language, such as degree modification and emphatic negation, taboo expressions play a role in degree resultatives; in fact, their role is excessive. We outline a number of the commonalities among the semantic domains of expressive language used in resultatives.
There are several types of absolute constructions (acs) in English. Among these, this article investigates the so-called what-with ac, which has not received much attention in the study of English grammar. This article considers the grammatical properties of the construction from a synchronic as well as a diachronic perspective, using much more representative and robust corpora than previous studies. Based on corpus data drawn from historical corpora such as COHA (Corpus of Historical American English, 400 million words), the article addresses questions about changes in the construction's syntactic, semantic and pragmatic properties. In addition, the article provides a Construction Grammar perspective, which supports previous research in arguing that the construction is undergoing the processes of grammatical constructionalization.