Most sociolinguistic research, in the UK as well as elsewhere, has not been cognisant of the recent advances in human geography (Britain, Reference Britain2009, Reference Britain2010). More specifically, current dialectological research tends not to be informed by rigorous geographical sampling methods or relies on geographical methods from the 1980s and early 1990s, such as the CURDS functional regions algorithm used to great effect by Cheshire, Edwards & Whittle (Reference Cheshire, Edwards and Whittle1989, Reference Cheshire, Edwards and Whittle1993). Britain, who has been at the forefront of sociolinguistic theorising of the concept of space in dialectology, proffers three main points of criticism of the variationist enterprise, which we will represent here in full:
Firstly, variationism has at worst largely ignored spatiality and at best treated it quite distinctly and separately from other social factors until relatively recently. Secondly, when it has engaged with space, it has tended to be a social devoid, Euclidean, distance-is-all type of space, rather than a socially rich spatiality, which recognises that “the fact that social processes take place over space and in a geographically-differentiated world affects their operation” (Massey, Reference Massey1985: 16), again until relatively recently. And thirdly, space has not, yet again until recently, seen the sort of critique in sociolinguistics that has been witnessed by concepts such as style (…). (Britain, Reference Britain2009:142)
Indeed, the majority of multi-locality sociolinguistic work can be described as spatially naïve, using geographical space merely as a canvas—unanalysed and undertheorized—onto which the results of linguistic analysis can be mapped. However, since the 1970s and 1980s, human geographers have started to conceptualise regions—and places within them—as dynamic entities which warrant more flexible and emically driven multifactorial approaches. Contemporary human geography, having moved beyond static, a priori approaches to space, aims at investigating “the construction of human geographies, the social production of space and the restless formation and reformation of geographical landscapes” (Soja, Reference Soja1989:10–11). Since little of this work has permeated into sociolinguistics, this paper sets out to investigate the ways in which the discipline can fruitfully draw on models created within the framework of human geography. More specifically, we investigate the benefits of using geographically informed parameters for sampling in multi-locality studies.
In this paper we put forward a model that embraces a socially sensitive approach to space as a sampling criterion. We also report on a pilot analysis that used a range of human geographical methods for sampling across the extreme North East of England (consider Map 1).
2. Geographical approaches for sampling in dialectological projects
The primary concern of most multi-locality dialectological projects, especially of older studies but also many recent ones, has been social (rather than geo-demographical) representativeness. Great care is generally taken to investigate and/or control for variability along the classic factors of gender, socioeconomic class, and age (plus sometimes attitudinal and/or networks factors). As such, “social variables of the local dialect speakers [are] … homogenized as much as possible in order to examine geographical variation” (Barbiers, Cornips & Kunst, Reference Barbiers, Cornips and Kunst2007:60). Space, however, the object of investigation, tended to be treated as carrier material, a blank slate over which linguistic variability was superimposed. Britain (Reference Britain2009:144) comments that “there was actually very little that can be considered truly geographical, letalone spatially sensitive in the work of the traditional dialectologists,” and to a great extent there still is not.Footnote 1 And so, Labov's summary paper (Reference Labov1982:42) rightly states that “the study of the heterogeneity in space has not advanced at the same tempo as research in single communities.”
At the start of the 21st century, dialectology—and with it the theorisation and manipulation of space as it pertains to linguistic analysis—seems to undergo an upswing. Two large overview volumes have recently appeared (Auer & Schmidt, Reference Auer and Schmidt2010; Lameli, Kehrein, & Rabanus, Reference Lameli, Kehrein and Rabanus2010). Critical reflections on space are underway and published more widely in the literature, (Buchstaller, Reference Buchstaller2008; Britain, Reference Britain2004, Reference Britain2009, Reference Britain2010; Horvath & Horvath, Reference Horvath and Horvath2001, Reference Horvath and Horvath2002; Stuart-Smith, Reference Stuart-Smith2002–5). Also our descriptive base has been broadened with the recent collection of a range of large-scale multi-locality data-sets, many of which aim at spatial and human geographical representativeness, leaving outdated grid-based models behind or at least supplementing them with more socially sensitive sampling methods. Let us investigate the sampling strategies of a number of recent large-scale projects in order to trace their conceptualisation and manipulation of space as well as the notion of representativeness that underlies these methods.Footnote 2
An ever-increasing number of atlas projects are coming out of ‘socio-syntax’, a new linguistic sub-discipline that investigates syntactic micro-variability by sampling across larger geographical areas. We briefly discuss the sampling methods underlying the Dynamic Syntactic Atlas of the Netherlands Dialects (SAND, http://www.meertens.knaw.nl/sand/zoeken/), which—under the auspices of the European Science Foundation funded Edisyn project—functions as a hub for similar dialect syntax projects (http://www.dialectsyntax.org/wiki/About_Edisyn).Footnote 3
The SAND approach to sampling is combinatory: It relies on tessellation via a grid model—250 cells of variable size for the whole of the Netherlands including both urban and rural localities—to ascertain overall spatial representativeness. But it is also sensitive to human geographical factors such as political borders, demographic changes, (counter)urbanisation and isolation: Certain types of locations received a higher sampling density, namely (i) those that are relatively isolated (e.g. (former) islands) and (ii) locations in transitional areas (e.g. between Frisian and Low-Saxon, German and Dutch and along the Germanic-Romance language border). The same also holds for locations in areas of which pilot projects or the linguistic literature revealed more dialectological variation (cf. Lekakou and Barbiers, p.c. 29 March, 2009). SAND does not sample according to population size, “but two important criteria for the selection of the locations were: 1. the villages should have some history. […]. Very recent locations which are fast growing due to industrial or administrative developments were excluded like Zoetermeer or Almere (which is a recently founded village near Amsterdam and thus fast growing), 2. the location didn't undergo very fast demographic changes recently” (Cornips, p.c. 6 April, 2009). While the inclusion of such socio-demographic sampling parameters is a huge step forwards, they appear to be administered on a case-by-case basis rather than based on principled parameters rooted in geographical practise. What is more, similar to the early days of dialectology, instead of investigating the effects of certain geosensitive types of human activity—such as in-migration—SAND excludes areas that are the locus of such changes.Footnote 4
The Atlas of North American English (TELSUR) (Labov, Ash & Boberg, Reference Labov, Ash and Boberg2006), which is based on 417 speakers across the territory of English-speaking North America, “was designed with the goal of representing the largest possible population, with special attention to those speakers who are expected to be the most advanced in processes of linguistic change” (http://www.ling.upenn.edu/phono_atlas/sampling.html). The project has incorporated contemporary human geographical concepts such as urbanisation and newspaper readership catchment areas into its sampling design. Three types of areas are sampled: Central Cities (CC), Zones of Influence (ZI), and Urbanized Areas (UA). CCs are defined on the basis of population distribution, with at least 200,000 inhabitants in the 1990 census. ZIs are derived from data on newspaper circulation from the 1992 County Penetration Reports of the Audit Bureau of Circulations (ABC); they consist of counties with the highest circulation of a city's newspaper(s), compared to the circulation of all other cities’ newspaper(s) in that city (cf. http://www.ling.upenn.edu/phono_atlas/sampling.html). Finally, UAs are used as a way to sample at a geographically refined and more meaningful scale, unconstrained by political and administrative boundaries. UAs consist of a core CC (or a group of nearby cities) and the surrounding densely settled territory, with a combined population of at least 50,000. Various population density measures are used to incorporate contiguous census block groups (rather than whole counties) around each core in order to form distinctive UAs. In the design of the TELSUR/Atlas sample, if a speaker is a native of any place within a UA, s/he is taken to be linguistically representative of the respective city's speech community. In order to differentiate the amount of sampling to be carried out in smaller cities within each ZI, the CCs are further divided into types by population of the corresponding UA (above one million, between 200,000 and one million, or below 200,000 inhabitants) and by physical area of the ZI (with 5,000 square miles as a cut-off).
Hence, TELSUR achieves broad geographical coverage and is based on a well-defined, geographically sensitive sampling strategy. However, it is restricted to urban speech (see Milroy & Gordon, Reference Milroy and Gordon2003:21). Obviously, focusing on either the urban OR the rural dramatically reduces the demographic representativeness of the study to just this settlement type—a rather narrow sampling universe in Sankoff's (Reference Sankoff1980) term.Footnote 5
Furthermore, while the sampling strategy of newspaper readership catchment areas might provide an adequate profile of speakers’ ideological belonging in the United States, this may not be an appropriate approach in other national contexts. In the UK for example, regional newspapers have a limited following and newspaper readership is class-based rather than geographically distributed (although socio-economic class is obviously not distributed evenly across space). Hence, other measures are needed in order to “cut (…) through the connective tissue of the world in such a way that its fundamental [social] integrities are retained” (Gregory, Reference Gregory1985:328).
Geographers have drawn our attention to the fact that “it is flows between places and not places themselves that matter” (Dorling, Reference Dorling2004:104). The only dialectological project we are aware of that applies flow-based geographies is the Survey of British Dialect Grammar (Cheshire etal., Reference Cheshire, Edwards and Whittle1989, Reference Cheshire, Edwards and Whittle1993). This project, which aims at collecting a large-scale data-base in the British Isles, is quite radical in its adaptation of human geographical models to sociolinguistics and in many respects it functions as a methodological precursor to this study. Cheshire etal. (Reference Cheshire, Edwards and Whittle1989, Reference Cheshire, Edwards and Whittle1993) conducted a large-scale investigation into vernacular morpho-syntax based on questionnaires sent to schools across the UK. They relied on the functional regions system produced on the basis of the 1981 census data by the Centre for Urban and Regional Development Studies (CURDS) at Newcastle University for the classification of their data-points. Functional regions are defined as areas with some geographical coherence, usually measured via parameters such as an area's socio-economic profile, commuting flows by working age population and in/out-migration patterns (Coombes, Dixon, Goddard, Openshaw & Taylor, Reference Coombes, Dixon, Goddard, Openshaw and Taylor1982; Masser & Scheurwater, Reference Masser and Scheurwater1980). They divide the country into a set of urban centres, based on statistical information regarding employment and retailing opportunities. The surrounding areas attached to these urban centres are defined on the basis of commuting patterns, resulting in 228 Functional Regions for the UK, consisting of cores, rings, and outer and rural areas. These cores are described as the “pivotal nodes of economic activity and social life” (Champion & Coombes, Reference Champion and Coombes1983), while their surrounding areas are defined in relation to commuting patterns, reflecting the degree to which their residents depend on the cores for their jobs.
The CURDS functional regions have been widely used by economic geographers and regional scientists for the analysis of economic and social change in a variety of urban and regional scales in Great Britain. Cheshire etal. (Reference Cheshire, Edwards and Whittle1989, Reference Cheshire, Edwards and Whittle1993) did not sample according to these parameters, but they categorize the 87 schools whose questionnaire responses they analyse in terms of their geographical location into cores, rings, and outer and rural areas. Since 75% of their responses were from the core areas, the Survey of British Dialect Grammar is biased towards the urban centres. Cheshire etal. (Reference Cheshire, Edwards and Whittle1993:63) conclude that “the CURDS system is potentially of great value for research into patterns of linguistic variation and change in the British Isles since it identified important patterns of social communication between people from different geographical areas, on the basis of their economic activity”. In this paper we use an approach that reflects the concept of functional zones pioneered by CURDS as a sampling strategy.
3. Towards a sampling model for the British Isles
The first and to date only large scale atlas project in England, the Survey of English Dialects (SED, Orton etal. Reference Orton1962–1971) conducted between 1950 and 1961, covers an impressive number and geographical spread of sampling points: 313 localities in England, the Isle of Man and some areas of Wales close to the English border. A contemporary investigation of dialectal differences in the UK could follow two, often conflicting, principles, namely diachronic comparability with the SED or synchronic geo-demographical representativeness of the area, both of which we discuss in turn.
We could aim for the former and take the sampling points of the Survey of English Dialects as starting points. However, given that the selection process that led to the choice of the SED localities was rather ad hocFootnote 6 (see Chambers & Trudgill's 1998 criticism), the representativeness of the data is questionable and—we would argue—not defensible. Indeed, even diachronic comparability is debatable, since several sampling points that were once rural isolated localities (such as the former mining villages Earsdon and Washington) have become commuter villages/towns as a result of counterurbanisation.
Even if we get around this issue—by sampling nearby localities for example—the problem persists that such a sampling strategy is arbitrary and not based on bona fide socio-spatial parameters. What is needed is a dialectology that is rooted in the everyday reality of the people who live in the area investigated and thus cognisant of the fact that “space and spatiality in general is socially constructed (….). [and] constantly evolving” (Allen, Massey & Cochrane, Reference Allen, Massey and Cochrane1998:138). A geographically informed sampling method for a dialectological project would thus aim to represent human activity across space, leading to the appropriation and manipulation of geographies. Indeed, sociolinguists such as Britain (Reference Britain2002) and Kerswill (Reference Kerswill2009b) remind us that dialectological researchers need to orient our understanding of space to the socio-geographical day-to-day practises of people. More specifically, our research needs to be sensitive to the fact that the
geographies and histories of our social networks and those of the social, economic, and political institutions which guide our daily lives in the West are played out, routinised, and reproduced within functional zones (…) [Consequently] the socio-geographical trajectories of speakers and their institutions are often strongly guided by past practices, by attitudinal considerations and by physical factors, and hence regions are formed. (Britain, Reference Britain2009:151)
Dialectology thus needs to develop sampling criteria that are sensitive to the everyday flows of human interaction and routinised activities.Footnote 7
As we discussed above, the concept of functional regions, the “pivotal nodes of economic activity and social life” (Champion & Coombes, Reference Champion and Coombes1983), has been used to great effect by Cheshire etal. (Reference Cheshire, Edwards and Whittle1993) to classify the schools participating in their Survey of British Dialect Grammar. In this paper, we will use functional regions as a parameter for sampling across space rather than as a descriptive element post hoc. Our unit of analysis, the Office for National Statistics travel-to-work areas (TTWAs), are based on up to date information from the 2001 census, yet they also reflect the concept of CURDS functional regions in that they group smaller areas into larger ones according to the strength of flows between them. TTWAs are defined by the following criteria, which were laid out in 2007 using 2001 census data on commuting (home and work addresses/postcodes, see http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/other/travel-to-work-areas/index.html): (i) At least 75% of the resident economically active population work in the area; (ii) at least 75% of everyone working in the area live in the area; and (iii) the minimum size is a working population of 3,500 (Coombes & Bond, Reference Coombes and Bond2008). This means in effect that geographical units (in our case, 2001 census areas) “‘organise themselves’ on the basis of their mutual commuting links” (Mooney & Carling, Reference Mooney and Carling2006:71) within the group with which they had the strongest mutual coherence. As such, TTWAs satisfy the criteria of minimum “population size and self-containment” (Shortt, Moore, Coombes & Wymer, Reference Shortt, Moore, Coombes and Wymer2005:2715), but they rely solely on optimising commuting flows, making them straightforward to conceptualise in a sociolinguistic framework. In other words, TTWAs do not pose an additional level of complexity to the functional regions.Footnote 8 Based on these criteria, the whole of the British Isles subdivides into 243 TTWAs. These areas of routine movement host the most fundamental grooves of daily interaction, based on commuting to and from work, often subsuming school runs, shopping trips and evening entertainment on the way and thus leading to the creation of space time zones. They are thus inherently meaningful in terms of people's daily routines and interactions.
What makes sampling via TTWAs inherently superior to approaches that are based on grids or stationary political boundaries is the fact that they are the fundamentally local outcomes of people constructing their ‘own’ place (Kerswill, Reference Kerswill2009b). They also conform entirely to Giddens' (Reference Giddens1984:376) concept of routinisation as “the habitual, taken for granted character of the vast bulk of activities of day-to-day social life”. For example, TTWAs have been used in geographical research to compare patterns of commuting in relation to employment opportunities and to develop employment policies. We propose that they are an ideal starting point for dialectological work since they (i) provide a stringently controlled sampling framework that is based on contemporary geographical methods, (ii) are widely available and easily accessible at http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/other/travel-to-work-areas/index.html, and (iii) readily lend themselves to a range of applications within the field of dialectology.
Taking zones of routinised every-day movement as a starting point thus results in a geography that is based on the human appropriation of space. Map 2 shows that the TTWAs in the North of England/South of Scotland are fundamentally independent of, and indeed criss-cross, political boundaries. For example, the TTWA centring around Berwick-upon-Tweed stretches on both sides of the political border. The special status of Berwick in the history of the English-Scottish border is reflected in the gestalt of the TTWA, with commuters from both directions flocking into Berwick-upon-Tweed (Glauser, Reference Glauser1974).Footnote 9
Britain (Reference Britain2010) points out that while routine interaction creates spaces of various kinds and shapes, a geographically sensitive approach to space also needs to account for a wealth of intimately inter-correlated socio-demographical factors. Indeed, once we have chosen the fundamental basis of spatial analysis, the next, rather thorny, question is the issue of where to sample inside of a TTWA while ensuring socio-demographic representativeness. We will briefly discuss the repercussion of using SED localities before proposing several socio-demographic parameters that could be used as sampling criteria. Using GIS (Geographical Information Systems) for manipulating the socio-economic information, we then embark on overlay analysis of TTWAs with socio-economic areal characteristics in order to define the sampling areas.
Above, we have argued against using SED sampling points due to their ad hoc character and lack of diachronic comparability. Map 3 provides another argument against the use of SED sampling points: It would lead to oversampling in certain areas (i.e. Wark, Haltwhistle, and Allendale in the Hexham & Haltwhistle TTWA) as well as undersampling in others (no sampling points in the Hartlepool or Darlington TTWAs).
What is needed is a principled method for selecting localities within these TTWAs. We would like to argue that the sampling points of any dialectological project that aims to be representative of the area it covers should correspond to the socio-demographic and economic make-up of the area. Since the TTWAs are obviously heterogeneous in this respect (given the emphasis on commuting criteria for their construction), we need to investigate their socio-demographic characteristics. Such an analysis fundamentally relies on the concept of socio-economic area classification, or SEAC, a key concept in social geography, in relation to area profiling and geo-demographics (Harris, Slight & Webber, Reference Harris, Slight and Webber2005). Geo-demographic classifications use socio-economic data from national censuses and other governmental and commercial databases to “group together geographic areas according to key characteristics common to the population in that grouping” (http://www.ons.gov.uk/ons/guide-method/geography/products/area-classifications/ns-area-classifications/index/available-geographies/index.html?format=print). In the context of the British Isles, it “distils key results from the 2001 Census for the whole of the UK at a fine grain to indicate the character of local areas” (http://areaclassification.org.uk/getting-started/getting-started-what-is-the-output-area-classification/). This results in a categorization of areas of variable sizes (from local authorities to wards down to very small census output areas) according to a range of socio-demographic and economic components that were included in the census. The main dimensions of these components are demographic, household composition, housing, socio-economic, employment and industry sector.
Cluster analysis of the 2001 census data has revealed that the British Isles can be grouped into 9 socio-economic “supergroups,” i.e. areas with characteristic socio-economic and demographic profiles. These supergroups are industrial hinterlands, traditional manufacturing, built-up areas, prospering metropolitan, student communities, multicultural metropolitan, suburbs and small towns, coastal and countryside, and accessible countryside (Vickers & Rees, Reference Vickers and Rees2007). Figure 1 shows a radar chart representing the profile of areas that are classified as “traditional manufacturing”. “Each spoke of the wheel represents a ‘variable’ – a characteristic of the population. Points are plotted to indicate values for each variable relative to the mean of the population” (http://www.ons.gov.uk/ons/guide-method/geography/products/area-classifications/ns-area-classifications/index/overview/index.html#4).
Hence, in terms of their socio-demographic profile, areas that correspond to the “traditional manufacturing” profile tend to have an above average share of people unemployed or routinely employed and separated/divorced/single parent households. These areas also tend to have a high percentage of terraced housing (and lower share of detached housing) as well as a much lower share of households owning two cars. For our analysis, we used the results of the geo-demographic cluster analysis based on the data available from the National Statistics 2001 Area Classification (http://www.ons.gov.uk/ons/guide-method/geography/products/area-classifications/national-statistics-area-classifications/national-statistics-2001-area-classifications/index.html). We chose as our unit of analysis the 2001 census statistical ward, which is a “frozen in time” version of the ever-changing electoral ward. Wards (statistical or electoral) are fundamentally local areal units in the British context and therefore meaningful from the perspective of the individual, despite the fact that their detailed boundaries may change every few years as a result of electoral considerations (e.g. to ensure representation amongst the electorate).Footnote 10 The second reason why wards were selected as the unit of analysis is that they facilitate communication between researchers, fieldworkers and subjects during the sampling and recruiting process. In short, it is easier to seek subjects—and communicate the exact space requirements to them—from a list of qualifying wards (that people can relate to), rather than a much longer and complex list of postcode areas (or even specific streets). However, it is noted here that this method can be fine-tuned by using a finer level of areal sampling units (such as census output areas or even full postcodes) if a sufficiently large number of informants is to be recruited.
We then superimposed the SEAC-based supergroup ward profiles on the TTWAs of the North East of England; Map 4 is the result of this procedure. It reveals the diversity of the North East region in terms of socio-demographic make-up: From the predominantly “coastal and countryside” areas south of Berwick (which itself is classified as a “built up area”) and the Northumberland countryside we move south to the urban conurbation of Newcastle and Gateshead, which is predominated by wards classified as “industrial hinterland” and “traditional manufacturing”.
The fundamental advantage of the SEAC classification—apart from the fact that it is readily available online—is that it is population-sensitive and encapsulates a wealth of socially relevant variables that have been selected on the basis of multivariate analysis (see http://www.ons.gov.uk/ons/guide-method/geography/products/area-classifications/national-statistics-area-classifications/national-statistics-2001-area-classifications/methodology-and-variables/wards/index.html for methodology and the full set of variables). Thus, using a SEAC-based sampling strategy not only gives us an overview about the socio-economic make-up that our TTWAs are composed of, it also allows us to make representative sampling decisions on the basis of a wealth of socio-demographic information.
The question of how many data points are needed is obviously fundamentally dependent on a range of factors, including the research questions, focus and scale of the project in terms of time and financial resources, and thus cannot possibly be decided a priori. Here, we report on a small-scale pilot project that tests the usefulness of the methods described above for dialectological research. We decided to sample in the four northernmost TTWAs of the North East of England, namely “Berwick”, “Morpeth, Ashington and Alnwick”, “Hexham and Haltwhistle” as well as “Newcastle and Durham”. Our sampling points were chosen on the basis of geo-demographic and population-based representativeness and—as a further consideration (if possible)—the existence of an old SED point in the wider area. For the northernmost TTWA, we sampled in Lowick, a former SED sampling point, which—being a hamlet of only 560 inhabitants—is wholly representative of a TTWA that is classified as predominantly costal and countryside. The inland of the Morpeth, Ashington & Alnwick TTWA is costal and countryside. Most of the population live along the coastline, in a stretch of area classified as industrial hinterland and traditional manufacturing. We sampled in Linton Colliery, a small ex-mining village of only a few hundred inhabitants about 1.5 miles from the old SED sampling point Ellington. In the Hexham and Haltwhistle TTWA, which is also mostly classified as countryside, we sampled just outside of Hexham.
The geo-demographic profile of the heavily populated Newcastle and Durham TTWA was slightly more complex, with the majority of wards classified as traditional manufacturing (52 wards comprising 394,834 inhabitants) and industrial hinterland (67 wards comprising 392,995 inhabitants). We aimed at a sampling strategy that captures this diversity. We thus chose two traditional manufacturing sampling points, namely Westerhope and Jarrow, which are situated north and south of the Tyne within the perimeters of the urban conurbation. We also chose one industrial hinterland sampling point further south, the ward of Delves Lane, a village of about 1,300 inhabitants. This also gives us the opportunity to investigate whether the traditional isoglosses that earlier research has revealed to run south of the urban conurbation (see Glauser, Reference Glauser1974, Reference Glauser2000; Kolb, Reference Kolb1966; Kolb, Glauser, Elmer & Stamm, Reference Kolb, Glauser, Elmer and Stamm1979)Footnote 11 still hold in 2009.
4. Putting theory into practice: Applying the new method to a dialectological project
We now discuss an application of the model we have developed for sampling across space in the context of the British Isles. Given that the aim of this pilot study is to test the socio-geographically sensitive method outlined above, we restricted our sample to only one slice of the population: Older (40+) speakers with comparatively little formal education (none of our informants went to university or received any form of higher postsecondary education). We sampled one man and one woman per location, all of whom share either kinship or friendship networks with their paired partners and maintain dense social networks in their local community. The informants were born in the locality and have lived in the same ward or in an adjacent one provided that it has the same socio-demographic profile at least until the age of 18 and most of their adult lives.
One fundamental restriction of our sample is thus that it only includes the informants commonly used in dialectological research. Note in this respect that previous research has established that different socio-demographic groups have different geographies; restricting one's sampling universe to one or more groups can only give us access to one amongst a multitude of intersecting spatialities (see i.e. the geographies of age Hopkins & Pain, Reference Hopkins and Pain2007, gender Bondi, Reference Bondi1996; McDowell, Reference McDowell1992, ethnicity Bonnett, Reference Bonnett1996, Reference Bonnett1997 or disability Imrie, Reference Imrie1996). We have thus decided to control for a maximum of social factors. A larger follow-up project will need to include speakers with a range of different speaker profiles in order to get a picture of the full socio-demographic reality of the area covered.
We report on the results from an indirect grammaticality judgement task.Footnote 12 Informants were asked to rate sentences by assigning them a number that corresponds to a verbal descriptor (see Labov, Reference Labov1996). We used the following four-point scale:
1 This type of sentence would never be used here—it seems very odd.
2 This type of sentence is not very common here but it doesn't seem too odd.
3 I have heard this type of sentence locally but it's not that common.
4 People around here use this type of sentence a lot.
Example (1) illustrates a sentence as it was administered in our questionnaire. All sentences to be judged were marked in bold and embedded in a short contextualising text of two to three sentences to make them pragmatically more acceptable (see also Buchstaller & Corrigan, Reference Buchstaller and Corrigan2011).
(1) Example of the Indirect Grammaticality judgment task
Please rate these sentences as described above.
The local supermarket got robbed and the police were looking for a witness. They were asking a group of children whether they had seen anything. Suzie pointed at a little girl. She said “ That's the girl seen it”.
Altogether there were 149 sentences (74 experimental sentences, 75 fillers) which alternated in randomised order. We divided these sentences into two questionnaires, of which we constructed 2 randomisations each. Every informant thus completed 2 questionnaires with a lengthy break in-between—half of the informants filled out the first randomisation and the others filled out the second.
The linguistic features included into this pilot project are so-called typical “Northern” features, i.e. variants that are traditionally associated with either Tyneside English and/or Scottish English as described in Beal (Reference Beal1993, Reference Beal2004) and Miller (Reference Miller2004) inter alia. We illustrate them briefly in turn.
The non-standard second person plural pronoun, often spelled yous, is a feature of both Tyneside and Scottish English (Beal, Reference Beal1993:205; Beal & Corrigan, Reference Beal and Corrigan2004; Miller, Reference Miller2004:49; Reference Buchstaller and CorriganBuchstaller & Corrigan, to appear). We tested for the effect of syntactic position on respondents’ rating, namely subject versus object position (2a. and 2b. respectively).
(2) a. Yous could share some pasta.
b. I want to play my song to yous.
Multiple negation is widely regarded as being “one of the most stigmatized features of non-standard English” (Beal & Corrigan, Reference Beal and Corrigan2005:145). We investigated respondents’ acceptance of two types of non-standard negation, multiple negation with Standard English lexis—verbal negation and negation with negative polarity items (in 3a.-b.)—as well as the presence of a vernacular negator, Scots dinnae where Standard English calls for don't or do not (in 3c.), and the Tyneside English equivalent divven't (in 3d.).
(3) a. I didn't see nobody.
b. Nobody bought nothing.
c. I dinnae eat steak.
d. She divven't read novels.
We also considered the acceptance rates of relative clause markers used in subject, animate, restrictive sentences. The vernacular variants examined were as (4a.), what (4b.) and zero (4c.).Footnote 13 Ball (Reference Ball1996:243) states that “there is no vernacular norm for either BrE or AmE with respect to the distribution of relative markers.” Indeed, speakers of non-standard varieties of English tend to show locally specific patterns in their usage of vernacular strategies, which tend to be at the expense of marking with WH-elements (Poussa, Reference Poussa1985; Tagliamonte, Smith & Lawrence, 2005).
(4) a. It's my mother as needs them.
b. He's the man what bought it.
c. That's the man Ø helped me.
Finally, we investigated the Northern Subject Rule (henceforth NSR, as in 5a.-b.), a phenomenon whereby verbs attract an -s suffix even when the subject NP is not third person singular in function (Beal, Reference Beal2004:122).Footnote 14 Little is known about the geographical scope of its use, and the extent to which its constraints are stable across space. We tested for the NP/PRO constraint, which “marks a verb with – s if its subject is anything but an adjacent pronoun” (Montgomery, Reference Montgomery1994:86, see 4a). We also analyse the acceptability of NSR with conjoined nouns forming the subject (as in 4b., see Beal & Corrigan, Reference Beal and Corrigan2000; Godfrey & Tagliamonte, Reference Godfrey and Tagliamonte1999; Buchstaller, Corrigan, Holmberg, Honeybone & Maguire, Reference Buchstaller, Corrigan, Holmberg, Honeybone and Maguire2013).
(5) a. The children say s they will return your kindness when they go Ø out there … (Fitzpatrick, Reference Fitzpatrick1994:350)
b. My mother and father hide s in the garden.
We now move on to describe our results in a series of tables which depict the acceptability ratings of these four constructions by linguistic environment and locality. The two-dimensionality of these tables conceals a north-south axis from Lowick in the north, over Linton Colliery to Hexham, Westerhope and Jarrow and finally to Delves Lane in the south, and an east–west axis, which will become particularly important with respect to the location of Hexham, west of the urban conglomeration.Footnote 15 Following Horvath & Horvath (Reference Horvath and Horvath2002), we consider the linguistic conditioning of these variables across geography, interpreting breaks in the probabilistic patterns of these variables as areas where “one pattern of sociolinguistic variability gives way to another pattern of sociolinguistic variability at some point in space” (Horvath & Horvath, Reference Horvath and Horvath2001:47). Hence, according to the cartographical method of natural breaks advocated by these authors, the loci of quantitative or qualitative differences in the constraints that govern these linguistic variables can be interpreted as areas of dialect transitions or—if they are found to cluster in space—even dialect boundaries.
Table 1 depicts the ratings for 2nd person plural yous, a feature that has been described as a Northern variant more widely (Beal & Corrigan, Reference Beal and Corrigan2004). Indeed, the ratings from the 6 localities—while variable in terms of overall acceptability—confirm that yous, overall as well as in both syntactic positions, is generally recognized as being used by people across the localities sampled in the North East.
Note, however, that informants in Delves Lane, Linton Colliery, and Lowick, the three non-urban localities at the north and the south of the periphery of our sample are less accepting of the feature. It is to be investigated whether this is an indication of yous being associated with urban speech communities (as suggested in Beal & Corrigan, Reference Beal and Corrigan2004), especially since the feature receives high acceptability rates in the rural locality close to Hexham.Footnote 16
The preferred linguistic context of yous is subject position, and while this constraint is not significant—probably due to low token numbers—the overall direction is the same everywhere, except for Westerhope, where yous-ratings are independent of syntactic position. But even in Westerhope, including more speakers into the analysis (as we have done on the basis of a follow-up study) and thus increasing token numbers results in the same pattern of subject over object. Overall, the results for vernacular 2nd person yous form a relatively homogenous picture where all localities share the same constraint hierarchy. Differentiation across space starts to show when we look at negation in Table 2.
Hughes, Trudgill and Watt (2005) claim that multiple negation is used frequently in Northern and Scottish dialects (but see Anderwald Reference Anderwald2004). Indeed, all of our informants identified multiple negation as a feature that is used in the North East, albeit with different degrees of acceptability. Importantly, Table 2 displays the transition from the Scottish dinnae to the typical Tyneside divven't as incremental changes in mean ratings from one locality to the next. Unsurprisingly, dinnae received the highest possible acceptability rating, 4, in the northernmost locality, Lowick, where informants are generally very accepting of vernacular negation. Some 43 miles further south, in Linton Colliery, the acceptability of dinnae has shrunk to 3.25, but it is still rated as the highest negative variant. Conversely, informants in the urban Newcastle-Gateshead area prefer the Tyneside form, divven't (see Beal, Reference Beal1993; Glauser, Reference Glauser1974). Note the very low ratings for dinnae, particularly in Jarrow.
Note also that Delves Lane, the southernmost sampling point, manifests reduced ratings of divven't and increased acceptability for dinnae. We suspect that this is due to the phonetic similarity of dinnae to another localised northern form, dinnet (attested south of the Tyne by Ellis, Reference Ellis1889, for South Shields and by Orton, Reference Orton1933, for Byers Green, a mere 16 miles from Delves Lane, see also Beal, Burbano-Elizondo & Llamas, Reference Beal, Burbano-Elizondo and Llamas2012)Footnote 17 , which we did not test for in this pilot study. There is thus a clear north-south gradation in terms of preference of forms, from dinnae in the North over divven't in the urban Newcastle-Gateshead conurbation to (we assume) dinnet further south. We interpret these results as the perceptual probabilistic outcome of a fan (Glauser, Reference Glauser2000).Footnote 18 Note however, that the east-west dimension also matters in this respect: Informants just outside of Hexham, which is about at the same latitude as Newcastle, have roughly equal ratings for dinnae and divven't. Further research is needed to ascertain whether this finding is an expression of the fact that the dinnae-territory spreads further south in the rural TTWA west of Newcastle or whether Hexhamites also regard themselves as users of dinnet (or of other local forms). Overall, it seems that the urban Newcastle-Gateshead conurbation is the clear geographical stronghold of divven't, whereas nasal variants reach much higher acceptability rates elsewhere. Let us now tackle the ratings for relativisation in Table 3.
In line with Hughes etal. (Reference Hughes, Trudgill and Watt2005) and Cheshire etal. (Reference Cheshire, Edwards and Whittle1989), what is rated highest in all our localities (except Delves Lane, where vernacular relatives receive relatively even ratings). Note, however, contra to claims in the literature, high acceptability of what is not restricted to urban localities: the form achieves high scores in Linton Colliery, Lowick and amongst the informants close to Hexham. Note also that, in spite of the fact that what is readily accepted in Linton Colliery/Lowick, and has been recorded in Glasgow (Miller, Reference Miller1993:62), the variant is not traditionally found in Scottish dialects, and Poussa (Reference Poussa1985) has suggested that it has spread upwards from the south.
Note in this respect that Edwards & Weltens’ review (Reference Edwards and Weltens1985) suggests that—especially in the North—speakers prefer other vernacular relativisation strategies. In our study, however, as is generally rated relatively low and does not follow any consistent pattern (see also Tagliamonte etal., Reference Tagliamonte, Smith and Lawrence2005; Kortmann, Reference Kortmann2004). Zero relatives only achieve acceptability ratings that surpass what-ratings amongst informants in Delves Lane, the southernmost locality. The only other locality with reasonable acceptability ratings for zero relatives is Lowick in the extreme North East (and to a certain extent Hexham). Note that the zero form, which has been in use ever since Old English (Traugott, Reference Traugott1972) has been found in the Southern Scottish Borders by Murray (Reference Murray1873:194), who commented that “an ellipsis of the relative is extremely common.” It has also been attested in both Tyneside and Sheffield in The Survey of English Dialects (Orton etal., Reference Orton1962–1971), The Newcastle Electronic Corpus of Tyneside English (http://www.ncl.ac.uk/necte) and The Survey of Sheffield Usage (see Beal & Corrigan, Reference Beal and Corrigan2007; Reference Buchstaller and CorriganBuchstaller & Corrigan, to appear). Given the lack of comparative diachronic quantitative data across the North Eastern area, it is not entirely clear whether our finding might be taken as an indication that the geo-spatial locus—at least synchronically—of zero relatives in the North East is more in the peripheral areas. More data, also from younger age groups, is needed in order to establish the complex competition amongst relativisation strategies in the North East of England.
Finally, let us consider the linguistic conditioning of the NSR across the six localities in Table 4. Historically, as we pointed out above, verbal -s has been reported to be conditioned by the NP/PRO constraint. Also conjoined nouns tend to favour the occurrence of verbal -s. Synchronically, however, these constraints seem to be undergoing locally specific reinterpretation (see Buchstaller etal., Reference Buchstaller, Corrigan, Holmberg, Honeybone and Maguire2013).
The acceptability ratings in Table 4 reveal localised patterns. Three localities, Westerhope, Delves Lane and Linton Colliery, display a binary constraint hierarchy whereby conjoined NPs receive considerably higher ratings than subjects that consist of full non-3rd person singular NPs or pronouns, which are rated least acceptable. The rating of NPs over pronouns is a synchronic reflex of the NP/PRO rule. The preference of conjoined NPs over full NPs is fully in line with Visser (Reference Visser1963), Beal & Corrigan (Reference Beal and Corrigan2000) and Godfrey & Tagliamonte (Reference Godfrey and Tagliamonte1999). Indeed, Buchstaller etal. Reference Buchstaller, Corrigan, Holmberg, Honeybone and Maguire(2013) have suggested that this pattern might be due to reanalysis of the 2nd conjoint of the complex subject NP as a 3rd person sg. subject.
Note, however, that Jarrow and Hexham display a slightly different pattern whereby full NPs favour the acceptance of the NSR over conjoined NP with pronouns coming last as in the other localities. We might want to argue that in these localities, whereas the original NP/PRO constraint is still firmly in place, the reinterpretation of the 2nd conjoint has not taken place. Indeed, research in Hawick, a small town in the Scottish borders has revealed similar results (see Buchstaller etal., 2013; Childs, Reference Buchstaller, Corrigan, Holmberg, Honeybone and Maguire2013).
Finally, the informants in Lowick, while displaying the conjoined NP effect, rate pronouns—the lowest ranked environment anywhere else—higher than single NPs. Hence, it seems that informants in Lowick do not orient to the NP/PRO constraint at all. Obviously, given the small number of informants sampled in these localities, the variability in Table 4 might be due to orthogonal social/attitudinal or even idiosyncratic factors and these results need to be confirmed on the basis of larger data base. However, the findings reported here support research by Buchstaller etal. Reference Buchstaller, Corrigan, Holmberg, Honeybone and Maguire(2013) conducted in Westerhope and Hawick that is based on a larger number of participants.
We suggest that there are two possible explanations for the geographically differentiated outcome in Table 4—assuming they are not sampling artefacts: The variability could be the result of the locally specific adaptation of a bundle of linguistic constraints that are currently changing across a wider spatial area. As Buchstaller etal. Reference Buchstaller, Corrigan, Holmberg, Honeybone and Maguire(2013) point out, the NSR seems to be in the process of undergoing major reanalysis—and our data from 6 different localities across the North East suggests that this process results in geospatial diversity synchronically. Alternatively, it might well be that the NSR, even historically, has never had the geographical uniformity it has been portrayed as having. Rather, it might have always been subject to localised constraints. Historical treatments tend to be based on impressionistic and/or small-scale studies and past empirical research lacks systematic geographical coverage. More data—synchronic as well as diachronic—is needed from a range of localities in order to ascertain the mechanisms lying behindthe results shown in Table 4 (see Pietsch, Reference Pietsch2005; Ramisch, Reference Ramisch2008).
Overall, Tables 1–4 display probabilistically gradient acceptability ratings of linguistic variability. Conceptualising these ratings within two dimensions, namely space—north to south and east to west—and place—the urban conurbation Newcastle-Gateshead versus various rural locations (Horvath & Horvath, Reference Horvath and Horvath2001)—reveals that some variables are more systematically patterned than others. Indeed, using the natural break pattern allows us to examine the “dialect landscapes” (Britain, Reference Britain2010:72) across the North East that fall out of a socio-demographically informed sampling strategy.
The northern sampling points, Lowick and Linton Colliery, give high ratings to dinnae, whereas the urban Tyneside complex has particularly high ratings of divven't. Delves Lane, the most southern locality, while not categorically different from any of the other sampling points, manifests the influence of another nasal variant that has been associated with more southern localities. These ratings give support on the perceptual level to the description of the English–Scottish border as a fan. However, orthogonal to space, place effects are also operational in the ratings for vernacular negatives: Hexham, which is at the same latitude as Newcastle, garners relatively high dinnae ratings. It thus seems as if preponderance for divven't is associated mainly with the conurbation Newcastle-Gateshead. We also detected a possible urban predominance for yous, which received much higher ratings in localities within the boundaries of the urban conurbation—in Jarrow and Westerhope—compared to the rural countryside, except for Hexham. Similarly, the higher acceptability ratings for zero forms on the northern and southern periphery might be due to the preponderance of as competitor form, what, in the urban centre. The ratings for the NSR, on the other hand, seem to be the locally specific manifestations of a phenomenon that has been described as generally northern (Murray, Reference Murray1873) but the constraining factors of which seem to vary from place to place (see Reference Buchstaller and CorriganBuchstaller & Corrigan, to appear).
Chambers and Trudgill (Reference Chambers and Trudgill1998:30) point out that “the future of dialect studies will have to be directed towards more representative populations.” Indeed, we have argued that considerations of geo-demographical representativeness have not received the kind of attention they warrant in large-scale dialectological work. Britain, similarly, finds that much of dialectological research has either tended to “carefully control (…) [space] out of the study” (2009:143) or turned it into a “homogenised, historically-, socio-economically-, and institutionally blind blank canvas” (2010:87). In this paper, we investigate the interface between human geography and sociolinguistics with an eye on methods that have the potential to enrich dialectological research. We focus on concepts and models that inform sampling decisions in multi-locality dialectological research.
The first sampling parameter we propose relies on tessellation via travel to work areas (TTWAs), although other types of zones, borders or “functional” regions that encapsulate regular human flows/activities can also be used, depending on the study context and data availability. Such functional regions, we argue, are the fundamentally local outcomes of routinised day-to-day behaviour and therefore inherently meaningful for the understanding of spatialised practices, linguistic as well as others. As such, socially sensitive tessellation has a major advantage over the standard dialectological sampling criteria, which focused, if at all, on “fairly long-term mobility rather than that of the taken-for-granted everyday kind” (Britain, Reference Britain2010:87). However, we have to point out that the UK TTWAs we used have been defined to treat all commuting flows equally, ignoring personal circumstances such as teleworking and gendered employment opportunities and practices. A more sensitive approach to tessellation might take into account factors such as age, ethnicity, industry, occupation and gender, to name a few, which is also possible from an analytical perspective, depending on relevant data availability.
Within these TTWAs we rely on a formal socio-economic area classification for choosing sampling points that correspond to the socio-demographic and economic make-up of the region they represent. A model that relies on a combination of travel to work areas and socio-economic area classification has a number of assets: Both parameters are readily available online and easily applied to a sampling universe of any size and location within Britain (and—contingent on the availability of census data—also elsewhere). They have been tested in a wide range of geo-demographical research, which has the added benefit of interdisciplinary convergence. Finally, they hand dialectological researchers a ready-to-use sampling tool that is fully cognisant of the recent advances in human geography. This is also true for many other countries with regular population censuses (e.g. US, Australia) or detailed citizens’ registers (e.g. Germany, Sweden). Such population-wide data may have different names and consist of different variables, but they are widely acceptable for geo-demographic research. For example, functional regions derived from the census (i.e. similar to the TTWAs) are known as Traffic Analysis Zones (TAZ) in the US and Travel Zones in Australia, available with Journey to Work (JTW) data. Although different parameters may have been used in their construction, the resulting regions and purpose are similar to the TTWAs. At the smaller geographical level, there is a range of governmental or proprietary data “products” providing socio-economic classifications (also known as population segmentations), similar to the one used here. Official census classifications are available freely, while commercial population segmentations attract a premium, although sometimes data companies are prepared to negotiate lower prices or make available free data for research purposes. In contexts where such official or commercial products are not available, researchers should be in a position to obtain a number of census-derived variables (e.g. age, employment, housing tenure of population) for larger areas and produce such classifications themselves, using standard statistical techniques (e.g. cluster analysis in SPSS). Such a task is more demanding and requires some understanding of spatial statistics, but it is still feasible to produce classifications from raw population data for sociolinguistic sampling. In any case, we encourage sociolinguists to consider such geographical approaches and discuss their data needs with colleagues from geography and planning.
Applying these two methods as sampling parameters to the extreme North East of England leaves us with a measure of socio-demographic representativeness and geographically informed coverage. We chose six localities with varying geospatial and socio-economic profiles to investigate a range of linguistic features. Acceptability judgement tasks conducted across these sampling points reveal a dialectological landscape constrained by linguistic, geographic and human geographic factors. By themselves, the results of our pilot study are only a small jigsaw piece amongst innumerable linguistic geographies which continue to develop and evolve and, as such, fundamentally limited in scope. But we hope to have demonstrated that the combinatory approach we propose is an adequate tool for dialectological research.
Obviously, given the fluidity of development both on the linguistic and the geographical plane, geo-spatially sensitive research needs to keep abreast of newer census data as it becomes available (the raw data for the 2011 UK census has just been released for larger areas and local authorities). Indeed, a comparison between census outputs from different years, while often marred by varying tessellation units, can provide important information about the changing nature of routines, linguistic as well as geographical ones. Furthermore, while space precludes us from showcasing such an analysis here, a research design that samples different age bands would also be in the position to investigate important questions such as the extent and direction of levelling or dialect supralocalisation.Footnote 19 In future research, such questions need to be investigated on the basis of both perception and production data. Another unanswered question is whether the findings reported here converge with patterns formed on the basis of phonology. Moreover, while we have refrained from doing so due to the relatively low number of speakers per tessellation unit, the statistical concept of standard deviation could be used as an analytical parameter, providing researchers with a fruitful diagnostic of focusing (Le Page, Reference Le Page1978), especially when comparing younger and older speakers.
To conclude, applying a combinatory human geographical sampling method to investigate linguistic variability in the extreme North East has resulted in a socio-demographically informed snapshot of socio-geographical patterns of language variation. We hope the method we propose has brought us one step further towards a “spatially sensitive dialectology, one which recognises and synthesises the ever evolving physical, social and perceptual spaces we live in and by, it places the spaces created, maintained and changed by interaction at centre stage” (Britain, Reference Britain2010:69, emphasis in original).
We gratefuly acknowledge the support of Newcastle University's Faculty Research Fund, as well as our fieldworker Laura Steventon. The digital maps used hold Crown Copyright from EDINA Digimap, a JISC supplied service. © Crown Copyright/database right 2013. An Ordnance Survey/EDINA supplied service.