Hostname: page-component-7479d7b7d-8zxtt Total loading time: 0 Render date: 2024-07-08T15:27:44.097Z Has data issue: false hasContentIssue false

Sign advantage: Both children and adults’ spatial expressions in sign are more informative than those in speech and gestures combined

Published online by Cambridge University Press:  13 December 2022

Dilay Z. KARADÖLLER*
Affiliation:
Max Planck Institute for Psycholinguistics, Netherlands Centre for Language Studies, Radboud University, Netherlands
Beyza SÜMER
Affiliation:
Max Planck Institute for Psycholinguistics, Netherlands Amsterdam Center for Language and Communication, University of Amsterdam, Netherlands
Ercenur ÜNAL
Affiliation:
Department of Psychology, Ozyegin University, Istanbul, Turkey
Aslı ÖZYÜREK
Affiliation:
Max Planck Institute for Psycholinguistics, Netherlands Centre for Language Studies, Radboud University, Netherlands Donders Institute for Brain, Cognition and Behavior, Radboud University, Netherlands
*
*Corresponding author: Dilay Z. Karadöller, Max Planck Institute for Psycholinguistics, Wundtlaan 1, 6525XD Nijmegen. E-mail: dilay.karadoller@mpi.nl
Rights & Permissions [Opens in a new window]

Abstract

Expressing Left-Right relations is challenging for speaking-children. Yet, this challenge was absent for signing-children, possibly due to iconicity in the visual-spatial modality of expression. We investigate whether there is also a modality advantage when speaking-children’s co-speech gestures are considered. Eight-year-old child and adult hearing monolingual Turkish speakers and deaf signers of Turkish-Sign-Language described pictures of objects in various spatial relations. Descriptions were coded for informativeness in speech, sign, and speech-gesture combinations for encoding Left-Right relations. The use of co-speech gestures increased the informativeness of speakers’ spatial expressions compared to speech-only. This pattern was more prominent for children than adults. However, signing-adults and children were more informative than child and adult speakers even when co-speech gestures were considered. Thus, both speaking- and signing-children benefit from iconic expressions in visual modality. Finally, in each modality, children were less informative than adults, pointing to the challenge of this spatial domain in development.

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press

Introduction

Children, from early on, see and interact with the objects surrounding them (e.g., a fork next to a plate). They also need to communicate about these objects and the spatial relations between them to function and navigate successfully in the world. To do so, children need to learn how to map the linguistic expressions in their specific languages to spatial relations. Previous work has shown that children learning different spoken languages show considerable variability in learning to encode spatial relations (e.g., Bowerman, Reference Bowerman, Bloom, Peterson, Nadel and Garett1996a, Reference Bowerman, Gumperz and Levinson1996b; Johnston & Slobin, Reference Johnston and Slobin1979). However, it is not known whether the development of spatial language use can be modulated by visually motivated form-meaning mappings (i.e., iconicity; Perniss, Thompson, & Vigliocco, Reference Perniss, Thompson and Vigliocco2010) as in the case of sign languages and/or co-speech gestures. Speakers and signers can use iconicity to map the relative relations of objects in real space onto sign and gesture space in an analogue manner (e.g., Emmorey, Reference Emmorey2002; Perniss, Reference Perniss2007). In this study, we aim to investigate whether such iconic affordances of visual expressions provide an advantage for children compared to the use of arbitrary expressions in speech. To do so, we focus on encoding Left-Right relations, which have been found to be challenging for children learning spoken languages.

A substantial amount of research across many spoken languages has shown that communicating about space is an early developing skill. While some spatial terms such as In-On-Under emerge already at around 2 years of age (Johnston & Slobin, Reference Johnston and Slobin1979), others, such as those requiring viewpoint, take longer (Grigoroglou, Johanson, & Papafragou, Reference Grigoroglou, Johanson and Papafragou2019; Johnston & Slobin, Reference Johnston and Slobin1979; Landau, Reference Landau2017; Sümer, Reference Sümer2015). Particularly for the relations between objects located on the lateral axis, speaking-children frequently produce under-informative descriptions with missing (e.g., Side, Next to) or incorrect (e.g., Front) spatial information instead of providing uniquely referring expressions (i.e., informative) (e.g., using spatial terms such as Left-Right; Sümer, Perniss, Zwitserlood, & Özyürek, Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014). This has been attributed to delays in the development of cognitive understanding of Left-Right spatial relations between the objects (Benton, Reference Benton1959; Corballis & Beale, Reference Corballis and Beale1976; Harris, Reference Harris1972; Piaget, Reference Piaget1972; Rigal, Reference Rigal1994, Reference Rigal1996).

However, recent research has shown that children acquiring sign languages might have an advantage in encoding such cognitively challenging spatial relations. For instance, signing-children learn to encode Left-Right relations between objects much earlier (around age 5) than their speaking peers (not until age 8), possibly due to iconic expressions in sign languages (Sümer et al., Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014). Nevertheless, speaking-children also use iconic expressions, such as gestures while communicating about space (e.g., Furman, Özyürek, & Allen, Reference Furman, Özyürek, Allen, Bamman, Magnitskaia and Zaller2006; Furman, Özyürek, & Küntay, Reference Furman, Özyürek, Küntay, Franich, Iserman and Keil2010; Furman, Küntay, & Özyürek, Reference Furman, Küntay and Özyürek2014; Göksun, Hirsh-Pasek, & Golinkoff, Reference Göksun, Hirsh-Pasek and Golinkoff2010; Iverson & Goldin-Meadow, Reference Iverson and Goldin-Meadow1998; Özyürek, Reference Özyürek, Hickmann, Veneziano and Jisa2018; Özyürek, Kita, Allen, Brown, Furman, & Ishizuka, Reference Özyürek, Kita, Allen, Brown, Furman and Ishizuka2008). It is not known whether signing-children will have an advantage when speaking-children’s iconic co-speech gestures are also taken into account.

In this paper, by taking a multimodal approach to the development of spatial language, we investigate whether iconic expressions provide linguistic and expressive tools for children and adults to convey more spatial information than speech alone. To do so, we study child and adult signers of Turkish Sign language (Türk İşaret Dili, TİD) and child and adult speakers of Turkish. In the sections that follow, we first describe what is known about the linguistic expressions of locative relations specifically focusing on Left-Right in speech, sign, and co-speech gestures. Next, we review the literature on the development of such expressions in different modalities. Based on this literature, we derive a set of predictions on whether visual modality of expression modulates the development of spatial language use in childhood and whether these patterns carry into adulthood.

Linguistic encoding of locative relations

The linguistic encoding of locative spatial relations requires the mention of Figure and Ground objects as well as the spatial relation between them. In a spatial configuration, the Figure refers to the smaller and foregrounded object, which is located with respect to a backgrounded, and usually bigger object, known as the Ground (Talmy, Reference Talmy1985). Figure 1 depicts various locative spatial relations between the pen (Figure) and the paper (Ground). Descriptions of locative spatial relations can vary in requiring an external perspective, which may be viewer- or environment-centered (Levinson, Reference Levinson, Bloom, Peterson, Nadel and Garrett1996, Reference Levinson2003; Majid, Reference Majid2002; Pederson, Danziger, Wilkins, Levinson, Kita, & Senft, Reference Pederson, Danziger, Wilkins, Levinson, Kita and Senft1998; see also Li & Gleitman, Reference Li and Gleitman2002). In this study, we are interested in the viewer-centered spatial relations that are especially likely to manifest in cases where Ground objects do not have intrinsic features, and thus require speakers to consider a viewpoint in using spatial terms. For instance, in Figure 1a, the spatial relation between the objects is independent of the viewpoint of the observer. However, in some spatial relations, such as Left-Right (Figure 1b) or Front-Behind (Figure 1c), the spatial relation between the objects depends on the viewpoint of the observer (see Martin & Sera, Reference Martin and Sera2006 for a discussion; see also Landau, Reference Landau2017; Levinson, Reference Levinson2003). For Front-Behind, informational cues such as visibility (in the case of Front) and occlusion (in the case of Behind) provide information for the asymmetrical relationship that helps distinguish the two spatial relations from each other (Grigoroglou et al., Reference Grigoroglou, Johanson and Papafragou2019). The case of Left-Right, however, does not contain any informational cues to distinguish them from each other and remains to be two categorically distinct symmetrical spatial layouts. The current study focuses on encoding of Left-Right relations in which language users need to be explicit in their descriptions to be informative.

Figure 1. Objects in viewpoint-independent (a) and viewpoint-dependent (b & c) spatial configurations.

Linguistic encoding of space in speech, sign and co-speech gestures

Speech

In encoding locative spatial relations, speech transforms visual and three-dimensional experiences into categorical linguistic forms that have an arbitrary relationship to their meaning. For instance, in order to describe the spatial relation between the pen and the paper in Figure 1b, English speakers might rely on prepositional phrases with Left or Right – depending on their viewpoint. Alternatively, in order to describe the spatial relation between the objects in the same picture, English speakers may use general spatial terms such as Next to. However, the latter description might be under-informative in certain contexts – for example, when distinguishing between two categorical layouts, such as Left versus Right, because it fails to specify the exact spatial relation between the objects compared to expressions using Left-Right spatial terms.

In this study, following Sümer (Reference Sümer2015), we focus on descriptions in Turkish. For describing the picture in Figure 1b in an informative way (i.e., to distinguish Left from Right), Turkish speakers use Sol ‘Left’ or Sağ ‘Right’. Alternatively, Turkish speakers can use a general relational term Yan ‘Side’. This general relational term in Turkish (unlike Next to in English) can be used to refer to any side of an object, including its Front and Back. Thus, when Yan ‘Side’ is used, it is rather under-informative and cannot distinguish one viewpoint-dependent relation from another. Therefore, in Turkish, Left-Right relations are most informatively described when specific spatial terms are used. It should be noted that Turkish speakers typically describe viewpoint-dependent relations from their viewpoint (Sümer, Reference Sümer2015). More information regarding the descriptions in Turkish is provided in the coding section.

Sign

In encoding locative spatial relations, sign languages incorporate linguistic forms that bear iconic links to their meanings. The most frequent iconic form for describing spatial relations, including Left-Right, is through the use of morphologically complex classifier constructions, as shown in Figure 2d (Emmorey, Reference Emmorey2002; Janke & Marshall, Reference Janke and Marshall2017; Perniss, Zwitserlood, & Özyürek, Reference Perniss, Zwitserlood and Özyürek2015a; Supalla, Reference Supalla1982; Zwitserlood, Reference Zwitserlood, Pfau, Steinbach and Woll2012). In these constructions, the location of the hands encodes the location of the objects with respect to each other, while the handshape encodes objects’ shape information (Emmorey, Reference Emmorey2002; Perniss et al., Reference Perniss, Zwitserlood and Özyürek2015a; Supalla, Reference Supalla1982; Zwitserlood, Reference Zwitserlood, Pfau, Steinbach and Woll2012). To illustrate, while describing the spatial relation between the cup and the toothbrush, signers first introduce the lexical signs for the cup (Figure 2a) and the toothbrush (Figure 2c), and later they choose classifier handshapes to indicate the size and shape of these two objects (e.g., Figure 2d). More specifically, signers choose a round handshape to represent the round nature of the cup and an elongated handshape (i.e., index finger) to represent the shape of the toothbrush. Later, they position their hands in the signing space in a way analogue to the spatial relations in the picture. Thus, the representation of spatial relations between objects on the signing space maps onto the exact spatial relation between the objects in real space from a specific viewpoint (mainly signer/viewer viewpoint). For instance, if the toothbrush was located on the right of the cup, then, the signer would have positioned her handshape with an index finger for the toothbrush to the right side of the classifier handshape used for locating the cup to her left from her viewpoint. This then allows a diagrammatically iconic expression considering the relative locations of objects (Perniss, Reference Perniss2007).

Figure 2. Informative description from a TİD signer by using a classifier construction in encoding the spatial relation between the cup and the toothbrush.

In addition to classifier constructions, signers can use other linguistic forms – albeit less frequently – to express the spatial relation between objects. These include relational lexemes (Arık, Reference Arık2003; Sümer, Reference Sümer2015), tracing the shape of the objects and locating them on the signing space (Perniss et al., Reference Perniss, Zwitserlood and Özyürek2015a), pointing to indicate the object’s location in the signing space (Karadöller, Sümer, & Özyürek, Reference Karadöller, Sümer and Özyürek2021), and lexical verb placements (Newport, Reference Newport1988) (See coding section and Figures 9, 10, for more details). Even though the handshapes in these forms are not iconic themselves, similar to classifier constructions, all of these forms give iconic information about the relative spatial locations of the objects with respect to each other from signers’ viewpoint in a diagrammatically iconic way. In this sense, they are almost always informative in conveying object locations and differ from the under-informative expressions in spoken languages (e.g., Yan ‘Side’ in Turkish or Next to in English), which fail to distinguish between the two symmetrical layouts.

Co-speech gestures

Visual modes of expressions allowing iconic and analogue encodings are not specific to sign languages. These types of expressions can be found in spoken languages in the form of co-speech gestures (Kendon, Reference Kendon2004; Kita & Özyürek, Reference Kita and Özyürek2003; McNeill, Reference McNeill1992, Reference McNeill2005; Özyürek, Reference Özyürek, Hickmann, Veneziano and Jisa2018). Co-speech gestures can be used to indicate locations of objects in gesture space in an analogue manner due to their iconic affordances. Therefore, spoken expressions accompanied by gestures might convey more spatial information than speech alone. For instance, when describing locations in space, speakers sometimes encode space in an ambiguous way (e.g., Here or There) in speech while also using gestures to indicate relative locations of entities in space (McNeill, Reference McNeill2005; Peeters & Özyürek, Reference Peeters and Özyürek2016). Figure 3 exemplifies the use of a directional pointing gesture used with speech. In this example, although speech fails to give information regarding the exact spatial relation between the objects, the directional pointing gesture to the right gestural space indicates that the fork is on the right side. In this sense, even a pointing gesture can map the location of an object in real space to gesture space in a diagrammatically iconic way in relation to speaker’s body. In such descriptions, gestures might serve as a helpful tool during communication by disambiguating information conveyed in speech (McNeill, Reference McNeill1992; see more examples in the coding section) and thus contribute to linguistic encoding of the spatial relation.

Figure 3. An example from a Turkish speaker using a pointing gesture towards the right while mentioning “Side” in speech.

Notes. The underlined word denotes the speech that gesture temporally overlaps with. The description is informative only when both speech and gesture are considered.

Development of linguistic encoding of space in speech, sign, and co-speech gestures

Speech

Previous research has revealed some regularities in learning to express spatial relations across spoken languages. Specifically, children first encode spatial terms for viewpoint-independent relations (i.e., In-On-Under) starting around age 2 (Casasola, Reference Casasola2008; Casasola, Cohen, & Chiarello, Reference Casasola, Cohen and Chiarello2003; Clark, Reference Clark1973; Johnston & Slobin, Reference Johnston and Slobin1979). This is followed by viewpoint-dependent ones such as Front-Behind (Durkin, Reference Durkin1980; Grigoroglou et al., Reference Grigoroglou, Johanson and Papafragou2019; Johnston & Slobin, Reference Johnston and Slobin1979; Piaget & Inhelder, Reference Piaget and Inhelder1971). Linguistic expressions for Left-Right, however, appear latest and are found to be delayed for children acquiring a spoken language even until 10 years of age (Abarbanell & Li, Reference Abarbanell and Li2021; Benton, Reference Benton1959; Harris, Reference Harris1972; Piaget, Reference Piaget1972; Rigal, Reference Rigal1994, Reference Rigal1996; Sümer, Reference Sümer2015; see also Corballis & Beale, Reference Corballis and Beale1976). This order has been hypothesized to reflect non-linguistic conceptual development of space (e.g., Clark, Reference Clark2004; Johnston, Reference Johnston and Slobin1985, Reference Johnston, Stiles-Davis, Kritchevsky and Bellugi1988).

Learning to encode of Left-Right is considered to be a two-step process. First, children develop a conceptual understanding of their own Left-Right and map relevant spatial terms to refer to their own body (Howard & Templeton, Reference Howard and Templeton1966). As a next step, they map these spatial terms on other people’s left and right hands/legs (Howard & Templeton, Reference Howard and Templeton1966; Piaget, Reference Piaget1972). Encoding Left-Right relations between objects appears even later (e.g., Sümer et al., Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014). Even though speaking-children use Left-Right spatial terms to encode spatial relations between objects around ages 8-10, they still use them less frequently compared to adults and often provide incorrect or missing information in their speech alone descriptions (see Abarbanell & Li, Reference Abarbanell and Li2021 and Sümer et al., Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014 for the use of alternative spatial terms, such as Front, for describing Left-Right). This has been attributed to the symmetrical nature of Left-Right, which makes it hard to distinguish Left and Right from each other.

Sign

Recent research on sign language acquisition raises the possibility that the above-mentioned development of learning to encode Left-Right in speech might not be a reflection of a challenge in conceptual development. Rather, it might be due to the difficulty of mapping arbitrary and categorical terms onto Left-Right relations. If this is the case, iconic affordances of sign languages can facilitate children’s encoding of Left-Right relations. Empirical support for this claim comes from a study conducted by Sümer et al. (Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014) showing that TİD signing-children can produce expressions of Left-Right relations in adult-like ways earlier than Turkish-speaking-children when only speech is considered. Importantly, this advantage has not been found for other spatial relations, such as In-On-Under (Sümer & Özyürek, Reference Sümer and Özyürek2020). The advantage found for signing-children in encoding Left-Right cannot be explained by morphological complexity, lexical diversity, or other typological differences between Turkish and TİD as these were similar across expressions used for Left-Right and In-On-Under. Instead, this advantage seems to be best explained by the iconic affordances of sign languages that allow iconic mappings of the spatial relations onto the signing space (Emmorey, Reference Emmorey2002) that possibly ease the encoding of cognitively challenging spatial relations. This possibility has been supported by the early use of classifier constructions as well as relational lexemes that directly map relations onto the right or left side of the body (Karadöller et al., Reference Karadöller, Sümer and Özyürek2021; Sümer, Reference Sümer2015; Sümer et al., Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014 for TİD; Manhardt, Özyürek, Sümer, Mulder, Karadöller, & Brouwer, Reference Manhardt, Özyürek, Sümer, Mulder, Karadöller and Brouwer2020 for Sign Language of the Netherlands). See Figure 9 in the coding section for a body-anchored encoding of Left in TİD.

Co-speech gestures

Similar to sign language encodings, iconic affordances of co-speech gestures also convey visually motivated expressions of space along with speech (see Özyürek, Reference Özyürek, Hickmann, Veneziano and Jisa2018 for a review). A few studies have found gestures to be an important indicator for the development of spatial communication (e.g., Sauter, Uttal, Alman, Goldin-Meadow, & Levine, Reference Sauter, Uttal, Alman, Goldin-Meadow and Levine2012; Sekine, Reference Sekine2009). In one of these studies, Sekine (Reference Sekine2009) investigated route descriptions of children (e.g., from school to home) in three age groups (4, 5, and 6 years). The results showed a correlation between spatial information used in speech (use of Left-Right terms and mention of landmarks in the route) and the spontaneous use of spatial gestures. Another study investigating descriptions of the spatial layout of hidden objects in a room found that 8-year-olds rarely encoded the spatial location of objects in speech but often used gestures to convey the locations of objects when prompted to use their hands (Sauter et al., Reference Sauter, Uttal, Alman, Goldin-Meadow and Levine2012). Based on this previous research, it is plausible to argue that children’s gestures might convey information about spatial relations in cases where speech is under-informative. This, however, has not been investigated for expression of Left-Right relations, in which speaking-children are known to show delayed acquisition in their speech.

Present study

As the above research shows, the visual modality of expression (i.e., sign and gesture) seems to be privileged in providing more spatial information compared to speech possibly due to the affordances of iconic form-meaning mappings (Goldin-Meadow & Beilock, Reference Goldin-Meadow and Beilock2010; Sommerville, Woodward, & Needham, Reference Sommerville, Woodward and Needham2005). However, the role of visual modality as a modulating factor in spatial language development has not been fully examined. Until now, researchers have typically studied sign or speech independently. Moreover, limited work studying both sign and speech has compared sign to speech alone. However, comparing sign to speech and speech-gesture combinations could more realistically approximate the development of spatial language use as it would capture all semiotic tools available to spoken languages, including both arbitrary/categorical (i.e., in auditory-vocal speech) and iconic/analogue (i.e., in visual-spatial co-speech gestures) expressions (Goldin-Meadow & Brentari, Reference Goldin-Meadow and Brentari2017; Özyürek & Woll, Reference Özyürek, Woll and Hagoort2019). Hence, here we investigate, for the first time, how deaf child and adult signers acquiring sign language from birth and hearing child and adult speakers express Left-Right relations in sign, speech, and speech-gesture combinations to provide informative expressions.

We defined informativeness in terms of whether a participant’s description distinguishes symmetrical Left-Right relations (see Grigoroglou & Papafragou, Reference Grigoroglou and Papafragou2019a for a similar approach in the domain of events). In the present study, participants engaged in a communicative task in which they saw displays with 4 pictures presenting different spatial configurations of the same two objects (see Karadöller, Sümer, Ünal, & Özyürek, Reference Karadöller, Sümer, Ünal and Özyürek2022; Manhardt et al., Reference Manhardt, Özyürek, Sümer, Mulder, Karadöller and Brouwer2020; Manhardt, Brouwer, & Özyürek, Reference Manhardt, Brouwer and Özyürek2021 for a similar procedure). Within one display, the only distinguishing feature of the pictures was the spatial configuration between the objects (see Figure 4 for examples of displays). One of the pictures in the display was the “target picture” to be described to a confederate addressee who had to find it on her tablet among the same four pictures displayed in a different way. A detailed description of the stimuli material, procedure, and coding is provided in the methods section.

Figure 4. Non-contrast (a) and contrast (b) experimental displays.

We chose Turkish and TİD because there is a strong tendency for Turkish speakers, especially children, to use under-informative descriptions (e.g., Yan ‘Side’) while describing Left-Right relations between objects, even at the age of 8 (Sümer, Reference Sümer2015). We focused on age 8 to build on this previous work in Turkish and TİD. Moreover, although not directly studied for the domain of space, Turkish has been found to be a high gesture culture in general (Azar, Özyürek, & Backus, Reference Azar, Özyürek and Backus2020). Due to these features of Turkish and based on previous work showing that gestures can be used as a tool to convey spatial information by children (Sauter et al., Reference Sauter, Uttal, Alman, Goldin-Meadow and Levine2012; Sekine, Reference Sekine2009), we investigate whether signing-children still have an advantage in describing Left-Right relations in informative ways compared to speaking-children even when their multimodal expressions are taken into account.

Predictions

We grouped our predictions into two clusters. First, we compared sign to speech (i.e., Unimodal Descriptions). Then, we compared sign to speech by also taking into account gestures (i.e., Multimodal Descriptions). In each section, we also compared the development of spatial expressions of children to adults.

In unimodal descriptions, we expected an overall effect of modality, such that signers would produce informative descriptions in sign more frequently than speakers would do so in speech. This would be due to the affordances of visual modality that allow iconic/analogue expressions (Goldin-Meadow & Brentari, Reference Goldin-Meadow and Brentari2017; Özyürek & Woll, Reference Özyürek, Woll and Hagoort2019; Taub, Reference Taub2001; Taub & Galvan, Reference Taub and Galvan2001). Regarding developmental differences between children and adults in the two groups, there are two possibilities. One possibility is that speaking-children would produce informative expressions less frequently than adults, but signing-children would produce informative descriptions equally frequently as signing-adults. This would be in line with previously reported developmental patterns for speaking-children (Clark, Reference Clark1973) and signing-children who have been found to produce adult-like expressions of Left-Right relations starting from 4 years of age (Sümer, Reference Sümer2015). Alternatively, signing-children, similar to speaking-children, might produce informative descriptions less frequently than adults despite the advantage of the visual modality. This latter possibility would indicate a universal challenge in conceptual development of the spatial domain, specifically for Left-Right regardless of the modality of expression (Clark, Reference Clark1973).

Turning to multimodal descriptions, we first predicted that iconic affordances of gestures might facilitate expressing spatial relations in informative ways. In line with this, we expected speaking-children to use co-speech gestures that complement their under-informative speech more frequently than adults who would be mostly already informative in their speech (Alibali & Goldin-Meadow, Reference Alibali and Goldin-Meadow1993; Church & Goldin-Meadow, Reference Church and Goldin-Meadow1986; Perry, Church, & Goldin-Meadow, Reference Perry, Church and Goldin-Meadow1992; Sauter et al., Reference Sauter, Uttal, Alman, Goldin-Meadow and Levine2012).

Next, when comparing speech-gesture combinations to sign, if co-speech gestures help with the informativeness of speakers’ expressions, we expected modality differences between speakers and signers and developmental differences between speaking-children and speaking-adults to disappear (Goldin-Meadow & Brentari, Reference Goldin-Meadow and Brentari2017; Özyürek & Woll, Reference Özyürek, Woll and Hagoort2019). However, it is still possible for signers to produce informative descriptions more frequently than speakers. Such a finding could be due to the fact that iconic forms in sign are conventional linguistic tools (Brentari, Reference Brentari2010; Emmorey, Reference Emmorey2002; Klima & Bellugi, Reference Klima and Bellugi1979), unlike co-speech gestures that are learned and used flexibly as a composite system together with speech (Kendon, Reference Kendon2004; McNeill, Reference McNeill1992, Reference McNeill2005; see Özyürek & Woll, Reference Özyürek, Woll and Hagoort2019 and Perniss, Özyürek, & Morgan Reference Perniss, Özyürek and Morgan2015b for a discussion).

Method

The methods reported in this study have been approved by the Ethics Review Board of the Radboud University Nijmegen, and Survey and Research Commission of the Republic of Turkey Ministry of National Education.

Participants

Speaking participants consisted of hearing monolingual Turkish speaking-children (N = 24; 14 Female; Mean Age = 8;6; SD Age = 0.93; Age Range = 6;7 – 9;5) and adults (N = 23; 14 Female; Mean Age = 35;9; SD Age = 10.29; Age Range = 19;8 – 50). Data from 1 additional speaking-child and 2 additional speaking-adults were excluded from the study since they were bilingual. Additionally, 2 speaking-children and 2 speaking-adults were excluded due to having average number of spatial encodings being 3 standard deviations below the group mean.

Signing participants consisted of TİD signing-children (N = 21; 12 Female; Mean Age = 8;5; SD Age = 1.29; Age Range = 6;8 – 11) and adults (N = 26; 21 Female; Mean Age = 29;10; SD Age = 8.34; Age Range = 18;2 – 48;7). Data from an additional 6 signing-children and 4 signing-adults were excluded from the study due to failure to follow the instructions (n = 7), problems with the testing equipment (n = 1), or disruption during the testing sessions (n = 2). All signing participants were profoundly and congenitally deaf and acquired TİD from birth from their deaf signing parents. They did not receive speech therapy and were exposed to written Turkish when they started the school for the deaf.Footnote 1

We determined the sample size based on convenience. Working with special populations poses certain challenges in reaching participants. Here, we report data from signers who had been exposed to sign language from birth by their signing deaf parents. This group represents 10% of the deaf population in the world (Mitchell & Karchmer, Reference Mitchell and Karchmer2004) and in Turkey (İlkbaşaran, Reference İlkbaşaran2015). Hence, the number of participants in each group reported in this study (speaking-children, signing-children, speaking-adults, and signing-adults) was determined based on the total number of deaf children attending the deaf schools in İstanbul that we could collect data from. We collected data from all students who matched our criteria (e.g., age, absence of comorbid health issues). Finally, to our knowledge, the current sample incorporates the largest number of deaf signers who have been exposed to sign language from birth by their parents in comparison to previous studies conducted in the field. We could not balance, however, the gender diversity within the adults’ group as we were limited to the number of deaf adults living in İstanbul. We collected data from almost all deaf adults falling under our criteria and willing to participate. Participation was voluntary and at the end of the study all children received a gender-neutral color pencil kit and adult participants received monetary compensation for their participation.

We compared speaking and signing participants’ ages and visual spatial cognitive abilities (i.e., Corsi Block Tapping Task Score; Corsi, Reference Corsi1972) to ensure similarity, separately for children and adults. To do so, we conducted Bayesian t-tests which assessed the probability of the mean difference (MDIFF) greater than zero and less than zero using the R package BayesianFirstAid (version 0.1; Bååth, Reference Bååth2014). Signing- and speaking-children were similar in age (Bayesian two sample t-test: MDIFF (–5) > 0: p = 0.556, MDIFF (5) < 0: p = 0.444) and in Corsi Block Tapping Task score (Bayesian two sample t-test: MDIFF (–5) > 0: p = 0.972, MDIFF (5) < 0: p = 0.280). Moreover, signing- and speaking-adults were also similar in age (Bayesian two sample t-test: MDIFF (–5) > 0: p = 0.736, MDIFF (5) < 0: p = 0.264) and Corsi Block Tapping Task score (Bayesian two sample t-test: MDIFF (–5) > 0: p = 0.866, MDIFF (5) < 0: p = 0.134).

Materials

Stimuli consisted of 84 displays. Each display had 4 pictures presented in a 2 x 2 grid. Individual pictures in each display showed the same two objects in various spatial configurations. Ground objects (e.g., a jar) were always in the center of the pictures and they did not have “intrinsic” sides determined by their shape (e.g., a picture frame has an intrinsic front, but a jar does not). Figure objects (e.g., a pencil) changed their location in relation to the Ground objects. In each display, the target picture to be described was indicated by an arrow. Experimental displays (n = 28) consisted of Left-Right spatial configurations between objects (e.g., the pencil is to the left of the cup). In half of the experimental displays, only the target picture contained Left or Right spatial configuration between objects and all non-target pictures contained spatial configurations other than Left-Right (i.e., Non-contrast displays). In the remaining half of the experimental displays, one non-target picture contained the contrastive spatial configuration (i.e., contrast picture; if the target picture contained Left spatial configuration, contrast picture contained Right spatial configuration or vice versa) and remaining pictures contained spatial configurations other than Left-Right (i.e., Contrast displays). See Figure 4 for example displays. The rationale for having Contrast displays in addition to Non-contrast displays was to increase the need for informativeness in describing the spatial relation between the objects in the target pictures. With this manipulation, we aimed to test if participants use more informative descriptions for Contrast than Non-Contrast displays to distinguish the target picture in a more distinctive way among the other pictures in the display (see Manhardt et al., Reference Manhardt, Özyürek, Sümer, Mulder, Karadöller and Brouwer2020, Reference Manhardt, Brouwer and Özyürek2021 for a similar procedure).

In addition to the experimental displays, we included 56 filler displays to avoid attention to the Left-Right spatial configurations. Filler displays consisted of target pictures in Front (n = 14), Behind (n = 14), In (n = 14), and On (n = 14) spatial relations between objects.

All visual displays were piloted to ensure that both children and adults could identify and name the objects in the display. Within all 84 displays, Figure objects (e.g., pen) were presented only once. Ground objects (e.g., cup) were presented 4 times but always with other Figure objects (e.g., cup-pencil, cup-egg, cup-fork, cup-chocolate). The same Ground objects were never presented twice in a row. Moreover, the same relation between the objects as a target picture was not presented more than twice in a row to avoid biases to one type of spatial relation. There were two sets of displays with the same Ground objects but with different Figure objects. All other configurations were similar across the two sets. The order of the displays and locations of the pictures in each display were randomized across each participant.

Procedure

The description and familiarization tasks were originally designed as part of an eye-tracking experiment – however, for the purpose of this paper, we only reported the description data.

Description task

Participants were presented with the description task after the familiarization task was completed (see details below). Trials started with a fixation cross (2000ms), followed by a display of 4 pictures (1000ms). Next, an arrow appeared (for 500ms) to indicate the target item and disappeared and 4 pictures remained on the screen (2000ms) until visual white noise appeared. Participants were instructed to describe the target picture to an addressee sitting across the table immediately after the appearance of the visual white noise. This was done to prevent children from pointing towards the screen to show the pictures or objects in a picture while describing. Participants were instructed that the addressee would choose the target picture on her tablet based on the participant’s description. They were also aware that the addressee had the same 4 pictures but in a different arrangement in the display and without the arrow. The addressee was a confederate and pretended to choose a picture on her tablet based on the participant’s description. Participants moved to the next trial by pressing the ENTER key on the keyboard. Having an addressee, albeit as a confederate, was especially important considering previous reports on children’s tendencies to be under-informative in the presence of an inattentive listener or in the absence of a listener (Bahtiyar & Küntay, Reference Bahtiyar and Küntay2009; Girbau, Reference Girbau2001; Grigoroglou & Papafragou, Reference Grigoroglou and Papafragou2019b). See figure 5 for the timeline of a trial in the description task.

Figure 5. Timeline of a trial in the description task.

At the beginning of the description task, participants engaged in practice trials (n = 3) and these trials were repeated if necessary. During practice trials, when participants failed to understand the task instructions, the experimenter repeated them. Both during the practice trials and throughout the experiment, the addressee did not give feedback on whether or not the description was correct in order to avoid biasing the responses in the upcoming trials but pretended to have found the right picture. When there was missing spatial information in participant’s description, the addressee only asked the location of the Figure object. In such cases, speaking participants were asked for the location of the figure object (e.g., Kalem nerede? ‘Where is the pencil?’) in Turkish, and signing participants were asked for the location of the Figure object using the lexical sign of where and the lexical sign of the Figure object found in the target picture in TİD. In order to provide consistent feedback, no other instructions were given to the participants. The addressee asked for such a clarification only once. Even if participants provided a description with missing spatial information in the second round, the addressee did not ask for further clarification and pretended to choose a picture in her tablet. Moreover, we did not provide explicit instructions for speaking participants to gesture or not. Thus, all of the gestures were spontaneously produced. Hearing adult speakers of Turkish were present as an addressee and as an experimenter for speaking participants, and deaf adult signers of TİD were present as an addressee and as an experimenter for signing participants.

Familiarization task

The familiarization task was introduced before the description task. This task aimed to introduce the general complexity of the displays with the 2 x 2 grid with two objects in various spatial configurations to each other. Participants were randomly presented with one of the two sets of displays that they did not receive during the description task.

Corsi Block Tapping Task

Participants received the computerized version of the Corsi Block Tapping Task in forward order. We administered this task to ensure that speaking and signing participants have similar spatial working memory spans (Corsi, Reference Corsi1972). This was especially important as previous studies showed mixed evidence for the visuospatial abilities across signers and speakers (see Emmorey, Reference Emmorey2002; Marshall, Jones, Denmark, Mason, Atkinson, Botting, & Morgan, Reference Marshall, Jones, Denmark, Mason, Atkinson, Botting and Morgan2015).

All the experimentation (Familiarization Task, Description Task, and Corsi Block Tapping Task) was administered via Dell laptop with software Presentation NBS 16.4 (Neurobehavioral Systems, Albany, CA). Instructions of the tasks were given orally to speaking participants, or in sign to signing participants, in order to avoid misunderstandings in written instructions by signers. We applied the same procedure to speakers to ensure identical experimental strategies. The description task was video-recorded from the front and side-top angles to allow for speech, sign, and gesture coding.

Annotation and coding

All descriptions produced in the description task were annotated for Target Pictures. Descriptions with speech, gesture, and sign were coded using ELAN (Version 4.9.3), a free annotation tool (http://tla.mpi.nl/tools/tla-tools/elan/) for multimedia resources developed by the Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, The Netherlands (Wittenburg, Brugman, Russel, Klassmann, & Sloetjes, Reference Wittenburg, Brugman, Russel, Klassmann and Sloetjes2006).

We coded descriptions in speech, gesture, and sign. Next, we formed informativeness categories first based on the information conveyed in speech, later by considering co-speech gestures along with speech, and in sign. We operationalized informativeness as whether participants’ descriptions provide a uniquely referring expression to distinguish the spatial relation between the objects in the target picture from other referent pictures in the display. In almost all descriptions, participants used their own perspective in encoding the spatial relations between objects.

Speech

Speech data were annotated and coded by the first author who is a hearing native speaker of Turkish. We did not have a reliability coding for speech as speech coding involved the presence of spatial terms (e.g., Left) that were unambiguously heard and identified by a hearing native speaker of Turkish. We grouped participants’ descriptions into two categories based on whether or not the linguistic form used to encode the spatial relation in the description was informative in uniquely identifying the target picture when only speech is considered.

  1. (1) Informative in Speech: This category consisted of descriptions that included Sol ‘Left’- Sağ ‘Right’ spatial terms. The specific spatial terms used in these descriptions provided uniquely referring expressions that distinguished the target picture (Figure 6a).

  2. (2) Under-informative in Speech: This category consisted of all the remaining descriptions as they failed to provide enough information to uniquely identify the target picture. These descriptions included the following sub-categories:

    1. (2a) Descriptions with a general relational term (Yan ‘Side’; Figure 6b) that failed to provide uniquely referring information that distinguished the actual spatial relation (e.g., which object is on the left side and which object is on the right side).

    2. (2b) Specific spatial terms other than Left-Right (e.g., Ön ‘Front’; Figure 6c). These descriptions were especially frequent in children’s descriptions as they tended to encode Left-Right with other spatial relations especially with Front (Abarbanell & Li, Reference Abarbanell and Li2021; Sümer et al., Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014). We did not want to render these descriptions as incorrect since we could not be sure whether children used these spatial terms due to having a difficulty in mapping Left-Right spatial terms onto Left-Right relations. A few cases also included spatial terms other than Front to describe target pictures in Left-Right spatial relations. Based on our definition of informativeness, all of these descriptions that encoded Left-Right with other spatial terms were not informative enough for the addressee to pick up the correct picture from her tablet and thus was considered under-informative.

    3. (2c) Descriptions with missing spatial relation where participants only labeled the objects but not the spatial relation between them (Figure 6d).

Figure 6. Examples from Turkish speakers describing the spatial relation between the pencil and the cup using (a) Left-Right spatial terms, (b) general relational term Side, (c) spatial terms other than Left-Right, (d) missing encoding of spatial relation between the objects.

Speech and gesture

We further coded spontaneous co-speech gestures as identified by strokes (see Kita, van der Hulst, & van Gijn, Reference Kita, van der Hulst, van Gijn, Wachsmuth and Fröhlich1998) produced by participants that conveyed information regarding the location of the two objects or the spatial relation between the objects. We did not take into account other types of gestures such as beat gestures. We did this coding per description and regardless of the type of speech used in the description.

In order to ensure reliability, 25% of the gesture data (5 Children and 5 Adults) were coded by another hearing native speaker of Turkish. There was substantial agreement between the coders for the type of spatial gestures used to localize Figure (88% Agreement, kappa = 0. 77) and Ground (92% Agreement, kappa = 0.79) objects. All disagreements were discussed to reach a 100% agreement.

For each description, we coded gestures separately for Figure and Ground objects. These gestures included either directional pointing gestures indicating the location of the Figure or Ground Object in an analogue way (Figure 3 and 7) or iconic hand placement gestures indicating the location of the Figure and/or Ground object on the gesture space (Figure 8). Both of the spatial gesture types, like linguistic structures found in sign to represent space, give spatial information about the Left-Right relations between objects from the viewpoint of the speaker, and help identify the target picture uniquely from other referents in the display. As a next step, we considered these spatial gestures on top of what has been conveyed in speech and redefined the informativeness categories for speakers creating multimodally informative categories:

  1. (1) Informative in speech: This category only involved Left-Right spatial terms as described above. Some of the descriptions with Left-Right spatial terms also included accompanying spatial gestures. However, for these descriptions, spatial gestures did not add to the informativeness of the description and were considered redundant. That is, speech was already informative even without considering the gestures. Thus, they did not form a new category (see Figure 7).

  2. (2) Informative in speech-plus-gesture: This category consisted of descriptions that include general spatial term Yan ‘Side’ in speech together with spatial gestures. In these descriptions, spatial information missing from descriptions with Yan ‘Side’ in speech was conveyed via spatial gestures (see Figures 3 and 8 for examples). Thus, these descriptions were informative only when the spatial gestures were considered.

  3. (3) Under-informative even when gestures are considered: This category consisted of descriptions with specific spatial terms other than Left-Right (e.g., Front; Figure 6c) as well as descriptions with missing spatial relation where participants only label the objects but not the spatial relation between them (Figure 6d). These descriptions were still under-informative even when gestures were considered together with speech. That is, gestures did not contribute to the informativeness of the description above speech.

Figure 7. Informative in Speech description from a Turkish speaker using a specific spatial term (Left) together with a directional pointing gesture to the left.

Note. Underlined words denote the speech that the gesture overlapped with. The description is informative even when only speech is considered.

Figure 8. Informative in speech-plus-gesture description from a Turkish speaker using a general spatial term (Side) together with iconic hand placement gestures.

Notes. Participant introduced gestures sequentially. Gesture indicating the basket (RH) was performed when the participant mentioned the basket in her speech. Gesture indicating the newspaper (LH) was performed when the participant mentioned the newspaper. Both gestures remained in the gesture space until the end of the sentence. The description is informative only when information in both speech and gestures combined is considered.

Sign

Sign data were annotated by a hearing L2 signer of TİD. The data were coded by another hearing L2 signer of TİD. Annotations and coding were checked by a trained native deaf signer of TİD. We did not have a reliability coding for sign as we only included the linguistic forms that were unambiguously approved by this signer in the final dataset.

We coded descriptions for the presence of spatial relation between the objects and the type of linguistic form used to localize the Figure object in relation to the Ground object. Signers used 5 different Linguistic forms. These forms were classifier constructions (Figure 2d), which are one of the most common forms to localize the Figure object in relation to the Ground object in sign languages in general (Emmorey, Reference Emmorey2002) and also in TİD (Arık, Reference Arık2003; Karadöller et al., Reference Karadöller, Sümer and Özyürek2021; Sümer, Reference Sümer2015; Sümer, et al., Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014). They allow signers to encode information about the entities through the handshape classifications of objects (e.g., Emmorey, Reference Emmorey2002; Janke & Marshall, Reference Janke and Marshall2017; Manhardt et al., Reference Manhardt, Özyürek, Sümer, Mulder, Karadöller and Brouwer2020; Perniss et al., Reference Perniss, Zwitserlood and Özyürek2015a; Zwitserlood, Reference Zwitserlood, Pfau, Steinbach and Woll2012). Alternatively, signers also used other forms such as relational lexemes, which are the lexical signs for spatial terms used in sign languages (Arık, Reference Arık2003; Manhardt et al., Reference Manhardt, Özyürek, Sümer, Mulder, Karadöller and Brouwer2020; Figure 9); tracing the shape of the Figure object on the signing space (Karadöller et al., Reference Karadöller, Sümer and Özyürek2021; Figure 10); pointing to the location of the Figure object on the signing space (Karadöller et al., Reference Karadöller, Sümer and Özyürek2021); placing a lexical verb to locate an object on the signing space (see Karadöller, Reference Karadöller2021).

Figure 9. Informative in sign description from a TİD signer by using a relational lexeme for Left in encoding the spatial relation between the cup and the ruler.

Figure 10. Informative in sign description from a TİD signer by tracing the shape of the Figure object on the signing space in encoding the spatial relation between the cup and the ruler.

We grouped participants’ descriptions into two categories in terms of whether or not the description was uniquely informative (i.e., which object is where relative to the other based on diagrammatical iconicity) in identifying the target picture depending on the linguistic form that is used to encode the spatial relation in sign.

  1. (1) Informative in Sign: This category included all of the linguistic forms mentioned above as they were describing the exact spatial relation between the objects and distinguishing the target picture uniquely from the other pictures in the display.

  2. (2) Under-informative in Sign: This category included descriptions with incorrect (e.g., describing that the pen is in front of the paper, despite the target picture showing that the pen is to the left of the paper) and missing (e.g., only labeling the Figure and Ground object but not the spatial relation of the Figure object in relation to the Ground object, Figure 11) spatial relation.

Figure 11. Under-informative description in sign with missing spatial relation between objects.

Results

Data presented in this section were analyzed using generalized mixed-effects logistic regression modeling (glmer) with random intercepts for Subjects and Items.Footnote 2 This mixed-effects approach allowed us to take into account the random variability due to having different participants and different items. All models were fit with the lme4 package (version 1.1.17; Bates, Mächler, Bolker, & Walker, Reference Bates, Mächler, Bolker and Walker2014) in R (version 3.6.3: R Core Team, 2018) with the optimizer bobyqa (Powell, Reference Powell2009). We did not include random slopes in any of the models because all of our models were testing between-subjects effects that cannot be added as random slopes.

Unimodal descriptions

First, we investigated whether the frequency of informative descriptions differs across the modalities and age groups (see dark green bars compared to yellow bars in Figure 12). We used a glmer model to test the fixed effects of Modality (Informative in speech versus Informative in sign) and Age Group (Children versus Adults), and an interaction between them on binary values for the presence of informative descriptions (Present = 1, Absent = 0) at the item level. The fixed effects of Modality and Age Group were analyzed with centered contrasts (–0.5, 0.5). The model revealed a fixed effect of Modality (β = –4.32, SE = 0.68, p < 0.001): signers produced informative descriptions (Mean = 0.94; SD = 0.24) more frequently than speakers (Mean = 0.58; SD = 0.49). The model also revealed a fixed effect of Age Group (β = 4.57, SE = 0.69, p < 0.001): adults produced informative descriptions (Mean = 0.93; SD = 0.26) more frequently than children (Mean = 0.58; SD = 0.49). There was no interaction between Modality and Age Group (β = 1.61, SE = 1.26, p = 0.203).

Figure 12. Proportion of Informative descriptions across Age Groups and Modality.

Multimodal descriptions

First of all, we tested if children’s spatial gestures help convey more information that is missing in Under-Informative speech than adults. To test this, we compared the frequency of descriptions that were Informative in speech-plus-gesture across children and adults (see light green bars in Figure 12). We used a glmer model to test the fixed effect of Age Group (Children, Adults) on whether the descriptions were Informative in speech-plus-gesture (1) or not (0) at the item level. The fixed effect of Age Group was analyzed with centered contrasts (–0.5, 0.5). The model revealed a fixed effect of Age Group (β = 2.94, SE = 0.70, p < 0.001): children (Mean = 0.45; SD = 0.46) produced descriptions that were Informative in speech-plus-gesture more frequently than adults (Mean = 0.08; SD = 0.26).

Finally, we investigated whether frequency of informative descriptions changes across the modalities and age groups (see light and dark green bars compared to yellow bars in Figure 12). We used a glmer model to test the fixed effects of Modality (Informative in speech and Informative in speech-plus-gesture combined versus Informative in sign) and Age Group (Children versus Adults), and an interaction between them on binary values for the presence of informative description (Present = 1, Absent = 0) at the item level. The fixed effects of Modality and Age Group were analyzed with centered contrasts (–0.5, 0.5). The model revealed a fixed effect of Modality (β = –1.30, SE = 0.37, p < 0.001): signers (Mean = 0.94; SD = 0.27) produced informative descriptions (in sign) more frequently than speakers (Mean = 0.85; SD = 0.36) (in speech and speech-plus-gesture combined). The model also revealed a fixed effect of Age Group (β = 2.25, SE = 0.37, p < 0.001): adults (Mean = 0.97; SD = 0.18) produced informative descriptions more frequently than children (Mean = 0.82; SD = 0.38) There was no interaction between Modality and Age Group (β = –1.12, SE = 0.74, p = 0.127).

Discussion

In this study, we investigated whether the modality of expression influences the informativeness of Left-Right expressions of children and adults. To our knowledge, this is the first study to investigate the adult-like uses of Left-Right expressions with a multimodal perspective by considering spatial co-speech gestures and comparing descriptions in sign not only to speech but also to speech-gesture combinations. Overall, results showed that gestures help increase the informativeness of spatial descriptions compared to information in speech for speaking-children. However, sign expressions were still more informative even when gestures were considered. Finally, across both modalities children were less informative than adults.

Sign has an advantage over speech both for children and adults

Unimodal comparisons across modalities revealed that signers produced informative descriptions more frequently than speakers regardless of the age group. This can be attributed to the facilitating effect of iconicity of sign language expressions providing more information compared to expressions in speech. Nevertheless, expressions in sign being more informative than those in speech should not be taken to indicate that signers have a more developed conception of Left-Right relations between objects than speakers do. Rather, it implies that iconic affordances of sign language expressions allow for more direct encoding and thus increase informativeness of the expression (see Sümer, Reference Sümer2015; Slonimska, Özyürek, & Capirci, Reference Slonimska, Özyürek and Capirci2020; Taub, Reference Taub2001 for adults). These findings provide important contributions for the development of spatial language use in speech and sign. That is, spatial language development may depend on several linguistic factors varying across languages (see Johnston & Slobin, Reference Johnston and Slobin1979). Despite these linguistic differences between TİD and Turkish, signing- and speaking-children encode In-On-Under at similar ages (Sümer & Özyürek, Reference Sümer and Özyürek2020). By contrast, they differ in encoding Left-Right. This difference points to a complex interplay between modality of expression, cognitive and linguistic development of space.

Gesture enhances informativeness of spoken expressions and more for children than for adults

When we considered multimodal descriptions of speakers, we found that both children and adults used spatial gestures that disambiguated the descriptions with Side. This trend was more prominent for children than adults. This is reminiscent of previous findings from other domains showing that gestures can help clarify the meaning of Side. For instance, Cook, Duffy, and Fenn (Reference Cook, Duffy and Fenn2013) showed that using gestures to teach math operations when referring to two sides of an equation helps children solve math equations more accurately. It seems that the use of gestures is an important tool for clarifying under-informative speech (see also Kelly, Reference Kelly2001). Moreover, we extend this finding to situations where participants were not explicitly instructed to gesture (Sauter et al., Reference Sauter, Uttal, Alman, Goldin-Meadow and Levine2012), since all of the gestures elicited in our study were spontaneous. Here, the frequent use of spontaneous gestures by Turkish speakers could be a reflection of Turkish being a high gesture culture (Azar et al., Reference Azar, Özyürek and Backus2020), hence raises possibilities for further investigations in other high (e.g., Italian) and low (e.g., Dutch) gesture cultures.

Moreover, we showed that children could communicate Left-Right spatial relations between two objects informatively through co-speech gestures before communicating them informatively in speech. This corroborates previous literature on gestures preceding speech, which was already established for several other domains (Alibali & Goldin-Meadow, Reference Alibali and Goldin-Meadow1993; Church & Goldin-Meadow, Reference Church and Goldin-Meadow1986; Perry et al., Reference Perry, Church and Goldin-Meadow1992; Sauter et al., Reference Sauter, Uttal, Alman, Goldin-Meadow and Levine2012). It seems that by age 8, children have some conceptual understanding of Left-Right spatial relations, yet they fail to map arbitrary/categorical linguistic forms in speech onto these conceptual representations. In these instances, gestures could act as a medium for representing already established spatial concepts that fail to surface in speech. These instances were very rare in adult language as adults could already map Left-Right spatial terms to these concepts. Together, our findings highlight the importance of considering children’s multimodal encodings in assessing their pragmatic (i.e., informativeness; Grigoroglou et al., Reference Grigoroglou, Johanson and Papafragou2019) and cognitive development (Hermer-Vazquez, Moffet, & Munkholm, Reference Hermer-Vazquez, Moffet and Munkholm2001).

Signed descriptions are more informative even when gestures are considered both for children and adults

When spatial gestures were considered together with speech and compared to expressions in sign, signers continued to be more informative than speakers. This can be attributed to having iconic expressions as obligatory and conventional linguistic forms for sign languages (Brentari, Reference Brentari2010; Emmorey, Reference Emmorey2002; Klima & Bellugi, Reference Klima and Bellugi1979). Conversely, co-speech gestures are used flexibly and only as a composite system together with speech (Kendon, Reference Kendon2004; McNeill, Reference McNeill1992, Reference McNeill2005; see Perniss et al., Reference Perniss, Özyürek and Morgan2015b for a discussion). Differences in the way co-speech gestures and signs are used during acquisition might have been the underlying factor of sign advantage.

Visual modality conveys more information than speech alone

Findings obtained for enhanced informativeness in both sign and gesture compared to speech add further evidence for the importance of body’s interaction with the world in shaping language and cognition (embodied cognition; Chu & Kita, Reference Chu and Kita2008; Goldin-Meadow, Reference Goldin-Meadow2016). It seems that children map their bodies’ interaction with the spatial relations between objects more easily onto iconic gestures/signs than speech (see Sümer, Reference Sümer2015 for a discussion). Both signers and speakers described the spatial relation between objects dominantly from their perspective. In those descriptions, participants might be mentally aligning the location of the objects that they saw on the computer screen with the left and right sides of their body. This alignment might have eased encoding the object locations on the sign/gesture space by placing hands rather than by mapping them by abstract spatial terms in speech. Moreover, signers used specific linguistic forms that are body-anchored (i.e., relational lexemes) – although few in frequency compared to classifiers. These signs have been found to facilitate learning to encode spatial relations earlier in sign than in speech and especially for Left-Right (Sümer, Reference Sümer2015; Sümer et al., Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014). Thus, having body-anchored lexical signs already in sign language lexicon might have allowed signers to encode more information with respect to Left-Right spatial relations between objects using their own body as a reference. Overall, visual modality allows more direct mapping of visual/bodily experience onto linguistic labels, which, therefore, results in more informative descriptions of Left-Right relations between objects.

Left-Right remains to be a challenging spatial domain for children even when visual modality is considered

Contrary to our initial expectation, developmental differences for the informativeness of Left-Right expressions did not disappear even when we considered visual modality of expression in sign or co-speech gestures. Children were less informative than adults for both unimodal and multimodal descriptions. This suggests an intricate interplay between language development, cognitive development, and the visual modality of expression. On the one hand, spatial gestures contribute to the informativeness of the descriptions for children more than adults. On the other hand, this contribution is not sufficient for speaking-children to reach adult levels of informativeness. Together, these results speak for the general claim that Left-Right is challenging for children regardless of the modality of communication (Abarbanell & Li, Reference Abarbanell and Li2021; Clark, Reference Clark1973; Rigal, Reference Rigal1994, Reference Rigal1996; Sümer, Reference Sümer2015; Sümer et al., Reference Sümer, Perniss, Zwitserlood, Özyürek, Bello, Guarini, McShane and Scassellati2014).

It is possible to attribute the differences between children and adults to the development of pragmatic knowledge required to provide informative descriptions. One way to investigate whether children and adults differ due to differences in pragmatic knowledge (as opposed to ease of encoding with iconic expressions) is to observe differences in contrast versus non-contrast trials. However, participants did not change the way they describe when there is a contrast or a non-contrast displays. Another way would be to have a non-confederate addressee and investigate various description/selection instances. For example, investigating the possible changes in the description patterns of the participants upon incorrect picture selection by addressee could reveal important insights for the development of pragmatic knowledge of children when compared to adults. This can be investigated in future research.

Future directions

First of all, the current study focused on manual articulators in sign and co-speech gestures in order to maintain similarity to previous research on the development of encoding space. In addition to the manual articulators, head/torso movements and eye-gaze direction may provide important contributions to our understanding of the role of multimodal communication. We call for further research to establish systematical and conventional ways of integrating those aspects in investigating the development of multimodal communication.

Secondly, the current study did not directly assess participants’ actual knowledge of Left-Right terms which could be used to strengten our claims for modulating effect of visual modality on the acquisition of Left-Right language.

Finally, even though our findings in spatial language use suggest an advantage when using sign compared to using speech or speech-gesture combinations, this advantage seems to be limited to tasks where language is explicitly used within the task. We did not, for instance, find differences between visual-spatial working memory across signers and speakers. Similarly, we did not find differences in participants’ spatial memory accuracy when we asked them to remember the picture that had been described in different modalities (Karadöller et al., Reference Karadöller, Sümer and Özyürek2021, see also Karadöller, Reference Karadöller, Sümer and Özyürek2021 for a discussion). Together, these findings call for investigations in other indices of cognition, such as visual attention during a planning of a description where some research demonstrated variations based on modality (see Manhardt et al., Reference Manhardt, Brouwer and Özyürek2021).

Conclusion

In summary, visual modality of expression, in sign or gesture, can modulate the development of spatial language use in signers and speakers. However, the facilitating effect of sign in conveying informative spatial descriptions was stronger than that of co-speech gestures. Having obligatory and conventional iconic expressions as linguistic forms in sign languages, unlike co-speech gestures that are used flexibly as composite utterances with speech, might have facilitated the development of informativeness in Left-Right descriptions of signers. Finally, both signing- and speaking-children were less informative than adults even with the advantage of visual modality allowing iconic descriptions. This corroborates earlier claims pointing to the challenge of this spatial domain in conceptual and linguistic development (Clark, Reference Clark1973; Johnston, Reference Johnston and Slobin1985, Reference Johnston, Stiles-Davis, Kritchevsky and Bellugi1988). Results of the present study call for investigations in other languages, bilinguals, and cultures (e.g., low-gesture cultures) and on different aspects of language development to unravel how cognitive and linguistic (e.g., modality of expression) factors interact and determine the outcomes for developmental milestones.

Acknowledgements

We thank our deaf assistants Sevinç Yücealtay-Akın, Feride Öksüz, Selda Işıktaş-Yıldız, Yusuf Ermez and hearing assistants Hükümran Sümer, Ekin Tünçok, İrem Yeşilçavdar and Yeşim Özüer for their help in data collection and coding, and Jeroen Geerts for processing the video data. We thank Dr. Francie Manhardt and Dr. Susanne Brouwer for their contribution on the design of this study. We also thank Dr. Tilbe Göksun and Dr. Demet Özer for their intellectual contribution to this manuscript.

Funding

This research is supported by NWO VICI Grant (277-70-013) awarded to Aslı Özyürek.

Competing interests

The authors declare none.

Footnotes

1 Unlike in the countries with general newborn hearing screening and robust early intervention, deaf children in Turkey do not typically get speech therapy and are exposed mostly to written Turkish when they start school. In the schools for the deaf, TİD is not part of the official curriculum as all of these schools employ only “oral/written education” in Turkish (İlkbaşaran, Reference İlkbaşaran2015).

2 For all models reported below, a more complex version of the model with Display Type (Contrast, Non-contrast) as an additional fixed effect was also tested. None of the models revealed a fixed effect of Display Type and these models did not have a better fit for the data. Thus, the fixed effect of Display Type was omitted from the models.

References

Abarbanell, L., & Li, P. (2021). Unraveling the contribution of left-right language on spatial perspective taking. Spatial Cognition & Computation, 138.CrossRefGoogle Scholar
Alibali, M. W., & Goldin-Meadow, S. (1993). Gesture-speech mismatch and mechanisms of learning: What the hands reveal about a child’s state of mindCognitive psychology25(4), 468523. http://doi.org/10.1006/cogp.1993.1012CrossRefGoogle ScholarPubMed
Arık, E. (2003). Spatial representations in Turkish and Sign Language of Turkey (TİD). MA Thesis. University of Amsterdam, Amsterdam, NL.Google Scholar
Azar, Z., Özyürek, A., & Backus, A. (2020). Turkish-Dutch bilinguals maintain language-specific reference tracking strategies in elicited narrativesInternational Journal of Bilingualism24(2), 376409http://doi.org/10.1177/1367006919838375CrossRefGoogle Scholar
Bååth, R. (2014). BayesianFirstAid: A package that implements bayesian alternatives to the classical test functions in R.Google Scholar
Bahtiyar, S., & Küntay, A. C. (2009). Integration of communicative partner’s visual perspective in patterns of referential requestsJournal of Child Language36(3), 529555. http://doi.org/10.1017/S0305000908009094CrossRefGoogle ScholarPubMed
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2014). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 148.Google Scholar
Benton, A. (1959). Right-left discrimination and finger localization. New York: Hoeber-Harper.Google Scholar
Bowerman, M. (1996a). Learning how to structure space for language: A cross-linguistic perspective. In Bloom, P., Peterson, M., Nadel, L., & Garett, M. (Eds), Language and space (pp. 385436). Cambridge, MA: MIT Press.Google Scholar
Bowerman, M. (1996b). The origins of children’s spatial semantic categories: Cognitive versus linguistic determinants. In Gumperz, J. J. & Levinson, S. C. (Eds.), Rethinking linguistic relativity (pp. 145176). Cambridge: Cambridge University Press.Google Scholar
Brentari, D. (Ed.). (2010). Sign languages. Cambridge University Press.CrossRefGoogle Scholar
Casasola, M. (2008). The development of infants’ spatial categories. Current Directions in Psychological Science, 17(1), 2125. http://doi.org/10.1111/j.1467-8721.2008.00541.xCrossRefGoogle Scholar
Casasola, M., Cohen, L. B., & Chiarello, E. (2003). Six‐month‐old infants’ categorization of containment spatial relations. Child Development, 74(3), 679693.CrossRefGoogle ScholarPubMed
Chu, M., & Kita, S. (2008). Spontaneous gestures during mental rotation tasks: Insights into the microdevelopment of the motor strategy. Journal of Experimental Psychology. General, 137(4), 706723.CrossRefGoogle ScholarPubMed
Church, R. B., & Goldin-Meadow, S. (1986). The mismatch between gesture and speech as an index of transitional knowledge. Cognition, 23, 4371. http://doi.org/10.1111/1467-8624.00562CrossRefGoogle ScholarPubMed
Clark, E. V. (1973). Non-linguistic strategies and the acquisition of word meanings. Cognition, 2(2), 161182. http://doi.org/10.1016/0010-0277(72)90010-8CrossRefGoogle Scholar
Clark, E. V. (2004). How language acquisition builds on cognitive development. Trends in Cognitive Sciences, 8(10), 472478.CrossRefGoogle ScholarPubMed
Cook, S. W., Duffy, R. G., & Fenn, K. M. (2013). Consolidation and transfer of learning after observing hand gestureChild development84(6), 18631871. http://doi.org/10.1111/cdev.12097CrossRefGoogle ScholarPubMed
Corballis, M. C., & Beale, I. L. (1976). The psychology of left-right. Hillsdale, NJ: Erlbaum.Google Scholar
Corsi, P. M. (1972). Human memory and the medial temporal region of the brain (Unpublished doctoral dissertation). McGill University, Montreal, Canada.Google Scholar
Durkin, K. (1980). The production of locative prepositions by young school children. Educational Studies, 6(1), 930. http://doi.org/10.1080/0305569800060102CrossRefGoogle Scholar
Emmorey, K. (2002). Language, cognition, and the brain: Insights from sign language research. Mahwah, NJ: Lawrence Erlbaum Associates. http://doi.org/10.4324/9781410603982Google Scholar
Furman, R., Küntay, A., & Özyürek, A. (2014). Early language-specificity of children’s event encoding in speech and gesture: Evidence from caused motion in Turkish. Language, Cognition and Neuroscience, 29, 620634.CrossRefGoogle Scholar
Furman, R., Özyürek, A., & Allen, S. (2006). Learning to express causal events across languages: What do speech and gesture patterns reveal?. In Bamman, D., Magnitskaia, T., & Zaller, C. (Eds.), Proceedings of the 30th Annual Boston University Conference on Language Development (pp. 190201). Somerville, MA: Cascadilla Press.Google Scholar
Furman, R., Özyürek, A., & Küntay, A. C. (2010). Early language-specificity in Turkish children’s caused motion event expressions in speech and gesture. In Franich, K., Iserman, K. M., & Keil, L. L. (Eds.), Proceedings of the 34th Boston University Conference on Language Development (Vol. 1, pp. 126137). Somerville, MA: Cascadilla Press.Google Scholar
Girbau, D. (2001). Children’s referential communication failure: The ambiguity and abbreviation of messageJournal of Language and Social Psychology20(1-2), 8189. http://doi.org/10.1177/0261927X01020001004CrossRefGoogle Scholar
Göksun, T., Hirsh-Pasek, K., & Golinkoff, R. M. (2010). How do preschoolers express cause in gesture and speech? Cognitive Development, 25(1), 5668.CrossRefGoogle ScholarPubMed
Goldin-Meadow, S. (2016). Using our hands to change our minds. Wiley Interdisciplinary Reviews. Cognitive Science, 8(1–2), 16.Google Scholar
Goldin-Meadow, S., & Beilock, S. L. (2010). Action’s influence on thought: The case of gesture. Perspectives on Psychological Science, 5, 664674. http://doi.org/10.1177/1745691610388764CrossRefGoogle ScholarPubMed
Goldin-Meadow, S., & Brentari, D. (2017). Gesture, sign, and language: The coming of age of sign language and gesture studiesBehavioral and Brain Sciences40, e46. http://doi.org/10.1017/S0140525X15001247CrossRefGoogle ScholarPubMed
Grigoroglou, M., Johanson, M., & Papafragou, A. (2019). Pragmatics and spatial language: The acquisition of front and backDevelopmental psychology55(4), 729. http://doi.org/10.1037/dev0000663CrossRefGoogle ScholarPubMed
Grigoroglou, M., & Papafragou, A. (2019a). Children’s (and Adults’) Production Adjustments to Generic and Particular Listener NeedsCognitive Science43(10), e12790. http://doi.org/10.1111/cogs.12790CrossRefGoogle ScholarPubMed
Grigoroglou, M., & Papafragou, A. (2019b). Interactive contexts increase informativeness in children’s referential communication. Developmental psychology, 55(5), 951. http://doi.org/10.1037/dev0000693CrossRefGoogle ScholarPubMed
Harris, L. (1972). Discrimination of left and right, and development of the logic relations. Merrill-Palmer Quarterly, 18, 307320.Google Scholar
Hermer-Vazquez, L., Moffet, A., & Munkholm, P. (2001). Language, space, and the development of cognitive flexibility in humans: The case of two spatial memory tasksCognition79(3), 263299.CrossRefGoogle ScholarPubMed
Howard, I., & Templeton, W. (1966). Human spatial orientation. London: Wiley.Google Scholar
İlkbaşaran, D. (2015). Literacies, mobilities and agencies of deaf youth in Turkey: Constraints and opportunities in the 21st century [Unpublished doctoral dissertation]. University of California.Google Scholar
Iverson, J. M., & Goldin-Meadow, S. (1998). Why people gesture when they speak. Nature, 396(6708), 228228.CrossRefGoogle ScholarPubMed
Janke, V., & Marshall, C. R. (2017). Using the hands to represent objects in space: Gesture as a substrate for signed language acquisition. Frontiers in Psychology, 8, 2007.CrossRefGoogle Scholar
Johnston, J. R. (1985). Cognitive prerequisites: The evidence from children learning English. In Slobin, D. (Ed.), The crosslinguistic study of language acquisition. Vol.2: The data (pp. 9611004). Hillside, NJ: Lawrence Erlbaum Associates.Google Scholar
Johnston, J. R. (1988). Children’s verbal representation of spatial location. In Stiles-Davis, J., Kritchevsky, M., & Bellugi, U. (Eds.), Spatial Cognition: Brain bases and development (pp. 195205). Hillside, NJ: Lawrence Erlbaum Associates.Google Scholar
Johnston, J. R., & Slobin, D. I. (1979). The development of locative expressions in English, Italian, Serbo-Croatian and Turkish. Journal of child language, 6(3), 529545. http://doi.org/10.1017/S030500090000252XGoogle ScholarPubMed
Karadöller, D. Z. (2021). Development of Spatial Language and Memory: Effects of Language Modality and Late Sign Language Exposure. Published Doctoral Dissertation, Radboud University Nijmegen. MPI Series in Psycholinguistics, Nijmegen, The Netherlands. ISBN: 978-94-92910-33-2.Google Scholar
Karadöller, D. Z., Sümer, B., & Özyürek, A. (2021). Effects and non-effects of late language exposure on spatial language development: Evidence from deaf adults and childrenLanguage Learning and Development, 17(1), 125. https://doi.org/10.1080/15475441.2020.1823846CrossRefGoogle Scholar
Karadöller, D. Z., Sümer, B., Ünal, E., & Özyürek, A. (2022). Late sign language exposure does not modulate the relation between spatial language and spatial memory in deaf children and adults. Memory & Cognition, Advance online publlication. https://doi.org/10.3758/s13421-022-01281-7CrossRefGoogle Scholar
Kelly, S. D. (2001). Broadening the units of analysis in communication: Speech and nonverbal behaviours in pragmatic comprehension. Journal of Child Language, 28, 325349. http://doi.org/10.1017/S0305000901004664CrossRefGoogle ScholarPubMed
Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge University Press. http://doi.org/10.1017/CBO9780511807572CrossRefGoogle Scholar
Kita, S., & Özyürek, A. (2003). What does cross-linguistic variation in semantic coordination of speech and gesture reveal? Evidence for an interface representation of spatial thinking and speakingJournal of Memory and Language48(1), 1632. http://doi.org/10.1016/S0749-596X(02)00505-3CrossRefGoogle Scholar
Kita, S., van der Hulst, H., & van Gijn, I. (1998). Movement phases in signs and co- speech gestures, and their transcription by human coders. In Wachsmuth, I. & Fröhlich, M. (Eds.), Gesture and sign language in human-computer interaction (pp. 2335). Berlin: Springer.CrossRefGoogle Scholar
Klima, E., & Bellugi, U. (1979). The signs of language. Cambridge, MA: Harvard University Press.Google Scholar
Landau, B. (2017). Update on “what” and “where” in spatial language: A new division of labor for spatial terms. Cognitive science, 41, 321350.CrossRefGoogle Scholar
Levinson, S. C. (1996). Frames of reference and Molyneux’s question: Cross-linguistic evidence. In Bloom, P., Peterson, M., Nadel, L., & Garrett, M. (Eds.), Language and space (pp. 109169). Cambridge, MA: MIT press.Google Scholar
Levinson, S. C. (2003). Space in language and cognition: explorations in cognitive diversity. New York, NY: Cambridge University Press. https://doi.org/10.1017/CBO9780511613609Google Scholar
Li, P., & Gleitman, L. (2002). Turning the tables: Language and spatial reasoningCognition83(3), 265294.CrossRefGoogle ScholarPubMed
Majid, A. (2002). Frames of reference and language conceptsTrends in cognitive sciences6(12), 503504.CrossRefGoogle ScholarPubMed
Manhardt, F., Brouwer, S., & Özyürek, A. (2021). A tale of two modalities: Sign and speech influence in each other in bimodal bilingualsPsychological Science. 32(3), 424436.Google ScholarPubMed
Manhardt, F., Özyürek, A., Sümer, B., Mulder, K., Karadöller, D. Z., & Brouwer, S. (2020). Iconicity in spatial language guides visual attention: A comparison between signers’ and speakers’ eye gaze during message preparationJournal of Experimental Psychology: Learning, Memory, and Cognition, 46(9), 17351753. https://doi.org/10.1037/xlm0000843Google ScholarPubMed
Marshall, C., Jones, A., Denmark, T., Mason, K., Atkinson, J., Botting, N., & Morgan, G. (2015). Deaf children’s non-verbal working memory is impacted by their language experience. Frontiers in Psychology, 6, 527.CrossRefGoogle ScholarPubMed
Martin, A. J., & Sera, M. D. (2006). The acquisition of spatial constructions in American Sign Language and English. Journal of Deaf Studies and Deaf Education, 11(4), 391402. http://doi.org/10.1093/deafed/enl004CrossRefGoogle ScholarPubMed
McNeill, D. (1992). Hand and mind: What gestures reveal about thought. Chicago: University of Chicago Press.Google Scholar
McNeill, D. (2005). Gesture and thought. Chicago: University of Chicago Press. http://doi.org/10.7208/chicago/9780226514642.001.0001CrossRefGoogle Scholar
Mitchell, R. E., & Karchmer, M. (2004). Chasing the mythical ten percent: Parental hearing status of deaf and hard of hearing students in the United States. Sign Language Studies, 4(2), 138163. https://doi.org/10.1353/sls.2004.0005CrossRefGoogle Scholar
Newport, E. L. (1988). Constraints on learning and their role in language acquisition: Studies of the acquisition of American Sign Language. Language Sciences, 10(1), 147172. https://doi.org/10.1016/0388-0001(88)90010-1CrossRefGoogle Scholar
Özyürek, A. (2018). Cross-linguistic variation in children’s multimodal utterances. In Hickmann, M., Veneziano, E., & Jisa, H. (Eds.), Sources of variation in first language acquisition: Languages, contexts, and learners (pp. 123138). Amsterdam: Benjamins.CrossRefGoogle Scholar
Özyürek, A., Kita, S., Allen, S., Brown, A., Furman, R., & Ishizuka, T. (2008). Development of cross-linguistic variation in speech and gesture: motion events in English and TurkishDevelopmental Psychology44(4), 10401054. https://doi.org/10.1037/0012-1649.44.4.1040CrossRefGoogle ScholarPubMed
Özyürek, A., & Woll, B. (2019). Language in the visual modality: Cospeech gesture and sign language. In Hagoort, P. (Ed.), Human language: From genes and brain to behavior (pp. 6783). Cambridge, MA: MIT Press.CrossRefGoogle Scholar
Pederson, E., Danziger, E., Wilkins, D. G., Levinson, S. C., Kita, S., & Senft, G. (1998). Semantic typology and spatial conceptualization. Language 74, 557589.Google Scholar
Peeters, D., & Özyürek, A. (2016). This and that revisited: A social and multimodal approach to spatial demonstratives. Frontiers in Psychology, 7, 222.CrossRefGoogle ScholarPubMed
Perniss, P. (2007). Space and iconicity in German Sign Language (DGS). [Doctoral dissertation]. Max Planck Institute for Psycholinguistics, The Netherlands.Google Scholar
Perniss, P., Özyürek, A., & Morgan, G. (Eds.). (2015b). The influence of the visual modality on language structure and conventionalization: Insights from sign language and gesture [Special Issue]. Topics in Cognitive Science, 7(1). http://doi.org/10.1111/tops.12127CrossRefGoogle Scholar
Perniss, P., Thompson, R. L., & Vigliocco, G. (2010). Iconicity as a General Property of Language: Evidence from Spoken and Signed Languages. Frontiers in Psychology, 1, Article 227. https://doi.org/10.3389/fpsyg.2010.00227CrossRefGoogle ScholarPubMed
Perniss, P., Zwitserlood, I., & Özyürek, A. (2015a). Does space structure spatial language? A comparison of spatial expression across sign languages. Language, 91(3), 611641. http://doi.org/10.1353/lan.2015.0041CrossRefGoogle Scholar
Perry, M., Church, R. B., & Goldin-Meadow, S. (1992). Is gesture-speech mismatch a general index of transitional knowledge? Cognitive Development, 7(1), 109122. http://doi.org/10.1016/0885-2014(92)90007-ECrossRefGoogle Scholar
Piaget, J. (1972). Judgment and reasoning in the child. Totowa, NJ: Littlefield, Adams. (Originally published in 1928).Google Scholar
Piaget, J., & Inhelder, B. (1971). The child’s conceptualization of space. New York: Norton. (Originally published in 1948).Google Scholar
Powell, M. J. (2009). The BOBYQA algorithm for bound constrained optimization without derivatives. Cambridge NA Report NA2009/06, University of Cambridge, Cambridge, 2646.Google Scholar
R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/Google Scholar
Rigal, R. (1994). Right-left orientation: Development of correct use of right and left terms. Perceptual and Motor Skills, 79(3), 1259. doi:10.2466/pms.1994.79.3.1259CrossRefGoogle ScholarPubMed
Rigal, R. (1996). Right-left orientation, mental rotation, and perspective-taking: When can children imagine what people see from their own viewpoint? Perceptual and Motor Skills, 83 (3), 831843. doi:10.2466/pms.1996.83.3.831CrossRefGoogle ScholarPubMed
Sauter, M., Uttal, D. H., Alman, A. S., Goldin-Meadow, S., & Levine, S. C. (2012). Learning what children know about space from looking at their hands: The added value of gesture in spatial communicationJournal of Experimental Child Psychology111(4), 587606. http://doi.org/10.1016/j.jecp.2011.11.009CrossRefGoogle ScholarPubMed
Sekine, K. (2009). Changes in frame of reference use across the preschool years: A longitudinal study of the gestures and speech produced during route descriptions. Language and Cognitive Processes, 24, 218238. http://doi.org/10.1080/01690960801941327CrossRefGoogle Scholar
Slonimska, A., Özyürek, A., & Capirci, O. (2020). The role of iconicity and simultaneity for efficient communication: The case of Italian Sign Language (LIS)Cognition200, 104246. http://doi.org/10.1016/j.cognition.2020.104246CrossRefGoogle ScholarPubMed
Sommerville, J. A., Woodward, A. L., & Needham, A. (2005). Action experience alters 3-month-old infants’ perception of others’ actionsCognition96(1), B1B11. http://doi.org/10.1016/j.cognition.2004.07.004CrossRefGoogle ScholarPubMed
Sümer, B. (2015). Acquisition of Spatial Language by Signing and Speaking Children: A comparison of Turkish Sign Language (TİD) and Turkish. Unpublished doctoral dissertation, Radboud University Nijmegen, Nijmegen.Google Scholar
Sümer, B., & Özyürek, A. (2020). No effects of modality in development of locative expressions of space in signing and speaking children. Journal of child language, 47(6), 11011131.CrossRefGoogle ScholarPubMed
Sümer, B., Perniss, P. M., Zwitserlood, I. E. P., & Özyürek, A. (2014). Learning to express “left-right” & “front-behind” in a sign versus spoken language. In Bello, P., Guarini, M., McShane, M. & Scassellati, B. (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society. Austin,Tx: Cognitive Science Society.Google Scholar
Supalla, T. R. (1982). Structure and acquisition of verbs of motion and location in American Sign Language. Unpublished doctoral dissertation, UCSD, The USA.Google Scholar
Talmy, L. (1985). Lexicalization patterns: Semantic structure in lexical forms. Language Typology and Syntactic Description, 3, 57149.Google Scholar
Taub, S. (2001). Language from the body: Iconicity and metaphor in American Sign Language. Cambridge University Press. http://doi.org/10.1017/CBO9780511509629CrossRefGoogle Scholar
Taub, S., & Galvan, D. (2001). Patterns of conceptual encoding in ASL motion descriptions. Sign Language Studies, 175200.CrossRefGoogle Scholar
Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., & Sloetjes, H. (2006). ELAN: a professional framework for multimodality research. In 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 15561559).Google Scholar
Zwitserlood, I. (2012). Classifiers: Meaning in the hand. In: Pfau, R., Steinbach, M., & Woll, B. (Eds.), Sign language: An international handbook (pp. 158186). Berlin: Mouton de Gruyter.CrossRefGoogle Scholar
Figure 0

Figure 1. Objects in viewpoint-independent (a) and viewpoint-dependent (b & c) spatial configurations.

Figure 1

Figure 2. Informative description from a TİD signer by using a classifier construction in encoding the spatial relation between the cup and the toothbrush.

Figure 2

Figure 3. An example from a Turkish speaker using a pointing gesture towards the right while mentioning “Side” in speech.Notes. The underlined word denotes the speech that gesture temporally overlaps with. The description is informative only when both speech and gesture are considered.

Figure 3

Figure 4. Non-contrast (a) and contrast (b) experimental displays.

Figure 4

Figure 5. Timeline of a trial in the description task.

Figure 5

Figure 6. Examples from Turkish speakers describing the spatial relation between the pencil and the cup using (a) Left-Right spatial terms, (b) general relational term Side, (c) spatial terms other than Left-Right, (d) missing encoding of spatial relation between the objects.

Figure 6

Figure 7. Informative in Speech description from a Turkish speaker using a specific spatial term (Left) together with a directional pointing gesture to the left.Note. Underlined words denote the speech that the gesture overlapped with. The description is informative even when only speech is considered.

Figure 7

Figure 8. Informative in speech-plus-gesture description from a Turkish speaker using a general spatial term (Side) together with iconic hand placement gestures.Notes. Participant introduced gestures sequentially. Gesture indicating the basket (RH) was performed when the participant mentioned the basket in her speech. Gesture indicating the newspaper (LH) was performed when the participant mentioned the newspaper. Both gestures remained in the gesture space until the end of the sentence. The description is informative only when information in both speech and gestures combined is considered.

Figure 8

Figure 9. Informative in sign description from a TİD signer by using a relational lexeme for Left in encoding the spatial relation between the cup and the ruler.

Figure 9

Figure 10. Informative in sign description from a TİD signer by tracing the shape of the Figure object on the signing space in encoding the spatial relation between the cup and the ruler.

Figure 10

Figure 11. Under-informative description in sign with missing spatial relation between objects.

Figure 11

Figure 12. Proportion of Informative descriptions across Age Groups and Modality.