Skip to main content Accessibility help


  • Access
  • Cited by 10


      • Send article to Kindle

        To send this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

        Note you can select to send to either the or variations. ‘’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

        Find out more about the Kindle Personal Document Service.

        Gender as stylistic bricolage: Transmasculine voices and the relationship between fundamental frequency and /s/
        Available formats

        Send article to Dropbox

        To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

        Gender as stylistic bricolage: Transmasculine voices and the relationship between fundamental frequency and /s/
        Available formats

        Send article to Google Drive

        To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

        Gender as stylistic bricolage: Transmasculine voices and the relationship between fundamental frequency and /s/
        Available formats
Export citation


Despite the importance of gender differences in the voice, sociolinguists have not paid sufficient attention to the sociolinguistic processes through which phonetic resources are mobilized in the construction of a gendered voice. This article argues that gender differences in the voice—including those influenced by physiology—are best understood as elements of sociolinguistic style rather than static properties. With a focus on transgender speakers in the early stages of masculinizing hormone therapy, the analysis demonstrates the complex interrelationship of the gendered meanings attributable to characteristics like fundamental frequency and /s/. Trans speakers challenge systems for categorizing voices as female or male, which assume that different aspects of the gendered voice will pattern together in normative ways. Yet a voice's gender is not a unidimensional feature, but a cluster of features that take on meaning only in context with one another, leaving them open for recombination and change through stylistic bricolage. (Transgender, style, gender, voice, pitch, sibilants)


Gender differentiation in the voice is of critical importance to sociolinguists, who have built theories about ‘female’ and ‘male’ voices into our most basic methods for sociophonetic analysis. Phoneticians, and even many sociolinguists, often look to biology to explain these differences, even where evidence for social influences is plentiful. Some sociolinguists, especially those whose work focuses on language, gender, and sexuality (as early as Sachs 1975), have seen gender socialization early in life as an alternative account for gender differences in the voice. What these perspectives tend to share, however, is a structural, deterministic understanding of the gendered voice, in which sex and/or early gender socialization lock in gender differences that persist across the lifespan. In the last two decades, sociophoneticians have increasingly turned toward investigations of gender non-normativity in the voice (e.g. Campbell-Kibler 2007; Podesva 2007; Stuart-Smith 2007; Levon 2007, 2014; Zimman 2013; Pharao, Maegaard, Spindler, & Kristiansen 2014), suggesting a greater degree of variation in the gendered voice than heteronormative discourses would suggest. Gay men—or in some cases men whose are perceived as gay-sounding—have been the focus of numerous studies and have been described selectively appropriating and recombining phonetic elements associated with normative femininity and masculinity in order to index their sexual orientation (e.g. Gaudio 1994; Munson 2007). This set of insights has been crucial in the development of sociophonetic research on sexuality, yet the force of its critique has yet to impact the study of phonetic expressions of gender, particularly when it comes to features like fundamental frequency, the acoustic correlate of vocal pitch.

Transgender people are a group of speakers who have great potential to complexify our ideas about gender and the voice, but who have yet to receive in-depth attention from sociolinguists. In this article, I argue that trans voices demand a retheorization of phonetic gender differentiation in order to account for the full diversity of the gendered voice. One illuminating way to approach gender diversity is to bring variation in the gendered voice under the rubric of sociolinguistic style, a theoretical construct that recognizes the inherent variability of language as well as the capacity for speakers to combine and recombine linguistic features in order to construct social meaning. I argue that rather than taking for granted that broad gender differences in the voice arise directly from physiology, sociolinguists must critically examine the relationship between biology, gender socialization early in life, and identities embodied throughout the lifespan. Although there is a small body of literature on the voices of transgender people, it has mostly been carried out by speech-language pathologists who are primarily interested in discovering how speech therapists might help trans people feminize (or, much less often, masculinize) their voices. Bringing the tools of sociolinguistics to bear on trans voices promises a far greater range of insights on the boundaries between ‘female’ and ‘male’ voices, the ways gendered phonetic styles might change over the lifetime, and the interaction between physiology and social practice. As an alternative to biological (or social) determinism, this article argues that sociolinguists must tease apart the meaning of a phrase like ‘(fe)male voice’ if we hope to account for a fuller range of potential gendered styles employed by both transgender and cisgender (i.e. nontransgender) speakers.

The voice acts as a critical mediator of gender perceptions, making it hugely important to transgender people who are pursuing a transition from one gender role to another. Those whose voices mark them as audibly transgender often find that speaking is the act that prevents them from having their self-identified gender recognized and affirmed by others. The voice is thus as much of a focus in the lives of trans people as it is for sociolinguists. Trans women in particular often seek out linguistic feminization techniques in order to bring their voices closer to normative standards for women's speech. Trans men, by contrast, often make use of testosterone therapy, which has a dramatic effect on vocal pitch, much as it does during a typical male puberty. The little research that has been published on the voices of transmasculine people comes primarily from speech-language pathologists (van Borsel, de Cuypere, Rubens, & Destareke 2000; Adler & van Borsel 2006; Damrose 2009), scholars and instructors of drama and vocal performance (McNamara 2007; Constansis 2008), with the exception of linguistic work by Papp (2011), Zimman (2012, 2013, 2015), and Podesva & Van Hofwegen (2016). The paucity of this research is partly attributable to the assumption among health practitioners, researchers, and community members alike, that the effects of testosterone on the larynx make other kinds of vocal masculinization unnecessary (see also Kulick 1999). As a result, trans men and trans women tend to have very different kinds of awareness about gender and the voice and different attitudes toward self-consciously changing their voices (Zimman 2016). In the present analysis, I focus on the phonetic changes that trans men and other transmasculine people experience during the first one to two years of testosterone therapy, much as Papp (2011) and Zimman (2012) document. Hormonal changes clearly exert an influence on the pitch range used by these speakers, but I argue that the interpretation of these changes requires consideration of the other features that cluster together to constitute a speaker's sociolinguistic style(s). The challenges that arise in attempting to categorize trans voices as female or male highlights the folly of treating gender as a binary opposition—or even a unidimensional continuum. We cannot assume that the features that constitute a gendered voice will cluster together in only one of two predictable ways, with women doing one set of things and men doing another. Transgender speakers bring into sharp relief how variable the gendered voice can be, but the insights extend to cisgender speakers as well. The goal of this paper, then, is not to provide a generalizable account of a representative sample of trans voices, nor to make any comparative claims about how transgender speakers differ from their cisgender counterparts. Rather, its aim is to complexify our understanding of what it means to have a female or male voice at all. Trans voices demand a reframing of the gendered voice as a fluid set of multidimensional styles rather than a static property determined by speaker sex. Viewing gender as a form of style sheds light not only on the way voices become gendered, but also the broader theorization of how language is imbued with social meaning.

The analysis presented here is based on two years of ethnographic participant-observation in transmasculine communities in the San Francisco Bay Area. Transmasculine is an umbrella label that includes transgender men (i.e. those assigned to a female gender role at birth who self-identify as men) as well as other individuals who were female-assigned but who do not identify as women, including some nonbinary individuals (i.e. those who do not identify as strictly female or male). The study involved recording a group of ten transmasculine people over the course of a year during their first months of hormone therapy in order to document how their fundamental frequency (or F0) was changing, as well as conducting participant observation to situate those changes ethnographically and stylistically. In addition to F0, the broader project (documented in Zimman 2012) also looks at changes in vowel formants and the frequency profile for /s/; in this article, I focus only on F0 and /s/. These features were selected because they pair one aspect of the voice that is clearly influenced by biology (pitch) with another whose variation is driven by articulatory rather than anatomical difference (/s/). Together, they represent an opportunity to consider how social and physiological forces interact. While the changes in F0 alone paint one picture of how masculine or feminine these speakers sound, the incorporation of /s/ into the analysis shifts this picture considerably such that the interpretation of either of these features depends on the other. Approaches to semiotic style that treat social meaning as produced through stylistic bricolage (as in Eckert 2004) are ideal for unpacking the various ways that gender is indexed phonetically, leading to the conclusion that gender itself is a form of sociolinguistic style.

Literature review

Style and indexical meaning

Style has been an integral part of sociolinguistic theory since its earliest days, when it served as a framework for investigating intraspeaker variation and linking the individual to the community in work like Labov's classic study of New York City (1966/2006, 1972). In the intervening decades, style has been theorized about in a variety of ways, often with a focus on how to account for shifts in individuals’ ways of speaking across contexts (Schilling-Estes 2002 for a review). Styles have been characterized as bundles of linguistic features that are grouped together in meaningful ways (Eckert & Rickford 2001; Coupland 2003; Eckert 2004; also Agha 2005). That is, styles are not simply groupings of variables, but groupings that are based on a coherent logic that both draws on and transforms previous uses of those variables. Importantly, the assignment of social meaning happens at the level of the style rather than the individual linguistic variable, despite the tendency among some sociolinguists to search for a one-to-one correspondence between features and social meaning (Eckert 2004). For early variationist sociolinguists, the meaning of style had to do with formality and informality, which in turn mapped onto standard and vernacular and, by extension, the stratification of socioeconomic classes. In contemporary research, more emphasis is placed on the wide range of meanings style can produce in the process of constructing discursive stances, personae, and identities. As a result, style has come to figure prominently in the theorization of sociolinguistic meaning, and it is this aspect of style that is of critical importance for the present discussion.

Indexicality features prominently in the contemporary theorization of style. Because social meaning is attributed to styles rather than features, the indexical meaning of a linguistic feature cannot be determined outside of its indexical context. This has led to the theorization of the indexical field (Eckert 2008), a fluid assemblage of potential meanings that may attach to a particular linguistic variable. Eckert's primary example, the released /t/, has been associated with various temporary stances (e.g. as emphatic, formal, polite, angry), more perduring qualities (e.g. as elegant, educated, articulate, prissy) and social types (e.g. as British, a school teacher, a nerd girl, a gay diva) (Eckert 2008:469). Given that speakers do not invoke all of these meanings simultaneously and with the same intensity, it is the presence of other socially meaningful indexes—linguistic or otherwise—that works to select from the potential meanings the indexical field allows (see also Coupland 2003). The process of combining socially meaningful signs into coherent styles has been identified as a kind of bricolage (Eckert 2004, following the work of Levi-Strauss 1966 and Hebdige 1979) in which speakers draw on disparate linguistic resources and use them to construct something that exceeds the meaning of its individual components. 1

Stylistic bricolage has been particularly informative in studies of sexuality and the voice aiming to identify the phonetic characteristics that tend to appear in the voices of men who either identify as gay or whose voices are perceived as ‘gay sounding’. This has proven a challenging task, as many of the acoustic features that correlate with perceived sexuality in one study turn out to be nonsignificant for another (Zimman 2013:4–6). A few researchers have designed experiments that more carefully tease apart the contribution of individual variables as well as their relationship to one another. Levon (2007) examines the relationship between two acoustic measures that are also relevant for this article: pitch range and the duration of /s/. The experiment used digitally manipulated speech from one gay speaker and one straight speaker in order to test whether changing a speaker's pitch range and/or altering the duration of /s/ might influence the way listeners judge their sexuality. While decreasing the pitch range of the gay-sounding speaker in this study made him more likely to be identified as straight, the straight-sounding speaker's gayness ratings were not significantly affected by changes in pitch range. Levon finds that it is the combination of a long /s/ duration and a large pitch range that consistently resulted in higher gayness ratings. Campbell-Kibler (2011) presents another experiment that also uses a matched guise method, in this case including three features that have been associated with perceived gayness among male speakers: a fronted or retracted /s/, a higher or lower fundamental frequency, and the use of the velar or alveolar variant of (ING) (see also Campbell-Kibler 2007). While the correlation between gayness and a fronted /s/ was stable across Campbell-Kibler's (2011) speakers and guises, there were also more contextual associations, like the one between an /s/ fronting paired with the use of velar (ING), which together worked to evoke an image of the ‘smart, effeminate gay man’. It could not be said that the use of the velar form of (ING), on its own, indexes gayness, but it can index a certain kind of gay identity when paired with other variables like a high frequency (or fronted) /s/. Each of these experiments provides strong evidence that voices are heard as ‘gay’ or ‘straight’ not because of the presence or absence of any one linguistic variable, but based on the particular combination of phonetic variables a speaker employs. In other words, indexing sexuality with the voice is a matter of identifying sociolinguistic styles (as in Podesva, Roberts, & Campbell-Kibler 2002), not individual variables.

Pharao et al. (2014) present another example of an experiment concerned with the variable perception of /s/ that underscores the critical importance of intersectionality. Most of the research on perceived sexuality among men either focuses on the voices of white speakers or leaves the ethnoracial identities of speakers unspecified. As a result, little has been said among sociophoneticians about the connections between sexuality, race, and immigration status that have been so widely discussed by queer theorists (e.g. Duggan 2003; Massad 2007; Puar 2007). Pharao and his colleagues discuss a fronted /s/ that can be found in two types of speech in Copenhagen. A fronted /s/ in modern Copenhagen Danish (a racially unmarked variety) is associated with the perception of men as gay, as it is in the varieties of English discussed above. However, it is also found in a variety known as ‘street language’ that can index a tough masculinity associated with immigrant youth. Rather than perceiving all speech containing a fronted /s/ as equally gay, listeners’ associations were sensitive to the variety in which the sibilant occurred (see also Levon 2014 on stereotypes as cognitive mechanisms guiding the selection of indexical meaning).

While the sociophonetic study of sexuality has been greatly advanced by the incorporation of bricolage as a theoretical tool, the study of basic gender differentiation in the voice has not yet been approached from this perspective. Gender differences in fundamental frequency and formant frequencies continue to be naturalized and taken for granted, often based purely on the researcher's own perceptions of speakers’ genders. These assumptions have concrete implications for basic phonetic methods like formant extraction and vowel normalization, which generally treat women and men as relatively homogenous, nonoverlapping groups distinguished by physiology. The central goal of this article, then, is to demonstrate that the gendered voice, and indeed gender itself, is constituted through sociolinguistic style.

The acoustic variables

The acoustic analysis to follow focuses on two variables: fundamental frequency (or F0) and the realization of /s/. These features make ideal candidates for the type of analysis presented here because each has a different relationship to sex (i.e. bodily characteristics) and gender (i.e. learned practices). Differences in women's and men's potential F0 range are clearly constrained by the size of the larynx, while the gendered patterns for /s/ point toward an articulatory source (e.g. Munson 2007; Stuart-Smith 2007; Fuchs & Toda 2010; Zimman 2013; Podesva & Van Hofwegen 2016).

When it comes to F0, the bulk of the gender difference found in adults arises during puberty (Lee, Potamianos, & Narayanan 1999; Whiteside 2001), at which point young people of all genders typically undergo a lowering of fundamental frequency (e.g. Hollien & Paul 1969; Hollien, Green, & Massey 1994; Evans, Neave, Wakelin, & Hamilton 2008). High levels of testosterone generally have a profound effect on the size of the larynx (Titze 1989), as transgender men make particularly clear (van Borsel et al. 2000; Papp 2011; Zimman 2012). Among adult speakers of American English, average speaking fundamental frequency is often placed at around 100–120 Hz for men and 200–220 Hz for women (e.g. Fitch & Holbrook 1970; Linke 1973; Stoicheff 1981; Traunmüller & Eriksson 1995; Simpson 2009).

At the same time, there is strong evidence for the influence of social forces even in cases like pitch, where biology is also clearly at play. Differences in the F0 of women and men has been shown to vary across time (e.g. Hollien & Shipp 1972; Yuasa 2008), across cultures and languages (e.g. Majewski, Hollien, & Zalewski 1972; Loveday 1981), and across social groups who speak different varieties of the ‘same’ language (e.g. Szakay 2006). The estimates for American English-speaking women's and men's mean F0 just cited, for instance, are lower than those found in studies on demographically similar speakers conducted during the 1950s and 1960s (see Yuasa 2008 for more); this has been interpreted as evidence of a generational shift in norms for gendered speaking styles. Yuasa (2008) compares the downward trend in the F0 of American English speakers to the situation among Japanese speakers, who do not seem to be undergoing this type of shift. In her comparison, she also highlights the culture-specific patterns in men's and women's use of vocal pitch across these—admittedly homogenized—groups. Speakers of Japanese, according to Yuasa (also Loveday 1981), maintain greater distance between women's and men's habitual F0 range, with women using a higher F0 than American English-speaking women, and men using a lower F0 than American English-speaking men. Given that pitch can differ across cultures and languages, it is unsurprising that it can also be used differently across social groups understood as speaking the same language. For instance, Szakay (2006) documents differences in Maori (indigenous) and pakeha (white) New Zealanders and finds that the former group displays a significantly higher mean pitch range than the latter. Furthermore, prepubescent children have been found in some studies (Hasek, Singh, & Murry 1980; Ingrisano, Weismer, & Schuckers 1980; Ferrand & Bloom 1996) to make use of different pitch ranges and intonational contours based on assigned gender as early as age seven. Though some authors have speculated on a biological cause for these differences in children, the evidence as a whole shows that children learn many gendered articulatory patterns from a young age (e.g. Busby & Plant 1995; Flipsen, Shrilberg, Weismer, Karlsson, & McSweeny 1999; Whiteside & Marshall 2001) despite lack of dimorphism in the vocal tract (Fitch & Giedd 1999). Ferrand & Bloom (1996), for example, find that prepubescent boys’ lowering of F0 occurs simultaneously with changes in the intonational contours they employ. Although physiology is clearly important, it seems that at least part of the observed gender differences in F0 derives from socially learned habits.

Sibilant consonants—particularly /s/—have also been linked to gender across a number of studies, but in this case the evidence points toward a more purely articulatory cause. Variation in gendered qualities of /s/ derives in large part from the size of the space between a speaker's tongue and the teeth. The closer the tongue is to the teeth, the higher the frequency of /s/. The bulk of the friction produced in the articulation of /s/ in English occurs in a high frequency range, generally above 4,000 Hz (Shadle 1990, 1991), although women are widely reported to produce this sound at a higher frequency than men (Schwartz 1968; Flipsen et al. 1999; Heffernan 2004; Stuart-Smith 2007; Fuchs & Toda 2010)—or, more precisely, with relatively greater amplitude in the high frequencies—but this pattern is not without exception. The frequency profile of /s/ can be measured in a number of ways. Two of the most widely used methods are peak frequency, which identifies the frequency with the highest amplitude, and center of gravity (COG), which calculates a weighted mean for the full spectrum of frequencies. A review of studies on this subject by Flipsen and colleagues (1999) suggests that women's mean or peak frequency for /s/ tends to fall roughly between 6,500 and 8,100 Hz, while men's has been said to range from around 4,000 to 7,100 Hz.

The overall picture of variation in /s/ makes a weak case for a correlation between /s/ production and sexual dimorphism. It is well known that /s/ is sensitive to articulatory differences (Shadle 1991), yet analyses of this feature provide another example of researchers’ desire to attribute gender differences in acoustic output to a binary model of biological sex. Several phoneticians have speculated that differences in vocal tract length might lead to gender-based patterns in the acoustics of /s/ (e.g. Flipsen et al. 1999:668). However, sex differentiation in the vocal anatomy is known to exist mainly in the posterior region—particularly the pharynx and larynx—whereas the frequency profile of /s/ is determined primarily by the size of the ‘front cavity’, a term for the aforementioned space between the tongue and the teeth (Shadle 1991). There has been speculation that the front cavity might differ anatomically by sex, or that other anatomical influences may be at work, such as the size and shape of the palate (Fuchs & Toda 2010), but comparisons of anatomical measures designed to test this relationship have failed to find statistically significant differences in the palates of women and men; measures designed to capture place of articulation were far more successful in predicting gender differences.

Variation in the placement of the tongue during the production of the sound provides by far the best explanation for the differences in this sound, not only across speaker gender, but also language, dialect, sexuality, age, and class, as in work by Heffernan (2004) comparing Japanese and English, Gordon, Barthmaier & Sands (2002) on seven unrelated languages, Stuart-Smith (2007) on class and age in Glasgow, Campbell-Kibler (2011) on the American South, and Munson (2007) on men's sexuality. The most significant findings for the present analysis are the association between high frequency /s/ and gay men and the complex patterns of gendered /s/ in Glasgow. In studies of men's (perceived) sexuality, /s/ has been more consistently linked with the percept of a gay male speaker than any other feature (see Zimman 2013 for a review and more on the connection between trans men and ‘gay sounding’ voices). Even more powerfully, Stuart-Smith (2007) shows that, in Glasgow, gender differences in /s/ are class- and age-specific. While the expected gender differences occur among middle-class speakers for both adults and adolescents of thirteen to fourteen years of age, Stuart-Smith found that the working-class girls in the younger age group produced the same kind of low frequency /s/ that was typical of the adult male speakers. Since working-class girls are clearly more similar physically to middle-class girls than they are to adult men of any class, we can conclude that socially driven articulatory differences are responsible for this noteworthy finding. Studies like these remind us that ‘women’ and ‘men’ are far from homogenous categories, and that intersectional identities can produce quite different constellations of gender-linked linguistic features. Taken together, this research points to the primacy of articulation in the production of gender differences in /s/, to the extent that any physical differences that might exist can be overridden by socially mediated identity work.

In the interest of exploring the diversity of sociolinguistic styles among one group of transgender speakers, this article pairs an analysis of fundamental frequency with an analysis of /s/ in order to provide a view on the process of bricolage, in which features from disparate sources work together to constitute holistic gendered styles. Before presenting the data, some background on the community of study and analytic methods would be informative.

Research context and methods

The ethnography

Like a growing segment of sociophonetic research, the analysis presented here is drawn from ethnographic fieldwork, which I carried out with a group of transgender speakers in the San Francisco Bay Area from 2010 to 2012. From an ethnographic perspective, language is fundamentally social and can never be separated from the socially meaningful contexts in which it occurs. One goal of linguistic ethnographers, then, is to discover how context produces, and is produced by, the social and linguistic practices observed in interaction. Ethnography has played a major role in the development of sociocultural linguistics from its earliest days (e.g. Hymes 1962), and has been particularly influential in the field of language and gender, and as part of the ‘third wave’ of sociolinguistic research (Eckert 2012). The last two decades in particular have seen a rise in the popularity of ethnographic methods among sociophoneticians (e.g. Eckert 2000; Podesva 2007, 2011; Mendoza-Denton 2008, 2011; Becker 2009; Drager 2009; Hall-Lew 2009). The study of gender differentiation in the voice, however, has most often been carried out by phoneticians working in scientific traditions, generally with data collected in laboratory conditions without much discussion of how the interactional aspects of that context might produce certain kinds of findings. Where ethnographers tend to focus on the situated, irreproducible nature of social interactions, phoneticians put more value on the scientific method, experimental replicability, and eliminating ‘bias’. One way these differences manifest is in the genre of speech data used for analysis. While phoneticians value speech recorded under relatively controlled conditions, ethnographers prefer to capture the speech of everyday life with minimal intrusion or manipulation on the part of the researcher (though always with the recognition that the effect of the researcher cannot be escaped). Of course, it is one of the foundational observations of sociolinguistics that people tend to speak differently in more formal or self-conscious contexts (Labov 1972).

Given this context, it may be a surprise that the analysis presented here employs read speech. The usefulness of read speech for isolating changes over time are clear, but such an analysis must come with caveats. Most importantly, I do not treat the findings of this analysis as representative or generalizable to other speaking contexts. Instead, I frame the self-conscious aspect of read speech as a potential opportunity for participants to produce a kind of self-conscious gendered performance. As speakers who are acutely attuned to the potential for masculinity and femininity to be conveyed linguistically, trans people who are in the early stages of a gender-role transition make for poor ‘naïve speakers’ even in the most conversational contexts. With the gendered characteristics of the voice weighing heavily on their minds, and their knowledge that they were participating in a study by a linguist seeking participants just starting hormones, it was obvious to participants that their changing voices were the focus. This is not to imply that all of the speakers documented here are consciously manipulating their speech in similar ways—or at all—but rather to suggest that the self-conscious aspects of read speech can be as much a source of insight as a limitation (see Gafter 2016).

The study and its participants

A challenge facing any ethnographic or sociolinguistic inquiry is how the community of study should be defined. The practice of delimiting a group of speakers as a community on the basis of shared demographic characteristics or overtly claimed identities has been thoroughly interrogated in both sociolinguistics generally and the study of language and sexuality in particular (Eckert & McConnell-Ginet 1992; Kulick 2000; Bucholtz & Hall 2004). There are two alternatives to the speech community model (Gumperz 1968/2001) that are useful for this study. The first is the community of practice (Eckert & McConnell-Ginet 1992), which conceptualizes group-level social relationships in terms of culturally meaningful activity. The participants in this study did not constitute a single community of practice (or CoP), and not every participant in this study knew one another. Instead, these speakers participated in an overlapping set of CoPs. Eckert & McConnell-Ginet's (2007) revisitation of the framework pushes sociocultural linguists to connect these micro-scale communities to macro-scale social categories. Though it was developed for the study of nationalism, Anderson's (1983) notion of the imagined community is one way to conceptualize the sense of community that can be shared by people who have never met. Valentine (2007) focuses on the imagining of ‘the transgender community’—a construct to which all of the speakers in this study oriented as well.

The goal of recruitment for this study was to record people as close as possible to their first injection of testosterone in order to capture the ensuing changes. Because of the difficulty of finding people at exactly the right moment in their transitions, I also included some speakers who had already been on testosterone for a few months, and in one case a full year. The study included fifteen regular participants in all, ten of whom were able to participate in a full year of audio recordings—the other five left the Bay Area permanently or for extended periods of time. The ten long-term participants were recorded approximately once each month over the course of a year, and at each meeting performed a reading of Fairbanks’ (1960) Rainbow Passage as well as being recorded during some other kind of speech activity, such as an interview or semi-structured conversation with the researcher (as in Alim 2004) or an interaction between the participant and their friends, partners, or family members.

Table 1 contains demographic information for the ten speakers in this study who were recorded for a full year. As the table indicates, speakers ranged in age from twenty-two to forty-six years, though all but two participants—whom I call Adam 2 (age thirty eight) and Mack (age forty six)—were in their twenties. Nine self-defined as white, while one, Carl, described himself as a Filipino person of color. Seven out of the ten participants grew up in the Bay Area, while the other three came from the New York City suburbs (Adam), Spain (Pol, who acquired English natively from his English mother), and eastern Massachusetts (James). 3 Class, which was assigned based on a combination of familial background and current profession, showed greater variability within this group, ranging from working to upper classes. Table 1 also includes information about participants’ gender identities (i.e. the categories they claim for themselves) and gender presentations (i.e. how they present their genders semiotically through resources like clothing, hair, bodily hexis, etc.). The words used for both of these columns are selected from participants’ self-descriptions and fleshed out by ethnographic observation. The gender identities participants claim cover several distinct groupings: men who see themselves as fundamentally part of the same category as cis men; trans men, who prefer to have their trans status marked and acknowledged when they describe their gender identity; genderqueer or nonbinary individuals who situate themselves outside of the female/male binary; and trans boys, a label adopted by some of these participants who identify primarily with masculinity but are not comfortable describing themselves as men (see Zimman 2012, 2015 for discussion). The final column of Table 1 includes the sexuality of participants, including both the identity label speakers use (i.e. as either queer or straight for this sample) and the genders to which they are attracted (since queer does not easily map onto specific forms of desire—some apply the label to their attraction to women, while others identify as queer because of their interest in men and/or nonbinary genders).

Table 1. Participants’ self-reported identities.

The participants in my fieldwork are not representative of the transmasculine community as a whole—particularly in the dimensions of race and age—yet even in the small sample of individuals I worked with for the course of a year, it is clear that there is a great deal of variety in the forms of masculinities enacted in trans communities. Despite the range of gender identities and expressions, what all of the speakers had in common was their choice to make use of injectable testosterone as a means of bodily masculinization, which is the most common medical intervention pursued by transmasculine people. Table 2 indicates how long each speaker had been on testosterone at our first meeting as well as his initial and final doses of testosterone (or T, as it's known in the community). Several speakers began at a ‘low dose’ of 25–50 milligrams of testosterone every week, working up to the typical ‘full dose’ of 100 mg per week or 200 mg biweekly. Some took a full dose from the very beginning of their hormone therapy; others preferred to remain at a low dose or to build up their dosage very slowly. Importantly, though, a person's dosage does not directly determine the actual testosterone levels in the bloodstream, so one trans person may find 100 mg per week provides ideal testosterone levels while others may need a higher or lower dose to reach the same range.

Table 2. Testosterone doses.

Acoustic methods

As noted, the analysis in this article is based on read speech in the form of repeated readings of the Rainbow Passage (see the appendix) on a more or less monthly basis over the course of approximately one year. A Fostex field recorder and Audio-Technica headset microphone were used to make all recordings in as quiet a space as the fieldwork allowed. All aspects of data collection and analysis were carried out by the author.

There are two sets of phonetic measurements to discuss for the purposes of this analysis: fundamental frequency (or F0) and the sibilant consonant /s/. Each production of the Rainbow Passage was divided into intonational phrases (IPs), each of which was measured for maximum, minimum, and mean F0. Praat's pitch settings were set to 75 Hz at the low end, which meant that some stretches of creaky voice quality were included in the means. 5 For this reason, minimum F0 was not included in the present analysis, as speakers’ minimums primarily reflect the extent to which they used creaky voice quality, a matter left for future investigation. Without considering speakers’ minimum F0, maximum F0 and the difference between them become less informative, so for the purposes of this analysis only mean F0 is considered.

For analysis of /s/, fourteen word-initial tokens were selected from the Rainbow Passage and filtered below 1,000 Hz to mitigate some low-frequency background noise (see methodological notes in Stuart-Smith 2007) and rule out the possibility that changes in the back cavity would affect measures of /s/. Spectral slices were generated from the middle of each token to capture a fuller frequency profile than a single point can provide. Following authors like Stuart-Smith (2007) and Munson (2007), a moments analysis was used to measure the distribution of frequencies in each spectrum, though for the purposes of the present analysis, center of gravity is used as the indicator of /s/ frequency because it is the measure most consistently linked to gender, whereas skew is more often linked to gay men's speech (Munson 2007; see Zimman 2012 for more on modeling COG and skew in this group).

The dataset thus includes measurements for nineteen IPs and fourteen tokens of /s/ per recording. There are between eight and thirteen recordings of the Rainbow Passage available for each of the ten speakers analyzed on a longitudinal basis. Table 3 has the total number of recordings, IPs, and tokens of /s/ analyzed for each speaker. While nineteen and fourteen are not particularly large numbers of tokens for each recording, bear in mind that these tokens are produced in consistent phonological, discursive, and intonational contexts.

Table 3. Number of recordings and tokens analyzed.

Linear mixed effects regressions were used to model the relationship between the acoustic measures and the number of weeks speakers had been on testosterone. First, linear regressions were performed on all data for each acoustic variable in order to determine whether, for these speakers as a group, a correlation exists between F0 or /s/ and time on testosterone. Because various individual factors may influence the way F0 changes, this was followed up by separate analyses for each individual speaker to determine the degree of inter-speaker variability when it came to change over time. In each regression, the dependent variable was the acoustic measure (mean F0 or COG of /s/) while time on testosterone was included as a fixed-effect variable. A random-effect variable was also included: for analysis of F0, the random variable was IP, while for /s/ the random variable was the word in which the token appeared. In the analyses of all speakers grouped together, speaker is also included as a random-effect variable.


Three sets of results are of interest for this article: the change in fundamental frequency experienced by speakers in this sample, their realization of /s/, and the relationship between these two measures.

Results of fundamental frequency analysis

To begin with the results for all ten of the speakers analyzed here as a group, there is a highly significant negative relationship between mean F0 and the number of weeks a speaker has been on testosterone (B = −0.4362, p < 0.001). As a whole, speakers who have been on testosterone longer have lower mean F0s than those who have been on testosterone for less time. Figure 1 illustrates this relationship in a scatterplot with a regression line.

Figure 1. Mean F0 by weeks on testosterone.

Looking at the data as a whole, however, hides important differences across speakers in this study. Most notably, the high-frequency tokens in the range of 40 to 140 weeks in Figure 1 were produced by a single speaker—Mack—who was recorded during his second year on testosterone. Table 4 contains results for each individual speaker. Each row provides a speaker's mean F0 for his first and final recordings of the Rainbow Passage and how many weeks he had been on testosterone when those recordings were made. Also included are coefficients that represent the direction and size of the correlation and p values that indicate its statistical significance. Speakers are ordered in this table according to the amount of time they had been on testosterone when they were first recorded. As the table indicates, all but one speaker—Dave—underwent a significant downward change in F0 over the course of participation in this study. Importantly, though, Dave had been on testosterone longer than anyone else in this study (just over one year at the time of our first recording) and also had the lowest mean F0, which was already 115 Hz during our first recording. He had clearly already gone through significant pitch changes before I began recording his voice, which he described as being “high soprano” before his transition.

Table 4. Changes over time in F0.

Figure 2 visualizes the data by speaker. Here, each empty triangle represents the mean F0 for the first recording made by each speaker, while each filled circle represents the mean at the final recording.

Figure 2. Mean F0 by speaker at first and last recording.

Though the pattern of downward change is consistent across speakers, there are also clear differences in the rate at which the changes occurred and the amount of change that took place. For example, James had been on testosterone for twenty-five weeks when I began recording him, but his voice had already dropped to an average F0 of 133 Hz. After only three weeks on hormones, Elvis was already at an average of 146 Hz. Yet Mack's last recording, which took place after two years on T and had an F0 mean of 176 Hz, left him with an ending mean F0 that was higher than any other participant's starting fundamental frequency. Physical size might be an attractive explanation for these differences, but in fact Mack was one of the tallest and overall largest participant in the study, while Dave was quite petite. There is also clear variation in how long speakers’ voices continue to change. While the change in James’ F0 was statistically significant, the raw change between his first and last recording was only 9 Hz, suggesting that his pitch had plateaued by the end of our work together. Mack, by contrast, had already been on testosterone for almost a year when I began recording him, yet he had a drop from 204 Hz to 176 Hz during his second year of hormone therapy. Mack contrasts most clearly with Dave, who did not see significant changes in F0 during his second year on testosterone. In addition to rate and trajectory of pitch changes, the raw amount of change each speaker saw varied significantly. The starkest comparison on this front is between Devin and Carl, who had similar mean F0 values during our first recording of 169 Hz and 171 Hz, respectively. Devin is the speaker who saw the biggest drop in mean F0, taking him to an average of 113 Hz, while Carl's pitch only fell 16 Hz, down to 155 Hz. Though we do not know what these speakers’ F0 would have been had I been able to record them before they began testosterone, they clearly occupied different pitch ranges after one year on hormones.

Most speakers eventually reached a mean under approximately 130 Hz, with the exceptions of Carl (whose final mean F0 was 155 Hz), and Mack (whose final mean was 176 Hz). Pol, though he eventually reached a mean of 127 Hz, also remained in the higher ranges (above approximately 160 Hz) for longer than most participants. There are a few factors that might help to explain some of the variability we find in these speakers’ starting fundamental frequency, ending fundamental frequency, and the paths that connect them. I briefly mention below four potential factors to draw attention to the way that both physiological and social factors seem to exert an effect on F0 among these speakers, but full consideration of the causes behind F0 change are beyond the scope of this article. These factors are age, regional dialect, ethnoracial identity, and testosterone dose.

Age may have an important effect on the degree of change trans speakers on testosterone experience. Within trans communities, it is often said that younger people have an easier time with gender role transitions, and part of this ease is ascribed to the physiological effects of aging. This could be a factor in explaining why Mack, as the oldest speaker in the sample (age forty-six), has seen less pitch change than the younger participants. It's worth noting, however, that Adam (age thirty-eight) and another speaker who did not complete a full year of recordings, Ethan (age forty-eight), reached the same range as the study's younger participants in only a few months time. Furthermore, Mack proved able to produce much lower pitched speech than he habitually used, but he was self-conscious about this deeper voice because it did not feel entirely authentic (see Zimman 2012 for details). If age is a factor in limiting some speakers’ vocal changes, then, physiology need not be the only or primary cause of those limitations. Perhaps speakers who have spent several decades living in a female gender role and employing a female- sounding voice have ingrained articulatory habits that are less easily changed than those of younger speakers, who have had less time to build up a gendered habitus.

The second factor worth further investigation is dialect. All but one of the speakers in this study were from the United States, and among those, most were from the San Francisco Bay Area. The clearest exception to the dialectal trend is Pol, a twenty-three-year-old white, working class, queer trans boy who grew up in Spain and is natively bilingual in Spanish and British English, learned at home from his Cornwall-born mother. Although Pol did eventually land in a normative male pitch range, with an average mean F0 of 127 Hz, his progression was slower than most of the other participants. Pol also reported being perceived as a woman, both in person and over the phone, for longer than most of the other speakers in the study. Even toward the end of his participation, when his vocal pitch had reached a low point, he described being read as female much more often than he would like. It is not necessarily the case that dialect directly constrains the pitch range that speakers employ, but Pol's soft-spoken British accent may have played into the perception of his gender in the US, especially given American ideologies about the supposed ‘effeminacy’ of British and European men.

A third factor worth further investigation is ethnoracial identity. The fact that all but one of the speakers in this sample self-identified as white makes it difficult to fully explore the implications of race and its effect on gendered styles, but it is notable that the one person of color in the study also stands out as occupying a different pitch range than the white speakers. This individual is Carl, a twenty-one-year-old Filipino American recent college graduate who grew up in the Bay Area in a middle-class immigrant family; Carl describes himself as simultaneously straight (based on his attraction to women) and queer (based on his identity as a trans person), and I discuss his gender presentation at greater length below. In addition to being the only person of color to participate in the longitudinal element of the project, Carl's pitch was significantly higher than any other participant's, save Mack. But with so little sociolinguistic research available on Filipino Americans, and so little research about patterns of gendered phonetic measures among people of color, it is difficult to assess whether Carl's pitch range is typical among Filipino American men. Because of the limitations of this sample of ten, my current extensions of this research are focused specifically on trans people of color.

Finally, the particular dose of testosterone taken by speakers seems to be of importance. A standard ‘full dose’ of testosterone is approximately 100 mg per week, but many transmasculine people begin hormone therapy at a lower dose of 25–50 mg per week. Some continue at this low dose, like Pol (25 mg/week) and Adam (50 mg/week), while others increase their dose over time, like James, Mack, and Tony. Still others begin hormone therapy at a full dose, meaning they experience a more rapid shift toward a normatively male endocrine balance. Devin, the speaker with the largest change in F0 over the course of his participation in this study, was among those to begin at a full weekly dose of 100 mg. At the same time, dosage cannot explain this variance by itself, as several speakers with low doses (e.g. James) nevertheless had large changes in F0.

Hormone levels provide a particularly striking example of the ways physiology and social forces interact. The effect of hormones like testosterone and estrogen are often seen as a matter of biological predetermination. Yet trans people highlight the fact that even elements such as hormone levels are sensitive to social context, since trans people's choices about whether to go on hormones, and at what dose, are influenced by social factors such as an individual's gender identity, life circumstance, and desired transition path. Nor are synthetic hormones particular to trans people's healthcare; indeed, many cis people make use of products that influence their so-called sex hormones such as birth control, fertility treatments, postmenopausal hormone replacement therapy, testosterone replacement for older men, and even hair-loss combatants like Rogaine. Synthetic hormones aside, biologists have documented the effect social behaviors have on the levels of hormones a person's body produces (e.g. Bernhardt, Dabbs, Fielden, & Lutter 1998).

Larger scale investigations of transgender speakers should take these factors into account, and with a large enough sample size it may be possible to tease apart the contribution of age, dialect, race, and hormone exposure. However, what none of these accounts would be able to capture, as long as F0 is examined in isolation, is the critical importance of the stylistic elements with which F0 co-occurs.

Results of sibilant analysis

Though it might be tempting to talk about a speaker's amount of change or ending point in F0 as an indicator of how masculine or feminine these speakers’ voices are, it would also miss the important fact that pitch is only one marker of gender among many. It is for this reason that I now turn to an analysis of /s/ as another sociophonetic index that differs from fundamental frequency in ways described above in part because gender differences in sibilants do not have the same kind of physiological influence that exists for F0. As with fundamental frequency, here I present the changes occurring in some speakers’ productions of /s/ before underscoring the great inter-speaker variability to be found for this consonant.

First, the results of the regression performed across all ten speakers indicates that there is a correlation between center of gravity for /s/ and the amount of time spent on testosterone (B = –2.761, p < 0.01). However, this relationship is clearly much weaker than that discovered between F0 and time on testosterone, as Figure 3 illustrates. This group-based analysis also obscures what is revealed in a by-speaker analysis, which is that only a few participants in this study experienced a change in /s/.

Figure 3. Center of gravity for /s/ by weeks on testosterone.

As with Table 4, Table 5 shows the starting and ending means for the center of gravity of /s/ as well as coefficients and p values from the regression analysis. Unlike the seemingly universal downward changes in pitch spurred by testosterone for these speakers, only a few participants showed significant, but small, changes in center of gravity. Three saw a decrease in center of gravity: Adam (who went from 6,867 Hz to 6,481 Hz), Tony (who went from 6,159 Hz to 5,734 Hz), and Devin (who went from 9,459 Hz to 8,779 Hz). More strikingly, however, one speaker actually saw an increase in center of gravity: Carl (who went from an average of 6,522 Hz to 6,877 Hz). Most immediately, the variability in these findings aligns better with articulatory, rather than anatomical, accounts of gender differences in /s/, particularly since testosterone would not cause speakers to produce /s/ in a more normatively ‘feminine’ way, as Carl did. As I have discussed in greater depth elsewhere (Zimman 2012, 2016), several participants in my study talked about feeling more comfortable incorporating certain traditionally feminine characteristics into their gender expression once they starting being perceived as men. Carl offered one of the most concrete examples of this process. When I first met him, Carl had an unremarkably masculine gender presentation that often took the form of slightly baggy jeans and t-shirts featuring pop-culture references such as video-game characters. As his transition progressed, however, he spoke openly about feeling increasingly comfortable expressing more ‘fem’ characteristics in both dress and mannerisms. Carl found that being almost universally perceived as male (despite his relatively high-pitched voice) allowed him greater freedom in the kinds of masculinities he could enact without risk of undermining recognition of his core gender identity as a trans man. Toward the end of my fieldwork, Carl had started letting his hair grow longer and was experimenting with a wider range of clothing styles—fitted cut-off jeans stopping just below the knee, for example—than the more conventional masculine look he preferred when we first met. Instead of looking like a relatively gender-normative adolescent boy, Carl began looking more like a queer, somewhat bohemian young man who is largely unconcerned with hegemonic norms for men's self-presentation. At the very same time that Carl showed a significant increase in center of gravity—approximately halfway through our year of recordings—he also showed a slight increase in F0.

Table 5. Changes over time in center of gravity for /s/.

All in all, these changes are relatively small compared to those seen in F0, and the more important aspect of /s/ for this discussion is not how its articulation changed over time for these speakers, but on how much variability there is in these speakers’ centers of gravity for /s/. Figure 4 breaks down center of gravity by speaker, ordered from lowest to highest mean COG, and shows that the ten individuals in this study cover an enormous range of possible frequencies for /s/, as low as 4,300 Hz and as high as 11,000 Hz. According to Flipsen et al.’s (1999) review of the literature, the COG of /s/ for English-speaking men has been reported throughout the range of 4,000–7,000 Hz, while for women the values range from 6,500–8,100 Hz; the speakers in this study cover that entire range.

Figure 4. Center of gravity for [s] by speaker.

To make sense of the way /s/ varies across this population of speakers, it is helpful to invoke these speakers’ local understanding of gender. Specifically, we need to separate gender identity and gender presentation, and to treat these separate concepts as distinct from gender assignment and sexuality (Zimman 2015). Gender identity is a phrase used in transgender communities to refer to the gender categories one claims for oneself, including identities like woman and man, but also distinctions between the difference mentioned earlier between identifying as a man versus a trans man, or between identifying as a trans man versus a trans boy. The speakers with the lowest center of gravity all identified as either men or trans men, including Mack, Carl, Adam, Tony, and Kyle. The speakers who described themselves as genderqueer or some other nonbinary gender identity all appear on the right of the chart: Elvis, James, Pol, and Devin. Gender identity alone, however, does not explain this spread, because Dave identifies strongly as a man yet has the second highest mean COG. This is why we must distinguish gender identity, as a claimed category, from the semiotic expression of femininity or masculinity known as gender presentation or expression. These are separate constructs for trans people in recognition that some men—including trans men—are not terribly masculine while some women—including trans women—are more butch than femme. Dave illustrates this disjuncture as a man who maintains what he describes as a fem gender expression constituted in part by his preference for bright, flamboyant, form-fitting clothing (often designed to be worn by women), as well as his vocal and gestural habits, which he describes as queeny and faggy. Dave is small in frame and stands just an inch or two over five feet, but is consistently perceived as male, undoubtedly due in part to his facial hair and low-pitched voice. He gives off a distinctly queer masculinity, however, in part through his voice. He makes ample use of falsetto voice quality, exhibits wild excursions in pitch range that contribute to his engaging and expressive interactional style, and, as Figure 4 indicates, has one of the highest frequency productions of /s/ among these speakers. See Zimman (2012, 2015) for an extended account of the role of gender expression and sexual orientation in the full sample of participants in this study.

Connecting features: The relationship between fundamental frequency and /s/

The temptation to categorize the femininity or masculinity of a transgender person's voice according to fundamental frequency alone conceals the view of gender's complexities by analyzing a larger set of resources for constructing gender differentiation in the voice. It would be easy to conclude that some speakers in this study have been more successful than others when it comes to changes in pitch, but including /s/ in this characterization reveals that gender is not a unidimensional scale with femininity on one end of a continuum and masculinity on the other. Instead, gender is constituted by numerous overlapping and sometimes even conflicting sets of practices that can be combined and recombined in a huge variety of ways. Both gender and the gendered voice are in this sense produced through stylistic bricolage, and a gender-role transition is one of the most dramatic recontextualizations a person can experience. Transgender people talk about this recontextualizing process, at least with respect to nonlinguistic practices: for instance, a man who fawns over small children, takes up a lot of physical space on public transportation, compliments a woman's shoes, or playfully shoves a friend will be seen quite differently from a woman who does the same. When a trans person's gender identity is recognized publically, as is the case for the trans speakers in this study who saw substantial lowering of vocal pitch, this also allows for more flexibility in gender presentation. In other words, transmasculine people who are categorized as physiologically male can engage in feminine activities—in some cases up to and including wearing ‘women's clothing’—without being perceived as female. Several speakers in this study spoke explicitly about feeling more comfortable (re)incorporating elements of feminine gender presentation that they had previously rejected because they threatened to undermine the gender perception they desired. Wanting a male-sounding voice was the most common reason my participants gave for pursuing testosterone therapy to begin with, but desire for a prototypically masculine voice was less common. Even when I asked straight-identified speakers like Mack and Ethan how they would feel if they were perceived as gay or feminine men, their responses stressed that they would not be too bothered, so long as they were perceived as men of some sort. Queer-identified participants, by contrast, often expressed a strong preference not to sound or act like heteronormative men.

If having a male-sounding voice allows speakers to avoid working to masculinize other aspects of their linguistic styles, as Zimman (2016) argues, it is not because pitch simply overrides other gender differences in the voice. In fact, a pilot perceptual experiment on the perceived gender of these speakers suggests that both formant frequencies and the center of gravity for /s/ mediate the categorization of speaker gender based on pitch. Pitch is unquestionably important in its own right, but the salience of a major change in fundamental frequency comes in part from the way it recontextualizes other aspects of a person's sociolinguistic style. In this way, the gendered voice is constructed through the same processes of stylistic bricolage we see at work in other domains of linguistic variation. With the development of a low-pitched voice, a high frequency /s/ can go from indexing a kind of hetero- and cisnormative femininity to indexing a gay or queer male identity (as in Zimman 2013). Although analyzing two features is only scratching at the surface of how multidimensional styles are formulated, the relationship between fundamental frequency and /s/ provides a striking complication to attempts at characterizing the femininity or masculinity of a voice.

In order to see how variation in fundamental frequency compares to variation in center of gravity for /s/ among these speakers, Table 6 presents a ranking for each of these variables: the speaker with the lowest mean frequency has a rank of one and the speaker with the highest mean frequency has a rank of ten. Speakers are listed here from lowest to highest mean F0, with a rank indicated in the left column. The ranking for center of gravity is listed in the right column. These numbers are grand means that include all tokens from all recordings in order to give a holistic sense of each speaker's voices at the time of their participation. My argument here is not that the ranking in F0 causes speakers to produce /s/ in the way they do, nor that these rankings are anything but relative to this group of speakers and therefore not necessarily generalizable, but what they show is that F0 and other markers of gender do not necessarily pattern together in normative ways. To reduce the gender of the voice to a single dimension would be to neglect the larger indexical picture, which is that there is a great deal of variation within gender groups, not just between them. Just as the meaning of style relies on a holistic picture of a person's speech, so too does the meaning of gender.

Table 6. Speakers’ grand means and rankings for F0 and center of gravity for /s/.

The most striking aspect of Table 6 is the near perfect reversal of the rank order. The speakers with the highest fundamental frequency (Carl and Mack) are also the speakers with the lowest frequency /s/, while those with the highest frequency /s/ (Dave and Devin) have the lowest F0. Clearly not all speakers show an opposition between these features—Pol does not, nor does Kyle, and Elvis and Tony are marginal cases—and, of course, this pattern exists specifically within the context of this group of speakers. But if we had analyzed F0 alone and concluded that Mack has the most feminine voice while Dave has the most masculine voice, we would fail to discover that Dave's voice is extremely queeny, in his words, while Mack's is decidedly masculine despite its high pitch. These are only two features, and the introduction of additional measures like formant frequencies, voice quality, and so forth, would further complexify this picture. But even when we consider only mean F0 and /s/, it becomes far more difficult to describe any of these speakers' voices as more or less masculine than the others. Yet, it is through consideration of the features together that we can begin to understand the logic behind these styles. Even as variation in /s/ aligns with speakers’ multi-layered relationships with gender identity, gender presentation, and sexuality, those particular indexical meanings are facilitated by the presence of a low fundamental frequency.

Turning back to the notion of style, bricolage helps to explain how the recontextualization of indexical meaning takes place. Both normative femininity and queer or non-normative masculinities co-exist on the indexical field for a high-frequency /s/, but it is the linguistic and social context of the variable that activates one of those meanings over the other. Of course, these are not arbitrary features. Normatively speaking, a person's embodied sex characteristics determine how they may be gendered, making the body a gatekeeper for the interpretation of gendered social practices. In more trans-affirming contexts, such as those found within trans communities, bodily normativity does not regulate access to gender affirmation; in this sense, ideology plays an important role in indexical configurations.


Sociolinguists often frame widespread gender differences in the voice differently from other kinds of sociophonetic variation. In spite of the field's commitment to uncovering the social logic of linguistic difference, the process by which voices are assigned female or male by analysts remains saturated with biological essentialism. Even in the case of gender's close ally—sexuality—we may talk about voices as sounding gay, but rarely do we describe voices as being gay. To describe a voice as female or male, by contrast, is an attribution process that usually occurs without reflection. As a result of the naturalization of binary gender differences in the voice, we know a great deal about the way binary gender categories correlate with phonetic linguistic variables (particularly those related to regional and class-based variation) but relatively little about variation in the phonetic indexes that most directly index gender, like pitch. Transgender voices challenge us to bring greater reflexivity to this process and examine our criteria for sexing or gendering a voice.

The biological essentialism that characterizes many linguists’ discourses about gender and the voice naturalizes a system where the sexing of a voice is determined by the sexing of the individual who produces it. One important lesson of queer and poststructuralist gender theory, however, is that biological sex is itself a social construct, and transgender embodiment has played a major role in supporting this argument. If female and male voices are distinguished on the basis of laryngeal size, for instance, the challenge comes in demarcating the exact boundary between these categories or determining whether it is possible to have a voice that is neither female or male. If, by this logic, certain larynges are too small to be considered male, or too large to be considered female, does that mean some cisgender men (not to mention prepubescent boys) have female voices and some cisgender women have male voices, if their laryngeal size falls short or exceeds the normative boundary? More likely, voices are designated female or male on the basis of the (perceived) identity of the speaker, such that (fe)male voice operates as a kind of circular shorthand for ‘a voice belonging to a (fe)male person’, which is then conflated with a voice that displays a normatively gendered pitch range, formant frequencies, and so on. The merging of sex and gender can pass by unnoticed in many studies when only cisgender speakers are considered, yet the result of this merger is far murkier theoretical waters regarding the causes and implications of gender differences in the voice.

Even where biological processes are clearly at play, as they are for transmasculine people on testosterone, social factors always have some role in mediating the relationship between embodiment, linguistic practice, and gendered meaning. To assume that the masculinization of transmasculine people's voices is caused by testosterone alone is to miss out on the complexity of their sociolinguistic styles, in which pitch is only one element among many. This article has focused on the greater depth of understanding that can be achieved by strategically examining two characteristics of the gendered voice, but gendered styles are most fruitfully examined as constellations of linguistic features that are assigned meaning holistically. Transgender speakers illuminate better than few others how the construction of gender takes place, but the model of gender as style also provides a deeper, more intersectional account of gendered variation among cisgender speakers as well. However normative or exceptional a speaker's identity, treating gender as a single dimension provides a flat, if not plainly unrepresentative, picture of the gendered meanings the voice takes on. Despite its rampant naturalization, gender fundamentally operates like other facets of identity: constructed through a process of bricolage that draws on both material and symbolic resources, emerging contextually in concert with multiple intersecting identities, and remaining open to change in the ongoing process of articulating the self.

Appendix: Rainbow Passage (Fairbanks 1960)

When the sunlight strikes raindrops in the air, they act as a prism and form a rainbow. The rainbow is a division of white light into many beautiful colors. These take the shape of a long round arch, with its path high above, and its two ends apparently beyond the horizon. There is, according to legend, a boiling pot of gold at one end. People look, but no one ever finds it. When a man looks for something beyond his reach, his friends say he is looking for the pot of gold at the end of the rainbow. Throughout the centuries people have explained the rainbow in various ways. Some have accepted it as a miracle without physical explanation. To the Hebrews it was a token that there would be no more universal floods. The Greeks used to imagine that it was a sign from the gods to foretell war or heavy rain. The Norsemen considered the rainbow as a bridge over which the gods passed from earth to their home in the sky. Others have tried to explain the phenomenon physically. Aristotle thought that the rainbow was caused by reflection of the sun's rays by the rain. Since then physicists have found that it is not reflection, but refraction by the raindrops which causes the rainbows. Many complicated ideas about the rainbow have been formed. The difference in the rainbow depends considerably upon the size of the drops, and the width of the colored band increases as the size of the drops increases. The actual primary rainbow observed is said to be the effect of super-imposition of a number of bows. If the red of the second bow falls upon the green of the first, the result is to give a bow with an abnormally wide yellow band, since red and green light when mixed form yellow. This is a very common type of bow, one showing mainly red and yellow, with little or no green or blue.


1 A common misconception about bricolage as a theory of style that arose during the review of this article is that it necessarily involves speakers being consciously aware of the ways they combine linguistic features into holistic styles. Bricolage did arise as part of a shift away from older, more mechanistic perspectives on sociolinguistic style (see Eckert & Rickford 2001 for a range of examples); however, the theory itself does not require intentionality on the part of the speaker. Bricolage can draw from biologically influenced as well as socially learned variables, as my analyses makes clear (see also Zimman 2013). The core function of the concept of bricolage is to help us understand how meaning is constructed, not to make claims about the degree to which speakers are conscious of the process. See Zimman (2016) for a discussion of agency and intentionality among the same set of speakers discussed here.

2 All names are pseudonyms chosen in consultation with participants.

3 One anonymous reviewer inquired as to whether variation in age or speaker dialect might explain some of the variation in /s/ among these speakers. However, I am unaware of any literature linking the particular regions these speakers are from (the San Francisco Bay Area, New York, Massachusetts, and Cornwall-influenced British English) to differences in /s/ of the sort documented by Stuart-Smith (2007) and Campbell-Kibler (2011).

4 In the case of Mack and Dave, the amount of testosterone is derived from the second year of hormone therapy rather than the first.

5 The choice to either include or exclude portions of talk produced with creaky phonation is a difficult one. Because creak is characteristically produced with a low frequency (often well under 100 Hz), these means may be lower than the speakers’ mean F0 in modal voice quality. By contrast, excluding creak could be seen as producing artificially higher means. The correct answer depends on whether listeners perceive creaky voice quality as low pitched or whether it is perceived as fundamentally different from modal speech.


Adler, Richard K., & van Borsel, John (2006). Female-to-male considerations. In Adler, Richard K., Hirsch, Sandy, & Mordaunt, Michelle (eds.), Voice and communication therapy for the transgender/transsexual client: A comprehensive clinical guide, 139–67. San Diego, CA: Plural Publishing.
Agha, Asif (2005). Voice, footing, enregisterment. Journal of Linguistic Anthropology 15(1):3859.
Alim, H. Samy (2004). You know my steez: An ethnographic and sociolinguistic study of styleshifting in a Black American speech community. Durham, NC: Duke University Press.
Anderson, Benedict (1983). Imagined communities: Reflections on the origin and spread of nationalism. New York: Verso.
Becker, Kara (2009). /r/ and the construction of place identity on New York City's Lower East Side. Journal of Sociolinguistics 13(5):634–58.
Bernhardt, Paul C.; Dabbs, James M. Jr.; Fielden;, Julie A. & Lutter, Candice D. (1998). Testosterone changes during vicarious experiences of winning and losing among fans at sporting events. Physiology & Behavior 65(1):5962.
Bucholtz, Mary, & Hall, Kira (2004). Theorizing identity in language and sexuality research. Language in Society 33(4):469516.
Busby, Peter A., & Plant, Geoff L. (1995). Formant frequency values of vowels produced by preadolescent boys and girls. Journal of the Acoustical Society of America 97(4):2603–7.
Campbell-Kibler, Kathryn (2007). Accent, (ING), and the social logic of listener perceptions. American Speech 82(1):3264.
Campbell-Kibler, Kathryn (2011). Intersecting variables and perceived sexual orientation in men. American Speech 86(1):5268.
Constansis, Alexandros N. (2008). The changing female-to-male (FTM) voice. Radical Musicology 3. Online:
Coupland, Nikolas (2003). Style: Language variation and identity. Cambridge: Cambridge University Press.
Damrose, Edward J. (2009). Quantifying the impact of androgen therapy on the female larynx. Auris Nasus Larynx 36(1):110–12.
Drager, Katie (2009). A sociophonetic ethnography of Selwyn Girls’ High. ,Christchurch: University of Canterbury dissertation.
Duggan, Lisa (2003). The twilight of equality? Neoliberalism, cultural politics, and the attack on democracy. Boston, MA: Beacon Press.
Eckert, Penelope (2000). Linguistic variation as social practice. Oxford: Blackwell.
Eckert, Penelope (2004). The meaning of style. Texas Linguistic Forum 47(1):4143.
Eckert, Penelope (2008). Variation and the indexical field. Journal of Sociolinguistics 12(4):453–76.
Eckert, Penelope (2012). Three waves of variation study: The emergence of meaning in the study of variation. Annual Review of Anthropology 41:87100.
Eckert, Penelope, & McConnell-Ginet, Sally (1992). Think practically and look locally: Language and gender as community-based practice. Annual Review of Anthropology 21:461–90.
Eckert, Penelope, & McConnell-Ginet, Sally (2007). Putting communities of practice in their place. Gender & Language 1(1):2737.
Eckert, Penelope, McConnell-Ginet, Sally & Rickford, John R. (2001). Style and sociolinguistic variation. New York: Cambridge University Press.
Evans, Sarah; Neave, Nick; Wakelin, Delia; & Hamilton, Colin (2008). The relationship between testosterone and vocal frequencies in human males. Physiology & Behavior 93(4–5):783–88.
Fairbanks, Grant (1960). Voice and articulation drillbook. New York: Harper & Row.
Ferrand, Carole T., & Bloom, Ronald L. (1996). Gender differences in children's intonational patterns. Journal of Voice 10(3):284–91.
Fitch, James L., & Holbrook, Anthony (1970). Modal vocal fundamental frequency of young adults. Archives of Otolaryngology 92:379–82.
Fitch, W. Tecumseh, & Giedd, Jay (1999). Morphology and development of the human vocal tract: A study using magnetic resonance imaging. Journal of the Acoustical Society of America 106(3):1511–22.
Flipsen, Peter Jr.; Shrilberg, Lawrence; Weismer, Gary; Karlsson, Heather; & McSweeny, Jane (1999). Acoustic characteristics of /s/ in adolescents. Journal of Speech, Language, and Hearing Research 42(3):663–77.
Fuchs, Susanne, & Toda, Martine (2010). Do differences in male versus female /s/ reflect biological or sociophonetic factors? In Fuchs, Susanne, Toda, Martine, & Zygis, Marzena (eds.), An interdisciplinary guide to turbulent sounds, 281302. Berlin: Mouton de Gruyter.
Gafter, Roey J. (2016). What's a stigmatized variant doing in the word list? Authenticity in reading styles and Hebrew pharyngeal. Journal of Sociolinguistics 20(1):3158.
Gaudio, Rudolf P. (1994). Sounding gay: Pitch properties in the speech of gay and straight men. American Speech 69(1):3057.
Gordon, Matthew; Barthmaier, Paul; & Sands, Kathy (2002). A cross-linguistic acoustic study of voiceless fricatives. Journal of the International Phonetic Association 32(2):141–74.
Gumperz, John J. (1968/2001). The speech community. In Duranti, Alessandro (ed.), Linguistic anthropology: A reader, 4352. Malden, MA: Blackwell.
Hall-Lew, Lauren (2009). Ethnicity and phonetic variation in a San Francisco neighborhood. Unpublished dissertation, Linguistics, Stanford University, Palo Alto, CA.
Hasek, Carol S., Singh, Sadanand & Murry, Thomas. (1980). Acoustic attributes of preadolescent voices. Journal of the Acoustical Society of America 68(5):1262–65.
Hebdige, Dick (1979). Subculture: The meaning of style. New York: Routledge.
Heffernan, Kevin (2004). Evidence from HNR that /s/ is a social marker of gender. Toronto Working Papers in Linguistics 23(2):7184.
Hollien, Harry; Green, Rachel; & Massey, Karen (1994). Longitudinal research on adolescent voice change in males. Journal of the Acoustical Society of America 96(5):2646–53.
Hollien, Harry, Green, Rachel; & Massey, Karen & Paul, Patricia (1969). A second evaluation of the speaking fundamental frequency characteristics of post-adolescent girls. Language and Speech 12(2):119–24.
Hollien, Harry, Green, Rachel; & Massey, Karen & Paul, Patricia & Shipp, Thomas (1972). Speaking fundamental frequency and chronologic age in males. Journal of Speech, Language, and Hearing Research 15(1):155–59.
Hymes, Dell (1962). The ethnography of speaking. In Gladwin, Thomas & Sturtevant, William C. (eds.), Anthropology and human behavior, 1353. Washington, DC: The Anthropological Society of Washington.
Ingrisano, Dennis; Weismer, Gary; & Schuckers, Gordon H. (1980). Sex identification of preschool children. Folia Phoniatrica et Logopaedica 32(1):6169.
Kulick, Don (1999). Transgender and language: A review of the literature and suggestions for the future. GLQ: A Journal of Lesbian and Gay Studies 5(4):605–22.
Kulick, Don (2000). Gay and lesbian language. Annual Review of Anthropology 29:243–85.
Labov, William (1966/2006). The social stratification of English in New York City. Cambridge: Cambridge University Press.
Labov, William (1972). Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press.
Lee, Sungbok; Potamianos, Alexandros; & Narayanan, Shrikanth (1999). Acoustics of children's speech: Developmental changes of temporal and spectral parameters. Journal of the Acoustical Society of America 105(3):1455–68.
Levi-Strauss, Claude (1966). The savage mind. Chicago: University of Chicago Press.
Levon, Erez (2007). Sexuality in context: Variation and the sociolinguistic perception of identity. Language in Society 36(4):533–54.
Levon, Erez (2014). Categories, stereotypes, and the linguistic perception of sexuality. Language in Society 43(5):539–66.
Linke, C. E. (1973). A study of pitch characteristics of female voices and their relationship to vocal effectiveness. Folia Phoniatrica et Logopaedica 25:173–85.
Loveday, Leo (1981). Pitch, politeness and sexual role: An exploratory investigation. Language and Speech 24(1):7188.
Majewski, Wojciech; Hollien, Harry; & Zalewski, Janusz (1972). Speaking fundamental frequency of Polish adult males. Phonetica 25(2):119–25.
Massad, Joseph (2007). Desiring Arabs. Chicago: University of Chicago Press.
McNamara, Catherine (2007). Re-inhabiting an uninhabitable body: Interventions in voice production with transsexual men. Research in Drama Education 12(2):195206.
Mendoza-Denton, Norma (2008). Homegirls: Language and cultural practice among Latina youth gangs. Malden, MA: Blackwell.
Mendoza-Denton, Norma (2011). The semiotic hitchhiker's guide to creaky voice: Circulation and gendered hardcore in a Chicana/o gang persona. Journal of Linguistic Anthropology 21(2):261–80.
Munson, Benjamin (2007). The acoustic correlates of perceived sexual orientation, perceived masculinity, and perceived femininity. Language and Speech 50(1):125–42.
Papp, Viktória (2011). Speaker gender: Physiology, performance and perception. Houston, TX: Rice University dissertation.
Pharao, Nicolai; Maegaard, Marie; Spindler Møller;, Janus & Kristiansen, Tore (2014). Indexical meanings of [s+] among Copenhagen youth: Social perception of a phonetic variant in different prosodic contexts. Language in Society 43(1):131.
Podesva, Robert J. (2007). Phonation type as a stylistic variable: The use of falsetto in constructing a persona. Journal of Sociolinguistics 11(4):478504.
Podesva, Robert J. (2011). Salience and the social meaning of declarative contours three case studies of gay professionals. Journal of English Linguistics 39(3):233–64.
Podesva, Robert J., Roberts, Sara J.; & Campbell-Kibler, Kathryn (2002). Sharing resources and indexing meanings in the production of gay styles. In Campbell-Kibler, Kathryn, Podesva, Robert J., Roberts, Sarah J., & Wong, Andrew (eds.), Language and sexuality: Contesting meaning in theory and practice, 175–89. Stanford, CA: CSLI Publications.
Podesva, Robert J., Roberts, Sara J.; & Campbell-Kibler, Kathryn & Van Hofwegen, Janneke (2016). /s/exuality in small-town California: Gender normativity and the acoustic realization of /s/. In Levon, Erez & Mendes, Ronald Beline (eds.), Language, sexuality, and power: Studies in intersectional sociolinguistics, 168–88. New York: Oxford University Press.
Puar, Jasbir K. (2007). Terrorist assemblages: Homonationalism in queer times. Durham, NC: Duke University Press.
Sachs, Jacqueline (1975). Cues to the identification of sex in children's speech. In Thorne, Barrie & Henley, Nancy (eds.), Language and sex: Difference and dominance, 152–71. Rowley, MA: Newbury House.
Schilling-Estes, Natalie (2002). Investigating stylistic variation. In Chambers, J. K., Trudgill, Peter, & Schilling-Estes, Natalie (eds.), The handbook of language variation and change, 375401. Malden, MA: Blackwell.
Schwartz, Martin F. (1968). Identification of speaker sex from isolated voiceless fricatives. Journal of the Acoustical Society of America 43(5):1178–79.
Shadle, Christine H. (1990). Articulatory-acoustic relationships in fricative consonants. In Hardcastle, William J. & Marchal, Alain (eds.), Speech production and speech modelling, 187209. Norwell, MA: Kluwer.
Shadle, Christine H. (1991). The effect of geometry on source mechanisms of fricative consonants. Journal of Phonetics 19(3–4):409–24.
Simpson, Adrian P. (2009). Phonetic differences between male and female speech. Language and Linguistics Compass 3(2):621–40.
Stoicheff, Margaret L. (1981). Speaking fundamental frequency characteristics of nonsmoking female adults. Journal of Speech and Hearing Research 24(3):437–41.
Stuart-Smith, Jane (2007). Empirical evidence for gendered speech production: /s/ in Glaswegian. In Cole, Jennifer, & Ignacio Hualde, José (eds.), Laboratory phonology 9, 6586. New York: Mouton de Gruyter.
Szakay, Anita (2006). Rhythm and pitch as markers of ethnicity in New Zealand English. In Warren, Paul & Watson, Catherine I. (eds.), Proceedings of the 11th Australian International Conference on Speech Science & Technology, 421–26. Auckland: University of Auckland.
Titze, Ingo R. (1989). Physiologic and acoustic differences between male and female voices. Journal of the Acoustical Society of America 85(4):16991707.
Traunmüller, Hartmut, & Eriksson, Anders (1995). The frequency range of the voice fundamental in the speech of male and female adults. Stockholm: Stockholm University, ms. Online:; accessed August 17, 2010.
Valentine, David (2007). Imagining transgender: An ethnography of a category. Durham, NC: Duke University Press.
van Borsel, John; de Cuypere, Griet; Rubens, Robert; & Destaerke, B. (2000). Voice problems in female-to- male transsexuals. International Journal of Language & Communication Disorders 35(3):427–42.
Whiteside, Sandra P. (2001). Sex-specific fundamental and formant frequency patterns in a cross- sectional study. Journal of the Acoustical Society of America 110(1):464–78.
Whiteside, Sandra P., & Marshall, Jeni (2001). Developmental trends in voice onset time: Some evidence for sex differences. Phonetica 58(3):196210.
Yuasa, Ikuko Patricia (2008). Culture and gender of voice pitch: A sociophonetic comparison of the Japanese and Americans. London: Equinox.
Zimman, Lal (2012). Voices in transition: Testosterone, transmasculinity, and the gendered voice among female-to-male transgender people. Boulder: University of Colorado, Boulder dissertation.
Zimman, Lal (2013). Hegemonic masculinity and the variability of gay-sounding speech: The perceived sexuality of transgender men. Journal of Language and Sexuality 2(1):543.
Zimman, Lal (2015). Transmasculinity and the voice: Gender assignment, identity, and presentation. In Milani, Tommaso (ed.), Language and masculinities: Performances, intersections, dislocations, 197219. New York: Routledge.
Zimman, Lal (2016). Agency and the gendered voice: Metalinguistic rejections of vocal masculinization among female-to-male transgender speakers. In Babel, Anna (ed.), Awareness and control in sociolinguistic research, 253–77. Cambridge: Cambridge University Press.