Hostname: page-component-77c89778f8-9q27g Total loading time: 0 Render date: 2024-07-18T09:59:56.889Z Has data issue: false hasContentIssue false

Going beyond F0: The acquisition of Mandarin tones

Published online by Cambridge University Press:  12 May 2020

Nari RHEE*
University of Pennsylvania, USA
Utrecht University, Netherlands
Jianjing KUANG*
University of Pennsylvania, USA
*Corresponding authors: Department of Linguistics, 3401-C Walnut Street, Suite 300, C-Wing, University of Pennsylvania, Philadelphia, PA19104-6228, USA E-mails: and
*Corresponding authors: Department of Linguistics, 3401-C Walnut Street, Suite 300, C-Wing, University of Pennsylvania, Philadelphia, PA19104-6228, USA E-mails: and


Using a semi-spontaneous speech corpus, we present evidence from computational modelling of tonal productions from Mandarin-speaking children (4- to 11-years old) and adults, showing that children exceed the adult-level tonal distinction at the age of 7 to 8 years using F0 cues, but do not reach the high adult-level distinction using spectral cues even at the age of 10 to 11 years. The difference in the developmental curves of F0 and spectral cues suggests that, in Mandarin tone production, secondary cues continue to develop even after the mastery of primary cues.

Brief Research Report
Copyright © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Astruc, L., Payne, E., Post, B., Vanrell, M. d. M., & Prieto, P. (2013). Tonal targets in early child English, Spanish, and Catalan. Language and Speech, 56(2), 229253. doi:10.1177/0023830912460494CrossRefGoogle ScholarPubMed
Belotel-Grenié, A., & Grenié, M. (1994). Phonation types analysis in Standard Chinese. In The 3rd International Conference on Spoken Language Processing (Vol. 94, pp. 343–346).Google Scholar
Chen, A., Esteve-Gibert, N., Prieto, P., & Redford, M. (unpublished observations). Development of phrasal prosody from infancy to late childhood. In Gussenhoven, C. & Chen, A. (Eds.), The Oxford Handbook of Prosody.Google Scholar
Chen, F., Peng, G., Yan, N., & Wang, L. (2017). The development of categorical perception of Mandarin tones in four- to seven-year-old children. Journal of Child Language, 44(6), 14131434. doi:10.1017/S0305000916000581CrossRefGoogle ScholarPubMed
De Ruiter, L. E. (2010). Studies on intonation and information structure in child and adult German (Doctoral dissertation, Radboud University Nijmegen Nijmegen).Google Scholar
Galligan, R. (1987). Intonation with single words: Purposive and grammatical use. Journal of Child Language, 14(1), 121. doi:10.1017/S0305000900012708CrossRefGoogle ScholarPubMed
Hua, Z., & Dodd, B. (2000). The phonological acquisition of Putonghua (Modern Standard Chinese). Journal of child language, 27(1), 342. doi:10.1017/S030500099900402XCrossRefGoogle Scholar
Iseli, M., Shue, Y.-L., & Alwan, A. (2007). Age, sex, and vowel dependencies of acoustic measures related to the voice source. The Journal of the Acoustical Society of America, 121(4), 22832295. doi:10.1121/1.2697522CrossRefGoogle ScholarPubMed
Kawahara, H., Masuda-Katsuse, I., & De Cheveigne, A. (1999). Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency based F0 extraction: Possible role of a repetitive structure in sounds. Speech communication, 27(3–4), 187207. doi:10.1016/S0167–6393(98)00085-5CrossRefGoogle Scholar
Keating, P. A., & Esposito, C. (2007). Linguistic voice quality. UCLA Working Papers in Phonetics, 105(6), 8591.Google Scholar
Keating, P., Esposito, C., Garellek, M., Khan, S., & Kuang, J. (2011). Phonation contrasts across languages. In Proceedings of ICPhS XVII.Google Scholar
Kong, Y.-Y., & Zeng, F.-G. (2006). Temporal and spectral cues in Mandarin tone recognition. The Journal of the Acoustical Society of America, 120(5), 28302840. doi:10.1121/1.2346009CrossRefGoogle ScholarPubMed
Kreiman, J., & Gerratt, B. R. (2010). Perceptual sensitivity to first harmonic amplitude in the voice source. The Journal of the Acoustical Society of America, 128(4), 20852089.CrossRefGoogle ScholarPubMed
Kreiman, J., Gerratt, B. R., & Antoñanzas-Barroso, N. (2007). Measures of the glottal source spectrum. Journal of Speech, Language, and Hearing Research. doi:10.1044/1092-4388(2007/042)CrossRefGoogle ScholarPubMed
Kuang, J. (2017). Covariation between voice quality and pitch: Revisiting the case of Mandarin creaky voice. The Journal of the Acoustical Society of America, 142(3), 16931706. doi:10.1121/1.5003649CrossRefGoogle ScholarPubMed
Kuang, J., & Liberman, M. (2018). Integrating voice quality cues in the pitch perception of speech and non-speech utterances. Frontiers in Psychology, 9, 2147.CrossRefGoogle ScholarPubMed
Lee, S., Potamianos, A., & Narayanan, S. (1999). Acoustics of children's speech: Developmental changes of temporal and spectral parameters. The Journal of the Acoustical Society of America, 105(3), 14551468. doi:10.1121/1.426686CrossRefGoogle ScholarPubMed
Li, C. N., & Thompson, S. A. (1977). The acquisition of tone in Mandarin-speaking children. Journal of Child Language, 4(2), 185199. doi:10.1017/S0305000900001598CrossRefGoogle Scholar
Liaw, A., & Wiener, M. (2002). Classification and regression by random forest. R News, 2(3), 1822. Retrieved from Scholar
Liu, P., Chen, Z., Larson, C. R., Huang, D., & Liu, H. (2010). Auditory feedback control of voice fundamental frequency in school children. The Journal of the Acoustical Society of America, 128(3), 13061312. doi:10.1121/1.3467773CrossRefGoogle ScholarPubMed
Liu, S., & Samuel, A. G. (2004). Perception of Mandarin lexical tones when F0 information is neutralized. Language and Speech, 47(2), 109138. doi:10.1177/00238309040470020101CrossRefGoogle ScholarPubMed
Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., & Leisch, F. (2018). E1071: Misc functions of the department of statistics, probability theory group (formerly: E1071), tu wien. R package version 1.7-0. Retrieved from Scholar
Oksanen, J., Blanchet, F. G., Friendly, M., Kindt, R., Legendre, P., McGlinn, D., Minchin, P. R., O'Hara, R. B., Simpson, G.L., Solymos, P., Stevens, M. H. H., Szoecs, E., & Wagner, H. (2018). Vegan: Community ecology package. R package version 2.5-2. Retrieved from Scholar
Patel, R., & Brayton, J. T. (2009). Identifying prosodic contrasts in utterances produced by 4-, 7-, and 11-year-old children. Journal of Speech, Language, and Hearing Research, 52(3), 790801. doi:10.1044/1092-4388(2008/07-0137)CrossRefGoogle ScholarPubMed
Pettinato, M., Tuomainen, O., Granlund, S., & Hazan, V. (2016). Vowel space area in later childhood and adolescence: Effects of age, sex and ease of communication. Journal of Phonetics, 54, 114. doi:10.1016/j.wocn.2015.07.002CrossRefGoogle Scholar
R Core Team. (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria. Retrieved from Scholar
Ryant, N., Yuan, J., & Liberman, M. (2014). Mandarin tone classification without pitch tracking. In 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) (pp. 4868–4872). IEEE.CrossRefGoogle Scholar
Shiller, D. M., Gracco, V. L., & Rvachew, S. (2010). Auditory-motor learning during speech production in 9–11-year-old children. PLoS ONE, 5(9), e12975. doi:10.1371/journal.pone.0012975CrossRefGoogle ScholarPubMed
Shue, Y.-L., Keating, P. A., Vicenik, C., & Yu, K. (2011). Voicesauce: A program for voice analysis. In Proceedings of the ICPhS XVII, 1846–1849.Google Scholar
Sundberg, J. (1994). Vocal fold vibration patterns and phonatory modes. STL-QPSR, 35, 6980.Google Scholar
Tingley, B. M., & Allen, G. D. (1975). Development of speech timing control in children. Child Development, 46(1), 186194. doi:10.2307/1128847CrossRefGoogle Scholar
Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S (Fourth). New York: Springer. doi:10.1007/978-0-387-21706-2CrossRefGoogle Scholar
Whalen, D. H., & Xu, Y. (1992). Information for Mandarin tones in the amplitude contour and in brief segments. Phonetica, 49(1), 2547. doi:10.1159/000261901CrossRefGoogle ScholarPubMed
Wong, P. (2013). Perceptual evidence for protracted development in monosyllabic Mandarin lexical tone production in preschool children in Taiwan. The Journal of the Acoustical Society of America, 133(1), 434443. doi:10.1121/1.4768883CrossRefGoogle ScholarPubMed
Wong, P., & Strange, W. (2017). Phonetic complexity affects children's Mandarin tone production accuracy in disyllabic words: A perceptual study. PloS ONE, 12(8), e0182337. doi:10.1371/journal.pone.0182337CrossRefGoogle ScholarPubMed
Xu, Y. (1999). Effects of tone and focus on the formation and alignment of F0 contours. Journal of Phonetics, 27(1), 55105.CrossRefGoogle Scholar
Yang, A., & Chen, A. (2018). The developmental path to adult-like prosodic focus-marking in Mandarin Chinese-speaking children. First Language, 38(1), 2646. doi:10.1177/0142723717733920CrossRefGoogle ScholarPubMed
Yeung, H. H., Chen, K. H., & Werker, J. F. (2013). When does native language input affect phonetic perception? the precocious case of lexical tone. Journal of Memory and Language, 68(2), 123139.CrossRefGoogle Scholar
Yuan, J., Ryant, N., & Liberman, M. (2014). Automatic phonetic segmentation in Mandarin Chinese: Boundary models, glottal features and tone. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2539–2543). doi:10.1109/ICASSP.2014.6854058CrossRefGoogle Scholar
Supplementary material: PDF

Rhee et al. supplementary material

Rhee et al. supplementary material

Download Rhee et al. supplementary material(PDF)
PDF 138 KB