Skip to main content Accessibility help
×
Hostname: page-component-cd9895bd7-8ctnn Total loading time: 0 Render date: 2024-12-30T17:13:56.187Z Has data issue: false hasContentIssue false

Computational Construction Grammar

A Usage-Based Approach

Published online by Cambridge University Press:  08 May 2024

Jonathan Dunn
Affiliation:
University of Illinois, Urbana-Champaign

Summary

This Element introduces a usage-based computational approach to Construction Grammar that draws on techniques from natural language processing and unsupervised machine learning. This work explores how to represent constructions, how to learn constructions from a corpus, and how to arrange the constructions in a grammar as a network. From a theoretical perspective, this Element examines how construction grammars emerge from usage alone as complex systems, with slot-constraints learned at the same time that constructions are learned. From a practical perspective, this work is accompanied by a Python package which enables linguists to incorporate construction grammars into their own corpus-based work. The computational experiments in this Element are important for testing the learnability, variability, and confirmability of Construction Grammar as a theory of language. All code examples will leverage the cloud computing platform Code Ocean to guide readers through implementation of these algorithms.
Get access
Type
Element
Information
Online ISBN: 9781009233743
Publisher: Cambridge University Press
Print publication: 06 June 2024

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bates, E., & Goodman, J. (1997). On the inseparability of grammar and the lexicon: Evidence from acquisition, aphasia and real-time processing. Language and Cognitive Processes, 12(5–6), 507584. https://doi.org/10.1080/016909697386628.Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 11371155.Google Scholar
Beuls, K., & Van Eecke, P. (2023). Fluid construction grammar: State of the art and future outlook. In Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023) (pp. 4150). Washington, DC Association for Computational Linguistics. https://aclanthology.org/2023.cxgsnlp-1.6.Google Scholar
Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge University Press.CrossRefGoogle Scholar
Bouma, G. (2009). Normalized (pointwise) mutual information in collocation extraction. In Proceedings of the German Society for Computational Linguistics and Language Technology (Vol. 30, pp. 3140). Gunter Narr Verlag.Google Scholar
Brysbaert, M., Warriner, A., & Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46, 904911. https://doi.org/10.3758/s13428-013-0403-5.CrossRefGoogle ScholarPubMed
Burdick, L., Kummerfeld, J. K., & Mihalcea, R. (2021). Analyzing the surprising variability in word embedding stability across languages. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (pp. 58915901). Association for Computational Linguistics.CrossRefGoogle Scholar
Chen, S., & Goodman, J. (1999). An empirical study of smoothing techniques for language modeling. Computer Speech and Language, 13, 359394.CrossRefGoogle Scholar
Church, K., & Hanks, P. (1990). Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), 2229. https://doi.org/10.3115/981623.981633.Google Scholar
Devlin, J., Chang, M.- W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, volume 1 (long and short papers) (pp. 41714186). Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1423.Google Scholar
Doumen, J., Beuls, K., & Van Eecke, P. (2023). Modelling language acquisition through syntactico-semantic pattern finding. In Findings of the Association for Computational Linguistics: EACL 2023 (pp. 13471357). Association for Computational Linguistics.CrossRefGoogle Scholar
Dunn, J. (2010). Gradient semantic intuitions of metaphoric expressions. Metaphor and Symbol, 26(1), 5367. https://doi.org/10.1080/10926488.2011.535416.CrossRefGoogle Scholar
Dunn, J. (2013). How linguistic structure influences and helps to predict metaphoric meaning. Cognitive Linguistics, 24(1), 3366. https://doi.org/10.1515/cog-2013-0002.CrossRefGoogle Scholar
Dunn, J. (2017). Computational learning of construction grammars. Language & Cognition, 9(2), 254292.CrossRefGoogle Scholar
Dunn, J. (2018a). Finding variants for construction-based dialectometry: A corpus-based approach to regional CxGs. Cognitive Linguistics, 29(2), 275311.CrossRefGoogle Scholar
Dunn, J. (2018b). Modeling the complexity and descriptive adequacy of construction grammars. In Proceedings of the Society for Computation in Linguistics (pp. 8190). Association for Computational Linguistics.Google Scholar
Dunn, J. (2018c). Multi-unit directional measures of association moving beyond pairs of words. International Journal of Corpus Linguistics, 23(2), 183215.CrossRefGoogle Scholar
Dunn, J. (2019a). Frequency vs. association for constraint selection in usage-based construction grammar. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (pp. 117128). Association for Computational Linguistics.CrossRefGoogle Scholar
Dunn, J. (2019b). Global syntactic variation in seven languages: Toward a computational dialectology. Frontiers in Artificial Intelligence, 2(15). https://doi.org/10.3389/frai.2019.00015.CrossRefGoogle Scholar
Dunn, J. (2019c). Modeling global syntactic variation in English using dialect classification. In Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects (pp. 4253). Association for Computational Linguistics. https://doi.org/10.18653/v1/W19-1405.CrossRefGoogle Scholar
Dunn, J. (2020). Mapping languages: The Corpus of Global Language Use. Language Resources and Evaluation, 54, 9991018. https://doi.org/10.1007/s10579-020-09489-2.CrossRefGoogle Scholar
Dunn, J. (2022a). Exposure and emergence in usage-based grammar: Computational experiments in 35 languages. Cognitive Linguistics, 33(4), 659699.CrossRefGoogle Scholar
Dunn, J. (2022b). Natural language processing for corpus linguistics. Cambridge University Press.CrossRefGoogle Scholar
Dunn, J. (2023a). Syntactic variation across the grammar: Modelling a complex adaptive system. Frontiers in Complex Systems, 1. https://doi.org/10.3389/fcpxs.2023.1273741.CrossRefGoogle Scholar
Dunn, J. (2023b). Variation and instability in dialect-based embedding spaces. In Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023) (pp. 6777). Dubrovnik, Croatia. Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.vardial-1.7.CrossRefGoogle Scholar
Dunn, J., Li, H., & Sastre, D. (2022). Predicting embedding reliability in low-resource settings using corpus similarity measures. In Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp. 64616470). Marseille, France. European Language Resources Association. https://aclanthology.org/2022.lrec-1.693.Google Scholar
Dunn, J., & Nini, A. (2021). Production vs perception: The role of individuality in usage-based grammar induction. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (pp. 149159). Association for Computational Linguistics.CrossRefGoogle Scholar
Dunn, J., & Tayyar Madabushi, H. (2021). Learned construction grammars converge across registers given increased exposure. In Conference on Natural Language Learning (pp. 268278). Association for Computational Linguistics.CrossRefGoogle Scholar
Dunn, J., & Wong, S. (2022). Stability of syntactic dialect classification over space and time. In Proceedings of the 29th International Conference on Computational Linguistics (pp. 2636). Gyeongju, Republic of Korea. International Committee on Computational Linguistics. https://aclanthology.org/2022.lrec-1.693.Google Scholar
Ellis, N. (2007). Language acquisition as rational contingency learning. Applied Linguistics, 27(1), 124.CrossRefGoogle Scholar
Fodor, J. D., & Crowther, C. (2002). Understanding stimulus poverty arguments. The Linguistic Review, 19(1–2), 105145. https://doi.org/10.1515/tlir.19.1-2.105.Google Scholar
Gazdar, G., Klein, E. H., Pullum, G. K., & Sag, I. A. (1985). Generalized phrase structure grammar. Blackwell.Google Scholar
Goldberg, A. (1995). Constructions: A construction grammar approach to argument structure. Chicago University Press.Google Scholar
Goldberg, A. (2006). Constructions at work: The nature of generalization in language. Oxford University Press.Google Scholar
Goldberg, A. (2019). Explain me this: Creativity, competition, and the partial productivity of constructions. Princeton University Press.Google Scholar
Goldsmith, J. (2001). Unsupervised learning of the morphology of a natural language. Computational Linguistics, 27(2), 153198.CrossRefGoogle Scholar
Goldsmith, J. (2006). An algorithm for the unsupervised learning of morphology. Natural Language Engineering, 12(4), 353371.CrossRefGoogle Scholar
Goldsmith, J. (2015). Towards a new empiricism for linguistics. In Chater, N., Clark, A., Goldsmith, J., & Perfors, A. (Eds.), Empiricism and language learnability (pp. 58105). Oxford University Press. https://doi.org/10.1093/acprof:oso/9780198734260.003.0003.Google Scholar
Grave, E., Bojanowski, P., Gupta, P., Joulin, A., & Mikolov, T. (2018). Learning word vectors for 157 languages. In Proceedings of the International Conference on Language Resources and Evaluation (pp. 34833487). European Language Resources Association.Google Scholar
Grune, D., & Jacobs, C. J. H. (2008). Parsing techniques: A practical guide (2nd ed.). Springer.CrossRefGoogle Scholar
Grünwald, P. (2007). The minimum description length principle. MIT Press.CrossRefGoogle Scholar
Hellrich, J., Kampe, B., & Hahn, U. (2019). The influence of down-sampling strategies on SVD word embedding stability. In Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP (pp. 1826). Association for Computational Linguistics.CrossRefGoogle Scholar
Kesarwani, A. (2018). New York Times comments. Kaggle. www.kaggle.com/datasets/aashita/nyt-comments.Google Scholar
Kneser, R., & Ney, H. (1995). Improved backing-off for M-gram language modeling. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (Vol. 1, pp. 181184). IEEE. https://doi.org/10.1109/ICASSP.1995.479394.Google Scholar
Kohonen, O., Virpioja, S., & Lagus, K. (2010). Semi-supervised learning of concatenative morphology. In Proceedings of the ACL Special Interest Group on Computational Morphology and Phonology (pp. 7886). Association for Computational Linguistics.Google Scholar
Kuperman, V., Stadthagen-Gonzalez, H., & Brysbaert, M. (2012). Age-of-acquisition ratings for 30,000 English words. Behavior Research Methods, 44, 978990.CrossRefGoogle ScholarPubMed
Lakoff, G., & Johnson, M. (1999). Philosophy in the flesh: The embodied mind and its challenge to western thought. Basic Books.Google Scholar
Langacker, R. (2008). Cognitive grammar: A basic introduction. Oxford University Press.CrossRefGoogle Scholar
Leclercq, L., & Morin, C. (2023). No equivalence: A new principle of no synonymy. Constructions, 15(1). https://doi.org/10.24338/cons-535.Google Scholar
Levy, O., Goldberg, Y., & Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3, 211225. https://doi.org/10.1162/tacl_a_00134.CrossRefGoogle Scholar
Li, H., & Dunn, J. (2022). Corpus similarity measures remain robust across diverse languages. Lingua, 275, 103377.CrossRefGoogle Scholar
Li, H., Dunn, J., & Nini, A. (2022). Register variation remains stable across 60 languages. Corpus Linguistics and Linguistic Theory, 19(3), 397426.CrossRefGoogle Scholar
Linzen, T. (2016). Issues in evaluating semantic spaces using word analogies. In Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP (pp. 1318). Berlin, Germany: Association for Computational Linguistics. https://doi.org/10.18653/v1/W16-2503.CrossRefGoogle Scholar
Lison, P., & Tiedemann, J. (2016). OpenSubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16) (pp. 923929). European Language Resources Association (ELRA).Google Scholar
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv. https://doi.org/10.48550/ARXIV.1301.3781.CrossRefGoogle Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th International Conference on Neural Information Processing Systems – Volume 2 (pp. 31113119). Curran Associates Inc.Google Scholar
Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 8197.CrossRefGoogle ScholarPubMed
Nevens, J., Doumen, J., Van Eecke, P., & Beuls, K. (2022). Language acquisition through intention reading and pattern finding. In Proceedings of the 29th International Conference on Computational Linguistics (pp. 1525). International Committee on Computational Linguistics.Google Scholar
Nirenburg, S., & Raskin, V. (2004). Ontological semantics. MIT Press.Google Scholar
Perek, F., & Patten, A. L. (2019). Towards an English constructicon using patterns and frames. International Journal of Corpus Linguistics, 24(3), 354384. https://doi.org/10.1075/ijcl.00016.per.CrossRefGoogle Scholar
Piao, S., Bianchi, F., Dayrell, C., D’egidio, A., & Rayson, P. (2015). Development of the multilingual semantic annotation system. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 12681274). Association for Computational Linguistics.Google Scholar
Rae, J. W., Potapenko, A., Jayakumar, S. M., & Lillicrap, T. P. (2019). Compressive transformers for long-range sequence modelling. arXiv. https://doi.org/10.48550/ARXIV.1911.05507.CrossRefGoogle Scholar
Rousseeuw, P. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Computational and Applied Mathematics, 20, 5365.CrossRefGoogle Scholar
Schler, J., Koppel, M., Argamon, S., & Pennebaker, J. (2006). Effects of age and gender on blogging. In Proceedings of 2006 AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs. Association for the Advancement of Artificial Intelligence.Google Scholar
Schubert, E., & Lenssen, L. (2022). Fast k-medoids clustering in Rust and Python. Journal of Open Source Software, 7(75), 4183.CrossRefGoogle Scholar
Sullivan, K. (2013). Frames and constructions in metaphoric language. John Benjamins.CrossRefGoogle Scholar
Taylor, J. (2004). Linguistic categorization (3rd ed.). Oxford University Press.Google Scholar
Tiedemann, J. (2012). Parallel data, tools and interfaces in OPUS. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12) (pp. 22142218). European Language Resources Association (ELRA).Google Scholar
Vlach, H. (2019). Learning to remember words: Memory constraints as double-edged sword mechanisms of language development. Child Development Perspectives, 13, 159165. https://doi.org/10.1111/cdep.12337.CrossRefGoogle Scholar
Vlach, H., & DeBrock, C. A. (2019). Statistics learned are statistics forgotten: Children’s retention and retrieval of cross-situational word learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45, 700711. https://doi.org/10.1037/xlm0000611.Google ScholarPubMed
Wible, D., & Tsao, N. (2010). StringNet as a computational resource for discovering and investigating linguistic constructions. In Proceedings of the Workshop on Extracting and Using Constructions in Computational Linguistics (pp. 2531). Association for Computational Linguistics.Google Scholar
Wible, D., & Tsao, N.- L. (2020). Constructions and the problem of discovery: A case for the paradigmatic. Corpus Linguistics and Linguistic Theory, 16(1), 6793. https://doi.org/10.1515/cllt-2017-0008.CrossRefGoogle Scholar
Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. arXiv. https://doi.org/10.48550/ARXIV.1509.01626.CrossRefGoogle Scholar

Save element to Kindle

To save this element to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Computational Construction Grammar
  • Jonathan Dunn, University of Illinois, Urbana-Champaign
  • Online ISBN: 9781009233743
Available formats
×

Save element to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Computational Construction Grammar
  • Jonathan Dunn, University of Illinois, Urbana-Champaign
  • Online ISBN: 9781009233743
Available formats
×

Save element to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Computational Construction Grammar
  • Jonathan Dunn, University of Illinois, Urbana-Champaign
  • Online ISBN: 9781009233743
Available formats
×