Skip to main content Accessibility help
×
Hostname: page-component-7479d7b7d-t6hkb Total loading time: 0 Render date: 2024-07-08T06:43:31.003Z Has data issue: false hasContentIssue false

A Theory of Linguistic Individuality for Authorship Analysis

Published online by Cambridge University Press:  12 May 2023

Andrea Nini
Affiliation:
University of Manchester

Summary

Authorship analysis is the process of determining who produced a questioned text by language analysis. Although there has been significant success in the performance of computational methods to solve this problem in recent years, these are often methods that are not amenable to interpretation. Authorship analysis is in all effects an area of computer science with very little linguistics or cognitive science. This Element introduces a Theory of Linguistic Individuality that, starting from basic notions of cognitive linguistics, establishes a formal framework for the mathematical modelling of language processing that is then applied to three computational experiments, including using the likelihood ratio framework. The results propose new avenues of research and a change of perspective in the way authorship analysis is currently carried out.
Get access
Type
Element
Information
Online ISBN: 9781108974851
Publisher: Cambridge University Press
Print publication: 15 June 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Anthonissen, L. and Petré, P. (2019) ‘Grammaticalization and the linguistic individual: New avenues in lifespan research’, Linguistics Vanguard, 5(s2), pp. 20180037. Available at: https://doi.org/10.1515/lingvan-2018-0037.CrossRefGoogle Scholar
Antonia, A., Craig, H., and Elliott, J. (2014) ‘Language chunking, data sparseness, and the value of a long marker list: Explorations with word n-grams and authorial attribution’, Literary and Linguistic Computing, 29(2), pp. 147–63. Available at: https://doi.org/10.1093/llc/fqt028.Google Scholar
Argamon, S. (2008) ‘Interpreting Burrows’s Delta: Geometric and probabilistic foundations’, Literary and Linguistic Computing, 23(2), pp. 131–47. Available at: https://doi.org/10.1093/llc/fqn003.Google Scholar
Argamon, S. E. (2018) ‘Computational forensic authorship analysis: Promises and pitfalls’, Language and Law / Linguagem e Direito, 5(2), pp. 737.Google Scholar
Barlow, M. (2013) ‘Individual differences and usage-based grammar’, International Journal of Corpus Linguistics, 18(4), pp. 443–78. Available at: https://doi.org/10.1075/ijcl.18.4.01bar.Google Scholar
Beckner, C., Ellis, N. C., Blythe, R., et al. (2009) ‘Language is a complex adaptive system: Position paper’, Language Learning, 59, pp. 126.Google Scholar
Biber, D. (1988) Variation across Speech and Writing. Cambridge: Cambridge University Press.Google Scholar
Biber, D. (2009) ‘A corpus-driven approach to formulaic language in English: Multi-word patterns in speech and writing’, International Journal of Corpus Linguistics, 14(3), pp. 275311.Google Scholar
Biber, D. and Conrad, S. (2009) Register, Genre, and Style. Cambridge: Cambridge University Press.Google Scholar
Bloch, B. (1948) ‘A set of postulates for phonemic analysis’, Language, 24(1), pp. 346.Google Scholar
Braun-Blanquet, J. (1932) Plant Sociology: The Study of Plant Communities. New York: McGraw-Hill.Google Scholar
Burrows, J. (2002) ‘“Delta”: A measure of stylistic difference and a guide to likely authorship’, Literary and Linguistic Computing, 17(3), p. 267.Google Scholar
Bybee, J. L. (2006) ‘From usage to grammar: The mind’s response to repetition’, Language, 82(4), pp. 711–33. Available at: https://doi.org/10.1353/lan.2006.0186.CrossRefGoogle Scholar
Bybee, J. (2010) Language, Usage and Cognition. Cambridge: Cambridge University Press.Google Scholar
Carne, M. and Ishihara, S. (2021) ‘Feature-based forensic text comparison using a Poisson model for likelihood ratio estimation’, in Proceedings of the 18th Workshop of the Australasian Language Technology Association. Australasian Language Technology Association, pp. 3242.Google Scholar
Chaski, C. E. (2001) ‘Empirical evaluations of language-based author identification techniques’, Forensic Linguistics, 8(1), pp. 165.Google Scholar
Christiansen, M. H. and Chater, N. (2016) ‘The Now-or-Never bottleneck: A fundamental constraint on language’, Behavioral and Brain Sciences, 39, p. e62. Available at: https://doi.org/10.1017/S0140525X1500031X.Google Scholar
Cohen, J. (1960) ‘A coefficient of agreement for nominal scales’, Educational and Psychological Measurements, 20, pp. 3746.Google Scholar
Cole, L. C. (1949) ‘The measurement of interspecific association’, Ecology, 30, pp. 411–24.Google Scholar
Consonni, V. and Todeschini, R. (2012) ‘New similarity coefficients for binary data’, MATCH Communications in Mathematical and in Computer Chemistry, 68, pp. 581−92.Google Scholar
Coulthard, M. (2004) ‘Author identification, idiolect, and linguistic uniqueness’, Applied Linguistics, 25, pp. 431–47.Google Scholar
Coulthard, M. (2013) ‘On admissible linguistic evidence’, Journal of Law and Policy, 21, pp. 441–66.Google Scholar
Coulthard, M., Johnson, A., and Wright, D. (2017) An Introduction to Forensic Linguistics. Abingdon: Routledge.Google Scholar
Cowan, N. (2001) ‘The magical number 4 in short-term memory: A reconsideration of mental storage capacity’, Behavioral and Brain Sciences, 24(1), pp. 87114. Available at: https://doi.org/10.1017/S0140525X01003922.CrossRefGoogle ScholarPubMed
Croft, W. (2001) Radical Construction Grammar: Syntactic Theory in Typological Perspective. Oxford: Oxford University Press. Available at: https://doi.org/10.1093/acprof:oso/9780198299554.001.0001.Google Scholar
Dąbrowska, E. (2012) ‘Different speakers, different grammars’, Linguistic Approaches to Bilingualism, 2(3), pp. 219–53. Available at: https://doi.org/10.1075/lab.2.3.01dab.CrossRefGoogle Scholar
Dąbrowska, E. (2015) ‘Individual differences in grammatical knowledge’, in Dąbrowska, E. and Divjak, D. (eds.) Handbook of Cognitive Linguistics. Berlin: De Gruyter, pp. 650–67.CrossRefGoogle Scholar
Dąbrowska, E. (2018) ‘Experience, aptitude and individual differences in native language ultimate attainment’, Cognition, 178 (May), pp. 222–35. Available at: https://doi.org/10.1016/j.cognition.2018.05.018.Google Scholar
Dąbrowska, E. (2020) ‘Language as a phenomenon of the third kind’, Cognitive Linguistics, 31(2), pp. 213–29. Available at: https://doi.org/10.1515/cog-2019-0029.CrossRefGoogle Scholar
Daelemans, W. (2013) ‘Explanation in computational stylometry’, Computational Linguistics and Intelligent Text Processing, 7817(2), pp. 451–62.Google Scholar
Dasgupta, I. and Gershman, S. J. (2021) ‘Memory as a computational resource’, Trends in Cognitive Sciences, 25(3), pp. 240–51. Available at: https://doi.org/10.1016/j.tics.2020.12.008.CrossRefGoogle ScholarPubMed
Diessel, H. (2019) The Grammar Network: How Linguistic Structure is Shaped by Language Use. Cambridge: Cambridge University Press.Google Scholar
Divjak, D. (2019) Frequency in Language: Memory, Attention and Learning. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Driver, H. E. and Kroeber, A. L. (1932) ‘Quantitative expression of cultural relationship’, University of California Publications in American Archaeology and Ethnology, 31, pp. 211–56.Google Scholar
Dugar, T. K., Gowtham, S., and Chakraborty, U. Kr. (2022) ‘Comparing word embeddings on authorship identification’, in Borah, S. and Panigrahi, R. (eds.) Applied Soft Computing: Techniques and Applications. Boca Raton, FL: CRC Press, pp. 177–94.Google Scholar
Dunn, J. (2017) ‘Computational learning of construction grammars’, Language and Cognition, 9(2), pp. 254–92. Available at: https://doi.org/10.1017/langcog.2016.7.CrossRefGoogle Scholar
Dunn, J. and Nini, A. (2021) ‘Production vs perception: The role of individuality in usage-based grammar induction’, in Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (online), Association for Computational Linguistics. Available at: https://aclanthology.org/2021.cmcl-1.19/, pp. 149–59.Google Scholar
Eder, M. (2015) ‘Does size matter? Authorship attribution, small samples, big problem’, Digital Scholarship in the Humanities, 30(2), pp. 167–82. Available at: https://doi.org/10.1093/llc/fqt066.Google Scholar
Ellis, N. C. (2002) ‘Frequency effects in language processing’, Studies in Second Language Acquisition, 24(02), pp. 143–88. Available at: https://doi.org/10.1017/S0272263102002024.Google Scholar
Ellis, N. C., Römer, U., and O’Donnell, M. B. (2016) Usage-Based Approaches to Language Acquisition and Processing: Cognitive and Corpus Investigations of Construction Grammar. Malden, MA: Wiley-Blackwell.Google Scholar
Erman, B. and Warren, B. (2000) ‘The idiom principle and the open choice principle’, Text, 20(1), pp. 2962. Available at: https://doi.org/10.1515/text.1.2000.20.1.29.Google Scholar
Evert, S., Proisl, T., Vitt, T., et al. (2015) ‘Towards a better understanding of Burrows’s Delta in literary authorship attribution’, in Feldman, A., Kazantseva, A., Szpakowicz, S., et al. (eds.) Proceedings of the Fourth Workshop on Computational Linguistics for Literature. Denver, CO: Association for Computational Linguistics, pp. 7988.Google Scholar
Evert, S., Proisl, T., Jannidis, F., et al. (2017) ‘Understanding and explaining Delta measures for authorship attribution’, Digital Scholarship in the Humanities, 32, pp. ii4ii16. Available at: https://doi.org/10.1093/llc/fqx023.Google Scholar
Fedorenko, E. (2021) ‘The human language system in the mind and brain’, in 5th Usage-Based Linguistics Conference (online), Tel Aviv University. Available at: https://youtu.be/edlY4GbH1tU .Google Scholar
Fonteyn, L. (2021) ‘Constructional change across the lifespan of 20 early modern gentlemen’, in 11th International Conference on Construction Grammar (ICCG11). Antwerp: University of Antwerp. Available at: https://doi.org/10.5281/zenodo.5220179.Google Scholar
Fonteyn, L. and Nini, A. (2020) ‘Individuality in syntactic variation: An investigation of the seventeenth-century gerund alternation’, Cognitive Linguistics, 31(2), pp. 279308. Available at: https://doi.org/10.1515/COG-2019-0040.Google Scholar
Galbraith, D. (2009) ‘Cognitive models of writing’, German as a Foreign Language, 2–3, pp. 722.Google Scholar
Gerlach, M. and Altmann, E. G. (2013) ‘Stochastic model for the vocabulary growth in natural languages’, Physical Review X, 3(2), p. 021006. Available at: https://doi.org/10.1103/PhysRevX.3.021006.Google Scholar
Gobet, F., Lane, P. C. R., Croker, S., et al. (2001) ‘Chunking mechanisms in human learning’, Trends in Cognitive Sciences, 5(6), pp. 236–43. Available at: https://doi.org/10.1016/S1364-6613(00)01662-4.Google Scholar
Goldberg, A. E. (1995) Constructions: A Construction Grammar Approach to Argument Structure. Chicago, IL: University of Chicago Press.Google Scholar
Goldberg, A. E. (2003) ‘Constructions: A new theoretical approach to language’, Trends in Cognitive Science, 7(5), pp. 219–24.CrossRefGoogle ScholarPubMed
Goldberg, A. E. (2006) Constructions at Work: The Nature of Generalization in Language. Oxford: Oxford University Press.Google Scholar
Goldberg, A. E. (2019) Explain Me This: Creativity, Competition, and the Partial Productivity of Constructions. Princeton, NJ: Princeton University Press.Google Scholar
Goodman, L. A. and Kruskal, W. H. (1954) ‘Measures of association for cross classifications’, Journal of the American Statistical Association, 49, pp. 732–64.Google Scholar
Grant, T. (2007) ‘Quantifying evidence in forensic authorship analysis’, International Journal of Speech Language and the Law, 14(1), pp. 125. Available at: https://doi.org/10.1558/ijsll.v14i1.1.Google Scholar
Grant, T. (2010) ‘Txt 4n6: Idiolect free authorship analysis’, in Coulthard, M. (ed.) Routledge Handbook of Forensic Linguistics. London: Routledge, pp. 508–23.Google Scholar
Grant, T. (2022) The Idea of Progress in Forensic Authorship Analysis. Elements in Forensic Linguistics. Cambridge: Cambridge University Press. Available at: www.cambridge.org/core/elements/idea-of-progress-in-forensic-authorship-analysis/6A4F7668B4831CCD7DBF74DECA3EBA06.Google Scholar
Grant, T. and MacLeod, N. (2018) ‘Resources and constraints in linguistic identity performance: A theory of authorship’, Language and Law / Linguagem e Direito, 5(1), pp. 8096.Google Scholar
Grant, T. and MacLeod, N. (2020) Language and Online Identities: The Undercover Policing of Internet Sexual Crime. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Gries, S. T. (2013) ‘50-something years of work on collocations: What is or should be next’, International Journal of Corpus Linguistics, 18(1), pp. 137–66. Available at: https://doi.org/10.1075/ijcl.18.1.09gri.Google Scholar
Grieve, J. (2007) ‘Quantitative authorship attribution: An evaluation of techniques’, Literary and Linguistic Computing, 22(3), pp. 251–70.CrossRefGoogle Scholar
Grieve, J., Clarke, I., Chiang, E., et al. (2019) ‘Attributing the Bixby Letter using n-gram tracing’, Digital Scholarship in the Humanities, 34(3), pp. 493512.Google Scholar
Halliday, M. A. K. and Matthiessen, C. M. I. M. (2004) An Introduction to Functional Grammar. London: Arnold.Google Scholar
Halvani, O., Graner, L., and Regev, R. (2020) ‘Cross-domain authorship verification based on topic agnostic features’, in L. Cappellato, C. Eickhoff, N. Ferro, and A. Névéol (eds.) Working Notes of CLEF 2020: Conference and Labs of the Evaluation Forum. Available at: https://ceur-ws.org/Vol-2696/.Google Scholar
Hasan, R. (1996) ‘Ways of saying: ways of meaning’, in Cloran, C., Butt, D., and Williams, G. (eds.) Ways of Saying, Ways of Meaning: Selected Papers of Ruqaiya Hasan. London: Cassell, pp. 191242.Google Scholar
Hasan, R. (2009a) ‘On semantic variation’, in Webster, J. (ed.) The Collected Works of Ruqaiya Hasan Vol. 2: Semantic Variation: Meaning in Society and in Sociolinguistics. London: Equinox, pp. 4172.Google Scholar
Hasan, R. (2009b) ‘Wanted: A theory for integrated sociolinguistics’, in Webster, J. (ed.) The Collected Works of Ruqaiya Hasan Vol. 2: Semantic Variation: Meaning in Society and in Sociolinguistics. London: Equinox, pp. 540.Google Scholar
Hasson, U., Chen, J., and Honey, C. J. (2015) ‘Hierarchical process memory: Memory as an integral component of information processing’, Trends in Cognitive Sciences, 19(6), pp. 304313. Available at: https://doi.org/10.1016/j.tics.2015.04.006.Google Scholar
Hawkins, R. P. and Dotson, V. A. (1968) ‘Reliability scores that delude: An Alice in Wonderland trip through the misleading characteristics of interobserver agreement scores in interval coding’, in Ramp, E. and Semb, G. (eds.) Behavior Analysis: Areas of Research and Application. Englewood Cliffs, NJ: Prentice Hall.Google Scholar
Hayek, L.-A. C. (1994) ‘Analysis of amphibian biodiversity data’, in Heyer, R. W. et al. (eds.) Measuring and Monitoring Biological Diversity: Standard Methods for Amphibians. Washington, DC: Smithsonian Books, pp. 207–70.Google Scholar
Heaps, H. S. (1978) Information Retrieval: Computational and Theoretical Aspects. Library and Information Science Series. New York: Academic Press.Google Scholar
Herdan, G. (1960) Type-Token Mathematics. Janua linguarum, Series maior, 4. ’s-Gravenhage: Mouton.Google Scholar
Hilpert, M. (2014) Construction Grammar and Its Application to English. Edinburgh: Edinburgh University Press.Google Scholar
Hoey, M. (2005) Lexical Priming: A New Theory of Words and Language. London: Routledge.Google Scholar
Hoover, D. L. (2004) ‘Testing Burrows’s Delta’, Literary and Linguistic Computing, 19(4), pp. 453–75.Google Scholar
Houvardas, J. and Stamatatos, E. (2006) ‘N-gram feature selection for authorship identification’, in Euzenat, J. and Domingue, J. (eds.) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2006, Bulgaria. Berlin: Springer, pp. 7786. Available at: https://doi.org/10.1007/11861461_10.Google Scholar
Hudson, R. (2010) An Introduction to Word Grammar. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Hudson, R. A. (1996) Sociolinguistics. 2nd ed. Cambridge Textbooks in Linguistics. Cambridge: Cambridge University Press.Google Scholar
Hunston, S. and Francis, G. (2000) Pattern Grammar: A Corpus-Driven Approach to the Lexical Grammar of English. Edited by Francis, G.. Studies in Corpus Linguistics, 4. Amsterdam: John Benjamins.Google Scholar
Ishihara, S. (2021a) ‘Score-based likelihood ratios for linguistic text evidence with a bag-of-words model’, Forensic Science International, 327, p. 110980. Available at: https://doi.org/10.1016/j.forsciint.2021.110980.Google Scholar
Ishihara, S. (2021b) ‘The influence of background data size on the performance of a score-based likelihood ratio system: A case of forensic text comparison’, in Proceedings of the 18th Workshop of the Australasian Language Technology Association. ALTA, pp. 2131. Available at: https://aclanthology.org/volumes/2020.alta-1/.Google Scholar
Jaccard, P. (1912) ‘The distribution of the flora in the alpine zone’, New Phytologist, 11(2), pp. 3750. Available at: https://doi.org/10.1111/j.1469-8137.1912.tb05611.x.Google Scholar
Jafariakinabad, F. and Hua, K. A. (2021) ‘Unifying lexical, syntactic, and structural representations of written language for authorship attribution’, SN Computer Science, 2(481), pp. 114. Available at: https://doi.org/10.1007/s42979-021-00911-2.Google Scholar
Jain, A. K., Ross, A., and Prabhakar, S. (2004) ‘An introduction to biometric recognition’, IEEE Transactions on Circuits and Systems for Video Technology, 14(1), pp. 420. Available at: https://doi.org/10.1109/TCSVT.2003.818349.Google Scholar
Jannidis, F., Pielström, S., Schöch, C., and Vitt., T. (2015) ‘Improving Burrows’ Delta: An empirical evaluation of text distance measures’, in Digital Humanities Conference 2015. Sydney, Australia: Alliance of Digital Humanities Organizations.Google Scholar
Johnson, A. and Wright, D. (2014) ‘Identifying idiolect in forensic authorship attribution: An n-gram textbite approach’, Language and Law/Linguagem e Direito, 1(1), pp. 3769.Google Scholar
Johnstone, B. (1996) The Linguistic Individual: Self-Expression in Language and Linguistics. Oxford: Oxford University Press.Google Scholar
Juola, P. (2008) ‘Authorship attribution’, Foundations and Trends® in Information Retrieval, 1(3), pp. 233334. Available at: https://doi.org/10.1561/1500000005.Google Scholar
Juola, P. (2012) ‘Large-scale experiments in authorship attribution’, English Studies, 93(3), pp. 275–83.Google Scholar
Jurafsky, D. and Martin, J. H. (2009) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Upper Saddle River, NJ: Pearson/Prentice Hall.Google Scholar
Keller, R. (1994) On Language Change: The Invisible Hand in Language. London: Taylor & Francis.Google Scholar
Kestemont, M. (2014) ‘Function words in authorship attribution from black magic to theory?’, in Proceedings of the 3rd Workshop on Computational Linguistics for Literature (CLfL) @ EACL 2014. Gothenburg, Sweden: Association for Computational Linguistics, pp. 5966.Google Scholar
Kestemont, M., Stover, J., Koppel, M., Karsdorp, F., and Daelemans, W. (2016) ‘Authenticating the writings of Julius Caesar’, Expert Systems With Applications, 63, pp. 8696. Available at: https://doi.org/10.1016/j.eswa.2016.06.029.Google Scholar
Kestemont, M., Manjavacas, E., Markov, I., et al. (2020) Overview of the Cross-Domain Authorship Verification Task at PAN 2020, Available at: https://pan.webis.de/downloads/publications/papers/kestemont_2020.pdf.Google Scholar
Kidd, E., Donnelly, S., and Christiansen, M. H. (2018) ‘Individual differences in language acquisition and processing’, Trends in Cognitive Sciences, 22(2), pp. 154–69. Available at: https://doi.org/10.1016/j.tics.2017.11.006.Google Scholar
Kidd, E., Bidgood, A., Donnelly, S., Durrant, S., Peter, M. S., and Rowland, C. F. (2020) ‘Individual differences in first language acquisition and their theoretical implications’, in Rowland, C. F., Theakston, A., Ambridge, B., and Twomey, K. (eds.) Current Perspectives on Child Language Acquisition: How children use their environment to learn. Amsterdam: John Benjamins, pp. 189219. Available at: https://doi.org/10.1075/tilar.27.09kid.Google Scholar
Koppel, M. and Schler, J. (2004) ‘Authorship verification as a one-class classification problem’, in Proceedings of the 21th International Conference on Machine Learning. Banff, Alberta, Canada: ACM, pp. 62–7.Google Scholar
Koppel, M. and Winter, Y. (2014) ‘Determining if two documents are written by the same author’, Journal of the Association for Information Science and Technology, 65(1), pp. 178–87.Google Scholar
Koppel, M., Schler, J., and Argamon, S. (2009) ‘Computational methods in authorship attribution’, Journal of the American Society for Information Science and Technology, 60(1), pp. 926.Google Scholar
Koppel, M., Schler, J., and Argamon, S. (2011) ‘Authorship attribution in the wild’, Language Resources and Evaluation, 45(1), pp. 8394. Available at: https://doi.org/10.1007/s10579-009-9111-2.Google Scholar
Koppel, M., Schler, J., and Argamon, S. (2013) ‘Authorship attribution: What’s easy and what’s hard?’, Journal of Law and Policy, 21, pp. 317–31.Google Scholar
Kulczynski, S. (1927) ‘Die Pflanzenassociationen der Pienenen’, Bulletin International de l’Academie Polonaise des Sciences et des Lettres. Classe des Sciences Mathematiques et Naturelles. Serie B. Sciences Naturelles, Suppl. II(2), pp. 57203.Google Scholar
Lakoff, G. (1990) ‘The Invariance Hypothesis: Is abstract reason based on image-schemas?’, Cognitive Linguistics, 1(1), pp. 3974. Available at: https://doi.org/10.1515/cogl.1990.1.1.39.Google Scholar
Lancashire, I. (1997) ‘Empirically determining Shakespeare’s idiolect’, Shakespeare Studies, 25, pp. 171–85.Google Scholar
Lancashire, I. (2010) Forgetful Muses: Reading the Author in the Text. Toronto: University of Toronto Press.Google Scholar
Langacker, R. W. (1987) Foundations of Cognitive Grammar. Stanford, CA: Stanford University Press.Google Scholar
Leeuwen, D. A. van (2015) ROC: Compute Structures to Compute ROC and DET Plots and Metrics for 2-Class Classifiers. R package. Available at: https://rdrr.io/github/davidavdav/ROC/.Google Scholar
Lewis, D. D., Yang, Y., Rose, T. G., and Li, F. (2004) ‘RCV1: A new benchmark collection for text categorization research’, Journal of Machine Learning Research, 5, pp. 361–97.Google Scholar
López-Monroy, A. P., Montes-y-Gómez, M., Villaseñor-Pineda, L., Carrasco-Ochoa, J. A., and Martínez-Trinidad, J. F. (2012) ‘A new document author representation for authorship attribution’, in Mexican Conference on Pattern Recognition. Berlin: Springer, pp. 283–92. Available at: https://doi.org/10.1007/978-3-642-31149-9_29.Google Scholar
Mccauley, S. M. and Christiansen, M. H. (2015) ‘Individual differences in chunking ability predict on-line sentence processing’, in Noelle, D. C., Dale, R., Warlaumont, A., et al. (eds.), Proceedings of the 37th Annual Conference of the Cognitive Science Society. Pasadena, CA: Cognitive Science Society, pp. 1553–8.Google Scholar
McMenamin, G. R. (2002) Forensic Linguistics: Advances in Forensic Stylistics. Boca Raton, FL: CRC Press.Google Scholar
Mikros, G. K. and Argiri, E. K. (2007) ‘Investigating topic influence in authorship attribution’, in Stein, B., Koppel, M., and Stamatatos, E. (eds.), Proceedings of the SIGIR 2007 International Workshop on Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection, vol. 276. Amsterdam: CEUR-WS.org. Available at: http://ceur-ws.org/Vol-276.Google Scholar
Miller, G. A. (1956) ‘The magical number seven, plus or minus two: Some limits on our capacity for processing information’, Psychological Review, 63(2), pp. 8197.Google Scholar
Mollin, S. (2009) ‘“I entirely understand” is a Blairism: The methodology of identifying idiolectal collocations’, International Journal of Corpus Linguistics, 14(3), pp. 367–92. Available at: https://doi.org/10.1075/ijcl.14.3.04mol.Google Scholar
Mosteller, F. and Wallace, D. L. (1963) ‘Inference in an authorship problem’, Journal of the American Statistical Association, pp. 275309. Available at: https://doi.org/10.2307/2283270.Google Scholar
Mountford, M. D. (1962) ‘An index of similarity and its applications to classificatory problems’, in Murphy, P.W. (ed.) Progress in Soil Zoology. London: Butterworths, pp. 4350.Google Scholar
Murauer, B. and Specht, G. (2021) ‘Developing a benchmark for reducing data bias in authorship attribution’, in Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems (Eval4NLP 2021). Association for Computational Linguistics, pp. 179–88. Available at: https://aclanthology.org/2021.eval4nlp-1.18.pdf.Google Scholar
Narayanan, A., Paskov, H., Gong, N. Z., et al. (2012) ‘On the feasibility of internet-scale author identification’, in Security and Privacy (SP), 2012 IEEE Symposium on. IEEE, pp. 300–14. Available at: https://ieeexplore.ieee.org/document/6234420.Google Scholar
Nini, A. (2018) ‘An authorship analysis of the Jack the Ripper letters’, Digital Scholarship in the Humanities, 33(3), pp. 621–36.Google Scholar
Nini, A. and Grant, T. (2013) ‘Bridging the gap between stylistic and cognitive approaches to authorship analysis using Systemic Functional Linguistics and multidimensional analysis’, International Journal of Speech Language and the Law, 20(2), pp. 173202.CrossRefGoogle Scholar
Nini, A., Cameron, M., and Murphy, C. (2021) ‘Experimental evidence on the individuality of lexicogrammar’, in International Construction Grammar Conference 11 (ICCG11). Antwerp: University of Antwerp. Available at: https://doi.org/10.5281/zenodo.5227222.Google Scholar
Oakes, M. P. (2014) Literary Detective Work on the Computer. Amsterdam: John Benjamins.CrossRefGoogle Scholar
Ochiai, A. (1957) ‘Zoogeographic studies on the soleoid fishes found in Japan and its neighboring regions’, Bulletin of the Japanese Society of Fisheries Science, 22, pp. 526–30.Google Scholar
Pearson, K. and Heron, D. (1913) ‘On theories of association’, Biometrika, 9, pp. 159315.Google Scholar
Petré, P. and Van de Velde, F. (2018) ‘The real-time dynamics of the individual and the community in grammaticalization’, Language, 94(4), pp. 867901.Google Scholar
Pinker, S. (1994) Language Instinct. New York: William Morrow.Google Scholar
Plakias, S. and Stamatatos, E. (2008) ‘Tensor space models for authorship identification’, in Darzentas, J., Vouros, G. A., Vosinakis, S., and Arnellos, A. (eds.) Proceedings of the 5th Hellenic Conference on Artificial Intelligence (SETN’08). Syros, Greece: LNCS, pp. 239–49.Google Scholar
Pokhriyal, N., Tayal, K., Nwogu, I., and Govindaraju, V. (2017) ‘Cognitive-biometric recognition from language usage: A feasibility study’, IEEE Transactions on Information Forensics and Security, 12(1), pp. 134–43. Available at: https://doi.org/10.1109/TIFS.2016.2604213.Google Scholar
Proisl, T., Evert, S., Jannidis, F., Schöch, C., Konle, L., and Pielström, S. (2018) ‘Delta vs. n-gram tracing: Evaluating the robustness of authorship attribution methods’, in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018). Miyazaki, Japan: European Language Resources Association (ELRA), pp. 3309–14.Google Scholar
Renouf, A. and Sinclair, J. (1991) ‘Collocational frameworks in English’, in Aijmer, K. and Altenherg, B. (eds.) English Corpus Linguistics: Studies in Honour of Jan Svartvik. London: Longman, pp. 128–43.Google Scholar
Rogot, E. and Goldberg, I. D. (1966) ‘A proposed index for measuring agreement in test-retest studies’, Journal of Chronic Disease, 19, pp. 9911006.Google Scholar
Russell, P. F. and Rao, T. R. (1940) ‘On habitat and association of species of Anopheline larvae in South Eastern Madras’, Journal of the Malaria Institute of India, 3, pp. 153–78.Google Scholar
Sapkota, U., Bethard, S., Montes-y-Gómez, M., and Solorio, T. (2015) ‘Not all character n-grams are created equal: A study in authorship attribution’, in Human Language Technologies: The 2015 Annual Conference of the North American Chapter of the ACL. Denver, CO: ACL, pp. 93102.Google Scholar
Sari, Y., Vlachos, A., and Stevenson, M. (2017) ‘Continuous N-gram representations for authorship attribution’, in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp. 267–73. Available at: https://doi.org/10.18653/v1/e17-2043.Google Scholar
Schmid, H.-J. (2015) ‘A blueprint of the Entrenchment-and-Conventionalization Model’, Yearbook of the German Cognitive Linguistics Association, 3(1), pp. 325. Available at: https://doi.org/10.1515/gcla-2015-0002.Google Scholar
Schmid, H.-J. and Mantlik, A. (2015) ‘Entrenchment in historical corpora? Reconstructing dead authors’ minds from their usage profiles’, Anglia, 133 (4), pp. 583623. Available at: https://doi.org/10.1515/ang-2015-0056.CrossRefGoogle Scholar
Schmid, H.-J., Würschinger, Q., Fischer, S., and Küchenhoff, H. (2021) ‘That’s cool: Computational sociolinguistic methods for investigating individual lexico-grammatical variation’, Frontiers in Artificial Intelligence, 3, p. 89. Available at: https://doi.org/10.3389/frai.2020.547531.Google Scholar
Schmitt, N. (2004) Formulaic Sequences: Acquisition, Processing, and Use. Amsterdam : John Benjamins.Google Scholar
Seidman, S. (2013) ‘Authorship verification using the impostors method’, in Forner, P., Navigli, R., Tufis, D., and Ferro, N. (eds.) CLEF 2013 Evaluation Labs and Workshop – Working Notes Papers. Valencia, Spain, pp. 23–6. Available at: https://ceur-ws.org/Vol-1179/.Google Scholar
Shannon, C. E. (1948) ‘A mathematical theory of communication’, Bell System Technical Journal, 27, pp. 379423 & 623–56.Google Scholar
Simpson, G. G. (1943) ‘Mammals and the nature of continents’, Amercian Journal of Science, 241, pp. 131.CrossRefGoogle Scholar
Sinclair, J. (1991) Corpus, Concordance, Collocation. Oxford: Oxford University Press.Google Scholar
Smet, H. De (2016) ‘The root of ruthless: Individual variation as a window on mental representation’, International Journal of Corpus Linguistics, 21(2), pp. 250–71. Available at: https://doi.org/10.1075/ijcl.21.2.05des.Google Scholar
Smith, P. W. H. and Aldridge, W. (2011) ‘Improving authorship attribution: Optimizing Burrows’ Delta method*’, Journal of Quantitative Linguistics, 18 (1), pp. 6388. Available at: https://doi.org/10.1080/09296174.2011.533591.Google Scholar
Sokal, R. R. and Michener, C. D. (1958) ‘A statistical method for evaluating systematic relationships’, University of Kansas Science Bulletin, 38, pp. 1409–38.Google Scholar
Sokal, R. R. and Sneath, P. H. A. (1963) Principles of Numerical Taxonomy. San Francisco, CA: W.H. Freeman.Google Scholar
Solan, L. M. and Tiersma, P. M. (2005) Speaking of Crime: The Language of Criminal Justice. Chicago, IL: University of Chicago Press.Google Scholar
Sorgenfrei, T. (1958) ‘Molluscan assemblages from the marine middle Miocene of South Jutland and their environments’, Danmark Geologiske Undersøgelse. Serie 2, 79, pp. 403–8.Google Scholar
Stamatatos, E. (2009) ‘A survey of modern authorship attribution methods’, Journal of the American Society for Information Science and Technology, 60(3), pp. 538–56. Available at: https://doi.org/10.1002/asi.21001.Google Scholar
Stamatatos, E. (2013) ‘On the robustness of authorship attribution based on character n-gram features’, Journal of Law and Policy, 21(2), pp. 421–39.Google Scholar
Svartvik, J. (1968) The Evans Statements: A Case for Forensic Linguistics. Gothenburg: University of Gothenburg Press.Google Scholar
Todeschini, R., Consonni, V., Xiang, H., Holliday, J., Buscema, M., and Willett, P. (2012) ‘Similarity coefficients for binary chemoinformatics data: Overview and extended comparison using simulated and real data sets’, Journal of Chemical Information and Modeling, 52(11), pp. 2884–901. Available at: https://doi.org/10.1021/ci300261r.CrossRefGoogle ScholarPubMed
Turell, M. T. (2010) ‘The use of textual, grammatical and sociolinguistic evidence in forensic text comparison’, International Journal of Speech Language and the Law, 17(2), pp. 211–50.Google Scholar
Turell, M. T. and Gavaldà, N. (2013) ‘Towards an index of idiolectal similitude (or distance) in forensic authorship analysis’, Journal of Law and Policy, 21, pp. 495514.Google Scholar
Ullman, M. T. (2004) ‘Contributions of memory circuits to language: the declarative/procedural model’, Cognition, 92(12), pp. 231–70. Available at: https://doi.org/10.1016/j.cognition.2003.10.008.Google Scholar
Ullman, M. T. (2013) ‘The role of declarative and procedural memory in disorders of language’, Linguistic Variation, 13(2), pp. 133–54. Available at: https://doi.org/10.1075/lv.13.2.01ull.Google Scholar
Vetchinnikova, S. (2017) ‘On the relationship between the cognitive and the communal: A complex systems perspective’, in Filppula, M., Klemola, J., Mauranen, A., and Vetchinnikova, S. (eds.) Changing English. Berlin: De Gruyter, pp. 277310. Available at: https://doi.org/10.1515/9783110429657-015.Google Scholar
Warrens, M. J. (2008) ‘Similarity coefficients for binary data’. Unpublished thesis, Leiden University.Google Scholar
Wible, D. and Tsao, N.-L. (2010) ‘StringNet as a computational resource for discovering and investigating linguistic constructions’, in Proceedings of the NAACL HLT Workshop on Extracting and Using Constructions in Computational Linguistics. Los Angeles, California, USA, pp. 2531. Available at: https://aclanthology.org/W10-0804/.Google Scholar
Wray, A. (2008) Formulaic Language Pushing the Boundaries. Oxford: Oxford University Press.Google Scholar
Wright, D. (2013) ‘Stylistic variation within genre conventions in the Enron email corpus: Developing a text-sensitive methodology for authorship research’, International Journal of Speech Language and the Law, 20(1), pp. 4575.Google Scholar
Wright, D. (2017) ‘Using word n-grams to identify authors and idiolects: A corpus approach to a forensic linguistic problem’, International Journal of Corpus Linguistics, 22(2), pp. 212–41. Available at: https://doi.org/10.1075/ijcl.22.2.03wri.Google Scholar
Yule, G. U. (1900) ‘On the association of attributes in statistics’, Philosophical Transactions of the Royal Society, 75, pp. 257319.Google Scholar
Yule, G. U. (1912) ‘On the methods of measuring association between two attributes’, Journal of the Royal Statistical Society, 75, pp. 579642.Google Scholar

Save element to Kindle

To save this element to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

A Theory of Linguistic Individuality for Authorship Analysis
  • Andrea Nini, University of Manchester
  • Online ISBN: 9781108974851
Available formats
×

Save element to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

A Theory of Linguistic Individuality for Authorship Analysis
  • Andrea Nini, University of Manchester
  • Online ISBN: 9781108974851
Available formats
×

Save element to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

A Theory of Linguistic Individuality for Authorship Analysis
  • Andrea Nini, University of Manchester
  • Online ISBN: 9781108974851
Available formats
×