Skip to main content Accessibility help

Integrating LSA-based hierarchical conceptual space and machine learning methods for leveling the readability of domain-specific texts

  • Hou-Chiang Tseng (a1) (a2) (a3), Berlin Chen (a1), Tao-Hsing Chang (a4) and Yao-Ting Sung (a5) (a6)


Text readability assessment is a challenging interdisciplinary endeavor with rich practical implications. It has long drawn the attention of researchers internationally, and the readability models since developed have been widely applied to various fields. Previous readability models have only made use of linguistic features employed for general text analysis and have not been sufficiently accurate when used to gauge domain-specific texts. In view of this, this study proposes a latent-semantic-analysis (LSA)-constructed hierarchical conceptual space that can be used to train a readability model to accurately assess domain-specific texts. Compared with a baseline reference using a traditional model, the new model improves by 13.88% to achieve 68.98% of accuracy when leveling social science texts, and by 24.61% to achieve 73.96% of accuracy when assessing natural science texts. We then combine the readability features developed for the current study with general linguistic features, and the accuracy of leveling social science texts improves by an even higher degree of 31.58% to achieve 86.68%, and that of natural science texts by 26.56% to achieve 75.91%. These results indicate that the readability features developed in this study can be used both to train a readability model for leveling domain-specific texts and also in combination with the more common linguistic features to enhance the efficacy of the model. Future research can expand the generalizability of the model by assessing texts from different fields and grade levels using the proposed method, thus enhancing the practical applications of this new method.


Corresponding author

*Corresponding author. Email:


Hide All
Ambati, B.R., Reddy, S. and Steedman, M. (2016). Assessing relative sentence complexity using an incremental CCG parser. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, California, United States, pp. 10511057.
American Association for the Advancement of Science, & National Science Teachers Association (2007). Atlas of Science Literacy: Project 2061. Washington: AAAS.
Bailin, A. and Grafstein, A. (2001). The linguistic assumptions underlying readability formulae. Language and Communication 21(3), 285301.
Bailin, A. and Grafstein, A. (2016). Readability: Text and Context. London: Palgrave Macmillan.
Bédard, J. and Chi, M.T.H. (1992). Expertise. Current Directions in Psychological Science 1(4), 135139.
Begeny, J.C. and Greene, D.J. (2014). Can readability formulas be used to successfully gauge difficulty of reading materials? Psychology in the Schools 51(2), 198215.
Belden, B.R. and Lee, W.D. (1961). Readability of biology textbooks and the reading ability of biology students. School Science and Mathematics 61(9), 689693.
Borst, A., Gaudinat, A., Grabar, N. and Boyer, C. (2008). Lexically-based distinction of readability levels of health documents. Acta Informatica Medica 16(2), 7275.
Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M. and Elhadad, N. (2015). Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, pp. 17211730.
Cecconi, M., De Backer, D., Antonelli, M.l., Beale, R., Bakker, J., Hofer, C., Jaeschke, R., Mebazaa, A., Pinsky, M.R., Teboul, J.L., Vincent, J.L. and Rhodes, A. (2014). Consensus on circulatory shock and hemodynamic monitoring. Task force of the European Society of Intensive Care Medicine. Intensive Care Medicine 40(12), 17951815.
Chall, J.S. and Dale, E. (1995). Readability Revisited: The New Dale-Chall Readability Formula. Cambridge, MA: Brookline Books.
Chang, C.C. and Lin, C.J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 127.
Chang, T.H. and Sung, Y.T. (2019). Automated Chinese essay scoring based on multi-level linguistic features. In Lu, X. and Chen, B. (eds), Computational and Corpus Approaches to Chinese Language Learning. Singapore: Springer, pp. 258274.
Chang, T.H.,Sung, Y.T. and Lee, Y.T. (2013). Evaluating the difficulty of concepts on domain knowledge using latent semantic analysis. In Proceedings of International Conference on Asian Language Processing, Urumqi, China, pp. 193196.
Chang, T.H.,Sung, Y.T. and Hong, J.F. (2015). Automatically detecting syntactic errors in sentences written by learners of Chinese as a foreign language. International Journal of Computational Linguistics and Chinese Language Processing 20(1):4964.
Chen, B. and Hsu, Y.C. (2019). Mandarin Chinese mispronunciation detection and diagnosis leveraging deep neural network based acoustic modeling and training techniques. In Lu, X. and Chen, B. (eds), Computational and Corpus Approaches to Chinese Language Learning. Singapore: Springer, pp. 219237.
Chen, Y.T., Chen, Y.H. and Cheng, Y.C. (2013). Assessing Chinese readability using term frequency and lexical chain. International Journal of Computational Linguistics & Chinese Language Processing 18(2), 118.
Chi, M.T.H., Glaser, R. and Farr, M. (eds) (1988). The Nature of Expertise. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Collins-Thompson, K. (2014). Computational assessment of text readability: A survey of current and future research. ITL-International Journal of Applied Linguistics 165(2), 97135.
Crossley, S.A., Skalicky, S., Dascalu, M., McNamara, D.S. and Kyle, K. (2017). Predicting text comprehension, processing, and familiarity in adult readers: New approaches to readability formulas. Discourse Processes 54(5–6), 340359.
Dale, E. and Chall, J.S. (1949). The concept of readability. Elementary English 26(1), 1926.
De Clercq, O. and Hoste, V. (2016). All mixed up? Finding the optimal feature set for general readability prediction and its application to English and Dutch. Computational Linguistics 42(3), 457490.
Dell’Orletta, F., Venturi, G. and Montemagni, S. (2012). Genre-oriented readability assessment: A case study. In Proceedings of the Workshop on Speech and Language Processing Tools in Education, Mumbai, India, pp. 9198.
Ding, L. (2007). A model of hierarchical knowledge representation? Toward knowware for intelligent systems. Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 11(10), 12321240.
DuBay, W.H. (2004). The principles of readability. Available at (accessed January 2017).
Etringer, B.D., Hillerbrand, E. and Claiborn, C.D. (1995). The transition from novice to expert counselor. Counselor Education and Supervision 35(1), 417.
Feng, L., Jansche, M., Huenerfauth, M. and Elhadad, N. (2010). A comparison of features for automatic readability assessment. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, Association for Computational Linguistics, Stroudsburg, PA, pp. 276284.
Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology 32(3), 221233.
François, T. and Miltsakaki, E. (2012). Do NLP and machine learning improve traditional readability formulas? In Proceedings of the First Workshop on Predicting and Improving Text Readability for Target Reader Populations, Association for Computational Linguistics, Stroudsburg, PA, pp. 4957.
Freimuth, V.S. (1979). Assessing the readability of health education messages. Public Health Reports 94(6), 568570.
Fry, E. (1968). A readability formula that saves time. Journal of Reading 11(7), 513578.
Fry, E. (1990). A readability formula for short passages. Journal of Reading 33(8), 594597.
Furnas, G.W., Deerwester, S., Dumais, S.T., Landauer, T.K., Harshman, R.A., Streeter, L.A. and Lochbaum, K.E. (1988). Information retrieval using a singular value decomposition model of latent semantic structure. In Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, NY, USA, pp. 465480.
Gallagher, D.J. and Thompson, G.R. (1981). A readability analysis of selected introductory economics textbooks. The Journal of Economic Education 12(2), 6063.
Golub, G.H. and Reinsch, C. (1970). Singular value decomposition and least squares solutions. Numerische Mathematik 14(5), 403420.
Graesser, A.C., Singer, M. and Trabasso, T. (1994). Constructing inferences during narrative text comprehension. Psychological Review 101(3), 371395.
Graesser, A.C., McNamara, D.S., Louwerse, M.M. and Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior Research Methods 36(2), 193202.
Graesser, A.C., McNamara, D.S. and Kulikowich, J.M. (2011). Coh-Metrix: Providing multilevel analyses of text characteristics. Educational Researcher 40(5), 223234.
Gruber, T.R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition 5(2), 199220.
Gunning, R. (1952). The Technique of Clear Writing. New York, NY: McGraw-Hill.
Harden, R.M. (1999). What is a spiral curriculum? Medical Teacher 21(2), 141143.
Han, L. (2009). Available at (accessed March 2018).
Hirschfeld, L.A. and Gelman, S.A. (1994). Mapping the Mind: Domain-Specificity in Cognition and Culture. New York, NY: Cambridge University Press.
Hong, J.F.,Sung, Y.T., Tseng, H.C., Chang, K.E. and Chen, J.L. (2016). A multilevel analysis of the linguistic features affecting Chinese text readability. Taiwan Journal of Chinese as a Second Language 2(13), 95126.
Howcroft, D.M. and Demberg, V. (2017). Psycholinguistic models of sentence processing improve sentence readability ranking. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, pp. 958968.
Hunt, D.P. (2003). The concept of knowledge and how to measure it. Journal of Intellectual Capital 4(1), 100113.
Hsu, F.Y., Lee, H.M., Chang, T.H. and Sung, Y.T. (2018). Automated estimation of item difficulty for multiple-choice tests: An application of word embedding techniques. Information Processing & Management 54(6), 969984.
Johns, J.L. and Wheat, T.E. (1984). Newspaper readability: Two crucial factors. Journal of Reading 27(5), 432434.
Kanebrant, E., Mühlenbock, K.H., Kokkinakis, S.J., Jönsson, A., Liberg, C., Geijerstam, Å., Folkeryd, J.W. and Falkenjack, J. (2015). T-MASTER– A tool for assessing students’ reading abilities. In Proceedings of the 7th International Conference on Computer Supported Education (CSEDU 2015), Lisbon, Portugal, pp. 220227.
Kang, H. (2009). Available at (accessed March 2018).
Kanungo, T. and Orr, D. (2009). Predicting the readability of short web summaries. In Proceedings of the Second ACM International Conference on Web Search and Data Mining, ACM, New York, NY, USA, pp. 202211.
Kilgarriff, A., Charalabopoulou, F., Gavrilidou, M., Johannessen, J.B., Khalil, S., Kokkinakis, S.J., Lew, R., Sharoff, S., Vadlapudi, R. and Volodina, E. (2014). Corpus-based vocabulary lists for language learners for nine languages. Language Resources and Evaluation 48(1), 121163.
Kireyev, K. and Landauer, T.K. (2011). Word maturity: Computational modeling of word knowledge. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, Association for Computational Linguistics, Stroudsburg, PA, pp. 299308.
Klare, G.R. (1963). The Measurement of Readability. Ames, IA: Iowa State University Press.
Klare, G.R. (2000). The measurement of readability: Useful information for communicators. ACM Journal of Computer Documentation (JCD) 24(3), 107121.
Landauer, T.K. and Dumais, S.T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review 104(2), 211240.
Landauer, T.K., Foltz, P.W. and Laham, D. (1998). An introduction to latent semantic analysis. Discourse Processes 25(2–3), 259284.
Le, Q. and Mikolov, T. (2014). Distributed representations of sentences and documents. In Proceedings of International Conference on Machine Learning, Beijing, China, pp. 11881196.
Lee, L.H., Chang, L.P. and Tseng, Y.H. (2016). Developing learner corpus annotation for Chinese grammatical errors. In Proceedings of the 20th International Conference on Asian Language Processing, Tainan, Taiwan, pp. 254257
Lee, L.S. and Chen, B. (2005). Spoken document understanding and organization. IEEE Signal Processing Magazine 22(5), 4260.
Lété, B., Sprenger-Charolles, L. and Colé, P. (2004). MANULEX: A grade-level lexical database from French elementary school readers. Behavior Research Methods Instruments, & Computers 36(1), 156166.
Lin, S.Y., Chen, H.C., Chang, T.H., Lee, W.E. and Sung, Y.T. (2019). CLAD: A corpus-derived Chinese lexical association database. Behavior Research Methods. doi: 10.3758/s13428-019-01208-2
Lu, X. and Chen, B. (2019). Computational and corpus approaches to Chinese language learning: An introduction. In Lu, X. and Chen, B. (eds), Computational and Corpus Approaches to Chinese Language Learning, Singapore: Springer, pp. 614.
McConnell, C.R. (1982). Readability formulas as applied to college economics textbooks. Journal of Reading 26(1), 1417.
McLaughlin, G.H. (1969). SMOG grading-a new readability formula. Journal of Reading 12(8), 639646.
McNamara, D.S., Louwerse, M.M. and Graesser, A.C. (2002). Coh-Metrix: Automated cohesion and coherence scores to predict text readability and facilitate comprehension. Technical report, Institute for Intelligent Systems, University of Memphis, Memphis, TN.
McNamara, D.S., Louwerse, M.M., McCarthy, P.M. and Graesser, A.C. (2010). Coh-Metrix: Capturing linguistic features of cohesion. Discourse Processes 47(4), 292330.
McNamara, D.S., Crossley, S.A. and Roscoe, R. (2013). Natural language processing in an intelligent writing strategy tutoring system. Behavior Research Methods 45(2), 499515.
McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12(2), 153157.
Mikolov, T., Chen, K., Corrado, G. and Dean, J. (2013). Efficient estimation of word representations in vector space. In Proceeding of the International Conference on Learning Representations (ICLR), Scottsdale, Arizona. pp. 112. Available at
Miltsakaki, E. and Troutt, A. (2007). Read-x: Automatic evaluation of reading difficulty of web text. In Proceeding of the World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education, Vol. 2007, No. 1, Quebec, Canada, pp. 72807286.
Miltsakaki, E. and Troutt, A. (2008). Real-time web text classification and analysis of reading difficulty. In Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications, Association for Computational Linguistics, Stroudsburg, PA, pp. 8997.
Ministry of Education (2014). General curriculum guidelines of 12-year basic education. Available at (accessed January 2019).
Nan, I. (2009). Available at (accessed March 2018).
Nolen-Hoeksema, S., Fredrickson, B.L., Loftus, G. and Wagenaar, W.A. (2009). Atkinson and Hilgard’s Introduction to Psychology. Boston: Cengage Learning.
Pennington, J., Socher, R. and Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 15321543.
Petersen, S.E. and Ostendorf, M. (2009). A machine learning approach to reading level assessment. Computer Speech & Language 23(1), 89106.
Powers, R.D., Sumner, W.A. and Kearl, B.E. (1958). A recalculation of four adult readability formulas. Journal of Educational Psychology 49(2), 99105.
Razek, J.R. and Cone, R.E. (1981). Readability of business communication textbooks-an empirical study. Journal of Business Communication 18(2), 3340.
Redish, J. (2000). Readability formulas have even more limitations than Klare discusses. ACM Journal of Computer Documentation (JCD) 24(3), 132137.
Ribeiro, M.T., Singh, S. and Guestrin, C. (2016). Model-Agnostic Interpretability of Machine Learning. In Proceedings of 2016 ICML workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY, pp. 9195.
Rubenstein, H. and Aborn, M. (1958). Learning, prediction, and readability. Journal of Applied Psychology 42(1), 2832.
Salton, G. and Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management 24(5), 513523.
Samek, W., Wiegand, T. and Müller, K.R. (2017). Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. ITU Journal: ICT Discoveries Special Issue The Impact of AI on Communication Networks and Services 1(1), pp. 110.
Schloerke, B. (2011). GGally: Extension to ggplot2. R package version 3.2.5. Available at (accessed January 2017).
Schvaneveldt, R.W., Durso, F.T. and Dearholt, D.W. (1989). Network structures in proximity data. Psychology of Learning and Motivation 24, 249284.
Schvaneveldt, R.W., Durso, F.T. and Dearholt, D.W. (2017). Pathfinder network. Available at (accessed January 2017)
Sherman, L.A. (1893). Analytics of Literature: A Manual for the Objective Study of English Prose and Poetry. Boston: Ginn and Company.
Spache, G. (1953). A new readability formula for primary-grade reading materials. The Elementary School Journal 53(7), 410413.
Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28(1), 1121.
Stenner, A.J. (1996). Measuring reading comprehension with the Lexile framework. Available at (accessed January 2017).
Sticht, T.G. (1973). Research toward the design, development and evaluation of a job-functional literacy training program for the United States Army. Literacy Discussion 4(3), 339369.
Sung, Y.T., Chen, J.L., Lee, Y.S., Cha, J.H., Tseng, H.C., Lin, W.C., Chang, T.H. and Chang, K.E. (2013). Investigating Chinese text readability: Linguistic features, modeling, and validation. Chinese Journal of Psychology 55(1), 75106.
Sung, Y.T., Chen, J.L., Cha, J.H., Tseng, H.C., Chang, T.H. and Chang, K.E. (2015a). Constructing and validating readability models: The method of integrating multilevel linguistic features with machine learning. Behavior Research Methods 47(2), 340354.
Sung, Y.T., Lin, W.C., Dyson, S.B., Chang, K.E. and Chen, Y.C. (2015b). Leveling L2 texts through readability: Combining multilevel linguistic features with the CEFR. The Modern Language Journal 99(2), 371391.
Sung, Y.T., Chang, T.H., Lin, W.C., Hsieh, K.S. and Chang, K.E. (2016a). CRIE: An automated analyzer for Chinese texts. Behavior Research Methods 48(4):12381251.
Sung, Y.T., Liao, C.N., Chang, T.H., Chen, C.L. and Chang, K.E. (2016b). The effect of online summary assessment and feedback system on the summary writing on 6th graders: The LSA-based technique. Computers & Education 95, 118.
Tanaka-Ishii, K., Tezuka, S. and Terada, H. (2010). Sorting texts by readability. Computational Linguistics 36(2), 203227.
Taylor, M.C. and Wahlstrom, M.W. (1999). Readability as applied to an ABE assessment instrument. Available at (accessed January 2017).
Thorndike, E.L. and Lorge, I. (1944). The Teacher’s Word Book of 30, 000 Words. New York: Teachers College, Columbia University. Bureau of Publications.
Truran, M., Georg, G., Cavazza, M. and Zhou, D. (2010). Assessing the readability of clinical documents in a document engineering environment. In Proceedings of the 10th ACM Symposium on Document Engineering, ACM, New York, NY.
Tseng, H.C.,Sung, Y.T., Chen, B. and Lee, W.E. (2016). Classification of text readability based on representation learning techniques. In Proceedings of the 26th Annual Meeting of the Society for Text & Discourse, Kassel, Germany, pp. 16.
Vajjala, S. and Meurers, D. (2012). On improving the accuracy of readability classification using insights from second language acquisition. In Proceedings of the Seventh Workshop on Building Educational Applications using NLP, Montreal, Canada, pp. 163173.
Vajjala, S. and Meurers, D. (2014). Assessing the relative reading level of sentence pairs for text simplification. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, Gothenburg, Sweden, pp. 288297.
Yan, X., Song, D. and Li, X. (2006). Concept-based document readability in domain-specific information retrieval. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management, ACM, New York, NY, pp. 540549.
Zeno, S.M., Ivens, S.H., Millard, R.T. and Duvvuri, R. (1995). The Educator’s Word Frequency Guide. New York: Touchstone Applied Science Associates, Inc. My Book.
Zhao, J. and Kan, M.Y. (2010). Domain-specific iterative readability computation. In Proceedings of the 10th Annual Joint Conference on Digital Libraries, ACM, New York, NY, pp. 205214.


Integrating LSA-based hierarchical conceptual space and machine learning methods for leveling the readability of domain-specific texts

  • Hou-Chiang Tseng (a1) (a2) (a3), Berlin Chen (a1), Tao-Hsing Chang (a4) and Yao-Ting Sung (a5) (a6)


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed