Hostname: page-component-848d4c4894-nmvwc Total loading time: 0 Render date: 2024-06-25T19:39:57.831Z Has data issue: false hasContentIssue false

Efficient parallel and incremental parsing of practical context-free languages

Published online by Cambridge University Press:  23 July 2015

JEAN-PHILIPPE BERNARDY
Affiliation:
Chalmers University of Technology & University of Gothenburg, Sweden (e-mail: bernardy@chalmers.se, koen@chalmers.se)
KOEN CLAESSEN
Affiliation:
Chalmers University of Technology & University of Gothenburg, Sweden (e-mail: bernardy@chalmers.se, koen@chalmers.se)
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We present a divide-and-conquer algorithm for parsing context-free languages efficiently. Our algorithm is an instance of Valiant's (1975; General context-free recognition in less than cubic time. J. Comput. Syst. Sci.10(2), 308–314), who reduced the problem of parsing to matrix multiplications. We show that, while the conquer step of Valiant's is O(n3), it improves to O(log2n) under certain conditions satisfied by many useful inputs that occur in practice, and if one uses a sparse representation of matrices. The improvement happens because the multiplications involve an overwhelming majority of empty matrices. This result is relevant to modern computing: divide-and-conquer algorithms with a polylogarithmic conquer step can be parallelized relatively easily.

Type
Articles
Copyright
Copyright © Cambridge University Press 2015 

References

Allison, L. (1992) Lazy dynamic-programming can be eager. Inform. Process. Lett. 43 (4), 207212.CrossRefGoogle Scholar
Bernardy, J.-P. (2008) Yi: An editor in Haskell for Haskell. In Proceedings of the 1st ACM SIGPLAN Symposium on Haskell. ACM, pp. 61–62.CrossRefGoogle Scholar
Bernardy, J.-P. (2009) Lazy functional incremental parsing. In Proceedings of the 2nd ACM SIGPLAN Symposium on Haskell. ACM, pp. 49–60.CrossRefGoogle Scholar
Bernardy, J.-P. and Claessen, K. (2013) Efficient divide-and-conquer parsing of practical context-free languages. In Proceedings of the 18th ACM SIGPLAN International Conference on Funct. Programming, pp. 111–122.CrossRefGoogle Scholar
Bird, R. (1986) An Introduction to the Theory of Lists. Programming Research Group, Oxford University Comp. Laboratory.Google Scholar
Burckhardt, S., Leijen, D., Sadowski, C., Yi, J. & Ball, T. (2011) Two for the price of one: A model for parallel and incremental computation. In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications. ACM, pp. 427–444.CrossRefGoogle Scholar
Chomsky, N. (1959) On certain formal properties of grammars. Inform. Control 2 (2), 137167.CrossRefGoogle Scholar
Chytil, M., Crochemore, M., Monien, B. & Rytter, W. (1991) On the parallel recognition of unambiguous context-free languages. Theor. Comput. Sci. 81 (2), 311316.CrossRefGoogle Scholar
Claessen, K. (2004) Parallel parsing processes. J. Funct. Program. 14 (6), 741757.CrossRefGoogle Scholar
Cocke, J. (1969) Programming Languages and their Compilers: Preliminary Notes. Courant Institute of Mathematical Sci., New York University.Google Scholar
Cormen, T. H., Leiserson, C. E., Rivest, R. L. & Stein, C. (2001) Introduction to Algorithms, 2nd ed.MIT press.Google Scholar
Forsberg, M. & Ranta, A.BNFC Quick reference, chapter Appendix A, London: College Publications, pp. 175192.Google Scholar
Free Software Foundation. (1991) Gnu general public license.Google Scholar
Gibbons, J. (1996) The third homomorphism theorem. J. Funct. Program. 6 (4), 657665.CrossRefGoogle Scholar
Hinze, R. & Paterson, R. (2006) Finger trees: A simple general-purpose data structure. J. Funct. Program. 16 (2), 197218.CrossRefGoogle Scholar
Hughes, R. J. M. & Swierstra, S. D. (2003) Polish parsers, step by step. In Proceedings of the Eighth ACM SIGPLAN International Conference on Funct. Programming. ACM, pp. 239–248.CrossRefGoogle Scholar
Kasami, T. (1965) An Efficient Recognition and Syntax Analysis Algorithm for Context-Free Languages. Technical Report, DTIC Document.Google Scholar
Lange, M. and Leiß, H. (2009) To CNF or not to CNF? An efficient yet presentable version of the CYK algorithm. Inform. Didactica 8, 20082010.Google Scholar
Morita, K., Morihata, A., Matsuzaki, K., Hu, Z. & Takeichi, M. (2007) Automatic inversion generates divide-and-conquer parallel programs. ACM SIGPLAN Not. 42 (6), 146155.CrossRefGoogle Scholar
Okhotin, A. (2014) Parsing by matrix multiplication generalized to boolean grammars. Theor. Comput. Sci. 516 (0), 101120.CrossRefGoogle Scholar
O'Sullivan, B. (2013) The Criterion benchmarking library.Google Scholar
Rytter, W. and Giancarlo, R. (1987) Optimal parallel parsing of bracket languages. Theor. Comput. Sci. 53 (2), 295306.CrossRefGoogle Scholar
Sikkel, K. and Nijholt, A. (1997) Parsing of Context-Free Languages. Berlin: Springer-Verlag, pp. 61100.Google Scholar
Strassen, V. (1969) Gaussian elimination is not optimal. Numer. Math. 13, 354356. DOI: 10.1007/BF02165411.CrossRefGoogle Scholar
Tomita, M. (1986) Efficient Parsing for Natural Language. Dordrecht: Kluwer Academic Publishers.CrossRefGoogle Scholar
Valiant, L. (1975) General context-free recognition in less than cubic time. J. Comput. Syst. Sci. 10 (2), 308314.CrossRefGoogle Scholar
Wagner, T. A. and Graham, S. L. (1998) Efficient and flexible incremental parsing. ACM Trans. Program. Lang. Syst. 20 (5), 9801013.CrossRefGoogle Scholar
Younger, D. (1967) Recognition and parsing of context-free languages in time n 3. Inform. Control 10 (2), 189208.CrossRefGoogle Scholar
Submit a response

Discussions

No Discussions have been published for this article.