Skip to main content Accessibility help
×
Home

Best parse parsing with Earley's and Inside algorithms on probabilistic RTN

  • Young S. Han (a1) and Key-Sun Choi (a1)

Abstract

Inside parsing is a best parse parsing method based on the Inside algorithm that is often used in estimating probabilistic parameters of stochastic context free grammars. It gives a best parse in O(N3G3) time where N is the input size and G is the grammar size. Earley algorithm can be made to return best parses with the same complexity in N.

By way of experiments, we show that Inside parsing can be more efficient than Earley parsing with sufficiently large grammar and sufficiently short input sentences. For instance, Inside parsing is better with sentences of 16 or less words for a grammar containing 429 states. In practice, parsing can be made efficient by employing the two methods selectively.

The redundancy of Inside algorithm can be reduced by the topdown filtering using the chart produced by Earley algorithm, which is useful in training the probabilistic parameters of a grammar. Extensive experiments on Penn Tree corpus show that the efficiency of Inside computation can be improved by up to 55%.

Copyright

References

Hide All
Aho, Alfred V., and Ullman, Jeffrey D., (1972) The Theory of Parsing, Translation, and Compiling, vol. I. New Jersey: Prentice Hall.
Allen, J., (1994) Natural Language Understanding. 2nd edition. Benjamin Cummings.
Briscoe, T., and Carroll, J., (1993) Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars. Computational Linguistics 19(1): 2557.
Carroll, J., and Briscoe, T., (1992) Probabilistic normalization and unpacking of packed parse forests for unification-based grammars. In proceedings, AAAl Fall Symposium Series: Probabilistic Approaches to Natural Language.Cambridge. Pp. 33–8.
Charniak, E., and Goldman, R., (1993) A Bayesian model of plan recognition. Artificial Intelligence. 64(1): 5379.
Charniak, E., Hendrickson, C., Jacobson, N., and Perkowitz, M., (1993) Equations for part-of-speech tagging. In proceedings, AAAl Conference.
Glenn, C, and Charniak, E., (1992) Learning probabilistic dependency grammars from labelled texts. In Proceedings, AAAl Fall Symposium Series: Probabilistic Approaches to Natural Language.Cambridge. Pp. 2532.
Han, Young S., and Choi, Key-Sun. (1993) Lexical concept acquisition from collocation map. In Proceedings, a workshop of SIGLEX: Acquisition of Lexical Knowledge from Text.Ohio. Pp. 2231.
Han, Young S., and Choi, Key-Sun. (1994) A Reestimation algorithm for probabilistic transition network. In proceedings of COLING.Kyoto. Pp. 859–64.
Jelinek, R, Lafferty, J. D., and Mercer, R. L., (1990) Basic methods of probabilistic context free grammars. IBM RC 16374. IBM Continuous Speech Recognition Group.
Kochut, K., (1983) Towards the elastic ATN implementation. In Leonard, B., (ed.), The Design of Interpreters, Compilers, and Editors for ATN. New York: Springer-Verlag. Pp. 175214.
Kupiec, J., (1991) A Trellis-based algorithm for estimating the parameters of a hidden stochastic context-free grammar. In Proceedings, Speech and Natural Language Workshop,sponsored by DARPA.Pacific Grove. Pp. 241–6.
Lafferty, J., Sleator, D., and Temperley, D., (1992) Grammatical trigrams: a probabilistic model of link grammar. In Proceedings, AAAI Fall Symposium Series: Probabilistic Approaches to Natural Language.Cambridge. Pp. 8997.
Lari, K., and Young, S. J., (1990) The estimation of stochastic context-free grammars using the Inside-Outside algorithm. Computer Speech and Language. 4: 3556.
Schabes, Y., (1992) Stochastic lexicalized tree-adjoining grammars. In Proceedings, the 15th International Conference on Computational Linguistics.
Woods, W. A., (1970) Transition network grammars for natural language analysis, Communication of the ACM 13.
Wright, J. H., (1990) LR parsing of probabilistic grammars with input uncertainty for speech recognition. Computer Speech and Language 4:297323.
Wright, J., Wrighley, E., and Sharman, R., (1991) Adaptive probabilistic generalized LR parsing. In Proceedings, 2nd International Workshop on Parsing Technologies,Cancun, Mexico. Pp. 154–63.

Best parse parsing with Earley's and Inside algorithms on probabilistic RTN

  • Young S. Han (a1) and Key-Sun Choi (a1)

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed