Exponential inequalities for VLMC empirical trees

Antonio Galves; Véronique Maume-Deschamps; Bernard Schmitt

doi:10.1051/ps:2007035

Exponential inequalities for VLMC empirical trees

Published online by Cambridge University Press: 23 January 2008

Antonio Galves ,

Véronique Maume-Deschamps and

Bernard Schmitt

Show author details

Antonio Galves: Affiliation:
Instituto de Matemática e Estatística, Universidade de São Paulo, BP 66281, 05315-970 São Paulo, Brasil; galves@ime.usp.br
Véronique Maume-Deschamps: Affiliation:
Institut de Mathématiques de Bourgogne, BP 47870, 21078 Dijon cedex France; vmaume@u-bourgogne.fr; schmittb@u-bourgogne.fr
Bernard Schmitt: Affiliation:
Institut de Mathématiques de Bourgogne, BP 47870, 21078 Dijon cedex France; vmaume@u-bourgogne.fr; schmittb@u-bourgogne.fr

Article contents

Abstract
References

Get access

Abstract

A seminal paper by Rissanen, published in 1983, introduced the class of Variable Length Markov Chains and the algorithm Context which estimates the probabilistic tree generating the chain. Even if the subject was recently considered in several papers, the central question of the rate of convergence of the algorithm remained open. This is the question we address here. We provide an exponential upper bound for the probability of incorrect estimation of the probabilistic tree, as a function of the size of the sample. As a consequence we prove the almost sure consistency of the algorithm Context. We also derive exponential upper bounds for type I errors and for the probability of underestimation of the context tree. The constants appearing in the bounds are all explicit and obtained in a constructive way.

Keywords

Variable Length Markov Chain context tree algorithm context weak dependance

Type: Research Article
Information: ESAIM: Probability and Statistics , Volume 12 , 2008 , pp. 219 - 229

DOI: https://doi.org/10.1051/ps:2007035 [Opens in a new window]
Copyright: © EDP Sciences, SMAI, 2008

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bejerano, G. and Yona, G., Variations on probabilistic suffix trees: statistical modeling and prediction of protein families. Bioinformatics 17 (2001) 23–43. CrossRef

Bühlmann, P. and Wyner, A., Variable length Markov chains. Ann. Statist. 27 (1999) 480–513.

Csiszár, I., Large-scale typicality of Markov sample paths and consistency of MDL order estimators. Special issue on Shannon theory: perspective, trends, and applications. IEEE Trans. Inform. Theory 48 (2002) 1616–1628. CrossRef

I. Csiszár and Z. Talata, Context tree estimation for not necessarily finite memory processes via BIC and MDL, manuscript (2005).

Dedecker, J. and Doukhan, P., A new covariance inequality and applications. Stochastic Process. Appl. 106 (2003) 63–80. CrossRef

Dedecker, J. and Prieur, C., New dependence coefficients. Examples and applications to statistics. Prob. Theory Relat. Fields 132 (2005) 203–236. CrossRef

P. Ferrari and A. Galves, Coupling and regeneration for stochastic processes. Notes for a minicourse presented in XIII Escuela Venezolana de Matematicas. Can be downloaded from www.ime.usp.br/~pablo/book/abstract.html (2000).

Ferrari, F. and Wyner, A., Estimation of general stationary processes by variable length Markov chains. Scand. J. Statist. 30 (2003) 459–480. CrossRef

Leonardi, F. and Galves, A., Sequence Motif identification and protein classification using probabilistic trees. Lect. Notes Comput. Sci. 3594 (2005) 190–193. CrossRef

V. Maume-Deschamps, Exponential inequalities and estimation of conditional probabilities in Dependence in probability and statistics, Lect. Notes in Stat., Vol. 187, P. Bertail, P. Doukhan and P. Soulier Eds. Springer (2006).

Rissanen, J., A universal data compression system. IEEE Trans. Inform. Theory 29 (1983) 656–664. CrossRef

Tjalkens, T.J. and Willems, F.M.J.F., Implementing the context-tree weighting method: arithmetic coding. Recent advances in interdisciplinary mathematics (Portland, ME, 1997). J. Combin. Inform. System Sci. 25 (2000) 49-58.

Willems, F.M., Shtarkov, Y.M. and Tjalkens, T.J, The context-tree weighting method: basic properties. IEEE Trans. Inform. Theory 41 (1995) 653–664. CrossRef

Article contents

Exponential inequalities for VLMC empirical trees

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests