Advanced search
Start date
Betweenand

Bootstrap and model selection for stochastic chains with memory of variable length

Grant number: 09/09494-0
Support Opportunities:Scholarships in Brazil - Post-Doctoral
Start date: October 01, 2009
End date: September 30, 2011
Field of knowledge:Physical Sciences and Mathematics - Probability and Statistics - Statistics
Principal Investigator:Jefferson Antonio Galves
Grantee:Matthieu Pierre Lerasle
Host Institution: Instituto de Matemática e Estatística (IME). Universidade de São Paulo (USP). São Paulo , SP, Brazil

Abstract

Resampling methods have been used with success in the practical implementation of statistical tools. Recently, non asymptotic results have been obtained theoretically, in particular in modelselection, leading to great improvements of the practical procedures. We propose to study rigorously some resampling methods for chains with memory of variable length. These chains,introduced by Rissanen in 1983, have been intensively studied since then. This is due not only to the mathematical interest of this new class of processes, but also to its possibilities as models for scientific data coming from domains as different as biology and linguistics. From a statistical point of view, given a sample the basic question is to retrieve the smallest contexttree associate to a chain with memory of variable length which best fits the sample. Several methods have been proposed to achieve this goal, but they all depend on unknown constants or onunknown properties of the estimator. Resampling methods seems particularly suitable to handle this kind of difficulty.Besides its intrinsic interest as a theoretical statistics project, this proposal has also an interdisciplinary and applied aspect, motivated by the linguistic challenge of retrieving rhythmicfeatures from a written text. Preliminary studies strongly suggests that context trees are good candidates as signatures of the rhythmic classes whose existence has been conjectured in thelinguistics literature. The practical goal of this project is to apply the new theoretical results obtained as tools to analyse linguistic samples, trying to obtain new evidences, with arigorous statistical methodology, supporting the linguistic conjecture of the existence of rhythmic classes of languages. (AU)

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications (5)
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
LERASLE, MATTHIEU. Optimal model selection in density estimation. ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, v. 48, n. 3, p. 884-908, . (09/09494-0)
LERASLE, MATTHIEU; TAKAHASHI, DANIEL Y.. An oracle approach for interaction neighborhood estimation in random fields. ELECTRONIC JOURNAL OF STATISTICS, v. 5, p. 534-571, . (08/08171-0, 09/09494-0)
GALLO, S.; LERASLE, M.; TAKAHASHI, D. Y.. Markov Approximation of Chains of Infinite Order in the (d)over-bar-metric. Markov Processes and Related Fields, v. 19, n. 1, p. 51-82, . (09/09494-0, 09/09809-1, 08/08171-0)
LERASLE, MATTHIEU; TAKAHASHI, DANIEL Y.. Sharp oracle inequalities and slope heuristic for specification probabilities estimation in discrete random fields. BERNOULLI, v. 22, n. 1, p. 325-344, . (08/08171-0, 09/09494-0)
LERASLE, MATTHIEU. OPTIMAL MODEL SELECTION FOR DENSITY ESTIMATION OF STATIONARY DATA UNDER VARIOUS MIXING CONDITIONS. ANNALS OF STATISTICS, v. 39, n. 4, p. 1852-1877, . (09/09494-0)