Advanced search
Start date
Betweenand

Consistent estimation of stochastic processes with variable length memory: applications to the modeling of biological sequences

Grant number: 09/09411-8
Support Opportunities:Regular Research Grants
Start date: September 01, 2009
End date: November 30, 2011
Field of knowledge:Physical Sciences and Mathematics - Probability and Statistics - Statistics
Principal Investigator:Florencia Graciela Leonardi
Grantee:Florencia Graciela Leonardi
Host Institution: Instituto de Matemática e Estatística (IME). Universidade de São Paulo (USP). São Paulo , SP, Brazil

Abstract

The main goal of this project is to study the properties of stochastic processes with variable length memory and its estimation. We also propose to apply the tools of data analysis based on this type of processes to two important problems of Biology: the genetic mapping of complex diseases and the characterization of the secondary structure in proteins. From the theoretical point of view, one of the main objectives of this project is to obtain sufficient conditions for the consistency of empirical context trees not necessarily truncated to \emph{a priori} defined level. We also wish to obtain optimal upper bounds for the speed of convergence of these estimators. On the other hand, we propose to study the conditions under which certain functionals of stochastic processes can be consistently estimated.From the point of view of the applications, for the problem of mapping of complex diseases we propose to develop algorithms for analysis of SNPs maps that allow to infer ancestral relationships between different individual and to identify which are the genome's regions more associated with the disease. With reference to the secondary structure in proteins, we wish to develop algorithms to predict the different secondary structures from the amino acid sequences. We also expect that this association between secondary structure and sequence could be used to project more efficient algorithms to classify proteins into families. (AU)

Articles published in Agência FAPESP Newsletter about the research grant:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
GARIVIER, A.; LEONARDI, F.. Context tree selection: A unifying view. Stochastic Processes and their Applications, v. 121, n. 11, p. 2488-2506, . (09/09411-8)
GALVES, ANTONIO; GALVES, CHARLOTTE; GARCIA, JESUS E.; GARCIA, NANCY L.; LEONARDI, FLORENCIA. CONTEXT TREE SELECTION AND LINGUISTIC RHYTHM RETRIEVAL FROM WRITTEN TEXTS. Annals of Applied Statistics, v. 6, n. 1, p. 186-209, . (09/09411-8)