Advanced search
Start date
Betweenand

Stochastic behavior, critical phenomena and rhythmic patterns identification in natural languages

Grant number: 03/09930-9
Support type:PRONEX Research - Thematic Grants
Duration: October 01, 2004 - March 31, 2009
Field of knowledge:Physical Sciences and Mathematics - Probability and Statistics
Cooperation agreement: CNPq - Pronex
Principal Investigator:Jefferson Antonio Galves
Grantee:Jefferson Antonio Galves
Home Institution: Instituto de Matemática e Estatística (IME). Universidade de São Paulo (USP). São Paulo , SP, Brazil
Co-Principal Investigators:Maria Bernadete Marques Abaurre ; Nancy Lopes Garcia
Associated grant(s):06/52639-1 - Nasal consonants in Tupi and Jê languages, AR.EXT

Abstract

The aim of this interdisciplinary project is to develop the area of Stochastics Processes Theory in order to rigorously address the following central problems in Linguistics: 1. the question of the existence of rhythmic patterns in natural languages 2. ,the existence of a discrete typology characterized by well-defined critical points; as opposed to a rhythmic continuum 3. the existence of rhythmic features in the acoustic signal and in written texts. Besides obtaining new mathematical results which are interesting by themselves, the project will use the conceptual framework of Probability Theory to effectively interpret linguistic data. The use the of statistical analysis will be crucial in order to arrive at a deeper understanding of the linguistic issues. As a by-product, the project will develop the statistical and computational tools necessary to treat the relevant linguistic data. This last aspect opens the possibility of technological developments in Language Engineering. In order to achieve its goals, the project team will engage in the following activities: 1. mathematical research to study the properties of the formal models proposed by the project; 2. linguistic research to update and reformulate the central questions of the project in the light of the new mathematical findings; 3. acoustical and written speech corpora building, and laboratorial treatment of linguistic samples; 4. statistical analysis of linguistic data in order to adjust the mathematical models and to find supporting or negative evidence for the predictions of the models; 5. development of new computational and statistical tools for the analysis of linguistic data. The research activities will be conducted at the Núcleo de Modelagem Estocâstica e Complexidade da USP (Numec), at IME-USP, IMEC-UNICAMP, IEb--UNICAMP, Mathematics-UFMG, and also at foreign research institutions (Laboratoire de Mathématiques; Rouen, Centre de Physique Théorique-CNRS e Ecole Polytechnique de Palaiseau, Centre de Physique Théorique-CNRS Luminy, Fisica-Roma La Sapienza, Lingüística-Ferrara,Lingüística-Lisboa, Lingüística-Braga, Cognitive Sciences-UPenn, Laboratoire de Sciences Cognitives-Ecole de Hautes Etudes en Sciences Sociales e ENS, Lingüístics Bielefeld and Freiburg, Linguistics-Northwestern University). Therefore regular working visits of members of the project team at the different institutions that host the project will be absolutely necessary. Work coordination will be at Numec-USP, where the Experimental Phonetics Laboratory and the Scientific Calculus Laboratory will be located. This project requires an array of specific knowledges and competences which comprise Mathematics, Computer Science, Statistics and Linguistics... (AU)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
GALVES, ANTONIO; GARCIA, NANCY L.; PRIEUR, CLEMENTINE. Perfect Simulation of a Coupling Achieving the (d)over-bar-distance Between Ordered Pairs of Binary Chains of Infinite Order. Journal of Statistical Physics, v. 141, n. 4, p. 669-682, NOV 2010. Web of Science Citations: 2.
COLLET, PIERRE; GALVES, ANTONIO; LEONARDI, FLORENCIA. Random perturbations of stochastic processes with unbounded variable length memory. ELECTRONIC JOURNAL OF PROBABILITY, v. 13, p. 1345-1361, AUG 25 2008. Web of Science Citations: 6.
LEONARDI‚ F.G. A generalization of the PST algorithm: modeling the sparse nature of protein sequences. Bioinformatics, v. 22, n. 11, p. 1302-1307, 2006.

Please report errors in scientific publications list by writing to: cdi@fapesp.br.