|Support type:||Scholarships in Brazil - Scientific Initiation|
|Effective date (Start):||April 01, 2007|
|Effective date (End):||March 31, 2009|
|Field of knowledge:||Physical Sciences and Mathematics - Computer Science|
|Principal researcher:||Ivandre Paraboni|
|Grantee:||Wilker Ferreira Aziz|
|Home Institution:||Escola de Artes, Ciências e Humanidades (EACH). Universidade de São Paulo (USP). São Paulo , SP, Brazil|
Machine Translation (MT) is one of the most traditional subfields of Natural Language Processing (NLP). Since the 50s many efforts have been made to translate texts written in one language into another, but the computational problem of translation remains largely unsolved. Well-known difficulties include the sheer complexity of natural languages and the need for large amounts of heterogeneous linguistic knowledge (e.g., syntactic, semantic, pragmatic etc) in the translation task. Recently, however, important achievements have been made in this and many other NLP subfields using purely statistical models, which use little or no linguistic knowledge to solve the task, and relying on aligned parallel corpora as a basis for learning the translation process. In this document we propose the study and possible development of textual alignment techniques for parallel corpora as a first step towards the development of a statistical MT system.