Advanced search
Start date
Betweenand

Investigation of textual alignment techniques for statistical machine translation

Grant number: 06/04818-4
Support type:Scholarships in Brazil - Scientific Initiation
Effective date (Start): April 01, 2007
Effective date (End): March 31, 2009
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal researcher:Ivandre Paraboni
Grantee:Wilker Ferreira Aziz
Home Institution: Escola de Artes, Ciências e Humanidades (EACH). Universidade de São Paulo (USP). São Paulo , SP, Brazil

Abstract

Machine Translation (MT) is one of the most traditional subfields of Natural Language Processing (NLP). Since the 50’s many efforts have been made to translate texts written in one language into another, but the computational problem of translation remains largely unsolved. Well-known difficulties include the sheer complexity of natural languages and the need for large amounts of heterogeneous linguistic knowledge (e.g., syntactic, semantic, pragmatic etc) in the translation task. Recently, however, important achievements have been made in this and many other NLP subfields using purely statistical models, which use little or no linguistic knowledge to solve the task, and relying on aligned parallel corpora as a basis for learning the translation process. In this document we propose the study and possible development of textual alignment techniques for parallel corpora as a first step towards the development of a statistical MT system.