Busca avançada
Ano de início
Entree


Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation

Texto completo
Autor(es):
Caseli, Helena M. ; Nunes, Maria das Gracas V. ; Forcada, Mikel L.
Número total de Autores: 3
Tipo de documento: Artigo Científico
Fonte: MACHINE TRANSLATION; v. 20, n. 4, p. 19-pg., 2006-03-01.
Resumo

The availability of machine-readable bilingual linguistic resources is crucial not only for rule-based machine translation but also for other applications such as cross-lingual information retrieval. However, the building of such resources (bilingual single-word and multi-word correspondences, translation rules) demands extensive manual work, and, as a consequence, bilingual resources are usually more difficult to find than "shallow" monolingual resources such as morphological dictionaries or part-of-speech taggers, especially when they involve a less-resourced language. This paper describes a methodology to build automatically both bilingual dictionaries and shallow-transfer rules by extracting knowledge from word-aligned parallel corpora processed with shallow monolingual resources (morphological analysers, and part-of-speech taggers). We present experiments for Brazilian Portuguese-Spanish and Brazilian Portuguese-English parallel texts. The results show that the proposed methodology can enable the rapid creation of valuable computational resources (bilingual dictionaries and shallow-transfer rules) for machine translation and other natural language processing tasks). (AU)

Processo FAPESP: 04/06707-0 - Traducao automatica envolvendo o portugues.
Beneficiário:Maria das Graças Volpe Nunes
Modalidade de apoio: Auxílio à Pesquisa - Regular
Processo FAPESP: 02/13207-8 - Tradução automática baseada em corpus envolvendo o português do Brasil
Beneficiário:Helena de Medeiros Caseli
Modalidade de apoio: Bolsas no Brasil - Doutorado