Advanced search
Start date
Betweenand

Semantic driven automated post-editing for Brazilian Portuguese

Grant number: 16/21317-0
Support type:Scholarships in Brazil - Scientific Initiation
Effective date (Start): March 01, 2017
Effective date (End): December 31, 2018
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Helena de Medeiros Caseli
Grantee:Marcio Lima Inácio
Home Institution: Centro de Ciências Exatas e de Tecnologia (CCET). Universidade Federal de São Carlos (UFSCAR). São Carlos , SP, Brazil

Abstract

The Machine Translation (MT) is one of the most important applications (and subfields) of Natural Language Processing (NLP). MT systems generate, in a target language, an equivalent version of a text provided as input, in a source language. After more than 70 years of research in MT and various approaches have been proposed and implemented - such as rule-based MT, phrase-based statistical MT and neural MT - it is not possible yet to achieve the ambitious goals of its appearance: the full-automatic translation with good quality for unrestricted domains. Therefore, the automatic translations, as a rule, have to be post-edited by humans to become accurate and fluent in the target language. However, the manual post-editing is an arduous process that requires specialized effort. In this context, several proposals for automated post-editing have emerged in recent years. This project aims to investigate the automated post-editing based on semantic knowledge. One of the most traditional forms for representing textual semantics is based on the distributional hypothesis which considers the context of the words. This contextual information can be mapped into the distributional semantic models (DSMs). DSMs represent words as vectors in a high-dimension space which associates words with their occurrence contexts. Thus, this project aims to investigate how the DSMs can be applied in automated post-editing. This proposal is related to the MMeaning project (Regular Aid from FAPESP #2016/13002-0). (AU)