Advanced search
Start date
Betweenand


Recognizing textual entailment in Portuguese

Full text
Author(s):
Erick Rocha Fonseca
Total Authors: 1
Document type: Doctoral Thesis
Press: São Carlos.
Institution: Universidade de São Paulo (USP). Instituto de Ciências Matemáticas e de Computação (ICMC/SB)
Defense date:
Examining board members:
Sandra Maria Aluisio; Fábio Natanael Kepler; Maria das Graças Volpe Nunes; Moacir Antonelli Ponti; Paulo Quaresma Neto
Advisor: Sandra Maria Aluisio
Abstract

Recognizing Textual Entailment (RTE) consists in automatically identifying whether a text passage in natural language is true based on the content of another one. This problem has been studied in Natural Language Processing (NLP) for some years, and gained some prominence recently, with the availability of annotated data in larger quantities and the development of deep learning methods. This doctoral research had the goal of developing resources and methods for RTE, especially for Portuguese. During its execution, the ASSIN corpus was compiled, which is the first to provide data for training and evaluating RTE systems in Portuguese, and the workshop with the same name was organized, gathering researchers interested in this theme. Moreover, computational experiments were carried out with different techniques for RTE, with English and Portuguese data. A new RTE model, TEDIN (Tree Edit Distance Network), was developed. This model is based on the concept of syntactic tree edit distance, already explored in other RTE works. Its differential is to combine explicit linguistic knowledge representation with the flexibility and representative capacity of neural networks. An RTE model based on classical machine learning and feature engineering, Infernal, was also developed. TEDIN had experimental results below other models from the literature, and a careful analysis of its behavior shows the difficulty of modelling differences between syntactic trees. On the other hand, Infernal had positive results on ASSIN, setting the new stateof- the-art for RTE in Portuguese. (AU)

FAPESP's process: 13/22973-0 - Textual inference applied to Question and Answering Systems
Grantee:Erick Rocha Fonseca
Support Opportunities: Scholarships in Brazil - Doctorate