Advanced search
Start date
Betweenand


Using Dominance Chains to Detect Annotation Variants in Parsed Corpora

Full text
Author(s):
Faria, Pablo ; New York Acad Sci
Total Authors: 2
Document type: Journal article
Source: 2014 IEEE 10TH INTERNATIONAL CONFERENCE ON E-SCIENCE (E-SCIENCE), VOL 1; v. N/A, p. 8-pg., 2014-01-01.
Abstract

In this paper, some results on the detection of variation in annotation in parsed corpora or treebanks are presented. Treebanks are generally built by means of using both automatic tools (i.e., taggers and parsers) and human intervention. In this process, inconsistencies (and, thus, variation) in the annotation arise, caused by a number of factors, for instance, disagreement in interpretation, incomplete or unclear annotation guidelines, etc. In this study, the algorithm for automatic detection of variation proposed in [1] is evaluated against the Tycho Brahe Corpus (TBC, [2]) and compared to an alternative implementation where variants of annotation are characterized by means of "dominance chains". Experimental results demonstrate that the modified version has better relative precision and recall than the original method. (AU)

FAPESP's process: 13/18090-6 - Study and development of methods for automatic detection and correction of errors and inconsistencies in syntactically annotated corpora
Grantee:Pablo Picasso Feliciano de Faria
Support Opportunities: Scholarships in Brazil - Post-Doctoral