Advanced search
Start date
Betweenand

Linguistic analysis of textual aspects for automatic multi-document summarization

Grant number: 13/13107-8
Support type:Scholarships in Brazil - Scientific Initiation
Effective date (Start): September 01, 2013
Effective date (End): August 31, 2014
Field of knowledge:Linguistics, Literature and Arts - Linguistics
Principal Investigator:Ariani Di Felippo
Grantee:Vinícius Felix dos Santos
Home Institution: Centro de Educação e Ciências Humanas (CECH). Universidade Federal de São Carlos (UFSCAR). São Carlos , SP, Brazil

Abstract

Several studies have been demonstrated that human summaries produced from collections of news from different sources with the same topic (i.e., multi-document summaries) have specific aspects based on their category. The "aspects" are defined as basic units of information. For example, a summary of "natural disasters" news have the following aspects: what, when, where, why, who_affected, damages and countermeasures. Based on that, some methods of Automatic Multi-document Summarization have been produced summaries by selecting sentences from sources texts that convey the aspects found for its category. This project aims at: (i) revising the annotation of the aspects in the 50 human multi-document summaries of the Portuguese CSTNews corpus, and (ii) annotating the aspects in the 140 source texts of the CSTNews corpus. The annotation revision is motivated by the fact that there is no a clear and well-defined theory of aspects, and then the criteria for the identification and definition of these aspects needed to be refined. The annotation of the CSTNews corpus is essential to develop aspect-based multi-document summarization methods especially for Portuguese, which requires a corpus of annotated source texts. Being part of the SUSTENTO (2012/13246-5 FAPESP / CNPq 483231/2012-6) project, which aims at generating linguistic knowledge for Automatic Multi-document Summarization of Portuguese language, this undergraduate research project aims to contribute to refine the theoretical knowledge on textual aspects and characterize the human multi-document summaries of CSTNews.

Distribution map of accesses to this page
Click here to view the access summary to this page.