- Research Grants
|Support type:||Scholarships in Brazil - Scientific Initiation|
|Effective date (Start):||September 01, 2013|
|Effective date (End):||August 31, 2014|
|Field of knowledge:||Linguistics, Literature and Arts - Linguistics|
|Principal Investigator:||Ariani Di Felippo|
|Home Institution:||Centro de Educação e Ciências Humanas (CECH). Universidade Federal de São Carlos (UFSCAR). São Carlos , SP, Brazil|
Computational applications able to handle the incredible amount of available information, mainly on-line, have become increasingly relevant. The automatic Multi-Document Summarization (MDS) is one of these applications. It aims at automatically producing a unique summary from a group of texts on the same topic. In order to produce automatic summaries without cohesion and coherence problems, the MDS methods have to deal with multi-document phenomena, such as redundancy, complementarity and contradiction among information units. Despite the recent interest in MDS, many systems have already been developed, including for Portuguese. Given the importance of MDS systems, the linguistic characterization of human multi-document summaries becomes increasingly necessary as it generates knowledge for the production of linguistically-motivated summaries. Thus, the goal of this undergraduate research project is to characterize human multi-document summaries at the lexical level. Being part of the SUSTENTO project (2012/13246-5 FAPESP / CNPq 483231/2012-6), which aims at generating linguistic knowledge for MDS of Portuguese, this project aims at (i) specifying the density of nouns, adjectives, verbs, and adverbs in the summaries in relation to the their source texts and (ii) describing similarities and differences of these lexical units in the summaries and their source texts. Thus, in the end, the goal is to obtain lexical features that can be taken as conditions for the automatic production of linguistically-motivated summaries.