An approach to automatic organization of dynamic text collections using incrementa...
Incorporating the semantics into the websensors construction process
Automatic clustering based on nature inspired metaheuristics
Grant number: | 11/19850-9 |
Support Opportunities: | Regular Research Grants |
Start date: | January 01, 2012 |
End date: | December 31, 2013 |
Field of knowledge: | Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques |
Principal Investigator: | Solange Oliveira Rezende |
Grantee: | Solange Oliveira Rezende |
Host Institution: | Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil |
Abstract
Textual information retrieval is traditionally based in keywords search. This search presents as a response a list of documents ordered according to its relevance to the query. However, this approch has some well known limitations. Users generally explore just the first results of the response list, to the detriment of the documents considered less relevant by the search engine. Moreover, another significant part of information is lost due to the difficult of the users to express their search objectives through keywords. In this project, methods for hierarchical clustering of documents are explored to help the organization of search engine results. Data returned by one or more search engines are organized in groups, in wich items thar are similar and related to a same topic are placed in a same group. Furthermore, the groups are hierarchically organized, such that the nearest a group is to the root node, the more general is the knowledge it represents. The detailment of a given group and its more specific knowledge are arranged in groups and subgroups in lower levels of the hierarchy. Each group has a sucint description, i.e., a topic that helps the user in an exploratory search of the obtained results in different levels of granurality. This organization based in hierarchical topics facilitates the search for the information of interest, obtaining a complementary view to the model based in a list ordered according to the document relevance. On the other hand, clustering search results has some specific requirements and challenges. The dynamic nature of data given by search engines, the needing for computational efficiency and the exigency of interpretation and interaction by the users resulted in new requirements. These requirements have their scientific and technological challenges, which are the objectives of this research project. (AU)
Articles published in Agência FAPESP Newsletter about the research grant: |
More itemsLess items |
TITULO |
Articles published in other media outlets ( ): |
More itemsLess items |
VEICULO: TITULO (DATA) |
VEICULO: TITULO (DATA) |