Advanced search
Start date
Betweenand

Hierarchical clustering methods for automatic organization of search engines results

Grant number: 11/19850-9
Support type:Regular Research Grants
Duration: January 01, 2012 - December 31, 2013
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Solange Oliveira Rezende
Grantee:Solange Oliveira Rezende
Home Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil

Abstract

Textual information retrieval is traditionally based in keywords search. This search presents as a response a list of documents ordered according to its relevance to the query. However, this approch has some well known limitations. Users generally explore just the first results of the response list to the detriment of the documents considered less relevant by the search engine. Moreover, another significant part of information is lost due to the difficult of the users to express their search objectives through keywords. In this project, methods for hierarchical clustering of documents are explored to help the organization of search engine results. Data returned by one or more search engines are organized in groups, in wich items thar are similar and related to a same topic are placed in a same group. Furthermore, the groups are hierarchically organized, such that the nearest a group is to the root node, the more general is the knowledge it represents. The detailment of a given group and its more specific knowledge are arranged in groups and subgroups in lower levels of the hierarchy. Each group has a sucint description, i.e., a topic that helps the user in a exploratory search of the obtained results in different levels of granurality. This organization based in hierarchical topics facilitates the search for the information of interest, obtaining a complementary view to the model based in a list ordered according to the document relevance. On the other hand, clustering search results has some specific requirements and challenges. The dynamic nature of data given by search engines, the needing for computational efficiency and the exigency of interpretation and interaction by the users resulted in new requirements. These requirements have their scientific and technological challenges, which are the objectives of this research project. (AU)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
ROSSI, RAFAEL GERALDELI; LOPES, ALNEU DE ANDRADE; FALEIROS, THIAGO DE PAULO; REZENDE, SOLANGE OLIVEIRA. Inductive Model Generation for Text Classification Using a Bipartite Heterogeneous Network. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, v. 29, n. 3, p. 361-375, MAY 2014. Web of Science Citations: 10.

Please report errors in scientific publications list by writing to: cdi@fapesp.br.