Advanced search
Start date
Betweenand

XML Ranked Keyword Search

Grant number: 10/00330-2
Support Opportunities:Scholarships in Brazil - Post-Doctoral
Start date: October 01, 2010
End date: August 31, 2011
Field of knowledge:Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques
Principal Investigator:Caetano Traina Junior
Grantee:Joe Tekli
Host Institution: Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil

Abstract

Recently, XML has been established as a major standard for information exchange and management, and has been broadly employed for data representation and storage. With the increased use of XML, specially on the Web, developing efficient search and retrieval techniques for XML data becomes very important, particularly for the database (DB) and information retrieval (IR) communities. As XML allows combining both structured and unstructured data, recent trends in DB and IR research show a growing interest to merge DB and IR techniques, exploiting IR methods in DBs and vice versa, for example extending DB-style XML query languages to support ranked results. In this project, we intend to develop a framework for efficiently producing ranked results for keyword search queries over large heterogeneous XML document collections. Our focus is based on employing the IR keyword search model, aiming to develop a keyword-based search environment that is more adapted and efficient to search and retrieve XML-encoded documents exploiting algorithms to define an efficient keyword-based XML search approach. Therefore, instead of exploiting complex query languages, like XML-QL, XQL or XQuery, to search on XML data, we retain the simple, widely used keyword search query method and exploit XML's nested structure during query processing. In other words, we aim at developing a user-friendly technique for searching XML data where users can express queries in the simplest possible form (keywords), in a way that less control is given to the user and more of the logic is put in the ranking mechanism to best match the user's needs.

News published in Agência FAPESP Newsletter about the scholarship:
More itemsLess items
Articles published in other media outlets ( ):
More itemsLess items
VEICULO: TITULO (DATA)
VEICULO: TITULO (DATA)

Scientific publications
(References retrieved automatically from Web of Science and SciELO through information on FAPESP grants and their corresponding numbers as mentioned in the publications by the authors)
TEKLI, JOE; CHBEIR, RICHARD; TRAINA, AGMA J. M.; TRAINA, JR., CAETANO; FILETO, RENATO. Approximate XML structure validation based on document-grammar tree similarity. INFORMATION SCIENCES, v. 295, p. 258-302, . (10/00330-2)
TEKLI, JOE; CHBEIR, RICHARD. A novel XML document structure comparison framework based-on sub-tree commonalities and label semantics. Journal of Web Semantics, v. 11, p. 14-40, . (10/00330-2)
TEKLI, JOE; CHBEIR, RICHARD; TRAINA, AGMA J. M.; TRAINA, CAETANO, JR.. XML document-grammar comparison: related problems and applications. OPEN COMPUTER SCIENCE, v. 1, n. 1, p. 20-pg., . (10/00330-2)