Advanced search
Start date
Betweenand


DivDB: A System for Diversifying Query Results

Author(s):
Vieira, Marcos R. ; Razente, Humberto L. ; Barioni, Maria C. N. ; Hadjieleftheriou, Marios ; Srivastava, Divesh ; Traina, Caetano, Jr. ; Tsotras, Vassilis J.
Total Authors: 7
Document type: Journal article
Source: PROCEEDINGS OF THE VLDB ENDOWMENT; v. 4, n. 12, p. 4-pg., 2011-08-01.
Abstract

With the availability of very large databases, an exploratory query can easily lead to a vast answer set, typically based on an answers relevance (i.e., top-k, tf-idf ) to the user query. Navigating through such an answer set requires huge effort and users give up after perusing through the first few answers, thus some interesting answers hidden further down the answer set can easily be missed. An approach to address this problem is to present the user with the most diverse among the answers based on some diversity criterion. In this demonstration we present DivDB, a system we built to provide query result diversification both for advanced and novice users. For the experienced users, who may want to test the performance of existing and new algorithms, we provide an SQL-based extension to formulate queries with diversification. As for the novice users, who may be more interested in the result rather than how to tune the various algorithms parameters, the DivDB system allows the user to provide a "hint" to the optimizer on speed vs. quality of result. Moreover, novice users can use an interface to dynamically change the tradeoff value between relevance and diversity in the result, and thus visually inspect the result as they interact with this parameter. This is a great feature to the end user because finding a good tradeoff value is a very hard task and it depends on several variables (i.e., query parameters, evaluation algorithms, and dataset properties). In this demonstration we show a study of the DivDB system with two image databases that contain many images of the same object under different settings (e.g., different camera angle). We show how the DivDB helps users to iteratively inspect diversification in the query result, without the need to know how to tune the many different parameters of the several existing algorithms in the DivDB system. (AU)

FAPESP's process: 06/00336-5 - Incorporating relevance feedback into queries for similarity in database management systems
Grantee:Humberto Luiz Razente
Support Opportunities: Scholarships in Brazil - Doctorate