Querying Priced Information in Databases: The Conjunctive Case

Carmo, Renato; Feder, Tomas; Kohayakawa, Yoshiharu; Laber, Eduardo; Motwani, Rajeev; O'Callaghan, Liadan; Panigrahy, Rina; Thomas, Dilys

Full text
Author(s):	Carmo, Renato ; Feder, Tomas ; Kohayakawa, Yoshiharu ; Laber, Eduardo ; Motwani, Rajeev ; O'Callaghan, Liadan ; Panigrahy, Rina ; Thomas, Dilys Total Authors: 8
Document type:	Journal article
Source:	ACM TRANSACTIONS ON ALGORITHMS; v. 3, n. 1, p. 22-pg., 2007-02-01.
Abstract
Query optimization that involves expensive predicates has received considerable attention in the database community. Typically, the output to a database query is a set of tuples that satisfy certain conditions, and, with expensive predicates, these conditions may be computationally costly to verify. In the simplest case, when the query looks for the set of tuples that simultaneously satisfy k expensive predicates, the problem reduces to ordering the evaluation of the predicates so as to minimize the time to output the set of tuples comprising the answer to the query. We study different cases of the problem: the sequential case, in which a single processor is available to evaluate the predicates, and the distributed case, in which there are k processors available, each dedicated to a different attribute (column) of the database, and there is no communication cost between the processors. For the sequential case, we give a simple and fast deterministic k-approximation algorithm, and prove that k is the best possible approximation ratio for a deterministic algorithm, even if exponential time algorithms are allowed. We also propose a randomized, polynomial time algorithm with expected approximation ratio 1 + root 2/2 approximate to 1.707 for k = 2, and prove that 3/2 is the best possible expected approximation ratio for randomized algorithms. We also show that given 0 <= epsilon <= 1, no randomized algorithm achieves approximation ratio smaller than 1 + epsilon with probability larger than (1 + epsilon)/2. For the distributed case, we consider two different models: the preemptive model, in which a processor is allowed to interrupt the evaluation of a predicate, and the nonpreemptive model, in which the evaluation of a predicate must be completed once started. We show that k is the best possible approximation ratio for a deterministic algorithm, even if exponential time algorithms are allowed. For the preemptive model, we introduce a polynomial time k-approximation algorithm. For the nonpreemptive model, we introduce a polynomial time O(k log(2) k)-approximation algorithm. (AU)

FAPESP's process:	03/09925-5 - Foundations of computer science: combinatory algorithms and discrete structures
Grantee:	Yoshiharu Kohayakawa
Support Opportunities:	PRONEX Research - Thematic Grants

Short URL