Digital Symposium Collection 2000  

 
 
 
 
 
 

 




















On the Complexity of the View-Selection Problem

Howard J. Karloff and Milena Mihail

  View Paper (PDF)  

Return to Views/Queries

Abstract
A commonly used and powerful technique for improving query response time over very large databases is to precompute ("Lmaterialize") frequently asked queries ("views"). The problem is to select an appropriate set of views, given a limited amount of resources. Harinarayan, Rajaraman and Ullman formalized this technique by proposing a framework in which queries are modeled by a weighted partial order, and selecting a set of views whose materialization minimizes the average query response time is equivalent to selecting a subset of nodes of the partial order that minimizes a suitably defined cost function. Because this problem is NPHard, the focus is on approximability and heuristics. Harinarayan, Rajaraman and Ullman proposed a greedy heuristic together with a "benefit" criterion to measure its performance; this heuristic and performance measure are used in several subsequent papers which generalize their work. We prove the following lower bounds: (a) The greedy heuristic of Harinarayan, Rajaraman and Ullman has query response time at least n/12 times optimal for infinitely many n. (Compare this to the fact that no algorithm, regardless of how naive it is, ever has query response time exceeding n times optimal.) (b) If PfNP, then for every e > 0, every polynomialtime approximation algorithm for the view-selection problem will output solutions with query response time at least n’-’ times optimal, for infinitely many n, even for partial orders with bounded degrees and bounded depth. (c) A similar result applies even if we generously allow the algorithm to materialize ak views, LY a constant, and compare its performance to the optimal achievable when k views are chosen. Our results prove (if P#NP) that the view- selection problem is essentially inapproximable for general partial orders (the "benefit" performance measure of Harinarayan, Rajaraman and Ullman provides no competitiveness guarantee against the optimal solution). Hence studies of the Harinarayan, Rajaraman and Ullman framework and its generalizations should focus on special cases of practical significance, such as hypercubes, and on experimental comparison of heuristics.


References

Note: References link to DBLP on the Web.

[1]
Uriel Feige : A Threshold of ln n for Approximating Set Cover (Preliminary Version). STOC 1996 : 314-318
[2]
M. R. Garey , David S. Johnson : Computer and Intractability: A Guide to NP-Completeness. W. H. Freeman 1979, ISBN 0-7167-1044-7
[3]
Himanshu Gupta : Selection of Views to Materialize in a Data Warehouse. ICDT 1997 : 98-112
[4]
Himanshu Gupta , Venky Harinarayan , Anand Rajaraman , Jeffrey D. Ullman : Index Selection for OLAP. ICDE 1997 : 208-219
[5]
Himanshu Gupta , Inderpal Singh Mumick : Selection of Views to Materialize Under a Maintenance Cost Constraint. ICDT 1999 : 453-470
[6]
Venky Harinarayan , Anand Rajaraman , Jeffrey D. Ullman : Implementing Data Cubes Efficiently. SIGMOD Conf. 1996 : 205-216
[7]
...
[8]
Carsten Lund , Mihalis Yannakakis : On the Hardness of Approximating Minimization Problems. JACM 41(5) : 960-981(1994)
[9]
Jeffrey D. Ullman : Efficient Implementation of Data Cubes Via Materialized Views. KDD 1996 : 386-388

BIBTEX

@inproceedings{DBLP:conf/pods/KarloffM99,
  author    = {Howard J. Karloff and
                Milena Mihail},
   title     = {On the Complexity of the View-Selection Problem},
   booktitle = {Proceedings of the Eighteenth ACM SIGACT-SIGMOD-SIGART Symposium
                on Principles of Database Systems, May 31 - June 2, 1999, Philadelphia,
                Pennsylvania},
   publisher = {ACM Press},
   year      = {1999},
   isbn      = {1-58113-062-7},
   pages     = {167-173},
   crossref  = {DBLP:conf/pods/99},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },


























Copyright(C) 2000 ACM