ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Accessing Data Cubes along Complex Dimensions.

Yuping Yang, Mukesh Singhal: Accessing Data Cubes along Complex Dimensions. DOLAP 1999: 73-78
@inproceedings{DBLP:conf/dolap/YangS99,
  author    = {Yuping Yang and
               Mukesh Singhal},
  title     = {Accessing Data Cubes along Complex Dimensions},
  booktitle = {DOLAP '99, ACM Second International Workshop on Data Warehousing
               and OLAP, November 6, 1999, Kansas City, Missouri, USA, Proceedings},
  publisher = {ACM},
  year      = {1999},
  pages     = {73-78},
  ee        = {db/conf/dolap/YangS99.html, http://doi.acm.org/10.1145/319757.319794},
  crossref  = {DBLP:conf/dolap/99},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

In a data warehouse, data cubes are accessed through their dimensions. If dimensions are numerical, because numerical data can be clustered or sorted, fast access methods such as binary search or B+ trees can be applied. However, complex attributes such as keyword sets of document contents are not easily sorted or clustered. Although it is highly desirable that documents can be searched through their sets of keywords.

Signature index is known for its ability to search along complex attributes. We propose a new indexing structure, dimensional signature index (DSI), for fast query processing in data cubes. DSI is particularly suitable for accessing data in data cubes through complex dimensions.

Through a mathematical analysis, we found that if one signature index (feature index) is built for each dimension of the data cube, if the size of all feature indices is equal to the size of a large signature index for the entire data cube as a flat file, and if a query execution involves all dimensions of a data cube, the search cost in all these feature indices is the same as the search cost in the large signature index for the entire data cube.

The significance of this discovery is that usually a query does not involve all dimensions of a data cube. By making one feature index for each dimension, only those feature indices involved in the query predicates need to be accessed. On average, this represents significant faster query executions than using a large signature file for the entire data cube.

The use of DSI scheme does not exclude the use of other fast signature index schemes. Each feature index in DSI can also use any of the previously proposed fast signature indices (S-trees, multi-leveled, frame-sliced, etc.) to achieve even faster access speed.

Copyright © 1999 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 4, CIKM, DOLAP, GIS, SIGFIDET, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

DOLAP '99, ACM Second International Workshop on Data Warehousing and OLAP, November 6, 1999, Kansas City, Missouri, USA, Proceedings. ACM 1999
Contents BibTeX

Online Edition

Citation Page BibTeX

References

[1]
...
[2]
Walter W. Chang, Hans-Jörg Schek: A Signature Access Method for the Starburst Database System. VLDB 1989: 145-153 BibTeX
[3]
Uwe Deppisch: S-Tree: A Dynamic Balanced Signature Index for Office Retrieval. SIGIR 1986: 77-87 BibTeX
[4]
Christos Faloutsos, Stavros Christodoulakis: Description and Performance Analysis of Signature File Methods for Office Filing. ACM Trans. Inf. Syst. 5(3): 237-257(1987) BibTeX
[5]
Christos Faloutsos: Signature-Based Text Retrieval Methods: A Survey. IEEE Data Eng. Bull. 13(1): 25-32(1990) BibTeX
[6]
Li Fan, Pei Cao, Jussara M. Almeida, Andrei Z. Broder: Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. SIGCOMM 1998: 254-265 BibTeX
[7]
Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Index Selection for OLAP. ICDE 1997: 208-219 BibTeX
[8]
Himanshu Gupta: Selection of Views to Materialize in a Data Warehouse. ICDT 1997: 98-112 BibTeX
[9]
Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Implementing Data Cubes Efficiently. SIGMOD Conference 1996: 205-216 BibTeX
[10]
Yoshiharu Ishikawa, Hiroyuki Kitagawa, Nobuo Ohbo: Evaluation of Signature Files as Set Access Facilities in OODBs. SIGMOD Conference 1993: 247-256 BibTeX
[11]
Dik Lun Lee, Chun-Wu Roger Leng: Partitioned Signature Files: Design Issues and Performance Evaluation. ACM Trans. Inf. Syst. 7(2): 158-180(1989) BibTeX
[12]
Dik Lun Lee, Young Man Kim, Gaurav Patel: Efficient Signature File Methods for Text Retrieval. IEEE Trans. Knowl. Data Eng. 7(3): 423-435(1995) BibTeX
[13]
Zheng Lin, Christos Faloutsos: Frame-Sliced Signature Files. IEEE Trans. Knowl. Data Eng. 4(3): 281-289(1992) BibTeX
[14]
John L. Pfaltz, William J. Berman, Edgar M. Cagley: Partial-Match Retrieval Using Indexed Descriptor Files. Commun. ACM 23(9): 522-528(1980) BibTeX
[15]
Charles S. Roberts: Partial-Match Via the Method of Superimposed Codes. Proceedings of the IEEE 67(12): 1624-1642(1979) BibTeX
[16]
Ron Sacks-Davis, Kotagiri Ramamohanarao: A two level superimposed coding scheme for partial match retrieval. Inf. Syst. 8(4): 273-289(1983) BibTeX
[17]
Ron Sacks-Davis, Alan J. Kent, Kotagiri Ramamohanarao: Multikey Access Methods Based on Superimposed Coding Techniques. ACM Trans. Database Syst. 12(4): 655-696(1987) BibTeX
[18]
...
[19]
...
[20]
...
[21]
Yuping Yang, Mukesh Singhal: A New Access Index for Fast Execution of Conjunctive Queries over Text Data. IDEAS 1999: 248-253 BibTeX
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
DOLAP 1999 Proceedings, ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:07:21 2009