ACM SIGMOD Anthology VLDB dblp.uni-trier.de

Active Storage for Large-Scale Data Mining and Multimedia.

Erik Riedel, Garth A. Gibson, Christos Faloutsos: Active Storage for Large-Scale Data Mining and Multimedia. VLDB 1998: 62-73
@inproceedings{DBLP:conf/vldb/RiedelGF98,
  author    = {Erik Riedel and
               Garth A. Gibson and
               Christos Faloutsos},
  editor    = {Ashish Gupta and
               Oded Shmueli and
               Jennifer Widom},
  title     = {Active Storage for Large-Scale Data Mining and Multimedia},
  booktitle = {VLDB'98, Proceedings of 24rd International Conference on Very
               Large Data Bases, August 24-27, 1998, New York City, New York,
               USA},
  publisher = {Morgan Kaufmann},
  year      = {1998},
  isbn      = {1-55860-566-5},
  pages     = {62-73},
  ee        = {db/conf/vldb/RiedelGF98.html},
  crossref  = {DBLP:conf/vldb/98},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX

Abstract

The increasing performance and decreasing cost of processors and memory are causing system intelligence to move into peripherals from the CPU. Storage system designers are using this trend toward "excess" compute power to perform more complex processing and optimizations inside storage devices. To date, such optimizations have been at relatively low levels of the storage protocol. At the same time, trends in storage density, mechanics, and electronics are eliminating the bottleneck in moving data off the media and putting pressure on interconnects and host processors to move data more efficiently. We propose a system called Active Disks that takes advantage of processing power on individual disk drives to run application-level code. Moving portions of an application's processing to execute directly at diskdrives can dramatically reduce data traffic and take advantage of the storage parallelism already present in large systems today. We discuss several types of applications that would benefit from this capability with a focus on the areas of database, data mining, and multimedia. We develop an analytical model of the speed- ups possible for scan-intensive applications in an Active Disk system. We also experiment with a prototype Active Disk system using relatively low-powered processors in comparison to a database server system with a single, fast processor. Our experiments validate the intuition in our model and demonstrate speedups of 2x on 10 disks across four scan-based applications. The model promises linear speedups in disk arrays of hundreds of disks, provided the application data is large enough.

Copyright © 1998 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.


Online Paper

ACM SIGMOD DiSC

CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ...

ACM SIGMOD Anthology

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ... BibTeX

Printed Edition

Ashish Gupta, Oded Shmueli, Jennifer Widom (Eds.): VLDB'98, Proceedings of 24rd International Conference on Very Large Data Bases, August 24-27, 1998, New York City, New York, USA. Morgan Kaufmann 1998, ISBN 1-55860-566-5
Contents BibTeX

References

[Acharya98]
...
[Agrawal95]
Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499 BibTeX
[Agrawal96]
Rakesh Agrawal, John C. Shafer: Parallel Mining of Association Rules. IEEE Trans. Knowl. Data Eng. 8(6): 962-969(1996) BibTeX
[Almaden97]
...
[Arpaci Dusseau97]
Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M. Hellerstein, David A. Patterson: High-Performance Sorting on Networks of Workstations. SIGMOD Conference 1997: 243-254 BibTeX
[Arya94]
Manish Arya, William F. Cody, Christos Faloutsos, Joel E. Richardson, Arthur Toya: QBISM: Extending a DBMS to Support 3D Medical Images. ICDE 1994: 314-325 BibTeX
[Barclay97]
...
[Berchtold96]
Stefan Berchtold, Daniel A. Keim, Hans-Peter Kriegel: The X-tree : An Index Structure for High-Dimensional Data. VLDB 1996: 28-39 BibTeX
[Berchtold97]
Stefan Berchtold, Christian Böhm, Daniel A. Keim, Hans-Peter Kriegel: A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space. PODS 1997: 78-86 BibTeX
[Bershad95]
Brian N. Bershad, Stefan Savage, Przemyslaw Pardyak, Emin Gün Sirer, Marc E. Fiuczynski, David Becker, Craig Chambers, Susan J. Eggers: Extensibility, Safety and Performance in the SPIN Operating System. SOSP 1995: 267-284 BibTeX
[Bitton88]
Dina Bitton, Jim Gray: Disk Shadowing. VLDB 1988: 331-338 BibTeX
[Blelloch98]
...
[Boral83]
Haran Boral, David J. DeWitt: Database Machines: An Idea Whose Time Passed? A Critique of the Future of Database Machines. IWDM 1983: 166-187 BibTeX
[Cao94]
Pei Cao, Swee Boon Lim, Shivakumar Venkataraman, John Wilkes: The TickerTAIP Parallel RAID Architecture. ACM Trans. Comput. Syst. 12(3): 236-269(1994) BibTeX
[DeWitt81]
David J. DeWitt, Paula B. Hawthorn: A Performance Evaluation of Data Base Machine Architectures (Invited Paper). VLDB 1981: 199-214 BibTeX
[DeWitt85]
David J. DeWitt, Robert H. Gerber: Multiprocessor Hash-Based Join Algorithms. VLDB 1985: 151-164 BibTeX
[DeWitt91]
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider: Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting. PDIS 1991: 280-291 BibTeX
[DeWitt92]
David J. DeWitt, Jim Gray: Parallel Database Systems: The Future of High Performance Database Systems. Commun. ACM 35(6): 85-98(1992) BibTeX
[Drapeau94]
Ann L. Drapeau, Ken Shirriff, John H. Hartman, Ethan L. Miller, Srinivasan Seshan, Randy H. Katz, Ken Lutz, David A. Patterson, Edward K. Lee, Peter M. Chen, Garth A. Gibson: RAID-II: A High-Bandwidth Network File Server. ISCA 1994: 234-244 BibTeX
[Faloutsos94]
Christos Faloutsos, Ron Barber, Myron Flickner, Jim Hafner, Wayne Niblack, Dragutin Petkovic, William Equitz: Efficient and Effective Querying by Image Content. J. Intell. Inf. Syst. 3(3/4): 231-262(1994) BibTeX
[Faloutsos96]
...
[Flickner95]
Myron Flickner, Harpreet S. Sawhney, Jonathan Ashley, Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele, Peter Yanker: Query by Image and Video Content: The QBIC System. IEEE Computer 28(9): 23-32(1995) BibTeX
[Gibson97]
Garth A. Gibson, David Nagle, Khalil Amiri, Fay W. Chang, Eugene M. Feinberg, Howard Gobioff, Chen Lee, Berend Ozceri, Erik Riedel, David Rochberg, Jim Zelenka: File Server Scaling with Network-Attached Secure Disks. SIGMETRICS 1997: 272-284 BibTeX
[Gibson98]
...
[Gosling96]
James Gosling, William N. Joy, Guy L. Steele Jr.: The Java™ Language Specification. Addison-Wesley 1996, ISBN 0-201-63451-1
BibTeX
[Gray97]
...
[Grochowski96]
...
[Hsiao79]
...
[Keeton98]
...
[Kitsuregawa83]
Masaru Kitsuregawa, Hidehiko Tanaka, Tohru Moto-Oka: Application of Hash to Data Base Machine and Its Architecture. New Generation Comput. 1(1): 63-74(1983) BibTeX
[Kotz94]
David Kotz: Disk-directed I/O for MIMD Multiprocessors. OSDI 1994: 61-74 BibTeX
[Lee96]
Edward K. Lee, Chandramohan A. Thekkath: Petal: Distributed Virtual Disks. ASPLOS 1996: 84-92 BibTeX
[Livny87]
Miron Livny, Setrag Khoshafian, Haran Boral: Multi-Disk Management Algorithms. SIGMETRICS 1987: 69-77 BibTeX
[Necula96]
George C. Necula, Peter Lee: Safe Kernel Extensions Without Run-Time Checking. OSDI 1996: 229-243 BibTeX
[Ozharahan75]
...
[Patterson88]
David A. Patterson, Garth A. Gibson, Randy H. Katz: A Case for Redundant Arrays of Inexpensive Disks (RAID). SIGMOD Conference 1988: 109-116 BibTeX
[Patterson95]
R. Hugo Patterson, Garth A. Gibson, Eka Ginting, Daniel Stodolsky, Jim Zelenka: Informed Prefetching and Caching. SOSP 1995: 79-95 BibTeX
[Quest97]
...
[Riedel97]
...
[Romer96]
Theodore H. Romer, Dennis Lee, Geoffrey M. Voelker, Alec Wolman, Wayne A. Wong, Jean-Loup Baer, Brian N. Bershad, Henry M. Levy: The Structure and Performance of Interpreters. ASPLOS 1996: 150-159 BibTeX
[Ruemmler91]
...
[Seagate97]
...
[Small95]
...
[Smith79]
...
[Smith95]
...
[StorageTek94]
...
[Su75]
Stanley Y. W. Su, G. Jack Lipovski: CASSM: A Cellular System for Very Large Data Bases. VLDB 1975: 456-472 BibTeX
[TPC98]
...
[TriCore97]
...
[Turley96]
...
[VanMeter96]
...
[Virage98]
...
[Wactlar96]
Howard D. Wactlar, Takeo Kanade, Michael A. Smith, Scott M. Stevens: Intelligent Access to Digital Video: Informedia Project. IEEE Computer 29(5): 46-52(1996) BibTeX
[Wahbe93]
Robert Wahbe, Steven Lucco, Thomas E. Anderson, Susan L. Graham: Efficient Software-Based Fault Isolation. SOSP 1993: 203-216 BibTeX
[Welling98]
...
[Wilkes95]
John Wilkes, Richard A. Golding, Carl Staelin, Tim Sullivan: The HP AutoRAID Hierarchical Storage System. SOSP 1995: 96-108 BibTeX
[Yao85]
Andrew Chi-Chih Yao, F. Frances Yao: A General Approach to d-Dimensional Geometric Queries (Extended Abstract). STOC 1985: 163-168 BibTeX

Referenced by

  1. Erik Riedel, Christos Faloutsos, Gregory R. Ganger, David Nagle: Data Mining on an OLTP System (Nearly) for Free. SIGMOD Conference 2000: 13-21
  2. Felipe Cariño, William O'Connell, John Burgess, Joel H. Saltz: Active Storage Hierarchy, Database Systems and Applications - Socratic Exegesis. VLDB 1999: 611-614
  3. Kimberly Keeton, David A. Patterson, Joseph M. Hellerstein: A Case for Intelligent Disks (IDISKs). SIGMOD Record 27(3): 42-52(1998)
BibTeX
ACM SIGMOD Anthology - DBLP: [Home | Search: Author, Title | Conferences | Journals]
VLDB Proceedings: Copyright © by VLDB Endowment,
ACM SIGMOD Anthology: Copyright © by ACM (info@acm.org), Corrections: anthology@acm.org
DBLP: Copyright © by Michael Ley (ley@uni-trier.de), last change: Sat May 16 23:46:20 2009