![]() |
![]() |
![]() |
@inproceedings{DBLP:conf/ssdbm/BurnettCT83,
author = {Robert A. Burnett and
Paula J. Cowley and
James J. Thomas},
editor = {Roy Hammond and
John L. McCarthy},
title = {Management and Display of Data Analysis Environments for Large
Data Sets},
booktitle = {Proceedings of the Second International Workshop on Statistical
Database Management, Los Altos, California, USA, September 27-29,
1983},
publisher = {Lawrence Berkeley Laboratory},
year = {1983},
pages = {22-31},
ee = {db/conf/ssdbm/BurnettCT83.html},
crossref = {DBLP:conf/ssdbm/83},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
BibTeX
Data analysis is typically an iterative process in which the choice of the next analysis operation is largely determined by the results of previous operations on the data set. With large data sets, many analysis paths may be explored before meaningful results are obtained. Along each path, the analyst creates a sequence of "data analysis environments," each environment being a frame or "snapshop" of the data set and associated descriptions, conditions, models, and analysis results. The data analysis environment may be changed incrementally through temprary data modifications, subsets, samples, or statistical operations; or, the analyst may wish to restore the conditions of a previous environment as a starting point fram which a new analysis path can be generated. Existing analysis systems, however, lack facilities to maintain, save, or restore all of the components required to completely describe or reconstruct a data analysis environment.
This paper describes ongoing research at Pacific Northwest Laboratory (PNL) in data management and display techniques for multiple data analysis environments. Specifically, research is being conducted in four major areas: (1) the development of a model of the data analysis process incorporating the concepts of data analysis environments; (2) the design and use of data modification definitions (differential files) to represent multiple versions of a large data base; (3) the use of data dictionaries/directories to manage, describe, and control multiple data analysis environments; and (4) the application of graphical display and interaction techniques to the examination and selection of data analysis environments. The results of these research efforts will be integrated to provide a new dimension in interactive data analysis.