Similarity Query Processing Using Disk Arrays
Apostolos N. Papadopoulos (Aristotle University of Thessaloniki)
Yannis Manolopoulos (Aristotle University of Thessaloniki)
Similarity queries are fundamental operations that are used extensively in
many modern applications, whereas disk arrays are powerful storage media of
increasing importance. The basic trade-off in similarity query processing
in such a system is that increased parallelism leads to higher resource
consumptions and low throughput, whereas low parallelism leads to higher
response times. Here, we propose a technique which is based on a careful
investigation of the currently available data in order to exploit parallelism
up to a point, retaining low response times during query processing.
The underlying access method is a variation of the R*-tree, which is
distributed among the components of a disk array, whereas the system is
simulated using event-driven simulation. The performance results conducted,
demonstrate that the proposed approach outperforms by factors a previous
branch-and-bound algorithm and a greedy algorithm which maximizes parallelism
as much as possible. Moreover, the comparison of the proposed algorithm to a
hypothetical (non-existing) optimal one (with respect to the number of disk
accesses) shows that the former is on average two times slower than the latter.
For more information about the authors visit the
Data Engineering Research
Group of the Department of Informatics, Aristotle University of
Thessaloniki, Greece.