![]() |
![]() |
This paper discusses some strategy on query optimization for semi-structured data. It is very nice to think of "Hybrid plan", which start from both the top and bottom, then meet somewhere in the middle. This may be very efficient in case there are too may paths go out of the root, or there are too many leaf node, but only a small number of them satisfy the condition.
The idea is to minimize the number of nodes we need to search or join along the path. This strategy would work well when the query condition is to find an element by name or by "equal" condition. This makes sure that an element can be allocated (or thrown away) by only looking at the value of the element itself. Consider query with "unequal" condition, which define the relation between nodes. For example, to find the nodes whose value of grandparents is larger than the value of its grandchild. Then, it would be relatively harder to us this strategy.
And this strategy is based on the Lore index system, with value index, label index, edge index and path index. All of them enable fast and efficient access to the data, but can made modification to the data relatively complex and not so efficient. Then, this would work better on static data rather than data that changes very often. Or, we modification if needed, part of the index which related to the data to be modified is removed and rebuilt.
Copyright © 2000 by the author(s). Review published with permission.