Logical Logging to Extend Recovery to New Domains

David Lomet*       Mark Tuttle
Microsoft Research       Compaq Cambridge Research Lab
lomet@microsoft.com       tuttle@crl.dec.com

Abstract

Recovery can be extended to new domains at reduced logging cost by exploiting "logical" log operations. During recovery, a logged logical operation may read data values from any recoverable object, not solely from values on the log or from the updated object. Hence, during normal operation, we needn't log these values, a substantial saving. In [8], we developed a redo recovery theory that dealt with general log operations and proved that the stable database remains recoverable if it can be explained in terms of an installation graph. This graph was used to derive a write graph that drives cache management by determining a flush order for cached objects that ensures that the objects read by log operations at recovery have the appropriate values. In this paper we show that a more refined write graph is possible, permitting more flexible cache management that can atomically flush smaller sets of objects. Based on the new write graph, we show how: (i) the cache manager can inject its own operations to break up atomic flush sets; and (ii) the recovery process can avoid redoing operations whose effects are no longer needed, by exploiting generalized recovery LSNs. These advances permit us to easily deal with logical log operations. They thus make it more cost/effective to extend database style recovery techniques to enable recovery of, e.g., files and applications.