Efficient and Transparent Application Recovery in Client-Server Information Systems
David Lomet (Microsoft Research)
Gerhard Weikum (University of the Saarland, Germany)

Database systems recover persistent data, providing high database availability. However, database applications, typically residing on client or middle-tier application-server machines, may lose work because of a server failure. This prevents the masking of server failures from the human user and substantially degrades application availability. This paper aims to enable high application availability with an integrated method for database server recovery and transparent application recovery in a client-server system. The approach, based on application message logging, is similar to earlier work on distributed system fault tolerance. However, we exploit advanced database logging and recovery techniques and request/reply messaging properties to significantly improve efficiency. Forced log I/Os, frequently required by other methods, are usually avoided. Restart time, for both failed server and failed client, is reduced by checkpointing and log truncation. Our method ensures that a server can recover independently of clients. A client may reduce logging overhead in return for dependency on server availability during client restart.