The first major benefit of using WAL is a
significantly reduced number of disk writes, because only the log
file needs to be flushed to disk at the time of transaction
commit, rather than every data file changed by the transaction.
In multiuser environments, commits of many transactions
may be accomplished with a single fsync
of
the log file. Furthermore, the log file is written sequentially,
and so the cost of syncing the log is much less than the cost of
flushing the data pages. This is especially true for servers
handling many small transactions touching different parts of the data
store.
Before WAL, it was not possible
for EnterpriseDB to be able to guarantee consistency in the case
of a crash. Any crash during writing, before WAL could result in:
index rows pointing to nonexistent table rows
index rows lost in split operations
totally corrupted table or index page content, because
of partially written data pages
Problems with indexes (problems 1 and 2) could possibly have been
fixed by additional fsync
calls, but it is
not obvious how to handle the last case without
WAL. WAL saves the entire data
page content in the log if that is required to ensure page
consistency for after-crash recovery.
Finally, WAL makes it possible to support on-line
backup and point-in-time recovery, as described in Section 36.3. By archiving the WAL data we can support
reverting to any time instant covered by the available WAL data:
we simply install a prior physical backup of the database, and
replay the WAL log just as far as the desired time. What's more,
the physical backup doesn't have to be an instantaneous snapshot
of the database state - if it is made over some period of time,
then replaying the WAL log for that period will fix any internal
inconsistencies.