Data Consistency Checks at the Application Level

Because readers in PostgreSQL do not lock data, regardless of transaction isolation level, data read by one transaction can be overwritten by another concurrent transaction. In other words, if a row is returned by SELECT it does not mean that the row is still current at the instant it is returned (that is, sometime after the current query began). The row might have been modified or deleted by an already-committed transaction that committed after this one started. Even if the row is still valid "now", it could be changed or deleted before the current transaction does a commit or rollback.

Another way to think about it is that each transaction sees a snapshot of the database contents, and concurrently executing transactions may very well see different snapshots. This means that the whole concept of "now" is somewhat suspect anyway. This is not normally a problem if the client applications are isolated from each other, but if the clients can communicate via channels outside the database, serious confusion may ensue.

To ensure the current validity of a row and protect it against concurrent updates you must use SELECT FOR UPDATE or an appropriate LOCK TABLE statement. (SELECT FOR UPDATE locks just the returned rows against concurrent updates, while LOCK TABLE locks the whole table.) This should be taken into account when porting applications to PostgreSQL from other environments.

Global validity checks require extra thought under MVCC. For example, a banking application might wish to check that the sum of all credits in one table equals the sum of debits in another table when both tables are being actively updated. Comparing the results of two successive PostgreSQL SELECT SUM(...) commands will not work reliably under Read Committed mode as the second query will likely include the results of transactions not counted by the first. Doing the two sums in a single serializable transaction will give an accurate picture of the effects of transactions that committed before the serializable transaction started—but you might legitimately wonder whether the answer is still relevant by the time it is delivered. If the serializable transaction itself applied some changes before trying to make the consistency check, the usefulness of the check becomes even more debatable because now it includes some but not all post-transaction-start changes. In such cases a careful person might wish to lock all tables needed for the check in order to get an indisputable picture of current reality. A SHARE mode (or higher) lock guarantees that there are no uncommitted changes in the locked table, other than those of the current transaction.

Note also that if you are relying on explicit locks to prevent concurrent changes, you should use Read Committed mode. Also, when in Serializable mode be careful to obtain the lock(s) before performing queries. An explicit lock obtained in a serializable transaction guarantees that no other transactions modifying the table are still running—but if the snapshot seen by the transaction predates obtaining the lock, it may predate some now-committed changes in the table. A serializable transaction's snapshot is actually frozen at the start of its first query (SELECT/INSERT/UPDATE/DELETE), so it is possible to obtain explicit locks before the snapshot is frozen.