Chapter 14. The Rule System

Some other database systems define active database rules. These are usually stored procedures and triggers and are implemented in PostgreSQL as functions and triggers. (Refer to the Red Hat Database SQL Guide and Reference.)

The query rewrite rule system (the Rule System from now on) is totally different from stored procedures and triggers. It modifies queries to take rules into consideration, and then passes the modified query to the query planner for planning and execution. It is very powerful, and can be used for many things such as query language procedures, views, and versions.

Rules are created with the CREATE RULES command. For its syntax, refer to the Red Hat Database SQL Guide and Reference.

Views in PostgreSQL are implemented using the Rule System. In fact there is absolutely no difference between a
CREATE VIEW myview AS SELECT * FROM mytab;
compared against the two commands
CREATE TABLE myview (same attribute list as for mytab);
CREATE RULE "_RETmyview" AS ON SELECT TO myview DO INSTEAD
    SELECT * FROM mytab;
because this is exactly what the CREATE VIEW command does internally. This has some side effects. One of them is that the information about a view in the PostgreSQL system catalogs is exactly the same as it is for a table. So for the query parser, there is absolutely no difference between a table and a view. They are the same thing - relations.

Query Trees

To understand how the Rule System works it is necessary to know when it is invoked and what its input and results are.

The Rule System is located between the query parser and the planner. It takes the output of the parser, one query tree, and the rewrite rules from the pg_rewrite catalog table, which are query trees too (with some extra information), and creates zero or many query trees as result. So its input and output are always things the parser itself could have produced, and thus anything it sees is basically representable as an SQL statement.

Now what is a query tree? It is an internal representation of an SQL statement where the parts that comprise it are stored separately. These query trees are visible when starting the PostgreSQL backend with debug level 4 and typing queries into the interactive backend interface. The rule actions in the pg_rewrite system catalog are also stored as query trees. They are not formatted like the debug output, but they contain exactly the same information.

When reading the SQL representations of the query trees in this chapter it is necessary to be able to identify the parts into which the statement is broken when it is transformed in the query tree structure. The parts of a query tree are:

the command type

This is a simple value telling which command (SELECT, INSERT, UPDATE, DELETE) produced the parse tree.

the range table

The range table is a list of relations that are used in the query. In a SELECT statement these are the relations given after the FROM keyword.

Every range table entry identifies a table or view and tells by which name it is called in the other parts of the query. In the query tree the range table entries are referenced by index rather than by name, so here it does not matter if there are duplicate names as it would in an SQL statement. This can happen after the range tables of rules have been merged in. The examples in this document will not have this situation, however.

the result relation

This is an index into the range table that identifies the relation where the results of the query go.

SELECT queries normally do not have a result relation. The special case of a SELECT INTO is mostly identical to a CREATE TABLE, INSERT ... SELECT sequence and is not discussed separately here.

On INSERT, UPDATE, and DELETE queries the result relation is the table (or view!) where the changes take effect.

the target list

The target list is a list of expressions that define the result of the query. In the case of a SELECT, the expressions are what builds the final output of the query. They are the expressions between the SELECT and the FROM keywords. (* is just an abbreviation for all the attribute names of a relation. It is expanded by the parser into the individual attributes, so the Rule System never sees it.)

DELETE queries do not need a target list because they do not produce any result. In fact the planner will add a special CTID entry to the empty target list. But this is after the rule system and will be discussed later. As far as the Rule System is concerned, the target list is empty.

In INSERT queries the target list describes the new rows that should go into the result relation. It is the expressions in the VALUES clause or the ones from the SELECT clause in INSERT ... SELECT. Missing columns of the result relation will be filled in by the planner with a constant NULL expression.

In UPDATE queries, the target list describes the new rows that should replace the old ones. At the Rule System stage, it contains just the expressions from the "SET attribute = expression" part of the query. The planner will add missing columns by inserting expressions that copy the values from the old row into the new one. And it will add the special CTID entry just as for DELETE too.

Every entry in the target list contains an expression that can be a constant value, a variable pointing to an attribute of one of the relations in the range table, a parameter, or an expression tree made of function calls, constants, variables, operators etc.

the qualification

The query's qualification is an expression much like one of those contained in the target list entries. The result value of this expression is a Boolean that tells if the operation (INSERT, UPDATE, DELETE or SELECT) for the final result row should be executed or not. It is the WHERE clause of an SQL statement.

the join tree

The query's join tree shows the structure of the FROM clause. For a simple query such as SELECT FROM a, b, c the join tree is just a list of the FROM items, because we are allowed to join them in any order. But when JOIN expressions (particularly outer joins) are used, we have to join in the order shown by the joins. The join tree shows the structure of the JOIN expressions. The restrictions associated with particular JOIN clauses (from ON or USING expressions) are stored as qualification expressions attached to those join tree nodes. It turns out to be convenient to store the top-level WHERE expression as a qualification attached to the top-level join tree item, too. So really the join tree represents both the FROM and WHERE clauses of a SELECT.

the others

The other parts of the query tree like the ORDER BY clause are not of interest here. The Rule System substitutes entries there while applying rules, but that does not have much to do with the fundamentals of the rule system.