Beginning with PostgreSQL 7.1 it is possible to control the query planner to some extent by using explicit JOIN syntax. To see why this matters, we first need some background.
In a simple join query, such as
SELECT * FROM a,b,c WHERE a.id = b.id AND b.ref = c.id; |
When a query only involves two or three tables, there are not many join orders to worry about. But the number of possible join orders grows exponentially as the number of tables expands. Beyond ten or so input tables it is no longer practical to do an exhaustive search of all the possibilities, and even for six or seven tables planning may take an annoyingly long time. When there are too many input tables, the PostgreSQL planner will switch from exhaustive search to a genetic probabilistic search through a limited number of possibilities. (The switch over threshold is set by the GEQO_THRESHOLD run-time parameter described in the Section called Planner and Optimizer Tuning in Chapter 1.) The genetic search takes less time, but it will not necessarily find the best possible plan.
When the query involves outer joins, the planner has much less freedom than it does for plain (inner) joins. For example, consider
SELECT * FROM a LEFT JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id); |
In PostgreSQL 7.1, the planner treats all explicit JOIN syntaxes as constraining the join order, even though it is not logically necessary to make such a constraint for inner joins. Therefore, although all of these queries give the same result:
SELECT * FROM a,b,c WHERE a.id = b.id AND b.ref = c.id; SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id; SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id); |
You do not need to constrain the join order completely in order to cut search time, because it's OK to use JOIN operators in a plain FROM list. For example:
SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...; |
If you have a mix of outer and inner joins in a complex query, you might not want to constrain the planner's search for a good ordering of inner joins inside an outer join. You cannot do that directly in the JOIN syntax, but you can get around the syntactic limitation by using subselects. For example,
SELECT * FROM d LEFT JOIN (SELECT * FROM a, b, c WHERE ...) AS ss ON (...); |
Constraining the planner's search in this way is a useful technique both for reducing planning time and for directing the planner to a good query plan. If the planner chooses a bad join order by default, you can force it to choose a better order via JOIN syntax—assuming that you know of a better order, that is. Experimentation is recommended.