Types of Nodes in a QEP

Each node in a QEP has detailed information associated with it, depending on the type of node.

The types of nodes are as follows:

Orig (or leaf) node—describes a particular table
Proj-rest node—describes the result of a projection and/or where clause restriction applied to a subsidiary node
Join node—describes a join. One of the following strategies is used:
- Cartesian product
- Full sort merge
- Partial sort merge
- Key and tid lookup joins
- Subquery joins
Exchange node—describes a point at which separate plan fragments execute concurrently on different threads as part of a parallel query plan

Sort Nodes in a QEP

Many types of nodes can also be shown as sort nodes. A sorting node causes the data to be sorted as it is returned. Any node other than an orig node can appear with a sort indication. A query with a sort clause on it has a sort node as the topmost node in the QEP. This type of node displays:

Total number of pages and tuples
Query cost information

For a description of each of these parts of the display, see Information on a QEP.

Sort nodes make use of a sort buffer and so consume primarily CPU resources, requiring disk I/O only when the volume of data to be sorted cannot be accommodated in the sort buffer. The heapsort algorithm is used; it is very fast on both unordered data and data which is nearly sorted.

Non-Join Nodes in a QEP

Types of non-join nodes are as follows:

Orig
Projection-restriction
Exchange

Orig Nodes

Orig nodes are nodes with no children. When reading a QEP, you should first find the orig nodes of the tree. Orig nodes are the most common type of QEP node.

Orig nodes describe a base table or secondary index being accessed from the query. This type of node displays the following:

Table or index name
Storage structure
Total number of pages and tuples

For a description of each of these parts of the display, see Information on a QEP.

Projection-Restriction Nodes

A projection-restriction (proj-rest) node describes the result of a projection and/or where clause restriction applied to a subsidiary node. It defines how a subsidiary node is to key into the table and what qualification to use on the table. This type of node displays the following:

A label identifying it as a proj-rest node
Storage structure (which is usually heap)
Total number of pages and tuples
Query cost information
Optionally, sort information (whether the data produced by this node is already sorted or requires sorting)

For a description of each of these parts of the display, see Information on a QEP.

Proj-rest nodes are used to remove data irrelevant to the query from a table, so that the minimum amount of data is carried around during the execution of the query. Therefore, only those columns referenced in the target list and where clause for a table are projected, and only those rows that satisfy the where clause restrictions are retained for use higher in the plan.

All you see is the amount of disk I/O required to read the appropriate rows from the node below, and that amount depends on what storage structures were used and the number of pages accessed.

Exchange Nodes

Exchange nodes appear in parallel query plans. An exchange node results in one or more threads being spawned to execute the plan fragment beneath the node. It allows different portions of a complex query plan to execute concurrently, reducing the overall elapsed time taken to execute the query. This type of node displays:

Estimated reduction in execution time due to the presence of the exchange node
Count of child threads spawned to execute the plan fragment below the exchange node
PC Join count, the number of join fragments performed by a partition compatible join

For more information, see Parallel Query Execution.

Examples of Non-join Nodes

Here are QEP examples that illustrate non-join nodes. The Sample Tables section describes the tables and indexes used in these examples.

Sample Tables

In these examples, the following two tables are used:

Table arel(col1, col2, col3):
Name: arel

Owner: supp60

Created: 26-oct-1998 08:50:00

Location: ii_database

Type: user table

Version: II2.5

Row width: 413

Number of rows: 156

Storage structure: hash

Duplicate Rows: allowed

Number of pages: 70

Overflow data pages: 6

Column Information:

Column                                      Key

Name   Type      Length   Nulls   Defaults  Seq

col1   integer    4       yes       no       1

col2   integer    4       yes       no

col3   varchar    400     yes       no

Secondary indexes: aindex (col2)  structure: isam
Table brel(col1,col2):
Name: brel

Owner: supp60

Created: 26-oct-1998 08:53:00

Location: ii_database

Type: user table

Version: II2.5

Row width: 10

Number of rows: 156

Storage structure: isam

Duplicate Rows: allowed

Number of pages: 3

Overflow data pages: 1

Column Information:

Column                                        Key

Name    Type    Length   Nulls   Defaults   Seq

col1    integer   4       yes       n         1

col2    integer   4       yes       no

Secondary indexes: none

Primary Key Lookup

This is an example of a simple primary key lookup. The QEP is shown below for the following SQL query:

select col1, col2 from arel

where col1 = 1234

order by col2;

QUERY PLAN 3,1, no timeout, of main query

Sort Keep dups

Pages 1 Tups 1

D1 C0

Proj-rest

Sorted(col1)

Pages 1 Tups 1

D1 C0

arel

Hashed(col1)

Pages 70 Tups 156

Reading this QEP diagram from bottom to top, Hashed(col1) means the row is being read through the index to return only those rows for which "col1 = 1234," as opposed to Hashed(NU) where NU indicates that the key structure cannot be used and all rows are returned. The projection-restriction node selected the rows matching the where constraint and removed superfluous columns. The final sort node reflects the sort clause on the query.

Select on a Non-Keyed Field

The following is an example of a select on a non-keyed field. The QEP is shown below for the following SQL query:

select col1, col2 from arel

where col3 = 'x'

order by col1;

QUERY PLAN 3,1, no timeout, of main query

Sort Keep dups

Pages 1 Tups 1

D9 C0

Proj-rest

Heap

Pages 1 Tups 1

D9 C0

arel

Hashed(NU)

Pages 70 Tups 156

In this example the Hashed(NU) implies that the table had to be scanned (that is, all 70 pages had to be visited). Without optimization statistics, the query optimizer uses a best guess approach (1% for equalities and 10% for non-equalities).

The query optimizer takes into account disk read-ahead or group reads when performing scans of tables—although 70 pages of data have to be read to scan arel, the estimated disk I/O value is only nine reads (D9) due to this effect. The query optimizer assumes a typical read-ahead of eight pages when performing scans, so here 70/8 reads generates an estimate of nine disk operations.

Join Nodes in a QEP

There is an inner and an outer tree beneath every join node, which function similarly to an inner and outer program loop. By convention, the left-hand tree is called the outer tree, and the right-hand tree is called the inner tree.

There are various types of join nodes, described individually below, but the joining method is the same: for each row from the outer tree, there is a join to all of the rows that can possibly qualify from the inner tree. The next row from the outer tree is processed, and so on.

Any join node can have outer join information if an outer join is present.

Cartesian Product Node

The Cartesian product, or cart-prod, strictly follows the unoptimized join model, with each row in the outer node compared to all rows from the inner node. This does not mean that all rows are actually read, only that all rows that satisfy the conditions of the query are compared.

A typical abbreviated example of a QEP diagram involving a cart-prod is shown below:

Cart-Prod

/ \

proj-rest table

table

This node is displayed with the following information on a QEP diagram:

A label identifying it as a cart-prod join node, along with the column(s) on which processing is done
If an outer join has been requested, one of the following labels indicating the type of join:
[LEFT JOIN]
[FULL JOIN]
[RIGHT JOIN]
Storage structure (which is usually heap)
Total number of pages and tuples
Query cost information
Optionally, sort information (whether the data produced by this node is already sorted or requires sorting)

The cart-prod is most frequently observed in disjoint queries (that is, when use of correlation variables and table names are mixed in the query). However, the following cases can also generate cart-prods, without adversely affecting performance:

Queries involving ORs that can usually be decomposed into a sequence of smaller queries
No join specification (a disjoint table or no where clause, so that there is no relationship between the two tables)
Small tables or amounts of data
Non_equijoins, such as a query with the following where clause:
where r1.col1 > r2.col2

Cart-prods are sometimes associated with substantial estimates for disk I/O usage and affect performance adversely.

This example shows a QEP diagram with a cart-prod join node resulting from the following simple disjoint query:

select arel.col1 from arel, arel a

where a.col1 = 99;

QUERY PLAN 7,1, no timeout, of main query

Cart-Prod

Heap

Pages 1 Tups 243

D9 C4

/ \

Proj-rest Proj-rest

Sorted(NU) Heap

Pages 1 Tups 2 Pages 1 Tups 156

D1 C0 D8 C1

/ /

arel arel

Hashed(col1) Hashed(NU)

Pages 70 Tups 156 Pages 70 Tups 156

Full Sort Merge Node

The full sort merge (FSM) is a more optimal join: it typically joins the inner and outer subtrees with many fewer comparisons than the cart-prod requires. This is done by assuring that both subtrees are sorted in the order of the join columns. If one or the other is not already sorted (for example, by being read from a B-tree index constructed on the join columns), the query plan can include a sort to put the rows in the correct order. This allows the rows of the outer subtree to be joined to the matching rows of the inner subtree with one pass over each. The inner subtree is not scanned multiple times, as with the cart-prod join.

A typical abbreviated example of a QEP diagram involving an FSM is shown below:

join

/ \

sort sort

/ /

proj-rest proj-rest

/ /

table table

This node is displayed with the following information on a QEP diagram:

A label identifying it as a FSM join node, along with the column(s) on which join processing is done
If an outer join has been requested, one of the following labels indicating the type of join:
[LEFT JOIN]
[FULL JOIN]
[RIGHT JOIN]
Storage structure (which is usually heap) or a list of sort columns if the data is being sorted
Total number of pages and tuples
Query cost information
Optionally, sort information (whether the data produced by this node is already sorted or requires sorting)

The FSM is most common when a "bulk" join takes place with no row restrictions on either table involved in the join, as with an SQL select statement of the following form:

select * from r1, r2 where r1.joincol = r2.joincol;

This example shows a QEP diagram with an FSM join node resulting from such a bulk join:

select a.col2, b.col2 from arel a, brel b

where a.col1 = b.col1;

QUERY PLAN 5,1, no timeout, of main query

            FSM Join(col1)
            Heap
            Pages 1 Tups 156
            D9 C40
    /               \
    Proj-rest          Proj-rest
    Sorted(eno)      Sorted(eno)
    Pages 1 Tups 156   Pages 1 Tups 156
    D8 C1              D1 C1
   /                     /
arel                    brel
Hashed (NU)             Isam (NU)
Pages 70 Tups 156       Pages 3 Tups 156

Partial Sort Merge Node

The partial sort merge (PSM) is a cross between a full sort merge and a cart-prod. The inner tree must be in sorted order. The outer tree can be sorted or partially sorted. The outer tree in PSM scenarios can always be derived from an ISAM table. Comparisons proceed as for the full sort merge until an outer value is found to be out of order. At that point the inner loop is restarted. Because ISAM tables are reasonably well ordered (depending on how many rows have been added because the last reorganization), the number of inner loop restarts is typically small.

A typical abbreviated example of a QEP diagram involving a PSM is shown below:

Join

/ \

proj-rest sort

/ /

table proj-rest

table

This node is displayed with the following information on a QEP diagram:

A label identifying it as a PSM join node, along with the columns on which processing is done
If an outer join has been requested, one of the following labels indicating the type of join:
[LEFT JOIN]
[FULL JOIN]
[RIGHT JOIN]
Storage structure (which is usually heap)
Total number of pages and tuples
Query cost information
Optionally, sort information (whether the data produced by this node is already sorted or requires sorting)

The following example shows a QEP diagram with a PSM join:

select a.col2, b.col2 from arel a, brel b

where a.col1 = b.col2;

QUERY PLAN 6,1, no timeout, of main query

PSM Join(col1)

Heap

Pages 1 Tups 156

D9 C26

/ \

Proj-rest Proj-rest

Heap Sort on(col1)

Pages 1 Tups 156 Pages 1 Tups 156

D1 C1 D8 C1

/ /

brel arel

Isam(col2) Hashed(NU)

Pages 3 Tups 156 Pages 70 Tups 156

Hash Join Node

The hash is an optimized join that replaces the FSM join when one or both of the subtrees must be sorted. It functions by loading the rows of one subtree into a memory resident hash table, keyed on the values of the join columns. The rows of the other subtree are hashed on their join key values into the hash table, allowing very efficient matching of joined rows. By avoiding the sort(s) of the FSM join, the hash join can be much more efficient.

A typical abbreviated example of a QEP diagram involving a hash join is shown below:

Join

/ \

proj-rest proj-rest

/ /

table table

This node is displayed with the following information on a QEP diagram:

A label identifying it as a hash join node, along with the columns on which join processing is done
If an outer join has been requested, one of the following labels indicating the type of join:
[LEFT JOIN]
[FULL JOIN]
[RIGHT JOIN]
Storage structure (which is usually heap) or a list of sort columns if the data is being sorted
Total number of pages and tuples
Query cost information
Optionally, sort information (whether the data produced by this node is already sorted or requires sorting)

Like the FSM join, the hash join is most common when a bulk join takes place with no row restrictions on either table involved in the join, as with an SQL select statement of the following form:

select from r1,r2 where r1.joincol= r2.joincol;

This example shows a QEP diagram with hash join node resulting from such a bulk join:

select a.col2, b.col2 from arel a, brel b

where a.col1 = b.col2;

QUERY PLAN 1,1, no timeout, of main query

HASH Join(col1)

Heap

Pages 1 Tups 156

D9 C40

/ \

Proj-rest Proj-rest

Heap Heap

Pages 1 Tups 156 Pages 1 Tups 156

D8 C1 D1 C1

/ /

arel(a) brel (b)

Hashed (NU) Isam (NU)

Pages 70 Tups 156 Pages 3 Tups 156

Key and Tid Lookup Join Node

In key and tid lookup joins, the outer and inner data set is not static. For each outer row, the join selects values and forms a key to search in the inner join. A key lookup join uses keyed access through the structure of the inner table or index, and a tid lookup join uses the tuple identifier (tid) value. For more information, see Tids.

A typical abbreviated example of a QEP diagram involving this type of join is shown below:

Join

/ \

sort btree (or hash or isam)

proj-rest

table

This node is displayed with the following information on a QEP diagram:

A label identifying it as a key (K) or tid (T) lookup join, along with the column(s) on which processing is done
If an outer join has been requested, the following label indicating the type of join:
[LEFT JOIN]
Storage structure (which is usually heap)
Total number of pages and tuples
Query cost information
Optionally, sort information (whether the data produced by this node is already sorted or requires sorting)

This case is seen most frequently where the outer subtree proj-rest returns few rows, so it is faster to do a keyed lookup (on an ISAM, hash, or B-tree table) than any of the sort merge operations.

The following example shows a QEP diagram with a key lookup join:

select b.col1, b.col2, a.col2

from arel a, brel b

where a.col3 = 'x' and a.col1 = b.col1;

K Join(col1)

Heap

Pages 1 Tups 2

D3 C0

/ \

Proj-rest brel

Heap Isam(col1)

Pages 1 Tups 2 Pages 3 Tups 156

D1 C0

arel

Hashed(NU)

Pages 70 Tups 156

In the next example of a tid lookup join, access to the base table is through the secondary index, and proj-rest collects tids for sorting. The tids are used for a direct tid lookup in the base table. Therefore, the storage structure of the base table is NU:

select a.col1, a.col2 from arel a

where a.col2 = 99

order by a.col2;

Sort(col2)

Pages 1 Tups 1

D4 C1

T Join(tidp)

Heap

Pages 1 Tups 1

D4 C0

/ \

Proj-rest arel

Sort on(tidp) Hashed(NU)

Pages 1 Tups 1 Pages 70 Tups 156

D2 C0

aindex

Isam(col2)

Pages 2 Tups 156

Subquery Join Node

The subquery join is specific to SQL because SQL allows subselects as part of a query. These nodes join rows from a query to matching rows of a contained subselect, thus allowing the subselect restrictions on the query to be evaluated.

A typical abbreviated example of a QEP diagram involving a subquery join is shown below:

SE join

/ \

proj-rest Tn

table

In this diagram, Tn identifies the QEP constructed to evaluate the subselect.

This node is displayed with the following information on a QEP diagram:

A label identifying it as a subquery (SE) join, along with the column(s) on which processing is done
Storage structure (which is usually heap)
Total number of pages and tuples
Query cost information
Optionally, sort information (whether the data produced by this node is already sorted or requires sorting)

The following example shows a QEP diagram with a subquery join:

select * from arel a

where a.col2 = (

select col2 from brel b

where a.col1 = b.col1)

and col1 = 5;

QUERY PLAN 3,1, no timeout, of T1

Proj-rest

Heap

Pages 1 Tups 1

D1 C0

brel

Hashed(col1)

Pages 3 Tups 156

QUERY PLAN 4,2, no timeout, of main query

SE Join(col1)

Heap

Pages 1 Tups 1

D2 C0

/ \

Proj-rest T1

Heap Heap

Pages 1 Tups 1 Pages 1 Tups 1

D1 C0

arel

Hashed(col1)

Pages 70 Tups 156

In the QEP pane of the SQL Scratchpad window in VDBA, these two QEPs are shown in separate tabs.

Subquery joins are reasonably expensive because the subselect must be executed once for each row of the outer subtree. The query optimizer contains logic to flatten various types of subselects into normal joins with the containing query. This allows more optimal joins to be used to evaluate the subselect.

As an example of the subselect flattening enhancement features of the query optimizer, consider the following subselect:

select r.a from r where r.c =

(select avg(s.a) from s

where s.b = r.b and r.d > s.d);

Instead of two scans of table r, the query optimizer has eliminated a scan of table r by evaluating the aggregate at an interior QEP node. The QEP appears similar to the previous example:

QUERY PLAN 7,1, no timeout, of main query

Hash Join(b)

avg(s.a)

Heap

Pages 2 Tups 359

D133 C171

/ \

Proj-rest Proj-rest

Sorted(b) Heap

Pages 1 Tups 359 Pages 40 Tups 359

D65 C112 D32 C3

/ /

r s

Btree (b,c) Hashed(NU)

Pages 44 Tups 359 Pages 44 Tups 359

Multiple Query Execution Plans

The query optimizer can generate multiple QEPs if the query includes any of the following objects:

SQL subqueries (in, exists, all, any, and so on.)
SQL union clause
SQL group by clause
Views that need to be materialized

As an example of multiple QEPs, consider the processing of a view on a union. The following statement creates the view:

create view view1 as

select distinct col1 from arel

union

select distinct col2 from arel;

There are two selects, designated #1 and #2 in their respective QEPs below. Now consider the query optimizer action in evaluating the following query:

select * from view1;

This generates three QEPs, which are shown in order in the example QEP diagrams that follow:

The first select in the union
The second select in the union
Main query—the merged result of the two unioned selects
QUERY PLAN of union view T0

               Sort Unique

               Pages 1 Tups 156

               D1 C10

              /

          Proj-rest

          Heap

          Pages 1 Tups 156

          D1 C0

          /

      aindex

      Isam(col2)

      Pages 2 Tups 156

QUERY PLAN of union subquery

             Sort Unique

             Pages 1 Tups 156

             D4 C10

            /

        Proj-rest

        Heap

        Pages 1 Tups 156

        D4 C0

       /

   arel

   Hashed(NU)

   Pages 70 Tups 156

QUERY PLAN of main query

             Sort Unique

             Pages 1 Tups 156

             D19 C20

            /

         Proj-rest

         Heap

         Pages 1 Tups 156

         D12 C11

        /

      T0

      Heap

      Pages 1 Tups 156

In the QEP pane of the SQL Scratchpad window in VDBA, these QEPs are shown in separate tabs.

More Complex QEPs

The previous series of QEPs on the different classes of joins involved only two tables. More complex QEPs involving joins with three or more tables can be read as a sequence of two-table joins that have already been described, with the query optimizer deciding what is the optimal join sequence. The key to understanding these complex QEPs is recognizing the join sequences and the types of joins being implemented.

Parallel Query Execution

With its thread support, Ingres has long supported the concurrent execution of separate queries. For short OLTP style queries, this permits a many fold increase in the number of such queries that can be executed in a given unit of time.

Ingres has the additional capability to split up the execution of individual long running queries over multiple threads. This parallel execution of a single complex query reduces the time to execute it.

Parallel query plans are implemented in Ingres by the introduction of the exchange node type. For more information, see Sample Parallel QEPs.

An exchange node marks the boundary between processing threads in a query plan. It can spawn one or many threads to execute the query plan fragment below the exchange node concurrent with the fragment above the exchange node. The exchange node itself passes or exchanges rows from threads below the node to the thread above the node.

Types of Parallelism

Ingres compiles exchange nodes into queries to implement any of three types of parallelism:

Inter-node (pipelined) parallelism – an exchange node that spawns a single thread effectively pipelines rows from the plan fragment below the node to the plan fragment above the node. For example, an exchange node below a sort node allows the plan fragment below to generate rows at the same time as sorting is being done for previous rows. Plan fragments that produce and consume rows at the same rate can effectively overlap their processing, reducing the overall execution time for the query.
Inter-node (bushy) parallelism – exchange nodes inserted over essentially independent query plan fragments allow those fragments to execute independently of one another. A specialized case of bushy parallelism occurs in union queries when a single exchange node is placed above the unioned selects. One thread is created for each of the select plan fragments, allowing the selects to be processed concurrently.
Intra-node (partitioned table) parallelism – a single exchange node is placed above the orig node for a partitioned table. The exchange node creates several (4, 8, and so forth) child threads, each one of which retrieves data from a subset of the partitions of the table. This allows the concurrent reading of rows from the different partitions, clearly reducing the elapsed time required to process the table. A variation on partitioned parallelism (called a partition compatible join) occurs when two partitioned tables with the same number of partitions are joined using their respective partitioning columns. The query optimizer places the exchange node above the join in the query plan, resulting in mini-joins being performed between the rows of compatible pairs of partitions. These mini-joins are performed concurrently on different threads.

Enabling Parallel Query Plans

The generation of parallel query plans is controlled by several configuration parameters, as well as a session level Set statement. The opf_pq_dop parameter defines the degree of parallelism or maximum number of exchange nodes that can be compiled into a query plan. A value of 0 or 1 prevents the generation of parallel plans, but any other positive value enables them. The session level set parallel <n> statement can be used to override the CBF parameter, where <n> is the degree of parallelism for queries compiled in the session.

The opf_pq_threshold parameter is a companion to opf_pq_dop and defines the cost threshold of a query plan before exchange nodes are inserted into it. Because there are slight overheads in initiating parallel query plans, compile only plans that benefit from parallel processing. The Ingres query optimizer currently compiles a serial query plan as it always has. If the degree of parallelism is defined to permit exchange nodes to be added to the plan, the cost estimate of the serial plan must still exceed the threshold value before a parallel plan is generated. The cost estimate is computed as the sum of the C and D numbers at the top of the query plan.

Sample Parallel QEPs

The following example shows a QEP diagram with a 1:n exchange node above a join node. It shows eight threads being created to divide the work of the join.

Two joins each of pairwise compatible partitions of the underlying tables are performed by each thread, all in parallel.

select a.col1, b.col2

from aprel a, bprel b

where a.col1 = b.col1

and b.col2 between 500 and 1500

Exchange

Heap

Pages 319 Tups 14012

Reduction 4301

Threads 8

PC Join count 16

D689 C6386

Hash Join(col1)

Heap

Pages 319 Tups 14012

PC Join count 16

D689 C6386

/ \

Proj-rest Proj-rest

Heap Heap

Pages 73 Tups 10000 Pages 83 Tups 14012

D384 C100 D305 C140

/ /aprel bprelB-Tree(NU) B-Tree(col2)

Pages 768 Tups 10000 Pages 12872 Tups 299814

Partitions 128 Partitions 16

PC Join count 16 PC Join count 16

This example shows a union select in which the exchange node generates one thread to process each of the selects.

select a.col1 from arel a

union all select b.col2 from brel b

QUERY PLAN 1,3, no timeout, of union subquery

Proj-rest

Heap

Pages 25 Tups 8197

D283 C82

brel

(b)

Heap

Pages 1132 Tups 8197

QUERY PLAN 1,2, no timeout, of union subquery

Proj-rest

Heap

Pages 16 Tups 5126

D218 C51

arel

(a)

Heap

Pages 872 Tups 5126

QUERY PLAN 1,1, no timeout, of main query

Exchange

Heap

Pages 54 Tups 17886

Reduction 242

Threads 2

D501 C266

Proj-rest

Heap

Pages 54 Tups 17886 D501 C266

Heap

Pages 16 Tups 17886

Optimizer Timeout

The query optimizer does not search for further execution plans if it believes that the best plan it has found so far takes less time to execute than the time it has taken evaluating QEPs. In this case, the query optimizer times out, stopping its operation and returning with the best QEP it has found up to that point.

You can tell if the query optimizer timed out by checking the header of the QEP for set qep diagrams or the Timeout check box in the QEP pane of the SQL Scratchpad window in VDBA. In the QEP pane the Timeout check box is disabled if the optimizer was able to find the optimal QEP without timing out. If the optimizer timed out before searching all QEPs, this check box is enabled. In QEP diagrams generated using set qep, these same conditions are indicated in the QEP diagram header using the keywords "no timeout" and "timed out," respectively.

The timeout feature is controlled using the SQL set statement with the joinop [no]timeout clause, as discussed in the set statement entry in the SQL Reference Guide. By default, the optimizer times out, but you can turn the timeout feature off to enable the query optimizer to evaluate all plans. To do this, issue the following statement (for example, from the VDBA SQL Scratchpad window, from a terminal monitor, or from within an embedded SQL application):

set joinop notimeout

To turn the timeout feature back on, the corresponding statement is:

set joinop timeout

Because it is elapsed CPU time that is being measured, QEPs that time out can change, depending on machine load.

To control this feature using an operating system environment variable:

Windows:

set ING_SET=set joinop notimeout

Unix:

C shell:

setenv ING_SET "set joinop notimeout"

Bourne shell:

ING_SET = "set joinop notimeout"

export ING_SET

VMS:

define ING_SET "set joinop notimeout"

Note: The fact that the query optimizer has timed out on a particular query does not guarantee that a better QEP is found; it indicates that not all QEPs have been checked, and so potentially, a better QEP is found.

Greedy Optimization

The Ingres query optimizer performs an exhaustive search of all possible plans for executing a query. It computes a cost estimate for each possible plan and chooses the cheapest alternative according to the cost algorithms. For relatively simple queries, this process is very fast. However, for queries involving large numbers of tables, the number of possible plans can be very large and the length of time spent enumerating them all and associating costs with them can be significant.

The Optimizer Timeout feature is useful in shortening the time taken to identify an efficient query plan. However, for very large queries, queries with large numbers of potentially useful secondary indexes and for large queries whose tables have large numbers of rows (leading to very expensive plans that do not timeout quickly), the optimizer can take an unacceptable length of time producing a viable query plan.

The query optimizer now includes an alternative mechanism for generating query plans that can greatly reduce the length of compile time. Rather than enumerate all possible query plans (including all permutations of table join order, all combinations of tables and useful secondary indexes, and all "shapes" of query plans), the new "greedy" enumeration heuristic starts with small plan fragments, using them as building blocks to construct the eventual query plan. The chosen fragments are always the lowest cost at that stage of plan construction, so even though large numbers of potential plans are not considered, those that are chosen are also based on cost estimation techniques.

This technique is used by default for any query involving at least five base tables and at least ten base tables, plus potentially useful secondary indexes. The chosen plans are usually very good, in particular, when the vastly reduced compile time is taken into consideration. However, for the rare cases in which the new technique produces a much slower plan, it can be toggled using the following statements:

[exec sql] set joinop nogreedy

exec sql] set joinop greedy

Summary for Evaluating QEPs

Here is a list of the main points to check when evaluating a QEP:

Cart-prods can be caused by errors due to disjoint queries or queries that involve certain types of OR operations. Also, joins involving calculations, data type conversions, and non-equijoins can generate cart-prods. Alternative ways of posing the query is often advised under these circumstances.
The NU on the storage structure part in the orig node description is not a good sign if you believe the storage structure must have been used to restrict the number of rows being read.
Verify that the appropriate secondary indexes are being used. Running the optimization process to generate statistics on the indexed columns allows the query optimizer to better differentiate between the selectivity powers of the different indexes for a particular situation.
If there is little data in a table (for example, less than five pages) the query optimizer can consider a scan of the table rather than use any primary or secondary indexes, because little is to be gained from using them.
Check that row estimates are accurate on the QEP nodes. If not, run the optimization process to generate statistics on the columns in question.