Previous Topic

Next Topic

Types of Nodes in a QEP

Each node in a QEP has detailed information associated with it, depending on the type of node.

The types of nodes are as follows:

Previous Topic

Next Topic

Sort Nodes in a QEP

Many types of nodes can also be shown as sort nodes. A sorting node causes the data to be sorted as it is returned. Any node other than an orig node can appear with a sort indication. A query with a sort clause on it has a sort node as the topmost node in the QEP. This type of node displays:

For a description of each of these parts of the display, see Information on a QEP.

Sort nodes make use of a sort buffer and so consume primarily CPU resources, requiring disk I/O only when the volume of data to be sorted cannot be accommodated in the sort buffer. The heapsort algorithm is used; it is very fast on both unordered data and data which is nearly sorted.

Previous Topic

Next Topic

Non-Join Nodes in a QEP

Types of non-join nodes are as follows:

Previous Topic

Next Topic

Orig Nodes

Orig nodes are nodes with no children. When reading a QEP, you should first find the orig nodes of the tree. Orig nodes are the most common type of QEP node.

Orig nodes describe a base table or secondary index being accessed from the query. This type of node displays the following:

For a description of each of these parts of the display, see Information on a QEP.

Previous Topic

Next Topic

Projection-Restriction Nodes

A projection-restriction (proj-rest) node describes the result of a projection and/or where clause restriction applied to a subsidiary node. It defines how a subsidiary node is to key into the table and what qualification to use on the table. This type of node displays the following:

For a description of each of these parts of the display, see Information on a QEP.

Proj-rest nodes are used to remove data irrelevant to the query from a table, so that the minimum amount of data is carried around during the execution of the query. Therefore, only those columns referenced in the target list and where clause for a table are projected, and only those rows that satisfy the where clause restrictions are retained for use higher in the plan.

All you see is the amount of disk I/O required to read the appropriate rows from the node below, and that amount depends on what storage structures were used and the number of pages accessed.

Previous Topic

Next Topic

Exchange Nodes

Exchange nodes appear in parallel query plans. An exchange node results in one or more threads being spawned to execute the plan fragment beneath the node. It allows different portions of a complex query plan to execute concurrently, reducing the overall elapsed time taken to execute the query. This type of node displays:

For more information, see Parallel Query Execution.

Previous Topic

Next Topic

Examples of Non-join Nodes

Here are QEP examples that illustrate non-join nodes. The Sample Tables section describes the tables and indexes used in these examples.

Sample Tables

In these examples, the following two tables are used:

  1. Table arel(col1, col2, col3):

    Name: arel

    Owner: supp60

    Created: 26-oct-1998 08:50:00

    Location: ii_database

    Type: user table

    Version: II2.5

    Row width: 413

    Number of rows: 156

    Storage structure: hash

    Duplicate Rows: allowed

    Number of pages: 70

    Overflow data pages: 6

    Column Information:

    Column                                      Key

    Name   Type      Length   Nulls   Defaults  Seq

    col1   integer    4       yes       no       1

    col2   integer    4       yes       no

    col3   varchar    400     yes       no

    Secondary indexes: aindex (col2)  structure: isam

  2. Table brel(col1,col2):

    Name: brel

    Owner: supp60

    Created: 26-oct-1998 08:53:00

    Location: ii_database

    Type: user table

    Version: II2.5

    Row width: 10

    Number of rows: 156

    Storage structure: isam

    Duplicate Rows: allowed

    Number of pages: 3

    Overflow data pages: 1

    Column Information:

    Column                                        Key

    Name    Type     Length   Nulls   Defaults   Seq

    col1    integer    4       yes       n         1

    col2    integer    4       yes       no

    Secondary indexes: none

Primary Key Lookup

This is an example of a simple primary key lookup. The QEP is shown below for the following SQL query:

select col1, col2 from arel

  where col1 = 1234

  order by col2;

QUERY PLAN 3,1, no timeout, of main query

                Sort Keep dups

                Pages 1 Tups 1

                D1 C0

                /

            Proj-rest

            Sorted(col1)

            Pages 1 Tups 1

            D1 C0

            /

         arel

         Hashed(col1)

         Pages 70 Tups 156

Reading this QEP diagram from bottom to top, Hashed(col1) means the row is being read through the index to return only those rows for which "col1 = 1234," as opposed to Hashed(NU) where NU indicates that the key structure cannot be used and all rows are returned. The projection-restriction node selected the rows matching the where constraint and removed superfluous columns. The final sort node reflects the sort clause on the query.

Select on a Non-Keyed Field

The following is an example of a select on a non-keyed field. The QEP is shown below for the following SQL query:

select col1, col2 from arel

  where col3 = 'x'

  order by col1;

QUERY PLAN 3,1, no timeout, of main query

                    Sort Keep dups

                    Pages 1 Tups 1

                    D9 C0

                    /

                 Proj-rest

                 Heap

                 Pages 1 Tups 1

                 D9 C0

                  /

              arel

              Hashed(NU)

              Pages 70 Tups 156

In this example the Hashed(NU) implies that the table had to be scanned (that is, all 70 pages had to be visited). Without optimization statistics, the query optimizer uses a best guess approach (1% for equalities and 10% for non-equalities).

The query optimizer takes into account disk read-ahead or group reads when performing scans of tables—although 70 pages of data have to be read to scan arel, the estimated disk I/O value is only nine reads (D9) due to this effect. The query optimizer assumes a typical read-ahead of eight pages when performing scans, so here 70/8 reads generates an estimate of nine disk operations.

Previous Topic

Next Topic

Join Nodes in a QEP

There is an inner and an outer tree beneath every join node, which function similarly to an inner and outer program loop. By convention, the left-hand tree is called the outer tree, and the right-hand tree is called the inner tree.

There are various types of join nodes, described individually below, but the joining method is the same: for each row from the outer tree, there is a join to all of the rows that can possibly qualify from the inner tree. The next row from the outer tree is processed, and so on.

Any join node can have outer join information if an outer join is present.

Previous Topic

Next Topic

Cartesian Product Node

The Cartesian product, or cart-prod, strictly follows the unoptimized join model, with each row in the outer node compared to all rows from the inner node. This does not mean that all rows are actually read, only that all rows that satisfy the conditions of the query are compared.

A typical abbreviated example of a QEP diagram involving a cart-prod is shown below:

       Cart-Prod

      /     \

proj-rest table

   /

table

This node is displayed with the following information on a QEP diagram:

The cart-prod is most frequently observed in disjoint queries (that is, when use of correlation variables and table names are mixed in the query). However, the following cases can also generate cart-prods, without adversely affecting performance:

Cart-prods are sometimes associated with substantial estimates for disk I/O usage and affect performance adversely.

This example shows a QEP diagram with a cart-prod join node resulting from the following simple disjoint query:

select arel.col1 from arel, arel a

  where a.col1 = 99;

QUERY PLAN 7,1, no timeout, of main query

                    Cart-Prod

                    Heap

                    Pages 1 Tups 243

                    D9 C4

                   /                \

          Proj-rest               Proj-rest

          Sorted(NU)              Heap

          Pages 1 Tups 2        Pages 1 Tups 156

          D1 C0                   D8 C1

          /                        /

     arel                      arel

     Hashed(col1)              Hashed(NU)

     Pages 70 Tups 156        Pages 70 Tups 156

Previous Topic

Next Topic

Full Sort Merge Node

The full sort merge (FSM) is a more optimal join: it typically joins the inner and outer subtrees with many fewer comparisons than the cart-prod requires. This is done by assuring that both subtrees are sorted in the order of the join columns. If one or the other is not already sorted (for example, by being read from a B-tree index constructed on the join columns), the query plan can include a sort to put the rows in the correct order. This allows the rows of the outer subtree to be joined to the matching rows of the inner subtree with one pass over each. The inner subtree is not scanned multiple times, as with the cart-prod join.

A typical abbreviated example of a QEP diagram involving an FSM is shown below:

               join

              /    \

           sort    sort

            /       /

   proj-rest proj-rest

  /            /

table       table

This node is displayed with the following information on a QEP diagram:

The FSM is most common when a "bulk" join takes place with no row restrictions on either table involved in the join, as with an SQL select statement of the following form:

select * from r1, r2 where r1.joincol = r2.joincol;

This example shows a QEP diagram with an FSM join node resulting from such a bulk join:

select a.col2, b.col2 from arel a, brel b

  where a.col1 = b.col1;

QUERY PLAN 5,1, no timeout, of main query

            FSM Join(col1)
            Heap
            Pages 1 Tups 156
            D9 C40
    /               \
    Proj-rest          Proj-rest
    Sorted(eno)      Sorted(eno)
    Pages 1 Tups 156   Pages 1 Tups 156
    D8 C1              D1 C1
   /                     /
arel                    brel
Hashed (NU)             Isam (NU)
Pages 70 Tups 156       Pages 3 Tups 156

Previous Topic

Next Topic

Partial Sort Merge Node

The partial sort merge (PSM) is a cross between a full sort merge and a cart-prod. The inner tree must be in sorted order. The outer tree can be sorted or partially sorted. The outer tree in PSM scenarios can always be derived from an ISAM table. Comparisons proceed as for the full sort merge until an outer value is found to be out of order. At that point the inner loop is restarted. Because ISAM tables are reasonably well ordered (depending on how many rows have been added because the last reorganization), the number of inner loop restarts is typically small.

A typical abbreviated example of a QEP diagram involving a PSM is shown below:

           Join

          /      \

      proj-rest  sort

      /           /

    table     proj-rest

                  /

                table

This node is displayed with the following information on a QEP diagram:

The following example shows a QEP diagram with a PSM join:

select a.col2, b.col2 from arel a, brel b

  where a.col1 = b.col2;

QUERY PLAN 6,1, no timeout, of main query

                     PSM Join(col1)

                     Heap

                     Pages 1 Tups 156

                     D9 C26

                    /          \

           Proj-rest            Proj-rest

           Heap                 Sort on(col1)

           Pages 1 Tups 156     Pages 1 Tups 156

           D1 C1                D8 C1

          /                     /

       brel                  arel

       Isam(col2)            Hashed(NU)

       Pages 3 Tups 156      Pages 70 Tups 156

Previous Topic

Next Topic

Hash Join Node

The hash is an optimized join that replaces the FSM join when one or both of the subtrees must be sorted. It functions by loading the rows of one subtree into a memory resident hash table, keyed on the values of the join columns. The rows of the other subtree are hashed on their join key values into the hash table, allowing very efficient matching of joined rows. By avoiding the sort(s) of the FSM join, the hash join can be much more efficient.

A typical abbreviated example of a QEP diagram involving a hash join is shown below:

           Join

          /      \

      proj-rest  proj-rest

      /           /

    table     table

This node is displayed with the following information on a QEP diagram:

Like the FSM join, the hash join is most common when a bulk join takes place with no row restrictions on either table involved in the join, as with an SQL select statement of the following form:

select from r1,r2 where r1.joincol= r2.joincol;

This example shows a QEP diagram with hash join node resulting from such a bulk join:

select a.col2, b.col2 from arel a, brel b

where a.col1 = b.col2;

QUERY PLAN 1,1, no timeout, of main query

                     HASH Join(col1)

                     Heap

                     Pages 1 Tups 156

                     D9 C40

                    /             \

           Proj-rest            Proj-rest

  Heap         Heap 

           Pages 1 Tups 156     Pages 1 Tups 156

           D8 C1                D1 C1

          /                     /

       arel(a)                brel (b)

       Hashed (NU)           Isam (NU)

       Pages 70 Tups 156     Pages 3 Tups 156

Previous Topic

Next Topic

Key and Tid Lookup Join Node

In key and tid lookup joins, the outer and inner data set is not static. For each outer row, the join selects values and forms a key to search in the inner join. A key lookup join uses keyed access through the structure of the inner table or index, and a tid lookup join uses the tuple identifier (tid) value. For more information, see Tids.

A typical abbreviated example of a QEP diagram involving this type of join is shown below:

         Join

       /     \

    sort      btree (or hash or isam)

       \      

      proj-rest

         \

       table

This node is displayed with the following information on a QEP diagram:

This case is seen most frequently where the outer subtree proj-rest returns few rows, so it is faster to do a keyed lookup (on an ISAM, hash, or B-tree table) than any of the sort merge operations.

The following example shows a QEP diagram with a key lookup join:

select b.col1, b.col2, a.col2

  from arel a, brel b

  where a.col3 = 'x' and a.col1 = b.col1;

                  K Join(col1)

                  Heap

                  Pages 1 Tups 2

                  D3 C0

                   /           \

           Proj-rest            brel

           Heap                Isam(col1)

           Pages 1 Tups 2       Pages 3 Tups 156

           D1 C0

            /

       arel

       Hashed(NU)

       Pages 70 Tups 156

In the next example of a tid lookup join, access to the base table is through the secondary index, and proj-rest collects tids for sorting. The tids are used for a direct tid lookup in the base table. Therefore, the storage structure of the base table is NU:

select a.col1, a.col2 from arel a

  where a.col2 = 99

  order by a.col2;

                   Sort(col2)

                   Pages 1 Tups 1

                   D4 C1

                   /

             T Join(tidp)

             Heap

             Pages 1 Tups 1

             D4 C0

            /               \

         Proj-rest          arel

         Sort on(tidp)      Hashed(NU)

         Pages 1 Tups 1     Pages 70 Tups 156

         D2 C0

        /

   aindex

   Isam(col2)

   Pages 2 Tups 156

Previous Topic

Next Topic

Subquery Join Node

The subquery join is specific to SQL because SQL allows subselects as part of a query. These nodes join rows from a query to matching rows of a contained subselect, thus allowing the subselect restrictions on the query to be evaluated.

A typical abbreviated example of a QEP diagram involving a subquery join is shown below:

              SE join

              /     \

       proj-rest     Tn

         /

    table

In this diagram, Tn identifies the QEP constructed to evaluate the subselect.

This node is displayed with the following information on a QEP diagram:

The following example shows a QEP diagram with a subquery join:

select * from arel a

  where a.col2 = (

    select col2 from brel b

      where a.col1 = b.col1)

  and col1 = 5;

QUERY PLAN 3,1, no timeout, of T1

               Proj-rest

               Heap

               Pages 1 Tups 1

               D1 C0

               /

          brel

          Hashed(col1)

          Pages 3 Tups 156

QUERY PLAN 4,2, no timeout, of main query

               SE Join(col1)

               Heap

               Pages 1 Tups 1

               D2 C0

              /              \

          Proj-rest          T1

          Heap               Heap

          Pages 1 Tups 1     Pages 1 Tups 1

          D1 C0

          /

      arel

      Hashed(col1)

      Pages 70 Tups 156

In the QEP pane of the SQL Scratchpad window in VDBA, these two QEPs are shown in separate tabs.

Subquery joins are reasonably expensive because the subselect must be executed once for each row of the outer subtree. The query optimizer contains logic to flatten various types of subselects into normal joins with the containing query. This allows more optimal joins to be used to evaluate the subselect.

As an example of the subselect flattening enhancement features of the query optimizer, consider the following subselect:

select r.a from r where r.c =

 (select avg(s.a) from s

where s.b = r.b and r.d > s.d);

Instead of two scans of table r, the query optimizer has eliminated a scan of table r by evaluating the aggregate at an interior QEP node. The QEP appears similar to the previous example:

QUERY PLAN 7,1, no timeout, of main query

                     Hash Join(b)

                     avg(s.a)

                     Heap

                     Pages 2 Tups 359

                     D133 C171

                    /          \

             Proj-rest         Proj-rest

             Sorted(b)         Heap

             Pages 1 Tups 359  Pages 40 Tups 359

             D65 C112          D32 C3

            /                  /

          r                   s

          Btree (b,c)         Hashed(NU)

          Pages 44 Tups 359   Pages 44 Tups 359

Previous Topic

Next Topic

Multiple Query Execution Plans

The query optimizer can generate multiple QEPs if the query includes any of the following objects:

As an example of multiple QEPs, consider the processing of a view on a union. The following statement creates the view:

create view view1 as

  select distinct col1 from arel

union

  select distinct col2 from arel;

There are two selects, designated #1 and #2 in their respective QEPs below. Now consider the query optimizer action in evaluating the following query:

select * from view1;

This generates three QEPs, which are shown in order in the example QEP diagrams that follow:

  1. The first select in the union
  2. The second select in the union
  3. Main query—the merged result of the two unioned selects

    QUERY PLAN of union view T0

                   Sort Unique

                   Pages 1 Tups 156

                   D1 C10

                  /

              Proj-rest

              Heap

              Pages 1 Tups 156

              D1 C0

              /

          aindex

          Isam(col2)

          Pages 2 Tups 156

    QUERY PLAN of union subquery

                 Sort Unique

                 Pages 1 Tups 156

                 D4 C10

                /

            Proj-rest

            Heap

            Pages 1 Tups 156

            D4 C0

           /

       arel

       Hashed(NU)

       Pages 70 Tups 156

    QUERY PLAN of main query

                 Sort Unique

                 Pages 1 Tups 156

                 D19 C20

                /

             Proj-rest

             Heap

             Pages 1 Tups 156

             D12 C11

            /

          T0

          Heap

          Pages 1 Tups 156

In the QEP pane of the SQL Scratchpad window in VDBA, these QEPs are shown in separate tabs.

Previous Topic

Next Topic

More Complex QEPs

The previous series of QEPs on the different classes of joins involved only two tables. More complex QEPs involving joins with three or more tables can be read as a sequence of two-table joins that have already been described, with the query optimizer deciding what is the optimal join sequence. The key to understanding these complex QEPs is recognizing the join sequences and the types of joins being implemented.

Previous Topic

Next Topic

Parallel Query Execution

With its thread support, Ingres has long supported the concurrent execution of separate queries. For short OLTP style queries, this permits a many fold increase in the number of such queries that can be executed in a given unit of time.

Ingres has the additional capability to split up the execution of individual long running queries over multiple threads. This parallel execution of a single complex query reduces the time to execute it.

Parallel query plans are implemented in Ingres by the introduction of the exchange node type. For more information, see Sample Parallel QEPs.

An exchange node marks the boundary between processing threads in a query plan. It can spawn one or many threads to execute the query plan fragment below the exchange node concurrent with the fragment above the exchange node. The exchange node itself passes or exchanges rows from threads below the node to the thread above the node.

Previous Topic

Next Topic

Types of Parallelism

Ingres compiles exchange nodes into queries to implement any of three types of parallelism:

Previous Topic

Next Topic

Enabling Parallel Query Plans

The generation of parallel query plans is controlled by several configuration parameters, as well as a session level Set statement. The opf_pq_dop parameter defines the degree of parallelism or maximum number of exchange nodes that can be compiled into a query plan. A value of 0 or 1 prevents the generation of parallel plans, but any other positive value enables them. The session level set parallel <n> statement can be used to override the CBF parameter, where <n> is the degree of parallelism for queries compiled in the session.

The opf_pq_threshold parameter is a companion to opf_pq_dop and defines the cost threshold of a query plan before exchange nodes are inserted into it. Because there are slight overheads in initiating parallel query plans, compile only plans that benefit from parallel processing. The Ingres query optimizer currently compiles a serial query plan as it always has. If the degree of parallelism is defined to permit exchange nodes to be added to the plan, the cost estimate of the serial plan must still exceed the threshold value before a parallel plan is generated. The cost estimate is computed as the sum of the C and D numbers at the top of the query plan.

Previous Topic

Next Topic

Sample Parallel QEPs

The following example shows a QEP diagram with a 1:n exchange node above a join node. It shows eight threads being created to divide the work of the join.

Two joins each of pairwise compatible partitions of the underlying tables are performed by each thread, all in parallel.

select a.col1, b.col2

   from aprel a, bprel b

   where a.col1 = b.col1

       and b.col2 between 500 and 1500

                                    Exchange

                                    Heap

                                    Pages 319 Tups 14012

                                    Reduction 4301

                                    Threads 8

                                    PC Join count 16

                                    D689 C6386

                         /

                        Hash Join(col1)

                        Heap

                        Pages 319 Tups 14012

                        PC Join count 16

                        D689 C6386

              /                     \

            Proj-rest               Proj-rest

            Heap                    Heap

            Pages 73 Tups 10000     Pages 83 Tups 14012

            D384 C100               D305 C140

 /                       /aprel           bprelB-Tree(NU)    B-Tree(col2)

Pages 768 Tups 10000    Pages 12872 Tups 299814

Partitions 128          Partitions 16

PC Join count 16        PC Join count 16

This example shows a union select in which the exchange node generates one thread to process each of the selects.

select a.col1 from arel a

    union all select b.col2 from brel b

QUERY PLAN 1,3, no timeout, of union subquery

            Proj-rest

            Heap

            Pages 25 Tups 8197

            D283 C82

 /

brel

(b)

Heap

Pages 1132 Tups 8197

QUERY PLAN 1,2, no timeout, of union subquery

            Proj-rest

            Heap

            Pages 16 Tups 5126

            D218 C51

 /

arel

(a)

Heap

Pages 872 Tups 5126

QUERY PLAN 1,1, no timeout, of main query

                        Exchange

                        Heap

                        Pages 54 Tups 17886

                        Reduction 242

                        Threads 2

                        D501 C266

             /

            Proj-rest

            Heap

            Pages 54 Tups 17886            D501 C266

 /

T3

Heap

Pages 16 Tups 17886

Previous Topic

Next Topic

Optimizer Timeout

The query optimizer does not search for further execution plans if it believes that the best plan it has found so far takes less time to execute than the time it has taken evaluating QEPs. In this case, the query optimizer times out, stopping its operation and returning with the best QEP it has found up to that point.

You can tell if the query optimizer timed out by checking the header of the QEP for set qep diagrams or the Timeout check box in the QEP pane of the SQL Scratchpad window in VDBA. In the QEP pane the Timeout check box is disabled if the optimizer was able to find the optimal QEP without timing out. If the optimizer timed out before searching all QEPs, this check box is enabled. In QEP diagrams generated using set qep, these same conditions are indicated in the QEP diagram header using the keywords "no timeout" and "timed out," respectively.

The timeout feature is controlled using the SQL set statement with the joinop [no]timeout clause, as discussed in the set statement entry in the SQL Reference Guide. By default, the optimizer times out, but you can turn the timeout feature off to enable the query optimizer to evaluate all plans. To do this, issue the following statement (for example, from the VDBA SQL Scratchpad window, from a terminal monitor, or from within an embedded SQL application):

set joinop notimeout

To turn the timeout feature back on, the corresponding statement is:

set joinop timeout

Because it is elapsed CPU time that is being measured, QEPs that time out can change, depending on machine load.

To control this feature using an operating system environment variable:

Windows:

set ING_SET=set joinop notimeout

Unix:

C shell:

setenv ING_SET "set joinop notimeout"

Bourne shell:

ING_SET = "set joinop notimeout"

export ING_SET

VMS:

define ING_SET "set joinop notimeout"

Note: The fact that the query optimizer has timed out on a particular query does not guarantee that a better QEP is found; it indicates that not all QEPs have been checked, and so potentially, a better QEP is found.

Previous Topic

Next Topic

Greedy Optimization

The Ingres query optimizer performs an exhaustive search of all possible plans for executing a query. It computes a cost estimate for each possible plan and chooses the cheapest alternative according to the cost algorithms. For relatively simple queries, this process is very fast. However, for queries involving large numbers of tables, the number of possible plans can be very large and the length of time spent enumerating them all and associating costs with them can be significant.

The Optimizer Timeout feature is useful in shortening the time taken to identify an efficient query plan. However, for very large queries, queries with large numbers of potentially useful secondary indexes and for large queries whose tables have large numbers of rows (leading to very expensive plans that do not timeout quickly), the optimizer can take an unacceptable length of time producing a viable query plan.

The query optimizer now includes an alternative mechanism for generating query plans that can greatly reduce the length of compile time. Rather than enumerate all possible query plans (including all permutations of table join order, all combinations of tables and useful secondary indexes, and all "shapes" of query plans), the new "greedy" enumeration heuristic starts with small plan fragments, using them as building blocks to construct the eventual query plan. The chosen fragments are always the lowest cost at that stage of plan construction, so even though large numbers of potential plans are not considered, those that are chosen are also based on cost estimation techniques.

This technique is used by default for any query involving at least five base tables and at least ten base tables, plus potentially useful secondary indexes. The chosen plans are usually very good, in particular, when the vastly reduced compile time is taken into consideration. However, for the rare cases in which the new technique produces a much slower plan, it can be toggled using the following statements:

[exec sql] set joinop nogreedy

or

exec sql] set joinop greedy

Previous Topic

Next Topic

Summary for Evaluating QEPs

Here is a list of the main points to check when evaluating a QEP:


© 2007 Ingres Corporation. All rights reserved.