Partitioning Schemes

A table can be partitioned. Partitioning distributes the rows of a table among a number of sub-tables (partitions). A partitioning scheme determines which rows are sent to which partitions.

After the partitioning scheme is defined, partitioning is managed automatically by Ingres.

To define a table with partitions, use the PARTITION= option in the CREATE TABLE WITH clause.

When creating tables, NOPARTITION is the default.

Each dimension of a partitioning scheme defines a rule, or distribution scheme, for assigning rows to partitions. Conceptually, a dimension defines a set of logical partitions; each logical partition can then be subdivided according to the next dimension's rule. Dimensions are evaluated from left to right.

Four distribution scheme types are available: AUTOMATIC, HASH, LIST, and RANGE. Hash, list, and range are data-dependent and require the ON clause. Automatic distribution is not data dependent and does not allow the ON clause.

An automatic distribution is used when the only goal is to spread rows evenly among the partitions. Rows are arbitrarily assigned to random partitions.

A hash distribution is used to spread rows evenly among the partitions deterministically, by use of a hash value (instead of randomly). Given a value for the partitioning columns, a query can predict which partition contains the rows that have the matching value. Thus a query can restrict its search to a subset of partitions.

A list distribution is used to assign rows to partitions based on specific values in one or more columns. A row's partitioning column values are compared to each partition's list values, and when a match is found, the row is sent to that partition. Multiple list values per partition are allowed. If a row matches any of the list values, that partition is selected. One of the partitions must contain the default value in its list; this partition is selected if the row matches none of the list values.

A range distribution is used to assign ranges of values to partitions. The range containing a row's partitioning column values determines the partition in which the row is placed. The ranges must be defined in such a way that every possible value falls into exactly one range. Overlapping ranges are not allowed. Separate ranges cannot map to the same partition—that is, one range, one partition.

A partition defined with values < rangevalue contains all possible values less than rangevalue, down to some smaller rangevalue in the scheme. Similarly, a partition defined with values > rangevalue contains all possible values greater than rangevalue, up to some larger rangevalue in the scheme. Because all values must be covered by some range, the smallest rangevalue must have the operator < (or <=), and the largest rangevalue must have the operator > (or >=). The partitions need not be defined in order of rangevalue.

Multi-column values are tested from left to right. For example, a three-column value (1, 10, 5) is greater than (1, 2, 3000).

While a null can be incorporated into a rangevalue, it is not recommended. The ordering of null relative to non-null values is defined by the SQL standard, so the resulting partitioning is dependent on server implementation.

The optional logical partition names must be unique for each table. The same partition name is allowed to occur in other partitioned tables. If a partition name is omitted, the system generates a name (of the form iipartnn).

If NO LOCATION= is specified for a partition, the location list is defined by the enclosing statement's with-clause (that is, the with-clause that contains the PARTITION= clause).

Partitioning Syntax

A table partition definition has the following format:

PARTITION = ( dimension ) | ( ( dimension ) { SUBPARTITION (dimension)} )

The syntax for each partition dimension is:

dimension = rule [ ON column { , column } ]

partitionspec { , partitionspec }

rule
Defines the type of distribution scheme for assigning rows to partitions. Valid values are:
- AUTOMATIC
  Assigns rows arbitrarily to random partitions. The ON clause cannot be used with automatic distribution.
- HASH
  If rule is specified as automatic or hash, the syntax for partitionspec merely defines the number of partitions and optionally their names:
  
  partitionspec = [nn] PARTITION[s] [ ( name {, name} ) ] [with-clause]
  
  The option number nn defaults to 1 if omitted.
- LIST
  Causes the syntax for partitionspec to define the list values to be mapped to each partition:
  
  partitionspec = PARTITION [name] values ( listvalue {, listvalue} )
  
  [with-clause]
  
  listvalue = single-constant-value
  
                |
  
                ( single-constant-value { , single-constant-value } )
  
                |
  
  default
  
  The first form is used when there is only one partitioning column, while the second form is used when there are multiple partitioning columns.
  
  RANGE
  
  Causes the syntax for partitionspec to define the ranges that map to each partition:
  
  partitionspec = PARTITION [name] values testing-op rangevalue
  [with-clause]
  
  where testing-op is one of the operators: < , <= , > , or >=
  
  rangevalue = single-constant-value
  
                |
  
                ( single-constant-value { , single-constant-value } )
  
  Range values work like list values: parentheses are needed for multiple ON columns. Ranges do not allow the default keyword.
  
  with-clause = with with-option
  
                | with ( with-option { , with-option } )
  
  with-option = location = ( location { , location } )