The optimizedb command generates statistics that are used by the Ingres Query Optimizer to select an efficient query processing strategy.
Statistics are generated on the specified columns, and stored in system catalogs (iistats and iihistograms).
Complete and accurate statistics in the system catalogs result in more efficient query execution strategies and faster system performance. The process of generating complete and accurate statistics requires time, but a balance between accurate statistics and the time to generate them can be achieved by specifying the -zx or -zs flag. Statistics need to be refreshed only when a significant change in the distribution of a column's values has occurred.
The statistics generated by the optimizedb command for any column consist of two elements:
The accuracy of the histograms can be controlled by the -zu# and -zr# flags described below. Increasing the number of cells in the histograms increases the amount of space required for the iihistograms table and thus increases somewhat the amount of space and time used by the optimizer. However, the increased accuracy of the statistics generally results in more efficient query execution strategies.
We recommend that you generate the statistics for all columns that appear in the qualification (where clause) of a query statement. If statistics are missing or incorrect, the query will still execute, but the speed of query processing can be affected.
After running optimizedb, it is prudent to run sysmod. This is especially true the first time optimizedb is run on a database.
Note: Although optimizedb does not lock the database or individual tables while it is retrieving values and generating statistics, after the statistics have been collected and stored in the appropriate catalogs, optimizedb takes an exclusive lock on the database or individual tables to complete its task.
For additional information on the Ingres Query Optimizer and the use of the optimizedb command, see the Database Administrator Guide.
The optimizedb command has the following format:
optimizedb [SQL option flags][-i filename] [-o filename] [-z flags]
dbname[/server_class] {-rtablename {-acolumnname}} | {-xrtablename} [-help]
Indicate SQL option flags that are automatically passed. The optimizedb command accepts the following SQL option flags. For a complete description of these flags, see the sql Command.
+U | -U
-u
-cN
-tN
-ikN
-fkxM.N
+w | -w
-xk
Reads statistics from filename instead of operating directly on the database.
The filename must be a file in ASCII format that was generated by the statdump command using the -o flag. While you can edit this file, only two types of changes are acceptable: modifying values and adding rows that describe cells. Do not change the format of the file, that is, do not change the order in which data appears or add an incomplete new row.
The -r and -a flags, when used with this flag, act as filters. Optimizedb reads in from the file only those statistics that belong to the specified table or column.
Optimizedb does not use the row and page count values in the file unless the -zp flag is also specified.
Note: These values are vital for correct operation of the DBMS. If you use the -zp flag, be sure to put new values for row and page counts in iitables.
Warning! A file with histogram data represented in hex format (generated by the -zhex flag) cannot be used as input to the optimizedb -i command. Doing so will result in incorrect histogram data, which will affect the performance of optimization algorithms.
Writes the output to the specified file instead of to the system catalogs.
Specify options to optimizedb. For details, see Optimizedb -z Flags.
Indicates the name of the database, and if required, the server_class, as described in Standard Flags and Parameters.
Specifies tablenames to be processed. If no table name is specified, then all columns for all tables in the database are processed.
The table name can be qualified with a valid schema name in the format schema.tablename, as described in Schema Qualifier.
If tablename specifies a secondary index name, optimizedb creates a composite histogram on the key columns comprising the index.
Specifies one or more tablenames to be excluded from processing. Except for these tables, all columns in all tables in the database are processed.
Note: Using both the -rtablename and -xrtablename parameters is not permitted in a single optimizedb request; nor is using both the -xrtablename and -acolumnname parameters.
Limits processing to the specified columns plus any columns included through the -zk flag. You can use the -acolumname flag only if the -rtablename parameter is specified.
Displays command syntax online.
The –z flags on the optimizedb command are as follows:
Directs optimizedb to optimize the system catalogs in addition to the base tables. If you want to optimize selected system catalogs only, use this flag and specify the individual tables with the -r flag. This flag is valid only if the user issuing the command is the DBA for the specified database.
Requests a composite histogram on primary key structure.
Directs optimizedb to use its algorithm to estimate the number of distinct values and repetition factor for a column whose histogram is built with sampling (see the -zs# option).
Directs optimizedb to read filename for all other command line flags, database names, and any other command line arguments. This file must contain only one flag per line (see the examples below). If this flag is specified, no other flags or arguments can appear on the command line; they must, instead, appear in the specified file.
Directs optimizedb to use the "fast query" option, which significantly reduces the time to build a histogram. This option improves performance only when the repetition factor of the column is 20 or higher.
The -zfq flag can also cause optimizedb to generate a global temporary table from the values of the histogrammed columns when more than one column is identified in the optimizedb command. The histograms are then built by reading from the faster temporary table, rather than from the base table. The smaller and faster temporary table offers additional performance benefits for the fast query option.
Optimizedb builds the global temporary table when -zfq is specified, and when the number of histogrammed columns and the size of the temporary table row (relative to the size of a base table row) meet certain criteria. See the description of the –znt flag, which can be used with the –zfq flag.
Because there is no performance benefit in building more than one histogram on a table with a single execution of optimizedb, it is recommended that repetitious columns be specified in one execution of optimizedb (with the –zfq flag) and that the others be specified in a separate execution.
Prints the histogram that was generated for each column. This flag also implies the -zv flag.
Generates histogram cell values in hex format, which is useful for seeing how Unicode data is stored. This flag is only effective when used with the -zh and the -o flags.
Generates statistics for columns that are keys on the table or are indexed, in addition to columns specified on the command line.
Reuses existing repetition factor if there is one.
Disables the default behavior of creating histograms from a maximum 500,000 row sample. Using this parameter assures that all rows are read from a table during the histogram building process.
Disables the use of global temporary tables when using the "fast query" option (-zfq) if disk space is not sufficient.
This flag is used only with the -zfq flag.
Directs optimizedb to read floating-point numbers using the precision level specified by #. Use this flag in conjunction with the -i filename flag.
Directs optimizedb to read the row and page count values in the file specified with the -i flag and to store those values in the appropriate system catalog (they can be viewed in iitables).
Specifies the maximum number of cells that the histogram can contain if optimizedb creates an inexact histogram. In an inexact histogram, each cell represents a range of values.
The allowable range is 1<#<15000 (that is, the minimum is 2 and the maximum is 14999).
The default number of cells is 100.
Creates statistics based on sample data. The percentage of table rows sampled is determined by the value of #. This number must be a floating-point number in the range of 0 to 100. Specifying the optional s (-zss) will cause the tuple identifiers (TIDs), which are used to retrieve the sample rows, to be sorted before the rows are retrieved. This decreases retrieval time but increases the amount of memory used by optimizedb.
Specifies the maximum number of cells an exact histogram can contain. In an exact histogram, each cell represents a single, unique value.
The allowable range is 1 to 15000.
The default number of cells is 100.
Prints information about each column as it is being processed.
Sets the complete flag, which indicates whether a column contains all possible values. The range of values in a column affects query optimization. By default, columns are assumed to be not complete.
Directs optimizedb to determine only the minimum and maximum values for each column rather than full statistics. Because minimum and maximum values for columns from the same table can be determined by a single scan through the table, this flag provides a quick way to generate a minimal set of statistics. Minimal statistics cannot be created on columns holding only null values.
This command generates full statistics for all columns in all tables in the empdata database:
optimizedb empdata
This command generates statistics for key or indexed columns in the employee and dept tables and for the dno column in the dept table:
optimizedb -zk empdata -remployee -rdept -adno
This command performs the same operation as the previous example, but from a file:
optimizedb -zf flagfile
where flagfile contains:
-zk
empdata
-remployee
-rdept
-adno
The following command does the following:
optimizedb
-zk
-zv
-zx
empdata
-remployee
-aeno
-rdept
-rsalhist;
This command allows up to 100 unique values from each column in the employee table before merging adjacent values into the same histogram cell:
optimizedb
-zu100
empdata
-remployee;