Defining Data Types

This chapter describes the data type definition and the routines that are required for a data type.

Previous Topic

Next Topic

Data Type Definition

The DBMS Server requires a data type definition for each user-defined abstract data type. This definition specifies the name, length, and ID of the new data type. The definition also points to the routines that manage and manipulate the new data type.

Previous Topic

Next Topic

Structure IIADD_DT_DFN Fields

The fields of the structure IIADD_DT_DFN, described below, compose the data type definition. The first five fields specify the name, length, ID of the new data type, its underlying type, and its attributes. These fields are as follows:

The remaining fields in the IIADD_DT_DFN structure are filled with the addresses of the required routines that manipulate the data type. The Required Routines for Data Type Definition section lists these routines and the common characteristics that they share. The remainder of the chapter describes each routine in detail.

Previous Topic

Next Topic

Required Routines for Data Type Definition

For each data type that you add, you must provide the following routines:

These routines are called using the following syntax:

status = fid_routine(scb, [arg1 [, arg2]], result)

The routines must return a 4-byte integer of type II_STATUS whose value represents the overall result of the function. A value of 0 (II_OK) means successful completion. A value of 5 (II_ERROR) means an error. In the case of an error, the query execution is terminated.

Each function has from two to four parameters. The first parameter must be a Session Control Block. This structure contains information used by the upper layers of the DBMS data type subsystem. Inside the Session Control Block is a structure named scb_error of the type II_ERR_STRUCT. Scb_error must be filled in whenever the function returns an error.

The fields of scb_error are listed here:

The last parameter must be a pointer to a result of some type. The result structure can be a single element, such as an integer, containing some sort of indicator, but most often it is an II_DATA_VALUE (defined below).

The result can contain valid data that specifies some portion of the work to be done. This is often done when the routine is creating a portion of a value and another portion of the value is created elsewhere. For example, this is done if getempty() was providing the data and the actual type and length were being created elsewhere.

The two optional parameters (arg1 and arg2) are pointers to II_DATA_VALUE structures that describe the data values manipulated by the routine. Each function uses II_DATA_VALUE structures.

The fields in the II_DATA_VALUE structure are as follows:

Previous Topic

Next Topic

compare Routine—Compare Two Data Elements

This routine compares values of two user-defined data types. If the II_DT_NOSORT attribute is provided in the dtd_attributes field, then the compare routine is not necessary.

The input arguments are II_DATA_VALUE pointers to the two data elements being compared. The data elements must be of the same type. The final argument, result, must be set to be a negative number, 0, or a positive, non-zero number depending on whether the first argument is less than, equal to, or greater than the second argument. That is,

if arg1 < arg2, result is negative
else if arg1 == arg2 result equals 0
else result is non-zero, positive

The address of this routine must be placed in the dtd_compare_addr field of the IIADD_DT_DFN structure.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

dbtoev Routine—Determine External Data Type

The dbtoev routine determines the external data type to which a user-defined data type is converted.

This routine returns the external type specification for the input data type. A coercion (function instance) must be defined to convert the input data type to the given output data type and length.

This routine is called by the DBMS Server to determine how to pass a non-exportable data type to an Ingres tool as the result of a select or fetch statement. This routine sets the ev_value field to the external data type and length for the specified user-defined data type. You must place the address of this routine in the dtd_dbtoev_addr field of the IIADD_DT_DFN structure.

The output data type (db_datatype) must be an SQL data type. Valid values are:

The copy SQL statement does not use this interface. User-defined (and other non-exportable) data types are returned in their original state, as char data of the appropriate length.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

dhmax Routine—Create Default Maximum Histogram Value

This routine creates the default maximum histogram value. The default histogram values are used by the optimizer when no histogram data is present in the system catalogs. For a discussion of creating a default histogram routine, see dhmin Routine. If the II_DT_NOHISTOGRAM attribute is set, then this routine is not necessary.

This routine and the hmax routine form a pair, similar to the pair hmin and dhmin, except that hmax and dhmax deal with the maximum and default maximums, respectively, instead of the minimums.

Place the address of this routine in the dtd_dhmax_addr field of the IIADD_DT_DFN structure.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

dhmin Routine—Create Default Minimum Histogram Value

This routine creates the minimum default histogram value. You must place the address of this routine in the dtd_dhmin_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.

The default histogram values are used by the optimizer when no histogram data is present in the system catalogs. (Optimizedb, which creates statistics for use in histograms, cannot be run on user-defined data types.)

This routine differs from the hmin routine in that hmin provides the histogram for the smallest possible value, whereas dhmin provides the histogram for the smallest "usual" value.

No values are provided to this routine-you must determine what the minimum and maximum default values are. For example, if a data type is being used to store temperatures, a valid range is probably absolute 0 to some very high number. However, a reasonable default minimum and maximum, indicating the range used by most queries, is probably -20 degrees (F) to +120 degrees (F) for temperatures in the continental US.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

getempty Routine—Get an Empty Value

This routine constructs the given empty value for this data type. 'Empty value' refers to the default value for a data type. (For example, the getempty routine for an integer creates the value 0.) NULLs are handled transparently, outside of this routine.

Place the address of this routine in the dtd_getempty_addr field of the IIADD_DT_DFN structure.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

hashprep Routine—Prepare Value for Hash Key

This routine prepares a data value for becoming a hash key. Place the address of this routine in the dtd_hashprep_addr field of the IIADD_DT_DFN structure. If the II_DT_NOKEY attribute is present, then this routine is not necessary.

For most data types, hash key preparation is a simple copy operation, copying the input data to the output. However, some data types may require more processing. For example, character data types may require blank removal or case translation.

The DBMS Server hash algorithm treats the hash key as a simple byte stream. It does not make allowances for the special characteristics of a data type. You must normalize any variable-length data types within this routine. Unused space must be initialized to some known value. For example, character strings are typically padded with blanks. You must also ensure that there are no compiler-generated holes in your data type. Holes can occur when a compiler pads a structure definition for alignment.

This routine must transform any two values of a data type that compare as equal (using the compare routine) into identical byte streams.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

helem Routine—Create a Histogram Element for Data Value

This routine creates a histogram value for a data element. Place the address of this routine in the dtd_helem_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.

A histogram value is a representation of the data element. The DBMS query optimizer uses histogram values in the evaluation of query plans. The optimizer restricts the length of the histogram value to 8.

The comparison of the histograms of two values must match the comparison of their respective values, because the histogram value definition (if a < b, then h(a) < h(b) ) assumes that 'a < b' uses the same compare routine. Histograms values have a type-for details, see the description hg_dtln Routine.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

hg_dtln Routine—Provide Type and Length for Histogram Value

The hg_dtln routine provides the data type and length for a histogram value for a given data type.

This routine builds a datavalue, dv_histogram, which describes the data type and length of the histogram value for the data type specified in the input dv_from. Place the address of this routine in the dtd_hg_dtln_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

hmax Routine—Create Histogram Value for Maximum Value

This routine is used by the optimizer to obtain the histogram value for the largest value of a type. Place the address of this routine in the dtd_hmax_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

hmin Routine—Create Histogram Value for Minimum Value

This routine is used by the optimizer to obtain the histogram value for the smallest value of a type. For a discussion of histograms, see helem Routine.

Note: The smallest value for the given data type is expected to be known implicitly by the routine.

Place the address of this routine in the dtd_hmim_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

keybuild Routine—Build a Key from the Value

The keybuild routine builds an isam, B-tree, or hash key from the value.

This routine constructs a key pair for use by the system. Place the address of this routine in the dtd_keybld_addr field of the IIADD_DT_DFN structure. If the II_DT_NOKEY attribute is present, then this routine is not necessary.

A key pair consists of a high-key/low-key combination whose values represent the largest and smallest values that match the key, respectively. The key pair that results from this routine is based upon the type of key desired.

The DBMS query optimizer uses this operation for building keys for traversing hash, isam, or B-tree tables. Whenever the DBMS Server must look up a value in a table using an ordered index (either hash, isam, or B-tree), it uses that value to form two other values. These two values represent the 'key', that is, the upper and lower limits of the search space. It is not guaranteed that all values in the relation matching the value are between the upper and lower limits produced by keybuild().

Along with the value being keyed on, the caller of keybuild must specify the comparison operator being used (for example, '<'). One of the input parameters for this routine, .adc_opkey, represents the type of operation for which this key is being built. The possible values for this parameter and the operators they represent are:

Keybuild's main purpose is to build the upper and lower values of the search space. These values are called the high key and low key, respectively. In addition, keybuild returns the type of key that was formed, which tells what type of search must be performed.

The value returned in adc_tykey determines whether or not a key pair is built and, if built, whether the pair is the high or low key. If you are interested only in what type of key is built and not in the actual search space, then set the db_data field (in the II_DATA_VALUE structure pointed to by adc_lokey and/or adc_hikey) to point to a zero address.

The following are the values returned in adc_tykey and their interpretations:

The most likely combinations are:

Although there is fairly strong correlation between the key operator and the type of key built, you cannot use the key operator to predict with certainty the type of key built. For example, assume that you are keying on an i2 column with the '<' operator but the supplied key value is an i4 whose value is 50000. The key that is built is II_KALLMATCH, not II_KHIGHKEY, as might be expected. Do not rely on the key operator to tell you what type of key is built.

Note: The data type of the datavalue in the .adc_kdv may not be same type as the required key resulting from this routine. If it is not, you must supply a coercion to change it to the required data type.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

length_check Routine—Check for Valid Length

The length_check routine checks that the specified length for the data type is valid. If the specified length is user specified, the routine returns the corresponding internal length. If the length is not user specified, it returns a user length corresponding to internal length.

Place the address of this routine in the dtd_lenchk_addr field of the IIADD_DT_DFN structure.

If the value of user_specified is not 0, then the length is a value specified by a user or user program-for example, 4 if user typed 'varchar(4)'. If the value of user_specified is 0, then the length is the internal length, for example, 6 for varchar(4).

If you specify result_dv, then it must be set to the valid length regardless of the success or failure of the routine. If user_specified is non-zero, then result_dv must specify the corresponding internal length. Conversely, if user-specified is zero, then result_dv must specify the user length corresponding to the provided internal length.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

minmaxdv Routine—Provide Min/Max Values and Lengths

The minmaxdv routine provides the minimum and maximum values and lengths for a data type.

Place the address of this routine in the dtd_minmaxdv_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.

Depending on the input parameters, the routine returns one or both of the following:

The two input parameters are min_dv and max_dv; both are pointers to II_DATA_VALUEs. The lengths specified (db_length) for each may be different, but their data types (db_datatype) must be the same.

The routine uses the following rules to process these inputs:

If an input is NULL, then processing for that input is not performed. This allows the caller who is interested in only the maximum value or only the minimum to use this routine more efficiently.

If the db_length field of an input is supplied as II_LEN_UNKNOWN, no corresponding value is built and placed at the output's db_data field. Instead, the routine returns the valid internal length to the db_length field.

If the db_data field of an input is NULL, then no value is built and placed at the corresponding output's db_data field.

If none of rules 1-3 apply to an input, then the value for the data type and length is built and placed at db_data.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

seglen Routine—Determine Length of Each Long Segment

This routine returns the maximum number of bytes that can fit into a segment. For a peripheral like long line, it is the number of points that can fit in the input length times the size of a point plus the size of a line's overhead.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

tmcvt Routine—Convert Data Type to Displayable Format

The tmcvt routine converts data of a user-defined data type from an internal format to a displayable format. (This displayable format is used by a terminal monitor when user-defined data types are sent without conversion to a terminal monitor.) Place the address of this routine in the dtd_tmcvt_addr field of the IIADD_DT_DFN structure.

This routine is used by the DBMS Server to format various trace statements and error messages.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

tmlen Routine—Determine Display Length

The tmlen routine determines the display length of the data type.

This routine returns the default and worst-case lengths for a data type if it were to be printed as text, for example, by a terminal monitor. Although user-defined data types are not returned to a terminal monitor as the user-defined types, this routine is needed by various trace flags and error formatting within the DBMS Server.

Place the address of this routine in the dtd_tmlen_addr field of the IIADD_DT_DFN structure.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS

Previous Topic

Next Topic

value_check Routine—Check for Valid Values

The value_check routine checks for valid values.

For some data types, only certain characters or bit patterns might be valid. This routine checks the patterns for validity. For example, this routine which rejects C data type values which contain null characters.

Place the address of this routine in the dtd_valchk_addr field of the IIADD_DT_DFN structure.

Inputs

The inputs for this function are:

Outputs

This function has no outputs. The return value (II_OK or II_ERROR) determines correctness.

Returns

II_STATUS

Previous Topic

Next Topic

xform Routine—Transform Long Types into Segments

This xform routine transforms a long data type into its component segments. Place the address of this routine in the dtd_xform_addr field of the IIADD_DT_DFM structure.

The shd_exp_action contains the instructions for this routine. ADW_START is the first call. Thereafter, it is examined at the return of this routine to determine the caller's next action. Set ADV_GET_DATA to have the caller provide the next section of data. If the caller receives ADW_GET_DATA and there is no more data, then it flushes any current data and does not return to this routine. ADW_FLUSH_SEGMENT indicates that the caller disposes of the output segment, and supplies a new, empty one on the next call. ADW_CONTINUE indicates that the routine is to be called again. ADW_STOP indicates that there is a problem and the process has failed; the routine is not called again.

Inputs

The inputs for this function are:

Outputs

The outputs for this function are:

Returns

II_STATUS


© 2007 Ingres Corporation. All rights reserved.