This chapter describes the data type definition and the routines that are required for a data type.
The DBMS Server requires a data type definition for each user-defined abstract data type. This definition specifies the name, length, and ID of the new data type. The definition also points to the routines that manage and manipulate the new data type.
The fields of the structure IIADD_DT_DFN, described below, compose the data type definition. The first five fields specify the name, length, ID of the new data type, its underlying type, and its attributes. These fields are as follows:
Contains the value (hex 210) specified by II_O_DATATYPE.
Is the name of the new data type. The name must be a character string no longer than 32 characters. If the string is less than 32 characters, it must be null terminated. For example, the data type char is specified as 'char\0'.
Is the data type identifier. It is a 2-byte integer field. The ID must be a value between the values represented by ADD_LOW_USER and ADD_HIGH_USER, 16384 and 16511 respectively. This field cannot be altered once the data type is in use.
Is the data type ID, used to store large object segments. This field is used only when the II_DT_PERIPHERAL attribute is set.
Specifies the attributes of the data type. If none of the attributes are necessary or appropriate, then set this field to II_DT_NOBITS (bits are described in the following table).
Large objects do not have any inherent sort order, cannot be used as keys for tables, and cannot have histograms. To specify the attributes in the dtd_attributes field, use the constants listed here:
Indicates that the data type has no specific attributes.
Indicates that the data type cannot be specified as a key column in a modify or create index statement.
Indicates that the data type cannot be specified as the target of a sort by or order by clause in a query, nor can the DBMS Server sort on this data type during the execution of a query. Data types tagged with this bit may have no inherent sort order. They are simply marked as "different" by the sort comparison routine.
Indicates that histograms cannot be constructed for this data type.
Indicates that the data type is stored outside of the basic table format. This means that the table itself can contain either the full data element or it can contain a "coupon" that can be redeemed later to obtain the actual data type.
The data representing a peripheral data type is always represented by an II_PERIPHERAL structure. This structure represents the union of the II_COUPON structure and a byte stream (an array of 1 byte/char). This data structure also contains a flag indicating whether this is the real thing or the coupon.
Indicates that the data type can be specified as occurring a maximum number of times. If compressed, only the actual number of occurrences is saved, thereby saving disk storage.
The remaining fields in the IIADD_DT_DFN structure are filled with the addresses of the required routines that manipulate the data type. The Required Routines for Data Type Definition section lists these routines and the common characteristics that they share. The remainder of the chapter describes each routine in detail.
For each data type that you add, you must provide the following routines:
These routines are called using the following syntax:
status = fid_routine(scb, [arg1 [, arg2]], result)
The routines must return a 4-byte integer of type II_STATUS whose value represents the overall result of the function. A value of 0 (II_OK) means successful completion. A value of 5 (II_ERROR) means an error. In the case of an error, the query execution is terminated.
Each function has from two to four parameters. The first parameter must be a Session Control Block. This structure contains information used by the upper layers of the DBMS data type subsystem. Inside the Session Control Block is a structure named scb_error of the type II_ERR_STRUCT. Scb_error must be filled in whenever the function returns an error.
The fields of scb_error are listed here:
Contains the error code that identifies the error to the calling facility.
Contains the value II_EXTERNAL_ERROR (3).
Contains the user error code. This is the same value as that in er_errcode.
Must contain a valid SQLSTATE error code. For details about SQLSTATE, see the SQL Reference Guide.
Sent by the DBMS Server to specify the size of the buffer pointed to by er_errmsgp.
Sent back to the DBMS Server to indicate the length of the formatted error message that was placed in the buffer to which er_errmsgp points.
Contains a pointer to a buffer where a formatted message can be placed. If this pointer is NULL (0), then no message can be provided.
The last parameter must be a pointer to a result of some type. The result structure can be a single element, such as an integer, containing some sort of indicator, but most often it is an II_DATA_VALUE (defined below).
The result can contain valid data that specifies some portion of the work to be done. This is often done when the routine is creating a portion of a value and another portion of the value is created elsewhere. For example, this is done if getempty() was providing the data and the actual type and length were being created elsewhere.
The two optional parameters (arg1 and arg2) are pointers to II_DATA_VALUE structures that describe the data values manipulated by the routine. Each function uses II_DATA_VALUE structures.
The fields in the II_DATA_VALUE structure are as follows:
Type identifier of the data.
Length of the data.
Pointer to the actual data (if appropriate). This data may not be aligned as required by your machine, and you may need to align the data for correct operation.
Precision of the data value. For most data types, this is not needed, and should be ignored on input and set to 0 for output.
For DECIMAL, the high order byte will represent the value's precision (total number of significant digits), and the low order byte will hold the value's scale (the number of these digits that are to the right of an implied decimal point.)
This routine compares values of two user-defined data types. If the II_DT_NOSORT attribute is provided in the dtd_attributes field, then the compare routine is not necessary.
The input arguments are II_DATA_VALUE pointers to the two data elements being compared. The data elements must be of the same type. The final argument, result, must be set to be a negative number, 0, or a positive, non-zero number depending on whether the first argument is less than, equal to, or greater than the second argument. That is,
if arg1 < arg2, result is negative
else if arg1 == arg2 result equals 0
else result is non-zero, positive
The address of this routine must be placed in the dtd_compare_addr field of the IIADD_DT_DFN structure.
The inputs for this function are:
Pointer to Session Control Block
First operand. Pointer to a II_DATA_VALUE structures which contains the values to be compared.
Second operand. Pointer to a II_DATA_VALUE structures which contains the values to be compared.
Pointer to integer to contain the result of the operation
The outputs for this function are:
Filled with the result of the operations. This routine is set *result as follows:
< 0 if op1 < op2
> 0 if op1 > op2
= 0 if op1 is equal to op2
II_STATUS
The dbtoev routine determines the external data type to which a user-defined data type is converted.
This routine returns the external type specification for the input data type. A coercion (function instance) must be defined to convert the input data type to the given output data type and length.
This routine is called by the DBMS Server to determine how to pass a non-exportable data type to an Ingres tool as the result of a select or fetch statement. This routine sets the ev_value field to the external data type and length for the specified user-defined data type. You must place the address of this routine in the dtd_dbtoev_addr field of the IIADD_DT_DFN structure.
The output data type (db_datatype) must be an SQL data type. Valid values are:
The copy SQL statement does not use this interface. User-defined (and other non-exportable) data types are returned in their original state, as char data of the appropriate length.
The inputs for this function are:
Pointer to an SCB
Ptr to II_DATA_VALUE for database type
Ptr to II_DATA_VALUE for export type
The outputs for this function are:
Filled in as follows:
Type of export value. See the description for a list of valid values for this field.
Length of export value
Must be 0
II_STATUS
This routine creates the default maximum histogram value. The default histogram values are used by the optimizer when no histogram data is present in the system catalogs. For a discussion of creating a default histogram routine, see dhmin Routine. If the II_DT_NOHISTOGRAM attribute is set, then this routine is not necessary.
This routine and the hmax routine form a pair, similar to the pair hmin and dhmin, except that hmax and dhmax deal with the maximum and default maximums, respectively, instead of the minimums.
Place the address of this routine in the dtd_dhmax_addr field of the IIADD_DT_DFN structure.
The inputs for this function are:
Pointer to an SCB
Pointer to a datavalue containing the type for the value
Pointer to a datavalue for the histogram
The outputs for this function are:
Filled with the histogram value
II_STATUS
This routine creates the minimum default histogram value. You must place the address of this routine in the dtd_dhmin_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.
The default histogram values are used by the optimizer when no histogram data is present in the system catalogs. (Optimizedb, which creates statistics for use in histograms, cannot be run on user-defined data types.)
This routine differs from the hmin routine in that hmin provides the histogram for the smallest possible value, whereas dhmin provides the histogram for the smallest "usual" value.
No values are provided to this routine-you must determine what the minimum and maximum default values are. For example, if a data type is being used to store temperatures, a valid range is probably absolute 0 to some very high number. However, a reasonable default minimum and maximum, indicating the range used by most queries, is probably -20 degrees (F) to +120 degrees (F) for temperatures in the continental US.
The inputs for this function are:
Pointer to a SCB
Pointer to a datavalue containing the type and length for the value
Pointer to a datavalue for the histogram
The outputs for this function are:
Filled with the histogram value
II_STATUS
This routine constructs the given empty value for this data type. 'Empty value' refers to the default value for a data type. (For example, the getempty routine for an integer creates the value 0.) NULLs are handled transparently, outside of this routine.
Place the address of this routine in the dtd_getempty_addr field of the IIADD_DT_DFN structure.
The inputs for this function are:
Pointer to a SCB.
Pointer to II_DATA_VALUE in which to place the empty data value:
The data type for the empty data value.
The length for the empty data value.
Pointer to location to place the db_data field for the empty data value. Note that this is often a pointer into a tuple.
The outputs for this function are:
The data for the empty data value is entered.
II_STATUS
This routine prepares a data value for becoming a hash key. Place the address of this routine in the dtd_hashprep_addr field of the IIADD_DT_DFN structure. If the II_DT_NOKEY attribute is present, then this routine is not necessary.
For most data types, hash key preparation is a simple copy operation, copying the input data to the output. However, some data types may require more processing. For example, character data types may require blank removal or case translation.
The DBMS Server hash algorithm treats the hash key as a simple byte stream. It does not make allowances for the special characteristics of a data type. You must normalize any variable-length data types within this routine. Unused space must be initialized to some known value. For example, character strings are typically padded with blanks. You must also ensure that there are no compiler-generated holes in your data type. Holes can occur when a compiler pads a structure definition for alignment.
This routine must transform any two values of a data type that compare as equal (using the compare routine) into identical byte streams.
The inputs for this function are:
Pointer to a SCB
Pointer to an II_DATA_VALUE for value to be keyed upon.
Pointer to an II_DATA_VALUE that contains the key.
The outputs for this function are:
The length of the key
The key value
II_STATUS
This routine creates a histogram value for a data element. Place the address of this routine in the dtd_helem_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.
A histogram value is a representation of the data element. The DBMS query optimizer uses histogram values in the evaluation of query plans. The optimizer restricts the length of the histogram value to 8.
The comparison of the histograms of two values must match the comparison of their respective values, because the histogram value definition (if a < b, then h(a) < h(b) ) assumes that 'a < b' uses the same compare routine. Histograms values have a type-for details, see the description hg_dtln Routine.
The inputs for this function are:
Pointer to SCB
Value for which a histogram is desired
Pointer to data value into which to place the histogram value
Contains the type of the histogram value
Contains the length of the histogram value
Pointer to space of (db_length) bytes into which the histogram value is placed
The outputs for this function are:
Contains the histogram value
II_STATUS
The hg_dtln routine provides the data type and length for a histogram value for a given data type.
This routine builds a datavalue, dv_histogram, which describes the data type and length of the histogram value for the data type specified in the input dv_from. Place the address of this routine in the dtd_hg_dtln_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.
The inputs for this function are:
Pointer to an SCB
Datavalue describing the data type:
Data type name
Length of the data type
Pointer to the datavalue provided to describe the histogram value
The outputs for this function are:
Filled with the required type and length
The data type
The data length
Must be 0
II_STATUS
This routine is used by the optimizer to obtain the histogram value for the largest value of a type. Place the address of this routine in the dtd_hmax_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.
The inputs for this function are:
Pointer to an SCB
Pointer to a datavalue describing the type and length of the desired histogram value
Pointer to a datavalue for the histogram
The outputs for this function are:
Contains the histogram value
II_STATUS
This routine is used by the optimizer to obtain the histogram value for the smallest value of a type. For a discussion of histograms, see helem Routine.
Note: The smallest value for the given data type is expected to be known implicitly by the routine.
Place the address of this routine in the dtd_hmim_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.
The inputs for this function are:
Pointer to SCB
Pointer to a datavalue describing the type and length of the user-typed value.
Pointer to a datavalue for the histogram
The outputs for this function are:
Contains the histogram value
II_STATUS
The keybuild routine builds an isam, B-tree, or hash key from the value.
This routine constructs a key pair for use by the system. Place the address of this routine in the dtd_keybld_addr field of the IIADD_DT_DFN structure. If the II_DT_NOKEY attribute is present, then this routine is not necessary.
A key pair consists of a high-key/low-key combination whose values represent the largest and smallest values that match the key, respectively. The key pair that results from this routine is based upon the type of key desired.
The DBMS query optimizer uses this operation for building keys for traversing hash, isam, or B-tree tables. Whenever the DBMS Server must look up a value in a table using an ordered index (either hash, isam, or B-tree), it uses that value to form two other values. These two values represent the 'key', that is, the upper and lower limits of the search space. It is not guaranteed that all values in the relation matching the value are between the upper and lower limits produced by keybuild().
Along with the value being keyed on, the caller of keybuild must specify the comparison operator being used (for example, '<'). One of the input parameters for this routine, .adc_opkey, represents the type of operation for which this key is being built. The possible values for this parameter and the operators they represent are:
Keybuild's main purpose is to build the upper and lower values of the search space. These values are called the high key and low key, respectively. In addition, keybuild returns the type of key that was formed, which tells what type of search must be performed.
The value returned in adc_tykey determines whether or not a key pair is built and, if built, whether the pair is the high or low key. If you are interested only in what type of key is built and not in the actual search space, then set the db_data field (in the II_DATA_VALUE structure pointed to by adc_lokey and/or adc_hikey) to point to a zero address.
The following are the values returned in adc_tykey and their interpretations:
No values in the table match, so no scan of the table is done. In this case, the low key is set to maximum value for the data type and length of the column being keyed and the high key is set to the minimum.
Only a single value from the table matchs. The low and high keys are set to the same value. The execution phase seeks to this point and scans forward until it is sure that it has exhausted all possible matching values.
All values in the table that match lay within a range. The low key is set to represent the lowest matching value in the table and the high key is set to represent the highest matching value. The execution phase seeks to the point matching the low key and scans forward until it has exhausted all values that might be less than or equal to the high key.
All values in the table that match lie at the low end of the table; they are less than or equal to some value. In this case, the high key is set to that value (the upper bound) and the low key is set to the minimum value for the data type and length of the column being keyed (unbounded). The execution phase starts at the beginning of the table and seeks forward until it has exhausted all values that might be less than or equal to the high key.
All values in the table that match lie at the high end of the table; they are greater than or equal to some value. In this case, the low key is set to that value (the lower bound) and the high key is set to the maximum value for the data type and length of the column being keyed (unbounded). The execution phase seeks to the point of the low key and scans forward from there.
All values in the table may match. The low key is set to minimum value for the data type and length of the column being keyed and the high key is set to the maximum value. A full scan of the table must be performed.
The most likely combinations are:
Although there is fairly strong correlation between the key operator and the type of key built, you cannot use the key operator to predict with certainty the type of key built. For example, assume that you are keying on an i2 column with the '<' operator but the supplied key value is an i4 whose value is 50000. The key that is built is II_KALLMATCH, not II_KHIGHKEY, as might be expected. Do not rely on the key operator to tell you what type of key is built.
Note: The data type of the datavalue in the .adc_kdv may not be same type as the required key resulting from this routine. If it is not, you must supply a coercion to change it to the required data type.
The inputs for this function are:
Pointer to SCB
Pointer to key block data structure:
Datavalue for which to build a key.
This datavalue does not need to be of the same type as the required key.
Operator type for which key is being built
Pointer to area for key. If 0, do not build key.
Pointer to area for key. If 0, do not build key.
Point to II_DATA_VALUEs
The outputs for this function are:
Key block filled with following:
Type key provided.
Pointer to area for key. If 0, do not build key. If adc_tykey is II_KEXACTKEY or II_KLOWKEY, this is key built.
Pointer to area for key. If 0, do not build key. If adc_tykey is II_KEXACTKEY or II_KHIGHKEY, this is the key built.
II_STATUS
The length_check routine checks that the specified length for the data type is valid. If the specified length is user specified, the routine returns the corresponding internal length. If the length is not user specified, it returns a user length corresponding to internal length.
Place the address of this routine in the dtd_lenchk_addr field of the IIADD_DT_DFN structure.
If the value of user_specified is not 0, then the length is a value specified by a user or user program-for example, 4 if user typed 'varchar(4)'. If the value of user_specified is 0, then the length is the internal length, for example, 6 for varchar(4).
If you specify result_dv, then it must be set to the valid length regardless of the success or failure of the routine. If user_specified is non-zero, then result_dv must specify the corresponding internal length. Conversely, if user-specified is zero, then result_dv must specify the user length corresponding to the provided internal length.
The inputs for this function are:
Pointer to SCB.
0 if not user specified, non-zero otherwise.
Pointer to datavalue to be checked. If user_specified is non-zero, then the length field refers to the length specified by the user. Otherwise, it refers to an internal length.
Pointer to an II_DATA_VALUE into which to place the correct length. This parameter can be NULL (0). When this is the case, simply return success or error status.
The outputs for this function are:
Contains the valid length. If the user_specified field is non-zero, this field must be set to the corresponding internal length. If the user_specified field is 0, then set this to the corresponding user length.
II_STATUS
The minmaxdv routine provides the minimum and maximum values and lengths for a data type.
Place the address of this routine in the dtd_minmaxdv_addr field of the IIADD_DT_DFN structure. If the II_DT_NOHISTOGRAM attribute is present, then this routine is not necessary.
Depending on the input parameters, the routine returns one or both of the following:
The two input parameters are min_dv and max_dv; both are pointers to II_DATA_VALUEs. The lengths specified (db_length) for each may be different, but their data types (db_datatype) must be the same.
The routine uses the following rules to process these inputs:
If an input is NULL, then processing for that input is not performed. This allows the caller who is interested in only the maximum value or only the minimum to use this routine more efficiently.
If the db_length field of an input is supplied as II_LEN_UNKNOWN, no corresponding value is built and placed at the output's db_data field. Instead, the routine returns the valid internal length to the db_length field.
If the db_data field of an input is NULL, then no value is built and placed at the corresponding output's db_data field.
If none of rules 1-3 apply to an input, then the value for the data type and length is built and placed at db_data.
The inputs for this function are:
Pointer to an SCB.
Pointer to II_DATA_VALUE for the 'min'. If this is NULL, 'min' processing is skipped:
Its data type. Must be the same as data type for 'max'.
The length to build the 'min' value for, or II_LEN_UNKNOWN, if the 'min' length is requested.
Pointer to location to place the 'min' non-null value, if requested. If this is NULL no 'min' value is created.
Pointer to II_DATA_VALUE for the 'max'. If this is NULL, 'max' processing is skipped:
Its data type. Must be the same as data type for 'min'.
The length to build the 'max' value for, or II_LEN_UNKNOWN, if the 'max' length is requested.
Pointer to location to place the 'max' non-null value, if requested. If this is NULL no 'max' value is created.
The outputs for this function are:
If this was supplied as NULL, 'min' processing is skipped.
If this was supplied as II_LEN_UNKNOWN, the 'min' valid internal length for this data type is returned.
If this was supplied as NULL, or if the db_length field was supplied as II_LEN_UNKNOWN, nothing is returned. Otherwise, the 'min' non-null value for this data type and length is built and placed at the location pointed to by db_data.
If this was supplied as NULL, 'max' processing is skipped.
If this was supplied as II_LEN_UNKNOWN, the 'max' valid internal length for this data type is returned.
If this was supplied as NULL, or if the db_length field was supplied as II_LEN_UNKNOWN, nothing is returned. Otherwise, the 'max' non-null value for this data type and length is built and placed at the location pointed to by db_data.
II_STATUS
This routine returns the maximum number of bytes that can fit into a segment. For a peripheral like long line, it is the number of points that can fit in the input length times the size of a point plus the size of a line's overhead.
The inputs for this function are:
Pointer to a SCB
Data type ID of peripheral object
Pointer to an II_DATA_VALUE with the maximum size of the segment in the length field
The outputs for this function are:
Pointer to an II_DATA_VALUE that receives the underlying data type, the maximum length of the underlying data type, and the precision of the underlying data type.
II_STATUS
The tmcvt routine converts data of a user-defined data type from an internal format to a displayable format. (This displayable format is used by a terminal monitor when user-defined data types are sent without conversion to a terminal monitor.) Place the address of this routine in the dtd_tmcvt_addr field of the IIADD_DT_DFN structure.
This routine is used by the DBMS Server to format various trace statements and error messages.
The inputs for this function are:
Pointer to a SCB.
Pointer to a datavalue containing the data to be displayed
Pointer to a datavalue that provides the output space. The datavalue's db_data field points to an area of db_length bytes.
The outputs for this function are:
Filled with the output
Filled with the number of characters placed in to_dv->db_data
II_STATUS
The tmlen routine determines the display length of the data type.
This routine returns the default and worst-case lengths for a data type if it were to be printed as text, for example, by a terminal monitor. Although user-defined data types are not returned to a terminal monitor as the user-defined types, this routine is needed by various trace flags and error formatting within the DBMS Server.
Place the address of this routine in the dtd_tmlen_addr field of the IIADD_DT_DFN structure.
The inputs for this function are:
Pointer to a SCB
Pointer to the datavalue for which the call is being made
The outputs for this function are:
Pointer to a 2-byte integer into which the default width was placed
Pointer to a 2-byte integer in which the largest (worst case) width was placed
II_STATUS
The value_check routine checks for valid values.
For some data types, only certain characters or bit patterns might be valid. This routine checks the patterns for validity. For example, this routine which rejects C data type values which contain null characters.
Place the address of this routine in the dtd_valchk_addr field of the IIADD_DT_DFN structure.
The inputs for this function are:
Pointer to a SCB
Pointer to data value in question
This function has no outputs. The return value (II_OK or II_ERROR) determines correctness.
II_STATUS
This xform routine transforms a long data type into its component segments. Place the address of this routine in the dtd_xform_addr field of the IIADD_DT_DFM structure.
The shd_exp_action contains the instructions for this routine. ADW_START is the first call. Thereafter, it is examined at the return of this routine to determine the caller's next action. Set ADV_GET_DATA to have the caller provide the next section of data. If the caller receives ADW_GET_DATA and there is no more data, then it flushes any current data and does not return to this routine. ADW_FLUSH_SEGMENT indicates that the caller disposes of the output segment, and supplies a new, empty one on the next call. ADW_CONTINUE indicates that the routine is to be called again. ADW_STOP indicates that there is a problem and the process has failed; the routine is not called again.
The inputs for this function are:
Pointer to a SCB.
Pointer to the workspace for peripheral operations (II_LO_WKSP). This workspace contains fields for the: data type being transformed; the action for this call; the pointer to, length of, and amount used of the input area; and the pointer to, length of, and amount used of the output area.
The outputs for this function are:
Modified to give caller action
Modified as used by the routine
II_STATUS