Whenever you create a table without specifying a storage structure, the storage structure assigned is heap. In a heap structure, the table has no key—it is simply a heap of data. When you add a row, it is added to the end of the heap. This makes heap the fastest storage structure to use when you are initially loading tables or adding a large quantity of data.
However, when you want to retrieve a particular row from a heap table, you must search through every row in the table looking for rows that qualify. This makes heap relatively slow for retrieval if tables have more than a few pages. For more information, see the chapter "Maintaining Storage Structures."
Note: The heapsort structure is like heap, but with the rows sorted and duplicates removed (unless duplicates are allowed).
A heap table consists of a chain of pages. The layout of the sample heap table, employee, is shown below:
empno name age salary comment
+-------------------------------------------
Page 0 | 17| Shigio | 29| 28000.000|
| 9| Blumberg | 33| 32000.000|
| 26| Stover | 38| 35000.000|
| 1| Mandic | 46| 43000.000|
|-------------------------------------------
Page 1 | 18| Giller | 47| 46000.000|
| 10| Ming | 23| 22000.000|
| 27| Curry | 34| 32000.000|
| 2| Ross | 50| 55000.000|
|-------------------------------------------
Page 2 | 19| McTigue | 44| 41000.000|
| 11| Robinson | 64| 80000.000|
| 28| Kay | 41| 38000.000|
| 3| Stein | 44| 40000.000|
|-------------------------------------------
Page 3 | 20| Cameron | 37| 35000.000|
| 12| Saxena | 24| 22000.000|
| 29| Ramos | 31| 30000.000|
| 4| Stannich | 36| 33000.000|
|-------------------------------------------
Page 4 | 21| Huber | 35| 32000.000|
| 13| Clark | 43| 40000.000|
| 30| Brodie | 42| 40000.000|
| 5| Verducci | 55| 55000.000|
|-------------------------------------------
Page 5 | 22| Zimmerman | 26| 25000.000|
| 14| Kreseski | 25| 24000.000|
| 31| Smith | 20| 10000.000|
| 6| Aitken | 49| 50000.000|
|-------------------------------------------
Page 6 | 23| Gordon | 28| 27000.000|
| 15| Green | 27| 26000.000|
| 7| Curan | 30| 30000.000| Fire
| 24| Sabel | 21| 21000.000|
|-------------------------------------------
Page 7 | 16| Gregori | 32| 31000.000|
| 8| McShane | 22| 22000.000|
| 25| Sullivan | 38| 35000.000|
|
+-------------------------------------------
Because table scans are expensive, heap is not a good structure to use while querying large tables. A retrieval of this type must look at every page in the employee table:
Select * from employee
where employee.name = 'Sullivan';
A retrieval like this also scans the entire table, even though Shigio's record is the first row of the first page:
Select * from employee
where employee.name = 'Shigio';
Because heap tables do not eliminate duplicate rows, the entire table must be scanned in case there is another employee named Shigio on another page in the table.
Heap is used as the default storage structure when a table is first created, because it is assumed that a newly created table is likely to be loaded with data.
Loading is optimized by not doing "per row" logging. Therefore, you must load into an empty table. This can be a table that was just created and into which no data has ever been added or deleted. Or it can be an existing table that was truncated by clicking Delete All Data in the Modify Table Structure dialog or by using the modify to truncate statement.
The empty table must also have the following characteristics:
Heap is also the best structure to use for adding data. Additions to a heap table progress quickly because inserted rows are added to the end of the heap. There is no overhead of calculating what page the row is on. The disadvantage is that the heap structure does not make use of deleted row space except at the end of the table.
Aside from compressed storage structures, the heap structure produces tables with the smallest number of pages. This is because every page in a heap table is filled as completely as possible. This is referred to as a 100% fill factor. A heap table is approximately half the size of the same table modified to hash because hash uses a 50% default fill factor instead of 100%.
After loading or adding the data, you can modify the table to another storage structure. (Do not modify an empty table to another storage structure before loading the data.)
To free deleted space, remodify the table to heap using the Modify Table Structure dialog or the modify statement.
Very small tables can usually be left as heap tables. If the table fits on one to five pages as a heap, there is no speed advantage to modifying it to a different structure.
Note: The heap structure is sometimes used for large tables in conjunction with a secondary index. This can be useful in a situation where the table is so large it cannot be modified, but an accelerated access method is needed.
Heap is a good storage structure to use in any of these cases:
Do not use heap for large tables when query performance is the top priority. Heap is also a poor storage structure to use if you look up particular rows by key value.
The following are problems encountered with heap storage structure, and their solutions:
Problem |
Solution |
---|---|
Access is slow on a table created from another table (for example, using the create table as select statement or the Create Table as Select check box in the Create Table dialog). |
Change the storage structure of the table from which you are selecting the data, or specify a storage structure other than heap for the table you are creating. |
Space once used by deleted rows is never reused. |
Modify the table to reclaim the deleted row space (for example, using the modify statement or the Modify Table Structure dialog). In this case, you can still choose heap as the storage structure. |
Selects and updates are slow. |
If the table is not small, modify it to another storage structure. Heap is used only for small tables because the entire table is always scanned. Alternatively, you can create a secondary index. |
Inserts are not concurrent. |
Use row locking if the page size is greater than 4 KB, or modify to another structure. All inserts to a heap table are sent to the last page. |