Chapter 2. Fast Index Creation in the InnoDB Storage Engine

Chapter 2. Fast Index Creation in the InnoDB Storage Engine
Prev		Next

2.1. Overview of Fast Index Creation

In MySQL versions up to 5.0, adding or dropping an index on a table with existing data can be very slow if the table has many rows. The CREATE INDEX and DROP INDEX commands work by creating a new, empty table defined with the requested set of indexes. It then copies the existing rows to the new table one-by-one, updating the indexes as it goes. Inserting entries into the indexes in this fashion, where the key values are not sorted, requires random access to the index nodes, and is far from optimal. After all rows from the original table are copied, the old table is dropped and the copy is renamed with the name of the original table.

Beginning with version 5.1, MySQL allows a storage engine to create or drop indexes without copying the contents of the entire table. The standard built-in InnoDB in MySQL version 5.1, however, does not take advantage of this capability. With the InnoDB Plugin, however, users can in most cases add and drop indexes much more efficiently than with prior releases.

In InnoDB, the rows of a table are stored in a clustered (or primary key) index, forming what some database systems call an “index-organized table”. Changing the clustered index requires copying the data, even with the InnoDB Plugin. However, adding or dropping a secondary index with the InnoDB Plugin is much faster, since it does not involve copying the data.

This new mechanism also means that you can generally speed the overall process of creating and loading an indexed table by creating the table with only the clustered index, and adding the secondary indexes after the data is loaded.

Although no syntax changes are required in the CREATE INDEX or DROP INDEX commands, some factors affect the performance, space usage, and semantics of this operation (see Section 2.6, “Limitations”).

Because the ability to create and drop indexes does not require use of a new on-disk file format, it is possible to temporarily use the InnoDB Plugin to create or drop an index, and then fall back to using the standard built-in InnoDB in MySQL for normal operations if you wish. See Chapter 11, Downgrading from the InnoDB Plugin for more information.

2.2. Examples

It is possible to create multiple indexes on a table with one ALTER TABLE command. This is relatively efficient, because the clustered index of the table needs to be scanned only once (although the data is sorted separately for each new index). For example:

CREATE TABLE T1(A INT PRIMARY KEY,
  B INT, C CHAR(1)) ENGINE=InnoDB;
INSERT INTO T1 VALUES
 (1,2,'a'), (2,3,'b'), (3,2,'c'), (4,3,'d'), (5,2,'e');
COMMIT;
ALTER TABLE T1 ADD INDEX (B), ADD UNIQUE INDEX (C);

The above commands will create table T1 with the clustered index (primary key) on column A, insert several rows, and then build two new indexes on columns B and C. If there were many rows inserted into T1 before the ALTER TABLE command, this approach would be much more efficient than creating the table with all its indexes before loading the data.

You may also create the indexes one at a time, but then the clustered index of the table is scanned (as well as sorted) once for each CREATE INDEX command. Thus, the following commands are not as efficient as the ALTER TABLE command above, even though neither requires recreating the clustered index for table T1.

CREATE INDEX B ON T1 (B);
CREATE UNIQUE INDEX C ON T1 (C);

Dropping indexes in the InnoDB Plugin does not require any copying of table data. Thus, you can equally quickly drop multiple indexes with a single ALTER TABLE command or multiple DROP INDEX commands:

ALTER TABLE T1 DROP INDEX B, DROP INDEX C;

or

DROP INDEX B ON T1;
DROP INDEX C ON T1;

Restructuring the clustered index in InnoDB always requires copying the data in the table. For example, if you create a table without a primary key, InnoDB chooses one for you, which may be the first UNIQUE key defined on NOT NULL columns, or a system-generated key. Defining a PRIMARY KEY later causes the data to be copied, as in the following example:

CREATE TABLE T2 (A INT, B INT) ENGINE=InnoDB;
INSERT INTO T2 VALUES (NULL, 1);
ALTER TABLE T2 ADD PRIMARY KEY (B);

Note that when you create a UNIQUE or PRIMARY KEY index, InnoDB must do some extra work. For UNIQUE indexes, InnoDB checks that the table contains no duplicate values for the key. For a PRIMARY KEY index, InnoDB also checks that none of the PRIMARY KEY columns contains a NULL. It is best to define the primary key when you create a table, so you need not rebuild the table later.

2.3. Implementation

InnoDB has two types of indexes: the clustered index and secondary indexes. Since the clustered index contains the data values in its B-tree nodes, adding or dropping a clustered index does involve copying the data, and creating a new copy of the table. A secondary index, however, contains only the index key and the value of the primary key. This type of index may be created or dropped without copying the data in the clustered index. Furthermore, because the secondary index contains the values of the primary key (used to access the clustered index when needed), when you change the definition of the primary key, thus recreating the clustered index, all secondary indexes are recreated as well.

Dropping a secondary index is simple. Only the internal InnoDB system tables and the MySQL data dictionary tables need to be updated to reflect the fact that the index no longer exists. InnoDB returns the storage used for the index to the tablespace that contained it, so that new indexes or additional table rows may use the space.

To add a secondary index to an existing table, InnoDB scans the table, and sorts the rows using memory buffers and temporary files in order by the value(s) of the secondary index key column(s). The B-tree is then built in key-value order, which is more efficient than inserting rows into an index in random order with respect to the key values. Because the B-tree nodes are split when they fill, building the index in this way results in a higher fill-factor for the index, making it more efficient for subsequent access.

2.4. Concurrency Considerations

While a secondary index is being created or dropped, the table is locked in shared mode. That is, any writes to the table are blocked, but the data in the table may be read. When you alter the clustered index of a table, however, the table is locked in exclusive mode, because the data must be copied. Thus, during the creation of a new clustered index, all operations on the table are blocked.

Before it can start executing, a CREATE INDEX or ALTER TABLE command must always wait for currently executing transactions that are accessing the table to commit or rollback before it can proceed. In addition, ALTER TABLE commands that create a new clustered index must wait for all SELECT statements that access the table to complete (or their containing transactions to commit). Even though the original index exists throughout the creation of the new clustered index, no transactions whose execution spans the creation of the index can be accessing the table, because the original table must be dropped when clustered index is restructured.

Once a CREATE INDEX or ALTER TABLE command that creates a secondary index begins executing, queries may access the table for read access, but may not update the table. If an ALTER TABLE command is changing the clustered index, all queries must wait until the operation completes.

A newly-created secondary index contains only data that is current in the table as of the time the CREATE INDEX or ALTER TABLE command begins to execute. Specifically, a newly-created index contains only the versions of data as of the most-recently committed transactions prior to the creation of the index. The index thus does not contain any rows that were deleted (and therefore marked for deletion) by transactions that completed before the CREATE INDEX or ALTER TABLE began. Similarly, the index contains only current versions of every row, and none of the old versions of rows that were updated by transactions that ran before the index was created.

Because a newly-created index contains only information about data current at the time the index was created, queries that need to see data that was deleted or changed before the index was created cannot use the index. The only queries that could be affected by this limitation are those executing in transactions that began before the creation of the index was begun. For such queries, unpredictable results could occur. Newer queries can use the index.

2.5. Crash Recovery

No data is lost if the server crashes while an ALTER TABLE command is executing. Recovery, however, is different for clustered indexes and secondary indexes.

If the server crashes while creating a secondary index, upon recovery, InnoDB drops any partially created indexes. All you need to do to create the index is to re-run the ALTER TABLE or CREATE INDEX command.

However, when a crash occurs during the creation of a clustered index, recovery is somewhat more complicated, because the data in the table must be copied to an entirely new clustered index. Remember that all InnoDB tables are stored as clustered indexes. In the following discussion, we use the word table and clustered index interchangeably.

The InnoDB Plugin creates the new clustered index by copying the existing data from the original table to a temporary table that has the desired index structure. Once the data is completely copied to this temporary table, the original table is renamed with a different temporary table name. The temporary table comprising the new clustered index is then renamed with the name of the original table, and the original table is then dropped from the database.

If a system crash occurs while creating a new clustered index, no data is lost, but users must complete the recovery process using the temporary tables that exist during the process.

Users rarely re-create a clustered index or re-define primary keys on large tables. Because system crashes are uncommon and the situation described here is rare, this manual does not provide information on recovering from this scenario. Instead, please see the InnoDB web site: http://www.innodb.com/support/tips.

2.6. Limitations

Take the following considerations into account when creating or dropping indexes using the InnoDB Plugin:

During index creation, files are written to the temporary directory ($TMPDIR on Unix, %TEMP% on Windows, or the value of --tmpdir configuration variable). Each temporary file is large enough to hold one column that makes up the new index, and each one is removed as soon as it is merged into the final index.
Due to a limitation of MySQL, the table is copied, rather than using “Fast Index Creation” when you create an index on a TEMPORARY TABLE. This has been reported as MySQL Bug #39833.
To avoid consistency issues between the InnoDB data dictionary and the MySQL data dictionary, the table is copied, rather than using Fast Index Creation when you use the ALTER TABLE ... RENAME COLUMN syntax.
The command ALTER IGNORE TABLE t ADD UNIQUE INDEX does not delete duplicate rows. This has been reported as MySQL Bug #40344. The IGNORE keyword is ignored. If any duplicate rows exist, the operation fails with the following error message:
```
ERROR 23000: Duplicate entry '347' for key 'pl'
```
As noted above, a newly-created index contains only information about data current at the time the index was created. Therefore, you should not run queries in a transaction that might use a secondary index that did not exist at the beginning of the transaction. There is no way for InnoDB to access “old” data that is consistent with the rest of the data read by the transaction. See the discussion of locking in Section 2.4, “Concurrency Considerations”.
Prior to InnoDB Plugin 1.0.4, unexpected results could occur if a query attempts to use an index created after the start of the transaction containing the query. If an old transaction attempts to access a “too new” index, InnoDB Plugin 1.0.4 and later reports an error:
```
ERROR HY000: Table definition has changed, please retry transaction
```
As the error message suggests, committing (or rolling back) the transaction, and restarting it, cures the problem.
InnoDB Plugin 1.0.2 introduces some improvements in error handling when users attempt to drop indexes. See section Section 8.7, “Better Error Handling when Dropping Indexes” for details.
MySQL 5.1 does not support efficient creation or dropping of FOREIGN KEY constraints. Therefore, if you use ALTER TABLE to add or remove a REFERENCES constraint, the child table will be copied, rather than using “Fast Index Creation”.

Prev		Next
Chapter 1. Introduction to the InnoDB Plugin	Home	Chapter 3. InnoDB Data Compression