Chapter 2. Fast Index Creation in the InnoDB Storage Engine
Prev		Next

Chapter 2. Fast Index Creation in the InnoDB Storage Engine

Table of Contents

2.1. Overview of Fast Index Creation
2.2. Examples of Fast Index Creation
2.3. Implementation Details of Fast Index Creation
2.4. Concurrency Considerations for Fast Index Creation
2.5. How Crash Recovery Works with Fast Index Creation
2.6. Limitations of Fast Index Creation

In MySQL 5.5 and higher, or in MySQL 5.1 with the InnoDB Plugin, creating and dropping secondary indexes does not copy the contents of the entire table, making this operation much more efficient than with prior releases.

2.1. Overview of Fast Index Creation

With MySQL 5.5 and higher, or MySQL 5.1 with the InnoDB Plugin, creating and dropping secondary indexes for InnoDB tables is much faster than before. Historically, adding or dropping an index on a table with existing data could be very slow. The CREATE INDEX and DROP INDEX statements worked by creating a new, empty table defined with the requested set of indexes, then copying the existing rows to the new table one-by-one, updating the indexes as the rows are inserted. After all rows from the original table were copied, the old table was dropped and the copy was renamed with the name of the original table.

The performance speedup for fast index creation applies to secondary indexes, not to the primary key index. The rows of an InnoDB table are stored in a clustered index organized based on the primary key, forming what some database systems call an “index-organized table”. Because the table structure is so closely tied to the primary key, redefining the primary key still requires copying the data.

This new mechanism also means that you can generally speed the overall process of creating and loading an indexed table by creating the table with only the clustered index, and adding the secondary indexes after the data is loaded.

Although no syntax changes are required in the CREATE INDEX or DROP INDEX commands, some factors affect the performance, space usage, and semantics of this operation (see Section 2.6, “Limitations of Fast Index Creation”).

2.2. Examples of Fast Index Creation

It is possible to create multiple indexes on a table with one ALTER TABLE statement. This is relatively efficient, because the clustered index of the table needs to be scanned only once (although the data is sorted separately for each new index). For example:

CREATE TABLE T1(A INT PRIMARY KEY, B INT, C CHAR(1)) ENGINE=InnoDB;
INSERT INTO T1 VALUES (1,2,'a'), (2,3,'b'), (3,2,'c'), (4,3,'d'), (5,2,'e');
COMMIT;
ALTER TABLE T1 ADD INDEX (B), ADD UNIQUE INDEX (C);

The above statements create table T1 with the clustered index (primary key) on column A, insert several rows, and then build two new indexes on columns B and C. If there were many rows inserted into T1 before the ALTER TABLE statement, this approach is much more efficient than creating all the secondary indexes before loading the data.

You can also create the indexes one at a time, but then the clustered index of the table is scanned (as well as sorted) once for each CREATE INDEX statement. Thus, the following statements are not as efficient as the ALTER TABLE statement above, even though neither requires recreating the clustered index for table T1.

CREATE INDEX B ON T1 (B);
CREATE UNIQUE INDEX C ON T1 (C);

Dropping InnoDB secondary indexes also does not require any copying of table data. You can equally quickly drop multiple indexes with a single ALTER TABLE statement or multiple DROP INDEX statements:

ALTER TABLE T1 DROP INDEX B, DROP INDEX C;

or:

DROP INDEX B ON T1;
DROP INDEX C ON T1;

Restructuring the clustered index in InnoDB always requires copying the data in the table. For example, if you create a table without a primary key, InnoDB chooses one for you, which may be the first UNIQUE key defined on NOT NULL columns, or a system-generated key. Defining a PRIMARY KEY later causes the data to be copied, as in the following example:

CREATE TABLE T2 (A INT, B INT) ENGINE=InnoDB;
INSERT INTO T2 VALUES (NULL, 1);
ALTER TABLE T2 ADD PRIMARY KEY (B);

When you create a UNIQUE or PRIMARY KEY index, InnoDB must do some extra work. For UNIQUE indexes, InnoDB checks that the table contains no duplicate values for the key. For a PRIMARY KEY index, InnoDB also checks that none of the PRIMARY KEY columns contains a NULL. It is best to define the primary key when you create a table, so you need not rebuild the table later.

2.3. Implementation Details of Fast Index Creation

InnoDB has two types of indexes: the clustered index and secondary indexes. Since the clustered index contains the data values in its B-tree nodes, adding or dropping a clustered index does involve copying the data, and creating a new copy of the table. A secondary index, however, contains only the index key and the value of the primary key. This type of index can be created or dropped without copying the data in the clustered index. Because each secondary index contains copies of the primary key values (used to access the clustered index when needed), when you change the definition of the primary key, all secondary indexes are recreated as well.

Dropping a secondary index is simple. Only the internal InnoDB system tables and the MySQL data dictionary tables are updated to reflect the fact that the index no longer exists. InnoDB returns the storage used for the index to the tablespace that contained it, so that new indexes or additional table rows can use the space.

To add a secondary index to an existing table, InnoDB scans the table, and sorts the rows using memory buffers and temporary files in order by the values of the secondary index key columns. The B-tree is then built in key-value order, which is more efficient than inserting rows into an index in random order. Because the B-tree nodes are split when they fill, building the index in this way results in a higher fill-factor for the index, making it more efficient for subsequent access.

2.4. Concurrency Considerations for Fast Index Creation

While an InnoDB secondary index is being created or dropped, the table is locked in shared mode. Any writes to the table are blocked, but the data in the table can be read. When you alter the clustered index of a table, the table is locked in exclusive mode, because the data must be copied. Thus, during the creation of a new clustered index, all operations on the table are blocked.

A CREATE INDEX or ALTER TABLE statement for an InnoDB table always waits for currently executing transactions that are accessing the table to commit or roll back. ALTER TABLE statements that redefine an InnoDB primary key wait for all SELECT statements that access the table to complete, or their containing transactions to commit. No transactions whose execution spans the creation of the index can be accessing the table, because the original table is dropped when the clustered index is restructured.

Once a CREATE INDEX or ALTER TABLE statement that creates an InnoDB secondary index begins executing, queries can access the table for read access, but cannot update the table. If an ALTER TABLE statement is changing the clustered index for an InnoDB table, all queries wait until the operation completes.

A newly-created InnoDB secondary index contains only the committed data in the table at the time the CREATE INDEX or ALTER TABLE statement begins to execute. It does not contain any uncommitted values, old versions of values, or values marked for deletion but not yet removed from the old index.

Because a newly-created index contains only information about data current at the time the index was created, queries that need to see data that was deleted or changed before the index was created cannot use the index. The only queries that could be affected by this limitation are those executing in transactions that began before the creation of the index was begun. For such queries, unpredictable results could occur. Newer queries can use the index.

2.5. How Crash Recovery Works with Fast Index Creation

Although no data is lost if the server crashes while an ALTER TABLE statement is executing, the crash recovery process is different for clustered indexes and secondary indexes.

If the server crashes while creating an InnoDB secondary index, upon recovery, MySQL drops any partially created indexes. You must re-run the ALTER TABLE or CREATE INDEX statement.

When a crash occurs during the creation of an InnoDB clustered index, recovery is more complicated, because the data in the table must be copied to an entirely new clustered index. Remember that all InnoDB tables are stored as clustered indexes. In the following discussion, we use the word table and clustered index interchangeably.

MySQL creates the new clustered index by copying the existing data from the original InnoDB table to a temporary table that has the desired index structure. Once the data is completely copied to this temporary table, the original table is renamed with a different temporary table name. The temporary table comprising the new clustered index is renamed with the name of the original table, and the original table is dropped from the database.

If a system crash occurs while creating a new clustered index, no data is lost, but you must complete the recovery process using the temporary tables that exist during the process. Since it is rare to re-create a clustered index or re-define primary keys on large tables, or to encounter a system crash during this operation, this manual does not provide information on recovering from this scenario. Instead, please see the InnoDB web site: http://www.innodb.com/support/tips.

2.6. Limitations of Fast Index Creation

Take the following considerations into account when creating or dropping InnoDB indexes:

During index creation, files are written to the temporary directory ($TMPDIR on Unix, %TEMP% on Windows, or the value of the --tmpdir configuration variable). Each temporary file is large enough to hold one column that makes up the new index, and each one is removed as soon as it is merged into the final index.
The table is copied, rather than using Fast Index Creation when you create an index on a TEMPORARY TABLE. This has been reported as MySQL Bug #39833.
To avoid consistency issues between the InnoDB data dictionary and the MySQL data dictionary, the table is copied, rather than using Fast Index Creation when you use the ALTER TABLE ... RENAME COLUMN syntax.
The statement ALTER IGNORE TABLE t ADD UNIQUE INDEX does not delete duplicate rows. This has been reported as MySQL Bug #40344. The IGNORE keyword is ignored. If any duplicate rows exist, the operation fails with the following error message:
```
ERROR 23000: Duplicate entry '347' for key 'pl'
```
As noted above, a newly-created index contains only information about data current at the time the index was created. Therefore, you should not run queries in a transaction that might use a secondary index that did not exist at the beginning of the transaction. There is no way for InnoDB to access “old” data that is consistent with the rest of the data read by the transaction. See the discussion of locking in Section 2.4, “Concurrency Considerations for Fast Index Creation”.
Prior to InnoDB storage engine 1.0.4, unexpected results could occur if a query attempts to use an index created after the start of the transaction containing the query. If an old transaction attempts to access a “too new” index, InnoDB storage engine 1.0.4 and later reports an error:
```
ERROR HY000: Table definition has changed, please retry transaction
```
As the error message suggests, committing (or rolling back) the transaction, and restarting it, cures the problem.
InnoDB storage engine 1.0.2 introduces some improvements in error handling when users attempt to drop indexes. See section Section 8.6, “Better Error Handling when Dropping Indexes” for details.
MySQL 5.5 does not support efficient creation or dropping of FOREIGN KEY constraints. Therefore, if you use ALTER TABLE to add or remove a REFERENCES constraint, the child table is copied, rather than using Fast Index Creation.

Prev		Next
Chapter 1. Introduction to InnoDB 1.1	Home	Chapter 3. InnoDB Data Compression