Table of Contents
In MySQL 5.5 and higher, or in MySQL 5.1 with the InnoDB Plugin, creating and dropping secondary indexes does not copy the contents of the entire table, making this operation much more efficient than with prior releases.
With MySQL 5.5 and higher, or MySQL 5.1 with the InnoDB Plugin,
creating and dropping
secondary indexes for
InnoDB tables is much faster than before. Historically, adding or
dropping an index on a table with existing data could be very
slow. The CREATE INDEX and
DROP INDEX statements worked by
creating a new, empty table defined with the requested set of
indexes, then copying the existing rows to the new table
one-by-one, updating the indexes as the rows are inserted. After
all rows from the original table were copied, the old table was
dropped and the copy was renamed with the name of the original
table.
The performance speedup for fast index creation applies to secondary indexes, not to the primary key index. The rows of an InnoDB table are stored in a clustered index organized based on the primary key, forming what some database systems call an “index-organized table”. Because the table structure is so closely tied to the primary key, redefining the primary key still requires copying the data.
This new mechanism also means that you can generally speed the overall process of creating and loading an indexed table by creating the table with only the clustered index, and adding the secondary indexes after the data is loaded.
Although no syntax changes are required in the
CREATE INDEX or
DROP INDEX commands, some factors
affect the performance, space usage, and semantics of this
operation (see Section 2.6, “Limitations of Fast Index Creation”).
It is possible to create multiple indexes on a table with one
ALTER TABLE statement. This is
relatively efficient, because the clustered index of the table
needs to be scanned only once (although the data is sorted
separately for each new index). For example:
CREATE TABLE T1(A INT PRIMARY KEY, B INT, C CHAR(1)) ENGINE=InnoDB; INSERT INTO T1 VALUES (1,2,'a'), (2,3,'b'), (3,2,'c'), (4,3,'d'), (5,2,'e'); COMMIT; ALTER TABLE T1 ADD INDEX (B), ADD UNIQUE INDEX (C);
The above statements create table T1 with the
clustered index (primary key) on column A,
insert several rows, and then build two new indexes on columns
B and C. If there were many
rows inserted into T1 before the
ALTER TABLE statement, this
approach is much more efficient than creating all the secondary
indexes before loading the data.
You can also create the indexes one at a time, but then the
clustered index of the table is scanned (as well as sorted) once
for each CREATE INDEX statement.
Thus, the following statements are not as efficient as the
ALTER TABLE statement above, even
though neither requires recreating the clustered index for table
T1.
CREATE INDEX B ON T1 (B); CREATE UNIQUE INDEX C ON T1 (C);
Dropping InnoDB secondary indexes also does not require any
copying of table data. You can equally quickly drop multiple
indexes with a single ALTER TABLE
statement or multiple DROP INDEX
statements:
ALTER TABLE T1 DROP INDEX B, DROP INDEX C;
or:
DROP INDEX B ON T1; DROP INDEX C ON T1;
Restructuring the clustered index in InnoDB always requires
copying the data in the table. For example, if you create a table
without a primary key, InnoDB chooses one for you, which may be
the first UNIQUE key defined on NOT
NULL columns, or a system-generated key. Defining a
PRIMARY KEY later causes the data to be copied,
as in the following example:
CREATE TABLE T2 (A INT, B INT) ENGINE=InnoDB; INSERT INTO T2 VALUES (NULL, 1); ALTER TABLE T2 ADD PRIMARY KEY (B);
When you create a UNIQUE or PRIMARY
KEY index, InnoDB must do some extra work. For
UNIQUE indexes, InnoDB checks that the table
contains no duplicate values for the key. For a PRIMARY
KEY index, InnoDB also checks that none of the
PRIMARY KEY columns contains a
NULL. It is best to define the primary key when
you create a table, so you need not rebuild the table later.
InnoDB has two types of indexes: the clustered index and secondary indexes. Since the clustered index contains the data values in its B-tree nodes, adding or dropping a clustered index does involve copying the data, and creating a new copy of the table. A secondary index, however, contains only the index key and the value of the primary key. This type of index can be created or dropped without copying the data in the clustered index. Because each secondary index contains copies of the primary key values (used to access the clustered index when needed), when you change the definition of the primary key, all secondary indexes are recreated as well.
Dropping a secondary index is simple. Only the internal InnoDB system tables and the MySQL data dictionary tables are updated to reflect the fact that the index no longer exists. InnoDB returns the storage used for the index to the tablespace that contained it, so that new indexes or additional table rows can use the space.
To add a secondary index to an existing table, InnoDB scans the table, and sorts the rows using memory buffers and temporary files in order by the values of the secondary index key columns. The B-tree is then built in key-value order, which is more efficient than inserting rows into an index in random order. Because the B-tree nodes are split when they fill, building the index in this way results in a higher fill-factor for the index, making it more efficient for subsequent access.
While an InnoDB secondary index is being created or dropped, the table is locked in shared mode. Any writes to the table are blocked, but the data in the table can be read. When you alter the clustered index of a table, the table is locked in exclusive mode, because the data must be copied. Thus, during the creation of a new clustered index, all operations on the table are blocked.
A CREATE INDEX or
ALTER TABLE statement for an InnoDB
table always waits for currently executing transactions that are
accessing the table to commit or roll back.
ALTER TABLE statements that
redefine an InnoDB primary key wait for all
SELECT statements that access the table to
complete, or their containing transactions to commit. No
transactions whose execution spans the creation of the index can
be accessing the table, because the original table is dropped when
the clustered index is restructured.
Once a CREATE INDEX or
ALTER TABLE statement that creates
an InnoDB secondary index begins executing, queries can access the
table for read access, but cannot update the table. If an
ALTER TABLE statement is changing
the clustered index for an InnoDB table, all queries wait until
the operation completes.
A newly-created InnoDB secondary index contains only the committed
data in the table at the time the CREATE
INDEX or ALTER TABLE
statement begins to execute. It does not contain any uncommitted
values, old versions of values, or values marked for deletion but
not yet removed from the old index.
Because a newly-created index contains only information about data current at the time the index was created, queries that need to see data that was deleted or changed before the index was created cannot use the index. The only queries that could be affected by this limitation are those executing in transactions that began before the creation of the index was begun. For such queries, unpredictable results could occur. Newer queries can use the index.
Although no data is lost if the server crashes while an
ALTER TABLE statement is executing,
the crash recovery
process is different for
clustered indexes and
secondary indexes.
If the server crashes while creating an InnoDB secondary index,
upon recovery, MySQL drops any partially created indexes. You must
re-run the ALTER TABLE or
CREATE INDEX statement.
When a crash occurs during the creation of an InnoDB clustered index, recovery is more complicated, because the data in the table must be copied to an entirely new clustered index. Remember that all InnoDB tables are stored as clustered indexes. In the following discussion, we use the word table and clustered index interchangeably.
MySQL creates the new clustered index by copying the existing data from the original InnoDB table to a temporary table that has the desired index structure. Once the data is completely copied to this temporary table, the original table is renamed with a different temporary table name. The temporary table comprising the new clustered index is renamed with the name of the original table, and the original table is dropped from the database.
If a system crash occurs while creating a new clustered index, no data is lost, but you must complete the recovery process using the temporary tables that exist during the process. Since it is rare to re-create a clustered index or re-define primary keys on large tables, or to encounter a system crash during this operation, this manual does not provide information on recovering from this scenario. Instead, please see the InnoDB web site: http://www.innodb.com/support/tips.
Take the following considerations into account when creating or dropping InnoDB indexes:
During index creation, files are written to the temporary
directory ($TMPDIR on Unix,
%TEMP% on Windows, or the value of the
--tmpdir configuration
variable). Each temporary file is large enough to hold one
column that makes up the new index, and each one is removed as
soon as it is merged into the final index.
The table is copied, rather than using Fast Index Creation
when you create an index on a TEMPORARY
TABLE. This has been reported as
MySQL
Bug #39833.
To avoid consistency issues between the InnoDB data dictionary
and the MySQL data dictionary, the table is copied, rather
than using Fast Index Creation when you use the ALTER
TABLE ... RENAME COLUMN syntax.
The statement ALTER IGNORE TABLE
does
not delete duplicate rows. This has been reported as
MySQL
Bug #40344. The t ADD UNIQUE INDEXIGNORE keyword is
ignored. If any duplicate rows exist, the operation fails with
the following error message:
ERROR 23000: Duplicate entry '347' for key 'pl'
As noted above, a newly-created index contains only information about data current at the time the index was created. Therefore, you should not run queries in a transaction that might use a secondary index that did not exist at the beginning of the transaction. There is no way for InnoDB to access “old” data that is consistent with the rest of the data read by the transaction. See the discussion of locking in Section 2.4, “Concurrency Considerations for Fast Index Creation”.
Prior to InnoDB storage engine 1.0.4, unexpected results could occur if a query attempts to use an index created after the start of the transaction containing the query. If an old transaction attempts to access a “too new” index, InnoDB storage engine 1.0.4 and later reports an error:
ERROR HY000: Table definition has changed, please retry transaction
As the error message suggests, committing (or rolling back) the transaction, and restarting it, cures the problem.
InnoDB storage engine 1.0.2 introduces some improvements in error handling when users attempt to drop indexes. See section Section 8.6, “Better Error Handling when Dropping Indexes” for details.
MySQL 5.5 does not support efficient creation or dropping of
FOREIGN KEY constraints. Therefore, if you
use ALTER TABLE to add or
remove a REFERENCES constraint, the child
table is copied, rather than using Fast Index Creation.