Table of Contents
In MySQL versions up to 5.0, adding or dropping an index on a
table with existing data can be very slow if the table has many
rows. The CREATE INDEX
and DROP INDEX
commands work by
creating a new, empty table defined with the requested set of
indexes. It then copies the existing rows to the new table
one-by-one, updating the indexes as it goes. Inserting entries
into the indexes in this fashion, where the key values are not
sorted, requires random access to the index nodes, and is far from
optimal. After all rows from the original table are copied, the
old table is dropped and the copy is renamed with the name of the
original table.
Beginning with version 5.1, MySQL allows a storage engine to create or drop indexes without copying the contents of the entire table. The standard built-in InnoDB in MySQL version 5.1, however, does not take advantage of this capability. With the InnoDB Plugin, however, users can in most cases add and drop indexes much more efficiently than with prior releases.
In InnoDB, the rows of a table are stored in a clustered (or primary key) index, forming what some database systems call an “index-organized table”. Changing the clustered index requires copying the data, even with the InnoDB Plugin. However, adding or dropping a secondary index with the InnoDB Plugin is much faster, since it does not involve copying the data.
This new mechanism also means that you can generally speed the overall process of creating and loading an indexed table by creating the table with only the clustered index, and adding the secondary indexes after the data is loaded.
Although no syntax changes are required in the CREATE INDEX
or
DROP INDEX
commands, some factors affect the performance, space
usage, and semantics of this operation (see
Section 2.6, “Limitations”).
Because the ability to create and drop indexes does not require use of a new on-disk file format, it is possible to temporarily use the InnoDB Plugin to create or drop an index, and then fall back to using the standard built-in InnoDB in MySQL for normal operations if you wish. See Chapter 11, Downgrading from the InnoDB Plugin for more information.
It is possible to create multiple indexes on a table with one
ALTER TABLE
command. This is relatively efficient, because the
clustered index of the table needs to be scanned only once
(although the data is sorted separately for each new index). For
example:
CREATE TABLE T1(A INT PRIMARY KEY, B INT, C CHAR(1)) ENGINE=InnoDB; INSERT INTO T1 VALUES (1,2,'a'), (2,3,'b'), (3,2,'c'), (4,3,'d'), (5,2,'e'); COMMIT; ALTER TABLE T1 ADD INDEX (B), ADD UNIQUE INDEX (C);
The above commands will create table T1
with
the clustered index (primary key) on column A
,
insert several rows, and then build two new indexes on columns
B
and C
. If there were many
rows inserted into T1
before the ALTER TABLE
command, this approach would be much more efficient than creating
the table with all its indexes before loading the data.
You may also create the indexes one at a time, but then the
clustered index of the table is scanned (as well as sorted) once
for each CREATE INDEX
command. Thus, the following commands are
not as efficient as the ALTER TABLE
command above, even though
neither requires recreating the clustered index for table
T1
.
CREATE INDEX B ON T1 (B); CREATE UNIQUE INDEX C ON T1 (C);
Dropping indexes in the InnoDB Plugin does not require any
copying of table data. Thus, you can equally quickly drop multiple
indexes with a single ALTER TABLE
command or multiple
DROP INDEX
commands:
ALTER TABLE T1 DROP INDEX B, DROP INDEX C;
or
DROP INDEX B ON T1; DROP INDEX C ON T1;
Restructuring the clustered index in InnoDB always requires
copying the data in the table. For example, if you create a table
without a primary key, InnoDB chooses one for you, which may be
the first UNIQUE
key defined on NOT
NULL
columns, or a system-generated key. Defining a
PRIMARY KEY
later causes the data to be copied,
as in the following example:
CREATE TABLE T2 (A INT, B INT) ENGINE=InnoDB; INSERT INTO T2 VALUES (NULL, 1); ALTER TABLE T2 ADD PRIMARY KEY (B);
Note that when you create a UNIQUE
or
PRIMARY KEY
index, InnoDB must do some extra
work. For UNIQUE
indexes, InnoDB checks that
the table contains no duplicate values for the key. For a
PRIMARY KEY
index, InnoDB also checks that
none of the PRIMARY KEY
columns contains a
NULL
. It is best to define the primary key when
you create a table, so you need not rebuild the table later.
InnoDB has two types of indexes: the clustered index and secondary indexes. Since the clustered index contains the data values in its B-tree nodes, adding or dropping a clustered index does involve copying the data, and creating a new copy of the table. A secondary index, however, contains only the index key and the value of the primary key. This type of index may be created or dropped without copying the data in the clustered index. Furthermore, because the secondary index contains the values of the primary key (used to access the clustered index when needed), when you change the definition of the primary key, thus recreating the clustered index, all secondary indexes are recreated as well.
Dropping a secondary index is simple. Only the internal InnoDB system tables and the MySQL data dictionary tables need to be updated to reflect the fact that the index no longer exists. InnoDB returns the storage used for the index to the tablespace that contained it, so that new indexes or additional table rows may use the space.
To add a secondary index to an existing table, InnoDB scans the table, and sorts the rows using memory buffers and temporary files in order by the value(s) of the secondary index key column(s). The B-tree is then built in key-value order, which is more efficient than inserting rows into an index in random order with respect to the key values. Because the B-tree nodes are split when they fill, building the index in this way results in a higher fill-factor for the index, making it more efficient for subsequent access.
While a secondary index is being created or dropped, the table is locked in shared mode. That is, any writes to the table are blocked, but the data in the table may be read. When you alter the clustered index of a table, however, the table is locked in exclusive mode, because the data must be copied. Thus, during the creation of a new clustered index, all operations on the table are blocked.
Before it can start executing, a CREATE INDEX
or ALTER TABLE
command must always wait for currently executing transactions that
are accessing the table to commit or rollback before it can
proceed. In addition, ALTER TABLE
commands that create a new
clustered index must wait for all SELECT
statements that access the table to complete (or their containing
transactions to commit). Even though the original index exists
throughout the creation of the new clustered index, no
transactions whose execution spans the creation of the index can
be accessing the table, because the original table must be dropped
when clustered index is restructured.
Once a CREATE INDEX
or ALTER TABLE
command that creates a
secondary index begins executing, queries may access the table for
read access, but may not update the table. If an ALTER TABLE
command is changing the clustered index, all queries must wait
until the operation completes.
A newly-created secondary index contains only data that is current
in the table as of the time the CREATE INDEX
or ALTER TABLE
command begins to execute. Specifically, a newly-created index
contains only the versions of data as of the most-recently
committed transactions prior to the creation of the index. The
index thus does not contain any rows that were deleted (and
therefore marked for deletion) by transactions that completed
before the CREATE INDEX
or ALTER TABLE
began. Similarly, the
index contains only current versions of every row, and none of the
old versions of rows that were updated by transactions that ran
before the index was created.
Because a newly-created index contains only information about data current at the time the index was created, queries that need to see data that was deleted or changed before the index was created cannot use the index. The only queries that could be affected by this limitation are those executing in transactions that began before the creation of the index was begun. For such queries, unpredictable results could occur. Newer queries can use the index.
No data is lost if the server crashes while an ALTER TABLE
command is executing. Recovery, however, is different for
clustered indexes and secondary indexes.
If the server crashes while creating a secondary index, upon
recovery, InnoDB drops any partially created indexes. All you
need to do to create the index is to re-run the ALTER TABLE
or
CREATE INDEX
command.
However, when a crash occurs during the creation of a clustered index, recovery is somewhat more complicated, because the data in the table must be copied to an entirely new clustered index. Remember that all InnoDB tables are stored as clustered indexes. In the following discussion, we use the word table and clustered index interchangeably.
The InnoDB Plugin creates the new clustered index by copying the existing data from the original table to a temporary table that has the desired index structure. Once the data is completely copied to this temporary table, the original table is renamed with a different temporary table name. The temporary table comprising the new clustered index is then renamed with the name of the original table, and the original table is then dropped from the database.
If a system crash occurs while creating a new clustered index, no data is lost, but users must complete the recovery process using the temporary tables that exist during the process.
Users rarely re-create a clustered index or re-define primary keys on large tables. Because system crashes are uncommon and the situation described here is rare, this manual does not provide information on recovering from this scenario. Instead, please see the InnoDB web site: http://www.innodb.com/support/tips.
Take the following considerations into account when creating or dropping indexes using the InnoDB Plugin:
During index creation, files are written to the temporary
directory ($TMPDIR
on Unix,
%TEMP%
on Windows, or the value of
--tmpdir
configuration
variable). Each temporary file is large enough to hold one
column that makes up the new index, and each one is removed as
soon as it is merged into the final index.
Due to a limitation of MySQL, the table is copied, rather than
using “Fast Index Creation” when you create an
index on a TEMPORARY TABLE
. This has been
reported as
MySQL
Bug #39833.
To avoid consistency issues between the InnoDB data dictionary
and the MySQL data dictionary, the table is copied, rather
than using Fast Index Creation when you use the ALTER
TABLE ... RENAME COLUMN
syntax.
The command ALTER IGNORE TABLE
does
not delete duplicate rows. This has been reported as
MySQL
Bug #40344. The t
ADD UNIQUE INDEXIGNORE
keyword is
ignored. If any duplicate rows exist, the operation fails with
the following error message:
ERROR 23000: Duplicate entry '347
' for key 'pl
'
As noted above, a newly-created index contains only information about data current at the time the index was created. Therefore, you should not run queries in a transaction that might use a secondary index that did not exist at the beginning of the transaction. There is no way for InnoDB to access “old” data that is consistent with the rest of the data read by the transaction. See the discussion of locking in Section 2.4, “Concurrency Considerations”.
Prior to InnoDB Plugin 1.0.4, unexpected results could occur if a query attempts to use an index created after the start of the transaction containing the query. If an old transaction attempts to access a “too new” index, InnoDB Plugin 1.0.4 and later reports an error:
ERROR HY000: Table definition has changed, please retry transaction
As the error message suggests, committing (or rolling back) the transaction, and restarting it, cures the problem.
InnoDB Plugin 1.0.2 introduces some improvements in error handling when users attempt to drop indexes. See section Section 8.7, “Better Error Handling when Dropping Indexes” for details.
MySQL 5.1 does not support efficient creation or dropping of
FOREIGN KEY
constraints. Therefore, if you use ALTER TABLE
to add or remove a REFERENCES
constraint, the child table
will be copied, rather than using “Fast Index
Creation”.