Table of Contents
This section discusses how certain InnoDB features, such as table
compression and off-page
storage of long columns, are controlled by the
ROW_FORMAT
clause of the CREATE
TABLE
statement. It discusses considerations for choosing
the right row format and compatibility of row formats between MySQL
releases.
The storage for rows and associated columns affects performance for queries and DML operations. As more rows fit into a single disk page, queries and index lookups can work faster, less cache memory is required in the InnoDB buffer pool, and less I/O is required to write out updated values for the numeric and short string columns.
All data in InnoDB is stored in database pages that make up a B-tree index (the clustered index organized according to the primary key columns). Table data and indexes both use this type of structure. The nodes of the index data structure contain the values of all the columns in that row (for the clustered index) or the index columns and the primary key columns (for secondary indexes).
Variable-length columns are an exception to this rule. Columns
such as BLOB
and VARCHAR
that are too long to fit on a B-tree page are stored on separately
allocated disk pages called
overflow pages. We call
such columns off-page
column. The values of these columns are stored on
singly-linked lists of overflow pages, and each such column has
its own list of one or more overflow pages. In some cases, all or
a prefix of the long column value is stored in the B-tree, to
avoid wasting storage and eliminating the need to read a separate
page.
The Barracuda file format
provides a new option (KEY_BLOCK_SIZE
) to
control how much column data is stored in the clustered index, and
how much is placed on overflow pages.
You specify the row format for a table with the
ROW_FORMAT
clause of the
CREATE TABLE
and
ALTER TABLE
statements.
When innodb_file_format
is set to
Barracuda and a table is created with
ROW_FORMAT=DYNAMIC
or
ROW_FORMAT=COMPRESSED
, long column values are
stored fully off-page, and the clustered index record contains
only a 20-byte pointer to the overflow page.
Whether any columns are stored off-page depends on the page size and the total size of the row. When the row is too long, InnoDB chooses the longest columns for off-page storage until the clustered index record fits on the B-tree page.
The DYNAMIC
row format maintains the efficiency
of storing the entire row in the index node if it fits (as do the
COMPACT
and REDUNDANT
formats), but this new format avoids the problem of filling B-tree
nodes with a large number of data bytes of long columns. The
DYNAMIC
format is based on the idea that if a
portion of a long data value is stored off-page, it is usually
most efficient to store all of the value off-page. With
DYNAMIC
format, shorter columns are likely to
remain in the B-tree node, minimizing the number of overflow pages
needed for any given row.
The COMPRESSED
row format uses similar internal
details for off-page storage as the DYNAMIC
row
format, with additional storage and performance considerations
from the table and index data being compressed and using smaller
page sizes. For full details about the
COMPRESSED
row format, see
Chapter 3, InnoDB Data Compression.
Early versions of InnoDB used an unnamed file format (now called
Antelope) for database files.
With that format, tables were defined with
ROW_FORMAT=COMPACT
(or
ROW_FORMAT=REDUNDANT
) and InnoDB stored up to
the first 768 bytes of variable-length columns (such as
BLOB
and VARCHAR
) in the
index record within the B-tree node, with the remainder stored on
the overflow pages.
To preserve compatibility with those prior versions, tables
created with the newest InnoDB use the prefix format, unless one
of ROW_FORMAT=DYNAMIC
or
ROW_FORMAT=COMPRESSED
is specified (or implied)
on the CREATE TABLE
statement.
With the Antelope file format, if the value of a column is 768
bytes or less, no overflow page is needed, and some savings in I/O
may result, since the value is in the B-tree node. This works well
for relatively short BLOB
values, but may cause
B-tree nodes to fill with data rather than key values, reducing
their efficiency. Tables with many BLOB
columns
could cause B-tree nodes to become too full of data, and contain
too few rows, making the entire index less efficient than if the
rows were shorter or if the column values were stored off-page.