Table of Contents
By setting InnoDB configuration options, you can create tables where the data is stored in compressed form. The compression means less data is transferred between disk and memory, and takes up less space in memory. The benefits are amplified for tables with secondary indexes, because index data is compressed also. Compression can be especially important for SSD storage devices, because they tend to have lower capacity than HDD devices.
Because processors and cache memories have increased in speed more than disk storage devices, many workloads are I/O-bound. Data compression enables smaller database size, reduced I/O, and improved throughput, at the small cost of increased CPU utilization. Compression is especially valuable for read-intensive applications, on systems with enough RAM to keep frequently-used data in memory.
An InnoDB table created with
ROW_FORMAT=COMPRESSED
can use a smaller page
size on disk than the usual 16KB default. Smaller pages require
less I/O to read from and write to disk, which is especially
valuable for SSD devices.
The page size is specified through the
KEY_BLOCK_SIZE
parameter. The different page
size means the table must be in its own .ibd
file rather than the system tablespace, which requires enabling
the innodb_file_per_table
option. The level of
compression is the same regardless of the
KEY_BLOCK_SIZE
value. As you specify smaller
values for KEY_BLOCK_SIZE
, you get the I/O
benefits of increasingly smaller pages. But if you specify a value
that is too small, there is additional overhead to reorganize the
pages when data values cannot be compressed enough to fit multiple
rows in each page. There is a hard limit on how small
KEY_BLOCK_SIZE
can be for a table, based on the
lengths of the key columns for each of its indexes. Specify a
value that is too small, and the CREATE
TABLE
or ALTER TABLE
statement fails.
In the buffer pool, the compressed data is held in small pages,
with a page size based on the KEY_BLOCK_SIZE
value. For extracting or updating the column values, InnoDB also
creates a 16KB page in the buffer pool with the uncompressed data.
Within the buffer pool, any updates to the uncompressed page are
also re-written back to the equivalent compressed page. You might
need to size your buffer pool to accommodate the additional data
of both compressed and uncompressed pages, although the
uncompressed pages are
evicted from the buffer pool
when space is needed, and then uncompressed again on the next
access.
The default uncompressed size of InnoDB data pages is 16KB. You
can use the attributes ROW_FORMAT=COMPRESSED
,
KEY_BLOCK_SIZE
, or both in the
CREATE TABLE
and
ALTER TABLE
statements to enable
table compression. Depending on the combination of option values,
InnoDB uses a page size of 1KB, 2KB, 4KB, 8KB, or 16KB for the
.ibd
file of the table. (The actual compression
algorithm is not affected by the KEY_BLOCK_SIZE
value.)
Compression is applicable to tables, not to individual rows,
despite the option name ROW_FORMAT
.
To create a compressed table, you might use a statement like this:
CREATE TABLEname
(column1 INT PRIMARY KEY) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4
;
If you specify ROW_FORMAT=COMPRESSED
but not
KEY_BLOCK_SIZE
, the default compressed page
size of 8KB is used. If KEY_BLOCK_SIZE
is
specified, you can omit the attribute
ROW_FORMAT=COMPRESSED
.
Setting KEY_BLOCK_SIZE=16
typically does not
result in much compression, since the normal InnoDB page size is
16KB. This setting may still be useful for tables with many long
BLOB
, VARCHAR
or
TEXT
columns, because such values often do
compress well, and might therefore require fewer
“overflow” pages as described in
Section 3.4, “
Compressing BLOB, VARCHAR and TEXT Columns
”.
All indexes of a table (including the clustered index) are
compressed using the same page size, as specified in the
CREATE TABLE
or
ALTER TABLE
statement. Table
attributes such as ROW_FORMAT
and
KEY_BLOCK_SIZE
are not part of the
CREATE INDEX
syntax, and are
ignored if they are specified (although you see them in the output
of the SHOW CREATE TABLE
statement).
Compressed tables are stored in a format that previous versions
of InnoDB cannot process. To preserve downward compatibility of
database files, compression can be specified only when the
Barracuda data file format
is enabled using the configuration parameter
innodb_file_format
.
Table compression is also not available for the InnoDB system
tablespace. The system tablespace (space 0, the
ibdata*
files) may contain user data, but it
also contains internal InnoDB system information, and therefore
is never compressed. Thus, compression applies only to tables
(and indexes) stored in their own tablespaces.
To use compression, enable the
file-per-table
mode using the configuration parameter
innodb_file_per_table
and
enable the Barracuda disk file format using the parameter
innodb_file_format
. If
necessary, you can set these parameters in the MySQL option file
my.cnf
or my.ini
, or with
the SET
statement without shutting down the
MySQL server.
Specifying ROW_FORMAT=COMPRESSED
or
KEY_BLOCK_SIZE
in CREATE
TABLE
or ALTER TABLE
statements produces these warnings if the Barracuda file format
is not enabled. You can view them with the SHOW
WARNINGS
statement.
Level | Code | Message |
---|---|---|
Warning | 1478 | InnoDB: KEY_BLOCK_SIZE requires
innodb_file_per_table. |
Warning | 1478 | InnoDB: KEY_BLOCK_SIZE requires innodb_file_format=1 |
Warning | 1478 | InnoDB: ignoring
KEY_BLOCK_SIZE= |
Warning | 1478 | InnoDB: ROW_FORMAT=COMPRESSED requires
innodb_file_per_table. |
Warning | 1478 | InnoDB: assuming ROW_FORMAT=COMPACT. |
These messages are only warnings, not errors, and the table is created as if the options were not specified. When InnoDB “strict mode” (see Section 8.4, “InnoDB Strict Mode”) is enabled, InnoDB generates an error, not a warning, for these cases. In strict mode, the table is not created if the current configuration does not permit using compressed tables.
The “non-strict” behavior is intended to permit you
to import a mysqldump
file into a database
that does not support compressed tables, even if the source
database contained compressed tables. In that case, MySQL
creates the table in ROW_FORMAT=COMPACT
instead of preventing the operation.
When you import the dump file into a new database, if you want
to have the tables re-created as they exist in the original
database, ensure the server is running the InnoDB storage engine with
the proper settings for the configuration parameters
innodb_file_format
and
innodb_file_per_table
,
The attribute KEY_BLOCK_SIZE
is permitted
only when ROW_FORMAT
is specified as
COMPRESSED
or is omitted. Specifying a
KEY_BLOCK_SIZE
with any other
ROW_FORMAT
generates a warning that you can
view with SHOW WARNINGS
. However, the table
is non-compressed; the specified
KEY_BLOCK_SIZE
is ignored).
Level | Code | Message |
---|---|---|
Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE= |
If you are running in InnoDB strict mode, the combination of a
KEY_BLOCK_SIZE
with any
ROW_FORMAT
other than
COMPRESSED
generates an error, not a warning,
and the table is not created.
Table 3.1, “Meaning of CREATE TABLE
and
ALTER TABLE
options”
summarizes how the various options on
CREATE TABLE
and
ALTER TABLE
are handled.
Table 3.1. Meaning of CREATE TABLE
and
ALTER TABLE
options
Option | Usage | Description |
---|---|---|
ROW_FORMAT=REDUNDANT | Storage format used prior to MySQL 5.0.3 | Less efficient than ROW_FORMAT=COMPACT ; for backward
compatibility |
ROW_FORMAT=COMPACT | Default storage format since MySQL 5.0.3 | Stores a prefix of 768 bytes of long column values in the clustered index page, with the remaining bytes stored in an overflow page |
ROW_FORMAT=DYNAMIC | Available only with
innodb_file_format=Barracuda | Store values within the clustered index page if they fit; if not, stores only a 20-byte pointer to an overflow page (no prefix) |
ROW_FORMAT=COMPRESSED | Available only with
innodb_file_format=Barracuda | Compresses the table and indexes using zlib to default compressed page
size of 8K bytes; implies
ROW_FORMAT=DYNAMIC |
KEY_BLOCK_SIZE= | Available only with
innodb_file_format=Barracuda | Specifies compressed page size of 1, 2, 4, 8 or 16K bytes; implies
ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPRESSED |
Table 3.2, “CREATE/ALTER TABLE
Warnings and Errors when InnoDB
Strict Mode is OFF”
summarizes error conditions that occur with certain combinations
of configuration parameters and options on the
CREATE TABLE
or
ALTER TABLE
statements, and how
the options appear in the output of SHOW TABLE
STATUS
.
When InnoDB strict mode is OFF
, InnoDB
creates or alters the table, but may ignore certain settings, as
shown below. You can see the warning messages in the MySQL error
log. When InnoDB strict mode is ON
, these
specified combinations of options generate errors, and the table
is not created or altered. You can see the full description of
the error condition with SHOW ERRORS
. For
example:
mysql>CREATE TABLE x (id INT PRIMARY KEY, c INT)
->ENGINE=INNODB KEY_BLOCK_SIZE=33333;
ERROR 1005 (HY000): Can't create table 'test.x' (errno: 1478) mysql>SHOW ERRORS;
+-------+------+-------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------+ | Error | 1478 | InnoDB: invalid KEY_BLOCK_SIZE=33333. | | Error | 1005 | Can't create table 'test.x' (errno: 1478) | +-------+------+-------------------------------------------+ 2 rows in set (0.00 sec)
Table 3.2. CREATE/ALTER TABLE
Warnings and Errors when InnoDB
Strict Mode is OFF
Syntax | Warning or Error Condition | Resulting ROW_FORMAT , as shown in SHOW TABLE
STATUS |
---|---|---|
ROW_FORMAT=REDUNDANT | None | REDUNDANT |
ROW_FORMAT=COMPACT | None | COMPACT |
ROW_FORMAT=COMPRESSED or
ROW_FORMAT=DYNAMIC or
KEY_BLOCK_SIZE is specified | Ignored unless both
innodb_file_format =Barracuda
and
innodb_file_per_table
are enabled | COMPACT |
Invalid KEY_BLOCK_SIZE is specified (not 1, 2, 4, 8
or 16) | KEY_BLOCK_SIZE is ignored | the requested one, or COMPACT by default |
ROW_FORMAT=COMPRESSED and valid
KEY_BLOCK_SIZE are specified | None; KEY_BLOCK_SIZE specified is used, not the 8K
default | COMPRESSED |
KEY_BLOCK_SIZE is specified with
REDUNDANT , COMPACT
or DYNAMIC row format | KEY_BLOCK_SIZE is ignored | REDUNDANT , COMPACT or
DYNAMIC |
ROW_FORMAT is not one of
REDUNDANT ,
COMPACT , DYNAMIC
or COMPRESSED | Ignored if recognized by the MySQL parser. Otherwise, an error is issued. | COMPACT or N/A |
When InnoDB strict mode is ON
(innodb_strict_mode=1
), the InnoDB storage engine
rejects invalid ROW_FORMAT
or
KEY_BLOCK_SIZE
parameters. For compatibility
with earlier versions of InnoDB, strict mode is not enabled by
default; instead, InnoDB issues warnings (not errors) for
ignored invalid parameters.
Note that it is not possible to see the chosen
KEY_BLOCK_SIZE
using SHOW TABLE
STATUS
. The statement SHOW CREATE
TABLE
displays the KEY_BLOCK_SIZE
(even if it was ignored by InnoDB). The real compressed page
size inside InnoDB cannot be displayed by MySQL.
Most often, the internal optimizations in InnoDB described in Section 3.4, “ InnoDB Data Storage and Compression ” ensure that the system runs well with compressed data. However, because the efficiency of compression depends on the nature of your data, there are some factors you should consider to get best performance. You need to choose which tables to compress, and what compressed page size to use. You might also adjust the size of the buffer pool based on run-time performance characteristics, such as the amount of time the system spends compressing and uncompressing data.
In general, compression works best on tables that include a reasonable number of character string columns and where the data is read far more often than it is written. Because there are no guaranteed ways to predict whether or not compression benefits a particular situation, always test with a specific workload and data set running on a representative configuration. Consider the following factors when deciding which tables to compress.
A key determinant of the efficiency of compression in reducing the
size of data files is the nature of the data itself. Recall that
compression works by identifying repeated strings of bytes in a
block of data. Completely randomized data is the worst case.
Typical data often has repeated values, and so compresses
effectively. Character strings often compress well, whether
defined in CHAR
, VARCHAR
,
TEXT
or BLOB
columns. On the
other hand, tables containing mostly binary data (integers or
floating point numbers) or data that is previously compressed (for
example JPEG or PNG images)
may not generally compress well, significantly or at all.
You choose whether to turn on compression for each InnoDB tables.
A table and all of its indexes use the same (compressed) page
size. It might be that the primary key (clustered) index, which
contains the data for all columns of a table, compresses more
effectively than the secondary indexes. For those cases where
there are long rows, the use of compression might result in long
column values being stored “off-page”, as discussed
in Section 5.3, “Barracuda File Format: DYNAMIC
and
COMPRESSED
Row Formats”. Those overflow
pages may compress well. Given these considerations, for many
applications, some tables compress more effectively than others,
and you might find that your workload performs best only with a
subset of tables compressed.
Experimenting is the only way to determine whether or not to
compress a particular table. InnoDB compresses data in 16K chunks
corresponding to the uncompressed page size, and in addition to
user data, the page format includes some internal system data that
is not compressed. Compression utilities compress an entire stream
of data, and so may find more repeated strings across the entire
input stream than InnoDB would find in a table compressed in 16K
chunks. But you can get a sense of how compression efficiency by
using a utility that implements LZ77 compression (such as
gzip
or WinZip) on your data file.
Another way to test compression on a specific table is to copy
some data from your uncompressed table to a similar, compressed
table (having all the same indexes) and look at the size of the
resulting file. When you do so (if nothing else using compression
is running), you can examine the ratio of successful compression
operations to overall compression operations. (In the
INNODB_CMP
table, compare
COMPRESS_OPS
to
COMPRESS_OPS_OK
. See
INNODB_CMP
for more information.) If a high percentage of compression
operations complete successfully, the table might be a good
candidate for compression.
Decide whether to compress data in your application or in the InnoDB table. It is usually not sensible to store data that is compressed by an application in an InnoDB compressed table. Further compression is extremely unlikely, and the attempt to compress just wastes CPU cycles.
The InnoDB table compression is automatic and applies to all
columns and index values. The columns can still be tested with
operators such as LIKE
, and sort operations can
still use indexes even when the index values are compressed.
Because indexes are often a significant fraction of the total size
of a database, compression could result in significant savings in
storage, I/O or processor time. The compression and decompression
operations happen on the database server, which likely is a
powerful system that is sized to handle the expected load.
If you compress data such as text in your application, before it is inserted into the database, You might save overhead for data that does not compress well by compressing some columns and not others. This approach uses CPU cycles for compression and uncompression on the client machine rather than the database server, which might be appropriate for a distributed application with many clients, or where the client machine has spare CPU cycles.
Of course, it is possible to combine these approaches. For some applications, it may be appropriate to use some compressed tables and some uncompressed tables. It may be best to externally compress some data (and store it in uncompressed InnoDB tables) and allow InnoDB to compress (some of) the other tables in the application. As always, up-front design and real-life testing are valuable in reaching the right decision.
In addition to choosing which tables to compress (and the page
size), the workload is another key determinant of performance. If
the application is dominated by reads, rather than updates, fewer
pages need to be reorganized and recompressed after the index page
runs out of room for the per-page “modification log”
that InnoDB maintains for compressed data. If the updates
predominantly change non-indexed columns or those containing
BLOB
s or large strings that happen to be stored
“off-page”, the overhead of compression may be
acceptable. If the only changes to a table are
INSERT
s that use a monotonically increasing
primary key, and there are few secondary indexes, there is little
need to reorganize and recompress index pages. Since InnoDB can
“delete-mark” and delete rows on compressed pages
“in place” by modifying uncompressed data,
DELETE
operations on a table are relatively
efficient.
For some environments, the time it takes to load data can be as important as run-time retrieval. Especially in data warehouse environments, many tables may be read-only or read-mostly. In those cases, it might or might not be acceptable to pay the price of compression in terms of increased load time, unless the resulting savings in fewer disk reads or in storage cost is significant.
Fundamentally, compression works best when the CPU time is available for compressing and uncompressing data. Thus, if your workload is I/O bound, rather than CPU-bound, you might find that compression can improve overall performance. When you test your application performance with different compression configurations, test on a platform similar to the planned configuration of the production system.
Reading and writing database pages from and to disk is the slowest aspect of system performance. Compression attempts to reduce I/O by using CPU time to compress and uncompress data, and is most effective when I/O is a relatively scarce resource compared to processor cycles.
This is often especially the case when running in a multi-user environment with fast, multi-core CPUs. When a page of a compressed table is in memory, InnoDB often uses an additional 16K in the buffer pool for an uncompressed copy of the page. The adaptive LRU algorithm in the InnoDB storage engine attempts to balance the use of memory between compressed and uncompressed pages to take into account whether the workload is running in an I/O-bound or CPU-bound manner. Still, a configuration with more memory dedicated to the InnoDB buffer pool tends to run better when using compressed tables than a configuration where memory is highly constrained.
The optimal setting of the compressed page size depends on the type and distribution of data that the table and its indexes contain. The compressed page size should always be bigger than the maximum record size, or operations may fail as noted in Section 3.4, “ Compression of B-Tree Pages ”.
Setting the compressed page size too large wastes some space, but the pages do not have to be compressed as often. If the compressed page size is set too small, inserts or updates may require time-consuming recompression, and the B-tree nodes may have to be split more frequently, leading to bigger data files and less efficient indexing.
Typically, you set the compressed page size to 8K or 4K bytes.
Given that the maximum InnoDB record size is around 8K,
KEY_BLOCK_SIZE=8
is usually a safe choice.
Overall application performance, CPU and I/O utilization and the size of disk files are good indicators of how effective compression is for your application.
To dig deeper into performance considerations for compressed tables, you can monitor compression performance at run time. using the Information Schema tables described in Example 6.1, “Using the Compression Information Schema Tables”. These tables reflect the internal use of memory and the rates of compression used overall.
The INNODB_CMP
tables report information about
compression activity for each compressed page size
(KEY_BLOCK_SIZE
) in use. The information in
these tables is system-wide, and includes summary data across all
compressed tables in your database. You can use this data to help
decide whether or not to compress a table by examining these
tables when no other compressed tables are being accessed.
The key statistics to consider are the number of, and amount of
time spent performing, compression and uncompression operations.
Since InnoDB must split B-tree nodes when they are too full to
contain the compressed data following a modification, compare the
number of “successful” compression operations with
the number of such operations overall. Based on the information in
the INNODB_CMP
tables and overall application
performance and hardware resource utilization, you might make
changes in your hardware configuration, adjust the size of the
InnoDB buffer pool, choose a different page size, or select a
different set of tables to compress.
If the amount of CPU time required for compressing and uncompressing is high, changing to faster CPUs, or those with more cores, can help improve performance with the same data, application workload and set of compressed tables. Increasing the size of the InnoDB buffer pool might also help performance, so that more uncompressed pages can stay in memory, reducing the need to uncompress pages that exist in memory only in compressed form.
A large number of compression operations overall (compared to the
number of INSERT
, UPDATE
and
DELETE
operations in your application and the
size of the database) could indicate that some of your compressed
tables are being updated too heavily for effective compression. If
so, choose a larger page size, or be more selective about which
tables you compress.
If the number of “successful” compression operations
(COMPRESS_OPS_OK
) is a high percentage of the
total number of compression operations
(COMPRESS_OPS
), then the system is likely
performing well. If the ratio is low, then InnoDB is reorganizing,
recompressing, and splitting B-tree nodes more often than is
desirable. In this case, avoid compressing some tables, or
increase KEY_BLOCK_SIZE
for some of the
compressed tables. You might turn off compression for tables that
cause the number of “compression failures” in your
application to be more than 1% or 2% of the total. (Such a failure
ratio might be acceptable during a temporary operation such as a
data load).
This section describes some internal implementation details about compression in InnoDB. The information presented here may be helpful in tuning for performance, but is not necessary to know for basic use of compression.
Some operating systems implement compression at the file system level. Files are typically divided into fixed-size blocks that are compressed into variable-size blocks, which easily leads into fragmentation. Every time something inside a block is modified, the whole block is recompressed before it is written to disk. These properties make this compression technique unsuitable for use in an update-intensive database system.
InnoDB implements compression with the help of the well-known zlib library, which implements the LZ77 compression algorithm. This compression algorithm is mature, robust, and efficient in both CPU utilization and in reduction of data size. The algorithm is “lossless”, so that the original uncompressed data can always be reconstructed from the compressed form. LZ77 compression works by finding sequences of data that are repeated within the data to be compressed. The patterns of values in your data determine how well it compresses, but typical user data often compresses by 50% or more.
Unlike compression performed by an application, or compression
features of some other database management systems, InnoDB
compression applies both to user data and to indexes. In many
cases, indexes can constitute 40-50% or more of the total database
size, so this difference is significant. When compression is
working well for a data set, the size of the InnoDB data files
(the .idb
files) is 25% to 50% of the
uncompressed size or possibly smaller. Depending on the workload,
this smaller database can in turn lead to a reduction in I/O, and
an increase in throughput, at a modest cost in terms of increased
CPU utilization.
All user data in InnoDB is stored in pages comprising a B-tree index (the clustered index). In some other database systems, this type of index is called an “index-organized table”. Each row in the index node contains the values of the (user-specified or system-generated) primary key and all the other columns of the table.
Secondary indexes in InnoDB are also B-trees, containing pairs of values: the index key and a pointer to a row in the clustered index. The pointer is in fact the value of the primary key of the table, which is used to access the clustered index if columns other than the index key and primary key are required. Secondary index records must always fit on a single B-tree page.
The compression of B-tree nodes (of both clustered and secondary
indexes) is handled differently from compression of overflow pages
used to store long VARCHAR
,
BLOB
, or TEXT
columns, as
explained in the following sections.
Because they are frequently updated, B-tree pages require special treatment. It is important to minimize the number of times B-tree nodes are split, as well as to minimize the need to uncompress and recompress their content.
One technique InnoDB uses is to maintain some system information in the B-tree node in uncompressed form, thus facilitating certain in-place updates. For example, this allows rows to be delete-marked and deleted without any compression operation.
In addition, InnoDB attempts to avoid unnecessary uncompression and recompression of index pages when they are changed. Within each B-tree page, the system keeps an uncompressed “modification log” to record changes made to the page. Updates and inserts of small records may be written to this modification log without requiring the entire page to be completely reconstructed.
When the space for the modification log runs out, InnoDB uncompresses the page, applies the changes and recompresses the page. If recompression fails, the B-tree nodes are split and the process is repeated until the update or insert succeeds.
Generally, InnoDB requires that each B-tree page can accommodate
at least two records. For compressed tables, this requirement has
been relaxed. Leaf pages of B-tree nodes (whether of the primary
key or secondary indexes) only need to accommodate one record, but
that record must fit in uncompressed form, in the per-page
modification log. Starting with InnoDB storage engine version 1.0.2, and
if InnoDB strict mode is ON
, the
InnoDB storage engine checks the maximum row size during
CREATE TABLE
or
CREATE INDEX
. If the row does not
fit, the following error message is issued: ERROR HY000:
Too big row
.
If you create a table when InnoDB strict mode is OFF, and a
subsequent INSERT
or UPDATE
statement attempts to create an index entry that does not fit in
the size of the compressed page, the operation fails with
ERROR 42000: Row size too large
. (This error
message does not name the index for which the record is too large,
or mention the length of the index record or the maximum record
size on that particular index page.) To solve this problem,
rebuild the table with ALTER TABLE
and select a larger compressed page size
(KEY_BLOCK_SIZE
), shorten any column prefix
indexes, or disable compression entirely with
ROW_FORMAT=DYNAMIC
or
ROW_FORMAT=COMPACT
.
In a clustered index, BLOB
,
VARCHAR
and TEXT
columns
that are not part of the primary key may be stored on separately
allocated (“overflow”) pages. We call these
off-page columns whose
values are stored on singly-linked lists of overflow pages.
For tables created in ROW_FORMAT=DYNAMIC
or
ROW_FORMAT=COMPRESSED
, the values of
BLOB
, TEXT
or
VARCHAR
columns may be stored fully off-page,
depending on their length and the length of the entire row. For
columns that are stored off-page, the clustered index record only
contains 20-byte pointers to the overflow pages, one per column.
Whether any columns are stored off-page depends on the page size
and the total size of the row. When the row is too long to fit
entirely within the page of the clustered index, InnoDB chooses
the longest columns for off-page storage until the row fits on the
clustered index page. As noted above, if a row does not fit by
itself on a compressed page, an error occurs.
Tables created in previous versions of InnoDB use the Antelope
file format, which supports only
ROW_FORMAT=REDUNDANT
and
ROW_FORMAT=COMPACT
. In these formats, InnoDB
stores the first 768 bytes of BLOB
,
VARCHAR
and TEXT
columns in
the clustered index record along with the primary key. The
768-byte prefix is followed by a 20-byte pointer to the overflow
pages that contain the rest of the column value.
When a table is in COMPRESSED
format, all data
written to overflow pages is compressed “as is”; that
is, InnoDB applies the zlib compression algorithm to the entire
data item. Other than the data, compressed overflow pages contain
an uncompressed header and trailer comprising a page checksum and
a link to the next overflow page, among other things. Therefore,
very significant storage savings can be obtained for longer
BLOB
, TEXT
or
VARCHAR
columns if the data is highly
compressible, as is often the case with text data (but not
previously compressed images).
The overflow pages are of the same size as other pages. A row containing ten columns stored off-page occupies ten overflow pages, even if the total length of the columns is only 8K bytes. In an uncompressed table, ten uncompressed overflow pages occupy 160K bytes. In a compressed table with an 8K page size, they occupy only 80K bytes. Thus, it is often more efficient to use compressed table format for tables with long column values.
Using a 16K compressed page size can reduce storage and I/O costs
for BLOB
, VARCHAR
or
TEXT
columns, because such data often compress
well, and might therefore require fewer “overflow”
pages, even though the B-tree nodes themselves take as many pages
as in the uncompressed form.
In a compressed InnoDB table, every compressed page (whether 1K, 2K, 4K or 8K) corresponds to an uncompressed page of 16K bytes. To access the data in a page, InnoDB reads the compressed page from disk if it is not already in the buffer pool, then uncompresses the page to its original 16K byte form. This section describes how InnoDB manages the buffer pool with respect to pages of compressed tables.
To minimize I/O and to reduce the need to uncompress a page, at times the buffer pool contains both the compressed and uncompressed form of a database page. To make room for other required database pages, InnoDB may “evict” from the buffer pool an uncompressed page, while leaving the compressed page in memory. Or, if a page has not been accessed in a while, the compressed form of the page may be written to disk, to free space for other data. Thus, at any given time, the buffer pool may contain both the compressed and uncompressed forms of the page, or only the compressed form of the page, or neither.
InnoDB keeps track of which pages to keep in memory and which to evict using a least-recently-used (LRU) list, so that “hot” or frequently accessed data tends to stay in memory. When compressed tables are accessed, InnoDB uses an adaptive LRU algorithm to achieve an appropriate balance of compressed and uncompressed pages in memory. This adaptive algorithm is sensitive to whether the system is running in an I/O-bound or CPU-bound manner. The goal is to avoid spending too much processing time uncompressing pages when the CPU is busy, and to avoid doing excess I/O when the CPU has spare cycles that can be used for uncompressing compressed pages (that may already be in memory). When the system is I/O-bound, the algorithm prefers to evict the uncompressed copy of a page rather than both copies, to make more room for other disk pages to become memory resident. When the system is CPU-bound, InnoDB prefers to evict both the compressed and uncompressed page, so that more memory can be used for “hot” pages and reducing the need to uncompress data in memory only in compressed form.
Before a compressed page is written to a database file, InnoDB writes a copy of the page to the redo log (if it has been recompressed since the last time it was written to the database). This is done to ensure that redo logs will always be usable, even if a future version of InnoDB uses a slightly different compression algorithm. Therefore, some increase in the size of log files, or a need for more frequent checkpoints, can be expected when using compression. The amount of increase in the log file size or checkpoint frequency depends on the number of times compressed pages are modified in a way that requires reorganization and recompression.
Note that the redo log file format (and the database file format) are different from previous releases when using compression. The MySQL Enterprise Backup product does support this latest Barracuda file format for compressed InnoDB tables. The older InnoDB Hot Backup product can only back up tables using the file format Antelope, and thus does not support InnoDB tables that use compression.