Table of Contents
This section discusses recent InnoDB enhancements to performance and scalability, covering the performance features in InnoDB 1.1 with MySQL 5.5, and the features in the InnoDB Plugin for MySQL 5.1. This information is useful to any DBA or developer who is concerned with performance and scalability. Although some of the enhancements do not require any action on your part, knowing this information can still help you diagnose performance issues more quickly and modernize systems and applications that rely on older, inefficient behavior.
InnoDB has always been highly efficient, and includes several unique architectural elements to assure high performance and scalability. The latest InnoDB storage engine includes new features that take advantage of advances in operating systems and hardware platforms, such as multi-core processors and improved memory allocation systems. In addition, new configuration options let you better control some InnoDB internal subsystems to achieve the best performance with your workload.
Starting with MySQL 5.5 and InnoDB 1.1, the built-in InnoDB
storage engine within MySQL is upgraded to the full feature set
and performance of the former InnoDB Plugin. This change makes
these performance and scalability enhancements available to a much
wider audience than before, and eliminates the separate
installation step of the InnoDB Plugin. After learning about the
InnoDB performance features in this section, continue with
Optimization to learn the best practices for
overall MySQL performance, and Optimizing for InnoDB
Tables
in particular for InnoDB tips and guidelines.
In MySQL and InnoDB, multiple threads of execution access shared data structures. InnoDB synchronizes these accesses with its own implementation of mutexes and read/write locks. InnoDB has historically protected the internal state of a read/write lock with an InnoDB mutex. On Unix and Linux platforms, the internal state of an InnoDB mutex is protected by a Pthreads mutex, as in IEEE Std 1003.1c (POSIX.1c).
On many platforms, there is a more efficient way to implement mutexes and read/write locks. Atomic operations can often be used to synchronize the actions of multiple threads more efficiently than Pthreads. Each operation to acquire or release a lock can be done in fewer CPU instructions, and thus result in less wasted time when threads are contending for access to shared data structures. This in turn means greater scalability on multi-core platforms.
InnoDB implements mutexes and read/write locks with the
built-in
functions provided by the GNU Compiler Collection (GCC) for atomic
memory access instead of using the Pthreads approach
previously used. More specifically, an InnoDB that is compiled
with GCC version 4.1.2 or later uses the atomic builtins instead
of a pthread_mutex_t
to implement InnoDB
mutexes and read/write locks.
On 32-bit Microsoft Windows, InnoDB has implemented mutexes (but not read/write locks) with hand-written assembler instructions. Beginning with Microsoft Windows 2000, functions for Interlocked Variable Access are available that are similar to the built-in functions provided by GCC. On Windows 2000 and higher, InnoDB makes use of the Interlocked functions. Unlike the old hand-written assembler code, the new implementation supports read/write locks and 64-bit platforms.
Solaris 10 introduced library functions for atomic operations, and InnoDB uses these functions by default. When MySQL is compiled on Solaris 10 with a compiler that does not support the built-in functions provided by the GNU Compiler Collection (GCC) for atomic memory access, InnoDB uses the library functions.
This change improves the scalability of InnoDB on multi-core systems. This feature is enabled out-of-the-box on the platforms where it is supported. You do not have to set any parameter or option to take advantage of the improved performance. On platforms where the GCC, Windows, or Solaris functions for atomic memory access are not available, InnoDB uses the traditional Pthreads method of implementing mutexes and read/write locks.
When MySQL starts, InnoDB writes a message to the log file
indicating whether atomic memory access is used for mutexes, for
mutexes and read/write locks, or neither. If suitable tools are
used to build InnoDB and the target CPU supports the atomic
operations required, InnoDB uses the built-in functions for
mutexing. If, in addition, the compare-and-swap operation can be
used on thread identifiers (pthread_t
), then
InnoDB uses the instructions for read-write locks as well.
Note: If you are building from source, ensure that the build process properly takes advantage of your platform capabilities.
For more information about the performance implications of locking, see Optimizing Locking Operations.
When InnoDB was developed, the memory allocators supplied with
operating systems and run-time libraries were often lacking in
performance and scalability. At that time, there were no memory
allocator libraries tuned for multi-core CPUs. Therefore, InnoDB
implemented its own memory allocator in the mem
subsystem. This allocator is guarded by a single mutex, which may
become a bottleneck. InnoDB
also implements a wrapper interface around the system allocator
(malloc
and free
) that is
likewise guarded by a single mutex.
Today, as multi-core systems have become more widely available,
and as operating systems have matured, significant improvements
have been made in the memory allocators provided with operating
systems. New memory allocators perform better and are more
scalable than they were in the past. The leading high-performance
memory allocators include Hoard
,
libumem
, mtmalloc
,
ptmalloc
, tbbmalloc
, and
TCMalloc
. Most workloads, especially those
where memory is frequently allocated and released (such as
multi-table joins), benefit from using a more highly tuned memory
allocator as opposed to the internal, InnoDB-specific memory
allocator.
You can control whether InnoDB uses its own memory allocator or an
allocator of the operating system, by setting the value of the
system configuration parameter
innodb_use_sys_malloc
in the
MySQL option file (my.cnf
or
my.ini
). If set to ON
or
1
(the default), InnoDB uses the
malloc
and free
functions of
the underlying system rather than manage memory pools itself. This
parameter is not dynamic, and takes effect only when the system is
started. To continue to use the InnoDB memory allocator, set
innodb_use_sys_malloc
to
0
.
When the InnoDB memory allocator is disabled, InnoDB ignores the
value of the parameter
innodb_additional_mem_pool_size
.
The InnoDB memory allocator uses an additional memory pool for
satisfying allocation requests without having to fall back to
the system memory allocator. When the InnoDB memory allocator is
disabled, all such allocation requests are fulfilled by the
system memory allocator.
On Unix-like systems that use dynamic linking, replacing the
memory allocator may be as easy as making the environment
variable LD_PRELOAD
or
LD_LIBRARY_PATH
point to the dynamic library
that implements the allocator. On other systems, some relinking
may be necessary. Please refer to the documentation of the
memory allocator library of your choice.
Since InnoDB cannot track all memory use when the system memory
allocator is used
(innodb_use_sys_malloc
is
ON
), the section “BUFFER POOL AND
MEMORY” in the output of the SHOW ENGINE INNODB
STATUS
command only includes the buffer pool
statistics in the “Total memory allocated”. Any
memory allocated using the mem
subsystem or
using ut_malloc
is excluded.
For more information about the performance implications of InnoDB memory usage, see Buffering and Caching.
When INSERT
, UPDATE
, and
DELETE
operations are done to a table, often
the values of indexed columns (particularly the values of
secondary keys) are not in sorted order, requiring substantial I/O
to bring secondary indexes up to date. InnoDB has an
insert buffer that
caches changes to secondary index entries when the relevant
page is not in the
buffer pool, thus
avoiding I/O operations by not reading in the page from the disk.
The buffered changes are merged when the page is loaded to the
buffer pool, and the updated page is later flushed to disk using
the normal mechanism. The InnoDB main thread merges buffered
changes when the server is nearly idle, and during a
slow shutdown.
Because it can result in fewer disk reads and writes, this feature is most valuable for workloads that are I/O-bound, for example applications with a high volume of DML operations such as bulk inserts.
However, the insert buffer occupies a part of the buffer pool, reducing the memory available to cache data pages. If the working set almost fits in the buffer pool, or if your tables have relatively few secondary indexes, it may be useful to disable insert buffering. If the working set entirely fits in the buffer pool, insert buffering does not impose any extra overhead, because it only applies to pages that are not in the buffer pool.
You can control the extent to which InnoDB performs insert
buffering with the system configuration parameter
innodb_change_buffering
. You can
turn on and off buffering for inserts, delete operations (when
index records are initially marked for deletion) and purge
operations (when index records are physically deleted). An update
operation is represented as a combination of an insert and a
delete. In MySQL 5.5 and higher, the default value is changed from
inserts
to all
.
The allowed values of
innodb_change_buffering
are:
all
The default value: buffer inserts, delete-marking operations, and purges.
none
Do not buffer any operations.
inserts
Buffer insert operations.
deletes
Buffer delete-marking operations.
changes
Buffer both inserts and delete-marking.
purges
Buffer the physical deletion operations that happen in the background.
You can set the value of this parameter in the MySQL option file
(my.cnf
or my.ini
) or change
it dynamically with the SET GLOBAL
command,
which requires the SUPER
privilege. Changing
the setting affects the buffering of new operations; the merging
of already buffered entries is not affected.
For more information about speeding up INSERT
,
UPDATE
, and DELETE
statements, see Optimizing DML Statements.
If a table fits almost entirely in main memory, the fastest way to
perform queries on it is to use
hash indexes rather
than B-tree lookups. MySQL
monitors searches on each index defined for an InnoDB table. If it
notices that certain index values are being accessed frequently,
it automatically builds an in-memory hash table for that index.
See Adaptive Hash Indexes for background
information and usage guidelines for the
adaptive hash
index feature and the
innodb_adaptive_hash_index
configuration option.
InnoDB uses operating system threads to process requests from user transactions. (Transactions may issue many requests to InnoDB before they commit or roll back.) On modern operating systems and servers with multi-core processors, where context switching is efficient, most workloads run well without any limit on the number of concurrent threads. Scalability improvements in MySQL 5.5 and up reduce the need to limit the number of concurrently executing threads inside InnoDB.
In situations where it is helpful to minimize context switching between threads, InnoDB can use a number of techniques to limit the number of concurrently executing operating system threads (and thus the number of requests that are processed at any one time). When InnoDB receives a new request from a user session, if the number of threads concurrently executing is at a pre-defined limit, the new request sleeps for a short time before it tries again. A request that cannot be rescheduled after the sleep is put in a first-in/first-out queue and eventually is processed. Threads waiting for locks are not counted in the number of concurrently executing threads.
You can limit the number of concurrent threads by setting the
configuration parameter
innodb_thread_concurrency
.
Once the number of executing threads reaches this limit,
additional threads sleep for a number of microseconds, set by the
configuration parameter
innodb_thread_sleep_delay
,
before being placed into the queue.
The default value for
innodb_thread_concurrency
and the implied default limit on the number of concurrent threads
has been changed in various releases of MySQL and InnoDB.
Currently, the default value of
innodb_thread_concurrency
is
0
, so that by default there is no limit on the
number of concurrently executing threads, as shown in
Table 7.1, “Changes to innodb_thread_concurrency
”.
Table 7.1. Changes to innodb_thread_concurrency
InnoDB Version | MySQL Version | Default value | Default limit of concurrent threads | Value to allow unlimited threads |
---|---|---|---|---|
Built-in | Earlier than 5.1.11 | 20 | No limit | 20 or higher |
Built-in | 5.1.11 and newer | 8 | 8 | 0 |
InnoDB before 1.0.3 | (corresponding to Plugin) | 8 | 8 | 0 |
InnoDB 1.0.3 and newer | (corresponding to Plugin) | 0 | No limit | 0 |
Note that InnoDB causes threads to sleep only when the number of
concurrent threads is limited. When there is no limit on the
number of threads, all contend equally to be scheduled. That is,
if innodb_thread_concurrency
is
0
, the value of
innodb_thread_sleep_delay
is
ignored.
When there is a limit on the number of threads, InnoDB reduces
context switching overhead by permitting multiple requests made
during the execution of a single SQL statement to enter InnoDB
without observing the limit set by
innodb_thread_concurrency
.
Since an SQL statement (such as a join) may comprise multiple row
operations within InnoDB, InnoDB assigns “tickets”
that allow a thread to be scheduled repeatedly with minimal
overhead.
When a new SQL statement starts, a thread has no tickets, and it
must observe
innodb_thread_concurrency
.
Once the thread is entitled to enter InnoDB, it is assigned a
number of tickets that it can use for subsequently entering
InnoDB. If the tickets run out,
innodb_thread_concurrency
is observed again and further tickets are assigned. The number of
tickets to assign is specified by the global option
innodb_concurrency_tickets
,
which is 500 by default. A thread that is waiting for a lock is
given one ticket once the lock becomes available.
The correct values of these variables depend on your environment and workload. Try a range of different values to determine what value works for your applications. Before limiting the number of concurrently executing threads, review configuration options that may improve the performance of InnoDB on multi-core and multi-processor computers, such as innodb_use_sys_malloc and innodb_adaptive_hash_index.
For general performance information about MySQL thread handling, see How MySQL Uses Threads for Client Connections.
A read-ahead request is an I/O request to prefetch multiple pages in the buffer pool asynchronously, in anticipation that these pages will be needed soon. InnoDB uses or has used two read-ahead algorithms to improve I/O performance:
Linear read-ahead is based on
the access pattern of the pages in the buffer pool, not just their
number. You can control when InnoDB performs a read-ahead
operation by adjusting the number of sequential page accesses
required to trigger an asynchronous read request, using the
configuration parameter
innodb_read_ahead_threshold
.
Before this parameter was added, InnoDB would only calculate
whether to issue an asynchronous prefetch request for the entire
next extent when it read in the last page of the current extent.
Random read-ahead is a former technique that has now been removed as of MySQL 5.5. If a certain number of pages from the same extent (64 consecutive pages) were found in the buffer pool, InnoDB asynchronously issued a request to prefetch the remaining pages of the extent. Random read-ahead added unnecessary complexity to the InnoDB code and often resulted in performance degradation rather than improvement. This feature is no longer part of InnoDB, and users should generally see equivalent or improved performance.
If the number of pages read from an extent of 64 pages is greater
or equal to
innodb_read_ahead_threshold
,
InnoDB initiates an asynchronous read-ahead operation of the
entire following extent. Thus, this parameter controls how
sensitive InnoDB is to the pattern of page accesses within an
extent in deciding whether to read the following extent
asynchronously. The higher the value, the more strict the access
pattern check. For example, if you set the value to 48, InnoDB
triggers a linear read-ahead request only when 48 pages in the
current extent have been accessed sequentially. If the value is 8,
InnoDB would trigger an asynchronous read-ahead even if as few as
8 pages in the extent were accessed sequentially.
The new configuration parameter
innodb_read_ahead_threshold
can be set to any value from 0-64. The default value is 56,
meaning that an asynchronous read-ahead is performed only when 56
of the 64 pages in the extent are accessed sequentially. You can
set the value of this parameter in the MySQL option file (my.cnf
or my.ini), or change it dynamically with the SET
GLOBAL
command, which requires the
SUPER
privilege.
The SHOW ENGINE INNODB STATUS
command displays
statistics to help you evaluate the effectiveness of the
read-ahead algorithm. See
Section 8.8, “More Read-Ahead Statistics” for more
information.
For more information about I/O performance, see
Optimizing InnoDB
Disk I/O and
Optimizing Disk I/O.
InnoDB uses background threads
to service various types of I/O requests. You can configure the
number of background threads that service read and write I/O on
data pages, using the configuration parameters
innodb_read_io_threads
and
innodb_write_io_threads
. These
parameters signify the number of background threads used for read
and write requests respectively. They are effective on all
supported platforms. You can set the value of these parameters in
the MySQL option file (my.cnf
or
my.ini
); you cannot change them dynamically.
The default value for these parameters is 4
and
the permissible values range from 1-64
.
These parameters replace
innodb_file_io_threads
from
earlier versions of MySQL. If you try to set a value for this
obsolete parameter, a warning is written to the log file and the
value is ignored. This parameter only applied to Windows
platforms. (On non-Windows platforms, there was only one thread
each for read and write.)
The purpose of this change is to make InnoDB more scalable on high
end systems. Each background thread can handle up to 256 pending
I/O requests. A major source of background I/O is
theread-ahead
requests. InnoDB tries to balance the load of incoming requests in
such way that most of the background threads share work equally.
InnoDB also attempts to allocate read requests from the same
extent to the same thread to increase the chances of coalescing
the requests together. If you have a high end I/O subsystem and
you see more than 64 ×
innodb_read_io_threads
pending
read requests in SHOW ENGINE INNODB STATUS
, you
might gain by increasing the value of
innodb_read_io_threads
.
For more information about InnoDB I/O performance, see
Optimizing InnoDB
Disk I/O.
Starting in InnoDB 1.1 with MySQL 5.5, the
asynchronous I/O
capability that InnoDB has had on Windows systems is now available
on Linux systems. (Other Unix-like systems continue to use
synchronous I/O calls.) This feature improves the scalability of
heavily I/O-bound systems, which typically show many pending
reads/writes in the output of the command SHOW ENGINE
INNODB STATUS\G
.
Running with a large number of InnoDB
I/O
threads, and especially running multiple such instances on the
same server machine, can exceed capacity limits on Linux systems.
In this case, you can fix the error:
EAGAIN: The specified maxevents exceeds the user's limit of available events.
by writing a higher limit to
/proc/sys/fs/aio-max-nr
.
In general, if a problem with the asynchronous I/O subsystem in
the OS prevents InnoDB from starting, set the option
innodb_use_native_aio=0
in the
configuration file. This new configuration option applies to Linux
systems only, and cannot be changed once the server is running.
For more information about InnoDB I/O performance, see
Optimizing InnoDB
Disk I/O.
InnoDB, like any other ACID-compliant database engine, flushes the redo log of a transaction before it is committed. Historically, InnoDB used group commit functionality to group multiple such flush requests together to avoid one flush for each commit. With group commit, InnoDB issues a single write to the log file to perform the commit action for multiple user transactions that commit at about the same time, significantly improving throughput.
Group commit in InnoDB worked until MySQL 4.x, and works once again with MySQL 5.1 with the InnoDB Plugin, and MySQL 5.5 and higher. The introduction of support for the distributed transactions and Two Phase Commit (2PC) in MySQL 5.0 interfered with the InnoDB group commit functionality. This issue is now resolved.
The group commit functionality inside InnoDB works with the Two
Phase Commit protocol in MySQL. Re-enabling of the group commit
functionality fully ensures that the ordering of commit in the
MySQL binlog and the InnoDB logfile is the same as it was before.
It means it is totally safe to use MySQL Enterprise Backup
with InnoDB 1.0.4 (that is, the InnoDB Plugin with
MySQL 5.1) and above. When the binlog is enabled, you typically
also set the configuration option
sync_binlog=0
, because group commit for the
binary log is only supported if it is set to 0.
Group commit is transparent; you do not need to do anything to take advantage of this significant performance improvement.
For more information about performance of
COMMIT
and other transactional operations, see
Optimizing InnoDB
Transaction Management.
The master thread in InnoDB is a thread that performs various tasks in the background. Most of these tasks are I/O related, such as flushing dirty pages from the buffer pool or writing changes from the insert buffer to the appropriate secondary indexes. The master thread attempts to perform these tasks in a way that does not adversely affect the normal working of the server. It tries to estimate the free I/O bandwidth available and tune its activities to take advantage of this free capacity. Historically, InnoDB has used a hard coded value of 100 IOPs (input/output operations per second) as the total I/O capacity of the server.
The parameter
innodb_io_capacity
indicates the overall I/O capacity available to InnoDB. This
parameter should be set to approximately the number of I/O
operations that the system can perform per second. The value
depends on your system configuration. When
innodb_io_capacity
is set,
the master threads estimates the I/O bandwidth available for
background tasks based on the set value. Setting the value to
100
reverts to the old behavior.
You can set the value of
innodb_io_capacity
to any
number 100 or greater. The default value is
200
, reflecting that the performance of typical
modern I/O devices is higher than in the early days of MySQL.
Typically, values around the previous default of 100 are
appropriate for consumer-level storage devices, such as hard
drives up to 7200 RPMs. Faster hard drives, RAID configurations,
and SSDs benefit from higher values.
You can set the value of this parameter in the MySQL option file
(my.cnf
or my.ini
) or change
it dynamically with the SET GLOBAL
command,
which requires the SUPER
privilege.
For more information about InnoDB I/O performance, see
Optimizing InnoDB
Disk I/O.
InnoDB performs certain tasks in the background, including
flushing of
dirty pages (those
pages that have been changed but are not yet written to the
database files) from the
buffer pool, a task
performed by the master
thread. Currently, InnoDB aggressively flushes buffer pool
pages if the percentage of dirty pages in the buffer pool exceeds
innodb_max_dirty_pages_pct
.
InnoDB uses a new algorithm to estimate the required rate of flushing, based on the speed of redo log generation and the current rate of flushing. The intent is to smooth overall performance by ensuring that buffer flush activity keeps up with the need to keep the buffer pool “clean”. Automatically adjusting the rate of flushing can help to avoid steep dips in throughput, when excessive buffer pool flushing limits the I/O capacity available for ordinary read and write activity.
InnoDB uses its log files in a circular fashion. Before reusing a
portion of a log file, InnoDB flushes to disk all dirty buffer
pool pages whose redo entries are contained in that portion of the
log file, a process known as a
sharp checkpoint. If
a workload is write-intensive, it generates a lot of redo
information, all written to the log file. If all available space
in the log files is used up, a sharp checkpoint occurs, causing a
temporary reduction in throughput. This situation can happen even
though innodb_max_dirty_pages_pct
is not reached.
InnoDB uses a heuristic-based algorithm to avoid such a scenario, by measuring the number of dirty pages in the buffer pool and the rate at which redo is being generated. Based on these numbers, InnoDB decides how many dirty pages to flush from the buffer pool each second. This self-adapting algorithm is able to deal with sudden changes in the workload.
Internal benchmarking has also shown that this algorithm not only maintains throughput over time, but can also improve overall throughput significantly.
Because adaptive flushing is a new feature that can significantly
affect the I/O pattern of a workload, a new configuration
parameter lets you turn off this feature. The default value of the
boolean parameter
innodb_adaptive_flushing
is
TRUE
, enabling the new algorithm. You can set
the value of this parameter in the MySQL option file
(my.cnf
or my.ini
) or change
it dynamically with the SET GLOBAL
command,
which requires the SUPER
privilege.
For more information about InnoDB I/O performance, see
Optimizing InnoDB
Disk I/O.
Synchronization inside InnoDB frequently involves the use of
spin loops: while waiting,
InnoDB executes a tight loop of instructions repeatedly to avoid
having the InnoDB process and
threads be rescheduled by the
operating system. If the spin loops are executed too quickly,
system resources are wasted, imposing a performance penalty on
transaction throughput. Most modern processors implement the
PAUSE
instruction for use in spin loops, so the
processor can be more efficient.
InnoDB uses a PAUSE
instruction in its spin
loops on all platforms where such an instruction is available.
This technique increases overall performance with CPU-bound
workloads, and has the added benefit of minimizing power
consumption during the execution of the spin loops.
You do not have to do anything to take advantage of this performance improvement.
For performance considerations for InnoDB locking operations, see Optimizing Locking Operations.
Many InnoDB mutexes and rw-locks are reserved for a short time. On a multi-core system, it can be more efficient for a thread to continuously check if it can acquire a mutex or rw-lock for a while before sleeping. If the mutex or rw-lock becomes available during this polling period, the thread can continue immediately, in the same time slice. However, too-frequent polling by multiple threads of a shared object can cause “cache ping pong”, different processors invalidating portions of each others' cache. InnoDB minimizes this issue by waiting a random time between subsequent polls. The delay is implemented as a busy loop.
You can control the maximum delay between testing a mutex or
rw-lock using the parameter
innodb_spin_wait_delay
. The
duration of the delay loop depends on the C compiler and the
target processor. (In the 100MHz Pentium era, the unit of delay
was one microsecond.) On a system where all processor cores share
a fast cache memory, you might reduce the maximum delay or disable
the busy loop altogether by setting
innodb_spin_wait_delay=0
. On a system with
multiple processor chips, the effect of cache invalidation can be
more significant and you might increase the maximum delay.
The default value of
innodb_spin_wait_delay
is
6
. The spin wait delay is a dynamic global
parameter that you can specify in the MySQL option file
(my.cnf
or my.ini
) or change
at runtime with the command SET GLOBAL
innodb_spin_wait_delay=
,
where delay
is the
desired maximum delay. Changing the setting requires the
delay
SUPER
privilege.
For performance considerations for InnoDB locking operations, see Optimizing Locking Operations.
Rather than using a strictly LRU algorithm, InnoDB uses a technique to minimize the amount of data that is brought into the buffer pool and never accessed again. The goal is to make sure that frequently accessed (“hot”) pages remain in the buffer pool, even as read-ahead and full table scans bring in new blocks that might or might not be accessed afterward.
Newly read blocks are inserted into the middle of the list
representing the buffer pool. of the LRU list. All newly read
pages are inserted at a location that by default is
3/8
from the tail of the LRU list. The pages
are moved to the front of the list (the most-recently used end)
when they are accessed in the buffer pool for the first time. Thus
pages that are never accessed never make it to the front portion
of the LRU list, and “age out” sooner than with a
strict LRU approach. This arrangement divides the LRU list into
two segments, where the pages downstream of the insertion point
are considered “old” and are desirable victims for
LRU eviction.
For an explanation of the inner workings of the InnoDB buffer pool
and the specifics of its LRU replacement algorithm, see
The InnoDB
Buffer Pool.
You can control the insertion point in the LRU list, and choose
whether InnoDB applies the same optimization to blocks brought
into the buffer pool by table or index scans. The configuration
parameter
innodb_old_blocks_pct
controls the percentage of “old” blocks in the LRU
list. The default value of
innodb_old_blocks_pct
is
37
, corresponding to the original fixed ratio
of 3/8. The value range is 5
(new pages in the
buffer pool age out very quickly) to 95
(only
5% of the buffer pool is reserved for hot pages, making the
algorithm close to the familiar LRU strategy).
The optimization that keeps the buffer pool from being churned by
read-ahead can avoid similar problems due to table or index scans.
In these scans, a data page is typically accessed a few times in
quick succession and is never touched again. The configuration
parameter innodb_old_blocks_time
specifies the time window (in milliseconds) after the first access
to a page during which it can be accessed without being moved to
the front (most-recently used end) of the LRU list. The default
value of innodb_old_blocks_time
is 0
, corresponding to the original behavior of
moving a page to the most-recently used end of the buffer pool
list when it is first accessed in the buffer pool. Increasing this
value makes more and more blocks likely to age out faster from the
buffer pool.
Both the new parameters
innodb_old_blocks_pct
and
innodb_old_blocks_time
are
dynamic, global and can be specified in the MySQL option file
(my.cnf
or my.ini
) or
changed at runtime with the SET GLOBAL
command.
Changing the setting requires the SUPER
privilege.
To help you gauge the effect of setting these parameters, the
SHOW ENGINE INNODB STATUS
command reports
additional statistics. The BUFFER POOL AND
MEMORY
section now looks like:
Total memory allocated 1107296256; in additional pool allocated 0 Dictionary memory allocated 80360 Buffer pool size 65535 Free buffers 0 Database pages 63920 Old database pages 23600 Modified db pages 34969 Pending reads 32 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 414946, not young 2930673 1274.75 youngs/s, 16521.90 non-youngs/s Pages read 486005, created 3178, written 160585 2132.37 reads/s, 3.40 creates/s, 323.74 writes/s Buffer pool hit rate 950 / 1000, young-making rate 30 / 1000 not 392 / 1000 Pages read ahead 1510.10/s, evicted without access 0.00/s LRU len: 63920, unzip_LRU len: 0 I/O sum[43690]:cur[221], unzip sum[0]:cur[0]
Old database pages
is the number of pages
in the “old” segment of the LRU list.
Pages made young
and not
young
is the total number of “old”
pages that have been made young or not respectively.
youngs/s
and non-young/s
is the rate at which page accesses to the “old”
pages have resulted in making such pages young or otherwise
respectively since the last invocation of the command.
young-making rate
and
not
provides the same rate but in terms of
overall buffer pool accesses instead of accesses just to the
“old” pages.
Because the effects of these parameters can vary widely based on your hardware configuration, your data, and the details of your workload, always benchmark to verify the effectiveness before changing these settings in any performance-critical or production environment.
In mixed workloads where most of the activity is OLTP type with
periodic batch reporting queries which result in large scans,
setting the value of
innodb_old_blocks_time
during the batch runs can help keep the working set of the normal
workload in the buffer pool.
When scanning large tables that cannot fit entirely in the buffer
pool, setting
innodb_old_blocks_pct
to a
small value keeps the data that is only read once from consuming a
significant portion of the buffer pool. For example, setting
innodb_old_blocks_pct=5
restricts this data
that is only read once to 5% of the buffer pool.
When scanning small tables that do fit into memory, there is less
overhead for moving pages around within the buffer pool, so you
can leave innodb_old_blocks_pct
at its default value, or even higher, such as
innodb_old_blocks_pct=50
.
The effect of the
innodb_old_blocks_time
parameter is harder to predict than the
innodb_old_blocks_pct
parameter, is relatively small, and varies more with the workload.
To arrive at an optimal value, conduct your own benchmarks if the
performance improvement from adjusting
innodb_old_blocks_pct
is
not sufficient.
For more information about the InnoDB buffer pool, see
The InnoDB
Buffer Pool.
A number of optimizations speed up certain steps of the recovery that happens on the next startup after a crash. In particular, scanning the redo log and applying the redo log are faster than in MySQL 5.1 and earlier, due to improved algorithms for memory management. You do not need to take any actions to take advantage of this performance enhancement. If you kept the size of your redo log files artificially low because recovery took a long time, you can consider increasing the file size.
For more information about InnoDB recovery, see
The InnoDB
Recovery Process.
Starting with InnoDB 1.1 with MySQL 5.5, you can profile certain internal InnoDB operations using the MySQL Performance Schema feature. This type of tuning is primarily for expert users, those who push the limits of MySQL performance, read the MySQL source code, and evaluate optimization strategies to overcome performance bottlenecks. DBAs can also use this feature for capacity planning, to see whether their typical workload encounters any performance bottlenecks with a particular combination of CPU, RAM, and disk storage; and if so, to judge whether performance can be improved by increasing the capacity of some part of the system.
To use this feature to examine InnoDB performance:
You must be running MySQL 5.5 or higher. You must build the
database server from source, enabling the Performance Schema
feature by building with the
--with-perfschema
option. Since the
Performance Schema feature introduces some performance
overhead, you should use it on a test or development system
rather than on a production system.
You must be running InnoDB 1.1 or higher.
You must be generally familiar with how to use the
Performance Schema
feature, for example to query tables in the
performance_schema
database.
Examine the following kinds of InnoDB objects by querying the
appropriate performance_schema
tables. The
items associated with InnoDB all contain the substring
innodb
in the NAME
column.
For the definitions of the *_instances
tables, see
Performance Schema Instance Tables. For the
definitions of the *_summary_*
tables, see
Performance Schema Summary Tables. For the
definition of the thread
table, see
Performance Schema Miscellaneous Tables. For
the definition of the *_current_*
and
*_history_*
tables, see
Performance Schema Wait Event Tables.
Mutexes in the
mutex_instances
table. (Mutexes and
RW-locks related to the InnoDB
buffer
pool are not included in this coverage; the same applies
to the output of the SHOW ENGINE INNODB
MUTEX
command.)
RW-locks in the
rwlock_instances
table.
RW-locks in the rwlock_instances
table.
File I/O operations in the
file_instances
,
file_summary_by_event_name
, and
file_summary_by_instance
tables.
Threads in the
PROCESSLIST
table.
During performance testing, examine the performance data in
the events_waits_current
and
events_waits_history_long
tables. If you
are interested especially in InnoDB-related objects, use the
clause where name like "%innodb%"
to see
just those entries; otherwise, examine the performance
statistics for the overall MySQL server.
You must be running MySQL 5.5, with the Performance Schema
enabled by building with the
--with-perfschema
build option.
For more information about the MySQL Performance Schema, see MySQL Performance Schema.
This performance enhancement is primarily useful for people with a
large buffer pool size,
typically in the multi-gigabyte range. To take advantage of this
speedup, you must set the new
innodb_buffer_pool_instances
configuration option, and you might also adjust the
innodb_buffer_pool_size
value.
When the InnoDB buffer pool is large, many data requests can be satisfied by retrieving from memory. You might encounter bottlenecks from multiple threads trying to access the buffer pool at once. Starting in InnoDB 1.1 and MySQL 5.5, you can enable multiple buffer pools to minimize this contention. Each page that is stored in or read from the buffer pool is assigned to one of the buffer pools randomly, using a hashing function. Each buffer pool manages its own free lists, flush lists, LRUs, and all other data structures connected to a buffer pool, and is protected by its own buffer pool mutex.
To enable this feature, set the
innodb_buffer_pool_instances
configuration
option to a value greater than 1 (the default) up to 64 (the
maximum). This option takes effect only when you set the
innodb_buffer_pool_size
to a size of 1 gigabyte
or more. The total size you specify is divided among all the
buffer pools. For best efficiency, specify a combination of
innodb_buffer_pool_instances
and
innodb_buffer_pool_size
so that
each buffer pool instance is at least 1 gigabyte.
For more information about the InnoDB buffer pool, see
The InnoDB
Buffer Pool.
Starting in InnoDB 1.1 with MySQL 5.5, the limit on concurrent transactions is greatly expanded, removing a bottleneck with the InnoDB rollback segment that affected high-capacity systems. The limit applies to concurrent transactions that change any data; read-only transactions do not count against that maximum.
The single rollback segment is now divided into 128 segments, each of which can support up to 1023 transactions that perform writes, for a total of approximately 128K concurrent transactions. The original transaction limit was 1023.
Each transaction is assigned to one of the rollback segments, and remains tied to that rollback segment for the duration. This enhancement improves both scalability (higher number of concurrent transactions) and performance (less contention when different transactions access the rollback segments).
To take advantage of this feature, you do not need to create any new database or tables, or reconfigure anything. You must do a slow shutdown before upgrading from MySQL 5.1 or earlier, or some time afterward. InnoDB makes the required changes inside the system tablespace automatically, the first time you restart after performing a slow shutdown.
For more information about performance of InnoDB under high
transactional load, see
Optimizing InnoDB
Transaction Management.
Starting in InnoDB 1.1 with MySQL 5.5, the purge operations (a type of garbage collection) that InnoDB performs automatically can be done in a separate thread, rather than as part of the master thread. This change improves scalability, because the main database operations run independently from maintenance work happening in the background.
To enable this feature, set the configuration option
innodb_purge_threads=1
, as opposed to the
default of 0, which combines the purge operation into the master
thread.
You might not notice a significant speedup, because the purge
thread might encounter new types of contention; the single purge
thread really lays the groundwork for further tuning and possibly
multiple purge threads in the future. There is another new
configuration option, innodb_purge_batch_size
with a default of 20 and maximum of 5000. This option is mainly
intended for experimentation and tuning of purge operations, and
should not be interesting to typical users.
For more information about InnoDB I/O performance, see
Optimizing InnoDB
Disk I/O.
This is another performance improvement that comes for free, with no user action or configuration needed. The details here are intended for performance experts who delve into the InnoDB source code, or interpret reports with keywords such as “mutex” and “log_sys”.
The mutex known as the log sys
mutex has historically done double duty, controlling access to
internal data structures related to log records and the
LSN, as well as pages in the
buffer pool that are
changed when a
mini-transaction is
committed. Starting in InnoDB 1.1 with MySQL 5.5, these two kinds
of operations are protected by separate mutexes, with a new
log_buf
mutex controlling writes to buffer pool
pages due to mini-transactions.
For performance considerations for InnoDB locking operations, see Optimizing Locking Operations.
Starting with InnoDB 1.1 with MySQL 5.5, concurrent access to the buffer pool is faster. Operations involving the flush list, a data structure related to the buffer pool, are now controlled by a separate mutex and do not block access to the buffer pool. You do not need to configure anything to take advantage of this speedup; it is fully automatic.
For more information about the InnoDB buffer pool, see
The InnoDB
Buffer Pool.