Working set management and ejection
Working set management is the process of freeing up space and ensuring that the most used items are available in RAM. Ejection is the process of removing data from RAM to provide room for frequently used items.
The process that Couchbase Server performs to free space in RAM, and to ensure the most-used items are still available in RAM is also known as working set management. Ejection is the process of removing data from RAM to provide room for frequently-used items. Ejections is automatically performed by Couchbase Server. When Couchbase Server ejects information, it works in conjunction with the disk persistence system to ensure that data in RAM has been persisted to disk and can be safely retrieved back into RAM if the item is requested.
In addition to memory quota for the caching layer, there are two watermarks the engine uses to determine when it is necessary to start persisting more data to disk. These are mem_low_wat and mem_high_wat.
As the caching layer becomes full of data, eventually the mem_low_wat is passed. At this time, no action is taken. As data continues to load, it eventually reaches mem_high_wat. At this point, a background job is scheduled to ensure items are migrated to disk and that memory is available for other Couchbase Server items. This job runs until measured memory reaches mem_low_wat. If the rate of incoming items is faster than the migration of items to disk, the system can return errors indicating there is not enough space. This continues until there is available memory. The process of removing data from the caching to make way for the actively used information is called ejection and is controlled automatically through thresholds set on each configured bucket in the Couchbase Server cluster.
Working set management process
Couchbase Server actively manages the data stored in a caching layer; this includes the information which is frequently accessed by clients and which needs to be available for rapid reads and writes. When there are too many items in RAM, Couchbase Server removes certain data to create free space and to maintain system performance. This process is called “working set management” and the set of data in RAM is referred to as the “working set”.
In general the working set consists of all the keys, metadata, and associated documents which are frequently used require fast access. The process the server performs to remove data from RAM is known as ejection. When the server performs this process, it removes the document, but not the keys or metadata for the item. Keeping keys and metadata in RAM serves three important purposes in a system:
-
Couchbase Server uses the remaining key and metadata in RAM if a request for that key comes from a client. If a request occurs, the server then tries to fetch the item from disk and return it into RAM.
-
The server can also use the keys and metadata in RAM for “miss access”. This means that it is quickly determine whether an item is missing and if so, perform some action, such as add it.
-
Finally, the expiration process in Couchbase Server uses the metadata in RAM to quickly scan for items that are expired and later remove them from disk. This process is known as the “expiry pager” and runs every 60 minutes by default.
Not Frequently Used (NFU) items
All items in the server contain metadata indicating whether the item has been recently accessed or not. This metadata is known as not-recently-used (NRU). If an item has not been recently used, then the item is a candidate for ejection if the high water mark has been exceeded. When the high water mark has been exceeded, the server evicts items from RAM.
Couchbase Server provides two NRU bits per item and also provides a replication protocol that can propagate items that are frequently read, but not mutated often.
For earlier versions, Couchbase Server provided only a single bit for NRU and a different replication protocol which resulted in two issues: metadata could not reflect how frequently or recently an item had been changed, and the replication protocol only propagated NRUs for mutation items from an active vBucket to a replica vBucket. This second behavior meant that the working set on an active vBucket could be quite different than the set on a replica vBucket. By changing the replication protocol, the working set in replica vBuckets will be closer to the working set in the active vBucket.
NRUs are decremented or incremented by server processes to indicate an item is more frequently used, or less frequently used. Items with lower bit values have lower scores and are considered more frequently used. The bit values, corresponding scores and status are as follows:
Binary NRU | Score | Access pattern | Description |
---|---|---|---|
00 | 0 | Set by write access to 00. Decremented by read access or no access. | Most heavily used item. |
01 | 1 | Decremented by read access. | Frequently access item. |
10 | 2 | Initial value or decremented by read access. | Default for new items. |
11 | 3 | Incremented by item pager for eviction. | Less frequently used item. |
There are two processes which change the NRU for an item:
- A client reads or writes an item, the server decrements NRU and lowers the item’s score
- A daily process which creates a list of frequently-used items in RAM. After this process runs, the server increments one of the NRU bits.
Because the two processes changes NRUs, they also affect which items are candidates for ejection.
Couchbase Server settings can be adjusted to change behavior during ejection. For example, specify the percentage of RAM to be consume before items are ejected or specify whether ejection should occur more frequently on replicated data than on original data. Couchbase recommends that the default settings be used.
Understanding the item pager
The item pager process, which runs periodically, removes documents from RAM and retains the item’s key and metadata. If the amount of RAM used by items reaches the high water mark (upper threshold), both active and replica data are ejected until the memory usage (amount of RAM consumed) reaches the low water mark (lower threshold). Evictions of active and replica data occur with the ratio probability of 40% (active data) to 60% (replica data) until the memory usage reaches the low watermark. Both the high water mark and low water mark are expressed as a percentage amount of RAM, such as 80%.
Both the high water mark and low water mark can be changed by providing a percentage amount of RAM for a node, for example, 80%. Couchbase recommends that the following default settings be used:
Version | High water mark | Low water mark |
---|---|---|
2.0 | 75% | 60% |
2.0.1 and higher | 85% | 75% |
The item pager ejects items from RAM in two phases:
-
Phase 1: Eject based on NRU. Scan NRU for items and create list of all items with score of 3. Eject all items with a NRU score of 3. Check RAM usage and repeat this process if usage is still above the low water mark.
-
Phase 2: Eject based on Algorithm. Increment all item NRUs by 1. If an NRU is equal to 3, generate a random number and eject that item if the random number is greater than a specified probability. The probability is based on current memory usage, low water mark, and whether a vBucket is in an active or replica state. If a vBucket is in active state the probability of ejection is lower than if the vBucket is in a replica state. The default probabilities for ejection from active of replica vBuckets is as follows:
The following is the probability of ejection based on active vs. replica vBuckets:
Active vBucket | Replica vBucket |
---|---|
60% | 40% |