- Storage >
- FAQ: MongoDB Storage
FAQ: MongoDB Storage¶
On this page
This document addresses common questions regarding MongoDB’s storage system.
Storage Engine Fundamentals¶
What is a storage engine?¶
A storage engine is the part of a database that is responsible for managing how data is stored, both in memory and on disk. Many databases support multiple storage engines, where different engines perform better for specific workloads. For example, one storage engine might offer better performance for read-heavy workloads, and another might support a higher throughput for write operations.
See also
Can you mix storage engines in a replica set?¶
Yes. You can have replica set members that use different storage engines.
When designing these multi-storage engine deployments, consider the following:
- the oplog on each member may need to be sized differently to account for differences in throughput between different storage engines.
- recovery from backups may become more complex if your backup captures data files from MongoDB: you may need to maintain backups for each storage engine.
WiredTiger Storage Engine¶
Can I upgrade an existing deployment to WiredTiger?¶
Yes. See:
How much compression does WiredTiger provide?¶
The ratio of compressed data to uncompressed data depends on your data and the compression library used. By default, collection data in WiredTiger use Snappy block compression; zlib compression is also available. Index data use prefix compression by default.
To what size should I set the WiredTiger internal cache?¶
With WiredTiger, MongoDB utilizes both the WiredTiger internal cache and the filesystem cache.
Starting in 3.4, the WiredTiger internal cache, by default, will use the larger of either:
- 50% of RAM minus 1 GB, or
- 256 MB.
By default, WiredTiger uses Snappy block compression for all collections and prefix compression for all indexes. Compression defaults are configurable at a global level and can also be set on a per-collection and per-index basis during collection and index creation.
Different representations are used for data in the WiredTiger internal cache versus the on-disk format:
- Data in the filesystem cache is the same as the on-disk format, including benefits of any compression for data files. The filesystem cache is used by the operating system to reduce disk I/O.
- Indexes loaded in the WiredTiger internal cache have a different data representation to the on-disk format, but can still take advantage of index prefix compression to reduce RAM usage. Index prefix compression deduplicates common prefixes from indexed fields.
- Collection data in the WiredTiger internal cache is uncompressed and uses a different representation from the on-disk format. Block compression can provide significant on-disk storage savings, but data must be uncompressed to be manipulated by the server.
Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes.
To adjust the size of the WiredTiger internal cache, see
storage.wiredTiger.engineConfig.cacheSizeGB
and
--wiredTigerCacheSizeGB
. Avoid increasing the WiredTiger
internal cache size above its default value.
Note
The storage.wiredTiger.engineConfig.cacheSizeGB
limits the size of the WiredTiger internal
cache. The operating system will use the available free memory
for filesystem cache, which allows the compressed MongoDB data
files to stay in memory. In addition, the operating system will
use any free RAM to buffer file system blocks and file system
cache.
To accommodate the additional consumers of RAM, you may have to decrease WiredTiger internal cache size.
The default WiredTiger internal cache size value assumes that there is a
single mongod
instance per machine. If a single machine
contains multiple MongoDB instances, then you should decrease the setting to
accommodate the other mongod
instances.
If you run mongod
in a container (e.g. lxc
,
cgroups
, Docker, etc.) that does not have access to all of the
RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB
to a value less
than the amount of RAM available in the container. The exact amount
depends on the other processes running in the container.
To view statistics on the cache and eviction rate, see the
wiredTiger.cache
field
returned from the serverStatus
command.
How frequently does WiredTiger write to disk?¶
MongoDB configures WiredTiger to create checkpoints (i.e. write the snapshot data to disk) at intervals of 60 seconds or 2 gigabytes of journal data.
For journal data, MongoDB writes to disk according to the following intervals or condition:
New in version 3.2: Every 50 milliseconds.
MongoDB sets checkpoints to occur in WiredTiger on user data at an interval of 60 seconds or when 2 GB of journal data has been written, whichever occurs first.
If the write operation includes a write concern of
j: true
, WiredTiger forces a sync of the WiredTiger journal files.Because MongoDB uses a journal file size limit of 100 MB, WiredTiger creates a new journal file approximately every 100 MB of data. When WiredTiger creates a new journal file, WiredTiger syncs the previous journal file.
How do I reclaim disk space in WiredTiger?¶
The WiredTiger storage engine maintains lists of empty records in data files as it deletes documents. This space can be reused by WiredTiger, but will not be returned to the operating system unless under very specific circumstances.
The amount of empty space available for reuse by WiredTiger is reflected
in the output of db.collection.stats()
under the heading
wiredTiger.block-manager.file bytes available for reuse
.
To allow the WiredTiger storage engine to release this empty space to the
operating system, you can de-fragment your data file. This can be achieved
using the compact
command. For more information on its behavior
and other considerations, see compact
.
MMAPv1 Storage Engine¶
What are memory mapped files?¶
A memory-mapped file is a file with data that the operating system
places in memory by way of the mmap()
system call. mmap()
thus
maps the file to a region of virtual memory. Memory-mapped files are
the critical piece of the MMAPv1 storage engine in MongoDB. By using memory
mapped files, MongoDB can treat the contents of its data files as if
they were in memory. This provides MongoDB with an extremely fast and
simple method for accessing and manipulating data.
How do memory mapped files work?¶
MongoDB uses memory mapped files for managing and interacting with all data.
Memory mapping assigns files to a block of virtual memory with a direct byte-for-byte correlation. MongoDB memory maps data files to memory as it accesses documents. Unaccessed data is not mapped to memory.
Once mapped, the relationship between file and memory allows MongoDB to interact with the data in the file as if it were memory.
How frequently does MMAPv1 write to disk?¶
In the default configuration for the MMAPv1 storage engine, MongoDB writes to the data files on disk every 60 seconds and writes to the journal files roughly every 100 milliseconds.
To change the interval for writing to the data files, use the
storage.syncPeriodSecs
setting. For the journal files, see
storage.journal.commitIntervalMs
setting.
These values represent the maximum amount of time between the completion of a write operation and when MongoDB writes to the data files or to the journal files. In many cases MongoDB and the operating system flush data to disk more frequently, so that the above values represents a theoretical maximum.
Why are the files in my data directory larger than the data in my database?¶
The data files in your data directory, which is the /data/db
directory in default configurations, might be larger than the data set
inserted into the database. Consider the following possible causes:
Preallocated data files¶
MongoDB preallocates its data files to avoid filesystem fragmentation, and because of this, the size of these files do not necessarily reflect the size of your data.
The storage.mmapv1.smallFiles
option will reduce the
size of these files, which may be useful if you have many small databases on
disk.
The oplog
¶
If this mongod
is a member of a replica set, the data
directory includes the oplog.rs file, which is a
preallocated capped collection in the local
database.
The default allocation is approximately 5% of disk space on 64-bit installations. In most cases, you should not need to resize the oplog. See Oplog Sizing for more information.
The journal
¶
The data directory contains the journal files, which store write operations on disk before MongoDB applies them to databases. See Journaling.
Empty records¶
The MMAPv1 storage engine maintains lists of empty records in data files as it deletes documents and collections. This space can be reused for new record allocations within the same database, but MMAPv1 will not, by default, return this space to the operating system.
To allow the MMAPv1 storage engine to more effectively reuse space
from empty records, you can de-fragment your data. To de-fragment,
use the compact
command. The compact
requires up to 2 gigabytes of extra disk space to run. Do not use
compact
if you are critically low on disk space. For
more information on its behavior and other considerations, see
compact
.
compact
only removes fragmentation from MongoDB data files
within a collection and does not return any disk space to the operating
system. To return disk space to the operating system, see
How do I reclaim disk space?.
How do I reclaim disk space?¶
Note
You do not need to reclaim disk space for MongoDB to reuse freed space. See Empty records for information on reuse of freed space.
For a secondary member of a replica set, you can perform a resync of the member by stopping the secondary member to resync, deleting all data and subdirectories from the member’s data directory, and restarting the secondary member. For details, see Resync a Member of a Replica Set.
Dropping an unused database via dropDatabase
will also
delete the associated data files and free up disk space.
What is the working set?¶
Working set represents the total body of data that the application uses in the course of normal operation. Often this is a subset of the total data size, but the specific size of the working set depends on actual moment-to-moment use of the database.
If you run a query that requires MongoDB to scan every document in a collection, the working set will expand to include every document. Depending on physical memory size, this may cause documents in the working set to “page out,” or to be removed from physical memory by the operating system. The next time MongoDB needs to access these documents, MongoDB may incur a hard page fault.
For best performance, the majority of your active set should fit in RAM.
What are page faults?¶
With the MMAPv1 storage engine, page faults can occur as MongoDB reads from or writes data to parts of its data files that are not currently located in physical memory. In contrast, operating system page faults happen when physical memory is exhausted and pages of physical memory are swapped to disk.
If there is free memory, then the operating system can find the page on disk and load it to memory directly. However, if there is no free memory, the operating system must:
- find a page in memory that is stale or no longer needed, and write the page to disk.
- read the requested page from disk and load it into memory.
This process, on an active system, can take a long time, particularly in comparison to reading a page that is already in memory.
See Page Faults for more information.
What is the difference between soft and hard page faults?¶
Page faults occur when MongoDB, with the MMAP storage engine, needs access to data that isn’t currently in active memory. A “hard” page fault refers to situations when MongoDB must access a disk to access the data. A “soft” page fault, by contrast, merely moves memory pages from one list to another, such as from an operating system file cache.
See Page Faults for more information.
Can I manually pad documents to prevent moves during updates?¶
Changed in version 3.0.0.
With the MMAPv1 storage engine, an update can cause a document to move on disk if the document grows in size. To minimize document movements, MongoDB uses padding.
You should not have to pad manually because by default, MongoDB uses Power of 2 Sized Allocations to add padding automatically. The Power of 2 Sized Allocations ensures that MongoDB allocates document space in sizes that are powers of 2, which helps ensure that MongoDB can efficiently reuse free space created by document deletion or relocation as well as reduce the occurrences of reallocations in many cases.
However, if you must pad a document manually, you can add a
temporary field to the document and then $unset
the field,
as in the following example.
Warning
Do not manually pad documents in a capped collection. Applying manual padding to a document in a capped collection can break replication. Also, the padding is not preserved if you re-sync the MongoDB instance.
var myTempPadding = [ "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"];
db.myCollection.insert( { _id: 5, paddingField: myTempPadding } );
db.myCollection.update( { _id: 5 },
{ $unset: { paddingField: "" } }
)
db.myCollection.update( { _id: 5 },
{ $set: { realField: "Some text that I might have needed padding for" } }
)
See also
Data Storage Diagnostics¶
How can I check the size of a collection?¶
To view the statistics for a collection, including the data size, use
the db.collection.stats()
method from the mongo
shell. The following example issues db.collection.stats()
for
the orders
collection:
db.orders.stats();
MongoDB also provides the following methods to return specific sizes for the collection:
db.collection.dataSize()
to return data size in bytes for the collection.db.collection.storageSize()
to return allocation size in bytes, including unused space.db.collection.totalSize()
to return the data size plus the index size in bytes.db.collection.totalIndexSize()
to return the index size in bytes.
The following script prints the statistics for each database:
db.adminCommand("listDatabases").databases.forEach(function (d) {
mdb = db.getSiblingDB(d.name);
printjson(mdb.stats());
})
The following script prints the statistics for each collection in each database:
db.adminCommand("listDatabases").databases.forEach(function (d) {
mdb = db.getSiblingDB(d.name);
mdb.getCollectionNames().forEach(function(c) {
s = mdb[c].stats();
printjson(s);
})
})
How can I check the size of indexes for a collection?¶
To view the size of the data allocated for an index, use the
db.collection.stats()
method and check the
indexSizes
field in the returned document.
How can I get information on the storage use of a database?¶
The db.stats()
method in the mongo
shell returns
the current state of the “active” database. For the description of the
returned fields, see dbStats Output.