- Administration >
- Administration Tutorials >
- Backup and Recovery >
- Backup and Restore Sharded Clusters >
- Backup a Sharded Cluster with Filesystem Snapshots
Backup a Sharded Cluster with Filesystem Snapshots¶
On this page
Changed in version 3.2: Starting in MongoDB 3.2, the procedure can be used with the MMAPv1 and the WiredTiger storage engines. With previous versions of MongoDB, the procedure applied to MMAPv1 only.
Overview¶
This document describes a procedure for taking a backup of all components of a sharded cluster. This procedure uses file system snapshots to capture a copy of the mongod instance. An alternate procedure uses mongodump to create binary database dumps when file-system snapshots are not available. See Backup a Sharded Cluster with Database Dumps for the alternate procedure.
Important
To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a running production system, you can only capture an approximation of point-in-time snapshot.
For more information on backups in MongoDB and backups of sharded clusters in particular, see MongoDB Backup Methods and Backup and Restore Sharded Clusters.
Considerations¶
Balancer¶
It is essential that you stop the balancer before capturing a backup.
If the balancer is active while you capture backups, the backup artifacts may be incomplete and/or have duplicate data, as chunks may migrate while recording backups.
Precision¶
In this procedure, you will stop the cluster balancer and take a backup up of the config database, and then take backups of each shard in the cluster using a file-system snapshot tool. If you need an exact moment-in-time snapshot of the system, you will need to stop all application writes before taking the filesystem snapshots; otherwise the snapshot will only approximate a moment in time.
For approximate point-in-time snapshots, you can minimize the impact on the cluster by taking the backup from a secondary member of each replica set shard.
Consistency¶
If the journal and data files are on the same logical volume, you can use a single point-in-time snapshot to capture a consistent copy of the data files.
If the journal and data files are on different file systems, you must use db.fsyncLock() and db.fsyncUnlock() to ensure that the data files do not change, providing consistency for the purposes of creating backups.
Procedure¶
Disable the balancer.¶
To disable the balancer, connect the mongo shell to a mongos instance and run sh.stopBalancer() in the config database.
use config
sh.stopBalancer()
For more information, see the Disable the Balancer procedure.
If necessary, lock one secondary member of each replica set.¶
If your secondary does not have journaling enabled or its journal and data files are on different volumes, you must lock the secondary’s mongod instance before capturing a backup.
If your secondary has journaling enabled and its journal and data files are on the same volume, you may skip this step.
Important
If your deployment requires this step, you must perform it on one secondary of each shard and, if your sharded cluster uses a replica set for the config servers, one secondary of the config server replica set (CSRS).
Ensure that the oplog has sufficient capacity to allow these secondaries to catch up to the state of the primaries after finishing the backup procedure. See Oplog Size for more information.
Lock shard replica set secondary.¶
For each shard replica set in the sharded cluster, connect a mongo shell to the secondary member’s mongod instance and run db.fsyncLock().
db.fsyncLock()
When calling db.fsyncLock(), ensure that the connection is kept open to allow a subsequent call to db.fsyncUnlock().
Lock config server replica set secondary.¶
Connect a mongo shell to the secondary member’s mongod instance.
db.fsyncLock()
When calling db.fsyncLock(), ensure that the connection is kept open to allow a subsequent call to db.fsyncUnlock().
Back up one of the config servers.¶
Backing up a config server backs up the sharded cluster’s metadata. You only need to back up one config server, as they all hold the same data. If you are using CSRS config servers, perform this step against the locked config server.
If the sharded cluster uses CSRS¶
Confirm that the locked secondary member of the CSRS recognizes that the balancer is disabled. In a mongo shell connected to the secondary member’s mongod instance, perform the following.
use config
rs.slaveOk()
db.settings.find({ "_id" : "balancer", "stopped" : true })
If the member recognizes that the balancer is disabled, the query should return a document. Otherwise, wait until the query returns a document.
To confirm that the CSRS secondary member has replicated past the last completed migration, check the changelog collection in the config database. The last logged moveChunk operation should be a commit.
use config;
db.changelog.find({what:/^moveChunk/}).sort({time:-1}).next().what"
The query should return "moveChunk.commit". If not, wait until the chunk migration completes.
Take a file-system snapshot of the config server.¶
To create a file-system snapshot of the config server, follow the procedure in Create a Snapshot.
Back up a replica set member for each shard.¶
If you locked a member of the replica set shards, perform this step against the locked secondary.
You may back up the shards in parallel. For each shard, create a snapshot, using the procedure in Backup and Restore with Filesystem Snapshots.
Unlock all locked replica set members.¶
If you locked any mongod instances to capture the backup, unlock them.
To unlock the replica set members, use db.fsyncUnlock() method in the mongo shell. For each locked member, use the same mongo shell used to lock the instance.
db.fsyncUnlock()
Enable the balancer.¶
To re-enable to balancer, connect the mongo shell to a mongos instance and run sh.setBalancerState().
sh.setBalancerState(true)
Additional Resources¶
See also MongoDB Cloud Manager for seamless automation, backup, and monitoring.
Thank you for your feedback!
We're sorry! You can Report a Problem to help us improve this page.