- Sharding >
- Sharded Cluster Tutorials >
- Sharded Cluster Deployment Tutorials >
- Deploy a Sharded Cluster
Deploy a Sharded Cluster¶
On this page
Changed in version 3.2.
Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as a replica set. The replica set config servers must run the WiredTiger storage engine. MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
The following tutorial deploys a new sharded cluster for MongoDB 3.2. To deploy a sharded cluster for earlier versions of MongoDB, refer to the corresponding version of the MongoDB Manual.
Considerations¶
Host Identifier¶
Warning
Sharding and “localhost” Addresses
If you use either “localhost” or 127.0.0.1 as the hostname portion of any host identifier, for example as the host argument to addShard or the value to the --configdb run time option, then you must use “localhost” or 127.0.0.1 for all host settings for any MongoDB instances in the cluster. If you mix localhost addresses and remote host address, MongoDB will error.
Connectivity¶
All members of a sharded cluster must be able to connect to all other members of a sharded cluster, including all shards and all config servers. Ensure that the network and security systems, including all interfaces and firewalls, allow these connections.
Deploy the Config Server Replica Set¶
Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as a replica set. The replica set config servers must run the WiredTiger storage engine. MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
The following restrictions apply to a replica set configuration when used for config servers:
- Must have zero arbiters.
- Must have no delayed members.
- Must build indexes (i.e. no member should have buildIndexes setting set to false).
The config servers store the sharded cluster’s metadata. The following steps deploy a three member replica set for the config servers.
Start all the config servers with both the --configsvr and --replSet <name> options:
mongod --configsvr --replSet configReplSet --port <port> --dbpath <path>
Or if using a configuration file, include the sharding.clusterRole and replication.replSetName setting:
sharding: clusterRole: configsvr replication: replSetName: configReplSet net: port: <port> storage: dbpath: <path>
For additional options, see mongod or Configuration File Options.
Connect a mongo shell to one of the config servers and run rs.initiate() to initiate the replica set.
rs.initiate( { _id: "configReplSet", configsvr: true, members: [ { _id: 0, host: "<host1>:<port1>" }, { _id: 1, host: "<host2>:<port2>" }, { _id: 2, host: "<host3>:<port3>" } ] } )
To use the deprecated mirrored config server deployment topology, see Start 3 Mirrored Config Servers (Deprecated).
Start the mongos Instances¶
The mongos instances are lightweight and do not require data directories. You can run a mongos instance on a system that runs other cluster components, such as on an application server or a server running a mongod process. By default, a mongos instance runs on port 27017.
When you start the mongos instance, specify the config servers, using either the sharding.configDB setting in the configuration file or the --configdb command line option.
Note
All config servers must be running and available when you first initiate a sharded cluster.
Start one or more mongos instances. For --configdb, or sharding.configDB, specify the config server replica set name followed by a slash / and at least one of the config server hostnames and ports:
mongos --configdb configReplSet/<cfgsvr1:port1>,<cfgsvr2:port2>,<cfgsvr3:port3>
If using the deprecated mirrored config server deployment topology, see Start the mongos Instances (Deprecated).
Add Shards to the Cluster¶
A shard can be a standalone mongod or a replica set. In a production environment, each shard should be a replica set. Use the procedure in Deploy a Replica Set to deploy replica sets for each shard.
From a mongo shell, connect to the mongos instance. Issue a command using the following syntax:
mongo --host <hostname of machine running mongos> --port <port mongos listens on>
For example, if a mongos is accessible at mongos0.example.net on port 27017, issue the following command:
mongo --host mongos0.example.net --port 27017
Add each shard to the cluster using the sh.addShard() method, as shown in the examples below. Issue sh.addShard() separately for each shard. If the shard is a replica set, specify the name of the replica set and specify a member of the set. In production deployments, all shards should be replica sets.
Optional
You can instead use the addShard database command, which lets you specify a name and maximum size for the shard. If you do not specify these, MongoDB automatically assigns a name and maximum size. To use the database command, see addShard.
The following are examples of adding a shard with sh.addShard():
To add a shard for a replica set named rs1 with a member running on port 27017 on mongodb0.example.net, issue the following command:
sh.addShard( "rs1/mongodb0.example.net:27017" )
To add a shard for a standalone mongod on port 27017 of mongodb0.example.net, issue the following command:
sh.addShard( "mongodb0.example.net:27017" )
Note
It might take some time for chunks to migrate to the new shard.
Enable Sharding for a Database¶
Before you can shard a collection, you must enable sharding for the collection’s database. Enabling sharding for a database does not redistribute data but make it possible to shard the collections in that database.
Once you enable sharding for a database, MongoDB assigns a primary shard for that database where MongoDB stores all data before sharding begins.
From a mongo shell, connect to the mongos instance. Issue a command using the following syntax:
mongo --host <hostname of machine running mongos> --port <port mongos listens on>
Issue the sh.enableSharding() method, specifying the name of the database for which to enable sharding. Use the following syntax:
sh.enableSharding("<database>")
Optionally, you can enable sharding for a database using the enableSharding command, which uses the following syntax:
db.runCommand( { enableSharding: <database> } )
Shard a Collection¶
You shard on a per-collection basis.
Determine what you will use for the shard key. Your selection of the shard key affects the efficiency of sharding. See the selection considerations listed in the Considerations for Selecting Shard Key.
If the collection already contains data you must create an index on the shard key using createIndex(). If the collection is empty then MongoDB will create the index as part of the sh.shardCollection() step.
Shard a collection by issuing the sh.shardCollection() method in the mongo shell. The method uses the following syntax:
sh.shardCollection("<database>.<collection>", shard-key-pattern)
Replace the <database>.<collection> string with the full namespace of your database, which consists of the name of your database, a dot (e.g. .), and the full name of the collection. The shard-key-pattern represents your shard key, which you specify in the same form as you would an index key pattern.
Example
The following sequence of commands shards four collections:
sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } ) sh.shardCollection("people.addresses", { "state": 1, "_id": 1 } ) sh.shardCollection("assets.chairs", { "type": 1, "_id": 1 } ) sh.shardCollection("events.alerts", { "_id": "hashed" } )
In order, these operations shard:
The people collection in the records database using the shard key { "zipcode": 1, "name": 1 }.
This shard key distributes documents by the value of the zipcode field. If a number of documents have the same value for this field, then that chunk will be splittable by the values of the name field.
The addresses collection in the people database using the shard key { "state": 1, "_id": 1 }.
This shard key distributes documents by the value of the state field. If a number of documents have the same value for this field, then that chunk will be splittable by the values of the _id field.
The chairs collection in the assets database using the shard key { "type": 1, "_id": 1 }.
This shard key distributes documents by the value of the type field. If a number of documents have the same value for this field, then that chunk will be splittable by the values of the _id field.
The alerts collection in the events database using the shard key { "_id": "hashed" }.
This shard key distributes documents by a hash of the value of the _id field. MongoDB computes the hash of the _id field for the hashed index, which should provide an even distribution of documents across a cluster.
Using 3 Mirrored Config Servers (Deprecated)¶
Start 3 Mirrored Config Servers (Deprecated)¶
Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as a replica set. The replica set config servers must run the WiredTiger storage engine. MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
In production deployments, if using mirrored config servers, you must deploy exactly three config server instances, each running on different servers to assure good uptime and data safety. In test environments, you can run all three instances on a single server.
Important
All members of a sharded cluster must be able to connect to all other members of a sharded cluster, including all shards and all config servers. Ensure that the network and security systems including all interfaces and firewalls, allow these connections.
Create data directories for each of the three config server instances. By default, a config server stores its data files in the /data/configdb directory. You can choose a different location. To create a data directory, issue a command similar to the following:
mkdir /data/configdb
Start the three config server instances. Start each by issuing a command using the following syntax:
mongod --configsvr --dbpath <path> --port <port>
The default port for config servers is 27019. You can specify a different port. The following example starts a config server using the default port and default data directory:
mongod --configsvr --dbpath /data/configdb --port 27019
For additional command options, see mongod or Configuration File Options.
Note
All config servers must be running and available when you first initiate a sharded cluster.
Start the mongos Instances (Deprecated)¶
Changed in version 3.2: Starting in MongoDB 3.2, config servers for sharded clusters can be deployed as a replica set. The replica set config servers must run the WiredTiger storage engine. MongoDB 3.2 deprecates the use of three mirrored mongod instances for config servers.
If using 3 mirrored config servers, when you start the mongos instance, specify the hostnames of the three config servers, either in the configuration file or as command line parameters.
Tip
To avoid downtime, give each config server a logical DNS name (unrelated to the server’s physical or virtual hostname). Without logical DNS names, moving or renaming a config server requires shutting down every mongod and mongos instance in the sharded cluster.
To start a mongos instance, issue a command using the following syntax:
mongos --configdb <config server hostnames>
For example, to start a mongos that connects to config server instance running on the following hosts and on the default ports:
- cfg0.example.net
- cfg1.example.net
- cfg2.example.net
You would issue the following command:
mongos --configdb cfg0.example.net:27019,cfg1.example.net:27019,cfg2.example.net:27019
Each mongos in a sharded cluster must use the same configDB string, with identical host names listed in identical order.
If you start a mongos instance with a string that does not exactly match the string used by the other mongos instances in the cluster, the mongos instance returns a Config Database String Error error and refuses to start.
To add shards, enable sharding and shard a collection, see Add Shards to the Cluster, Enable Sharding for a Database, and Shard a Collection.
Thank you for your feedback!
We're sorry! You can Report a Problem to help us improve this page.