Configuring the default shard for nonsharded collections

In the Starting a simple sharded environment of two shards recipe in Chapter 1, Installing and Starting the MongoDB Server, we set up a simple two-shard server. In the Connecting to a shard from the Mongo shell and performing operations recipe in Chapter 1, Installing and Starting the MongoDB Server, we added data to a person collection that was sharded. However, for any collection that is not sharded, all the documents end up on one shard called the primary shard. This situation is acceptable for small databases with a relatively small number of collections. However, if, the database size increases and at the same time, the number of unsharded collections increase we end up overloading a particular shard (the primary shard for a database) with a lot of data from these unsharded collections. All query operations for such unsharded collections, as well as those on the collections whose particular range in the shard reside on this server instance, will be directed to this. In such a scenario, we can have the primary shard of a database changed to some other instance so that these unsharded collections get balanced out across different instances. In this recipe, we will see how to view this primary shard and change it to some other server whenever needed.

Getting ready

Refer to the Starting a simple sharded environment of two shards recipe in Chapter 1, Installing and Starting the MongoDB Server, to set up and start a sharded environment. From the shell, connect to the started mongos process. Also, assuming that the two shard servers are listening to the 27000 and 27001 ports, connect from the shell to these two processes. So we have a total of three shells opened, one connected to the mongos process and two to these individual shards.

We are using the test database for this recipe, and sharding has to be enabled on this database. If it's not, then you need to execute the following commands on the shell connected to the mongos process:

mongos> use test
mongos> sh.enableSharding('test')

How to do it…

  1. From the shell connected to the mongos process, execute the following two commands:
    mongos> db.testCol.insert({i : 1})
    mongos> sh.status()
    
  2. In the databases, look out for the test database and take note of the primary. Suppose that the following is a part (showing the part under databases only) of the output of sh.status():
    databases:
     {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
     {  "_id" : "test",  "partitioned" : true,  "primary" : "shard0000" }
    
  3. The second document under the databases shows us that the test database is enabled for sharding (because partitioned is true) and the primary shard is shard0000.
  4. The primary shard, which is shard0000 in our case, is the mongod process listening to port 27000. Open the shell connected to this process and execute the following query:
    > db.testCol.find()
    
  5. Now connect to another mongod process listening to port 27001 and execute the following query again:
    > db.testCol.find()
    
  6. Note that the data will be found only on the primary shard and not on any other shard.
  7. Execute the following command from the Mongos shell:
    mongos> use admin
    mongos> db.runCommand({movePrimary:'test', to:'shard0001'})
    
  8. Execute the following command again from the Mongo shell connected to the mongos process:
    mongos> sh.status()
    
  9. From the shell connected to the mongos processes running on ports 27000 and 27001, execute the following query:
    > db.testCol.find()
    

How it works…

We started a sharded setup and connected to it from the mongos process. We started by inserting a document in the testCol collection that is not enabled for sharding in the test database, which is not enabled for sharding as well. In such cases, the data lies on a shard called the primary shard. Do not mistake this for the primary of a replica set. This is a shard (that itself can be a replica set), and it is the shard chosen by default for all databases and collections for which sharding is not enabled.

When we add the data to a nonsharded collection, it is seen only on the shard that is primary. Executing sh.status() tells us the primary shard. To change the primary, we need to execute a command from the admin database from the shell connected to the mongos process. The command is as follows:

db.runCommand({movePrimary:'<database whose primary shard is to be changed>', to:'<target shard>'})

Once the primary shard is changed, all existing data in nonsharded databases and collections is migrated to the new primary, and all subsequent writes to nonsharded collections will go to this shard.

Use this command with caution, as it will migrate all the unsharded collections to the new primary, which may take time for big collections.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset