Configuring the default shard for non-sharded collections

In the recipe Starting a simple sharded environment of two shards in Chapter 1, Installing and Starting the Server we set up a simple two-shard server. In the recipe Connecting to a shard in the shell and performing operations in Chapter 1, Installing and Starting the Server we added data to a person collection that was sharded. However, for any collection that is not sharded, all the documents end up on one shard called the primary shard. This situation is acceptable for small databases with relatively small number of collections. However, if the database size increases and at the same time the number of un-sharded collections increase, we end up overloading a particular shard (which is the primary shard for a database) with a lot of data from these un-sharded collections. All query operations for such un-sharded collections as well as those on the collections whose particular range in the shard reside on this server instance will be directed to this it. In such scenario, we can have the primary shard of a database changed to some other instance so that these un-sharded collections get balanced out across different instances.

In this recipe, we will see how to view this primary shard and change it to some other server whenever needed.

Getting ready

Following the recipe Starting a simple sharded environment of two shards in Chapter 1, Installing and Starting the Server set up and start a sharded environment. From the shell, connect to the started mongos process. Also, assuming that the two shards servers are listening to port 27000 and 27001, connect from the shell to these two processes. So, we have a total of three shells opened, one connected to the mongos process and two to these individual shards.

We need are using the test database for this recipe and sharding has to be enabled on it. If it not, then you need to execute the following on the shell connected to the mongos process:

mongos> use test
mongos> sh.enableSharding('test')

How to do it…

  1. From the shell connected to the mongos process, execute the following two commands:
    mongos> db.testCol.insert({i : 1})
    mongos> sh.status()
    
  2. In the databases, look out for test database and take a note of the primary. Suppose the following is a part (showing the part under databases only) of the output of sh.status():
    databases:
     {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
     {  "_id" : "test",  "partitioned" : true,  "primary" : "shard0000" }
    

    The second document under the databases shows us that the database test is enabled for sharding (because partitioned is true) and the primary shard is shard0000.

  3. The primary shard, which is shard0000 in our case, is the mongod process listening to port 27000. Open the shell connected to this process and execute the following in it:
    > db.testCol.find()
    
  4. Now, connect to another mongod process listening to port 27001 and again execute the following query:
    > db.testCol.find()
    

    Note that the data would be found only on the primary shard and not on other shard.

  5. Execute the following command from the mongos shell:
    mongos> use admin
    mongos> db.runCommand({movePrimary:'test', to:'shard0001'})
    
  6. Execute the following command from mongo shell connected to the mongos process:
    mongos> sh.status()
    
  7. From the shell connected to the mongos processes running on port 27000 and 27001, execute the following query:
    > db.testCol.find()
    

How it works…

We started a sharded setup and connected to it from the mongos process. We started by inserting a document in the testCol collection that is not enabled for sharding in the test database, which is not enabled for sharding as well. In such cases, the data lies on shard called the primary shard. Do not misunderstand this for the primary of a replica set. This is a shard (that itself can be a replica set) and it is the shard chosen by default for all database and collection for which sharding is not enabled.

When we add the data to a non-sharded collection, it was seen only on the shard that is primary. Executing sh.status() tells us the primary shard. To change the primary, we need to execute a command from the admin database from the shell connected to the mongos process. The command is as follows:

db.runCommand({movePrimary:'<database whose primary shard is to be changed>', to:'<target shard>'})

Once the primary shard was changed, all existing data of non-sharded database and collection was migrated to the new primary and all subsequent writes to non-sharded collections will go to this shard.

Use this command with caution as it will migrate all the unsharded collections to the new primary, which may take time for big collections.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset