The mongostat and mongotop utilities

Most of you might find these names similar to two popular Unix commands, iostat and top. For MongoDB, mongostat and mongotop are two utilities which does pretty much the same job as these two Unix commands do and there is no prize for guessing that these are used to monitor the mongo instance.

Getting ready

In this recipe, we would be simulating some operations on a standalone mongo instance by running a script that would attempt to keep your server busy, and then in another terminal we will run these utilities to monitor the db instance.

You need to start a standalone server listening to any port for client connections; in this case, we will stick to the default 27017. If you are not aware how to start a standalone server, refer to Installing single node MongoDB in Chapter 1, Installing and Starting the Server. We also need to download the script KeepServerBusy.js from Packt site and keep it handy for execution on local drive. Also, it is assumed that the bin directory of your mongo installation is present in the path variable of your operating system. If not, then these commands need to be executed with the absolute path of the executable from the shell. These two utilities mongostat and mongotop comes standard with the mongo installation.

How to do it…

  1. Start the MongoDB server, and let it listen to the default port for connections.
  2. In a separate terminal, execute the provided JavaScript KeepServerBusy.js as follows:
    $ mongo KeepServerBusy.js –quiet
    
  3. Open a new OS terminal and execute the following command:
    $ mongostat
    
  4. Capture the output content for some time and then hit Ctrl + C to stop the command from capturing more stats. Keep the terminal open or copy the stats to another file.
  5. Now, execute the following command from the terminal:
    $ mongotop
    
  6. Capture the output content for some time and then hit Ctrl + C to stop the command from capturing more stats. Keep the terminal open or copy the stats to another file.
  7. Hit Ctrl + C in the shell where the provided JavaScript KeepServerBusy.js was executed to stop the operation that keeps the server busy.

How it works…

Let's see what we have captured from these two utilities.

We start by analyzing mongostat. On my laptop, the capture using mongostat looks like this:

mongostat
connected to: 127.0.0.1
insert query update delete getmore command flushes mapped vsize   res faults idx miss % qr|qw ar|aw netIn netOut conn     time
  1000     1    950   1000       1     1|0       0 624.0M  1.4G 50.0M      0          0   0|0   0|1  431k   238k    2 08:59:21
  1000     1   1159   1000       1     1|0       0 624.0M  1.4G 50.0M      0          0   0|0   0|0  468k   252k    2 08:59:22
  1000     1    984   1000       1     1|0       0 624.0M  1.4G 50.0M      0          0   0|0   0|1  437k   240k    2 08:59:23
  1000     1   1066   1000       1     1|0       0 624.0M  1.4G 50.0M      0          0   0|0   0|1  452k   246k    2 08:59:24
  1000     1    944   1000       1     2|0       0 624.0M  1.4G 50.0M      0          0   0|0   0|1  431k   237k    2 08:59:25
  1000     1   1149   1000       1     1|0       0 624.0M  1.4G 50.0M      0          0   0|0   0|1  466k   252k    2 08:59:26
  1000     2   1015   1053       2     1|0       0 624.0M  1.4G 50.0M      0          0   0|0   0|0  450k   293k    2 08:59:27

You may choose to look at what the script KeepServerBusy.js is doing to keep the server busy. All it does is insert 1000 documents in collection monitoringTest, then update them one by one to set a new key in it, executes a find and iterates through all of them, and finally deletes them one by one and is basically a write intensive operation.

The output does look ugly with content wrapping, but let's analyze the fields one by one and see what the fields to keep an eye on.

Column(s)

Description

insert, query, update, delete

The first four columns are the number of insert, query, update and delete operation per second. It is per second as the time frame these figures are captured are separated by one second, which is indicated by the last column.

getmore

When the cursor runs out of data for the query, it executes a getmore operation on the server to get more results for the query executed earlier. This column shows the number of getmore operations executed in this given time frame of 1 second. In our case, there are not many getmore operations that are executed.

commands

This is the number of commands executed on the server in the given time frame of 1 second. In our case, it wasn't much and was only one. The number after a | is 0 in our case, as this was in standalone mode. Try executing mongostat connecting to a replica set primary and secondary. You should see slightly different figures there.

flushes

This is the number of times data was flushed to disk in the interval of 1 second. (fsync in case of MMAPv1 storage engine, and checkpoints triggered between polling interval in case of WiredTiger storage engine)

mapped, virtual, and resident memory

Mapped memory is the amount of memory mapped by the Mongo process to the database. This will typically be same as the size of the database. Virtual memory on other hand is the memory allocated to the entire mongod process. This will be more than twice the size of mapped memory especially when journaling is enabled. Finally, resident memory is the actual of physical memory used by mongo. All these figures are given in MB. The total amount of physical memory might be a lot more than what is being used by Mongo, but that is really not a concern unless a lot of page faults occur (which does happen in the previously mentioned output).

faults

These are the number of page faults occurring per second. These numbers should be as less as possible. It indicates the number of times mongo had to go to disk to obtain the document/index that was missing in the main memory. This problem is not as big a problem when using SSD for persistent storage as it is when using spinning disk drives.

locked

Since version 2.2, all write operations to a collection lock the database in which the collection is and does not acquire a global level lock. This field shows the database that was locked for a majority of the time in the given time interval. In our case, the test database is locked for a majority of time.

idx miss %

This field gives the number of times a particular index was needed and was not present in memory. This causes a page fault and the disk needs to be accessed to get the index. Another disk access might be needed to get the document as well. This figure too should be low. A high percentage of index miss is something that would need attention.

qr | qw

These are the queued up reads and writes that are waiting for getting a chance to be executed. If this number goes up, it shows that the database is getting overwhelmed by the volume of read and writes than it could handle. If the values are too high, keep an eye on page faults and database lock percents in order to get more insights on increased queue counts. If the data set is too large, sharding the collection can improve the performance significantly.

ar | aw

This is the number of active readers and writers (clients). Not something to worry of even for a large number as far as other stats we saw previously are under control.

netIn and netOut

The network traffic in and out of the mongo server in the given time frame. Figure is measured in bits. For example, 271k means 271 kilobits.

conn

This indicates the number of open connections. Something to keep a watch on to see if this doesn't keep getting higher.

time

This is the time interval when this sample was captured.

There are some more fields seen if mongostat is connected to a replica set primary or secondary. As an assignment, once the stats or a standalone instance are collected, start a replica set server and execute the same script to keep the server busy. Use mongostat to connect to a primary and secondary instance and see different stats captured.

Apart from mongostat, we also used the mongotop utility to capture the stats. Let's see its output and make some sense out of it:

$>mongotop
connected to: 127.0.0.1
                              ns           total          read         write
2014-01-15T17:55:13
               test.monitoringTest         899ms           1ms         898ms
               test.system.users             0ms           0ms           0ms
            test.system.namespaces           0ms           0ms           0ms
               test.system.js             0ms           0ms           0ms
               test.system.indexes           0ms           0ms           0ms

                              ns           total          read         write
2014-01-15T17:55:14
               test.monitoringTest         959ms           0ms         959ms
            test.system.users             0ms           0ms           0ms
            test.system.namespaces           0ms           0ms           0ms
              test.system.js             0ms           0ms           0ms
               test.system.indexes           0ms           0ms           0ms
                              ns           total          read         write
2014-01-15T17:55:15
               test.monitoringTest         954ms           1ms         953ms
               test.system.users             0ms           0ms           0ms
            test.system.namespaces           0ms           0ms           0ms
              test.system.js             0ms           0ms           0ms
               test.system.indexes           0ms           0ms           0ms

There is not much to see in this stat. We see the total time a database was busy reading or writing in the given slice of 1 second. The value given in the total would be sum of the read and the write time. If we actually compare the mongotop and mongostat for the same time slice, the percentage of time duration for which the write was taking place would be very close to the figure given in the percentage time that the database was locked in the mongostat output.

The command mongotop accepts a parameter on the command line as follows:

$ mongotop 5

In this case, the interval after which the stats will be printed out will be 5 seconds as opposed to the default value of 1 second.

Note

Starting with MongoDB 3.0, both mongotop and mongostat utilities allow output in JSON format using --json option. This can be very useful if you were to use custom monitoring or metrics collection scripts, which would rely on these utilities.

See also

  • In the recipe Getting current executing operations and killing them, we will see how to get the current executing operations from the shell and kill them if needed
  • In the recipe Using profiler to profile operations, we will see how to use the inbuilt profiling feature of Mongo to log operation execution times.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset