Getting current executing operations and killing them

In this recipe, we will see how to view the current running operations and kill some operations that are running for a long time.

Getting ready

We will simulate some operations on a standalone mongo instance. We need to start a standalone server listening to any port for client connections; in this case, we will stick to the default 27017. If you are not aware how to start a standalone server, refer to Installing single node MongoDB in Chapter 1, Installing and Starting the Server. We also need to start two shells connected to the server started. One shell would be used for background index creation and another would be used to monitor the current operation and then kill it.

How to do it…

  1. We would not be able to simulate the actual long running operation in our test environment. We will try to create an index and hope it takes long to create. Depending on your target hardware configuration, the operation may take some time.
  2. To start with this test, let's execute the following on the mongo shell:
    > db.currentOpTest.drop()
    > for(i = 1 ; i < 10000000 ; i++) { db.currentOpTest.insert({'i':i})}
    

    The preceding insertion might take some time to insert 10 million documents.

  3. Once the documents are inserted, we will execute an operation that would create the index in background. If you would like to know more about index creation, refer to the recipe Creating a background and foreground index in the shell in Chapter 2, Command-line Operations and Indexes, but it is not a prerequisite for this recipe.
  4. Create a background index on the field i in the document. This index creation operation is what we will be viewing from the currentOp operation and is what we will attempt to kill from using the kill operation. Execute the following in one shell to initiate the background index creation operation. This takes fairly long time and on my laptop it took well over 100 seconds.
    > db.currentOpTest.ensureIndex({i:1}, {background:1})
    
  5. In the second shell, execute the following command to get the current executing operations:
    > db.currentOp().inprog
    
  6. Take a note of the progress of the operations and find the one that is necessary for index creation. In our case, it was the only in progress on test machine. It will be an operation on system.indexes and the operation will be insert. The keys to lookout for in the output document are ns and op, respectively. We need to note the first field of this operation, opid. In this case, it is 11587458. The sample output of the command is given in next section.
  7. Kill the operation from the shell using the following command, using the opid (operation ID) we got earlier:
    > db.killOp(11587458)
    

How it works…

We will split our explanation into two sections, the first about the current operation details and second about killing the operation.

In our case, index creation process is the long-running operation that we intend to kill. We create a big collection with about 10 million documents and initiate a background index creation process.

On executing the db.currentOp() operation, we get a document as the result with a field inprog whose value is an array of other documents each representing a currently running operation. It is common to get a big list of documents on a busy system. Here is a document taken for the index creation operation:

{
        "desc" : "conn12",
        "threadId" : "0x3be96c0",
        "connectionId" : 12,
        "opid" : 3212789,
        "active" : true,
        "secs_running" : 1,
        "microsecs_running" : NumberLong(729029),
        "op" : "query",
        "ns" : "test.$cmd",
        "query" : {
            "createIndexes" : "currentOpTest",
            "indexes" : [
                {
                    "key" : {
                        "i" : 1
                    },
                    "name" : "i_1",
                    "background" : 1
                }
            ]
        },
        "client" : "127.0.0.1:36542",
        "msg" : "Index Build (background) Index Build (background): 384120/1000000 38%",
        "progress" : {
            "done" : 384120,
            "total" : 1000000
        },
        "numYields" : 3003,
        "locks" : {
            "Global" : "w",
            "MMAPV1Journal" : "w",
            "Database" : "w",
            "Collection" : "W"
  "waitingForLock" : true,
        "lockStats" : {
            "Global" : {
                "acquireCount" : {
                    "w" : NumberLong(3004)
                }
            },
            "MMAPV1Journal" : {
                "acquireCount" : {
                    "w" : NumberLong(387127)
                },
                "acquireWaitCount" : {
                    "w" : NumberLong(9)
                },
                "timeAcquiringMicros" : {
                    "w" : NumberLong(60025)
                }
            },
            "Database" : {
                "acquireCount" : {
                    "w" : NumberLong(3004),
                    "W" : NumberLong(1)
                }
            },
            "Collection" : {
                "acquireCount" : {
                    "W" : NumberLong(3004)
                },
                "acquireWaitCount" : {
                    "W" : NumberLong(1)
                },
                "timeAcquiringMicros" : {
                    "W" : NumberLong(66)
                }
            },
            "Metadata" : {
                "acquireCount" : {
                    "W" : NumberLong(4)
                }
            }
        }
    }

We will see what these fields mean in the following table:

Field

Description

opid

This is a unique operation ID identifying the operation. This is the ID to be used to kill an operation.

active

The Boolean value indicating whether the operation has started or not, it is false only if it is waiting for acquiring the lock to execute the operation. The value will be true once it starts even if at a point of time where it has yielded the lock and is not executing.

secs_running

Gives the time in seconds the operation is executing for.

op

This is the type of the operation. In the case of index creation, it is inserted into a system collection of indexes. Possible values are insert, query, getmore, update, remove, and command.

ns

This is the fully qualified namespace for the target. It would be in the form <database name>.<collection name>.

insert

This is the document that would be inserted in the collection.

query

This is a field that would be present for other operations, other than insert, getmore, and command.

client

The ip address/hostname and the port of the client who initiated the operation.

desc

This is the description of the client, mostly the client connection name.

connectionId

This is the identifier of the client connection from which the request originated.

locks

This is a document containing the locks held for this operation. The document shows the type and mode of locks held for the operation being analyzed. The possible modes are as follows:

R represents Shared (S) lock.

W represents Exclusive (X) lock.

r represents Intent Shared (IS) lock.

w represents Intent Exclusive (IX) lock.

waitingForLock

This field indicates if the operation is waiting for a lock to be acquired. For instance, if the preceding index creation was not a background process, other operations on this database would queue up for the lock to be acquired. This flag for those operations would then be true.

msg

This is a human-readable message for the operation. In this case, we do see the percentage of operation complete as this is an index creation operation.

progress

The state of the operation, the total gives the total number of documents in the collection and done gives the number indexed so far. In this case, the collection already had some more documents over 10 million documents. The percentage completion is computed from these figures.

numYields

This is the number of times the process has yielded the lock to allow other operations to execute. Since this is the background index creation process, this number will keep on increasing as the server yields it frequently to let other operations execute. Had it been a foreground process, the lock would never be yielded till the operation completes.

lockStats

This document has more nested documents giving the stats for the total time this operation has held the read or write lock and also the time it waited to acquire the lock.

Note

In case you have a replica set, there would be more lot of getmore operations on the oplog on primary from secondary.

  1. To see the system operations being executed too, we need to pass a true value as the parameter to the currentOp function call as follows:
    > db.currentOp(true)
    
  2. Next, we will see how to kill the user initiated operation using the killOp function. The operation is simply called as follows:
    > db.killOp(<operation id>)
    

    In our case, the index creation process had the process ID 11587458 and thus it will be killed as follows:

    > db.killOp(11587458)
    

    On killing any operation, irrespective of whether the given operation ID exists or not, we see the following message on the console:

    { "info" : "attempting to kill op" }
    

    Thus, seeing this message doesn't mean that the operation was killed. It just means that the operation if it exists will be attempted to be killed.

  3. If some operation cannot be killed immediately and if the killOp command is issued for it, the field killPending in the currentOp will start appearing for the given operation. For example, execute the following query on the shell:
    > db.currentOpTest.find({$where:'sleep(100000)'})
    

This will not return and the thread executing the query will sleep for 100 seconds. This is an operation that cannot be killed using killOp. Try executing the command currentOp from another shell (do not press Tab for auto completion, your shell may just hang), get the operation ID, and then kill it using the killOp. You should see that the process still would be running if you execute the currentOp command, but the document for the process details will now contain a new key killPending stating that the kill for this operation is requested but pending.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset