Chapter 2. MongoDB through the JavaScript shell

This chapter covers

  • Using CRUD operations in the MongoDB shell
  • Building indexes and using explain()
  • Understanding basic administration
  • Getting help

The previous chapter hinted at the experience of running MongoDB. If you’re ready for a more hands-on introduction, this is it. Using the MongoDB shell, this chapter teaches the database’s basic concepts through a series of exercises. You’ll learn how to create, read, update, and delete (CRUD) documents and, in the process, get to know MongoDB’s query language. In addition, we’ll take a preliminary look at database indexes and how they’re used to optimize queries. Then we’ll explore some basic administrative commands and suggest a few ways of getting help as you continue working with MongoDB’s shell. Think of this chapter as both an elaboration of the concepts already introduced and as a practical tour of the most common tasks performed from the MongoDB shell.

The MongoDB shell is the go-to tool for experimenting with the database, running ad-hoc queries, and administering running MongoDB instances. When you’re writing an application that uses MongoDB, you’ll use a language driver (like MongoDB’s Ruby gem) rather than the shell, but the shell is likely where you’ll test and refine these queries. Any and all MongoDB queries can be run from the shell.

If you’re completely new to MongoDB’s shell, know that it provides all the features that you’d expect of such a tool; it allows you to examine and manipulate data and administer the database server itself. MongoDB’s shell differs from others, however, in its query language. Instead of employing a standardized query language such as SQL, you interact with the server using the JavaScript programming language and a simple API. This means that you can write JavaScript scripts in the shell that interact with a MongoDB database. If you’re not familiar with JavaScript, rest assured that only a superficial knowledge of the language is necessary to take advantage of the shell, and all examples in this chapter will be explained thoroughly. The MongoDB API in the shell is similar to most of the language drivers, so it’s easy to take queries you write in the shell and run them from your application.

You’ll benefit most from this chapter if you follow along with the examples, but to do that, you’ll need to have MongoDB installed on your system. You’ll find installation instructions in appendix A.

2.1. Diving into the MongoDB shell

MongoDB’s JavaScript shell makes it easy to play with data and get a tangible sense of documents, collections, and the database’s particular query language. Think of the following walkthrough as a practical introduction to MongoDB.

You’ll begin by getting the shell up and running. Then you’ll see how JavaScript represents documents, and you’ll learn how to insert these documents into a MongoDB collection. To verify these inserts, you’ll practice querying the collection. Then it’s on to updates. Finally, we’ll finish out the CRUD operations by learning to remove data and drop collections.

2.1.1. Starting the shell

Follow the instructions in appendix A and you should quickly have a working MongoDB installation on your computer, as well as a running mongod instance. Once you do, start the MongoDB shell by running the mongo executable:

mongo

If the shell program starts successfully, your screen will look like figure 2.1. The shell heading displays the version of MongoDB you’re running, along with some additional information about the currently selected database.

Figure 2.1. MongoDB JavaScript shell on startup

If you know some JavaScript, you can start entering code and exploring the shell right away. In either case, read on to see how to run your first operations against MongoDB.

2.1.2. Databases, collections, and documents

As you probably know by now, MongoDB stores its information in documents, which can be printed out in JSON (JavaScript Object Notation) format. You’d probably like to store different types of documents, like users and orders, in separate places. This means that MongoDB needs a way to group documents, similar to a table in an RDBMS. In MongoDB, this is called a collection.

MongoDB divides collections into separate databases. Unlike the usual overhead that databases produce in the SQL world, databases in MongoDB are just namespaces to distinguish between collections. To query MongoDB, you’ll need to know the database (or namespace) and collection you want to query for documents. If no other database is specified on startup, the shell selects a default database called test. As a way of keeping all the subsequent tutorial exercises under the same namespace, let’s start by switching to the tutorial database:

> use tutorial
switched to db tutorial

You’ll see a message verifying that you’ve switched databases.

Why does MongoDB have both databases and collections? The answer lies in how MongoDB writes its data out to disk. All collections in a database are grouped in the same files, so it makes sense, from a memory perspective, to keep related collections in the same database. You might also want to have different applications access the same collections (multitenancy) and, it’s also useful to keep your data organized so you’re prepared for future requirements.

On creating databases and collections

You may be wondering how you can switch to the tutorial database without explicitly creating it. In fact, creating the database isn’t required. Databases and collections are created only when documents are first inserted. This behavior is consistent with MongoDB’s dynamic approach to data; just as the structure of documents needn’t be defined in advance, individual collections and databases can be created at runtime. This can lead to a simplified and accelerated development process. That said, if you’re concerned about databases or collections being created accidentally, most of the drivers let you enable a strict mode to prevent such careless errors.

It’s time to create your first document. Because you’re using a JavaScript shell, your documents will be specified in JSON. For instance, a simple document describing a user might look like this:

{username: "smith"}

The document contains a single key and value for storing Smith’s username.

2.1.3. Inserts and queries

To save this document, you need to choose a collection to save it to. Appropriately enough, you’ll save it to the users collection. Here’s how:

> db.users.insert({username: "smith"})
WriteResult({ "nInserted" : 1 })
Note

Note that in our examples, we’ll preface MongoDB shell commands with a > so that you can tell the difference between the command and its output.

You may notice a slight delay after entering this code. At this point, neither the tutorial database nor the users collection has been created on disk. The delay is caused by the allocation of the initial data files for both.

If the insert succeeds, you’ve just saved your first document. In the default MongoDB configuration, this data is now guaranteed to be inserted even if you kill the shell or suddenly restart your machine. You can issue a query to see the new document:

> db.users.find()

Since the data is now part of the users collection, reopening the shell and running the query will show the same result. The response will look something like this:

{ "_id" : ObjectId("552e458158cd52bcb257c324"), "username" : "smith" }
_id fields in MongoDB

Note that an _id field has been added to the document. You can think of the _id value as the document’s primary key. Every MongoDB document requires an _id, and if one isn’t present when the document is created, a special MongoDB ObjectID will be generated and added to the document at that time. The ObjectID that appears in your console won’t be the same as the one in the code listing, but it will be unique among all _id values in the collection, which is the only requirement for the field. You can set your own _id by setting it in the document you insert, the ObjectID is just MongoDB’s default.

We’ll have more to say about ObjectIDs in the next chapter. Let’s continue for now by adding a second user to the collection:

> db.users.insert({username: "jones"})
WriteResult({ "nInserted" : 1 })

There should now be two documents in the collection. Go ahead and verify this by running the count command:

> db.users.count()
2
Pass a query Predicate

Now that you have more than one document in the collection, let’s look at some slightly more sophisticated queries. As before, you can still query for all the documents in the collection:

> db.users.find()
{ "_id" : ObjectId("552e458158cd52bcb257c324"), "username" : "smith" }
{ "_id" : ObjectId("552e542a58cd52bcb257c325"), "username" : "jones" }

You can also pass a simple query selector to the find method. A query selector is a document that’s used to match against all documents in the collection. To query for all documents where the username is jones, you pass a simple document that acts as your query selector like this:

> db.users.find({username: "jones"})
{ "_id" : ObjectId("552e542a58cd52bcb257c325"), "username" : "jones" }

The query predicate {username: "jones"} returns all documents where the username is jones—it literally matches against the existing documents.

Note that calling the find method without any argument is equivalent to passing in an empty predicate; db.users.find() is the same as db.users.find({}).

You can also specify multiple fields in the query predicate, which creates an implicit AND among the fields. For example, you query with the following selector:

> db.users.find({
... _id: ObjectId("552e458158cd52bcb257c324"),
... username: "smith"
... })
{ "_id" : ObjectId("552e458158cd52bcb257c324"), "username" : "smith" }

The three dots after the first line of the query are added by the MongoDB shell to indicate that the command takes more than one line.

The query predicate is identical to the returned document. The predicate ANDs the fields, so this query searches for a document that matches on both the _id and username fields.

You can also use MongoDB’s $and operator explicitly. The previous query is identical to

> db.users.find({ $and: [
... { _id: ObjectId("552e458158cd52bcb257c324") },
... { username: "smith" }
... ] })
{ "_id" : ObjectId("552e458158cd52bcb257c324"), "username" : "smith" }

Selecting documents with an OR is similar: just use the $or operator. Consider the following query:

> db.users.find({ $or: [
... { username: "smith" },

... { username: "jones" }
... ]})
{ "_id" : ObjectId("552e458158cd52bcb257c324"), "username" : "smith" }
{ "_id" : ObjectId("552e542a58cd52bcb257c325"), "username" : "jones" }

The query returns both the smith and jones documents, because we asked for either a username of smith or a username of jones.

This example is different than previous ones, because it doesn’t just insert or search for a specific document. Rather, the query itself is a document. The idea of representing commands as documents is used often in MongoDB and may come as a surprise if you’re used to relational databases. One advantage of this interface is that it’s easier to build queries programmatically in your application because they’re documents rather than a long SQL string.

We’ve presented the basics of creating and reading data. Now it’s time to look at how to update that data.

2.1.4. Updating documents

All updates require at least two arguments. The first specifies which documents to update, and the second defines how the selected documents should be modified. The first few examples demonstrate modifying a single document, but the same operations can be applied to many documents, even an entire collection, as we show at the end of this section. But keep in mind that by default the update() method updates a single document.

There are two general types of updates, with different properties and use cases. One type of update involves applying modification operations to a document or documents, and the other type involves replacing the old document with a new one.

For the following examples, we’ll look at this sample document:

> db.users.find({username: "smith"})
{ "_id" : ObjectId("552e458158cd52bcb257c324"), "username" : "smith" }
Operator update

The first type of update involves passing a document with some kind of operator description as the second argument to the update function. In this section, you’ll see an example of how to use the $set operator, which sets a single field to the specified value.

Suppose that user Smith decides to add her country of residence. You can record this with the following update:

> db.users.update({username: "smith"}, {$set: {country: "Canada"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

This update tells MongoDB to find a document where the username is smith, and then to set the value of the country property to Canada. You see the change reflected in the message that gets sent back by the server. If you now issue a query, you’ll see that the document has been updated accordingly:

> db.users.find({username: "smith"})
{ "_id" : ObjectId("552e458158cd52bcb257c324"), "username" : "smith", "country" : "Canada" }
Replacement update

Another way to update a document is to replace it rather than just set a field. This is sometimes mistakenly used when an operator update with a $set was intended. Consider a slightly different update command:

> db.users.update({username: "smith"}, {country: "Canada"})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

In this case, the document is replaced with one that only contains the country field, and the username field is removed because the first document is used only for matching and the second document is used for replacing the document that was previously matched. You should be careful when you use this kind of update. A query for the document yields the following:

> db.users.find({country: "Canada"})
{ "_id" : ObjectId("552e458158cd52bcb257c324"), "country" : "Canada" }

The _id is the same, yet data has been replaced in the update. Be sure to use the $set operator if you intend to add or set fields rather than to replace the entire document. Add the username back to the record:

> db.users.update({country: "Canada"}, {$set: {username: "smith"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.users.find({country: "Canada"})
{ "_id" : ObjectId("552e458158cd52bcb257c324"), "country" : "Canada",
     "username" : "smith" }

If you later decide that the country stored in the profile is no longer needed, the value can be removed as easily using the $unset operator:

> db.users.update({username: "smith"}, {$unset: {country: 1}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.users.find({username: "smith"})
{ "_id" : ObjectId("552e458158cd52bcb257c324"), "username" : "smith" }
Updating complex data

Let’s enrich this example. You’re representing your data with documents, which, as you saw in chapter 1, can contain complex data structures. Let’s suppose that, in addition to storing profile information, your users can store lists of their favorite things. A good document representation might look something like this:

{
  username: "smith",
  favorites: {

    cities: ["Chicago", "Cheyenne"],
    movies: ["Casablanca", "For a Few Dollars More", "The Sting"]
  }
}

The favorites key points to an object containing two other keys, which point to lists of favorite cities and movies. Given what you know already, can you think of a way to modify the original smith document to look like this? The $set operator should come to mind:

> db.users.update( {username: "smith"},
...   {
...     $set: {
...       favorites: {
...         cities: ["Chicago", "Cheyenne"],
...         movies: ["Casablanca", "For a Few Dollars More", "The Sting"]
...      }
...     }
...   })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

Please note that the use of spacing for indenting isn’t mandatory, but it helps avoid errors as the document is more readable this way.

Let’s modify jones similarly, but in this case you’ll only add a couple of favorite movies:

> db.users.update( {username: "jones"},
...   {
...     $set: {
...       favorites: {
...         movies: ["Casablanca", "Rocky"]
...       }
...     }
...   })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

If you make a typo, you can use the up arrow key to recall the last shell statement.

Now query the users collection to make sure that both updates succeeded:

> > db.users.find().pretty()
{
    "_id" : ObjectId("552e458158cd52bcb257c324"),
    "username" : "smith",
    "favorites" : {
        "cities" : [
            "Chicago",
            "Cheyenne"
        ],
        "movies" : [
            "Casablanca",
            "For a Few Dollars More",
            "The Sting"
        ]
    }
}

{
    "_id" : ObjectId("552e542a58cd52bcb257c325"),
    "username" : "jones",
    "favorites" : {
        "movies" : [
            "Casablanca",
            "Rocky"
        ]
    }
}

Strictly speaking, the find() command returns a cursor to the returning documents. Therefore, to access the documents you’ll need to iterate the cursor. The find() command automatically returns 20 documents—if they’re available—after iterating the cursor 20 times.

With a couple of example documents at your fingertips, you can now begin to see the power of MongoDB’s query language. In particular, the query engine’s ability to reach into nested inner objects and match against array elements proves useful in this situation. Notice how we appended the pretty operation to the find operation to get nicely formatted results returned by the server. Strictly speaking, pretty() is actually cursor.pretty(), which is a way of configuring a cursor to display results in an easy-to-read format.

You can see an example of both of these concepts demonstrated in this query to find all users who like the movie Casablanca:

> db.users.find({"favorites.movies": "Casablanca"})

The dot between favorites and movies instructs the query engine to look for a key named favorites that points to an object with an inner key named movies and then to match the value of the inner key. Thus, this query will return both user documents because queries on arrays will match if any element in the array matches the original query.

To see a more involved example, suppose you know that any user who likes Casablanca also likes The Maltese Falcon and that you want to update your database to reflect this fact. How would you represent this as a MongoDB update?

More advanced updates

You could conceivably use the $set operator again, but doing so would require you to rewrite and send the entire array of movies. Because all you want to do is to add an element to the list, you’re better off using either $push or $addToSet. Both operators add an item to an array, but the second does so uniquely, preventing a duplicate addition. This is the update you’re looking for:

> db.users.update( {"favorites.movies": "Casablanca"},
...     {$addToSet: {"favorites.movies": "The Maltese Falcon"} },
...           false,
...           true )
WriteResult({ "nMatched" : 2, "nUpserted" : 0, "nModified" : 2 })

Most of this should be decipherable by now. The first argument is a query predicate that matches against users who have Casablanca in their movies list. The second argument adds The Maltese Falcon to that list using the $addToSet operator.

The third argument, false, controls whether an upsert is allowed. This tells the update operation whether it should insert a document if it doesn’t already exist, which has different behavior depending on whether the update is an operator update or a replacement update.

The fourth argument, true, indicates that this is a multi-update. By default, a MongoDB update operation will apply only to the first document matched by the query selector. If you want the operation to apply to all documents matched, you must be explicit about that. You want your update to apply to both smith and jones, so the multi-update is necessary.

We’ll cover updates in more detail later, but try these examples before moving on.

2.1.5. Deleting data

Now you know the basics of creating, reading, and updating data through the MongoDB shell. We’ve saved the simplest operation, removing data, for last.

If given no parameters, a remove operation will clear a collection of all its documents. To get rid of, say, a foo collection’s contents, you enter:

> db.foo.remove()

You often need to remove only a certain subset of a collection’s documents, and for that, you can pass a query selector to the remove() method. If you want to remove all users whose favorite city is Cheyenne, the expression is straightforward:

> db.users.remove({"favorites.cities": "Cheyenne"})
WriteResult({ "nRemoved" : 1 })

Note that the remove() operation doesn’t actually delete the collection; it merely removes documents from a collection. You can think of it as being analogous to SQL’s DELETE command.

If your intent is to delete the collection along with all of its indexes, use the drop() method:

> db.users.drop()

Creating, reading, updating, and deleting are the basic operations of any database; if you’ve followed along, you should be in a position to continue practicing basic CRUD operations in MongoDB. In the next section, you’ll learn how to enhance your queries, updates, and deletes by taking a brief look at secondary indexes.

2.1.6. Other shell features

You may have noticed this already, but the shell does a lot of things to make working with MongoDB easier. You can revisit earlier commands by using the up and down arrows, and use autocomplete for certain inputs, like collection names. The autocomplete feature uses the tab key to autocomplete or to list the completion possibilities.[1] You can also discover more information in the shell by typing this:

1

For the full list of keyboard shortcuts, please visit http://docs.mongodb.org/v3.0/reference/program/mongo/#mongo-keyboard-shortcuts.

> help

A lot of functions print pretty help messages that explain them as well. Try it out:

> db.help()
DB methods:
    db.adminCommand(nameOrDocument) - switches to 'admin' db, and runs command [ just calls db.runCommand(...) ]
    db.auth(username, password)
    db.cloneDatabase(fromhost)
    db.commandHelp(name) returns the help for the command
    db.copyDatabase(fromdb, todb, fromhost)
...

Help on queries is provided through a different function called explain, which we’ll investigate in later sections. There are also a number of options you can use when starting the MongoDB shell. To display a list of these, add the help flag when you start the MongoDB shell:

$ mongo --help

You don’t need to worry about all these features, and we’re not done working with the shell yet, but it’s worth knowing where you can find more information when you need it.

2.2. Creating and querying with indexes

It’s common to create indexes to enhance query performance. Fortunately, MongoDB’s indexes can be created easily from the shell. If you’re new to database indexes, this section should make the need for them clear; if you already have indexing experience, you’ll see how easy it is to create indexes and then profile queries against them using the explain() method.

2.2.1. Creating a large collection

An indexing example makes sense only if you have a collection with many documents. So you’ll add 20,000 simple documents to a numbers collection. Because the MongoDB shell is also a JavaScript interpreter, the code to accomplish this is simple:

> for(i = 0; i < 20000; i++) {
    db.numbers.save({num: i});
  }
WriteResult({ "nInserted" : 1 })

That’s a lot of documents, so don’t be surprised if the insert takes a few seconds to complete. Once it returns, you can run a couple of queries to verify that all the documents are present:

> db.numbers.count()
20000
> db.numbers.find()
{ "_id": ObjectId("4bfbf132dba1aa7c30ac830a"), "num": 0 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac830b"), "num": 1 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac830c"), "num": 2 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac830d"), "num": 3 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac830e"), "num": 4 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac830f"), "num": 5 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8310"), "num": 6 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8311"), "num": 7 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8312"), "num": 8 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8313"), "num": 9 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8314"), "num": 10 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8315"), "num": 11 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8316"), "num": 12 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8317"), "num": 13 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8318"), "num": 14 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8319"), "num": 15 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac831a"), "num": 16 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac831b"), "num": 17 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac831c"), "num": 18 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac831d"), "num": 19 }
Type "it" for more

The count() command shows that you’ve inserted 20,000 documents. The subsequent query displays the first 20 results (this number may be different in your shell). You can display additional results with the it command:

> it
{ "_id": ObjectId("4bfbf132dba1aa7c30ac831e"), "num": 20 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac831f"), "num": 21 }
{ "_id": ObjectId("4bfbf132dba1aa7c30ac8320"), "num": 22 }
...

The it command instructs the shell to return the next result set.[2]

2

You may be wondering what’s happening behind the scenes here. All queries create a cursor, which allows for iteration over a result set. This is somewhat hidden when using the shell, so it isn’t necessary to discuss in detail at the moment. If you can’t wait to learn more about cursors and their idiosyncrasies, see chapters 3 and 4.

With a sizable set of documents available, let’s try a couple queries. Given what you know about MongoDB’s query engine, a simple query matching a document on its num attribute makes sense:

> db.numbers.find({num: 500})
{ "_id" : ObjectId("4bfbf132dba1aa7c30ac84fe"), "num" : 500 }
Range queries

More interestingly, you can also issue range queries using the special $gt and $lt operators. They stand for greater than and less than, respectively. Here’s how you query for all documents with a num value greater than 199,995:

> db.numbers.find( {num: {"$gt": 19995 }} )
{ "_id" : ObjectId("552e660b58cd52bcb2581142"), "num" : 19996 }
{ "_id" : ObjectId("552e660b58cd52bcb2581143"), "num" : 19997 }
{ "_id" : ObjectId("552e660b58cd52bcb2581144"), "num" : 19998 }
{ "_id" : ObjectId("552e660b58cd52bcb2581145"), "num" : 19999 }

You can also combine the two operators to specify upper and lower boundaries:

> db.numbers.find( {num: {"$gt": 20, "$lt": 25 }} )
{ "_id" : ObjectId("552e660558cd52bcb257c33b"), "num" : 21 }
{ "_id" : ObjectId("552e660558cd52bcb257c33c"), "num" : 22 }
{ "_id" : ObjectId("552e660558cd52bcb257c33d"), "num" : 23 }
{ "_id" : ObjectId("552e660558cd52bcb257c33e"), "num" : 24 }

You can see that by using a simple JSON document, you’re able to specify a range query in much the same way you might in SQL. $gt and $lt are only two of a host of operators that comprise the MongoDB query language. Others include $gte for greater than or equal to, $lte for (you guessed it) less than or equal to, and $ne for not equal to. You’ll see other operators and many more example queries in later chapters.

Of course, queries like this are of little value unless they’re also efficient. In the next section, we’ll start thinking about query efficiency by exploring MongoDB’s indexing features.

2.2.2. Indexing and explain( )

If you’ve spent time working with relational databases, you’re probably familiar with SQL’s EXPLAIN, an invaluable tool for debugging or optimizing a query. When any database receives a query, it must plan out how to execute it; this is called a query plan. EXPLAIN describes query paths and allows developers to diagnose slow operations by determining which indexes a query has used. Often a query can be executed in multiple ways, and sometimes this results in behavior you might not expect. EXPLAIN explains. MongoDB has its own version of EXPLAIN that provides the same service. To get an idea of how it works, let’s apply it to one of the queries you just issued. Try running the following on your system:

> db.numbers.find({num: {"$gt": 19995}}).explain("executionStats")

The result should look something like what you see in the next listing. The "executionStats" keyword is new to MongoDB 3.0 and requests a different mode that gives more detailed output.

Listing 2.1. Typical explain("executionStats") output for an unindexed query
{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "tutorial.numbers",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "num" : {
                    "$gt" : 19995
            }
        },
        "winningPlan" : {
            "stage" : "COLLSCAN",
            "filter" : {
                "num" : {
                    "$gt" : 19995
                }
            },
            "direction" : "forward"
        },
        "rejectedPlans" : [ ]
    },
    "executionStats" : {
        "executionSuccess" : true,
        "nReturned" : 4,
        "executionTimeMillis" : 8,
        "totalKeysExamined" : 0,
        "totalDocsExamined" : 20000,
        "executionStages" : {
            "stage" : "COLLSCAN",
            "filter" : {
                "num" : {
                    "$gt" : 19995
                }
            },
            "nReturned" : 4,
            "executionTimeMillisEstimate" : 0,
            "works" : 20002,
            "advanced" : 4,
            "needTime" : 19997,
            "needFetch" : 0,
            "saveState" : 156,
            "restoreState" : 156,
            "isEOF" : 1,
            "invalidates" : 0,
            "direction" : "forward",
            "docsExamined" : 20000
        }
    },
    "serverInfo" : {
        "host" : "rMacBook.local",
        "port" : 27017,
        "version" : "3.0.6",

        "gitVersion" : "nogitversion"
    },
    "ok" : 1
}

Upon examining the explain() output,[3] you may be surprised to see that the query engine has to scan the entire collection, all 20,000 documents (docsExamined), to return only four results (nReturned). The value of the totalKeysExamined field shows the number of index entries scanned, which is zero. Such a large difference between the number of documents scanned and the number returned marks this as an inefficient query. In a real-world situation, where the collection and the documents themselves would likely be larger, the time needed to process the query would be substantially greater than the eight milliseconds (millis) noted here (this may be different on your machine).

3

In these examples we’re inserting “hostname” as the machine’s hostname. On your platform this may appear as localhost, your machine’s name, or its name plus .local. Don’t worry if your output looks a little different than ours‘; it can vary based on your platform and your exact version of MongoDB.

What this collection needs is an index. You can create an index for the num key within the documents using the createIndex() method. Try entering the following index creation code:

> db.numbers.createIndex({num: 1})
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}

The createIndex() method replaces the ensureIndex() method in MongoDB 3. If you’re using an older MongoDB version, you should use ensureIndex() instead of createIndex(). In MongoDB 3, ensureIndex() is still valid as it’s an alias for create-Index(), but you should stop using it.

As with other MongoDB operations, such as queries and updates, you pass a document to the createIndex() method to define the index’s keys. In this case, the {num: 1} document indicates that an ascending index should be built on the num key for all documents in the numbers collection.

You can verify that the index has been created by calling the getIndexes() method:

> db.numbers.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },

        "name" : "_id_",
        "ns" : "tutorial.numbers"
    },
    {
        "v" : 1,
        "key" : {
            "num" : 1
        },
        "name" : "num_1",
        "ns" : "tutorial.numbers"
    }
]

The collection now has two indexes. The first is the standard _id index that’s automatically built for every collection; the second is the index you created on num. The indexes for those fields are called _id_ and num_1, respectively. If you don’t provide a name, MongoDB sets hopefully meaningful names automatically.

If you run your query with the explain() method, you’ll now see the dramatic difference in query response time, as shown in the following listing.

Listing 2.2. explain() output for an indexed query

Now that the query utilizes the index num_1 on num, it scans only the four documents pertaining to the query. This reduces the total time to serve the query from 8 ms to 0 ms!

Indexes don’t come free; they take up some space and can make your inserts slightly more expensive, but they’re an essential tool for query optimization. If this example intrigues you, be sure to check out chapter 8, which is devoted to indexing and query optimization. Next you’ll look at the basic administrative commands required to get information about your MongoDB instance. You’ll also learn techniques for getting help from the shell, which will aid in mastering the various shell commands.

2.3. Basic administration

This chapter promised to be an introduction to MongoDB via the JavaScript shell. You’ve already learned the basics of data manipulation and indexing. Here, we’ll present techniques for getting information about your mongod process. For instance, you’ll probably want to know how much space your various collections are taking up, or how many indexes you’ve defined on a given collection. The commands detailed here can take you a long way in helping to diagnose performance issues and keep tabs on your data.

We’ll also look at MongoDB’s command interface. Most of the special, non-CRUD operations that can be performed on a MongoDB instance, from server status checks to data file integrity verification, are implemented using database commands. We’ll explain what commands are in the MongoDB context and show how easy they are to use. Finally, it’s always good to know where to look for help. To that end, we’ll point out places in the shell where you can turn for help to further your exploration of MongoDB.

2.3.1. Getting database information

You’ll often want to know which collections and databases exist on a given installation. Fortunately, the MongoDB shell provides a number of commands, along with some syntactic sugar, for getting information about the system.

show dbs prints a list of all the databases on the system:

> show dbs
admin     (empty)
local     0.078GB
tutorial  0.078GB

show collections displays a list of all the collections defined on the current database.[4] If the tutorial database is still selected, you’ll see a list of the collections you worked with in the preceding tutorial:

4

You can also enter the more succinct show tables.

> show collections
numbers
system.indexes
users

The one collection that you may not recognize is system.indexes. This is a special collection that exists for every database. Each entry in system.indexes defines an index for the database, which you can view using the getIndexes() method, as you saw earlier. But MongoDB 3.0 deprecates direct access to the system.indexes collections; you should use createIndexes and listIndexes instead. The getIndexes() JavaScript method can be replaced by the db.runCommand( {"listIndexes": "numbers"} ) shell command.

For lower-level insight into databases and collections, the stats() method proves useful. When you run it on a database object, you’ll get the following output:

> db.stats()
{
    "db" : "tutorial",
    "collections" : 4,
    "objects" : 20010,
    "avgObjSize" : 48.0223888055972,
    "dataSize" : 960928,
    "storageSize" : 2818048,
    "numExtents" : 8,
    "indexes" : 3,
    "indexSize" : 1177344,
    "fileSize" : 67108864,
    "nsSizeMB" : 16,
    "extentFreeList" : {
        "num" : 0,
        "totalSize" : 0
    },
    "dataFileVersion" : {
        "major" : 4,
        "minor" : 5
    },
    "ok" : 1
}

You can also run the stats() command on an individual collection:

> db.numbers.stats()
{
    "ns" : "tutorial.numbers",

    "count" : 20000,
    "size" : 960064,
    "avgObjSize" : 48,
    "storageSize" : 2793472,
    "numExtents" : 5,
    "nindexes" : 2,
    "lastExtentSize" : 2097152,
    "paddingFactor" : 1,
    "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0.
     It remains hard coded to 1.0 for compatibility only.",
    "systemFlags" : 1,
    "userFlags" : 1,
    "totalIndexSize" : 1169168,
    "indexSizes" : {
        "_id_" : 654080,
        "num_1" : 515088
    },
    "ok" : 1
}

Some of the values provided in these result documents are useful only in complicated debugging or tuning situations. But at the very least, you’ll be able to find out how much space a given collection and its indexes are occupying.

2.3.2. How commands work

A certain set of MongoDB operations—distinct from the insert, update, remove, and query operations described so far in this chapter—are known as database commands. Database commands are generally administrative, as with the stats() methods just presented, but they may also control core MongoDB features, such as updating data.

Regardless of the functionality they provide, what all database commands have in common is their implementation as queries on a special virtual collection called $cmd. To show what this means, let’s take a quick example. Recall how you invoked the stats() database command:

> db.stats()

The stats() method is a helper that wraps the shell’s command invocation method. Try entering the following equivalent operation:

> db.runCommand( {dbstats: 1} )

The results are identical to what’s provided by the stats() method. Note that the command is defined by the document {dbstats: 1}. In general, you can run any available command by passing its document definition to the runCommand() method. Here’s how you’d run the collection stats command:

> db.runCommand( {collstats: "numbers"} )

The output should look familiar.

But to get to the heart of database commands, you need to see how the run-Command() method works. That’s not hard to find out because the MongoDB shell will print the implementation of any method whose executing parentheses are omitted. Instead of running the command like this

> db.runCommand()

you can execute the parentheses-less version and see the internals:

> db.runCommand
  function ( obj, extra ){
    if ( typeof( obj ) == "string" ){
        var n = {};
        n[obj] = 1;
        obj = n;
        if ( extra && typeof( extra ) == "object" ) {
            for ( var x in extra ) {
                n[x] = extra[x];
            }
        }
    }
    return this.getCollection( "$cmd" ).findOne( obj );
  }

The last line in the function is nothing more than a query on the $cmd collection. To define it properly, then, a database command is a query on a special collection, $cmd, where the query selector defines the command itself. That’s all there is to it. Can you think of a way to run the collection stats command manually? It’s this simple:

> db.$cmd.findOne( {collstats: "numbers"} );

Using the runCommand helper is easier but it’s always good to know what’s going on beneath the surface.

2.4. Getting help

By now, the value of the MongoDB shell as a testing ground for experimenting with data and administering the database should be evident. But because you’ll likely spend a lot of time in the shell, it’s worth knowing how to get help.

The built-in help commands are the first place to look. db.help() prints a list of commonly used methods for operating on databases. You’ll find a similar list of methods for operating on collections by running db.numbers.help().

There’s also built-in tab completion. Start typing the first characters of any method and then press the Tab key twice. You’ll see a list of all matching methods. Here’s the tab completion for collection methods beginning with get:

> db.numbers.get
db.numbers.getCollection(          db.numbers.getIndexes(             db.numbers.getShardDistribution(
db.numbers.getDB(                  db.numbers.getIndices(             db.numbers.getShardVersion(

db.numbers.getDiskStorageStats(    db.numbers.getMongo(               db.numbers.getSlaveOk(
db.numbers.getFullName(            db.numbers.getName(                db.numbers.getSplitKeysForChunks(
db.numbers.getIndexKeys(           db.numbers.getPagesInRAM(          db.numbers.getWriteConcern(
db.numbers.getIndexSpecs(          db.numbers.getPlanCache(
db.numbers.getIndexStats(          db.numbers.getQueryOptions(

The official MongoDB manual is an invaluable resource and can be found at http://docs.mongodb.org. It has both tutorials and reference material, and it’s kept up-to-date with new releases of MongoDB. The manual also includes documentation for each language-specific MongoDB driver implementation, such as the Ruby driver, which is necessary when accessing MongoDB from an application.

If you’re more ambitious, and are comfortable with JavaScript, the shell makes it easy to examine the implementation of any given method. For instance, suppose you’d like to know exactly how the save() method works. Sure, you could go trolling through the MongoDB source code, but there’s an easier way: enter the method name without the executing parentheses. Here’s how you’d normally execute save():

> db.numbers.save({num: 123123123});

And this is how you can check the implementation:

> db.numbers.save
function ( obj , opts ){
    if ( obj == null )
        throw "can't save a null";

    if ( typeof( obj ) == "number" || typeof( obj) == "string" )
        throw "can't save a number or string"

    if ( typeof( obj._id ) == "undefined" ){
        obj._id = new ObjectId();
        return this.insert( obj , opts );
    }
    else {
        return this.update( { _id : obj._id } , obj , Object.merge({ upsert:true }, opts));
    }
}

Read the function definition closely, and you’ll see that save() is merely a wrapper for insert() and update(). After checking the type of the obj argument, if the object you’re trying to save doesn’t have an _id field, then the field is added, and insert() is invoked. Otherwise an update is performed.

This trick for examining the shell’s methods comes in handy. Keep this technique in mind as you continue exploring the MongoDB shell.

2.5. Summary

You’ve now seen the document data model in practice, and we’ve demonstrated a variety of common MongoDB operations on that data model. You’ve learned how to create indexes and have seen an example of index-based performance improvements through the use of explain(). In addition, you should be able to extract information about the collections and databases on your system, you now know all about the clever $cmd collection, and if you ever need help, you’ve picked up a few tricks for finding your way around.

You can learn a lot by working in the MongoDB shell, but there’s no substitute for the experience of building a real application. That’s why we’re going from a carefree data playground to a real-world data workshop in the next chapter. You’ll see how the drivers work, and then, using the Ruby driver, you’ll build a simple application, hitting MongoDB with some real, live data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset