Chapter 4
MongoDB Functionality

Now that you know the basic set of MongoDB objects, let’s discuss what you can do with them through MongoDB functionality. We’ll provide an overview to the four most common functions (adding, querying, updating, and deleting) as well as related operators. Because this is a book on how to do data modeling for MongoDB databases, and not a book on how to use MongoDB, we cover core functionality, but leave it to the MongoDB user guide and other resources on the market to cover the more advanced functionality (see Appendix B for a full list of MongoDB references).

Adding Data in MongoDB

The insert( ) function adds a document or an array of documents to a collection. The general structure for an insert( ) statement is:

There are three parts to the insert( ) function:

  • Database. We need to identify which database we would like to add documents to. When we first start the MongoDB client, there is a variable called db that is assigned to represent the database for the current connection. Variables are pointers to the actual data they are assigned. Think of db as the pointer to your database. By default, db points to the test database. To see this yourself, type db at the command prompt and hit <Enter>.
  • Collection. We need to identify the collection within this database we would like to add documents to.
  • Documents. We need to list the one or more documents that will be added to the collection. We can add one document, or we can add many documents using an array.

For example, to add a new book document into the book collection, we can execute this statement:

db.book.insert( { titleName : “Data Modeling for MongoDB” } )

If the book collection does not exist, it will be created when adding this first document. The insert statement will also lead to the creation of an ObjectId field to identify this new document.

Multiple documents can be added to a collection with one command by using an array:

db.book.insert( [ { titleName : “Data Modeling Made Simple” },

{ titleName : “Extreme Scoping” },

{ titleName : “UML Database Modeling Workbook” }

] )

To see what we just created, use the find command, which we’ll cover in the next section:

db.book.find( )

And this is what is returned:

{ “_id” : ObjectId(“530f8be4d77a08823086017c”), “titleName” : “Data Modeling for MongoDB” }

{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Data Modeling Made Simple” }

{ “_id” : ObjectId(“530f8be4d77a08823086017e”), “titleName” : “Extreme Scoping” }

{ “_id” : ObjectId(“530f8be4d77a08823086017f”), “titleName” : “UML Database Modeling Workbook” }

Note the ObjectIds were automatically created for us. We could optionally specify values for each ObjectId field. ObjectId is a default that’s good for guaranteeing uniqueness. However, if there are already unique keys in your data, you can use these unique keys for ObjectId values and this will save you an extra index.

The previous example illustrates when a document just contains the ObjectID and the book’s title. But what if we have more fields such as the three books in this spreadsheet?

Title Name

Subtitle Name

Page Count

Categories

Extreme Scoping

An Agile Approach to Enterprise Data Warehousing and Business Intelligence

300

Agile, Business Intelligence

Business unIntelligence

Insight and Innovation beyond Analytics and Big Data

442

Data Warehouse, Business Intelligence

Secrets of Analytical Leaders

268

Analytics

Here is the statement to add these three new titles to our book collection:

db.book.insert( [  { titleName : “Extreme Scoping”,   

subtitleName : “An Agile Approach to Enterprise Data Warehousing and Business Intelligence”,

pageCount : 300,

categories : [ “Agile”, “Business Intelligence” ]

},

{

titleName : “Business unIntelligence”,

subtitleName : “Insight and Innovation beyond Analytics and Big Data”,

pageCount : 442,

categories : [ “Data Warehouse”, “Business Intelligence” ]

},

{

titleName : “Secrets of Analytical Leaders”,

pageCount : 268,

categories : [ “Analytics” ]

} ] )

Note that not all documents have to contain all fields since the third book does not have a subtitle. Also note how easy it is to add multiple values for a single field through an array as in adding the books’ categories. Let’s do another  find( ) and see what comes back:

db.book.find( )

And this is what is returned:

{ “_id” : ObjectId(“530f8be4d77a08823086017c”), “titleName” : “Data Modeling for MongoDB” }

{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Data Modeling Made Simple” }

{ “_id” : ObjectId(“530f8be4d77a08823086017e”), “titleName” : “Extreme Scoping” }

{ “_id” : ObjectId(“530f8be4d77a08823086017f”), “titleName” : “UML Database Modeling Workbook” }

{ “_id” : ObjectId(“530f8be4d77a08823086017g”), “titleName” : “Extreme Scoping”, “subtitleName” : “An Agile Approach to Enterprise Data Warehousing and Business Intelligence”, “pageCount” : 300, “categories” : [ “Agile”, “Business Intelligence” ] }

{ “_id” : ObjectId(“530f8be4d77a08823086017h”), “titleName” : “Business unIntelligence”, “subtitleName” : “Insight and Innovation beyond Analytics and Big Data”, “pageCount” : 442, “categories” : [ “Data Warehouse”, “Business Intelligence” ] }

{ “_id” : ObjectId(“530f8be4d77a08823086017i”), “titleName” : “Secrets of Analytical Leaders”, “pageCount” : 268, “categories” : [ “Analytics” ] }

Notice that the Extreme Scoping book now exists twice in the collection. If this is indeed the same book we will need to remove the duplicate, which we will do shortly with remove( ).

Querying Data in MongoDB

The find( ) function displays documents from a collection. The general structure for a find( ) statement is:

There are three parts to the find( ) function:

  • Database. We need to identify which database we are querying. More accurately, the variable pointing to the database we are querying, such as in this case db.
  • Collection. We need to identify the collection within this database we are querying.
  • Parameters. We need to identify the documents and fields within this collection we would like displayed. The first parameter contains the criteria for displaying the documents, including optionally using operators to bring back one or more documents. The second parameter called “projection” specifies which fields to display. In RDBMS terms, the first parameter specifies the rows and the second specifies the columns. Note that if we leave out both parameters, all documents will be displayed (up to the first 20 and then more can be displayed by typing it).

So if we would like to display all books in our book collection, we can execute:

db.book.find( )

If we would like to bring back a specific book in our collection, we can execute:

db.book.find( { titleName : “Data Modeling for MongoDB” } )

If we would like to display a specific book in our collection and only the titleName field (along with _id), we can execute:

db.book.find( { titleName : “Data Modeling for MongoDB” },

{ titleName : 1 } )

The 1 after a field name tells MongoDB we would like this field displayed. To exclude a field, use a 0. So if we would like all of the fields displayed except the titleName field, we can execute:

db.book.find( { titleName : “Data Modeling for MongoDB” }, { titleName : 0 } )

If there is more than one field in the find clause, there is an implied “AND” between the conditions. That is, all conditions must be true. So, for example, to find the book titled Extreme Scoping, which contains 300 pages, we can write this query:

db.book.find( { titleName : “Extreme Scoping”, pageCount : 300 } )

You can include any of the following operators as well as other operators described in the MongoDB reference manual:

  • Greater than: $gt
  • Great than or equal to: $gte
  • Less than: $lt
  • Less than or equal to: $lte
  • Not equal to: $ne
  • Match any or all conditions: $or
  • Match all conditions: $and
  • Match all conditions not specified: $not
  • Match any of the values that exist in an array: $in
  • Match all documents that contain a specified field: $exists
  • Increment the value of a field by a certain amount: $inc

For example, if we would like to find all of the books in this list that are over 275 pages:

Extreme Scoping, 300 pages

Data Engineering, 100 pages

Business unIntelligence, 442 pages

Secrets of Analytical Leaders, 268 pages

We can run this query:

db.book.find( { pageCount : { $gt : 275 } } )

And this would be returned:

Extreme Scoping, 300 pages

Business unIntelligence, 442 pages

We can also add functions to the end of the find statement for further query refinement. If we wanted to sort the documents returned by titleName, we can run this query:

db.book.find( { pageCount : { $gt : 275 } } ).sort( {titleName : 1 } )

And this would be returned:

Business unIntelligence, 442 pages

Extreme Scoping, 300 pages

Note that the 1 in the sort command indicates ascending order and -1 indicates descending order.

If we wanted to count the number of documents returned in this same find( ) query, we can execute this statement:

db.book.find( { pageCount : { $gt : 275 } } ).count( )

This statement would return 2 for the two records returned.

EXERCISE 3: Interpreting Queries

Based on the sample dataset below from the Title collection, what would each of the following two queries return? See Appendix A for the answer.

Title Name

Page Count

Publication Year

Author

Amazon Review

Extreme Scoping

300

2013

Larissa Moss

4.5

Data Engineering

100

2013

Brian Shive

4.25

Business unIntelligence

442

2013

Barry Devlin, PhD

5

Data Modeling Made Simple

250

2009

Steve Hoberman

4.35

db.title.find( { $or : [ { pageCount : { $gt : 200 } },

{ publicationYear : { $lt : 2010 } } ] } )

db.title.find( { authorName : { $ne : “Shive” } } )

Updating Data in MongoDB

The update( ) function modifies one or more documents from a collection. The general structure for a update( ) statement is:

There are three parts to the update( ) function:

  • Database. We need to identify which database we are updating. More accurately, the variable pointing to the database we are updating, such as in this case db.
  • Collection. We need to identify the collection within this database we are updating.
  • Parameters. We need to identify the documents within this collection we would like changed. The first parameter contains the criteria for finding the documents we would like changed, which is identical to the syntax in the find( ) function because we are querying for documents. The second parameter, called “changes,” specifies how we would like to change the documents we find. We may decide to modify the values in one or more fields. The third parameter contains two indicators that can appear in any order: upsert and multi. Upsert means “update else insert,” meaning if this indicator is set to true or 1 and the criteria does not bring back any documents in the collection to update, then a new document will be created. If we leave out the upsert parameter or set it to false or 0, a new document will not be created. Multi is an indicator that, if set to true or 1, tells MongoDB to update all of the documents that meet the criteria. If we leave out the multi parameter or set it to false or 0, only the first document encountered will be updated.

So, if we would like to update the title for Data Modeling for MongoDB, we can execute this statement:

db.book.update( { titleName : “Data Modeling for MongoDB” },

{ titleName : “Fifty Shades of Data Modeling” } )

Because we left off the multi and upsert parameters, we would only change the first occurrence of this title, and if this title did not exist in our collection, we would not insert this title as a new document.

If we wanted to update all documents that meet this criteria, we can execute this statement:

db.book.update( { titleName : “Data Modeling for MongoDB” },

{ titleName : “Fifty Shades of Data Modeling” }, { multi: true } )

Note that the basic update function is designed for addressing the situation when what you are searching for and what you are changing are the same fields. If we wanted to change a different field than the one being searched for, you will need to use the $set command:

db.book.update ( {_id : ObjectId(“530f8be4d77a08823086017d”) },

{ “$set” : { titleName : “Fifty Shades of Data Modeling” } } )

What would happen if we wanted to update a field but the field did not exist? The solution is to again use $set. $set sets the value of a field. If the field does not yet exist, it will be created. If the field does exist, it will be updated with the value specified by $set. This can be handy for updating schema or adding user-defined fields. For example, if we wanted to update the edition but edition did not exist, we can run this statement:

db.book.update ( { titleName : “Fifty Shades of Data Modeling” },

{ “$set” : { edition : 2 } } )

Now this document will have an edition field:

db.book.find( { titleName : “Fifty Shades of Data Modeling” } )

{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Fifty Shades of Data Modeling”, “edition” : 2 }

If we decided that this book is really the third edition and not the second edition, $set can be used again to change the value:

db.book.update( { titleName : “Fifty Shades of Data Modeling” },

{ “$set” : { edition : 3 } } )

$set can even change the type of the field it modifies. For instance, if we decide that we would like to store the character string three instead of 3, we can run this statement:

db.book.update ( { titleName : “Fifty Shades of Data Modeling” },

{ “$set” : { edition : “three” } } )

If we decide the edition field really isn’t needed anyway, we can remove this field altogether with $unset:

db.book.update ( { titleName : “Fifty Shades of Data Modeling” }, { “$unset” : { edition : “three” } } )

The $inc modifier can be used to increment the value of an existing field and, similar to $set, can be used to create a new field if it does not already exist. For example, let’s say we would like to add a field called bookTotalCopiesSoldCount, which specifies how many copies of a title have been sold since the book was first published. We can run this statement:

db.book.update ( { titleName : “Fifty Shades of Data Modeling” },

{ “$inc” : { bookTotalCopiesSoldCount : 3000 } } )

If we look at the document after this update, we’ll see the following:

db.book.find ( { titleName : “Fifty Shades of Data Modeling” } )

{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Fifty Shades of Data Modeling”, “bookTotalCopiesSoldCount” : 3000 }

Next month, after 100 more copies of this title are sold, we can run this statement to add 100 more copies to the total:

db.book.update ( { titleName : “Fifty Shades of Data Modeling” },

{ “$inc” : { bookTotalCopiesSoldCount : 100 } } )

If we look at the document after this update, we’ll see the following:

db.book.find( { titleName : “Fifty Shades of Data Modeling” } )

{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Fifty Shades of Data Modeling”, “bookTotalCopiesSoldCount” : 3100 }

There are many more ways to update data and especially ways to update data in arrays. Please refer to the MongoDB user guide for more information.

Deleting Data in MongoDB

The remove( ) function deletes one or more documents from a collection. The general structure for a remove( ) statement is:

There are three parts to the remove( ) function:

  • Database. We need to identify which database we are removing data from. More accurately, the variable pointing to the database, such as in this case db.
  • Collection. We need to identify the collection we are removing data from.
  • Parameters. We need to identify the criteria for the documents that need to be removed. Setting the justOne parameter to true or 1 removes just the first occurrence of the criteria. If we leave out the justOne parameter or set it to false or 0, all documents that meet the criteria will be removed.  

To remove all of the documents within the book collection, we can execute the remove( ) statement without any parameters:

db.book.remove( )

To remove a particular book, we can execute this statement:

db.book.remove( { titleName : “Fifty Shades of Data Modeling” } )

Note that if we had five books with the same title, the above statement will remove all five of them.

Recall earlier that we added the book Extreme Scoping twice in our collection:

{ _id : ObjectId(“530f8be4d77a08823086017e”), titleName : “Extreme Scoping” }

{ _id : ObjectId(“530f8be4d77a08823086017g”), titleName : “Extreme Scoping”, “subtitleName” : “An Agile Approach to Enterprise Data Warehousing and Business Intelligence”, pageCount : 300, categories : [ “Agile”, “Business Intelligence” ] }

We can now remove the duplicate:

db.book.remove( { titleName : “Extreme Scoping” } )

But don’t hit the enter button just yet! This will remove both occurrences of Extreme Scoping, and we only want to remove the first one. Executing the following statement would remove just the first one:

db.book.remove( { _id : ObjectId(“530f8be4d77a08823086017e”) } )

Instead of running the above statement, we could also just remove the first occurrence of Extreme Scoping by running this statement:

db.book.remove( { titleName : “Extreme Scoping” }, 1 )

Using 1 or true as the second parameter in the remove statement deletes only the first occurrence. Using 0 or false or leaving this second parameter off completely deletes all of the documents that meet the specified criteria.

EXERCISE 4: MongoDB Functionality

Below is some sample data from the Title entity. Write the MongoDB statements to insert this data into the collection Title. Then remove FruITion. Then update the page count for Data Quality Assessment to 350. Then view all of the data. See Appendix A for the answer.

Title Name

Page Count

Publication Year

Author

FruITion

100

2010

Chris Potts

Data Quality Assessment

400

2009

Arkady Maydanchik

Data Modeling Theory and Practice

300

2008

Graeme Simsion

Key Points

  • To add one or more documents to a collection, use the insert( ) function.
  • To retrieve one or more documents from a collection, use the find( ) function.
  • To modify one or more documents in a collection, use the update( ) function.
  • To delete one or more documents in a collection, use the delete( ) function.
  • Although the basic functionality was covered in this chapter, there is a lot more to learn about MongoDB including more functions, more operators, and knowing the efficiency tradeoffs for each option.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset