Now that you know the basic set of MongoDB objects, let’s discuss what you can do with them through MongoDB functionality. We’ll provide an overview to the four most common functions (adding, querying, updating, and deleting) as well as related operators. Because this is a book on how to do data modeling for MongoDB databases, and not a book on how to use MongoDB, we cover core functionality, but leave it to the MongoDB user guide and other resources on the market to cover the more advanced functionality (see Appendix B for a full list of MongoDB references).
Adding Data in MongoDB
The insert( ) function adds a document or an array of documents to a collection. The general structure for an insert( ) statement is:
There are three parts to the insert( ) function:
For example, to add a new book document into the book collection, we can execute this statement:
db.book.insert( { titleName : “Data Modeling for MongoDB” } )
If the book collection does not exist, it will be created when adding this first document. The insert statement will also lead to the creation of an ObjectId field to identify this new document.
Multiple documents can be added to a collection with one command by using an array:
db.book.insert( [ { titleName : “Data Modeling Made Simple” },
{ titleName : “Extreme Scoping” },
{ titleName : “UML Database Modeling Workbook” }
] )
To see what we just created, use the find command, which we’ll cover in the next section:
db.book.find( )
And this is what is returned:
{ “_id” : ObjectId(“530f8be4d77a08823086017c”), “titleName” : “Data Modeling for MongoDB” }
{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Data Modeling Made Simple” }
{ “_id” : ObjectId(“530f8be4d77a08823086017e”), “titleName” : “Extreme Scoping” }
{ “_id” : ObjectId(“530f8be4d77a08823086017f”), “titleName” : “UML Database Modeling Workbook” }
Note the ObjectIds were automatically created for us. We could optionally specify values for each ObjectId field. ObjectId is a default that’s good for guaranteeing uniqueness. However, if there are already unique keys in your data, you can use these unique keys for ObjectId values and this will save you an extra index.
The previous example illustrates when a document just contains the ObjectID and the book’s title. But what if we have more fields such as the three books in this spreadsheet?
Title Name |
Subtitle Name |
Page Count |
Categories |
Extreme Scoping |
An Agile Approach to Enterprise Data Warehousing and Business Intelligence |
300 |
Agile, Business Intelligence |
Business unIntelligence |
Insight and Innovation beyond Analytics and Big Data |
442 |
Data Warehouse, Business Intelligence |
Secrets of Analytical Leaders |
268 |
Analytics |
Here is the statement to add these three new titles to our book collection:
db.book.insert( [ { titleName : “Extreme Scoping”,
subtitleName : “An Agile Approach to Enterprise Data Warehousing and Business Intelligence”,
pageCount : 300,
categories : [ “Agile”, “Business Intelligence” ]
},
{
titleName : “Business unIntelligence”,
subtitleName : “Insight and Innovation beyond Analytics and Big Data”,
pageCount : 442,
categories : [ “Data Warehouse”, “Business Intelligence” ]
},
{
titleName : “Secrets of Analytical Leaders”,
pageCount : 268,
categories : [ “Analytics” ]
} ] )
Note that not all documents have to contain all fields since the third book does not have a subtitle. Also note how easy it is to add multiple values for a single field through an array as in adding the books’ categories. Let’s do another find( ) and see what comes back:
db.book.find( )
And this is what is returned:
{ “_id” : ObjectId(“530f8be4d77a08823086017c”), “titleName” : “Data Modeling for MongoDB” }
{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Data Modeling Made Simple” }
{ “_id” : ObjectId(“530f8be4d77a08823086017e”), “titleName” : “Extreme Scoping” }
{ “_id” : ObjectId(“530f8be4d77a08823086017f”), “titleName” : “UML Database Modeling Workbook” }
{ “_id” : ObjectId(“530f8be4d77a08823086017g”), “titleName” : “Extreme Scoping”, “subtitleName” : “An Agile Approach to Enterprise Data Warehousing and Business Intelligence”, “pageCount” : 300, “categories” : [ “Agile”, “Business Intelligence” ] }
{ “_id” : ObjectId(“530f8be4d77a08823086017h”), “titleName” : “Business unIntelligence”, “subtitleName” : “Insight and Innovation beyond Analytics and Big Data”, “pageCount” : 442, “categories” : [ “Data Warehouse”, “Business Intelligence” ] }
{ “_id” : ObjectId(“530f8be4d77a08823086017i”), “titleName” : “Secrets of Analytical Leaders”, “pageCount” : 268, “categories” : [ “Analytics” ] }
Notice that the Extreme Scoping book now exists twice in the collection. If this is indeed the same book we will need to remove the duplicate, which we will do shortly with remove( ).
Querying Data in MongoDB
The find( ) function displays documents from a collection. The general structure for a find( ) statement is:
There are three parts to the find( ) function:
So if we would like to display all books in our book collection, we can execute:
db.book.find( )
If we would like to bring back a specific book in our collection, we can execute:
db.book.find( { titleName : “Data Modeling for MongoDB” } )
If we would like to display a specific book in our collection and only the titleName field (along with _id), we can execute:
db.book.find( { titleName : “Data Modeling for MongoDB” },
{ titleName : 1 } )
The 1 after a field name tells MongoDB we would like this field displayed. To exclude a field, use a 0. So if we would like all of the fields displayed except the titleName field, we can execute:
db.book.find( { titleName : “Data Modeling for MongoDB” }, { titleName : 0 } )
If there is more than one field in the find clause, there is an implied “AND” between the conditions. That is, all conditions must be true. So, for example, to find the book titled Extreme Scoping, which contains 300 pages, we can write this query:
db.book.find( { titleName : “Extreme Scoping”, pageCount : 300 } )
You can include any of the following operators as well as other operators described in the MongoDB reference manual:
For example, if we would like to find all of the books in this list that are over 275 pages:
Extreme Scoping, 300 pages
Data Engineering, 100 pages
Business unIntelligence, 442 pages
Secrets of Analytical Leaders, 268 pages
We can run this query:
db.book.find( { pageCount : { $gt : 275 } } )
And this would be returned:
Extreme Scoping, 300 pages
Business unIntelligence, 442 pages
We can also add functions to the end of the find statement for further query refinement. If we wanted to sort the documents returned by titleName, we can run this query:
db.book.find( { pageCount : { $gt : 275 } } ).sort( {titleName : 1 } )
And this would be returned:
Business unIntelligence, 442 pages
Extreme Scoping, 300 pages
Note that the 1 in the sort command indicates ascending order and -1 indicates descending order.
If we wanted to count the number of documents returned in this same find( ) query, we can execute this statement:
db.book.find( { pageCount : { $gt : 275 } } ).count( )
This statement would return 2 for the two records returned.
EXERCISE 3: Interpreting Queries
Based on the sample dataset below from the Title collection, what would each of the following two queries return? See Appendix A for the answer.
Title Name |
Page Count |
Publication Year |
Author |
Amazon Review |
Extreme Scoping |
300 |
2013 |
Larissa Moss |
4.5 |
Data Engineering |
100 |
2013 |
Brian Shive |
4.25 |
Business unIntelligence |
442 |
2013 |
Barry Devlin, PhD |
5 |
Data Modeling Made Simple |
250 |
2009 |
Steve Hoberman |
4.35 |
db.title.find( { $or : [ { pageCount : { $gt : 200 } },
{ publicationYear : { $lt : 2010 } } ] } )
db.title.find( { authorName : { $ne : “Shive” } } )
Updating Data in MongoDB
The update( ) function modifies one or more documents from a collection. The general structure for a update( ) statement is:
There are three parts to the update( ) function:
So, if we would like to update the title for Data Modeling for MongoDB, we can execute this statement:
db.book.update( { titleName : “Data Modeling for MongoDB” },
{ titleName : “Fifty Shades of Data Modeling” } )
Because we left off the multi and upsert parameters, we would only change the first occurrence of this title, and if this title did not exist in our collection, we would not insert this title as a new document.
If we wanted to update all documents that meet this criteria, we can execute this statement:
db.book.update( { titleName : “Data Modeling for MongoDB” },
{ titleName : “Fifty Shades of Data Modeling” }, { multi: true } )
Note that the basic update function is designed for addressing the situation when what you are searching for and what you are changing are the same fields. If we wanted to change a different field than the one being searched for, you will need to use the $set command:
db.book.update ( {_id : ObjectId(“530f8be4d77a08823086017d”) },
{ “$set” : { titleName : “Fifty Shades of Data Modeling” } } )
What would happen if we wanted to update a field but the field did not exist? The solution is to again use $set. $set sets the value of a field. If the field does not yet exist, it will be created. If the field does exist, it will be updated with the value specified by $set. This can be handy for updating schema or adding user-defined fields. For example, if we wanted to update the edition but edition did not exist, we can run this statement:
db.book.update ( { titleName : “Fifty Shades of Data Modeling” },
{ “$set” : { edition : 2 } } )
Now this document will have an edition field:
db.book.find( { titleName : “Fifty Shades of Data Modeling” } )
{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Fifty Shades of Data Modeling”, “edition” : 2 }
If we decided that this book is really the third edition and not the second edition, $set can be used again to change the value:
db.book.update( { titleName : “Fifty Shades of Data Modeling” },
{ “$set” : { edition : 3 } } )
$set can even change the type of the field it modifies. For instance, if we decide that we would like to store the character string three instead of 3, we can run this statement:
db.book.update ( { titleName : “Fifty Shades of Data Modeling” },
{ “$set” : { edition : “three” } } )
If we decide the edition field really isn’t needed anyway, we can remove this field altogether with $unset:
db.book.update ( { titleName : “Fifty Shades of Data Modeling” }, { “$unset” : { edition : “three” } } )
The $inc modifier can be used to increment the value of an existing field and, similar to $set, can be used to create a new field if it does not already exist. For example, let’s say we would like to add a field called bookTotalCopiesSoldCount, which specifies how many copies of a title have been sold since the book was first published. We can run this statement:
db.book.update ( { titleName : “Fifty Shades of Data Modeling” },
{ “$inc” : { bookTotalCopiesSoldCount : 3000 } } )
If we look at the document after this update, we’ll see the following:
db.book.find ( { titleName : “Fifty Shades of Data Modeling” } )
{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Fifty Shades of Data Modeling”, “bookTotalCopiesSoldCount” : 3000 }
Next month, after 100 more copies of this title are sold, we can run this statement to add 100 more copies to the total:
db.book.update ( { titleName : “Fifty Shades of Data Modeling” },
{ “$inc” : { bookTotalCopiesSoldCount : 100 } } )
If we look at the document after this update, we’ll see the following:
db.book.find( { titleName : “Fifty Shades of Data Modeling” } )
{ “_id” : ObjectId(“530f8be4d77a08823086017d”), “titleName” : “Fifty Shades of Data Modeling”, “bookTotalCopiesSoldCount” : 3100 }
There are many more ways to update data and especially ways to update data in arrays. Please refer to the MongoDB user guide for more information.
Deleting Data in MongoDB
The remove( ) function deletes one or more documents from a collection. The general structure for a remove( ) statement is:
There are three parts to the remove( ) function:
To remove all of the documents within the book collection, we can execute the remove( ) statement without any parameters:
db.book.remove( )
To remove a particular book, we can execute this statement:
db.book.remove( { titleName : “Fifty Shades of Data Modeling” } )
Note that if we had five books with the same title, the above statement will remove all five of them.
Recall earlier that we added the book Extreme Scoping twice in our collection:
{ _id : ObjectId(“530f8be4d77a08823086017e”), titleName : “Extreme Scoping” }
{ _id : ObjectId(“530f8be4d77a08823086017g”), titleName : “Extreme Scoping”, “subtitleName” : “An Agile Approach to Enterprise Data Warehousing and Business Intelligence”, pageCount : 300, categories : [ “Agile”, “Business Intelligence” ] }
We can now remove the duplicate:
db.book.remove( { titleName : “Extreme Scoping” } )
But don’t hit the enter button just yet! This will remove both occurrences of Extreme Scoping, and we only want to remove the first one. Executing the following statement would remove just the first one:
db.book.remove( { _id : ObjectId(“530f8be4d77a08823086017e”) } )
Instead of running the above statement, we could also just remove the first occurrence of Extreme Scoping by running this statement:
db.book.remove( { titleName : “Extreme Scoping” }, 1 )
Using 1 or true as the second parameter in the remove statement deletes only the first occurrence. Using 0 or false or leaving this second parameter off completely deletes all of the documents that meet the specified criteria.
EXERCISE 4: MongoDB Functionality
Below is some sample data from the Title entity. Write the MongoDB statements to insert this data into the collection Title. Then remove FruITion. Then update the page count for Data Quality Assessment to 350. Then view all of the data. See Appendix A for the answer.
Title Name |
Page Count |
Publication Year |
Author |
FruITion |
100 |
2010 |
Chris Potts |
Data Quality Assessment |
400 |
2009 |
Arkady Maydanchik |
Data Modeling Theory and Practice |
300 |
2008 |
Graeme Simsion |
Key Points
|