MongoDB is a NoSQL-based distributed document data storage. This has been specially designed for providing scalable and high performance data storage solutions. In many scenarios, it can be used to replace traditional relational database or key/value data storage. The biggest feature of Mongo is its query language, which is very powerful, and its syntax is somewhat similar to object-oriented query language.
The following are the features of MongoDB:
We can use R and MongoDB together by installing the following prerequisites:
The following are the steps provided for installation of MongoDB in Ubuntu 12.04 and CentOS:
First, we will see installation steps for Ubuntu.
sudo apt-key adv --keyserverhkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
/etc/apt/sources.list.d/mongodb.list
by using the following command:echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list
sudo apt-get update
apt-get install mongodb-10gen
Now, we will see the installation steps for CentOs.
/etc/yum.repos.d/mongodb.repo
and use the following configurations:[mongodb] name=MongoDB Repository baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/ gpgcheck=0 enabled=1
[mongodb] name=MongoDB Repository baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/i686/ gpgcheck=0 enabled=1
With the following command, install a stable version of MongoDB and the associated tools:
yum install mongo-10gen mongo-10gen-server
Now, you have successfully installed MongoDB.
Useful commands for controlling a mongodb service
To start the mongodb service we use the following command:
sudo service mongodb start
To stop the mongodb service we use the following command:
sudo service mongodb stop
To restart the mongodb service we use the following command:
sudo service mongodb restart
To start a Mongo console we use the following command:
mongo
The following are the mappings of SQL terms to MongoDB terms for better understanding of data storage:
No. |
SQL Term |
MongoDB Term |
---|---|---|
1. |
Database |
Database |
2. |
Table |
Collection |
3. |
Index |
Index |
4. |
Row |
Document |
5. |
Column |
Field |
6. |
Joining |
Embedding & linking |
The following are the mapping of SQL statements to Mongo QL statements for the understanding of query development/conversion:
No. |
SQL Statement |
Mongo QL Statement |
---|---|---|
1. | INSERT INTO students VALUES(1,1)
|
$db->students->insert(array("a" => 1, "b" => 1));
|
2. | SELECT a, b FROM students
| $db->students->find(array(), array("a" => 1, "b" => 1));
|
3. | SELECT * FROM students WHERE age < 15
| $db->students->find(array("age" => array('$lt' => 15)));
|
4. | UPDATE students SET a=1 WHERE b='q'
| $db->students->update(array("b" => "q"), array('$set' => array("a" => 1)));
|
5. | DELETE FROM students WHERE name="siddharth"
| $db->students->remove(array("name" => " siddharth"));
|
To use MongoDB within R, we need to have installed R with the rmongodb library. We can install rmongodb from CRAN via the following command:
# installing library rmongodb in R install.packages (rmongodb)
We have learned how to install MongoDB in Ubuntu 12.04. Now, we can perform all the necessary operations on our data. In this section, we are going to learn how Mongo data can be handled and imported in R for data analytics activity. For loading the library we use the following command:
# loading the library of rmongodb library (rmongodb) Mongo connection establishment mongo <-mongo.create () Check whether the normal series mongo.is.connected (mongo) Create a BSON object cache buf <- mongo.bson.buffer.create () Add element to the object buf mongo.bson.buffer.append (buf, "name", "Echo")
Objects of the mongo.bson
class are used to store BSON documents. BSON is the form that MongoDB uses to store documents in its database. MongoDB network traffic also uses BSON messages:
b <- mongo.bson.from.list(list(name="Fred", age=29, city="Boston"))iter <- mongo.bson.iterator.create(b) # b is of class "mongo.bson"while (mongo.bson.iterator.next(iter))print(mongo.bson.iterator.value(iter))
We will now see how Mongo data object can be operated within R:
# To check whether mongo is connected or not in R. if (mongo.is.connected(mongo)) {ns <- "test.people" #Returns a fresh mongo.bson.buffer object ready to have data #appended onto it in R.buf <- mongo.bson.buffer.create()mongo.bson.buffer.append(buf, "name", "Joe")criteria <- mongo.bson.from.buffer(buf) # mongo.bson.buffer objects are used to build mongo.bson objects.buf <- mongo.bson.buffer.create() mongo.bson.buffer.start.object(buf, "inc")mongo.bson.buffer.append(buf, "age", 1L)mongo.bson.buffer.finish.object(buf)objNew <- mongo.bson.from.buffer(buf)# increment the age field of the first record matching name "Joe"mongo.update(mongo, ns, criteria, objNew) # mongo.bson.buffer objects are used to build mongo.bson objects.buf <- mongo.bson.buffer.create()mongo.bson.buffer.append(buf, "name", "Jeff")criteria <- mongo.bson.from.buffer(buf) # mongo.bson.buffer objects are used to build mongo.bson objects.buf <- mongo.bson.buffer.create()mongo.bson.buffer.append(buf, "name", "Jeff")mongo.bson.buffer.append(buf, "age", 27L)objNew <- mongo.bson.from.buffer(buf)# update the entire record to { name: "Jeff", age: 27 }# where name equals "Jeff"# if such a record exists; otherwise, insert this as a new reordmongo.update(mongo, ns, criteria, objNew,mongo.update.upsert)# do a shorthand update:mongo.update(mongo, ns, list(name="John"), list(name="John", age=25))}