2.7.5 NoSQL

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Introducing Big Data | 29

Indeed, one should write SQL queries for them. Also, unlike MapReduce, there is no Reduce

step, although the queries can always do aggregational work. Even though they are really a grid

of multiple databases, they present an interface where one sends them a single query and one

gets back a single result set. Thus, it abstracts the complexity of parallel processing. Typically,

MPP data warehouses are packaged as physical appliances, and typically, they do not use direct

attached storage, but instead they use a more enterprise-oriented network storage scheme.

2.7.5 NoSQL

NoSQL stands for ‘not only SQL’. NoSQL databases are different from relational databases, where

they do not mandate a predened structured model for the data. This makes it a better choice to

store any information that does not render itself easily into the relational table and record format.

The key differentiating characteristics of NoSQL databases are as follows.

• Most NoSQL databases allow the schema from row to row to differ quite substantially.

• They do not have the concept of referential integrity. In fact, most NoSQL databases do not

even support an ACID compliance. From the perspective of consistency, NoSQL databases

do not support immediate consistency in the database. However, by doing that, they do allow

for greater availability. They allow write’s to be buffered and thus, read’s to be less blocked.

NoSQL databases are quite popular with a wide variety of industrial applications. There are four

types of NoSQL databases and they are briey explained as follows.

• Key-value stores: In key-value stores, the data is stored in key-value pairs. Look at the exam-

ple in Figure 2.8. Here, we have a Customer table and an Order table. Look at the Customer

table. For a Row ID of 101, we have a few keys, such as First_Name, Last_Name, Address

and Last_Order_ID. Then, there is another row with an ID of 102.

FIGURE 2.8 Example of key-value stores

Database

Table: Customers

Table: Orders

Row ID: 101

First_Name: John

Last_Name: Doe

Address: 123 Park Street

Last_Order_ID: 1701

Row ID: 102

First_Name: Jane

Last_Name: Doe

Address: 456 Green Street

Last_Order_ID: 1702

Row ID: 1701

Price: 1000 USD

Item_ID: 2345

Item_ID: 7890

Row ID: 1702

Price: 700 USD

Item_ID: 4321

Item_ID: 5446

M02 Big Data Simplified XXXX 01.indd 29 5/10/2019 9:56:53 AM

30 | Big Data Simplied

• Document stores: In this case, instead of having rows, there are documents. Conceptually,

the documents in a Document Store are similar to rows. The documents contain key and

value pairs. The slight difference here is that the value of a key can itself be a document.

The documents can be JavaScript objects and they are encoded using JavaScript Object

Notation or JSON. JavaScript language ends up being used as the internal language for these

databases as well. The documents can also be in XML or other semi-structured formats. All

these technologies are extremely familiar to a web developer. As such, the document stores

tend to be highly used in web applications, with the usage of this technology, the static con-

tent in a website and the actual data-driven content end up having a lot in common.

As depicted in the example in Figure 2.9, Customers and Orders are stand-alone contain-

ers that have no overall parent. In this case, they are called databases. Instead of containing

rows, they contain documents. The Customer document 101 has an Address key, and the

value of that key is itself a document, with keys and values for House Number and Street

Name. You will also see that the Orders key has a document as its value, and that document,

in turn, has a key called Last_Order_ID. The value of that key points to a document in

another database, in this case, the Order document 1701 in the Orders database.

• Graph databases: They are primarily used for tracking relationships and therefore, such

data is considered good for social media applications where friends, networks and their

comments that relate to specific status messages all need to be tracked and analysed. They

really focus on the relationships between entities, rather than especially being focused on

FIGURE 2.9 Example of a document store

Database: Customers Database: Orders

Document ID: 101

First_Name: John

Second_Name: Doe

Address:

House_No: 123

Street_Name: Park Street

Orders:

Last_Order_ID: 1701

Document ID: 102

First_Name: Jane

Second_Name: Doe

Address:

House_No: 456

Street_Name: Green Street

Orders:

Last_Order_ID: 1702

Document ID: 1701

Price: 1000 USD

Item_ID: 2345

Item_ID: 7890

Document ID: 1702

Price: 700 USD

Item_ID: 4321

Item_ID: 5446

M02 Big Data Simplified XXXX 01.indd 30 5/10/2019 9:56:53 AM

Introducing Big Data | 31

the representation of the entities themselves. Graph databases tend to be extremely useful

for social media applications, for example, to track relationships between people in social

networks.

As depicted in the example in Figure 2.10, there is a database with a number of objects

call nodes. There is a node for John Smith. There is another node which has four different

properties in it, namely, House No, Street Name, City, State and Zip. Between two nodes

there is an edge. The edge is given a relationship name of Address. This is a kind of edge that

represents a property. So, John Smith has an address, and the address in turn has four prop-

erties and values of its own. But John Smith can also have a simple relationship to another

node in the database that represents another person. In this case, John Smith is a friend of

John Doe. John Smith has also sent an invitation to Jane Doe in his social network. Also,

John Doe has commented on a photo by Jane Smith in a social network. So, as observed in

the diagram, there are heterogeneous types of data which can be related through the edges.

There can also be a hierarchical relationship. In this case, John Smith placed an order, and

that order has its own properties and values as well. Evidently, this type of database is highly

appropriate for social media applications.

• Wide column stores: It is found that, HDFS based databases like HBase and Cassandra are

wide column or column family stores, and so the column family store category is used in Big

Data scenarios frequently. Let us now take a look at wide column stores with an example.

FIGURE 2.10 Example of a graph database

Database

House No: 123

Street: Park Street

City: New York

State: NY

Zip Code: 10014

Jane Smith

John Doe

John Smith

Jane Doe

Order ID: 1701

Total Price: 1000

USD

Item ID: 2345

Type: Dress

Colour: Red

Item ID: 7890

Type: Shirt

Colour: Blue

Address

Friend

Commented

on a photo

Sent a friend

request

Placed an

onine order

Order

contains

Order

contains

M02 Big Data Simplified XXXX 01.indd 31 5/10/2019 9:56:54 AM

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 2.7.5 NoSQL

Create new playlist

Sign In

Sign Up

Table of Contents for
2.7.5 NoSQL