Chapter 1 – Introduction to the Kognitio Architecture

“I saw the angel in the marble and carved until I set him free.”

– Michelangelo

What is Parallel Processing?

“After enlightenment, the laundry”

- Zen Proverb

image

“After parallel processing the laundry, enlightenment!”

- Kognitio Zen Proverb

Two guys were having fun on a Saturday night when one said, “I’ve got to go and do my laundry.” The other said, “What?!” The man explained that if he went to the laundry mat the next morning, he would be lucky to get one machine and be there all day. But, if he went on Saturday night he could get all the machines. Then, he could do all his wash and dry in two hours. Now that’s parallel processing mixed in with a little dry humor!

The Basics of a Single Computer

image

“When you are courting a nice girl, an hour seems like a second. When
     you sit on a red-hot cinder, a second seems like an hour. That’s relativity.”

– Albert Einstein

Data on disk does absolutely nothing. When data is requested, the computer moves the data one block at a time from disk into memory. Once the data is in memory, it is processed by the CPU at lightning speed. All computers work this way. The "Achilles Heel" of every computer is the slow process of moving data from disk to memory. The real theory of relativity is to find out how to get blocks of data from the disk into memory faster!

Data in Memory is fast as Lightning

image

“You can observe a lot by watching.”

– Yogi Berra

Once the data block is moved off of the disk and into memory, the processing of that block happens as fast as lightning. It is the movement of the block from disk into memory that slows down every computer. Data being processed in memory is so fast that even Yogi Berra couldn't catch it!

Parallel Processing Of Data

image

"If the facts don't fit the theory, change the facts."

-Albert Einstein

Big Data is all about parallel processing. Parallel processing is all about taking the rows of a table and spreading them among many parallel processing units. In Kognitio, these parallel processing units are referred to as Nodes. Above, we can see a table called Orders. There are 16 rows in the table. Each Node holds four rows. Now they can process the data in parallel and be four times as fast. What Albert Einstein meant to say was, “If the theory doesn't fit the dimension table, change it to a fact." Each Node shares nothing and holds a portion of every table.

Kognitio is an In-Memory System

image

Kognitio distributes the rows of every table equally across each and every node.

The tables are loaded into memory so they can be queried at lightning speeds.

The combination of parallel processing and in-memory technology makes Kognitio
one of the fastest systems in the word for processing large amounts of data.

Kognitio pins tables in memory so when they are queried the speeds are incredible. The Achilles Heel of every computer is in moving data from disk into memory, but Kognitio does this at startup so queries are fast.

Kognitio has Three Table Distribution Options

1)Random - Round Robin Distribution (Default)

2)Hash Distributed

3)Replicated

Most Kognitio tables will use a Round Robin distribution. This is the default when no distribution key is defined. The data is spread evenly across the nodes to maximize parallel processing.

Hash distribution. A column(s) are chosen as the distribution key. This column(s) is run through the Kognitio hashing algorithm to divide data among all of the nodes. Like values will hash to the same node. This is often done when two tables are joined together. When the join key is also the distribution key for both tables, the join works fast and efficient.

Replicated: A table can be replicated in its entirety on all nodes. This is often done when a smaller table, such as a dimension table(s) will be joined to a larger table.

Kognitio makes creating tables and distribution easy. Check out the fundamentals above.

Kognitio has Linear Scalability

image

"A Journey of a thousand miles begins with a single step."

- Lao Tzu

Kognitio was born to be parallel. With each query, a single step is performed in parallel by each Node. A Kognitio system consists of a series of Nodes that will work in parallel to store and process your data. This design allows you to start small and grow infinitely. If your Kognitio system provides you with an excellent Return on Investment (ROI), then continue to invest by purchasing more Node nodes. Most companies start small, but after seeing what Kognitio can do, they continue to grow their ROI from the single step of implementing a Kognitio system to millions of dollars in profits. Double your nodes and double your speeds. . . . Forever. The Kognitio Data Warehouse actually provides a journey of a thousand smiles!

Nexus is Now Available for Kognitio

image

Why the Nexus Chameleon should be your query tool of choice:

1)Queries every major system

2)Provides visualization and automatically writes the SQL

3)Can perform cross-system joins with a few clicks of the mouse

4)Converts table structures and moves the table and data between systems

5)Compares and synchronizes databases

6)Can move an entire database of tables or views between systems

7)Has the "Garden of Analysis" to re-query answer sets inside your PC

8)Provides a dashboard of graphs and charts for answer sets

Download the Nexus for a free trial at www.CoffingDW.com and use Nexus in-house or on the cloud.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset