Introducing Big Data

Big Data has to deal with large and complex datasets that can be structured, semi-structured, or unstructured and will typically not fit into memory to be processed. They have to be processed in place, which means that computation has to be done where the data resides for processing. When we talk to developers, the people actually building Big Data systems and applications, we get a better idea of what they mean about 3Vs. They typically would mention the 3Vs model of Big Data, which are velocity, volume, and variety.

Velocity refers to the low latency, real-time speed at which the analytics need to be applied. A typical example of this would be to perform analytics on a continuous stream of data originating from a social networking site or aggregation of disparate sources of data.

Volume refers to the size of the dataset. It may be in KB, MB, GB, TB, or PB based on the type of the application that generates or receives the data.

Variety refers to the various types of the data that can exist, for example, text, audio, video, and photos.

Big Data usually includes datasets with sizes. It is not possible for such systems to process this amount of data within the time frame mandated by the business. Big Data volumes are a constantly moving target, as of 2012 ranging from a few dozen terabytes to many petabytes of data in a single dataset. Faced with this seemingly insurmountable challenge, entirely new platforms are called Big Data platforms.

Introducing Big Data
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset