OBJECTIVE
CHAPTER
6
Introducing Spark
andKafka
6.1 Introducing Spark
6.2 Working with Kafka
Now we have covered the core Big Data components,
such as Hadoop, MapReduce and NoSQL. It is the right
time to introduce another very important aspect of the
Hadoop ecosystem, i.e., Apache Spark. Spark is widely
used across organizations to process large data sets.
It is extremely popular for its great processing speed
and ability to integrate with diverse databases. Apache
Spark is accompanied by Apache Kafka, an open source
distributed streaming platform which is used to stream
data. Developed in Scala and Java by LinkedIn, it was
contributed to the Apache Software Foundation. It pro-
vides uni ed, high-throughput, low-latency platform
for handling real-time data feeds. We shall study the
functions of Apache Kafka in this chapter.
M06 Big Data Simplified XXXX 01.indd 117 5/17/2019 2:49:07 PM
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset