Part III. Spark Streaming

In this part, we are going to learn about Spark Streaming.

Spark Streaming was the first streaming API offered on Apache Spark and is currently used in production by many companies around the world. It provides a powerful and extensible functional API based on the core Spark abstractions. Nowadays, Spark Streaming is mature and stable.

Our exploration of Spark Streaming begins with a practical example that provides us with an initial feeling of its API usage and programming model. As we progress through this part, we explore the different aspects involved in the programming and execution of robust Spark Streaming applications:

  • Understanding the Discretized Stream (DStream) abstraction

  • Creating applications using the API and programming model

  • Consuming and producing data using streaming sources and Output Operations

  • Combining SparkSQL and other libraries into streaming applications

  • Understanding the fault-tolerance characteristics and how to create robust applications

  • Monitoring and managing streaming applications

After this part, you will have the knowledge required to design, implement, and execute stream-processing applications using Spark Streaming. We will also be prepared for Part IV, in which we cover more advanced topics like the application of probabilistic data structures for stream processing and online machine learning.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset