Part I. Fundamentals of Stream Processing with Apache Spark

The first part of this book is dedicated to building solid foundations on the concepts that underpin stream processing and a theoretical understanding of Apache Spark as a streaming engine.

We begin with a discussion on what motivating drivers are behind the adoption of stream-processing techniques and systems in the enterprise today (Chapter 1). We then establish vocabulary and concepts common to stream processing (Chapter 2). Next, we take a quick look at how we got to the current state of the art as we discuss different streaming architectures (Chapter 3) and outline a theoretical understanding of Apache Spark as a streaming engine (Chapter 4).

At this point, the readers have the opportunity to directly jump to the more practical-oriented discussion of Structured Streaming in Part II or Spark Streaming in Part III.

For those who prefer to gain a deeper understanding before adventuring into APIs and runtimes, we suggest that you continue reading about Spark’s Distributed Processing model in Chapter 5, in which we lay the core concepts that will later help you to better understand the different implementations, options, and features offered by Spark Streaming and Structured Streaming.

In Chapter 6, we deepen our understanding of the resilience model implemented by Spark and how it takes away the pain from the developer to implement robust streaming applications that can run enterprise-critical workloads 24/7.

With this new knowledge, we are ready to venture into the two streaming APIs of Spark, which we do in the subsequent parts of this book.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset