In this part, we want to view Apache Spark’s streaming engines within a broader scope. We begin with a detailed comparison with other relevant projects of the distributed stream-processing industry, explaining both where Spark comes from and how there is no alternative exactly like it.
We offer a brief description of and a focused comparison to other distributed processing engines, including the following:
A historical landmark of distributed processing, and a system that still has a legacy footprint today
A distributed stream processing engine that is the most active competitor of Spark
A reliable distributed log and stream connector that is fast developing analytical chops
We also touch on the cloud offerings of the main players (Amazon and Microsoft) as well as the centralizing engine of Google Cloud Dataflow.
After you are equipped with a detailed sense of the potential and challenges of Apache Spark’s streaming ambitions, we’ll touch on how you can become involved with the community and ecosystem of stream processing with Apache Spark, providing references for contributing, discussing, and growing in the practice of streaming analytics.