Preface

The success of Hadoop as a big data platform raised user expectations, both in terms of solving different analytics challenges and reducing latency. Various tools evolved over time, but when Apache Spark came, it provided a single runtime to address all these challenges. It eliminated the need to combine multiple tools with their own challenges and learning curves. Using memory for persistent storage besides compute, Apache Spark eliminates the need to store intermediate data on disk and increases processing speed up to 100 times. It also provides a single runtime, which addresses various analytics needs, such as machine-learning and real-time streaming, using various libraries.
This book covers the installation and configuration of Apache Spark and building solutions using Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX libraries.


For more information on this book's recipes, please visit infoobjects.com/spark-cookbook.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset