Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Optimizing memory

Spark is a complex distributed computing framework and has many moving parts. Various cluster resources, such as memory, CPU, and network bandwidth, can become bottlenecks at various points. As Spark is an in-memory compute framework, the impact of the memory is the biggest.

Another issue is that it is common for Spark applications to use a huge amount of memory, sometimes more than 100 GB. This amount of memory usage is not common in traditional Java applications.

In Spark, there are two places where memory optimization is needed: one at the driver level and the other at the executor level. The following diagram shows the two levels (driver level and executor level) of operations in Spark:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Optimizing memory

Create new playlist

Sign In

Sign Up

Table of Contents for
Optimizing memory