Spark memory allocation

By default, Spark uses 60 percent of the JVM heap for Spark's execution and storage needs and 40 percent for user objects and metadata. Out of this 60 percentage, 50 percentage of the executor memory is for cache RDDs/DataFrames (storage) and the other 50 percent for execution. Sometimes, you may not need 60 percent for RDDs and can reduce this limit so that more space is available for object creation (and less need for GC).

You can set the memory allocated for the RDD/DataFrame cache to 40 percent by starting the Spark shell and setting the memory fraction:

$ spark-shell -conf spark.memory.storageFraction=0.4

Spark also has another property, which expresses memory as a fraction to total JVM head space (the default being 0.6):

$ spark-shell -conf spark.memory.fraction=0.7
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset