How to do it...

  1. Add the --executor-memory command-line argument while submitting a job to spark-submit to set on-heap memory for the worker node. For example, we could use --executor-memory 4g to allocate 4 GB of memory.
  2. Add the --conf command-line argument to set the off-heap memory for the worker node:
--conf "spark.executor.extraJavaOptions=-Dorg.bytedeco.javacpp.maxbytes=8G"
  1. Add the --conf command-line argument to set the off-heap memory for the master node. For example, we could use --conf "spark.driver.memoryOverhead=-Dorg.bytedeco.javacpp.maxbytes=8G" to allocate 8 GB of memory.
  2. Add the --driver-memory command-line argument to specify the on-heap memory for the master node. For example, we could use --driver-memory 4g to allocate 4 GB of memory.
  3. Configure garbage collection for the worker nodes by calling workerTogglePeriodicGC() and workerPeriodicGCFrequency() while you set up the distributed neural network using SharedTrainingMaster:
new SharedTrainingMaster.Builder(voidConfiguration, minibatch)
.workerTogglePeriodicGC(true)
.workerPeriodicGCFrequency(frequencyIntervalInMs)
.build();
  1. Enable Kryo optimization in DL4J by adding the following dependency to the pom.xml file:
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-kryo_2.11</artifactId>
<version>1.0.0-beta3</version>
</dependency>
  1. Configure KryoSerializer with SparkConf:
SparkConf conf = new SparkConf();
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
conf.set("spark.kryo.registrator", "org.nd4j.Nd4jRegistrator");
  1. Add locality configuration to spark-submit, as shown here:
--conf spark.locality.wait=0 
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset