How to do it...

Here are the installation steps:

  1. Open the terminal and download the binaries using the following command:
        $ wget http://d3kbcqa49mib13.cloudfront.net/spark-2.1.0-bin-hadoop2.7.tgz
  1. Unpack the binaries:
        $ tar -zxf spark-2.1.0-bin-hadoop2.7.tgz
  1. Rename the folder containing the binaries by stripping the version information:
        $ sudo mv spark-2.1.0-bin-hadoop2.7 spark
  1. Move the configuration folder to the /etc folder so that it can be turned into a symbolic link later:
        $ sudo mv spark/conf/* /etc/spark
  1. Create your company-specific installation directory under /opt. As the recipes in this book are tested on the infoobjects sandbox, use infoobjects as the directory name. Create the /opt/infoobjects directory:
        $ sudo mkdir -p /opt/infoobjects
  1. Move the spark directory to /opt/infoobjects, as it's an add-on software package:
        $ sudo mv spark /opt/infoobjects/
  1. Change the permissions of the spark home directory, namely 0755 = user:read-write-execute group:read-execute world:read-execute:
        $ sudo chmod -R 755 /opt/infoobjects/spark
  1. Move to the spark home directory:
        $ cd /opt/infoobjects/spark
  1. Create the symbolic link:
        $ sudo ln -s /etc/spark conf
  1. Append Spark binaries path to PATH in .bashrc:
        $ echo "export PATH=$PATH:/opt/infoobjects/spark/bin" >> /home/hduser/.bashrc
  1. Open a new terminal.
  2. Create the log directory in /var:
        $ sudo mkdir -p /var/log/spark
  1. Make hduser the owner of Spark's log directory:
        $ sudo chown -R hduser:hduser /var/log/spark
  1. Create Spark's tmp directory:
        $ mkdir /tmp/spark
  1. Configure Spark with the help of the following command lines:
     $ cd /etc/spark
$ echo "export HADOOP_CONF_DIR=/opt/infoobjects/hadoop/etc/hadoop" >> spark-env.sh
$ echo "export YARN_CONF_DIR=/opt/infoobjects/hadoop/etc/Hadoop" >> spark-env.sh
$ echo "export SPARK_LOG_DIR=/var/log/spark" >> spark-env.sh
$ echo "export SPARK_WORKER_DIR=/tmp/spark" >> spark-env.sh
  1. Change the ownership of the spark home directory to root:
        $ sudo chown -R root:root /opt/infoobjects/spark
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset