Virtualizing the environment with Vagrant

In order to create a portable Python and Spark environment that can be easily shared and cloned, the development environment can be built with a vagrantfile.

We will point to the Massive Open Online Courses (MOOCs) delivered by Berkeley University and Databricks:

The course labs were executed on IPython Notebooks powered by PySpark. They can be found in the following GitHub repository: https://github.com/spark-mooc/mooc-setup/.

Once you have set up Vagrant on your machine, follow these instructions to get started: https://docs.vagrantup.com/v2/getting-started/index.html.

Clone the spark-mooc/mooc-setup/ github repository in your work directory and launch the command $ vagrant up, within the cloned directory:

Be aware that the version of Spark may be outdated as the vagrantfile may not be up-to-date.

You will see an output similar to this:

C:Programssparkedx1001mooc-setup-master>vagrant up
Bringing machine 'sparkvm' up with 'virtualbox' provider...
==> sparkvm: Checking if box 'sparkmooc/base' is up to date...
==> sparkvm: Clearing any previously set forwarded ports...
==> sparkvm: Clearing any previously set network interfaces...
==> sparkvm: Preparing network interfaces based on configuration...
    sparkvm: Adapter 1: nat
==> sparkvm: Forwarding ports...
    sparkvm: 8001 => 8001 (adapter 1)
    sparkvm: 4040 => 4040 (adapter 1)
    sparkvm: 22 => 2222 (adapter 1)
==> sparkvm: Booting VM...
==> sparkvm: Waiting for machine to boot. This may take a few minutes...
    sparkvm: SSH address: 127.0.0.1:2222
    sparkvm: SSH username: vagrant
    sparkvm: SSH auth method: private key
    sparkvm: Warning: Connection timeout. Retrying...
    sparkvm: Warning: Remote connection disconnect. Retrying...
==> sparkvm: Machine booted and ready!
==> sparkvm: Checking for guest additions in VM...
==> sparkvm: Setting hostname...
==> sparkvm: Mounting shared folders...
    sparkvm: /vagrant => C:/Programs/spark/edx1001/mooc-setup-master
==> sparkvm: Machine already provisioned. Run `vagrant provision` or use the `--provision`
==> sparkvm: to force provisioning. Provisioners marked to run always will still run.

C:Programssparkedx1001mooc-setup-master>

This will launch the IPython Notebooks powered by PySpark on localhost:8001:

Virtualizing the environment with Vagrant
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset