Appendix B. Hadoop Setup

Hortonworks Sandbox

Hortonworks Sandbox is a Hadoop learning and development environment that runs as a virtual machine. It is a widely accepted way to learn Hadoop as it comes with most of latest stack of applications of Hortonworks Data Platform (HDP).

We have used Hortonworks Sandbox throughout the book. At the time of this writing, the latest version of the sandbox is 1.3.

Setting up the Hortonworks Sandbox

The following steps will help you set up Hortonworks Sandbox:

  1. Download the Oracle VirtualBox installer from https://www.virtualbox.org.
  2. Launch the installer and accept all the default options.
  3. Download the Hortonworks Sandbox virtual image for VirtualBox, located at http://hortonworks.com/products/hortonworks-sandbox. At the time of writing, Hortonworks+Sandbox+1.3+VirtualBox+RC6.ova is the latest image available.
  4. Launch the Oracle VirtualBox application.
  5. In the File menu, choose Import Appliance.
  6. The Import Virtual Appliance dialog will appear; click on the Open Appliance... button and navigate to the image file.
  7. Click on the Next button.
  8. Accept the default settings and click on the Import button.
  9. On the image list, you will find Hortonworks Sandbox 1.3. The following screenshot shows the Hortonworks Sandbox in an image listbox:
    Setting up the Hortonworks Sandbox
  10. On the menu bar, click on Settings.
  11. The settings dialog appears. On the left-hand side panel of the dialog, choose Network.
  12. In the Adapter 1 menu tab, make sure the checkbox labeled Enable Network Adapter is checked.
  13. In the Attached to listbox, select Bridged Adapter. This configuration makes the VM as it is having its own NIC card and IP address. Click on OK to accept the configuration. The following screenshot shows the VirtualBox network configuration display:
    Setting up the Hortonworks Sandbox
  14. In the menu bar, click on the Start button to run the VM.
  15. After the VM completely starts up, press Alt + F5 to log in to the virtual machine. Use root as username and hadoop as password.
  16. The sandbox uses DHCP to obtain its IP address. Assuming you can configure your PC to the 192.168.1.x network address, we will change the Sandbox's IP address to the static 192.168.1.122 address by editing the /etc/sysconfig/network-scripts/ifcfg-eth0 file. Use the following values:
    • DEVICE: eth0
    • TYPE: Ethernet
    • ONBOOT: yes
    • NM_CONTROLLED: yes
    • BOOTPROTO: static
    • IPADDR: 192.168.1.122
    • NETMASK: 255.255.255.0
    • DEFROUTE: yes
    • PEERDNS: no
    • PEERROUTES: yes
    • IPV4_FAILURE_FATAL: yes
    • IPV6INIT: no
    • NAME: System eth0
  17. Restart the network by issuing the service network restart command.
  18. From the host, try to ping the new IP address. If successful, we are good to move to the next preparation.

Hortonworks Sandbox web administration

The following steps will make you aware of web-based administration:

  1. Launch your web browser from the host. In the address bar, type in http://192.168.1.122:8888. It will open up the sandbox home page, which consists of an application menu, administrative menu, and a collection of written and video tutorials.
  2. Under the Use the Sandbox box, click on the Start button. This will open Hue—an open source UI application for Apache Hadoop. The following screenshot shows the Hortonworks Sandbox web page display:
    Hortonworks Sandbox web administration
  3. On the upper-right corner of the page, note that you are currently logged in as hue. The following screenshot shows hue as the current logged in user.
    Hortonworks Sandbox web administration
  4. In the menu bar, explore the list of Hadoop application menus. The following screenshot shows a list of Hadoop-related application menus:
    Hortonworks Sandbox web administration
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset