Installing Pig

As we described in the earlier chapters, you can use Hadoop MapReduce interface to program most of the applications. However, if we are writing an application that includes many MapReduce steps, programming them with MapReduce is complicated.

There are several higher-level programming interfaces such as Pig and Hive to program parallel applications built on top of MapReduce. We will discuss these two interfaces in the following recipes.

How to do it...

This section demonstrates how to install Pig.

  1. Download Pig 0.10.0 from http://pig.apache.org/releases.html.
  2. Unzip Pig distribution by running the following command. We will call it PIG_HOME.
    > tar xvf pig-0.10.0.tar.gz
    
  3. To run Pig commands, change the directory to PIG_HOME and run the pig command. It starts the grunt shell.
    >cd PIG_HOME
    >bin/pig --help
    >bin/pig-x local
    grunt>
    

You can issue the Pig commands from the grunt shell.

How it works...

The preceding instructions set up Pig in the local mode, and you can use the grunt> shell to execute the Pig commands.

There's more...

The preceding commands explain how to run Pig in the local mode. The link http://pig.apache.org/docs/r0.10.0/start.html#Running+the+Pig+Scripts+in+Mapreduce+Mode explains how to run HBase in the distributed mode.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset