Installing Hive

Just like with Pig, Hive also provides an alternative programming model to write data processing jobs. It allows users to map their data into a relational model and process them through SQL-like commands.

Due to its SQL-style language, Hive is very natural for users who were doing data warehousing using relational databases. Therefore, it is often used as a data warehousing tool.

Getting ready

You need a machine that has Java JDK 1.6 or later version installed.

How to do it...

This section describes how to install Hive.

  1. Download Hive 0.9.0 from
  2. Unzip the distribution by running the following commands.
    > tar xvf hive-0.9.0.tar.gz
  3. Download Hadoop 1.0.0 distribution from
  4. Unzip the Hadoop distribution with the following command.
    > tar xvfhadoop-1.0.0.tar.gz
  5. Define the environment variables pointing to Hadoop and Hive distributions.
    >export HIVE_HOME=<hive distribution>
    >export HADOOP_HOME=<hadoopdistribution>
  6. Configure Hive by adding the following section to the conf/hive-site.xml file.
  7. Delete the HADOOP_HOME/build folder to avoid a bug that will cause Hive to fail.
  8. Start Hive by running the following commands from HIVE_HOME:
    > cd hive-0.9.0
    > bin/hive
    WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the files.
    Logging initialized using configuration in jar:file:/Users/srinath/playground/hadoop-book/hive-0.9.0/lib/hive-common-0.9.0.jar!/
    Hive history file=/tmp/srinath/hive_job_log_srinath_201206072032_139699150.txt

How it works...

The preceding commands will set up Hive, and it will run using the Hadoop distribution as configured in the HADOOP_HOME.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.