Installing HBase

HBase is a highly scalable NoSQL data store that supports columnar-style data storage. As we will see in the next recipe, it works very closely with Hadoop.

Installing HBase

The preceding screenshot depicts the HBase data model. As shown, HBase includes several tables. Each table has zero or more rows where a row consists of a single row ID and multiple name-value pairs. For an example, the first row has the row ID Foundation, and several name-value pairs such as author with value asimov. Although the data model has some similarities with the relational data model, unlike the relational data model, different rows in HBase data model may have different columns. For instance, the second row may contain completely different name-value pairs from the first one. You can find more details about the data model from Google's Bigtable paper http://research.google.com/archive/bigtable.html.

Hadoop by default loads data from flat files, and it is a responsibility of the MapReduce job to read and parse the data through data formatters. However, often there are use cases where the data is already in a structured form. Although it is possible to export this data into flat files, parsing and processing the use cases using conventional MapReduce jobs leads to several disadvantages:

  • Processing needs extra steps to convert and export the data
  • Exporting the data needs additional storage
  • Exporting and parsing takes more computing power
  • There arises a need to write specific code to export and parse the data

HBase addresses these concerns by enabling users to read data directly from HBase and write results directly to HBase without having to convert them to flat files.

How to do it...

This section demonstrates how to install HBase.

  1. Download HBase 0.94.2 from http://hbase.apache.org/.
  2. Unzip the distribution by running the following command. We will call the resulting directory HBASE_HOME.
    >tarxfz hbase-0.94.2-SNAPSHOT.tar.gz
    
  3. Create a data directory to be used by HBase:
    >cd $HBASE_HOME
    >mkdirhbase-data
    
  4. Add the following to the HBASE_HOME/conf/hbase-site.xml file.
    <configuration>
    <property>
    <name>hbase.rootdir</name>
    <value>file:///Users/srinath/playground/hadoop-book/hbase-0.94.2/hbase-data
    </value>
    </property>
    </configuration>
  5. Start the HBase server by running the following command from HBASE_HOME:
    >./bin/start-hbase.sh
    
  6. Verify the HBase installation by running the shell commands from HBASE_HOME:
    >bin/hbase shell
    HBase Shell; enter 'help<RETURN>' for list of supported commands.
    Type "exit<RETURN>" to leave the HBase Shell
    Version 0.92.1, r1298924, Fri Mar  9 16:58:34 UTC 2012
    
  7. Create a test table and list its content using the following commands:
    hbase(main):001:0> create 'test', 'cf'
    0 row(s) in 1.8630 seconds
    
    hbase(main):002:0> list 'test'
    TABLE
    test
    1 row(s) in 0.0180 seconds
    
  8. Store a value, row1, for row ID, column name test, and value val1 to the test table using the following commands:
    hbase(main):004:0> put 'test', 'row1', 'cf:a', 'val1'
    0 row(s) in 0.0680 seconds
    
  9. Scan the table using the following command. It prints all the data in the table:
    hbase(main):005:0> scan 'test'
    ROW      COLUMN+CELL                                                                           
    row1column=cf:a, timestamp=1338485017447, value=val1                                      
    1 row(s) in 0.0320 seconds
    
  10. Get the value from the table using the following command by giving row1 as row ID and test as the column ID:
    hbase(main):006:0> get 'test', 'row1'
    COLUMN    CELL                                                                                  
    cf:atimestamp=1338485017447, value=val1                                                   
    1 row(s) in 0.0130 seconds
    
    hbase(main):007:0> exit
    
  11. The preceding commands verify the HBase installation.
  12. When done, finally shut down the HBase by running the following command from the HBASE_HOME:
    > ./bin/stop-hbase.sh
    stoppinghbase..............
    

How it works...

The preceding steps configure and run the HBase in the local mode. The server start command starts the HBase server, and HBase shell connects to the server and issues the commands.

There's more...

The preceding commands show how to run HBase in the local mode. The link http://hbase.apache.org/book/standalone_dist.html#distributed explains how to run HBase in the distributed mode.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset