Apache HBase is a distributed Big Data store for Hadoop. This allows random, real-time, read/write access to Big Data. This is designed as a column-oriented, data-storage model, innovated after being inspired by Google Big table.
Following are the features for HBase:
Pre-requisites for RHBase are as follows:
Here we assume that users have already configured Hadoop for their Linux machine. If anyone wishes to know how to install Hadoop on Linux, please refer to Chapter 1, Getting Ready to Use R and Hadoop.
Following are the steps for installing HBase:
wget http://apache.cs.utah.edu/hbase/stable/hbase-0.94.11.tar.gz tar -xzf hbase-0.94.11.tar.gz
cd hbase-0.94.11/ vi conf/hbase-site.xml
hbase-env.sh
.~ Vi conf / hbase-env.sh
export JAVA_HOME = /usr/lib/jvm/java-6-sun export HBASE_HOME = /usr/local/hbase-0.94.11 export HADOOP_INSTALL = /usr/local/hadoop export HBASE_CLASSPATH = /usr/local/hadoop/conf export HBASE_MANAGES_ZK = true
hbase-site.xmlzxml
:Vi conf / hbase-site.xml
hbase-site.cml
, which should look like the following code:<configuration> <property> <name> hbase.rootdir </name> <value> hdfs://master:9000/hbase </value> </Property> <property> <name>hbase.cluster.distributed </name> <value>true</value> </Property> <property> <name>dfs.replication </name> <value>1</value> </Property> <property> <name>hbase.zookeeper.quorum </name> <value>master</value> </Property> <property> <name>hbase.zookeeper.property.clientPort </name> <value>2181</value> </Property> <property> <name>hbase.zookeeper.property.dataDir </name> <value>/root/hadoop/hdata</value> </Property> </ Configuration>
Cp $HADOOP_HOME/conf/hdfs-site.xml $HBASE_HOME/conf Cp $HADOOP_HOME/hadoop-core-1.0.3.jar $HBASE_HOME/lib Cp $HADOOP_HOME/lib/commons-configuration-1.6.jar $HBASE_HOME/lib Cp $HADOOP_HOME/lib/commons-collections-3.2.1.jar $HBASE_HOME/lib
Following are the steps for installing thrift:
get http://archive.apache.org/dist/thrift/0.8.0/thrift-0.8.0.tar.gz
.tar.gz
file, use the following command:tar xzvf thrift-0.8.0.tar.gz cd thrift-0.8.0/
./Configure
Make Make install
After installing HBase , we will see how to get the RHBase library.
rhbase
we use the following command:wget https://github.com/RevolutionAnalytics/rhbase/blob/master/build/rhbase_1.2.0.tar.gz
R CMD INSTALL rhbase_1.2.0.tar.gz
Once RHBase is installed, we can load the dataset in R from HBase with the help of RHBase:
hb.list.tables ()
hb.new.table ("student")
hb.describe.table("student_rhbase")
hb.get ('student_rhbase', 'mary')