The Fuse-DFS project allows us to mount HDFS on Linux (supports many other flavors of Unix as well) as a standard filesystem. This allows any program or user to access and interact with HDFS similar to a traditional filesystem.
You must have the following software installed in your system.
JAVA_HOME
must be set to point to a JDK, not to a JRE.
You must have the root privileges for the node in which you are planning to mount the HDFS filesystem.
The following recipe assumes you already have pre-built libhdfs libraries. Hadoop contains pre-built libhdfs libraries for the Linux x86_64/i386 platforms. If you are using some other platform, first follow the Building libhdfs sub section in the more info section to build the libhdfs libraries.
The following steps show you how to mount an HDFS filesystem as a standard file system on Linux:
$HADOOP_HOME
and create a new directory named build
.>cd $HADOOP_HOME >mkdir build
libhdfs
libraries inside the build
directory.>ln -s c++/Linux-amd64-64/lib/ build/libhdfs
c++
directory to the build
folder.>cp -R c++/ build/
$HADOOP_HOME
. This command will generate the fuse_dfs
and fuse_dfs_wrapper.sh
files in the build/contrib/fuse-dfs/
directory.> ant compile-contrib -Dlibhdfs=1 -Dfusedfs=1
fuse_dfs_wrapper.sh
and correct them. You may have to change the libhdfs path in the following line as follows:export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/$OS_ARCH/server:$HADOOP_HOME/build/libhdfs/:/usr/local/lib
user_allow_other
in /etc/fuse.conf
.>mkdir /u/hdfs
build/contrib/fuse-dfs/
directory. You have to execute this command with root privileges. Make sure that the HADOOP_HOME
and JAVA_HOME
environmental variables are set properly in the root environment as well. The optional –d
parameter enables the debug mode. It would be helpful to run the following command in the debug mode to identify any error when you run it for the first time. The rw
parameter mounts the filesystem read-write (ro
for read-only). –oserver
must point to the NameNode hostname. –oport
should provide the NameNode port number.>chmod a+x fuse_dfs_wrapper.sh >./fuse_dfs_wrapper.sh rw -oserver=localhost -oport=9000 /u/hdfs/ -d
Fuse-DFS is based on the filesystem in user space. The FUSE project (http://fuse.sourceforge.net/) makes it possible to implement filesystems in the user space. Fuse-DFS interacts with HDFS filesystem using the libhdfs C API. libhdfs uses JNI to spawn a JVM that communicates with the configured HDFS NameNode.
Many instances of HDFS can be mounted on to different directories using the Fuse-DFS as mentioned in the preceding sections.
In order to build libhdfs, you must have the following software installed in your system:
ant-nodeps
and ant-trax
packagesautomake
packageLibtool
packagezlib-devel
packageCompile libhdfs by executing the following command in $HADOOP_HOME
:
>ant compile-c++-libhdfs -Dislibhdfs=1
Package the distribution together with libhdfs
by executing the following command. Provide the path to JDK 1.5 using the -Djava5.home
property. Provide the path to the Apache Forrest installation using the -Dforrest.home
property.
>ant package -Djava5.home=/u/jdk1.5 -Dforrest.home=/u/apache-forrest-0.8
Check whether the build/libhdfs
directory contains the libhdfs.*
files. If it doesn't, copy those files to build/libhdfs
from the build/c++/<your_architecture>/lib
directory.
>cp -R build/c++/<Your_OS_Architecture/lib>/ build/libhdfs