Hadoop supports specifying multiple directories for DataNode data directory. This feature allows us to utilize multiple disks/volumes to store the data blocks in DataNodes. Hadoop will try to store equal amounts of data in each directory. Hadoop also supports limiting the amount of disk space used by HDFS.
The following steps will show you how to add multiple disk volumes:
$HADOOP_HOME/conf/hdfs-site.xml
, provide a comma-separated list of directories corresponding to the data storage locations in each volume under the dfs.data.dir
directory.<property> <name>dfs.data.dir</name> <value>/u1/hadoop/data,/u2/hadoop/data</value> </property>
$HADOOP_HOME/conf/hdfs-site.xml
to reserve space for non-DFS usage. The value specifies the number of bytes that HDFS cannot use per volume.<property> <name>dfs.datanode.du.reserved</name> <value>6000000000</value> <description>Reserved space in bytes per volume. Always leave this much space free for non dfs use. </description> </property>