There can be multiple situations where you want to decommission one or more data nodes from an HDFS cluster. This recipe shows how to gracefully decommission the DataNodes without incurring data loss and without having to restart the cluster.
The following steps show you how to decommission data nodes gracefully:
conf/hdfs-site.xml
file by adding the following property.<property>
<name>dfs.hosts.exclude</name>
<value>[FULL_PATH_TO_THE_EXCLUDE_FILE]</value>
<description>Names a file that contains a list of hosts thatare not permitted to connect to the namenode. The full pathname of the file must be specified. If the value is empty, no hosts are excluded.</description>
</property>
exclude
file.>bin/hadoop dfsadmin -refreshNodes
>bin/hadoop dfsadmin -report ..... ..... Name: myhost:50010 Decommission Status : Decommission in progress Configured Capacity: .... .....
exclude
file and execute the bin/hadoop dfsadmin –refreshNodes
command when you want to add the nodes back in to the cluster.exclude
file and then executing the bin/hadoop dfsadmin –refreshNodes
command.When a node is in the decommissioning process, HDFS replicates the blocks in that node to the other nodes in the cluster. Decommissioning can be a slow process as HDFS purposely does it slowly to avoid overwhelming the cluster. Shutting down nodes without decommissioning may result in data loss.
After the decommissioning is completed, the nodes mentioned in the exclude
file are not allowed to communicate with the NameNode.