HDFS basic command-line file operations

HDFS is a distributed filesystem, and just like a Unix filesystem, it allows users to manipulate the filesystem using shell commands. This recipe explains how to use the HDFS basic command line to execute those commands.

It is worth noting that HDFS commands have a one-to-one correspondence with Unix commands. For example, consider the following command:

>hadoop dfs –cat /data/foo.txt

The command reads the /data/foo.txt file and prints it to the screen, just like the cat command in Unix system.

Getting ready

Start the HDFS server by following the Setting up HDFS recipe.

How to do it...

  1. Change the directory to HADOOP_HOME.
  2. Run the following command to create a new directory called /test:
    >bin/hadoop dfs -mkdir /test
    
  3. HDFS filesystem has / as the root directory just like the Unix filesystem. Run the following command to list the content of the HDFS root directory:
    >bin/hadoop dfs -ls /
    
  4. Run the following command to copy the local readme file to /test
    >bin/hadoop dfs -put README.txt /test
    
  5. Run the following command to list the /test directory:
    >bin/hadoop dfs -ls /test
    
    Found 1 items
    -rw-r--r--   1 srinath supergroup       1366 2012-04-10 07:06 /test/README.txt
    
  6. Run the following command to copy the /test/README.txt to local directory:
    >bin/hadoop dfs -get /test/README.txt README-NEW.txt
    

How it works...

When a command is issued, the client will talk to the HDFS NameNode on the user's behalf and carry out the operation. Generally, we refer to a file or a folder using the path starting with /; for example, /data, and the client will pick up the NameNode from configurations in the HADOOP_HOME/conf directory.

However, if needed, we can use a fully qualified path to force the client to talk to a specific NameNode. For example, hdfs://bar.foo.com:9000/data will ask the client to talk to NameNode running on bar.foo.com at the port 9000.

There's more...

HDFS supports most of the Unix commands such as cp, mv, and chown, and they follow the same pattern as the commands discussed above. The document http://hadoop.apache.org/docs/r1.0.3/file_system_shell.html provides a list of all commands. We will use these commands throughout, in the recipes of the book.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset