HDFS (Hadoop Distributed File System)

HDFS is a self-controlled, distributed filesystem, providing scalable and reliable data storage on commodity hardware with a fault tolerance mechanism. HDFS, in association with MapReduce, manages data storage on a clustered environment by scaling up, based on content and resource consumption. Unlike RDBMS systems, HDFS can store data in multiple formats such as files, images, and videos in addition to the text. In the case of exceptions on one node, HDFS switches between the nodes and transfers information at rapid speed.

HDFS behaves in the master slave architecture with a single NameNode and a number of DataNodes. The master (NameNode) is responsible for managing the namespace of the filesystem and controling the client file access. The DataNodes are responsible for serving read and write requests from the filesystem's clients. The DataNodes also perform block creation, deletion, and replication on instructions from NameNode.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset