A closer look to hdfs


I am keeping up-skilling on Hadoop & NoSql in general, so this post might be useful if you’re doing the same…

Hdfs is optimized for steaming operations, to achieve high performance when processing huge amount of data, a trade off is it’s poor on random seeks, but because of the nature of the way data is meant to be processed, this is not a real limitation.
The structure is intrinsically write once, so data cannot be deleted or changes; because there is no real reread of the same blocks (in the steaming processing) no local cache needs; the failure handling is a core feature; for example even if one rack (of machine) has communication issues, hadoop is still able to keep m/r jobs going (with degrading performances as probably data needs to be re-replicated);

Namenode and datanode

The clusters can supply different services, consider them as nodes that can do different responsibilities along the time-line. The hdfs uses a master-slave model: the namenode is a traffic coordinator, it provides a single namespace for all the file-system (of course it’s spread along the clusters), the datanode is where the data is located, namenode looks after any datanode that fails (when it’s so it’s considered not reachable or down).
Let’s see a diagram where a client uses the namenode when using hdfs, so the namenode check the status of the datanodes and then, they start to stream the data requested by the client


Note the secondary namenode is not a failover node, it does checkpoint for the namenode fs so if the namenode goes down it can used to restart the process; it’s usually on a separate server.

Filesystem Namespace

Hdfs support the majority command of unix fs, (create, move, cp etc), this operations happened though namenode, the manipulation goes through the namenode not directly accessing the datanode. Some features such has hard/soft link are not supported as not relevant in the hdfs world.

Block replication
Because of the streaming nature of hdfs and quantity of data, each datablock is 64 MB.

If we have a file that have this size


Hdfs will break this in 5 parts (hp the size is 64*4 + X) we have the block is replicated 3 times (by defualt), so the namenode will spread the 5 blocks on the available datanode


The namenode itself maintains an imaginary table of where the data is replicated so in case of failure it will instruct a free datanode to replicate the data of the other 2 available.

Reading hdfs


let’s see the steps when a client reads from hdfs.
Step 1
the client calls the distributed file-system, it calls the namenode to figure out from which datanode in the cluster the block (we want read data from) is located in the cluster. Don’t forget that as data is replicated, so we expected to have a list of namenodes holding copies of the data, from which we can read;
Step 2
This list of datanodes is based on the proximity to the client; this makes sense when looking to minimize wait time; the namenode is aware of this information when the cluster is built (adding/removing clusters)
Step 3
A FSDataInputStream is returned to the client, this is an objects that abstrscts the process
Step 4
so the client can read the data from the datanode; data can be streamed to the client
Step 5
Connections are opened/closed during this process to read data from the cluster, don’t forget is a distributed system
Writing to hdfs
the writing is basically specular and the main difference is we have a node pipe of blocks as the data is replicated across (3) datanodes, and we have a FSDataOutputStream and checking phase oh Acknowledge so the namenode can update is internal references, sure the data is correctly replicated.


Buy this http://www.informit.com/store/hadoop-fundamentals-livelessons-video-training-9780133392821


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s