Hadoop Distributed File System
HDFS is a management of hadoop file system for example every Operating System do have FS like in windows we have NTFS , FAT32 managing metadata about your directories and files same in hdfs Master node [namenode] is managing metadata about all file and directories that are present in whole cluster.
Let us Understand the concept of hadoop distributed file system.
1- hadoop is working on cluster computing concept that is having master slave architecture.
2- Master and slaves are machine serving couple of services.
Master:
1-NameNode
2-Job Tracker.
3-Secondary namenode.
Slave:
1-Data Nmode
2-Task Tracker
3- Child Jvm.
1-Job Tracker is master daemon and Task Tracker is slave daemon job tracker distribute job across all nodes where data partitions exist.
2-In every data node we have Task Tracker and child jvm all Task Tracker register themselves with
job tracker while they are working on hadoop job.
3-Job is executed by child jvm and all task tracker send report to the job tracker like ack. message.
4-All data nodes sends acknowledge message to the Master Namenode to report that they are still alive or they are working properly.
5- Secondary name node is working like a helper of namenode. the task of secondary namenode is to update namenode with cluster current image file that is stored in fsimage .
6- So finally Task Tracker send Report to Job Tracker.Data Node send Report to Name Node.
and Secondary Name Node update Master Name Node.
Comments
Post a Comment