Skip to main content

Hadoop distributed file System-HDFS

Hadoop Distributed File System


HDFS is a management of hadoop file system for example every Operating System do have FS like in windows we have NTFS , FAT32 managing metadata about your directories and files same in hdfs Master node [namenode] is managing metadata about all file and directories that are present in whole cluster.


Hadoop is not a single tool and having distributed file System. when user upload data at HDFS then hadoop distribute data across multiple nodes. Hdfs and map-reduce are two base component of hadoop echo system.

Let us Understand the concept of hadoop distributed file system.
1- hadoop is working on cluster computing concept that is having master slave architecture.
2- Master and slaves are machine serving couple of services.
Master: 
    1-NameNode
    2-Job Tracker.
    3-Secondary namenode.

Slave:
    1-Data Nmode
    2-Task Tracker
    3- Child Jvm.

1-Job Tracker is master daemon and Task Tracker is slave daemon job tracker distribute job across all nodes where data partitions exist.

2-In every data node we have Task Tracker and child jvm all Task Tracker register themselves with
job tracker while they are working on hadoop job.

3-Job is executed by child jvm and all task tracker send report to the job tracker like ack. message.

4-All data nodes sends acknowledge message to the Master Namenode to report that they are still alive or they are working properly.

5- Secondary name node is working like a helper of namenode. the task of secondary namenode is to  update namenode with cluster current image file that is stored in fsimage .

6- So finally  Task Tracker send Report  to Job Tracker.Data Node send Report to  Name Node.
and Secondary Name Node update Master Name Node.





Comments

Popular posts from this blog

KAFKA CLUSTER SETUP GUIDE

KAFKA CLUSTER SETUP Step-1: Set password less communication in master and slave machine 1-check communication in both machine. [ping master.node.com  /   ping slave.node.com] 2- set fully qualified domanin name [/etc/host] 3- su root [master/slave machine] 4- change hostname  /etc/hostname file.... hostname -f master.node.com 3-update password less ssh in master and slave. check previous blog. http://hadoop-edu.blogspot.com/2018/09/installation-of-apache-hadoop.html Step-2: Extract Kafka and Zookeeper and Update bashrc file. 1-   /usr/lib/kafka/...   [tar kafka here]         [in both machine] 2-   /usr/lib/zoo/...       [tar zookeeper here] [in both machine]       tar -xvfz  zookeeper-3.4.10.tar.gz   [master & slave] nano ~/.bashrc export ZOOKEEPER_HOME=/usr/lib/zoo/zookeeper-3.4.10 export KAFKA_HOME=/usr/lib/kafka/kafka_2.11-1.1.0 P...

Apache Spark and Apache Zeppelin Visualization Tool

Apache Spark and Apache Zeppelin Step-1: Installation and configuration of Apache Zeppelin https://zeppelin.apache.org/download.html Step-2: Extract Apache Zeppelin and move it to /usr/lib directory. sudo tar xvf   zeppelin-*-bin-all.tgz move   zepline   to   /usr/lib/directory   Step-3: Install Java development kit in ubuntu and set JAVA_HOME variable. echo $JAVA_HOME create     zepplin-env.sh   and zeppline-site.xml   from template files. open zepplin-env.sh                set          JAVA_HOME=        /path/                set          SPARK_HOME=     /path/ ...

Apache Hadoop cluster setup guide

APACHE HADOOP CLUSTER SETUP UBUNTU 16 64 bit Step 1: Install ubuntu os system for master and slave nodes.             1-install vmware workstation14.             https://www.vmware.com/in/products/workstation-pro/workstation-pro-evaluation.html             2- install Ubuntu 16os-64bit  for masternode using vmware             3- install Ubuntu 16os-64bit  for  slavenode  using vmware Step-2 : Update root password so that you can perform all admin level operations.                          sudo passwd root command to set new root password Step-3-Creating a User  from root user for Hadoop Eco System.             It is recommended to create a separate user for Hadoop to isolate Hadoop file system from          ...