Thursday, February 22, 2018

Different ways to start Hadoop daemon processes

Different ways to start hadoop daemon processes and what is the difference between them usually newbies know how to start hadoop processes but they don't know the differences among them.

So basically Hadoop processes can be started or stop in three ways:

1- start-all.sh and stop-all.sh
2- start.dfs.sh, stop.dfs.sh and start-yarn.sh, stop-yarn.sh
3- hadoop.daemon.sh start namenode/datanode and hadoop.daemon.sh stop namenode/datanode

Differences


1- start-all.sh and stop-all.sh: Used to start and stop hadoop daemons all at once. Issuing it on the master machine will start/stop the daemons on all the nodes of a cluster.

2- start.dfs.sh, stop.dfs.sh and start-yarn.sh, stop-yarn.sh: Same as above but start/stop HDFS and YARN daemons separately from the master machine on all the nodes. It is advisable to use these commands now over start-all.sh & stop-all.sh

3- hadoop.daemon.sh start namenode/datanode and hadoop.daemon.sh stop namenode/datanode: To start individual daemons on an individual machine manually. You need to go to a particular node and issue these commands.

Use case: Suppose you have added a new datanode to your cluster and you need to start the datanode daemon only on this machine

$HADOOP_HOME/bin/hadoop-daemon.sh start datanode

3 comments:

Kafka Architecture

Apache Kafka is a distributed publish-subscribe messaging system and a robust queue that can handle a high volume of data and enables you t...