Select Hadoop: Calculating the Capacity of a Node

Wednesday, February 8, 2017

Calculating the Capacity of a Node

Because YARN has now removed the hard partitioned mapper and reducer slots of Hadoop

Version 1, new capacity calculations are required. There are eight important parameters

for calculating a node’s capacity that are specified in mapred-site.xml and yarnsite.xml:

In mapred-site.xml:

1. mapreduce.map.memory.mb

2. mapreduce.reduce.memory.mb

These are the hard limits enforced by Hadoop on each mapper or reducer task.

1. mapreduce.map.java.opts

2. mapreduce.reduce.java.opts

The heapsize of the jvm –Xmx for the mapper or reducer task. Remember to leave room
for the JVM Perm Gen and Native Libs used. This value should always be lower than
mapreduce.[map\|reduce].memory.mb.

In yarn-site.xml:

• yarn.scheduler.minimum-allocation-mb

The smallest container that YARN will allow.

• yarn.scheduler.maximum-allocation-mb

The largest container that YARN will allow.

• yarn.nodemanager.resource.memory-mb

The amount of physical memory (RAM) for Containers on the compute node. It is

important that this is not equal to the total amount of RAM on the node, as other

Hadoop services also require RAM.

• yarn.nodemanager.vmem-pmem-ratio

The amount of virtual memory that each Container is allowed. This can be calculated with:

containerMemoryRequest*vmem-pmem-ratio

As an example, consider a configuration with the settings in the following table.

Example YARN MapReduce Settings

Property

Value

mapreduce.map.memory.mb

1536

mapreduce.reduce.memory.mb

2560

mapreduce.map.java.opts

-Xmx1024m

mapreduce.reduce.java.opts

-Xmx2048m

yarn.scheduler.minimum-allocation-mb

512

yarn.scheduler.maximum-allocation-mb

4096

yarn.nodemanager.resource.memory-mb

36864

yarn.nodemanager.vmem-pmem-ratio

2.1

With these settings, each map and reduce task has a generous 512MB of overhead

for the Container, as evidenced by the difference between the mapreduce.[map|

reduce].memory.mb and the mapreduce.[map|reduce].java.opts.

Next, YARN has been configured to allow a Container no smaller than 512MB and no

larger than 4GB. The compute nodes have 36GB of RAM available for Containers. With

a virtual memory ratio of 2.1 (the default value), each map can have up to 3225.6MB of

RAM, or a reducer can have 5376MB of virtual RAM.

This means that the compute node configured for 36GB of Container space can support

up to 24 maps or 14 reducers, or any combination of mappers and reducers allowed by the

available resources on the node.

2 comments:

vignesjosephApril 3, 2017 at 12:45 PM
Big data(Hadoop) is mostly using their data analytics check process.The cloud contribution aggressive for Hadoop related map reduce code.Selenium Online Training
ReplyDelete
Replies

Add comment

Select Hadoop

Wednesday, February 8, 2017

Calculating the Capacity of a Node

2 comments:

Kafka Architecture

Search This Blog