Yarn Tuning Guide
Yarn Tuning Guide
Memory
Service Category CPU (cores) (MB) Notes
Operating System Overhead 1 8192 Most operating systems use 4-
Other services Overhead 0 0 Enter the required cores or me
Cloudera Manager agent Overhead 1 1024 Allocate 1GB and 1 vcore for C
HDFS DataNode CDH 1 1024 Allocation for the HDFS DataN
YARN NodeManager CDH 1 1024 Allocation for the YARN NodeM
Impala daemon CDH 0 0 (Optional Service) Suggestion:
Hbase RegionServer CDH 0 0 (Optional Service) Suggestion:
Solr Server CDH 0 0 (Optional Service) Suggestion:
Kudu Server CDH 0 0 (Optional Service) Suggestion:
Available Container Resources 44 250880
Container resources
1
Physical Cores to Vcores Multiplier 1 Set this ratio based on the exp
YARN Available Vcores 44 This value will be used in STEP
YARN Available Memory 250880 This value will be used in STEP
2
yes 1G
no 10G
40G
100G
scription / Notes
de memory in Gigabytes
mber of CPU's and the number of HW cores per CPU. The calculation of vcores below includes HyperThreading support.
es the CPU support HyperThreading?
mber of Hard Drives and size per drive in JBOD Configuration
mber of Ethernet connections and the transfer speed
3
this ratio based on the expected number of concurrent threads in a container per thread core. Default is 1.
s value will be used in STEP 4 for YARN Configuration
s value will be used in STEP 4 for YARN Configuration
4
erThreading support.
index sizes.
ding on data sizes.
5
YARN Configuration
STEP 4: YARN Configuration on Cluster
These are the first set of configuration values for your cluster. You can
set these values in YARN->Configuration
6
yarn.scheduler.increment-allocation-mb 512 Memory allocations must be a m
7
pied from STEP 2 "Available Resources"
pied from STEP 2 "Available Resources"
8
mory allocations must be a multiple of this value in MegaByte
440
10
9
to OutOfMemory issues
10
MapReduce Configuration
STEP 7: MapReduce Configuration
For CDH 5.5 and later we recommend that only the heap or the container
size is specified for map and reduce tasks. The value that is not specified
will be calculated based on the setting mapreduce.job.heap.memory-
mb.ratio. This calculation follows Cloudera Manager and calculates the
heap size based on the ratio and the container size.
11
AM memory request must fit within scheduler limits GOOD yarn.scheduler.minimum-alloca
Container size must large enough for java heap and overhead GOOD Java Heap should be between 7
Ratio should be between 0.75 and 0.9 GOOD Java Heap should be between 7
12
yes
no
13
n.scheduler.minimum-allocation-mb <= yarn.app.mapreduce.am.resource.cpu-vcores <= yarn.scheduler.maximum-allocation-m
a Heap should be between 75% and 90% of the container size: too low wastes resources, to high could lead to OOM
a Heap should be between 75% and 90% of the container size: too low wastes resources, to high could lead to OOM
14
scheduler.maximum-allocation-vcores
15
eduler.maximum-allocation-mb
ould lead to OOM
m-allocation-vcores
-allocation-mb
ould lead to OOM
mum-allocation-vcores
um-allocation-mb
ould lead to OOM
16