Rate This Document
Findability
Accuracy
Completeness
Readability

Configuring Component Parameters

Web Configuration

The parameters can be modified on the Ambari cluster WebUI. For example, click YARN on the left pane and click the CONFIGS tab to modify the parameters of the Yarn component. Modify the parameters on the SETTINGS and ADVANCED tabs.

If the parameter has an initial value on the WebUI, you can use the search box on the CONFIGS tab page to locate the parameter and then modify it. For example, to change the value of yarn.nodemanager.resource.cpu-vcores on YARN, enter yarn.nodemanager.resource.cpu-vcores in the search box and click the modification icon to modify the parameter.

If the parameter does not have an initial value on the WebUI (for example, yarn.nodemanager.numa-awareness.enabled), add the parameter in the Custom yarn-site area (Custom XXX-site for other components).

The following table describes the parameter settings of each component.

Component

Parameter

Recommended Value

Description

Yarn

->NodeManager

Yarn

->ResourceManager

ResourceManager Java heap size

1024

Change the JVM memory size to increase the memory size and reduce the garbage collection (GC) frequency.

NOTE:

The values are not fixed. You need to increase or decrease the values of Xms and Xmx based on the GC release status.

NodeManager Java heap size

1024

Yarn

->NodeManager

yarn.nodemanager.resource.cpu-vcores

For the 48 core Kunpeng computing platform environment, the recommended value is 48.

Number of CPU cores that can be allocated to a container.

Yarn

->NodeManager

yarn.nodemanager.resource.memory-mb

The same as the actual physical memory capacity of the data node.

Memory that can be allocated to a container.

Yarn

->NodeManager

yarn.nodemanager.numa-awareness.enabled

True

The NUMA awareness when NodeManager starts a container.

Yarn

->NodeManager

yarn.nodemanager.numa-awareness.read-topology

True

The automatic NUMA topology awareness of NodeManager.

MapReduce2

mapreduce.map.memory.mb

7168

Maximum memory that can be used by a map task.

MapReduce2

mapreduce.reduce.memory.mb

14336

Maximum memory that can be used by a reduce task.

MapReduce2

mapreduce.job.reduce.slowstart.completedmaps

0.35

Only if the map completion ratio reaches the value of this parameter, the system applies for resources for reduce.

HDFS

->NameNode

NameNode Java heap size

3072

Change the JVM memory size to increase the memory size and reduce the GC frequency.

NOTE:

The values are not fixed. You need to increase or decrease the values of Xms and Xmx based on the GC release status.

NameNode new generation size

384

NameNode maximum new generation size

384

HDFS

->DataNode

dfs.datanode.handler.count

512

Number of DataNode service threads. You can increase the value to a proper value.

HDFS

->NameNode

dfs.namenode.service.handler.count

32

Number of threads used by the NameNode RPC server to monitor DataNode requests and other requests. You can increase the value to a proper value.

HDFS

->NameNode

dfs.namenode.handler.count

1200

Number of threads used by the NameNode RPC server to monitor client requests. You can increase the value to a proper value.

TEZ

tez.am.resource.memory.mb

7168

This parameter is equivalent to yarn.scheduler.minimum-allocation-mb. The default value is 7168.

TEZ

tez.runtime.io.sort.mb

1892 MB

Set this parameter to 40%*hive.tez.container.size. Generally, the value does not exceed 2 GB.

TEZ

tez.am.container.reuse.enabled

true

Container reuse switch.

TEZ

tez.runtime.unordered.output.buffer.size-mb

537

10% x Value of hive.tez.container.size

TEZ

tez.am.resource.cpu.vcores

10

Number of used virtual CPUs. The default value is 1. You need to manually add this parameter.

TEZ

tez.container.max.java.heap.fraction

0.85

Percentage of memory allocated to the Java process against the memory provided by Yarn. The default value is 0.8. You need to manually add this value.

Client Configuration

Only the optimization parameters that need to be modified are listed.

Table 1 SQL1

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.limit.optimize.enable

TRUE

hive.exec.parallel

TRUE

hive.cbo.enable

TRUE

hive.exec.compress.intermediate

TRUE

hive.tez.container.size

7168

Table 2 SQL2

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.exec.parallel

TRUE

hive.cbo.enable

TRUE

hive.exec.compress.intermediate

TRUE

hive.exec.reducers.max

576

hive.tez.container.size

5120

Table 3 SQL3

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.limit.optimize.enable

TRUE

hive.exec.parallel

TRUE

hive.exec.reducers.max

376

hive.tez.container.size

7168

Table 4 SQL4

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.limit.optimize.enable

TRUE

hive.exec.parallel

TRUE

hive.cbo.enable

TRUE

hive.tez.container.size

6148

Table 5 SQL5

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.limit.optimize.enable

TRUE

hive.exec.parallel

TRUE

hive.cbo.enable

TRUE

hive.tez.container.size

8192

Table 6 SQL6

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.limit.optimize.enable

TRUE

hive.exec.parallel

TRUE

hive.cbo.enable

TRUE

hive.exec.reducers.max

365

hive.tez.container.size

7168

Table 7 SQL7

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.limit.optimize.enable

TRUE

hive.exec.parallel

TRUE

hive.cbo.enable

TRUE

hive.exec.reducers.max

1

hive.tez.container.size

7168

Table 8 SQL8

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.limit.optimize.enable

TRUE

hive.exec.parallel

TRUE

hive.tez.container.size

7168

Table 9 SQL9

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.limit.optimize.enable

TRUE

hive.exec.parallel

TRUE

hive.cbo.enable

TRUE

hive.exec.reducers.max

365

hive.tez.container.size

7168

Table 10 SQL10

Parameter

Setting

hive.map.aggr

TRUE

hive.vectorized.execution.enabled

TRUE

hive.auto.convert.join

TRUE

hive.auto.convert.join.noconditionaltask

TRUE

hive.limit.optimize.enable

TRUE

hive.exec.parallel

TRUE

hive.cbo.enable

TRUE

hive.exec.reducers.max

120

hive.tez.container.size

7168