Rate This Document
Findability
Accuracy
Completeness
Readability

Big Data Tuning

  1. Click next to System Profiler.

    Choose AI Tuning. The page for creating a task is displayed.

  2. Set task parameters, as shown in Figure 1. Table 1, Table 2, and Table 3 describe the parameters.

    AI tuning analysis is available only on CentOS 7.6, openEuler 20.03, and openEuler 22.03 LTS.

    Figure 1 Creating an AI tuning analysis task (big data)
    Table 1 Parameters for creating an AI tuning analysis task (big data-Hive)

    Parameter

    Description

    Task Name

    Name of the task. The name must meet the following requirements:

    1. Contain only letters, digits, and underscores (_).
    2. Contain 1 to 64 characters.

    Application Type

    Type of the application to be tuned. Select Big data.

    Application Name

    Name of the application to be tuned. Select Hive.

    Application Version

    Application version, which can be Hive 3.0.0 or 3.1.0-3.1.3.

    Root User Password

    Password of the root user for the DevKit node. Ensure that you have root permissions for AI tuning.

    Master Node

    Master node of the cluster.

    JAVA_HOME

    JDK installation directory.

    Application Executable File Path

    Path to the executable file of the to-be-tuned application, for example, /application/hive/bin.

    Application Configuration Parameter

    Select the application parameters that you want to configure. All parameters are selected by default. You can click Add Parameter to add parameters or click Restore to restore configuration parameters to the original ones.

    Pressure Test Tool

    Tool used to perform a pressure test on the application. It can be TPC-DS.

    Pressure Test Tool Version

    Pressure test tool version, which can be TPC-DS 3.0.

    Test Case

    Test case used by the pressure test tool. By default, query1.sql is selected. You can select one from query1.sql to query99.sql.

    Tuning Metric

    Metric for application tuning, which defaults to latency.

    Database

    Name of the database used for the pressure test.

    Tuning Iterations

    Number of iterations for application tuning. The options are 20, 50, 100, 150 (default), and 200.

    Table 2 Parameters for creating an AI tuning analysis task (big data-Flink)

    Parameter

    Description

    Task Name

    Name of the task. The name must meet the following requirements:

    1. Contain only letters, digits, and underscores (_).
    2. Contain 1 to 64 characters.

    Application Type

    Type of the application to be tuned. Select Big data.

    Application Name

    Name of the application to be tuned. Select Flink.

    Application Version

    Version of the application to be tuned. It can be Flink 1.12-1.15.

    Root User Password

    Password of the root user for the DevKit node. Ensure that you have root permissions for AI tuning.

    Deployment Mode

    Application deployment mode. The options are YARN (default) and Standalone.

    Master & Benchmark Node

    Node where the pressure test tool resides. You can click Add Node to add an agent node.

    JAVA_HOME

    JDK installation directory.

    Application Executable File Path

    Path to the executable file of the to-be-tuned application, for example, /application/flink/bin.

    (Optional) Startup Parameter

    Parameters used to start the application. The tool provides three default parameters. You can click Add to add new parameters. This parameter is available when Deployment Mode is set to Yarn.

    Application Configuration Parameter

    Select the application parameters that you want to configure. All parameters are selected by default. You can click Add Parameter to add parameters or click Restore to restore configuration parameters to the original ones.

    Flink Master IP Address

    IP address of the master node in the Flink cluster. This parameter is available when Deployment Mode is set to Standalone.

    Application Port on Flink Master Node

    Enter the port of the Flink application on the master node. This parameter is available when Deployment Mode is set to Standalone.

    Pressure Test Tool

    Tool used to perform a pressure test on the application. It can be HiBench. Flink 1.15 supports Huawei Cloud HiBench only.

    Pressure Test Tool Version

    Pressure test tool version, which can be HiBench 7.0.

    Test Case

    Test case used by the pressure test tool, which defaults to identity. The options are identity, repartition, and wordcount.

    Tuning Metric

    Metric for application tuning, which defaults to throughput. The options are throughput, latency, and throughput/latency.

    Pressure Test Tool Path

    Path to the pressure test tool, for example, /opt/HiBench-7.0.

    NOTE:

    You are advised to set the application path to a path such as /home or /opt. Do not set the application path to a system directory such as /, /dev, /sys, or /boot. Otherwise, system exceptions may occur.

    Throughput

    Throughput of the test case, which defaults to 20K. The options are 20K, 40K, 60K, 80K, 100K, 200K, 300K, 400K, 500K, 600K, 700K, 800K, 900K, 1000K, 2000K, 4000K, 6000K, 8000K and 10000K.

    Tuning Iterations

    Number of iterations for application tuning. The options are 20, 50, 100, 150 (default), and 200.

    Table 3 Parameters for creating an AI tuning analysis task (big data-Spark)

    Parameter

    Description

    Task Name

    Name of the task. The name must meet the following requirements:

    1. Contain only letters, digits, and underscores (_).
    2. Contain 1 to 64 characters.

    Application Type

    Type of the application to be tuned. Select Big data.

    Application Name

    Name of the application to be tuned. Select Spark.

    Application Version

    Version of the application to be tuned. The supported Spark versions are 2.3.0-2.3.2, 2.4.1-2.4.7, 3.0.0-3.0.3, 3.1.0-3.1.2, 3.2.1, 3.2.2, 3.3.0, and 3.3.1.

    Root User Password

    Password of the root user for the DevKit node. Ensure that you have root permissions for AI tuning.

    Master Node

    Master node of the cluster.

    JAVA_HOME

    JDK installation directory.

    Application Executable File Path

    Path to the executable file of the to-be-tuned application, for example, /application/spark/bin.

    (Optional) OmniOperator Directory

    OmniOperator directory.

    Deployment Mode

    Application deployment mode. The options are YARN (default) and Standalone.

    Application Configuration Parameter

    Select the application parameters that you want to configure. All parameters are selected by default. You can click Add Parameter to add parameters or click Restore to restore configuration parameters to the original ones.

    Pressure Test Tool

    Tool used to perform a pressure test on the application. It can be TPC-DS.

    Pressure Test Tool Version

    Pressure test tool version, which can be TPC-DS 3.0.

    Test Case

    Test case used by the pressure test tool. query1.sql is selected by default. You can select any test case from query1.sql to query99.sql. Cases 14, 23, 24, and 39 have two types: a and b.

    Tuning Metric

    Metric for application tuning, which defaults to latency.

    Database

    Name of the database used for the pressure test.

    Tuning Iterations

    Number of iterations for application tuning. The options are 20, 50, 100, 150 (default), and 200.

  3. Click Verify and Create.
  4. Click the task name to view the tuning information (using Spark 3.3.0 as an example).
    • If the test case cannot be executed, the task fails. You can click AI Tuning Run Log to download the log and view the failure cause and case information.
    • The icon indicates the invalid status, which may be caused by parameter conflicts or environment abnormalities. A small number of invalid rounds do not affect the final tuning result. However, a relatively large number of invalid rounds may terminate the tuning process.
    • The icon indicates the reference value for starting tuning, and the icon indicates that the current round of tuning is successful.

    Each row indicates one iteration of tuning. You can click Stop to stop the tuning.

    Figure 2 AI-based tuning analysis for a big data application
  5. Click Download Tuned Parameter Set to obtain the tuned database configuration.