Using the Background Spark Task Tuning Function
You can run the python tuning.pyc command to enable background task tuning. After a tuning command is executed, the server tunes Spark tasks in the background.
Command Function
On the management node, you can run this command to start a tuning task and specify the tuning target, retest mode, and tuning method.
Syntax
python tuning.pyc [-h] -l LOAD_ID -r {hijacking,backend} [-t {iterative,expert,transfer,native}]
Parameter Description
|
Item |
Description |
|---|---|
|
-h or --help |
Optional. Displays help information about a command. The help information contains the command usage, parameter definition, and additional description. |
|
-l or --load-id |
Mandatory. Indicates the load ID queried in the loads table. |
|
-r or --retest-way |
Mandatory. Indicates the retest mode. Retests ensure the result reliability. The options are:
|
|
-t or --tuning-method |
Mandatory when -r or --retest-way is set to hijacking, and optional when -r or --retest-way is set to backend. Indicates the tuning method.
|
Example of Usage
- Display the command usage, parameter definition, and additional description.
python tuning.pyc --help

- Optimize the load whose load-id is 1 based on expert rules and check the optimization effect through background retests.
python tuning.pyc -l 1 -r backend -t expert
- If the -t or --tuning-method option is not used to specify the tuning method and the retest mode is background, the tuning.strategy configuration item in the common_config.ini configuration file of OmniAdvisor 2.0 is used for tuning by default.
python tuning.pyc -l 1 -r backend
Example of Tuning
- Construct test data.
spark-sql --master yarn --deploy-mode client --driver-cores 8 --driver-memory 20G --num-executors 36 --executor-cores 8 --executor-memory 29g -e "CREATE DATABASE IF NOT EXISTS omnitest; CREATE TABLE IF NOT EXISTS omnitest.employee (id INT,name STRING)ROW FORMAT DELIMITED FIELDS TERMINATED BY ','STORED AS TEXTFILE;"
- Intercept the load.
Intercept the user load to obtain the load and related information for subsequent tuning. For details, see Using the Foreground Spark Task Interception Function.
export enable_omniadvisor=true spark-sql --master yarn --deploy-mode client --driver-cores 8 --driver-memory 20G --num-executors 36 --executor-cores 8 --executor-memory 29g -e "SELECT * FROM omnitest.employee;"
After a load is intercepted and recorded in the database, any subsequent command for that same load is also intercepted, and the current user-defined task configuration is replaced with the optimal configuration recommended by the system (the current user-defined task configuration is used in the initial status of OmniAdvisor 2.0). If the recommended configuration fails to be executed, the system rolls back to the default configuration.
- Query the load and related information.
After a load is executed and intercepted, you can query its tuning information in the database. OmniAdvisor Database Tables describes the database table structure.
1select id, rounds, load_id, method from omniadvisor_tuning_record;

In this example, load_id is 7. The load_id will be used in 4. Replace load_id with the actual one.
- Start tuning.
- Example 1
Use the expert rule–based tuning method to tune the load whose load_id is 7. During the tuning, perform tests again in the background. The number of tests is equal to the value of tuning.retest.times in the $OMNIADVISOR_HOME/omniruntime-omniadvisor-2.0.0/config/common_config.ini file.
python tuning.pyc -l 7 -r backend -t expert
- Example 2
Determine the current tuning method based on the default tuning policy and historical tuning records in OmniAdvisor 2.0.
python tuning.pyc -l 7 -r backend
After the tuning is complete, the optimal load configuration is updated so that it can be used in 2.
- Example 1