OmniAdvisor
OmniAdvisor is the parameter tuning feature.
OmniAdvisor parses parameters of historical Spark and Hive SQL tasks, uses AI algorithms to intelligently tune parameter sampling, and implements end-to-end online parameter tuning for tasks.
It is used to tune parameters for an online system from end to end. It consists of four modules:
- SQL log parsing module: parses Spark and Hive logs to obtain SQL parameter information.
- SQL parameter sampling tuning module: samples SQL parameters of different configurations and uses the sampled parameters to execute tasks.
- SQL parameter recommendation module: For a task that has been tuned, you can search the database for the optimal parameters of historical tasks and use the optimal parameters as the execution parameters of the tuned task.
- SQL exception parameter processing module: If an exception occurs when a sampled or recommended SQL parameter is being executed, the exception must be handled.
Figure 1 Software architecture of OmniAdvisor
- Use the default parameters to execute a Spark or Hive Tez task.
- After the task is executed, the cluster retains the task execution log information (you need to configure spark.history.fs.logDirectory for the Spark cluster, or start the timeline server service for the Hive Tez engine). Use the log parsing module to parse the log information after the task is executed. Save the analyzed SQL parameter information, SQL execution status, and execution time to the MySQL database.
- Select the list of tasks whose parameters you want to tune, obtain the historical parameters of each task from the database, sample the historical parameters, obtain the target parameters, parse the execution results, and update the database information.
- When you need to re-execute a task, you can search the database for the optimal parameters of historical tasks and use them as the execution parameters of the task.
Figure 2 Feature scenario analysis
Parent topic: Key Features