Architecture
This section describes the software architecture of the OmniMV feature.
OmniMV uses AI algorithms to recommend the optimal materialized view from historical SQL queries, automatically matches SQL statements with a materialized view in Spark, and replaces part of the SQL statements in an execution plan with the matched materialized view. This feature reduces repeated calculations and increases query efficiency. You can submit an SQL task to a Spark cluster. The cluster management node distributes the task to multiple compute nodes as subtasks for execution.
Figure 1 shows the architecture.
OmniMV consists of two modules: candidate view generation module and SQL rewrite module.
- Candidate view generation module: generates candidate views.
- SQL rewrite module: modifies the physical execution plan of SQL statements.
Use OmniMV in the following process:
- Collect historical query logs, such as Yarn logs.
- Parse historical logs to obtain SQL information, including the SQL text, SQL execution plan, and SQL running time.
- Generate candidate views based on the information obtained in 2, and then use the greedy selection policy to select top N candidate views.
Parent topic: Feature Overview
