Rate This Document
Findability
Accuracy
Completeness
Readability

OmniOperator

OmniOperator is the operator acceleration feature.

OmniOperator uses native code (C/C++) to implement big data SQL operators to improve query performance. It uses columnar storage and vectorized execution technologies as well as the Kunpeng acceleration library to improve operator execution efficiency and query performance of the query engine. OmniOperator provides fixed interfaces for distributed tasks. You can submit an SQL task to a Spark cluster. The cluster management node distributes the task to multiple compute nodes as subtasks for execution.

OmniOperator is invoked by user code only in a single task and does not interact with other subtasks. Figure 1 shows the OmniOperator architecture.

Figure 1 Software architecture of OmniOperator

OmniOperator provides the following features:

  • Implements the high-performance OmniOperator using native code. It fully exploits the computing capabilities of hardware, especially the heterogeneous computing power. Compared with Java and Scala operators, OmniOperator greatly improves the performance of the compute engine.
  • Provides an efficient data organization mode. It defines a column-oriented storage mode independent of languages and uses off-heap memory to implement OmniVec, which can read data with zero copy. There is no serialization overhead, so that users can process data in the memory more efficiently.