OmniOperator

OmniOperator provides fixed interfaces for distributed tasks. You can submit an SQL task to a Spark cluster. The cluster management node distributes the task to multiple compute nodes as subtasks for execution. OmniOperator is invoked by user code only in a single task and does not interact with other subtasks. Figure 1 shows the OmniOperator architecture.

Figure 1 Software architecture of OmniOperator

OmniOperator provides the following features:

Implements the high-performance OmniOperator using native code. It fully exploits the computing capabilities of hardware, especially the heterogeneous computing power. Compared with Java and Scala operators, OmniOperator greatly improves the performance of the compute engine.
Provides an efficient data organization mode. It defines a column-oriented storage mode independent of languages and uses off-heap memory to implement OmniVec, which can read data with zero copy. There is no serialization overhead, so that users can process data in the memory more efficiently.

Parent topic: Key Features