When Spark Executes INSERT Statements to Query Multiple Wide Tables Using JOIN, a Core Dump Occurs Due to Insufficient Memory of the SMJ Operator

Symptom

In the scenario where Spark executes INSERT statements and there is only one data partition, when Sort Merge Join (SMJ) is performed in 50 consecutive tables, the off-heap memory is used up and the OmniOperator SMJ operator calls the new statement to apply for the vector memory. As a result, a core dump occurs.

Key Process and Cause Analysis

OmniOperator uses column-based processing and occupies more memory than open source Spark which uses row-based processing. In addition, the resources applied by the SMJ operator during the calculation can only be released after the task is complete.

The problem occurs when INSERT statements are executed and there is only one data partition. Spark generates only one task. As a result, Sort Merge Join on 50 tables is executed in one task. In this case, the configured 38 GB off-heap memory is used up by the 50 consecutive SMJ operators during the calculation. When the new statement is used to apply for memory, a core dump occurs.

Conclusion and Solution

This case is a rare scenario. Spark jobs aim to leverage the concurrency advantage of large-scale clusters. In normal cases, the situation where a single task (single thread) executes JOIN for a large number of tables does not exist. If this situation is triggered, you can perform the following operations for rectification:

Adjust the value of spark.memory.offHeap.size to increase the off-heap memory and trigger the service again.
Roll back to the Spark open source version to trigger services.

Parent topic: Troubleshooting