Rate This Document
Findability
Accuracy
Completeness
Readability

Error Reported When Viewing the OmniShuffle Log

Symptom

When you view the OmniShuffle log, the error message "Failed to send sync package, Operation timed out" is displayed. As a result, the handshake fails, but the Spark task ends normally.

Key Process and Cause Analysis

Generally, this error occurs because the remaining memory of the peer system is insufficient or the memory is severely fragmented. As a result, the connection between the two ends takes a long time and the timeout error is triggered.

Conclusion and Solution

Reduce the memory configured for OmniShuffle and Spark. Spark has an internal retry mechanism. When this error occurs, the Spark task retries and continues to run. In this case, the corresponding retry log is displayed on the Spark web page.