Rate This Document
Findability
Accuracy
Completeness
Readability

BoostIO Process Fault/Recovery

In separated deployment mode, BoostIO has an independent cache process; in converged deployment mode, BoostIO and JuiceFS are loaded to one process. The cache process is responsible for processing cache client requests, data read/write cache, and resource management. The methods for troubleshooting the cache process are described as follows.

Table 1 Cache process fault scenarios

Scenario

Impact

Handling Method

Remarks

Cache process fault

  • Copy data in the write cache is lost.
  • Object data in the read cache is lost.
  • The SDK fails to distribute requests.
  • The ZooKeeper heartbeat detects that the cache process is faulty and instructs the cluster management module to update the view and then release the new view.
  • Partition data affected by the process fault is forcibly evicted to the back-end storage.
  • If the cache process is temporarily faulty, only the partition view is changed.
  • If the cache process is permanently faulty, the cluster management module removes the node from the cluster and changes both the node view and partition view.
  • The time window for determining temporary faults is configurable.

Cache process recovery

Read/write cache functions need to be restored.

  • Temporary fault: The ZooKeeper heartbeat detects that the cache process is recovered and instructs the cluster management module to update the view and then release the new view.
  • Permanent fault: The removed node is added to the cluster as a new node for expansion.

None