Rate This Document
Findability
Accuracy
Completeness
Readability

BoostIO Communication Fault/Recovery

NICs are used for communications between cache clients and servers or among cache servers. When a NIC is faulty, such communications fail. For example, request messages fail to be sent or responses cannot be received. The methods for troubleshooting NICs are described as follows.

Table 1 NIC fault scenarios

Scenario

Impact

Handling Method

Remarks

NIC fault

  • The SDK fails to send requests.
  • Receiving requests on the SDK times out.
  • The partition view cannot be received.
  • The ZooKeeper heartbeat detects that the NIC is faulty and instructs the cluster management module to update the view and then release the new view.
  • Partition data affected by the NIC fault is forcibly evicted to the back-end storage.
  • Troubleshooting in occupied ports and down firewall/NIC scenarios is supported.
  • Troubleshooting for NIC exceptions, including packet loss, error packets, one-way NICs, and intermittent disconnections, is not supported.

NIC recovery

  • The request sending function is restored.
  • The re-entry of each read/write cache process is allowed.

The ZooKeeper heartbeat detects that the NIC is recovered and instructs the cluster management module to update the view and then release the new view.

None