我要评分
获取效率
正确性
完整性
易理解

Abnormal ceph-mon Daemons

Problem Description

Table 1 Basic information

Item

Information

Source of the Problem

Online maintenance

Product

Kunpeng BoostKit

Sub-item

SDS

Service Scenario

Compilation and installation

Component

Other

Output Time

2019-10-28

Author

Chen Xiaobo 00416232

Team

Kunpeng BoostKit

Review Result

Review passed

Review Date

2019-11-05

Release Date

2020-03-20

Keywords

Abnormal ceph-mon daemons

Symptom

The ceph -s command output shows that slow ops exist in the ceph-mon daemons. The message is as follows:

HEALTH_WARN 376 slow ops, oldest one blocked for 894 sec, daemons [mon,ceph4,mon,ceph5,mon,ceph6] have slow ops.
SLOW_OPS 376 slow ops, oldest one blocked for 894 sec, daemons [mon,ceph4,mon,ceph5,mon,ceph6] have slow ops.

Key Process and Cause Analysis

After the Ceph cluster is redeployed, the configuration file of the original Ceph cluster overwrites that of the current cluster. As a result, the NUMA affinity configuration does not meet the site requirements.

Conclusion and Solution

Reconfigure NUMA affinity. Modify the NUMA affinity configuration in the ceph.conf file as required. An example is provided as follows:
[osd.N]:
osd_numa_node = 1
public_network_interface = bond1
cluster_network_interface = bond1