Abnormal ceph-mon Daemons
Problem Description
Item |
Information |
|---|---|
Source of the Problem |
Online maintenance |
Product |
Kunpeng BoostKit |
Sub-item |
SDS |
Service Scenario |
Compilation and installation |
Component |
Other |
Output Time |
2019-10-28 |
Author |
Chen Xiaobo 00416232 |
Team |
Kunpeng BoostKit |
Review Result |
Review passed |
Review Date |
2019-11-05 |
Release Date |
2020-03-20 |
Keywords |
Abnormal ceph-mon daemons |
Symptom
The ceph -s command output shows that slow ops exist in the ceph-mon daemons. The message is as follows:
HEALTH_WARN 376 slow ops, oldest one blocked for 894 sec, daemons [mon,ceph4,mon,ceph5,mon,ceph6] have slow ops. SLOW_OPS 376 slow ops, oldest one blocked for 894 sec, daemons [mon,ceph4,mon,ceph5,mon,ceph6] have slow ops.
Key Process and Cause Analysis
After the Ceph cluster is redeployed, the configuration file of the original Ceph cluster overwrites that of the current cluster. As a result, the NUMA affinity configuration does not meet the site requirements.
Conclusion and Solution
Reconfigure NUMA affinity. Modify the NUMA affinity configuration in the ceph.conf file as required. An example is provided as follows:
[osd.N]: osd_numa_node = 1 public_network_interface = bond1 cluster_network_interface = bond1