Bug #36085
Updated by Patrick Donnelly over 5 years ago
<pre> cluster: id: 10f9c55a-a813-44d7-bce7-e6159a98dc61 health: HEALTH_WARN 774413/230157054 objects misplaced (0.336%) Degraded data redundancy: 27570/230157054 objects degraded (0.012%), 1 pg degraded, 1 pg undersized services: mon: 3 daemons, quorum rndcl94,rndcl106,rndcl154 mgr: rndcl94(active), standbys: rndcl106, rndcl154 mds: cephfs-1/1/1 up {0=rndcl94=up:active(laggy or crashed)} osd: 24 osds: 24 up, 24 in; 11 remapped pgs data: pools: 2 pools, 512 pgs objects: 76.72 M objects, 14 TiB usage: 53 TiB used, 36 TiB / 89 TiB avail pgs: 27570/230157054 objects degraded (0.012%) 774413/230157054 objects misplaced (0.336%) 501 active+clean 10 active+remapped+backfilling 1 active+undersized+degraded+remapped+backfilling io: client: 9.7 MiB/s rd, 6.5 MiB/s wr, 94 op/s rd, 630 op/s wr recovery: 17 MiB/s, 90 objects/s </pre> the mds go to laggy and crashed every a few seconds, and then go to active this status seems make cephfs very slow but the mds process never truly dead. I try to restart mds, but seems no effect. I set debug_ms=1 and catch some log.