Project

General

Profile

Actions

Bug #10381

closed

health HEALTH_WARN mds ceph239 is laggy

Added by science luo over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi there.
Today,I runned a script to do some test on my ceph cluster via a cephfs client,include dd/rm/cp files less than 10K.
After 1 hour,the cephfs client was freezed,So I check my ceph health was below:
[root@MON_137 ceph-deploy]# ceph -s
cluster fe614861-e6fb-426f-90f7-682fd6f2def3
health HEALTH_WARN mds ceph239 is laggy
monmap e3: 3 mons at {MON_137=10.118.202.137:6789/0,MON_156=10.118.202.156:6789/0,MON_9=10.118.202.9:6789/0}, election epoch 130, quorum 0,1,2 MON_9,MON_137,MON_156
mdsmap e106: 1/1/1 up {0=ceph239=up:active(laggy or crashed)}
osdmap e381: 11 osds: 11 up, 11 in
pgmap v12714: 384 pgs, 3 pools, 2284 MB data, 1022 objects
10317 MB used, 2386 GB / 2396 GB avail
384 active+clean

and find a coredump log in the attachment.
The mds can't work any more.
BTW,my max_mds was 1 not 2.


Files

ceph-mds.ceph239.log (1.84 MB) ceph-mds.ceph239.log science luo, 12/18/2014 10:52 PM
ceph mds dump.txt (759 Bytes) ceph mds dump.txt science luo, 12/21/2014 04:53 PM
ceph osd dump.txt (3.14 KB) ceph osd dump.txt science luo, 12/21/2014 04:53 PM
Actions #1

Updated by Greg Farnum over 9 years ago

Can you provide the output of "ceph mds dump" and "ceph osd dump"?

It looks like the MDS is trying to access a pool that doesn't exist

Updated by science luo over 9 years ago

Greg Farnum wrote:

Can you provide the output of "ceph mds dump" and "ceph osd dump"?

It looks like the MDS is trying to access a pool that doesn't exist

Actions #3

Updated by Sage Weil over 9 years ago

  • Project changed from Ceph to CephFS
Actions #4

Updated by Greg Farnum over 9 years ago

  • Status changed from New to Resolved

Whoops, this fell through the cracks.

Anyway, the MDS map has pool 0 set to use for data, but the OSDMap doesn't have such a pool (it has a "data" pool, but it's number 3). I think there were several sequences of (unwise) monitor commands one could use to achieve this effect in the past, but we cover it appropriately now so that the OSDMonitor won't let you delete pools in use by the MDS, and the MDS won't let you insert pool IDs which don't exist.

Actions

Also available in: Atom PDF