Bug #10381: health HEALTH_WARN mds ceph239 is laggy - CephFS - Ceph

Actions

Copy link

Bug #10381

closed

health HEALTH_WARN mds ceph239 is laggy

Added by science luo over 9 years ago. Updated over 9 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi there.
Today,I runned a script to do some test on my ceph cluster via a cephfs client,include dd/rm/cp files less than 10K.
After 1 hour,the cephfs client was freezed,So I check my ceph health was below:
[root@MON_137 ceph-deploy]# ceph -s
cluster fe614861-e6fb-426f-90f7-682fd6f2def3
health HEALTH_WARN mds ceph239 is laggy
monmap e3: 3 mons at {MON_137=10.118.202.137:6789/0,MON_156=10.118.202.156:6789/0,MON_9=10.118.202.9:6789/0}, election epoch 130, quorum 0,1,2 MON_9,MON_137,MON_156
mdsmap e106: 1/1/1 up {0=ceph239=up:active(laggy or crashed)}
osdmap e381: 11 osds: 11 up, 11 in
pgmap v12714: 384 pgs, 3 pools, 2284 MB data, 1022 objects
10317 MB used, 2386 GB / 2396 GB avail
384 active+clean

and find a coredump log in the attachment.
The mds can't work any more.
BTW,my max_mds was 1 not 2.

Files

Download all files

ceph-mds.ceph239.log (1.84 MB) ceph-mds.ceph239.log		science luo, 12/18/2014 10:52 PM
ceph mds dump.txt (759 Bytes) ceph mds dump.txt		science luo, 12/21/2014 04:53 PM
ceph osd dump.txt (3.14 KB) ceph osd dump.txt		science luo, 12/21/2014 04:53 PM

Actions

Copy link

Updated by Greg Farnum over 9 years ago

Can you provide the output of "ceph mds dump" and "ceph osd dump"?

It looks like the MDS is trying to access a pool that doesn't exist

Actions

Copy link Download all files

Updated by science luo over 9 years ago

File ceph mds dump.txt ceph mds dump.txt added
File ceph osd dump.txt ceph osd dump.txt added

Greg Farnum wrote:

Can you provide the output of "ceph mds dump" and "ceph osd dump"?

It looks like the MDS is trying to access a pool that doesn't exist

Actions

Copy link

Updated by Sage Weil over 9 years ago

Project changed from Ceph to CephFS

Actions

Copy link

Updated by Greg Farnum over 9 years ago

Status changed from New to Resolved

Whoops, this fell through the cracks.

Anyway, the MDS map has pool 0 set to use for data, but the OSDMap doesn't have such a pool (it has a "data" pool, but it's number 3). I think there were several sequences of (unwise) monitor commands one could use to achieve this effect in the past, but we cover it appropriately now so that the OSDMonitor won't let you delete pools in use by the MDS, and the MDS won't let you insert pool IDs which don't exist.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #10381

health HEALTH_WARN mds ceph239 is laggy

Updated by Greg Farnum over 9 years ago

Updated by science luo over 9 years ago

Updated by Sage Weil over 9 years ago

Updated by Greg Farnum over 9 years ago