Actions
Bug #9264
closedmds: occasionally log segments can't trim
Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
it happened with latest lab mds restart yesterday; we have the logs (for another 6 days or so)
root@mira040:~# ceph mds dump 618 dumped mdsmap epoch 618 epoch 618 flags 0 created 2013-08-14 03:19:58.297184 modified 2014-08-27 10:05:56.352173 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 last_failure 467 last_failure_osd_epoch 354013 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap} max_mds 1 in 0 up {0=1716410} failed stopped data_pools 0 metadata_pool 1 inline_data disabled 1716410: 10.214.134.10:6800/33114 'burnupi21' mds.0.38 up:replay seq 1
Updated by Sage Weil over 9 years ago
2014-08-27 00:54:32.894038 7f0a35232700 6 mds.0.journal LogSegment(1566182478170).try_to_expire
is the segment that gets stuck:
2014-08-27 02:34:08.364557 7f0a35232700 10 mds.0.log trim 172 / 30 segments, 215078 / -1 events, 1 (1046) expiring, 140 (175941) expired 2014-08-27 02:34:08.364561 7f0a35232700 5 mds.0.log trim already expiring segment 1566182478170, 1046 events 2014-08-27 02:34:08.364565 7f0a35232700 5 mds.0.log trim already expired segment 1566186672593, 1109 events
Updated by Sage Weil over 9 years ago
- Status changed from New to 12
2014-08-27 00:54:32.901022 7f0a35232700 10 mds.0.journal try_to_expire waiting for nest flush on [inode 605 [...2,head] ~mds0/stray5/ auth v25125019 ap=3+1 f(v13 m2014-08-27 00:38:56.778628 443=418+25) n(v566 rc2014-08-27 00:38:56.778628 b395656374 a1 444=418+26) (inest lock->sync w=1 dirty) (ifile lock w=2) (iversion lock) | d irtyscattered=1 lock=2 dirfrag=1 stickydirs=1 stray=1 dirtyrstat=0 dirty=1 waiter=1 authpin=1 0x10c98e00] 2014-08-27 00:54:32.901039 7f0a35232700 10 mds.0.locker scatter_nudge auth, waiting for stable (inest lock->sync w=1 dirty) on [inode 605 [...2,head] ~mds0/stray5/ auth v25125019 ap=3+1 f(v13 m2014-08-27 00:38:56.778628 443=418+25) n(v566 rc2014-08-27 00:38:56.778628 b395656374 a1 444=418+26) (inest lock->sync w=1 dirty) (ifile lock w=2) (iversion lock) | dirtyscattered=1 lock=2 dirfrag=1 stickydirs=1 stray=1 dirtyrstat=0 dirty=1 waiter=1 authpin=1 0x10c98e00] 2014-08-27 00:54:32.901054 7f0a35232700 10 mds.0.cache.ino(605) add_waiter tag 400000000000 0x1e6f3230 !ambig 1 !frozen 1 !freezing 1
it's waiting for the wrlock to release.. which is held by a request that is stuck revoking caps due to bug #8962.
Actions