Bug #8962
closedkcephfs: client does not release revoked cap
0%
Description
several instances where the mds tries to revoke a cap (Ls and Fs have been observed so far) and the client doesn't release. this causes requests to hang.
restarting the mds clears it up.
Updated by Zheng Yan almost 10 years ago
I saw similar hang a few weeks ago. In that case, all OSDs were down, the MDS couldn't submit log event.
Updated by Sage Weil almost 10 years ago
- Priority changed from Immediate to Urgent
Updated by Sage Weil almost 10 years ago
here is the final state of the directory:
ubuntu@teuthology:/a/zack-2014-07-29_14:05:07-ceph-deploy-firefly-testing-basic-multi/384761$ find remote/plana62/log remote/plana62/log remote/plana62/log/ceph-client.admin.10109.log.gz remote/plana62/log/ceph-client.admin.11178.log.gz remote/plana62/log/ceph-client.admin.8373.log.gz remote/plana62/log/ceph-client.admin.8422.log.gz remote/plana62/log/ceph-client.admin.9230.log.gz remote/plana62/log/ceph-client.bootstrap-osd.10560.log.gz remote/plana62/log/ceph-client.bootstrap-osd.10589.log.gz remote/plana62/log/ceph-client.bootstrap-osd.10648.log.gz remote/plana62/log/ceph-client.bootstrap-osd.8969.log.gz remote/plana62/log/ceph-client.bootstrap-osd.8998.log.gz remote/plana62/log/ceph-client.bootstrap-osd.9056.log.gz remote/plana62/log/ceph-client.bootstrap-osd.9617.log.gz remote/plana62/log/ceph-client.bootstrap-osd.9646.log.gz remote/plana62/log/ceph-client.bootstrap-osd.9707.log.gz remote/plana62/log/ceph-mon.plana62.log.gz remote/plana62/log/ceph-osd.0.log.gz remote/plana62/log/ceph-osd.1.log.gz remote/plana62/log/ceph-osd.2.log.gz remote/plana62/log/ceph.log.gz remote/plana62/log/tmpMXKzyM
and the request that hanged is
30884112 mds0 unlink #100017f166c/tmpMXKzyM (teuthology-archive/zack-2014-07-29_14:05:07-ceph-deploy-firefly-testing-basic-multi/384761/remote/plana62/log/tmpMXKzyM)
Updated by Sage Weil almost 10 years ago
and the code that did it is in teuthology.git/teuthology/misc.py:
def pull_directory(remote, remotedir, localdir): """ Copy a remote directory to a local directory. """ log.debug('Transferring archived files from %s:%s to %s', remote.shortname, remotedir, localdir) if not os.path.exists(localdir): os.mkdir(localdir) _, local_tarfile = tempfile.mkstemp(dir=localdir) remote.get_tar(remotedir, local_tarfile, sudo=True) with open(local_tarfile, 'r+') as fb1: tar = tarfile.open(mode='r|gz', fileobj=fb1) while True: ti = tar.next() if ti is None: break if ti.isdir(): # ignore silently; easier to just create leading dirs below pass elif ti.isfile(): sub = safepath.munge(ti.name) safepath.makedirs(root=localdir, path=os.path.dirname(sub)) tar.makefile(ti, targetpath=os.path.join(localdir, sub)) else: if ti.isdev(): type_ = 'device' elif ti.issym(): type_ = 'symlink' elif ti.islnk(): type_ = 'hard link' else: type_ = 'unknown' log.info('Ignoring tar entry: %r type %r', ti.name, type_) continue os.remove(local_tarfile)
where the final os.remove() is what hanged
Updated by Zheng Yan almost 10 years ago
no clue what happened. please dump the mds cache when it happens next time
Updated by Sage Weil almost 10 years ago
Zheng Yan wrote:
no clue what happened. please dump the mds cache when it happens next time
We have a dump, actually... see ~ubuntu/mds.cache.dump
Updated by Zheng Yan almost 10 years ago
Sage Weil wrote:
Zheng Yan wrote:
no clue what happened. please dump the mds cache when it happens next time
We have a dump, actually... see ~ubuntu/mds.cache.dump
can't find it on machine teuthology
Updated by Sage Weil almost 10 years ago
Zheng Yan wrote:
Sage Weil wrote:
Zheng Yan wrote:
no clue what happened. please dump the mds cache when it happens next time
We have a dump, actually... see ~ubuntu/mds.cache.dump
can't find it on machine teuthology
sorry, mira060:~ubuntu/mds.cache.dump
Updated by Zheng Yan almost 10 years ago
[inode 100012f9a68 [2,head] /teuthology-archive/sage-2014-07-17_16:03:19-rados-master-testing-basic-plana/367941/remote/plana41/log/ceph-client.admin.31671.log.gz auth v18862 ap=2+0 s=48 n(v0 b48 1=1+0) (ilink sync->lock) (iversion lock w=1 last_client=317444) caps={204886=pAsXsFscr/pAsLsXsFscr/-@3,317444=p/-@1} | lock=2 caps=1 waiter=1 authpin=1 0x5f623500]
looks suspicious. But no idea why client didn't release Ls cap
Updated by Sage Weil almost 10 years ago
another similar hang:
# cat /sys/kernel/debug/ceph/28f7427e-5558-4ffd-ae1a-51ec3042759a.client333715/mdsc 13216547 mds0 create #100019c52b7/.teuthology.log.swp (teuthology-archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392050/.teuthology.log.swp) 13218990 mds0 getattr #100019c52b7 13220035 mds0 getattr #100019c52b7
2014-08-03 18:04:55.910700 7f9ad82fe700 10 mds.0.locker wrlock_start (ifile sync) on [inode 100019c52b7 [...2,head] /teuthology-archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392050/ auth v35450 ap=1+1 f(v0 m2014-08-01 23:18:03.469804 10=8+2) n(v6 rc2014-08-01 23:18:03.469804 b3646332 299=294+5) (isnap sync r=1) (inest lock) (iversion lock) caps={191425=pAsLsXsFs/-@2,333715=pAsLsXs/p@5} | request=0 lock=1 dirfrag=1 caps=1 authpin=1 0x3b9e5c00] 2014-08-03 18:04:55.910711 7f9ad82fe700 7 mds.0.locker simple_lock on (ifile sync) on [inode 100019c52b7 [...2,head] /teuthology-archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392050/ auth v35450 ap=1+1 f(v0 m2014-08-01 23:18:03.469804 10=8+2) n(v6 rc2014-08-01 23:18:03.469804 b3646332 299=294+5) (isnap sync r=1) (inest lock) (iversion lock) caps={191425=pAsLsXsFs/-@2,333715=pAsLsXs/p@5} | request=0 lock=1 dirfrag=1 caps=1 authpin=1 0x3b9e5c00] 2014-08-03 18:04:55.910727 7f9ad82fe700 7 mds.0.locker issue_caps allowed=pAsLsXsFc, xlocker allowed=pAsLsXsFc on [inode 100019c52b7 [...2,head] /teuthology-archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392050/ auth v35450 ap=1+1 f(v0 m2014-08-01 23:18:03.469804 10=8+2) n(v6 rc2014-08-01 23:18:03.469804 b3646332 299=294+5) (isnap sync r=1) (inest lock) (ifile sync->lock) (iversion lock) caps={191425=pAsLsXsFs/-@2,333715=pAsLsXs/p@5} | request=0 lock=1 dirfrag=1 caps=1 authpin=1 0x3b9e5c00] 2014-08-03 18:04:55.910742 7f9ad82fe700 7 mds.0.locker sending MClientCaps to client.191425 seq 3 new pending pAsLsXs was pAsLsXsFs 2014-08-03 18:04:55.910748 7f9ad82fe700 10 mds.0.30 send_message_client_counted client.191425 seq 295 client_caps(revoke ino 100019c52b7 408287541 seq 3 caps=pAsLsXs dirty=- wanted=- follows 0 size 0/0 ts 1 mtime 2014-08-01 23:18:03.469804) v4 2014-08-03 18:04:55.910755 7f9ad82fe700 1 -- 10.214.134.10:6800/9063 --> 10.214.137.25:0/3163831945 -- client_caps(revoke ino 100019c52b7 408287541 seq 3 caps=pAsLsXs dirty=- wanted=- follows 0 size 0/0 ts 1 mtime 2014-08-01 23:18:03.469804) v4 -- ?+0 0xcd43c00 con 0x50a5b00 2014-08-03 18:04:55.910768 7f9ad82fe700 10 mds.0.cache.ino(100019c52b7) auth_pin by 0x3b9e63f0 on [inode 100019c52b7 [...2,head] /teuthology-archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392050/ auth v35450 ap=2+1 f(v0 m2014-08-01 23:18:03.469804 10=8+2) n(v6 rc2014-08-01 23:18:03.469804 b3646332 299=294+5) (isnap sync r=1) (inest lock) (ifile sync->lock) (iversion lock) caps={191425=pAsLsXs/pAsLsXsFs/-@3,333715=pAsLsXs/p@5} | request=0 lock=1 dirfrag=1 caps=1 authpin=1 0x3b9e5c00] now 2+1 2014-08-03 18:04:55.910788 7f9ad82fe700 7 mds.0.locker wrlock_start waiting on (ifile sync->lock) on [inode 100019c52b7 [...2,head] /teuthology-archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392050/ auth v35450 ap=2+1 f(v0 m2014-08-01 23:18:03.469804 10=8+2) n(v6 rc2014-08-01 23:18:03.469804 b3646332 299=294+5) (isnap sync r=1) (inest lock) (ifile sync->lock) (iversion lock) caps={191425=pAsLsXs/pAsLsXsFs/-@3,333715=pAsLsXs/p@5} | request=0 lock=1 dirfrag=1 caps=1 authpin=1 0x3b9e5c00] 2014-08-03 18:04:55.910801 7f9ad82fe700 10 mds.0.cache.ino(100019c52b7) add_waiter tag 40000000 0x54f86200 !ambig 1 !frozen 1 !freezing 1 2014-08-03 18:04:55.910805 7f9ad82fe700 10 mds.0.locker nudge_log (ifile sync->lock) on [inode 100019c52b7 [...2,head] /teuthology-archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392050/ auth v35450 ap=2+1 f(v0 m2014-08-01 23:18:03.469804 10=8+2) n(v6 rc2014-08-01 23:18:03.469804 b3646332 299=294+5) (isnap sync r=1) (inest lock) (ifile sync->lock) (iversion lock) caps={191425=pAsLsXs/pAsLsXsFs/-@3,333715=pAsLsXs/p@5} | request=0 lock=1 dirfrag=1 caps=1 waiter=1 authpin=1 0x3b9e5c00]
debug mds = 10 logs are on burnupi21 (careful, they are very big!)
the above entry is at offset /var/log/ceph/ceph-mds.burnupi21.log byte 395877069361/408914568239 97% (press RETURN)
Updated by Sage Weil over 9 years ago
35139797 mds0 mkdir #1000207b106/421099 (teuthology-archive/samuelj-2014-08-12_21:17:59-rados-wip-9054-testing-basic-multi/421099) 35151800 mds0 mkdir #100020b68d0/422304 (teuthology-archive/sage-2014-08-13_12:02:27-rgw-wip-mds-contexts-testing-basic-mira/422304)
restarted at 2014-08-13 13:22:35.617241, so these ops are a bit before that.
client is 551831
Updated by Sage Weil over 9 years ago
2014-08-13 12:42:09.831606 7ff844faa700 7 mds.0.server dispatch_client_request client_request(client.551831:35139797 mkdir #1000207b106/421099) ... 2014-08-13 12:42:09.832104 7ff844faa700 7 mds.0.locker simple_lock on (ifile sync) on [inode 1000207b106 [...2,head] /teuthology-archive/samuelj-2014-08-12_21:17:59-rados-wip-9054-testing-basic-multi/ auth v6956066 ap=1+1 f(v0 m2014-08-13 12:41:24.208187 558=0+558) n(v4275 rc2014-08-13 12:42:09.350741 b107169850288 266688=262748+3940) (isnap sync r=1) (inest lock dirty) (iversion lock) caps={191425=pAsLsXsFs/-@2,204886=pAsLsXs/-@1,551831=pAsLsXs/p@72781} | dirtyscattered=1 request=0 lock=1 dirfrag=1 caps=1 dirtyrstat=0 dirtyparent=0 dirty=1 waiter=0 authpin=1 0xcb1e2e00] 2014-08-13 12:42:09.832121 7ff844faa700 7 mds.0.locker issue_caps allowed=pAsLsXsFc, xlocker allowed=pAsLsXsFc on [inode 1000207b106 [...2,head] /teuthology-archive/samuelj-2014-08-12_21:17:59-rados-wip-9054-testing-basic-multi/ auth v6956066 ap=1+1 f(v0 m2014-08-13 12:41:24.208187 558=0+558) n(v4275 rc2014-08-13 12:42:09.350741 b107169850288 266688=262748+3940) (isnap sync r=1) (inest lock dirty) (ifile sync->lock) (iversion lock) caps={191425=pAsLsXsFs/-@2,204886=pAsLsXs/-@1,551831=pAsLsXs/p@72781} | dirtyscattered=1 request=0 lock=1 dirfrag=1 caps=1 dirtyrstat=0 dirtyparent=0 dirty=1 waiter=0 authpin=1 0xcb1e2e00] 2014-08-13 12:42:09.832140 7ff844faa700 7 mds.0.locker sending MClientCaps to client.191425 seq 3 new pending pAsLsXs was pAsLsXsFs 2014-08-13 12:42:09.832145 7ff844faa700 10 mds.0.31 send_message_client_counted client.191425 seq 3297 client_caps(revoke ino 1000207b106 546031007 seq 3 caps=pAsLsXs dirty=- wanted=- follows 0 size 0/0 ts 1 mtime 2014-08-13 12:41:24.208187) v4 2014-08-13 12:42:09.832155 7ff844faa700 1 -- 10.214.134.10:6801/40004 --> 10.214.137.25:0/3163831945 -- client_caps(revoke ino 1000207b106 546031007 seq 3 caps=pAsLsXs dirty=- wanted=- follows 0 size 0/0 ts 1 mtime 2014-08-13 12:41:24.208187) v4 -- ?+0 0x25cb2800 con 0x47929a0 2014-08-13 12:42:09.832172 7ff844faa700 10 mds.0.cache.ino(1000207b106) auth_pin by 0xcb1e35f0 on [inode 1000207b106 [...2,head] /teuthology-archive/samuelj-2014-08-12_21:17:59-rados-wip-9054-testing-basic-multi/ auth v6956066 ap=2+1 f(v0 m2014-08-13 12:41:24.208187 558=0+558) n(v4275 rc2014-08-13 12:42:09.350741 b107169850288 266688=262748+3940) (isnap sync r=1) (inest lock dirty) (ifile sync->lock) (iversion lock) caps={191425=pAsLsXs/pAsLsXsFs/-@3,204886=pAsLsXs/-@1,551831=pAsLsXs/p@72781} | dirtyscattered=1 request=0 lock=1 dirfrag=1 caps=1 dirtyrstat=0 dirtyparent=0 dirty=1 waiter=0 authpin=1 0xcb1e2e00] now 2+1 2014-08-13 12:42:09.832192 7ff844faa700 7 mds.0.locker wrlock_start waiting on (ifile sync->lock) on [inode 1000207b106 [...2,head] /teuthology-archive/samuelj-2014-08-12_21:17:59-rados-wip-9054-testing-basic-multi/ auth v6956066 ap=2+1 f(v0 m2014-08-13 12:41:24.208187 558=0+558) n(v4275 rc2014-08-13 12:42:09.350741 b107169850288 266688=262748+3940) (isnap sync r=1) (inest lock dirty) (ifile sync->lock) (iversion lock) caps={191425=pAsLsXs/pAsLsXsFs/-@3,204886=pAsLsXs/-@1,551831=pAsLsXs/p@72781} | dirtyscattered=1 request=0 lock=1 dirfrag=1 caps=1 dirtyrstat=0 dirtyparent=0 dirty=1 waiter=0 authpin=1 0xcb1e2e00] 2014-08-13 12:42:09.832211 7ff844faa700 10 mds.0.cache.ino(1000207b106) add_waiter tag 40000000 0x2189d3e0 !ambig 1 !frozen 1 !freezing 1 2014-08-13 12:42:09.832214 7ff844faa700 10 mds.0.locker nudge_log (ifile sync->lock) on [inode 1000207b106 [...2,head] /teuthology-archive/samuelj-2014-08-12_21:17:59-rados-wip-9054-testing-basic-multi/ auth v6956066 ap=2+1 f(v0 m2014-08-13 12:41:24.208187 558=0+558) n(v4275 rc2014-08-13 12:42:09.350741 b107169850288 266688=262748+3940) (isnap sync r=1) (inest lock dirty) (ifile sync->lock) (iversion lock) caps={191425=pAsLsXs/pAsLsXsFs/-@3,204886=pAsLsXs/-@1,551831=pAsLsXs/p@72781} | dirtyscattered=1 request=0 lock=1 dirfrag=1 caps=1 dirtyrstat=0 dirtyparent=0 dirty=1 waiter=1 authpin=1 0xcb1e2e00]
...and client isn't releasing the Fs cap.
Updated by Sage Weil over 9 years ago
- Subject changed from mds: unlink hangs due to inest lock hang in stray dir to kcephfs: client does not release revoked cap
- Description updated (diff)
- Category changed from 47 to 53
Updated by Sage Weil over 9 years ago
27898660 mds0 unlink #10001a2e027/ceph-client.admin.10021.log.gz (teuthology-archive/teuthology-2014-08-02_09:30:01-upgrade:firefly-firefly-testing-basic-vps/394383/remote/ubuntu@vpm114.front.sepia.ceph.com/log/ceph-client. admin.10021.log.gz) 33236750 mds0 unlink #10001a5d579/ceph-client.admin.17160.log.gz (teuthology-archive/teuthology-2014-08-03_02:30:01-rados-next-testing-basic-plana/394888/remote/plana79/log/ceph-client.admin.17160.log.gz) 34531973 mds0 unlink #10002ac81a2/.results.log.swx (teuthology-archive/john-2014-08-26_17:50:53-fs:basic-wip-9152-testing-basic-multi/.results.log.swx) 35586069 mds0 unlink #10002acf844/tmpS1rm84 (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452015/remote/plana15/log/tmpS1rm84) 35586109 mds0 unlink #10002acf680/tmphc4wEK (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452057/remote/plana93/tmphc4wEK) 35586150 mds0 unlink #10002acf8b3/tmpxetA_M (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452124/remote/plana69/log/tmpxetA_M) 35586184 mds0 unlink #10002acf8d6/tmpuzeuEn (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452120/remote/burnupi38/log/tmpuzeuEn) 35586199 mds0 unlink #10002acf747/tmpbX2W_s (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452093/remote/burnupi49/tmpbX2W_s) 35586204 mds0 unlink #10002acf795/tmpItnN8V (teuthology-archive/sage-2014-08-26_16:54:45-smoke-master-testing-basic-multi/453263/remote/plana58/tmpItnN8V) 35586217 mds0 unlink #10002acf76e/tmpYJrkNt (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452113/remote/mira072/tmpYJrkNt) 35593280 mds0 unlink #10002acfd27/tmptwtITj (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452108/remote/plana19/log/tmptwtITj) 35593352 mds0 unlink #10002acfd6c/tmphgMufE (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452163/remote/plana09/log/tmphgMufE) 35593469 mds0 unlink #10002acfd8e/tmpGvNVyb (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452175/remote/plana68/log/tmpGvNVyb) 35593601 mds0 unlink #10002acfe01/tmp4W0qqi (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452142/remote/plana49/log/tmp4W0qqi) 35595950 mds0 unlink #10002ad041b/tmpyxaMjK (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452133/remote/mira094/log/tmpyxaMjK) 35595969 mds0 unlink #10002ad045e/tmpBsj_I5 (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452206/remote/plana65/tmpBsj_I5) 35596074 mds0 unlink #10002ad045b/tmp04raza (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452141/remote/plana85/log/tmp04raza) 35601947 mds0 getattr #10001a2e027 35613016 mds0 unlink #10002ad1c50/tmpMSzLYm (teuthology-archive/samuelj-2014-08-26_19:30:53-rados-wip-sam-testing-testing-basic-multi/454102/remote/plana18/tmpMSzLYm) 35613047 mds0 unlink #10002ad2177/tmpivUOro (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452171/remote/plana70/log/tmpivUOro) 35613095 mds0 unlink #10002ad21a7/tmpwRVfX0 (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452099/remote/plana63/log/tmpwRVfX0) 35613921 mds0 symlink (unsafe) #10002ad21d1/worker.log (teuthology-archive/yehudasa-2014-08-26_09:38:51-rgw-wip-9148-testing-basic-multi/452252/worker.log) /var/lib/teuthworker/archive/worker_logs/worker.multi.4768
restarting now (Wed Aug 27 10:06:26 PDT 2014)
Updated by Sage Weil over 9 years ago
ubuntu@burnupi21:/var/log/ceph$ zgrep :27898660 ceph-mds.burnupi21.log.?.gz ceph-mds.burnupi21.log.1.gz:2014-08-26 00:12:29.373631 7f0a37337700 1 -- 10.214.134.10:6801/19153 <== client.764753 10.214.137.23:0/3344997032 1803770 ==== client_request(client.764753:27898660 unlink #10001a2e027/ceph-client.admin.10021.log.gz) v2 ==== 248+0+0 (4113235506 0 0) 0x1ab06000 con 0x7eeb420 ceph-mds.burnupi21.log.1.gz:2014-08-26 00:12:29.373641 7f0a37337700 4 mds.0.server handle_client_request client_request(client.764753:27898660 unlink #10001a2e027/ceph-client.admin.10021.log.gz) v2 ceph-mds.burnupi21.log.1.gz:2014-08-26 00:12:29.373645 7f0a37337700 7 mds.0.cache request_start request(client.764753:27898660 cr=0x1ab06000) ceph-mds.burnupi21.log.1.gz:2014-08-26 00:12:29.373789 7f0a37337700 7 mds.0.server dispatch_client_request client_request(client.764753:27898660 unlink #10001a2e027/ceph-client.admin.10021.log.gz) v2 ceph-mds.burnupi21.log.1.gz:2014-08-26 00:12:29.373829 7f0a37337700 10 mds.0.locker acquire_locks request(client.764753:27898660 cr=0x1ab06000)
Updated by Zheng Yan over 9 years ago
2014-08-26 00:12:29.374924 7f0a37337700 10 mds.0.locker nudge_log (ilink sync->lock) on [inode 10001a2e17b [2,head] /teuthology-archive/teuthology-2014-08-02_09:30:01-upgrade:firefly-firefly-testing-basic-vps/394383/remote/ubuntu@vpm114.front.sepia.ceph.com/log/ceph-client.admin.10021.log.gz auth v3370 ap=2+0 s=48 n(v0 b48 1=1+0) (ilink sync->lock) (iversion lock w=1 last_client=764753) caps={191425=pAsXsFscr/pAsLsXsFscr/-@3,764753=p/-@1} | lock=2 caps=1 waiter=1 authpin=1 0x2a4f3b00]
client.191425 is gitbuilder-archive, which runs 3.13.0-rc3-ceph-00078-g125d725 kernel. Its dmesg has:
[18222160.521420] front: 0003e510: ff ff ff ff 00 00 00 00 b6 3e 06 00 00 00 00 00 .........>...... [18222160.521421] front: 0003e520: 00 00 00 00 00 00 00 00 55 0d 00 00 00 00 00 00 ........U....... [18222160.521423] front: 0003e530: fc 8f 76 27 00 00 00 00 02 00 00 00 00 00 00 00 ..v'............ [18222160.521424] front: 0003e540: 01 00 00 00 00 00 00 00 01 00 00 40 00 01 00 00 ...........@.... ...
I think this bug should be fixed by "commit 54008399 ceph: preallocate buffer for readdir reply"
Updated by Sage Weil over 9 years ago
- Status changed from 12 to Resolved
old bug; restarted those clients.