Bug #16954
openMetadata damage reported with snapshots+smallcache+dirfrags ("object missing on disk")
0%
Description
http://pulpito.ceph.com/jspray-2016-08-07_16:42:13-fs-wip-prompt-frag-distro-basic-mira/353833
2016-08-08 01:33:47.438171 mds.0 172.21.4.128:6812/7613 2 : cluster [ERR] dir 10000002550 object missing on disk; some files may be lost 2016-08-08 01:58:36.368616 mds.0 172.21.4.120:6812/6701 2 : cluster [ERR] dir 1000000873f object missing on disk; some files may be lost
It looks like the 01:33:47 incident fell of the end of the rotated MDS logs.
So looking at the 873f one:
mira028/log/ceph-mds.a.log.3.gz: 2016-08-08 01:58:36.354486 7f5d7f9fb700 7 mds.0.cache traverse: opening base ino 1000000873f snap head 2016-08-08 01:58:36.354497 7f5d7f9fb700 10 mds.0.server traverse_to_auth_dir [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=0+42+42 state=1610612736 f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 authpin=0 0xb2a4708] 2016-08-08 01:58:36.354522 7f5d7f9fb700 10 mds.0.server rdlock_path_xlock_dentry dir [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=0+42+42 state=1610612736 f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 authpin=0 0xb2a4708] 2016-08-08 01:58:36.354543 7f5d7f9fb700 10 mds.0.server prepare_null_dentry compat_ioctl.c in [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=0+42+42 state=1610612736 f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 authpin=0 0xb2a4708] 2016-08-08 01:58:36.354561 7f5d7f9fb700 20 mds.0.cache.dir(1000000873f) lookup (head, 'compat_ioctl.c') 2016-08-08 01:58:36.354565 7f5d7f9fb700 20 mds.0.cache.dir(1000000873f) miss -> (blk-cgroup.h,head) 2016-08-08 01:58:36.354567 7f5d7f9fb700 7 mds.0.server incomplete dir contents for [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=0+42+42 state=1610612736 f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 authpin=0 0xb2a4708], fetching 2016-08-08 01:58:36.354583 7f5d7f9fb700 10 mds.0.cache.dir(1000000873f) fetch on [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=0+42+42 state=1610612736 f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 authpin=0 0xb2a4708] 2016-08-08 01:58:36.354599 7f5d7f9fb700 10 mds.0.cache.dir(1000000873f) auth_pin by 0xb2a4708 on [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=1+42+42 state=1610612736 f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 waiter=1 authpin=1 0xb2a4708] count now 1 + 42 2016-08-08 01:58:36.354646 7f5d7f9fb700 1 -- 172.21.4.120:6812/6701 --> 172.21.4.120:6804/28828 -- osd_op(mds.0.763:498 1.8d280eb2 1000000873f.00000000 [omap-get-header 0~0,omap-get-vals 0~16,getxattr parent] snapc 0=[] ack+read+known_if_redirected+full_force e195) v7 -- ?+0 0xbcb5340 con 0xb0e9600 2016-08-08 01:58:36.355623 7f5d7d6f4700 1 -- 172.21.4.120:6812/6701 <== osd.4 172.21.4.120:6804/28828 10 ==== osd_op_reply(498 1000000873f.00000000 [omap-get-header 0~0,omap-get-vals 0~16,getxattr] v0'0 uv0 ack = -2 ((2) No such file or directory)) v7 ==== 224+0+0 (3960339603 0 0) 0xb16f340 con 0xb0e9600 2016-08-08 01:58:36.368553 7f5d7f1fa700 10 mds.0.cache.dir(1000000873f) _fetched header 0 bytes 0 keys for [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=1+42+42 state=1610612864 f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 waiter=1 authpin=1 0xb2a4708] 2016-08-08 01:58:36.368578 7f5d7f1fa700 0 mds.0.cache.dir(1000000873f) _fetched missing object for [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=1+42+42 state=1610612864 f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 waiter=1 authpin=1 0xb2a4708] 2016-08-08 01:58:36.368610 7f5d7f1fa700 -1 log_channel(cluster) log [ERR] : dir 1000000873f object missing on disk; some files may be lost 2016-08-08 01:58:36.373126 7f5d7f1fa700 10 mds.0.cache.dir(1000000873f) auth_unpin by 0xb2a4708 on [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=0+42+42 state=1611137026|complete|badfrag f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 waiter=1 authpin=0 0xb2a4708] count now 0 + 42 2016-08-08 01:58:36.373156 7f5d7f1fa700 11 mds.0.cache.dir(1000000873f) finish_waiting mask 2 result -5 on [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 ap=0+42+42 state=1611137026|complete|badfrag f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 waiter=1 authpin=0 0xb2a4708] 2016-08-08 01:58:36.373181 7f5d7f1fa700 7 mds.0.server dispatch_client_request client_req
It did get split a few seconds before:
2016-08-08 01:58:33.326310 7f5d7d1ef700 10 mds.0.cache.dir(1000000873f) split by 1 bits on [dir 1000000873f /client.0/tmp/linux-2.6.33/block/ [2,head] auth v=85 cv=0/0 state=1610612738|complete f(v0 m2016-08-08 01:58:12.302965 21=21+0) n(v0 rc2016-08-08 01:58:12.302965 b329311 21=21+0) hs=21+0,ss=0+0 dirty=21 | child=1 dirty=1 0xb2a4708]
Updated by John Spray over 7 years ago
Given that it happened twice in one job, seems a decent change it's reproducible, let's see:
http://pulpito.ceph.com/jspray-2016-08-08_03:35:40-fs:thrash-wip-prompt-frag-distro-basic-mira/
Updated by John Spray over 7 years ago
2016-08-08 01:58:36.354646 7f5d7f9fb700 1 -- 172.21.4.120:6812/6701 --> 172.21.4.120:6804/28828 -- osd_op(mds.0.763:498 1.8d280eb2 1000000873f.00000000 [omap-get-header 0~0,omap-get-vals 0~16,getxattr parent] snapc 0=[] ack+read+known_if_redirected+full_force e195) v7 -- ?+0 0xbcb5340 con 0xb0e9600 2016-08-08 01:58:36.355623 7f5d7d6f4700 1 -- 172.21.4.120:6812/6701 <== osd.4 172.21.4.120:6804/28828 10 ==== osd_op_reply(498 1000000873f.00000000 [omap-get-header 0~0,omap-get-vals 0~16,getxattr] v0'0 uv0 ack = -2 ((2) No such file or directory)) v7 ==== 224+0+0 (3960339603 0 0) 0xb16f340 con 0xb0e9600 2016-08-08 02:02:38.515244 7f26568ea700 1 -- 172.21.4.120:6812/7374 --> 172.21.4.120:6804/28828 -- osd_op(mds.0.816:1137 1.8d280eb2 1000000873f.00000000 [stat,omap-set-header 0~258,omap-set-vals 0~3215] snapc 0=[] ondisk+write+known_if_redirected+full_force e204) v7 -- ?+0 0xc3a78c0 con 0xa131600 2016-08-08 02:02:38.744511 7f26524de700 1 -- 172.21.4.120:6812/7374 <== osd.4 172.21.4.120:6804/28828 493 ==== osd_op_reply(1137 1000000873f.00000000 [stat,omap-set-header 0~258,omap-set-vals 0~3215] v204'4708 uv4708 ondisk = 0) v7 ==== 224+0+0 (4210584545 0 0) 0xb9882c0 con 0xa131600
So much later the object does eventually get written on journal segment expiry. So probably the bug here is that during path traversal we are incorrectly identifying a just-split dirfrag as being incomplete, and trying to load from the metadata pool something that hasn't been written there yet.
Updated by John Spray over 7 years ago
Hmm, no failures in that re-run, so it's not quite completely reproducible.