Bug #38452
mds: assert crash loop while unlinking file
0%
Description
Here is the stack trace, it was caused by a Postgresql trying to unlink a file in the log archive.
ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable) 1: (()+0x4058c0) [0x5588549cd8c0] 2: (()+0x12890) [0x7fdf788d3890] 3: (gsignal()+0xc7) [0x7fdf779c6e97] 4: (abort()+0x141) [0x7fdf779c8801] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x250) [0x7fdf78fbd530] 6: (()+0x26d5a7) [0x7fdf78fbd5a7] 7: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x15f8) [0x5588547482e8] 8: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0x961) [0x558854748cd1] 9: (Server::handle_client_request(MClientRequest*)+0x49b) [0x55885476305b] 10: (Server::dispatch(Message*)+0x2db) [0x558854766d1b] 11: (MDSRank::handle_deferrable_message(Message*)+0x434) [0x5588546da1e4] 12: (MDSRank::_dispatch(Message*, bool)+0x89b) [0x5588546e7a1b] 13: (MDSRank::retry_dispatch(Message*)+0x12) [0x5588546e8012] 14: (MDSInternalContextBase::complete(int)+0x67) [0x55885494df87] 15: (MDSRank::_advance_queues()+0xf1) [0x5588546e69c1] 16: (MDSRank::ProgressThread::entry()+0x43) [0x5588546e7043] 17: (()+0x76db) [0x7fdf788c86db] 18: (clone()+0x3f) [0x7fdf77aa988f]
Attached a debug-ms=10 file, however I can't reproduce as I fixed it by killing the ceph-fuse client on our Postgres server et restarting the MDS.
Related issues
History
#1 Updated by Patrick Donnelly almost 5 years ago
- Subject changed from ceph-mds assert crash loop while unlinking file to mds: assert crash loop while unlinking file
- Description updated (diff)
- Priority changed from Normal to High
- Target version set to v15.0.0
- Start date deleted (
02/22/2019) - Backport set to nautilus,mimic,luminous
- ceph-qa-suite deleted (
fs)
#2 Updated by Jérôme Poulin almost 5 years ago
I can reproduce the problem by deleting a specific file if you need more information. This is a development environment so I can reproduce on demand.
#3 Updated by Jérôme Poulin almost 5 years ago
Here is the last assert I got, would you think they're caused by the same source?
0> 2019-03-03 21:25:31.306 7f2145c48700 -1 /build/ceph-13.2.4/src/mds/Server.cc: In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' thread 7f2145c48700 time 2019-03-03 21:25:31.306591 /build/ceph-13.2.4/src/mds/Server.cc: 6567: FAILED assert(in->first <= straydn->first) ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f214de5d3e2] 2: (()+0x26d5a7) [0x7f214de5d5a7] 3: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x15f8) [0x559839a2c2e8] 4: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0x961) [0x559839a2ccd1] 5: (Server::handle_client_request(MClientRequest*)+0x49b) [0x559839a4705b] 6: (Server::dispatch(Message*)+0x2db) [0x559839a4ad1b] 7: (MDSRank::handle_deferrable_message(Message*)+0x434) [0x5598399be1e4] 8: (MDSRank::_dispatch(Message*, bool)+0x89b) [0x5598399cba1b] 9: (MDSRank::retry_dispatch(Message*)+0x12) [0x5598399cc012] 10: (MDSInternalContextBase::complete(int)+0x67) [0x559839c31f87] 11: (MDSRank::_advance_queues()+0xf1) [0x5598399ca9c1] 12: (MDSRank::_dispatch(Message*, bool)+0x2d0) [0x5598399cb450] 13: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x5598399cbff5] 14: (MDSDaemon::ms_dispatch(Message*)+0xd3) [0x5598399b5e03] 15: (DispatchQueue::entry()+0xb92) [0x7f214ded7722] 16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f214df762cd] 17: (()+0x76db) [0x7f214d7686db] 18: (clone()+0x3f) [0x7f214c94988f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
#4 Updated by Zheng Yan almost 5 years ago
please add 'debug_mds = 20' to ceph.conf, then trigger the assertion again
#5 Updated by Jérôme Poulin almost 5 years ago
- File ceph-mds.log.xz added
Here is the whole log attached (8.4MB)
#6 Updated by Zheng Yan almost 5 years ago
looks like you set debug_ms to 20, not debug_mds
#7 Updated by Jérôme Poulin almost 5 years ago
Sorry, my mistake, the new file with debug-mds=20 has 520 MB now.
Since the resulting compressed file has 10 MB, do you have a preferred place I can upload the file?
Also, do you have a GPG key I can encrypt the file with since it seems to contain a lot of data (or is it only metadata)?
I can also remove the replay/rejoin/reconnect phase if you prefer to have it joined to this ticket.
#8 Updated by Zheng Yan almost 5 years ago
removing replay/rejoin/reconnect should be good
#10 Updated by Zheng Yan almost 5 years ago
[inode 0x100000367e5 [head,head] /pg_xlog_archives/9.6/smobile/000000200000002C000000BB.00000028.backup.zst auth v145794 ap=2+0 s=193 n(v0 rc2019-03-03 21:27:34.226503) (iversion lock w=1 last_client=25558652) caps={25432667=pAsXsFs/-@2,25558652=pAsXsFscr/-@1} | request=1 lock=2 caps=1 dirty=0 waiter=0 authpin=1 0x55933ec87100]
looks like a field of the inode is corrupted. no idea what happend. please grep 'inode 0x100000367e5' from the full log and upload the result.
#11 Updated by Jérôme Poulin almost 5 years ago
- File ceph-mds-0x100000367e5.log View added
#12 Updated by Zheng Yan almost 5 years ago
there is no clue in the log. If you want to fix the issue, you can:
- ceph daemon mds.xx flush journal - stop mds - rados -p cephfs_metadata_pool rmomapkey 1000000088b.01800000 000000200000002C000000BB.00000028.backup.zst_head - restart mds
#13 Updated by Patrick Donnelly almost 5 years ago
- Target version deleted (
v15.0.0)
#14 Updated by Jérôme Poulin almost 5 years ago
I had to repeat the procedure a couple time before everything cleaned up, but now it is back to normal.
I included a hexdump of a key for your reference.
┌─│15:37:52│0│$ ~ └──╼ rados --cluster dev -p cephfsmeta listomapkeys 1000000088b.01800000 | grep 000000200000002C000000BD.zst 000000200000002C000000BD.zst_head ┌─│15:38:03│0│jerome.poulin@JeromeCauca:~ └──╼ rados --cluster dev -p cephfsmeta rmomapkey 1000000088b.01800000 000000200000002C000000BD.zst_head ┌─│15:38:20│0│$ ~ └──╼ rados --cluster dev -p cephfsmeta listomapkeys 1000000088b.01800000 | grep 000000200000002C000000C2.zst 000000200000002C000000C2.zst_head ┌─│15:38:49│0│$ ~ └──╼ rados --cluster dev -p cephfsmeta rmomapkey 1000000088b.01800000 000000200000002C000000C2.zst_head ┌─│15:38:55│0│$ ~ └──╼ rados --cluster dev -p cephfsmeta listomapkeys 1000000088b.01800000 | grep 000000200000002C000000BE.zst 000000200000002C000000BE.zst_head ┌─│15:39:30│0│$ ~ └──╼ rados --cluster dev -p cephfsmeta getomapval 1000000088b.01800000 000000200000002C000000BE.zst_head value (462 bytes) : 00000000 fe ff ff ff ff ff ff ff 49 0f 06 a3 01 00 00 24 |........I......$| 00000010 68 03 00 00 01 00 00 00 00 00 00 16 8d 7c 5c f6 |h............|\.| 00000020 62 bc 0d 00 80 00 00 00 00 00 00 00 00 00 00 01 |b...............| 00000030 00 00 00 00 00 00 00 00 00 00 00 00 02 02 18 00 |................| 00000040 00 00 00 00 40 00 01 00 00 00 00 00 40 00 06 00 |....@.......@...| 00000050 00 00 00 00 00 00 00 00 00 00 c4 b1 2c 00 00 00 |............,...| 00000060 00 00 01 00 00 00 ff ff ff ff ff ff ff ff 00 00 |................| 00000070 00 00 00 00 00 00 00 00 00 00 ca 93 a8 5b 00 00 |.............[..| 00000080 00 00 ca 93 a8 5b 00 00 00 00 00 00 00 00 00 00 |.....[..........| 00000090 00 00 03 02 28 00 00 00 00 00 00 00 00 00 00 00 |....(...........| 000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000000c0 03 02 38 00 00 00 00 00 00 00 00 00 00 00 00 00 |..8.............| 000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000000f0 00 00 00 00 00 00 16 8d 7c 5c f6 62 bc 0d 03 02 |........|\.b....| 00000100 38 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |8...............| 00000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000130 00 00 00 00 16 8d 7c 5c f6 62 bc 0d 85 39 02 00 |......|\.b...9..| 00000140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000150 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 |................| 00000160 c4 b1 2c 00 00 00 00 00 ff ff ff ff ff ff ff ff |..,.............| 00000170 00 00 00 00 01 01 10 00 00 00 00 00 00 00 00 00 |................| 00000180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001a0 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ff ff |................| 000001b0 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000001c0 00 00 fe ff ff ff ff ff ff ff 00 00 00 00 |..............| 000001ce ┌─│15:39:41│0│$ ~ └──╼ rados --cluster dev -p cephfsmeta rmomapkey 1000000088b.01800000 000000200000002C000000BE.zst_head ┌─│15:39:49│0│$ ~ └──╼ rados --cluster dev -p cephfsmeta rmomapkey 1000000088b.01800000 000000200000002C000000BA.zst_head
#15 Updated by Patrick Donnelly about 4 years ago
- Related to Bug #42829: tools/cephfs: linkages injected by cephfs-data-scan have first == head added
#16 Updated by Patrick Donnelly about 4 years ago
- Status changed from New to Resolved
- Assignee set to Zheng Yan
- Backport deleted (
nautilus,mimic,luminous)
#17 Updated by Patrick Donnelly over 2 years ago
- Duplicated by Bug #41147: mds: crash loop - Server.cc 6835: FAILED ceph_assert(in->first <= straydn->first) added
#18 Updated by Patrick Donnelly about 1 year ago
- Category set to Correctness/Safety
- Status changed from Resolved to Need More Info
- Assignee deleted (
Zheng Yan) - Priority changed from High to Urgent
- Target version set to v18.0.0
#19 Updated by Patrick Donnelly about 1 year ago
- Duplicated by Bug #54643: crash: void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*): assert(in->first <= straydn->first) added
#20 Updated by Patrick Donnelly about 1 year ago
- Duplicated by Bug #53179: Crash when unlink in corrupted cephfs added
#21 Updated by Felix Stolte 12 months ago
- File mds_crash.tar.gz added
Last week our mds crashed only once with the following error:
2022-12-08T13:05:48.919+0100 7f440afec700 -1 /build/ceph-16.2.10/src/mds/ScatterLock.h: In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 7f440afec700 time 2022-12-08T13:05:48.921223+0100
/build/ceph-16.2.10/src/mds/ScatterLock.h: 59: FAILED ceph_assert(state LOCK_XLOCK || state LOCK_XLOCKDONE)
At 13:05 a snapshot was created and at 13:18 we observed the unlink_local error again, leading to the crash loop we killed the mds session:
2022-12-08T13:18:57.756+0100 7fec7d8fd700 -1 /build/ceph-16.2.10/src/mds/Server.cc: In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' thread 7fec7d8fd700 time 2022-12-08T13:18:57
.757771+0100
/build/ceph-16.2.10/src/mds/Server.cc: 7806: FAILED ceph_assert(in->first <= straydn->first)
User confirmed the file in question was copied to the cephfs shortly before snapshot creation.
While debugging the issue we realizied we have a MTU mismatch. All OSDs and Clients used a MTU 9000 while the MDS was using MTU 1500. Client in question is using cephfs kernel mount with Kernel 5.4.0-135. I attached the error log of the first crash with scatter_lock assuming it was the cause unlink_local problem.
#22 Updated by Patrick Donnelly 11 months ago
- Related to Bug #58482: mds: catch damage to CDentry's first member before persisting added
#23 Updated by Milind Changire 5 months ago
- Duplicated by Bug #60669: crash: void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*): assert(in->first <= straydn->first) added
#24 Updated by Dhairya Parmar 5 months ago
- Related to Bug #54546: mds: crash due to corrupt inode and omap entry added
#25 Updated by Patrick Donnelly 3 months ago
- Target version deleted (
v18.0.0)