Project

General

Profile

Bug #38452

mds: assert crash loop while unlinking file

Added by Jérôme Poulin almost 5 years ago. Updated 3 months ago.

Status:
Need More Info
Priority:
Urgent
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Here is the stack trace, it was caused by a Postgresql trying to unlink a file in the log archive.

 ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
 1: (()+0x4058c0) [0x5588549cd8c0]
 2: (()+0x12890) [0x7fdf788d3890]
 3: (gsignal()+0xc7) [0x7fdf779c6e97]
 4: (abort()+0x141) [0x7fdf779c8801]
 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x250) [0x7fdf78fbd530]
 6: (()+0x26d5a7) [0x7fdf78fbd5a7]
 7: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x15f8) [0x5588547482e8]
 8: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0x961) [0x558854748cd1]
 9: (Server::handle_client_request(MClientRequest*)+0x49b) [0x55885476305b]
 10: (Server::dispatch(Message*)+0x2db) [0x558854766d1b]
 11: (MDSRank::handle_deferrable_message(Message*)+0x434) [0x5588546da1e4]
 12: (MDSRank::_dispatch(Message*, bool)+0x89b) [0x5588546e7a1b]
 13: (MDSRank::retry_dispatch(Message*)+0x12) [0x5588546e8012]
 14: (MDSInternalContextBase::complete(int)+0x67) [0x55885494df87]
 15: (MDSRank::_advance_queues()+0xf1) [0x5588546e69c1]
 16: (MDSRank::ProgressThread::entry()+0x43) [0x5588546e7043]
 17: (()+0x76db) [0x7fdf788c86db]
 18: (clone()+0x3f) [0x7fdf77aa988f]

Attached a debug-ms=10 file, however I can't reproduce as I fixed it by killing the ceph-fuse client on our Postgres server et restarting the MDS.

ceph-mds.log.gz (225 KB) Jérôme Poulin, 02/22/2019 05:06 PM

ceph-mds.log.xz (82.8 KB) Jérôme Poulin, 03/04/2019 07:16 AM

ceph-mds-active.log.xz (92.7 KB) Jérôme Poulin, 03/04/2019 02:54 PM

ceph-mds-0x100000367e5.log View (51.9 KB) Jérôme Poulin, 03/05/2019 02:49 PM

mds_crash.tar.gz - error scatter_lock (125 KB) Felix Stolte, 12/13/2022 09:28 AM


Related issues

Related to CephFS - Bug #42829: tools/cephfs: linkages injected by cephfs-data-scan have first == head Resolved
Related to CephFS - Bug #58482: mds: catch damage to CDentry's first member before persisting Resolved
Related to CephFS - Bug #54546: mds: crash due to corrupt inode and omap entry New
Duplicated by CephFS - Bug #41147: mds: crash loop - Server.cc 6835: FAILED ceph_assert(in->first <= straydn->first) Duplicate
Duplicated by CephFS - Bug #54643: crash: void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*): assert(in->first <= straydn->first) Duplicate
Duplicated by CephFS - Bug #53179: Crash when unlink in corrupted cephfs Duplicate
Duplicated by CephFS - Bug #60669: crash: void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*): assert(in->first <= straydn->first) New

History

#1 Updated by Patrick Donnelly almost 5 years ago

  • Subject changed from ceph-mds assert crash loop while unlinking file to mds: assert crash loop while unlinking file
  • Description updated (diff)
  • Priority changed from Normal to High
  • Target version set to v15.0.0
  • Start date deleted (02/22/2019)
  • Backport set to nautilus,mimic,luminous
  • ceph-qa-suite deleted (fs)

#2 Updated by Jérôme Poulin almost 5 years ago

I can reproduce the problem by deleting a specific file if you need more information. This is a development environment so I can reproduce on demand.

#3 Updated by Jérôme Poulin almost 5 years ago

Here is the last assert I got, would you think they're caused by the same source?

     0> 2019-03-03 21:25:31.306 7f2145c48700 -1 /build/ceph-13.2.4/src/mds/Server.cc: In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' thread 7f2145c48700 time 2019-03-03 21:25:31.306591
/build/ceph-13.2.4/src/mds/Server.cc: 6567: FAILED assert(in->first <= straydn->first)

 ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f214de5d3e2]
 2: (()+0x26d5a7) [0x7f214de5d5a7]
 3: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x15f8) [0x559839a2c2e8]
 4: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0x961) [0x559839a2ccd1]
 5: (Server::handle_client_request(MClientRequest*)+0x49b) [0x559839a4705b]
 6: (Server::dispatch(Message*)+0x2db) [0x559839a4ad1b]
 7: (MDSRank::handle_deferrable_message(Message*)+0x434) [0x5598399be1e4]
 8: (MDSRank::_dispatch(Message*, bool)+0x89b) [0x5598399cba1b]
 9: (MDSRank::retry_dispatch(Message*)+0x12) [0x5598399cc012]
 10: (MDSInternalContextBase::complete(int)+0x67) [0x559839c31f87]
 11: (MDSRank::_advance_queues()+0xf1) [0x5598399ca9c1]
 12: (MDSRank::_dispatch(Message*, bool)+0x2d0) [0x5598399cb450]
 13: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x5598399cbff5]
 14: (MDSDaemon::ms_dispatch(Message*)+0xd3) [0x5598399b5e03]
 15: (DispatchQueue::entry()+0xb92) [0x7f214ded7722]
 16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f214df762cd]
 17: (()+0x76db) [0x7f214d7686db]
 18: (clone()+0x3f) [0x7f214c94988f]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

#4 Updated by Zheng Yan almost 5 years ago

please add 'debug_mds = 20' to ceph.conf, then trigger the assertion again

#5 Updated by Jérôme Poulin almost 5 years ago

Here is the whole log attached (8.4MB)

#6 Updated by Zheng Yan almost 5 years ago

looks like you set debug_ms to 20, not debug_mds

#7 Updated by Jérôme Poulin almost 5 years ago

Sorry, my mistake, the new file with debug-mds=20 has 520 MB now.

Since the resulting compressed file has 10 MB, do you have a preferred place I can upload the file?

Also, do you have a GPG key I can encrypt the file with since it seems to contain a lot of data (or is it only metadata)?

I can also remove the replay/rejoin/reconnect phase if you prefer to have it joined to this ticket.

#8 Updated by Zheng Yan almost 5 years ago

removing replay/rejoin/reconnect should be good

#9 Updated by Jérôme Poulin almost 5 years ago

Here you go.

#10 Updated by Zheng Yan almost 5 years ago

[inode 0x100000367e5 [head,head] /pg_xlog_archives/9.6/smobile/000000200000002C000000BB.00000028.backup.zst auth v145794 ap=2+0 s=193 n(v0 rc2019-03-03 21:27:34.226503) (iversion lock w=1 last_client=25558652) caps={25432667=pAsXsFs/-@2,25558652=pAsXsFscr/-@1} | request=1 lock=2 caps=1 dirty=0 waiter=0 authpin=1 0x55933ec87100]

looks like a field of the inode is corrupted. no idea what happend. please grep 'inode 0x100000367e5' from the full log and upload the result.

#12 Updated by Zheng Yan almost 5 years ago

there is no clue in the log. If you want to fix the issue, you can:

- ceph daemon mds.xx flush journal
- stop mds
- rados -p cephfs_metadata_pool rmomapkey 1000000088b.01800000 000000200000002C000000BB.00000028.backup.zst_head
- restart mds

#13 Updated by Patrick Donnelly almost 5 years ago

  • Target version deleted (v15.0.0)

#14 Updated by Jérôme Poulin almost 5 years ago

I had to repeat the procedure a couple time before everything cleaned up, but now it is back to normal.

I included a hexdump of a key for your reference.

┌─│15:37:52│0│$ ~
└──╼ rados --cluster dev -p cephfsmeta listomapkeys 1000000088b.01800000 | grep 000000200000002C000000BD.zst
000000200000002C000000BD.zst_head
┌─│15:38:03│0│jerome.poulin@JeromeCauca:~
└──╼ rados --cluster dev -p cephfsmeta rmomapkey 1000000088b.01800000 000000200000002C000000BD.zst_head
┌─│15:38:20│0│$ ~
└──╼ rados --cluster dev -p cephfsmeta listomapkeys 1000000088b.01800000 | grep 000000200000002C000000C2.zst
000000200000002C000000C2.zst_head
┌─│15:38:49│0│$ ~
└──╼ rados --cluster dev -p cephfsmeta rmomapkey 1000000088b.01800000 000000200000002C000000C2.zst_head
┌─│15:38:55│0│$ ~
└──╼ rados --cluster dev -p cephfsmeta listomapkeys 1000000088b.01800000 | grep 000000200000002C000000BE.zst
000000200000002C000000BE.zst_head
┌─│15:39:30│0│$ ~
└──╼ rados --cluster dev -p cephfsmeta getomapval 1000000088b.01800000 000000200000002C000000BE.zst_head
value (462 bytes) :
00000000  fe ff ff ff ff ff ff ff  49 0f 06 a3 01 00 00 24  |........I......$|
00000010  68 03 00 00 01 00 00 00  00 00 00 16 8d 7c 5c f6  |h............|\.|
00000020  62 bc 0d 00 80 00 00 00  00 00 00 00 00 00 00 01  |b...............|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 40 00 01 00  00 00 00 00 40 00 06 00  |....@.......@...|
00000050  00 00 00 00 00 00 00 00  00 00 c4 b1 2c 00 00 00  |............,...|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 ca 93 a8 5b 00 00  |.............[..|
00000080  00 00 ca 93 a8 5b 00 00  00 00 00 00 00 00 00 00  |.....[..........|
00000090  00 00 03 02 28 00 00 00  00 00 00 00 00 00 00 00  |....(...........|
000000a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000c0  03 02 38 00 00 00 00 00  00 00 00 00 00 00 00 00  |..8.............|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000f0  00 00 00 00 00 00 16 8d  7c 5c f6 62 bc 0d 03 02  |........|\.b....|
00000100  38 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |8...............|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000130  00 00 00 00 16 8d 7c 5c  f6 62 bc 0d 85 39 02 00  |......|\.b...9..|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000150  00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  c4 b1 2c 00 00 00 00 00  ff ff ff ff ff ff ff ff  |..,.............|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001a0  00 00 00 00 00 00 01 00  00 00 00 00 00 00 ff ff  |................|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

┌─│15:39:41│0│$ ~
└──╼ rados --cluster dev -p cephfsmeta rmomapkey 1000000088b.01800000 000000200000002C000000BE.zst_head
┌─│15:39:49│0│$ ~
└──╼ rados --cluster dev -p cephfsmeta rmomapkey 1000000088b.01800000 000000200000002C000000BA.zst_head

#15 Updated by Patrick Donnelly about 4 years ago

  • Related to Bug #42829: tools/cephfs: linkages injected by cephfs-data-scan have first == head added

#16 Updated by Patrick Donnelly about 4 years ago

  • Status changed from New to Resolved
  • Assignee set to Zheng Yan
  • Backport deleted (nautilus,mimic,luminous)

#17 Updated by Patrick Donnelly over 2 years ago

  • Duplicated by Bug #41147: mds: crash loop - Server.cc 6835: FAILED ceph_assert(in->first <= straydn->first) added

#18 Updated by Patrick Donnelly about 1 year ago

  • Category set to Correctness/Safety
  • Status changed from Resolved to Need More Info
  • Assignee deleted (Zheng Yan)
  • Priority changed from High to Urgent
  • Target version set to v18.0.0

#19 Updated by Patrick Donnelly about 1 year ago

  • Duplicated by Bug #54643: crash: void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*): assert(in->first <= straydn->first) added

#20 Updated by Patrick Donnelly about 1 year ago

  • Duplicated by Bug #53179: Crash when unlink in corrupted cephfs added

#21 Updated by Felix Stolte 12 months ago

Last week our mds crashed only once with the following error:

2022-12-08T13:05:48.919+0100 7f440afec700 -1 /build/ceph-16.2.10/src/mds/ScatterLock.h: In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 7f440afec700 time 2022-12-08T13:05:48.921223+0100
/build/ceph-16.2.10/src/mds/ScatterLock.h: 59: FAILED ceph_assert(state LOCK_XLOCK || state LOCK_XLOCKDONE)

At 13:05 a snapshot was created and at 13:18 we observed the unlink_local error again, leading to the crash loop we killed the mds session:

2022-12-08T13:18:57.756+0100 7fec7d8fd700 -1 /build/ceph-16.2.10/src/mds/Server.cc: In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' thread 7fec7d8fd700 time 2022-12-08T13:18:57
.757771+0100
/build/ceph-16.2.10/src/mds/Server.cc: 7806: FAILED ceph_assert(in->first <= straydn->first)

User confirmed the file in question was copied to the cephfs shortly before snapshot creation.

While debugging the issue we realizied we have a MTU mismatch. All OSDs and Clients used a MTU 9000 while the MDS was using MTU 1500. Client in question is using cephfs kernel mount with Kernel 5.4.0-135. I attached the error log of the first crash with scatter_lock assuming it was the cause unlink_local problem.

#22 Updated by Patrick Donnelly 11 months ago

  • Related to Bug #58482: mds: catch damage to CDentry's first member before persisting added

#23 Updated by Milind Changire 5 months ago

  • Duplicated by Bug #60669: crash: void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*): assert(in->first <= straydn->first) added

#24 Updated by Dhairya Parmar 5 months ago

  • Related to Bug #54546: mds: crash due to corrupt inode and omap entry added

#25 Updated by Patrick Donnelly 3 months ago

  • Target version deleted (v18.0.0)

Also available in: Atom PDF