Project

General

Profile

Bug #47698

mds crashed in try_remove_dentries_for_stray after touching file in strange directory

Added by Dan van der Ster 12 months ago. Updated 12 months ago.

Status:
New
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We had a directory "b1" which appeared empty but could not be rmdir'd.
The directory also had a very large size, also indicating some sort of metadata corruption.

We scrubbed the path and eventually the entire filesystem but it did not find any problems.

Here is the parent:

# rados listomapkeys -p cephfs_metadata 2002261a436.00000000
b1_head
#
# rados listomapvals -p cephfs_metadata 2002261a436.00000000
b1_head
value (462 bytes) :
00000000  02 00 00 00 00 00 00 00  49 0f 06 a3 01 00 00 06  |........I.......|
00000010  a7 61 22 00 02 00 00 00  00 00 00 f3 28 4e 5f 04  |.a".........(N_.|
00000020  ea d1 32 ed 41 00 00 e1  03 00 00 de 03 00 00 01  |..2.A...........|
00000030  00 00 00 00 02 00 00 00  00 00 00 00 02 02 18 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff  |................|
00000050  ff ff ff ff ff ff 00 00  00 00 00 00 00 00 00 00  |................|
00000060  00 00 01 00 00 00 ff ff  ff ff ff ff ff ff 00 00  |................|
00000070  00 00 00 00 00 00 00 00  00 00 f3 28 4e 5f 04 ea  |...........(N_..|
00000080  d1 32 2b e6 8d 5e 8e 15  7f 3a 00 00 00 00 00 00  |.2+..^...:......|
00000090  00 00 03 02 28 00 00 00  01 00 00 00 00 00 00 00  |....(...........|
000000a0  f3 28 4e 5f 04 ea d1 32  ff ff ff ff ff ff ff ff  |.(N_...2........|
000000b0  00 00 00 00 00 00 00 00  24 00 00 00 00 00 00 00  |........$.......|
000000c0  03 02 38 00 00 00 02 00  00 00 00 00 00 00 54 f8  |..8...........T.|
000000d0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff 01 00  |................|
000000e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000f0  00 00 00 00 00 00 f3 28  4e 5f 04 ea d1 32 03 02  |.......(N_...2..|
00000100  38 00 00 00 02 00 00 00  00 00 00 00 54 f8 ff ff  |8...........T...|
00000110  ff ff ff ff ff ff ff ff  ff ff ff ff 01 00 00 00  |................|
00000120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000130  00 00 00 00 f3 28 4e 5f  04 ea d1 32 a9 45 00 00  |.....(N_...2.E..|
00000140  00 00 00 00 00 00 00 00  00 00 00 00 01 00 00 00  |................|
00000150  00 00 00 00 9c 05 00 00  00 00 00 00 00 00 00 00  |................|
00000160  00 00 00 00 00 00 00 00  ff ff ff ff ff ff ff ff  |................|
00000170  00 00 00 00 01 01 10 00  00 00 00 00 00 00 00 00  |................|
00000180  00 00 00 00 00 00 00 00  00 00 00 00 00 00 a9 2f  |.............../|
00000190  00 00 00 00 00 00 48 cb  96 5e 22 f6 ea 26 2b e6  |......H..^"..&+.|
000001a0  8d 5e 8e 15 7f 3a 24 00  00 00 00 00 00 00 ff ff  |.^...:$.........|
000001b0  ff ff 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  00 00 fe ff ff ff ff ff  ff ff 00 00 00 00        |..............|
000001ce

And here was the directory itself:

# rados listomapvals -p cephfs_metadata 2002261a706.00000000
#

Before trying to remove that omap key directly, I thought maybe I should touch a file in `b1` dir.
This crashed the MDS:

    -1> 2020-09-30 14:26:44.144 7f3216895700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.11/rpm/el7/BUILD/ceph-14.2.11/src/mds/CDir.cc: In function 'void CDir::try_remove_dentries_for_stray()' thread 7f3216895700 time 2020-09-30 14:26:44.142112
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.11/rpm/el7/BUILD/ceph-14.2.11/src/mds/CDir.cc: 758: FAILED ceph_assert(dn->get_linkage()->is_null())

 ceph version 14.2.11 (f7fdb2f52131f54b891a2ec99d8205561242cdaf) nautilus (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x7f3226335025]
 2: (()+0x25c1ed) [0x7f32263351ed]
 3: (CDir::try_remove_dentries_for_stray()+0x259) [0x557075dfad19]
 4: (MDCache::clear_dirty_bits_for_stray(CInode*)+0x1c8) [0x557075d24ef8]
 5: (StrayManager::_eval_stray(CDentry*)+0x3a2) [0x557075d752f2]
 6: (StrayManager::eval_stray(CDentry*)+0x1f) [0x557075d75acf]
 7: (Server::_unlink_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*, unsigned long)+0x3f9) [0x557075c7e2c9]
 8: (MDSContext::complete(int)+0x74) [0x557075e8a0e4]
 9: (MDSIOContextBase::complete(int)+0x16f) [0x557075e8a33f]
 10: (MDSLogContextBase::complete(int)+0x40) [0x557075e8a550]
 11: (Finisher::finisher_thread_entry()+0x16f) [0x7f32263be38f]
 12: (()+0x7ea5) [0x7f32241ebea5]
 13: (clone()+0x6d) [0x7f3222e998dd]

The MDS then does not start, it crashes the same way during MDS clientreplay:

    -3> 2020-09-30 14:27:47.203 7f24f0f55700  1 mds.0.80598 clientreplay_start
    -2> 2020-09-30 14:27:47.232 7f24f0f55700  1 mds.0.80598  still have 0 requests need to be replayed, 1 sessions need to be reclaimed
    -1> 2020-09-30 14:27:47.299 7f24ebf4b700 -1 /builddir/build/BUILD/ceph-14.2.11/src/mds/CDir.cc: In function 'void CDir::try_remove_dentries_for_stray()' thread 7f24ebf4b700 time 2020-09-30 14:27:47.298616
/builddir/build/BUILD/ceph-14.2.11/src/mds/CDir.cc: 758: FAILED ceph_assert(dn->get_linkage()->is_null())

 ceph version 14.2.11 (f7fdb2f52131f54b891a2ec99d8205561242cdaf) nautilus (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x7f24fa472025]
 2: (()+0x25c1ed) [0x7f24fa4721ed]
 3: (CDir::try_remove_dentries_for_stray()+0x259) [0x559ea7c3dc89]
 4: (MDCache::clear_dirty_bits_for_stray(CInode*)+0x1c8) [0x559ea7b67e68]
 5: (StrayManager::_eval_stray(CDentry*)+0x3a2) [0x559ea7bb8262]
 6: (StrayManager::eval_stray(CDentry*)+0x1f) [0x559ea7bb8a3f]
 7: (MDCache::scan_stray_dir(dirfrag_t)+0x134) [0x559ea7b3c694]
 8: (MDSContext::complete(int)+0x74) [0x559ea7ccd054]
 9: (MDSRank::_advance_queues()+0xa4) [0x559ea7a5b054]
 10: (MDSRank::ProgressThread::entry()+0x3d) [0x559ea7a5b6cd]
 11: (()+0x7ea5) [0x7f24f8328ea5]
 12: (clone()+0x6d) [0x7f24f6fd68dd]

I have started this recovery procedure: https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/

History

#1 Updated by Dan van der Ster 12 months ago

After finishing the following, the MDS started:

# cephfs-journal-tool --rank=cephfs:0 event recover_dentries summary
Events by type:
  OPEN: 6242
  SESSION: 24
  SESSIONS: 7
  SUBTREEMAP: 240
  UPDATE: 185247
Errors: 0
# cephfs-journal-tool --rank=cephfs:0 journal reset

# cephfs-table-tool all reset session

We are running `ceph tell mds.x scrub start / recursive repair` now.

And interestingly, we looked in the broken dir after the fs was back online, and the dir (and it's parent) are gone.

#2 Updated by Dan van der Ster 12 months ago

Here is the `b1` dir at the start of this issue:

# ls -l
total 0
drwxr-xr-x 1 lemon lemon                    1 Sep 30 10:20 60
drwxr-xr-x 1 lemon lemon 18446744073709551615 Sep  1 12:56 b1
drwxr-xr-x 1 lemon lemon                    1 Sep 30 10:20 ce
# rmdir b1
rmdir: failed to remove ‘b1’: Directory not empty

#3 Updated by Zheng Yan 12 months ago

try deleting 'd1' using 'rados rmomapkey'. If you have debug_mds=10, it should be easy to get d1's parent dirfrag (corresponding to omap object)

#4 Updated by Dan van der Ster 12 months ago

b1 was not longer there after we followed the recover_dentries procedure, so it is gone.

#5 Updated by Dan van der Ster 12 months ago

Now I'm trying to repair the metadata on this fs so it fully consistent.
When I run 'scrub start / force recurstive repair' and 'scrub start ~mdsdir force recursive repair' it finds and corrects some rstats, and complains about dup inodes (in stray and in the fs). But I run those scrubs a few times and eventually the scrubs are clean.

Then I failover the mds to a standby, and the standby complains about dup inodes again, and scrub again reports errors.

The mds log is at ceph-post-file: 1e6b1579-d030-4895-847b-f901f9414399

How do I cleanup this fs?
I've heard in the past about "compare the dup inodes, and rmomapkey for the lower versions" and I've heard about running scan_links.
Should I do either of those now?

#6 Updated by Zheng Yan 12 months ago

Dan van der Ster wrote:

Now I'm trying to repair the metadata on this fs so it fully consistent.
When I run 'scrub start / force recurstive repair' and 'scrub start ~mdsdir force recursive repair' it finds and corrects some rstats, and complains about dup inodes (in stray and in the fs). But I run those scrubs a few times and eventually the scrubs are clean.

Then I failover the mds to a standby, and the standby complains about dup inodes again, and scrub again reports errors.

The mds log is at ceph-post-file: 1e6b1579-d030-4895-847b-f901f9414399

How do I cleanup this fs?

'cephfs-data-scan scan_links' can fix this error

I've heard in the past about "compare the dup inodes, and rmomapkey for the lower versions" and I've heard about running scan_links.
Should I do either of those now?

#7 Updated by Dan van der Ster 12 months ago

Thanks. scan_links fixed these dups (and some Bad nlink). For future reference, it took 2 hours to complete on a cluster with 4.3 million objects in cephfs_metadata.

Do we want to keep this open to work on the original corrupted dir which caused the crash? Or in future we should just `rados rmomapkey` the strange looking directory?

Also available in: Atom PDF