Project

General

Profile

Bug #44380

mds: MDCache.cc: 2335: FAILED ceph_assert(!"unmatched rstat rbytes" == g_conf()->mds_verify_scatter)

Added by Patrick Donnelly 7 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature:

Description

2020-02-29T13:29:53.496+0000 7f67c0191700 10 MDSIOContextBase::complete: 13C_MDC_Recover
2020-02-29T13:29:53.496+0000 7f67c0191700 10 MDSContext::complete: 13C_MDC_Recover
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0 RecoveryQueue::_recovered _recovered r=0 size=2 mtime=2020-02-29T13:29:33.859441+0000 for [inode 0x10000000016 [head,head] /subdir/21 auth v356 ap=2 recovering s=2 n() (ifile excl->sync) (iversion lock) cr={5509=0-4194304@fffffffffffffffd} | ptrwaiter=0 request=0 lock=0 caps=0 dirtyrstat=1 dirty=0 authpin=1 0x55e6b547a800]
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.locker check_inode_max_size new_ranges {} update_size 1 on [inode 0x10000000016 [head,head] /subdir/21 auth v356 ap=2 s=2 n() (ifile excl->sync) (iversion lock) cr={5509=0-4194304@fffffffffffffffd} | ptrwaiter=0 request=0 lock=0 caps=0 dirtyrstat=1 dirty=0 authpin=1 0x55e6b547a800]
2020-02-29T13:29:53.496+0000 7f67c0191700 15 mds.0.cache.ino(0x10000000016) project_inode 0x10000000016
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.dir(0x10000000000) pre_dirty 358
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.den(0x10000000000 21) pre_dirty [dentry #0x1/subdir/21 [head,head] auth (dversion lock) pv=358 v=356 ino=0x10000000016 state=1073741824 | lock=0 fragmenting=0 inodepin=1 dirty=0 authpin=0 0x55e6b5478480]
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.ino(0x10000000016) pre_dirty 358 (current v 356)
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.locker check_inode_max_size client_ranges {5509=0-4194304@fffffffffffffffd} -> {}
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.locker check_inode_max_size size 2 -> 2
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.locker check_inode_max_size mtime 2020-02-29T13:29:33.000000+0000 -> 2020-02-29T13:29:33.859441+0000
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache predirty_journal_parents linkunlink=0 primary_dn follows head [inode 0x10000000016 [head,head] /subdir/21 auth v356 pv358 ap=2 s=2 n()->n(v0 rc2020-02-29T13:29:33.859441+0000 b2)/n() (ifile excl->sync) (iversion lock) cr={5509=0-4194304@fffffffffffffffd} | ptrwaiter=0 request=0 lock=0 caps=0 dirtyrstat=1 dirty=0 authpin=1 0x55e6b547a800]
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.dir(0x10000000000) auth_pin by 0x55e6b54cd440 on [dir 0x10000000000 /subdir/ [2,head] auth v=357 cv=357/357 ap=1+2 state=1073741825|complete f(v0 100=100+0)/f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=0 dirty=0 authpin=1 0x55e6b54b2580] count now 1
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.dir(0x10000000000) project_fnode 0x55e6b45e4b10
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.dir(0x10000000000) pre_dirty 359
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache  taking wrlock on (inest lock) on [inode 0x10000000000 [...2,head] /subdir/ auth v410 f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1) (inest lock) (iversion lock) | dirtyscattered=0 request=0 lock=0 dirfrag=1 caps=0 dirtyrstat=0 dirty=1 waiter=0 authpin=0 0x55e6b5419800]
2020-02-29T13:29:53.496+0000 7f67c0191700  7 mds.0.locker wrlock_force  on (inest lock) on [inode 0x10000000000 [...2,head] /subdir/ auth v410 f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1) (inest lock) (iversion lock) | dirtyscattered=0 request=0 lock=0 dirfrag=1 caps=0 dirtyrstat=0 dirty=1 waiter=0 authpin=0 0x55e6b5419800]
2020-02-29T13:29:53.496+0000 7f67c0191700 10  mds.0.cache.snaprealm(0x1 seq 1 0x55e6b5404500) have_past_parents_open [1,head]
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache projected_rstat_inode_to_frag first head linkunlink 0 [inode 0x10000000016 [head,head] /subdir/21 auth v356 pv358 ap=2 s=2 n()->n(v0 rc2020-02-29T13:29:33.859441+0000 b2)/n() (ifile excl->sync) (iversion lock) cr={5509=0-4194304@fffffffffffffffd} | ptrwaiter=0 request=0 lock=0 caps=0 dirtyrstat=1 dirty=0 authpin=1 0x55e6b547a800]
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache     frag head is [2,head]
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache  inode update is [head,head]
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache  floor of head from parent dn [dentry #0x1/subdir/21 [head,head] auth (dversion lock) pv=358 v=356 ino=0x10000000016 state=1073741824 | lock=0 fragmenting=0 inodepin=1 dirty=0 authpin=0 0x55e6b5478480]
2020-02-29T13:29:53.496+0000 7f67c0191700 10  mds.0.cache.snaprealm(0x1 seq 1 0x55e6b5404500) have_past_parents_open [1,head]
2020-02-29T13:29:53.496+0000 7f67c0191700 10  mds.0.cache.snaprealm(0x1 seq 1 0x55e6b5404500) get_snaps  (seq 1 cached_seq 1)
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache _project_rstat_inode_to_frag [head,head]
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache   inode           rstat n(v0 rc2020-02-29T13:29:33.859441+0000 b2)
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache   inode accounted_rstat n()
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache                   delta n(v0 rc2020-02-29T13:29:33.859441+0000 b2)
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache  projecting to head [2,head] n(v1 rc2020-02-29T13:29:28.160623+0000 b188 99=99+0)
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache   project to [2,head] n(v1 rc2020-02-29T13:29:28.160623+0000 b188 99=99+0)
2020-02-29T13:29:53.496+0000 7f67c91a3700  1 -- [v2:172.21.15.2:6832/2828792259,v1:172.21.15.2:6833/2828792259] >> v1:172.21.15.40:0/1109131888 conn(0x55e6b53bc000 legacy=0x55e6b541b000 unknown :6833 s=STATE_CONNECTION_ESTABLISHED l=1).read_bulk peer close file descriptor 38
2020-02-29T13:29:53.496+0000 7f67c91a3700  1 -- [v2:172.21.15.2:6832/2828792259,v1:172.21.15.2:6833/2828792259] >> v1:172.21.15.40:0/1109131888 conn(0x55e6b53bc000 legacy=0x55e6b541b000 unknown :6833 s=STATE_CONNECTION_ESTABLISHED l=1).read_until read failed
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache       result [2,head] n(v1 rc2020-02-29T13:29:33.859441+0000 b190 99=99+0) [dir 0x10000000000 /subdir/ [2,head] auth pv=359 v=357 cv=357/357 ap=1+2 state=1073741825|complete f(v0 100=100+0)/f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0)->f(v0 100=100+0)/f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b188 99=99+0)->n(v1 rc2020-02-29T13:29:33.859441+0000 b190 99=99+0)/n(v1 rc2020-02-29T13:29:28.160623+0000 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=0 dirty=0 authpin=1 0x55e6b54b2580]
2020-02-29T13:29:53.496+0000 7f67c91a3700  1 --1- [v2:172.21.15.2:6832/2828792259,v1:172.21.15.2:6833/2828792259] >> v1:172.21.15.40:0/1109131888 conn(0x55e6b53bc000 0x55e6b541b000 :6833 s=OPENED pgs=2 cs=1 l=1).handle_message read tag failed
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.ino(0x10000000016) clear_dirty_rstat
2020-02-29T13:29:53.496+0000 7f67c91a3700  1 --1- [v2:172.21.15.2:6832/2828792259,v1:172.21.15.2:6833/2828792259] >> v1:172.21.15.40:0/1109131888 conn(0x55e6b53bc000 0x55e6b541b000 :6833 s=OPENED pgs=2 cs=1 l=1).fault on lossy channel, failing
2020-02-29T13:29:53.496+0000 7f67c0191700  7 mds.0.locker local_wrlock_grab  on (iversion lock) on [inode 0x10000000000 [...2,head] /subdir/ auth v410 f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1) (inest lock w=1) (iversion lock) | dirtyscattered=0 request=0 lock=1 dirfrag=1 caps=0 dirtyrstat=0 dirty=1 waiter=0 authpin=0 0x55e6b5419800]
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.ino(0x10000000000) auth_pin by 0x55e6b54cd440 on [inode 0x10000000000 [...2,head] /subdir/ auth v410 ap=1 f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1) (inest lock w=1) (iversion lock w=1) | dirtyscattered=0 request=0 lock=2 dirfrag=1 caps=0 dirtyrstat=0 dirty=1 waiter=0 authpin=1 0x55e6b5419800] now 1
2020-02-29T13:29:53.496+0000 7f67c0191700 15 mds.0.cache.dir(0x1) adjust_nested_auth_pins 1 on [dir 0x1 / [2,head] auth v=411 cv=409/409 dir_auth=0 ap=0+1 state=1610612737|complete f(v0 m2020-02-29T13:29:28.065624+0000 1=0+1) n(v2 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=0 0x55e6b529f600] by 0x55e6b5419800 count now 0/1
2020-02-29T13:29:53.496+0000 7f67c0191700 10  mds.0.cache.snaprealm(0x3 seq 1 0x55e6b5404280) have_past_parents_open [1,head]
2020-02-29T13:29:53.496+0000 7f67c0191700 15 mds.0.cache.ino(0x10000000000) project_inode 0x10000000000
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.dir(0x1) pre_dirty 412
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.den(0x1 subdir) pre_dirty [dentry #0x1/subdir [2,head] auth (dversion lock) pv=412 v=410 ino=0x10000000000 state=1610612736 | request=0 lock=0 inodepin=1 dirty=1 authpin=0 0x55e6b5416fc0]
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.ino(0x10000000000) pre_dirty 412 (current v 410)
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache predirty_journal_parents frag->inode on [dir 0x10000000000 /subdir/ [2,head] auth pv=359 v=357 cv=357/357 ap=1+2 state=1073741825|complete f(v0 100=100+0)/f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0)->f(v0 100=100+0)/f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b188 99=99+0)->n(v1 rc2020-02-29T13:29:33.859441+0000 b190 99=99+0)/n(v1 rc2020-02-29T13:29:28.160623+0000 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=0 dirty=0 authpin=1 0x55e6b54b2580]
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache project_rstat_frag_to_inode [2,head]
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache   frag           rstat n(v1 rc2020-02-29T13:29:33.859441+0000 b190 99=99+0)
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache   frag accounted_rstat n(v1 rc2020-02-29T13:29:28.160623+0000 b188 99=99+0)
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache                  delta n(v1 rc2020-02-29T13:29:33.859441+0000 b2)
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache  projecting to [2,head] n(v1 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1)
2020-02-29T13:29:53.496+0000 7f67c0191700 20 mds.0.cache         result [2,head] n(v1 rc2020-02-29T13:29:33.859441+0000 b192 101=100+1)
2020-02-29T13:29:53.496+0000 7f67c0191700 -1 log_channel(cluster) log [ERR] : unmatched rstat rbytes on single dirfrag 0x10000000000, inode has n(v1 rc2020-02-29T13:29:33.859441+0000 b192 101=100+1), dirfrag has n(v1 rc2020-02-29T13:29:33.859441+0000 b190 99=99+0)
2020-02-29T13:29:53.497+0000 7f67c0191700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/huge/release/15.1.0-1435-g8e904aa/rpm/el8/BUILD/ceph-15.1.0-1435-g8e904aa/src/mds/MDCache.cc: In function 'void MDCache::predirty_journal_parents(MutationRef, EMetaBlob*, CInode*, CDir*, int, int, snapid_t)' thread 7f67c0191700 time 2020-02-29T13:29:53.497532+0000
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/huge/release/15.1.0-1435-g8e904aa/rpm/el8/BUILD/ceph-15.1.0-1435-g8e904aa/src/mds/MDCache.cc: 2335: FAILED ceph_assert(!"unmatched rstat rbytes" == g_conf()->mds_verify_scatter)

 ceph version 15.1.0-1435-g8e904aa (8e904aaab37a2615eec03fd9c9f166a80774d91e) octopus (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f67ce2e30f0]
 2: (()+0x27930a) [0x7f67ce2e330a]
 3: (MDCache::predirty_journal_parents(boost::intrusive_ptr<MutationImpl>, EMetaBlob*, CInode*, CDir*, int, int, snapid_t)+0x232a) [0x55e6b32d923a]
 4: (Locker::check_inode_max_size(CInode*, bool, unsigned long, unsigned long, utime_t)+0x414) [0x55e6b33a9ab4]
 5: (RecoveryQueue::_recovered(CInode*, int, unsigned long, utime_t)+0x8bd) [0x55e6b33819dd]
 6: (MDSContext::complete(int)+0x56) [0x55e6b34c92d6]
 7: (MDSIOContextBase::complete(int)+0xa3) [0x55e6b34c94e3]
 8: (Filer::C_Probe::finish(int)+0xb5) [0x55e6b3565e75]
 9: (Context::complete(int)+0xd) [0x55e6b31f9f5d]
 10: (Finisher::finisher_thread_entry()+0x1a5) [0x7f67ce373a05]
 11: (()+0x82de) [0x7f67cd10f2de]
 12: (clone()+0x43) [0x7f67cbca2133]

From: /ceph/teuthology-archive/pdonnell-2020-02-29_02:56:38-kcephfs-wip-pdonnell-testing-20200229.001503-distro-basic-smithi/4811126/remote/smithi002/log/ceph-mds.a.log.gz

Only PR in this batch related to this code path is: https://github.com/ceph/ceph/pull/33538

I don't see how #33538 would cause this.

The last run of kcephfs on Feb 16th did not have this issue: http://pulpito.ceph.com/pdonnell-2020-02-16_17:35:17-kcephfs-wip-pdonnell-testing-20200215.033325-distro-basic-smithi/


Related issues

Copied to fs - Backport #45220: octopus: mds: MDCache.cc: 2335: FAILED ceph_assert(!"unmatched rstat rbytes" == g_conf()->mds_verify_scatter) Resolved
Copied to fs - Backport #45497: nautilus: mds: MDCache.cc: 2335: FAILED ceph_assert(!"unmatched rstat rbytes" == g_conf()->mds_verify_scatter) Resolved

History

#1 Updated by Xiubo Li 6 months ago

  • Assignee set to Xiubo Li

#2 Updated by Xiubo Li 6 months ago

  • Status changed from New to In Progress

#3 Updated by Zheng Yan 6 months ago

test that triggered the assertion is test_fragmented_injection. I think we need to add following settings

self.fs.set_ceph_conf('mds', 'mds verify scatter', False)
self.fs.set_ceph_conf('mds', 'mds debug scatterstat', False)

#4 Updated by Xiubo Li 6 months ago

Zheng Yan wrote:

test that triggered the assertion is test_fragmented_injection. I think we need to add following settings

self.fs.set_ceph_conf('mds', 'mds verify scatter', False)
self.fs.set_ceph_conf('mds', 'mds debug scatterstat', False)

Yeah, tried them by setting to True, locally I couldn't reproduce it by running for more than 30 times till now.

#5 Updated by Greg Farnum 6 months ago

Zheng Yan wrote:

test that triggered the assertion is test_fragmented_injection. I think we need to add following settings

self.fs.set_ceph_conf('mds', 'mds verify scatter', False)
self.fs.set_ceph_conf('mds', 'mds debug scatterstat', False)

That's just disabling the debug checking; I see this test pokes at our metadata but I think the data scan should clean it up. What steps lead to the rstats going wrong?

#6 Updated by Zheng Yan 6 months ago

Greg Farnum wrote:

That's just disabling the debug checking; I see this test pokes at our metadata but I think the data scan should clean it up. What steps lead to the rstats going wrong?

date scan does not fix fragstat / rstat. The test uses 'rados oamprmkey' to remove a dentry, which caused the incorrect rstat.

#7 Updated by Xiubo Li 6 months ago

Zheng Yan wrote:

Greg Farnum wrote:

That's just disabling the debug checking; I see this test pokes at our metadata but I think the data scan should clean it up. What steps lead to the rstats going wrong?

date scan does not fix fragstat / rstat. The test uses 'rados oamprmkey' to remove a dentry, which caused the incorrect rstat.

The "test_fragmented_injection" test case will stop all the mds daemons when doing the "rmomapkey 21_head" for "subdir/21" and data scan. And later when starting the mds daemons all the dentries under "subdir/" will be loaded from pools, including the "subdir/21", so it shouldn't matter that the data_scan does not update the rstat/dirstat, right ?

I checked the teuthology.log, the test_fragmented_injection test case succeeded:

2020-02-29T13:29:54.230 INFO:teuthology.orchestra.run.smithi002.stderr:2020-02-29T13:29:54.198+0000 7fceda039700  1 -- 172.21.15.2:0/3828554392 shutdown_connections
2020-02-29T13:29:54.230 INFO:teuthology.orchestra.run.smithi002.stderr:2020-02-29T13:29:54.198+0000 7fceda039700  1 -- 172.21.15.2:0/3828554392 wait complete.
2020-02-29T13:29:54.230 INFO:tasks.cephfs_test_runner:test_fragmented_injection (tasks.cephfs.test_data_scan.TestDataScan) ... ok
2020-02-29T13:29:54.231 INFO:tasks.cephfs_test_runner:Starting test: test_parallel_execution (tasks.cephfs.test_data_scan.TestDataScan)

All the dentries under /subdir/ should be regular files and the file number is 100, no directory exists, but just when taking the wrlock, it has one new directory and one new regular file added by some one: we can see "n(v1 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1)" from:

2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache  taking wrlock on (inest lock) on [inode 0x10000000000 [...2,head] /subdir/ auth v410 f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1) (inest lock) (iversion lock) | dirtyscattered=0 request=0 lock=0 dirfrag=1 caps=0 dirtyrstat=0 dirty=1 waiter=0 authpin=0 0x55e6b5419800]
2020-02-29T13:29:53.496+0000 7f67c0191700  7 mds.0.locker wrlock_force  on (inest lock) on [inode 0x10000000000 [...2,head] /subdir/ auth v410 f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1) (inest lock) (iversion lock) | dirtyscattered=0 request=0 lock=0 dirfrag=1 caps=0 dirtyrstat=0 dirty=1 waiter=0 authpin=0 0x55e6b5419800]

But just before taking the wrlock, we can see that the "/subdir" still have 99 regluar files, no directory:

2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache predirty_journal_parents linkunlink=0 primary_dn follows head [inode 0x10000000016 [head,head] /subdir/21 auth v356 pv358 ap=2 s=2 n()->n(v0 rc2020-02-29T13:29:33.859441+0000 b2)/n() (ifile excl->sync) (iversion lock) cr={5509=0-4194304@fffffffffffffffd} | ptrwaiter=0 request=0 lock=0 caps=0 dirtyrstat=1 dirty=0 authpin=1 0x55e6b547a800]
2020-02-29T13:29:53.496+0000 7f67c0191700 10 mds.0.cache.dir(0x10000000000) auth_pin by 0x55e6b54cd440 on [dir 0x10000000000 /subdir/ [2,head] auth v=357 cv=357/357 ap=1+2 state=1073741825|complete f(v0 100=100+0)/f(v0 m2020-02-29T13:29:28.159623+0000 100=100+0) n(v1 rc2020-02-29T13:29:28.160623+0000 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=0 dirty=0 authpin=1 0x55e6b54b2580] count now 1

including the "subdir/21" here, the regular file number is 100.

After the test_fragmented_injection test case, the "subdir/*" will be kept, and it could be used by other test cases.

BTW, in which case will the recovery queue be triggered ? I didn't see any file recovery test case was ran.

BRs

#8 Updated by Zheng Yan 6 months ago

Xiubo Li wrote:

Zheng Yan wrote:

Greg Farnum wrote:

That's just disabling the debug checking; I see this test pokes at our metadata but I think the data scan should clean it up. What steps lead to the rstats going wrong?

rstat can go wrong if mds merge dirfrags automatically before the data scan

date scan does not fix fragstat / rstat. The test uses 'rados oamprmkey' to remove a dentry, which caused the incorrect rstat.

The "test_fragmented_injection" test case will stop all the mds daemons when doing the "rmomapkey 21_head" for "subdir/21" and data scan. And later when starting the mds daemons all the dentries under "subdir/" will be loaded from pools, including the "subdir/21", so it shouldn't matter that the data_scan does not update the rstat/dirstat, right ?

I checked the teuthology.log, the test_fragmented_injection test case succeeded:

[...]

All the dentries under /subdir/ should be regular files and the file number is 100, no directory exists, but just when taking the wrlock, it has one new directory and one new regular file added by some one: we can see "n(v1 rc2020-02-29T13:29:28.160623+0000 b190 101=100+1)" from:

the one dir in rstat is directory itself

[...]

But just before taking the wrlock, we can see that the "/subdir" still have 99 regluar files, no directory:

[...]

including the "subdir/21" here, the regular file number is 100.

After the test_fragmented_injection test case, the "subdir/*" will be kept, and it could be used by other test cases.

BTW, in which case will the recovery queue be triggered ? I didn't see any file recovery test case was ran.

BRs

#9 Updated by Xiubo Li 6 months ago

I could reproduce it %100 locally, the steps are:
1), run the test_fragmented_injection test case
2), manually remove the subdir/ manually from the mountpoint.

  441 2020-04-08T01:06:41.005-0400 7f00ebd3d700 10 mds.0.server _unlink_local [dentry #0x1/subdir/21 [head,head] auth (dn xlock x=1 by 0x557c45093      000) (dversion lock w=1 last_client=35588) pv=0 v=303 ap=2 ino=0x1000000020a state=1073741824 | request=1 lock=2 fragmenting=0 inodepin=1 au      thpin=1 0x557c450ade00]                                                                                                                     
  442 2020-04-08T01:06:41.005-0400 7f00ebd3d700 10 mds.0.cache.dir(0x10000000000) pre_dirty 305
  443 2020-04-08T01:06:41.005-0400 7f00ebd3d700 10 mds.0.cache.den(0x10000000000 21) pre_dirty [dentry #0x1/subdir/21 [head,head] auth (dn xlock x      =1 by 0x557c45093000) (dversion lock w=1 last_client=35588) pv=305 v=303 ap=2 ino=0x1000000020a state=1073741824 | request=1 lock=2 fragment      ing=0 inodepin=1 authpin=1 0x557c450ade00]
  444 2020-04-08T01:06:41.005-0400 7f00ebd3d700 15 mds.0.cache.ino(0x1000000020a) project_inode 0x1000000020a
  445 2020-04-08T01:06:41.005-0400 7f00ebd3d700 10 mds.0.cache.dir(0x601) pre_dirty 3
  446 2020-04-08T01:06:41.005-0400 7f00ebd3d700 10 mds.0.cache.den(0x601 1000000020a) pre_dirty [dentry #0x100/stray1/1000000020a [2,head] auth NU      LL (dn xlock x=1 by 0x557c45093000) (dversion lock w=1 last_client=35588) pv=3 v=2 ap=2 ino=(nil) state=1342177345|new|bottomlru | request=1       lock=2 authpin=1 0x557c450c8600]
  447 2020-04-08T01:06:41.005-0400 7f00ebd3d700 10 mds.0.cache.ino(0x1000000020a) pre_dirty 3 (current v 1)
  448 2020-04-08T01:06:41.005-0400 7f00ebd3d700 10 mds.0.cache predirty_journal_parents do_parent_mtime linkunlink=-1 primary_dn follows head [ino      de 0x1000000020a [head,head] ~mds0/stray1/1000000020a auth v1 pv3 ap=3 s=2 n()->n(v0 rc2020-04-08T01:06:40.995200-0400)/n() (ilink xlock x=1       by 0x557c45093000) (iversion lock w=1 last_client=35588) caps={35588=pAsXsFscr/-@3} | request=1 lock=3 caps=1 authpin=1 0x557c450b5c00]
  449 2020-04-08T01:06:41.005-0400 7f00ebd3d700 10 mds.0.cache.dir(0x10000000000) auth_pin by 0x557c45093000 on [dir 0x10000000000 /subdir/ [2,hea      d] auth v=304 cv=304/304 ap=2+5 state=1074003969|complete f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc2020-04-07      T20:35:35.448479-0400 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=0 dirty=0 authpin=1 0x557c450cc000] count now 2
  450 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache.dir(0x10000000000) project_fnode 0x557c45058410
  451 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache.dir(0x10000000000) pre_dirty 306
  452 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache predirty_journal_parents bumping change_attr to 1 on 0x557c450cc000
  453 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache predirty_journal_parents updating mtime on [dir 0x10000000000 /subdir/ [2,head] aut      h pv=306 v=304 cv=304/304 ap=2+5 state=1074003969|complete f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0)->f(v0 m2020-04-0      8T01:06:40.995200-0400 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc2020-04-07T20:35:35.448479-0400 b188 99=99+0)->n(v      0 rc2020-04-07T20:35:35.448479-0400 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=0 dirty=0 authpin=1 0x557c450cc000]
  454 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache predirty_journal_parents updating size on [dir 0x10000000000 /subdir/ [2,head] auth       pv=306 v=304 cv=304/304 ap=2+5 state=1074003969|complete f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0)->f(v0 m2020-04-08      T01:06:40.995200-0400 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc2020-04-07T20:35:35.448479-0400 b188 99=99+0)->n(v0       rc2020-04-08T01:06:40.995200-0400 b188 99=99+0)/n(v0 rc2020-04-07T20:35:35.448479-0400 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=0 dir      ty=0 authpin=1 0x557c450cc000]
  455 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10  mds.0.cache.snaprealm(0x1 seq 1 0x557c45023b80) have_past_parents_open [1,head]
  456 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache projected_rstat_inode_to_frag first head linkunlink -1 [inode 0x1000000020a [head,h      ead] ~mds0/stray1/1000000020a auth v1 pv3 ap=3 s=2 n()->n(v0 rc2020-04-08T01:06:40.995200-0400)/n() (ilink xlock x=1 by 0x557c45093000) (ive      rsion lock w=1 last_client=35588) caps={35588=pAsXsFscr/-@3} | request=1 lock=3 caps=1 authpin=1 0x557c450b5c00]
  457 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache     frag head is [2,head] 
  458 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache  inode update is [head,head]
  459 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache  floor of 2 from parent dn [dentry #0x100/stray1/1000000020a [2,head] auth NULL (dn       xlock x=1 by 0x557c45093000) (dversion lock w=1 last_client=35588) pv=3 v=2 ap=2 ino=(nil) state=1342177345|new|bottomlru | request=1 lock=      2 authpin=1 0x557c450c8600]
  460 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10  mds.0.cache.snaprealm(0x1 seq 1 0x557c45023b80) have_past_parents_open [1,head]
  461 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10  mds.0.cache.snaprealm(0x1 seq 1 0x557c45023b80) get_snaps  (seq 1 cached_seq 1)
  462 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache _project_rstat_inode_to_frag [head,head]
  463 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache   inode           rstat n(v0 rc2020-04-08T01:06:40.995200-0400)
  464 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache   inode accounted_rstat n()
  465 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache                   delta n()
  466 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache  projecting to head [2,head] n(v0 rc2020-04-08T01:06:40.995200-0400 b188 99=99+0)
  467 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache   project to [2,head] n(v0 rc2020-04-08T01:06:40.995200-0400 b188 99=99+0)
  468 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache       result [2,head] n(v0 rc2020-04-08T01:06:40.995200-0400 b188 99=99+0) [dir 0x1      0000000000 /subdir/ [2,head] auth pv=306 v=304 cv=304/304 ap=2+5 state=1074003969|complete f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-      0400 100=100+0)->f(v0 m2020-04-08T01:06:40.995200-0400 99=99+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc2020-04-07T20:35:35.      448479-0400 b188 99=99+0)->n(v0 rc2020-04-08T01:06:40.995200-0400 b188 99=99+0)/n(v0 rc2020-04-07T20:35:35.448479-0400 b188 99=99+0) hs=100+      0,ss=0+0 | child=1 frozen=0 dirty=0 authpin=1 0x557c450cc000]
  469 2020-04-08T01:06:41.006-0400 7f00ebd3d700  7 mds.0.locker local_wrlock_grab  on (iversion lock) on [inode 0x10000000000 [...2,head] /subdir/       auth v407 ap=2 f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc2020-04-07T20:35:35.448479-0400 b190 101=100+1) (isnap sync r=1) (in      est lock w=1) (ifile lock w=1) (iversion lock) caps={35588=pAsLsXs/-@2},l=35588 | request=0 lock=3 dirfrag=1 caps=1 dirty=0 waiter=0 authpin      =1 0x557c4504fc00]
  470 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10  mds.0.cache.snaprealm(0x3 seq 1 0x557c45023900) have_past_parents_open [1,head]
  471 2020-04-08T01:06:41.006-0400 7f00ebd3d700 15 mds.0.cache.ino(0x10000000000) project_inode 0x10000000000
  472 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache.dir(0x1) pre_dirty 408
  473 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache.den(0x1 subdir) pre_dirty [dentry #0x1/subdir [2,head] auth (dversion lock) pv=408       v=407 ino=0x10000000000 state=1073741824 | request=0 lock=0 inodepin=1 dirty=0 authpin=0 0x557c45044600]                                    
  474 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache.ino(0x10000000000) pre_dirty 408 (current v 407)
  475 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache predirty_journal_parents add_delta f(v0 m2020-04-08T01:06:40.995200-0400 99=99+0)
  476 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache predirty_journal_parents         - f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0)
  477 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache predirty_journal_parents     gives f(v0 m2020-04-08T01:06:40.995200-0400 99=99+0) o      n [inode 0x10000000000 [...2,head] /subdir/ auth v407 pv408 ap=2 f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0)->f(v0 m2020-04-08T01:06:40      .995200-0400 99=99+0) n(v0 rc2020-04-07T20:35:35.448479-0400 b190 101=100+1)->n(v0 rc2020-04-07T20:35:35.448479-0400 b190 101=100+1) (isnap       sync r=1) (inest lock w=1) (ifile lock w=1) (iversion lock w=1 last_client=35588) caps={35588=pAsLsXs/-@2},l=35588 | request=0 lock=4 dirfra      g=1 caps=1 dirty=0 waiter=0 authpin=1 0x557c4504fc00]
  478 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache predirty_journal_parents frag->inode on [dir 0x10000000000 /subdir/ [2,head] auth p      v=306 v=304 cv=304/304 ap=2+5 state=1074003969|complete f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0)->f(v0 m2020-04-08T0      1:06:40.995200-0400 99=99+0) n(v0 rc2020-04-07T20:35:35.448479-0400 b188 99=99+0)->n(v0 rc2020-04-08T01:06:40.995200-0400 b188 99=99+0)/n(v0       rc2020-04-07T20:35:35.448479-0400 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=0 dirty=0 authpin=1 0x557c450cc000]
  479 2020-04-08T01:06:41.006-0400 7f00ebd3d700 10 mds.0.cache project_rstat_frag_to_inode [2,head]
  480 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache   frag           rstat n(v0 rc2020-04-08T01:06:40.995200-0400 b188 99=99+0)
  481 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache   frag accounted_rstat n(v0 rc2020-04-07T20:35:35.448479-0400 b188 99=99+0)
  482 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache                  delta n(v0 rc2020-04-08T01:06:40.995200-0400)
  483 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache  projecting to [2,head] n(v0 rc2020-04-07T20:35:35.448479-0400 b190 101=100+1)
  484 2020-04-08T01:06:41.006-0400 7f00ebd3d700 20 mds.0.cache         result [2,head] n(v0 rc2020-04-08T01:06:40.995200-0400 b190 101=100+1)
  485 2020-04-08T01:06:41.007-0400 7f00ebd3d700 -1 log_channel(cluster) log [ERR] : unmatched rstat rbytes on single dirfrag 0x10000000000, inode       has n(v0 rc2020-04-08T01:06:40.995200-0400 b190 101=100+1), dirfrag has n(v0 rc2020-04-08T01:06:40.995200-0400 b188 99=99+0)
  486 2020-04-08T01:06:41.031-0400 7f00ebd3d700 -1 /data/ceph/src/mds/MDCache.cc: In function 'void MDCache::predirty_journal_parents(MutationRef,       EMetaBlob*, CInode*, CDir*, int, int, snapid_t)' thread 7f00ebd3d700 time 2020-04-08T01:06:41.008923-0400
  487 /data/ceph/src/mds/MDCache.cc: 2335: FAILED ceph_assert(!"unmatched rstat rbytes" == g_conf()->mds_verify_scatter)
  488 
  489  ceph version 15.2.0-328-ge7d98c3 (e7d98c37a96c82f76e51ca97ffc39657178ebce7) octopus (rc)
  490  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1aa) [0x7f00f4a4b56c]
  491  2: (()+0x14807ee) [0x7f00f4a4b7ee]
  492  3: (MDCache::predirty_journal_parents(boost::intrusive_ptr<MutationImpl>, EMetaBlob*, CInode*, CDir*, int, int, snapid_t)+0x24f0) [0x557c41      adc0f6]
  493  4: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x73f) [0x557c41a1092b]
  494  5: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0x15b0) [0x557c41a1001a]
  495  6: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0x129c) [0x557c419e91cc]
  496  7: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x15ac) [0x557c419e7720]
  497  8: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x99f) [0x557c419cf575]
  498  9: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message const> const&)+0x814) [0x557c4195e864]
  499  10: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x387) [0x557c4195d319]
  500  11: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0xdd) [0x557c4195cea7]
  501  12: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x1f3) [0x557c4193c5f3]
  502  13: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0xe9) [0x7f00f4c145b1]
  503  14: (DispatchQueue::entry()+0x5fc) [0x7f00f4c13284]
  504  15: (DispatchQueue::DispatchThread::entry()+0x1c) [0x7f00f4d6c9ae]
  505  16: (Thread::entry_wrapper()+0x78) [0x7f00f49e91b0]
  506  17: (Thread::_entry_func(void*)+0x18) [0x7f00f49e912e]
  507  18: (()+0x7e65) [0x7f00f2705e65]
  508  19: (clone()+0x6d) [0x7f00f13b388d]
  509 
  510 2020-04-08T01:06:41.057-0400 7f00ebd3d700 -1 *** Caught signal (Aborted) **
  511  in thread 7f00ebd3d700 thread_name:ms_dispatch
  512 
  513  ceph version 15.2.0-328-ge7d98c3 (e7d98c37a96c82f76e51ca97ffc39657178ebce7) octopus (rc)
  514  1: (()+0xd8d80c) [0x557c41ed880c]
  515  2: (()+0xf5f0) [0x7f00f270d5f0]
  516  3: (gsignal()+0x37) [0x7f00f12eb337]
  517  4: (abort()+0x148) [0x7f00f12eca28]
  518  5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x379) [0x7f00f4a4b73b]
  519  6: (()+0x14807ee) [0x7f00f4a4b7ee]
  520  7: (MDCache::predirty_journal_parents(boost::intrusive_ptr<MutationImpl>, EMetaBlob*, CInode*, CDir*, int, int, snapid_t)+0x24f0) [0x557c41      adc0f6]
  521  8: (Server::_unlink_local(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*)+0x73f) [0x557c41a1092b]
  522  9: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0x15b0) [0x557c41a1001a]
  523  10: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0x129c) [0x557c419e91cc]
  524  11: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x15ac) [0x557c419e7720]
  525  12: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x99f) [0x557c419cf575]
  526  13: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message const> const&)+0x814) [0x557c4195e864]
  527  14: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x387) [0x557c4195d319]
  528  15: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0xdd) [0x557c4195cea7]
  529  16: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x1f3) [0x557c4193c5f3]
  530  17: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0xe9) [0x7f00f4c145b1]
  531  18: (DispatchQueue::entry()+0x5fc) [0x7f00f4c13284]
  532  19: (DispatchQueue::DispatchThread::entry()+0x1c) [0x7f00f4d6c9ae]
  533  20: (Thread::entry_wrapper()+0x78) [0x7f00f49e91b0]
  534  21: (Thread::_entry_func(void*)+0x18) [0x7f00f49e912e]
  535  22: (()+0x7e65) [0x7f00f2705e65]
  536  23: (clone()+0x6d) [0x7f00f13b388d]
  537  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
  538

The problem is after dirfrag merge done, the dirfrag rstat is not correct and is "b188 99=99+0", but "190 100=100+0" is expected:

  3754 2020-04-07T20:36:35.649-0400 7f00e5530700 10 MDSContext::complete: 18C_MDC_FragmentPrep
  3755 2020-04-07T20:36:35.649-0400 7f00e5530700 10 mds.0.cache fragment_logged 0x10000000000 bits -1 on [inode 0x10000000000 [...2,head] /subdir/        auth v403 pv407 ap=2 f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0)->f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc2020-04-07T       20:35:35.448479-0400 b190 101=100+1)->n(v0 rc2020-04-07T20:35:35.448479-0400 b190 101=100+1) (idft lock w=1) (inest lock w=1) (ifile excl w       =1) (iversion lock) caps={35564=pAsLsXsFs/-@5},l=35564 | request=0 lock=3 dirfrag=1 caps=1 waiter=0 authpin=1 0x557c4504fc00]
  3756 2020-04-07T20:36:35.649-0400 7f00e5530700 15 mds.0.cache.ino(0x10000000000) pop_and_dirty_projected_inode 0x10000000000 v407
  3757 2020-04-07T20:36:35.649-0400 7f00e5530700 10 mds.0.cache.ino(0x10000000000) mark_dirty [inode 0x10000000000 [...2,head] /subdir/ auth v403        pv407 ap=2 f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0)->f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc2020-04-07T20:35:35.44       8479-0400 b190 101=100+1)->n(v0 rc2020-04-07T20:35:35.448479-0400 b190 101=100+1) (idft lock w=1) (inest lock w=1) (ifile excl w=1) (iversi       on lock) caps={35564=pAsLsXsFs/-@5},l=35564 | request=0 lock=3 dirfrag=1 caps=1 waiter=0 authpin=1 0x557c4504fc00]
  3758 2020-04-07T20:36:35.649-0400 7f00e5530700 10 mds.0.cache.den(0x1 subdir) mark_dirty [dentry #0x1/subdir [2,head] auth (dversion lock) pv=40       7 v=405 ino=0x10000000000 state=1073741824 | request=0 lock=0 inodepin=1 authpin=0 0x557c45044600]
  3759 2020-04-07T20:36:35.649-0400 7f00e5530700 10 mds.0.cache.dir(0x1) _mark_dirty (already dirty) [dir 0x1 / [2,head] auth v=407 cv=0/0 dir_aut       h=0 ap=0+2 state=1610874881|complete f(v0 m2020-04-07T20:35:34.406134-0400 1=0+1) n(v1 rc2020-04-07T20:35:35.448479-0400 b190 101=100+1) hs       =1+0,ss=0+0 dirty=1 | child=1 subtree=1 subtreetemp=0 dirty=1 waiter=0 authpin=0 0x557c4430b700] version 407
  3760 2020-04-07T20:36:35.649-0400 7f00e5530700 10 mds.0.cache  storing result frag [dir 0x10000000000 /subdir/ [2,head] auth v=304 cv=0/0 state=       1610874889|complete|frozendir f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc2020-04-07T20:35:35.448479-0400 b188        99=99+0) hs=100+0,ss=0+0 | child=1 frozen=1 dirty=1 0x557c450cc000]                                                                        
  3761 2020-04-07T20:36:35.649-0400 7f00e5530700 10 mds.0.cache.dir(0x10000000000) auth_pin by 0x557c44f8d200 on [dir 0x10000000000 /subdir/ [2,he       ad] auth v=304 cv=0/0 ap=1+0 state=1610874889|complete|frozendir f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc20       20-04-07T20:35:35.448479-0400 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=1 dirty=1 authpin=1 0x557c450cc000] count now 1
  3762 2020-04-07T20:36:35.649-0400 7f00e5530700 10 mds.0.cache.dir(0x10000000000) commit want 0 on [dir 0x10000000000 /subdir/ [2,head] auth v=30       4 cv=0/0 ap=1+0 state=1610878985|complete|frozendir|fragmenting f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc202       0-04-07T20:35:35.448479-0400 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=1 dirty=1 authpin=1 0x557c450cc000]
  3763 2020-04-07T20:36:35.649-0400 7f00e5530700 10 mds.0.cache.dir(0x10000000000) auth_pin by 0x557c450cc000 on [dir 0x10000000000 /subdir/ [2,he       ad] auth v=304 cv=0/0 ap=2+0 state=1610878985|complete|frozendir|fragmenting f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+       0) n(v0 rc2020-04-07T20:35:35.448479-0400 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=1 dirty=1 authpin=1 0x557c450cc000] count now 2
  3764 2020-04-07T20:36:35.650-0400 7f00e5530700 10 mds.0.cache.dir(0x10000000000) _commit want 304 on [dir 0x10000000000 /subdir/ [2,head] auth v       =304 cv=0/0 ap=2+0 state=1610878985|complete|frozendir|fragmenting f(v0 100=100+0)/f(v0 m2020-04-07T20:35:35.443058-0400 100=100+0) n(v0 rc       2020-04-07T20:35:35.448479-0400 b188 99=99+0) hs=100+0,ss=0+0 | child=1 frozen=1 dirty=1 authpin=1 0x557c450cc000]
  3765 2020-04-07T20:36:35.650-0400 7f00e5530700 10 mds.0.cache.dir(0x10000000000) marking committing
  3766 2020-04-07T20:36:35.650-0400 7f00e5530700 10 mds.0.cache.dir(0x10000000000) _omap_commit

So after the dirfrag merge, no matter in which case, once the predirty_journal_parent() is called, the MDS daemon will crash anyway.

#10 Updated by Xiubo Li 6 months ago

The root cause of this is the cephfs-data-scan tool didn't update the InodeStore->inode->rstat, and when the MDS daemon is loading the dentries for "subdir/21" the pi->rstat will be empty, so the "subdir/" will never get a chance to count the size and file number of "subdir/21".

#11 Updated by Xiubo Li 6 months ago

The fixing PR: https://github.com/ceph/ceph/pull/34410

It will add accounted_rstat/rstat when building file dentry for data_scan tool and set the dnfirst=2 as default, which will fix another MDS daemon crash.

#12 Updated by Xiubo Li 6 months ago

  • Status changed from In Progress to Fix Under Review

#13 Updated by Greg Farnum 5 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to octopus

#14 Updated by Nathan Cutler 5 months ago

  • Copied to Backport #45220: octopus: mds: MDCache.cc: 2335: FAILED ceph_assert(!"unmatched rstat rbytes" == g_conf()->mds_verify_scatter) added

#15 Updated by Nathan Cutler 5 months ago

  • Pull request ID set to 34410

#16 Updated by Patrick Donnelly 5 months ago

  • Backport changed from octopus to octopus,nautilus

#17 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #45497: nautilus: mds: MDCache.cc: 2335: FAILED ceph_assert(!"unmatched rstat rbytes" == g_conf()->mds_verify_scatter) added

#18 Updated by Nathan Cutler 4 months ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF