Project

General

Profile

Actions

Bug #65606

open

workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))

Added by Leonid Usov 12 days ago. Updated 2 days ago.

Status:
Pending Backport
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
backport_processed
Backport:
squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, quiesce
Labels (FS):
qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666598/

The first instance of a slow op reported by teuthology:

2024-04-21T12:18:20.317 INFO:journalctl@ceph.mon.c.smithi129.stdout:Apr 21 12:18:20 smithi129 ceph-mon[31790]: 1 slow requests, 1 included below; oldest blocked for > 199.712049 secs
2024-04-21T12:18:20.318 INFO:journalctl@ceph.mon.c.smithi129.stdout:Apr 21 12:18:20 smithi129 ceph-mon[31790]: slow request 199.712049 seconds old, received at 2024-04-21T12:15:00.098184+0000: peer_request:mds.0:1981 currently dispatched
2024-04-21T12:18:20.348 INFO:journalctl@ceph.mon.b.smithi073.stdout:Apr 21 12:18:20 smithi073 ceph-mon[31345]: 1 slow requests, 1 included below; oldest blocked for > 199.712049 secs
2024-04-21T12:18:20.348 INFO:journalctl@ceph.mon.b.smithi073.stdout:Apr 21 12:18:20 smithi073 ceph-mon[31345]: slow request 199.712049 seconds old, received at 2024-04-21T12:15:00.098184+0000: peer_request:mds.0:1981 currently dispatched
2024-04-21T12:18:20.443 INFO:journalctl@ceph.mon.a.smithi046.stdout:Apr 21 12:18:20 smithi046 ceph-mon[27983]: 1 slow requests, 1 included below; oldest blocked for > 199.712049 secs
2024-04-21T12:18:20.444 INFO:journalctl@ceph.mon.a.smithi046.stdout:Apr 21 12:18:20 smithi046 ceph-mon[27983]: slow request 199.712049 seconds old, received at 2024-04-21T12:15:00.098184+0000: peer_request:mds.0:1981 currently dispatched

leonidus@teuthology:/a/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666598
$ zgrep slow remote/smithi073/log/88391cde-ffd5-11ee-bc93-c7b262605968/ceph-mds.b.log.gz
2024-04-21T12:18:24.468+0000 7fa9f195d700  0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 210.002867 secs
2024-04-21T12:18:24.468+0000 7fa9f195d700  0 log_channel(cluster) log [WRN] : slow request 210.002867 seconds old, received at 2024-04-21T12:14:54.467714+0000: internal op exportdir:mds.0:1981 currently failed to acquire quiesce lock

The same log shows traces of a crash:

2024-04-21T12:18:40.865+0000 7fa9ec953700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/19.0.0-3154-ga630765c/rpm/el8/BUILD/ceph
-19.0.0-3154-ga630765c/src/mds/Locker.cc: In function 'void Locker::local_wrlock_finish(const lock_iterator&, MutationImpl*)' thread 7fa9ec953700 time 2024-04-21T12:18:40.864289+0000
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/19.0.0-3154-ga630765c/rpm/el8/BUILD/ceph-19.0.0-3154-ga630765c/src/mds/Locker.cc: 551
3: FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))

 ceph version 19.0.0-3154-ga630765c (a630765c1f93c6e61d9b40c4b76437611d5ef29e) squid (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7fa9fd7893dd]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x2d15a3) [0x7fa9fd7895a3]
 3: (Locker::local_wrlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*)+0x7b4) [0x55cad974fec4]
 4: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x1ca) [0x55cad977063a]
 5: (Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x5f) [0x55cad97764cf]
 6: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl> const&, boost::intrusive_ptr<MClientReply> const&)+0x4af) [0x55cad95c3a5f]
 7: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl> const&, int)+0x3b8) [0x55cad95c5458]
 8: (Server::_link_remote_finish(boost::intrusive_ptr<MDRequestImpl> const&, bool, CDentry*, CInode*, unsigned long)+0x165) [0x55cad95c87f5]
 9: (MDSContext::complete(int)+0x5f) [0x55cad98b05ff]
 10: (MDSIOContextBase::complete(int)+0x3b4) [0x55cad98b0b04]
 11: (MDSLogContextBase::complete(int)+0x5b) [0x55cad98b0cfb]
 12: (Finisher::finisher_thread_entry()+0x19d) [0x7fa9fd829d1d]
 13: /lib64/libpthread.so.0(+0x81ca) [0x7fa9fc5071ca]
 14: clone()

Related issues 1 (1 open0 closed)

Copied to CephFS - Backport #65710: squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))Fix Under ReviewLeonid UsovActions
Actions #1

Updated by Leonid Usov 12 days ago · Edited

We had a successful quiesce on the mds.0 followed by the said export dir request. The export dir request has failed to acquire the quiesce lock, expectedly, but it somehow got stuck

2024-04-21T12:14:58.973+0000 7fa9ec953700 20 mds.0.cache dispatch_quiesce_inode request(mds.0:1992 nref=3) quiescing [inode 0x10000000007 [...12,head] /volumes/qa/sv_1/567795c4-3420-4752-a6f2-111d0eb520d0/postgres/data/pg_wal/ rep@1.1 fragt
ree_t(*^3) v4180 f(v21 m2024-04-21T12:13:33.779632+0000 65=64+1) n(v31 rc2024-04-21T12:14:06.122296+0000 b1073741824 66=64+2)/n(v25 rc2024-04-21T12:13:33.779632+0000 b1073741824 66=64+2) old_inodes=11 (inest mix dirty) (ifile mix) (iquiesce
 lock x=1 by request(mds.0:1992 nref=4)) caps={24989=pAsLsXs/-@1} | dirtyscattered=1 request=1 lock=1 importing=0 dirfrag=1 caps=1 waiter=0 0x55cadcdecc00]
2024-04-21T12:14:58.973+0000 7fa9ec953700 10 mds.0.cache dispatch_quiesce_inode request(mds.0:1992 nref=3) non-auth quiesce complete of [inode 0x10000000007 [...12,head] /volumes/qa/sv_1/567795c4-3420-4752-a6f2-111d0eb520d0/postgres/data/pg
_wal/ rep@1.1 fragtree_t(*^3) v4180 f(v21 m2024-04-21T12:13:33.779632+0000 65=64+1) n(v31 rc2024-04-21T12:14:06.122296+0000 b1073741824 66=64+2)/n(v25 rc2024-04-21T12:13:33.779632+0000 b1073741824 66=64+2) old_inodes=11 (inest mix dirty) (i
file mix) (iquiesce lock x=1 by request(mds.0:1992 nref=4)) caps={24989=pAsLsXs/-@1} | dirtyscattered=1 request=1 lock=1 importing=0 dirfrag=1 caps=1 waiter=0 0x55cadcdecc00]
...
2024-04-21T12:15:00.097+0000 7fa9ee156700  7 mds.0.mig dispatch_export_dir request(mds.0:1981 nref=3) [dir 0x10000000007.110* /volumes/qa/sv_1/567795c4-3420-4752-a6f2-111d0eb520d0/postgres/data/pg_wal/ [2,head] auth{1=1} v=3089 cv=3083/3083 dir_auth=0 ap=1+13 state=1610875905|complete|exporting f(v21 m2024-04-21T12:13:33.672633+0000 8=7+1) n(v31 rc2024-04-21T12:14:45.426888+0000 b117440512 8=7+1)/n(v31 rc2024-04-21T12:13:57.056390+0000 b117440512 8=7+1) hs=8+8,ss=16+0 dirty=3 | ptrwaiter=0 request=1 child=1 frozen=0 subtree=1 importing=0 replicated=1 dirty=1 authpin=1 0x55cadd716000]
...
2024-04-21T12:15:00.097+0000 7fa9f3160700  7 mds.0.locker local_wrlock_start  on (iquiesce lock x=1 by request(mds.0:1992 nref=3)) on [inode 0x10000000007 [...12,head] /volumes/qa/sv_1/567795c4-3420-4752-a6f2-111d0eb520d0/postgres/data/pg_w
al/ rep@1.1 fragtree_t(*^3) v4198 f(v21 m2024-04-21T12:13:33.779632+0000 65=64+1) n(v31 rc2024-04-21T12:14:06.122296+0000 b1073741824 66=64+2)/n(v25 rc2024-04-21T12:13:33.779632+0000 b1073741824 66=64+2) old_inodes=11 (inest mix dirty) (ifi
le mix) (iquiesce lock x=1 by request(mds.0:1992 nref=3)) caps={24989=pAsLsXs/-@1} | dirtyscattered=1 request=2 lock=1 importing=0 dirfrag=1 caps=1 waiter=0 0x55cadcdecc00]

The said quiesce has finished after 15 seconds

2024-04-21T12:15:32.654+0000 7fa9ea14e700 20 quiesce.mgr.24545 <leader_upkeep_db> updating peer 24460 with 1 sets modified in db version range ((56:42)..43]
2024-04-21T12:15:32.654+0000 7fa9ea14e700 10 quiesce.mds.0 <operator()> sending listing q-db[v:(56:43) sets:1/0] to the peer 24460
2024-04-21T12:15:32.654+0000 7fa9ea14e700  1 -- [v2:172.21.15.73:6838/2448738084,v1:172.21.15.73:6839/2448738084] send_to--> mds [v2:172.21.15.129:6836/3029675336,v1:172.21.15.129:6837/3029675336] --  -- ?+0 0x55cadd5aea80
2024-04-21T12:15:32.654+0000 7fa9ea94f700 20 quiesce.agt <agent_thread_main> old = (56:42), current = (56:43)
2024-04-21T12:15:32.654+0000 7fa9ea14e700  1 -- [v2:172.21.15.73:6838/2448738084,v1:172.21.15.73:6839/2448738084] --> [v2:172.21.15.129:6836/3029675336,v1:172.21.15.129:6837/3029675336] --  -- 0x55cadd5aea80 con 0x55cadc8bbc00
2024-04-21T12:15:32.654+0000 7fa9ea94f700 10 quiesce.agt <agent_thread_main> Calling `cancel` on `file:/` with handle <mds.0:1982>
2024-04-21T12:15:32.655+0000 7fa9ea94f700 20 mds.0.cache killing quiesce op request(mds.0:1992 nref=3)
2024-04-21T12:15:32.655+0000 7fa9ea94f700 10 mds.0.cache request_kill request(mds.0:1992 nref=3)
2024-04-21T12:15:32.655+0000 7fa9ea94f700 15 mds.0.cache request_cleanup request(mds.0:1992 nref=3)
2024-04-21T12:15:32.655+0000 7fa9ea94f700  1 -- [v2:172.21.15.73:6838/2448738084,v1:172.21.15.73:6839/2448738084] send_to--> mds [v2:172.21.15.129:6836/3029675336,v1:172.21.15.129:6837/3029675336] -- peer_request(mds.0:1992.0 finish) -- ?+0 0x55cadd1b1180
2024-04-21T12:15:32.655+0000 7fa9ea94f700  1 -- [v2:172.21.15.73:6838/2448738084,v1:172.21.15.73:6839/2448738084] --> [v2:172.21.15.129:6836/3029675336,v1:172.21.15.129:6837/3029675336] -- peer_request(mds.0:1992.0 finish) -- 0x55cadd1b1180 con 0x55cadc8bbc00
2024-04-21T12:15:32.655+0000 7fa9ea94f700 10 mds.0.cache request_drop_foreign_locks forgetting lock (iquiesce lock x=1 by request(mds.0:1992 nref=4)) on 0x55cadcdecc00

this is suspicious.. iquiesce is a local lock, should it be even considered in request_drop_foreign_locks? If this code erroneously excludes this lock from the list, we may be never dropping it properly via Locker::local_xlock_finish(). This would explain why the internal op waiting for that lock never resumed.

Actions #2

Updated by Leonid Usov 12 days ago · Edited

  • Status changed from New to Fix Under Review
  • Pull request ID set to 57020
Actions #3

Updated by Leonid Usov 12 days ago

The incorrect behavior of the method that stripped the local quiesce lock from the request resulted in the crash when the client happened to release the wrlock on the quiesce lock on the same inode:

2024-04-21T12:18:40.863+0000 7fa9ec953700  7 mds.0.locker local_wrlock_finish  on (iquiesce lock w=1 last_client=24989) on [inode 0x10000000007 [...1a,head] /volumes/qa/sv_1/567795c4-3420-4752-a6f2-111d0eb520d0/postgres/data/pg_wal/ rep@1.1
 fragtree_t(*^3) v4824 f(v21 m2024-04-21T12:13:33.779632+0000 65=64+1) n(v40 rc2024-04-21T12:17:53.624957+0000 b1073741824 66=64+2)/n(v25 rc2024-04-21T12:13:33.779632+0000 b1073741824 66=64+2) old_inodes=11 (isnap sync r=1) (inest mix w=1 d
irty) (ifile mix w=1 dirty) (iquiesce lock w=1 last_client=24989) caps={24989=pAsLsXs/pAsLsXsFsxcral@26} | dirtyscattered=2 request=2 lock=4 importing=0 dirfrag=7 caps=1 dirwaiter=0 waiter=1 discoverbase=0 0x55cadcdecc00]
2024-04-21T12:18:40.864+0000 7fa9f3160700  1 -- [v2:172.21.15.73:6838/2448738084,v1:172.21.15.73:6839/2448738084] <== mds.2 v2:172.21.15.129:6838/2581261009 3851 ==== peer_request(client.24989:4542.0 committed) ==== 139+0+0 (crc 0 0 0) 0x55
cadd255180 con 0x55cadc7ea800
2024-04-21T12:18:40.865+0000 7fa9ec953700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/19.0.0-3154-ga630765c/rpm/el8/BUILD/ceph
-19.0.0-3154-ga630765c/src/mds/Locker.cc: In function 'void Locker::local_wrlock_finish(const lock_iterator&, MutationImpl*)' thread 7fa9ec953700 time 2024-04-21T12:18:40.864289+0000
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/19.0.0-3154-ga630765c/rpm/el8/BUILD/ceph-19.0.0-3154-ga630765c/src/mds/Locker.cc: 5513: FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK))
Actions #4

Updated by Leonid Usov 12 days ago

  • Assignee set to Leonid Usov
  • Priority changed from Normal to High
  • Backport set to squid
Actions #5

Updated by Leonid Usov 11 days ago · Edited

fixing the request_drop_foreign_locks method uncovered another crash due to the same reason, this time when dropping the locks:

2024-04-22T06:57:07.321+0000 7f1b41447700 20 mds.0.cache killing quiesce op request(mds.0:2027 nref=3)
2024-04-22T06:57:07.321+0000 7f1b41447700 10 mds.0.cache request_kill request(mds.0:2027 nref=3)
2024-04-22T06:57:07.321+0000 7f1b41447700 15 mds.0.cache request_cleanup request(mds.0:2027 nref=3)
2024-04-22T06:57:07.323+0000 7f1b41447700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/19.0.0-3156-g86f7587a/rpm/el8/BUILD/ceph
-19.0.0-3156-g86f7587a/src/mds/Locker.cc: In function 'void Locker::_drop_locks(MutationImpl*, std::set<CInode*>*, bool)' thread 7f1b41447700 time 2024-04-22T06:57:07.322796+0000
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/19.0.0-3156-g86f7587a/rpm/el8/BUILD/ceph-19.0.0-3156-g86f7587a/src/mds/Locker.cc: 720
: FAILED ceph_assert(lock->get_sm()->can_remote_xlock)

 ceph version 19.0.0-3156-g86f7587a (86f7587a5a09af35f0895e1a2d08527638fae697) squid (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f1b542813dd]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x2d15a3) [0x7f1b542815a3]
 3: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0xa2c) [0x55624223ae9c]
 4: (Locker::drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x7f) [0x55624223af9f]
 5: (MDCache::request_cleanup(boost::intrusive_ptr<MDRequestImpl> const&)+0x243) [0x5562421690d3]
 6: (MDCache::request_cleanup(boost::intrusive_ptr<MDRequestImpl> const&)+0x1ac) [0x55624216903c]
 7: /usr/bin/ceph-mds(+0x60dd3f) [0x5562423e8d3f]
 8: (QuiesceAgent::agent_thread_main()+0x1680) [0x556242451560]
 9: /lib64/libpthread.so.0(+0x81ca) [0x7f1b52fff1ca]
 10: clone()

The assumption wasn't correct that any xlock that's not an auth must be a remote xlock:

  if (it->is_xlock()) {
    if (obj->is_auth()) {
      ...
    } else {
      ceph_assert(lock->get_sm()->can_remote_xlock); // <<
Actions #6

Updated by Leonid Usov 11 days ago

Another instance of this issue: https://pulpito.ceph.com/leonidus-2024-04-21_11:37:13-fs-wip-lusov-quiescer-distro-default-smithi/7666596/

2024-04-21T12:09:29.474 INFO:journalctl@ceph.mon.c.smithi188.stdout:Apr 21 12:09:29 smithi188 ceph-mon[31370]: 1 slow requests, 1 included below; oldest blocked for > 184.805394 secs
2024-04-21T12:09:29.474 INFO:journalctl@ceph.mon.c.smithi188.stdout:Apr 21 12:09:29 smithi188 ceph-mon[31370]: slow request 184.805394 seconds old, received at 2024-04-21T12:06:22.675885+0000: client_request(client.24946:39803 create owner_uid=1000, owner_gid=1316 #0x10000000273/file7017 2024-04-21T12:06:17.693293+0000 FWD=1 caller_uid=1000, caller_gid=1316{6,36,1000,1316,}) currently failed to acquire quiesce lock
Actions #7

Updated by Patrick Donnelly 3 days ago

  • Status changed from Fix Under Review to Pending Backport
  • Target version set to v20.0.0
  • Source set to Q/A
  • Component(FS) MDS added
  • Labels (FS) qa-failure added
Actions #8

Updated by Backport Bot 3 days ago

  • Copied to Backport #65710: squid: workload fails due to slow ops, assert in logs mds/Locker.cc: 551 FAILED ceph_assert(!lock->is_waiter_for(SimpleLock::WAIT_WR) || lock->is_waiter_for(SimpleLock::WAIT_XLOCK)) added
Actions #9

Updated by Backport Bot 3 days ago

  • Tags set to backport_processed
Actions #10

Updated by Leonid Usov 2 days ago

  • Component(FS) deleted (MDS)
Actions #11

Updated by Leonid Usov 2 days ago

  • Component(FS) MDS added
Actions

Also available in: Atom PDF