Bug #47833
mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_session(Session*)'
0%
Description
We are not able to decrease from max_mds=2 to 1 on our cephfs cluster.
As soon as we decrease max_mds, the mds goes to up:stopping, then we see the migrator start to export a few dirs, and then this assert:
2020-10-12 15:53:18.353 7ffae5ee4700 -1 /builddir/build/BUILD/ceph-14.2.11/src/mds/SessionMap.cc: In function 'void SessionMap::hit_session(Session*)' thread 7ffae5ee4700 time 2020-10-12 15:53:18.349619 /builddir/build/BUILD/ceph-14.2.11/src/mds/SessionMap.cc: 1019: FAILED ceph_assert(sessions != 0) ceph version 14.2.11 (f7fdb2f52131f54b891a2ec99d8205561242cdaf) nautilus (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x7ffaef177025] 2: (()+0x25c1ed) [0x7ffaef1771ed] 3: (()+0x3c2bfa) [0x559ca2efebfa] 4: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0xb45) [0x559ca2ce3035] 5: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x1c9) [0x559ca2ce32e9] 6: (MDSContext::complete(int)+0x74) [0x559ca2f0a054] 7: (MDCache::_do_find_ino_peer(MDCache::find_ino_peer_info_t&)+0x516) [0x559ca2d753d6] 8: (MDCache::handle_find_ino_reply(boost::intrusive_ptr<MMDSFindInoReply const> const&)+0x502) [0x559ca2d810f2] 9: (MDCache::dispatch(boost::intrusive_ptr<Message const> const&)+0x1a7) [0x559ca2db6b07] 10: (MDSRank::handle_deferrable_message(boost::intrusive_ptr<Message const> const&)+0x48a) [0x559ca2c9692a] 11: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x7ea) [0x559ca2c98fda] 12: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x41) [0x559ca2c99441] 13: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x108) [0x559ca2c86508] 14: (DispatchQueue::entry()+0x1699) [0x7ffaef397d69] 15: (DispatchQueue::DispatchThread::entry()+0xd) [0x7ffaef4451ed] 16: (()+0x7ea5) [0x7ffaed02dea5] 17: (clone()+0x6d) [0x7ffaebcdb8dd]
The standby gets the same assert.
The full mds log is at ceph-post-file: 12f9e692-2727-4311-944f-9aa7b8da4499
And the log of the standby mds is at ceph-post-file: 0037c9fd-de03-4018-8433-455f1c94c456
We set max_mds back to 2 and the crashes stop.
Maybe relevant: on this cluster, we have directories manually pinned to rank 0 and 1.
Related issues
History
#1 Updated by Dan van der Ster over 2 years ago
Here is a log with debug_mds=10. ceph-post-file: f4f87969-d492-4e1d-8e8e-5c9e81e45d2f
From what I can gather, (and I might be wrong), this happens when a client does getattr with a Stale fh:
-42> 2020-10-12 16:22:54.848 7faf9f4df700 7 mds.1.server reply_client_request -116 ((116) Stale file handle) client_request(client.1247869482:6492 getattr As #0x10041167375 2020-10-06 17:09:53.325431 RETRY=229 caller_uid=0, caller_gid=0{}) v4 -41> 2020-10-12 16:22:54.848 7faf9f4df700 10 mds.1.server apply_allocated_inos 0x0 / [] / 0x0 -2> 2020-10-12 16:22:54.851 7faf9f4df700 -1 /builddir/build/BUILD/ceph-14.2.11/src/mds/SessionMap.cc: In function 'void SessionMap::hit_session(Session*)' thread 7faf9f4df700 time 2020-10-12 16:22:54.849676 0> 2020-10-12 16:22:54.854 7faf9f4df700 -1 *** Caught signal (Aborted) ** in thread 7faf9f4df700 thread_name:ms_dispatch
I am going to evict those clients with Stale file handles then try again.
#2 Updated by Dan van der Ster over 2 years ago
Indeed, I evicted the weird clients spinning on Stale file handles, and then the mds stopping procedure finished without crashing.
For completeness, here is one such client that can trigger this:
{ "id": 1223433359, "entity": { "name": { "type": "client", "num": 1223433359 }, "addr": { "type": "any", "addr": "redacted:0", "nonce": 1813269304 } }, "state": "open", "num_leases": 0, "num_caps": 1, "request_load_avg": 14545, "uptime": 16737.925794727998, "requests_in_flight": 0, "completed_requests": 0, "reconnecting": false, "recall_caps": { "value": 0, "halflife": 60 }, "release_caps": { "value": 0, "halflife": 60 }, "recall_caps_throttle": { "value": 0, "halflife": 2.5 }, "recall_caps_throttle2o": { "value": 0, "halflife": 0.5 }, "session_cache_liveness": { "value": 0, "halflife": 300 }, "inst": "client.1223433359 redacted:0/1813269304", "completed_requests": [], "prealloc_inos": [], "used_inos": [], "client_metadata": { "features": "0x0000000000001fff", "ceph_sha1": "75f4de193b3ea58512f204623e6c5a16e6c1e1ba", "ceph_version": "ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)", "entity_id": "pvc-990e69c2-518d-4974-8867-4d6ad8ddf387", "hostname": "kubeflow-ml-001-redacted-node-1.cern.ch.ch", "mount_point": "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-990e69c2-518d-4974-8867-4d6ad8ddf387/globalmount", "pid": "891", "root": "/volumes/_nogroup/af09d1d5-6854-4156-8d47-3fd0d3aed30b" } },
#3 Updated by Patrick Donnelly over 2 years ago
Dan van der Ster wrote:
Indeed, I evicted the weird clients spinning on Stale file handles, and then the mds stopping procedure finished without crashing.
For completeness, here is one such client that can trigger this:
[...]
Hi Dan, is that session dump from rank 0? It must be rank 1 did not have a session with that client to trip that assertion.
#4 Updated by Dan van der Ster over 2 years ago
Patrick Donnelly wrote:
Dan van der Ster wrote:
Indeed, I evicted the weird clients spinning on Stale file handles, and then the mds stopping procedure finished without crashing.
For completeness, here is one such client that can trigger this:
[...]Hi Dan, is that session dump from rank 0? It must be rank 1 did not have a session with that client to trip that assertion.
Hi Patrick, I'm not sure, because this was an output of tell mds.*. But I'll check next time I trigger this. (I can reproduce the stuck clients by failing over to a standby mds).
#5 Updated by Dan van der Ster over 2 years ago
I have the coredump so we can debug further. In the hit_session frame, we see the session clearly:
(gdb) up #8 0x000055f33658dbfa in SessionMap::hit_session (this=0x55f338b5f150, session=session@entry=0x55f338c3b000) (gdb) p session->state $3 = 3 (gdb) p Session::STATE_CLOSING+0 $4 = 3 (gdb) p session->info $5 = {inst = {name = {_type = 8 '\b', _num = 1247869482, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, addr = { static TYPE_DEFAULT = entity_addr_t::TYPE_MSGR2, type = 3, nonce = 1807280899, u = {sa = {sa_family = 2, sa_data = "\000\000\211\212\067'\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 0, sin_addr = {s_addr = 657951369}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 0, sin6_flowinfo = 657951369, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}}, completed_requests = std::map with 0 elements, prealloc_inos = {_size = 0, m = std::map with 0 elements}, used_inos = {_size = 0, Python Exception <class 'gdb.error'> There is no member or method named _M_value_field.: m = std::map with 0 elements}, client_metadata = {kv_map = std::map with 7 elements, features = {static bits_per_block = 64, _vec = std::vector of length 1, capacity 1 = {8191}}}, completed_flushes = std::set with 0 elements, auth_name = {static STR_TO_ENTITY_TYPE = {_M_elems = {{type = 32, str = 0x7fafa8fff606 "auth"}, {type = 1, str = 0x7fafa8fde9a7 "mon"}, {type = 4, str = 0x7fafa9021bf8 "osd"}, {type = 2, str = 0x7fafa90344ff "mds"}, {type = 16, str = 0x7fafa901bf7e "mgr"}, {type = 8, str = 0x7fafa901bfcb "client"}}}, type = 8, id = "pvc-fb998d31-d9c4-462d-a7a4-6f768458cd56", type_id = "client.pvc-fb998d31-d9c4-462d-a7a4-6f768458cd56"}}
So looks like we are in hit_session when state is Session::STATE_CLOSING.
Does this fix, or would it be better to check the session state before calling hit_session?
diff --git a/src/mds/SessionMap.cc b/src/mds/SessionMap.cc index ae80cef..18bfc5c 100644 --- a/src/mds/SessionMap.cc +++ b/src/mds/SessionMap.cc @@ -1015,7 +1015,8 @@ int Session::check_access(CInode *in, unsigned mask, // track total and per session load void SessionMap::hit_session(Session *session) { uint64_t sessions = get_session_count_in_state(Session::STATE_OPEN) + - get_session_count_in_state(Session::STATE_STALE); + get_session_count_in_state(Session::STATE_STALE) +; + get_session_count_in_state(Session::STATE_CLOSING); ceph_assert(sessions != 0); double total_load = total_load_avg.hit();
#6 Updated by Patrick Donnelly over 2 years ago
That patch looks correct. Would you like to post the PR Dan?
#7 Updated by Dan van der Ster over 2 years ago
- Backport set to nautilus, octopus
- Pull request ID set to 37646
#8 Updated by Dan van der Ster over 2 years ago
I tested the fix in the same env, with the stale fh clients, and now the mds does not crash while stopping.
#9 Updated by Nathan Cutler over 2 years ago
- Status changed from New to Fix Under Review
#10 Updated by Patrick Donnelly over 2 years ago
- Assignee set to Dan van der Ster
- Target version set to v16.0.0
- Source set to Community (dev)
- Backport changed from nautilus, octopus to nautilus,octopus
- Component(FS) MDS added
#11 Updated by Patrick Donnelly over 2 years ago
- Status changed from Fix Under Review to Pending Backport
#12 Updated by Nathan Cutler over 2 years ago
- Copied to Backport #47935: nautilus: mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_session(Session*)' added
#13 Updated by Nathan Cutler over 2 years ago
- Copied to Backport #47936: octopus: mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_session(Session*)' added
#14 Updated by Nathan Cutler over 2 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".