Project

General

Profile

Bug #18757

Jewel ceph-fuse does not recover after lost connection to MDS

Added by Henrik Korkuc 10 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
02/01/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
jewel, kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Component(FS):
ceph-fuse
Needs Doc:
No

Description

After ceph-fuse loses connection to MDS for few minutes, it does not recover - accessing mountpoint hangs processes.
Replicated using 10.2.3 and 10.2.5 clients. Didn't try on Kraken.

Example setup:
periodically poke mount:
while [ 1 ]; do date; ls -l cephfs/test ; sleep 10; done

then firewall connection to MDS:
iptables -A INPUT -p tcp --dport 6816 -j DROP

after pokes stop wait few minutes (5?). If we wait for too short time, it recovers, if long enough it hangs. Attaching client logs with enabled debug options:
debug_mds = 20/20
debug_mds_balancer = 20/20
debug_mds_log = 20/20
debug_objecter = 20/20
debug_rados = 20/20
debug_client = 20/20
debug_ms = 20/20
debug_fuse = 20/20

and a thread backtrace.

gdb.txt.gz (2.96 KB) Henrik Korkuc, 02/01/2017 12:23 PM

ceph-client.cephfs.log.gz (27.6 KB) Henrik Korkuc, 02/01/2017 12:24 PM

ceph-client.cephfs.log View (586 KB) Henrik Korkuc, 02/02/2017 08:24 AM

ceph-mds.henrik-eu2.log.gz (97 KB) Henrik Korkuc, 02/02/2017 08:25 AM


Related issues

Copied to fs - Backport #19677: jewel: Jewel ceph-fuse does not recover after lost connection to MDS Resolved
Copied to fs - Backport #19678: kraken: Jewel ceph-fuse does not recover after lost connection to MDS Resolved

History

#1 Updated by John Spray 10 months ago

  • Status changed from New to Rejected

Clients which lose connectivity to the MDS are evicted after a timeout given by the "mds session timeout" setting. Evicting a client may cause the client to get stuck, as it can't necessarily get back into a healthy state.

Presumably you are reproducing this artificially because you encountered it in practice somewhere; the solution in practice is to either increase that interval, or fix whatever is causing the loss of contact to begin with.

There are changes coming in the future that will make this functionality a bit friendlier to make the client cope more gracefully with being evicted, only evict clients when they're holding up other clients.

#2 Updated by Henrik Korkuc 10 months ago

Yes, I am reproducing it artificially after short network blip caused permanent mount point hangs of multiple servers, sending load avg of these servers to the sky by periodic mountpoint stats (monitoring system).

If ceph-fuse cannot handle session reset and only way to recover is to manually SIGKILL ceph-fuse and remount, why doesn't ceph-fuse just kill itself? In my opinion erroring out a user would be much more friendlier than just hanging it permanently.

#3 Updated by John Spray 10 months ago

  • Status changed from Rejected to New

Hmm, now that I actually read the log (like a reasonable person :-)) it is a little bit strange that the server is sending the client client_session(stale) messages (not close messages) but then apparently dropping the client's getattrs.

Could you upload the MDS log from the same period, and mark the times at which the connection was interrupted and then restored?

#4 Updated by Henrik Korkuc 10 months ago

Attaching client and MDS logs. This time I was mounting from another server and firewalling both input and output to make sure everything is isolated. Final outcome didn't change

#5 Updated by Henrik Korkuc 10 months ago

forgot to add a timeline:
08:08:29 mounted
08:09:54 iptables up
08:10:50 ls stuck
08:16:02 iptables down
08:17:56 tried "ls -f"

#6 Updated by Henrik Korkuc 10 months ago

Comparing logs I noticed that MDS clock is ~30s behind client. ntpd was dead on one of test servers... Will try to compensate in notes

~08:10:56 MDS decides client is stale:

2017-02-02 08:10:26.538843 7f3b0042f700 10 mds.0.server new stale session client.4151 10.194.0.100:0/4006638222 last 2017-02-02 08:09:21.880119

~08:14:56 client session is closed:

2017-02-02 08:14:26.545466 7f3b0042f700  0 log_channel(cluster) log [INF] : closing stale session client.4151 10.194.0.100:0/4006638222 after 304.665332
2017-02-02 08:14:26.545476 7f3b0042f700 10 mds.0.server autoclosing stale session client.4151 10.194.0.100:0/4006638222 last 2017-02-02 08:09:21.880119

~08:16:10 new session is created, resetsession sent:

2017-02-02 08:15:41.931194 7f3afdb29700 10 mds.client.cephfs  new session 0x55c844270680 for client.4151 10.194.0.100:0/4006638222 con 0x55c8445b8180
<..>
2017-02-02 08:15:41.931248 7f3afdb29700  0 -- 10.194.0.189:6816/31695 >> 10.194.0.100:0/4006638222 pipe(0x55c84441e800 sd=18 :6816 s=0 pgs=0 cs=0 l=0 c=0x55c8445b8180).accept we re
set (peer sent cseq 2), sending RESETSESSION

client get's reset session, marks it's session as stale.

2017-02-02 08:16:10.810857 7f8cc4ff9700  0 client.4151 ms_handle_remote_reset on 10.194.0.189:6816/31695
2017-02-02 08:16:10.810869 7f8cc4ff9700  1 client.4151 reset from mds we were open; mark session as stale

08:16:31 client requests for caps, get's ignored because it's session is closed. Happens multiple times.
client:

2017-02-02 08:16:31.016380 7f8ccc5ef700 10 -- 10.194.0.100:0/4006638222 >> 10.194.0.189:6816/31695 pipe(0x5616c998ca10 sd=0 :33785 s=2 pgs=7 cs=1 l=0 c=0x5616c998ad60).reader got ack seq 742216328 >= 742216328 on 0x7f8c9c018040 client_session(request_renewcaps seq 26) v1

MDS:
2017-02-02 08:16:01.932965 7f3b02d35700  3 mds.0.server handle_client_session client_session(request_renewcaps seq 26) v1 from client.4151
2017-02-02 08:16:01.932969 7f3b02d35700 10 mds.0.server ignoring renewcaps on non open|stale session (closed)

Looking at src/client/Client.cc I think that ms_handle_remote_reset() "MetaSession::STATE_OPEN" case of state switch should reopen session instead of marking it stale. I could try making a diff moving case to be together with MetaSession::STATE_OPENING as it looks like it does the same. What do you think? Would session close/open have some other side effects for fuse client in open state?

#7 Updated by Henrik Korkuc 9 months ago

I created https://github.com/ceph/ceph/pull/13522

This resolves hang and allows work with mountpoint in this test case. I am just not sure if it is proper way to do a reconnect and if there are any additional side effects.

#8 Updated by Zheng Yan 9 months ago

you can use 'ceph daemon client.xxx kick_stale_sessions' to recover this issue. Maybe we should add config option to decide if automatic reconnect is desired.

#9 Updated by Henrik Korkuc 9 months ago

I updated PR to do _closed_mds_session(s).

As for config option, I would expect client to reconnect automagically after connection loss (it is self-healing after all).

On the other hand, I can do config option too if it is considered a better way to handle this change.

#10 Updated by John Spray 7 months ago

  • Status changed from New to Pending Backport
  • Backport set to jewel, kraken

Let's backport this for the benefit of people running cephfs today

#11 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #19677: jewel: Jewel ceph-fuse does not recover after lost connection to MDS added

#12 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #19678: kraken: Jewel ceph-fuse does not recover after lost connection to MDS added

#13 Updated by Nathan Cutler 4 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF