Project

General

Profile

Actions

Bug #62579

open

client: evicted warning because client completes unmount before thrashed MDS comes back

Added by Patrick Donnelly 8 months ago. Updated 7 months ago.

Status:
Pending Backport
Priority:
Normal
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
backport_processed
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
qa-suite
Labels (FS):
qa, qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2023-08-22T17:33:03.517 INFO:teuthology.orchestra.run.smithi008.stdout:2023-08-22T17:31:34.138829+0000 mds.d (mds.0) 1 : cluster [WRN] evicting unresponsive client smithi008:0 (4965), after waiting 46.9909 seconds during MDS startup

From: /teuthology/yuriw-2023-08-22_14:48:56-fs-pacific-release-distro-default-smithi/7376380/teuthology.log

client.0 was evicted, it completed its workunit:

2023-08-22T17:28:38.841 INFO:tasks.workunit:Stopping ['fs/snaps'] on client.0...
...
2023-08-22T17:29:02.854 DEBUG:teuthology.orchestra.run.smithi008:> sudo fusermount -u /home/ubuntu/cephtest/mnt.0

The client log shows:

2023-08-22T17:29:23.960+0000 7fb20d7fa700 10 client.4965 unmounting: trim pass, size was 0+0
2023-08-22T17:29:23.960+0000 7fb20d7fa700 20 client.4965 trim_cache size 0 max 16384
2023-08-22T17:29:23.960+0000 7fb20d7fa700 10 client.4965 unmounting: trim pass, size still 0+0
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 client.4965 1 mds(s) did not respond to session close -- timing out.                                                                             
2023-08-22T17:29:32.920+0000 7fb2218a2200  5 client.4965 _closed_mds_session mds.0 seq 1
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6836/1047878986,v1:172.21.15.8:6837/1047878986] conn(0x7fb21158a040 msgr2=0x7fb20066b240 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6836/1047878986,v1:172.21.15.8:6837/1047878986] conn(0x7fb21158a040 0x7fb20066b240 crc :-1 s=READY pgs=19 cs=0 l=0 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200 10 client.4965 remove_session_caps mds.0
2023-08-22T17:29:32.920+0000 7fb2218a2200 10 client.4965 kick_requests_closed for mds.0
2023-08-22T17:29:32.920+0000 7fb2218a2200 10 client.4965 unmount stopping remount finisher
2023-08-22T17:29:32.920+0000 7fb2218a2200  2 client.4965 unmounted.
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 client.4965 shutdown
2023-08-22T17:29:32.920+0000 7fb2218a2200 10 client.4965 shutdown stopping cache invalidator finisher                                                                                         
2023-08-22T17:29:32.920+0000 7fb2218a2200 10 client.4965 shutdown stopping dentry invalidator finisher                                                                                        
2023-08-22T17:29:32.920+0000 7fb2218a2200 10 client.4965 shutdown stopping interrupt finisher
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6808/33075,v1:172.21.15.8:6809/33075] conn(0x7fb200551dd0 msgr2=0x7fb20206ebc0 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6808/33075,v1:172.21.15.8:6809/33075] conn(0x7fb200551dd0 0x7fb20206ebc0 crc :-1 s=READY pgs=462
cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6810/33076,v1:172.21.15.8:6811/33076] conn(0x7fb200d92af0 msgr2=0x7fb200176a80 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6810/33076,v1:172.21.15.8:6811/33076] conn(0x7fb200d92af0 0x7fb200176a80 crc :-1 s=READY pgs=408
cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6800/33074,v1:172.21.15.8:6801/33074] conn(0x7fb200514500 msgr2=0x7fb2020766a0 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6800/33074,v1:172.21.15.8:6801/33074] conn(0x7fb200514500 0x7fb2020766a0 crc :-1 s=READY pgs=395
cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6824/33085,v1:172.21.15.8:6825/33085] conn(0x7fb2102f2040 msgr2=0x7fb2003bcd10 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:6824/33085,v1:172.21.15.8:6825/33085] conn(0x7fb2102f2040 0x7fb2003bcd10 crc :-1 s=READY pgs=567
cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb21fcd3700  1 -- 192.168.0.1:0/2936571564 reap_dead start
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6800/33318,v1:172.21.15.163:6801/33318] conn(0x7fb21028f890 msgr2=0x7fb2000bc8a0 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6800/33318,v1:172.21.15.163:6801/33318] conn(0x7fb21028f890 0x7fb2000bc8a0 crc :-1 s=READY pgs=495 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb21fcd3700  1 -- 192.168.0.1:0/2936571564 reap_dead start
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6824/33334,v1:172.21.15.163:6825/33334] conn(0x7fb200482330 msgr2=0x7fb20042a8d0 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6824/33334,v1:172.21.15.163:6825/33334] conn(0x7fb200482330 0x7fb20042a8d0 crc :-1 s=READY pgs=603 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6808/33320,v1:172.21.15.163:6809/33320] conn(0x7fb212ce2600 msgr2=0x7fb2002f4a80 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6808/33320,v1:172.21.15.163:6809/33320] conn(0x7fb212ce2600 0x7fb2002f4a80 crc :-1 s=READY pgs=375 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6816/33319,v1:172.21.15.163:6817/33319] conn(0x7fb200ee2ec0 msgr2=0x7fb20091e6c0 crc :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6816/33319,v1:172.21.15.163:6817/33319] conn(0x7fb200ee2ec0 0x7fb20091e6c0 crc :-1 s=READY pgs=535 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:3300/0,v1:172.21.15.8:6789/0] conn(0x55cb87c03190 msgr2=0x55cb87c15230 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=1).mark_down
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:3300/0,v1:172.21.15.8:6789/0] conn(0x55cb87c03190 0x55cb87c15230 secure :-1 s=READY pgs=275 cs=0
l=1 rev1=1 rx=0x7fb21800d300 tx=0x7fb218005d40).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 shutdown_connections
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6824/33334,v1:172.21.15.163:6825/33334] conn(0x7fb200482330 0x7fb20042a8d0 unknown :-1 s=CLOSED pgs=603 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6808/33320,v1:172.21.15.163:6809/33320] conn(0x7fb212ce2600 0x7fb2002f4a80 unknown :-1 s=CLOSED pgs=375 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.163:6816/33319,v1:172.21.15.163:6817/33319] conn(0x7fb200ee2ec0 0x7fb20091e6c0 unknown :-1 s=CLOSED pgs=535 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 --2- 192.168.0.1:0/2936571564 >> [v2:172.21.15.8:3300/0,v1:172.21.15.8:6789/0] conn(0x55cb87c03190 0x55cb87c15230 unknown :-1 s=CLOSED pgs=275 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 >> 192.168.0.1:0/2936571564 conn(0x55cb87bf6ac0 msgr2=0x55cb87bf9020 unknown :-1 s=STATE_NONE l=0).mark_down         
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 shutdown_connections
2023-08-22T17:29:32.920+0000 7fb2218a2200  1 -- 192.168.0.1:0/2936571564 wait complete.
2023-08-22T17:29:32.960+0000 7fb2218a2200 20 client.4965 trim_cache size 0 max 16384

mds.0 was thrashed so the client gave up reconnecting to the MDS during unmount. When mds.0 came back, it evicted the client's stale session.


Related issues 3 (2 open1 closed)

Copied to CephFS - Backport #62897: pacific: client: evicted warning because client completes unmount before thrashed MDS comes backResolvedPatrick DonnellyActions
Copied to CephFS - Backport #62898: quincy: client: evicted warning because client completes unmount before thrashed MDS comes backIn ProgressPatrick DonnellyActions
Copied to CephFS - Backport #62899: reef: client: evicted warning because client completes unmount before thrashed MDS comes backIn ProgressPatrick DonnellyActions
Actions #1

Updated by Patrick Donnelly 8 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 53149
Actions #2

Updated by Patrick Donnelly 7 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #3

Updated by Backport Bot 7 months ago

  • Copied to Backport #62897: pacific: client: evicted warning because client completes unmount before thrashed MDS comes back added
Actions #4

Updated by Backport Bot 7 months ago

  • Copied to Backport #62898: quincy: client: evicted warning because client completes unmount before thrashed MDS comes back added
Actions #5

Updated by Backport Bot 7 months ago

  • Copied to Backport #62899: reef: client: evicted warning because client completes unmount before thrashed MDS comes back added
Actions #6

Updated by Backport Bot 7 months ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF