Project

General

Profile

Bug #50112

MDS stuck at stopping when reducing max_mds

Added by 玮文 胡 19 days ago. Updated 12 days ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
pacific,octopus,nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We are trying to upgrade to v16 today. Cephadm is trying to reduce max_mds to 1 automatically. However, MDS.1 is stuck at stopping state for about 10 minutes and seems making no progress. We can't wait longer because some MDS requests are blocked.

We are running 2 active MDS for weeks, with both kernel and userspace clients.

On the web dashboard, when stuck at stopping state, MDS.1 has 0 dentries, 2 inodes, 1 dirs, 0 caps.

We've tried 3 times. All get the same result.

To get out of this state, we reset max_mds to 2 and restart the stopping MDS.

Here are the logs from MDS.1

2021-04-02T10:36:28.832+0000 7f17f888f700 20 mds.1.locker caps_tick 0 revoking caps
2021-04-02T10:36:28.832+0000 7f17f888f700 20 mds.1.23151 updating export targets, currently 1 ranks are targets
2021-04-02T10:36:28.832+0000 7f17f888f700 20 mds.1.23151 export target mds.0 is [C 4.40e-01]
2021-04-02T10:36:28.832+0000 7f17f888f700 10 mds.1.log trim 1 / 128 segments, 1 / -1 events, 0 (0) expiring, 0 (0) expired
2021-04-02T10:36:28.832+0000 7f17f888f700 10 mds.1.log _trim_expired_segments waiting for 181543359/644740121223 to expire
2021-04-02T10:36:28.832+0000 7f17f888f700  7 mds.1.cache shutdown_pass
2021-04-02T10:36:28.832+0000 7f17f888f700 10 mds.1.cache shutdown_export_strays 0x60a ''
2021-04-02T10:36:28.832+0000 7f17f888f700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=18446744073709551615
2021-04-02T10:36:28.832+0000 7f17f888f700  7 mds.1.cache trim_lru trimming 18446744073709551615 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:28.832+0000 7f17f888f700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:28.832+0000 7f17f888f700  5 mds.1.cache lru size now 0/0
2021-04-02T10:36:28.832+0000 7f17f888f700 10 mds.1.cache Migrating any ephemerally pinned inodes
2021-04-02T10:36:28.832+0000 7f17f888f700 20 mds.1.bal handle_export_pins export_pin_queue size=0
2021-04-02T10:36:28.832+0000 7f17f888f700  7 mds.1.cache looking for subtrees to export
2021-04-02T10:36:28.832+0000 7f17f888f700 10 mds.1.cache   examining [dir 0x101 ~mds1/ [2,head] auth v=70451715 cv=70451715/70451715 dir_auth=1 state=1073741824 f(v0 10=0+10) n(v322133 rc2021-04-02T10:26:20.324976+0000 b50 45=35+10) hs=0+0,ss=0+0 | child=0 subtree=1 subtreetemp=0 replicated=0 dirty=0 waiter=0 authpin=0 0x55ad7c233600] bounds
2021-04-02T10:36:28.832+0000 7f17f888f700 10 mds.1.log trim_all: 1/0/0
2021-04-02T10:36:28.832+0000 7f17f888f700 10 mds.1.log _trim_expired_segments waiting for 181543359/644740121223 to expire
2021-04-02T10:36:28.832+0000 7f17f888f700  7 mds.1.cache still have replicated objects
2021-04-02T10:36:28.832+0000 7f17f888f700  7 mds.1.23151 shutdown_pass=false
2021-04-02T10:36:29.056+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ad7c4d0800 client.5274411 v1:222.201.187.248:0/44200992 state closed
2021-04-02T10:36:29.056+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 11) from client.5274411
2021-04-02T10:36:29.056+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:29.236+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:29.528+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 1.000067010s ago
2021-04-02T10:36:29.528+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:29.528+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:29.528+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:29.528+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:29.528+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289596, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:29.528+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:29.528+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:29.528+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:29.664+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ad7c4d1200 client.6374310 v1:222.201.187.246:0/1497586911 state closed
2021-04-02T10:36:29.664+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 12) from client.6374310
2021-04-02T10:36:29.664+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:30.236+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:30.336+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ad7c4cf400 client.6514667 v1:222.201.187.170:0/3293323087 state closed
2021-04-02T10:36:30.336+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 2) from client.6514667
2021-04-02T10:36:30.336+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:30.368+0000 7f17f888f700 20 mds.1.23151 get_task_status
2021-04-02T10:36:30.368+0000 7f17f888f700 20 mds.1.23151 schedule_update_timer_task
2021-04-02T10:36:30.528+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 1.000068049s ago
2021-04-02T10:36:30.528+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:30.528+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:30.528+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:30.528+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:30.528+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289596, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:30.528+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:30.528+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:30.528+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:30.840+0000 7f17f808e700  5 mds.beacon.cephfs.gpu006.ddpekw Sending beacon up:stopping seq 7654
2021-04-02T10:36:30.840+0000 7f17f808e700 20 mds.beacon.cephfs.gpu006.ddpekw sender thread waiting interval 4s
2021-04-02T10:36:30.844+0000 7f17fd098700  5 mds.beacon.cephfs.gpu006.ddpekw received beacon reply up:stopping seq 7654 rtt 0.00400027
2021-04-02T10:36:31.164+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ad88956800 client.5274448 v1:222.201.187.233:0/638074764 state closed
2021-04-02T10:36:31.164+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 11) from client.5274448
2021-04-02T10:36:31.164+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:31.236+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:31.528+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 1.000069055s ago
2021-04-02T10:36:31.528+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:31.528+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:31.528+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:31.528+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:31.528+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289596, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:31.528+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:31.528+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:31.528+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:32.236+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:32.364+0000 7f17f888f700 20 mds.1.23151 get_task_status
2021-04-02T10:36:32.364+0000 7f17f888f700 20 mds.1.23151 schedule_update_timer_task
2021-04-02T10:36:32.524+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 0.996069741s ago
2021-04-02T10:36:32.524+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:32.524+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:32.524+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:32.524+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:32.528+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289596, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:32.528+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:32.528+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:32.528+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:33.236+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:33.352+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ada600ad00 client.5457150 v1:222.201.187.241:0/2631762979 state closed
2021-04-02T10:36:33.352+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 12) from client.5457150
2021-04-02T10:36:33.352+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:33.528+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 1.000070970s ago
2021-04-02T10:36:33.528+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:33.528+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:33.528+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:33.528+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:33.528+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289596, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:33.528+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:33.528+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:33.528+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 0.995397667s
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.log trim 1 / 128 segments, 1 / -1 events, 0 (0) expiring, 0 (0) expired
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.log _trim_expired_segments waiting for 181543359/644740121223 to expire
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.server find_idle_sessions. last cleared laggy state 5.81484e+06s ago
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.locker scatter_tick
2021-04-02T10:36:33.828+0000 7f17f888f700 20 mds.1.locker caps_tick 0 revoking caps
2021-04-02T10:36:33.828+0000 7f17f888f700 20 mds.1.23151 updating export targets, currently 1 ranks are targets
2021-04-02T10:36:33.828+0000 7f17f888f700 20 mds.1.23151 export target mds.0 is [C 3.11e-01]
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.log trim 1 / 128 segments, 1 / -1 events, 0 (0) expiring, 0 (0) expired
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.log _trim_expired_segments waiting for 181543359/644740121223 to expire
2021-04-02T10:36:33.828+0000 7f17f888f700  7 mds.1.cache shutdown_pass
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.cache shutdown_export_strays 0x60a ''
2021-04-02T10:36:33.828+0000 7f17f888f700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=18446744073709551615
2021-04-02T10:36:33.828+0000 7f17f888f700  7 mds.1.cache trim_lru trimming 18446744073709551615 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:33.828+0000 7f17f888f700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:33.828+0000 7f17f888f700  5 mds.1.cache lru size now 0/0
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.cache Migrating any ephemerally pinned inodes
2021-04-02T10:36:33.828+0000 7f17f888f700 20 mds.1.bal handle_export_pins export_pin_queue size=0
2021-04-02T10:36:33.828+0000 7f17f888f700  7 mds.1.cache looking for subtrees to export
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.cache   examining [dir 0x101 ~mds1/ [2,head] auth v=70451715 cv=70451715/70451715 dir_auth=1 state=1073741824 f(v0 10=0+10) n(v322133 rc2021-04-02T10:26:20.324976+0000 b50 45=35+10) hs=0+0,ss=0+0 | child=0 subtree=1 subtreetemp=0 replicated=0 dirty=0 waiter=0 authpin=0 0x55ad7c233600] bounds
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.log trim_all: 1/0/0
2021-04-02T10:36:33.828+0000 7f17f888f700 10 mds.1.log _trim_expired_segments waiting for 181543359/644740121223 to expire
2021-04-02T10:36:33.828+0000 7f17f888f700  7 mds.1.cache still have replicated objects
2021-04-02T10:36:33.828+0000 7f17f888f700  7 mds.1.23151 shutdown_pass=false
2021-04-02T10:36:34.237+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:34.365+0000 7f17f888f700 20 mds.1.23151 get_task_status
2021-04-02T10:36:34.365+0000 7f17f888f700 20 mds.1.23151 schedule_update_timer_task
2021-04-02T10:36:34.525+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 0.996071593s ago
2021-04-02T10:36:34.525+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:34.525+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:34.525+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:34.525+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:34.525+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289596, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:34.525+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:34.525+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:34.525+0000 7f17f608a700 10 mds.1.cache releasing free memory
2021-04-02T10:36:34.525+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:34.797+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ad7c4cea00 client.6574441 222.201.187.249:0/4238770464 state closed
2021-04-02T10:36:34.797+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 2) v1 from client.6574441
2021-04-02T10:36:34.797+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:34.837+0000 7f17f808e700  5 mds.beacon.cephfs.gpu006.ddpekw Sending beacon up:stopping seq 7655
2021-04-02T10:36:34.837+0000 7f17f808e700 20 mds.beacon.cephfs.gpu006.ddpekw sender thread waiting interval 4s
2021-04-02T10:36:34.841+0000 7f17fd098700  5 mds.beacon.cephfs.gpu006.ddpekw received beacon reply up:stopping seq 7655 rtt 0.00400029
2021-04-02T10:36:35.237+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:35.525+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 1.000072765s ago
2021-04-02T10:36:35.525+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:35.525+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:35.525+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:35.525+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:35.525+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289588, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:35.525+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:35.525+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:35.525+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:36.237+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:36.365+0000 7f17f888f700 20 mds.1.23151 get_task_status
2021-04-02T10:36:36.365+0000 7f17f888f700 20 mds.1.23151 schedule_update_timer_task
2021-04-02T10:36:36.369+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ad88955900 client.5274453 v1:222.201.187.231:0/3972282249 state closed
2021-04-02T10:36:36.369+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 12) from client.5274453
2021-04-02T10:36:36.369+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:36.525+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 1.000073624s ago
2021-04-02T10:36:36.525+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:36.525+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:36.525+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:36.525+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:36.525+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289588, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:36.525+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:36.525+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:36.525+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:36.949+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ada600a800 client.5218886 v1:116.56.143.171:0/2506259812 state closed
2021-04-02T10:36:36.949+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 12) from client.5218886
2021-04-02T10:36:36.949+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:37.237+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:37.525+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 1.000074454s ago
2021-04-02T10:36:37.525+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:37.525+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:37.525+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:37.525+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:37.529+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289588, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:37.529+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:37.529+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:37.529+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:37.737+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ada600a300 client.6544148 v1:222.201.187.244:0/4103172439 state closed
2021-04-02T10:36:37.737+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 12) from client.6544148
2021-04-02T10:36:37.737+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:38.237+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:38.365+0000 7f17f888f700 20 mds.1.23151 get_task_status
2021-04-02T10:36:38.365+0000 7f17f888f700 20 mds.1.23151 schedule_update_timer_task
2021-04-02T10:36:38.529+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 1.000075263s ago
2021-04-02T10:36:38.529+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:38.529+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:38.529+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:38.529+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:38.529+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289588, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:38.529+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:38.529+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:38.529+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:38.737+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ad88955400 client.5274416 v1:222.201.187.242:0/3687007628 state closed
2021-04-02T10:36:38.737+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 11) from client.5274416
2021-04-02T10:36:38.737+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.log trim 1 / 128 segments, 1 / -1 events, 0 (0) expiring, 0 (0) expired
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.log _trim_expired_segments waiting for 181543359/644740121223 to expire
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.server find_idle_sessions. last cleared laggy state 5.81484e+06s ago
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.locker scatter_tick
2021-04-02T10:36:38.829+0000 7f17f888f700 20 mds.1.locker caps_tick 0 revoking caps
2021-04-02T10:36:38.829+0000 7f17f888f700 20 mds.1.23151 updating export targets, currently 1 ranks are targets
2021-04-02T10:36:38.829+0000 7f17f888f700 20 mds.1.23151 export target mds.0 is [C 2.20e-01]
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.log trim 1 / 128 segments, 1 / -1 events, 0 (0) expiring, 0 (0) expired
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.log _trim_expired_segments waiting for 181543359/644740121223 to expire
2021-04-02T10:36:38.829+0000 7f17f888f700  7 mds.1.cache shutdown_pass
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.cache shutdown_export_strays 0x60a ''
2021-04-02T10:36:38.829+0000 7f17f888f700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=18446744073709551615
2021-04-02T10:36:38.829+0000 7f17f888f700  7 mds.1.cache trim_lru trimming 18446744073709551615 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:38.829+0000 7f17f888f700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:38.829+0000 7f17f888f700  5 mds.1.cache lru size now 0/0
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.cache Migrating any ephemerally pinned inodes
2021-04-02T10:36:38.829+0000 7f17f888f700 20 mds.1.bal handle_export_pins export_pin_queue size=0
2021-04-02T10:36:38.829+0000 7f17f888f700  7 mds.1.cache looking for subtrees to export
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.cache   examining [dir 0x101 ~mds1/ [2,head] auth v=70451715 cv=70451715/70451715 dir_auth=1 state=1073741824 f(v0 10=0+10) n(v322133 rc2021-04-02T10:26:20.324976+0000 b50 45=35+10) hs=0+0,ss=0+0 | child=0 subtree=1 subtreetemp=0 replicated=0 dirty=0 waiter=0 authpin=0 0x55ad7c233600] bounds
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.log trim_all: 1/0/0
2021-04-02T10:36:38.829+0000 7f17f888f700 10 mds.1.log _trim_expired_segments waiting for 181543359/644740121223 to expire
2021-04-02T10:36:38.829+0000 7f17f888f700  7 mds.1.cache still have replicated objects
2021-04-02T10:36:38.829+0000 7f17f888f700  7 mds.1.23151 shutdown_pass=false
2021-04-02T10:36:38.837+0000 7f17f808e700  5 mds.beacon.cephfs.gpu006.ddpekw Sending beacon up:stopping seq 7656
2021-04-02T10:36:38.837+0000 7f17f808e700 20 mds.beacon.cephfs.gpu006.ddpekw sender thread waiting interval 4s
2021-04-02T10:36:38.841+0000 7f17fd098700  5 mds.beacon.cephfs.gpu006.ddpekw received beacon reply up:stopping seq 7656 rtt 0.0040003
2021-04-02T10:36:39.229+0000 7f17fa893700 20 mds.1.23151 get_session have 0x55ad7c47ed00 client.5274447 v1:222.201.187.247:0/816986335 state closed
2021-04-02T10:36:39.229+0000 7f17fa893700  3 mds.1.server handle_client_session client_session(request_renewcaps seq 11) from client.5274447
2021-04-02T10:36:39.229+0000 7f17fa893700 10 mds.1.server ignoring renewcaps on non open|stale session (closed)
2021-04-02T10:36:39.253+0000 7f17fa893700  7 mds.1.cache cache_expire from mds.0
2021-04-02T10:36:39.529+0000 7f17f608a700 20 mds.1.cache upkeep thread trimming cache; last trim 1.000076044s ago
2021-04-02T10:36:39.529+0000 7f17f608a700 10 mds.1.cache trim_client_leases
2021-04-02T10:36:39.529+0000 7f17f608a700  7 mds.1.cache trim bytes_used=5kB limit=16GB reservation=0.05% count=0
2021-04-02T10:36:39.529+0000 7f17f608a700  7 mds.1.cache trim_lru trimming 0 items from LRU size=0 mid=0 pintail=0 pinned=0
2021-04-02T10:36:39.529+0000 7f17f608a700  7 mds.1.cache trim_lru trimmed 0 items
2021-04-02T10:36:39.529+0000 7f17f608a700  2 mds.1.cache Memory usage:  total 7697216, rss 289588, heap 332004, baseline 332004, 0 / 2 inodes have caps, 0 caps, 0 caps per inode
2021-04-02T10:36:39.529+0000 7f17f608a700  7 mds.1.server recall_client_state: min=100 max=8388608 total=0 flags=0xa
2021-04-02T10:36:39.529+0000 7f17f608a700  7 mds.1.server recalled 0 client caps.
2021-04-02T10:36:39.529+0000 7f17f608a700 20 mds.1.cache upkeep thread waiting interval 1s
2021-04-02T10:36:44.529+0000 7f17fa893700  1 mds.cephfs.gpu006.ddpekw Updating MDS map to version 23242 from mon.1


Related issues

Copied to CephFS - Backport #50288: octopus: MDS stuck at stopping when reducing max_mds In Progress
Copied to CephFS - Backport #50289: pacific: MDS stuck at stopping when reducing max_mds In Progress
Copied to CephFS - Backport #50290: nautilus: MDS stuck at stopping when reducing max_mds In Progress

History

#1 Updated by 玮文 胡 19 days ago

I've figured out "7 mds.1.cache still have replicated objects" may be the reason that this MDS cannot complete its shutdown. But I don't know the meaning of "replicated" here.

#2 Updated by Patrick Donnelly 19 days ago

  • Status changed from New to Triaged
  • Assignee set to Xiubo Li
  • Target version set to v17.0.0
  • Backport set to pacific,octopus,nautilus

#3 Updated by 玮文 胡 17 days ago

I spent some more time investigating this. Hope this is helpful.

When I issue "dump cache" to the stopping mds.1, I got the following output, and this inode should be the inode that is referred to by the log "still have replicated objects".

[
    ...
    {
        "ino": 257,
        ...
        "auth_state": {
            "replicas": {
                "0": 2801
            }
        },
        "replica_state": {
            "authority": [
                1,
                -2
            ],
            "replica_nonce": 0
        },
        "auth_pins": 0,
        "is_frozen": false,
        "is_freezing": false,
        "pins": {
            "dirtyscattered": 0,
            "lock": 0,
            "dirfrag": 1,
            "openingsnapparents": 0,
            "replicated": 1,
            "dirty": 0,
            "waiter": 0,
            "authpin": 0
        },
        ...
        "dirfrags": [
            {
                "path": "~mds1",
                "dirfrag": "0x101",
                "snapid_first": 2,
                "projected_version": "70555552",
                "version": "70555552",
                "committing_version": "70555552",
                "committed_version": "70555552",
                "is_rep": false,
                "dir_auth": "1",
                "states": [
                    "auth" 
                ],
                "is_auth": true,
                "auth_state": {
                    "replicas": {}
                },
                "replica_state": {
                    "authority": [
                        1,
                        -2
                    ],
                    "replica_nonce": 0
                },
                "auth_pins": 0,
                "is_frozen": false,
                "is_freezing": false,
                "pins": {
                    "child": 0,
                    "subtree": 1,
                    "subtreetemp": 0,
                    "replicated": 0,
                    "dirty": 0,
                    "waiter": 0,
                    "authpin": 0
                },
                "nref": 1,
                "dentries": []
            }
        ]
    }
]

If I understand it correctly, this replica record should be removed from MDCache::inode_remove_replica() called by MDCache::handle_cache_expire(). And from the logs of mds.0, I got this:

2021-04-04T08:26:21.345+0000 7f6f0c02f700 20 mds.0.cache trim: try expiring [inode 0x101 [...2,head] ~mds1/ rep@1.2802 v2220160 snaprealm=0x55be009a5b80 f(v0 10=0+10) n(v323722 rc2021-04-04T08:15:40.780573+0000 b846538 52=41+11)/n(v0 rc2020-09-25T04:16:49.903977+0000 11=0+11) (inest mix) (iversion lock) caps={5274411=pAsLsXsFs/-@1,6374310=pAsLsXsFs/-@1,6465050=pAsLsXsFs/-@1} | lock=0 dirfrag=0 caps=1 dirwaiter=0 discoverbase=0 0x55bdc30f6800] for stopping mds.0x55b87d219008
2021-04-04T08:26:21.345+0000 7f6f0c02f700 10 mds.0.cache expire_recursive:[inode 0x101 [...2,head] ~mds1/ rep@1.2802 v2220160 snaprealm=0x55be009a5b80 f(v0 10=0+10) n(v323722 rc2021-04-04T08:15:40.780573+0000 b846538 52=41+11)/n(v0 rc2020-09-25T04:16:49.903977+0000 11=0+11) (inest mix) (iversion lock) caps={5274411=pAsLsXsFs/-@1,6374310=pAsLsXsFs/-@1,6465050=pAsLsXsFs/-@1} | lock=0 dirfrag=0 caps=1 dirwaiter=0 discoverbase=0 0x55bdc30f6800]
2021-04-04T08:26:21.345+0000 7f6f0c02f700 20 mds.0.cache trim: successfully expired mdsdir
2021-04-04T08:26:21.345+0000 7f6f0c02f700  7 mds.0.cache sending cache_expire to 1

So mds.0 seems correctly sent the cache_expire message. However, the corresponding logs from mds.1 shows:

2021-04-04T08:26:21.349+0000 7f9dea00d700  7 mds.1.cache cache_expire from mds.0

without any following logs. Seems it is getting an empty message?

Any ideas about what happened here?

#4 Updated by 玮文 胡 17 days ago

Some more progress. mds.0 seems actually sending an empty cache_expire message to mds.1, despite saying "successfully expired mdsdir" in the log. The call "expire_recursive(mdsdir_in, expiremap)" in "MDCache::trim()" seems only expire the CDentry in this inode but not the "mdsdir_in" (inode 0x101) itself. But the inode 0x101 has no Dentry in mds.0.cache. I got this from "dump cache" on mds.0, this operation makes mds.0 unresponsive and be taken over by a standby. After the take-over, reducing max_mds to 1 succeeded.

#5 Updated by Patrick Donnelly 16 days ago

I wonder if this is related to

https://tracker.ceph.com/issues/49922

The code would indicate it's normal for mdsdir to be replicated (end of MDCache::trim) but it's clearly not doing the right thing when trimming.

#6 Updated by Xiubo Li 15 days ago

玮文 胡 wrote:

Some more progress. mds.0 seems actually sending an empty cache_expire message to mds.1, despite saying "successfully expired mdsdir" in the log. The call "expire_recursive(mdsdir_in, expiremap)" in "MDCache::trim()" seems only expire the CDentry in this inode but not the "mdsdir_in" (inode 0x101) itself. But the inode 0x101 has no Dentry in mds.0.cache. I got this from "dump cache" on mds.0, this operation makes mds.0 unresponsive and be taken over by a standby. After the take-over, reducing max_mds to 1 succeeded.

Yeah, you're right, in this case only the mdsdir is replicated, but the mdsdir's remplica didn't removed from mds.0.

This case is similar with the previous bug which has been fixed by Zheng Yan years ago:

commit c4bfe3e3ffc1e8a8025c13a5c49063d3d41df1b5
Author: Yan, Zheng <ukernel@gmail.com>
Date:   Mon Mar 20 11:36:18 2017 +0800

    mds: proper trim stopping mds's mdsdir inode

    previous code does not work when only mdsdir inode is replicated
    (mdsdir dirfrag is not replicated)

    Signed-off-by: "Yan, Zheng" <zyan@redhat.com>

But the issue still exists.

There has two ways to fix this:
1, in mds.0, to check if the mdsdir dirfrag is not replicated, just remove the mdsdir's replica in mds.0.
2, in mds.1, maybe we remove Line#6876 in MDCache::trim() to make the mdsdir expired from mds.0:

 6801 std::pair<bool, uint64_t> MDCache::trim(uint64_t count)
 6802 {
      ...
 6874   std::set<mds_rank_t> stopping;
 6875   mds->mdsmap->get_mds_set(stopping, MDSMap::STATE_STOPPING);
 6876   stopping.erase(mds->get_nodeid());                                                                                                
 6877   for (auto rank : stopping) {
 6878     CInode* mdsdir_in = get_inode(MDS_INO_MDSDIR(rank));
 6879     if (!mdsdir_in)
 6880       continue;
     ...

#7 Updated by Xiubo Li 15 days ago

It seems the "mdsdir_in->get_num_ref()" is not zero in mds.0:

 6889     const bool aborted = expire_recursive(mdsdir_in, expiremap);                                                                    
 6890     if (!aborted) {
 6891       dout(20) << __func__ << ": successfully expired mdsdir" << dendl;
 6892       auto&& ls = mdsdir_in->get_dirfrags();
 6893       for (auto dir : ls) {
 6894         if (dir->get_num_ref() == 1) {  // subtree pin
 6895           trim_dirfrag(dir, dir, expiremap);
 6896           ++trimmed;
 6897         }
 6898       }
 6899       if (mdsdir_in->get_num_ref() == 0) {  
 6900         trim_inode(NULL, mdsdir_in, NULL, expiremap);                                                                               
 6901         ++trimmed;
 6902       }
 6903     } else {
 6904       dout(20) << __func__ << ": some unexpirable contents in mdsdir" << dendl;                                                     
 6905     }

And in mds.0 there are no logs like:

 934587 2021-04-06T16:24:52.186+0800 14d16e76a700 15 mds.0.cache trim_inode [inode 0x101 [...2,head] ~mds1/ rep@1.1 v1 snaprealm=0x564924b        73b80 f(v0 10=0+10) n(v0 rc2021-04-06T16:21:19.832840+0800 11=0+11) (iversion lock) 0x564922d19000]
 934588 2021-04-06T16:24:52.186+0800 14d16e76a700  7 mds.0.locker rdlock_try on (idft sync) on [inode 0x101 [...2,head] ~mds1/ rep@1.1 v1         snaprealm=0x564924b73b80 f(v0 10=0+10) n(v0 rc2021-04-06T16:21:19.832840+0800 11=0+11) (iversion lock) 0x564922d19000]            
 934589 2021-04-06T16:24:52.186+0800 14d16e76a700 12 mds.0.cache   sending expire to mds.1 on [inode 0x101 [...2,head] ~mds1/ rep@1.1 v1 s        naprealm=0x564924b73b80 f(v0 10=0+10) n(v0 rc2021-04-06T16:21:19.832840+0800 11=0+11) (iversion lock) 0x564922d19000]
 934590 2021-04-06T16:24:52.186+0800 14d16e76a700 14 mds.0.cache remove_inode [inode 0x101 [...2,head] ~mds1/ rep@1.1 v1 snaprealm=0x56492        4b73b80 f(v0 10=0+10) n(v0 rc2021-04-06T16:21:19.832840+0800 11=0+11) (iversion lock) 0x564922d19000]

In theory, in trim() it will call trim_inode(0x101) and expire the 0x101 inode.

#8 Updated by 玮文 胡 15 days ago

Xiubo Li wrote:

It seems the "mdsdir_in->get_num_ref()" is not zero in mds.0

Yes, do you have any idea why? As I already restarted mds.0, I cannot reproduce it anymore. "dump cache" from mds.0 contains:

[inode 0x101 [...2,head] ~mds1/ rep@1.2804 v2220248 snaprealm=0x55be009a5b80 f(v0 10=0+10) n(v323745 rc2021-04-04T10:15:45.587536+0000 b1094205 166=148+18)/n(v0 rc2020-09-25T04:16:49.903977+0000 11=0+11) (inest mix) (iversion lock) caps={5274411=pAsLsXsFs/-@1,6374310=pAsLsXsFs/-@1,6465050=pAsLsXsFs/-@1} | lock=0 dirfrag=1 caps=1 dirwaiter=0 discoverbase=0 0x55bdc30f6800]
 [dir 0x101 ~mds1/ [2,head] rep@1.1 dir_auth=1 state=0 f(v0 10=0+10) n(v323742 rc2021-04-04T09:37:24.538296+0000 b3653816 155=145+10)/n(v323742 rc2021-04-04T09:33:57.704404+0000 b1593846 184=174+10) hs=0+0,ss=0+0 | child=0 subtree=1 0x55c16eed8000]

Maybe because there are some caps?

#9 Updated by Xiubo Li 15 days ago

玮文 胡 wrote:

Xiubo Li wrote:

It seems the "mdsdir_in->get_num_ref()" is not zero in mds.0

Yes, do you have any idea why? As I already restarted mds.0, I cannot reproduce it anymore. "dump cache" from mds.0 contains:

[...]

Maybe because there are some caps?

Maybe, locally I cannot reproduce it by running the test for a whole day.

BTW, do you have more logs for the mds.0 ? I need to check the logs to get why.

#10 Updated by 玮文 胡 15 days ago

Xiubo Li wrote:

BTW, do you have more logs for the mds.0 ? I need to check the logs to get why.

Only less than 1 second logs are saved. Others are suppressed by journals. As logs may contain sensitive information, I have sent it to your by E-mail address

#11 Updated by Xiubo Li 15 days ago

玮文 胡 wrote:

Xiubo Li wrote:

It seems the "mdsdir_in->get_num_ref()" is not zero in mds.0

Yes, do you have any idea why? As I already restarted mds.0, I cannot reproduce it anymore. "dump cache" from mds.0 contains:

[...]

Maybe because there are some caps?

Yeah, it is. There still has cap was pinned:

| lock=0 caps=1 dirwaiter=0 discoverbase=0

#12 Updated by Xiubo Li 15 days ago

Patrick Donnelly wrote:

I wonder if this is related to

https://tracker.ceph.com/issues/49922

IMO you are right.

For the mds private inode/dirs, it shouldn't be normal to handle the client caps, right ?

#13 Updated by 玮文 胡 14 days ago

Xiubo Li wrote:

For the mds private inode/dirs, it shouldn't be normal to handle the client caps, right ?

I think so. So we can recall these caps when stopping, or just pretend this inode does not exist for every client request?

#14 Updated by Xiubo Li 14 days ago

玮文 胡 wrote:

Xiubo Li wrote:

For the mds private inode/dirs, it shouldn't be normal to handle the client caps, right ?

I think so. So we can recall these caps when stopping, or just pretend this inode does not exist for every client request?

With any of Patrick's fixing in MDS [1] or Jeff's kclient fixing [2], it should be okay.

Since you were reproducing this with both kclient and userspace client, I will do the same fixing in client/Client.cc which is the same with Jeff's kernel patch [2]. With this we can see what codepaths have led to these calls next time.

[1] https://github.com/ceph/ceph/pull/40389
[2] https://patchwork.kernel.org/project/ceph-devel/list/?series=460827

#15 Updated by Patrick Donnelly 12 days ago

  • Status changed from Triaged to Pending Backport
  • Pull request ID set to 40642
  • Component(FS) Client added
  • Component(FS) deleted (MDS)

#16 Updated by Backport Bot 12 days ago

  • Copied to Backport #50288: octopus: MDS stuck at stopping when reducing max_mds added

#17 Updated by Backport Bot 12 days ago

  • Copied to Backport #50289: pacific: MDS stuck at stopping when reducing max_mds added

#18 Updated by Backport Bot 12 days ago

  • Copied to Backport #50290: nautilus: MDS stuck at stopping when reducing max_mds added

Also available in: Atom PDF