Project

General

Profile

Bug #59163

mds: stuck in up:rejoin when it cannot "open" missing directory inode

Added by Patrick Donnelly 11 months ago. Updated 9 months ago.

Status:
New
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

tasks.cephfs.test_damage.TestDamage.test_object_deletion tests for damage when no clients are in the session list (for better or worse). Due to a bug in the kernel (#59162), the kernel does not correctly close its session so the MDS keeps a directory inode in the OpenFileTable. The presence of this entry causes the MDS to try to open the dirfrag during up:rejoin. The directory was deliberately removed to test for response to metadata damage. Apparently, in this scenario, the up:rejoin process becomes stuck.

Other mutations also fail, probably for the same reason.

See:

2023-03-23T19:30:35.545 INFO:teuthology.orchestra.run.smithi119.stderr:dumped fsmap epoch 94
2023-03-23T19:30:35.565 DEBUG:tasks.cephfs.filesystem:are_daemons_healthy: mds map: {'epoch': 94, 'flags': 18, 'flags_state': {'joinable': True, 'allow_snaps': True, 'allow_multimds_snaps': True, 'allow_standby_replay': False, 'refuse_client_session': False}, 'ever_allowed_features': 0, 'explicitly_allowed_features': 0, 'created': '2023-03-23T19:28:02.712518+0000', 'modified': '2023-03-23T19:30:21.752731+0000', 'tableserver': 0, 'root': 0, 'session_timeout': 60, 'session_autoclose': 300, 'required_client_features': {}, 'max_file_size': 1099511627776, 'max_xattr_size': 65536, 'last_failure': 0, 'last_failure_osd_epoch': 102, 'compat': {'compat': {}, 'ro_compat': {}, 'incompat': {'feature_1': 'base v0.20', 'feature_2': 'client writeable ranges', 'feature_3': 'default file layouts on dirs', 'feature_4': 'dir inode in separate object', 'feature_5': 'mds uses versioned encoding', 'feature_6': 'dirfrag is stored in omap', 'feature_7': 'mds uses inline data', 'feature_8': 'no anchor table', 'feature_9': 'file layout v2', 'feature_10': 'snaprealm v2'}}, 'max_mds': 1, 'in': [0], 'up': {'mds_0': 7947}, 'failed': [], 'damaged': [], 'stopped': [], 'info': {'gid_7947': {'gid': 7947, 'name': 'c', 'rank': 0, 'incarnation': 91, 'state': 'up:rejoin', 'state_seq': 17, 'addr': '172.21.15.119:6837/4165463071', 'addrs': {'addrvec': [{'type': 'v2', 'addr': '172.21.15.119:6836', 'nonce': 4165463071}, {'type': 'v1', 'addr': '172.21.15.119:6837', 'nonce': 4165463071}]}, 'join_fscid': -1, 'export_targets': [], 'features': 4540138322906710015, 'flags': 0, 'compat': {'compat': {}, 'ro_compat': {}, 'incompat': {'feature_1': 'base v0.20', 'feature_2': 'client writeable ranges', 'feature_3': 'default file layouts on dirs', 'feature_4': 'dir inode in separate object', 'feature_5': 'mds uses versioned encoding', 'feature_6': 'dirfrag is stored in omap', 'feature_7': 'mds uses inline data', 'feature_8': 'no anchor table', 'feature_9': 'file layout v2', 'feature_10': 'snaprealm v2'}}}}, 'data_pools': [15], 'metadata_pool': 14, 'enabled': True, 'fs_name': 'cephfs', 'balancer': '', 'bal_rank_mask': '-1', 'standby_count_wanted': 0}
2023-03-23T19:30:35.566 WARNING:tasks.cephfs.filesystem:Unhealthy mds state gid_7947:up:rejoin
2023-03-23T19:30:35.566 INFO:tasks.cephfs.test_damage:Result: Mutation 'Delete 10000000000.00000000' should have left us healthy, actually not.

/teuthology/pdonnell-2023-03-23_18:44:22-fs-wip-pdonnell-testing-20230323.162417-distro-default-smithi/7217902/teuthology.log


Related issues

Related to Linux kernel client - Bug #59162: kernel completes umount without waiting for mds close Fix Under Review
Duplicated by CephFS - Bug #59230: Test failure: test_object_deletion (tasks.cephfs.test_damage.TestDamage) Duplicate

History

#1 Updated by Patrick Donnelly 11 months ago

  • Related to Bug #59162: kernel completes umount without waiting for mds close added

#2 Updated by Venky Shankar 11 months ago

  • Category set to Correctness/Safety
  • Assignee set to Venky Shankar

#3 Updated by Patrick Donnelly 11 months ago

  • Duplicated by Bug #59230: Test failure: test_object_deletion (tasks.cephfs.test_damage.TestDamage) added

#4 Updated by Venky Shankar 10 months ago

The MDS got marked as down:damaged as it could not decode the CDir fnode:

   -49> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache rejoin_start
   -48> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache process_imported_caps
   -47> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.openfiles prefetch_inodes
   -46> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.openfiles _prefetch_inodes state 1
   -45> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache open_ino_batch_start
   -44> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache open_ino 0x10000000000 pool 14 want_replica 0
   -43> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache _open_ino_traverse_dir: ino 0x10000000000 ret 0
   -42> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache open_ino_traverse_dir ino 0x10000000000 [<0x1/subdir v0>]
   -41> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 20 mds.0.cache.dir(0x1) lookup (subdir, 'head')
   -40> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache  fetching incomplete [dir 0x1 / [2,head] auth v=12 cv=0/0 dir_auth=0 state=1073741824 f(v0 m2023-03-23T19:28:10.297554+0000 1=0+1) n(v1 rc2023-03-23T19:28:10.481550+0000 b6291456 2=1+1) hs=0+0,ss=0+0 | subtre
e=1 0x5591337c8d00]
   -39> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache open_ino_batch_submit
   -38> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache.dir(0x1) fetch_keys 1 keys on [dir 0x1 / [2,head] auth v=12 cv=0/0 dir_auth=0 state=1073741824 f(v0 m2023-03-23T19:28:10.297554+0000 1=0+1) n(v1 rc2023-03-23T19:28:10.481550+0000 b6291456 2=1+1) hs=0+0,ss=0+0
 | subtree=1 0x5591337c8d00]
   -37> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache.dir(0x1) auth_pin by 0x5591337c8d00 on [dir 0x1 / [2,head] auth v=12 cv=0/0 dir_auth=0 ap=1+0 state=1073741824 f(v0 m2023-03-23T19:28:10.297554+0000 1=0+1) n(v1 rc2023-03-23T19:28:10.481550+0000 b6291456 2=1+
1) hs=0+0,ss=0+0 | dnwaiter=1 subtree=1 authpin=1 0x5591337c8d00] count now 1
   -36> 2023-03-23T19:41:16.405+0000 7f9a85ad7700  1 -- [v2:172.21.15.119:6836/2518833919,v1:172.21.15.119:6837/2518833919] --> [v2:172.21.15.148:6824/33963,v1:172.21.15.148:6825/33963] -- osd_op(unknown.0.366:24 14.1f 14:ff5b34d6:::1.00000000:head [omap-get-header,omap-ge
t-vals-by-keys in=19b,getxattr parent in=6b] snapc 0=[] ondisk+read+known_if_redirected+full_force+supports_pool_eio e145) v8 -- 0x5591345de800 con 0x5591345df000
   -35> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 20 mds.0.bal hit_dir 3 pop is 1, frag * size 1 [pop IRD:[C 0.00e+00] IWR:[C 0.00e+00] RDR:[C 0.00e+00] FET:[C 1.00e+00] STR:[C 0.00e+00] *LOAD:2.0]
   -34> 2023-03-23T19:41:16.405+0000 7f9a85ad7700  1 mds.0.366 rejoin_joint_start
   -33> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 mds.0.cache rejoin_send_rejoins with recovery_set
   -32> 2023-03-23T19:41:16.405+0000 7f9a85ad7700  7 mds.0.cache rejoin_send_rejoins still processing imported caps, delaying
   -31> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 notify_mdsmap: mds.metrics
   -30> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 notify_mdsmap: mds.metrics: rank0 is unavailable
   -29> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 10 reset_seq: mds.metrics: last_updated_seq=0
   -28> 2023-03-23T19:41:16.405+0000 7f9a85ad7700 20 set_next_seq: mds.metrics: current sequence number 0, setting next sequence number 0
   -27> 2023-03-23T19:41:16.405+0000 7f9a88add700  1 -- [v2:172.21.15.119:6836/2518833919,v1:172.21.15.119:6837/2518833919] <== osd.7 v2:172.21.15.148:6824/33963 5 ==== osd_op_reply(24 1.00000000 [omap-get-header out=80b,omap-get-vals-by-keys out=523b,getxattr (30) out=30b
] v0'0 uv543 ondisk = 0) v8 ==== 238+0+633 (crc 0 0 0) 0x5591338aafc0 con 0x5591345df000
   -26> 2023-03-23T19:41:16.405+0000 7f9a7facb700 10 MDSIOContextBase::complete: 21C_IO_Dir_OMAP_Fetched
   -25> 2023-03-23T19:41:16.405+0000 7f9a7facb700 10 MDSContext::complete: 21C_IO_Dir_OMAP_Fetched
   -24> 2023-03-23T19:41:16.405+0000 7f9a7facb700 10 mds.0.cache.dir(0x1) _fetched header 80 bytes 1 keys for [dir 0x1 / [2,head] auth v=12 cv=0/0 dir_auth=0 ap=1+0 state=1073741824 f(v0 m2023-03-23T19:28:10.297554+0000 1=0+1) n(v1 rc2023-03-23T19:28:10.481550+0000 b629145
6 2=1+1) hs=0+0,ss=0+0 | dnwaiter=1 subtree=1 authpin=1 0x5591337c8d00]
   -23> 2023-03-23T19:41:16.405+0000 7f9a7facb700 -1 mds.0.cache.dir(0x1) Corrupt fnode in dirfrag 0x1: void fnode_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) no longer understand old encoding version 3 < 101: Malformed input [buffer:3]
   -22> 2023-03-23T19:41:16.405+0000 7f9a7facb700  0 log_channel(cluster) log [WRN] : Corrupt fnode header in 0x1: void fnode_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) no longer understand old encoding version 3 < 101: Malformed input [buffer:3] ()
   -21> 2023-03-23T19:41:16.405+0000 7f9a7facb700 10 mds.0.cache.dir(0x1) go_bad *
   -20> 2023-03-23T19:41:16.405+0000 7f9a7facb700 -1 mds.0.damage notify_dirfrag Damage to fragment * of ino 0x1 is fatal because it is a system directory for this rank
   -19> 2023-03-23T19:41:16.405+0000 7f9a7facb700  5 mds.beacon.c set_want_state: up:rejoin -> down:damaged

#5 Updated by Venky Shankar 9 months ago

Sorry for the delay - back to working on this.

Also available in: Atom PDF