Project

General

Profile

Bug #55464

cephfs: mds/client error when client stale reconnect

Added by Mer Xuanyi 7 months ago. Updated 7 months ago.

Status:
In Progress
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
quincy, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Options:
mds_session_blocklist_on_evict: false
mds_session_blocklist_on_timeout: false
client_reconnect_stale: true

We expect client can work well when mds reboot. (or session stale because temporarily network unavailable)
But in fact it may lead to mds/client crash.
When mds reboot into RECONNECT phrase, client will detected the state changes from mdsmap, and call Client::send_reconnect function.
In Client::send_reconnect, client will send session->unsafe_requests, unprocessed mds_requests and client_reconnect message.
If reconnect time out, mds call kill_session to evict client (not blocklist), client call Client::kick_requests_closed in Client::_closed_mds_session, and that will kick&remove all inflight requests, looks nice right? But in Client::make_requests it will check if request->reply and try to resend it when mds is active and session reopen, that may lead to lots of error.

One typical situation is we have two requests req.1 (mkdir test_dir) and req.2 (touch test_dir/test_file), req.1 got early_reply.
when mds reboot but client reconnect timed out, client will drop req.1, req.2 will be resend when mds's state change to active with session reopen, but Server can't process this request correctly cause the ino of test_dir is not real exist. Finally mds tell client the ino of test_dir is stale, and client will retry req.2 from this infinite loop.

Possible Problems from this bug:

1. stale ino
2. client cache mud / client crash(ino added into inode_map when handle mds early_reply, but updated it for another request when mds reboot)
3. objecter mud (client write data when get early_reply, but droped when mds reboot)
4. mds crash (when mds alloc ino, it find the ino is already in inode_map -- only find in jewel , a special OPEN event journaled after mkdir)

These PR #29095, #30969 (removed by commit a7a1b0a3) solved a part of this problem but only effect when mds has not yet switch to active

History

#1 Updated by Venky Shankar 7 months ago

  • Category set to Correctness/Safety
  • Status changed from New to In Progress
  • Backport set to quincy, pacific
  • Pull request ID set to 46050

Also available in: Atom PDF