Project

General

Profile

Actions

Bug #46535

open

mds: Importer MDS failing right after EImportStart event is journaled, causes incorrect blacklisting of client session

Added by Sidharth Anupkrishnan almost 4 years ago. Updated almost 2 years ago.

Status:
In Progress
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific,octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

An MDS hitting mds_kill_import_at = 7 (after EImportStart is journaled but before sending the ImportAck to the exporter), during import and subsequent taking over by a standby MDS causes will cause imported client session to be blacklisted. The reason for this is that before this killpoint (https://github.com/sidharthanup/ceph/blob/wip-multimdss-killpoint-test/src/mds/Migrator.cc#L3026) is hit, there is a prepare_force_open_sessions() method being called(https://github.com/sidharthanup/ceph/blob/wip-multimdss-killpoint-test/src/mds/Migrator.cc#L2699) in handle_export_dir() and this call marks a dirty open session which later gets persisted as
part of the journal event EImportStart. Now during journal replay of the new MDS, this information is relayed to the new MDS (during EImportStart::replay()) and the new MDS thinks that there is an open session with the client whereas in reality, that session was never really opened with the client. Now during up:reconnect, it tries to reconnect with the client and gets no response and ends up blacklisting the client.

We probably want to try to force open the dirty session during EImportStart replay.

Actions #1

Updated by Sidharth Anupkrishnan almost 4 years ago

  • Project changed from Ceph to CephFS
  • Component(FS) MDS added
Actions #2

Updated by Sidharth Anupkrishnan almost 4 years ago

  • Description updated (diff)
Actions #3

Updated by Sidharth Anupkrishnan almost 4 years ago

  • Description updated (diff)
Actions #4

Updated by Sidharth Anupkrishnan almost 4 years ago

  • Description updated (diff)
Actions #5

Updated by Sidharth Anupkrishnan almost 4 years ago

  • Description updated (diff)
Actions #6

Updated by Sidharth Anupkrishnan almost 4 years ago

  • Description updated (diff)
Actions #7

Updated by Patrick Donnelly over 3 years ago

  • Status changed from New to In Progress
  • Assignee set to Sidharth Anupkrishnan
  • Target version set to v16.0.0
  • Source set to Q/A
Actions #8

Updated by Patrick Donnelly over 3 years ago

  • Target version changed from v16.0.0 to v17.0.0
  • Backport set to pacific,octopus,nautilus
Actions #9

Updated by Patrick Donnelly almost 2 years ago

  • Target version deleted (v17.0.0)
Actions

Also available in: Atom PDF