Project

General

Profile

Actions

Bug #44677

closed

stale scrub status entry from a failed mds shows up in `ceph status`

Added by Venky Shankar about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Administration/Usability
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
nautilus,octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This happens intermittently. When an active mds (mds.b) is terminated, mds.c transitions to active, but task status shows both MDSs scrub status:

ceph -s                                                                                 
...
...

  services:                                      
    mon: 1 daemons, quorum a (age 24s)                                                                                               
    mgr: x(active, since 18s)
    mds: a:1 {0=b=up:active} 2 up:standby
    osd: 3 osds: 3 up (since 18s), 3 in (since 2h)

  task status:                           
    scrub status:                                
        mds.b: idle             
        mds.c: idle                               
...
...

ceph-mgr should ideally prune older entries after `mgr_service_beacon_grace` seconds, but that doesn't happen. The issue is that ceph-mgr receives an updated fsmap and removes entries from it's tracking index (`daemon_state`). However, `DaemonServer::_prune_pending_service_map()` requires the mds entry in the tracking index to prune stale entries from service map. So, those stale entries remain in the service map until ceph-mgr is restarted (or on a failover).


Related issues 2 (0 open2 closed)

Copied to CephFS - Backport #45049: octopus: stale scrub status entry from a failed mds shows up in `ceph status`ResolvedNathan CutlerActions
Copied to CephFS - Backport #45050: nautilus: stale scrub status entry from a failed mds shows up in `ceph status`ResolvedWei-Chung ChengActions
Actions #1

Updated by Venky Shankar about 4 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 34281
Actions #2

Updated by Kefu Chai about 4 years ago

  • Backport changed from nautilus to nautilus,octopus
Actions #3

Updated by Greg Farnum about 4 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #4

Updated by Nathan Cutler about 4 years ago

  • Copied to Backport #45049: octopus: stale scrub status entry from a failed mds shows up in `ceph status` added
Actions #5

Updated by Nathan Cutler about 4 years ago

  • Copied to Backport #45050: nautilus: stale scrub status entry from a failed mds shows up in `ceph status` added
Actions #6

Updated by Nathan Cutler almost 4 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF