Project

General

Profile

Actions

Bug #44798

closed

librados mon_command (mgr) command hang

Added by Sage Weil about 4 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

- mon starts
- mgr starts
- mgr fetchs mon metadata
- more mons are added to the cluster (post-bootstrap)
- librados app/test starts
- call to librados mon_command with a mgr command (e.g., 'pg deepscrub')
- MonClient sends to a peon mon (not the bootstrap mon)
- peon forwards to mgr
- mgr rejects connection with

2020-03-28T21:12:26.479+0000 7f12d7cdf700  2 mgr.server handle_open ignoring open from mon.c 172.21.15.37:0/139366808; not ready for session (expect reconnect)

this is reproduced by a fix to qa/tasks/cephadm.py that puts all mons in the ceph.conf instead of just the bootstrap mon, which means that these mgr commands pretty frequently go to a peon


Related issues 2 (0 open2 closed)

Copied to RADOS - Backport #44835: nautilus: librados mon_command (mgr) command hangRejectedActions
Copied to RADOS - Backport #44836: octopus: librados mon_command (mgr) command hangResolvedNathan CutlerActions
Actions #1

Updated by Sage Weil about 4 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 34266
Actions #2

Updated by Sage Weil about 4 years ago

  • Backport set to octopus,nautilus
Actions #3

Updated by Sage Weil about 4 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #4

Updated by Nathan Cutler about 4 years ago

  • Copied to Backport #44835: nautilus: librados mon_command (mgr) command hang added
Actions #5

Updated by Nathan Cutler about 4 years ago

  • Copied to Backport #44836: octopus: librados mon_command (mgr) command hang added
Actions #6

Updated by Loïc Dachary over 2 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF