Project

General

Profile

Actions

Bug #46230

closed

front_addr and cluster_network leads to hanging MGR

Added by Michal Nasiadka almost 4 years ago. Updated almost 4 years ago.

Status:
Won't Fix
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Since a day or so - all ceph orch commands are failing (although all other ceph commands run fine)

even after enabling debug_mgr = 5, the only entry in the mgr log is:

2020-06-26T12:19:40.608+0000 7fd53ae0c700  0 log_channel(audit) log [DBG] : from='client.994212 -' entity='client.admin' cmd=[{"prefix": "orch status", "target": ["mon-mgr", ""]}]: dispatch

Files

ceph-mgr.strace (622 KB) ceph-mgr.strace Michal Nasiadka, 06/26/2020 04:03 PM
ceph-mgr-3.log (742 KB) ceph-mgr-3.log Michal Nasiadka, 06/26/2020 04:09 PM
config-key.dump (440 KB) config-key.dump Michal Nasiadka, 06/26/2020 04:20 PM

Related issues 1 (1 open0 closed)

Related to RADOS - Feature #46238: raise a HEALTH warn, if OSDs use the cluster_network for the frontNew

Actions
Actions #1

Updated by Michal Nasiadka almost 4 years ago

ceph cephadm * commands also hang

Actions #2

Updated by Sebastian Wagner almost 4 years ago

  • Description updated (diff)
Actions #5

Updated by Michal Nasiadka almost 4 years ago

Actually I noticed that some of my OSDs have front_addr in cluster_network - which is not accessible from Ceph monitors.
simple restart of those OSDs did help - and they have front_addr back in public_network - and everything started to work again.

Actions #6

Updated by Sebastian Wagner almost 4 years ago

  • Related to Feature #46238: raise a HEALTH warn, if OSDs use the cluster_network for the front added
Actions #7

Updated by Sebastian Wagner almost 4 years ago

  • Project changed from Orchestrator to mgr
  • Subject changed from ceph orch commands hang forever to front_addr and cluster_network leads to hanging MGR
  • Category deleted (orchestrator)
Actions #8

Updated by Josh Durgin almost 4 years ago

  • Status changed from New to Won't Fix

Sounds like this was a network configuration issue.

Actions

Also available in: Atom PDF