Bug #46230: front_addr and cluster_network leads to hanging MGR - mgr - Ceph

Actions

Copy link

Bug #46230

closed

front_addr and cluster_network leads to hanging MGR

Added by Michal Nasiadka almost 4 years ago. Updated almost 4 years ago.

Status:

Won't Fix

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Since a day or so - all ceph orch commands are failing (although all other ceph commands run fine)

even after enabling debug_mgr = 5, the only entry in the mgr log is:

2020-06-26T12:19:40.608+0000 7fd53ae0c700  0 log_channel(audit) log [DBG] : from='client.994212 -' entity='client.admin' cmd=[{"prefix": "orch status", "target": ["mon-mgr", ""]}]: dispatch

Files

Download all files

ceph-mgr.strace (622 KB) ceph-mgr.strace		Michal Nasiadka, 06/26/2020 04:03 PM
ceph-mgr-3.log (742 KB) ceph-mgr-3.log		Michal Nasiadka, 06/26/2020 04:09 PM
config-key.dump (440 KB) config-key.dump		Michal Nasiadka, 06/26/2020 04:20 PM

Related issues 1 (1 open — 0 closed)

Actions

Copy link

Updated by Michal Nasiadka almost 4 years ago

ceph cephadm * commands also hang

Actions

Copy link

Updated by Sebastian Wagner almost 4 years ago

Description updated (diff)

Actions

Copy link Download all files

Updated by Michal Nasiadka almost 4 years ago

File ceph-mgr.strace ceph-mgr.strace added
File ceph-mgr-3.log ceph-mgr-3.log added

Actions

Copy link

Updated by Michal Nasiadka almost 4 years ago

File config-key.dump config-key.dump added

Actions

Copy link

Updated by Michal Nasiadka almost 4 years ago

Actually I noticed that some of my OSDs have front_addr in cluster_network - which is not accessible from Ceph monitors.
simple restart of those OSDs did help - and they have front_addr back in public_network - and everything started to work again.

Actions

Copy link

Updated by Sebastian Wagner almost 4 years ago

Related to Feature #46238: raise a HEALTH warn, if OSDs use the cluster_network for the front added

Actions

Copy link

Updated by Sebastian Wagner almost 4 years ago

Project changed from Orchestrator to mgr
Subject changed from ceph orch commands hang forever to front_addr and cluster_network leads to hanging MGR
Category deleted (~~orchestrator~~)

Actions

Copy link

Updated by Josh Durgin almost 4 years ago

Status changed from New to Won't Fix

Sounds like this was a network configuration issue.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » mgr

Custom queries

Bug #46230

front_addr and cluster_network leads to hanging MGR

Updated by Michal Nasiadka almost 4 years ago

Updated by Sebastian Wagner almost 4 years ago

Updated by Michal Nasiadka almost 4 years ago

Updated by Michal Nasiadka almost 4 years ago

Updated by Michal Nasiadka almost 4 years ago

Updated by Sebastian Wagner almost 4 years ago

Updated by Sebastian Wagner almost 4 years ago

Updated by Josh Durgin almost 4 years ago