Bug #45737: Module 'cephadm' has failed: cannot send (already closed?) - Orchestrator - Ceph

Actions

Copy link

Bug #45737

closed

Module 'cephadm' has failed: cannot send (already closed?)

Added by Alain Deleglise almost 4 years ago. Updated almost 4 years ago.

Status:

Duplicate

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi,

I have a development cluster running on 4 VM. They're all running CentOS8 Stream and were bootstraped using cephadm (octopus).

We were running tests, and I stopped VMs randomly to see how the cluster reacts.

When I siwtching VMs back on, the cluster state was Warning, saying that 2 nodes didn't passed the checks. So I went and run `ceph cephadm check-host HOST`, host1 was fine, but host2 gave a python stack trace, unfortunatly I've not copied it.

Now ceph cephadm says : Error EIO: Module 'cephadm' has experienced an error and cannot handle commands: cannot send (already closed?)

ceph -v
ceph version 15.2.2 (0c857e985a29d90501a285f242ea9c008df49eb8) octopus (stable)

cat /etc/centos-release
CentOS Linux release 8.1.1911 (Core)

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Sebastian Wagner almost 4 years ago

Project changed from Ceph to Orchestrator
Category deleted (~~ceph cli~~)

Actions

Copy link

Updated by Sebastian Wagner almost 4 years ago

Related to Bug #45627: cephadm: frequently getting `1 hosts fail cephadm check` added

Actions

Copy link

Updated by Sebastian Wagner almost 4 years ago

Status changed from New to Duplicate

Actions

Copy link

Updated by Sebastian Wagner almost 4 years ago

Target version deleted (~~v15.2.2~~)

Actions

Copy link

Updated by Alain Deleglise almost 4 years ago

Hi,

So besides the fact this is a duplicate of an issue, that is waiting to have a fix reviewed, what should I do in the mean time ?

I mean, is it fixable in some way ? Should I wait for the fixe to be released and update components ? Right now my cluster is a developement cluster, so it's not crucial, but one can easily imagine this happening in production, what is the stance to have in such situation ?

Thanks

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Orchestrator

Custom queries

Bug #45737

Module 'cephadm' has failed: cannot send (already closed?)

Updated by Sebastian Wagner almost 4 years ago

Updated by Sebastian Wagner almost 4 years ago

Updated by Sebastian Wagner almost 4 years ago

Updated by Sebastian Wagner almost 4 years ago

Updated by Alain Deleglise almost 4 years ago