Project

General

Profile

Bug #45167

cephadm: mons are not properly deployed

Added by Sebastian Wagner 6 months ago. Updated 4 months ago.

Status:
New
Priority:
Low
Category:
cephadm (binary)
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

cephadm run --name mon.hostname --fsid myfsid
INFO:cephadm:Using recent ceph image ceph/ceph:v15
"debug "2020-04-21T17:13:54.335+0000 7fed2debf6c0  0 set uid:gid to 167:167 (ceph:ceph)
"debug "2020-04-21T17:13:54.335+0000 7fed2debf6c0  0 ceph version 15.2.1 (9fd2f65f91d9246fae2c841a6222d34d121680ee) octopus (stable), process ceph-mon, pid 1
"debug "2020-04-21T17:13:54.335+0000 7fed2debf6c0  0 pidfile_write: ignore empty --pid-file
"debug "2020-04-21T17:13:54.343+0000 7fed2debf6c0  0 load: jerasure load: lrc load: isa
"debug "2020-04-21T17:13:54.347+0000 7fed2debf6c0 -1 Invalid argument: /var/lib/ceph/mon/hostname/store.db: does not exist (create_if_missing is false)

"debug "2020-04-21T17:13:54.347+0000 7fed2debf6c0 -1 error opening mon data directory at '/var/lib/ceph/mon/hostname': (22) Invalid argument

We need to be able to recover from this error.

Afterwards:

ceph orch apply mon hostname`, the `/var/lib/ceph/<fsid>` gets removed. When I re-add it, the directory comes back, but empty, and ceph orch ls shows this:

mon                1/2  0s ago     4m   storage-14b-0,storage-14b-1  mix                       mix

Related issues

Related to Orchestrator - Bug #45235: cephadm: mons are not properly undeployed Can't reproduce
Copied from Orchestrator - Documentation #45165: cephadm troubleshooting: recover from broken daemons New

History

#1 Updated by Sebastian Wagner 6 months ago

#2 Updated by Sebastian Wagner 6 months ago

  • Tracker changed from Documentation to Bug
  • Status changed from New to In Progress
  • Assignee set to Sebastian Wagner
  • Regression set to No
  • Severity set to 3 - minor

can confirm:

root@ubuntu:/var/lib/ceph/a9df56ad-29b0-4bf4-8e47-a35c7657b332/mon.ubuntu# ls -lR
.:
insgesamt 32
-rw------- 1  167  167  210 Apr 23 13:28 config
-rw------- 1  167  167   77 Apr 23 13:28 keyring
-rw------- 1  167  167    8 Apr 23 13:28 kv_backend
drwxr-xr-x 2  167  167 4096 Apr 23 13:28 store.db
-rw------- 1  167  167   38 Apr 23 13:28 unit.configured
-rw------- 1  167  167   48 Apr 23 13:28 unit.created
-rw------- 1 root root   47 Apr 23 13:28 unit.image
-rw------- 1 root root    0 Apr 23 13:28 unit.poststop
-rw------- 1 root root 1054 Apr 23 13:28 unit.run

./store.db:
insgesamt 0
-rw-r--r-- 1 167 167 0 Apr 23 13:28 LOCK
root@ubuntu:/var/lib/ceph/a9df56ad-29b0-4bf4-8e47-a35c7657b332/mon.ubuntu# bash unit.run 
debug 2020-04-23T12:23:07.797+0000 7f3ca4bdb6c0  0 set uid:gid to 167:167 (ceph:ceph)
debug 2020-04-23T12:23:07.797+0000 7f3ca4bdb6c0  0 ceph version 16.0.0-901-g713ef3c (713ef3c7e762ef9a07171570426c6462d3ecf2d6) pacific (dev), process ceph-mon, pid 1
debug 2020-04-23T12:23:07.797+0000 7f3ca4bdb6c0  0 pidfile_write: ignore empty --pid-file
debug 2020-04-23T12:23:07.801+0000 7f3ca4bdb6c0  0 load: jerasure load: lrc load: isa 
debug 2020-04-23T12:23:07.801+0000 7f3ca4bdb6c0 -1 Invalid argument: /var/lib/ceph/mon/ceph-ubuntu/store.db: does not exist (create_if_missing is false)

debug 2020-04-23T12:23:07.801+0000 7f3ca4bdb6c0 -1 error opening mon data directory at '/var/lib/ceph/mon/ceph-ubuntu': (22) Invalid argumen

#3 Updated by Sebastian Wagner 6 months ago

  • Status changed from In Progress to New

hm. this needs more investigation

#4 Updated by Sebastian Wagner 5 months ago

  • Related to Bug #45235: cephadm: mons are not properly undeployed added

#6 Updated by Sebastian Wagner 4 months ago

  • Priority changed from High to Low

low, until it happens again.

Also available in: Atom PDF