Actions
Bug #45961
closedcephadm: high load and slow disk make "cephadm bootstrap" fail
Status:
Resolved
Priority:
Normal
Assignee:
Category:
cephadm (binary)
Target version:
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
When running "cephadm bootstrap" in a libvirt-based virtual environment (four VMs) running on a machine that has a single 2TB spinner and is overloaded:
uptime 16:30:05 up 1 day 6:56, 2 users, load average: 19.45, 17.69, 14.18
cephadm bootstrap fails with the following in the log:
INFO:cephadm:Waiting for mon to start... INFO:cephadm:Waiting for mon... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:mon not available, waiting (1/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:mon not available, waiting (2/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:mon not available, waiting (3/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:mon not available, waiting (4/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:mon not available, waiting (5/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:mon not available, waiting (6/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status DEBUG:cephadm:/usr/bin/ceph:stdout cluster: DEBUG:cephadm:/usr/bin/ceph:stdout id: 3bc1642a-aa51-11ea-93cd-525400336add DEBUG:cephadm:/usr/bin/ceph:stdout health: HEALTH_OK DEBUG:cephadm:/usr/bin/ceph:stdout DEBUG:cephadm:/usr/bin/ceph:stdout services: DEBUG:cephadm:/usr/bin/ceph:stdout mon: 1 daemons, quorum node1 (age 76s) DEBUG:cephadm:/usr/bin/ceph:stdout mgr: no daemons active DEBUG:cephadm:/usr/bin/ceph:stdout osd: 0 osds: 0 up, 0 in DEBUG:cephadm:/usr/bin/ceph:stdout DEBUG:cephadm:/usr/bin/ceph:stdout data: DEBUG:cephadm:/usr/bin/ceph:stdout pools: 0 pools, 0 pgs DEBUG:cephadm:/usr/bin/ceph:stdout objects: 0 objects, 0 B DEBUG:cephadm:/usr/bin/ceph:stdout usage: 0 B used, 0 B / 0 B avail DEBUG:cephadm:/usr/bin/ceph:stdout pgs: DEBUG:cephadm:/usr/bin/ceph:stdout INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:stdout cluster: INFO:cephadm:/usr/bin/ceph:stdout id: 3bc1642a-aa51-11ea-93cd-525400336add INFO:cephadm:/usr/bin/ceph:stdout health: HEALTH_OK INFO:cephadm:/usr/bin/ceph:stdout INFO:cephadm:/usr/bin/ceph:stdout services: INFO:cephadm:/usr/bin/ceph:stdout mon: 1 daemons, quorum node1 (age 76s) INFO:cephadm:/usr/bin/ceph:stdout mgr: no daemons active INFO:cephadm:/usr/bin/ceph:stdout osd: 0 osds: 0 up, 0 in INFO:cephadm:/usr/bin/ceph:stdout INFO:cephadm:/usr/bin/ceph:stdout data: INFO:cephadm:/usr/bin/ceph:stdout pools: 0 pools, 0 pgs INFO:cephadm:/usr/bin/ceph:stdout objects: 0 objects, 0 B INFO:cephadm:/usr/bin/ceph:stdout usage: 0 B used, 0 B / 0 B avail INFO:cephadm:/usr/bin/ceph:stdout pgs: INFO:cephadm:/usr/bin/ceph:stdout INFO:cephadm:mon not available, waiting (7/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:mon not available, waiting (8/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:mon not available, waiting (9/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:mon not available, waiting (10/10)... DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status DEBUG:cephadm:/usr/bin/ceph:stdout cluster: DEBUG:cephadm:/usr/bin/ceph:stdout id: 3bc1642a-aa51-11ea-93cd-525400336add DEBUG:cephadm:/usr/bin/ceph:stdout health: HEALTH_OK DEBUG:cephadm:/usr/bin/ceph:stdout DEBUG:cephadm:/usr/bin/ceph:stdout services: DEBUG:cephadm:/usr/bin/ceph:stdout mon: 1 daemons, quorum node1 (age 4m) DEBUG:cephadm:/usr/bin/ceph:stdout mgr: no daemons active DEBUG:cephadm:/usr/bin/ceph:stdout osd: 0 osds: 0 up, 0 in DEBUG:cephadm:/usr/bin/ceph:stdout DEBUG:cephadm:/usr/bin/ceph:stdout data: DEBUG:cephadm:/usr/bin/ceph:stdout pools: 0 pools, 0 pgs DEBUG:cephadm:/usr/bin/ceph:stdout objects: 0 objects, 0 B DEBUG:cephadm:/usr/bin/ceph:stdout usage: 0 B used, 0 B / 0 B avail DEBUG:cephadm:/usr/bin/ceph:stdout pgs: DEBUG:cephadm:/usr/bin/ceph:stdout INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph -e NODE_NAME=node1 -v /var/lib/ceph/3bc1642a-aa51-11ea-93cd-525400336add/mon.node1:/var/lib/ceph/mon/ceph-node1:z -v /tmp/ceph-tmph797_v0x:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpr2d31del:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph registry.suse.de/devel/storage/7.0/containers/ses/7/ceph/ceph status INFO:cephadm:/usr/bin/ceph:stdout cluster: INFO:cephadm:/usr/bin/ceph:stdout id: 3bc1642a-aa51-11ea-93cd-525400336add INFO:cephadm:/usr/bin/ceph:stdout health: HEALTH_OK INFO:cephadm:/usr/bin/ceph:stdout INFO:cephadm:/usr/bin/ceph:stdout services: INFO:cephadm:/usr/bin/ceph:stdout mon: 1 daemons, quorum node1 (age 4m) INFO:cephadm:/usr/bin/ceph:stdout mgr: no daemons active INFO:cephadm:/usr/bin/ceph:stdout osd: 0 osds: 0 up, 0 in INFO:cephadm:/usr/bin/ceph:stdout INFO:cephadm:/usr/bin/ceph:stdout data: INFO:cephadm:/usr/bin/ceph:stdout pools: 0 pools, 0 pgs INFO:cephadm:/usr/bin/ceph:stdout objects: 0 objects, 0 B INFO:cephadm:/usr/bin/ceph:stdout usage: 0 B used, 0 B / 0 B avail INFO:cephadm:/usr/bin/ceph:stdout pgs: INFO:cephadm:/usr/bin/ceph:stdout Traceback (most recent call last): File "/usr/sbin/cephadm", line 4684, in <module> r = args.func() File "/usr/sbin/cephadm", line 1153, in _default_image return func() File "/usr/sbin/cephadm", line 2473, in command_bootstrap is_available('mon', is_mon_available) File "/usr/sbin/cephadm", line 900, in is_available % (what, retry)) __main__.Error: mon not available after 10 tries
Actions