Project

General

Profile

Actions

Bug #40100

closed

Missing block.wal and block.db symlinks on restart

Added by Corey Bryant almost 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We are tracking a bug in Ubuntu (https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1828617) wherea race on system restart causes missing block.wal and block.db symlinks.

There is a loop for each OSD that calls 'ceph-volume lvm trigger' 30 times until the OSD is activated, for example:
[2019-05-31 01:27:29,235][ceph_volume.process][INFO ] Running command: ceph-volume lvm trigger 4-7478edfc-f321-40a2-a105-8e8a2c8ca3f6
[2019-05-31 01:27:35,435][ceph_volume.process][INFO ] stderr --> RuntimeError: could not find osd.4 with fsid 7478edfc-f321-40a2-a105-8e8a2c8ca3f6
[2019-05-31 01:27:35,530][systemd][WARNING] command returned non-zero exit status: 1
[2019-05-31 01:27:35,531][systemd][WARNING] failed activating OSD, retries left: 30
[2019-05-31 01:27:44,122][ceph_volume.process][INFO ] stderr --> RuntimeError: could not find osd.4 with fsid 7478edfc-f321-40a2-a105-8e8a2c8ca3f6
[2019-05-31 01:27:44,174][systemd][WARNING] command returned non-zero exit status: 1
[2019-05-31 01:27:44,175][systemd][WARNING] failed activating OSD, retries left: 29
...

The race appears to exist where 'ceph-volume lvm trigger' succeeds yet the WAL and DB devices are not ready:
https://github.com/ceph/ceph/blob/luminous/src/ceph-volume/ceph_volume/systemd/main.py#L93

Then the symlinks don't get setup here:
https://github.com/ceph/ceph/blob/luminous/src/ceph-volume/ceph_volume/devices/lvm/activate.py#L154
https://github.com/ceph/ceph/blob/luminous/src/ceph-volume/ceph_volume/devices/lvm/activate.py#L177

I wonder if we can have similar 'ceph-volume lvm trigger'-ish calls/loops for WAL and DB devices per OSD in src/ceph-volume/ceph_volume/systemd/main.py. We can determine if an OSD has a DB or WAL device from the lvm tags.

Actions #1

Updated by Corey Bryant almost 5 years ago

Can we do something along these lines in ceph_volume/systemd/main.py after existing while loop?

  1. using extra_data in ceph_volume/systemd/main.py, get ceph.wal_device and ceph.db_device from lvs tag with matching ceph.osd_id, ceph.osd_fsid, and type=block
  2. e.g. where extra_data=ceph.osd_id=0-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
    sudo lvs -o lv_tags|grep type=block | grep ceph.osd_id=0 | grep ceph\.osd_fsid=e20dbce0-34f4-46b3-8efc-f41edbcae3d7 | grep ceph\.wal_device
    sudo lvs -o lv_tags|grep type=block | grep ceph.osd_id=0 | grep ceph\.osd_fsid=e20dbce0-34f4-46b3-8efc-f41edbcae3d7 | grep ceph\.db_device
  3. loop until the following is found or CEPH_VOLUME_SYSTEMD_TRIES times
  4. where ceph.wal_device=/dev/ceph-wal-8a073a5b-6e42-43bf-a99d-e30c649362ea/osd-wal-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
    sudo lvs -o lv_tags|grep type=wal | grep ceph.wal_device=/dev/ceph-wal-8a073a5b-6e42-43bf-a99d-e30c649362ea/osd-wal-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
  5. loop until the following is found or CEPH_VOLUME_SYSTEMD_TRIES times
  6. where ceph.db_device=/dev/ceph-db-c37da146-b9a3-4339-bb2f-819f223982d3/osd-db-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
    sudo lvs -o lv_tags|grep type=db | grep ceph.db_device=/dev/ceph-db-c37da146-b9a3-4339-bb2f-819f223982d3/osd-db-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
Actions #2

Updated by Corey Bryant almost 5 years ago

Trying that again, the formatting in the last comment was unintended:

# using extra_data in ceph_volume/systemd/main.py, get ceph.wal_device and ceph.db_device from lvs tag with matching ceph.osd_id, ceph.osd_fsid, and type=block
# e.g. where extra_data=ceph.osd_id=0-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
sudo lvs -o lv_tags|grep type=block | grep ceph.osd_id=0 | grep ceph\.osd_fsid=e20dbce0-34f4-46b3-8efc-f41edbcae3d7 | grep ceph\.wal_device
sudo lvs -o lv_tags|grep type=block | grep ceph.osd_id=0 | grep ceph\.osd_fsid=e20dbce0-34f4-46b3-8efc-f41edbcae3d7 | grep ceph\.db_device
# loop until the following is found or CEPH_VOLUME_SYSTEMD_TRIES times
# where ceph.wal_device=/dev/ceph-wal-8a073a5b-6e42-43bf-a99d-e30c649362ea/osd-wal-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
sudo lvs -o lv_tags|grep type=wal | grep ceph.wal_device=/dev/ceph-wal-8a073a5b-6e42-43bf-a99d-e30c649362ea/osd-wal-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
# loop until the following is found or CEPH_VOLUME_SYSTEMD_TRIES times
# where ceph.db_device=/dev/ceph-db-c37da146-b9a3-4339-bb2f-819f223982d3/osd-db-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
sudo lvs -o lv_tags|grep type=db | grep ceph.db_device=/dev/ceph-db-c37da146-b9a3-4339-bb2f-819f223982d3/osd-db-e20dbce0-34f4-46b3-8efc-f41edbcae3d7
Actions #4

Updated by Greg Farnum almost 5 years ago

  • Project changed from Ceph to ceph-volume
  • Status changed from New to Fix Under Review
Actions #5

Updated by David Casier almost 5 years ago

Proposal pull request (from the work of coreycb) : https://github.com/ceph/ceph/pull/28520

In the coreycb proposal, the command was evaluated before waiting for WAL / DB devices to arrive.

In order to keep an equivalent timeout, I propose to put in the same loop

Actions #6

Updated by David Casier almost 5 years ago

I do not know the consequences in case "ceph-volume simple" is used. I added a control "if sub_command == 'lvm':"

Actions #7

Updated by Alfredo Deza almost 5 years ago

I commented in the PR, but want to reiterate here: we knew that there was a chance that in certain systems, the 30 tries at 5 second interval wouldn't be enough which is why we made it configurable and not hard coded.

In this case, this can benefit from changing the environment variables (as opposed to add extra intervals or tries).

The environment variables are:

CEPH_VOLUME_SYSTEMD_TRIES
CEPH_VOLUME_SYSTEMD_INTERVAL

Actions #8

Updated by Kefu Chai almost 5 years ago

  • Pull request ID set to 28791
Actions #9

Updated by Jan Fajerski almost 5 years ago

  • Status changed from Fix Under Review to Resolved

Looks like the fix was merged? Feel free to re-open if its still an issue.

Actions

Also available in: Atom PDF