Project

General

Profile

Actions

Bug #46124

closed

Potential race condition regression around new OSD flock()s

Added by Niklas Hambuechen almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
octopus, nautilus
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In #38150 and PR https://github.com/ceph/ceph/pull/26245, a new `flock()` approach was introuduced.

When I use `ceph-osd --mkfs` on v15.2.3, I get racy behaviour that it sometimes tries to take the lock when it already has it, resulting in `EAGAIN (Resource temporarily unavailable)` errors, and OSD creation fails.

Here are 3 runs with `strace`:

# strace -fye flock,close ceph-osd -i 0 --mkfs --osd-uuid f1889a99-b4dc-4044-a244-eabca4539ac9 --setuser ceph --setgroup ceph --osd-objectstore bluestore 2>&1 | grep -E 'flock|Resource|close*dev/xvd|^\+\+\+ exited'
[pid 19933] flock(12</dev/xvdg>, LOCK_EX|LOCK_NB) = 0
[pid 19933] flock(24</dev/xvdh>, LOCK_EX|LOCK_NB) = 0
[pid 19933] flock(12</dev/xvdg>, LOCK_EX|LOCK_NB) = 0
[pid 19933] flock(24</dev/xvdh>, LOCK_EX|LOCK_NB) = 0
[pid 19933] flock(24</dev/xvdh>, LOCK_EX|LOCK_NB) = -1 EAGAIN (Resource temporarily unavailable)
2020-06-21T02:59:26.498+0000 7fce7d0abdc0 -1 bdev(0x55a5f810e700 /var/lib/ceph/osd/ceph-0/block.db) _lock flock failed on /var/lib/ceph/osd/ceph-0/block.db
2020-06-21T02:59:26.498+0000 7fce7d0abdc0 -1 bdev(0x55a5f810e700 /var/lib/ceph/osd/ceph-0/block.db) open failed to lock /var/lib/ceph/osd/ceph-0/block.db: (11) Resource temporarily unavailable
2020-06-21T02:59:26.499+0000 7fce7d0abdc0 -1 bluestore(/var/lib/ceph/osd/ceph-0) _minimal_open_bluefs add block device(/var/lib/ceph/osd/ceph-0/block.db) returned: (11) Resource temporarily unavailable
2020-06-21T02:59:26.837+0000 7fce7d0abdc0 -1 OSD::mkfs: couldn't mount ObjectStore: error (11) Resource temporarily unavailable
2020-06-21T02:59:26.837+0000 7fce7d0abdc0 -1  ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-0: (11) Resource temporarily unavailable
+++ exited with 250 +++

# strace -fye flock,close ceph-osd -i 0 --mkfs --osd-uuid f1889a99-b4dc-4044-a244-eabca4539ac9 --setuser ceph --setgroup ceph --osd-objectstore bluestore 2>&1 | grep -E 'flock|Resource|close*dev/xvd|^\+\+\+ exited'
[pid 21477] flock(12</dev/xvdg>, LOCK_EX|LOCK_NB) = 0
[pid 21477] flock(24</dev/xvdh>, LOCK_EX|LOCK_NB) = 0
[pid 21477] flock(12</dev/xvdg>, LOCK_EX|LOCK_NB) = -1 EAGAIN (Resource temporarily unavailable)
+++ exited with 250 +++

# strace -fye flock,close ceph-osd -i 0 --mkfs --osd-uuid f1889a99-b4dc-4044-a244-eabca4539ac9 --setuser ceph --setgroup ceph --osd-objectstore bluestore 2>&1 | grep -E 'flock|Resource|close*dev/xvd|^\+\+\+ exited'
[pid 20320] flock(12</dev/xvdg>, LOCK_EX|LOCK_NB) = 0
[pid 20320] flock(24</dev/xvdh>, LOCK_EX|LOCK_NB) = 0
[pid 20320] flock(12</dev/xvdg>, LOCK_EX|LOCK_NB) = 0
[pid 20320] flock(24</dev/xvdh>, LOCK_EX|LOCK_NB) = 0
[pid 20320] flock(24</dev/xvdh>, LOCK_EX|LOCK_NB) = 0
+++ exited with 0 +++
  • The first one fails with errors
  • The second one, oddly, also has errors and exits with exit code 250, but no error is printed to the user (silent failure)
  • The third one succeeds

Files


Related issues 2 (0 open2 closed)

Copied to bluestore - Backport #47707: nautilus: Potential race condition regression around new OSD flock()sResolvedNathan CutlerActions
Copied to bluestore - Backport #47708: octopus: Potential race condition regression around new OSD flock()sResolvedNathan CutlerActions
Actions

Also available in: Atom PDF