Project

General

Profile

Bug #14737

libkrbd vs udev event ordering

Added by Ilya Dryomov about 4 years ago. Updated 12 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
rbd
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature:

Description

I sat down and looked into a few rare instances of "rbd map" hangs, mostly during fsx runs from the last year. The only explanation I could come up with was udev messing up with us and indeed:

What are udev event ordering guarantees? Specifically, if I get three
kernel uevents with seqnums i, i+1 and i+2, is it guaranteed that udev
will deliver them to the udev monitor socket in the same (i.e., seqnum)
order? I don't have any logs and the problem occurs very infrequently,
but what I'm seeing could be explained by udev delivering those three
events in i+1, i+2, i order. Is that possible?

udev only maintains order between child and parent nodes, but not
otherwise.

I see. So the idea is that everybody should subscribe to udev events
and then, because udev doesn't expose seqnums in any way, nobody should
even know they exist and, even though the kernel is guaranteed to
deliver certain uevents in certain order, udev is free to reorder their
udev counterparts, correct?

Yes.

Need to improve our sysfs layout (which has been on my radar for a while anyway) and maybe try to think of a feasible fix for the existing stuff.


Related issues

Related to rbd - Bug #39089: krbd: fix rbd map hang due to udev return subsystem unordered Resolved 04/03/2019

History

#1 Updated by Ilya Dryomov about 4 years ago

  • Description updated (diff)

#2 Updated by Ilya Dryomov almost 4 years ago

  • Status changed from 12 to In Progress

#3 Updated by Ilya Dryomov about 1 year ago

  • Related to Bug #39089: krbd: fix rbd map hang due to udev return subsystem unordered added

#4 Updated by Ilya Dryomov 12 months ago

  • Status changed from In Progress to Resolved

Haven't seen "rbd map" hanging this way in the lab with newer udev (i.e. systemd).

Anyway, mostly fixed in https://github.com/ceph/ceph/pull/27339.

#5 Updated by Ilya Dryomov 12 months ago

  • Category set to rbd

Also available in: Atom PDF