Bug #14737
closedlibkrbd vs udev event ordering
0%
Description
I sat down and looked into a few rare instances of "rbd map" hangs, mostly during fsx runs from the last year. The only explanation I could come up with was udev messing up with us and indeed:
What are udev event ordering guarantees? Specifically, if I get three
kernel uevents with seqnums i, i+1 and i+2, is it guaranteed that udev
will deliver them to the udev monitor socket in the same (i.e., seqnum)
order? I don't have any logs and the problem occurs very infrequently,
but what I'm seeing could be explained by udev delivering those three
events in i+1, i+2, i order. Is that possible?udev only maintains order between child and parent nodes, but not
otherwise.I see. So the idea is that everybody should subscribe to udev events
and then, because udev doesn't expose seqnums in any way, nobody should
even know they exist and, even though the kernel is guaranteed to
deliver certain uevents in certain order, udev is free to reorder their
udev counterparts, correct?Yes.
Need to improve our sysfs layout (which has been on my radar for a while anyway) and maybe try to think of a feasible fix for the existing stuff.
Updated by Ilya Dryomov almost 8 years ago
- Status changed from 12 to In Progress
Updated by Ilya Dryomov about 5 years ago
- Related to Bug #39089: krbd: fix rbd map hang due to udev return subsystem unordered added
Updated by Ilya Dryomov about 5 years ago
- Status changed from In Progress to Resolved
Haven't seen "rbd map" hanging this way in the lab with newer udev (i.e. systemd).
Anyway, mostly fixed in https://github.com/ceph/ceph/pull/27339.