Project

General

Profile

Actions

Bug #41036

closed

concurrent "rbd unmap" failures due to udev

Added by Ilya Dryomov almost 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous,mimic,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Unmapping 200 images concurrently leaves behind 10-20 mappings:

# map-200.sh
OK
# rbd showmapped | wc -l
201
# for ((i = 0; i < 200; i++)); do rbd unmap /dev/rbd$i & done
rbd: '/dev/rbd4' is not an rbd device
rbd: unmap failed: (22) Invalid argument
rbd: unmap failed: (19) No such device
rbd: unmap failed: (19) No such device
rbd: '/dev/rbd70' is not an rbd device
rbd: '/dev/rbd106' is not an rbd device
rbd: unmap failed: rbd: unmap failed: (22) Invalid argument
(22) Invalid argument
rbd: '/dev/rbd136' is not an rbd device
rbd: unmap failed: (22) Invalid argument
rbd: '/dev/rbd167' is not an rbd device
rbd: '/dev/rbd138' is not an rbd device
rbd: unmap failed: rbd: unmap failed: (22) Invalid argument
rbd: unmap failed: (22) Invalid argument(19) No such device
rbd: unmap failed: (19) No such device
rbd: '/dev/rbd160' is not an rbd device
rbd: unmap failed: (22) Invalid argument
rbd: '/dev/rbd163' is not an rbd device
rbd: unmap failed: (19) No such device
rbd: unmap failed: (22) Invalid argument
rbd: unmap failed: (19) No such device
rbd: '/dev/rbd173' is not an rbd device
rbd: unmap failed: (22) Invalid argument
rbd: unmap failed: (19) No such device
rbd: unmap failed: (19) No such device
rbd: '/dev/rbd181' is not an rbd device
rbd: unmap failed: (22) Invalid argument
rbd: unmap failed: (19) No such device

# rbd showmapped 
id  pool namespace image  snap device      
106 rbd            img106 -    /dev/rbd106 
136 rbd            img137 -    /dev/rbd136 
138 rbd            img140 -    /dev/rbd138 
140 rbd            img141 -    /dev/rbd140 
160 rbd            img158 -    /dev/rbd160 
162 rbd            img165 -    /dev/rbd162 
163 rbd            img162 -    /dev/rbd163 
167 rbd            img168 -    /dev/rbd167 
173 rbd            img173 -    /dev/rbd173 
177 rbd            img176 -    /dev/rbd177 
181 rbd            img183 -    /dev/rbd181 
184 rbd            img187 -    /dev/rbd184 
187 rbd            img184 -    /dev/rbd187 
188 rbd            img186 -    /dev/rbd188 
189 rbd            img189 -    /dev/rbd189 
4   rbd            img5   -    /dev/rbd4   
70  rbd            img70  -    /dev/rbd70  
83  rbd            img82  -    /dev/rbd83  
93  rbd            img96  -    /dev/rbd93

This is because udev_enumerate_scan_devices() called from devno_to_krbd_id() sporadically fails with either ENODEV or ENOENT. Under normal circumstances devno_to_krbd_id() returns ENOENT explicitly, which the caller treats as "not an rbd device" and translates to EINVAL.

Looking at strace output, the filtering code inside libudev does find the right device and continues on. udev_enumerate_scan_devices() fails later.


Related issues 5 (0 open5 closed)

Related to rbd - Bug #41404: [rbd] rbd map hangs up infinitely after osd downResolvedIlya Dryomov08/23/2019

Actions
Related to Ceph - Fix #42523: backport "common/thread: Fix race condition in make_named_thread" to mimic and nautilusResolvedIlya Dryomov10/29/2019

Actions
Copied to rbd - Backport #42524: nautilus: concurrent "rbd unmap" failures due to udevResolvedNathan CutlerActions
Copied to rbd - Backport #42526: mimic: concurrent "rbd unmap" failures due to udevResolvedIlya DryomovActions
Copied to rbd - Backport #42527: luminous: concurrent "rbd unmap" failures due to udevResolvedIlya DryomovActions
Actions

Also available in: Atom PDF