lock ops are not re-sent when cluster gets marked un-full
3 - minor
Pull request ID:
I'm not certain what the correct behavior should be in this case, so maybe it is not a bug, but here is what is happening: When an OSD becomes full, a process fails and we unmount the rbd attempt to remove the lock associated with the rbd for the process. The unmount works fine, but removing the lock is failing right now because the list_lockers() function call never returns. Here is a code snippet I tried with a fake rbd lock on a test cluster: import rbd import rados with rados.Rados(conffile='/etc/ceph/ceph.conf') as cluster: with cluster.open_ioctx('rbd') as ioctx: with rbd.Image(ioctx, 'msd1') as image: image.list_lockers() The process never returns, even after the ceph cluster is returned to healthy. The only indication of the error is an error in the /var/log/messages file: Jul 11 23:25:05 node-172-16-0-13 python: 2013-07-11 23:25:05.826793 7ffc66d72700 0 client.6911.objecter FULL, paused modify 0x7ffc687c6050 tid 2 Any help would be greatly appreciated. ceph version: ceph version 0.61.4 (1669132fcfc27d0c0b5e5bb93ade59d147e23404)
This may turn out to be a librados issue, but it showed up via rbd locking.