Bug #39710: [exclusive lock] An image got more than one lock owners at the same time - rbd - Ceph

Actions

Copy link

Bug #39710

open

[exclusive lock] An image got more than one lock owners at the same time

Added by yu feng almost 5 years ago. Updated almost 5 years ago.

Status:

New

Priority:

Low

Assignee:

Target version:

Ceph - v13.2.6

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

With ceph mimic 13.2.5, I wrote a simple program to test the librbd exclusive lock feature relating APIs(i.e. lock_acquire, lock_break...) and found that in a specific case, there can be more than one clients thought themselves the lock owner at the same time.

The "rbd_blacklist_on_break_lock" is setted as false before the test, and the information of the tested image is:

[root@node0012 ~]# rbd info p1/lun7
rbd image 'lun7':
size 1 GiB in 256 objects
order 22 (4 MiB objects)
id: f6736b8b4567
block_name_prefix: rbd_data.f6736b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
op_features:
flags:
create_timestamp: Sat May 11 14:29:38 2019

The test procedure is:
1. clients on host node0010 and host node0011(the ip of the two hosts are separately 10.0.10.10 and 10.0.10.11) opened the same image p1/lun7 as above.
2. client on host node0010 call lock_acquire.
3. client on host node0011 call lock_break & lock_acquire.

After step 3, the command shows the locker owner is on node0011:

[root@node0012 ~]# rados -p p1 lock info rbd_header.f6736b8b4567 rbd_lock {"name":"rbd_lock","type":"exclusive","tag":"internal","lockers":[{"name":"client.63133","cookie":"auto 140200885894320","description":"","expiration":"0.000000","addr":"10.0.10.11:0/2140833570"}]}

[root@node0012 ~]# rbd lock ls p1/lun7
There is 1 exclusive lock on this image.
Locker ID Address
client.63133 auto 140200885894320 10.0.10.11:0/2140833570

but the client on node0010 still think itself the lock owner:

2019-05-11 14:34:24.585 7f24d9ffb700 10 librbd::Watcher::C_NotifyAck 0x7f24d400b880 C_NotifyAck: id=4621384811568, handle=139796018117808
2019-05-11 14:34:24.585 7f24d9ffb700 10 librbd::ImageWatcher: 0x7f24d4004dd0 image exclusively locked announcement
2019-05-11 14:34:24.585 7f24d9ffb700 10 librbd::ImageWatcher: 0x7f24d4004dd0 current lock owner: [*63133,140200885894320*]
2019-05-11 14:34:24.585 7f24d9ffb700 20 librbd::ManagedLock: 0x7f24d400e860 is_lock_owner: =1
2019-05-11 14:34:24.585 7f24d9ffb700 10 librbd::Watcher::C_NotifyAck 0x7f24d400b880 finish: r=0

Then the client on node0010 continue to write data, got the following log:

2019-05-11 14:34:46.465 7f25042cb480 20 librbd::io::ImageRequestWQ: 0x1e042d0 aio_write: ictx=0x1e02f40, completion=0x1e05550, off=0, len=4096, flags=0
2019-05-11 14:34:46.465 7f25042cb480 20 librbd::io::ImageRequestWQ: 0x1e042d0 queue: ictx=0x1e02f40, req=0x1dec9d0
2019-05-11 14:34:46.465 7f24d97fa700 20 librbd::io::AsyncOperation: 0x1e05688 start_op
2019-05-11 14:34:46.465 7f24d97fa700 20 librbd::io::ImageRequestWQ: 0x1e042d0 process: ictx=0x1e02f40, req=0x1dec9d0
2019-05-11 14:34:46.465 7f24d97fa700 20 librbd::io::ImageRequest: 0x7f24d97f6e90 send: aio_write: ictx=0x1e02f40, completion=0x1e05550

2019-05-11 14:34:46.465 7f24d97fa700 20 librbd::ObjectMap: 0x7f24cc008db0 aio_update: start=0, end=1, ->1
2019-05-11 14:34:46.465 7f24d97fa700 20 librbd::object_map::UpdateRequest: 0x7f24cc0068e0 update_object_map: ictx=0x1e02f40, oid=rbd_object_map.f6736b8b4567, [0,
1) = ->1
2019-05-11 14:34:46.465 7f24d97fa700 10 client.63127.objecter _op_submit oid rbd_object_map.f6736b8b4567 '@1' '@1' [call lock.assert_locked,call rbd.object_map_u
pdate] tid 29 osd.4
2019-05-11 14:34:46.472 7f24f2ad0700 10 client.63127.objecter op 0 rval -16 len 0
2019-05-11 14:34:46.472 7f24f2ad0700 10 client.63127.objecter op 1 rval 0 len 0
2019-05-11 14:34:46.472 7f24d9ffb700 20 librbd::object_map::UpdateRequest: 0x7f24cc0068e0 handle_update_object_map: r=-16
2019-05-11 14:34:46.472 7f24d9ffb700 20 librbd::object_map::UpdateRequest: 0x7f24cc0068e0 update_in_memory_object_map:
2019-05-11 14:34:46.472 7f24d9ffb700 20 librbd::object_map::Request: 0x7f24cc0068e0 should_complete: r=-16
2019-05-11 14:34:46.472 7f24d9ffb700 -1 librbd::object_map::Request: failed to update object map: (16) Device or resource busy
2019-05-11 14:34:46.472 7f24d9ffb700 -1 librbd::object_map::InvalidateRequest: 0x7f24d4011d30 invalidating object map in-memory
2019-05-11 14:34:46.472 7f24d9ffb700 -1 librbd::object_map::InvalidateRequest: 0x7f24d4011d30 invalidating object map on-disk

The object_map::UpdateRequest got error -16 is because the op "call lock.assert_locked" failed. Also, we can see the object rbd_object_map.f6736b8b4567 is locked by client on node0011:

[root@node0012 ~]# rados -p p1 lock info rbd_object_map.f6736b8b4567 rbd_lock {"name":"rbd_lock","type":"exclusive","tag":"","lockers":[{"name":"client.63133","cookie":"","description":"","expiration":"0.000000","addr":"10.0.10.11:0/2140833570"}]}

At the end, the image got object-map and fast-diff invalid flags:

Here are my question:
1. Why client on node0010 received the notify message "AcquiredLockPayload" from the true lock owner, but didn't change the lock state? It seems the following code does not work in this case.

RWLock::RLocker owner_locker(m_image_ctx.owner_lock);
  if (m_image_ctx.exclusive_lock != nullptr) {
    // potentially wake up the exclusive lock state machine now that
    // a lock owner has advertised itself
    m_image_ctx.exclusive_lock->handle_peer_notification(0);
  }

2. Does the exclusive lock feature should always be used alone with "rbd_blacklist_on_break_lock = true"?

Files

Download all files

node0010.log.tar.gz (54.8 KB) node0010.log.tar.gz		yu feng, 05/11/2019 07:45 AM
node0011.log.tar.gz (55 KB) node0011.log.tar.gz		yu feng, 05/11/2019 07:47 AM

Actions

Copy link

Updated by Greg Farnum almost 5 years ago

Project changed from Ceph to rbd

Actions

Copy link

Updated by Jason Dillaman almost 5 years ago

Priority changed from Normal to Low

Honestly, all bets are off if you set "rbd_blacklist_on_break_lock" to false. You are testing code paths that are way off the established path.

Actions

Copy link

Updated by yu feng almost 5 years ago

Jason Dillaman wrote:

Honestly, all bets are off if you set "rbd_blacklist_on_break_lock" to false. You are testing code paths that are way off the established path.

Hi Jason,
I did this test because it's what the tcmu-runner do. If the the option "rbd_blacklist_on_break_lock" is true, there would be hundred of clients added to the blacklist.
Yes, maybe it's a issue of the tcmu-runner, I posted it here just want to get some advice about the configuration of the exclusive lock.
Thanks.

Actions

Copy link

Updated by Jason Dillaman almost 5 years ago

yu feng wrote:

I did this test because it's what the tcmu-runner do.

Is that true? I just grepped that code and I don't see it being disabled. The "rbd_lock_break" API is actually hard-coded to force a blacklist [1].

[1] https://github.com/ceph/ceph/blob/master/src/librbd/internal.cc#L1184

Actions

Copy link

Updated by yu feng almost 5 years ago

The tcmu-runner I tested at first was a continous integration build(not a formal release). If two tcmu-runners of this version open the same image, they will race for the exclusive lock of the image and add each other into the blacklist(by lock_break) over and over again.
So I set "rbd_blacklist_on_break_lock = false" in ceph.conf manually to avoid this issue, and then I got the problem as I described above.

But after updating the tcmu-runner to the formal release 1.4.0, it seems the issue of racing for the lock no longger exists.

Anyway, thanks for your advice.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rbd

Custom queries

Bug #39710

[exclusive lock] An image got more than one lock owners at the same time

Updated by Greg Farnum almost 5 years ago

Updated by Jason Dillaman almost 5 years ago

Updated by yu feng almost 5 years ago

Updated by Jason Dillaman almost 5 years ago

Updated by yu feng almost 5 years ago