Bug #45714
Updated by Jason Dillaman almost 4 years ago
If The rbd-mirror attempts to shut down thrasher test case got stuck waiting for quiesce / unquiesce while releasing its journal replayer, it might shut down after a op has started but before lock -- the OpFinish event is processed. In that case, the op will never finish point where it was blacklisted by peers and eventually aborted by the wait-for-events state will never advance. This reverts to the pre-Octopus behavior prior to rbd-mirror being refactored. trasher for not completing in a timely fashion. <pre> 2020-05-22T17:36:33.055+0000 7f3c5b3a8700 10 librbd::ImageWatcher: 0x5564c65f8240 notify_quiesce: request_id=3 2020-05-22T17:36:33.055+0000 7f3c5b3a8700 10 librbd::ImageWatcher: 0x5564c65f8240 notify_quiesce: async_request_id=[4367,93891297478144,3] attempts=10 2020-05-22T17:36:33.055+0000 7f3c5b3a8700 20 librbd::watcher::Notifier: 0x5564c65f82c0 notify: pending=1 2020-05-22T17:36:33.055+0000 7f3c743da700 5 librbd::Watcher: 0x5564c96cf680 notifications_blocked: blocked=0 2020-05-22T17:36:33.055+0000 7f3c743da700 10 librbd::Watcher::C_NotifyAck 0x5564c1b0fbd0 C_NotifyAck: id=1258425422260, handle=93891333330944 2020-05-22T17:36:33.055+0000 7f3c743da700 10 librbd::ImageWatcher: 0x5564c96cf680 quiesce request: [4367,93891333330944,3] 2020-05-22T17:36:33.055+0000 7f3c743da700 20 librbd::ImageState: 0x5564c1f94580 notify_quiesce 2020-05-22T17:36:33.055+0000 7f3c5b3a8700 10 librbd::ImageWatcher: 0x5564c8fdb2b0 C_ResponseMessage: r=0 2020-05-22T17:36:33.055+0000 7f3c5b3a8700 10 librbd::Watcher::C_NotifyAck 0x5564c1b0fbd0 finish: r=0 2020-05-22T17:36:33.055+0000 7f3c7fbf1700 15 rbd::mirror::image_replayer::util::decode_client_meta: 2020-05-22T17:36:33.055+0000 7f3c7fbf1700 15 rbd::mirror::image_replayer::util::decode_client_meta: client found: client_meta=[image_id=110f23aef24a, state=Replaying, sync_object_count=120, sync_points=[], snap_seqs=[[local_snap_seq=458, peer_snap_seq618]] 2020-05-22T17:36:33.055+0000 7f3c7fbf1700 5 rbd::mirror::image_replayer::journal::Replayer: 0x5564c5a2e700 validate_remote_client_state: image_id=110f23aef24a, remote_client_meta.image_id=110f23aef24a, remote_client.state=connected 2020-05-22T17:36:33.055+0000 7f3c743da700 5 librbd::Watcher: 0x5564c65f8240 notifications_blocked: blocked=0 2020-05-22T17:36:33.055+0000 7f3c743da700 10 librbd::Watcher::C_NotifyAck 0x5564c9b29a90 C_NotifyAck: id=1258425419227, handle=93891297478144 2020-05-22T17:36:33.055+0000 7f3c743da700 10 librbd::ImageWatcher: 0x5564c65f8240 quiesce request: [4367,93891297478144,3] ... snip ... 2020-05-22T17:36:37.648+0000 7f3c7fbf1700 15 rbd::mirror::ImageReplayer: 0x5564bed39900 [3/6171e4b9-eb13-49a4-a4c3-b3ae89c31304] unregister_admin_socket_hook: 2020-05-22T17:36:43.054+0000 7f3c43b79700 10 librbd::ImageWatcher: 0x5564c65f8240 quiesce request [4367,93891297478144,3] timed out 2020-05-22T17:36:43.054+0000 7f3c43b79700 20 librbd::ImageState: 0x5564c177d600 notify_unquiesce 2020-05-22T17:36:45.871+0000 7f3c43b79700 10 librbd::ImageWatcher: 0x5564be582b40 quiesce request [4367,93891187611648,3] timed out 2020-05-22T17:36:45.871+0000 7f3c43b79700 20 librbd::ImageState: 0x5564c8c47a80 notify_unquiesce </pre> http://qa-proxy.ceph.com/teuthology/jdillaman-2020-05-22_12:40:24-rbd-wip-jd-testing-distro-basic-smithi/5082994/teuthology.log