Bug #59685
open[RBD Mirror] `Mirror image status` reports "up+stopping_replay" state upon a force promote on secondary, after a disaster of primary
0%
Description
After a disaster with primary, force promote a secondary image and check the mirror image state (after sometime) the status is reported as "up+stopping_replay".
Actual results:
Image status after force promote:
state":"up+stopping_replay"
Expected results:
Image status after force promote:
state":"up+stopped"
Reproducer: 100%
Updated by Prasanna Kumar Kalever 12 months ago
✨ sudo ../src/mstop.sh clustera ✨ ps -aux | grep rbd-mirror ✨ sudo kill -9 22786 --> rbd-mirror of site-a ✨ sudo ./bin/rbd --cluster=site-b mirror image promote pool1/img --debug-rbd=0 --force Image promoted to primary ✨ sudo ./bin/rbd --cluster=site-b mirror image status pool1/img --debug-rbd=0 img: global_id: d85f530b-ac7a-465c-ae0a-90b6fde8abaf state: up+stopping_replay description: local image linked to unknown peer service: admin on localhost.localdomain last_update: 2023-05-09 15:22:17 peer_sites: name: 0a287f34-04b6-418e-897e-c91be9e0a7e9 state: down+starting_replay description: starting replay last_update: 2023-05-09 12:19:17 snapshots: 9 .mirror.primary.d85f530b-ac7a-465c-ae0a-90b6fde8abaf.4251976f-b9e9-46bc-91c8-bb9f2f9954e4 (peer_uuids:[2505e916-4a47-4a42-950e-1c9c6d888214])
Updated by Prasanna Kumar Kalever 12 months ago
Here is my Analysis:
template <typename I> void ImageReplayer<I>::on_stop_journal_replay(int r, const std::string &desc) { dout(10) << dendl; { ┆ std::lock_guard locker{m_lock}; ┆ if (m_state != STATE_REPLAYING) { ┆ ┆ // might be invoked multiple times while stopping ┆ ┆ return; ┆ } ┆ m_stop_requested = true; ┆ m_state = STATE_STOPPING; } cancel_update_mirror_image_replay_status(); set_state_description(r, desc); update_mirror_image_status(true, boost::none); shut_down(0); }
shut_down() is invoked above
template <typename I> void ImageReplayer<I>::shut_down(int r) { dout(10) << "r=" << r << dendl; [...] // chain the shut down sequence (reverse order) Context *ctx = new LambdaContext( ----> Doesn't get invoked ┆ [this, r](int _r) { ┆ ┆ update_mirror_image_status(true, STATE_STOPPED); ┆ ┆ handle_shut_down(r); ┆ }); // destruct the state builder if (m_state_builder != nullptr) { ┆ ctx = new LambdaContext([this, ctx](int r) { ┆ ┆ m_state_builder->close(ctx); ----> This will be eventually blocked and will not call ctx->complete(0) ┆ }); } [...] }The LambdaContext in the above shut_down() is not executed, so it won't be able to execute `update_mirror_image_status(true, STATE_STOPPED);` This is because,
m_state_builder->close(ctx); [src/tools/rbd_mirror/ImageReplayer.cc::shut_down()] --> void StateBuilder<I>::close(Context* on_finish) {} [src/tools/rbd_mirror/image_replayer/snapshot/StateBuilder.cc] --> void StateBuilder<I>::close_local_image(Context* on_finish) {} [../src/tools/rbd_mirror/image_replayer/StateBuilder.cc] --> void StateBuilder<I>::close_remote_image(Context* on_finish) {} [../src/tools/rbd_mirror/image_replayer/StateBuilder.cc] --> void CloseImageRequest<I>::send() {} [../src/tools/rbd_mirror/image_replayer/CloseImageRequest.cc] --> void CloseImageRequest<I>::close_image() {} [../src/tools/rbd_mirror/image_replayer/CloseImageRequest.cc]
template <typename I> void CloseImageRequest<I>::close_image() { dout(20) << dendl; Context *ctx = create_context_callback< ┆ CloseImageRequest<I>, &CloseImageRequest<I>::handle_close_image>(this); (*m_image_ctx)->state->close(ctx); ------> This is supposed to call void ImageState<I>::close(Context *on_finish){}, but that is not happening currently }
As the above is async calls, the execution of the process doesn't block but, close_remote_image() doesn't continue, which means the "on_finish->complete(0)" doesn't run as a result (if you trace back) the LambdaContext code hunch in the shut_down() is not invoked and update_mirror_image_status(true, STATE_STOPPED); is not called.
The remote cluster is no more available because of the disaster, this is why remote image close is not continuing.
Updated by Ilya Dryomov 12 months ago
Hi Prasanna,
Where exactly in ImageState::close() does it hang?
Updated by Prasanna Kumar Kalever 12 months ago
(*m_image_ctx)->state->close(ctx); ------> This is supposed to call void ImageState<I>::close(Context *on_finish){}, but that is not happening currently
I had rerun the command `sudo ./bin/rbd --cluster=site-b mirror image promote pool1/img --force --debug-rbd=0` without `--debug-rbd=0`, it produced the desired logs where I can see the above `void ImageState<I>::close(Context *on_finish){}` is continuing (running) ...
Client 1:
2023-05-09T19:53:29.677+0530 7f58fb7ed640 20 rbd::mirror::image_replayer::CloseImageRequest: 0x563cb258f7c0 handle_close_imagePRASANNAXXX: CloseImageRequest<I>::handle_close_image(): r=0 2023-05-09T19:53:29.677+0530 7f58fb7ed640 20 rbd::mirror::image_replayer::CloseImageRequest: 0x563cb258f7c0 handle_close_imagePRASANNAXXX: CloseImageRequest<I>::handle_close_image(): calling complete r=0 2023-05-09T19:53:29.677+0530 7f58fb7ed640 10 rbd::mirror::image_replayer::StateBuilder: 0x563cb23f2000 handle_close_local_image: r=0 2023-05-09T19:53:29.677+0530 7f58fb7ed640 10 rbd::mirror::image_replayer::StateBuilder: 0x563cb23f2000 close_remote_image: PRASANNAXXX: close_remote_image 2023-05-09T19:53:29.677+0530 7f58fb7ed640 10 rbd::mirror::image_replayer::StateBuilder: 0x563cb23f2000 close_remote_image: PRASANNAXXX: remote_image_ctx != nullptr 2023-05-09T19:53:29.677+0530 7f58fb7ed640 20 rbd::mirror::image_replayer::CloseImageRequest: 0x563cb216fa70 sendPRASANNAXXX: In CloseImageRequest<I>::send() 2023-05-09T19:53:29.677+0530 7f58fb7ed640 20 rbd::mirror::image_replayer::CloseImageRequest: 0x563cb216fa70 close_imagePRASANNAXXX: In CloseImageRequest<I>::close_image() 2023-05-09T19:53:29.677+0530 7f58fb7ed640 20 rbd::mirror::image_replayer::CloseImageRequest: 0x563cb216fa70 close_imagePRASANNAXXX: In CloseImageRequest<I>::close_image() called (*m_image_ctx)->state->close(ctx) 2023-05-09T19:53:29.677+0530 7f58fb7ed640 10 rbd::mirror::image_replayer::StateBuilder: 0x563cb23f2000 close_remote_image: PRASANNAXXX: done requesting image_replayer::CloseImageRequest<I>::send for remote 2023-05-09T19:53:29.677+0530 7f58fb7ed640 10 rbd::mirror::image_replayer::StateBuilder: 0x563cb23f2000 handle_close_local_image: PRASANNAXXX: handle_close_local_image called complete, r=0 2023-05-09T19:53:32.934+0530 7f590b80d640 10 rbd::mirror::MirrorStatusUpdater 0x563cad3a91e0 handle_timer_task: 2023-05-09T19:53:32.934+0530 7f590b80d640 10 rbd::mirror::MirrorStatusUpdater 0x563cad3a91e0 schedule_timer_task: 2023-05-09T19:53:32.934+0530 7f590b80d640 10 rbd::mirror::MirrorStatusUpdater 0x563cad3a91e0 queue_update_task:
Client 2:
2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c6e80 closePRASANNAXXX: ImageState<I>::close() 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c6e80 closePRASANNAXXX: ImageState<I>::close() 2 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c6e80 closePRASANNAXXX: ImageState<I>::close() 2 taking Action 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c6e80 closePRASANNAXXX: ImageState<I>::close() 2 taking Action done 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c6e80 closePRASANNAXXX: ImageState<I>::close() 2 execute_action_unlock 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c6e80 0x55a64d6c6e80 append_context PRASANNAXXX: pushing to action_contexts->second.push_back(context) 2023-05-09T19:53:31.389+0530 7f3826860580 10 librbd::ImageState: 0x55a64d6c6e80 0x55a64d6c6e80 send_close_unlock 2023-05-09T19:53:31.389+0530 7f3826860580 10 librbd::ConfigWatcher: shut_down: 2023-05-09T19:53:31.389+0530 7f3826860580 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 send_block_image_watcher 2023-05-09T19:53:31.389+0530 7f3826860580 10 librbd::ImageWatcher: 0x7f37ec008c00 block_notifies 2023-05-09T19:53:31.389+0530 7f3826860580 5 librbd::Watcher: 0x7f37ec008c00 block_notifies: blocked_count=1 2023-05-09T19:53:31.389+0530 7f3826860580 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 handle_block_image_watcher: r=0 2023-05-09T19:53:31.389+0530 7f3826860580 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 send_shut_down_update_watchers 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c6e80 shut_down_update_watchers 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c7100 ImageUpdateWatchers::shut_down 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c7100 ImageUpdateWatchers::shut_down: completing shut down 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c6e80 closePRASANNAXXX: ImageState<I>::close() 2 execute_action_unlock done 2023-05-09T19:53:31.389+0530 7f3826860580 20 librbd::ImageState: 0x55a64d6c6e80 closePRASANNAXXX: ImageState<I>::close() done 2023-05-09T19:53:31.389+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 handle_shut_down_update_watchers: r=0 2023-05-09T19:53:31.389+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 send_flush 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::AsyncOperation: 0x7f37ec00a890 start_op 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::Dispatcher: 0x55a64d6c8f90 send: dispatch_spec=0x7f37ec018a70 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::QueueImageDispatch: 0x55a64d6c90e0 flush: tid=4 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::Dispatcher: 0x55a64d6c8f90 send: dispatch_spec=0x7f37ec018a70 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::QosImageDispatch: 0x55a64d6c9290 flush: tid=4 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::Dispatcher: 0x55a64d6c8f90 send: dispatch_spec=0x7f37ec018a70 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::RefreshImageDispatch: 0x55a64d6c9230 flush: tid=4 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::WriteBlockImageDispatch: 0x55a64d46d1d0 flush: tid=4 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::ImageDispatch: 0x55a64d6c90c0 flush: 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::ImageRequest: 0x7f3812ff7590 send: aio_flush: ictx=0x55a64d677090, completion=0x7f37ec00a780 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::AioCompletion: 0x7f37ec00a780 set_request_count: pending=1 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::Dispatcher: 0x55a64d6ca580 send: dispatch_spec=0x7f37ec016670 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::cache::WriteAroundObjectDispatch: 0x7f37e8005720 flush: 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::AioCompletion: 0x7f37ec00a780 complete_request: cb=1, pending=0 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::AioCompletion: 0x7f37ec00a780 finalize: r=0 2023-05-09T19:53:31.389+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 handle_flush: r=0 2023-05-09T19:53:31.389+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 send_unregister_image_watcher 2023-05-09T19:53:31.389+0530 7f3812ffd640 10 librbd::ImageWatcher: 0x7f37ec008c00 unregistering image watcher 2023-05-09T19:53:31.389+0530 7f3812ffd640 10 librbd::Watcher: 0x7f37ec008c00 unregister_watch: 2023-05-09T19:53:31.389+0530 7f3812ffd640 20 librbd::io::AsyncOperation: 0x7f37ec00a890 finish_op 2023-05-09T19:53:31.394+0530 7f37fdffb640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 handle_unregister_image_watcher: r=0 2023-05-09T19:53:31.394+0530 7f37fdffb640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 send_flush_readahead 2023-05-09T19:53:31.394+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 handle_flush_readahead: r=0 2023-05-09T19:53:31.394+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 send_shut_down_image_dispatcher 2023-05-09T19:53:31.394+0530 7f3812ffd640 20 librbd::io::AsyncOperation: 0x7f37e0001f00 start_op 2023-05-09T19:53:31.394+0530 7f3812ffd640 5 librbd::io::Dispatcher: 0x55a64d6c8f90 shut_down: 2023-05-09T19:53:31.394+0530 7f3812ffd640 20 librbd::io::FlushTracker: 0x55a64d46b5a0 shut_down: 2023-05-09T19:53:31.395+0530 7f3812ffd640 20 librbd::io::FlushTracker: 0x55a64d46ba10 shut_down: 2023-05-09T19:53:31.395+0530 7f3812ffd640 20 librbd::io::AsyncOperation: 0x7f37e0001f00 finish_op 2023-05-09T19:53:31.395+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 handle_shut_down_image_dispatcher: r=0 2023-05-09T19:53:31.395+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 send_shut_down_object_dispatcher 2023-05-09T19:53:31.395+0530 7f3812ffd640 5 librbd::io::Dispatcher: 0x55a64d6ca580 shut_down: 2023-05-09T19:53:31.395+0530 7f3812ffd640 5 librbd::io::ObjectDispatch: 0x55a64d6ca310 shut_down: 2023-05-09T19:53:31.395+0530 7f3812ffd640 5 librbd::cache::WriteAroundObjectDispatch: 0x7f37e8005720 shut_down: 2023-05-09T19:53:31.395+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 handle_shut_down_object_dispatcher: r=0 2023-05-09T19:53:31.395+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 send_flush_op_work_queue 2023-05-09T19:53:31.395+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 handle_flush_op_work_queue: r=0 2023-05-09T19:53:31.395+0530 7f3812ffd640 10 librbd::image::CloseRequest: 0x55a64d6d2dc0 handle_flush_image_watcher: r=0 2023-05-09T19:53:31.395+0530 7f3812ffd640 10 librbd::ImageState: 0x55a64d6c6e80 0x55a64d6c6e80 handle_close: r=0 2023-05-09T19:53:31.395+0530 7f37fdffb640 10 librbd::ImageCtx: 0x55a64d677090 ~ImageCtx 2023-05-09T19:53:31.395+0530 7f37fdffb640 20 librbd::AsioEngine: 0x55a64d6c4e80 ~AsioEngine: 2023-05-09T19:53:31.395+0530 7f37fdffb640 20 librbd::asio::ContextWQ: 0x55a64d6c4ec0 ~ContextWQ:
The problem is "CloseImageRequest<I>::handle_close_image(int r)" is never invoked on `client 1` for remote image.
Updated by Ilya Dryomov 12 months ago
What do you mean by "client 1" and "client 2" and "remote image" here? Force promote is a local operation (to the cluster the image is being promoted on) -- it's supposed to operate only on the image that is being promoted.
Updated by Prasanna Kumar Kalever 12 months ago
As commented earlier:
template <typename I> void ImageReplayer<I>::on_stop_journal_replay(int r, const std::string &desc) { dout(10) << dendl; { ┆ std::lock_guard locker{m_lock}; ┆ if (m_state != STATE_REPLAYING) { ┆ ┆ // might be invoked multiple times while stopping ┆ ┆ return; ┆ } ┆ m_stop_requested = true; ┆ m_state = STATE_STOPPING; } cancel_update_mirror_image_replay_status(); set_state_description(r, desc); update_mirror_image_status(true, boost::none); shut_down(0); }
shut_down() is invoked above
template <typename I> void ImageReplayer<I>::shut_down(int r) { dout(10) << "r=" << r << dendl; [...] // chain the shut down sequence (reverse order) Context *ctx = new LambdaContext( ----> Doesn't get invoked ┆ [this, r](int _r) { ┆ ┆ update_mirror_image_status(true, STATE_STOPPED); ┆ ┆ handle_shut_down(r); ┆ }); // destruct the state builder if (m_state_builder != nullptr) { ┆ ctx = new LambdaContext([this, ctx](int r) { ┆ ┆ m_state_builder->close(ctx); ----> This will be eventually blocked and will not call ctx->complete(0) ┆ }); } [...] }
Here is the list of calls and actions that happen which leads to miss of calling `update_mirror_image_status(true, STATE_STOPPED);` above:
-> m_state_builder->close(ctx); [src/tools/rbd_mirror/ImageReplayer.cc::shut_down()] -> void StateBuilder<I>::close(Context* on_finish) [src/tools/rbd_mirror/image_replayer/snapshot/StateBuilder.cc] -> void StateBuilder<I>::close_local_image(Context* on_finish) [src/tools/rbd_mirror/image_replayer/StateBuilder.cc] -> void CloseImageRequest<I>::close_image() [src/tools/rbd_mirror/image_replayer/CloseImageRequest.cc ] -> void ImageState<I>::close(Context *on_finish) [src/librbd/ImageState.cc] -> void ImageState<I>::execute_action_unlock(const Action &action Context *on_finish) [src/librbd/ImageState.cc] -> void ImageState<I>::append_context(const Action &action, Context *context) [src/librbd/ImageState.cc] -> void ImageState<I>::execute_next_action_unlock() [src/librbd/ImageState.cc] -> void ImageState<I>::send_close_unlock() [src/librbd/ImageState.cc] -> void CloseRequest<I>::send() [src/librbd/image/CloseRequest.cc] -> void CloseRequest<I>::send_block_image_watcher() [src/librbd/image/CloseRequest.cc] -> void CloseRequest<I>::handle_block_image_watcher(int r) [src/librbd/image/CloseRequest.cc] -> void CloseRequest<I>::send_shut_down_update_watchers() -> void CloseRequest<I>::handle_shut_down_update_watchers(int r) -> void CloseRequest<I>::send_flush() -> void CloseRequest<I>::handle_flush(int r) -> void CloseRequest<I>::send_shut_down_exclusive_lock() -> void CloseRequest<I>::send_unregister_image_watcher() -> void ImageWatcher<I>::unregister_watch(Context *on_finish) [src/librbd/ImageWatcher.cc] -> void Watcher::unregister_watch(Context *on_finish) [src/librbd/Watcher.cc] -> void CloseRequest<I>::handle_unregister_image_watcher(int r) [src/librbd/image/CloseRequest.cc] -> void CloseRequest<I>::send_flush_readahead() -> void CloseRequest<I>::handle_flush_readahead(int r) -> void CloseRequest<I>::send_shut_down_image_dispatcher() -> void CloseRequest<I>::handle_shut_down_image_dispatcher(int r) -> void CloseRequest<I>::send_shut_down_object_dispatcher() -> void CloseRequest<I>::handle_shut_down_object_dispatcher(int r) -> void CloseRequest<I>::send_flush_op_work_queue() -> void CloseRequest<I>::handle_flush_op_work_queue(int r) -> void CloseRequest<I>::send_close_parent() -> void CloseRequest<I>::handle_close_parent(int r) -> void CloseRequest<I>::send_flush_image_watcher() -> void CloseRequest<I>::handle_flush_image_watcher(int r) -> void CloseRequest<I>::finish() ->void ImageState<I>::handle_close(int r) [src/librbd/ImageState.cc] -> void CloseImageRequest<I>::handle_close_image(int r) [src/tools/rbd_mirror/image_replayer/CloseImageRequest.cc] -> void StateBuilder<I>::handle_close_local_image(int r, Context* on_finish) [src/tools/rbd_mirror/image_replayer/StateBuilder.cc] -> void StateBuilder<I>::close(Context* on_finish) [src/tools/rbd_mirror/image_replayer/snapshot/StateBuilder.cc] -> void StateBuilder<I>::close_remote_image(Context* on_finish) [src/tools/rbd_mirror/image_replayer/StateBuilder.cc] -> void CloseImageRequest<I>::close_image() [src/tools/rbd_mirror/image_replayer/CloseImageRequest.cc] -> void ImageState<I>::close(Context *on_finish) [src/librbd/ImageState.cc] -> void ImageState<I>::execute_action_unlock(const Action &action Context *on_finish) [src/librbd/ImageState.cc] -> void ImageState<I>::append_context(const Action &action, Context *context) [src/librbd/ImageState.cc] -> void ImageState<I>::execute_next_action_unlock() [src/librbd/ImageState.cc] -> void ImageState<I>::send_close_unlock() [src/librbd/ImageState.cc] -> void CloseRequest<I>::send() [src/librbd/image/CloseRequest.cc] -> void CloseRequest<I>::send_block_image_watcher() [src/librbd/image/CloseRequest.cc] -> void CloseRequest<I>::handle_block_image_watcher(int r) [src/librbd/image/CloseRequest.cc] -> void CloseRequest<I>::send_shut_down_update_watchers() -> void CloseRequest<I>::handle_shut_down_update_watchers(int r) -> void CloseRequest<I>::send_flush() -> void CloseRequest<I>::handle_flush(int r) -> void CloseRequest<I>::send_shut_down_exclusive_lock() -> void CloseRequest<I>::send_unregister_image_watcher() -> void ImageWatcher<I>::unregister_watch(Context *on_finish) [src/librbd/ImageWatcher.cc] -> void Watcher::unregister_watch(Context *on_finish) [src/librbd/Watcher.cc] -> This doesn't return
For remote image the control just stuck here in `m_ioctx.aio_unwatch` and doesn't invoke the callback:
void Watcher::unregister_watch(Context *on_finish) { ldout(m_cct, 10) << dendl; { ┆ std::unique_lock watch_locker{m_watch_lock}; ┆ if (m_watch_state != WATCH_STATE_IDLE) { ┆ ┆ ldout(m_cct, 10) << "delaying unregister until register completed" ┆ ┆ ┆ ┆ ┆ ┆ ┆ ┆ ┆ ┆ ┆<< dendl; ┆ ┆ ceph_assert(m_unregister_watch_ctx == nullptr); ┆ ┆ m_unregister_watch_ctx = new LambdaContext([this, on_finish](int r) { ┆ ┆ ┆ ┆ unregister_watch(on_finish); ┆ ┆ ┆ }); ┆ ┆ return; ┆ } else if (is_registered(m_watch_lock)) { ┆ ┆ librados::AioCompletion *aio_comp = create_rados_callback( ┆ ┆ ┆ new C_UnwatchAndFlush(m_ioctx, on_finish)); int r = m_ioctx.aio_unwatch(m_watch_handle, aio_comp); ┆ ┆ ceph_assert(r == 0); ┆ ┆ aio_comp->release(); ┆ ┆ m_watch_handle = 0; ┆ ┆ m_watch_blocklisted = false; ┆ ┆ return; ┆ } } on_finish->complete(0); }
It looks like `m_ioctx.aio_unwatch()` in `void Watcher::unregister_watch()` is stuck and doesn't callback.
Updated by Prasanna Kumar Kalever 12 months ago
- Status changed from New to Fix Under Review
- Pull request ID set to 51540