Bug #19897
closedrbd maybe pending in 99% when remove a clone image
Added by Tang Jin almost 7 years ago. Updated over 6 years ago.
0%
Description
Prerequisite: rbd_op_threads is 3 and rbd_cache is disable
When rbd removes a clone image, it is possible that rbd cmd pends in 99%.
If rbd removes a clone image, it will delete its parent ImageCtx at first including to delete its op_work_queue, it will be waiting until the ThreadPool finishes all its jobs because of 3 threads.
But the matter is that this 'drain' operation happens in ThreadPool::worker context itself, so the ThreadPool will never finish by itself.
Updated by Tang Jin almost 7 years ago
these are the log when enable tp=15
root@node1:jintang$ rbd rm test_pool/test_child
2017-05-10 10:37:43.889802 7f381283ed40 10 librbd::thread_pool start
2017-05-10 10:37:43.889807 7f381283ed40 10 librbd::thread_pool registering config observer on rbd_op_threads
2017-05-10 10:37:43.889812 7f381283ed40 10 librbd::thread_pool start_threads creating and starting 0x561767abe700
2017-05-10 10:37:43.889874 7f381283ed40 10 librbd::thread_pool start_threads creating and starting 0x561767af6880
2017-05-10 10:37:43.890051 7f381283ed40 10 librbd::thread_pool start_threads creating and starting 0x561767af6b50
2017-05-10 10:37:43.890074 7f381283ed40 15 librbd::thread_pool started
2017-05-10 10:37:43.890099 7f37ef7fe700 10 librbd::thread_pool worker start
2017-05-10 10:37:43.890129 7f37effff700 10 librbd::thread_pool worker start
2017-05-10 10:37:43.890148 7f37f4e05700 10 librbd::thread_pool worker start
2017-05-10 10:37:43.914399 7f37ef7fe700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37d8003920 (1 active)
2017-05-10 10:37:43.914422 7f37ef7fe700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37d8003920 (0 active)
2017-05-10 10:37:43.914425 7f37ef7fe700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37e4000fd0 (1 active)
2017-05-10 10:37:43.914434 7f37f4e05700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37d8003ad0 (2 active)
2017-05-10 10:37:43.914440 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37d80032d0 (3 active)
2017-05-10 10:37:43.914446 7f37ef7fe700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37e4000fd0 (2 active)
2017-05-10 10:37:43.914452 7f37f4e05700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37d8003ad0 (1 active)
2017-05-10 10:37:43.914466 7f37ef7fe700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc000df0 (2 active)
2017-05-10 10:37:43.914471 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37d80032d0 (1 active)
2017-05-10 10:37:43.914525 7f37ef7fe700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc000df0 (0 active)
2017-05-10 10:37:43.914865 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37d800aa00 (1 active)
2017-05-10 10:37:43.914873 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37d800aa00 (0 active)
2017-05-10 10:37:43.914873 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc000d80 (1 active)
2017-05-10 10:37:43.914880 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc000d80 (0 active)
2017-05-10 10:37:43.914881 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37d8002040 (1 active)
2017-05-10 10:37:43.914885 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37d8002040 (0 active)
2017-05-10 10:37:43.914885 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x561767af7370 (1 active)
2017-05-10 10:37:43.914907 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x561767af7370 (0 active)
2017-05-10 10:37:43.914978 7f37ef7fe700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x561767af78f0 (1 active)
2017-05-10 10:37:43.914988 7f37ef7fe700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x561767af78f0 (0 active)
2017-05-10 10:37:43.914989 7f37ef7fe700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37e40018e0 (1 active)
2017-05-10 10:37:43.915044 7f37ef7fe700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37e40018e0 (0 active)
2017-05-10 10:37:43.933738 7f37f4e05700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x561767af7350 (1 active)
2017-05-10 10:37:43.934303 7f37f4e05700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x561767af7350 (0 active)
Removing image: 99% complete...2017-05-10 10:37:43.937271 7f37ef7fe700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x561767af78f0 (1 active)
2017-05-10 10:37:43.937304 7f37ef7fe700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x561767af78f0 (0 active)
2017-05-10 10:37:43.937305 7f37ef7fe700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37e4002df0 (1 active)
2017-05-10 10:37:43.937316 7f37ef7fe700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37e4002df0 (0 active)
2017-05-10 10:37:43.946777 7f37f4e05700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x561767a65360 (1 active)
2017-05-10 10:37:43.946845 7f37f4e05700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x561767a65360 (0 active)
2017-05-10 10:37:43.961223 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37e0003b60 (1 active)
2017-05-10 10:37:43.961230 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37e0003b60 (0 active)
2017-05-10 10:37:43.961231 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc000e40 (1 active)
2017-05-10 10:37:43.961239 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc000e40 (0 active)
2017-05-10 10:37:43.961239 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc001c30 (1 active)
2017-05-10 10:37:43.961264 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc001c30 (0 active)
2017-05-10 10:37:43.961265 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc001f00 (1 active)
2017-05-10 10:37:43.961286 7f37f4e05700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc001f50 (2 active)
2017-05-10 10:37:43.961296 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc001f00 (1 active)
2017-05-10 10:37:43.961339 7f37f4e05700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc001f50 (0 active)
2017-05-10 10:37:43.980598 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc001c50 (1 active)
2017-05-10 10:37:43.980657 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc001c50 (0 active)
2017-05-10 10:37:43.980659 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc000dd0 (1 active)
2017-05-10 10:37:43.980663 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc000dd0 (0 active)
2017-05-10 10:37:43.980683 7f37effff700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc000df0 (1 active)
2017-05-10 10:37:43.980694 7f37ef7fe700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc000d80 (2 active)
2017-05-10 10:37:43.980726 7f37f4e05700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37e4001860 (3 active)
2017-05-10 10:37:43.980736 7f37f4e05700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37e4001860 (2 active)
2017-05-10 10:37:43.980738 7f37f4e05700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37e4002df0 (3 active)
2017-05-10 10:37:43.980742 7f37f4e05700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37e4002df0 (2 active)
2017-05-10 10:37:43.980743 7f37f4e05700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37e0003ee0 (3 active)
2017-05-10 10:37:43.980747 7f37ef7fe700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc000d80 (2 active)
2017-05-10 10:37:43.980755 7f37ef7fe700 12 tp_librbd worker wq librbd::op_work_queue start processing 0x7f37dc001f50 (3 active)
2017-05-10 10:37:43.980760 7f37ef7fe700 10 librbd::thread_pool drain
2017-05-10 10:37:43.980760 7f37f4e05700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37e0003ee0 (2 active)
2017-05-10 10:37:43.980762 7f37effff700 15 tp_librbd worker wq librbd::op_work_queue done processing 0x7f37dc000df0 (1 active)
Updated by Nathan Cutler almost 7 years ago
- Project changed from Ceph to rbd
- Category deleted (
librbd)
Updated by Nathan Cutler almost 7 years ago
- Status changed from New to Fix Under Review
Updated by Jason Dillaman almost 7 years ago
- Priority changed from High to Normal
- Severity changed from 2 - major to 3 - minor
Note: op work threads are currently hard-coded to 1.
Updated by Jason Dillaman over 6 years ago
- Status changed from Fix Under Review to Duplicate
Multithread issues are being tracked under #17379