Project

General

Profile

Bug #18963

Updated by Jason Dillaman about 7 years ago

When a local image is force promoted to primary, the local rbd-mirror daemon should detect that the local images are now primary and shut-down the image replayers (and release the exclusive lock). However, if the remote peer is unreachable, it can result in deadlock and the image replayers will not shut down correctly. 

 <pre> 
 #0    0x00007f96db88b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 
 #1    0x00007f96dc6c7ad1 in Wait (mutex=..., this=0x7f9636ff9da0) at common/Cond.h:56 
 #2    librados::IoCtxImpl::operate_read (this=this@entry=0x7f96efdfb050, oid=..., o=o@entry=0x7f9636ff9fc0, pbl=pbl@entry=0x7f9636ffa180, flags=flags@entry=0) at librados/IoCtxImpl.cc:725 
 #3    0x00007f96dc6d25d3 in librados::IoCtxImpl::exec (this=0x7f96efdfb050, oid=..., cls=cls@entry=0x7f96e649f4c7 "rbd", method=method@entry=0x7f96e64e42e7 "mirror_mode_get", inbl=..., outbl=...) at librados/IoCtxImpl.cc:1135 
 #4    0x00007f96dc681a74 in librados::IoCtx::exec (this=this@entry=0x7f96efdfb710, oid="rbd_mirroring", cls=cls@entry=0x7f96e649f4c7 "rbd", method=method@entry=0x7f96e64e42e7 "mirror_mode_get", inbl=..., outbl=...) at librados/librados.cc:1273 
 #5    0x00007f96e638ec7d in librbd::cls_client::mirror_mode_get (ioctx=ioctx@entry=0x7f96efdfb710, mirror_mode=mirror_mode@entry=0x7f9636ffa21c) at cls/rbd/cls_rbd_client.cc:1042 
 #6    0x00007f96e623bf10 in librbd::mirror_mode_get (io_ctx=..., mirror_mode=mirror_mode@entry=0x7f9636ffa3dc) at librbd/internal.cc:3445 
 #7    0x00007f96e61d471a in rbd::mirror::PoolWatcher::refresh (this=this@entry=0x7f96efdfb710, image_ids=image_ids@entry=0x7f9636ffa680) at tools/rbd_mirror/PoolWatcher.cc:90 
 #8    0x00007f96e61d54df in rbd::mirror::PoolWatcher::refresh_images (this=0x7f96efdfb710, reschedule=<optimized out>) at tools/rbd_mirror/PoolWatcher.cc:65 
 #9    0x00007f96e61b0c9a in operator() (a0=<optimized out>, this=<optimized out>) at /usr/include/boost/function/function_template.hpp:767 
 #10 FunctionContext::finish (this=<optimized out>, r=<optimized out>) at include/Context.h:460 
 #11 0x00007f96e61aeb89 in Context::complete (this=0x7f954c00d530, r=<optimized out>) at include/Context.h:64 
 #12 0x00007f96e63ccd24 in SafeTimer::timer_thread (this=0x7f96efdfb730) at common/Timer.cc:105 
 #13 0x00007f96e63ce75d in SafeTimerThread::entry (this=<optimized out>) at common/Timer.cc:38 
 #14 0x00007f96db887dc5 in start_thread () from /lib64/libpthread.so.0 
 #15 0x00007f96da77073d in clone () from /lib64/libc.so.6 
 </pre> 

 <pre> 
 #0    0x00007f96db88b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 
 #1    0x00007f96dc6c7ad1 in Wait (mutex=..., this=0x7f9596ffa120) at common/Cond.h:56 
 #2    librados::IoCtxImpl::operate_read (this=this@entry=0x7f96efe66fb0, oid=..., o=o@entry=0x7f9596ffa340, pbl=pbl@entry=0x7f9596ffa500, flags=flags@entry=0) at librados/IoCtxImpl.cc:725 
 #3    0x00007f96dc6d25d3 in librados::IoCtxImpl::exec (this=0x7f96efe66fb0, oid=..., cls=cls@entry=0x7f96e649f4c7 "rbd", method=method@entry=0x7f96e64e42c7 "mirror_uuid_get", inbl=..., outbl=...) at librados/IoCtxImpl.cc:1135 
 #4    0x00007f96dc681a74 in librados::IoCtx::exec (this=this@entry=0x7f96efe2d3f8, oid="rbd_mirroring", cls=cls@entry=0x7f96e649f4c7 "rbd", method=method@entry=0x7f96e64e42c7 "mirror_uuid_get", inbl=..., outbl=...) at librados/librados.cc:1273 
 #5    0x00007f96e638e8dd in librbd::cls_client::mirror_uuid_get (ioctx=ioctx@entry=0x7f96efe2d3f8, uuid=uuid@entry=0x7f9596ffa650) at cls/rbd/cls_rbd_client.cc:1010 
 Python Exception <type 'exceptions.ValueError'> Cannot find type const rbd::mirror::Replayer::ImageIds::_Rep_type:  
 #6    0x00007f96e61ac49f in rbd::mirror::Replayer::set_sources (this=this@entry=0x7f96efe2d2d0, image_ids=std::set with 4 elements) at tools/rbd_mirror/Replayer.cc:631 
 #7    0x00007f96e61adc47 in rbd::mirror::Replayer::run (this=0x7f96efe2d2d0) at tools/rbd_mirror/Replayer.cc:453 
 #8    0x00007f96e61b15fd in rbd::mirror::Replayer::ReplayerThread::entry (this=<optimized out>) at tools/rbd_mirror/Replayer.h:125 
 #9    0x00007f96db887dc5 in start_thread () from /lib64/libpthread.so.0 
 #10 0x00007f96da77073d in clone () from /lib64/libc.so.6 
 </pre> 

 <pre> 
 #0    0x00007f96db88e1bd in __lll_lock_wait () from /lib64/libpthread.so.0 
 #1    0x00007f96db889d02 in _L_lock_791 () from /lib64/libpthread.so.0 
 #2    0x00007f96db889c08 in pthread_mutex_lock () from /lib64/libpthread.so.0 
 #3    0x00007f96e63c5458 in Mutex::Lock (this=this@entry=0x7f96efdf5ad8, no_lockdep=no_lockdep@entry=false) at common/Mutex.cc:110 
 #4    0x00007f96e61a6767 in Locker (m=..., this=<synthetic pointer>) at common/Mutex.h:115 
 #5    rbd::mirror::Replayer::is_blacklisted (this=0x7f96efdf5ab0) at tools/rbd_mirror/Replayer.cc:263 
 Python Exception <type 'exceptions.ValueError'> Cannot find type const rbd::mirror::Mirror::PoolPeers::_Rep_type:  
 #6    0x00007f96e61a218b in rbd::mirror::Mirror::update_replayers (this=this@entry=0x7f96efdbcbe0, pool_peers=std::map with 3 elements) at tools/rbd_mirror/Mirror.cc:368 
 #7    0x00007f96e61a2cf6 in rbd::mirror::Mirror::run (this=0x7f96efdbcbe0) at tools/rbd_mirror/Mirror.cc:237 
 #8    0x00007f96e619a592 in main (argc=<optimized out>, argv=0x7ffe3e072c68) at tools/rbd_mirror/main.cc:74 
 </pre>

Back