Bug #14089
closed
ceph rbd api is not thread safe
Added by ceph zte over 8 years ago.
Updated over 8 years ago.
Description
My ceph version is 0.94.When i use the Multithreading run rbd api like below.
The librbd will core dump.Why the librbd api is not thread safe?
import rados
import rbd
import hashlib
import datetime
import threading
RADOS_NAME = 'client.admin'
RBDTIMEOUT = 20
num=0
def md5(raw):
import hashlib
hasher = hashlib.md5()
hasher.update(raw)
return hasher.hexdigest()
mu = threading.Lock()
mu1 = threading.Lock()
class RBDHandle():
def __init__(self, clustername="ceph"):
self._clustername = clustername
def getmd5info(self, ivalue):
hasher = hashlib.md5()
hasher.update(str(ivalue))
return hasher.hexdigest()
def testconnect(self):
try:
cluster_handle = rados.Rados(name=RADOS_NAME, clustername=self._clustername, conffile='')
cluster_handle.connect(timeout=RBDTIMEOUT)
except:
print "it is error"
finally:
cluster_handle.shutdown()
if mu1.acquire():
global num
num = num + 1
mu1.release()
print num
def test_fuc():
crb = RBDHandle()
data={}
threads=[]
for i in range(1,10):
print "it is begin"
t=threading.Thread(target= crb.testconnect)
threads.append(t)
for t in threads:
t.setDaemon(True)
t.start()
t.join()
if __name__ == "__main__":
while True:
global num
num = 0
test_fuc()
if num != 9:
break
print "it is over"
- Project changed from Ceph to rbd
- Subject changed from ceph rbd api is not thread sate. to ceph rbd api is not thread safe
- Status changed from New to Need More Info
- Priority changed from Urgent to Normal
What is the exact issue you are encountering?
- Description updated (diff)
I run the below python script in Multithreading.Such as 100 threads.Sometimes it usually has the below errors like "Exception in thread Thread-95 (most likely raised during interpreter shutdown)"
def testconnect(self):
try:
cluster_handle = rados.Rados(name=RADOS_NAME, clustername=self._clustername, conffile='')
cluster_handle.connect(timeout=RBDTIMEOUT)
finally:
cluster_handle.shutdown()
The core dump print is as below
pure virtual method called
terminate called without an active exception
Aborted (core dumped)
Could you open the core file that was dumped with gdb and get a backtrace?
i.e. run 'gdb python /path/to/core/file' and then the gdb command 'bt', and paste the full output here?
This does seem to be a good test case - on master running it in a loop exposed a different bug (#14115)
(gdb) bt
#0 0x00007f39abea35f9 in raise () from /lib64/libc.so.6
#1 0x00007f39abea5068 in abort () from /lib64/libc.so.6
#2 0x00007f399d8ef9d5 in _gnu_cxx::_verbose_terminate_handler() ()
from /lib64/libstdc++.so.6
#3 0x00007f399d8ed946 in ?? () from /lib64/libstdc++.so.6
#4 0x00007f399d8ed973 in std::terminate() () from /lib64/libstdc++.so.6
#5 0x00007f399d8ee4df in __cxa_pure_virtual () from /lib64/libstdc++.so.6
#6 0x00007f3969a84dd4 in ThreadPool::WorkQueueVal<std::pair<Context*, int>, std ::pair<Context*, int> >::_void_dequeue (this=0x7f36f8494c40)
at ./common/WorkQueue.h:177
#7 0x00007f3969b75534 in ThreadPool::worker (this=0x7f36f822d650,
wt=0x7f36f87b6d30) at common/WorkQueue.cc:120
#8 0x00007f3969b76ab0 in ThreadPool::WorkThread::entry (this=<optimized out>)
at common/WorkQueue.h:318
#9 0x00007f39ac93fdf3 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f39abf6454d in clone () from /lib64/libc.so.6
I believe this is a duplicate of #13636 (#13758 is the backport ticket for Hammer). It should be included in the forthcoming v0.94.6 release.
- Status changed from Need More Info to Duplicate
Also available in: Atom
PDF