Bug #14089: ceph rbd api is not thread safe - rbd - Ceph

Actions

Copy link

Bug #14089

closed

ceph rbd api is not thread safe

Added by ceph zte over 8 years ago. Updated over 8 years ago.

Status:

Duplicate

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v0.94.2

ceph-qa-suite:

rbd

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

My ceph version is 0.94.When i use the Multithreading run rbd api like below.

The librbd will core dump.Why the librbd api is not thread safe?

import rados
import rbd
import hashlib
import datetime
import  threading

RADOS_NAME = 'client.admin'
RBDTIMEOUT = 20
num=0
def md5(raw):
    import hashlib
    hasher = hashlib.md5()
    hasher.update(raw)
    return hasher.hexdigest()

mu = threading.Lock()
mu1 = threading.Lock()

class RBDHandle():
    def __init__(self, clustername="ceph"):
        self._clustername = clustername

    def getmd5info(self, ivalue):
        hasher = hashlib.md5()
        hasher.update(str(ivalue))
        return hasher.hexdigest()

     def testconnect(self):
       try:

           cluster_handle = rados.Rados(name=RADOS_NAME, clustername=self._clustername, conffile='')
           cluster_handle.connect(timeout=RBDTIMEOUT)

       except:
           print "it is error" 
       finally:

           cluster_handle.shutdown()

           if mu1.acquire():
              global num
              num = num + 1
              mu1.release()
              print num

def test_fuc():
    crb = RBDHandle()
    data={}
    threads=[]
    for i in range(1,10):
        print "it is begin" 
        t=threading.Thread(target= crb.testconnect)
        threads.append(t)

    for t in threads:
        t.setDaemon(True)
        t.start()
    t.join()

if __name__ == "__main__":
    while True:
         global num
         num = 0
         test_fuc()
         if num != 9:
            break
    print "it is over"

Actions

Copy link

Updated by Nathan Cutler over 8 years ago

Project changed from Ceph to rbd
Subject changed from ceph rbd api is not thread sate. to ceph rbd api is not thread safe

Actions

Copy link

Updated by Jason Dillaman over 8 years ago

Status changed from New to Need More Info
Priority changed from Urgent to Normal

What is the exact issue you are encountering?

Actions

Copy link

Updated by Josh Durgin over 8 years ago

Description updated (diff)

Actions

Copy link

Updated by ceph zte over 8 years ago

I run the below python script in Multithreading.Such as 100 threads.Sometimes it usually has the below errors like "Exception in thread Thread-95 (most likely raised during interpreter shutdown)"

def testconnect(self):
try:

cluster_handle = rados.Rados(name=RADOS_NAME, clustername=self._clustername, conffile='')
           cluster_handle.connect(timeout=RBDTIMEOUT)

finally:

cluster_handle.shutdown()

Actions

Copy link

Updated by ceph zte over 8 years ago

The core dump print is as below

pure virtual method called
terminate called without an active exception
Aborted (core dumped)

Actions

Copy link

Updated by Josh Durgin over 8 years ago

Could you open the core file that was dumped with gdb and get a backtrace?

i.e. run 'gdb python /path/to/core/file' and then the gdb command 'bt', and paste the full output here?

This does seem to be a good test case - on master running it in a loop exposed a different bug (#14115)

Actions

Copy link

Updated by ceph zte over 8 years ago

(gdb) bt
#0 0x00007f39abea35f9 in raise () from /lib64/libc.so.6
#1 0x00007f39abea5068 in abort () from /lib64/libc.so.6
#2 0x00007f399d8ef9d5 in _gnu_cxx::_verbose_terminate_handler() ()
from /lib64/libstdc++.so.6
#3 0x00007f399d8ed946 in ?? () from /lib64/libstdc++.so.6
#4 0x00007f399d8ed973 in std::terminate() () from /lib64/libstdc++.so.6
#5 0x00007f399d8ee4df in __cxa_pure_virtual () from /lib64/libstdc++.so.6
#6 0x00007f3969a84dd4 in ThreadPool::WorkQueueVal<std::pair<Context*, int>, std ::pair<Context*, int> >::_void_dequeue (this=0x7f36f8494c40)
at ./common/WorkQueue.h:177
#7 0x00007f3969b75534 in ThreadPool::worker (this=0x7f36f822d650,
wt=0x7f36f87b6d30) at common/WorkQueue.cc:120
#8 0x00007f3969b76ab0 in ThreadPool::WorkThread::entry (this=<optimized out>)
at common/WorkQueue.h:318
#9 0x00007f39ac93fdf3 in start_thread () from /lib64/libpthread.so.0
#10 0x00007f39abf6454d in clone () from /lib64/libc.so.6

Actions

Copy link

Updated by Jason Dillaman over 8 years ago

I believe this is a duplicate of #13636 (#13758 is the backport ticket for Hammer). It should be included in the forthcoming v0.94.6 release.

Actions

Copy link

Updated by Jason Dillaman over 8 years ago

Status changed from Need More Info to Duplicate

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rbd

Custom queries

Bug #14089

ceph rbd api is not thread safe

Updated by Nathan Cutler over 8 years ago

Updated by Jason Dillaman over 8 years ago

Updated by Josh Durgin over 8 years ago

Updated by ceph zte over 8 years ago

Updated by ceph zte over 8 years ago

Updated by Josh Durgin over 8 years ago

Updated by ceph zte over 8 years ago

Updated by Jason Dillaman over 8 years ago

Updated by Jason Dillaman over 8 years ago