Project

General

Profile

Actions

Bug #4185

closed

Python multiprocessing exhibiting odd behaviour with librados

Added by Gerben Meijer about 11 years ago. Updated about 10 years ago.

Status:
Won't Fix
Priority:
High
Assignee:
Category:
librados
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
python
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The following code snippet yields unexpected results:

#!/usr/bin/python
import multiprocessing
import rbd
import rados
import logging
# some multiprocessing usefol logging
mpl = multiprocessing.log_to_stderr()
mpl.setLevel(logging.INFO)

def childconnect():
  # set up a rados object
  threadcluster = rados.Rados(conffile='/etc/ceph/ceph.conf')
  # and connect. but this fails, mon logs show 'unexpected key: req.key=0'
  threadcluster.connect()
  # :(
  threadcluster.shutdown()

if __name__ == '__main__':
  # set up a main connection
  cluster = rados.Rados(conffile='/etc/ceph/ceph.conf')
  cluster.connect()
  # do some generic stuff first, before we set up child processes to do more work.
  job = []
  job.append(multiprocessing.Process(target=childconnect))
  job.append(multiprocessing.Process(target=childconnect))
  for j in job:
   j.start()
   j.join()
  # and we're done.
  cluster.shutdown()

If the cluster object and cluster.connect/shutdown is removed from the main process, child connections work fine. If kept as is, child connections consistently fail:

Process Process-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "./mpcephfail.py", line 14, in childconnect
    threadcluster.connect()
  File "/usr/lib/python2.7/dist-packages/rados.py", line 185, in connect
    raise make_ex(ret, "error calling connect")
PermissionError: error calling connect

mons log the following error:

2013-02-18 10:52:54.592627 7f871c6f1700  0 cephx server client.admin:  unexpected key: req.key=0 expected_key=e0c36589ad8a04e7

Can be worked around by doing everything CEPH related in a child process and not connecting in the main process whatsoever, but it seems to be an issue with the python bindings.

Actions #1

Updated by Ian Colle about 11 years ago

  • Assignee set to Josh Durgin
Actions #2

Updated by Ian Colle over 10 years ago

  • Priority changed from Normal to High
Actions #3

Updated by Samuel Just over 10 years ago

  • Status changed from New to Won't Fix

Multiprocessing forks, and that's a problem.

Actions #4

Updated by Evan Felix about 10 years ago

Some notes for the Next guy that comes across this issue:

You can use the multiprocessing managers to push all the rados calls to a single process. The code cnage be changed to this:

#!/usr/bin/python
import multiprocessing
import multiprocessing.managers
import rbd
import rados
import logging
# some multiprocessing usefol logging
mpl = multiprocessing.log_to_stderr()
mpl.setLevel(logging.INFO)

class MyManager(multiprocessing.managers.BaseManager):
        pass
MyManager.register("Rados",rados.Rados)

def childconnect(mgr):
  # set up a rados object
  threadcluster = mgr.Rados(conffile='/etc/ceph/ceph.conf')
  # and connect. but this fails, mon logs show 'unexpected key: req.key=0'
  threadcluster.connect()
  # :(
  threadcluster.shutdown()

if __name__ == '__main__':
  mgr = MyManager()
  mgr.start()
  # set up a main connection
  cluster = mgr.Rados(conffile='/etc/ceph/ceph.conf')
  cluster.connect()
  # do some generic stuff first, before we set up child processes to do more work.
  job = []
  job.append(multiprocessing.Process(target=childconnect,args=(mgr,)))
  job.append(multiprocessing.Process(target=childconnect,args=(mgr,)))
  for j in job:
   j.start()
   j.join()
  # and we're done.
  cluster.shutdown()
Actions

Also available in: Atom PDF