Project

General

Profile

Actions

Bug #11128

closed

"[ FAILED ] LibCephFS".* tests in smoke-master-distro-basic-multi

Added by Yuri Weinstein about 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2015-03-14_02:35:02-smoke-master-distro-basic-multi/
Job: 803964
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-03-14_02:35:02-smoke-master-distro-basic-multi/803964/

2015-03-14T22:04:50.177 INFO:tasks.workunit.client.0.plana64.stdout:[ RUN      ] LibCephFS.ThreesomeLocking
2015-03-14T22:04:55.181 INFO:tasks.workunit.client.0.plana64.stdout:[       OK ] LibCephFS.ThreesomeLocking (5004 ms)
2015-03-14T22:04:55.181 INFO:tasks.workunit.client.0.plana64.stdout:[ RUN      ] LibCephFS.InterProcessLocking
2015-03-14T22:04:55.193 INFO:tasks.workunit.client.0.plana64.stdout:test/libcephfs/flock.cc:386: Failure
2015-03-14T22:04:55.193 INFO:tasks.workunit.client.0.plana64.stdout:Value of: ceph_mount(cmount, __null)
2015-03-14T22:04:55.193 INFO:tasks.workunit.client.0.plana64.stdout:  Actual: -1
2015-03-14T22:04:55.193 INFO:tasks.workunit.client.0.plana64.stdout:Expected: 0
2015-03-14T22:05:00.357 INFO:tasks.workunit.client.0.plana64.stdout:test/libcephfs/flock.cc:470: Failure
2015-03-14T22:05:00.357 INFO:tasks.workunit.client.0.plana64.stdout:Value of: sem_timedwait(&s.sem, abstime(ts, waitSlowMs))
2015-03-14T22:05:00.357 INFO:tasks.workunit.client.0.plana64.stdout:  Actual: -1
2015-03-14T22:05:00.357 INFO:tasks.workunit.client.0.plana64.stdout:Expected: 0
2015-03-14T22:05:00.357 INFO:tasks.workunit.client.0.plana64.stdout:[  FAILED  ] LibCephFS.InterProcessLocking (5176 ms)
2015-03-14T22:05:00.357 INFO:tasks.workunit.client.0.plana64.stdout:[ RUN      ] LibCephFS.ThreesomeInterProcessLocking
2015-03-14T22:05:00.370 INFO:tasks.workunit.client.0.plana64.stdout:test/libcephfs/flock.cc:386: Failure
2015-03-14T22:05:00.370 INFO:tasks.workunit.client.0.plana64.stdout:Value of: ceph_mount(cmount, __null)
2015-03-14T22:05:00.370 INFO:tasks.workunit.client.0.plana64.stdout:  Actual: -1
2015-03-14T22:05:00.370 INFO:tasks.workunit.client.0.plana64.stdout:Expected: 0
2015-03-14T22:05:00.371 INFO:tasks.workunit.client.0.plana64.stdout:test/libcephfs/flock.cc:386: Failure
2015-03-14T22:05:00.371 INFO:tasks.workunit.client.0.plana64.stdout:Value of: ceph_mount(cmount, __null)
2015-03-14T22:05:00.371 INFO:tasks.workunit.client.0.plana64.stdout:  Actual: -1
2015-03-14T22:05:00.371 INFO:tasks.workunit.client.0.plana64.stdout:Expected: 0
2015-03-14T22:05:05.507 INFO:tasks.workunit.client.0.plana64.stdout:test/libcephfs/flock.cc:581: Failure
2015-03-14T22:05:05.507 INFO:tasks.workunit.client.0.plana64.stdout:Value of: sem_timedwait(&s.sem, abstime(ts, waitSlowMs))
2015-03-14T22:05:05.507 INFO:tasks.workunit.client.0.plana64.stdout:  Actual: -1
2015-03-14T22:05:05.507 INFO:tasks.workunit.client.0.plana64.stdout:Expected: 0
2015-03-14T22:05:05.507 INFO:tasks.workunit.client.0.plana64.stdout:[  FAILED  ] LibCephFS.ThreesomeInterProcessLocking (5150 ms)
2015-03-14T22:05:05.508 INFO:tasks.workunit.client.0.plana64.stdout:[----------] 41 tests from LibCephFS (102919 ms total)
2015-03-14T22:05:05.508 INFO:tasks.workunit.client.0.plana64.stdout:
2015-03-14T22:05:05.508 INFO:tasks.workunit.client.0.plana64.stdout:[----------] 1 test from Caps
2015-03-14T22:05:05.508 INFO:tasks.workunit.client.0.plana64.stdout:[ RUN      ] Caps.ReadZero
2015-03-14T22:05:10.184 INFO:tasks.workunit.client.0.plana64.stdout:[       OK ] Caps.ReadZero (4677 ms)
2015-03-14T22:05:10.184 INFO:tasks.workunit.client.0.plana64.stdout:[----------] 1 test from Caps (4677 ms total)
2015-03-14T22:05:10.184 INFO:tasks.workunit.client.0.plana64.stdout:
2015-03-14T22:05:10.184 INFO:tasks.workunit.client.0.plana64.stdout:[----------] Global test environment tear-down
2015-03-14T22:05:10.184 INFO:tasks.workunit.client.0.plana64.stdout:[==========] 42 tests from 2 test cases ran. (107596 ms total)
2015-03-14T22:05:10.185 INFO:tasks.workunit.client.0.plana64.stdout:[  PASSED  ] 40 tests.
2015-03-14T22:05:10.185 INFO:tasks.workunit.client.0.plana64.stdout:[  FAILED  ] 2 tests, listed below:
2015-03-14T22:05:10.185 INFO:tasks.workunit.client.0.plana64.stdout:[  FAILED  ] LibCephFS.InterProcessLocking
2015-03-14T22:05:10.185 INFO:tasks.workunit.client.0.plana64.stdout:[  FAILED  ] LibCephFS.ThreesomeInterProcessLocking
2015-03-14T22:05:10.185 INFO:tasks.workunit.client.0.plana64.stdout:
2015-03-14T22:05:10.185 INFO:tasks.workunit.client.0.plana64.stdout: 2 FAILED TESTS
2015-03-14T22:05:10.189 INFO:tasks.workunit:Stopping ['libcephfs/test.sh'] on client.0...

Files

nss_test.cc (3.98 KB) nss_test.cc Zheng Yan, 03/20/2015 01:16 PM
Actions #1

Updated by Greg Farnum about 9 years ago

  • Assignee set to Zheng Yan

It looks to me like these tests haven't passed since they were merged. :/ Zheng, please check out the failures!

My first thought was that it's an old version of fuse, but looking at http://qa-proxy.ceph.com/teuthology/teuthology-2015-03-13_23:04:01-fs-master-testing-basic-multi/803906/ the kernel is our testing branched, based off of 4.0-rc3. :(

http://qa-proxy.ceph.com/teuthology/teuthology-2015-03-13_23:04:01-fs-master-testing-basic-multi/803953/ has a lockdep assert mixed in with it; perhaps the test code is racy/broken in some way around that which is causing issues?

And finally http://qa-proxy.ceph.com/teuthology/teuthology-2015-03-14_23:04:02-fs-hammer-testing-basic-multi/804436/ is dying on a message assert; I'm not sure if that's related or not.

All of these except the last one are duplicated elsewhere.

Actions #2

Updated by Sage Weil about 9 years ago

  • Project changed from Ceph to CephFS
Actions #3

Updated by Zheng Yan about 9 years ago

Sorry, my RP does not work. the failure of "ceph_mount(cmount, __null)" is caused by authentication error. I have no idea why parent process succeeded in authenticating, but forked child process failed.

log of parent process

2015-03-17 08:09:39.232632 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service mds
2015-03-17 08:09:39.232635 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service osd
2015-03-17 08:09:39.232636 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service auth
2015-03-17 08:09:39.232637 7fb4d1ffb700 10 cephx: validate_tickets want 38 have 0 need 38
2015-03-17 08:09:39.232642 7fb4d1ffb700 10 monclient(hunting): my global_id is 4295
2015-03-17 08:09:39.232645 7fb4d1ffb700 10 cephx client: handle_response ret = 0
2015-03-17 08:09:39.232647 7fb4d1ffb700 10 cephx client:  got initial server challenge 2780952419503487784
2015-03-17 08:09:39.232651 7fb4d1ffb700 10 cephx client: validate_tickets: want=38 need=38 have=0
2015-03-17 08:09:39.232652 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service mds
2015-03-17 08:09:39.232657 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service osd
2015-03-17 08:09:39.232658 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service auth
2015-03-17 08:09:39.232659 7fb4d1ffb700 10 cephx: validate_tickets want 38 have 0 need 38
2015-03-17 08:09:39.232660 7fb4d1ffb700 10 cephx client: want=38 need=38 have=0
2015-03-17 08:09:39.232663 7fb4d1ffb700 10 cephx client: build_request
2015-03-17 08:09:39.232827 7fb4d1ffb700 10 cephx client: get auth session key: client_challenge 15310706112129456750
2015-03-17 08:09:39.232838 7fb4d1ffb700 10 monclient(hunting): _send_mon_message to mon.c at 10.214.131.21:6790/0
2015-03-17 08:09:39.233844 7fb4d1ffb700 10 cephx client: handle_response ret = 0
2015-03-17 08:09:39.233853 7fb4d1ffb700 10 cephx client:  get_auth_session_key
2015-03-17 08:09:39.233858 7fb4d1ffb700 10 cephx: verify_service_ticket_reply got 1 keys
2015-03-17 08:09:39.233860 7fb4d1ffb700 10 cephx: got key for service_id auth
2015-03-17 08:09:39.233905 7fb4d1ffb700 10 cephx:  ticket.secret_id=2
2015-03-17 08:09:39.233908 7fb4d1ffb700 10 cephx: verify_service_ticket_reply service auth secret_id 2 session_key AQCzQwhVA4XxDRAA3bHFZdDRPb+U2P/K91HVKw== validity=43200.000000
2015-03-17 08:09:39.233930 7fb4d1ffb700 10 cephx: ticket expires=2015-03-17 20:09:39.233930 renew_after=2015-03-17 17:09:39.233930
2015-03-17 08:09:39.233937 7fb4d1ffb700 10 cephx client:  want=38 need=38 have=0
2015-03-17 08:09:39.233939 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service mds
2015-03-17 08:09:39.233940 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service osd
2015-03-17 08:09:39.233942 7fb4d1ffb700 10 cephx: validate_tickets want 38 have 32 need 6
2015-03-17 08:09:39.233944 7fb4d1ffb700 10 cephx client: validate_tickets: want=38 need=6 have=32
2015-03-17 08:09:39.233946 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service mds
2015-03-17 08:09:39.233947 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service osd
2015-03-17 08:09:39.233948 7fb4d1ffb700 10 cephx: validate_tickets want 38 have 32 need 6
2015-03-17 08:09:39.233949 7fb4d1ffb700 10 cephx client: want=38 need=6 have=32
2015-03-17 08:09:39.233951 7fb4d1ffb700 10 cephx client: build_request
2015-03-17 08:09:39.233952 7fb4d1ffb700 10 cephx client: get service keys: want=38 need=6 have=32
2015-03-17 08:09:39.233996 7fb4d1ffb700 10 monclient(hunting): _send_mon_message to mon.c at 10.214.131.21:6790/0
2015-03-17 08:09:39.235088 7fb4d1ffb700 10 cephx client: handle_response ret = 0
2015-03-17 08:09:39.235092 7fb4d1ffb700 10 cephx client:  get_principal_session_key session_key AQCzQwhVA4XxDRAA3bHFZdDRPb+U2P/K91HVKw==
2015-03-17 08:09:39.235105 7fb4d1ffb700 10 cephx: verify_service_ticket_reply got 2 keys
2015-03-17 08:09:39.235107 7fb4d1ffb700 10 cephx: got key for service_id mds
2015-03-17 08:09:39.235145 7fb4d1ffb700 10 cephx:  ticket.secret_id=2
2015-03-17 08:09:39.235147 7fb4d1ffb700 10 cephx: verify_service_ticket_reply service mds secret_id 2 session_key AQCzQwhVE8cDDhAAIw5E5M6ErXQZ2Z3y9S4X1Q== validity=3600.000000
2015-03-17 08:09:39.235158 7fb4d1ffb700 10 cephx: ticket expires=2015-03-17 09:09:39.235158 renew_after=2015-03-17 08:54:39.235158
2015-03-17 08:09:39.235165 7fb4d1ffb700 10 cephx: got key for service_id osd
2015-03-17 08:09:39.235192 7fb4d1ffb700 10 cephx:  ticket.secret_id=2
2015-03-17 08:09:39.235194 7fb4d1ffb700 10 cephx: verify_service_ticket_reply service osd secret_id 2 session_key AQCzQwhVaQwEDhAAcqMEfLHaEkxo6ZFy8NJLHw== validity=3600.000000
2015-03-17 08:09:39.235204 7fb4d1ffb700 10 cephx: ticket expires=2015-03-17 09:09:39.235204 renew_after=2015-03-17 08:54:39.235204
2015-03-17 08:09:39.235210 7fb4d1ffb700 10 cephx: validate_tickets want 38 have 38 need 0
2015-03-17 08:09:39.235213 7fb4d1ffb700  1 monclient(hunting): found mon.c
2015-03-17 08:09:39.235214 7fb4d1ffb700 10 monclient: _send_mon_message to mon.c at 10.214.131.21:6790/0
2015-03-17 08:09:39.235238 7fb4d1ffb700 10 monclient: _send_mon_message to mon.c at 10.214.131.21:6790/0
2015-03-17 08:09:39.235256 7fb4d1ffb700 10 cephx: validate_tickets want 38 have 38 need 0
2015-03-17 08:09:39.235257 7fb4d1ffb700 20 cephx client: need_tickets: want=38 need=0 have=38
2015-03-17 08:09:39.235261 7fb4d1ffb700 20 monclient: _check_auth_rotating not needed by client.admin
2015-03-17 08:09:39.235277 7fb4dfbf0780  5 monclient: authenticate success, global_id 4295

log of child process

2015-03-17 08:10:35.235680 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service mds
2015-03-17 08:10:35.235683 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service osd
2015-03-17 08:10:35.235684 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service auth
2015-03-17 08:10:35.235685 7fb4d1ffb700 10 cephx: validate_tickets want 38 have 0 need 38
2015-03-17 08:10:35.235689 7fb4d1ffb700 10 monclient(hunting): my global_id is 4156
2015-03-17 08:10:35.235692 7fb4d1ffb700 10 cephx client: handle_response ret = 0
2015-03-17 08:10:35.235694 7fb4d1ffb700 10 cephx client:  got initial server challenge 3578289682439041774
2015-03-17 08:10:35.235697 7fb4d1ffb700 10 cephx client: validate_tickets: want=38 need=38 have=0
2015-03-17 08:10:35.235699 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service mds
2015-03-17 08:10:35.235700 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service osd
2015-03-17 08:10:35.235701 7fb4d1ffb700 10 cephx: set_have_need_key no handler for service auth
2015-03-17 08:10:35.235704 7fb4d1ffb700 10 cephx: validate_tickets want 38 have 0 need 38
2015-03-17 08:10:35.235705 7fb4d1ffb700 10 cephx client: want=38 need=38 have=0
2015-03-17 08:10:35.235708 7fb4d1ffb700 10 cephx client: build_request
2015-03-17 08:10:35.235893 7fb4d1ffb700 10 cephx client: get auth session key: client_challenge 4441271746298158252
2015-03-17 08:10:35.235903 7fb4d1ffb700 10 monclient(hunting): _send_mon_message to mon.b at 10.214.131.21:6789/0
2015-03-17 08:10:35.236682 7fb4d1ffb700 10 cephx client: handle_response ret = -1
2015-03-17 08:10:35.236686 7fb4d1ffb700  1 monclient(hunting): found mon.b
2015-03-17 08:10:35.236713 7fb4dfbf0780  1 client.-1 shutdown
2015-03-17 08:10:35.237245 7fb4dfbf0780 10 monclient: shutdown
2015-03-17 08:10:35.237261 7fb4dfbf0780 20 monclient: shutdown discarding pending message mon_subscribe({osdmap=0}) v2
2015-03-17 08:10:35.237275 7fb4dfbf0780 20 monclient: shutdown discarding pending message mon_subscribe({mdsmap=0+,osdmap=0}) v2
2015-03-17 08:10:35.238457 7fb4dfbf0780 20 client.-1 trim_cache size 0 max 0

Actions #4

Updated by Greg Farnum about 9 years ago

I haven't looked at the code, but if I were to guess it's fork()ing at a bad time and so hitting authentication issues with the monitor?

We've seen other failures beyond connection issues (note the lockdep assert in particular) so I think we'll probably want the PR to go in as well.

Actions #5

Updated by Greg Farnum about 9 years ago

Uh, these two tests both have one of the threads return with "exit(EXIT_FAILURE);", in what looks like the normal path. Surely that will cause the test to be interpreted as a failure, and is not the intended behavior?

Actions #6

Updated by Greg Farnum about 9 years ago

Right, I forgot some of what we discussed in standup. That exit code is okay because we exit earlier in process_ConcurrentLocking, and it's odd that this isn't a problem on our local nodes but apparently is in the sepia lab. :/

Actions #7

Updated by Greg Farnum about 9 years ago

The mon log has these lines from one run:

2015-03-18 16:04:49.786133 7fac487ed700  0 cephx server client.admin:  unexpected key: req.key=7ff428ff85e0 expected_key=106bfa9b7693976a
2015-03-18 16:04:55.021756 7fac487ed700  0 cephx server client.admin:  unexpected key: req.key=7ff42bffe5e0 expected_key=8443cd5d4fc6d54c
2015-03-18 16:04:55.023631 7fac487ed700  0 cephx server client.admin:  unexpected key: req.key=7ff42bffe5e0 expected_key=b317a0892f21ffeb

Note that the given key is the same for two of the three requests (the client is using a different nonce for each one) but the expected key is different for all of them.

This doesn't happen with locally-built code on my rex box but it does happen reliably on the sepia lab with our packages. I think there must be some kind of race in the test code, probably to do with the fork, but I can't see it either. :/
Maybe it's about the shadowing of the "waitMs" static const member by the concurrency tests? Or....I've got nothing.

Actions #8

Updated by Zheng Yan about 9 years ago

found a race https://github.com/ceph/ceph/commit/15d85a6e4a47f241917e6c1becbe90e21309b711. don't know if it's related to the mount failure.

Actions #9

Updated by Zheng Yan about 9 years ago

after applying following patch

diff --git a/src/auth/cephx/CephxProtocol.cc b/src/auth/cephx/CephxProtocol.cc
index f57f063..cbc3c06 100644
--- a/src/auth/cephx/CephxProtocol.cc
+++ b/src/auth/cephx/CephxProtocol.cc
@@ -25,22 +25,25 @@

 void cephx_calc_client_server_challenge(CephContext *cct, CryptoKey& secret, uint64_t server_challenge, 
-                 uint64_t client_challenge, uint64_t *key, std::string &ret)
+                 uint64_t client_challenge, uint64_t *key, std::string &error)
 {
   CephXChallengeBlob b;
   b.server_challenge = server_challenge;
   b.client_challenge = client_challenge;

   bufferlist enc;
-  std::string error;
-  if (encode_encrypt(cct, b, secret, enc, error))
+  if (encode_encrypt(cct, b, secret, enc, error)) {
+    ldout(cct, 10) << "cephx_calc_client_server_challenge secret " << secret << " server_challenge " << server_challenge << " client_challenge " << client_challenge <<  " error " << error << dendl;
     return;
+  }

   uint64_t k = 0;
   const uint64_t *p = (const uint64_t *)enc.c_str();
   for (int pos = 0; pos + sizeof(k) <= enc.length(); pos+=sizeof(k), p++)
     k ^= mswab64(*p);
   *key = k;
+
+  ldout(cct, 10) << "cephx_calc_client_server_challenge secret " << secret << " server_challenge " << server_challenge << " client_challenge " << client_challenge <<  " key " << k << dendl;
 }

Log of child process has

2015-03-19 22:47:42.306073 7fdc9504f780  5 adding auth protocol: cephx
2015-03-19 22:47:42.306652 7fdc9504f780  2 auth: KeyRing::load: loaded key file /etc/ceph/ceph.keyring
2015-03-19 22:47:42.309303 7fdc867fc700 10 cephx: set_have_need_key no handler for service mds
2015-03-19 22:47:42.309307 7fdc867fc700 10 cephx: set_have_need_key no handler for service osd
2015-03-19 22:47:42.309309 7fdc867fc700 10 cephx: set_have_need_key no handler for service auth
2015-03-19 22:47:42.309311 7fdc867fc700 10 cephx: validate_tickets want 38 have 0 need 38
2015-03-19 22:47:42.309317 7fdc867fc700 10 cephx client: handle_response ret = 0
2015-03-19 22:47:42.309320 7fdc867fc700 10 cephx client:  got initial server challenge 2435188869190731736
2015-03-19 22:47:42.309323 7fdc867fc700 10 cephx client: validate_tickets: want=38 need=38 have=0
2015-03-19 22:47:42.309326 7fdc867fc700 10 cephx: set_have_need_key no handler for service mds
2015-03-19 22:47:42.309327 7fdc867fc700 10 cephx: set_have_need_key no handler for service osd
2015-03-19 22:47:42.309329 7fdc867fc700 10 cephx: set_have_need_key no handler for service auth
2015-03-19 22:47:42.309331 7fdc867fc700 10 cephx: validate_tickets want 38 have 0 need 38
2015-03-19 22:47:42.309335 7fdc867fc700 10 cephx client: want=38 need=38 have=0
2015-03-19 22:47:42.309340 7fdc867fc700 10 cephx client: build_request
2015-03-19 22:47:42.309674 7fdc867fc700 10 cephx: cephx_calc_client_server_challenge secret AQCUswtVs16kHhAAE3EY2eLllJbpGJTDE8N31g== server_challenge 2435188869190731736 client_challenge 8954408529516010106 error cannot convert AES key for NSS: -8023
2015-03-19 22:47:42.309706 7fdc867fc700 20 cephx client: cephx_calc_client_server_challenge error: cannot convert AES key for NSS: -8023
2015-03-19 22:47:42.310461 7fdc867fc700 10 cephx client: handle_response ret = -22

The secret is the same as parent process. No idea why there is NSS error

Actions #10

Updated by Zheng Yan about 9 years ago

I can reliably reproduce this locally when configure ceph with "--with-nss --without-cryptopp"

Actions #11

Updated by Zheng Yan about 9 years ago

Simple c++ code to reproduce this issue.

#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <assert.h>
#define __USE_FILE_OFFSET64
#include "cephfs/libcephfs.h" 

#define ASSERT_EQ(x,y) assert((x) == (y))

#define STARTUP_CEPH(cmount) do {                       \
        ASSERT_EQ(0, ceph_create(&cmount, NULL));       \
        ASSERT_EQ(0, ceph_conf_parse_env(cmount, NULL));\
        ASSERT_EQ(0, ceph_conf_read_file(cmount, NULL));\
        ASSERT_EQ(0, ceph_mount(cmount, NULL)); \
} while(0)

#define CLEANUP_CEPH(cmount) do {               \
        ASSERT_EQ(0, ceph_unmount(cmount));     \
        ASSERT_EQ(0, ceph_release(cmount));     \
} while(0)

void mount_umount() {
        struct ceph_mount_info *mount;
        STARTUP_CEPH(mount);
        sleep(5);
        CLEANUP_CEPH(mount);
}

int main(int argc, char *argv[])
{
        // no failure if comment out this line
        mount_umount();

        pid_t pid = fork();
        if (pid == 0) {
                mount_umount();
        } else {
                mount_umount();

                int status;
                ASSERT_EQ(pid, waitpid(pid, &status, 0));
                ASSERT_EQ(status, 0);
        }
        return 0;
}
Actions #12

Updated by Zheng Yan about 9 years ago

replacing libcephfs with librados has the same issue.

Actions #13

Updated by Zheng Yan about 9 years ago

looks like we need to call NSS init function in both parent and child process. calling NSS init function before fork() does not work.
Test code is attached.

Actions #14

Updated by Greg Farnum about 9 years ago

Zheng, do we need to merge any changes besides https://github.com/ceph/ceph/pull/4098 to make this work?

I think init() is called in the separate child/parent threads anyway, but your comment about that being required leaves me unsure.

Actions #15

Updated by Zheng Yan about 9 years ago

For making NSS work after fork(), https://github.com/ceph/ceph/pull/4098 is enough. For the libcephfs flock test, we still need https://github.com/ceph/ceph/pull/4117

Actions #16

Updated by Greg Farnum about 9 years ago

  • Status changed from New to Resolved

Both merged!

Actions

Also available in: Atom PDF