Actions
Bug #5874
closedrgw: cuttlefish cls_rgw tests fails against next
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2013-08-03T15:11:11.863 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [==========] Running 8 tests from 1 test case. 2013-08-03T15:11:11.863 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [----------] Global test environment set-up. 2013-08-03T15:11:11.863 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [----------] 8 tests from cls_rgw 2013-08-03T15:11:11.863 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ RUN ] cls_rgw.init 2013-08-03T15:11:13.066 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ OK ] cls_rgw.init (1203 ms) 2013-08-03T15:11:13.066 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ RUN ] cls_rgw.index_basic 2013-08-03T15:11:13.070 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: test/cls_rgw/test_cls_rgw.cc:99: Failure 2013-08-03T15:11:13.070 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: Value of: ioctx.operate(bucket_oid, op) 2013-08-03T15:11:13.070 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: Actual: -5 2013-08-03T15:11:13.070 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: Expected: 0 2013-08-03T15:11:13.070 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ FAILED ] cls_rgw.index_basic (4 ms) 2013-08-03T15:11:13.071 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ RUN ] cls_rgw.index_multiple_obj_writers 2013-08-03T15:11:13.223 INFO:teuthology.task.ceph.osd.3.err:[10.214.131.30]: daemon-helper: command crashed with signal 11 2013-08-03T15:11:36.930 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ OK ] cls_rgw.index_multiple_obj_writers (23861 ms) 2013-08-03T15:11:36.930 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ RUN ] cls_rgw.index_remove_object 2013-08-03T15:11:37.181 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ OK ] cls_rgw.index_remove_object (251 ms) 2013-08-03T15:11:37.181 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ RUN ] cls_rgw.index_suggest 2013-08-03T15:11:37.185 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: test/cls_rgw/test_cls_rgw.cc:262: Failure 2013-08-03T15:11:37.185 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: Value of: ioctx.operate(bucket_oid, op) 2013-08-03T15:11:37.185 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: Actual: -5 2013-08-03T15:11:37.185 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: Expected: 0 2013-08-03T15:11:37.186 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ FAILED ] cls_rgw.index_suggest (4 ms) 2013-08-03T15:11:37.186 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ RUN ] cls_rgw.gc_set 2013-08-03T15:11:37.774 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ OK ] cls_rgw.gc_set (589 ms) 2013-08-03T15:11:37.774 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ RUN ] cls_rgw.gc_defer 2013-08-03T15:11:46.873 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ OK ] cls_rgw.gc_defer (9099 ms) 2013-08-03T15:11:46.873 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ RUN ] cls_rgw.finalize 2013-08-03T15:11:48.137 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [ OK ] cls_rgw.finalize (1264 ms) 2013-08-03T15:11:48.137 INFO:teuthology.task.workunit.client.0.out:[10.214.131.28]: [----------] 8 tests from cls_rgw (36275 ms total)
test is
ubuntu@teuthology:/a/teuthology-2013-08-02_01:30:04-upgrade-next-testing-basic-plana/93844$ cat orig.config.yaml kernel: kdb: true sha1: 05542c395ce50bb1750cc6fead85727903fc3e72 machine_type: plana nuke-on-error: true os_type: ubuntu overrides: admin_socket: branch: next ceph: conf: mon: debug mon: 20 debug ms: 1 debug paxos: 20 log-whitelist: - slow request sha1: ef036bd4bc0e79bff8a5805800fbdeb0cc2db6ae ceph-deploy: branch: dev: next conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 install: ceph: sha1: ef036bd4bc0e79bff8a5805800fbdeb0cc2db6ae s3tests: branch: next workunit: sha1: ef036bd4bc0e79bff8a5805800fbdeb0cc2db6ae roles: - - mon.a - mds.a - osd.0 - osd.1 - - mon.b - mon.c - osd.2 - osd.3 - - client.0 tasks: - chef: null - clock.check: null - install: branch: cuttlefish - ceph: null - workunit: branch: cuttlefish clients: client.0: - rados/load-gen-mix.sh - install.upgrade: osd.0: branch: next osd.2: branch: next - ceph.restart: - osd.0 - osd.2 - workunit: branch: cuttlefish clients: client.0: - rados/test.sh - cls teuthology_branch: next
note that the cls test is run from client.0, which was not upgraded, but 2/4 osds are upgraded and restarted.
Updated by Yehuda Sadeh over 10 years ago
It looks like some osd crashed here:
2013-08-03T15:11:13.223 INFO:teuthology.task.ceph.osd.3.err:[10.214.131.30]: daemon-helper: command crashed with signal 11
Updated by Yehuda Sadeh over 10 years ago
We get this, which looks like #5752:
128.41c26202 e397) v4 ==== 147+0+20 (3830404673 0 3874395120) 0x15726c0 con 0x1f17580 2013-08-06 15:46:22.967531 7f169c587700 0 _load_class could not open class /usr/lib/rados-classes/libcls_rgw.so (dlopen failed): /usr/lib/rados-classes/libcls_rgw.so: undefined symbol: _Z21cls_current_subop_numPv 2013-08-06 15:46:22.967559 7f169c587700 1 -- 10.214.131.4:6803/4043 --> 10.214.131.38:0/1006175 -- osd_op_reply(2 bucket-0 [call rgw.bucket_init_index] ack = -5 (Input/output error)) v4 -- ?+0 0x1a72600 con 0x1f17580
and then the osd crashes. Not sure why we see it, shouldn't the osd have been installed with the new version by now? The crash itself is around here:
#4 <signal handler called> #5 0x000000000009f406 in ?? () #6 0x00007f1691268755 in ?? () #7 0x0000000002f8aa00 in ?? () #8 0x0000000001536eb0 in ?? () #9 0x0000000001536e28 in ?? () #10 0x000000000068bd64 in ClassHandler::_load_class(ClassHandler::ClassData*) () #11 0x000000000068c3a6 in ClassHandler::open_class(std::string const&, ClassHandler::ClassData**) () #12 0x000000000060869b in OSD::init_op_flags(std::tr1::shared_ptr<OpRequest>) () #13 0x000000000063ce78 in OSD::handle_op(std::tr1::shared_ptr<OpRequest>) () #14 0x0000000000646159 in OSD::dispatch_op(std::tr1::shared_ptr<OpRequest>) () #15 0x00000000006521fe in OSD::_dispatch(Message*) () #16 0x0000000000652986 in OSD::ms_dispatch(Message*) () #17 0x00000000008e7ec1 in DispatchQueue::entry() () #18 0x0000000000832e4d in DispatchQueue::DispatchThread::entry() () #19 0x00007f16a8711e9a in start_thread (arg=0x7f169c587700) at pthread_create.c:308 #20 0x00007f16a68a74bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
Updated by Yehuda Sadeh over 10 years ago
The osd hasn't been restarted at this point.
Updated by Yehuda Sadeh over 10 years ago
So basically this is #5752. We can try working around it by running the objclass unitest before the upgrade (which will hopefully get the osd to load the class object).
Updated by Sage Weil over 10 years ago
- Status changed from New to Resolved
backported the preload osd class patches to cuttlefish and enabled in teuthology so we can avoid this problem in testing.
Actions