Project

General

Profile

Bug #20175

test_librbd_api.sh fails in upgrade test

Added by Kefu Chai 4 months ago. Updated 25 days ago.

Status:
Resolved
Priority:
Immediate
Assignee:
-
Target version:
-
Start date:
06/04/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
kraken, jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
jewel
Needs Doc:
No

Description

2017-06-04T05:28:29.904 INFO:tasks.workunit.client.0.mira083.stderr:/build/ceph-12.0.2-2130-g5f43976/src/common/Mutex.cc: 109: FAILED assert(r == 0)
2017-06-04T05:28:29.905 INFO:tasks.workunit.client.0.mira083.stderr: ceph version  12.0.2-2130-g5f43976 (5f43976ec3165b1ca915e26e07d405c72ae12abc) luminous (dev)
2017-06-04T05:28:29.905 INFO:tasks.workunit.client.0.mira083.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x10e) [0x7f0a57f0038e]
2017-06-04T05:28:29.905 INFO:tasks.workunit.client.0.mira083.stderr: 2: (Mutex::Lock(bool)+0x1a4) [0x7f0a57ed41d4]
2017-06-04T05:28:29.905 INFO:tasks.workunit.client.0.mira083.stderr: 3: (md_config_t::add_observer(md_config_obs_t*)+0x36) [0x7f0a5811f6e6]
2017-06-04T05:28:29.905 INFO:tasks.workunit.client.0.mira083.stderr: 4: (TracepointProvider::TracepointProvider(CephContext*, char const*, char const*)+0xa1) [0x7f0a58164dd1]
2017-06-04T05:28:29.905 INFO:tasks.workunit.client.0.mira083.stderr: 5: (()+0x76254) [0x7f0a62650254]
2017-06-04T05:28:29.905 INFO:tasks.workunit.client.0.mira083.stderr: 6: (rados_create()+0x71) [0x7f0a62650401]
2017-06-04T05:28:29.905 INFO:tasks.workunit.client.0.mira083.stderr: 7: (connect_cluster_pp(librados::Rados&)+0xa3) [0x7f0a6322ac13]
2017-06-04T05:28:29.906 INFO:tasks.workunit.client.0.mira083.stderr: 8: (main()+0x3c) [0x7f0a6315e66c]
2017-06-04T05:28:29.906 INFO:tasks.workunit.client.0.mira083.stderr: 9: (__libc_start_main()+0xf5) [0x7f0a60840f45]
2017-06-04T05:28:29.906 INFO:tasks.workunit.client.0.mira083.stderr: 10: (()+0x118ec7) [0x7f0a6315fec7]
2017-06-04T05:28:29.906 INFO:tasks.workunit.client.0.mira083.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2017-06-04T05:28:29.911 INFO:tasks.workunit.client.0.mira083.stderr:Aborted (core dumped)
2017-06-04T05:28:29.913 INFO:tasks.workunit:Stopping ['rbd/test_librbd_api.sh'] on client.0...

see http://qa-proxy.ceph.com/teuthology/kchai-2017-06-04_05:23:18-upgrade-wip-mgr-stats-kefu---basic-mira/1259395/teuthology.log

my guess is that the new rbd client does not work with old librbd.


Related issues

Copied to rbd - Backport #20351: kraken: test_librbd_api.sh fails in upgrade test Resolved
Copied to rbd - Backport #20532: jewel: test_librbd_api.sh fails in upgrade test Resolved

History

#1 Updated by Kefu Chai 4 months ago

TracepointProvider.cc was in libcommon.a which is in turn included by librbd.so in jewel, but in luminous, it is moved into libceph-common.so, and librbd.so is linked against it dynamically.

not sure if it's relevant.

#2 Updated by Kefu Chai 4 months ago

it's not relevant. and jewel's "ceph_test_librbd_api" crashed in a different way when dynamically linked against librbd and librados from master.

#3 Updated by Sage Weil 4 months ago

  • Assignee set to Kefu Chai

#4 Updated by Kefu Chai 3 months ago

(gdb) b PerfCounters::PerfCounters
Breakpoint 2 at 0x555555769bd0 (3 locations)
(gdb) info shared
From                To                  Syms Read   Shared Object Library
0x00007ffff7dd9aa0  0x00007ffff7df5040  Yes         /lib64/ld-linux-x86-64.so.2
0x00007ffff768f9d0  0x00007ffff7a18da6  Yes         /var/ceph/ceph/build/lib/librbd.so.1
0x00007ffff725caf0  0x00007ffff739f32f  Yes         /var/ceph/ceph/build/lib/librados.so.2
0x00007ffff6edaab0  0x00007ffff6ee7811  Yes         /lib/x86_64-linux-gnu/libpthread.so.0
0x00007ffff6cd1d80  0x00007ffff6cd294e  Yes         /lib/x86_64-linux-gnu/libdl.so.2
0x00007ffff6aba060  0x00007ffff6ac991f  Yes (*)     /usr/lib/x86_64-linux-gnu/libboost_thread.so.1.62.0
0x00007ffff68a30e0  0x00007ffff68a5ecf  Yes         /lib/x86_64-linux-gnu/librt.so.1
0x00007ffff6570b10  0x00007ffff665cae4  Yes (*)     /usr/lib/x86_64-linux-gnu/libnss3.so
0x00007ffff6323ce0  0x00007ffff634441f  Yes (*)     /usr/lib/x86_64-linux-gnu/libnspr4.so
0x00007ffff610b5d0  0x00007ffff6111afd  Yes (*)     /usr/lib/x86_64-linux-gnu/libboost_iostreams.so.1.62.0
0x00007ffff5efc060  0x00007ffff5efcd06  Yes (*)     /usr/lib/x86_64-linux-gnu/libboost_system.so.1.62.0
0x00007ffff5c06230  0x00007ffff5cae5c9  Yes (*)     /usr/lib/x86_64-linux-gnu/libstdc++.so.6
0x00007ffff587b680  0x00007ffff58e78da  Yes         /lib/x86_64-linux-gnu/libm.so.6
0x00007ffff5661ac0  0x00007ffff5671fe5  Yes (*)     /lib/x86_64-linux-gnu/libgcc_s.so.1
0x00007ffff52df910  0x00007ffff5409403  Yes         /lib/x86_64-linux-gnu/libc.so.6
0x00007fffec2f1700  0x00007fffeca5ef8f  Yes         /var/ceph/ceph/build/lib/libceph-common.so.0
0x00007fffeaf90970  0x00007fffeaf9ca52  Yes         /lib/x86_64-linux-gnu/libresolv.so.2
0x00007fffead7ea50  0x00007fffead875d9  Yes (*)     /usr/lib/x86_64-linux-gnu/libibverbs.so.1
0x00007ffff7fb9ca0  0x00007ffff7fcb786  Yes (*)     /lib/x86_64-linux-gnu/libudev.so.1
0x00007fffeab3cfe0  0x00007fffeab668d9  Yes (*)     /lib/x86_64-linux-gnu/libblkid.so.1
0x00007fffea8e8320  0x00007fffea91af32  Yes (*)     /usr/lib/x86_64-linux-gnu/libssl3.so
0x00007fffea6b8a70  0x00007fffea6d07eb  Yes (*)     /usr/lib/x86_64-linux-gnu/libsmime3.so
0x00007fffea48d2b0  0x00007fffea49c311  Yes (*)     /usr/lib/x86_64-linux-gnu/libnssutil3.so
0x00007fffea27e040  0x00007fffea27ef38  Yes (*)     /usr/lib/x86_64-linux-gnu/libplds4.so
0x00007fffea0794b0  0x00007fffea07aba6  Yes (*)     /usr/lib/x86_64-linux-gnu/libplc4.so
0x00007fffe9e601c0  0x00007fffe9e70afe  Yes (*)     /lib/x86_64-linux-gnu/libz.so.1
0x00007fffe9c4f700  0x00007fffe9c5b452  Yes (*)     /lib/x86_64-linux-gnu/libbz2.so.1.0
0x00007fffe9a00000  0x00007fffe9a2c5ff  Yes (*)     /usr/lib/x86_64-linux-gnu/libnl-route-3.so.200
0x00007fffe97cea90  0x00007fffe97dc0c5  Yes (*)     /lib/x86_64-linux-gnu/libnl-3.so.200
0x00007fffe95c2570  0x00007fffe95c3c41  Yes (*)     /lib/x86_64-linux-gnu/libuuid.so.1
0x00007fffe93a69a0  0x00007fffe93ad4fd  Yes (*)     /usr/lib/x86_64-linux-gnu/liblttng-ust-tracepoint.so.0
0x00007fffe919ef70  0x00007fffe91a22d9  Yes (*)     /usr/lib/x86_64-linux-gnu/liburcu-bp.so.4
0x00007fffe8f96800  0x00007fffe8f99979  Yes (*)     /usr/lib/x86_64-linux-gnu/liburcu-cds.so.4
0x00007fffe8d912d0  0x00007fffe8d926b9  Yes (*)     /usr/lib/x86_64-linux-gnu/liburcu-common.so.4
(gdb) info inferior
  Num  Description       Executable
* 1    process 27464     /usr/lib/ceph/bin/ceph_test_librbd_api

$ less /proc/27464/maps
555555554000-555555869000 r-xp 00000000 08:05 11668827                   /usr/lib/ceph/bin/ceph_test_librbd_api
555555a69000-555555a73000 r--p 00315000 08:05 11668827                   /usr/lib/ceph/bin/ceph_test_librbd_api
555555a73000-555555a7a000 rw-p 0031f000 08:05 11668827                   /usr/lib/ceph/bin/ceph_test_librbd_api
555555a7a000-55555eabd000 rw-p 00000000 00:00 0                          [heap]
7fffe858f000-7fffe8590000 ---p 00000000 00:00 0
7fffe8590000-7fffe8d90000 rw-p 00000000 00:00 0

so the called ctor of PerfCounters is in the executable itself!

#5 Updated by Kefu Chai 3 months ago

[kchai@mira078 ~]$ /usr/bin/ceph_test_librbd_api
common/perf_counters.cc: In function 'PerfCounters* PerfCountersBuilder::create_perf_counters()' thread 7ffa5ff3c200 time 2017-06-09 12:05:57.193919
common/perf_counters.cc: 445: FAILED assert(d->type != PERFCOUNTER_NONE)
 ceph version 10.2.7-257-gce1fc34 (ce1fc3492e87c669f7059c2047a3bed077418a89)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7ffa60176bd5]
 2: (PerfCountersBuilder::create_perf_counters()+0x57) [0x7ffa6019f2d7]
 3: (Finisher::Finisher(CephContext*, std::string, std::string)+0x255) [0x7ffa5f58abb5]
 4: (librados::RadosClient::RadosClient(CephContext*)+0x20e) [0x7ffa5f5881ee]
 5: (rados_create()+0xb0) [0x7ffa5f55c870]
 6: (connect_cluster_pp(librados::Rados&)+0xb6) [0x7ffa60159b16]
 7: (main()+0x4c) [0x7ffa6008925c]
 8: (__libc_start_main()+0xf5) [0x7ffa5d758b35]
 9: (()+0x116c27) [0x7ffa6008ac27]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Aborted

this is exactly what i have in my testbed if i launch /usr/lib/ceph/bin/ceph_test_librbd_api.

#6 Updated by Kefu Chai 3 months ago

B+ |148         PerfCountersBuilder b(cct, string("finisher-") + name,                                                                                                             |
  >|149                               l_finisher_first, l_finisher_last);                                                                                                          |
   |150         b.add_u64(l_finisher_queue_len, "queue_len");

p b.m_perf_counters->m_data should have 2 elements at line 149, but it shrinks to 0. i set watch points to watch the size related variables of that m_data, but they didn't help =(

(gdb) p b->m_perf_counters->m_data._M_impl._M_finish
$46 = (std::_Vector_base<PerfCounters::perf_counter_data_any_d, std::allocator<PerfCounters::perf_counter_data_any_d> >::pointer) 0x55555eab9750
(gdb) p b->m_perf_counters->m_data._M_impl._M_start
$47 = (std::_Vector_base<PerfCounters::perf_counter_data_any_d, std::allocator<PerfCounters::perf_counter_data_any_d> >::pointer) 0x55555eab9750

(gdb) p &m_data._M_impl._M_start
$91 = (std::_Vector_base<PerfCounters::perf_counter_data_any_d, std::allocator<PerfCounters::perf_counter_data_any_d> >::pointer *) 0x55555eab9610

(gdb) p &m_data._M_impl._M_finish
$90 = (std::_Vector_base<PerfCounters::perf_counter_data_any_d, std::allocator<PerfCounters::perf_counter_data_any_d> >::pointer *) 0x55555eab9618

(gdb) p/x *0x55555eab9610
$108 = 0x5eab96e0
(gdb) p/x *0x55555eab9618
$113 = 0x5eab9750
(gdb) info b
4       hw watchpoint  keep y                      *0x55555eab9610
5       hw watchpoint  keep y                      *0x55555eab9618

but they didn't stop at where they changed.

(gdb) p b.m_perf_counters->m_data._M_impl._M_finish
$126 = (std::_Vector_base<PerfCounters::perf_counter_data_any_d, std::allocator<PerfCounters::perf_counter_data_any_d> >::pointer) 0x55555eab9750
(gdb) p &b.m_perf_counters->m_data._M_impl._M_finish
$127 = (std::_Vector_base<PerfCounters::perf_counter_data_any_d, std::allocator<PerfCounters::perf_counter_data_any_d> >::pointer *) 0x55555eab9620

b.m_perf_counters->m_data._M_impl._M_finish's address changed!

#7 Updated by Kefu Chai 3 months ago

because the layout of PerfCounters was changed in luminous: we added a new field of @prio_adjust to it. so the PerfCounters instances instantiated by jewel executable can not be correctly read by luminous executable/shared library anymore.

#8 Updated by Kefu Chai 3 months ago

because the user applications do not link against libcommon, i think it'd be fine to backport the change of prio_adjust to jewel just to appease ceph_test_librbd_api. so jewel's ABI of libcommon is consistent with that of luminous.

and we don't need to worry about this anymore since luminous, because almost all client side executables are now linked agaisnt libceph-common dynamically.

#9 Updated by Kefu Chai 3 months ago

  • Category changed from librbd to common
  • Status changed from New to Need Review
  • Release jewel added

#10 Updated by Nathan Cutler 3 months ago

  • Project changed from Ceph to CI
  • Category deleted (common)

#11 Updated by Nathan Cutler 3 months ago

  • Project changed from CI to Ceph

#12 Updated by Kefu Chai 3 months ago

tested at http://qa-proxy.ceph.com/teuthology/kchai-2017-06-12_12:19:18-upgrade-wip-20175-kefu---basic-mira/1279912/teuthology.log.

/home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=kraken TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 RBD_FEATURES=13 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/rbd/test_librbd_api.sh'
2017-06-12T12:25:45.773 INFO:tasks.workunit.client.0.mira083.stdout:[==========] Running 82 tests from 3 test cases.
2017-06-12T12:25:45.773 INFO:tasks.workunit.client.0.mira083.stdout:[----------] Global test environment set-up.
2017-06-12T12:25:45.773 INFO:tasks.workunit.client.0.mira083.stdout:[----------] 68 tests from TestLibRBD
2017-06-12T12:25:45.773 INFO:tasks.workunit.client.0.mira083.stdout:seed 17944
2017-06-12T12:25:45.809 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.CreateAndStat
2017-06-12T12:25:46.681 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:25:49.756 INFO:tasks.workunit.client.0.mira083.stdout:image has size 2097152 and order 22
2017-06-12T12:25:49.756 INFO:tasks.workunit.client.0.mira083.stdout:[       OK ] TestLibRBD.CreateAndStat (3946 ms)
2017-06-12T12:25:49.756 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.CreateAndStatPP
2017-06-12T12:25:49.758 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:25:49.810 INFO:tasks.workunit.client.0.mira083.stdout:[       OK ] TestLibRBD.CreateAndStatPP (54 ms)
2017-06-12T12:25:49.810 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.GetId
2017-06-12T12:25:49.810 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:25:49.851 INFO:tasks.workunit.client.0.mira083.stdout:[       OK ] TestLibRBD.GetId (42 ms)
2017-06-12T12:25:49.852 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.GetIdPP
2017-06-12T12:25:49.852 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:25:49.895 INFO:tasks.workunit.client.0.mira083.stdout:[       OK ] TestLibRBD.GetIdPP (43 ms)
2017-06-12T12:25:49.895 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.GetBlockNamePrefix
2017-06-12T12:25:49.895 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:25:49.934 INFO:tasks.workunit.client.0.mira083.stdout:[       OK ] TestLibRBD.GetBlockNamePrefix (40 ms)
2017-06-12T12:25:49.935 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.GetBlockNamePrefixPP
2017-06-12T12:25:49.935 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:25:49.977 INFO:tasks.workunit.client.0.mira083.stdout:[       OK ] TestLibRBD.GetBlockNamePrefixPP (42 ms)
...
2017-06-12T12:25:50.429 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.TestCreateLsDelete
2017-06-12T12:25:50.744 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:25:53.807 INFO:tasks.workunit.client.0.mira083.stdout:image: image15
2017-06-12T12:25:53.807 INFO:tasks.workunit.client.0.mira083.stdout:expected = image15
2017-06-12T12:25:53.808 INFO:tasks.workunit.client.0.mira083.stdout:found image15
2017-06-12T12:25:53.808 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:25:53.870 INFO:tasks.workunit.client.0.mira083.stdout:image: image15
2017-06-12T12:25:53.870 INFO:tasks.workunit.client.0.mira083.stdout:image: image16
2017-06-12T12:25:53.870 INFO:tasks.workunit.client.0.mira083.stdout:expected = image15
2017-06-12T12:25:53.870 INFO:tasks.workunit.client.0.mira083.stdout:found image15
2017-06-12T12:25:53.870 INFO:tasks.workunit.client.0.mira083.stdout:expected = image16
2017-06-12T12:25:53.870 INFO:tasks.workunit.client.0.mira083.stdout:found image16
2017-06-12T12:25:53.870 INFO:tasks.workunit.client.0.mira083.stdout:test/librbd/test_librbd.cc:712: Failure
2017-06-12T12:25:53.870 INFO:tasks.workunit.client.0.mira083.stdout:Value of: rbd_remove(ioctx, name.c_str())
2017-06-12T12:25:53.871 INFO:tasks.workunit.client.0.mira083.stdout:  Actual: -95
2017-06-12T12:25:53.871 INFO:tasks.workunit.client.0.mira083.stdout:Expected: 0
2017-06-12T12:25:53.871 INFO:tasks.workunit.client.0.mira083.stdout:[  FAILED  ] TestLibRBD.TestCreateLsDelete (3441 ms)
...
2017-06-12T12:26:22.246 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.TestClone
2017-06-12T12:26:23.602 INFO:tasks.workunit.client.0.mira083.stdout:made parent image "parent" 
2017-06-12T12:26:23.603 INFO:tasks.workunit.client.0.mira083.stdout:parent has no parent info
2017-06-12T12:26:23.603 INFO:tasks.workunit.client.0.mira083.stdout:made snapshot "parent@parent_snap" 
2017-06-12T12:26:23.603 INFO:tasks.workunit.client.0.mira083.stdout:can't unprotect an unprotected snap
2017-06-12T12:26:23.603 INFO:tasks.workunit.client.0.mira083.stdout:can't protect a protected snap
2017-06-12T12:26:23.603 INFO:tasks.workunit.client.0.mira083.stdout:made and opened clone "child" 
2017-06-12T12:26:23.603 INFO:tasks.workunit.client.0.mira083.stdout:read: 8
2017-06-12T12:26:23.603 INFO:tasks.workunit.client.0.mira083.stdout:read: 8
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:read: 8
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:sizes and overlaps are good between parent and child
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:sized down clone, changed overlap
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:parent info: size 4194304 obj_size 4194304 parent_pool -1
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:sized up clone, changed size but not overlap or parent's size
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:can't remove parent while child still exists
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:test/librbd/test_librbd.cc:1836: Failure
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:Value of: rbd_remove(ioctx, child_name.c_str())
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:  Actual: -95
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:Expected: 0
2017-06-12T12:26:23.604 INFO:tasks.workunit.client.0.mira083.stdout:[  FAILED  ] TestLibRBD.TestClone (1359 ms)
...
2017-06-12T12:26:40.219 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.FlushAio
2017-06-12T12:26:40.219 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:26:40.644 INFO:tasks.workunit.client.0.mira083.stdout:test/librbd/test_librbd.cc:2424: Failure
2017-06-12T12:26:40.644 INFO:tasks.workunit.client.0.mira083.stdout:Value of: rbd_remove(ioctx, name.c_str())
2017-06-12T12:26:40.645 INFO:tasks.workunit.client.0.mira083.stdout:  Actual: -95
2017-06-12T12:26:40.645 INFO:tasks.workunit.client.0.mira083.stdout:Expected: 0
2017-06-12T12:26:40.645 INFO:tasks.workunit.client.0.mira083.stdout:[  FAILED  ] TestLibRBD.FlushAio (426 ms)
...
2017-06-12T12:26:50.104 INFO:tasks.workunit.client.0.mira083.stdout:[ RUN      ] TestLibRBD.ObjectMapConsistentSnap
2017-06-12T12:26:50.104 INFO:tasks.workunit.client.0.mira083.stdout:using new format!
2017-06-12T12:27:00.694 INFO:tasks.workunit.client.0.mira083.stderr:Segmentation fault (core dumped)
2017-06-12 12:25:34.563325 7f2ab3ca7800  0 ceph version 10.2.7-268-g119f147 (119f147556b043a680b5d058eef37642afa40e3a), process ceph-osd, pid 16812
..
2017-06-12 12:25:49.739670 7f2a94461700  1 -- 172.21.7.120:6800/16812 --> 172.21.7.118:0/2588823292 -- osd_op_reply(17 rbd_header.101359cbae3d [call rbd.get_data_pool] v0'0 uv0 ondisk = -95 ((95) Operation not supported)) v7 -- ?+0 0x5582a9c31180 con 0x5582a9fb2000
..
2017-06-12 12:25:49.749531 7f2a94461700  1 -- 172.21.7.120:6800/16812 --> 172.21.7.118:0/2588823292 -- osd_op_reply(22 rbd_header.101359cbae3d [call rbd.image_get_group] v0'0 uv0 ondisk = -95 ((95) Operation not supported)) v7 -- ?+0 0x5582aa12e840 con 0x5582a9fb2000
..
2017-06-12 12:26:01.937492 7f2a92c5e700  1 -- 172.21.7.120:6800/16812 --> 172.21.7.118:0/2588823292 -- osd_op_reply(235 rbd_header.101398c4bf8 [call rbd.get_data_pool] v0'0 uv0 ondisk = -95 ((95) Operation not supported)) v7 -- ?+0 0x5582aa131c80 con 0x5582a9fb2000
..
2017-06-12 12:26:01.948089 7f2a92c5e700  1 -- 172.21.7.120:6800/16812 --> 172.21.7.118:0/2588823292 -- osd_op_reply(240 rbd_header.101398c4bf8 [call rbd.image_get_group] v0'0 uv0 ondisk = -95 ((95) Operation not supported)) v7 -- ?+0 0x5582aa526840 con 0x5582a9fb2000

see /a/kchai-2017-06-12_12:19:18-upgrade-wip-20175-kefu---basic-mira/1279912/remote/mira082/log/ceph-osd.0.log.gz

see also #17422, the rbd rados cls in jewel does not support "get_data_pool" and "image_get_group". but the luminous librbd is using them.

#13 Updated by Kefu Chai 3 months ago

  • Project changed from Ceph to rbd
  • Assignee changed from Kefu Chai to Jason Dillaman

Jason, could you help take a look? by inspecting qa/suites/upgrade/client-upgrade/jewel-client-x/basic, i think it's on purpose to use the combination of "jewel librbd client + luminous librbd + jewel cluster" to test the interoperability.

#14 Updated by Jason Dillaman 3 months ago

  • Status changed from Need Review to Pending Backport
  • Backport set to kraken,jewel

#15 Updated by Kefu Chai 3 months ago

  • Status changed from Pending Backport to Verified

Jason, the test still fails with this fix. see http://tracker.ceph.com/issues/20175#note-12

#16 Updated by Jason Dillaman 3 months ago

@Kefu: thanks -- that's a different issue. I'll take care of that today.

#17 Updated by Kefu Chai 3 months ago

  • Status changed from Verified to Pending Backport
  • Assignee deleted (Jason Dillaman)

#18 Updated by Nathan Cutler 3 months ago

  • Copied to Backport #20351: kraken: test_librbd_api.sh fails in upgrade test added

#20 Updated by Nathan Cutler 3 months ago

  • Backport changed from kraken,jewel to kraken

#21 Updated by Nathan Cutler 3 months ago

  • Backport changed from kraken to kraken, jewel

#23 Updated by Nathan Cutler 3 months ago

  • Copied to Backport #20532: jewel: test_librbd_api.sh fails in upgrade test added

#24 Updated by Nathan Cutler 25 days ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF