Project

General

Profile

Bug #57757

ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of_buffer'

Added by Laura Flores 4 months ago. Updated 2 months ago.

Status:
Fix Under Review
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2022-09-29_16:44:24-rados-wip-lflores-testing-distro-default-smithi/7048173/remote/smithi133/crash/posted/2022-09-29T23:32:11.008544Z_f3f12160-378c-4631-a08e-53563a23d684/meta

{
    "crash_id": "2022-09-29T23:32:11.008544Z_f3f12160-378c-4631-a08e-53563a23d684",
    "timestamp": "2022-09-29T23:32:11.008544Z",
    "process_name": "ceph-osd",
    "entity_name": "osd.9",
    "ceph_version": "18.0.0-131-gb2711ca9",
    "utsname_hostname": "smithi133",
    "utsname_sysname": "Linux",
    "utsname_release": "5.4.0-126-generic",
    "utsname_version": "#142-Ubuntu SMP Fri Aug 26 12:12:57 UTC 2022",
    "utsname_machine": "x86_64",
    "os_name": "Ubuntu",
    "os_id": "ubuntu",
    "os_version_id": "20.04",
    "os_version": "20.04.5 LTS (Focal Fossa)",
    "backtrace": [
        "/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f06c745d420]",
        "gsignal()",
        "abort()",
        "ceph-osd(+0xd0d130) [0x55b3c6c14130]",
        "(__cxxabiv1::__terminate(void (*)())+0xc) [0x55b3c7e4770c]",
        "ceph-osd(+0x1f40777) [0x55b3c7e47777]",
        "ceph-osd(+0x1f408d9) [0x55b3c7e478d9]",
        "ceph-osd(+0xce37c1) [0x55b3c6bea7c1]",
        "(ECUtil::decode(ECUtil::stripe_info_t const&, std::shared_ptr<ceph::ErasureCodeInterface>&, std::map<int, ceph::buffer::v15_2_0::list, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::v15_2_0::list> > >&, std::map<int, ceph::buffer::v15_2_0::list*, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::v15_2_0::list*> > >&)+0x49b) [0x55b3c6f831cb]",
        "(ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::v15_2_0::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::v15_2_0::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, std::optional<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >, RecoveryMessages*)+0x431) [0x55b3c70db3e1]",
        "(OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x71) [0x55b3c7100981]",
        "(ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x8d) [0x55b3c70cc9ad]",
        "(ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*, ZTracer::Trace const&)+0xdc2) [0x55b3c70e4032]",
        "(ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2d4) [0x55b3c70e4894]",
        "(PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x56) [0x55b3c6e9c726]",
        "(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x4d2) [0x55b3c6e38682]",
        "(OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1c0) [0x55b3c6c93860]",
        "(ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x69) [0x55b3c6f87629]",
        "(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xb28) [0x55b3c6cacf38]",
        "(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x434) [0x55b3c73d7f14]",
        "(ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55b3c73db004]",
        "/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f06c7451609]",
        "clone()" 
    ]
}

Teuthology log:

2022-09-29T23:32:10.984 INFO:tasks.ceph.osd.8.smithi133.stderr:2022-09-29T23:32:10.978+0000 7fe938179700 -1 received  signal: Hangup from /usr/bin/python3 /usr/bin/daemon-helper kill ceph-osd -f --cluster ceph -i 8  (PID: 26823) UID: 0
2022-09-29T23:32:10.992 INFO:tasks.ceph.osd.9.smithi133.stderr:terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of_buffer'
2022-09-29T23:32:10.993 INFO:tasks.ceph.osd.9.smithi133.stderr:  what():  End of buffer [buffer:2]
2022-09-29T23:32:10.993 INFO:tasks.ceph.osd.9.smithi133.stderr:*** Caught signal (Aborted) **
2022-09-29T23:32:10.993 INFO:tasks.ceph.osd.9.smithi133.stderr: in thread 7f06a5796700 thread_name:tp_osd_tp
2022-09-29T23:32:11.006 INFO:teuthology.orchestra.run.smithi012.stdout:{
2022-09-29T23:32:11.006 INFO:teuthology.orchestra.run.smithi012.stdout:    "ops": [],
2022-09-29T23:32:11.006 INFO:teuthology.orchestra.run.smithi012.stdout:    "num_ops": 0
2022-09-29T23:32:11.006 INFO:teuthology.orchestra.run.smithi012.stdout:}
2022-09-29T23:32:11.010 INFO:tasks.ceph.osd.9.smithi133.stderr: ceph version 18.0.0-131-gb2711ca9 (b2711ca9ca68627a0666c08c8c65c3a27a2da3f2) quincy (dev)
2022-09-29T23:32:11.010 INFO:tasks.ceph.osd.9.smithi133.stderr: 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f06c745d420]
2022-09-29T23:32:11.011 INFO:tasks.ceph.osd.9.smithi133.stderr: 2: gsignal()
2022-09-29T23:32:11.011 INFO:tasks.ceph.osd.9.smithi133.stderr: 3: abort()
2022-09-29T23:32:11.011 INFO:tasks.ceph.osd.9.smithi133.stderr: 4: ceph-osd(+0xd0d130) [0x55b3c6c14130]
2022-09-29T23:32:11.011 INFO:tasks.ceph.osd.9.smithi133.stderr: 5: (__cxxabiv1::__terminate(void (*)())+0xc) [0x55b3c7e4770c]
2022-09-29T23:32:11.012 INFO:tasks.ceph.osd.9.smithi133.stderr: 6: ceph-osd(+0x1f40777) [0x55b3c7e47777]
2022-09-29T23:32:11.012 INFO:tasks.ceph.osd.9.smithi133.stderr: 7: ceph-osd(+0x1f408d9) [0x55b3c7e478d9]
2022-09-29T23:32:11.012 INFO:tasks.ceph.osd.9.smithi133.stderr: 8: ceph-osd(+0xce37c1) [0x55b3c6bea7c1]
2022-09-29T23:32:11.012 INFO:tasks.ceph.osd.9.smithi133.stderr: 9: (ECUtil::decode(ECUtil::stripe_info_t const&, std::shared_ptr<ceph::ErasureCodeInterface>&, std::map<int, ceph::buffer::v15_2_0::list, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::v15_2_0::list> > >&, std::map<int, ceph::buffer::v15_2_0::list*, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::v15_2_0::list*> > >&)+0x49b) [0x55b3c6f831cb]
2022-09-29T23:32:11.013 INFO:tasks.ceph.osd.9.smithi133.stderr: 10: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::v15_2_0::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::v15_2_0::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, std::optional<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >, RecoveryMessages*)+0x431) [0x55b3c70db3e1]
2022-09-29T23:32:11.013 INFO:tasks.ceph.osd.9.smithi133.stderr: 11: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x71) [0x55b3c7100981]
2022-09-29T23:32:11.013 INFO:tasks.ceph.osd.9.smithi133.stderr: 12: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x8d) [0x55b3c70cc9ad]
2022-09-29T23:32:11.013 INFO:tasks.ceph.osd.9.smithi133.stderr: 13: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*, ZTracer::Trace const&)+0xdc2) [0x55b3c70e4032]
2022-09-29T23:32:11.014 INFO:tasks.ceph.osd.9.smithi133.stderr: 14: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2d4) [0x55b3c70e4894]
2022-09-29T23:32:11.014 INFO:tasks.ceph.osd.9.smithi133.stderr: 15: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x56) [0x55b3c6e9c726]
2022-09-29T23:32:11.014 INFO:tasks.ceph.osd.9.smithi133.stderr: 16: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x4d2) [0x55b3c6e38682]
2022-09-29T23:32:11.014 INFO:tasks.ceph.osd.9.smithi133.stderr: 17: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1c0) [0x55b3c6c93860]
2022-09-29T23:32:11.015 INFO:tasks.ceph.osd.9.smithi133.stderr: 18: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x69) [0x55b3c6f87629]
2022-09-29T23:32:11.015 INFO:tasks.ceph.osd.9.smithi133.stderr: 19: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xb28) [0x55b3c6cacf38]
2022-09-29T23:32:11.015 INFO:tasks.ceph.osd.9.smithi133.stderr: 20: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x434) [0x55b3c73d7f14]
2022-09-29T23:32:11.015 INFO:tasks.ceph.osd.9.smithi133.stderr: 21: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55b3c73db004]
2022-09-29T23:32:11.015 INFO:tasks.ceph.osd.9.smithi133.stderr: 22: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f06c7451609]
2022-09-29T23:32:11.016 INFO:tasks.ceph.osd.9.smithi133.stderr: 23: clone()

Found a very similar instance of this crash tracked in Telemetry. The affected version there is v14.2.19.
http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?var-sig_v2=ab549128220593a3ca9e0e5dacabaa9ffae5833e27ebd2cb413eeb3c1ac669ee&orgId=1

There were a few other crashes that looked similar (link pasted below), but the one above is the closest to the crash logged in Teuthology. All versions are 14:
http://telemetry.front.sepia.ceph.com:4000/d/Nvj6XTaMk/spec-search?orgId=1&var-substr_1=ECUtil::decode&var-substr_2=ECBackend::handle_recovery_read_complete&var-substr_3=OSD::dequeue_op&var-majors_affected=&var-minors_affected=&var-assert_function=&var-assert_condition=&var-total_results=4&var-sig_v1=&var-sig_v2=&var-daemons=&var-daemons=ceph-osd&var-only_new_fingerprints=false&var-status_description=All&var-only_open=false

History

#1 Updated by Laura Flores 4 months ago

  • Affected Versions v14.2.9 added

Put affected version as "14.2.9" since there is no option for "14.2.19".

#2 Updated by Radoslaw Zarzynski 4 months ago

  • Assignee set to Nitzan Mordechai

There is a coredump on the teuhtology node (/ceph/teuthology-archive/yuriw-2022-09-29_16:44:24-rados-wip-lflores-testing-distro-default-smithi/7048173/remote/smithi133/coredump). Might be worth taking a look / asking Brad Hubbard.

#3 Updated by Nitzan Mordechai 3 months ago

The coredump from branch wip-lflores-testing, I was not able to create docker image since this branch is no longer available.
i'll try to recreate it with main and see if i can reproduce it

#4 Updated by Laura Flores 3 months ago

@Nitzan, here is the branch if you'd like to rebuild it on ci: https://github.com/ljflores/ceph/commits/wip-lflores-testing

It is based on main though, so you would probably be able to get the same results on the main branch.

#5 Updated by Nitzan Mordechai 3 months ago

@Laura, thanks for that! i'll try first with main as you suggested

#6 Updated by Nitzan Mordechai 3 months ago

2022-09-29T23:32:10.982+0000 7f06a5796700 20 osd.9 pg_epoch: 480 pg[3.cs0( v 477'1140 lc 444'1131 (0'0,477'1140] local-lis/les=476/477 n=9 ec=25/25 lis/c=476/394 les/c/f=477/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=0 lpr=477 pi=[394,476)/1 crt=477'1140 mlcod 444'1131 active+recovering rops=1 m=1 mbc={0={(0+1)=1},1={(0+1)=1},2={(0+1)=1},3={(0+1)=1},4={(0+1)=1},5={(0+1)=1}} trimq=[13d~1,14d~1,15c~1,15e~2,164~1,167~1,16b~1,172~1] ps=[139~1,14c~1,151~1,158~1,15b~1]] handle_sub_read_reply Complete: ReadOp(tid=706, to_read={3:307edc38:::smithi01231941-9:head=read_request_t(to_read=[3145728,1048576,0], need={}, want_attrs=0)}, complete={3:307edc38:::smithi01231941-9:head=read_result_t(r=0, errors={}, noattrs, returned=(3145728, 1048576, [0(0),262144, 1(1),258048, 3(3),258048, 5(2),258048, 6(4),258048, 11(5),258048]))}, priority=127, obj_to_source={3:307edc38:::smithi01231941-9:head=0(0),1(1),3(3),5(2),6(4),11(5)}, source_to_obj={0(0)=3:307edc38:::smithi01231941-9:head,1(1)=3:307edc38:::smithi01231941-9:head,3(3)=3:307edc38:::smithi01231941-9:head,5(2)=3:307edc38:::smithi01231941-9:head,6(4)=3:307edc38:::smithi01231941-9:head,11(5)=3:307edc38:::smithi01231941-9:head}, in_progress=)


This one doesn't look right, shard 0 returned with wrong length. unlike other shards, that what i think causing the decode function to fail. it supposed to return with 258048, but it returned with the requested length instead (262144) we can see in it on OSD.0 output:
2022-09-29T23:32:10.711+0000 7f80cd27f700 10 osd.0 479 dequeue_op 0x55e331e43500 prio 127 cost 0 latency 0.000160 MOSDECSubOpWrite(3.1ds4 479/466 ECSubWrite(tid=2480, reqid=client.4590.0:4651, at_version=0'0, trim_to=0'0, roll_forward_to=479'644)) v2 pg pg[3.1ds4( v 479'644 (0'0,479'644] local-lis/les=466/467 
n=7 ec=393/25 lis/c=466/466 les/c/f=467/467/0 sis=466) [8,11,7,4,0,3]p11(1) r=4 lpr=466 luod=0'0 crt=479'641 mlcod 479'641 active mbc={} ps=[139~1,13d~1,14c~2,151~1,158~1,15b~2,15e~2,164~1,167~1,16b~1,172~1]]
2022-09-29T23:32:10.711+0000 7f80d1a88700 20 osd.0 pg_epoch: 479 pg[3.cs0( v 452'1140 (0'0,452'1140] local-lis/les=394/395 n=10 ec=25/25 lis/c=394/394 les/c/f=395/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=-1 lpr=476 pi=[394,476)/1 luod=0'0 crt=452'1140 lcod 446'1136 mlcod 0'0 active mbc={} ps=[135~1,139~1,13c~1,142
~1,149~1,14c~1,14e~1,150~3,158~1,15b~1]] handle_sub_read read request=262144 r=262144 len=262144

unlike other shards that returned with correct length:

2022-09-29T23:32:10.610+0000 7f38d8e41700 20 osd.6 pg_epoch: 479 pg[3.cs4( v 450'1138 (0'0,450'1138] local-lis/les=394/395 n=10 ec=25/25 lis/c=394/394 les/c/f=395/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=-1 lpr=476 pi=[394,476)/1 luod=0'0 crt=450'1138 lcod 452'1140 mlcod 0'0 active mbc={} ps=[135~1,139~1,13c~1,142~1,149~1,14c~1,14e~1,150~3,158~1,15b~1]] handle_sub_read read request=262144 r=258048 len=258048

#7 Updated by Laura Flores 3 months ago

Perhaps there is somewhere that the length should be getting updated, but it not?

#8 Updated by Laura Flores 3 months ago

The output Nitzan pasted is from printing ECBackend::read_result_t:

src/osd/ECBackend.cc

ostream &operator<<(ostream &lhs, const ECBackend::read_result_t &rhs)
{
  lhs << "read_result_t(r=" << rhs.r
      << ", errors=" << rhs.errors;
  if (rhs.attrs) {
    lhs << ", attrs=" << *(rhs.attrs);
  } else {
    lhs << ", noattrs";
  }
  return lhs << ", returned=" << rhs.returned << ")";
}

And from the function `handle_sub_read`:

src/osd/ECBackend.cc

void ECBackend::handle_sub_read(
  pg_shard_t from,
  const ECSubRead &op,
  ECSubReadReply *reply,
  const ZTracer::Trace &trace)
{

...

      if (r < 0) {
        // if we are doing fast reads, it's possible for one of the shard
        // reads to cross paths with another update and get a (harmless)
        // ENOENT.  Suppress the message to the cluster log in that case.
        if (r == -ENOENT && get_parent()->get_pool().fast_read) {
          dout(5) << __func__ << ": Error " << r
                  << " reading " << i->first << ", fast read, probably ok" 
                  << dendl;
        } else {
          get_parent()->clog_error() << "Error " << r
                                     << " reading object " 
                                     << i->first;
          dout(5) << __func__ << ": Error " << r
                  << " reading " << i->first << dendl;
        }
        goto error;
      } else {
        dout(20) << __func__ << " read request=" << j->get<1>() << " r=" << r << " len=" << bl.length() << dendl;
        reply->buffers_read[i->first].push_back(
          make_pair(
            j->get<0>(),
            bl)
          );
      }

...

  reply->from = get_parent()->whoami_shard();
  reply->tid = op.tid;
}

#9 Updated by Laura Flores 3 months ago

By the way, I have the coredump saved on the teuthology node under /home/lflores/tracker_57757.

#10 Updated by Nitzan Mordechai 3 months ago

Laura, are you able to use GDB with debuginfo on that coredump file?

#11 Updated by Laura Flores 3 months ago

No luck yet, but I'm trying to set up the right debug environment. So far, gdb is only giving me question marks, but I'm sure it's because I don't have a debug package installed properly or something.

#12 Updated by Laura Flores 3 months ago

@Brad do you have any tips on how to load the correct debug symbols for the above coredump? After running the `ceph-debug-docker.sh` script to set up an ubuntu focal environment based on the main branch, I run:

root@b7a54ac4ed27:~# gdb /usr/bin/ceph-osd /teuthology/yuriw-2022-09-29_16:44:24-rados-wip-lflores-testing-distro-default-smithi/7048173/remote/smithi133/coredump/1664494331.37862.core
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/ceph-osd...
Reading symbols from /usr/lib/debug/.build-id/d8/da6662fcecf34c749539f01c4002f19b5db27d.debug...

warning: exec file is newer than core file.
[New LWP 38723]
[New LWP 37880]
[New LWP 37866]
[New LWP 37887]
[New LWP 37862]
[New LWP 37865]
[New LWP 37878]
[New LWP 37952]
[New LWP 37871]
[New LWP 37953]
[New LWP 37883]
[New LWP 37960]
[New LWP 38693]
[New LWP 37900]
[New LWP 38696]
[New LWP 37961]
[New LWP 38700]
[New LWP 37955]
[New LWP 37879]
[New LWP 38697]
[New LWP 37962]
[New LWP 37954]
[New LWP 38698]
[New LWP 38710]
[New LWP 37913]
[New LWP 38699]
[New LWP 38706]
[New LWP 37949]
[New LWP 38731]
[New LWP 38712]
[New LWP 38702]
[New LWP 38701]
[New LWP 38327]
[New LWP 38707]
[New LWP 38704]
[New LWP 37863]
[New LWP 38694]
[New LWP 37950]
[New LWP 38711]
[New LWP 38705]
[New LWP 38708]
[New LWP 38511]
[New LWP 38719]
[New LWP 37951]
[New LWP 38709]
[New LWP 38714]
[New LWP 38722]
[New LWP 38732]
[New LWP 38703]
[New LWP 38717]
[New LWP 38715]
[New LWP 38716]
[New LWP 38729]
[New LWP 37867]
[New LWP 38718]
[New LWP 38326]
[New LWP 38726]
[New LWP 37872]
[New LWP 38727]
[New LWP 38721]
[New LWP 38734]
[New LWP 38733]
[New LWP 38510]
[New LWP 38724]
[New LWP 38713]
[New LWP 38728]
[New LWP 38692]
[New LWP 38730]
[New LWP 38695]
[New LWP 38735]
[New LWP 38725]
[New LWP 38720]
[New LWP 37884]
--Type <RET> for more, q to quit, c to continue without paging--
Core was generated by `ceph-osd -f --cluster ceph -i 9'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f06c745d2ab in ?? ()
[Current thread is 1 (LWP 38723)]
(gdb) bt
#0  0x00007f06c745d2ab in ?? ()
#1  0x0000000000001000 in ?? ()
#2  0x0000000000000000 in ?? ()

Maybe I'm missing something obvious? It's certainly not detecting the right debug symbols.

#13 Updated by Nitzan Mordechai 3 months ago

@brad, maybe it's a good candidate for another blog for upstream core dump analysis that you talked about (ubuntu 20.04)

#14 Updated by Laura Flores 3 months ago

According to Brad, the build needs to be as close to the test branch that originally experienced the crash as possible. I'm rebuilding the branch here: https://shaman.ceph.com/builds/ceph/wip-lflores-testing/51b60d63060a1ab939a826e4a67f0324a9099e5a/

Also, Brad wrote a helpful blog post on debugging an ubuntu 20.04 coredump here that we can use: https://source.redhat.com/personal_blogs/debugging_a_ceph_osd_coredump_on_ubuntu_2004

#15 Updated by Laura Flores 3 months ago

Got it to detect the right symbols with the new build!

I will attempt to analyze this coredump at a deeper level, and will post updates here.

Brad's tutorial helped a lot: https://source.redhat.com/personal_blogs/debugging_a_ceph_osd_coredump_on_ubuntu_2004

(gdb) bt
#0  raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000055b3c738179a in reraise_fatal (signum=6) at ./src/global/signal_handler.cc:88
#2  handle_oneshot_fatal_signal (signum=6) at ./src/global/signal_handler.cc:363
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x00007f06c70bd859 in __GI_abort () at abort.c:79
#6  0x000055b3c6c14130 in __gnu_cxx::__verbose_terminate_handler() [clone .cold] () at /usr/include/c++/11/bits/allocator.h:162
#7  0x000055b3c7e4770c in __cxxabiv1::__terminate(void (*)()) ()
#8  0x000055b3c7e47777 in std::terminate() ()
#9  0x000055b3c7e478d9 in __cxa_throw ()
#10 0x000055b3c6bea7c1 in ceph::buffer::v15_2_0::list::substr_of (this=<optimized out>, other=..., off=off@entry=258048, len=len@entry=4096) at ./src/common/buffer.cc:1502
#11 0x000055b3c6f831cb in ECUtil::decode (sinfo=..., ec_impl=warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<ErasureCodeJerasure*, (__gnu_cxx::_Lock_policy)2>'
warning: RTTI symbol not found for class 'std::_Sp_counted_ptr<ErasureCodeJerasure*, (__gnu_cxx::_Lock_policy)2>'
std::shared_ptr<class ceph::ErasureCodeInterface> (use count 3, weak count 0) = {...}, Python Exception <class 'gdb.error'> Attempt to extract a component of a value that is not a struct/class/union.: 
to_decode=std::map with 6 elements, Python Exception <class 'gdb.error'> Attempt to extract a component of a value that is not a struct/class/union.: 
out=std::map with 6 elements)
    at /usr/include/c++/11/bits/stl_tree.h:350
#12 0x000055b3c70db3e1 in ECBackend::handle_recovery_read_complete (this=0x55b3dd4ac600, hoid=..., to_read=..., attrs=
    std::optional<class std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > >> [no contained value], m=0x7f06a57925a0) at ./src/osd/ECBackend.cc:464
#13 0x000055b3c7100981 in OnRecoveryReadComplete::finish (this=0x55b3dfcf0c60, in=...) at /usr/include/c++/11/ext/aligned_buffer.h:72
#14 0x000055b3c70cc9ad in GenContext<std::pair<RecoveryMessages*, ECBackend::read_result_t&>&>::complete<std::pair<RecoveryMessages*, ECBackend::read_result_t&>&> (t={...}, this=0x55b3dfcf0c60) at ./src/include/Context.h:52
#15 ECBackend::complete_read_op (this=this@entry=0x55b3dd4ac600, rop=..., m=m@entry=0x7f06a57925a0) at ./src/osd/ECBackend.cc:1351
#16 0x000055b3c70e4032 in ECBackend::handle_sub_read_reply (this=<optimized out>, from=..., op=..., m=0x7f06a57925a0, trace=...) at ./src/osd/ECBackend.cc:1334
#17 0x000055b3c70e4894 in ECBackend::_handle_message (this=0x55b3dd4ac600, _op=...) at ./src/osd/ECBackend.cc:858
#18 0x000055b3c6e9c726 in PGBackend::handle_message (this=this@entry=0x55b3dd4ac600, op=...) at /usr/include/c++/11/bits/atomic_base.h:377
#19 0x000055b3c6e38682 in PrimaryLogPG::do_request (this=0x55b3e1e3b000, op=..., handle=...) at ./obj-x86_64-linux-gnu/boost/include/boost/smart_ptr/scoped_ptr.hpp:103
#20 0x000055b3c6c93860 in OSD::dequeue_op (this=this@entry=0x55b3c9698000, pg=..., op=..., handle=...) at ./obj-x86_64-linux-gnu/boost/include/boost/smart_ptr/intrusive_ptr.hpp:197
#21 0x000055b3c6f87629 in ceph::osd::scheduler::PGOpItem::run (this=<optimized out>, osd=0x55b3c9698000, sdata=<optimized out>, pg=..., handle=...) at ./src/osd/PG.h:853
#22 0x000055b3c6cacf38 in ceph::osd::scheduler::OpSchedulerItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f06a5792bc0) at /usr/include/c++/11/bits/unique_ptr.h:421
#23 OSD::ShardedOpWQ::_process (this=0x55b3c9698f10, thread_index=<optimized out>, hb=<optimized out>) at ./src/osd/OSD.cc:10822
#24 0x000055b3c73d7f14 in ShardedThreadPool::shardedthreadpool_worker (this=0x55b3c9698a58, thread_index=4) at ./src/common/WorkQueue.cc:313
#25 0x000055b3c73db004 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at ./src/common/WorkQueue.h:643
#26 0x00007f06c7451609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#27 0x00007f06c71ba133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

#16 Updated by Laura Flores 3 months ago

Looking at frame 12, I can see that the incorrect length (262144) for shard 0 is evident in the local variable "from". Specifically, `from[shard 0]._len`:

(gdb) f 12
#12 0x000055b3c70db3e1 in ECBackend::handle_recovery_read_complete (this=0x55b3dd4ac600, hoid=..., to_read=..., 
    attrs=std::optional<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > >> [no contained value], m=0x7f06a57925a0) at ./src/osd/ECBackend.cc:464
464      r = ECUtil::decode(sinfo, ec_impl, from, target);
(gdb) info locals
...
...
from = std::map with 6 elements = {[0] = {_buffers = {_root = {next = 0x55b3c96aecc0}, _tail = 0x55b3c96aecc0}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 262144, _num = 1, 
    static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, [1] = {_buffers = {_root = {next = 0x55b3c96aeb40}, 
      _tail = 0x55b3c96aeb40}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 1, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {
        next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, [2] = {_buffers = {_root = {next = 0x55b3ce772d00}, _tail = 0x55b3ce772d00}, 
    _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 1, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, 
        _off = 0, _len = 0}, <No data fields>}}, [3] = {_buffers = {_root = {next = 0x55b3cd6581a0}, _tail = 0x55b3cd6581a0}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 1, 
    static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, [4] = {_buffers = {_root = {next = 0x55b3c96ae0c0}, 
      _tail = 0x55b3c96ae0c0}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 1, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {
        next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, [5] = {_buffers = {_root = {next = 0x55b3c96aefe0}, _tail = 0x55b3c96aefe0}, 
    _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 1, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, 
        _off = 0, _len = 0}, <No data fields>}}}

Back in the "op" argument, still in frame 12, shard 0 has the correct length (258048) recorded in op.returned_data[shard 0]._len:

op = @0x55b3ca83fca0: {hoid = {static POOL_META = -1, static POOL_TEMP_START = -2, oid = {name = "smithi01231941-9"}, snap = {val = 18446744073709551614}, hash = 473660940, max = false, nibblewise_key_cache = 3236410305, 
    hash_reverse_bits = 813620280, pool = 3, nspace = "", key = ""}, v = {version = 1132, epoch = 444, __pad = 0}, missing_on = std::set with 6 elements = {[0] = {static NO_OSD = 2147483647, osd = 2, shard = {id = 5 '\005', 
        static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}}, [1] = {static NO_OSD = 2147483647, osd = 3, shard = {id = 1 '\001', static NO_SHARD = {id = -1 '\377', 
          static NO_SHARD = <same as static member of an already seen type>}}}, [2] = {static NO_OSD = 2147483647, osd = 4, shard = {id = 4 '\004', static NO_SHARD = {id = -1 '\377', 
          static NO_SHARD = <same as static member of an already seen type>}}}, [3] = {static NO_OSD = 2147483647, osd = 7, shard = {id = 3 '\003', static NO_SHARD = {id = -1 '\377', 
          static NO_SHARD = <same as static member of an already seen type>}}}, [4] = {static NO_OSD = 2147483647, osd = 9, shard = {id = 0 '\000', static NO_SHARD = {id = -1 '\377', 
          static NO_SHARD = <same as static member of an already seen type>}}}, [5] = {static NO_OSD = 2147483647, osd = 11, shard = {id = 2 '\002', static NO_SHARD = {id = -1 '\377', 
          static NO_SHARD = <same as static member of an already seen type>}}}}, missing_on_shards = std::set with 6 elements = {[0] = {id = 0 '\000', static NO_SHARD = {id = -1 '\377', 
        static NO_SHARD = <same as static member of an already seen type>}}, [1] = {id = 1 '\001', static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}, [2] = {id = 2 '\002', 
      static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}, [3] = {id = 3 '\003', static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}, 
    [4] = {id = 4 '\004', static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}, [5] = {id = 5 '\005', static NO_SHARD = {id = -1 '\377', 
        static NO_SHARD = <same as static member of an already seen type>}}}, recovery_info = {soid = {static POOL_META = -1, static POOL_TEMP_START = -2, oid = {name = "smithi01231941-9"}, snap = {val = 18446744073709551614}, 
      hash = 473660940, max = false, nibblewise_key_cache = 3236410305, hash_reverse_bits = 813620280, pool = 3, nspace = "", key = ""}, version = {version = 1132, epoch = 444, __pad = 0}, size = 6733824, oi = {soid = {
        static POOL_META = -1, static POOL_TEMP_START = -2, oid = {name = "smithi01231941-9"}, snap = {val = 18446744073709551614}, hash = 473660940, max = false, nibblewise_key_cache = 3236410305, hash_reverse_bits = 813620280, 
        pool = 3, nspace = "", key = ""}, version = {version = 1140, epoch = 452, __pad = 0}, prior_version = {version = 1132, epoch = 444, __pad = 0}, user_version = 725, last_reqid = {name = {_type = 8 '\b', _num = 4590, 
          static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8, static TYPE_MGR = 16, static NEW = -1}, tid = 4337, inc = 0}, size = 6733824, mtime = {tv = {tv_sec = 1664494275, tv_nsec = 93437093}}, 
      local_mtime = {tv = {tv_sec = 1664494277, tv_nsec = 411591785}}, flags = (object_info_t::FLAG_DIRTY | object_info_t::FLAG_DATA_DIGEST), truncate_seq = 0, truncate_size = 0, watchers = std::map with 0 elements, 
      data_digest = 3568415875, omap_digest = 4294967295, expected_object_size = 0, expected_write_size = 0, alloc_hint_flags = 0, manifest = {type = 0 '\000', redirect_target = {static POOL_META = -1, static POOL_TEMP_START = -2, 
          oid = {name = ""}, snap = {val = 0}, hash = 0, max = false, nibblewise_key_cache = 0, hash_reverse_bits = 0, pool = -9223372036854775808, nspace = "", key = ""}, chunk_map = std::map with 0 elements}}, ss = {seq = {val = 0}, 
      snaps = std::vector of length 0, capacity 0, clones = std::vector of length 0, capacity 0, clone_overlap = std::map with 0 elements, clone_size = std::map with 0 elements, clone_snaps = std::map with 0 elements}, copy_subset = {
      _size = 0, m = std::map with 0 elements}, clone_subset = std::map with 0 elements, object_exist = true}, recovery_progress = {data_recovered_to = 3145728, omap_recovered_to = "", first = false, data_complete = false, 
    omap_complete = true, error = false}, state = ECBackend::RecoveryOp::READING, returned_data = std::map with 6 elements = {[0] = {_buffers = {_root = {next = 0x55b3c96af140}, _tail = 0x55b3e0a7db40}, 
      _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 63, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, 
          _off = 0, _len = 0}, <No data fields>}}, [1] = {_buffers = {_root = {next = 0x55b3c96af240}, _tail = 0x55b3e0a7d1e0}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 63, 
      static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, [2] = {_buffers = {_root = {next = 0x55b3c96af620}, 
        _tail = 0x55b3e0a7da40}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 63, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {
          next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, [3] = {_buffers = {_root = {next = 0x55b3c96af440}, _tail = 0x55b3e0a7c380}, 
      _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 63, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, 
          _off = 0, _len = 0}, <No data fields>}}, [4] = {_buffers = {_root = {next = 0x55b3c96aee80}, _tail = 0x55b3e0a7ca40}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 63, 
      static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, [5] = {_buffers = {_root = {next = 0x55b3c96af2e0}, 
        _tail = 0x55b3e0a7cb60}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 258048, _num = 63, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {
          next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}}, xattrs = std::map with 4 elements = {["_"] = {_buffers = {_root = {next = 0x55b3e1c18f60}, _tail = 0x55b3e1c18f60}, 
      _carriage = 0x55b3e1c18f60, _len = 271, _num = 1, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, ["__header"] = {
      _buffers = {_root = {next = 0x55b3e5d260a0}, _tail = 0x55b3e5d260a0}, _carriage = 0x55b3e5d260a0, _len = 58, _num = 1, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {
          _raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, ["hinfo_key"] = {_buffers = {_root = {next = 0x55b3ec20a940}, _tail = 0x55b3ec20a940}, _carriage = 0x55b3ec20a940, _len = 42, _num = 1, 
      static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, ["snapset"] = {_buffers = {_root = {next = 0x55b3e3198820}, 
        _tail = 0x55b3e3198820}, _carriage = 0x55b3e3198820, _len = 235, _num = 1, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, 
          _len = 0}, <No data fields>}}}, hinfo = warning: RTTI symbol not found for class 'std::_Sp_counted_deleter<ECUtil::HashInfo*, SharedPtrRegistry<hobject_t, ECUtil::HashInfo, std::less<hobject_t> >::OnRemoval, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>'
warning: RTTI symbol not found for class 'std::_Sp_counted_deleter<ECUtil::HashInfo*, SharedPtrRegistry<hobject_t, ECUtil::HashInfo, std::less<hobject_t> >::OnRemoval, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>'
std::shared_ptr<ECUtil::HashInfo> (use count 1, weak count 1) = {get() = 0x55b3ec8d9380}, obc = std::shared_ptr<ObjectContext> (use count 2, weak count 1) = {get() = 0x55b3dfd3fb80}, 
  waiting_on_pushes = std::set with 0 elements, extent_requested = {first = 3145728, second = 1048576}}

This is the piece of code that leads up to the decode failure, on line 464 in ECBackend::handle_recovery_read_complete. From the coredump, we see that `RecoveryOp &op` has the right shard length stored. `from` should have the correct length after this line: `from[i->first.shard] = std::move(i->second);`, but something happened in the std::move() function to cause it to get the wrong length.

void ECBackend::handle_recovery_read_complete(
  const hobject_t &hoid,
  boost::tuple<uint64_t, uint64_t, map<pg_shard_t, bufferlist> > &to_read,
  std::optional<map<string, bufferlist, less<>> > attrs,
  RecoveryMessages *m)
{
  dout(10) << __func__ << ": returned " << hoid << " " 
           << "(" << to_read.get<0>()
           << ", " << to_read.get<1>()
           << ", " << to_read.get<2>()
           << ")" 
           << dendl;
  ceph_assert(recovery_ops.count(hoid));
  RecoveryOp &op = recovery_ops[hoid];
  ceph_assert(op.returned_data.empty());
  map<int, bufferlist*> target;
  for (set<shard_id_t>::iterator i = op.missing_on_shards.begin();
       i != op.missing_on_shards.end();
       ++i) {
    target[*i] = &(op.returned_data[*i]);
  }
  map<int, bufferlist> from;
  for(map<pg_shard_t, bufferlist>::iterator i = to_read.get<2>().begin();
      i != to_read.get<2>().end();
      ++i) {
    from[i->first.shard] = std::move(i->second);
  }
  dout(10) << __func__ << ": " << from << dendl;
  int r;
  r = ECUtil::decode(sinfo, ec_impl, from, target);

From what I can tell, we need to make sure that `from[shard i]._len` == `op.returned_data[shard i]._len` after std::move() is performed.

#17 Updated by Laura Flores 3 months ago

@Nitzan, what do you think about this analysis? Or are there any other frames/locals you'd like me to check?

#18 Updated by Laura Flores 3 months ago

Also to note, we can see information about argument `to_read` here:

(gdb) info args
this = 0x55b3dd4ac600
hoid = @0x55b3dfcf0c70: {static POOL_META = -1, static POOL_TEMP_START = -2, oid = {name = "smithi01231941-9"}, snap = {val = 18446744073709551614}, hash = 473660940, max = false, nibblewise_key_cache = 3236410305, 
  hash_reverse_bits = 813620280, pool = 3, nspace = "", key = ""}
to_read = @0x55b3dce02420: {<boost::tuples::cons<unsigned long, boost::tuples::cons<unsigned long, boost::tuples::cons<std::map<pg_shard_t, ceph::buffer::v15_2_0::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::v15_2_0::list> > >, boost::tuples::null_type> > >> = {head = 3145728, tail = {head = 1048576, tail = {head = std::map with 6 elements = {[{static NO_OSD = 2147483647, osd = 0, shard = {id = 0 '\000', static NO_SHARD = {
                id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}}] = {_buffers = {_root = {next = 0x55b3ca68d568}, _tail = 0x55b3ca68d568}, 
            _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 0, _num = 0, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, 
                _off = 0, _len = 0}, <No data fields>}}, [{static NO_OSD = 2147483647, osd = 1, shard = {id = 1 '\001', static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}}] = {
            _buffers = {_root = {next = 0x55b3cdb015b8}, _tail = 0x55b3cdb015b8}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 0, _num = 0, 
            static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, [{static NO_OSD = 2147483647, osd = 3, shard = {
              id = 3 '\003', static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}}] = {_buffers = {_root = {next = 0x55b3e24ce848}, _tail = 0x55b3e24ce848}, 
            _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 0, _num = 0, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, 
                _off = 0, _len = 0}, <No data fields>}}, [{static NO_OSD = 2147483647, osd = 5, shard = {id = 2 '\002', static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}}] = {
            _buffers = {_root = {next = 0x55b3e2d70348}, _tail = 0x55b3e2d70348}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 0, _num = 0, 
            static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}, [{static NO_OSD = 2147483647, osd = 6, shard = {
              id = 4 '\004', static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}}] = {_buffers = {_root = {next = 0x55b3e24cf108}, _tail = 0x55b3e24cf108}, 
            _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 0, _num = 0, static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, 
                _off = 0, _len = 0}, <No data fields>}}, [{static NO_OSD = 2147483647, osd = 11, shard = {id = 5 '\005', static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}}] = {
            _buffers = {_root = {next = 0x55b3e2d71338}, _tail = 0x55b3e2d71338}, _carriage = 0x55b3c8663a40 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 0, _num = 0, 
            static always_empty_bptr = {<ceph::buffer::v15_2_0::ptr_hook> = {next = 0x0}, <ceph::buffer::v15_2_0::ptr> = {_raw = 0x0, _off = 0, _len = 0}, <No data fields>}}}}}}, <No data fields>}
attrs = std::optional<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > >> [no contained value]
m = 0x7f06a57925a0

This is relevant in that `to_read.get<2>[shard i]->second` is used to move shards in `from`. This is where the length mismatch occurred.

#19 Updated by Laura Flores 3 months ago

Brad and I did some more debugging today.

Here is the end of the log associated with the coredump:

2022-09-29 23:32:10.982902 thread: 0x7f06a5796700 priority: 20 subsystem: 24 osd.9 pg_epoch: 480 pg[3.cs0( v 477'1140 lc 444'1131 (0'0,477'1140] local-lis/les=476/477 n=9 ec=25/25 lis/c=476/394 les/c/f=477/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=0 lpr=477 pi=[394,476)/1 crt=477'1140 mlcod 444'1131 active+recovering rops=1 m=1 mbc={0={(0+1)=1},1={(0+1)=1},2={(0+1)=1},3={(0+1)=1},4={(0+1)=1},5={(0+1)=1}} trimq=[13d~1,14d~1,15c~1,15e~2,164~1,167~1,16b~1,172~1] ps=[139~1,14c~1,151~1,158~1,15b~1]] handle_sub_read_reply Complete: ReadOp(tid=706, to_read={3:307edc38:::smithi01231941-9:head=read_request_t(to_read=[3145728,1048576,0], need={}, want_attrs=0)}, complete={3:307edc38:::smithi01231941-9:head=read_result_t(r=0, errors={}, noattrs, returned=(3145728, 1048576, [0(0),262144, 1(1),258048, 3(3),258048, 5(2),258048, 6(4),258048, 11(5),258048]))}, priority=127, obj_to_source={3:307edc38:::smithi01231941-9:head=0(0),1(1),3(3),5(2),6(4),11(5)}, source_to_obj={0(0)=3:307edc38:::smithi01231941-9:head,1(1)=3:307edc38:::smithi01231941-9:head,3(3)=3:307edc38:::smithi01231941-9:head,5(2)=3:307edc38:::smithi01231941-9:head,6(4)=3:307edc38:::smithi01231941-9:head,11(5)=3:307edc38:::smithi01231941-9:head}, in_progress=)
2022-09-29 23:32:10.982902 thread: 0x7f06a5796700 priority: 10 subsystem: 24 osd.9 pg_epoch: 480 pg[3.cs0( v 477'1140 lc 444'1131 (0'0,477'1140] local-lis/les=476/477 n=9 ec=25/25 lis/c=476/394 les/c/f=477/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=0 lpr=477 pi=[394,476)/1 crt=477'1140 mlcod 444'1131 active+recovering rops=1 m=1 mbc={0={(0+1)=1},1={(0+1)=1},2={(0+1)=1},3={(0+1)=1},4={(0+1)=1},5={(0+1)=1}} trimq=[13d~1,14d~1,15c~1,15e~2,164~1,167~1,16b~1,172~1] ps=[139~1,14c~1,151~1,158~1,15b~1]] handle_recovery_read_complete: returned 3:307edc38:::smithi01231941-9:head (3145728, 1048576, [0(0),262144, 1(1),258048, 3(3),258048, 5(2),258048, 6(4),258048, 11(5),258048])
2022-09-29 23:32:10.982902 thread: 0x7f06a5796700 priority: 10 subsystem: 24 osd.9 pg_epoch: 480 pg[3.cs0( v 477'1140 lc 444'1131 (0'0,477'1140] local-lis/les=476/477 n=9 ec=25/25 lis/c=476/394 les/c/f=477/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=0 lpr=477 pi=[394,476)/1 crt=477'1140 mlcod 444'1131 active+recovering rops=1 m=1 mbc={0={(0+1)=1},1={(0+1)=1},2={(0+1)=1},3={(0+1)=1},4={(0+1)=1},5={(0+1)=1}} trimq=[13d~1,14d~1,15c~1,15e~2,164~1,167~1,16b~1,172~1] ps=[139~1,14c~1,151~1,158~1,15b~1]] handle_recovery_read_complete: [0,262144, 1,258048, 2,258048, 3,258048, 4,258048, 5,258048]
2022-09-29 23:32:10.986902 thread: 0x7f06b77ba700 priority: 20 subsystem: 49 bluestore.MempoolThread(0x55b3c96c0b38) _resize_shards cache_size: 2845415832 kv_alloc: 1207959552 kv_used: 3310864 kv_onode_alloc: 234881024 kv_onode_used: 5699008 meta_alloc: 1140850688 meta_used: 3512567 data_alloc: 218103808 data_used: 12607488
2022-09-29 23:32:10.990902 thread: 0x7f06a078c700 priority: 20 subsystem: 24 osd.9 op_wq(6) _process 3.es5 to_process <OpSchedulerItem(3.es5 PGPeeringEvent(epoch_sent: 480 epoch_requested: 480 NullEvt) prio 255 cost 10 e480)> waiting <> waiting_peering {}
2022-09-29 23:32:10.990902 thread: 0x7f06a078c700 priority: 20 subsystem: 24 osd.9 op_wq(6) _process OpSchedulerItem(3.es5 PGPeeringEvent(epoch_sent: 480 epoch_requested: 480 NullEvt) prio 255 cost 10 e480) pg 0x55b3e1512000
2022-09-29 23:32:10.990902 thread: 0x7f06a078c700 priority: 10 subsystem: 24 osd.9 pg_epoch: 480 pg[3.es5( v 477'406 (477'406,477'406] local-lis/les=478/479 n=7 ec=25/25 lis/c=478/478 les/c/f=479/479/0 sis=480 pruub=15.075696945s) [7,4,6,11,0,9]p4(1) r=5 lpr=480 pi=[478,480)/1 crt=477'406 lcod 477'407 mlcod 0'0 unknown NOTIFY pruub 230.607772827s@ mbc={} ps=[139~1,13d~1,14c~2,151~1,158~1,15b~2,15e~2,164~1,167~1,16b~1,172~1]] do_peering_event: epoch_sent: 480 epoch_requested: 480 NullEvt
2022-09-29 23:32:10.990902 thread: 0x7f06a078c700 priority: 10 subsystem: 24 log is not dirty
2022-09-29 23:32:10.990902 thread: 0x7f06a078c700 priority: 20 subsystem: 24 osd.9 op_wq(6) _process empty q, waiting
2022-09-29 23:32:10.990902 thread: 0x7f06a4794700 priority: 20 subsystem: 24 osd.9 op_wq(6) _process empty q, waiting
2022-09-29 23:32:11.002902 thread: 0x7f06a5796700 priority: -1 subsystem: 0 *** Caught signal (Aborted) **
 in thread 7f06a5796700 thread_name:tp_osd_tp

 ceph version 18.0.0-131-gb2711ca9 (b2711ca9ca68627a0666c08c8c65c3a27a2da3f2) quincy (dev)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f06c745d420]
 2: gsignal()
 3: abort()
 4: ceph-osd(+0xd0d130) [0x55b3c6c14130]
 5: (__cxxabiv1::__terminate(void (*)())+0xc) [0x55b3c7e4770c]
 6: ceph-osd(+0x1f40777) [0x55b3c7e47777]
 7: ceph-osd(+0x1f408d9) [0x55b3c7e478d9]
 8: ceph-osd(+0xce37c1) [0x55b3c6bea7c1]
 9: (ECUtil::decode(ECUtil::stripe_info_t const&, std::shared_ptr<ceph::ErasureCodeInterface>&, std::map<int, ceph::buffer::v15_2_0::list, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::v15_2_0::list> > >&, std::map<int, ceph::buffer::v15_2_0::list*, std::less<int>, std::allocator<std::pair<int const, ceph::buffer::v15_2_0::list*> > >&)+0x49b) [0x55b3c6f831cb]
 10: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::v15_2_0::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::v15_2_0::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, std::optional<std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > >, RecoveryMessages*)+0x431) [0x55b3c70db3e1]
 11: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x71) [0x55b3c7100981]
 12: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x8d) [0x55b3c70cc9ad]
 13: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*, ZTracer::Trace const&)+0xdc2) [0x55b3c70e4032]
 14: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2d4) [0x55b3c70e4894]
 15: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x56) [0x55b3c6e9c726]
 16: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x4d2) [0x55b3c6e38682]
 17: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x1c0) [0x55b3c6c93860]
 18: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x69) [0x55b3c6f87629]
 19: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xb28) [0x55b3c6cacf38]
 20: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x434) [0x55b3c73d7f14]
 21: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55b3c73db004]
 22: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f06c7451609]
 23: clone()

Right here is the last record of that bad value for shard 0, which was in `handle_recovery_read_complete`:

2022-09-29 23:32:10.982902 thread: 0x7f06a5796700 priority: 10 subsystem: 24 osd.9 pg_epoch: 480 pg[3.cs0( v 477'1140 lc 444'1131 (0'0,477'1140] local-lis/les=476/477 n=9 ec=25/25 lis/c=476/394 les/c/f=477/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=0 lpr=477 pi=[394,476)/1 crt=477'1140 mlcod 444'1131 active+recovering rops=1 m=1 mbc={0={(0+1)=1},1={(0+1)=1},2={(0+1)=1},3={(0+1)=1},4={(0+1)=1},5={(0+1)=1}} trimq=[13d~1,14d~1,15c~1,15e~2,164~1,167~1,16b~1,172~1] ps=[139~1,14c~1,151~1,158~1,15b~1]] handle_recovery_read_complete: [0,262144, 1,258048, 2,258048, 3,258048, 4,258048, 5,258048]

Right above that, the bad value is also present in `handle_sub_read_reply`. We will need to trace where this bad value is first populated. At the moment, `handle_sub_read_reply` is the earliest spot we can confirm.

2022-09-29 23:32:10.982902 thread: 0x7f06a5796700 priority: 20 subsystem: 24 osd.9 pg_epoch: 480 pg[3.cs0( v 477'1140 lc 444'1131 (0'0,477'1140] local-lis/les=476/477 n=9 ec=25/25 lis/c=476/394 les/c/f=477/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=0 lpr=477 pi=[394,476)/1 crt=477'1140 mlcod 444'1131 active+recovering rops=1 m=1 mbc={0={(0+1)=1},1={(0+1)=1},2={(0+1)=1},3={(0+1)=1},4={(0+1)=1},5={(0+1)=1}} trimq=[13d~1,14d~1,15c~1,15e~2,164~1,167~1,16b~1,172~1] ps=[139~1,14c~1,151~1,158~1,15b~1]] handle_sub_read_reply Complete: ReadOp(tid=706, to_read={3:307edc38:::smithi01231941-9:head=read_request_t(to_read=[3145728,1048576,0], need={}, want_attrs=0)}, complete={3:307edc38:::smithi01231941-9:head=read_result_t(r=0, errors={}, noattrs, returned=(3145728, 1048576, [0(0),262144, 1(1),258048, 3(3),258048, 5(2),258048, 6(4),258048, 11(5),258048]))}, priority=127, obj_to_source={3:307edc38:::smithi01231941-9:head=0(0),1(1),3(3),5(2),6(4),11(5)}, source_to_obj={0(0)=3:307edc38:::smithi01231941-9:head,1(1)=3:307edc38:::smithi01231941-9:head,3(3)=3:307edc38:::smithi01231941-9:head,5(2)=3:307edc38:::smithi01231941-9:head,6(4)=3:307edc38:::smithi01231941-9:head,11(5)=3:307edc38:::smithi01231941-9:head}, in_progress=)

#20 Updated by Nitzan Mordechai 3 months ago

@Laura, thanks for confirm that in the coredump, yes, shard0 also showing that when it get the chunk from bluestore:

2022-09-29T23:32:10.711+0000 7f80d1a88700 10 osd.0 479 dequeue_op 0x55e32cfef200 prio 127 cost 0 latency 0.100360 MOSDECSubOpRead(3.cs0 479/476 ECSubRead(tid=706, to_read={3:307edc38:::smithi01231941-9:head=786432,262144,0}, subchunks={3:307edc38:::smithi01231941-9:head=[0,1]}, attrs_to_read=)) v3 pg pg[3.cs0(
 v 452'1140 (0'0,452'1140] local-lis/les=394/395 n=10 ec=25/25 lis/c=394/394 les/c/f=395/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=-1 lpr=476 pi=[394,476)/1 luod=0'0 crt=452'1140 lcod 446'1136 mlcod 0'0 active mbc={} ps=[135~1,139~1,13c~1,142~1,149~1,14c~1,14e~1,150~3,158~1,15b~1]]
2022-09-29T23:32:10.711+0000 7f80cda80700 20 osd.0 op_wq(4) _process empty q, waiting
2022-09-29T23:32:10.711+0000 7f80f0345700 15 osd.0 479 enqueue_op 0x55e331e43500 prio 127 type 108 cost 0 latency 0.000078 epoch 479 MOSDECSubOpWrite(3.1ds4 479/466 ECSubWrite(tid=2480, reqid=client.4590.0:4651, at_version=0'0, trim_to=0'0, roll_forward_to=479'644)) v2
2022-09-29T23:32:10.711+0000 7f80f0345700 20 osd.0 op_wq(5) _enqueue OpSchedulerItem(3.1ds4 PGOpItem(op=MOSDECSubOpWrite(3.1ds4 479/466 ECSubWrite(tid=2480, reqid=client.4590.0:4651, at_version=0'0, trim_to=0'0, roll_forward_to=479'644)) v2) prio 127 cost 0 e479)
2022-09-29T23:32:10.711+0000 7f80d1a88700 10 osd.0 pg_epoch: 479 pg[3.cs0( v 452'1140 (0'0,452'1140] local-lis/les=394/395 n=10 ec=25/25 lis/c=394/394 les/c/f=395/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=-1 lpr=476 pi=[394,476)/1 luod=0'0 crt=452'1140 lcod 446'1136 mlcod 0'0 active mbc={} ps=[135~1,139~1,13c~1,142
~1,149~1,14c~1,14e~1,150~3,158~1,15b~1]] _handle_message: MOSDECSubOpRead(3.cs0 479/476 ECSubRead(tid=706, to_read={3:307edc38:::smithi01231941-9:head=786432,262144,0}, subchunks={3:307edc38:::smithi01231941-9:head=[0,1]}, attrs_to_read=)) v3
2022-09-29T23:32:10.711+0000 7f80d1a88700 15 bluestore(/var/lib/ceph/osd/ceph-0) read 3.cs0_head 0#3:307edc38:::smithi01231941-9:head# 0xc0000~40000
2022-09-29T23:32:10.711+0000 7f80d1a88700 20 _pin 0x55e32aa94000   0#3:307edc38:::smithi01231941-9:head# pinned
2022-09-29T23:32:10.711+0000 7f80d1a88700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_read 0xc0000~40000 size 0x19b000 (1683456)
2022-09-29T23:32:10.711+0000 7f80d1a88700 20 bluestore(/var/lib/ceph/osd/ceph-0) _do_read defaulting to buffered read
2022-09-29T23:32:10.711+0000 7f80cd27f700 20 osd.0 op_wq(5) _process 3.1ds4 to_process <> waiting <> waiting_peering {}
2022-09-29T23:32:10.711+0000 7f80cd27f700 20 osd.0 op_wq(5) _process OpSchedulerItem(3.1ds4 PGOpItem(op=MOSDECSubOpWrite(3.1ds4 479/466 ECSubWrite(tid=2480, reqid=client.4590.0:4651, at_version=0'0, trim_to=0'0, roll_forward_to=479'644)) v2) prio 127 cost 0 e479) queued
2022-09-29T23:32:10.711+0000 7f80d1a88700 20 bluestore(/var/lib/ceph/osd/ceph-0) _read_cache  blob Blob(0x55e3324c6d90 blob([0x5e00000~10000] csum crc32c/0x1000) use_tracker(0x10*0x1000 0x[1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000]) SharedBlob(0x55e3324c6e00 sbid 0x0)) need 0x0~10000 cache has 0x[0~10000]
2022-09-29T23:32:10.711+0000 7f80d1a88700 20 bluestore(/var/lib/ceph/osd/ceph-0) _read_cache  blob Blob(0x55e3324c6e70 blob([0x5e10000~10000] csum crc32c/0x1000) use_tracker(0x10*0x1000 0x[1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000]) SharedBlob(0x55e3324c6ee0 sbid 0x0)) need 0x0~10000 cache has 0x[0~10000]
2022-09-29T23:32:10.711+0000 7f80cd27f700 20 osd.0 op_wq(5) _process 3.1ds4 to_process <OpSchedulerItem(3.1ds4 PGOpItem(op=MOSDECSubOpWrite(3.1ds4 479/466 ECSubWrite(tid=2480, reqid=client.4590.0:4651, at_version=0'0, trim_to=0'0, roll_forward_to=479'644)) v2) prio 127 cost 0 e479)> waiting <> waiting_peering {}
2022-09-29T23:32:10.711+0000 7f80cd27f700 20 osd.0 op_wq(5) _process OpSchedulerItem(3.1ds4 PGOpItem(op=MOSDECSubOpWrite(3.1ds4 479/466 ECSubWrite(tid=2480, reqid=client.4590.0:4651, at_version=0'0, trim_to=0'0, roll_forward_to=479'644)) v2) prio 127 cost 0 e479) pg 0x55e331bb4000
2022-09-29T23:32:10.711+0000 7f80d1a88700 20 bluestore(/var/lib/ceph/osd/ceph-0) _read_cache  blob Blob(0x55e3324c6f50 blob([0x5e20000~10000] csum crc32c/0x1000) use_tracker(0x10*0x1000 0x[1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000]) SharedBlob(0x55e332d23f10 sbid 0x0)) need 0x0~10000 cache has 0x[0~10000]
2022-09-29T23:32:10.711+0000 7f80d1a88700 20 bluestore(/var/lib/ceph/osd/ceph-0) _read_cache  blob Blob(0x55e332d23dc0 blob([0x5e30000~10000] csum crc32c/0x1000) use_tracker(0x10*0x1000 0x[1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000,1000]) SharedBlob(0x55e3321c0070 sbid 0x0)) need 0x0~10000 cache has 0x[0~10000]
2022-09-29T23:32:10.711+0000 7f80d1287700 20 osd.0 op_wq(5) _process empty q, waiting
2022-09-29T23:32:10.711+0000 7f80d1a88700 20 _unpin 0x55e32aa94000   0#3:307edc38:::smithi01231941-9:head# unpinned
2022-09-29T23:32:10.711+0000 7f80d1a88700 10 bluestore(/var/lib/ceph/osd/ceph-0) read 3.cs0_head 0#3:307edc38:::smithi01231941-9:head# 0xc0000~40000 = 262144

All the rest of the shards showing "correct" length of 258048 as i mention early.
The interesting thing is how we got that chunk size, that I am not sure yet. We can see that the chunk was recovered before (the osd thrashed during the test)

#21 Updated by Nitzan Mordechai 3 months ago

after rechecking the logs it looks like we are taking 2 different versions of smithi01231941-9:head
All chunks with size 258048

2022-09-29T23:32:10.610+0000 7fb2decfb700 10 osd.5 pg_epoch: 479 pg[3.cs2( v 450'1138 (0'0,450'1138] local-lis/les=394/395 n=9 ec=25/25 lis/c=394/394 les/c/f=395/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=-1 lpr=477 pi=[394,476)/1 luod=0'0 crt=450'1138 lcod 0'0 mlcod 0'0 active mbc={} ps=[135~1,139~1,13c~1,142~1,149
~1,14c~1,14e~1,150~3,158~1,15b~1]] _handle_message: MOSDECSubOpRead(3.cs2 479/476 ECSubRead(tid=706, to_read={3:307edc38:::smithi01231941-9:head=786432,262144,0}, subchunks={3:307edc38:::smithi01231941-9:head=[0,1]}, attrs_to_read=)) v3

the chunk with size 262144 (osd.0)
2022-09-29T23:32:10.711+0000 7f80d1a88700 10 osd.0 pg_epoch: 479 pg[3.cs0( v 452'1140 (0'0,452'1140] local-lis/les=394/395 n=10 ec=25/25 lis/c=394/394 les/c/f=395/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=-1 lpr=476 pi=[394,476)/1 luod=0'0 crt=452'1140 lcod 446'1136 mlcod 0'0 active mbc={} ps=[135~1,139~1,13c~1,142
~1,149~1,14c~1,14e~1,150~3,158~1,15b~1]] _handle_message: MOSDECSubOpRead(3.cs0 479/476 ECSubRead(tid=706, to_read={3:307edc38:::smithi01231941-9:head=786432,262144,0}, subchunks={3:307edc38:::smithi01231941-9:head=[0,1]}, attrs_to_read=)) v3

we got that after we got shards 1,2,3,4,5 but in_progress=0(0), between that time to the replay from shard0 we had peering (probably update of the object)

 -28> 2022-09-29T23:32:10.982+0000 7f06a5796700 10 osd.9 pg_epoch: 480 pg[3.cs0( v 477'1140 lc 444'1131 (0'0,477'1140] local-lis/les=476/477 n=9 ec=25/25 lis/c=476/394 les/c/f=477/409/0 sis=476) [9,3,11,7,4,2]p9(0) r=0 lpr=477 pi=[394,476)/1 crt=477'1140 mlcod 444'1131 active+recovering rops=1 m=1 mbc={0={(0
   -27> 2022-09-29T23:32:10.982+0000 7f06a5796700 10 osd.9 480 dequeue_op 0x55b3d31a1b00 finish
   -26> 2022-09-29T23:32:10.982+0000 7f06a5796700 20 osd.9 op_wq(4) _process 3.cs0 to_process <> waiting <> waiting_peering {}
   -25> 2022-09-29T23:32:10.982+0000 7f06a5796700 20 osd.9 op_wq(4) _process OpSchedulerItem(3.cs0 PGOpItem(op=MOSDECSubOpReadReply(3.cs0 479/476 ECSubReadReply(tid=706, attrs_read=0)) v2) prio 127 cost 0 e479) queued
   -24> 2022-09-29T23:32:10.982+0000 7f06a5796700 20 osd.9 op_wq(4) _process 3.cs0 to_process <OpSchedulerItem(3.cs0 PGOpItem(op=MOSDECSubOpReadReply(3.cs0 479/476 ECSubReadReply(tid=706, attrs_read=0)) v2) prio 127 cost 0 e479)> waiting <> waiting_peering {}
   -23> 2022-09-29T23:32:10.982+0000 7f06a5796700 20 osd.9 op_wq(4) _process OpSchedulerItem(3.cs0 PGOpItem(op=MOSDECSubOpReadReply(3.cs0 479/476 ECSubReadReply(tid=706, attrs_read=0)) v2) prio 127 cost 0 e479) pg 0x55b3e1e3b000
   -22> 2022-09-29T23:32:10.982+0000 7f06a5796700 10 osd.9 480 dequeue_op 0x55b3dd0fc600 prio 127 cost 0 latency 0.272652 MOSDECSubOpReadReply(3.cs0 479/476 ECSubReadReply(tid=706, attrs_read=0)) v2 pg pg[3.cs0( v 477'1140 lc 444'1131 (0'0,477'1140] local-lis/les=476/477 n=9 ec=25/25 lis/c=476/394 les/c/f=477/


that I think caused the different

#22 Updated by Nitzan Mordechai 3 months ago

  • Status changed from New to In Progress

#23 Updated by Radoslaw Zarzynski 2 months ago

Discussed during the RADOS Team Meeting on 15 Nov.
Linking the Nitzan's gist: https://gist.github.com/NitzanMordhai/8e1d64fcaa782868cbf68b668eabd4a1.

#24 Updated by Nitzan Mordechai 2 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 49174

Also available in: Atom PDF