Bug #17966: test-erasure-eio.sh can crash osd - Ceph - Ceph

Actions

Copy link

Bug #17966

closed

test-erasure-eio.sh can crash osd

Added by David Zafman over 7 years ago. Updated over 7 years ago.

Status:

Resolved

Priority:

Immediate

Assignee:

Samuel Just

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

The size of a replica is being corrupted, but I don't know why this has just started happening. The ASSERT has been in place since 2014. This read should succeed because the 2 out of 3 shards have the correct size. The next test will corrupt another shard and the client should get EIO. In neither case should the ASSERT be hit. This could be related to the EC overwrite changes which somehow affected the standard code path.

Looking at the log line before the assert it looks like to_read is asking for 4096 bytes, but both shards returned 2048. Earlier in the log shard 1(1) returned EIO because of the bad size:

osd.1:
pg[2.0s1( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=1 lpr=49 pi=27-48/9 luod=0'0 crt=28'1 lcod 0'0 active] get_hash_info: Mismatch of total_chunk_size 2048

It might be a coincidence that we are modifying the size of a shard and the crash appears to be related to about of data read.

osd.0:
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/test-erasure-eio.sh:227: rados_get_data_bad_size: rados_get td/test-erasure-eio pool-jerasure obj-size-19593-0-10 0
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/test-erasure-eio.sh:88: rados_get: local dir=td/test-erasure-eio
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/test-erasure-eio.sh:89: rados_get: local poolname=pool-jerasure
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/test-erasure-eio.sh:90: rados_get: local objname=obj-size-19593-0-10
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/test-erasure-eio.sh:91: rados_get: local expect=0
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/test-erasure-eio.sh:96: rados_get: '[' 0 = 1 ']'
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/test-erasure-eio.sh:104: rados_get: rados --pool pool-jerasure get obj-size-19593-0-10 td/test-erasure-eio/COPY

   -46> 2016-11-18 17:56:57.364077 7fc76c1f3700 10 osd.0 52 dequeue_op 0x7fc790c04a40 prio 127 cost 0 latency 0.000095 MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0)) v1 pg pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean]
   -45> 2016-11-18 17:56:57.364098 7fc76c1f3700  5 -- op tracker -- seq: 194, time: 2016-11-18 17:56:57.364098, event: reached_pg, op: MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0))
   -44> 2016-11-18 17:56:57.364103 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_message: MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0)) v1
   -43> 2016-11-18 17:56:57.364112 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply: reply ECSubReadReply(tid=2, attrs_read=0)
   -42> 2016-11-18 17:56:57.364123 7fc76c1f3700 20 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply shard=1(1) error=-5
   -41> 2016-11-18 17:56:57.364134 7fc76c1f3700 20 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply have shard=0
   -40> 2016-11-18 17:56:57.364146 7fc76c1f3700 20 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply minimum_to_decode failed
   -39> 2016-11-18 17:56:57.364156 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] send_all_remaining_reads have/error shards=0,1
   -38> 2016-11-18 17:56:57.364164 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] get_remaining_shards: checking acting 0(0)
   -37> 2016-11-18 17:56:57.364173 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] get_remaining_shards: checking acting 1(1)
   -36> 2016-11-18 17:56:57.364183 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] get_remaining_shards: checking acting 2(2)
   -35> 2016-11-18 17:56:57.364195 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] send_all_remaining_reads Read remaining shards 2(2)
   -34> 2016-11-18 17:56:57.364206 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] do_read_op: starting read ReadOp(tid=2, to_read={2:2c070a00:::obj-size-26512-0-10:head=read_request_t(to_read=[0,4096,0], need=2(2), want_attrs=0)}, complete={2:2c070a00:::obj-size-26512-0-10:head=read_result_t(r=0, errors={1(1)=-5}, noattrs, returned=(0, 4096, [0(0),2048]))}, priority=127, obj_to_source={2:2c070a00:::obj-size-26512-0-10:head=0(0),1(1)}, source_to_obj={0(0)=2:2c070a00:::obj-size-26512-0-10:head,1(1)=2:2c070a00:::obj-size-26512-0-10:head}, in_progress=)
   -33> 2016-11-18 17:56:57.364236 7fc76c1f3700 10 osd.0 52 send_incremental_map 51 -> 52 to 0x7fc7906e6e00 127.0.0.1:6809/6807
   -32> 2016-11-18 17:56:57.364249 7fc76c1f3700  1 -- 127.0.0.1:6801/22396 --> 127.0.0.1:6809/6807 -- osd_map(52..52 src has 1..52) v3 -- 0x7fc79064e0c0 con 0
   -31> 2016-11-18 17:56:57.364258 7fc76c1f3700  2 Event(0x7fc7901f9080 nevent=5000 time_id=10).wakeup
   -30> 2016-11-18 17:56:57.364268 7fc76c1f3700 10 osd.0 52 note_peer_epoch osd.2 has 52
   -29> 2016-11-18 17:56:57.364272 7fc76c1f3700  1 -- 127.0.0.1:6801/22396 --> 127.0.0.1:6809/6807 -- MOSDECSubOpRead(2.0s2 52 ECSubRead(tid=2, to_read={2:2c070a00:::obj-size-26512-0-10:head=0,2048,0}, attrs_to_read=)) v2 -- 0x7fc790cd4100 con 0
   -28> 2016-11-18 17:56:57.364323 7fc76c1f3700  2 Event(0x7fc7901f9080 nevent=5000 time_id=10).wakeup
   -27> 2016-11-18 17:56:57.364335 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] do_read_op: started ReadOp(tid=2, to_read={2:2c070a00:::obj-size-26512-0-10:head=read_request_t(to_read=[0,4096,0], need=2(2), want_attrs=0)}, complete={2:2c070a00:::obj-size-26512-0-10:head=read_result_t(r=0, errors={1(1)=-5}, noattrs, returned=(0, 4096, [0(0),2048]),(0, 4096, []))}, priority=127, obj_to_source={2:2c070a00:::obj-size-26512-0-10:head=0(0),1(1),2(2)}, source_to_obj={0(0)=2:2c070a00:::obj-size-26512-0-10:head,1(1)=2:2c070a00:::obj-size-26512-0-10:head,2(2)=2:2c070a00:::obj-size-26512-0-10:head}, in_progress=2(2))
   -26> 2016-11-18 17:56:57.364362 7fc76c1f3700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply readop not complete: ReadOp(tid=2, to_read={2:2c070a00:::obj-size-26512-0-10:head=read_request_t(to_read=[0,4096,0], need=2(2), want_attrs=0)}, complete={2:2c070a00:::obj-size-26512-0-10:head=read_result_t(r=0, errors={1(1)=-5}, noattrs, returned=(0, 4096, [0(0),2048]),(0, 4096, []))}, priority=127, obj_to_source={2:2c070a00:::obj-size-26512-0-10:head=0(0),1(1),2(2)}, source_to_obj={0(0)=2:2c070a00:::obj-size-26512-0-10:head,1(1)=2:2c070a00:::obj-size-26512-0-10:head,2(2)=2:2c070a00:::obj-size-26512-0-10:head}, in_progress=2(2))
   -25> 2016-11-18 17:56:57.364383 7fc76c1f3700 10 osd.0 52 dequeue_op 0x7fc790c04a40 finish
   -24> 2016-11-18 17:56:57.364386 7fc76c1f3700  5 -- op tracker -- seq: 194, time: 2016-11-18 17:56:57.364386, event: done, op: MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0))
   -23> 2016-11-18 17:56:57.364782 7fc7817cf700  5 -- 127.0.0.1:6801/22396 >> 127.0.0.1:6809/6807 conn(0x7fc7906e6e00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2 cs=1 l=0). rx osd.2 seq 103 0x7fc790d42f40 osd_map(52..52 src has 1..52) v3
   -22> 2016-11-18 17:56:57.364814 7fc776207700  1 -- 127.0.0.1:6801/22396 <== osd.2 127.0.0.1:6809/6807 103 ==== osd_map(52..52 src has 1..52) v3 ==== 214+0+0 (1107775612 0 0) 0x7fc790d42f40 con 0x7fc7906e6e00
   -21> 2016-11-18 17:56:57.364834 7fc776207700 20 osd.0 52 OSD::ms_dispatch: osd_map(52..52 src has 1..52) v3
   -20> 2016-11-18 17:56:57.364840 7fc776207700 10 osd.0 52 do_waiters -- start
   -19> 2016-11-18 17:56:57.364841 7fc776207700 10 osd.0 52 do_waiters -- finish
   -18> 2016-11-18 17:56:57.364845 7fc776207700 20 osd.0 52 _dispatch 0x7fc790d42f40 osd_map(52..52 src has 1..52) v3
   -17> 2016-11-18 17:56:57.364842 7fc7817cf700  5 -- 127.0.0.1:6801/22396 >> 127.0.0.1:6809/6807 conn(0x7fc7906e6e00 :6801 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2 cs=1 l=0). rx osd.2 seq 104 0x7fc790cd4100 MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0)) v1
   -16> 2016-11-18 17:56:57.364852 7fc776207700  3 osd.0 52 handle_osd_map epochs [52,52], i have 52, src has [1,52]
   -15> 2016-11-18 17:56:57.364855 7fc776207700 10 osd.0 52  no new maps here, dropping
   -14> 2016-11-18 17:56:57.364852 7fc7817cf700  1 -- 127.0.0.1:6801/22396 <== osd.2 127.0.0.1:6809/6807 104 ==== MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0)) v1 ==== 2187+0+0 (3338324462 0 0) 0x7fc790cd4100 con 0x7fc7906e6e00
   -13> 2016-11-18 17:56:57.364862 7fc7817cf700 10 osd.0 52 handle_replica_op MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0)) v1 epoch 52
   -12> 2016-11-18 17:56:57.364867 7fc7817cf700 20 osd.0 52 should_share_map osd.2 127.0.0.1:6809/6807 52
   -11> 2016-11-18 17:56:57.364891 7fc7817cf700 15 osd.0 52 enqueue_op 0x7fc790c258e0 prio 127 cost 0 latency 0.000060 MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0)) v1
   -10> 2016-11-18 17:56:57.364899 7fc7817cf700  5 -- op tracker -- seq: 195, time: 2016-11-18 17:56:57.364899, event: queued_for_pg, op: MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0))
    -9> 2016-11-18 17:56:57.364927 7fc76e9f8700 10 osd.0 52 dequeue_op 0x7fc790c258e0 prio 127 cost 0 latency 0.000110 MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0)) v1 pg pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean]
    -8> 2016-11-18 17:56:57.364961 7fc76e9f8700  5 -- op tracker -- seq: 195, time: 2016-11-18 17:56:57.364960, event: reached_pg, op: MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0))
    -7> 2016-11-18 17:56:57.364967 7fc76e9f8700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_message: MOSDECSubOpReadReply(2.0s0 52 ECSubReadReply(tid=2, attrs_read=0)) v1
    -6> 2016-11-18 17:56:57.364996 7fc76e9f8700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply: reply ECSubReadReply(tid=2, attrs_read=0)
    -5> 2016-11-18 17:56:57.365011 7fc76e9f8700 20 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply have shard=0
    -4> 2016-11-18 17:56:57.365020 7fc76e9f8700 20 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply have shard=2
    -3> 2016-11-18 17:56:57.365041 7fc76e9f8700 -1 log_channel(cluster) log [ERR] : handle_sub_read_reply: Error(s) ignored for 2:2c070a00:::obj-size-26512-0-10:head enough copies available
    -2> 2016-11-18 17:56:57.365047 7fc76e9f8700 10 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply Error(s) ignored for 2:2c070a00:::obj-size-26512-0-10:head enough copies available
    -1> 2016-11-18 17:56:57.365058 7fc76e9f8700 20 osd.0 pg_epoch: 52 pg[2.0s0( v 28'1 (0'0,28'1] local-les=50 n=1 ec=27 les/c/f 50/51/0 47/49/41) [0,1,2] r=0 lpr=49 crt=28'1 mlcod 0'0 active+clean] handle_sub_read_reply Complete: ReadOp(tid=2, to_read={2:2c070a00:::obj-size-26512-0-10:head=read_request_t(to_read=[0,4096,0], need=2(2), want_attrs=0)}, complete={2:2c070a00:::obj-size-26512-0-10:head=read_result_t(r=0, errors={}, noattrs, returned=(0, 4096, [0(0),2048, 2(2),2048]),(0, 4096, []))}, priority=127, obj_to_source={2:2c070a00:::obj-size-26512-0-10:head=0(0),1(1),2(2)}, source_to_obj={0(0)=2:2c070a00:::obj-size-26512-0-10:head,1(1)=2:2c070a00:::obj-size-26512-0-10:head,2(2)=2:2c070a00:::obj-size-26512-0-10:head}, in_progress=)
     0> 2016-11-18 17:56:57.369463 7fc76e9f8700 -1 /slow/dzafman/ceph/src/osd/ECBackend.cc: In function 'virtual void CallClientContexts::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)' thread 7fc76e9f8700 time 2016-11-18 17:56:57.365080
/slow/dzafman/ceph/src/osd/ECBackend.cc: 2131: FAILED assert(res.returned.size() == to_read.size())

 ceph version 11.0.2-1754-gcc37efa (cc37efa47c1ba85244a22b324378b5cbdbf0f516)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7fc786e814a5]
 2: (CallClientContexts::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x59f) [0x7fc786aeb0bf]
 3: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x7f) [0x7fc786aca17f]
 4: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*)+0x1089) [0x7fc786acb299]
 5: (ECBackend::handle_message(std::shared_ptr<OpRequest>)+0x1a3) [0x7fc786ad6133]
 6: (ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x100) [0x7fc786970e70]
 7: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x41d) [0x7fc78681c16d]
 8: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest> const&)+0x6d) [0x7fc78681c3bd]
 9: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x7e2) [0x7fc786843632]
 10: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x947) [0x7fc786e87117]
 11: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7fc786e89270]
 12: (()+0x31c7007ee5) [0x7fc783e2bee5]
 13: (clone()+0x6d) [0x7fc782d0fd1d]

Actions

Copy link

Updated by David Zafman over 7 years ago

Status changed from New to 12
Assignee set to David Zafman

Actions

Copy link

Updated by David Zafman over 7 years ago

This doesn't have to do with number of bytes read but rather res.returned has 2 list elements but to_read has only 1. The second list item in res.returned seems extra.

This may be related to Sam's change 1e95f2ce6.

(gdb) print res.returned
$9 = std::list = {
  [0] = {<boost::tuples::cons<unsigned long, boost::tuples::cons<unsigned long, boost::tuples::cons<std::map<pg_shard_t, ceph::buffer::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::list> > >, boost::tuples::null_type> > >> = {head = 0, tail = {head = 4096, tail = {head = std::map with 2 elements = {[{osd = 1, shard = {id = 1 '\001', static NO_SHARD = {
                  id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}}] = {_buffers = std::list = {[0] = {_raw = 0x7f40945c5590, _off = 131, _len = 2048}}, _len = 2048,
              _memcopy_count = 0, append_buffer = {_raw = 0x0, _off = 0, _len = 0},
              last_p = {<ceph::buffer::list::iterator_impl<false>> = {<std::iterator<std::forward_iterator_tag, char, long, char*, char&>> = {<No data fields>}, bl = 0x7f40942369a8, ls = 0x7f40942369a8,
                  off = 0, p = {_raw = , _off = 0, _len = 0}, p_off = 0}, <No data fields>}, static CLAIM_DEFAULT = 0, static CLAIM_ALLOW_NONSHAREABLE = 1}, [{osd = 2, shard = {id = 2 '\002',
                static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}}] = {_buffers = std::list = {[0] = {_raw = 0x7f40945c5e90, _off = 131,
                  _len = 2048}}, _len = 2048, _memcopy_count = 0, append_buffer = {_raw = 0x0, _off = 0, _len = 0},
              last_p = {<ceph::buffer::list::iterator_impl<false>> = {<std::iterator<std::forward_iterator_tag, char, long, char*, char&>> = {<No data fields>}, bl = 0x7f409469e9a8, ls = 0x7f409469e9a8,
                  off = 0, p = {_raw = , _off = 0, _len = 0}, p_off = 0}, <No data fields>}, static CLAIM_DEFAULT = 0, static CLAIM_ALLOW_NONSHAREABLE = 1}}}}}, <No data fields>},
  [1] = {<boost::tuples::cons<unsigned long, boost::tuples::cons<unsigned long, boost::tuples::cons<std::map<pg_shard_t, ceph::buffer::list, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::list> > >, boost::tuples::null_type> > >> = {head = 0, tail = {head = 4096, tail = {head = std::map with 0 elements}}}, <No data fields>}}
(gdb) print to_read
$10 = std::list = {[0] = {<boost::tuples::cons<unsigned long, boost::tuples::cons<unsigned long, boost::tuples::cons<unsigned int, boost::tuples::null_type> > >> = {head = 0, tail = {head = 4096,
        tail = {head = 0}}}, <No data fields>}}

Actions

Copy link

Updated by David Zafman over 7 years ago

Assignee changed from David Zafman to Samuel Just

I assume something in 1e95f2ce6 caused this. I looked at it for a while but it will probably be more obvious to Sam. So I'm assigning to Sam.

Actions

Copy link

Updated by Samuel Just over 7 years ago

do_read_op has a precondition that op.complete is empty. send_all_remaining_reads now violates that precondition. I'll refactor this so it's less confusing (and less crashy).

Actions

Copy link