Project

General

Profile

Actions

Bug #20754

closed

osd/PrimaryLogPG.cc: 1845: FAILED assert(!cct->_conf->osd_debug_misdirected_ops)

Added by Sage Weil over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

 -1075> 2017-07-23 02:37:55.090582 7f76325b2700 20 osd.2 pg_epoch: 83 pg[3.7( v 83'1 (0'0,83'1] local-lis/les=80/83 n=1 ec=73/73 lis/c 80/80 les/c/f 83/83/0 73/80/73) [2,1] r=0 lpr=82 luod=0'0 lua=0'0 crt=83'1 lcod 0'0 mlcod 0'0 active+clean] do_op: op osd_op(client.4310.0:27 3.7 3:ee71ac71:::benchmark_data_smithi0
65_17586_object26:head [set-alloc-hint object_size 4194304 write_size 4194304,write 0~4194304] snapc 0=[] RETRY=1 ondisk+retry+write+known_if_redirected e79) v8
 -1074> 2017-07-23 02:37:55.090590 7f76325b2700 -1 osd.2 pg_epoch: 83 pg[3.7( v 83'1 (0'0,83'1] local-lis/les=80/83 n=1 ec=73/73 lis/c 80/80 les/c/f 83/83/0 73/80/73) [2,1] r=0 lpr=82 luod=0'0 lua=0'0 crt=83'1 lcod 0'0 mlcod 0'0 active+clean] do_op 3.7 does not contain 3:ee71ac71:::benchmark_data_smithi065_17586_ob
ject26:head pg_num 26 hash 8e358e77
 -1071> 2017-07-23 02:37:55.090602 7f76325b2700  0 log_channel(cluster) log [WRN] : 3.7 does not contain 3:ee71ac71:::benchmark_data_smithi065_17586_object26:head op osd_op(client.4310.0:27 3.7 3:ee71ac71:::benchmark_data_smithi065_17586_object26:head [set-alloc-hint object_size 4194304 write_size 4194304,write 0~4
194304] snapc 0=[] RETRY=1 ondisk+retry+write+known_if_redirected e79) v8
     0> 2017-07-23 02:37:55.094172 7f76325b2700 -1 /build/ceph-12.1.1-380-g5e8fa3e/src/osd/PrimaryLogPG.cc: In function 'virtual void PrimaryLogPG::do_op(OpRequestRef&)' thread 7f76325b2700 time 2017-07-23 02:37:55.090610
/build/ceph-12.1.1-380-g5e8fa3e/src/osd/PrimaryLogPG.cc: 1845: FAILED assert(!cct->_conf->osd_debug_misdirected_ops)

 ceph version 12.1.1-380-g5e8fa3e (5e8fa3e06b68fae1582c9230a3a8d1abc6146286) luminous (rc)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x10e) [0x7f765439cc6e]
 2: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x92d) [0x7f7653fe5ebd]
 3: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xe46) [0x7f7653fa6076]
 4: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3e6) [0x7f7653e4a8c6]
 5: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x47) [0x7f765409f367]
 6: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xff5) [0x7f7653e74ea5]
 7: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x907) [0x7f76543a24f7]
 8: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f76543a4670]
 9: (()+0x8184) [0x7f7651e20184]
 10: (clone()+0x6d) [0x7f7650f1037d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

/a/sage-2017-07-23_02:11:52-rados-wip-weight-set-distro-basic-smithi/1433151
Actions #1

Updated by Sage Weil over 6 years ago

the pg was split in e80:

2017-07-23 02:37:46.737810 7fed57749700  7 mon.a@0(leader).osd e79 prepare_update mon_command({"var": "pg_num", "prefix": "osd pool set", "pool": "unique_pool_1", "val": "26"} v 0) v1 from client.4331 172.21.15.65:0/3989964320

the request was sent in e79:
 -1085> 2017-07-23 02:37:55.090557 7f76325b2700 10 osd.2 83 dequeue_op 0x7f766b061fa0 prio 63 cost 4194304 latency 8.491205 osd_op(client.4310.0:27 3.7 3.8e358e77 (undecoded) ondisk+retry+write+known_if_redirected e79) v8 pg pg[3.7( v 83'1 (0'0,83'1] local-lis/les=80/83 n=1 ec=73/73 lis/c 80/80 les/c/f 83/83/0 73/8
0/73) [2,1] r=0 lpr=82 luod=0'0 lua=0'0 crt=83'1 lcod 0'0 mlcod 0'0 active+clean]

so this is actually not a bug, just a bad assert.

Actions #2

Updated by Sage Weil over 6 years ago

  • Status changed from In Progress to Fix Under Review
Actions #3

Updated by Kefu Chai over 6 years ago

  • Status changed from Fix Under Review to Resolved
  • Assignee set to Sage Weil
Actions

Also available in: Atom PDF