Project

General

Profile

Actions

Bug #22330

closed

ec: src/common/interval_map.h: 161: FAILED assert(len > 0)

Added by Patrick Donnelly over 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
EC Pools
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
multimds
Component(RADOS):
EC plugins
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2017-12-05T11:23:19.900 INFO:tasks.ceph.osd.3.smithi086.stderr:/build/ceph-13.0.0-3732-g70d0667/src/common/interval_map.h: In function 'void interval_map<K, V, S>::insert(K, K, V&&) [with K = long unsigned int; V = ceph::buffer::list; S = bl_split_merge]' thread 7f8bd8140700 time 2017-12-05 11:23:19.922589
2017-12-05T11:23:19.900 INFO:tasks.ceph.osd.3.smithi086.stderr:/build/ceph-13.0.0-3732-g70d0667/src/common/interval_map.h: 161: FAILED assert(len > 0)
2017-12-05T11:23:19.903 INFO:tasks.ceph.osd.3.smithi086.stderr: ceph version 13.0.0-3732-g70d0667 (70d06678b3571264de00c11f4a08eae8375ff04c) mimic (dev)
2017-12-05T11:23:19.904 INFO:tasks.ceph.osd.3.smithi086.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x5574c549bbb2]
2017-12-05T11:23:19.904 INFO:tasks.ceph.osd.3.smithi086.stderr: 2: (CallClientContexts::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x10b4) [0x5574c52207e4]
2017-12-05T11:23:19.904 INFO:tasks.ceph.osd.3.smithi086.stderr: 3: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x82) [0x5574c51f1262]
2017-12-05T11:23:19.904 INFO:tasks.ceph.osd.3.smithi086.stderr: 4: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*, ZTracer::Trace const&)+0xf39) [0x5574c51fb269]
2017-12-05T11:23:19.904 INFO:tasks.ceph.osd.3.smithi086.stderr: 5: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x178) [0x5574c520a858]
2017-12-05T11:23:19.904 INFO:tasks.ceph.osd.3.smithi086.stderr: 6: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x5574c50f2bb0]
2017-12-05T11:23:19.904 INFO:tasks.ceph.osd.3.smithi086.stderr: 7: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x531) [0x5574c50aad11]
2017-12-05T11:23:19.904 INFO:tasks.ceph.osd.3.smithi086.stderr: 8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x367) [0x5574c4efb017]
2017-12-05T11:23:19.904 INFO:tasks.ceph.osd.3.smithi086.stderr: 9: (PGOpItem::run(OSD*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x5a) [0x5574c5172a6a]
2017-12-05T11:23:19.905 INFO:tasks.ceph.osd.3.smithi086.stderr: 10: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xe7b) [0x5574c4f01aab]
2017-12-05T11:23:19.905 INFO:tasks.ceph.osd.3.smithi086.stderr: 11: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x7bb) [0x5574c549f43b]
2017-12-05T11:23:19.905 INFO:tasks.ceph.osd.3.smithi086.stderr: 12: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5574c54a1910]
2017-12-05T11:23:19.905 INFO:tasks.ceph.osd.3.smithi086.stderr: 13: (()+0x76ba) [0x7f8bf4a886ba]
2017-12-05T11:23:19.905 INFO:tasks.ceph.osd.3.smithi086.stderr: 14: (clone()+0x6d) [0x7f8bf3aff3dd]

From: /ceph/teuthology-archive/pdonnell-2017-12-05_06:50:02-multimds-wip-pdonnell-testing-20171205.044504-testing-basic-smithi/1931903/teuthology.log


Related issues 4 (0 open4 closed)

Related to RADOS - Bug #21931: osd: src/osd/ECBackend.cc: 2164: FAILED assert((offset + length) <= (range.first.get_off() + range.first.get_len()))ResolvedNeha Ojha10/25/2017

Actions
Has duplicate RADOS - Bug #36271: src/common/interval_map.h: 161: FAILED ceph_assert(len > 0)Duplicate

Actions
Copied to RADOS - Backport #36437: mimic: ec: src/common/interval_map.h: 161: FAILED assert(len > 0)ResolvedNathan CutlerActions
Copied to RADOS - Backport #36438: luminous: ec: src/common/interval_map.h: 161: FAILED assert(len > 0)ResolvedNathan CutlerActions
Actions #1

Updated by Patrick Donnelly over 6 years ago

Another: /ceph/teuthology-archive/pdonnell-2017-12-05_06:54:06-kcephfs-wip-pdonnell-testing-20171205.044504-testing-basic-smithi/1932525/teuthology.log

Actions #2

Updated by Patrick Donnelly almost 6 years ago

  • Priority changed from Normal to Urgent
  • Target version set to v13.0.0

http://pulpito.ceph.com/pdonnell-2018-05-01_20:58:18-multimds-wip-pdonnell-testing-20180501.191840-testing-basic-smithi/2464211

/ceph/teuthology-archive/pdonnell-2018-05-01_20:58:18-multimds-wip-pdonnell-testing-20180501.191840-testing-basic-smithi/2464211/teuthology.log

and

http://pulpito.ceph.com/pdonnell-2018-05-01_20:58:18-multimds-wip-pdonnell-testing-20180501.191840-testing-basic-smithi/2464361

/ceph/teuthology-archive/pdonnell-2018-05-01_20:58:18-multimds-wip-pdonnell-testing-20180501.191840-testing-basic-smithi/2464361/teuthology.log

Actions #3

Updated by Sage Weil almost 6 years ago

  • Status changed from New to 12
  • Assignee set to Sage Weil
Actions #4

Updated by Sage Weil almost 6 years ago

  • Status changed from 12 to Need More Info

need to capture some logs...

Actions #5

Updated by Sage Weil almost 6 years ago

  • Assignee deleted (Sage Weil)
Actions #6

Updated by Patrick Donnelly over 5 years ago

/ceph/teuthology-archive/pdonnell-2018-08-02_13:06:29-multimds-wip-pdonnell-testing-20180802.044402-testing-basic-smithi/2852847/remote/smithi141/coredump/1533233406.10526.core

Coredump ^

strangely logs aren't getting collected

Actions #7

Updated by Neha Ojha over 5 years ago

Running the multimds:basic suite with --filter 'clusters/9-mds.yaml conf/{client.yaml mds.yaml mon.yaml osd.yaml} inline/yes.yaml mount/kclient/{mount.yaml overrides/{distro/random/{k-testing.yaml supported$/{rhel_latest.yaml}} ms-die-on-skipped.yaml}} objectstore-ec/bluestore-ec-root.yaml overrides/{basic/{frag_enable.yaml whitelist_health.yaml whitelist_wrongly_marked_down.yaml} fuse-default-perm-no.yaml} q_check_counter/check_counter.yaml tasks/cfuse_workunit_suites_fsx.yaml' N=20 times, reproduces this failure reliably.

http://pulpito.ceph.com/nojha-2018-09-17_15:49:25-multimds:basic-master-distro-basic-smithi/

Actions #8

Updated by Neha Ojha over 5 years ago

  • Status changed from Need More Info to 12
Actions #9

Updated by Patrick Donnelly over 5 years ago

  • Has duplicate Bug #36271: src/common/interval_map.h: 161: FAILED ceph_assert(len > 0) added
Actions #10

Updated by Patrick Donnelly over 5 years ago

Latest instance with logs/cores: /ceph/teuthology-archive/pdonnell-2018-10-01_03:19:12-multimds-wip-pdonnell-testing-20181001.011252-distro-basic-smithi/3090388/teuthology.log

Actions #11

Updated by Neha Ojha over 5 years ago

  • Assignee set to Neha Ojha
Actions #12

Updated by Neha Ojha over 5 years ago

  • Related to Bug #21931: osd: src/osd/ECBackend.cc: 2164: FAILED assert((offset + length) <= (range.first.get_off() + range.first.get_len())) added
Actions #13

Updated by Neha Ojha over 5 years ago

  • Status changed from 12 to Fix Under Review
Actions #14

Updated by Neha Ojha over 5 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to luminous,mimic
Actions #15

Updated by Neha Ojha over 5 years ago

Note that there is a common PR to be backported for this issue and https://tracker.ceph.com/issues/21931

Actions #16

Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #36437: mimic: ec: src/common/interval_map.h: 161: FAILED assert(len > 0) added
Actions #17

Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #36438: luminous: ec: src/common/interval_map.h: 161: FAILED assert(len > 0) added
Actions #18

Updated by Nathan Cutler over 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF