Project

General

Profile

Bug #25174

osd: assert failure with FAILED assert(repop_queue.front() == repop) In function 'void PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)'

Added by Shylesh Kumar over 2 years ago. Updated 10 months ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
Category:
Peering
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Component(RADOS):
OSD
Pull request ID:
Crash signature:

Description

branch: luminous
description: rados:downstream:singleton/{all/ec-lost-unfound.yaml msgr-failures/many.yaml
msgr/simple.yaml objectstore/bluestore-bitmap.yaml rados.yaml}

2018-07-27T09:30:39.665 INFO:tasks.ec_lost_unfound.ceph_manager:clean!
2018-07-27T09:32:27.533 INFO:tasks.ceph.osd.2.argo017.stderr:/builddir/build/BUILD/ceph-12.2.4/src/osd/PrimaryLogPG.cc: In function 'void PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)' thread 7f296b703700 time 2018-07-27 13:32:27.401195
2018-07-27T09:32:27.533 INFO:tasks.ceph.osd.2.argo017.stderr:/builddir/build/BUILD/ceph-12.2.4/src/osd/PrimaryLogPG.cc: 9301: FAILED assert(repop_queue.front() repop)
2018-07-27T09:32:27.537 INFO:tasks.ceph.osd.2.argo017.stderr: ceph version 12.2.4-40.el7cp (9e34484c328181b5aeee82974b7ffebcd5f3509b) luminous (stable)
2018-07-27T09:32:27.538 INFO:tasks.ceph.osd.2.argo017.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x55e5294a72d0]
2018-07-27T09:32:27.538 INFO:tasks.ceph.osd.2.argo017.stderr: 2: (PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)+0x524) [0x55e52908ba64]
2018-07-27T09:32:27.538 INFO:tasks.ceph.osd.2.argo017.stderr: 3: (PrimaryLogPG::repop_all_applied(PrimaryLogPG::RepGather*)+0x74) [0x55e52908c334]
2018-07-27T09:32:27.538 INFO:tasks.ceph.osd.2.argo017.stderr: 4: (Context::complete(int)+0x9) [0x55e528f507e9]
2018-07-27T09:32:27.538 INFO:tasks.ceph.osd.2.argo017.stderr: 5: (ECBackend::handle_sub_write_reply(pg_shard_t, ECSubWriteReply const&, ZTracer::Trace const&)+0x193) [0x55e52921a8c3]
2018-07-27T09:32:27.539 INFO:tasks.ceph.osd.2.argo017.stderr: 6: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2df) [0x55e52921c9ef]
2018-07-27T09:32:27.539 INFO:tasks.ceph.osd.2.argo017.stderr: 7: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x55e5291215b0]
2018-07-27T09:32:27.539 INFO:tasks.ceph.osd.2.argo017.stderr: 8: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x59c) [0x55e52908d18c]
2018-07-27T09:32:27.539 INFO:tasks.ceph.osd.2.argo017.stderr: 9: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9) [0x55e528f12689]
2018-07-27T09:32:27.539 INFO:tasks.ceph.osd.2.argo017.stderr: 10: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x55e52918f877]
2018-07-27T09:32:27.539 INFO:tasks.ceph.osd.2.argo017.stderr: 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce) [0x55e528f410ae]
2018-07-27T09:32:27.539 INFO:tasks.ceph.osd.2.argo017.stderr: 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839) [0x55e5294acde9]
2018-07-27T09:32:27.540 INFO:tasks.ceph.osd.2.argo017.stderr: 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55e5294aed80]
2018-07-27T09:32:27.540 INFO:tasks.ceph.osd.2.argo017.stderr: 14: (()+0x7dd5) [0x7f298a5cadd5]
2018-07-27T09:32:27.540 INFO:tasks.ceph.osd.2.argo017.stderr: 15: (clone()+0x6d) [0x7f29896bbb3d]
2018-07-27T09:32:27.540 INFO:tasks.ceph.osd.2.argo017.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2018-07-27T09:32:27.540 INFO:tasks.ceph.osd.2.argo017.stderr:2018-07-27 13:32:27.405832 7f296b703700 -1 /builddir/build/BUILD/ceph-12.2.4/src/osd/PrimaryLogPG.cc: In function 'void PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)' thread 7f296b703700 time 2018-07-27 13:32:27.401195
2018-07-27T09:32:27.540 INFO:tasks.ceph.osd.2.argo017.stderr:/builddir/build/BUILD/ceph-12.2.4/src/osd/PrimaryLogPG.cc: 9301: FAILED assert(repop_queue.front() repop)
2018-07-27T09:32:27.541 INFO:tasks.ceph.osd.2.argo017.stderr:
2018-07-27T09:32:27.541 INFO:tasks.ceph.osd.2.argo017.stderr: ceph version 12.2.4-40.el7cp (9e34484c328181b5aeee82974b7ffebcd5f3509b) luminous (stable)
2018-07-27T09:32:27.541 INFO:tasks.ceph.osd.2.argo017.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x55e5294a72d0]
2018-07-27T09:32:27.541 INFO:tasks.ceph.osd.2.argo017.stderr: 2: (PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)+0x524) [0x55e52908ba64]
2018-07-27T09:32:27.541 INFO:tasks.ceph.osd.2.argo017.stderr: 3: (PrimaryLogPG::repop_all_applied(PrimaryLogPG::RepGather*)+0x74) [0x55e52908c334]
2018-07-27T09:32:27.541 INFO:tasks.ceph.osd.2.argo017.stderr: 4: (Context::complete(int)+0x9) [0x55e528f507e9]
2018-07-27T09:32:27.542 INFO:tasks.ceph.osd.2.argo017.stderr: 5: (ECBackend::handle_sub_write_reply(pg_shard_t, ECSubWriteReply const&, ZTracer::Trace const&)+0x193) [0x55e52921a8c3]
2018-07-27T09:32:27.542 INFO:tasks.ceph.osd.2.argo017.stderr: 6: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2df) [0x55e52921c9ef]
2018-07-27T09:32:27.542 INFO:tasks.ceph.osd.2.argo017.stderr: 7: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x55e5291215b0]
2018-07-27T09:32:27.542 INFO:tasks.ceph.osd.2.argo017.stderr: 8: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x59c) [0x55e52908d18c]
2018-07-27T09:32:27.542 INFO:tasks.ceph.osd.2.argo017.stderr: 9: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9) [0x55e528f12689]
2018-07-27T09:32:27.543 INFO:tasks.ceph.osd.2.argo017.stderr: 10: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x55e52918f877]
2018-07-27T09:32:27.543 INFO:tasks.ceph.osd.2.argo017.stderr: 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce) [0x55e528f410ae]
2018-07-27T09:32:27.543 INFO:tasks.ceph.osd.2.argo017.stderr: 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839) [0x55e5294acde9]
2018-07-27T09:32:27.543 INFO:tasks.ceph.osd.2.argo017.stderr: 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55e5294aed80]
2018-07-27T09:32:27.543 INFO:tasks.ceph.osd.2.argo017.stderr: 14: (()+0x7dd5) [0x7f298a5cadd5]
2018-07-27T09:32:27.543 INFO:tasks.ceph.osd.2.argo017.stderr: 15: (clone()+0x6d) [0x7f29896bbb3d]
2018-07-27T09:32:27.543 INFO:tasks.ceph.osd.2.argo017.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2018-07-27T09:32:27.544 INFO:tasks.ceph.osd.2.argo017.stderr:
2018-07-27T09:32:27.563 INFO:tasks.ceph.osd.2.argo017.stderr: 0> 2018-07-27 13:32:27.405832 7f296b703700 -1 /builddir/build/BUILD/ceph-12.2.4/src/osd/PrimaryLogPG.cc: In function 'void PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)' thread 7f296b703700 time 2018-07-27 13:32:27.401195
2018-07-27T09:32:27.563 INFO:tasks.ceph.osd.2.argo017.stderr:/builddir/build/BUILD/ceph-12.2.4/src/osd/PrimaryLogPG.cc: 9301: FAILED assert(repop_queue.front() == repop)
2018-07-27T09:32:27.563 INFO:tasks.ceph.osd.2.argo017.stderr:
2018-07-27T09:32:27.563 INFO:tasks.ceph.osd.2.argo017.stderr: ceph version 12.2.4-40.el7cp (9e34484c328181b5aeee82974b7ffebcd5f3509b) luminous (stable)
2018-07-27T09:32:27.563 INFO:tasks.ceph.osd.2.argo017.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x55e5294a72d0]
2018-07-27T09:32:27.564 INFO:tasks.ceph.osd.2.argo017.stderr: 2: (PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)+0x524) [0x55e52908ba64]
2018-07-27T09:32:27.564 INFO:tasks.ceph.osd.2.argo017.stderr: 3: (PrimaryLogPG::repop_all_applied(PrimaryLogPG::RepGather*)+0x74) [0x55e52908c334]
2018-07-27T09:32:27.564 INFO:tasks.ceph.osd.2.argo017.stderr: 4: (Context::complete(int)+0x9) [0x55e528f507e9]
2018-07-27T09:32:27.564 INFO:tasks.ceph.osd.2.argo017.stderr: 5: (ECBackend::handle_sub_write_reply(pg_shard_t, ECSubWriteReply const&, ZTracer::Trace const&)+0x193) [0x55e52921a8c3]
2018-07-27T09:32:27.564 INFO:tasks.ceph.osd.2.argo017.stderr: 6: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2df) [0x55e52921c9ef]
2018-07-27T09:32:27.564 INFO:tasks.ceph.osd.2.argo017.stderr: 7: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x55e5291215b0]
2018-07-27T09:32:27.564 INFO:tasks.ceph.osd.2.argo017.stderr: 8: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x59c) [0x55e52908d18c]
2018-07-27T09:32:27.565 INFO:tasks.ceph.osd.2.argo017.stderr: 9: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9) [0x55e528f12689]
2018-07-27T09:32:27.565 INFO:tasks.ceph.osd.2.argo017.stderr: 10: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x55e52918f877]
2018-07-27T09:32:27.565 INFO:tasks.ceph.osd.2.argo017.stderr: 11: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce) [0x55e528f410ae]
2018-07-27T09:32:27.565 INFO:tasks.ceph.osd.2.argo017.stderr: 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839) [0x55e5294acde9]
2018-07-27T09:32:27.565 INFO:tasks.ceph.osd.2.argo017.stderr: 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55e5294aed80]
2018-07-27T09:32:27.565 INFO:tasks.ceph.osd.2.argo017.stderr: 14: (()+0x7dd5) [0x7f298a5cadd5]
2018-07-27T09:32:27.566 INFO:tasks.ceph.osd.2.argo017.stderr: 15: (clone()+0x6d) [0x7f29896bbb3d]
2018-07-27T09:32:27.566 INFO:tasks.ceph.osd.2.argo017.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2018-07-27T09:32:27.566 INFO:tasks.ceph.osd.2.argo017.stderr:
2018-07-27T09:32:27.566 INFO:tasks.ceph.osd.2.argo017.stderr:*** Caught signal (Aborted)
2018-07-27T09:32:27.567 INFO:tasks.ceph.osd.2.argo017.stderr: in thread 7f296b703700 thread_name:tp_osd_tp
2018-07-27T09:32:27.568 INFO:tasks.ceph.osd.2.argo017.stderr: ceph version 12.2.4-40.el7cp (9e34484c328181b5aeee82974b7ffebcd5f3509b) luminous (stable)
2018-07-27T09:32:27.568 INFO:tasks.ceph.osd.2.argo017.stderr: 1: (()+0xa3c941) [0x55e529468941]
2018-07-27T09:32:27.568 INFO:tasks.ceph.osd.2.argo017.stderr: 2: (()+0xf680) [0x7f298a5d2680]
2018-07-27T09:32:27.569 INFO:tasks.ceph.osd.2.argo017.stderr: 3: (gsignal()+0x37) [0x7f29895f3207]
2018-07-27T09:32:27.569 INFO:tasks.ceph.osd.2.argo017.stderr: 4: (abort()+0x148) [0x7f29895f48f8]
2018-07-27T09:32:27.569 INFO:tasks.ceph.osd.2.argo017.stderr: 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x284) [0x55e5294a7444]
2018-07-27T09:32:27.569 INFO:tasks.ceph.osd.2.argo017.stderr: 6: (PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)+0x524) [0x55e52908ba64]
2018-07-27T09:32:27.569 INFO:tasks.ceph.osd.2.argo017.stderr: 7: (PrimaryLogPG::repop_all_applied(PrimaryLogPG::RepGather*)+0x74) [0x55e52908c334]
2018-07-27T09:32:27.569 INFO:tasks.ceph.osd.2.argo017.stderr: 8: (Context::complete(int)+0x9) [0x55e528f507e9]
2018-07-27T09:32:27.570 INFO:tasks.ceph.osd.2.argo017.stderr: 9: (ECBackend::handle_sub_write_reply(pg_shard_t, ECSubWriteReply const&, ZTracer::Trace const&)+0x193) [0x55e52921a8c3]
2018-07-27T09:32:27.570 INFO:tasks.ceph.osd.2.argo017.stderr: 10: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x2df) [0x55e52921c9ef]
2018-07-27T09:32:27.570 INFO:tasks.ceph.osd.2.argo017.stderr: 11: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x55e5291215b0]
2018-07-27T09:32:27.570 INFO:tasks.ceph.osd.2.argo017.stderr: 12: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x59c) [0x55e52908d18c]
2018-07-27T09:32:27.570 INFO:tasks.ceph.osd.2.argo017.stderr: 13: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9) [0x55e528f12689]
2018-07-27T09:32:27.570 INFO:tasks.ceph.osd.2.argo017.stderr: 14: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x55e52918f877]
2018-07-27T09:32:27.570 INFO:tasks.ceph.osd.2.argo017.stderr: 15: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfce) [0x55e528f410ae]
2018-07-27T09:32:27.571 INFO:tasks.ceph.osd.2.argo017.stderr: 16: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839) [0x55e5294acde9]
2018-07-27T09:32:27.571 INFO:tasks.ceph.osd.2.argo017.stderr: 17: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55e5294aed80]
2018-07-27T09:32:27.571 INFO:tasks.ceph.osd.2.argo017.stderr: 18: (()+0x7dd5) [0x7f298a5cadd5]
2018-07-27T09:32:27.571 INFO:tasks.ceph.osd.2.argo017.stderr: 19: (clone()+0x6d) [0x7f29896bbb3d]
2018-07-27T09:32:27.571 INFO:tasks.ceph.osd.2.argo017.stderr:2018-07-27 13:32:27.436419 7f296b703700 -1
Caught signal (Aborted) *
2018-07-27T09:32:27.571 INFO:tasks.ceph.osd.2.argo017.stderr: in thread 7f296b703700 thread_name:tp_osd_tp


Related issues

Related to RADOS - Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop) Resolved 04/13/2017

History

#1 Updated by Shylesh Kumar over 2 years ago

  • Related to Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop) added

#3 Updated by Neha Ojha over 2 years ago

Do we have logs for this failure somewhere?

#4 Updated by Josh Durgin over 2 years ago

  • Assignee changed from Josh Durgin to Neha Ojha

#6 Updated by Patrick Donnelly over 1 year ago

  • Subject changed from OSD assert failure with FAILED assert(repop_queue.front() == repop) In function 'void PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)' to osd: assert failure with FAILED assert(repop_queue.front() == repop) In function 'void PrimaryLogPG::eval_repop(PrimaryLogPG::RepGather*)'
  • Start date deleted (07/30/2018)

#7 Updated by Neha Ojha about 1 year ago

  • Status changed from New to Can't reproduce

#8 Updated by Yan Jun 10 months ago

Also available in: Atom PDF