Project

General

Profile

Bug #24422

Ceph OSDs crashing in BlueStore::queue_transactions() using EC

Added by 鹏 张 almost 6 years ago. Updated almost 6 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph version: 12.2.5
data pool use Ec module 3 + 1.
When restart one osd,it case crash and restart more and more.

-1 bluestore(/var/lib/ceph/osd/ceph-12) _txc_add_transaction error (2) No such file or directory not handled on operation 30 (op 0, counting from 0)


Related issues

Duplicates RADOS - Bug #23145: OSD crashes during recovery of EC pg Duplicate 02/27/2018

History

#1 Updated by 鹏 张 almost 6 years ago

1.-45> 2018-06-05 17:47:56.886142 7f8972974700 -1 bluestore(/var/lib/ceph/osd/ceph-12) _txc_add_transaction error (2) No such file or directory not handled on operation 30 (op 0, counting from 0)

2018-06-05T17:47:56.922047+08:00 node54 ceph-osd: /work/build/rpmbuild/BUILD/infinity-3.2.5/src/os/bluestore/BlueStore.cc: 9415: FAILED assert(0 == "unexpected error")

2. OP_CLONERANGE2 = 30, // cid, oid, newoid, srcoff, len, dstoff

#2 Updated by 鹏 张 almost 6 years ago

鹏 张 wrote:

ceph version: 12.2.5
data pool use Ec module 2 + 1.
When restart one osd,it case crash and restart more and more.

-1 bluestore(/var/lib/ceph/osd/ceph-12) _txc_add_transaction error (2) No such file or directory not handled on operation 30 (op 0, counting from 0)

#3 Updated by 鹏 张 almost 6 years ago

2018-06-05T17:46:28.273183+08:00 node54 ceph-osd: /work/build/rpmbuild/BUILD/infinity-3.2.5/src/os/bluestore/BlueStore.cc: 9415: FAILED assert(0 "unexpected error")
2018-06-05T17:46:28.270166+08:00 node54 ceph-osd: 3: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0) [0x55d9979eb8e0]
2018-06-05T17:46:28.270428+08:00 node54 ceph-osd: 4: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, ObjectStore::Transaction&&, Context*, Context*, Context*, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x171) [0x55d9975df9d1]
2018-06-05T17:46:28.270681+08:00 node54 ceph-osd: 5: (OSD::dispatch_context_transaction(PG::RecoveryCtx&, PG*, ThreadPool::TPHandle*)+0x76) [0x55d997567b36]
2018-06-05T17:46:28.270965+08:00 node54 ceph-osd: 6: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x3bb) [0x55d99759013b]
2018-06-05T17:46:28.271237+08:00 node54 ceph-osd: 7: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x17) [0x55d9975f9fb7]
2018-06-05T17:46:28.271496+08:00 node54 ceph-osd: 8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa8e) [0x55d997b1d35e]
2018-06-05T17:46:28.271783+08:00 node54 ceph-osd: 9: (ThreadPool::WorkThread::entry()+0x10) [0x55d997b1e240]
2018-06-05T17:46:28.272069+08:00 node54 ceph-osd: 10: (()+0x7e25) [0x7f7e61d7be25]
2018-06-05T17:46:28.272330+08:00 node54 ceph-osd: 11: (clone()+0x6d) [0x7f7e60e6f34d]
2018-06-05T17:46:28.272600+08:00 node54 ceph-osd: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2018-06-05T17:46:28.272888+08:00 node54 ceph-osd: 2018-06-05 17:46:28.269083 7f7e49877700 -1 /work/build/rpmbuild/BUILD/infinity-3.2.5/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)' thread 7f7e49877700 time 2018-06-05 17:46:28.265554
2018-06-05T17:46:28.273183+08:00 node54 ceph-osd: /work/build/rpmbuild/BUILD/infinity-3.2.5/src/os/bluestore/BlueStore.cc: 9415: FAILED assert(0 "unexpected error")
2018-06-05T17:46:28.273441+08:00 node54 ceph-osd: ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)
2018-06-05T17:46:28.273696+08:00 node54 ceph-osd: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x55d997b16770]
2018-06-05T17:46:28.273985+08:00 node54 ceph-osd: 2: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x1487) [0x55d9979ea897]
2018-06-05T17:46:28.274237+08:00 node54 ceph-osd: 3: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0) [0x55d9979eb8e0]
2018-06-05T17:46:28.274395+08:00 node54 ceph-osd: 4: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, ObjectStore::Transaction&&, Context*, Context*, Context*, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x171) [0x55d9975df9d1]
2018-06-05T17:46:28.274551+08:00 node54 ceph-osd: 5: (OSD::dispatch_context_transaction(PG::RecoveryCtx&, PG*, ThreadPool::TPHandle*)+0x76) [0x55d997567b36]
2018-06-05T17:46:28.274705+08:00 node54 ceph-osd: 6: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x3bb) [0x55d99759013b]
2018-06-05T17:46:28.274860+08:00 node54 ceph-osd: 7: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x17) [0x55d9975f9fb7]
2018-06-05T17:46:28.275033+08:00 node54 ceph-osd: 8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa8e) [0x55d997b1d35e]
2018-06-05T17:46:28.275186+08:00 node54 ceph-osd: 9: (ThreadPool::WorkThread::entry()+0x10) [0x55d997b1e240]
2018-06-05T17:46:28.275335+08:00 node54 ceph-osd: 10: (()+0x7e25) [0x7f7e61d7be25]
2018-06-05T17:46:28.275489+08:00 node54 ceph-osd: 11: (clone()+0x6d) [0x7f7e60e6f34d]

#4 Updated by 鹏 张 almost 6 years ago

the same to https://tracker.ceph.com/issues/21475. and i already modify bluestore_deferred_throttle_bytes = 0
bluestore_throttle_bytes = 0 .But it also have this problem.

#5 Updated by Patrick Donnelly almost 6 years ago

  • Project changed from CephFS to RADOS

#6 Updated by Greg Farnum almost 6 years ago

  • Duplicates Bug #23145: OSD crashes during recovery of EC pg added

#7 Updated by Josh Durgin almost 6 years ago

  • Status changed from New to Duplicate

#8 Updated by Sage Weil almost 6 years ago

Can you generate an osd log with 'debug osd = 20' for the crashing osd that leads up to the crash?

#9 Updated by Sage Weil almost 6 years ago

Sage Weil wrote:

Can you generate an osd log with 'debug osd = 20' for the crashing osd that leads up to the crash?

If so, please report it on the duplicate ticket #23145

Also available in: Atom PDF