Project

General

Profile

Actions

Bug #51338

closed

osd/scrub_machine.cc: FAILED ceph_assert(state_cast<const NotActive*>())

Added by Neha Ojha almost 3 years ago. Updated about 2 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Originally reported in https://tracker.ceph.com/issues/50346#note-6

-1> 2021-06-14T11:17:15.373+0200 7fb9916f5700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/osd/scrub_machine.cc: In function 'void Scrub::ScrubMachine::assert_not_active() const' thread 7fb9916f5700 time 2021-06-14T11:17:15.364312+0200
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.4/rpm/el8/BUILD/ceph-16.2.4/src/osd/scrub_machine.cc: 55: FAILED ceph_assert(state_cast<const NotActive*>())
ceph version 16.2.4 (3cbe25cde3cfa028984618ad32de9edc4c1eaed0) pacific (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x55f12c369064]
 2: /usr/bin/ceph-osd(+0x56927e) [0x55f12c36927e]
 3: /usr/bin/ceph-osd(+0x9df81f) [0x55f12c7df81f]
 4: (PgScrubber::replica_scrub_op(boost::intrusive_ptr<OpRequest>)+0x4bf) [0x55f12c7cfc0f]
 5: (PG::replica_scrub(boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x62) [0x55f12c51eb22]
 6: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x7bb) [0x55f12c5e3e0b]
 7: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x309) [0x55f12c46d649]
 8: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x55f12c6ca808]
 9: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xa58) [0x55f12c48d618]
 10: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x55f12caf5794]
 11: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55f12caf8434]
 12: /lib64/libpthread.so.0(+0x814a) [0x7fb9b4f0514a]
 13: clone()
<pre>

Files

ceph-osd.1442.log.gz (254 KB) ceph-osd.1442.log.gz Andrej Filipcic, 07/03/2021 06:16 AM

Related issues 2 (1 open1 closed)

Related to bluestore - Bug #51133: OSDs failing to start: rocksdb: submit_common error: Corruption: block checksum mismatchNew

Actions
Is duplicate of RADOS - Bug #51942: src/osd/scrub_machine.cc: FAILED ceph_assert(state_cast<const NotActive*>())ResolvedRonen Friedman

Actions
Actions #1

Updated by Andrej Filipcic almost 3 years ago

Another OSD crash after scrub assert bug, log attached. corrupted rocskdb.

Actions #2

Updated by Neha Ojha almost 3 years ago

  • Related to Bug #51133: OSDs failing to start: rocksdb: submit_common error: Corruption: block checksum mismatch added
Actions #3

Updated by André Cruz about 2 years ago

I'm also encountering this issue on Pacific (16.2.7):

/src/osd/scrub_machine.cc: In function 'void Scrub::ScrubMachine::assert_not_active() const' thread 7fdf61752700 time 2022-02-11T12:24:02.836325+0000
/src/osd/scrub_machine.cc: 55: FAILED ceph_assert(state_cast<const NotActive*>())

ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x56543596ab7e]
2: /usr/bin/ceph-osd(+0x56ad98) [0x56543596ad98]
3: /usr/bin/ceph-osd(+0x9e8fef) [0x565435de8fef]
4: (PgScrubber::replica_scrub_op(boost::intrusive_ptr<OpRequest>)+0x4bf) [0x565435dd954f]
5: (PG::replica_scrub(boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x62) [0x565435b24342]

Any pointers?

Actions #4

Updated by Neha Ojha about 2 years ago

André Cruz wrote:

I'm also encountering this issue on Pacific (16.2.7):

[...]

Any pointers?

I think we are missing a backport in pacific, https://tracker.ceph.com/issues/53338.

Actions #5

Updated by Neha Ojha about 2 years ago

  • Status changed from New to Duplicate
Actions #6

Updated by Neha Ojha about 2 years ago

  • Is duplicate of Bug #51942: src/osd/scrub_machine.cc: FAILED ceph_assert(state_cast<const NotActive*>()) added
Actions

Also available in: Atom PDF