Project

General

Profile

Actions

Bug #58178

open

FAILED ceph_assert(last_e.version.version < e.version.version)

Added by Kevin Fox over 1 year ago. Updated over 1 year ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

debug -4> 2022-12-05T19:14:03.556+0000 7fe51028a200 5 osd.57 pg_epoch: 261349 pg[1.573( v 261349'617978754 (261347'617973346,261349'617978754] local-lis/les=260780/260781 n=16148 ec=9347/31 lis/c=260780/260780 les/c/f=260781/260781/17050 sis=260780) [22,57,20] r=1 lpr=0 crt=261349'617978754 lcod 0'0 mlcod 0'0 unknown mbc={}] exit Initial 0.050230 0 0.000000
debug -3> 2022-12-05T19:14:03.556+0000 7fe51028a200 5 osd.57 pg_epoch: 261349 pg[1.573( v 261349'617978754 (261347'617973346,261349'617978754] local-lis/les=260780/260781 n=16148 ec=9347/31 lis/c=260780/260780 les/c/f=260781/260781/17050 sis=260780) [22,57,20] r=1 lpr=0 crt=261349'617978754 lcod 0'0 mlcod 0'0 unknown mbc={}] enter Reset
debug -2> 2022-12-05T19:14:03.558+0000 7fe51028a200 5 osd.57 pg_epoch: 261349 pg[1.581(unlocked)] enter Initial
debug -1> 2022-12-05T19:14:03.579+0000 7fe51028a200 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/osd/PGLog.h: In function 'static void PGLog::read_log_and_missing(ObjectStore*, ObjectStore::CollectionHandle&, ghobject_t, const pg_info_t&, PGLog::IndexedLog&, missing_type&, std::ostringstream&, bool, bool*, const DoutPrefixProvider*, std::set<std::__cxx11::basic_string<char> >*, bool) [with missing_type = pg_missing_set<true>; ObjectStore::CollectionHandle = boost::intrusive_ptr<ObjectStore::CollectionImpl>; std::ostringstream = std::__cxx11::basic_ostringstream<char>]' thread 7fe51028a200 time 2022-12-05T19:14:03.577771+0000
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.10/rpm/el8/BUILD/ceph-16.2.10/src/osd/PGLog.h: 1484: FAILED ceph_assert(last_e.version.version < e.version.version)

ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x55b21213773c]
2: ceph-osd(+0x57f956) [0x55b212137956]
3: (void PGLog::read_log_and_missing&lt;pg_missing_set&lt;true&gt; >(ObjectStore*, boost::intrusive_ptr&lt;ObjectStore::CollectionImpl&gt;&, ghobject_t, pg_info_t const&, PGLog::IndexedLog&, pg_missing_set&lt;true&gt;&, std::__cxx11::basic_ostringstream&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; >&, bool, bool*, DoutPrefixProvider const*, std::set&lt;std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; >, std::less&lt;std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; > >, std::allocator&lt;std::__cxx11::basic_string&lt;char, std::char_traits&lt;char&gt;, std::allocator&lt;char&gt; > > >, bool)+0x3447) [0x55b21230d517]
4: (PG::read_state(ObjectStore
)+0x13e7) [0x55b2122f8537]
5: (OSD::load_pgs()+0xa47) [0x55b2122440c7]
6: (OSD::init()+0x26f7) [0x55b212274547]
7: main()
Actions #1

Updated by Kevin Fox over 1 year ago

Noticed an osd, doing this, on a cluster over the weekend. Its been crashing consistently since.

Actions #2

Updated by Igor Fedotov over 1 year ago

  • Project changed from bluestore to RADOS
Actions #3

Updated by Nitzan Mordechai over 1 year ago

@Kevin H Fox, can you please share the failing osd logs with osd debug 20?
We will suppose to print the previous log entry version and the current one

Actions #4

Updated by Kevin Fox over 1 year ago

I can not, sorry. I reported the issue as soon as I saw it, waited a day after it showed up, then reformatted the drive and put it back into production. Couldn't delay longer.

Actions #5

Updated by Radoslaw Zarzynski over 1 year ago

  • Status changed from New to Need More Info
Actions

Also available in: Atom PDF