Bug #54561
closed5 out of 6 OSD crashing after update to 17.1.0-0.2.rc1.fc37.x86_64
0%
Description
Description of problem:
After upgrading to 17.1.0-0.2.rc1.fc37.x86_64, 5 out of 6 of my OSDs are crashing on start.
2022-03-14T11:20:44.682+0100 7ff5a50d0180 -1 bluestore::NCB::__restore_allocator::Failed open_for_read with error-code -2
2022-03-14T11:20:44.682+0100 7ff5a50d0180 0 bluestore(/var/lib/ceph/osd/ceph-0) _init_alloc::NCB::restore_allocator() failed! Run Full Recovery from ONodes (might ta
ke a while) ...
2022-03-14T11:20:54.767+0100 7ff5a50d0180 -1 /builddir/build/BUILD/ceph-17.1.0/src/os/bluestore/AvlAllocator.cc: In function 'virtual void AvlAllocator::init_add_free
(uint64_t, uint64_t)' thread 7ff5a50d0180 time 2022-03-14T11:20:54.766296+0100
/builddir/build/BUILD/ceph-17.1.0/src/os/bluestore/AvlAllocator.cc: 442: FAILED ceph_assert(offset + length <= uint64_t(device_size))
ceph version 17.1.0 (c675060073a05d40ef404d5921c81178a52af6e0) quincy (dev)
(full log attached)
Version-Release number of selected component (if applicable):
17.1.0-0.2.rc1.fc37.x86_64
How reproducible:
Steps to Reproduce:
1. Upgrade working cluster to quincy rc1 release.
2.
3.
Actual results:
OSD crashing
Expected results:
OSD working.
Additional info:
My cluster has 3 control nodes running rawhide (mons, mgrs, mds).
1 physical server with 6 HDDs running 6 OSDs (rawhide).
I'm using CephFS and RGW.