Bug #53678: NCB's reconstruct allocations improperly handles shared blobb - bluestore - Ceph

Actions

Copy link

Bug #53678

closed

NCB's reconstruct allocations improperly handles shared blobb

Added by Igor Fedotov over 2 years ago. Updated about 2 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Gabriel BenHanokh

Target version:

% Done:

Source:

Tags:

Backport:

quincy

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v17.0.0

ceph-qa-suite:

Pull request ID:

44563

Crash signature (v1):

Crash signature (v2):

Description

One might get the following assertion on fsck/startup when it keeps tons of shared blobs and previously crashed due to no space error.

../src/os/bluestore/fastbmap_allocator_impl.h: 809: FAILED ceph_assert(available >= allocated)

ceph version Development (no_version) quincy (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14c) [0x7f484ea021fb]
 2: /home/if/ceph.2/build/lib/libceph-common.so.2(+0x25c40c) [0x7f484ea0240c]
 3: (BitmapAllocator::init_rm_free(unsigned long, unsigned long)+0x709) [0x55c0f2ae1f39]
 4: (BlueStore::read_allocation_from_single_onode(Allocator*, boost::intrusive_ptr&lt;BlueStore::Onode&gt;&, BlueStore::read_alloc_stats_t&)+0x103) [0x55c0f28f5d33]
 5: (BlueStore::read_allocation_from_onodes(Allocator*, BlueStore::read_alloc_stats_t&)+0xd31) [0x55c0f29309a1]
 6: (BlueStore::reconstruct_allocations(Allocator*, BlueStore::read_alloc_stats_t&)+0x520) [0x55c0f2931a00]
 7: (BlueStore::read_allocation_from_drive_on_startup()+0x96) [0x55c0f29485c6]
 8: (BlueStore::_init_alloc(std::map&lt;unsigned long, unsigned long, std::less&lt;unsigned long&gt;, std::allocator&lt;std::pair&lt;unsigned long const, unsigned long&gt; > >*)+0xac3) [0x55c0f29496b3]
 9: (BlueStore::_open_db_and_around(bool, bool)+0x367) [0x55c0f2992727]
 10: (BlueStore::_fsck(BlueStore::FSCKDepth, bool)+0x486) [0x55c0f29950f6]
 11: main()

Adam's script from https://gist.github.com/aclamk/974db54b7613e074c6bbc4877a4d722e is a good reproducer. But please note OSD has to end up with ENOSPC error before getting this failure.

IMO the root cause is a design flaw in NCB code which feeds the same extent from a shared blob multiple times to NCB restore allocator - bitmap allocator silently accepts duplicate extents but improperly adjust available space multiple times which finally causes the assertion.
Generally allocators are not supposed to get the same extent in init_rm_free calls multiple times - it's an abuse of bitmap one which performs no checking for performance reasons and hence silently accepts that up to some point. E.g. AVL one would assert immediately..
I think the proper shared blob handling should be similar to fsck implementation and should involve standalone run over shared blob KV records to update allocator instead of handling shared blob on per-onode basis.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » bluestore

Custom queries

Bug #53678

NCB's reconstruct allocations improperly handles shared blobb

Updated by Igor Fedotov over 2 years ago

Updated by Neha Ojha over 2 years ago

Updated by Neha Ojha about 2 years ago

Updated by Gabriel BenHanokh about 2 years ago

Updated by Gabriel BenHanokh about 2 years ago

Updated by Backport Bot about 2 years ago

Updated by Neha Ojha about 2 years ago