Project

General

Profile

Actions

Bug #62282

open

BlueFS and BlueStore use the same space (init_rm_free assert)

Added by Adam Kupczyk 9 months ago. Updated 5 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The problem is triggered on BlueFS mounts and tries to reserve allocations on shared device.

ceph version 17.2.6-70.el9cp (fe62dcdbb2c6e05782a3e2b67d025b84ff5047cc) quincy (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x11e) [0x557821136c6b]
 2: /usr/bin/ceph-osd(+0x3dbe27) [0x557821136e27]
 3: /usr/bin/ceph-osd(+0xa432d1) [0x55782179e2d1]
 4: (AvlAllocator::_try_remove_from_tree(unsigned long, unsigned long, std::function<void (unsigned long, unsigned long, bool)>)+0x24c) [0x5578217960ec]
 5: (HybridAllocator::init_rm_free(unsigned long, unsigned long)+0xc0) [0x55782179dfd0]
 6: (BlueFS::mount()+0x1f6) [0x5578217666e6]
 7: (BlueStore::_open_bluefs(bool, bool)+0x82) [0x557821690b42]
 8: (BlueStore::_prepare_db_environment(bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*)+0x5c0) [0x557821691800]
 9: (BlueStore::_open_db(bool, bool, bool)+0x179) [0x557821693439]
 10: (BlueStore::_open_db_and_around(bool, bool)+0x429) [0x557821694169]
 11: (BlueStore::_mount()+0x2ec) [0x55782169a57c]
 12: (OSD::init()+0x4fc) [0x55782127359c]
 13: main()

The offending range is from BlueFS log: range 0x6dda040000~400000

2023-07-28T18:53:56.313+0000 7fc5b803c2c0 30 bluefs mount noting alloc for file(ino 1 size 0x89c000 mtime 2023-07-27T18:29:37.033690+0000 allocated c20000 alloc_commit 10000 extents [1:0x5d800000~10000,1:0x5d6f0000~10000,1:0x6dda040000~400000,1:0x1a3d0000~400000,1:0x4fd47c0000~400000])
2023-07-28T18:53:56.313+0000 7fc5b803c2c0 10 HybridAllocator init_rm_free offset 0x5d800000 length 0x10000
2023-07-28T18:53:56.313+0000 7fc5b803c2c0 10 HybridAllocator init_rm_free offset 0x5d6f0000 length 0x10000
2023-07-28T18:53:56.313+0000 7fc5b803c2c0 10 HybridAllocator init_rm_free offset 0x6dda040000 length 0x400000
2023-07-28T18:53:56.313+0000 7fc5b803c2c0 -1 HybridAllocator init_rm_free lambda Uexpected extent:  0x6dda040000~400000
2023-07-28T18:53:56.317+0000 7fc5b803c2c0 -1 /builddir/build/BUILD/ceph-17.2.6/src/os/bluestore/HybridAllocator.cc: In function 'HybridAllocator::init_rm_free(uint64_t, uint64_t)::<lambda(uint64_t, uint64_t, bool)>' thread 7fc5b803c2c0 time 2023-07-28T18:53:56.315192+0000
/builddir/build/BUILD/ceph-17.2.6/src/os/bluestore/HybridAllocator.cc: 175: FAILED ceph_assert(false)

As a run of fsck shows:

ceph-bluestore-tool --path /rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46/ --bluestore-allocator=bitmap fsck
2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error:  oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda2c0000~10000 or a subset is already allocated (misreferenced)

2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error:  oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda280000~10000 or a subset is already allocated (misreferenced)

2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error:  oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda290000~10000 or a subset is already allocated (misreferenced)

2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error:  oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda2a0000~10000 or a subset is already allocated (misreferenced)

2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error:  oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda2b0000~10000 or a subset is already allocated (misreferenced)

2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error:  oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda2d0000~10000 or a subset is already allocated (misreferenced)
.....
2023-08-01T15:33:44.361+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error:  oid #-1:3fc5a3cf:::disk_bw_test_40:0#, extent 0x4fd4890000~10000 or a subset is already allocated (misreferenced)

fsck status: remaining 128 error(s) and warning(s)

The space on disk is currently occcupied both by BlueFS and BlueStore.
The device is rotational and bitmap_freelist_manager is in use.


Files

b.patch (1.17 KB) b.patch Igor Fedotov, 11/23/2023 09:12 PM

Related issues 1 (0 open1 closed)

Related to bluestore - Bug #63618: Allocator configured with 64K alloc unit might get 4K requestsResolvedIgor Fedotov

Actions
Actions #1

Updated by Adam Kupczyk 9 months ago

  • Subject changed from BlueFS and BlueStore use the same space to BlueFS and BlueStore use the same space (init_rm_free assert)
Actions #3

Updated by Adam Kupczyk 9 months ago

The avl allocator (and by extension hybrid) has an ability to accept same region twice.

  unique_ptr<Allocator> alloc;
  alloc.reset(Allocator::create(g_ceph_context, "hybrid",
                0x746e1c00000, 4096, 0, 0, "dupa"));
  alloc->init_add_free(0x100000,0x10000);
  alloc->init_add_free(0x120000,0x10000);
  PExtentVector release_set;
  release_set.emplace_back(0x108000, 0x10000);
  alloc->release(release_set);
  int r = 0;
  int allocated = 0;
  do {
    PExtentVector tmp;
    r = alloc->allocate(0x4000, 0x1000, 0, 0, &tmp);
    if (r > 0) allocated += r;
    std::cout << "allocated=" << tmp << std::hex << " total 0x" << allocated << std::dec << std::endl;
  } while (r > 0);
allocated=[0x100000~4000] total 0x4000
allocated=[0x104000~4000] total 0x8000
allocated=[0x108000~4000] total 0xc000
allocated=[0x10c000~4000] total 0x10000
allocated=[0x108000~4000] total 0x14000
allocated=[0x10c000~4000] total 0x18000
allocated=[0x110000~4000] total 0x1c000
allocated=[0x114000~4000] total 0x20000
allocated=[0x120000~4000] total 0x24000
allocated=[0x124000~4000] total 0x28000
allocated=[0x128000~4000] total 0x2c000
allocated=[0x12c000~4000] total 0x30000
allocated=[] total 0x30000

Now it has given out region 0x108000~8000 TWICE!
This convinces me that the bug is caused by relasing some region that was already free.
Then BlueFS and BlueStore independently got the same region.

Action plan is to add meaningful catch codes for these events and start testing.
Possibly even make it a part of the releases, so we could use telemetry to zoom in on whatever is going on.

Actions #4

Updated by Igor Fedotov 9 months ago

Adam Kupczyk wrote:

The avl allocator (and by extension hybrid) has an ability to accept same region twice.

[...]

[...]

Now it has given out region 0x108000~8000 TWICE!
This convinces me that the bug is caused by relasing some region that was already free.
Then BlueFS and BlueStore independently got the same region.

Action plan is to add meaningful catch codes for these events and start testing.
Possibly even make it a part of the releases, so we could use telemetry to zoom in on whatever is going on.

Would https://github.com/ceph/ceph/pull/47730 do the trick?

Actions #6

Updated by Igor Fedotov 5 months ago

I think this could be related to https://tracker.ceph.com/issues/63618 if DB shares main device and legacy 64K alloc unit is in use for this device.

Actions #7

Updated by Igor Fedotov 5 months ago

This patch simulates a scenario for Hybrid Allocator which might result in marking used extents as free and hence cause duplicate allocatations.

Actions #8

Updated by Igor Fedotov 5 months ago

  • Related to Bug #63618: Allocator configured with 64K alloc unit might get 4K requests added
Actions

Also available in: Atom PDF