osd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: assert(0 == "allocate failed, wtf");
luminous 12.2.2 osd crash during deep scrubbing after 7 days of workload.
Same as http://tracker.ceph.com/issues/18698
but in different source file.
36 ssds, 200G each contain 400G of rbds and 500G of small files each 5kb.
Almost all ssds used above 50%.
After 7 days of testing started deep scrubbing.
As I understand that leads to bluestore rebalancing.
And then crash occurs:
bluestore(/var/lib/ceph/osd/ceph-45) _balance_bluefs_freespace allocate failed on 0x76c00000 min_alloc_size 0x1000 ... ... very very long stupidalloc dump ... 2: (()+0x11390) [0x7fb052f47390] 3: (gsignal()+0x38) [0x7fb051ee2428] 4: (abort()+0x16a) [0x7fb051ee402a] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x5641f18d9a2e] 6: (BlueStore::_balance_bluefs_freespace(std::vector<bluestore_pextent_t, mempool::pool_allocator<(mempool::pool_index_t)4, bluestore_pextent_t> >*)+0x1b21) [0x5641f176b5c1] 7: (BlueStore::_kv_sync_thread()+0x1ac0) [0x5641f176e040] 8: (BlueStore::KVSyncThread::entry()+0xd) [0x5641f17b1f8d] 9: (()+0x76ba) [0x7fb052f3d6ba] 10: (clone()+0x6d) [0x7fb051fb43dd]
As I understand _balance_bluefs_freespace try to allocate space (~2G) for compacted metadata, but due to fragmentation allocation failed,
and _balance_bluefs_freespac can't handle this.
Will check if it can be eliminated by setting bluestore_max_alloc_size...
#4 Updated by Aleksei Gutikov over 3 years ago
Seems that is same as described here for bitmap allocator:
After set bluestore_min_alloc_size = bluefs_alloc_size = 1M
crashes were not reproducible.
We used bluestore_min_alloc_size=4k, but even with default 16k or 64k
seems crashes will be reproduced.