Actions
Bug #22510
closedosd: BlueStore.cc: BlueStore::_balance_bluefs_freespace: assert(0 == "allocate failed, wtf");
% Done:
0%
Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
luminous 12.2.2 osd crash during deep scrubbing after 7 days of workload.
Same as http://tracker.ceph.com/issues/18698
but in different source file.
Preconditions:
36 ssds, 200G each contain 400G of rbds and 500G of small files each 5kb.
Almost all ssds used above 50%.
After 7 days of testing started deep scrubbing.
As I understand that leads to bluestore rebalancing.
And then crash occurs:
bluestore(/var/lib/ceph/osd/ceph-45) _balance_bluefs_freespace allocate failed on 0x76c00000 min_alloc_size 0x1000
...
... very very long stupidalloc dump
...
2: (()+0x11390) [0x7fb052f47390]
3: (gsignal()+0x38) [0x7fb051ee2428]
4: (abort()+0x16a) [0x7fb051ee402a]
5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x5641f18d9a2e]
6: (BlueStore::_balance_bluefs_freespace(std::vector<bluestore_pextent_t, mempool::pool_allocator<(mempool::pool_index_t)4, bluestore_pextent_t> >*)+0x1b21) [0x5641f176b5c1]
7: (BlueStore::_kv_sync_thread()+0x1ac0) [0x5641f176e040]
8: (BlueStore::KVSyncThread::entry()+0xd) [0x5641f17b1f8d]
9: (()+0x76ba) [0x7fb052f3d6ba]
10: (clone()+0x6d) [0x7fb051fb43dd]
As I understand _balance_bluefs_freespace try to allocate space (~2G) for compacted metadata, but due to fragmentation allocation failed,
and _balance_bluefs_freespac can't handle this.
Will check if it can be eliminated by setting bluestore_max_alloc_size...
Files
Actions