Bug #62815
closed
hybrid/avl allocators might be very ineffective when serving bluefs allocations
Added by Igor Fedotov 8 months ago.
Updated 2 months ago.
Backport:
reef, quincy, pacific
Description
When operating in best-fit mode these allocators perform chunk lookups through size-sorted "range_size_tree" container. This is done in two stages:
Use container's lower_bound() to search for the first available
long enough chunk.
Iterate sequentially to locate properly aligned chunk,
starting from the position obtained at step 1)
Step 2) might require significant efforts for bluefs allocations (which uses longer allocation unit than bluestore min_alloc_size, e.g. 64K vs. 4K) if space is highly fragmented. Plenty of potentially available chunks might be unaligned to 64K boundary and once aligned they aren't long enough any more.
This spreadsheet shows how dramatic could that be for hybrid allocator (real free map from a production cluster was used to build it):
https://docs.google.com/spreadsheets/d/1bh22JNqwTPSFUQPStYuuiKDEui6YSxZsusIln3tZ9t0/edit?usp=sharing
Actually the above description is valid for both best- and fast-fit modes. Sequential lookup for 64K aligned chunk might be pretty costly. I've made an experiment where the alignment requirement was changed from 64K to 4K. And allocation times dropped dramatically.
Looks like a good fix to me as large chunk alignment requirement is apparently redundant.
Originally the issue was revealed a while ago when the following PR was made: https://github.com/ceph/ceph/pull/48640
That implementation is rather an overkill though since we might just get rid off the alignment requirement instead.
- Severity changed from 3 - minor to 2 - major
- Pull request ID set to 53483
- Backport set to reef, quincy, pacific
- Status changed from New to Fix Under Review
- Is duplicate of Bug #62509: os/bluestore: suspect performance bottleneck on allocator added
- Status changed from Fix Under Review to Duplicate
- Status changed from Duplicate to Fix Under Review
- Target version set to v18.2.1
- Status changed from Fix Under Review to Pending Backport
- Copied to Backport #63760: reef: hybrid/avl allocators might be very ineffective when serving bluefs allocations added
- Copied to Backport #63761: quincy: hybrid/avl allocators might be very ineffective when serving bluefs allocations added
- Copied to Backport #63762: pacific: hybrid/avl allocators might be very ineffective when serving bluefs allocations added
- Tags set to backport_processed
- Target version deleted (
v18.2.1)
- Status changed from Pending Backport to Resolved
- Assignee set to Igor Fedotov
- % Done changed from 0 to 100
- Source set to Community (dev)
Also available in: Atom
PDF