Bug #47804
openEC backend implementation isn't optimal when handling 4k overwrites
0%
Description
EC backend performs redundant read/write ops when handling partial 4k-aligned overwrite.
E.g. there is 64K object in EC 2+1 pool and 4K overwrite at offset 0 takes place. ObjectStore's alloc size set to 4K and hence stripe width is 8K.
To process 4K overwrite ideal EC implementation would need (aside from input 4K chunk) one more existing data chunk (or existing coding one) to be able to build the third coding chunk. Then both new data chunk and new coding chunk to be written. Existing data chunk doesn't need any modification.
Instead current implementation reads two data chunks (overwritten one is redundant!) and then writes all three chunks (again one write op is redundant) back to stores.
For EC profiles with higher 'm' this has even higher overhead, e.g. for 3+1 profile one read and two writes are redundant. The general calculation for m+k EC profile gives 1 redundant read and m-1 redundant writes per each 4k write.
Given that 4K-aligned overwrite is a pretty common user op and that modern BlueStore uses 4K allocation units for both HDD and SSD the above scenario looks quite realistic and worth performance optimization.
Files