Fix #58759
BlueFS log runway space exhausted
Status:
New
Priority:
Low
Assignee:
-
Target version:
-
% Done:
0%
Source:
Development
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
In BlueFS::_flush_and_sync_log_core we have following data integrity check:
ceph_assert(bl.length() <= runway);
It is there, because it is unacceptable to put transaction larger then currently available transaction.
If we do so, there would be no good way to get the data (we do _do_replay_recovery_read() heuristic, but it requires lengthy recovery).
The solution could be that if we have less runway than transaction,
we inject a log-extending transaction first.
It has been almost impossible before, but these commits help:
https://github.com/ceph/ceph/pull/42750 "incremental update"
https://github.com/ceph/ceph/pull/48854 "4K bluefs"