Backport #13335
closedhammer: OSD crashed when reached pool's max_bytes quota
Updated by Loïc Dachary over 8 years ago
- Description updated (diff)
- Status changed from New to In Progress
- Assignee set to Loïc Dachary
Updated by Alexey Sheplyakov over 8 years ago
There's a fix here: https://github.com/ceph/ceph/pull/6918
Updated by Loïc Dachary over 8 years ago
- Description updated (diff)
- Status changed from In Progress to New
- Assignee deleted (
Loïc Dachary)
This needs to be adapted to hammer because a few things are different (no flag in messages, CEPH_OSD_FLAG_FULL_FORCE not implemented.
Updated by Alexey Sheplyakov over 8 years ago
Basically I've moved the check for a full pool to the right place (before updating the cached ObjectContext)
without changing the check itself (well, almost).
Updated by Loïc Dachary about 8 years ago
- Status changed from New to In Progress
- Assignee set to Loïc Dachary
Updated by Loïc Dachary about 8 years ago
- Status changed from In Progress to Resolved
- Target version set to v0.94.6
Updated by Loïc Dachary about 8 years ago
- Status changed from Resolved to New
- Assignee deleted (
Loïc Dachary) - Target version deleted (
v0.94.6)
Updated by Loïc Dachary about 8 years ago
The commit introduces a regression and is reverted by http://tracker.ceph.com/issues/15019
Updated by Loïc Dachary about 8 years ago
- Related to Bug #15019: hammer: fs test fails with log [ERR] : OSD full dropping all updates 100% full added
Updated by Alexey Sheplyakov about 8 years ago
The commit introduces a regression
The commit exposes a bug in the test which assumes it's possible to write more data than the storage capacity is.
I believe that OSD should reject such writes to prevent further damage (ENOSPC handling in filesystems' code is not 100% fool proof), and it does so in Infernalis and Jewel.
and is reverted by http://tracker.ceph.com/issues/15019
I don't think reverting it is a good idea, the test case itself should be fixed instead.
Even if we want to pretend that it's possible to write 144 MB of data to a 100 MB drive
the check should be slightly modified, that is, https://github.com/ceph/ceph/blob/hammer/src/osd/ReplicatedPG.cc#L5693 should be removed,
instead of reintroducing the obc corruption. However I think checking for a full OSD is actually correct.
Updated by Loïc Dachary about 8 years ago
- Status changed from New to In Progress
- Assignee set to Alexey Sheplyakov
Updated by Loïc Dachary about 8 years ago
- Subject changed from OSD crashed when reached pool's max_bytes quota to hammer: OSD crashed when reached pool's max_bytes quota
Updated by Loïc Dachary about 8 years ago
- Status changed from In Progress to Resolved