Backport #13335: hammer: OSD crashed when reached pool's max_bytes quota - Ceph - Ceph

Basically I've moved the check for a full pool to the right place (before updating the cached ObjectContext)
without changing the check itself (well, almost).

Actions

Copy link

Updated by Loïc Dachary about 8 years ago

Status changed from New to In Progress
Assignee set to Loïc Dachary

Actions

Copy link

Updated by Loïc Dachary about 8 years ago

Description updated (diff)

Actions

Copy link

Updated by Loïc Dachary about 8 years ago

Status changed from In Progress to Resolved
Target version set to v0.94.6

Actions

Copy link

Updated by Loïc Dachary about 8 years ago

Status changed from Resolved to New
Assignee deleted (~~Loïc Dachary~~)
Target version deleted (~~v0.94.6~~)

Actions

Copy link

Updated by Loïc Dachary about 8 years ago

The commit introduces a regression and is reverted by http://tracker.ceph.com/issues/15019

Actions

Copy link

#10

Updated by Loïc Dachary about 8 years ago

Related to Bug #15019: hammer: fs test fails with log [ERR] : OSD full dropping all updates 100% full added

Actions

Copy link

#11

Updated by Alexey Sheplyakov about 8 years ago

The commit introduces a regression

The commit exposes a bug in the test which assumes it's possible to write more data than the storage capacity is.
I believe that OSD should reject such writes to prevent further damage (ENOSPC handling in filesystems' code is not 100% fool proof), and it does so in Infernalis and Jewel.

and is reverted by http://tracker.ceph.com/issues/15019

I don't think reverting it is a good idea, the test case itself should be fixed instead.
Even if we want to pretend that it's possible to write 144 MB of data to a 100 MB drive
the check should be slightly modified, that is, https://github.com/ceph/ceph/blob/hammer/src/osd/ReplicatedPG.cc#L5693 should be removed,
instead of reintroducing the obc corruption. However I think checking for a full OSD is actually correct.

Actions

Copy link

#12