Bug #10762
mon: osd gets marked down twice
0%
Description
- osd has intermittnet network issue
- gets marked down
- network fixed
- osd comes back up
- mon election (or something loady on the mons)
- osd marked down again
- osd comes back
The bug is that hte preprocess_failure checks for is_down() but doesn't verify that the reproting epoch is after get_up_from, leading to dups.
Associated revisions
mon: ignore osd failures from before up_from
If the failure was generated for an instance of the OSD prior to when
it came up, ignore it.
This probably causes a fair bit of unnecessary flapping in the wild...
Backport: giant, firefly
Fixes: #10762
Reported-by: Dan van der Ster <dan@vanderster.com>
Signed-off-by: Sage Weil <sage@redhat.com>
mon: ignore osd failures from before up_from
If the failure was generated for an instance of the OSD prior to when
it came up, ignore it.
This probably causes a fair bit of unnecessary flapping in the wild...
Backport: giant, firefly
Fixes: #10762
Reported-by: Dan van der Ster <dan@vanderster.com>
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 400ac237d35d0d1d53f240fea87e8483c0e2a7f5)
mon: ignore osd failures from before up_from
If the failure was generated for an instance of the OSD prior to when
it came up, ignore it.
This probably causes a fair bit of unnecessary flapping in the wild...
Backport: giant, firefly
Fixes: #10762
Reported-by: Dan van der Ster <dan@vanderster.com>
Signed-off-by: Sage Weil <sage@redhat.com>
(cherry picked from commit 400ac237d35d0d1d53f240fea87e8483c0e2a7f5)
History
#2 Updated by Sage Weil about 9 years ago
- Status changed from New to Fix Under Review
#3 Updated by Sage Weil about 9 years ago
- Status changed from Fix Under Review to Pending Backport
#4 Updated by Loïc Dachary about 9 years ago
#5 Updated by Loïc Dachary about 9 years ago
- firefly backport https://github.com/ceph/ceph/pull/3937
#6 Updated by Loïc Dachary about 9 years ago
- giant backport https://github.com/ceph/ceph/pull/4047
#7 Updated by Loïc Dachary about 9 years ago
#8 Updated by Loïc Dachary about 9 years ago
- Status changed from Pending Backport to Resolved