Bug #53327
closed
osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_notify_mon by default
Added by Sage Weil over 2 years ago.
Updated almost 2 years ago.
Category:
Correctness/Safety
Backport:
pacific,quincy,octopus
Description
- it should send MOSDMarkMeDead not MarkMeDown
- we must confirm that we set a flag (preparing to stop?) that makes the OSD drop all messages coming in, so that we can be treated as really dead.
Hi Sage,
is there some update?
- Assignee changed from Sage Weil to Nitzan Mordechai
- Status changed from In Progress to New
- Backport changed from pacific to pacific,quincy
- Status changed from New to In Progress
- Status changed from In Progress to Fix Under Review
Hi Nitzan,
I checked your patch on the current pacific branch.
unfortunately I still get slow ops (slow >= 5 seconds blocked IO) after stopping all OSDs from one host. (systemctl stop ceph-osd.target)
In the OSD log I see the message, that the osd sends the dead notification to the mon, but in the ceph.log I get only for some of the OSDs the "marked itself dead" messages. The down messages are there for all affected OSDs..
I hope you can have a further look into this.
Thanks
Manuel
- Pull request ID set to 44807
- Backport changed from pacific,quincy to pacific,quincy,octopus
- Status changed from Fix Under Review to Pending Backport
- Copied to Backport #55073: pacific: osd: osd_fast_shutdown_notify_mon not quite right added
- Copied to Backport #55074: octopus: osd: osd_fast_shutdown_notify_mon not quite right added
- Copied to Backport #55075: quincy: osd: osd_fast_shutdown_notify_mon not quite right added
- Subject changed from osd: osd_fast_shutdown_notify_mon not quite right to osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_notify_mon by default
- Has duplicate Bug #53328: osd_fast_shutdown_notify_mon option should be true by default added
octopus: osd/OSD: osd_fast_shutdown_notify_mon not quite right #45655
https://github.com/ceph/ceph/pull/45655/commits
osd/OSD: osd_fast_shutdown_notify_mon not quite right
osd: make osd_fast_shutdown_notify_mon option true by default
Are there any problems with the backport of these two patch to octopus?
Why not merge into octopus yet?
Thanks!
Manuel Lausch wrote:
Hi Nitzan,
I checked your patch on the current pacific branch.
unfortunately I still get slow ops (slow >= 5 seconds blocked IO) after stopping all OSDs from one host. (systemctl stop ceph-osd.target)
In the OSD log I see the message, that the osd sends the dead notification to the mon, but in the ceph.log I get only for some of the OSDs the "marked itself dead" messages. The down messages are there for all affected OSDs..
I hope you can have a further look into this.
Thanks
Manuel
Please follow this commit of mine :
https://github.com/ceph/ceph/pull/46273
https://tracker.ceph.com/issues/55665
- Status changed from Pending Backport to Resolved
Also available in: Atom
PDF