Project

General

Profile

Bug #18635

systemd restarts Ceph Mon to quickly after failing to start

Added by Wido den Hollander 3 months ago. Updated 18 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
Start date:
01/23/2017
Due date:
% Done:

0%

Source:
Tags:
systemd,restartsec,ceph-mon,ipv6,mon
Backport:
jewel,kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No

Description

In the systemd profile of the ceph-mon service RestartSec is not set which makes it default to 100ms.

On systems where IPv6 connectivity is still awaiting Duplicated Address Detection (DAD) or Router Announcements (RA) the Monitor might fail to start.

A higher RestartSec value will mitigate this by giving the host time to bring the network up properly.


Related issues

Copied to Backport #18720: jewel: systemd restarts Ceph Mon to quickly after failing to start Resolved
Copied to Backport #18721: kraken: systemd restarts Ceph Mon to quickly after failing to start Resolved

Associated revisions

Revision e73eb8cc (diff)
Added by Wido den Hollander 3 months ago

systemd: Restart Mon after 10s in case of failure

In some situations the IP address the Monitor wants to bind to
might not be available yet.

This might for example be a IPv6 Address which is still performing
DAD or waiting for a Router Advertisement to be send by the Router(s).

Have systemd wait for 10s before starting the Mon and increase the amount
of times it does so to 5.

This allows the system to bring up IP Addresses in the mean time while
systemd waits with restarting the Mon.

Fixes: #18635

Signed-off-by: Wido den Hollander <>

Revision 20e75023 (diff)
Added by Wido den Hollander 3 months ago

systemd: Restart Mon after 10s in case of failure

In some situations the IP address the Monitor wants to bind to
might not be available yet.

This might for example be a IPv6 Address which is still performing
DAD or waiting for a Router Advertisement to be send by the Router(s).

Have systemd wait for 10s before starting the Mon and increase the amount
of times it does so to 5.

This allows the system to bring up IP Addresses in the mean time while
systemd waits with restarting the Mon.

Fixes: #18635

Signed-off-by: Wido den Hollander <>
(cherry picked from commit e73eb8cc1e0d45af1f0b7852c551f2ddfb82a520)

Revision bf3400f7 (diff)
Added by Wido den Hollander 3 months ago

systemd: Restart Mon after 10s in case of failure

In some situations the IP address the Monitor wants to bind to
might not be available yet.

This might for example be a IPv6 Address which is still performing
DAD or waiting for a Router Advertisement to be send by the Router(s).

Have systemd wait for 10s before starting the Mon and increase the amount
of times it does so to 5.

This allows the system to bring up IP Addresses in the mean time while
systemd waits with restarting the Mon.

Fixes: #18635

Signed-off-by: Wido den Hollander <>
(cherry picked from commit e73eb8cc1e0d45af1f0b7852c551f2ddfb82a520)

Revision 16b2fd00 (diff)
Added by Wido den Hollander 3 months ago

systemd: Restart Mon after 10s in case of failure

In some situations the IP address the Monitor wants to bind to
might not be available yet.

This might for example be a IPv6 Address which is still performing
DAD or waiting for a Router Advertisement to be send by the Router(s).

Have systemd wait for 10s before starting the Mon and increase the amount
of times it does so to 5.

This allows the system to bring up IP Addresses in the mean time while
systemd waits with restarting the Mon.

Fixes: #18635

Signed-off-by: Wido den Hollander <>
(cherry picked from commit e73eb8cc1e0d45af1f0b7852c551f2ddfb82a520)

History

#1 Updated by Loic Dachary 3 months ago

  • Backport set to jewel

#2 Updated by Loic Dachary 3 months ago

  • Status changed from New to Testing

#3 Updated by Sage Weil 3 months ago

  • Status changed from Testing to Pending Backport
  • Backport changed from jewel to jewel,kraken

#4 Updated by Nathan Cutler 3 months ago

  • Copied to Backport #18720: jewel: systemd restarts Ceph Mon to quickly after failing to start added

#5 Updated by Nathan Cutler 3 months ago

  • Copied to Backport #18721: kraken: systemd restarts Ceph Mon to quickly after failing to start added

#6 Updated by Nathan Cutler 18 days ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF