Bug #57897: ceph mgr restart causes restart of all iscsi daemons in a loop - Orchestrator - Ceph

Actions

Copy link

Bug #57897

open

ceph mgr restart causes restart of all iscsi daemons in a loop

Added by Dan Poltawski over 1 year ago. Updated about 1 year ago.

Status:

New

Priority:

Normal

Assignee:

Category:

orchestrator

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Yes

Severity:

1 - critical

Reviewed:

Affected Versions:

Ceph - v17.2.4, Ceph - v17.2.5

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

We have observed that since v17.2.4, a restart of the active ceph mgr appears to cause all iSCSI daemons to restart and then continue to restart in a loop causing a complete loss of iSCSI service.

We are still working out the steps to reproduce in a test environment, but given the severity of impact I thought i'd create this issue. It feels likely to be a regression from https://tracker.ceph.com/issues/57527

Current steps to reproduce:

3 node cephadm adopted cluster
Deploy 3 iSCSI daemons
Run `ceph orch upgrade start --ceph-version 17.2.5`
(This may also happen on any active manager failover)

These logs indicate the cosntant reconfiguration:

2022-10-19 06:37:42    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node0.kwbtan (dependencies changed)...
2022-10-19 06:37:42    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node0.kwbtan on node0.ma1.tnp.infra
2022-10-19 06:37:56    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node1.adffbc (dependencies changed)...
2022-10-19 06:37:56    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node1.adffbc on node1.ma1.tnp.infra
2022-10-19 06:38:11    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node2.rogtfr (dependencies changed)...
2022-10-19 06:38:11    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node2.rogtfr on node2.ma1.tnp.infra
2022-10-19 06:39:29    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node0.kwbtan (dependencies changed)...
2022-10-19 06:39:29    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node0.kwbtan on node0.ma1.tnp.infra
2022-10-19 06:39:33    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node1.adffbc (dependencies changed)...
2022-10-19 06:39:34    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node1.adffbc on node1.ma1.tnp.infra
2022-10-19 06:39:39    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node2.rogtfr (dependencies changed)...
2022-10-19 06:39:39    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node2.rogtfr on node2.ma1.tnp.infra
2022-10-19 06:40:01    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node0.kwbtan (dependencies changed)...
[etc]

Actions

Copy link

Updated by David Heap over 1 year ago

We currently have this situation again after an mgr failover for node maintenance/reboots and managed enabled debug logging

It looks like the orchestrator keeps reconfiguring the service due to thinking the mgr ip has changed when it hasn't - based on the code in serve.py here https://github.com/ceph/ceph/blob/v17.2.5/src/pybind/mgr/cephadm/serve.py#L910 it looks like the ip address it's getting as `last_deps` isn't updating, so it keeps trying to reconfigure the ip each time:


Nov 01 18:35:51 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:35:51 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...
Nov 01 18:36:03 node1 ceph-mon[3901]: iscsi.iscsi.node2.lofqpp deps ['172.16.1.5'] -> ['172.16.1.6']
Nov 01 18:36:03 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node2.lofqpp (dependencies changed)...
Nov 01 18:36:06 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:36:06 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...
Nov 01 18:37:10 node1 ceph-mon[3901]: iscsi.iscsi.node2.lofqpp deps ['172.16.1.5'] -> ['172.16.1.6']
Nov 01 18:37:10 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node2.lofqpp (dependencies changed)...
Nov 01 18:37:15 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:37:15 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...
Nov 01 18:38:19 node1 ceph-mon[3901]: iscsi.iscsi.node2.lofqpp deps ['172.16.1.5'] -> ['172.16.1.6']
Nov 01 18:38:19 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node2.lofqpp (dependencies changed)...
Nov 01 18:38:25 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:38:25 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...
Nov 01 18:39:31 node1 ceph-mon[3901]: iscsi.iscsi.node2.lofqpp deps ['172.16.1.5'] -> ['172.16.1.6']
Nov 01 18:39:31 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node2.lofqpp (dependencies changed)...
Nov 01 18:39:35 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:39:35 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...

When we pause orchestration and check inside the container that's created, the trusted ip list looks to be appended with the correct mgr ip as expected in quincy (which i believe is the reason the reconfiguration takes place since https://github.com/ceph/ceph/commit/cda82c98a32f51cb392fc51ba854bcae409567f8 )


trusted_ip_list = 172.16.1.4,172.16.1.5,172.16.1.6,172.16.1.7,172.16.1.8,172.16.1.6

Actions

Copy link

Updated by David Heap over 1 year ago

Attempts to stop or redeploy the daemon don't work as they seem to invoke a dependency check, which then restarts the service again


Nov 02 12:18:04 dh-ceph01-test ceph-mon[2054]: Schedule stop daemon iscsi.iscsi.dh-ceph01-test.jrdndc
Nov 02 12:18:05 dh-ceph01-test ceph-mon[2054]: Reconfiguring iscsi.iscsi.dh-ceph01-test.jrdndc (dependencies changed)...
Nov 02 12:18:05 dh-ceph01-test ceph-mon[2054]: Reconfiguring daemon iscsi.iscsi.dh-ceph01-test.jrdndc on dh-ceph01-test
Nov 02 12:18:23 dh-ceph01-test ceph-mon[2054]: Schedule redeploy daemon iscsi.iscsi.dh-ceph01-test.jrdndc
Nov 02 12:18:23 dh-ceph01-test ceph-mon[2054]: Reconfiguring iscsi.iscsi.dh-ceph01-test.jrdndc (dependencies changed)...
Nov 02 12:18:23 dh-ceph01-test ceph-mon[2054]: Reconfiguring daemon iscsi.iscsi.dh-ceph01-test.jrdndc on dh-ceph01-test

Actions

Copy link

Updated by Redouane Kachach Elhichou over 1 year ago

In this case I think it would be helpful to see what's the actual content of the deps. To get this information, from a cephadm shell enable the debug level and grep for 'deps'. Something like:

ceph config set mgr mgr/cephadm/log_to_cluster_level debug; ceph -W cephadm --watch-debug | grep -e iscsi -e deps

Actions

Copy link

Updated by Adam King over 1 year ago

this is a painful one. @David at least until we have a fix for this, I will mention that setting the iscsi spec to unmanaged (https://docs.ceph.com/en/quincy/cephadm/services/#disabling-automatic-management-of-daemons) will stop cephadm from doing these dependency checks and will at least stop this from happening. The downside is it won't update the trusted ip list when it actually should (that's what it's trying to do here since the active mgr ip needs to be in that list), but theoretically just a "ceph orch redeploy <iscsi-service-name>" after the mgr failover happens should resolve it.

Actions

Copy link

Updated by David Heap over 1 year ago

Adam King wrote:

this is a painful one. @David at least until we have a fix for this, I will mention that setting the iscsi spec to unmanaged (https://docs.ceph.com/en/quincy/cephadm/services/#disabling-automatic-management-of-daemons) will stop cephadm from doing these dependency checks and will at least stop this from happening. The downside is it won't update the trusted ip list when it actually should (that's what it's trying to do here since the active mgr ip needs to be in that list), but theoretically just a "ceph orch redeploy <iscsi-service-name>" after the mgr failover happens should resolve it.

Thanks, we will look into setting it to unmanaged for now. We've currently got orchestration paused at a point where all the containers are up, which i guess is doing a similar thing in a more global way.

Our mgr IPs are in the initial trusted IP lists we configured when deploying the iscsi services, so the updates aren't really required in our clusters anyway.

Redouane Kachach Elhichou wrote:

In this case I think it would be helpful to see what's the actual content of the deps. To get this information, from a cephadm shell enable the debug level and grep for 'deps'.

It's the same IP address issue as above when attempting to stop or redeploy the service


2022-11-08T11:59:11.048572+0000 mgr.dh-ceph00-test [INF] Schedule stop daemon iscsi.iscsi.dh-ceph00-test.asunqg
2022-11-08T11:59:11.279916+0000 mgr.dh-ceph00-test [DBG] iscsi.iscsi.dh-ceph00-test.asunqg deps ['10.10.15.183'] -> ['10.10.15.182']
2022-11-08T11:59:11.280093+0000 mgr.dh-ceph00-test [INF] Reconfiguring iscsi.iscsi.dh-ceph00-test.asunqg (dependencies changed)...
2022-11-08T11:59:21.609811+0000 mgr.dh-ceph00-test [INF] Schedule redeploy daemon iscsi.iscsi.dh-ceph00-test.asunqg
2022-11-08T11:59:21.822075+0000 mgr.dh-ceph00-test [DBG] iscsi.iscsi.dh-ceph00-test.asunqg deps ['10.10.15.183'] -> ['10.10.15.182']
2022-11-08T11:59:21.822242+0000 mgr.dh-ceph00-test [INF] Reconfiguring iscsi.iscsi.dh-ceph00-test.asunqg (dependencies changed)...

Actions

Copy link

Updated by Redouane Kachach Elhichou over 1 year ago

I tried to reproduce the issue with the same setup but wasn't successful so far. I started from a cluster running v17.2.4 with iscsi devices and then upgraded to v17.2.5. Upgrade was smooth and I didn't see any issues.

Actions

Copy link

Updated by Mykola Golub over 1 year ago

David mentioned [1] as a potential cause when the issue was introduced. But actually I think it is [2]. And apart that there seem to be a bug that causes an issue in David's environment, I am not sure I like the approach very much. I.e. now on every mgr failover we are reconfiguring and restarting iscsi gateways because of the trusted_ip_list change. I think it could be improved if the list includes not only the active mgr ip, but all mgr ips.

[1] https://github.com/ceph/ceph/commit/cda82c98a32f51cb392fc51ba854bcae409567f8
[2] https://github.com/ceph/ceph/pull/48076/commits/7ad668158c76c498c08ca584a27d288de0ac1e3b

Actions

Copy link

Updated by David Heap over 1 year ago

Hi Mykola and Redouane

Thanks for looking into this - we initially thought the same regarding the commit in the [2] link (see Dan's opening entry above), but it doesn't look like that commit is in the code tagged in 17.2.5:

https://github.com/ceph/ceph/blob/main/src/pybind/mgr/cephadm/module.py#L2480
https://github.com/ceph/ceph/blob/v17.2.5/src/pybind/mgr/cephadm/module.py#L2362

nor is the new code in the containers running 17.2.5:


heapd@dh-ceph00-test:~$ sudo ceph version
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)

heapd@dh-ceph00-test:~$ sudo podman ps | grep mgr
c85b4d1c7e76  quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45  -n mgr.dh-ceph00-...  9 days ago  Up 9 days ago  ...

heapd@dh-ceph00-test:~$ sudo podman exec c85b4d1c7e76 grep iscsi /usr/share/ceph/mgr/cephadm/module.py
from .services.iscsi import IscsiService
        elif daemon_type == 'iscsi':
                'iscsi': PlacementSpec(count=1),
    def apply_iscsi(self, spec: ServiceSpec) -> str:

Agree that it probably shouldn't be reconfiguring and restarting on every mgr failover if the pool of mgr IPs is staying the same

Actions

Copy link

Updated by Mykola Golub about 1 year ago

Hi David,

Although I was not able to reproduce your issue, I think the commit [1] I mentioned in my previous comment may fix your issue, as it changes how deps is calculated. This change will be in 17.2.6.

Also I created a ticket [2] and PR [3] to change the mgr cephadm module not to add the mgr ip in to the trusted_ip_list if it's already there. In your case (when you already have the mgr ips in the list) it should avoid restarting the iscsi gateway on mgr failover.

[1] https://github.com/ceph/ceph/pull/48076/commits/7ad668158c76c498c08ca584a27d288de0ac1e3b
[2] https://tracker.ceph.com/issues/58792
[3] https://github.com/ceph/ceph/pull/50167

Actions

Copy link

#10

Updated by David Heap about 1 year ago

Hi Mykola

Unfortunately the change in deps calculation in 17.2.6 didn't resolve the issue as the deps still change, but hopefully your deduplication PR will help once released - thanks for submitting that!

2023-04-12T16:15:29.180866+0100 mgr.dh-ceph01-test [DBG] Applying service iscsi.iscsi spec
2023-04-12T16:15:29.181770+0100 mgr.dh-ceph01-test [DBG] Combine hosts with existing daemons [<DaemonDescription>(iscsi.iscsi.dh-ceph00-test.asunqg)] + new hosts []
2023-04-12T16:15:29.326384+0100 mgr.dh-ceph01-test [DBG] iscsi.iscsi.dh-ceph00-test.asunqg deps ['10.10.15.181,10.10.15.182,10.10.15.183,10.10.15.184,10.10.15.185,10.10.15.182'] -> ['10.10.15.181,10.10.15.182,10.10.15.183,10.10.15.184,10.10.15.185,10.10.15.183']
2023-04-12T16:15:29.326585+0100 mgr.dh-ceph01-test [INF] Reconfiguring iscsi.iscsi.dh-ceph00-test.asunqg (dependencies changed)...
2023-04-12T16:15:29.336105+0100 mgr.dh-ceph01-test [INF] Reconfiguring daemon iscsi.iscsi.dh-ceph00-test.asunqg on dh-ceph00-test

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Orchestrator

Custom queries

Bug #57897

ceph mgr restart causes restart of all iscsi daemons in a loop

Updated by David Heap over 1 year ago

Updated by David Heap over 1 year ago

Updated by Redouane Kachach Elhichou over 1 year ago

Updated by Adam King over 1 year ago

Updated by David Heap over 1 year ago

Updated by Redouane Kachach Elhichou over 1 year ago

Updated by Mykola Golub over 1 year ago

Updated by David Heap over 1 year ago

Updated by Mykola Golub about 1 year ago

Updated by David Heap about 1 year ago