Project

General

Profile

Bug #57897

ceph mgr restart causes restart of all iscsi daemons in a loop

Added by Dan Poltawski 3 months ago. Updated 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
orchestrator
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We have observed that since v17.2.4, a restart of the active ceph mgr appears to cause all iSCSI daemons to restart and then continue to restart in a loop causing a complete loss of iSCSI service.

We are still working out the steps to reproduce in a test environment, but given the severity of impact I thought i'd create this issue. It feels likely to be a regression from https://tracker.ceph.com/issues/57527

Current steps to reproduce:
  • 3 node cephadm adopted cluster
  • Deploy 3 iSCSI daemons
  • Run `ceph orch upgrade start --ceph-version 17.2.5`
  • (This may also happen on any active manager failover)

These logs indicate the cosntant reconfiguration:

2022-10-19 06:37:42    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node0.kwbtan (dependencies changed)...
2022-10-19 06:37:42    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node0.kwbtan on node0.ma1.tnp.infra
2022-10-19 06:37:56    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node1.adffbc (dependencies changed)...
2022-10-19 06:37:56    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node1.adffbc on node1.ma1.tnp.infra
2022-10-19 06:38:11    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node2.rogtfr (dependencies changed)...
2022-10-19 06:38:11    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node2.rogtfr on node2.ma1.tnp.infra
2022-10-19 06:39:29    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node0.kwbtan (dependencies changed)...
2022-10-19 06:39:29    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node0.kwbtan on node0.ma1.tnp.infra
2022-10-19 06:39:33    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node1.adffbc (dependencies changed)...
2022-10-19 06:39:34    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node1.adffbc on node1.ma1.tnp.infra
2022-10-19 06:39:39    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node2.rogtfr (dependencies changed)...
2022-10-19 06:39:39    log_channel(cephadm) log [INF] : Reconfiguring daemon iscsi.iscsi.node2.rogtfr on node2.ma1.tnp.infra
2022-10-19 06:40:01    log_channel(cephadm) log [INF] : Reconfiguring iscsi.iscsi.node0.kwbtan (dependencies changed)...
[etc]

History

#1 Updated by David Heap 3 months ago

We currently have this situation again after an mgr failover for node maintenance/reboots and managed enabled debug logging

It looks like the orchestrator keeps reconfiguring the service due to thinking the mgr ip has changed when it hasn't - based on the code in serve.py here https://github.com/ceph/ceph/blob/v17.2.5/src/pybind/mgr/cephadm/serve.py#L910 it looks like the ip address it's getting as `last_deps` isn't updating, so it keeps trying to reconfigure the ip each time:


Nov 01 18:35:51 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:35:51 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...
Nov 01 18:36:03 node1 ceph-mon[3901]: iscsi.iscsi.node2.lofqpp deps ['172.16.1.5'] -> ['172.16.1.6']
Nov 01 18:36:03 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node2.lofqpp (dependencies changed)...
Nov 01 18:36:06 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:36:06 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...
Nov 01 18:37:10 node1 ceph-mon[3901]: iscsi.iscsi.node2.lofqpp deps ['172.16.1.5'] -> ['172.16.1.6']
Nov 01 18:37:10 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node2.lofqpp (dependencies changed)...
Nov 01 18:37:15 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:37:15 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...
Nov 01 18:38:19 node1 ceph-mon[3901]: iscsi.iscsi.node2.lofqpp deps ['172.16.1.5'] -> ['172.16.1.6']
Nov 01 18:38:19 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node2.lofqpp (dependencies changed)...
Nov 01 18:38:25 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:38:25 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...
Nov 01 18:39:31 node1 ceph-mon[3901]: iscsi.iscsi.node2.lofqpp deps ['172.16.1.5'] -> ['172.16.1.6']
Nov 01 18:39:31 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node2.lofqpp (dependencies changed)...
Nov 01 18:39:35 node1 ceph-mon[3901]: iscsi.iscsi.node0.ldfwch deps ['172.16.1.4'] -> ['172.16.1.6']
Nov 01 18:39:35 node1 ceph-mon[3901]: Reconfiguring iscsi.iscsi.node0.ldfwch (dependencies changed)...

When we pause orchestration and check inside the container that's created, the trusted ip list looks to be appended with the correct mgr ip as expected in quincy (which i believe is the reason the reconfiguration takes place since https://github.com/ceph/ceph/commit/cda82c98a32f51cb392fc51ba854bcae409567f8 )


trusted_ip_list = 172.16.1.4,172.16.1.5,172.16.1.6,172.16.1.7,172.16.1.8,172.16.1.6

#2 Updated by David Heap 3 months ago

Attempts to stop or redeploy the daemon don't work as they seem to invoke a dependency check, which then restarts the service again


Nov 02 12:18:04 dh-ceph01-test ceph-mon[2054]: Schedule stop daemon iscsi.iscsi.dh-ceph01-test.jrdndc
Nov 02 12:18:05 dh-ceph01-test ceph-mon[2054]: Reconfiguring iscsi.iscsi.dh-ceph01-test.jrdndc (dependencies changed)...
Nov 02 12:18:05 dh-ceph01-test ceph-mon[2054]: Reconfiguring daemon iscsi.iscsi.dh-ceph01-test.jrdndc on dh-ceph01-test
Nov 02 12:18:23 dh-ceph01-test ceph-mon[2054]: Schedule redeploy daemon iscsi.iscsi.dh-ceph01-test.jrdndc
Nov 02 12:18:23 dh-ceph01-test ceph-mon[2054]: Reconfiguring iscsi.iscsi.dh-ceph01-test.jrdndc (dependencies changed)...
Nov 02 12:18:23 dh-ceph01-test ceph-mon[2054]: Reconfiguring daemon iscsi.iscsi.dh-ceph01-test.jrdndc on dh-ceph01-test

#3 Updated by Redouane Kachach Elhichou 3 months ago

In this case I think it would be helpful to see what's the actual content of the deps. To get this information, from a cephadm shell enable the debug level and grep for 'deps'. Something like:

ceph config set mgr mgr/cephadm/log_to_cluster_level debug; ceph -W cephadm --watch-debug | grep -e iscsi -e deps

#4 Updated by Adam King 3 months ago

this is a painful one. @David at least until we have a fix for this, I will mention that setting the iscsi spec to unmanaged (https://docs.ceph.com/en/quincy/cephadm/services/#disabling-automatic-management-of-daemons) will stop cephadm from doing these dependency checks and will at least stop this from happening. The downside is it won't update the trusted ip list when it actually should (that's what it's trying to do here since the active mgr ip needs to be in that list), but theoretically just a "ceph orch redeploy <iscsi-service-name>" after the mgr failover happens should resolve it.

#5 Updated by David Heap 3 months ago

Adam King wrote:

this is a painful one. @David at least until we have a fix for this, I will mention that setting the iscsi spec to unmanaged (https://docs.ceph.com/en/quincy/cephadm/services/#disabling-automatic-management-of-daemons) will stop cephadm from doing these dependency checks and will at least stop this from happening. The downside is it won't update the trusted ip list when it actually should (that's what it's trying to do here since the active mgr ip needs to be in that list), but theoretically just a "ceph orch redeploy <iscsi-service-name>" after the mgr failover happens should resolve it.

Thanks, we will look into setting it to unmanaged for now. We've currently got orchestration paused at a point where all the containers are up, which i guess is doing a similar thing in a more global way.

Our mgr IPs are in the initial trusted IP lists we configured when deploying the iscsi services, so the updates aren't really required in our clusters anyway.

Redouane Kachach Elhichou wrote:

In this case I think it would be helpful to see what's the actual content of the deps. To get this information, from a cephadm shell enable the debug level and grep for 'deps'.

It's the same IP address issue as above when attempting to stop or redeploy the service


2022-11-08T11:59:11.048572+0000 mgr.dh-ceph00-test [INF] Schedule stop daemon iscsi.iscsi.dh-ceph00-test.asunqg
2022-11-08T11:59:11.279916+0000 mgr.dh-ceph00-test [DBG] iscsi.iscsi.dh-ceph00-test.asunqg deps ['10.10.15.183'] -> ['10.10.15.182']
2022-11-08T11:59:11.280093+0000 mgr.dh-ceph00-test [INF] Reconfiguring iscsi.iscsi.dh-ceph00-test.asunqg (dependencies changed)...
2022-11-08T11:59:21.609811+0000 mgr.dh-ceph00-test [INF] Schedule redeploy daemon iscsi.iscsi.dh-ceph00-test.asunqg
2022-11-08T11:59:21.822075+0000 mgr.dh-ceph00-test [DBG] iscsi.iscsi.dh-ceph00-test.asunqg deps ['10.10.15.183'] -> ['10.10.15.182']
2022-11-08T11:59:21.822242+0000 mgr.dh-ceph00-test [INF] Reconfiguring iscsi.iscsi.dh-ceph00-test.asunqg (dependencies changed)...

#6 Updated by Redouane Kachach Elhichou 3 months ago

I tried to reproduce the issue with the same setup but wasn't successful so far. I started from a cluster running v17.2.4 with iscsi devices and then upgraded to v17.2.5. Upgrade was smooth and I didn't see any issues.

#7 Updated by Mykola Golub 2 months ago

David mentioned [1] as a potential cause when the issue was introduced. But actually I think it is [2]. And apart that there seem to be a bug that causes an issue in David's environment, I am not sure I like the approach very much. I.e. now on every mgr failover we are reconfiguring and restarting iscsi gateways because of the trusted_ip_list change. I think it could be improved if the list includes not only the active mgr ip, but all mgr ips.

[1] https://github.com/ceph/ceph/commit/cda82c98a32f51cb392fc51ba854bcae409567f8
[2] https://github.com/ceph/ceph/pull/48076/commits/7ad668158c76c498c08ca584a27d288de0ac1e3b

#8 Updated by David Heap 2 months ago

Hi Mykola and Redouane

Thanks for looking into this - we initially thought the same regarding the commit in the [2] link (see Dan's opening entry above), but it doesn't look like that commit is in the code tagged in 17.2.5:

https://github.com/ceph/ceph/blob/main/src/pybind/mgr/cephadm/module.py#L2480
https://github.com/ceph/ceph/blob/v17.2.5/src/pybind/mgr/cephadm/module.py#L2362

nor is the new code in the containers running 17.2.5:


heapd@dh-ceph00-test:~$ sudo ceph version
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)

heapd@dh-ceph00-test:~$ sudo podman ps | grep mgr
c85b4d1c7e76  quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45  -n mgr.dh-ceph00-...  9 days ago  Up 9 days ago  ...

heapd@dh-ceph00-test:~$ sudo podman exec c85b4d1c7e76 grep iscsi /usr/share/ceph/mgr/cephadm/module.py
from .services.iscsi import IscsiService
        elif daemon_type == 'iscsi':
                'iscsi': PlacementSpec(count=1),
    def apply_iscsi(self, spec: ServiceSpec) -> str:

Agree that it probably shouldn't be reconfiguring and restarting on every mgr failover if the pool of mgr IPs is staying the same

Also available in: Atom PDF