Actions
Bug #53693
closedceph orch upgrade start is getting stuck in gibba cluster
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
- The current ceph version
# ceph versions { "mon": { "ceph version 17.0.0-9475-g8ea352e9 (8ea352e994feffca1bfd357a20c491df01db91a9) quincy (dev)": 5 }, "mgr": { "ceph version 17.0.0-9475-g8ea352e9 (8ea352e994feffca1bfd357a20c491df01db91a9) quincy (dev)": 2 }, "osd": { "ceph version 17.0.0-9475-g8ea352e9 (8ea352e994feffca1bfd357a20c491df01db91a9) quincy (dev)": 970 }, "mds": { "ceph version 17.0.0-9475-g8ea352e9 (8ea352e994feffca1bfd357a20c491df01db91a9) quincy (dev)": 2 }, "overall": { "ceph version 17.0.0-9475-g8ea352e9 (8ea352e994feffca1bfd357a20c491df01db91a9) quincy (dev)": 979 } }
The version we were trying to upgrade:
{ "needs_update": { "crash.gibba001": { "current_id": "f79fcb826d512859ef4914712095ea7ee02622fc213f5c39ab7b2ec468965efd", "current_name": "quay.ceph.io/ceph-ci/ceph@sha256:14b1ea54031bea23a37c589a02be794dca9c5a0807116ffef655bea631f9a62e", "current_version": "17.0.0-9475-g8ea352e9" }, "crash.gibba002": { "current_id": "f79fcb826d512859ef4914712095ea7ee02622fc213f5c39ab7b2ec468965efd", "current_name": "quay.ceph.io/ceph-ci/ceph@sha256:14b1ea54031bea23a37c589a02be794dca9c5a0807116ffef655bea631f9a62e", "current_version": "17.0.0-9475-g8ea352e9" }, ........ ........ ], "target_digest": "quay.ceph.io/ceph-ci/ceph@sha256:465e18548d5a9e1155bd093dfaa894e3cbc8f5b2e5a3d22b22c73a7979664155", "target_id": "9081735aa97cbfd10601ab1fc5fcaed6c8b41c2b22517b73c297aab304e5ffdd", "target_name": "quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa", "target_version": "ceph version 17.0.0-9718-g4ff72306 (4ff723061fc15c803dcf6556d02f56bdf56de5fa) quincy (dev)", "up_to_date": [] }
- Upgrade start and status with debug log enabled
[root@gibba001 ~]# ceph config set mgr mgr/cephadm/log_level debug [root@gibba001 ~]# ceph orch upgrade start --image quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa Initiating upgrade to quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa [root@gibba001 ~]# ceph orch upgrade status { "target_image": "quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa", "in_progress": true, "services_complete": [], "progress": "", }
- Ceph MGR Logs:
2021-12-21T21:34:47.490+0000 7fd28c7cb700 0 log_channel(audit) log [DBG] : from='client.17948814 -' entity='client.admin' cmd=[{"prefix": "orch upgrade start", "image": "quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa", "target": ["mon-mgr", ""]}]: dispatch 2021-12-21T21:34:47.492+0000 7fd28cfcc700 0 [cephadm INFO root] Upgrade: Started with target quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa 2021-12-21T21:34:47.492+0000 7fd28cfcc700 0 log_channel(cephadm) log [INF] : Upgrade: Started with target quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa 2021-12-21T21:34:47.492+0000 7fd28cfcc700 0 [progress INFO root] update: starting ev 668dc33f-3fca-4bd9-9ca7-0b926137fd71 (Upgrade to quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa)
- In debug logs, we have only the following, maybe nothing related to upgrade?
2021-12-21T21:34:48.142+0000 7fd286f00700 0 [progress INFO root] Processing OSDMap change 44389..44389 2021-12-21T21:34:50.813+0000 7fd25c16f700 0 [cephadm DEBUG root] Refreshed host gibba026 daemons (28) 2021-12-21T21:34:50.819+0000 7fd25d171700 0 [cephadm DEBUG root] Refreshed host gibba027 daemons (28) 2021-12-21T21:34:50.847+0000 7fd28b7c9700 0 log_channel(cluster) log [DBG] : pgmap v98: 65553 pgs: 1 active+clean+scrubbing+deep, 65552 active+clean; 992 GiB data, 4.1 TiB used, 8.8 TiB / 13 TiB avail 2021-12-21T21:34:51.075+0000 7fd25d171700 0 [cephadm DEBUG root] Received up-to-date metadata from agent on host gibba027. 2021-12-21T21:34:51.077+0000 7fd25c16f700 0 [cephadm DEBUG root] Received up-to-date metadata from agent on host gibba026. 2021-12-21T21:34:51.083+0000 7fd25a96c700 0 [cephadm DEBUG root] Refreshed host gibba023 daemons (28) 2021-12-21T21:34:51.237+0000 7fd25a96c700 0 [cephadm DEBUG root] Received up-to-date metadata from agent on host gibba023. 2021-12-21T21:34:51.483+0000 7fd25996a700 0 [cephadm DEBUG root] Refreshed host gibba030 daemons (28) 2021-12-21T21:34:51.585+0000 7fd25a16b700 0 [cephadm DEBUG root] Refreshed host gibba029 daemons (28) 2021-12-21T21:34:51.604+0000 7fd25996a700 0 [cephadm DEBUG root] Received up-to-date metadata from agent on host gibba030. 2021-12-21T21:34:51.696+0000 7fd25a16b700 0 [cephadm DEBUG root] Received up-to-date metadata from agent on host gibba029. 2021-12-21T21:34:51.841+0000 7fd25d972700 0 [cephadm DEBUG root] Refreshed host gibba031 daemons (28) 2021-12-21T21:34:51.954+0000 7fd259169700 0 [cephadm DEBUG root] Refreshed host gibba032 daemons (28) 2021-12-21T21:34:51.996+0000 7fd25d972700 0 [cephadm DEBUG root] Received up-to-date metadata from agent on host gibba031. 2021-12-21T21:34:52.087+0000 7fd259169700 0 [cephadm DEBUG root] Received up-to-date metadata from agent on host gibba032.
- Ceph staus
# ceph -s cluster: id: 182eef00-53b5-11ec-84d3-3cecef3d8fb8 health: HEALTH_OK services: mon: 5 daemons, quorum gibba001,gibba002,gibba004,gibba005,gibba006 (age 3h) mgr: gibba001.zptzqf(active, since 7m), standbys: gibba002.veobjs mds: 1/1 daemons up, 1 standby osd: 1073 osds: 970 up (since 14m), 970 in (since 23h) data: volumes: 1/1 healthy pools: 4 pools, 65553 pgs objects: 230.34M objects, 992 GiB usage: 4.1 TiB used, 8.8 TiB / 13 TiB avail pgs: 65553 active+clean progress: Upgrade to quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa (0s) [............................]
- In Progress bar of ceph status upgrade status is stuck always as following we have given approx more than 15+hours to this upgrade to move forward but no luch
progress: Upgrade to quay.ceph.io/ceph-ci/ceph:4ff723061fc15c803dcf6556d02f56bdf56de5fa (0s) [............................]
Actions