Feature #55715
closedpybind/mgr/cephadm/upgrade: allow upgrades without reducing max_mds
0%
Description
See mailing list post "Alternate Multi-MDS Upgrade Procedure".
cephadm should have a configurable upgrade option to perform cephfs upgrades without reducing max_mds.
Right now, cephadm reduces max_mds to 1 to avoid having two active MDS modifying on-disk structures with new versions, communicating cross-version-incompatible messages, or other potential incompatibilities. This can be disruptive for large scale cephfs deployments because the cluster cannot easily reduce actives to 1.
Until we have true and correct rolling upgrades for CephFS, another option is to just fail the file system ("ceph fs fail ...") for every fs, upgrade all MDS servers, and then unfail the file systems ("ceph fs set foo joinable true"). This procedure should not be the default until well tested for minor/major upgrades (with qa tests!). Perhaps for Ceph S release it can be a default.
Updated by Dan van der Ster almost 2 years ago
FWIW here's a report on manually upgrading a small 15.2.15 cluster to 16.2.9. Two active MDSs, upgraded without decreasing to 1 active:
- RPMs upgraded, mons/mgrs/osds all restarted into 16.2.9
- ceph fs set cephfs allow_standby_replay false
- Stopped standby MDSs
- Stopped active MDSs
- Started one MDS.. nothing
- Started 2nd MDS.. rejoin / reconnect done. FS active.
- Started two standby MDSs.
- (Note I forgot to ceph config set mon mon_mds_skip_sanity true, but it worked anyway !!)
Cluster log:
2022-05-23T11:08:33.628263+0200 mon.cephoctopus-1 [WRN] Health check failed: 1 filesystem is degraded (FS_DEGRADED) 2022-05-23T11:08:33.628321+0200 mon.cephoctopus-1 [WRN] Health check failed: 1 filesystem has a failed mds daemon (FS_WITH_FAILED_MDS) 2022-05-23T11:08:39.249083+0200 mon.cephoctopus-1 [ERR] Health check failed: 1 filesystem is offline (MDS_ALL_DOWN) 2022-05-23T11:09:05.376057+0200 mon.cephoctopus-1 [INF] Health check cleared: FS_WITH_FAILED_MDS (was: 1 filesystem has a failed mds daemon) 2022-05-23T11:09:05.376098+0200 mon.cephoctopus-1 [INF] Health check cleared: MDS_INSUFFICIENT_STANDBY (was: insufficient standby MDS daemons available) 2022-05-23T11:09:05.382936+0200 mon.cephoctopus-1 [INF] Standby daemon mds.cephoctopus-1 assigned to filesystem cephfs as rank 0 2022-05-23T11:09:05.383079+0200 mon.cephoctopus-1 [WRN] Health check failed: 1 filesystem has a failed mds daemon (FS_WITH_FAILED_MDS) 2022-05-23T11:09:05.383089+0200 mon.cephoctopus-1 [WRN] Health check failed: insufficient standby MDS daemons available (MDS_INSUFFICIENT_STANDBY) 2022-05-23T11:09:05.383096+0200 mon.cephoctopus-1 [INF] Health check cleared: MDS_ALL_DOWN (was: 1 filesystem is offline) 2022-05-23T11:09:32.466193+0200 mon.cephoctopus-1 [INF] Health check cleared: FS_WITH_FAILED_MDS (was: 1 filesystem has a failed mds daemon) 2022-05-23T11:09:32.466226+0200 mon.cephoctopus-1 [INF] Health check cleared: MDS_INSUFFICIENT_STANDBY (was: insufficient standby MDS daemons available) 2022-05-23T11:09:32.472694+0200 mon.cephoctopus-1 [INF] Standby daemon mds.cephoctopus-2 assigned to filesystem cephfs as rank 1 2022-05-23T11:09:32.472792+0200 mon.cephoctopus-1 [WRN] Health check failed: insufficient standby MDS daemons available (MDS_INSUFFICIENT_STANDBY) 2022-05-23T11:09:35.568458+0200 mon.cephoctopus-1 [INF] daemon mds.cephoctopus-1 is now active in filesystem cephfs as rank 0 2022-05-23T11:09:35.568662+0200 mon.cephoctopus-1 [INF] daemon mds.cephoctopus-2 is now active in filesystem cephfs as rank 1 2022-05-23T11:09:36.523803+0200 mon.cephoctopus-1 [INF] Health check cleared: FS_DEGRADED (was: 1 filesystem is degraded) 2022-05-23T11:10:00.000233+0200 mon.cephoctopus-1 [WRN] Health detail: HEALTH_WARN insufficient standby MDS daemons available; all OSDs are running pacific or later but require_osd_release < pacific 2022-05-23T11:10:00.000284+0200 mon.cephoctopus-1 [WRN] [WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available 2022-05-23T11:10:00.000294+0200 mon.cephoctopus-1 [WRN] have 0; want 1 more 2022-05-23T11:10:00.000315+0200 mon.cephoctopus-1 [WRN] [WRN] OSD_UPGRADE_FINISHED: all OSDs are running pacific or later but require_osd_release < pacific 2022-05-23T11:10:00.000326+0200 mon.cephoctopus-1 [WRN] all OSDs are running pacific or later but require_osd_release < pacific 2022-05-23T11:10:58.334807+0200 mon.cephoctopus-1 [INF] Health check cleared: MDS_INSUFFICIENT_STANDBY (was: insufficient standby MDS daemons available) 2022-05-23T11:12:15.534602+0200 mon.cephoctopus-1 [INF] Health check cleared: OSD_UPGRADE_FINISHED (was: all OSDs are running pacific or later but require_osd_release < pacific) 2022-05-23T11:12:15.534648+0200 mon.cephoctopus-1 [INF] Cluster is now healthy 2022-05-23T11:20:00.000153+0200 mon.cephoctopus-1 [INF] overall HEALTH_OK
Saw this minor warning when stopping the standbys:
2022-05-23T11:06:31.970275+0200 mon.cephoctopus-1 (mon.0) 683 : cluster [WRN] daemon mds.cephoctopus-3 compat changed unexpectedly 2022-05-23T11:06:32.000983+0200 mon.cephoctopus-1 (mon.0) 685 : cluster [DBG] osdmap e7562: 4 total, 4 up, 4 in 2022-05-23T11:06:32.206606+0200 mon.cephoctopus-1 (mon.0) 686 : cluster [DBG] fsmap cephfs:2 {0=cephoctopus-2=up:active,1=cephoctopus-1=up:active} 1 up:standby ... 2022-05-23T11:06:56.486798+0200 mon.cephoctopus-1 (mon.0) 699 : cluster [WRN] daemon mds.cephoctopus-4 compat changed unexpectedly 2022-05-23T11:06:56.508442+0200 mon.cephoctopus-1 (mon.0) 700 : cluster [DBG] osdmap e7563: 4 total, 4 up, 4 in 2022-05-23T11:06:56.514220+0200 mon.cephoctopus-1 (mon.0) 701 : cluster [WRN] Health check failed: insufficient standby MDS daemons available (MDS_INSUFFICIENT_STANDBY) 2022-05-23T11:06:56.522924+0200 mon.cephoctopus-1 (mon.0) 702 : cluster [DBG] fsmap cephfs:2 {0=cephoctopus-2=up:active,1=cephoctopus-1=up:active}
So then the actives were stopped.
MDS log from starting rank 0 first time in 16.2.9:
2022-05-23T11:08:32.974+0200 7f509480a700 1 mds.cephoctopus-1 suicide! Wanted state up:active 2022-05-23T11:08:35.966+0200 7f509480a700 1 mds.1.293 shutdown: shutting down rank 1 2022-05-23T11:08:35.967+0200 7f508f800700 2 mds.1.cache Memory usage: total 466120, rss 41252, heap 331984, baseline 307408, 0 / 1067 inodes have caps, 0 caps, 0 caps per inode 2022-05-23T11:08:35.966+0200 7f5094009700 0 ms_deliver_dispatch: unhandled message 0x561a70b31dc0 osd_map(7564..7564 src has 6981..7564) v4 from mon.0 v2:188.185.87.224:3300/0 2022-05-23T11:08:35.967+0200 7f5094009700 0 ms_deliver_dispatch: unhandled message 0x561a71b5a300 mdsmap(e 303) v2 from mon.0 v2:188.185.87.224:3300/0 2022-05-23T11:08:35.967+0200 7f5094009700 0 ms_deliver_dispatch: unhandled message 0x561a70a8b680 mdsmap(e 4294967295) v2 from mon.0 v2:188.185.87.224:3300/0 2022-05-23T11:09:05.237+0200 7f1ca15bb900 0 set uid:gid to 167:167 (ceph:ceph) 2022-05-23T11:09:05.237+0200 7f1ca15bb900 0 ceph version 16.2.9-1 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable), process ceph-mds, pid 1773358 2022-05-23T11:09:05.237+0200 7f1ca15bb900 1 main not setting numa affinity 2022-05-23T11:09:05.237+0200 7f1ca15bb900 0 pidfile_write: ignore empty --pid-file 2022-05-23T11:09:05.245+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 304 from mon.1 2022-05-23T11:09:05.382+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 305 from mon.1 2022-05-23T11:09:05.382+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Monitors have assigned me to become a standby. 2022-05-23T11:09:05.390+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 306 from mon.1 2022-05-23T11:09:05.392+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map i am now mds.0.306 2022-05-23T11:09:05.392+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map state change up:boot --> up:replay 2022-05-23T11:09:05.392+0200 7f1c8e7a9700 1 mds.0.306 replay_start 2022-05-23T11:09:05.392+0200 7f1c8e7a9700 1 mds.0.306 waiting for osdmap 7565 (which blocklists prior instance) 2022-05-23T11:09:05.393+0200 7f1c8879d700 2 mds.0.306 Booting: 0: opening inotable 2022-05-23T11:09:05.393+0200 7f1c8879d700 2 mds.0.306 Booting: 0: opening sessionmap 2022-05-23T11:09:05.393+0200 7f1c8879d700 2 mds.0.306 Booting: 0: opening mds log 2022-05-23T11:09:05.393+0200 7f1c8879d700 2 mds.0.306 Booting: 0: opening purge queue (async) 2022-05-23T11:09:05.393+0200 7f1c8879d700 2 mds.0.306 Booting: 0: loading open file table (async) 2022-05-23T11:09:05.393+0200 7f1c8879d700 2 mds.0.306 Booting: 0: opening snap table 2022-05-23T11:09:05.430+0200 7f1c8879d700 2 mds.0.306 Booting: 1: loading/discovering base inodes 2022-05-23T11:09:05.430+0200 7f1c8879d700 0 mds.0.cache creating system inode with ino:0x100 2022-05-23T11:09:05.430+0200 7f1c8879d700 0 mds.0.cache creating system inode with ino:0x1 2022-05-23T11:09:05.445+0200 7f1c8879d700 2 mds.0.306 Booting: 2: replaying mds log 2022-05-23T11:09:05.445+0200 7f1c8879d700 2 mds.0.306 Booting: 2: waiting for purge queue recovered 2022-05-23T11:09:06.528+0200 7f1c86f9a700 1 mds.0.306 Finished replaying journal 2022-05-23T11:09:06.528+0200 7f1c86f9a700 1 mds.0.306 making mds journal writeable 2022-05-23T11:09:06.528+0200 7f1c86f9a700 2 mds.0.306 i am not alone, moving to state resolve 2022-05-23T11:09:07.272+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 307 from mon.1 2022-05-23T11:09:07.273+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map i am now mds.0.306 2022-05-23T11:09:07.273+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map state change up:replay --> up:resolve 2022-05-23T11:09:07.273+0200 7f1c8e7a9700 1 mds.0.306 resolve_start 2022-05-23T11:09:07.273+0200 7f1c8e7a9700 1 mds.0.306 reopen_log 2022-05-23T11:09:07.273+0200 7f1c8e7a9700 1 mds.0.306 recovery set is 1 2022-05-23T11:09:32.480+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 309 from mon.1 2022-05-23T11:09:32.480+0200 7f1c8e7a9700 1 mds.0.cache handle_mds_failure mds.1 : recovery peers are 1 2022-05-23T11:09:33.490+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 310 from mon.1 2022-05-23T11:09:33.490+0200 7f1c8e7a9700 1 mds.0.306 recovery set is 1 2022-05-23T11:09:33.516+0200 7f1c8e7a9700 1 mds.0.306 resolve_done 2022-05-23T11:09:34.513+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 311 from mon.1 2022-05-23T11:09:34.513+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map i am now mds.0.306 2022-05-23T11:09:34.513+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map state change up:resolve --> up:reconnect 2022-05-23T11:09:34.513+0200 7f1c8e7a9700 1 mds.0.306 reconnect_start 2022-05-23T11:09:34.513+0200 7f1c8e7a9700 1 mds.0.server reconnect_clients -- 1 sessions 2022-05-23T11:09:34.513+0200 7f1c8e7a9700 0 log_channel(cluster) log [DBG] : reconnect by client.2134445 v1:188.185.87.224:0/3785233189 after 0 2022-05-23T11:09:34.514+0200 7f1c8e7a9700 1 mds.0.306 reconnect_done 2022-05-23T11:09:35.515+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 312 from mon.1 2022-05-23T11:09:35.516+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map i am now mds.0.306 2022-05-23T11:09:35.516+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map state change up:reconnect --> up:rejoin 2022-05-23T11:09:35.516+0200 7f1c8e7a9700 1 mds.0.306 rejoin_start 2022-05-23T11:09:35.529+0200 7f1c8e7a9700 1 mds.0.306 rejoin_joint_start 2022-05-23T11:09:35.566+0200 7f1c8879d700 1 mds.0.306 rejoin_done 2022-05-23T11:09:36.259+0200 7f1c8e7a9700 0 ms_deliver_dispatch: unhandled message 0x55849fdfc9a0 client_metrics [client_metric_type: CAP_INFO cap_hits: 685398 cap_misses: 130 num_caps: 1][client_metric_type: READ_LATENCY latency: 6068600.201000][client_metric_type: WRITE_LATENCY latency: 2034-04-09T08:03:59.523000+0200][client_metric_type: METADATA_LATENCY latency: 2045-10-03T18:46:10.208000+0200][client_metric_type: DENTRY_LEASE dlease_hits: 7456 dlease_misses: 7 num_dentries: 1] v1 from client.2134445 v1:188.185.87.224:0/3785233189 2022-05-23T11:09:36.534+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 313 from mon.1 2022-05-23T11:09:36.535+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map i am now mds.0.306 2022-05-23T11:09:36.535+0200 7f1c8e7a9700 1 mds.0.306 handle_mds_map state change up:rejoin --> up:active 2022-05-23T11:09:36.535+0200 7f1c8e7a9700 1 mds.0.306 recovery_done -- successful recovery! 2022-05-23T11:09:36.535+0200 7f1c8e7a9700 1 mds.0.306 active_start 2022-05-23T11:09:36.536+0200 7f1c8e7a9700 1 mds.0.306 cluster recovered. 2022-05-23T11:09:37.537+0200 7f1c8e7a9700 1 mds.cephoctopus-1 Updating MDS map to version 314 from mon.1
Finally starting rank 2 first time in 16.2.9:
2022-05-23T11:08:39.218+0200 7f46cf1ba700 1 mds.cephoctopus-2 suicide! Wanted state up:active 2022-05-23T11:08:42.758+0200 7f46cf1ba700 1 mds.0.281 shutdown: shutting down rank 0 2022-05-23T11:08:42.758+0200 7f46ce9b9700 0 ms_deliver_dispatch: unhandled message 0x5572a92fb6c0 osd_map(7565..7565 src has 6981..7565) v4 from mon.0 v2:188.185.87.224:3300/0 2022-05-23T11:08:42.758+0200 7f46ce9b9700 0 ms_deliver_dispatch: unhandled message 0x5572a92dbb00 mdsmap(e 304) v2 from mon.0 v2:188.185.87.224:3300/0 2022-05-23T11:08:42.758+0200 7f46ce9b9700 0 ms_deliver_dispatch: unhandled message 0x5572a92db980 mdsmap(e 4294967295) v2 from mon.0 v2:188.185.87.224:3300/0 2022-05-23T11:09:32.225+0200 7f02132cb900 0 set uid:gid to 167:167 (ceph:ceph) 2022-05-23T11:09:32.225+0200 7f02132cb900 0 ceph version 16.2.9-1 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable), process ceph-mds, pid 1899424 2022-05-23T11:09:32.225+0200 7f02132cb900 1 main not setting numa affinity 2022-05-23T11:09:32.225+0200 7f02132cb900 0 pidfile_write: ignore empty --pid-file 2022-05-23T11:09:32.232+0200 7f02004b8700 1 mds.cephoctopus-2 Updating MDS map to version 307 from mon.1 2022-05-23T11:09:32.471+0200 7f02004b8700 1 mds.cephoctopus-2 Updating MDS map to version 308 from mon.1 2022-05-23T11:09:32.471+0200 7f02004b8700 1 mds.cephoctopus-2 Monitors have assigned me to become a standby. 2022-05-23T11:09:32.480+0200 7f02004b8700 1 mds.cephoctopus-2 Updating MDS map to version 309 from mon.1 2022-05-23T11:09:32.482+0200 7f02004b8700 1 mds.1.309 handle_mds_map i am now mds.1.309 2022-05-23T11:09:32.482+0200 7f02004b8700 1 mds.1.309 handle_mds_map state change up:boot --> up:replay 2022-05-23T11:09:32.482+0200 7f02004b8700 1 mds.1.309 replay_start 2022-05-23T11:09:32.482+0200 7f02004b8700 1 mds.1.309 waiting for osdmap 7565 (which blocklists prior instance) 2022-05-23T11:09:32.484+0200 7f01fa4ac700 2 mds.1.309 Booting: 0: opening inotable 2022-05-23T11:09:32.484+0200 7f01fa4ac700 2 mds.1.309 Booting: 0: opening sessionmap 2022-05-23T11:09:32.484+0200 7f01fa4ac700 2 mds.1.309 Booting: 0: opening mds log 2022-05-23T11:09:32.484+0200 7f01fa4ac700 2 mds.1.309 Booting: 0: opening purge queue (async) 2022-05-23T11:09:32.484+0200 7f01fa4ac700 2 mds.1.309 Booting: 0: loading open file table (async) 2022-05-23T11:09:32.524+0200 7f01f9cab700 2 mds.1.309 Booting: 1: loading/discovering base inodes 2022-05-23T11:09:32.524+0200 7f01f9cab700 0 mds.1.cache creating system inode with ino:0x101 2022-05-23T11:09:32.525+0200 7f01f9cab700 0 mds.1.cache creating system inode with ino:0x1 2022-05-23T11:09:32.537+0200 7f01fa4ac700 2 mds.1.309 Booting: 2: replaying mds log 2022-05-23T11:09:32.537+0200 7f01fa4ac700 2 mds.1.309 Booting: 2: waiting for purge queue recovered 2022-05-23T11:09:32.599+0200 7f01f8ca9700 1 mds.1.309 Finished replaying journal 2022-05-23T11:09:32.600+0200 7f01f8ca9700 1 mds.1.309 making mds journal writeable 2022-05-23T11:09:32.600+0200 7f01f8ca9700 2 mds.1.309 i am not alone, moving to state resolve 2022-05-23T11:09:33.490+0200 7f02004b8700 1 mds.cephoctopus-2 Updating MDS map to version 310 from mon.1 2022-05-23T11:09:33.490+0200 7f02004b8700 1 mds.1.309 handle_mds_map i am now mds.1.309 2022-05-23T11:09:33.490+0200 7f02004b8700 1 mds.1.309 handle_mds_map state change up:replay --> up:resolve 2022-05-23T11:09:33.490+0200 7f02004b8700 1 mds.1.309 resolve_start 2022-05-23T11:09:33.490+0200 7f02004b8700 1 mds.1.309 reopen_log 2022-05-23T11:09:33.490+0200 7f02004b8700 1 mds.1.309 recovery set is 0 2022-05-23T11:09:33.490+0200 7f02004b8700 1 mds.1.309 recovery set is 0 2022-05-23T11:09:33.492+0200 7f02004b8700 1 mds.cephoctopus-2 parse_caps: cannot decode auth caps buffer of length 0 2022-05-23T11:09:33.493+0200 7f02004b8700 1 mds.1.309 resolve_done 2022-05-23T11:09:34.512+0200 7f02004b8700 1 mds.cephoctopus-2 Updating MDS map to version 311 from mon.1 2022-05-23T11:09:34.512+0200 7f02004b8700 1 mds.1.309 handle_mds_map i am now mds.1.309 2022-05-23T11:09:34.512+0200 7f02004b8700 1 mds.1.309 handle_mds_map state change up:resolve --> up:reconnect 2022-05-23T11:09:34.512+0200 7f02004b8700 1 mds.1.309 reconnect_start 2022-05-23T11:09:34.512+0200 7f02004b8700 1 mds.1.309 reconnect_done 2022-05-23T11:09:35.516+0200 7f02004b8700 1 mds.cephoctopus-2 Updating MDS map to version 312 from mon.1 2022-05-23T11:09:35.516+0200 7f02004b8700 1 mds.1.309 handle_mds_map i am now mds.1.309 2022-05-23T11:09:35.516+0200 7f02004b8700 1 mds.1.309 handle_mds_map state change up:reconnect --> up:rejoin 2022-05-23T11:09:35.516+0200 7f02004b8700 1 mds.1.309 rejoin_start 2022-05-23T11:09:35.516+0200 7f02004b8700 1 mds.1.309 rejoin_joint_start 2022-05-23T11:09:35.567+0200 7f02004b8700 1 mds.1.309 rejoin_done 2022-05-23T11:09:36.535+0200 7f02004b8700 1 mds.cephoctopus-2 Updating MDS map to version 313 from mon.1 2022-05-23T11:09:36.535+0200 7f02004b8700 1 mds.1.309 handle_mds_map i am now mds.1.309 2022-05-23T11:09:36.535+0200 7f02004b8700 1 mds.1.309 handle_mds_map state change up:rejoin --> up:active 2022-05-23T11:09:36.535+0200 7f02004b8700 1 mds.1.309 recovery_done -- successful recovery! 2022-05-23T11:09:36.535+0200 7f02004b8700 1 mds.1.309 active_start 2022-05-23T11:09:36.536+0200 7f02004b8700 1 mds.1.309 cluster recovered. 2022-05-23T11:09:37.537+0200 7f02004b8700 1 mds.cephoctopus-2 Updating MDS map to version 314 from mon.1
Updated by Dhairya Parmar almost 2 years ago
- Status changed from New to In Progress
Updated by Dhairya Parmar almost 2 years ago
- Status changed from In Progress to New
- Pull request ID set to 46534
Updated by Dhairya Parmar almost 2 years ago
- Status changed from New to In Progress
Updated by Dhairya Parmar almost 2 years ago
- Pull request ID changed from 46534 to 47092
Updated by Dhairya Parmar almost 2 years ago
- Status changed from In Progress to Fix Under Review
Updated by Adam King over 1 year ago
- Status changed from Fix Under Review to Resolved
Updated by Dhairya Parmar over 1 year ago
- Pull request ID changed from 47092 to 47756