Bug #64316
openceph-mon keeps crashing after upgrading from v16.2.14 to v17.2.x
0%
Description
Hello,
I would like to ask for help as I am having trouble updating my Ceph cluster from version 16.2.14 to version 17.2.7 for some reason.
I update the packages with APT and then issue the following command:
systemctl restart ceph-mon.target
From this point on, the ceph-mon process keeps crashing continuously:
2024-02-02T12:17:07.14485+01:00 o-stor-01 ceph-mon[288919]: -7> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).osd e59701 start_mapping started mapping job 0x562222533540 at 2024-02-02T12:17:07.095594+0100
2024-02-02T12:17:07.14493+01:00 o-stor-01 ceph-mon[288919]: -6> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).paxosservice(logm 1..177) refresh
2024-02-02T12:17:07.14500+01:00 o-stor-01 ceph-mon[288919]: -5> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).log v177 update_from_paxos
2024-02-02T12:17:07.14508+01:00 o-stor-01 ceph-mon[288919]: -4> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).log v177 update_from_paxos version 177 summary v 0
2024-02-02T12:17:07.14515+01:00 o-stor-01 ceph-mon[288919]: -3> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).log v177 log_external_backlog initialized external_log_to = 0 (summary v 0)
2024-02-02T12:17:07.14521+01:00 o-stor-01 ceph-mon[288919]: -2> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).log v177 update_from_paxos latest full 176
2024-02-02T12:17:07.14528+01:00 o-stor-01 ceph-mon[288919]: -1> 2024-02-02T12:17:07.095+0100 7f15b5d42a00 7 mon.o-stor-01@-1(???).log v177 update_from_paxos loading summary e176
2024-02-02T12:17:07.14534+01:00 o-stor-01 ceph-mon[288919]: 0> 2024-02-02T12:17:07.111+0100 7f15b5d42a00 -1 *** Caught signal (Aborted) ** in thread 7f15b5d42a00 thread_name:ceph-mon ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f15b673e420] 2: gsignal() 3: abort() 4: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e8d1) [0x7f15b65de8d1] 5: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa37c) [0x7f15b65ea37c] 6: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3e7) [0x7f15b65ea3e7] 7: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa699) [0x7f15b65ea699] 8: (ceph::buffer::v15_2_0::list::iterator_impl<true>::copy(unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)+0xd1) [0x7f15b7016071] 9: (LogSummary::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x17e) [0x7f15b6d0a4ce] 10: (LogMonitor::update_from_paxos(bool*)+0x101c) [0x56222027692c] 11: (PaxosService::refresh(bool*)+0x28f) [0x56222038147f] 12: (Monitor::refresh_from_paxos(bool*)+0x11b) [0x5622201d1f9b] 13: (Monitor::init_paxos()+0x74) [0x5622201d23b4] 14: (Monitor::preinit()+0xeea) [0x56222020b9ea] 15: main() 16: __libc_start_main() 17: _start() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
The complete output can be found at the following link:
https://pastebin.com/kswRFuA3
The OS version I am using is:
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
Kernel version: 5.4.0-169-generic
I tried version v17.2.6 as well, but the same issue persists. I tried to upgrade from v16.2.14 directly to v17.2.0, the very first quincy release, to see if I could upgrade, but I got the same error. The ceph-mon process keeps crashing.
No data to display