Project

General

Profile

Actions

Bug #64316

open

ceph-mon keeps crashing after upgrading from v16.2.14 to v17.2.x

Added by Tamás Csonka 3 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello,

I would like to ask for help as I am having trouble updating my Ceph cluster from version 16.2.14 to version 17.2.7 for some reason.

I update the packages with APT and then issue the following command:

systemctl restart ceph-mon.target

From this point on, the ceph-mon process keeps crashing continuously:

2024-02-02T12:17:07.14485+01:00 o-stor-01 ceph-mon[288919]:     -7> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).osd e59701 start_mapping started mapping job 0x562222533540 at 2024-02-02T12:17:07.095594+0100
2024-02-02T12:17:07.14493+01:00 o-stor-01 ceph-mon[288919]:     -6> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).paxosservice(logm 1..177) refresh
2024-02-02T12:17:07.14500+01:00 o-stor-01 ceph-mon[288919]:     -5> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).log v177 update_from_paxos
2024-02-02T12:17:07.14508+01:00 o-stor-01 ceph-mon[288919]:     -4> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).log v177 update_from_paxos version 177 summary v 0
2024-02-02T12:17:07.14515+01:00 o-stor-01 ceph-mon[288919]:     -3> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).log v177 log_external_backlog initialized external_log_to = 0 (summary v 0)
2024-02-02T12:17:07.14521+01:00 o-stor-01 ceph-mon[288919]:     -2> 2024-02-02T12:17:07.091+0100 7f15b5d42a00 10 mon.o-stor-01@-1(???).log v177 update_from_paxos latest full 176
2024-02-02T12:17:07.14528+01:00 o-stor-01 ceph-mon[288919]:     -1> 2024-02-02T12:17:07.095+0100 7f15b5d42a00  7 mon.o-stor-01@-1(???).log v177 update_from_paxos loading summary e176
2024-02-02T12:17:07.14534+01:00 o-stor-01 ceph-mon[288919]:      0> 2024-02-02T12:17:07.111+0100 7f15b5d42a00 -1 *** Caught signal (Aborted) **  in thread 7f15b5d42a00 thread_name:ceph-mon   ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)  1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f15b673e420]  2: gsignal()  3: abort()  4: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e8d1) [0x7f15b65de8d1]  5: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa37c) [0x7f15b65ea37c]  6: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa3e7) [0x7f15b65ea3e7]  7: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa699) [0x7f15b65ea699]  8: (ceph::buffer::v15_2_0::list::iterator_impl<true>::copy(unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)+0xd1) [0x7f15b7016071]  9: (LogSummary::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x17e) [0x7f15b6d0a4ce]  10: (LogMonitor::update_from_paxos(bool*)+0x101c) [0x56222027692c]  11: (PaxosService::refresh(bool*)+0x28f) [0x56222038147f]  12: (Monitor::refresh_from_paxos(bool*)+0x11b) [0x5622201d1f9b]  13: (Monitor::init_paxos()+0x74) [0x5622201d23b4]  14: (Monitor::preinit()+0xeea) [0x56222020b9ea]  15: main()  16: __libc_start_main()  17: _start()  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

The complete output can be found at the following link:
https://pastebin.com/kswRFuA3

The OS version I am using is:
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
Kernel version: 5.4.0-169-generic

I tried version v17.2.6 as well, but the same issue persists. I tried to upgrade from v16.2.14 directly to v17.2.0, the very first quincy release, to see if I could upgrade, but I got the same error. The ceph-mon process keeps crashing.

No data to display

Actions

Also available in: Atom PDF