Actions
Bug #64478
closedUpgrading mon from v18.2.1 to latest-reef-devel image is causing mon to fail when decoding the MDSMap
Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
Yes
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
The Rook daily CI creates a v18.2.1 cluster with CephFS enabled, then upgrades to the latest-reef-devel image. As soon as the mon is upgraded, this failure is seen in the mon log.
debug -1> 2024-02-16T21:27:13.319+0000 7fca997d4c80 1 mon.a@-1(???) e1 preinit fsid 8a85cb29-13cb-4904-b9e3-6b692e7bfffb
debug 0> 2024-02-16T21:27:13.343+0000 7fca997d4c80 -1 ** Caught signal (Aborted) *
in thread 7fca997d4c80 thread_name:ceph-mon
ceph version 18.2.1-593-g744c573d (744c573dfc29e50959567861c524f9e6c038171f) reef (stable)
1: /lib64/libpthread.so.0(0x12d20) [0x7fca9642dd20]
2: gsignal()
3: abort()
4: /lib64/libstdc+.so.6(0x9009b) [0x7fca95a3f09b]
5: /lib64/libstdc+.so.6(0x9654c) [0x7fca95a4554c]
6: /lib64/libstdc+.so.6(0x965a7) [0x7fca95a455a7]
7: /lib64/libstdc+.so.6(+0x96808) [0x7fca95a45808]
8: (ceph::buffer::v15_2_0::list::iterator_impl<true>::copy(unsigned int, char*)+0xa5) [0x7fca98fd2385]
9: (MDSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0xa64) [0x7fca991ede94]
10: (Filesystem::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x162) [0x7fca991fbb42]
11: (void ceph::decode<Filesystem, std::allocator<std::shared_ptr<Filesystem> > >(std::vector<std::shared_ptr<Filesystem>, std::allocator<std::shared_ptr<Filesystem> > >&, ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x145) [0x7fca99208195]
12: (FSMap::decode(ceph::buffer::v15_2_0::list::iterator_impl<true>&)+0x161) [0x7fca991fda01]
13: (MDSMonitor::update_from_paxos(bool*)+0x26b) [0x555c9c6e55eb]
14: (Monitor::refresh_from_paxos(bool*)+0x104) [0x555c9c473764]
15: (Monitor::preinit()+0xa2b) [0x555c9c4a1b6b]
16: main()
17: __libc_start_main()
18: _start()
See attached for the full mon log.
The purpose of the test is to confirm that Rook upgrades are passing on the latest Ceph images before the next release comes out.
See also the Rook CI issue: https://github.com/rook/rook/issues/13785
Files
Actions