Project

General

Profile

Actions

Bug #51112

open

RHEL 8.4: libceph considers osdmap corrupt if numbering is discontinuous

Added by Matti Saarinen almost 3 years ago. Updated over 1 year ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
libceph
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
backport_processed
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

If I try to map rbd from a cluster where osd numbering is discontinuous it fails with message [85115.711643] libceph: corrupt full osdmap (-2) epoch 248869 off 10063. The cluster runs nautilus 14.2.18 on RHEL 7. The ceph-common package on client is also nautilus and the OS is RHEL 8. The mapping succeeds from clients running RHEL 7. I tried with mimic, nautilus and octopus ceph-client packages.

# modinfo libceph
filename:       /lib/modules/4.18.0-305.3.1.el8_4.x86_64/kernel/net/ceph/libceph.ko.xz
license:        GPL
description:    Ceph core library
author:         Patience Warnick <patience@newdream.net>
author:         Yehuda Sadeh <yehuda@hq.newdream.net>
author:         Sage Weil <sage@newdream.net>
rhelversion:    8.4
srcversion:     4A720ED724979ABE2F86C68
depends:        libcrc32c,dns_resolver
intree:         Y
name:           libceph
vermagic:       4.18.0-305.3.1.el8_4.x86_64 SMP mod_unload modversions 

The dmesg output with kernel debugs enabled from client was too large to be attached. It can be downloaded from here

https://s3.datacloud.helsinki.fi/matti:public/client-dmesg-with-kernel-debug-enabled


Files

osdmap.248869 (75.6 KB) osdmap.248869 Matti Saarinen, 06/07/2021 05:41 AM
crushmap.from-osdmap-248869 (6.95 KB) crushmap.from-osdmap-248869 Matti Saarinen, 06/07/2021 05:41 AM
Actions

Also available in: Atom PDF