Project

General

Profile

Actions

Bug #46823

closed

nautilus: kceph w/ testing branch: mdsc_handle_session corrupt message mds0 len 67

Added by Patrick Donnelly over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

[  145.812885] libceph: mon2 (1)172.21.15.84:6790 session established
[  145.813553] libceph: client8321 fsid 5cf09433-2f5f-422b-acb4-803697e075a8
[  149.694847] libceph: mds0 (1)172.21.15.65:6835 socket closed (con state OPEN)
[  149.729899] libceph: mds0 (1)172.21.15.65:6835 connection reset
[  149.736003] libceph: reset on mds0
[  149.736005] ceph: mds0 closed our session
[  149.736007] ceph: mds0 reconnect start
[  149.737125] ceph: mds0 reconnect denied
[  149.738001] libceph: mds0 (1)172.21.15.65:6835 socket closed (con state NEGOTIATING)
[  150.899734] ceph: mdsc_handle_session corrupt message mds0 len 67
[  150.906018] header: 00000000: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  150.906023] header: 00000010: 16 00 7f 00 03 00 43 00 00 00 00 00 00 00 00 00  ......C.........
[  150.906026] header: 00000020: 00 00 00 00 02 00 00 00 00 00 00 00 00 01 00 00  ................
[  150.906030] header: 00000030: 00 30 45 8c 74                                   .0E.t
[  150.906033]  front: 00000000: 0b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  150.906037]  front: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00  ................
[  150.906040]  front: 00000020: 0c 00 00 00 65 72 72 6f 72 5f 73 74 72 69 6e 67  ....error_string
[  150.906044]  front: 00000030: 0b 00 00 00 62 6c 61 63 6b 6c 69 73 74 65 64 00  ....blacklisted.
[  150.906047]  front: 00000040: 00 00 00                                         ...
[  150.906051] footer: 00000000: d6 ac a0 0c 00 00 00 00 00 00 00 00 3e d3 12 5a  ............>..Z
[  150.906055] footer: 00000010: e0 4d 48 94 05                                   .MH..

From dmesg output on hung node. Teuthology log:

2020-08-03T21:57:12.992 DEBUG:tasks.cephfs.kernel_mount:Unmounting client client.0...
2020-08-03T21:57:12.992 INFO:teuthology.orchestra.run:Running command with timeout 900
2020-08-03T21:57:12.993 INFO:teuthology.orchestra.run.smithi201:> sudo umount /home/ubuntu/cephtest/mnt.0
2020-08-03T21:57:13.033 INFO:teuthology.orchestra.run.smithi045:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-08-03T21:57:13.038 INFO:teuthology.orchestra.run.smithi065:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-08-03T21:57:13.040 INFO:teuthology.orchestra.run.smithi084:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-08-03T21:57:13.043 INFO:teuthology.orchestra.run.smithi101:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-08-03T21:57:13.047 INFO:teuthology.orchestra.run.smithi171:> sudo logrotate /etc/logrotate.d/ceph-test.conf
...

From: /ceph/teuthology-archive/pdonnell-2020-08-03_20:28:49-kcephfs-nautilus-distro-basic-smithi/5284713/teuthology.log

Actions #1

Updated by Jeff Layton over 3 years ago

  • Assignee changed from Jeff Layton to Patrick Donnelly

handle_session only really deals with the "front" part of the message, so the corruption is likely there.

op == 0xb (CEPH_SESSION_REJECT)
seq == 0
msg_version == 3 (from the header section)

...the next field (32 bits) after "seq" should be the session metadata length, but it's 0. __decode_session_metadata then does this:

static int __decode_session_metadata(void **p, void *end,
                                     bool *blacklisted)
{
        /* map<string,string> */
        u32 n;
        bool err_str;
        ceph_decode_32_safe(p, end, n, bad);   <<< n == 0
        while (n-- > 0) {
                u32 len;
                ceph_decode_32_safe(p, end, len, bad);
                ceph_decode_need(p, end, len, bad);
                err_str = !strncmp(*p, "error_string", len);
                *p += len;
                ceph_decode_32_safe(p, end, len, bad);
                ceph_decode_need(p, end, len, bad);
                if (err_str && strnstr(*p, "blacklisted", len))
                        *blacklisted = true;
                *p += len;
        }
        return 0;
bad:
        return -1;
}

We can see both "error_string" and "blacklisted" in the trailing data, but because the length field was 0, those never got parsed. This looks like an encoding error by the MDS.

Assigning back to Patrick for now, but let me know if you need me to take another look.

Actions #2

Updated by Jeff Layton over 3 years ago

To be clear:

[  150.906033]  front: 00000000: 0b 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
                                 |---op----| |---------seq---------| |--mdlen--|

I think that mdlen should have been > 0, but it wasn't. Looking at it more, it almost seems like an extra 16 bytes of 0's got injected between the seq and real mdlen, as it looks like the real mdlen is at offset 0x1c in the front section.

Actions #3

Updated by Jeff Layton over 3 years ago

  • Assignee changed from Patrick Donnelly to Jeff Layton
Actions #4

Updated by Jeff Layton over 3 years ago

I missed some of the fields. op and seq come from struct ceph_mds_session_head:

struct ceph_mds_session_head {
        __le32 op;
        __le64 seq;
        struct ceph_timespec stamp;
        __le32 max_caps, max_leases;
} __attribute__ ((packed));

The remaining fields account for the missing 16 bytes. So the metadata section probably did get decoded correctly. What does look hinky is that after the metadata field, there are supposed to be feature bit words, starting with a 32-bit length.

That length is 0, and the kclient doesn't expect that. That's probably where the problem is. That changed somewhat recently with this patch:

commit 0fa8263367db9287aa0632f96c1a5f93cc478150
Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Apr 28 08:10:22 2020 -0400

    ceph: fix endianness bug when handling MDS session feature bits

    Eduard reported a problem mounting cephfs on s390 arch. The feature
    mask sent by the MDS is little-endian, so we need to convert it
    before storing and testing against it.

    Cc: stable@vger.kernel.org
    Reported-and-Tested-by: Eduard Shishkin <edward6@linux.ibm.com>
    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Which broke the handling of a zero-length feature mask. I'll spin up a patch for upstream.

Actions #5

Updated by Jeff Layton over 3 years ago

  • Status changed from New to In Progress

Patch posted upstream and marked for stable:

https://marc.info/?l=ceph-devel&m=159655872206314&w=2

Actions #6

Updated by Jeff Layton over 3 years ago

So I tried reproducing this locally and couldn't get the error to pop. I suspect that the MDS I have is not sending a zero-length feature mask, even for reject messages.

I mean, this does seem a bit weird. The only thing that a version 3 struct adds is the feature mask, so you'd expect there to always be at least one word in there. Otherwise, why use a v3 struct?

While I intend to go ahead and fix this on the client, I have to wonder if we may have a subtle bug in the MDS code.

Actions #7

Updated by Jeff Layton over 3 years ago

  • Status changed from In Progress to Resolved

Fixed upstream.

Actions

Also available in: Atom PDF