Project

General

Profile

Actions

Bug #62435

open

Pod unable to mount fscrypt encrypted cephfs PVC when it moves to another node

Added by Sudhin Bengeri 9 months ago. Updated 3 months ago.

Status:
Need More Info
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Here is our setup:
Kubernetes: 1.27.3
rook: 1.11.9
ceph: 17.2.6
OS: Ubuntu 20.04 modified kernel to support fscrypt
(Linux wkhd 6.3.0-rc4+ #6 SMP PREEMPT_DYNAMIC Mon May 22 22:48:41 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux)

1. Made change to operator.yaml & common.yaml
2. Enabled fscrypt on CephFileSystem
3. Enabled fscrypt on storageclass

as suggested in https://rook.io/docs/rook/v1.11/Storage-Configuration/Ceph-CSI/ceph-csi-drivers/#enable-rbd-encryption-support

After that, we are able to create the pod and mount the volume. But once the pod was deleted and recreated in a different node, we get:

Warning FailedMount 27s (x7 over 2m39s) kubelet MountVolume.MountDevice failed for volume "pvc-71b1ef3f-4b06-4809-a2ac-2c42c4477db4" : rpc error: code = Internal desc = fscrypt: unsupported state metadata=true kernel_policy=false


Files

dmesg-230830.txt (18.5 KB) dmesg-230830.txt Sudhin Bengeri, 08/30/2023 02:11 PM
Actions #1

Updated by Venky Shankar 9 months ago

  • Assignee set to Xiubo Li

Xiubo, please take this one.

Actions #2

Updated by Venky Shankar 9 months ago

  • Status changed from New to Triaged
Actions #3

Updated by Xiubo Li 9 months ago

Venky Shankar wrote:

Xiubo, please take this one.

Sure.

Actions #4

Updated by Xiubo Li 9 months ago

  • Status changed from Triaged to Need More Info

Hi Sudhin,

This is not cephfs fscrypt. You are encrypting from the disk layer, not the filesystem layer. My understanding is you just mount a encrypted RBD block device. We haven't support the cephfs encryption yet.

BTW, do you have any dmesg logs from the new node ?

Actions #5

Updated by Sudhin Bengeri 8 months ago

Hi Xiubo,

Thanks for your response.

Are you saying that cephfs does not support fscrypt? I am not exactly sure where we read it but remember coming across some docs which mentioned fscrypt being supported for cephfs as well, for instance:

https://github.com/ceph/ceph-csi/blob/devel/docs/design/proposals/cephfs-fscrypt.md
https://blog.rook.io/rook-v1-11-storage-enhancements-8001aa67e10e

Here are the dmesg logs we see when the error is reported (from the node on which the pod gets scheduled and fails to mount PVC) :

[Wed Aug 30 13:38:23 2023] libceph: mon0 (1)[xxxx:xxxx:xxxx:xxx1::94]:6789 session established
[Wed Aug 30 13:38:23 2023] libceph: client82004833 fsid 226694be-9975-4db9-8618-329b2a0633e1
[Wed Aug 30 13:38:25 2023] ceph: handle_cap_grant: cap grant attempt to change fscrypt_auth on non-I_NEW inode (old len 0 new len 48)

I have attached the logs around that time in case needed.

Please let me know if I can be of any further help.

Regards,
Sudhin

Actions #6

Updated by Xiubo Li 8 months ago

Sudhin Bengeri wrote:

Hi Xiubo,

Thanks for your response.

Are you saying that cephfs does not support fscrypt? I am not exactly sure where we read it but remember coming across some docs which mentioned fscrypt being supported for cephfs as well, for instance:

https://github.com/ceph/ceph-csi/blob/devel/docs/design/proposals/cephfs-fscrypt.md
https://blog.rook.io/rook-v1-11-storage-enhancements-8001aa67e10e

Hi Sudhin

Please see https://tracker.ceph.com/issues/46690, it's still under reviewing, which means not get merged to Linux mainline yet.

We are planing to merge it in 6.5 and already in the `master` branch now in ceph/ceph-client repo.

Here are the dmesg logs we see when the error is reported (from the node on which the pod gets scheduled and fails to mount PVC) :

[Wed Aug 30 13:38:23 2023] libceph: mon0 (1)[xxxx:xxxx:xxxx:xxx1::94]:6789 session established
[Wed Aug 30 13:38:23 2023] libceph: client82004833 fsid 226694be-9975-4db9-8618-329b2a0633e1
[Wed Aug 30 13:38:25 2023] ceph: handle_cap_grant: cap grant attempt to change fscrypt_auth on non-I_NEW inode (old len 0 new len 48)

Which kernel were you using ? It's so strange you can see the above logs, which was introduce by:

commit 2f22d41a33cf8d595a69272acbe56b9fbb454b72
Author: Jeff Layton <jlayton@kernel.org>
Date:   Thu Aug 25 09:31:09 2022 -0400

    ceph: handle fscrypt fields in cap messages from MDS

    Handle the new fscrypt_file and fscrypt_auth fields in cap messages. Use
    them to populate new fields in cap_extra_info and update the inode with
    those values.

    Signed-off-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Xiubo Li <xiubli@redhat.com>
    Reviewed-and-tested-by: Luís Henriques <lhenriques@suse.de>
    Reviewed-by: Milind Changire <mchangir@redhat.com>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

Are you using the ceph/ceph-client testing branch ? How could your kernel could include this patch ?

And also your ceph version is old too, in upstream we support fscrypt from v18.0.0

Thanks

Actions #7

Updated by Sudhin Bengeri 8 months ago

Hi Xuibo,

Here is the uname -a output from the nodes:
Linux wkhd 6.3.0-rc4+ #6 SMP PREEMPT_DYNAMIC Mon May 22 22:48:41 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Kernel built with this patch

https://patchwork.kernel.org/project/ceph-devel/cover/20230417032654.32352-1-xiubli@redhat.com
(which includes Jeff Layton's work).

We recently upgraded to ceph:v18.2.0 and rook-ceph:v1.12.1. Did any log in the dmesg output shared yesterday made you think that we were using old ceph version?

Thanks.
Sudhin

Actions #8

Updated by Xiubo Li 8 months ago

Sudhin Bengeri wrote:

Hi Xuibo,

Here is the uname -a output from the nodes:
Linux wkhd 6.3.0-rc4+ #6 SMP PREEMPT_DYNAMIC Mon May 22 22:48:41 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Kernel built with this patch

https://patchwork.kernel.org/project/ceph-devel/cover/20230417032654.32352-1-xiubli@redhat.com
(which includes Jeff Layton's work).

We recently upgraded to ceph:v18.2.0 and rook-ceph:v1.12.1. Did any log in the dmesg output shared yesterday made you think that we were using old ceph version?

Hi Sudhin

Okay, you just patched the patches from mail list. The v19 is out of date, the latest version is v20. But it didn't change much and that shouldn't matter here IMO.

Could you try to reproduce it again by enabling:

1, kernel ceph.ko module's dynamic debug logs
2, enable debug_mds = 25 and debug_ms = 1 for the MDSs daemons

And then provide the above logs.

Thanks

Actions #9

Updated by Sudhin Bengeri 7 months ago

Hi Xiubo,

We built version v20 and installed on the ceph HD nodes, here is the updated uname -a from a node:
Linux wkhd 6.3.0-rc4-custom2 #7 SMP PREEMPT_DYNAMIC Fri Oct 13 00:55:23 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

We observed that this issue is resolved in this version - that is when a pod moved on another node it was able to mount fscrypt encrypted cephfs PVC. We have collected ceph.ko module's dynamic debugs logs and mds debug logs, but since it worked do you still need the logs.

Please let us know when this kernel is going to be GA.

Thanks.

Actions #10

Updated by Xiubo Li 7 months ago

Sudhin Bengeri wrote:

Hi Xiubo,

We built version v20 and installed on the ceph HD nodes, here is the updated uname -a from a node:
Linux wkhd 6.3.0-rc4-custom2 #7 SMP PREEMPT_DYNAMIC Fri Oct 13 00:55:23 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

We observed that this issue is resolved in this version - that is when a pod moved on another node it was able to mount fscrypt encrypted cephfs PVC. We have collected ceph.ko module's dynamic debugs logs and mds debug logs, but since it worked do you still need the logs.

Okay, this logs make no sense to me. If you can reproduce it again please provide the logs.

Thanks

Please let us know when this kernel is going to be GA.

Thanks.

Actions #11

Updated by Sudhin Bengeri 3 months ago

Hi Xuibo,

We tried with the following updated setup:
Kubernetes: 1.28.4
rook: 1.13.2
ceph: 18.2.1
OS: Ubuntu 20.04 with latest 6.8rc1 kernel
Linux wkhd2 6.8.0-060800rc1-generic #202401212233 SMP PREEMPT_DYNAMIC Sun Jan 21 22:43:23 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

With this version we see that pods are able to mount fscrypt encrypted cephfs PVCs.

However, we see the following errors when a fscrypt encrypted cephfs PVC is mounted in multiple pods and these pods get scheduled to different nodes - the PVC does not get mounted in either pod, and pods do not come up.

Warning FailedMount 20m kubelet MountVolume.MountDevice failed for volume "pvc-e1c3d29f-6ec0-4a34-aefc-74420b7c09d1" : rpc error: code = Internal desc = "/etc/fscrypt.conf" is invalid: proto: syntax error (line 1:1): unexpected token
Warning FailedMount 116s (x11 over 19m) kubelet MountVolume.MountDevice failed for volume "pvc-e1c3d29f-6ec0-4a34-aefc-74420b7c09d1" : rpc error: code = Internal desc = fscrypt: unsupported state metadata=true kernel_policy=false"

Further, we have observed that the ceph storage nodes crash frequently.

Please let us know if you need any debug info from the nodes.

Thanks.
Sudhin

Actions

Also available in: Atom PDF