Project

General

Profile

Bug #49873

ceph_lremovexattr does not return error on file in ceph pacific

Added by John Mulligan 9 months ago. Updated 8 months ago.

Status:
Duplicate
Priority:
Normal
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While running our go-ceph CI against pacific for the first time our CI failed in the xattr tests.
It expected a call to ceph_lremovexattr to a regular file to return an error versus the symlink which is expected to pass. Unfortunately, the ceph pacific version does not appear to return an error when ceph_lremovexattr is invoked on the file path.

Example:
create link "link_name" from file "file_name"
ceph_lsetxattr(link_name, xattr_name, value, xattr_flags) : expected ok
ceph_lremovexattr(file_name, xattr_name) : expected error
ceph_lremovexattr(link_name, xattr_name) : expected ok

In both nautilus and octopus this is working as expected. On pacific the call "ceph_lremovexattr(file_name, xattr_name)" does not return an error.

Interestingly, calling "ceph_lremovexattr(link_name, xattr_name)" twice does cause an error (ENODATA) to be returned the 2nd time. So this implies the function call is handling some error cases. Perhaps the function isn't handling normal (non-link) files correctly?

I am using the 16.1.0 build that is part of ceph container ceph/daemon-base:lastest-pacific (sha256:463b03f0c48a3c760003daa8a5ab77e87120266a784dabe489293943ec20cc3d).


Related issues

Duplicates CephFS - Bug #49833: MDS should return -ENODATA when asked to remove xattr that doesn't exist Resolved

History

#1 Updated by Patrick Donnelly 9 months ago

John Mulligan wrote:

While running our go-ceph CI against pacific for the first time our CI failed in the xattr tests.
It expected a call to ceph_lremovexattr to a regular file to return an error versus the symlink which is expected to pass. Unfortunately, the ceph pacific version does not appear to return an error when ceph_lremovexattr is invoked on the file path.

lremovexattr(2) says:

lremovexattr() is identical to removexattr(), except in the case of a symbolic link, where the extended attribute is removed from the link itself, not the file that it refers to.

I don't see why it should fail for a regular file.

#2 Updated by John Mulligan 9 months ago

To try and clarify:

The xattr is set on the link. There should be no xattr of that name on the file the link points to. The test then tries to run (ceph_)lremovexattr on both paths. It expects the call to succeed on the link and to return an error on the regular file (since it is not present there).

This test works on nautilus and octopus versions.

Also to clarify, this is using libcephfs, not going via fuse/syscalls.

Just to double check, I ran tests on a local file system with syscalls, and it works the way ceph did previously, returning ENODATA when using lremovexattr on the "wrong" path. The only wrinkle is that on a local FS I could not use the "user." xattr namespace on links, so I just ran my tests as root and used a different namespace.

#3 Updated by Patrick Donnelly 9 months ago

  • Status changed from New to Triaged
  • Assignee set to Sidharth Anupkrishnan
  • Target version set to v17.0.0
  • Source set to Development
  • Backport set to pacific
  • Component(FS) Client, MDS added

John Mulligan wrote:

To try and clarify:

The xattr is set on the link. There should be no xattr of that name on the file the link points to. The test then tries to run (ceph_)lremovexattr on both paths. It expects the call to succeed on the link and to return an error on the regular file (since it is not present there).

This test works on nautilus and octopus versions.

Also to clarify, this is using libcephfs, not going via fuse/syscalls.

Just to double check, I ran tests on a local file system with syscalls, and it works the way ceph did previously, returning ENODATA when using lremovexattr on the "wrong" path. The only wrinkle is that on a local FS I could not use the "user." xattr namespace on links, so I just ran my tests as root and used a different namespace.

Ah, I see. That does sound like a bug.

#4 Updated by John Mulligan 9 months ago

I'll also note that I did find the following issue
https://tracker.ceph.com/issues/49833

But forgot to reference it in my original reply, sorry. Where it differs from mine, is that I do see ENODATA errors in some circumstances, but not all of them (as noted above). Plus, that issue references ceph_removexattr vs. ceph_lremovexattr. But I suppose that there could be some relationship between them so I should have mentioned it earlier.

#5 Updated by Jeff Layton 9 months ago

John, I fixed a similar sounding bug in the MDS yesterday:

https://tracker.ceph.com/issues/49833

Are you able to test MDS patches? If so, then doing this test against an MDS with that patch would be interesting.

Libcephfs always does a sync call to the MDS, so it seems unlikely that the intermittent behavior you see is due to sometimes having caps and sometimes not. It might be though that the file with the "missing" xattr has (or had) other xattrs when it works. The MDS only failed to return -ENODATA when removing a non-existing xattr when there were none at all. If there were others, then it worked correctly.

#6 Updated by John Mulligan 9 months ago

Jeff Layton wrote:

John, I fixed a similar sounding bug in the MDS yesterday:

https://tracker.ceph.com/issues/49833

Are you able to test MDS patches? If so, then doing this test against an MDS with that patch would be interesting.

In theory, I could. But there are currently a lot of layers in between (the containers are built from rpms, etc). Plus I haven't built ceph myself as of yet so there's a learning curve involved too. That said I may give it a try but it is doubtful I would be able to do so quickly.

Libcephfs always does a sync call to the MDS, so it seems unlikely that the intermittent behavior you see is due to sometimes having caps and sometimes not. It might be though that the file with the "missing" xattr has (or had) other xattrs when it works. The MDS only failed to return -ENODATA when removing a non-existing xattr when there were none at all. If there were others, then it worked correctly.

The last three sentences certainly sounds very similar to what I'm seeing. But I do see that you marked 49833 as needing a backport to octopus, where octopus works fine for me.

#7 Updated by Sidharth Anupkrishnan 8 months ago

I tested this incorrectly last week using libcephfs due to an incorrect reproduction of the steps. Upon retesting correctly, I found Jeff's patch: https://github.com/ceph/ceph/pull/40158/ does indeed fix the issue. This issue is non-existent in octopus and nautilus since there is no such check for whether xattrs exist or not. There is only a check to see if the particular xattr exists(See: https://github.com/ceph/ceph/blob/octopus/src/mds/Server.cc#L5974 for Octopus and https://github.com/ceph/ceph/blob/nautilus/src/mds/Server.cc#L5732 for Nautilus). The change in xattr handling due to https://github.com/ceph/ceph/pull/36603 are included in Pacific and so there is a change in behaviur due to https://github.com/ceph/ceph/pull/36603/commits/a641e3c7600c7323532f95641a4b229c29985d55#diff-277dc6e796ccecb6aa14c9357f7a86898d6ddcf4113c110a4e00e0a10f6fefa7R5867. This is what was causing the issue. And upon retesting it with Jeff's patch the issue does not persist.

#8 Updated by John Mulligan 8 months ago

That sounds sensible to me, thanks! I did attempt to build ceph a copule of weeks ago, but unfortunately I was unable to make much progress as it seems I can't run a ceph build and a desktop environment on my laptop at the same time. I think the memory requirements to build ceph triggered the oom killer on my machine.

Anyway, now that I have gotten that off my chest.... I see that the patch Jeff linked to is already back-ported for Pacific. Once the next pacific bugfix release is out I will make sure I retest this and I will close out the issue if all goes as I expect it will. Thanks!

#9 Updated by Patrick Donnelly 8 months ago

  • Status changed from Triaged to Duplicate

Thanks for checking that Sidharth.

#10 Updated by Patrick Donnelly 8 months ago

  • Duplicates Bug #49833: MDS should return -ENODATA when asked to remove xattr that doesn't exist added

Also available in: Atom PDF