Bug #50719: xattr returning from the dead (sic!) - CephFS - Ceph

Actions

Copy link

Bug #50719

open

xattr returning from the dead (sic!)

Added by Ralph Böhme almost 3 years ago. Updated 15 days ago.

Status:

Need More Info

Priority:

Normal

Assignee:

Xiubo Li

Category:

Correctness/Safety

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v14.2.16

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi Ceph folks,

slow from the Samba team here. :)

I'm investigating a problem at a customer site where xattr data from files is reappearing after a few hours. No kidding.

I have straces that show that a process generates and subsequently removes a specific xattr. At that point I can look at the filesystem, list xattr of the file and the xattr is not there. Coming back a few hours later, out of the sudden the xattr is back.

This is happening with Ceph 14.2.16 on a Ceph mount:

mount | grep ceph
172.17.252.46,172.17.252.47,172.17.252.48:6789:/ on /ceph type ceph (rw,relatime,name=ctdb,secret=<hidden>,acl,wsize=16777216)

I'm 100% sure that the file is not touched by any other process after the xattr has been deleted:

- because the I've worked with the customer to reproduce the problem on an isolated directory tree

- because the output of stat from before and after the return-of-the-dead shows identical ctime and mtime

This is reliably (more or less) reproducible.

The higher level workflow where this is observed is the following:

- Samba fileserver version 4.8.11

- Samba share configured with Mac support ie

vfs objects = fruit streams_xattr acl_xattr
  (but not using the ceph VFS module)

- Mac client connecting over SMB

- Mac client copies one or more files to the share

- While the copy of a file is in progress, initially the Mac client will write a special SMB Named stream called "AFP_AfpInfo" containing a special Mac metadata type/creator code "brokMacs" (8 char string) to the file. The server will store the stream data in an xattr in the filesystem (due to streams_xattr), the xattr name being "user.DosStream.AFP_AfpInfo:$DATA". For the Mac user the behavour is that file with such a type/creator code are greyed out and inaccessible which makes sense while the copy is in progress. This is by design.

- When the client has finished copying the file, it overwrites the type/creator string with 0 bytes which Samba will internally translate to an removexattr() syscall. So the resulting sequence of xattr syscalls for the copy of one file is

14315 11:09:58.120175 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = -1 ENODATA (No data available)
14315 11:09:58.121224 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "", 1, XATTR_CREATE) = 0
14315 11:09:58.121322 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "AFP\0\0\0\1\0\0\0\0\0\200\0\0\0brokMACS\0\0\0\0\0\0\0", 61, 0) = 0
14315 11:10:00.102080 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = 0
14315 11:10:00.212989 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = -1 ENODATA (No data available)
14315 11:10:00.213699 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "", 1, XATTR_CREATE) = 0
14315 11:10:00.213790 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "", 1, 0) = 0
14315 11:10:00.214792 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = 0

Looking at the filesystem after the copy, we can see the xattr is indeed not present:

$ getfattr -n 'user.DosStream.AFP_AfpInfo:$DATA' -e hex Survival-Cheat-Sheet.pdf
Survival-Cheat-Sheet.pdf: user.DosStream.AFP_AfpInfo:$DATA: No such attribute

The inode timestamps at that point in time are

Access: 2021-04-23 11:09:58.055664217 +0200
Modify: 2021-02-22 15:36:42.000000000 +0100
Change: 2021-04-23 11:10:00.213654990 +0200

Now, waiting a few hours and repeating the above command, out of nowhere the xattr data is back.

$ getfattr -n 'user.DosStream.AFP_AfpInfo:$DATA' -e hex Survival-Cheat-Sheet.pdf

file: Survival-Cheat-Sheet.pdf
user.DosStream.AFP_AfpInfo:$DATA=0x4146500000000100000000008000000062726f6b4d41435300000000000000000000000000000000000000000000000000000000000000000000000000

Timestamps (no change compared to above):

Access: 2021-04-23 11:09:58.055664217 +0200
Modify: 2021-02-22 15:36:42.000000000 +0100
Change: 2021-04-23 11:10:00.213654990 +0200

As the "copy-in-progress" type/creator is back, the file is greyed out for the Mac users and Mac users can't access the file.

What can we do to diagnose this? And, can you help us here? Thanks!

Cheers!
-slow

Files

Download all files

debug-data.tgz (2.99 KB) debug-data.tgz		Ralph Böhme, 05/10/2021 05:31 AM
control (417 KB) control		Matthew Hutchinson, 03/01/2024 01:34 PM
ceph-mds.gw1.log.7z (694 KB) ceph-mds.gw1.log.7z		Matthew Hutchinson, 03/01/2024 01:43 PM
ceph-mds.gw1.log.7z.001 (808 KB) ceph-mds.gw1.log.7z.001	Ths Log should have the files getitng the attr attribute which causes MAC to see locked files	Matthew Hutchinson, 03/19/2024 02:34 PM

Actions

Copy link

Updated by Patrick Donnelly almost 3 years ago

Status changed from New to Triaged
Assignee set to Jeff Layton

Actions

Copy link

Updated by Jeff Layton almost 3 years ago

What kernel version are you running this on? Is this something easily reproducible, or does it take a while?

There are two different modes when you do a setxattr or removexattr:

If you have the exclusive xattr cap (aka Xx) then the set or removexattr will be done locally and the cap dirtied so the new xattr blob would eventually get sent to the MDS. If you don't have the Xx cap, then it does a synchronous SETXATTR call to the MDS.

So, it's one (or possibly a combination) of three things:

the client did the change locally and never flushed the updated xattr blob to the MDS for some reason
the client did flush the blob to the MDS, but it dropped the update onto the floor or reverted it for some reason
the client did a synchronous SETXATTR call and it didn't stick for some reason

Given that you saw the change in a getxattr after it happened, #3 seems less likely than the others.

Actions

Copy link

Updated by Ralph Böhme almost 3 years ago

Hi Jeff,

thanks for looking into this!

Jeff Layton wrote:

What kernel version are you running this on?

uname -r
3.10.0-1062.18.1.el7.x86_64

Is this something easily reproducible, or does it take a while?

It's reliably reproducible.

There are two different modes when you do a setxattr or removexattr:

If you have the exclusive xattr cap (aka Xx) then the set or removexattr will be done locally and the cap dirtied so the new xattr blob would eventually get sent to the MDS. If you don't have the Xx cap, then it does a synchronous SETXATTR call to the MDS.

So, it's one (or possibly a combination) of three things:

the client did the change locally and never flushed the updated xattr blob to the MDS for some reason

the client did flush the blob to the MDS, but it dropped the update onto the floor or reverted it for some reason

the client did a synchronous SETXATTR call and it didn't stick for some reason

Given that you saw the change in a getxattr after it happened, #3 seems less likely than the others.

Any suggestion how to diagnose this further?

Thanks!
-slow

Actions

Copy link

Updated by Jeff Layton almost 3 years ago

Ralph Böhme wrote:

What kernel version are you running this on?

uname -r
3.10.0-1062.18.1.el7.x86_64

Ok. RHEL7's kcephfs client is quite old. It's possible that this is something fixed in a more recent kernel version. It might be good to test something closer to mainline (a RHEL8.4-ish kernel is probably close enough here too).

It's reliably reproducible.

Good. Is it reproducible without samba? A standalone reproducer would be nice if possible.

There are two different modes when you do a setxattr or removexattr:

If you have the exclusive xattr cap (aka Xx) then the set or removexattr will be done locally and the cap dirtied so the new xattr blob would eventually get sent to the MDS. If you don't have the Xx cap, then it does a synchronous SETXATTR call to the MDS.

So, it's one (or possibly a combination) of three things:

the client did the change locally and never flushed the updated xattr blob to the MDS for some reason

the client did flush the blob to the MDS, but it dropped the update onto the floor or reverted it for some reason

the client did a synchronous SETXATTR call and it didn't stick for some reason

Given that you saw the change in a getxattr after it happened, #3 seems less likely than the others.

Any suggestion how to diagnose this further?

I think trying to first reproduce this on a newer kernel would be the best thing. That would tell us whether we're looking at something that is still broken upstream or whether it's already fixed. Depending on the results of that, we'll need to do more to track down the actual bug, of course.

Actions

Copy link

Updated by Ralph Böhme almost 3 years ago

Jeff Layton wrote:

Ok. RHEL7's kcephfs client is quite old. It's possible that this is something fixed in a more recent kernel version. It might be good to test something closer to mainline (a RHEL8.4-ish kernel is probably close enough here too).

Ok. Will try to get some systems with newer kernels and see if the problems reproduces there.

Is it reproducible without samba? A standalone reproducer would be nice if possible.

If it still reproduces on newer kernel I'll prepare a simple program that repeats the pattern.

Thanks!
-slow

Actions

Copy link

Updated by Jeff Layton over 2 years ago

Ralph, ping? Were you ever able to determine whether this was fixed in later kernels?

Actions

Copy link

Updated by Ralph Böhme over 2 years ago

Jeff Layton wrote:

Ralph, ping? Were you ever able to determine whether this was fixed in later kernels?

At first the problem seemed to have been solved after updating to newer kernel and ceph versions. But then a few days later it came back, worse then before. I'm going to work on trying to reproduce this with a local simple test program in the next days.

Actions

Copy link

Updated by Ralph Böhme over 2 years ago

Ralph Böhme wrote:

I'm going to work on trying to reproduce this with a local simple test program in the next days.

No luck. :/ I tried several time with a C program modelled closely after the strace I have from an smbd where at the end, after a few hours of waiting, the "xattr returns from the dead" problem was seen.

Current versions:

Ceph Server: 14.2.21
Ceph Client: # uname -r -> 4.18.0-310.el8.x86_64

The servers will b updated in a few months, I'll report back if anything changes then.

Actions

Copy link

Updated by Jeff Layton almost 2 years ago

Status changed from Triaged to Can't reproduce
Assignee changed from Jeff Layton to Xiubo Li

No response in several months. Closing case. Ralph, feel free to reopen if you have more info to share.

Actions

Copy link

#10

Updated by Thomas Hukkelberg 12 months ago

Hi Jeff!

Just wanted to let you know that this issue is still relevant and severe with more recent versions of both Ceph and Linux. We run Ceph 16.2.11 and Linux 5.13.19 with samba 4.17.6, on metadata-servers we run Linux 6.2.9. We can easily reproduce this issue by copying a handfull (50+) of files from a macOS workstation to a samba share with vfs_fruit. About 5% of files have the brokMACS xattr sequence still set after copying is finished.

Another interesting find is that it only seems to affect files that have a known format to macOS like pdf, png, wav, txt etc. If I copy a set of files that have no or unknown extension the brokMACS sequence is not seen on those files.

See recent discussion on the samba mailing list: https://lists.samba.org/archive/samba/2023-May/245150.html

Actions

Copy link

#11

Updated by Tobias Hachmer 11 months ago

Hi Jeff,

we're also affected by this issue and can confirm the behaviour Thomas described.

Versions here:

Ceph 17.2.6 with kernel 5.10.0-23-amd64 on Debian Bullseye
Samba 4.18.2 (sernet packages) with kernel 5.10.0-23-amd64 on Debian Bullseye

Would be great if this issue can be reopened and investigate further.

Thanks!

Actions

Copy link

#12

Updated by Xiubo Li 11 months ago

Tobias Hachmer wrote:

Hi Jeff,

we're also affected by this issue and can confirm the behaviour Thomas described.

Versions here:

Ceph 17.2.6 with kernel 5.10.0-23-amd64 on Debian Bullseye
Samba 4.18.2 (sernet packages) with kernel 5.10.0-23-amd64 on Debian Bullseye

Would be great if this issue can be reopened and investigate further.

Thanks!

Hi Tobias

Could you reproduce it by enabling the mds debug logs ?

For the MDS daemons, set:

debug_mds = 25
debug_ms = 1

And also if possible please enable the kclient debug log:

echo "module ceph +p" | sudo tee /sys/kernel/debug/dynamic_debug/control

If you can provide the above logs I can find the root cause soon.

Thanks
- Xiubo

Actions

Copy link

#13

Updated by Xiubo Li 11 months ago

Status changed from Can't reproduce to Need More Info

Actions

Copy link

#14

Updated by Tobias Hachmer 11 months ago

Hi Xiubo,

thanks for taking care of this issue and the debug info.

We have here only a production system. Can you please describe the impact of the two debug options a bit more in detail. Will this lead to higher/heavy load of the MDS, decreased performance, big load of log file, etc.?

Thanks
Tobias

Actions

Copy link

#15

Updated by Xiubo Li 11 months ago

Tobias Hachmer wrote:

Hi Xiubo,

thanks for taking care of this issue and the debug info.

We have here only a production system. Can you please describe the impact of the two debug options a bit more in detail. Will this lead to higher/heavy load of the MDS, decreased performance, big load of log file, etc.?

For production system this certainly will lead to heavy load and perf issues in both kclient and MDSs.

But if you can reproduce this always, then this should be okay IMO, you can just catch the logs and then disable the debug logs in both side after that.

Thanks
Tobias

Actions

Copy link

#16

Updated by Austin Axworthy 2 months ago

Hello,

I've come across this Ceph issue and noticed it hasn't been updated in 9 months. I aim to shed light on this bug again, hoping to contribute to finding a resolution. Please let me know what information I can provide to expedite the process. Thank you.

Actions

Copy link

#17

Updated by Xiubo Li 2 months ago

Austin Axworthy wrote:

Hello,

I've come across this Ceph issue and noticed it hasn't been updated in 9 months. I aim to shed light on this bug again, hoping to contribute to finding a resolution. Please let me know what information I can provide to expedite the process. Thank you.

Hi Austin,

If you can also reproduce it, then could you help provide the debug logs as I mentioned in comment#12 ?

Thanks.

Actions

Copy link

#18

Updated by Matthew Hutchinson about 2 months ago

We are currently working on recreating this issue internally as this was a customer cluster that was having the issue Austin mentioned.

As a workaround to resolve this issue for our customer, we added "vfs_streams_depot" to our SMB shares so that it does not put the attr on the individual files and instead makes a single depot file which contains them all

Once I have recreated this issue I will be able to provide the logs from comment #12

Actions

Copy link

#19

Updated by Xiubo Li about 2 months ago

Matthew Hutchinson wrote:

We are currently working on recreating this issue internally as this was a customer cluster that was having the issue Austin mentioned.

As a workaround to resolve this issue for our customer, we added "vfs_streams_depot" to our SMB shares so that it does not put the attr on the individual files and instead makes a single depot file which contains them all

Once I have recreated this issue I will be able to provide the logs from comment #12

Sounds good. Thanks!

Actions

Copy link Download all files

#20

Updated by Matthew Hutchinson about 2 months ago

File control control added
File ceph-mds.gw1.log.7z ceph-mds.gw1.log.7z added

Xiubo Li wrote:

Matthew Hutchinson wrote:

We are currently working on recreating this issue internally as this was a customer cluster that was having the issue Austin mentioned.

As a workaround to resolve this issue for our customer, we added "vfs_streams_depot" to our SMB shares so that it does not put the attr on the individual files and instead makes a single depot file which contains them all

Once I have recreated this issue I will be able to provide the logs from comment #12

Sounds good. Thanks!

I was able to recreate the issue on a testing cluster. attached are the MDS log and /sys/kernel/debug/dynamic_debug/control

On the MAC the files are greyed out and state they are locked and can not be open

Let me know if there is anything else you would like me to provide

Actions

Copy link

#21

Updated by Matthew Hutchinson about 1 month ago

Is there anything in the logs you saw that could be causing this issue? I am eager to get this resolved for all of our MAC clients using cephfs

Actions

Copy link

#22

Updated by Xiubo Li about 1 month ago

Matthew Hutchinson wrote:

Is there anything in the logs you saw that could be causing this issue? I am eager to get this resolved for all of our MAC clients using cephfs

Sorry, I missed you last comments, I will check it tomorrow.

Actions

Copy link

#23

Updated by Xiubo Li about 1 month ago

Matthew Hutchinson wrote:

Is there anything in the logs you saw that could be causing this issue? I am eager to get this resolved for all of our MAC clients using cephfs

BTW, which xattr you were using ? Still user.DosStream.AFP_AfpInfo ? Why I couldn't see any logs in the log files you uploaded ?

The mds log file is too short and I didn't see any logs about this.

Actions

Copy link

#24

Updated by Matthew Hutchinson about 1 month ago

File ceph-mds.gw1.log.7z.001 ceph-mds.gw1.log.7z.001 added

We are currently using "streams_xattr" "acl_xattr"

Actions

Copy link

#25

Updated by Xiubo Li about 1 month ago

Matthew Hutchinson wrote:

We are currently using "streams_xattr" "acl_xattr"

Hi Mathew,

BTW, do you have the full logs or you can just upload it somewhere I can access? This is still the same. I need to whole logs from beginning you to remove the xattr to when this happens.

And if possible if you could enable the kernel ceph's debug logs will be better.

Thanks

Actions

Copy link

#26

Updated by Matthew Hutchinson 15 days ago

Those MDs logs would be everything. they are from the moment I built the MDS services until you requested the logs which should have the issue.

This is a VM cluster so I can spin it back up and gather any additional info you may need

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #50719

xattr returning from the dead (sic!)

Updated by Patrick Donnelly almost 3 years ago

Updated by Jeff Layton almost 3 years ago

Updated by Ralph Böhme almost 3 years ago

Updated by Jeff Layton almost 3 years ago

Updated by Ralph Böhme almost 3 years ago

Updated by Jeff Layton over 2 years ago

Updated by Ralph Böhme over 2 years ago

Updated by Ralph Böhme over 2 years ago

Updated by Jeff Layton almost 2 years ago

Updated by Thomas Hukkelberg 12 months ago

Updated by Tobias Hachmer 11 months ago

Updated by Xiubo Li 11 months ago

Updated by Xiubo Li 11 months ago

Updated by Tobias Hachmer 11 months ago

Updated by Xiubo Li 11 months ago

Updated by Austin Axworthy 2 months ago

Updated by Xiubo Li 2 months ago

Updated by Matthew Hutchinson about 2 months ago

Updated by Xiubo Li about 2 months ago

Updated by Matthew Hutchinson about 2 months ago

Updated by Matthew Hutchinson about 1 month ago

Updated by Xiubo Li about 1 month ago

Updated by Xiubo Li about 1 month ago

Updated by Matthew Hutchinson about 1 month ago

Updated by Xiubo Li about 1 month ago

Updated by Matthew Hutchinson 15 days ago