Bug #50719
openxattr returning from the dead (sic!)
0%
Description
Hi Ceph folks,
slow from the Samba team here. :)
I'm investigating a problem at a customer site where xattr data from files is reappearing after a few hours. No kidding.
I have straces that show that a process generates and subsequently removes a specific xattr. At that point I can look at the filesystem, list xattr of the file and the xattr is not there. Coming back a few hours later, out of the sudden the xattr is back.
This is happening with Ceph 14.2.16 on a Ceph mount:
- mount | grep ceph
172.17.252.46,172.17.252.47,172.17.252.48:6789:/ on /ceph type ceph (rw,relatime,name=ctdb,secret=<hidden>,acl,wsize=16777216)
I'm 100% sure that the file is not touched by any other process after the xattr has been deleted:
- because the I've worked with the customer to reproduce the problem on an isolated directory tree
- because the output of stat from before and after the return-of-the-dead shows identical ctime and mtime
This is reliably (more or less) reproducible.
The higher level workflow where this is observed is the following:
- Samba fileserver version 4.8.11
- Samba share configured with Mac support ie
vfs objects = fruit streams_xattr acl_xattr
(but not using the ceph VFS module)
- Mac client connecting over SMB
- Mac client copies one or more files to the share
- While the copy of a file is in progress, initially the Mac client will write a special SMB Named stream called "AFP_AfpInfo" containing a special Mac metadata type/creator code "brokMacs" (8 char string) to the file. The server will store the stream data in an xattr in the filesystem (due to streams_xattr), the xattr name being "user.DosStream.AFP_AfpInfo:$DATA". For the Mac user the behavour is that file with such a type/creator code are greyed out and inaccessible which makes sense while the copy is in progress. This is by design.
- When the client has finished copying the file, it overwrites the type/creator string with 0 bytes which Samba will internally translate to an removexattr() syscall. So the resulting sequence of xattr syscalls for the copy of one file is
14315 11:09:58.120175 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = -1 ENODATA (No data available)
14315 11:09:58.121224 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "", 1, XATTR_CREATE) = 0
14315 11:09:58.121322 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "AFP\0\0\0\1\0\0\0\0\0\200\0\0\0brokMACS\0\0\0\0\0\0\0", 61, 0) = 0
14315 11:10:00.102080 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = 0
14315 11:10:00.212989 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = -1 ENODATA (No data available)
14315 11:10:00.213699 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "", 1, XATTR_CREATE) = 0
14315 11:10:00.213790 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "", 1, 0) = 0
14315 11:10:00.214792 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = 0
Looking at the filesystem after the copy, we can see the xattr is indeed not present:
$ getfattr -n 'user.DosStream.AFP_AfpInfo:$DATA' -e hex Survival-Cheat-Sheet.pdf
Survival-Cheat-Sheet.pdf: user.DosStream.AFP_AfpInfo:$DATA: No such attribute
The inode timestamps at that point in time are
Access: 2021-04-23 11:09:58.055664217 +0200
Modify: 2021-02-22 15:36:42.000000000 +0100
Change: 2021-04-23 11:10:00.213654990 +0200
Now, waiting a few hours and repeating the above command, out of nowhere the xattr data is back.
$ getfattr -n 'user.DosStream.AFP_AfpInfo:$DATA' -e hex Survival-Cheat-Sheet.pdf- file: Survival-Cheat-Sheet.pdf
user.DosStream.AFP_AfpInfo:$DATA=0x4146500000000100000000008000000062726f6b4d41435300000000000000000000000000000000000000000000000000000000000000000000000000
Timestamps (no change compared to above):
Access: 2021-04-23 11:09:58.055664217 +0200
Modify: 2021-02-22 15:36:42.000000000 +0100
Change: 2021-04-23 11:10:00.213654990 +0200
As the "copy-in-progress" type/creator is back, the file is greyed out for the Mac users and Mac users can't access the file.
What can we do to diagnose this? And, can you help us here? Thanks!
Cheers!
-slow
Files
Updated by Patrick Donnelly almost 3 years ago
- Status changed from New to Triaged
- Assignee set to Jeff Layton
Updated by Jeff Layton almost 3 years ago
What kernel version are you running this on? Is this something easily reproducible, or does it take a while?
There are two different modes when you do a setxattr or removexattr:
If you have the exclusive xattr cap (aka Xx) then the set or removexattr will be done locally and the cap dirtied so the new xattr blob would eventually get sent to the MDS. If you don't have the Xx cap, then it does a synchronous SETXATTR call to the MDS.
So, it's one (or possibly a combination) of three things:
- the client did the change locally and never flushed the updated xattr blob to the MDS for some reason
- the client did flush the blob to the MDS, but it dropped the update onto the floor or reverted it for some reason
- the client did a synchronous SETXATTR call and it didn't stick for some reason
Given that you saw the change in a getxattr after it happened, #3 seems less likely than the others.
Updated by Ralph Böhme almost 3 years ago
Hi Jeff,
thanks for looking into this!
Jeff Layton wrote:
What kernel version are you running this on?
- uname -r
3.10.0-1062.18.1.el7.x86_64
Is this something easily reproducible, or does it take a while?
It's reliably reproducible.
There are two different modes when you do a setxattr or removexattr:
If you have the exclusive xattr cap (aka Xx) then the set or removexattr will be done locally and the cap dirtied so the new xattr blob would eventually get sent to the MDS. If you don't have the Xx cap, then it does a synchronous SETXATTR call to the MDS.
So, it's one (or possibly a combination) of three things:
- the client did the change locally and never flushed the updated xattr blob to the MDS for some reason
- the client did flush the blob to the MDS, but it dropped the update onto the floor or reverted it for some reason
- the client did a synchronous SETXATTR call and it didn't stick for some reason
Given that you saw the change in a getxattr after it happened, #3 seems less likely than the others.
Any suggestion how to diagnose this further?
Thanks!
-slow
Updated by Jeff Layton almost 3 years ago
Ralph Böhme wrote:
What kernel version are you running this on?
- uname -r
3.10.0-1062.18.1.el7.x86_64
Ok. RHEL7's kcephfs client is quite old. It's possible that this is something fixed in a more recent kernel version. It might be good to test something closer to mainline (a RHEL8.4-ish kernel is probably close enough here too).
It's reliably reproducible.
Good. Is it reproducible without samba? A standalone reproducer would be nice if possible.
There are two different modes when you do a setxattr or removexattr:
If you have the exclusive xattr cap (aka Xx) then the set or removexattr will be done locally and the cap dirtied so the new xattr blob would eventually get sent to the MDS. If you don't have the Xx cap, then it does a synchronous SETXATTR call to the MDS.
So, it's one (or possibly a combination) of three things:
- the client did the change locally and never flushed the updated xattr blob to the MDS for some reason
- the client did flush the blob to the MDS, but it dropped the update onto the floor or reverted it for some reason
- the client did a synchronous SETXATTR call and it didn't stick for some reason
Given that you saw the change in a getxattr after it happened, #3 seems less likely than the others.
Any suggestion how to diagnose this further?
I think trying to first reproduce this on a newer kernel would be the best thing. That would tell us whether we're looking at something that is still broken upstream or whether it's already fixed. Depending on the results of that, we'll need to do more to track down the actual bug, of course.
Updated by Ralph Böhme almost 3 years ago
Jeff Layton wrote:
Ok. RHEL7's kcephfs client is quite old. It's possible that this is something fixed in a more recent kernel version. It might be good to test something closer to mainline (a RHEL8.4-ish kernel is probably close enough here too).
Ok. Will try to get some systems with newer kernels and see if the problems reproduces there.
Is it reproducible without samba? A standalone reproducer would be nice if possible.
If it still reproduces on newer kernel I'll prepare a simple program that repeats the pattern.
Thanks!
-slow
Updated by Jeff Layton over 2 years ago
Ralph, ping? Were you ever able to determine whether this was fixed in later kernels?
Updated by Ralph Böhme over 2 years ago
Jeff Layton wrote:
Ralph, ping? Were you ever able to determine whether this was fixed in later kernels?
At first the problem seemed to have been solved after updating to newer kernel and ceph versions. But then a few days later it came back, worse then before. I'm going to work on trying to reproduce this with a local simple test program in the next days.
Updated by Ralph Böhme over 2 years ago
Ralph Böhme wrote:
I'm going to work on trying to reproduce this with a local simple test program in the next days.
No luck. :/ I tried several time with a C program modelled closely after the strace I have from an smbd where at the end, after a few hours of waiting, the "xattr returns from the dead" problem was seen.
Current versions:
Ceph Server: 14.2.21
Ceph Client: # uname -r -> 4.18.0-310.el8.x86_64
The servers will b updated in a few months, I'll report back if anything changes then.
Updated by Jeff Layton almost 2 years ago
- Status changed from Triaged to Can't reproduce
- Assignee changed from Jeff Layton to Xiubo Li
No response in several months. Closing case. Ralph, feel free to reopen if you have more info to share.
Updated by Thomas Hukkelberg 12 months ago
Hi Jeff!
Just wanted to let you know that this issue is still relevant and severe with more recent versions of both Ceph and Linux. We run Ceph 16.2.11 and Linux 5.13.19 with samba 4.17.6, on metadata-servers we run Linux 6.2.9. We can easily reproduce this issue by copying a handfull (50+) of files from a macOS workstation to a samba share with vfs_fruit. About 5% of files have the brokMACS xattr sequence still set after copying is finished.
Another interesting find is that it only seems to affect files that have a known format to macOS like pdf, png, wav, txt etc. If I copy a set of files that have no or unknown extension the brokMACS sequence is not seen on those files.
See recent discussion on the samba mailing list: https://lists.samba.org/archive/samba/2023-May/245150.html
Updated by Tobias Hachmer 11 months ago
Hi Jeff,
we're also affected by this issue and can confirm the behaviour Thomas described.
Versions here:
Ceph 17.2.6 with kernel 5.10.0-23-amd64 on Debian Bullseye
Samba 4.18.2 (sernet packages) with kernel 5.10.0-23-amd64 on Debian Bullseye
Would be great if this issue can be reopened and investigate further.
Thanks!
Updated by Xiubo Li 11 months ago
Tobias Hachmer wrote:
Hi Jeff,
we're also affected by this issue and can confirm the behaviour Thomas described.
Versions here:
Ceph 17.2.6 with kernel 5.10.0-23-amd64 on Debian Bullseye
Samba 4.18.2 (sernet packages) with kernel 5.10.0-23-amd64 on Debian BullseyeWould be great if this issue can be reopened and investigate further.
Thanks!
Hi Tobias
Could you reproduce it by enabling the mds debug logs ?
For the MDS daemons, set:
debug_mds = 25 debug_ms = 1
And also if possible please enable the kclient debug log:
echo "module ceph +p" | sudo tee /sys/kernel/debug/dynamic_debug/control
If you can provide the above logs I can find the root cause soon.
Thanks
- Xiubo
Updated by Tobias Hachmer 11 months ago
Hi Xiubo,
thanks for taking care of this issue and the debug info.
We have here only a production system. Can you please describe the impact of the two debug options a bit more in detail. Will this lead to higher/heavy load of the MDS, decreased performance, big load of log file, etc.?
Thanks
Tobias
Updated by Xiubo Li 11 months ago
Tobias Hachmer wrote:
Hi Xiubo,
thanks for taking care of this issue and the debug info.
We have here only a production system. Can you please describe the impact of the two debug options a bit more in detail. Will this lead to higher/heavy load of the MDS, decreased performance, big load of log file, etc.?
For production system this certainly will lead to heavy load and perf issues in both kclient and MDSs.
But if you can reproduce this always, then this should be okay IMO, you can just catch the logs and then disable the debug logs in both side after that.
Thanks
Tobias
Updated by Austin Axworthy 2 months ago
Hello,
I've come across this Ceph issue and noticed it hasn't been updated in 9 months. I aim to shed light on this bug again, hoping to contribute to finding a resolution. Please let me know what information I can provide to expedite the process. Thank you.
Updated by Xiubo Li 2 months ago
Austin Axworthy wrote:
Hello,
I've come across this Ceph issue and noticed it hasn't been updated in 9 months. I aim to shed light on this bug again, hoping to contribute to finding a resolution. Please let me know what information I can provide to expedite the process. Thank you.
Hi Austin,
If you can also reproduce it, then could you help provide the debug logs as I mentioned in comment#12 ?
Thanks.
Updated by Matthew Hutchinson about 2 months ago
We are currently working on recreating this issue internally as this was a customer cluster that was having the issue Austin mentioned.
As a workaround to resolve this issue for our customer, we added "vfs_streams_depot" to our SMB shares so that it does not put the attr on the individual files and instead makes a single depot file which contains them all
Once I have recreated this issue I will be able to provide the logs from comment #12
Updated by Xiubo Li about 2 months ago
Matthew Hutchinson wrote:
We are currently working on recreating this issue internally as this was a customer cluster that was having the issue Austin mentioned.
As a workaround to resolve this issue for our customer, we added "vfs_streams_depot" to our SMB shares so that it does not put the attr on the individual files and instead makes a single depot file which contains them all
Once I have recreated this issue I will be able to provide the logs from comment #12
Sounds good. Thanks!
Updated by Matthew Hutchinson about 2 months ago
- File control control added
- File ceph-mds.gw1.log.7z ceph-mds.gw1.log.7z added
Xiubo Li wrote:
Matthew Hutchinson wrote:
We are currently working on recreating this issue internally as this was a customer cluster that was having the issue Austin mentioned.
As a workaround to resolve this issue for our customer, we added "vfs_streams_depot" to our SMB shares so that it does not put the attr on the individual files and instead makes a single depot file which contains them all
Once I have recreated this issue I will be able to provide the logs from comment #12
Sounds good. Thanks!
I was able to recreate the issue on a testing cluster. attached are the MDS log and /sys/kernel/debug/dynamic_debug/control
On the MAC the files are greyed out and state they are locked and can not be open
Let me know if there is anything else you would like me to provide
Updated by Matthew Hutchinson about 1 month ago
Is there anything in the logs you saw that could be causing this issue? I am eager to get this resolved for all of our MAC clients using cephfs
Updated by Xiubo Li about 1 month ago
Matthew Hutchinson wrote:
Is there anything in the logs you saw that could be causing this issue? I am eager to get this resolved for all of our MAC clients using cephfs
Sorry, I missed you last comments, I will check it tomorrow.
Updated by Xiubo Li about 1 month ago
Matthew Hutchinson wrote:
Is there anything in the logs you saw that could be causing this issue? I am eager to get this resolved for all of our MAC clients using cephfs
BTW, which xattr you were using ? Still user.DosStream.AFP_AfpInfo ? Why I couldn't see any logs in the log files you uploaded ?
The mds log file is too short and I didn't see any logs about this.
Updated by Matthew Hutchinson about 1 month ago
- File ceph-mds.gw1.log.7z.001 ceph-mds.gw1.log.7z.001 added
We are currently using "streams_xattr" "acl_xattr"
Updated by Xiubo Li about 1 month ago
Matthew Hutchinson wrote:
We are currently using "streams_xattr" "acl_xattr"
Hi Mathew,
BTW, do you have the full logs or you can just upload it somewhere I can access? This is still the same. I need to whole logs from beginning you to remove the xattr to when this happens.
And if possible if you could enable the kernel ceph's debug logs will be better.
Thanks
Updated by Matthew Hutchinson 15 days ago
Those MDs logs would be everything. they are from the moment I built the MDS services until you requested the logs which should have the issue.
This is a VM cluster so I can spin it back up and gather any additional info you may need