Bug #50719
openxattr returning from the dead (sic!)
0%
Description
Hi Ceph folks,
slow from the Samba team here. :)
I'm investigating a problem at a customer site where xattr data from files is reappearing after a few hours. No kidding.
I have straces that show that a process generates and subsequently removes a specific xattr. At that point I can look at the filesystem, list xattr of the file and the xattr is not there. Coming back a few hours later, out of the sudden the xattr is back.
This is happening with Ceph 14.2.16 on a Ceph mount:
- mount | grep ceph
172.17.252.46,172.17.252.47,172.17.252.48:6789:/ on /ceph type ceph (rw,relatime,name=ctdb,secret=<hidden>,acl,wsize=16777216)
I'm 100% sure that the file is not touched by any other process after the xattr has been deleted:
- because the I've worked with the customer to reproduce the problem on an isolated directory tree
- because the output of stat from before and after the return-of-the-dead shows identical ctime and mtime
This is reliably (more or less) reproducible.
The higher level workflow where this is observed is the following:
- Samba fileserver version 4.8.11
- Samba share configured with Mac support ie
vfs objects = fruit streams_xattr acl_xattr
(but not using the ceph VFS module)
- Mac client connecting over SMB
- Mac client copies one or more files to the share
- While the copy of a file is in progress, initially the Mac client will write a special SMB Named stream called "AFP_AfpInfo" containing a special Mac metadata type/creator code "brokMacs" (8 char string) to the file. The server will store the stream data in an xattr in the filesystem (due to streams_xattr), the xattr name being "user.DosStream.AFP_AfpInfo:$DATA". For the Mac user the behavour is that file with such a type/creator code are greyed out and inaccessible which makes sense while the copy is in progress. This is by design.
- When the client has finished copying the file, it overwrites the type/creator string with 0 bytes which Samba will internally translate to an removexattr() syscall. So the resulting sequence of xattr syscalls for the copy of one file is
14315 11:09:58.120175 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = -1 ENODATA (No data available)
14315 11:09:58.121224 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "", 1, XATTR_CREATE) = 0
14315 11:09:58.121322 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "AFP\0\0\0\1\0\0\0\0\0\200\0\0\0brokMACS\0\0\0\0\0\0\0", 61, 0) = 0
14315 11:10:00.102080 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = 0
14315 11:10:00.212989 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = -1 ENODATA (No data available)
14315 11:10:00.213699 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "", 1, XATTR_CREATE) = 0
14315 11:10:00.213790 setxattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA", "", 1, 0) = 0
14315 11:10:00.214792 removexattr("testfolder/Folder_5/Survival-Cheat-Sheet.pdf", "user.DosStream.AFP_AfpInfo:$DATA") = 0
Looking at the filesystem after the copy, we can see the xattr is indeed not present:
$ getfattr -n 'user.DosStream.AFP_AfpInfo:$DATA' -e hex Survival-Cheat-Sheet.pdf
Survival-Cheat-Sheet.pdf: user.DosStream.AFP_AfpInfo:$DATA: No such attribute
The inode timestamps at that point in time are
Access: 2021-04-23 11:09:58.055664217 +0200
Modify: 2021-02-22 15:36:42.000000000 +0100
Change: 2021-04-23 11:10:00.213654990 +0200
Now, waiting a few hours and repeating the above command, out of nowhere the xattr data is back.
$ getfattr -n 'user.DosStream.AFP_AfpInfo:$DATA' -e hex Survival-Cheat-Sheet.pdf- file: Survival-Cheat-Sheet.pdf
user.DosStream.AFP_AfpInfo:$DATA=0x4146500000000100000000008000000062726f6b4d41435300000000000000000000000000000000000000000000000000000000000000000000000000
Timestamps (no change compared to above):
Access: 2021-04-23 11:09:58.055664217 +0200
Modify: 2021-02-22 15:36:42.000000000 +0100
Change: 2021-04-23 11:10:00.213654990 +0200
As the "copy-in-progress" type/creator is back, the file is greyed out for the Mac users and Mac users can't access the file.
What can we do to diagnose this? And, can you help us here? Thanks!
Cheers!
-slow
Files