Bug #2392
closedFirst read of symlink after ceph filesystem mounted gives error
Description
On client machine (Ubuntu 12.04):
$ mount -t ceph -o name=admin,secret=xxxx 192.168.1.101:6789:/ /mnt/osd $ mkdir /mnt/osd/dir $ ln -s /etc/hosts /mnt/osd/dir/hosts $ umount /mnt/osd $ mount -t ceph -o name=admin,secret=xxxx 192.168.1.101:6789:/ /mnt/osd $ cat /mnt/osd/dir/hosts cat: /mnt/osd/dir/hosts: Invalid argument $ cat /mnt/osd/dir/hosts ...contents of /etc/hosts as expected
There are 3 other Ubuntu 12.04 hosts using the 0.46 ceph package, configured with mon, mts and osd on each host (see ceph.conf).
Files
Updated by Greg Farnum almost 12 years ago
I notice looking at your conf file that you have 3 MDSes. Are they all active? (ie, did you increase max_mds to 3)
If they are, see if you can reproduce with a single MDS. If they aren't, diagnosing this will be a lot easier with client and MDS logs.
Updated by Mark Kirkwood almost 12 years ago
- File ceph-logs-dev1.tar.gz ceph-logs-dev1.tar.gz added
- File kern.log kern.log added
Ah - good point, no I had not updated max_mds. I redid the setup with 1 mds and 1 osd. Same issue, logs attached.
Updated by Greg Farnum almost 12 years ago
- Assignee set to Greg Farnum
Okay, no guarantees but I will try and check this out at least briefly in the next day or two. :)
Updated by Greg Farnum almost 12 years ago
Mark, can you repeat these with debug logging turned up? It'll take a fair bit of disk space but there's not very much in these logs. :)
(http://www.ceph.com/wiki/Debugging)
Updated by Mark Kirkwood almost 12 years ago
- File ceph-logs-dev1.tar.gz ceph-logs-dev1.tar.gz added
Sorry - remembering to enable debugging that would have been more helpful! Logs with debugging turned on attached.
Updated by Greg Farnum almost 12 years ago
- Project changed from Ceph to Linux kernel client
- Category deleted (
26) - Assignee deleted (
Greg Farnum)
Okay, this looks to me like it has to be a problem with the kernel client. The MDS definitely knows it's a symlink at all times, but I'm not the right person to look through the kernel code for this right now.
Updated by Sage Weil almost 12 years ago
- Category set to fs/ceph
- Status changed from New to 12
The problem is the lookup open intents stuff. We try to do a lookup + open, but it ends up that the lookup result is a symlink that you can't open; in fact the vfs needs to come back and follow the link.
Miklos' atomic open stuff is also broken, altho the error is "too many levesl of symbolic links".
I think this is just a matter of returning "success but not open" to the vfs.. not sure the right way to do that. Can probably just look at what nfs does.
Updated by Sage Weil almost 12 years ago
- Assignee set to Sage Weil
- Priority changed from Normal to High
- Target version set to v3.5
This is going to be easy to fix once the atomic_open stuff is merged. Real Soon Now.