Bug #4850
ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
0%
Description
2013-04-28 08:48:35.479385 7f03e249d780 1 client.4144 dump_cache 2013-04-28 08:48:35.479404 7f03e249d780 1 client.4144 dump_inode: DISCONNECTED inode 10000000004 #10000000004 ref 010000000004.head(ref=0 cap_refs={} open={2=0} mode=100666 size=0 mtime=2013-04-27 23:11:30.544590 caps=- objectset[10000000004 ts 0/0 objects 0 dirty_or_tx 0] 0x2ae8900) 2013-04-28 08:48:35.479425 7f03e249d780 1 client.4144 dump_inode: DISCONNECTED inode 10000000037 #10000000037 ref 010000000037.head(ref=0 cap_refs={1024=0,4096=0,8192=0} open={2=0} mode=100666 size=478995 mtime=2013-04-27 23:11:34.561768 caps=- objectset[10000000037 ts 2/478995 objects 0 dirty_or_tx 0] 0x34d2000) 2013-04-28 08:48:35.479439 7f03e249d780 1 client.4144 dump_inode: DISCONNECTED inode 10000000014 #10000000014 ref 010000000014.head(ref=0 cap_refs={} open={} mode=40777 size=0 mtime=2013-04-27 23:11:30.622570 caps=- COMPLETE 0x2b2b000)
job was
ubuntu@teuthology:/a/teuthology-2013-04-27_20:55:21-fs-next-testing-basic/2193$ cat orig.config.yaml kernel: kdb: true sha1: 42c6070519ad45965762de9b20f5bd280a6eef68 machine_type: plana nuke-on-error: true overrides: ceph: conf: client: debug client: 10 mds: debug mds: 20 debug ms: 1 mon: debug mon: 20 debug ms: 20 debug paxos: 20 log-whitelist: - slow request - wrongly marked me down sha1: 5327d06275972dc49150505b113051743a87f8b5 s3tests: branch: next workunit: sha1: 5327d06275972dc49150505b113051743a87f8b5 roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - - client.0 - mds.b-s-a tasks: - chef: null - clock.check: null - install: null - ceph: null - mds_thrash: null - ceph-fuse: null - workunit: clients: all: - suites/fsstress.sh
Related issues
History
#1 Updated by Sage Weil almost 11 years ago
- Project changed from Ceph to CephFS
- Subject changed from ceph-fuse: to ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- Category deleted (
1)
#2 Updated by Sage Weil almost 11 years ago
have full log.. put a copy in the run dir
#3 Updated by Greg Farnum almost 11 years ago
/a/teuthology-2013-04-28_21:32:40-fs-next-testing-basic/2662
That's an fsstress run that got hung, I copied the client log into that dir.
#4 Updated by Sage Weil almost 11 years ago
- Assignee set to Sam Lang
#5 Updated by Sam Lang almost 11 years ago
- Status changed from 12 to In Progress
#6 Updated by Sam Lang almost 11 years ago
This looks like the client creates a file, then unlinks it, but it never removes it from its cache, because it still holds the caps for the inode (caps=pAsXsFs). Then the mds goes through a series of failures, with the standby mds taking over, and every time the inode in question (10000000004) gets replayed (both the openc and unlink). Eventually, the log gets trimmed, and the replay doesn't include those ops. This is where the mds sends back cap export messages for the inode, because the inode isn't in the mds cache from journal replay or from the client replaying unsafe requests.
The mds should be sending back a caps revoke for the remaining caps on unlink, I think. Its not clear yet why that's not happening...
#7 Updated by Greg Farnum almost 11 years ago
We can't revoke on unlink because the file might still be held open with something accessing it. :)
#8 Updated by Sage Weil almost 11 years ago
- Priority changed from Urgent to High
#9 Updated by Sam Lang almost 11 years ago
- File ceph-client.0.28605.log.gz added
- File ceph-mds.b-s-a.log.04.gz added
- File ceph-mds.a.log.04.gz added
The attached files include the complete client log, along with the mds logs that include 10000000004 (one of the indoes that is disconnected at the client). The ceph-mds.a.log is the mds that sends the cap exports. The full mds logs are available on teuthology at:
/a/teuthology-2013-04-27_20:55:21-fs-next-testing-basic/2193/remote/ubuntu@plana70.front.sepia.ceph.com/log/ceph-mds.b-s-a.log.gz
/a/teuthology-2013-04-27_20:55:21-fs-next-testing-basic/2193/remote/ubuntu@plana73.front.sepia.ceph.com/log/ceph-mds.a.log.gz
(6GB each)
#10 Updated by Zheng Yan almost 11 years ago
I think this is a general issue. When handling MClientReconnect, if an inode is not in the cache, the MDS tries fetching the missing inode by the path client provides. But the path client provides isn't always accurate. (because the MDS does not send notifications to clients when handling rename/unlink)
#11 Updated by Sage Weil almost 11 years ago
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-05-01_01:00:37-fs-next-testing-basic/4534
#12 Updated by Greg Farnum almost 11 years ago
- Status changed from In Progress to 12
- Assignee deleted (
Sam Lang)
Hmm, I thought we handled renames properly since they involve changing the caps state. But maybe we don't propagate the new path out to clients. :(
We certainly don't do so on unlinks, and it is indeed a problem. The clean solution would be to fix it in the protocol, although Sage suggests we might be able to do something with pinning deleted-but-referenced nodes. (I'm not at all sure we want to commit to that as a long-term solution, though.)
Sam, you're not actively working on this any more, right?
#13 Updated by Zheng Yan almost 11 years ago
- Assignee set to Zheng Yan
FYI: I have code that finds the missing inode by using backtrace. The code is under test, will send out soon.
#14 Updated by Zheng Yan almost 11 years ago
- Assignee deleted (
Zheng Yan)
#15 Updated by Sage Weil over 10 years ago
- Status changed from 12 to Resolved
i think this is resolved now...