Bug #11294
opensamba: DISCONNECTED inode warning
0%
Description
This is on the hammer branch.
http://pulpito.ceph.com/teuthology-2015-03-29_23:14:01-samba-hammer-testing-basic-multi/827923/
2015-03-31 11:00:00.581700 7f38bbfff700 1 -- 10.214.131.16:0/18333 <== mds.0 10.214.135.6:6808/10215 11530 ==== client_session(renewcaps seq 1789) v1 ==== 28+0+0 (748224024 0 0) 0x7f38a8002730 con 0x472b0c0 2015-03-31 11:00:00.581710 7f38bbfff700 10 client.4116 handle_client_session client_session(renewcaps seq 1789) v1 from mds.0 2015-03-31 11:00:00.581713 7f38bbfff700 10 client.4116 unmounting: trim pass, size was 0+1 2015-03-31 11:00:00.581714 7f38bbfff700 20 client.4116 trim_cache size 0 max 0 2015-03-31 11:00:00.581716 7f38bbfff700 10 client.4116 unmounting: trim pass, size still 0+1 2015-03-31 11:00:01.756486 7f38cc6747c0 1 client.4116 dump_cache 2015-03-31 11:00:01.756505 7f38cc6747c0 1 client.4116 dump_inode: DISCONNECTED inode 10000000091 #10000000091 ref 681410000000091.head(ref=6814 ll_ref=0 cap_refs={1024=0} open={1=6814} mode=100744 size=0/0 mtime=2015-03-30 06:36:21.982318 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] 0x7f38ac046e80) 2015-03-31 11:00:01.756528 7f38cc6747c0 2 client.4116 cache still has 0+1 items, waiting (for caps to release?) 2015-03-31 11:00:06.756685 7f38cc6747c0 1 client.4116 dump_cache 2015-03-31 11:00:06.756703 7f38cc6747c0 1 client.4116 dump_inode: DISCONNECTED inode 10000000091 #10000000091 ref 681410000000091.head(ref=6814 ll_ref=0 cap_refs={1024=0} open={1=6814} mode=100744 size=0/0 mtime=2015-03-30 06:36:21.982318 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] 0x7f38ac046e80) 2015-03-31 11:00:06.756726 7f38cc6747c0 2 client.4116 cache still has 0+1 items, waiting (for caps to release?) 2015-03-31 11:00:11.756885 7f38cc6747c0 1 client.4116 dump_cache 2015-03-31 11:00:11.756903 7f38cc6747c0 1 client.4116 dump_inode: DISCONNECTED inode 10000000091 #10000000091 ref 681410000000091.head(ref=6814 ll_ref=0 cap_refs={1024=0} open={1=6814} mode=100744 size=0/0 mtime=2015-03-30 06:36:21.982318 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] 0x7f38ac046e80) 2015-03-31 11:00:11.756927 7f38cc6747c0 2 client.4116 cache still has 0+1 items, waiting (for caps to release?)
This was a hang, but I copied the actual logs into scp_remotes.
Updated by Greg Farnum about 9 years ago
- Status changed from New to In Progress
- Assignee set to Greg Farnum
The Inode thinks it's opened 6814 times for read. That seems implausible; and I notice that it appears to be monotonically increasing. Probably something wrong with our locking code, since this is smbtorture...?
Or maybe samba's test is just that mistaken.
Updated by Greg Farnum about 9 years ago
Yeah, we're performing 16383 ll_opens on inode 10000000091, but only getting 9570 ll_release() calls on it. So this looks like an issue with either Samba tests or our bindings.
Updated by Greg Farnum about 9 years ago
- Assignee changed from Greg Farnum to Zheng Yan
- Priority changed from Urgent to Normal
Okay, now I'm just confused because it doesn't seem that the Samba VFS is using the ll calls; it's just using ceph_open and ceph_close.
Zheng, do you have any idea?
Also, I guess this isn't urgent since it appears we aren't losing the reference internally.
Updated by Greg Farnum about 9 years ago
d'oh.
Okay, one of us will need to see if's possible to track this to a problem with ceph-fuse or if Samba is holding the file open a bunch of times. (It looks like it's shutting down in an error case so maybe?).
Updated by Zheng Yan about 9 years ago
- Status changed from In Progress to 12
One of the FH which does not get released is 0x7f38741050f0. After opened, the FH was used once for ll_flush, but it never got released. This should be some kind of kernel bug, but there is no kernel message for further debugging.
Updated by Greg Farnum almost 9 years ago
We saw this again: http://pulpito-rdu.front.sepia.ceph.com/teuthology-2015-04-25_16:14:02-samba-hammer-testing-basic-typica/5186/
Copied the client log; nothing looked interesting in kern.log nor dmesg.
Updated by Greg Farnum almost 8 years ago
- Category changed from 46 to 43
- Assignee deleted (
Zheng Yan) - Component(FS) ceph-fuse added
This doesn't look anything like #11835 to me; I've not been tracking closely enough to know if we're still seeing hangs though.
Updated by Patrick Donnelly about 5 years ago
- Category deleted (
43) - Labels (FS) Samba/CIFS added