Project

General

Profile

Bug #11294

samba: DISCONNECTED inode warning

Added by Greg Farnum over 4 years ago. Updated 6 months ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
03/31/2015
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
ceph-fuse
Labels (FS):
Samba/CIFS
Pull request ID:

Description

This is on the hammer branch.
http://pulpito.ceph.com/teuthology-2015-03-29_23:14:01-samba-hammer-testing-basic-multi/827923/

2015-03-31 11:00:00.581700 7f38bbfff700  1 -- 10.214.131.16:0/18333 <== mds.0 10.214.135.6:6808/10215 11530 ==== client_session(renewcaps seq 1789) v1 ==== 28+0+0 (748224024 0 0) 0x7f38a8002730 con 0x472b0c0
2015-03-31 11:00:00.581710 7f38bbfff700 10 client.4116 handle_client_session client_session(renewcaps seq 1789) v1 from mds.0
2015-03-31 11:00:00.581713 7f38bbfff700 10 client.4116 unmounting: trim pass, size was 0+1
2015-03-31 11:00:00.581714 7f38bbfff700 20 client.4116 trim_cache size 0 max 0
2015-03-31 11:00:00.581716 7f38bbfff700 10 client.4116 unmounting: trim pass, size still 0+1
2015-03-31 11:00:01.756486 7f38cc6747c0  1 client.4116 dump_cache
2015-03-31 11:00:01.756505 7f38cc6747c0  1 client.4116 dump_inode: DISCONNECTED inode 10000000091 #10000000091 ref 681410000000091.head(ref=6814 ll_ref=0 cap_refs={1024=0} open={1=6814} mode=100744 size=0/0 mtime=2015-03-30 06:36:21.982318 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] 0x7f38ac046e80)
2015-03-31 11:00:01.756528 7f38cc6747c0  2 client.4116 cache still has 0+1 items, waiting (for caps to release?)
2015-03-31 11:00:06.756685 7f38cc6747c0  1 client.4116 dump_cache
2015-03-31 11:00:06.756703 7f38cc6747c0  1 client.4116 dump_inode: DISCONNECTED inode 10000000091 #10000000091 ref 681410000000091.head(ref=6814 ll_ref=0 cap_refs={1024=0} open={1=6814} mode=100744 size=0/0 mtime=2015-03-30 06:36:21.982318 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] 0x7f38ac046e80)
2015-03-31 11:00:06.756726 7f38cc6747c0  2 client.4116 cache still has 0+1 items, waiting (for caps to release?)
2015-03-31 11:00:11.756885 7f38cc6747c0  1 client.4116 dump_cache
2015-03-31 11:00:11.756903 7f38cc6747c0  1 client.4116 dump_inode: DISCONNECTED inode 10000000091 #10000000091 ref 681410000000091.head(ref=6814 ll_ref=0 cap_refs={1024=0} open={1=6814} mode=100744 size=0/0 mtime=2015-03-30 06:36:21.982318 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] 0x7f38ac046e80)
2015-03-31 11:00:11.756927 7f38cc6747c0  2 client.4116 cache still has 0+1 items, waiting (for caps to release?)

This was a hang, but I copied the actual logs into scp_remotes.

History

#1 Updated by Greg Farnum over 4 years ago

  • Status changed from New to In Progress
  • Assignee set to Greg Farnum

The Inode thinks it's opened 6814 times for read. That seems implausible; and I notice that it appears to be monotonically increasing. Probably something wrong with our locking code, since this is smbtorture...?
Or maybe samba's test is just that mistaken.

#2 Updated by Greg Farnum over 4 years ago

Yeah, we're performing 16383 ll_opens on inode 10000000091, but only getting 9570 ll_release() calls on it. So this looks like an issue with either Samba tests or our bindings.

#3 Updated by Greg Farnum over 4 years ago

  • Assignee changed from Greg Farnum to Zheng Yan
  • Priority changed from Urgent to Normal

Okay, now I'm just confused because it doesn't seem that the Samba VFS is using the ll calls; it's just using ceph_open and ceph_close.

Zheng, do you have any idea?

Also, I guess this isn't urgent since it appears we aren't losing the reference internally.

#4 Updated by Zheng Yan over 4 years ago

In this test case, samba uses ceph-fuse

#5 Updated by Greg Farnum over 4 years ago

d'oh.

Okay, one of us will need to see if's possible to track this to a problem with ceph-fuse or if Samba is holding the file open a bunch of times. (It looks like it's shutting down in an error case so maybe?).

#6 Updated by Zheng Yan over 4 years ago

  • Status changed from In Progress to Verified

One of the FH which does not get released is 0x7f38741050f0. After opened, the FH was used once for ll_flush, but it never got released. This should be some kind of kernel bug, but there is no kernel message for further debugging.

#7 Updated by Zheng Yan over 4 years ago

  • Status changed from Verified to Need More Info

#8 Updated by Greg Farnum over 4 years ago

We saw this again: http://pulpito-rdu.front.sepia.ceph.com/teuthology-2015-04-25_16:14:02-samba-hammer-testing-basic-typica/5186/
Copied the client log; nothing looked interesting in kern.log nor dmesg.

#10 Updated by Zheng Yan about 4 years ago

  • Regression set to No

seems like dup of #11835

#11 Updated by Greg Farnum about 3 years ago

  • Category changed from 46 to 43
  • Assignee deleted (Zheng Yan)
  • Component(FS) ceph-fuse added

This doesn't look anything like #11835 to me; I've not been tracking closely enough to know if we're still seeing hangs though.

#12 Updated by Patrick Donnelly 6 months ago

  • Category deleted (43)
  • Labels (FS) Samba/CIFS added

Also available in: Atom PDF