Bug #5381
ceph-fuse: stuck with disconnected inodes on shutdown
0%
Description
Seen this at least 2x in the last few days:
... 2013-06-17 13:19:59.908081 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000000da5 #10000000da5 ref 010000000da5.head(ref=0 cap_refs={1024=0,4096=0,8192=0} open={3=0} mode=100600 size=30118 mtime=2013-06-17 05:41:31.083178 caps=pAsXsFs objectset[10000000da5 ts 0/0 objects 0 dirty_or_tx 0] 0x5066b00) 2013-06-17 13:19:59.908092 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000000c3c #10000000c3c ref 010000000c3c.head(ref=0 cap_refs={} open={3=0} mode=100600 size=0 mtime=2013-06-17 05:40:55.982444 caps=pAsXsFs objectset[10000000c3c ts 0/0 objects 0 dirty_or_tx 0] 0x5125000) 2013-06-17 13:19:59.908102 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 1000000073a #1000000073a ref 01000000073a.head(ref=0 cap_refs={} open={3=0} mode=100600 size=0 mtime=2013-06-17 05:38:26.190278 caps=pAsXsFs objectset[1000000073a ts 0/0 objects 0 dirty_or_tx 0] 0x3894b00) 2013-06-17 13:19:59.908111 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000000070 #10000000070 ref 010000000070.head(ref=0 cap_refs={1024=0,4096=0,8192=0} open={3=0} mode=100600 size=53077 mtime=2013-06-17 05:32:34.358573 caps=pAsXsFscr objectset[10000000070 ts 0/0 objects 0 dirty_or_tx 0] 0x338b200) 2013-06-17 13:19:59.908122 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000000fd0 #10000000fd0 ref 010000000fd0.head(ref=0 cap_refs={} open={3=0} mode=100600 size=0 mtime=2013-06-17 05:42:27.993567 caps=pAsXsFs objectset[10000000fd0 ts 0/0 objects 0 dirty_or_tx 0] 0x440ed80) 2013-06-17 13:19:59.908132 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000002008 #10000002008 ref 010000002008.head(ref=0 cap_refs={} open={3=0} mode=100600 size=0 mtime=2013-06-17 05:47:51.065532 caps=pAsXsFs objectset[10000002008 ts 0/0 objects 0 dirty_or_tx 0] 0xbd30680) 2013-06-17 13:19:59.908151 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000000477 #10000000477 ref 010000000477.head(ref=0 cap_refs={1024=0,4096=0,8192=0} open={3=0} mode=100600 size=28489 mtime=2013-06-17 05:36:00.470362 caps=pAsXsFsxcrwb objectset[10000000477 ts 0/0 objects 0 dirty_or_tx 0] 0x2edc000) 2013-06-17 13:19:59.908163 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000001b06 #10000001b06 ref 010000001b06.head(ref=0 cap_refs={} open={3=0} mode=100600 size=0 mtime=2013-06-17 05:46:08.374166 caps=pAsXsFs objectset[10000001b06 ts 0/0 objects 0 dirty_or_tx 0] 0xb490d80) 2013-06-17 13:19:59.908172 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000000a02 #10000000a02 ref 010000000a02.head(ref=0 cap_refs={1024=0,4096=0,8192=0} open={3=0} mode=100600 size=53 mtime=2013-06-17 05:39:53.221556 caps=pAsXsFs objectset[10000000a02 ts 0/0 objects 0 dirty_or_tx 0] 0x46c1480) 2013-06-17 13:19:59.908183 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000001928 #10000001928 ref 010000001928.head(ref=0 cap_refs={} open={3=0} mode=100600 size=0 mtime=2013-06-17 05:45:42.125788 caps=pAsXsFs objectset[10000001928 ts 0/0 objects 0 dirty_or_tx 0] 0xa9cf480) 2013-06-17 13:19:59.908192 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 10000001d00 #10000001d00 ref 010000001d00.head(ref=0 cap_refs={} open={3=0} mode=100600 size=0 mtime=2013-06-17 05:47:03.310441 caps=pAsXsFs objectset[10000001d00 ts 0/0 objects 0 dirty_or_tx 0] 0xbb52b00) 2013-06-17 13:19:59.908201 7f4244276780 1 client.4117 dump_inode: DISCONNECTED inode 1000000046f #1000000046f ref 01000000046f.head(ref=0 cap_refs={1024=0,4096=0,8192=0} open={3=0} mode=100600 size=10845 mtime=2013-06-17 05:35:59.171126 caps=pAsXsFscr objectset[1000000046f ts 0/0 objects 0 dirty_or_tx 0] 0x3f8f900) ...
fusermount has already run, so this is in umount/shutdown. note the ref=0..
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-06-17_01:30:05-upgrade-master-testing-basic/37859$ cat orig.config.yaml kernel: kdb: true sha1: d9b1e9bfdfc3d76e15cbb4bc500c57e02e4779c1 machine_type: plana nuke-on-error: true overrides: ceph: conf: mon: debug mon: 20 debug ms: 20 debug paxos: 20 log-whitelist: - slow request sha1: e3fb095d8aa88556e4356c76b848fa61b09acbc0 install: ceph: sha1: e3fb095d8aa88556e4356c76b848fa61b09acbc0 s3tests: branch: master workunit: sha1: e3fb095d8aa88556e4356c76b848fa61b09acbc0 roles: - - mon.a - mds.a - osd.0 - osd.1 - - mon.b - mon.c - osd.2 - osd.3 - - client.0 tasks: - chef: null - clock.check: null - install: branch: cuttlefish - ceph: log-whitelist: - wrongly marked - ceph-fuse: null - workunit: branch: cuttlefish clients: all: - suites/dbench.sh - install.upgrade: all: branch: next - ceph.restart: - mon.a - mon.b - mon.c - osd.0 - osd.1 - osd.2 - osd.3 - mds.a - workunit: branch: next clients: all: - kernel_untar_build.sh
Related issues
Associated revisions
client: use put_inode on MetaRequest inode refs
When we drop the request inode refs, we need to use put_inode() to ensure
they get cleaned up properly (removed from inode_map, caps released, etc.).
Do this explicitly here (as we do with all other inode put() paths that
matter).
Fixes: #5381
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
client: use put_inode on MetaRequest inode refs
When we drop the request inode refs, we need to use put_inode() to ensure
they get cleaned up properly (removed from inode_map, caps released, etc.).
Do this explicitly here (as we do with all other inode put() paths that
matter).
Fixes: #5381
Backport: cuttlefish
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 81bee6487fb1ce9e090b030d61bda128a3cf4982)
History
#1 Updated by Greg Farnum over 10 years ago
Good chance this is a duplicate of #4850 (though that's fsstress, so maybe not).
#2 Updated by Zheng Yan over 10 years ago
#3 Updated by Sage Weil over 10 years ago
next time we see this (or any other ceph-fuse hsutdown hang), grab teh logs manually via scp before nuking, and note the job yaml pls!
#4 Updated by Sage Weil over 10 years ago
- Status changed from New to Need More Info
#5 Updated by Sage Weil over 10 years ago
this is sufficient to reproduce. i think this is a problem with unlinked inodes in the client cache not getting cleaned up after the mds restarts.
machine_type: plana interactive-on-error: true overrides: admin_socket: branch: master ceph: conf: mon: debug mon: 20 debug ms: 20 debug paxos: 20 client: debug client: 20 debug ms: 1 mds: debug mds: 20 debug ms: 1 log-whitelist: - slow request roles: - - mon.a - mds.a - osd.0 - osd.1 - - mon.b - mon.c - osd.2 - osd.3 - - client.0 tasks: - install: branch: cuttlefish - ceph: null - ceph-fuse: null - workunit: branch: cuttlefish clients: client.0: # - misc/trivial_sync.sh - suites/blogbench.sh #- install.upgrade: # all: # branch: next - ceph.restart: - mds.a - mon.a - mon.b - mon.c - osd.0 - osd.1 - osd.2 - osd.3 #- workunit: # branch: next # clients: # client.0: # - misc/trivial_sync.sh # - suites/fsstress.sh
#6 Updated by Greg Farnum over 10 years ago
- Status changed from Need More Info to 12
#7 Updated by Sage Weil over 10 years ago
- Status changed from 12 to Fix Under Review
#8 Updated by Sage Weil over 10 years ago
- Status changed from Fix Under Review to Pending Backport
commit:946a838cffa0927d1237489e8c2c143e87d66892
#9 Updated by Sage Weil about 10 years ago
- Status changed from Pending Backport to Resolved