Actions
Feature #9477
closedHandle kclient shutdown with dead network more gracefully
Status:
Closed
Priority:
Normal
Assignee:
-
Category:
Introspection/Control
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
Component(FS):
kceph
Labels (FS):
Pull request ID:
Description
[ 2281.086454] INFO: task sync:3132 blocked for more than 120 seconds. [ 2281.092778] Tainted: G E 3.17.0-rc5-ceph-00011-g177c131 #1 [ 2281.099911] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2281.107836] sync D ffff8804256ab900 0 3132 3125 0x00000000 [ 2281.115087] ffff8800369cfc18 0000000000000046 ffff8800369cfbc8 ffff8800369a0000 [ 2281.122730] 0000000000013b80 ffff8800369cffd8 ffff8800369cfbd8 0000000000013b80 [ 2281.130393] ffff88042965c3a0 ffff8800369a0000 ffffffff81118503 ffff88043fc944f8 [ 2281.137983] Call Trace: [ 2281.140515] [<ffffffff81118503>] ? __delayacct_blkio_start+0x23/0x30 [ 2281.147043] [<ffffffff8172c9c0>] ? bit_wait+0x50/0x50 [ 2281.152264] [<ffffffff8172c149>] schedule+0x29/0x70 [ 2281.157309] [<ffffffff8172c21f>] io_schedule+0x8f/0xd0 [ 2281.162623] [<ffffffff8172c9eb>] bit_wait_io+0x2b/0x50 [ 2281.167934] [<ffffffff8172c8b5>] __wait_on_bit+0x65/0x90 [ 2281.173428] [<ffffffff8115e8fe>] ? find_get_pages_tag+0x2e/0x1e0 [ 2281.179605] [<ffffffff8115dcc7>] wait_on_page_bit+0xc7/0xd0 [ 2281.185358] [<ffffffff8109a260>] ? wake_atomic_t_function+0x40/0x40 [ 2281.191802] [<ffffffff8115e234>] filemap_fdatawait_range+0xf4/0x190 [ 2281.198288] [<ffffffff811fb905>] ? sync_inodes_sb+0x1a5/0x280 [ 2281.204206] [<ffffffff811fb905>] ? sync_inodes_sb+0x1a5/0x280 [ 2281.210135] [<ffffffff8115e2fb>] filemap_fdatawait+0x2b/0x30 [ 2281.215966] [<ffffffff811fb949>] sync_inodes_sb+0x1e9/0x280 [ 2281.221712] [<ffffffff81203100>] ? fdatawrite_one_bdev+0x20/0x20 [ 2281.227896] [<ffffffff81203100>] ? fdatawrite_one_bdev+0x20/0x20 [ 2281.234070] [<ffffffff8120311d>] sync_inodes_one_sb+0x1d/0x30 [ 2281.239983] [<ffffffff811d43a1>] iterate_supers+0xf1/0x100 [ 2281.245639] [<ffffffff81203285>] sys_sync+0x35/0x90 [ 2281.250691] [<ffffffff81732096>] system_call_fastpath+0x1a/0x1f [ 2281.256778] 1 lock held by sync/3132: [ 2281.260522] #0: (&type->s_umount_key#28){+++++.}, at: [<ffffffff811d4338>] iterate_supers+0x88/0x100 [ 2355.114904] libceph: mds0 10.214.132.128:6816 socket closed (con state CONNECTING)Reproduce:
- Mount a kernel cephfs client
- Block all ceph traffic (I'm using an iptables rule)
- shutdown -r now
- Host becomes un-sshable (openssh server has shut down), but does not complete shutdown
Updated by John Spray over 9 years ago
- Subject changed from kclient prevents reboot when network is dead to kclient prevents shutdown/reboot when network is dead
- Description updated (diff)
Updated by John Spray over 9 years ago
Ah, this only happens if I have some dirty state from userspace at the time. In this instance it's my Mount.open_background routine from ceph-qa-suite/tasks/filesystem/mount.py:
import time f = open("{path}", 'w') f.write('content') f.flush() f.write('content2') while True: time.sleep(1)
...aaand here's a slightly more complete console log. After this it repeats the "blocked for more than..." dump.
[ 1102.303850] libceph: client4134 fsid 86b8eeb8-25a3-46de-9aa3-a3119ca78128 [ 1102.313948] libceph: mon0 10.214.132.128:6789 session established [ 1106.379213] ip_tables: (C) 2000-2006 Netfilter Core Team [ 1107.504720] init: idmapd main process (830) killed by TERM signal [ 1107.513234] init: tty4 main process (1114) killed by TERM signal [ 1107.519597] init: tty5 main process (1120) killed by TERM signal [ 1107.526005] init: tty2 main process (1127) killed by TERM signal [ 1107.532507] init: tty3 main process (1128) killed by TERM signal [ 1107.538940] init: tty6 main process (1130) killed by TERM signal [ 1107.545384] init: cron main process (1145) killed by TERM signal [ 1107.552122] init: irqbalance main process (1162) killed by TERM signal [ 1107.559293] init: ttyS2 main process (1518) killed by TERM signal [ 1107.565893] init: tty1 main process (1519) killed by TERM signal [ 1107.572364] init: ttyS1 main process (1520) killed by TERM signal * Stopping Dist[ 1107.579808] init: plymouth-upstart-bridge main process (3204) terminated with status 1 ributed Compiler Daemon: distccd [ OK ] [ OK ] * Stopping nagios-nrpe nagios-nrpe Stopping PKCS#11 slot daemon: pkcsslotd. No /usr/bin/radosgw found running; none killed. * Stopping System status server r[ OK ] [ OK ] * Stopping Trusted Computing daemon tcsd * Unmounted debugfs from /sys/kernel/debug * Stopping quota service rpc.rquotad [ OK ] [ OK ] * Stopping NFS kernel daemon [ 1107.833517] nfsd: last server has exited, flushing export cache * Unexporting directories for NFS kernel daemon... [ OK ] [ OK ] * Turning off quotas... [ 1109.031451] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1110.386623] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1112.385543] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1115.388049] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1120.389271] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1129.384467] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1146.399377] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1172.393523] ceph: mds0 caps stale [ 1179.405782] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1244.395107] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1319.962782] INFO: task sync:3382 blocked for more than 120 seconds. [ 1319.969152] Tainted: G E 3.17.0-rc5-ceph-00011-g177c131 #1 [ 1319.976283] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1319.984210] sync D ffff880425bde3c0 0 3382 3375 0x00000000 [ 1319.991392] ffff880427447c18 0000000000000046 ffff880427447bc8 ffff880426698000 [ 1319.999040] 0000000000013b80 ffff880427447fd8 ffff880427447bd8 0000000000013b80 [ 1320.006684] ffff88042966c3a0 ffff880426698000 ffffffff81118503 ffff88043fd544f8 [ 1320.014269] Call Trace: [ 1320.016807] [<ffffffff81118503>] ? __delayacct_blkio_start+0x23/0x30 [ 1320.023339] [<ffffffff8172c9c0>] ? bit_wait+0x50/0x50 [ 1320.028565] [<ffffffff8172c149>] schedule+0x29/0x70 [ 1320.033612] [<ffffffff8172c21f>] io_schedule+0x8f/0xd0 [ 1320.038923] [<ffffffff8172c9eb>] bit_wait_io+0x2b/0x50 [ 1320.044233] [<ffffffff8172c8b5>] __wait_on_bit+0x65/0x90 [ 1320.049720] [<ffffffff8115e8fe>] ? find_get_pages_tag+0x2e/0x1e0 [ 1320.055898] [<ffffffff8115dcc7>] wait_on_page_bit+0xc7/0xd0 [ 1320.061644] [<ffffffff8109a260>] ? wake_atomic_t_function+0x40/0x40 [ 1320.068086] [<ffffffff8115e234>] filemap_fdatawait_range+0xf4/0x190 [ 1320.074533] [<ffffffff811fb905>] ? sync_inodes_sb+0x1a5/0x280 [ 1320.080450] [<ffffffff811fb905>] ? sync_inodes_sb+0x1a5/0x280 [ 1320.086366] [<ffffffff8115e2fb>] filemap_fdatawait+0x2b/0x30 [ 1320.092199] [<ffffffff811fb949>] sync_inodes_sb+0x1e9/0x280 [ 1320.097943] [<ffffffff81203100>] ? fdatawrite_one_bdev+0x20/0x20 [ 1320.104116] [<ffffffff81203100>] ? fdatawrite_one_bdev+0x20/0x20 [ 1320.110297] [<ffffffff8120311d>] sync_inodes_one_sb+0x1d/0x30 [ 1320.116216] [<ffffffff811d43a1>] iterate_supers+0xf1/0x100 [ 1320.121871] [<ffffffff81203285>] sys_sync+0x35/0x90 [ 1320.126926] [<ffffffff81732096>] system_call_fastpath+0x1a/0x1f [ 1320.133012] 1 lock held by sync/3382: [ 1320.136753] #0: (&type->s_umount_key#28){+++++.}, at: [<ffffffff811d4338>] iterate_supers+0x88/0x100 [ 1373.606150] libceph: osd0 10.214.132.128:6800 socket closed (con state CONNECTING) [ 1402.270808] ceph: mds0 hung
Updated by John Spray over 9 years ago
- Project changed from Ceph to CephFS
- Category set to 53
Updated by John Spray over 9 years ago
- Tracker changed from Bug to Feature
- Subject changed from kclient prevents shutdown/reboot when network is dead to Handle kclient shutdown with dead network more gracefully
In the general case (e.g. root filesystem is cephfs) there's nothing we can do: the system can't shut down until the filesystem data can be flushed. But for typical distributed filesystem use cases, there are things we can do to be more helpful:
- Hook into userspace init system to try remounting cephfs as RO before shutting down the SSH daemon
- Detect if the network has been shut down in userspace and assume it will be unavailable thereafter (so "crash" the filesystem in response)
- Log to syslog before shutting down userspace services to indicate that we are going into shutdown with dirty data in the filesystem
- Print a more descriptive message to the console while shutdown is stuck, like "Cannot complete shutdown because Ceph client caches are dirty, waiting..."
Updated by Vasu Kulkarni almost 8 years ago
All the above 4 options look interesting, but if we can print the 4 option in the meantime that would be very useful.
Updated by Greg Farnum almost 8 years ago
- Category changed from 53 to Introspection/Control
Updated by Zheng Yan over 4 years ago
- Status changed from New to Closed
this can be handled by 'umount -f'
Actions