Bug #54107: kclient: hang during umount - CephFS - Ceph

Actions

Copy link

Bug #54107

closed

kclient: hang during umount

Added by Patrick Donnelly about 2 years ago. Updated almost 2 years ago.

Status:

Resolved

Priority:

High

Assignee:

Jeff Layton

Category:

Target version:

Ceph - v18.0.0

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

kceph

Labels (FS):

qa-failure

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

2022-01-29T02:45:30.067286+00:00 smithi006 kernel: INFO: task umount:96866 blocked for more than 120 seconds.
2022-01-29T02:45:30.067390+00:00 smithi006 kernel:      Not tainted 4.18.0-305.el8.x86_64 #1
2022-01-29T02:45:30.067419+00:00 smithi006 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
2022-01-29T02:45:30.075501+00:00 smithi006 kernel: umount          D    0 96866  96850 0x00004080
2022-01-29T02:45:30.081365+00:00 smithi006 kernel: Call Trace:
2022-01-29T02:45:30.084184+00:00 smithi006 kernel: __schedule+0x2c4/0x700
2022-01-29T02:45:30.088058+00:00 smithi006 kernel: schedule+0x38/0xa0
2022-01-29T02:45:30.096379+00:00 smithi006 kernel: ceph_mdsc_sync+0x2f1/0x350 [ceph]
2022-01-29T02:45:30.096428+00:00 smithi006 kernel: ? finish_wait+0x80/0x80
2022-01-29T02:45:30.100309+00:00 smithi006 kernel: ceph_sync_fs+0x2f/0xb0 [ceph]
2022-01-29T02:45:30.104750+00:00 smithi006 kernel: sync_filesystem+0x71/0x90
2022-01-29T02:45:30.108852+00:00 smithi006 kernel: generic_shutdown_super+0x22/0x100
2022-01-29T02:45:30.113653+00:00 smithi006 kernel: kill_anon_super+0x14/0x30
2022-01-29T02:45:30.122198+00:00 smithi006 kernel: ceph_kill_sb+0x39/0x70 [ceph]
2022-01-29T02:45:30.122244+00:00 smithi006 kernel: deactivate_locked_super+0x34/0x70
2022-01-29T02:45:30.127003+00:00 smithi006 kernel: cleanup_mnt+0x3b/0x70
2022-01-29T02:45:30.130745+00:00 smithi006 kernel: task_work_run+0x8a/0xb0
2022-01-29T02:45:30.134696+00:00 smithi006 kernel: exit_to_usermode_loop+0xeb/0xf0
2022-01-29T02:45:30.139339+00:00 smithi006 kernel: do_syscall_64+0x198/0x1a0
2022-01-29T02:45:30.143456+00:00 smithi006 kernel: entry_SYSCALL_64_after_hwframe+0x65/0xca
2022-01-29T02:45:30.148857+00:00 smithi006 kernel: RIP: 0033:0x7f19ccba4dfb
2022-01-29T02:45:30.159709+00:00 smithi006 kernel: Code: Unable to access opcode bytes at RIP 0x7f19ccba4dd1.
2022-01-29T02:45:30.159752+00:00 smithi006 kernel: RSP: 002b:00007ffdf87aad78 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
2022-01-29T02:45:30.167628+00:00 smithi006 kernel: RAX: 0000000000000000 RBX: 00005624695dc5d0 RCX: 00007f19ccba4dfb
2022-01-29T02:45:30.175123+00:00 smithi006 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00005624695dc7b0
2022-01-29T02:45:30.182624+00:00 smithi006 kernel: RBP: 0000000000000000 R08: 00005624695dc7e0 R09: 00007f19ccc28580
2022-01-29T02:45:30.190119+00:00 smithi006 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00005624695dc7b0
2022-01-29T02:45:30.197616+00:00 smithi006 kernel: R13: 00007f19cd952184 R14: 0000000000000000 R15: 00000000ffffffff

From: /ceph/teuthology-archive/pdonnell-2022-01-29_01:47:41-fs:workload-wip-pdonnell-testing-20220127.171526-distro-default-smithi/6648146/remote/smithi006/syslog/kern.log.gz

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Patrick Donnelly about 2 years ago

Related to Bug #54106: kclient: hang during workunit cleanup added

Actions

Copy link

Updated by Patrick Donnelly about 2 years ago

Maybe related on testing branch: /ceph/teuthology-archive/pdonnell-2022-01-28_01:59:55-fs:workload-wip-pdonnell-testing-20220127.171526-distro-default-smithi/6645065/remote/smithi019/syslog/kern.log.gz

2022-01-28T03:22:27.902598+00:00 smithi019 kernel: libceph: mon0 (2)172.21.15.19:3300 session established
2022-01-28T03:22:27.910606+00:00 smithi019 kernel: libceph: client15204 fsid fbf413ce-7fe7-11ec-8c35-001a4aab830c
2022-01-28T03:41:39.136284+00:00 smithi019 kernel: ceph: dump_cap_flushes: still waiting for cap flushes through 79658:
2022-01-28T03:41:39.136444+00:00 smithi019 kernel: ceph: 5000000121f:fffffffffffffffe Fw 77655 0 1
2022-01-28T03:42:40.575953+00:00 smithi019 kernel: ceph: dump_cap_flushes: still waiting for cap flushes through 79658:
2022-01-28T03:42:40.576109+00:00 smithi019 kernel: ceph: 5000000121f:fffffffffffffffe Fw 77655 0 1
2022-01-28T03:43:42.015942+00:00 smithi019 kernel: ceph: dump_cap_flushes: still waiting for cap flushes through 79658:
2022-01-28T03:43:42.016095+00:00 smithi019 kernel: ceph: 5000000121f:fffffffffffffffe Fw 77655 0 1
2022-01-28T03:44:43.455977+00:00 smithi019 kernel: ceph: dump_cap_flushes: still waiting for cap flushes through 79658:
2022-01-28T03:44:43.456138+00:00 smithi019 kernel: ceph: 5000000121f:fffffffffffffffe Fw 77655 0 1
2022-01-28T03:45:36.299594+00:00 smithi019 kernel: libceph: mds0 (2)172.21.15.139:6838 socket closed (con state OPEN)
2022-01-28T03:45:36.320594+00:00 smithi019 kernel: libceph: mds2 (2)172.21.15.139:6834 socket closed (con state OPEN)
2022-01-28T03:45:36.331594+00:00 smithi019 kernel: libceph: mds1 (2)172.21.15.139:6836 socket closed (con state OPEN)
2022-01-28T03:45:36.372590+00:00 smithi019 kernel: libceph: mds3 (2)172.21.15.139:6832 socket closed (con state OPEN)
2022-01-28T03:45:36.372778+00:00 smithi019 kernel: libceph: mds4 (2)172.21.15.19:6840 socket closed (con state OPEN)
2022-01-28T06:58:34.552206+00:00 smithi019 kernel: perf: interrupt took too long (2525 > 2500), lowering kernel.perf_event_max_sample_rate to 79000

Actions

Copy link

Updated by Jeff Layton about 2 years ago

Status changed from New to In Progress

Actions

Copy link

Updated by Jeff Layton about 2 years ago

It really looks like we just never got a FLUSH_ACK for this cap:

2022-01-28T03:41:39.136284+00:00 smithi019 kernel: ceph: dump_cap_flushes: still waiting for cap flushes through 79658:
2022-01-28T03:41:39.136444+00:00 smithi019 kernel: ceph: 5000000121f:fffffffffffffffe Fw 77655 0 1

It really looks a lot like we just never received a FLUSH_ACK for this inode. The 4th field (the 0) is ci->i_last_cap_flush_ack, and that is set to 0, which means we've never seen a FLUSH_ACK for this inode.

What happens if the MDS gets a cap flush for an inode that doesn't exist yet? I don't think we can end up sending a cap flush before the create call, but it seems plausible that they could be processed in reverse order in the MDS.

Do we need to make the client wait until it gets the create response before we send a cap flush for it?

Actions

Copy link

Updated by Jeff Layton about 2 years ago

Venky, Patrick : any thoughts on the above? Either way, it might be nice to revise the dout() messages in Locker::handle_client_caps with lower numerical values, so that we get log messages if it just drops a cap flush on the floor.

I'll see if I can come up with an RFC patch for the MDS.

Actions

Copy link

Updated by Patrick Donnelly about 2 years ago

Jeff Layton wrote:

It really looks like we just never got a FLUSH_ACK for this cap:

[...]

It really looks a lot like we just never received a FLUSH_ACK for this inode. The 4th field (the 0) is ci->i_last_cap_flush_ack, and that is set to 0, which means we've never seen a FLUSH_ACK for this inode.

What happens if the MDS gets a cap flush for an inode that doesn't exist yet? I don't think we can end up sending a cap flush before the create call, but it seems plausible that they could be processed in reverse order in the MDS.

I think if the create gets put on a wait list then maybe (?) a cap flush could be sent and ignored by the MDS.