Actions
Bug #57986
closedceph: ceph_fl_release_lock cause "unable to handle kernel paging request at ffffffffffffff34"
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
#0 [ffff95202f33b960] machine_kexec at ffffffff890662f4 #1 [ffff95202f33b9c0] __crash_kexec at ffffffff89122b82 #2 [ffff95202f33ba90] crash_kexec at ffffffff89122c70 #3 [ffff95202f33baa8] oops_end at ffffffff89791798 #4 [ffff95202f33bad0] no_context at ffffffff89075d14 #5 [ffff95202f33bb20] __bad_area_nosemaphore at ffffffff89075fe2 #6 [ffff95202f33bb70] bad_area_nosemaphore at ffffffff89076104 #7 [ffff95202f33bb80] __do_page_fault at ffffffff89794750 #8 [ffff95202f33bbf0] do_page_fault at ffffffff89794975 #9 [ffff95202f33bc20] page_fault at ffffffff89790778 [exception RIP: ceph_fl_release_lock+20] RIP: ffffffffc08247a4 RSP: ffff95202f33bcd0 RFLAGS: 00010286 RAX: ffff952d4ebd8a00 RBX: 0000000000000000 RCX: dead000000000200 RDX: ffff95202f33bd60 RSI: ffff95202f33bd60 RDI: ffff9526b6ac5b00 RBP: ffff95202f33bce0 R8: ffff9526b6ac5b18 R9: ffffffffc083c368 R10: 0000000000001109 R11: 0000000000000000 R12: ffff95202f33bd60 R13: ffff9526b6ac5b00 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff95202f33bce8] locks_release_private at ffffffff892ab3d7 #11 [ffff95202f33bd00] locks_free_lock at ffffffff892ac34d #12 [ffff95202f33bd18] locks_dispose_list at ffffffff892ac44b #13 [ffff95202f33bd40] __posix_lock_file at ffffffff892acdfa #14 [ffff95202f33bda8] posix_lock_file at ffffffff892ad146 #15 [ffff95202f33bdb8] ceph_lock at ffffffffc0824e8a [ceph] #16 [ffff95202f33bdf8] vfs_lock_file at ffffffff892ad185 #17 [ffff95202f33be08] locks_remove_posix at ffffffff892ad239 #18 [ffff95202f33bee0] locks_remove_posix at ffffffff892ad2a0 #19 [ffff95202f33bef0] filp_close at ffffffff8924baa6 #20 [ffff95202f33bf18] __close_fd at ffffffff8926f89c #21 [ffff95202f33bf40] sys_close at ffffffff8924d503 #22 [ffff95202f33bf50] system_call_fastpath at ffffffff89799f92 RIP: 00007f806ec446ab RSP: 00007f80517f0d90 RFLAGS: 00010206 RAX: 0000000000000003 RBX: 00007f8030001a20 RCX: 00007f80300386b0 RDX: 00007f806ef0d880 RSI: 0000000000000001 RDI: 0000000000000006 RBP: 00007f806ef0e3c0 R8: 00007f80517fa700 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000 R13: 00007f80300035b0 R14: 00007f80517f1104 R15: 000000000000006c ORIG_RAX: 0000000000000003 CS: 0033 SS: 002b
Updated by Xiubo Li over 1 year ago
There should be a race in 'filp_close()`, for example in a single process a file is opened twice with two different file descripters in two different threads: filpA for Userspace ThreadA and filpB for Userspace ThreadB, then both ThreadA and ThreadB set posix locks for the file:
Userspace ThreadA: Userspace ThreadB: filp_close(): filp_close(): ->locks_remove_posix(filpA): ->locks_remove_posix(filpB): ->vfs_lock_file(): ->vfs_lock_file(): ->ceph_lock(): ->ceph_lock(): ->posix_lock_file(): ->__posix_lock_file(): ->Iterate and remove all the inode's posix locks with the same owner, which all the posix lock owner are the same: current->files. This will also close ThreadB's posix lock. ->posix_lock_file(): ->__posix_lock_file(): ->Will do nothing since there is no any posix lock in the inode ->locks_dispose_list(): ->Do nothing too. ->fput(filpB): ->__fput(filpB): ->file->f_path.dentry = NULL; ->file->f_inode = NULL; ->locks_dispose_list(): ->locks_free_lock(): ->locks_release_private(): ->Remove both ThreadA and ThreadB posix locks. And when accessing filpB it will crash. ->fput(filpA)
The ThreadA and ThreadB in kernel space will share the same file descripters, if my understanding is correct the posix locks' owner will be the same: current->files.
Updated by Xiubo Li over 1 year ago
- Status changed from In Progress to Fix Under Review
The patchwork links:
Jeff's VFS locks patch:
https://patchwork.kernel.org/project/ceph-devel/list/?series=695950
Ceph layer patches:
https://patchwork.kernel.org/project/ceph-devel/list/?series=696724
Updated by Xiubo Li over 1 year ago
- Status changed from Fix Under Review to Resolved
Applied to the mainline and closing this tracker.
Actions