Project

General

Profile

Bug #11188

CephFS/NFS lockup

Added by Greg Farnum over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
NFS (Linux Kernel)
Target version:
-
Start date:
03/20/2015
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:

Description

From http://pulpito.ceph.com/teuthology-2015-03-18_23:10:01-knfs-master-testing-basic-multi/810664/

The first backtrace is:

Mar 19 08:29:38 mira087 kernel: [13945.287341] Call Trace:
Mar 19 08:29:38 mira087 kernel: [13945.287343]  [<ffffffff813ce8ef>] __delay+0xf/0x20
Mar 19 08:29:38 mira087 kernel: [13945.287347]  [<ffffffff810b13ce>] do_raw_spin_lock+0x8e/0x170
Mar 19 08:29:38 mira087 kernel: [13945.287351]  [<ffffffff81764910>] _raw_spin_lock+0x40/0x50
Mar 19 08:29:38 mira087 kernel: [13945.287354]  [<ffffffff8123367a>] ? generic_setlease+0x1fa/0x730
Mar 19 08:29:38 mira087 kernel: [13945.287359]  [<ffffffff81372a25>] ? apparmor_file_lock+0x25/0x30
Mar 19 08:29:38 mira087 kernel: [13945.287365]  [<ffffffff8123367a>] generic_setlease+0x1fa/0x730
Mar 19 08:29:38 mira087 kernel: [13945.287369]  [<ffffffff81233bd5>] vfs_setlease+0x25/0x30
Mar 19 08:29:38 mira087 kernel: [13945.287373]  [<ffffffffa033b7d7>] nfs4_put_deleg_lease+0x77/0x90 [nfsd]
Mar 19 08:29:38 mira087 kernel: [13945.287390]  [<ffffffffa034473c>] nfsd4_delegreturn+0x15c/0x170 [nfsd]
Mar 19 08:29:38 mira087 kernel: [13945.287404]  [<ffffffffa03445e5>] ? nfsd4_delegreturn+0x5/0x170 [nfsd]
Mar 19 08:29:38 mira087 kernel: [13945.287418]  [<ffffffffa0330a54>] nfsd4_proc_compound+0x4b4/0x750 [nfsd]
Mar 19 08:29:38 mira087 kernel: [13945.287431]  [<ffffffffa031c505>] nfsd_dispatch+0xe5/0x230 [nfsd]
Mar 19 08:29:38 mira087 kernel: [13945.287440]  [<ffffffffa026d9c2>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc]
Mar 19 08:29:38 mira087 kernel: [13945.287462]  [<ffffffffa026c7a4>] svc_process_common+0x324/0x680 [sunrpc]
Mar 19 08:29:38 mira087 kernel: [13945.287481]  [<ffffffffa026ce73>] svc_process+0x123/0x200 [sunrpc]
Mar 19 08:29:38 mira087 kernel: [13945.287499]  [<ffffffffa031bd47>] nfsd+0x167/0x1e0 [nfsd]
Mar 19 08:29:38 mira087 kernel: [13945.287507]  [<ffffffffa031bbe5>] ? nfsd+0x5/0x1e0 [nfsd]
Mar 19 08:29:38 mira087 kernel: [13945.287516]  [<ffffffffa031bbe0>] ? nfsd_destroy+0xd0/0xd0 [nfsd]
Mar 19 08:29:38 mira087 kernel: [13945.287523]  [<ffffffff8107c7bf>] kthread+0xef/0x110
Mar 19 08:29:38 mira087 kernel: [13945.287530]  [<ffffffff8107c6d0>] ? flush_kthread_worker+0xf0/0xf0
Mar 19 08:29:38 mira087 kernel: [13945.287533]  [<ffffffff81765548>] ret_from_fork+0x58/0x90
Mar 19 08:29:38 mira087 kernel: [13945.287536]  [<ffffffff8107c6d0>] ? flush_kthread_worker+0xf0/0xf0

I copied the whole kern.log file to mira087.kern.log in that test directory.

0001-locks-fix-file_lock-deletion-inside-loop.patch View (1.28 KB) Zheng Yan, 03/27/2015 02:37 AM

History

#1 Updated by Zheng Yan over 4 years ago

looks like a NFSD bug, let's wait for another rc release

#2 Updated by Greg Farnum over 4 years ago

If it's definitely not a CephFS bug we should probably report the failure somewhere?

#3 Updated by Zheng Yan over 4 years ago

It's a bug in kernel locking code. sent the attached patch to fs-devel

#4 Updated by Zheng Yan over 4 years ago

  • Status changed from Testing to Resolved

upstreamed by commit a901125c65544aa05c52e1a7388c3900e8af105f

Also available in: Atom PDF