Project

General

Profile

Actions

Bug #2572

closed

krbd: writeback errors?

Added by Alex Elder almost 12 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While trying to reproduce a null pointer messenger problem,
I kept hitting messages like this after some (fairly random)
number of iterations:

[ 885.866666] =====================================
[ 885.866667] [ BUG: bad unlock balance detected! ]
[ 885.866669] 3.5.0-rc1-ceph-00037-g07accf4 #1 Not tainted
[ 885.866670] -------------------------------------
[ 885.866673] flush-250:0/9378 is trying to release lock (&(&wb->list_lock)->r:
[ 885.866679] [<ffffffff811ac309>] wb_writeback+0x199/0x2d0
[ 885.866679] but there are no more locks to release!

(full dump below)

This appears to be a writeback-related problem, but I think we
need to investigate to see whether the problem has something to
do with RBD.

I have been running xfstests over RBD, and test 049 is the one
that has been reliable in reproducing problems. This may
complicate understanding this issue a bit, because this test
runs some tests on various combinations of one file system
mounted on a loop device backed by a file on another file
system, which is then backed by an RBD device. And the
loop-mounted and underlying file systems are ext2 or xfs.

Actions #1

Updated by Sage Weil almost 12 years ago

  • Project changed from Linux kernel client to rbd
Actions #2

Updated by Josh Durgin almost 12 years ago

  • Subject changed from rbd client: writeback errors? to krbd: writeback errors?
Actions #3

Updated by Ian Colle about 11 years ago

  • Assignee set to Alex Elder
Actions #4

Updated by Alex Elder about 11 years ago

  • Status changed from New to Resolved

I've run xfstests 49 a bunch of times consecutively
and I am no longer seeing this issue.

I'm about to run it 100 times, and if I don't see it I'm
going to update the nightly test file so it adds test
49 back into the test. I couldn't find any issue that
recorded removing that test from the nightly runs so
I created http://tracker.ceph.com/issues/4244.

Actions

Also available in: Atom PDF