Project

General

Profile

Actions

Bug #552

closed

Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg

Added by Paul Komkoff over 13 years ago. Updated over 13 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

With kernel oplocks = yes, samba fills up dmesg with those

[ 4472.504211] ceph: problem parsing dir contents -5
[ 4472.505543] ceph: mds parse_reply err -5
[ 4472.506799] ceph: mdsc_handle_reply got corrupt reply mds0
[ 4472.508131] header: 00000000: 21 00 00 00 00 00 00 00 d1 57 05 00 00 00 00 00 !........W......
[ 4472.508135] header: 00000010: 1a 00 7f 00 01 00 1c 00 00 00 00 00 00 00 00 00 ................
[ 4472.508138] header: 00000020: 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 ................
[ 4472.508141] header: 00000030: 00 2a 7f 70 e5 .*.p.
[ 4472.508143] front: 00000000: 10 01 00 00 00 00 00 00 45 00 00 00 01 00 00 00 ........E.......
[ 4472.508146] front: 00000010: 00 00 00 01 00 00 00 01 00 00 00 00 ............
[ 4472.508149] footer: 00000000: 24 13 71 62 00 00 00 00 00 00 00 00 01 $.qb.........

client kernel 2.6.36 (2.6.36-1.fc15.x86_64)
server ceph version 0.23~rc (commit:62716aa7c9a264c7a575bbccde0d8a7002563210)

Actions #1

Updated by Sage Weil over 13 years ago

  • Target version set to v2.6.37
Actions #2

Updated by Sage Weil over 13 years ago

  • Assignee set to Greg Farnum

From the reply dump, it looks like a ceph_mds_reply_head, a length 0 tracebl, a length 1 extrabl (containing a u8 == 1), and a length 0 snapbl. The question is where that extrabl came from. My guess is

  bufferlist lock_bl;
  ::encode(lock_state, lock_bl);

  MClientReply *reply = new MClientReply(req);
  reply->set_extra_bl(lock_bl);

from handle_client_file_readlock(). That lock_state looks suspect because it's a pointer,
  ceph_lock_state_t *lock_state = NULL;

Even if it did encode non-gibberish, though, the kclient needs to not try to decode it as directory contents if it wasn't originally readdir request...

Actions #3

Updated by Greg Farnum over 13 years ago

  • Status changed from New to 7

Our friends at Tcloud just submitted patches for this today, which I've applied to the unstable branch of our kernel client tree and the testing branch of the userspace daemons. Hopefully that'll take care of this!

Actions #4

Updated by Greg Farnum over 13 years ago

  • Status changed from 7 to Resolved

Closing this out unless we hear about more issues.

Actions

Also available in: Atom PDF