Project

General

Profile

Actions

Bug #22256

closed

nfs-ganesha: crashes in free_delegrecall_context

Added by Jeff Layton over 6 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, Ganesha FSAL
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I've been working on delegation support in cephfs for ganesha. The ceph pieces were recently merged, so I rebased my ceph delegations patch on top of the latest ganesha -next branch. I'm now seeing regular crashes when running the cthon special tests against it. This is one of them:

(gdb) bt
#0  0x00007ffff5aa01f7 in raise () from /lib64/libc.so.6
#1  0x00007ffff5aa18e8 in abort () from /lib64/libc.so.6
#2  0x0000000000435bb5 in free_delegrecall_context (deleg_ctx=0x7ffef40008c0) at /home/jlayton/git/ganesha/src/FSAL_UP/fsal_up_top.c:1075
#3  0x0000000000436394 in delegrecall_completion_func (call=0x7ffef40009a8) at /home/jlayton/git/ganesha/src/FSAL_UP/fsal_up_top.c:1201
#4  0x000000000043e123 in nfs_rpc_call_process (cc=0x7ffef4000a20) at /home/jlayton/git/ganesha/src/MainNFSD/nfs_rpc_callback.c:921
#5  0x00007ffff63d0adf in svc_rqst_expire_task (wpe=0x7ffef4000a20) at /home/jlayton/git/ganesha/src/libntirpc/src/svc_rqst.c:293
#6  0x00007ffff63db89d in work_pool_thread (arg=0x7fff94000c30) at /home/jlayton/git/ganesha/src/libntirpc/src/work_pool.c:176
#7  0x00007ffff6805e25 in start_thread () from /lib64/libpthread.so.0
#8  0x00007ffff5b6334d in clone () from /lib64/libc.so.6

Essentially, it looks like the deleg_ctx has already been freed at this point, and the drc_clid pointer is now bogus. The code then asserts because pthread_mutex_lock returned EINVAL (probably because the mutex has been scribbled over).

The patch to add delegation support to ceph is here:

https://review.gerrithub.io/#/c/377714/

It's fairly straightforward. I mostly notice this when running the cthon special tests against the server. It eventually crashes during one of the rename tests.


Files

ganesha.conf (1.02 KB) ganesha.conf Jeff Layton, 11/27/2017 07:35 PM
Actions #1

Updated by Jeff Layton over 6 years ago

Here's my ganesha.conf as well. I bisected the change down to 46a5e8535f978b1e12dcb15cbdcbf6d5e757d24e (nfs_rpc_call), if I base my ceph patch on top of the commit just before this, it works fine.

Actions #2

Updated by Patrick Donnelly over 6 years ago

  • Subject changed from crashes in free_delegrecall_context to nfs-ganesha: crashes in free_delegrecall_context
  • Status changed from New to In Progress
  • Assignee set to Jeff Layton
  • Source set to Development
Actions #3

Updated by Jeff Layton over 6 years ago

  • Status changed from In Progress to Resolved

This was fixed by commit f332c172a2884c04a0d4e743c8858ff3e7f957a1 in ganesha (and the associated ntirpc changes).

Actions #4

Updated by Patrick Donnelly about 5 years ago

  • Category deleted (109)
  • Component(FS) Client, Ganesha FSAL added
Actions

Also available in: Atom PDF