Actions
Bug #22256
closednfs-ganesha: crashes in free_delegrecall_context
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, Ganesha FSAL
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I've been working on delegation support in cephfs for ganesha. The ceph pieces were recently merged, so I rebased my ceph delegations patch on top of the latest ganesha -next branch. I'm now seeing regular crashes when running the cthon special tests against it. This is one of them:
(gdb) bt #0 0x00007ffff5aa01f7 in raise () from /lib64/libc.so.6 #1 0x00007ffff5aa18e8 in abort () from /lib64/libc.so.6 #2 0x0000000000435bb5 in free_delegrecall_context (deleg_ctx=0x7ffef40008c0) at /home/jlayton/git/ganesha/src/FSAL_UP/fsal_up_top.c:1075 #3 0x0000000000436394 in delegrecall_completion_func (call=0x7ffef40009a8) at /home/jlayton/git/ganesha/src/FSAL_UP/fsal_up_top.c:1201 #4 0x000000000043e123 in nfs_rpc_call_process (cc=0x7ffef4000a20) at /home/jlayton/git/ganesha/src/MainNFSD/nfs_rpc_callback.c:921 #5 0x00007ffff63d0adf in svc_rqst_expire_task (wpe=0x7ffef4000a20) at /home/jlayton/git/ganesha/src/libntirpc/src/svc_rqst.c:293 #6 0x00007ffff63db89d in work_pool_thread (arg=0x7fff94000c30) at /home/jlayton/git/ganesha/src/libntirpc/src/work_pool.c:176 #7 0x00007ffff6805e25 in start_thread () from /lib64/libpthread.so.0 #8 0x00007ffff5b6334d in clone () from /lib64/libc.so.6
Essentially, it looks like the deleg_ctx has already been freed at this point, and the drc_clid pointer is now bogus. The code then asserts because pthread_mutex_lock returned EINVAL (probably because the mutex has been scribbled over).
The patch to add delegation support to ceph is here:
https://review.gerrithub.io/#/c/377714/
It's fairly straightforward. I mostly notice this when running the cthon special tests against the server. It eventually crashes during one of the rename tests.
Files
Actions