Project

General

Profile

Bug #52508

nfs-ganesha crash when calls libcephfs, it triggers __ceph_assert_fail

Added by le le over 2 years ago. Updated over 1 year ago.

Status:
Triaged
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
pacific,octopus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Ganesha FSAL, libcephfs
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

centos8
cephfs: v16.2.5
nfs-ganesha:v3.5 last release

In my code from git,
Inode.cc

76: if (!in.dentries.empty())
77: out << " parents=" << in.dentries;

I find something strange, my backtrace shows coredump at line 77, but I print the value of "if()" is false , what happened? Or some wrong symbols in backtrace? Or my code doesn't match the libcephfs which backtrace shows?

The in.dentries value as {_front = 0x0, _back = 0x0, _size = 0} , it will be coredump?

some debuginfo:

#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ff8b60b5db5 in __GI_abort () at abort.c:79
#2 0x00007ff8a3e3db75 in ceph::__ceph_assert_fail(char const*, char const*, int, char const*) () from /usr/lib64/ceph/libceph-common.so.2
#3 0x00007ff8a3e3dd3e in ceph::__ceph_assert_fail(ceph::assert_data const&) () from /usr/lib64/ceph/libceph-common.so.2
#4 0x00007ff8acbad627 in xlist<Dentry*>::const_iterator::operator++ (this=) at /usr/include/c++/8/ostream:559
#5 operator<< (list=..., oss=...) at /usr/src/debug/ceph-16.2.5-0.el8.x86_64/src/include/xlist.h:212
#6 operator<< (out=..., in=...) at /usr/src/debug/ceph-16.2.5-0.el8.x86_64/src/client/Inode.cc:77
#7 0x00007ff8acb715ae in Client::ll_sync_inode (this=0x55e3a853c7f0, in=in@entry=0x7ff6a9903120, syncdataonly=syncdataonly@entry=false) at /usr/include/c++/8/ostream:559
#8 0x00007ff8acadbe55 in ceph_ll_sync_inode (cmount=cmount@entry=0x55e3a849b2b0, in=in@entry=0x7ff6a9903120, syncdataonly=syncdataonly@entry=0) at /usr/src/debug/ceph-16.2.5-0.el8.x86_64/src/libcephfs.cc:1865
#9 0x00007ff8acebba15 in fsal_ceph_ll_setattr (creds=, mask=409, stx=0x7ff87c3b7310, i=, cmount=) at /usr/src/debug/nfs-ganesha-3.5-3.el8.x86_64/src/FSAL/FSAL_CEPH/statx_compat.h:209
#10 ceph_fsal_setattr2 (obj_hdl=0x7ff76c2a75b0, bypass=, state=, attrib_set=0x7ff87c3b75a0) at /usr/src/debug/nfs-ganesha-3.5-3.el8.x86_64/src/FSAL/FSAL_CEPH/handle.c:2410
#11 0x00007ff8b860eb9f in mdcache_setattr2 (obj_hdl=0x7ff76c50d3a8, bypass=, state=0x0, attrs=0x7ff87c3b75a0) at /usr/src/debug/nfs-ganesha-3.5-3.el8.x86_64/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1002
#12 0x00007ff8b852b7b4 in fsal_setattr (obj=0x7ff76c50d3a8, bypass=bypass@entry=false, state=0x0, attr=attr@entry=0x7ff87c3b75a0) at /usr/src/debug/nfs-ganesha-3.5-3.el8.x86_64/src/FSAL/fsal_helper.c:573
#13 0x00007ff8b85e2d17 in nfs4_op_setattr (op=0x7ff800bc0800, data=0x7ff8002c0850, resp=0x7ff8002f2560) at /usr/src/debug/nfs-ganesha-3.5-3.el8.x86_64/src/Protocols/NFS/nfs4_op_setattr.c:212
#14 0x00007ff8b85c678f in process_one_op (data=data@entry=0x7ff8002c0850, status=status@entry=0x7ff87c3b779c) at /usr/src/debug/nfs-ganesha-3.5-3.el8.x86_64/src/Protocols/NFS/nfs4_Compound.c:920
#15 0x00007ff8b85c7927 in nfs4_Compound (arg=0x7ff800cd7d00, req=0x7ff800cd74f0, res=0x7ff80035f410) at /usr/src/debug/nfs-ganesha-3.5-3.el8.x86_64/src/Protocols/NFS/nfs4_Compound.c:1327
#16 0x00007ff8b8547c46 in nfs_rpc_process_request (reqdata=0x7ff800cd74f0) at /usr/src/debug/nfs-ganesha-3.5-3.el8.x86_64/src/MainNFSD/nfs_worker_thread.c:1508
#17 0x00007ff8b82d5800 in svc_request () from /lib64/libntirpc.so.3.5
#18 0x00007ff8b82d2bf9 in svc_rqst_xprt_task_recv () from /lib64/libntirpc.so.3.5
#19 0x00007ff8b82d35d8 in svc_rqst_epoll_loop () from /lib64/libntirpc.so.3.5
#20 0x00007ff8b82de65d in work_pool_thread () from /lib64/libntirpc.so.3.5
#21 0x00007ff8b687114a in start_thread (arg=) at pthread_create.c:479
#22 0x00007ff8b6190dc3 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) f 6
#6 operator<< (out=..., in=...) at /usr/src/debug/ceph-16.2.5-0.el8.x86_64/src/client/Inode.cc:77
77 in /usr/src/debug/ceph-16.2.5-0.el8.x86_64/src/client/Inode.cc
(gdb) p in.dentries
$27 = {_front = 0x0, _back = 0x0, _size = 0}

(gdb) p !(in.dentries._front == 0)
$28 = false

(gdb) f 7
#7 0x00007ff8acb715ae in Client::ll_sync_inode (this=0x55e3a853c7f0, in=in@entry=0x7ff6a9903120, syncdataonly=syncdataonly@entry=false) at /usr/include/c++/8/ostream:559
559 operator<<(basic_ostream<char, _Traits>& __out, const char* __s)
(gdb) p !(in->dentries._front == 0)
$29 = false
(gdb) p in
$30 = (Inode *) 0x7ff6a9903120
(gdb) p *in
$31 = {ceph::common::RefCountedObject = {}, client = 0x55e3a853c7f0, ino = {val = 1099547201712}, snapid = {val = 18446744073709551614}, faked_ino = 0, rdev = 0, ctime = {tv = {tv_sec = 1630293648, tv_nsec = 828338984}}, btime = {tv = {
tv_sec = 1630293648, tv_nsec = 825306256}}, mode = 33188, uid = 1000, gid = 1000, nlink = 1, dir_layout = {dl_dir_hash = 0 '\000', dl_unused1 = 0 '\000', dl_unused2 = 0, dl_unused3 = 0}, layout = {stripe_unit = 4194304, stripe_count = 1,
object_size = 4194304, pool_id = 8, pool_ns = ""}, size = 0, truncate_seq = 1, truncate_size = 18446744073709551615, mtime = {tv = {tv_sec = 1630293648, tv_nsec = 828311544}}, atime = {tv = {tv_sec = 1630293648, tv_nsec = 828311494}},
time_warp_seq = 2, change_attr = 2, max_size = 4194304, dirstat = {<scatter_info_t> = {version = 0}, mtime = {tv = {tv_sec = 0, tv_nsec = 0}}, change_attr = 0, nfiles = 0, nsubdirs = 0}, rstat = {<scatter_info_t> = {version = 0}, rctime = {tv = {
tv_sec = 0, tv_nsec = 0}}, rbytes = 0, rfiles = 1, rsubdirs = 0, rsnaps = 0}, version = 6030, xattr_version = 1, snap_btime = {tv = {tv_sec = 0, tv_nsec = 0}}, snap_metadata = std::map with 0 elements, inline_version = 18446744073709551615,
inline_data = {_buffers = {_root = {next = 0x7ff6a99032a8}, _tail = 0x7ff6a99032a8}, _carriage = 0x7ff8aca94b80 ceph::buffer::v15_2_0::list::always_empty_bptr, _len = 0, _num = 0, static always_empty_bptr = {ceph::buffer::v15_2_0::ptr_hook = {
next = 0x0}, ceph::buffer::v15_2_0::ptr = {_raw = 0x0, _off = 0, _len = 0}, }}, fscrypt = false, flags = 0, quota = {max_bytes = 0, max_files = 0}, dir = 0x0, dirfragtree = {
_splits = {<compact_map_base<frag_t, int, std::map<frag_t, int, std::less<frag_t>, std::allocator<std::pair<frag_t const, int> > > >> = {
map = std::unique_ptr<std::map<frag_t, int, std::less<frag_t>, std::allocator<std::pair<frag_t const, int> > >> = {get() = 0x0}}, }}, dir_release_count = 1, dir_ordered_count = 1, dir_hashed = false, dir_replicated = false,
caps = std::map with 1 element = {[0] = {inode = @0x7ff6a9903120, session = 0x55e3a8559488, cap_id = 1, issued = 16349, implemented = 16349, wanted = 13005, seq = 2, issue_seq = 1, mseq = 0, gen = 0, latest_perms = {m_uid = 1000, m_gid = 1000,
gid_count = 1, gids = 0x7ff7c846f960, alloced_gids = true}, cap_item = {_item = 0x7ff6a8b4e168, _prev = 0x7ff7014917c8, _next = 0x0, _list = 0x55e3a8559518}}}, auth_cap = 0x7ff6a8b4e168, cap_dirtier_uid = 1000, cap_dirtier_gid = 1000,
dirty_caps = 520, flushing_caps = 0, flushing_cap_tids = std::map with 0 elements, shared_gen = 1, cache_gen = 1, snap_caps = 0, snap_cap_refs = 0, hold_caps_until = {tv = {tv_sec = 0, tv_nsec = 0}}, delay_cap_item = {_item = 0x7ff6a9903120,
_prev = 0x0, _next = 0x0, _list = 0x0}, dirty_cap_item = {_item = 0x7ff6a9903120, _prev = 0x7ff6a841f610, _next = 0x0, _list = 0x55e3a853d620}, flushing_cap_item = {_item = 0x7ff6a9903120, _prev = 0x0, _next = 0x0, _list = 0x0},
snaprealm = 0x7ff848011b60, snaprealm_item = {_item = 0x7ff6a9903120, _prev = 0x7ff6a9902e68, _next = 0x0, _list = 0x7ff848011c18}, snapdir_parent = {px = 0x0}, cap_snaps = std::map with 0 elements, open_by_mode = std::map with 1 element = {[2] = 1},
cap_refs = std::map with 0 elements, oset = {parent = 0x7ff6a9903120, ino = {val = 1099547201712}, truncate_seq = 0, truncate_size = 0, poolid = 8, objects = {_front = 0x0, _back = 0x0, _size = 0}, dirty_or_tx = 0, return_enoent = false},
reported_size = 0, wanted_max_size = 0, requested_max_size = 0, ll_ref = 1, dentries = {_front = 0x0, _back = 0x0, _size = 0}, symlink = "", xattrs = std::map with 0 elements, fragmap = std::map with 0 elements, frag_repmap = std::map with 0 elements,
waitfor_caps = empty std::__cxx11::list, waitfor_commit = empty std::__cxx11::list, waitfor_deleg = empty std::__cxx11::list, fcntl_locks = std::unique_ptr<ceph_lock_state_t> = {get() = 0x0}, flock_locks = std::unique_ptr<ceph_lock_state_t> = {
get() = 0x0}, delegations = empty std::__cxx11::list, unsafe_ops = {_front = 0x0, _back = 0x0, _size = 0}, fhs = std::set with 1 element = {[0] = 0x7ff82c445b50}, dir_pin = -1}
(gdb) f 6
#6 operator<< (out=..., in=...) at /usr/src/debug/ceph-16.2.5-0.el8.x86_64/src/client/Inode.cc:77
77 /usr/src/debug/ceph-16.2.5-0.el8.x86_64/src/client/Inode.cc: No such file or directory.
(gdb) p in
$32 = (const Inode &) @0x7ff6a9903120: {ceph::common::RefCountedObject = {}, client = 0x55e3a853c7f0, ino = {val = 1099547201712}, snapid = {val = 18446744073709551614}, faked_ino = 0, rdev = 0, ctime = {tv = {tv_sec = 1630293648,
tv_nsec = 828338984}}, btime = {tv = {tv_sec = 1630293648, tv_nsec = 825306256}}, mode = 33188, uid = 1000, gid = 1000, nlink = 1, dir_layout = {dl_dir_hash = 0 '\000', dl_unused1 = 0 '\000', dl_unused2 = 0, dl_unused3 = 0}, layout = {
stripe_unit = 4194304, stripe_count = 1, object_size = 4194304, pool_id = 8, pool_ns = ""}, size = 0, truncate_seq = 1, truncate_size = 18446744073709551615, mtime = {tv = {tv_sec = 1630293648, tv_nsec = 828311544}}, atime = {tv = {
tv_sec = 1630293648, tv_nsec = 828311494}}, time_warp_seq = 2, change_attr = 2, max_size = 4194304, dirstat = {<scatter_info_t> = {version = 0}, mtime = {tv = {tv_sec = 0, tv_nsec = 0}}, change_attr = 0, nfiles = 0, nsubdirs = 0},
rstat = {<scatter_info_t> = {version = 0}, rctime = {tv = {tv_sec = 0, tv_nsec = 0}}, rbytes = 0, rfiles = 1, rsubdirs = 0, rsnaps = 0}, version = 6030, xattr_version = 1, snap_btime = {tv = {tv_sec = 0, tv_nsec = 0}},
snap_metadata = std::map with 0 elements, inline_version = 18446744073709551615, inline_data = {_buffers = {_root = {next = 0x7ff6a99032a8}, _tail = 0x7ff6a99032a8}, _carriage = 0x7ff8aca94b80 ceph::buffer::v15_2_0::list::always_empty_bptr, _len = 0,
_num = 0, static always_empty_bptr = {ceph::buffer::v15_2_0::ptr_hook = {next = 0x0}, ceph::buffer::v15_2_0::ptr = {_raw = 0x0, _off = 0, _len = 0}, }}, fscrypt = false, flags = 0, quota = {max_bytes = 0, max_files = 0},
dir = 0x0, dirfragtree = {_splits = {<compact_map_base<frag_t, int, std::map<frag_t, int, std::less<frag_t>, std::allocator<std::pair<frag_t const, int> > > >> = {
map = std::unique_ptr<std::map<frag_t, int, std::less<frag_t>, std::allocator<std::pair<frag_t const, int> > >> = {get() = 0x0}}, }}, dir_release_count = 1, dir_ordered_count = 1, dir_hashed = false, dir_replicated = false,
caps = std::map with 1 element = {[0] = {inode = @0x7ff6a9903120, session = 0x55e3a8559488, cap_id = 1, issued = 16349, implemented = 16349, wanted = 13005, seq = 2, issue_seq = 1, mseq = 0, gen = 0, latest_perms = {m_uid = 1000, m_gid = 1000,
gid_count = 1, gids = 0x7ff7c846f960, alloced_gids = true}, cap_item = {_item = 0x7ff6a8b4e168, _prev = 0x7ff7014917c8, _next = 0x0, _list = 0x55e3a8559518}}}, auth_cap = 0x7ff6a8b4e168, cap_dirtier_uid = 1000, cap_dirtier_gid = 1000,
dirty_caps = 520, flushing_caps = 0, flushing_cap_tids = std::map with 0 elements, shared_gen = 1, cache_gen = 1, snap_caps = 0, snap_cap_refs = 0, hold_caps_until = {tv = {tv_sec = 0, tv_nsec = 0}}, delay_cap_item = {_item = 0x7ff6a9903120,
_prev = 0x0, _next = 0x0, _list = 0x0}, dirty_cap_item = {_item = 0x7ff6a9903120, _prev = 0x7ff6a841f610, _next = 0x0, _list = 0x55e3a853d620}, flushing_cap_item = {_item = 0x7ff6a9903120, _prev = 0x0, _next = 0x0, _list = 0x0},
snaprealm = 0x7ff848011b60, snaprealm_item = {_item = 0x7ff6a9903120, _prev = 0x7ff6a9902e68, _next = 0x0, _list = 0x7ff848011c18}, snapdir_parent = {px = 0x0}, cap_snaps = std::map with 0 elements, open_by_mode = std::map with 1 element = {[2] = 1},
cap_refs = std::map with 0 elements, oset = {parent = 0x7ff6a9903120, ino = {val = 1099547201712}, truncate_seq = 0, truncate_size = 0, poolid = 8, objects = {_front = 0x0, _back = 0x0, _size = 0}, dirty_or_tx = 0, return_enoent = false},
reported_size = 0, wanted_max_size = 0, requested_max_size = 0, ll_ref = 1, dentries = {_front = 0x0, _back = 0x0, _size = 0}, symlink = "", xattrs = std::map with 0 elements, fragmap = std::map with 0 elements, frag_repmap = std::map with 0 elements,
waitfor_caps = empty std::__cxx11::list, waitfor_commit = empty std::__cxx11::list, waitfor_deleg = empty std::__cxx11::list, fcntl_locks = std::unique_ptr<ceph_lock_state_t> = {get() = 0x0}, flock_locks = std::unique_ptr<ceph_lock_state_t> = {
get() = 0x0}, delegations = empty std::__cxx11::list, unsafe_ops = {_front = 0x0, _back = 0x0, _size = 0}, fhs = std::set with 1 element = {[0] = 0x7ff82c445b50}, dir_pin = -1}

History

#1 Updated by le le over 2 years ago

The exception because of compiler’s optimization ?

#2 Updated by Patrick Donnelly over 2 years ago

  • Description updated (diff)

#3 Updated by Patrick Donnelly over 2 years ago

  • Status changed from New to Triaged
  • Assignee set to Ramana Raja
  • Target version set to v17.0.0
  • Source set to Community (user)
  • Backport set to pacific,octopus

#4 Updated by Patrick Donnelly over 2 years ago

le le wrote:

The exception because of compiler’s optimization ?

Probably there is a race condition not protected by the big client lock.

#5 Updated by le le over 2 years ago

Patrick Donnelly wrote:

le le wrote:

The exception because of compiler’s optimization ?

Probably there is a race condition not protected by the big client lock.

Thanks a lot, can I comment the 3 lines code to avoid the crash if I don't need the out log?

The 3 lines code in Client.cc:

14398: //ldout(cct, 3) << "ll_sync_inode " << *in << " " << dendl;
14399: //tout(cct) << "ll_sync_inode" << std::endl;
14400: //tout(cct) << (uintptr_t)in << std::endl;

#6 Updated by Patrick Donnelly over 1 year ago

  • Target version deleted (v17.0.0)

Also available in: Atom PDF