Bug #24241
openNFS-Ganesha libcephfs: Assert failure in object_locator_to_pg
0%
Description
When calling ceph_ll_get_stripe_osd from nfs-ganesha fsal ceph in file mds.c, assertion failure causes segmentation fault. The code is very old and its not used anywhere else. It may be the ceph setup or the code needs to updated.
(gdb) bt
#0 0x00007fa20ebc1f67 in raise () from /lib64/libc.so.6
#1 0x00007fa20ebc333a in abort () from /lib64/libc.so.6
#2 0x00007fa2059bac35 in ceph::__ceph_assert_fail (assertion=assertion@entry=0x7fa205deb596 "ret == 0",
file=file@entry=0x7fa205e06ad0 "ceph/src/osd/OSDMap.h", line=line@entry=1024,
func=func@entry=0x7fa205e07ba0 <OSDMap::object_locator_to_pg(object_t const&, object_locator_t const&) const::__PRETTY_FUNCTION__> "pg_t OSDMap::object_locator_to_pg(const object_t&, const object_locator_t&) const") at ceph/src/common/assert.cc:66
#3 0x00007fa205b1e496 in OSDMap::object_locator_to_pg (loc=..., oid=..., this=0x21ff830)
at ceph/src/osd/OSDMap.h:1024
#4 OSDMap::make_object_layout (this=this@entry=0x21ff830, oid=..., pg_pool=pg_pool@entry=-1, nspace="")
at ceph/src/osd/OSDMap.cc:1919
#5 0x00007fa20112e797 in OSDMap::file_to_object_layout (layout=..., oid=..., this=0x21ff830)
at ceph/src/osd/OSDMap.h:1035
#6 Client::<lambda(const OSDMap&)>::operator() (o=..., __closure=<optimized out>)
at ceph/src/client/Client.cc:12473
#7 Objecter::with_osdmap<Client::ll_get_stripe_osd(Inode*, uint64_t, file_layout_t*)::<lambda(const OSDMap&)> > (cb=..., this=<optimized out>)
at ceph/src/osdc/Objecter.h:2068
#8 Client::ll_get_stripe_osd (this=<optimized out>, in=<optimized out>, blockno=<optimized out>, layout=layout@entry=0x7fa172276e70)
Updated by Patrick Donnelly almost 6 years ago
- Status changed from New to Need More Info
- Target version set to v14.0.0
- Source set to Community (user)
- Backport set to mimic,luminous
- Component(FS) Client added
What version of Ceph are you using?
Updated by supriti singh almost 6 years ago
Patrick Donnelly wrote:
What version of Ceph are you using?
I run vstart cluster from master (last commit in on may 2: 65a74d40fd69ba6987f59e67e05d7efe7fe20671). It could be that the parameters passed to ceph_ll_get_stripe_osd() from nfs-ganesha (in my implementation) are not correct somehow. From the commit log, it seems like this assert is hit when the pool is gone (https://github.com/ceph/ceph/commit/d4ed22ffe5cc50c5169a08ee4b42bb0c0b7d76d4). I will debug it further.
Updated by Jeff Layton almost 6 years ago
If you have time, it's probably worthwhile to roll a new testcase for ceph_ll_get_stripe_osd for this sort of thing. That's simpler than testing this with ganesha, and would help ensure that we don't end up breaking it again in the future.
Updated by Patrick Donnelly almost 6 years ago
- Status changed from Need More Info to New
- Priority changed from Normal to High
Updated by Patrick Donnelly about 5 years ago
- Target version changed from v14.0.0 to v15.0.0