Bug #2536
librados crashed while getting stat of an object
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
librados
Target version:
-
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
librados crashed while getting stat of an object:
./log/SubsystemMap.h: In function 'bool ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread 7f52adaf3700 time 2012-06-11 18:33:47.455897 ./log/SubsystemMap.h: 74: FAILED assert(sub < m_subsys.size()) ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17ad372) 1: (()+0xebbc0) [0x7f5298a92bc0] 2: (librados::IoCtxImpl::stat(object_t const&, unsigned long*, long*)+0x597) [0x7f5298ab2a87] 3: (rados_stat()+0x3a) [0x7f5298a975fa] 4: (x_stat(enif_environment_t*, int, unsigned long const*)+0x1cd) [0x7f5298f11f65] 5: (process_main()+0x4f32) [0x544122] 6: /usr/local/erlang/lib/erlang/erts-5.9.1/bin/beam.smp() [0x4a7d08] 7: /usr/local/erlang/lib/erlang/erts-5.9.1/bin/beam.smp() [0x5bcb20] 8: (()+0x7e9a) [0x7f52b378be9a] 9: (clone()+0x6d) [0x7f52b32b14bd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. terminate called after throwing an instance of 'ceph::FailedAssertion' Aborted (core dumped)
Attached is the objdump file.
History
#1 Updated by Sage Weil over 11 years ago
Have you seen this problem since then? It looks like it could be due to racing with rados startup or shutdown...
#2 Updated by Sage Weil over 11 years ago
- Status changed from New to Need More Info
#3 Updated by Sage Weil about 11 years ago
- Priority changed from High to Normal
#4 Updated by Sage Weil about 11 years ago
- Status changed from Need More Info to Can't reproduce
#5 Updated by Benjamin Schulz almost 11 years ago
- File objdump.txt View added
Hi,
I got the same assertion:
radosgw-admin user create./log/SubsystemMap.h: In function 'bool ceph::log::SubsystemMap::should_gather(unsigned int, int)' thread 7f8fa8546760 time 2012-11-26 02:41:55.267779
./log/SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
ceph version 0.48.1argonaut (a7ad701b9bd479f20429f19e6fea7373ca6bba7c)
1: (()+0x17c3a7) [0x7f8fa770f3a7]
2: (MonClient::build_initial_monmap()+0x1e9) [0x7f8fa77da249]
3: (librados::RadosClient::connect()+0x48) [0x7f8fa77231b8]
4: (RGWRados::initialize()+0x49) [0x48ece9]
5: (RGWCache<RGWRados>::initialize()+0x17) [0x4a31e7]
6: (RGWRados::init_storage_provider(CephContext*)+0x30) [0x48ebe0]
7: (main()+0xfd0) [0x42c8c0]
8: (__libc_start_main()+0xfd) [0x7f8fa5cbcead]
9: radosgw-admin() [0x432761]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
- Caught signal (Segmentation fault) *
in thread 7f8fa8546760
ceph version 0.48.1argonaut (a7ad701b9bd479f20429f19e6fea7373ca6bba7c)
1: radosgw-admin() [0x43d8b2]
2: (()+0xf030) [0x7f8fa715c030]
3: (ceph::__ceph_assert_fail(char const, char const*, int, char const*)+0x38f) [0x445edf]
4: (()+0x17c3a7) [0x7f8fa770f3a7]
5: (MonClient::build_initial_monmap()+0x1e9) [0x7f8fa77da249]
6: (librados::RadosClient::connect()+0x48) [0x7f8fa77231b8]
7: (RGWRados::initialize()+0x49) [0x48ece9]
8: (RGWCache<RGWRados>::initialize()+0x17) [0x4a31e7]
9: (RGWRados::init_storage_provider(CephContext*)+0x30) [0x48ebe0]
10: (main()+0xfd0) [0x42c8c0]
11: (__libc_start_main()+0xfd) [0x7f8fa5cbcead]
12: radosgw-admin() [0x432761]
Segmentation fault
I'm running v0.48.1 on debian wheezy. The system is setup in two VMs, I can reproduce it every time. Contact me, if you're interested in the VM-Images.
best Regards
-- Benjamin
#6 Updated by Greg Farnum almost 11 years ago
Hey Benjamin, this is the same assert but quite a different call chain — can you create a new bug? Preferably with some logs of the crash?