Bug #5426
closedlibrbd: mutex assert in perfcounters::tinc in librbd::AioCompletion::complete()
100%
Description
2013-06-22T12:24:53.336 INFO:teuthology.orchestra.run.err:common/Mutex.cc: In function 'void Mutex::Lock(bool)' thread 7f32aaffd700 time 2013-06-22 12:24:55.286391 2013-06-22T12:24:53.336 INFO:teuthology.orchestra.run.err:common/Mutex.cc: 93: FAILED assert(r == 0) 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: ceph version 0.61.4-5-gd572cf6 (d572cf6f77418f217a5a8e37f1124dc566e24d0b) 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: 1: (Mutex::Lock(bool)+0x1c3) [0x7f32b5362593] 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: 2: (PerfCounters::tinc(int, utime_t)+0x5e) [0x7f32b535fdbe] 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: 3: (librbd::AioCompletion::complete()+0x210) [0x7f32b5f72ac0] 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: 4: (librbd::AioCompletion::complete_request(CephContext*, long)+0x1c7) [0x7f32b5f71ed7] 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: 5: (librbd::C_AioRead::finish(int)+0x1d6) [0x7f32b5f72336] 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: 6: (Context::complete(int)+0xa) [0x7f32b5f7281a] 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: 7: (librbd::rados_req_cb(void*, void*)+0x47) [0x7f32b5f86517] 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: 8: (librados::C_AioComplete::finish(int)+0x1d) [0x7f32b530b12d] 2013-06-22T12:24:53.337 INFO:teuthology.orchestra.run.err: 9: (Finisher::finisher_thread_entry()+0x1c0) [0x7f32b5378bb0] 2013-06-22T12:24:53.338 INFO:teuthology.orchestra.run.err: 10: (()+0x7e9a) [0x7f32b4902e9a] 2013-06-22T12:24:53.338 INFO:teuthology.orchestra.run.err: 11: (clone()+0x6d) [0x7f32b4c0bccd] 2013-06-22T12:24:53.338 INFO:teuthology.orchestra.run.err: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
run was
ubuntu@teuthology:/a/teuthology-2013-06-21_20:00:30-rbd-cuttlefish-testing-basic/42512$ cat orig.config.yaml kernel: kdb: true sha1: 2dd322b42d608a37f3e5beed57a8fbc673da6e32 machine_type: plana nuke-on-error: true overrides: admin_socket: branch: cuttlefish ceph: conf: client: rbd cache: false global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 20 debug paxos: 20 osd: osd op thread timeout: 60 fs: btrfs log-whitelist: - slow request sha1: d572cf6f77418f217a5a8e37f1124dc566e24d0b install: ceph: sha1: d572cf6f77418f217a5a8e37f1124dc566e24d0b s3tests: branch: cuttlefish workunit: sha1: d572cf6f77418f217a5a8e37f1124dc566e24d0b roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.0 tasks: - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: timeout: 1200 - rbd_fsx: clients: - client.0 ops: 2000
Updated by Sage Weil almost 11 years ago
- Status changed from New to 12
2013-07-09T20:46:21.561 INFO:teuthology.orchestra.run.err:common/Mutex.cc: In function 'void Mutex::Lock(bool)' thread 7fb3b1ffb700 time 2013-07-09 20:46:41.372806 2013-07-09T20:46:21.561 INFO:teuthology.orchestra.run.err:common/Mutex.cc: 93: FAILED assert(r == 0) 2013-07-09T20:46:21.567 INFO:teuthology.orchestra.run.err: ceph version 0.61.4-48-g24f90b8 (24f90b832c695ef13021db66a178c18369ac356d) 2013-07-09T20:46:21.567 INFO:teuthology.orchestra.run.err: 1: (Mutex::Lock(bool)+0x1c3) [0x7fb3bc4410d3] 2013-07-09T20:46:21.567 INFO:teuthology.orchestra.run.err: 2: (PerfCounters::tinc(int, utime_t)+0x5e) [0x7fb3bc43e8fe] 2013-07-09T20:46:21.567 INFO:teuthology.orchestra.run.err: 3: (librbd::AioCompletion::complete()+0x210) [0x7fb3bd053ac0] 2013-07-09T20:46:21.567 INFO:teuthology.orchestra.run.err: 4: (librbd::AioCompletion::complete_request(CephContext*, long)+0x1c7) [0x7fb3bd052ed7] 2013-07-09T20:46:21.567 INFO:teuthology.orchestra.run.err: 5: (librbd::C_AioRead::finish(int)+0x1d6) [0x7fb3bd053336] 2013-07-09T20:46:21.568 INFO:teuthology.orchestra.run.err: 6: (Context::complete(int)+0xa) [0x7fb3bd05381a] 2013-07-09T20:46:21.568 INFO:teuthology.orchestra.run.err: 7: (librbd::rados_req_cb(void*, void*)+0x47) [0x7fb3bd067517] 2013-07-09T20:46:21.568 INFO:teuthology.orchestra.run.err: 8: (librados::C_AioComplete::finish(int)+0x1d) [0x7fb3bc3e9c6d] 2013-07-09T20:46:21.568 INFO:teuthology.orchestra.run.err: 9: (Finisher::finisher_thread_entry()+0x1c0) [0x7fb3bc4576f0] 2013-07-09T20:46:21.568 INFO:teuthology.orchestra.run.err: 10: (()+0x7e9a) [0x7fb3bb9e0e9a] 2013-07-09T20:46:21.568 INFO:teuthology.orchestra.run.err: 11: (clone()+0x6d) [0x7fb3bbce9ccd] 2013-07-09T20:46:21.568 INFO:teuthology.orchestra.run.err: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2013-07-09T20:46:21.569 INFO:teuthology.orchestra.run.err:terminate called after throwing an instance of 'ceph::FailedAssertion'
job was
ubuntu@teuthology:/a/teuthology-2013-07-09_20:00:36-rbd-cuttlefish-testing-basic/61034$ cat orig.config.yaml kernel: kdb: true sha1: 365b57b1317524bb0cdd15859a224ba1ab58d1d7 machine_type: plana nuke-on-error: true overrides: admin_socket: branch: cuttlefish ceph: conf: client: rbd cache: false global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 20 debug paxos: 20 fs: xfs log-whitelist: - slow request sha1: 24f90b832c695ef13021db66a178c18369ac356d install: ceph: sha1: 24f90b832c695ef13021db66a178c18369ac356d s3tests: branch: cuttlefish workunit: sha1: 24f90b832c695ef13021db66a178c18369ac356d roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.0 tasks: - chef: null - clock.check: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: timeout: 1200 - rbd_fsx: clients: - client.0 ops: 2000
Updated by Sage Weil almost 11 years ago
both of these are cuttlefish.. maybe something that hasn't been backported?
Updated by Sage Weil almost 11 years ago
ubuntu@teuthology:/a/teuthology-2013-07-13_20:00:45-rbd-cuttlefish-testing-basic/66483$ cat orig.config.yaml kernel: kdb: true sha1: 365b57b1317524bb0cdd15859a224ba1ab58d1d7 machine_type: plana nuke-on-error: true overrides: admin_socket: branch: cuttlefish ceph: conf: global: ms inject socket failures: 5000 mon: debug mon: 20 debug ms: 20 debug paxos: 20 osd: osd op thread timeout: 60 fs: btrfs log-whitelist: - slow request sha1: 658240710baaf9c661b8fbf856322907a0d394ee install: ceph: sha1: 658240710baaf9c661b8fbf856322907a0d394ee s3tests: branch: cuttlefish workunit: sha1: 658240710baaf9c661b8fbf856322907a0d394ee roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - - client.0 tasks: - chef: null - clock.check: null - install: null - ceph: conf: client: rbd cache: false - rbd_fsx: clients: - client.0 ops: 5000
cuttlefish again!
Updated by Sage Weil almost 11 years ago
okay, not cuttlefish.
ubuntu@teuthology:/a/teuthology-2013-07-17_01:00:47-rbd-next-testing-basic/70375
2013-07-17T10:14:13.999 INFO:teuthology.task.workunit.client.0.err: 0> 2013-07-17 10:14:41.818619 7f00ae7fc700 -1 *** Caught signal (Segmentation fault) ** 2013-07-17T10:14:13.999 INFO:teuthology.task.workunit.client.0.err: in thread 7f00ae7fc700 2013-07-17T10:14:13.999 INFO:teuthology.task.workunit.client.0.err: 2013-07-17T10:14:13.999 INFO:teuthology.task.workunit.client.0.err: ceph version 0.66-669-g884fa2f (884fa2fcb6d707b23317bab1da909586ddc27608) 2013-07-17T10:14:13.999 INFO:teuthology.task.workunit.client.0.err: 1: rbd() [0x41f2aa] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 2: (()+0xfcb0) [0x7f00b8569cb0] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 3: (PerfCounters::tinc(int, utime_t)+0x28) [0x7f00b895f6b8] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 4: (librbd::AioCompletion::complete()+0x210) [0x7f00b95e95d0] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 5: (librbd::AioCompletion::complete_request(CephContext*, long)+0x1c7) [0x7f00b95e8a37] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 6: (librbd::C_AioRead::finish(int)+0x15d) [0x7f00b95e8e1d] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 7: (Context::complete(int)+0xa) [0x7f00b95e932a] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 8: (librbd::rados_req_cb(void*, void*)+0x47) [0x7f00b95fd007] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 9: (librados::C_AioComplete::finish(int)+0x1d) [0x7f00b88fef7d] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 10: (Finisher::finisher_thread_entry()+0x1c0) [0x7f00b89826b0] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 11: (()+0x7e9a) [0x7f00b8561e9a] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: 12: (clone()+0x6d) [0x7f00b7b74ccd] 2013-07-17T10:14:14.000 INFO:teuthology.task.workunit.client.0.err: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Josh Durgin almost 11 years ago
- Status changed from 12 to In Progress
Updated by Josh Durgin over 10 years ago
Reproduced after 22 tries with no debugging on. Couldn't reproduce without debugging, but this is a start.
The PerfCounters object hasn't been destroyed yet, but the _conf and other fields that its m_cct references is invalid, so m_cct is probably already destroyed at this point.
p *(this->m_cct) $8 = {nref = {val = 23004272}, _conf = 0x671, _log = 0x7facf203ebf8, _module_type = 4060343288, _service_thread = 0x15e8960, _log_obs = 0x15e8960, _admin_socket = 0x30, _service_thread_lock = 769, _perf_counters_collection = 0x15b7570, _perf_counters_conf_obs = 0x16af240, _admin_hook = 0x10000fffffffe, _heartbeat_map = 0x722032393935363a, _crypto_none = 0x2e617461645f6462, _crypto_aes = 0x3166636463303031}
Updated by Josh Durgin over 10 years ago
nevermind, the PerfCounters is invalid too, with m_lower_bound = 22777856, m_upper_bound = 0
Updated by Sage Weil over 10 years ago
ubuntu@teuthology:/a/teuthology-2013-08-22_01:01:01-rbd-next-testing-basic-plana/888
Updated by Sage Weil over 10 years ago
ubuntu@teuthology:/a/teuthology-2013-08-23_00:30:06-ceph-deploy-master---basic-saya/1087
slightly different, though:
... Thread 10 (Thread 0x7f9d1c19e700 (LWP 4064)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 ---Type <return> to continue, or q <return> to quit--- #1 0x00007f9d1df100ab in std::vector<ceph::log::Subsystem, std::allocator<ceph::log::Subsystem> >::_M_fill_insert(__gnu_cxx::__normal_iterator<ceph::log::Subsystem*, std::vector<ceph::log::Subsystem, std::allocator<ceph::log::Subsystem> > >, unsigned long, ceph::log::Subsystem const&) () from /usr/lib/librados.so.2 #2 0x0000000000000000 in ?? () Thread 9 (Thread 0x7f9d1b19c700 (LWP 4066)): #0 0x00007f9d1d6cb313 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87 #1 0x00007f9d1de9561b in std::pair<std::string const, boost::variant<std::string, bool, long, double, std::vector<std::string, std::allocator<std::string> >, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> >::~pair() () from /usr/lib/librados.so.2 #2 0x0000000000000000 in ?? () Thread 8 (Thread 0x7f9d19999700 (LWP 4069)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:215 #1 0x00007f9d1dea9ebb in Finisher::wait_for_empty() () from /usr/lib/librados.so.2 #2 0x0000000000000000 in ?? () Thread 7 (Thread 0x7f9d0acf3700 (LWP 8025)): #0 0x00007f9d1d6cb313 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87 #1 0x00007f9d1df8ed72 in Pipe::read_message(Message**) () from /usr/lib/librados.so.2 #2 0x0000000000000000 in ?? () Thread 6 (Thread 0x7f9d0adf4700 (LWP 8024)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f9d1df9b3db in Pipe::writer() () from /usr/lib/librados.so.2 #2 0x00007f9d1dfa66bd in ?? () from /usr/lib/librados.so.2 #3 0x0000000000000000 in ?? () Thread 5 (Thread 0x7f9d0abf2700 (LWP 8041)): #0 0x00007f9d1d6cb313 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87 #1 0x00007f9d1df8ed72 in Pipe::read_message(Message**) () from /usr/lib/librados.so.2 #2 0x0000000000000000 in ?? () Thread 4 (Thread 0x7f9d1ef9a780 (LWP 4063)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f9d1de173b3 in std::_Rb_tree<unsigned long, std::pair<unsigned long const, Objecter::PoolOp*>, std::_Select1st<std::pair<unsigned long const, Objecter::PoolOp*> >, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, Objecter::PoolOp*> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, Objecter::PoolOp*> >*) () from /usr/lib/librados.so.2 #2 0x00007f9d1de2285b in librados::IoCtxImpl::aio_sparse_read(object_t, librados::AioCompletionImpl*, std::map<unsigned long, unsigned long, std::less<unsigned long>, std::allocator<std::pair<unsigned long const, unsigned long> > >*, ceph::buffer::list*, unsigned long, unsigned long, unsigned long) () from /usr/lib/librados.so.2 #3 0x00007fff1216c820 in ?? () #4 0x00007fff1216c820 in ?? () #5 0x0000000100000000 in ?? () #6 0x00007fff1216c820 in ?? () #7 0x0000000000000000 in ?? () Thread 3 (Thread 0x7f9d0b1f8700 (LWP 8080)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f9d1df9b3db in Pipe::writer() () from /usr/lib/librados.so.2 #2 0x00007f9d1dfa66bd in ?? () from /usr/lib/librados.so.2 #3 0x0000000000000000 in ?? () Thread 2 (Thread 0x7f9d0aef5700 (LWP 8039)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f9d1df9b3db in Pipe::writer() () from /usr/lib/librados.so.2 #2 0x00007f9d1dfa66bd in ?? () from /usr/lib/librados.so.2 #3 0x0000000000000000 in ?? () Thread 1 (Thread 0x7f9d0bfff700 (LWP 4074)): #0 0x00007f9d1de87188 in PerfCountersCollection::remove(PerfCounters*) () from /usr/lib/librados.so.2 #1 0x0000003000000003 in ?? () #2 0x0000000000020000 in ?? () #3 0x00007f9d0bffe778 in ?? () #4 0x00007f9d0bffe778 in ?? () #5 0x00007f9c00000000 in ?? () #6 0x0000000000000000 in ?? () (gdb) q
Updated by Sage Weil over 10 years ago
ubuntu@teuthology:/a/teuthology-2013-09-05_01:00:58-rbd-next-testing-basic-plana/22153
Updated by Tamilarasi muthamizhan over 10 years ago
ubuntu@teuthology:/a/teuthology-2013-09-09_01:36:24-upgrade-small-next-testing-basic-vps/27433
Updated by Tamilarasi muthamizhan over 10 years ago
latest logs: ubuntu@teuthology:/a/teuthology-2013-11-08_05:35:02-upgrade-small-next-testing-basic-vps/90358
Updated by Ian Colle over 10 years ago
- Priority changed from Normal to Urgent
Upgraded to Urgent since causing failures in nightlies
Updated by Tamilarasi muthamizhan over 10 years ago
recent logs : ubuntu@teuthology:/a/teuthology-2013-11-21_23:01:11-rbd-next-testing-basic-plana/113257
2013-11-22T03:56:35.652 INFO:teuthology.orchestra.run.out:[10.214.132.22]: 3523 flatten 2013-11-22T03:56:38.000 INFO:teuthology.orchestra.run.out:[10.214.132.22]: 3525 trunc from 0x62e39fe to 0x1b9c7ec 2013-11-22T03:56:38.135 INFO:teuthology.orchestra.run.out:[10.214.132.22]: 3529 punch from 0x1b7c8d9 to 0x1b8ae76, (0xe59d bytes) 2013-11-22T03:56:38.178 INFO:teuthology.orchestra.run.out:[10.214.132.22]: 3530 trunc from 0x1b9c7ec to 0x74f792a 2013-11-22T03:56:38.210 INFO:teuthology.orchestra.run.out:[10.214.132.22]: 3531 clone 20 order 18 su 131072 sc 12 2013-11-22T03:56:39.281 INFO:teuthology.orchestra.run.out:[10.214.132.22]: checking clone #18, image image_client.0-clone18 against file /home/ubuntu/cephtest/archive/fsx-image_client.0-parent19 2013-11-22T03:56:41.435 INFO:teuthology.orchestra.run.err:[10.214.132.22]: common/Mutex.cc: In function 'void Mutex::Lock(bool)' thread 7f291d4f1700 time 2013-11-22 03:56:41.432336 2013-11-22T03:56:41.435 INFO:teuthology.orchestra.run.err:[10.214.132.22]: common/Mutex.cc: 93: FAILED assert(r == 0) 2013-11-22T03:56:41.435 INFO:teuthology.orchestra.run.err:[10.214.132.22]: ceph version 0.72-214-g203065c (203065c076daf9aaa9ade3d7cdd1b10e6700bfe6) 2013-11-22T03:56:41.436 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 1: (Mutex::Lock(bool)+0x1c3) [0x7f29232fc0e3] 2013-11-22T03:56:41.436 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 2: (PerfCounters::tinc(int, utime_t)+0x5e) [0x7f29232f983e] 2013-11-22T03:56:41.436 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 3: (librbd::AioCompletion::complete()+0x210) [0x7f2923fecb10] 2013-11-22T03:56:41.436 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 4: (librbd::AioCompletion::complete_request(CephContext*, long)+0x177) [0x7f2923febfd7] 2013-11-22T03:56:41.437 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 5: (librbd::C_AioRead::finish(int)+0x104) [0x7f2923fec2d4] 2013-11-22T03:56:41.437 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 6: (Context::complete(int)+0x9) [0x7f2923fec769] 2013-11-22T03:56:41.437 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 7: (librbd::rados_req_cb(void*, void*)+0x47) [0x7f2923ff6f57] 2013-11-22T03:56:41.437 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 8: (librados::C_AioComplete::finish(int)+0x1d) [0x7f292329024d] 2013-11-22T03:56:41.438 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 9: (Context::complete(int)+0x9) [0x7f2923270e69] 2013-11-22T03:56:41.438 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 10: (Finisher::finisher_thread_entry()+0x1c0) [0x7f292331d7d0] 2013-11-22T03:56:41.438 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 11: (()+0x7e9a) [0x7f2922825e9a] 2013-11-22T03:56:41.439 INFO:teuthology.orchestra.run.err:[10.214.132.22]: 12: (clone()+0x6d) [0x7f2922b2eccd] 2013-11-22T03:56:41.439 INFO:teuthology.orchestra.run.err:[10.214.132.22]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2013-11-22T03:56:41.439 INFO:teuthology.orchestra.run.err:[10.214.132.22]: terminate called after throwing an instance of 'ceph::FailedAssertion' 2013-11-22T03:56:52.443 ERROR:teuthology.run_tasks:Saw exception from tasks. Traceback (most recent call last): File "/home/teuthworker/teuthology-next/teuthology/run_tasks.py", line 33, in run_tasks manager.__enter__() File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/home/teuthworker/teuthology-next/teuthology/task/rbd_fsx.py", line 40, in task p.spawn(_run_one_client, ctx, config, role) File "/home/teuthworker/teuthology-next/teuthology/parallel.py", line 83, in __exit__ for result in self: File "/home/teuthworker/teuthology-next/teuthology/parallel.py", line 100, in next resurrect_traceback(result) File "/home/teuthworker/teuthology-next/teuthology/parallel.py", line 19, in capture_traceback return func(*args, **kwargs) File "/home/teuthworker/teuthology-next/teuthology/task/rbd_fsx.py", line 62, in _run_one_client 'image_{image}'.format(image=role), File "/home/teuthworker/teuthology-next/teuthology/orchestra/remote.py", line 47, in run r = self._runner(client=self.ssh, **kwargs) File "/home/teuthworker/teuthology-next/teuthology/orchestra/run.py", line 271, in run r.exitstatus = _check_status(r.exitstatus) File "/home/teuthworker/teuthology-next/teuthology/orchestra/run.py", line 264, in _check_status raise CommandCrashedError(command=r.command) CommandCrashedError: Command crashed: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph_test_librbd_fsx -d -W -R -p 100 -P /home/ubuntu/cephtest/archive -t 1 -l 250000000 -S 0 -N 5000 pool_client.0 image_client.0' 2013-11-22T03:56:52.650 ERROR:teuthology.run_tasks: Sentry event: http://sentry.ceph.com/inktank/teuthology/search?q=c31f734e23384b0396a3d735e27ad1f7 CommandCrashedError: Command crashed: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph_test_librbd_fsx -d -W -R -p 100 -P /home/ubuntu/cephtest/archive -t 1 -l 250000000 -S 0 -N 5000 pool_client.0 image_client.0'
Updated by Dmitry Borodaenko over 10 years ago
This problem is easily reproducible with Mirantis OpenStack 4.0:
https://bugs.launchpad.net/fuel/+bug/1260911
After converting and uploading a RAW CirrOS image, all you have to do is launch a VM from that image, then run rbd -p compute export <instance-id>_disk - > /dev/null.
Updated by Josh Durgin over 10 years ago
- Status changed from In Progress to Fix Under Review
- Backport set to dumpling, emperor
Updated by Josh Durgin over 10 years ago
- Status changed from Fix Under Review to Pending Backport
I cherry-picked the fix to dumpling-5426 and emperor-5426 branches based on dumpling and emperor, respectively. The fix will go on the actual dumpling and emperor branches once its passed nightly tests for a while.
Updated by Sage Weil over 10 years ago
- Status changed from Pending Backport to Resolved
Updated by Dmitry Borodaenko over 10 years ago
Fix confirmed with the packages from http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/ref/dumpling-5426/ on Mirantis OpenStack 4.0, SEGV no longer occurs.