Bug #23947
openceph_test_rados dumped core, Error: finished tid 1 when last_acked_tid was 6
0%
Description
I don't see a stack trace in teuthology.log
There is a core file that can be analyzed using packages in https://shaman.ceph.com/repos/ceph/wip-zafman-testing/e31d02d156079600e558ac1e73f3b433f5daa498/
Updated by David Zafman almost 6 years ago
Stack trace from core:
Thread 1 (Thread 0x7efc09ffb700 (LWP 15989)): #0 0x00007efc2c322428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x00007efc2c32402a in __GI_abort () at abort.c:89 #2 0x00005560047aabce in WriteOp::_finish (this=0x55600584adf0, info=<optimized out>) at /build/ceph-13.0.2-1912-ge31d02d/src/test/osd/RadosModel.h:906 #3 0x00005560047c6369 in TestOp::finish (info=0x5560058080d0, this=<optimized out>) at /build/ceph-13.0.2-1912-ge31d02d/src/test/osd/RadosModel.cc:20 #4 write_callback (comp=<optimized out>, arg=0x55600587cb70) at /build/ceph-13.0.2-1912-ge31d02d/src/test/osd/RadosModel.cc:33 #5 0x00007efc35716ae6 in librados::C_AioComplete::finish(int) () from /usr/lib/librados.so.2 #6 0x00007efc356f5949 in Context::complete(int) () from /usr/lib/librados.so.2 #7 0x00007efc2c999c1e in Finisher::finisher_thread_entry() () from /usr/lib/ceph/libceph-common.so.0 #8 0x00007efc3545b6ba in start_thread (arg=0x7efc09ffb700) at pthread_create.c:333 #9 0x00007efc2c3f441d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 2018-04-29T06:55:45.486 INFO:tasks.rados.rados.0.smithi146.stderr:Error: finished tid 1 when last_acked_tid was 6
Line 906 is this ceph_abort()
void _finish(CallbackInfo *info) override { assert(info); context->state_lock.Lock(); uint64_t tid = info->id; cout << num << ": finishing write tid " << tid << " to " << context->prefix + oid << std::endl; if (tid <= last_acked_tid) { cerr << "Error: finished tid " << tid << " when last_acked_tid was " << last_acked_tid << std::endl; ceph_abort(); }
Updated by David Zafman almost 6 years ago
(gdb) print *this $4 = {<TestOp> = {_vptr.TestOp = 0x5560049d8088 <vtable for WriteOp+16>, num = 2487, context = 0x7fff72564bb0, stat = 0x7fff725648b0, done = false}, oid = "110", cont = {objnum = 1532, cursnap = 0, seqnum = 1532, prefix = "smithi14615956-OID: 110 snap 0\n", oid = ""}, waiting = std::set with 4 elements = {[0] = 0x5560057c9490, [1] = 0x556005887ca0, [2] = 0x5560058ad980, [3] = 0x55600607aa90}, rcompletion = 0x556005826aa0, waiting_on = 1, last_acked_tid = 6, read_op = {<librados::ObjectOperation> = {<No data fields>}, <No data fields>}, write_op = {<librados::ObjectOperation> = {<No data fields>}, unused = 0x0}, rbuffer = { _buffers = {<std::__cxx11::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >> = {_M_impl = {<std::allocator<std::_List_node<ceph::buffer::ptr> >> = {<__gnu_cxx::new_allocator<std::_List_node<ceph::buffer::ptr> >> = {<No data fields>}, <No data fields>}, _M_node = {<std::__detail::_List_node_base> = {_M_next = 0x7efc1c001f90, _M_prev = 0x7efc1c001f90}, _M_storage = {_M_storage = "\001\000\000\000\000\000\000"}}}}, <No data fields>}, _len = 1, _memcopy_count = 0, append_buffer = {_raw = 0x0, _off = 0, _len = 0}, last_p = {<ceph::buffer::list::iterator_impl<false>> = {<std::iterator<std::forward_iterator_tag, char, long, char*, char&>> = {<No data fields>}, bl = 0x55600584aef8, ls = 0x55600584aef8, Python Exception <class 'gdb.error'> There is no member or method named _M_data.: off = 0, p = , p_off = 0}, <No data fields>}, static CLAIM_DEFAULT = 0, static CLAIM_ALLOW_NONSHAREABLE = 1}, do_append = false, do_excl = false}
$ grep "smithi14615956-110" teuthology.log 2018-04-29T06:53:05.103 INFO:tasks.rados.rados.0.smithi146.stdout:110: writing smithi14615956-110 from 631963 to 1263046 tid 1 2018-04-29T06:53:05.110 INFO:tasks.rados.rados.0.smithi146.stdout:110: writing smithi14615956-110 from 2061949 to 2621968 tid 2 2018-04-29T06:53:05.116 INFO:tasks.rados.rados.0.smithi146.stdout:110: writing smithi14615956-110 from 3243812 to 3748579 tid 3 2018-04-29T06:53:05.356 INFO:tasks.rados.rados.0.smithi146.stdout:110: finishing write tid 1 to smithi14615956-110 2018-04-29T06:53:05.356 INFO:tasks.rados.rados.0.smithi146.stdout:110: finishing write tid 2 to smithi14615956-110 2018-04-29T06:53:05.356 INFO:tasks.rados.rados.0.smithi146.stdout:110: finishing write tid 3 to smithi14615956-110 2018-04-29T06:53:05.356 INFO:tasks.rados.rados.0.smithi146.stdout:110: finishing write tid 5 to smithi14615956-110 2018-04-29T06:53:05.357 INFO:tasks.rados.rados.0.smithi146.stdout:110: finishing write tid 6 to smithi14615956-110 2018-04-29T06:55:36.190 INFO:tasks.rados.rados.0.smithi146.stdout:2487: writing smithi14615956-110 from 548324 to 1216745 tid 1 2018-04-29T06:55:36.196 INFO:tasks.rados.rados.0.smithi146.stdout:2487: writing smithi14615956-110 from 1749002 to 2312864 tid 2 2018-04-29T06:55:36.209 INFO:tasks.rados.rados.0.smithi146.stdout:2487: writing smithi14615956-110 from 2937672 to 3584631 tid 3 2018-04-29T06:55:37.554 INFO:tasks.rados.rados.0.smithi146.stdout:2487: finishing write tid 2 to smithi14615956-110 2018-04-29T06:55:37.647 INFO:tasks.rados.rados.0.smithi146.stdout:2487: finishing write tid 3 to smithi14615956-110 2018-04-29T06:55:37.647 INFO:tasks.rados.rados.0.smithi146.stdout:2487: finishing write tid 5 to smithi14615956-110 2018-04-29T06:55:37.647 INFO:tasks.rados.rados.0.smithi146.stdout:2487: finishing write tid 6 to smithi14615956-110 2018-04-29T06:55:45.486 INFO:tasks.rados.rados.0.smithi146.stdout:2487: finishing write tid 1 to smithi14615956-110 2018-04-29T06:55:45.486 INFO:tasks.rados.rados.0.smithi146.stderr:Error: finished tid 1 when last_acked_tid was 6
Updated by David Zafman almost 6 years ago
- Subject changed from ceph_test_rados dumped core to ceph_test_rados dumped core, Error: finished tid 1 when last_acked_tid was 6
Updated by David Zafman almost 6 years ago
A 2018-04-29T06:55:36.190 INFO:tasks.rados.rados.0.smithi146.stdout:2487: writing smithi14615956-110 from 548324 to 1216745 tid 1
B 2018-04-29T06:55:36.196 INFO:tasks.rados.rados.0.smithi146.stdout:2487: writing smithi14615956-110 from 1749002 to 2312864 tid 2
C 2018-04-29T06:55:36.209 INFO:tasks.rados.rados.0.smithi146.stdout:2487: writing smithi14615956-110 from 2937672 to 3584631 tid 3
The write at 548324 "A" was processed 8 seconds later with RETRY=2.
B 2018-04-29 06:55:36.978 7fec604e3700 10 osd.3 pg_epoch: 97 pg[2.13( v 88'300 (0'0,88'300] local-lis/les=96/97 n=14 ec=96/17 lis/c 96/17 les/c/f 97/18/0 17/96/17) [3,2,4] r=0 lpr=96 pi=[17,96)/1 crt=88'300 lcod 0'0 mlcod 0'0 active mbc={}] new_repop rep_tid 2077 on osd_op(client.4339.0:9440 2.13 2:c958b3e9:::smithi14615956-110:head [write 1749002~563862 [fadvise_dontneed]] snapc 0=[] ondisk+write+known_if_redirected e96) v8
C 2018-04-29 06:55:36.978 7fec604e3700 10 osd.3 pg_epoch: 97 pg[2.13( v 97'301 (0'0,97'301] local-lis/les=96/97 n=14 ec=96/17 lis/c 96/96 les/c/f 97/97/0 17/96/17) [3,2,4] r=0 lpr=96 luod=88'300 crt=97'301 lcod 0'0 mlcod 0'0 active+clean] new_repop rep_tid 2078 on osd_op(client.4339.0:9441 2.13 2:c958b3e9:::smithi14615956-110:head [write 2937672~646959 [fadvise_dontneed]] snapc 0=[] ondisk+write+known_if_redirected e96) v8
A 2018-04-29 06:55:44.650 7fec604e3700 10 osd.3 pg_epoch: 104 pg[2.13( v 97'307 (0'0,97'307] local-lis/les=102/103 n=14 ec=96/17 lis/c 102/96 les/c/f 103/97/0 101/102/17) [3,2,1]/[3,2] async=[1] r=0 lpr=102 pi=[96,102)/1 rops=2 crt=97'307 lcod 97'306 mlcod 0'0 active+recovering+undersized+remapped mbc={255={(2+1)=8}}] new_repop rep_tid 2133 on osd_op(client.4339.0:9439 2.13 2:c958b3e9:::smithi14615956-110:head [write 548324~668421 [fadvise_dontneed]] snapc 0=[] RETRY=2 ondisk+retry+write+known_if_redirected e104) v8
Updated by David Zafman almost 6 years ago
- Related to Bug #23857: flush (manifest) vs async recovery causes out of order op added
Updated by Kefu Chai almost 6 years ago
- Related to Bug #23598: hammer->jewel: ceph_test_rados crashes during radosbench task in jewel rados upgrade test added
Updated by Kefu Chai about 5 years ago
/a/kchai-2019-01-29_02:56:52-rados-wip-kefu2-testing-2019-01-28-1728-distro-basic-mira/3521986
/a/kchai-2019-01-29_02:56:52-rados-wip-kefu2-testing-2019-01-28-1728-distro-basic-mira/3521988
Updated by Kefu Chai about 5 years ago
/a/kchai-2019-02-16_11:36:29-rados-wip-sage-testing-2019-02-16-1748-distro-basic-smithi/3601272
/a/kchai-2019-05-27_10:18:11-rados-wip-kefu-testing-2019-05-27-1206-distro-basic-smithi/3983050
Updated by Kefu Chai about 4 years ago
2020-03-24T16:51:52.628 INFO:tasks.rados.rados.0.smithi167.stdout:5179: finishing write tid 5 to smithi16730471-110 2020-03-24T16:51:52.628 INFO:tasks.rados.rados.0.smithi167.stdout:5179: finishing write tid 2 to smithi16730471-110 2020-03-24T16:51:52.628 INFO:tasks.rados.rados.0.smithi167.stderr:Error: finished tid 2 when last_acked_tid was 5 2020-03-24T16:51:52.628 INFO:tasks.rados.rados.0.smithi167.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/ gigantic/release/15.1.1-346-g983ae03/rpm/el8/BUILD/ceph-15.1.1-346-g983ae03/src/test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7fabea7fc700 time 2020-03-2 4T16:51:52.620939+0000 2020-03-24T16:51:52.629 INFO:tasks.rados.rados.0.smithi167.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/ gigantic/release/15.1.1-346-g983ae03/rpm/el8/BUILD/ceph-15.1.1-346-g983ae03/src/test/osd/RadosModel.h: 907: ceph_abort_msg("abort() called") 2020-03-24T16:51:52.629 INFO:tasks.rados.rados.0.smithi167.stderr: ceph version 15.1.1-346-g983ae03 (983ae031b114222849ba5d8e81e4e984a5f8d8f8) octopus (rc) 2020-03-24T16:51:52.629 INFO:tasks.rados.rados.0.smithi167.stderr: 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > con st&)+0xe5) [0x7fac05b9e310] 2020-03-24T16:51:52.629 INFO:tasks.rados.rados.0.smithi167.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x666) [0x557f6dbd9da6] 2020-03-24T16:51:52.629 INFO:tasks.rados.rados.0.smithi167.stderr: 3: (write_callback(void*, void*)+0x1d) [0x557f6dbfa29d] 2020-03-24T16:51:52.629 INFO:tasks.rados.rados.0.smithi167.stderr: 4: (()+0x9e44e) [0x7fac0f0b444e] 2020-03-24T16:51:52.629 INFO:tasks.rados.rados.0.smithi167.stderr: 5: (()+0x619ed) [0x7fac0f0779ed] 2020-03-24T16:51:52.630 INFO:tasks.rados.rados.0.smithi167.stderr: 6: (Finisher::finisher_thread_entry()+0x1a5) [0x7fac05c2e6d5] 2020-03-24T16:51:52.630 INFO:tasks.rados.rados.0.smithi167.stderr: 7: (()+0x82de) [0x7fac0522d2de]
/a/kchai-2020-03-24_03:31:20-rados-wip-kefu-testing-2020-03-23-2313-distro-basic-smithi/4886049