Bug #22743
open"RadosModel.h: 854: FAILED assert(0)" in upgrade:hammer-x-jewel-distro-basic-smithi
0%
Description
Run: http://pulpito.ceph.com/teuthology-2018-01-19_01:15:02-upgrade:hammer-x-jewel-distro-basic-smithi/
Job: 2088826
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2018-01-19_01:15:02-upgrade:hammer-x-jewel-distro-basic-smithi/2088826/teuthology.log
2018-01-21T16:58:24.901 INFO:tasks.rados.rados.0.smithi076.stderr:Error: finished tid 1 when last_acked_tid was 6 2018-01-21T16:58:24.906 INFO:tasks.rados.rados.0.smithi076.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f35dbfff700 time 2018-01-21 16:58:24.910783 2018-01-21T16:58:24.906 INFO:tasks.rados.rados.0.smithi076.stderr:./test/osd/RadosModel.h: 854: FAILED assert(0) 2018-01-21T16:58:24.906 INFO:tasks.rados.rados.0.smithi076.stderr: ceph version 0.94.10-85-ga8e54ce (a8e54cee69fc2fdc8df27f35ebe1b56444f43317) 2018-01-21T16:58:24.906 INFO:tasks.rados.rados.0.smithi076.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x4e147b] 2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x46e) [0x4c8bfe] 2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 3: (write_callback(void*, void*)+0x19) [0x4d8a79] 2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 4: (()+0x9526d) [0x7f35f4f3526d] 2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 5: (()+0x704c9) [0x7f35f4f104c9] 2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 6: (()+0x1351e8) [0x7f35f4fd51e8] 2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 7: (()+0x8184) [0x7f35f4a74184] 2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 8: (clone()+0x6d) [0x7f35f33f4ffd] 2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Greg Farnum about 6 years ago
- Project changed from Ceph to RADOS
- Category set to Correctness/Safety
- Priority changed from Urgent to Normal
- Release set to hammer
I don't think a bug in a hammer binary during an upgrade test to jewel is an urgent problem at this point?
Updated by Nathan Cutler about 6 years ago
- Related to Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi added
Updated by Nathan Cutler about 6 years ago
- Related to Bug #22123: osd: objecter sends out of sync with pg epochs for proxied ops added
Updated by Nathan Cutler about 6 years ago
@David I guess this is a duplicate, too?
Updated by David Zafman about 6 years ago
@Nathan Weinberg This doesn't have cache tier, so it would be a different issue. Maybe related to upgrade?
Updated by Greg Farnum about 6 years ago
I looked at it briefly and the output doesn't make any sense to me, but I don't have a lot of context around what these tests are doing. Namely, I don't see a tid write 6 being sent out anywhere nearby.
2018-02-16T02:28:25.685 INFO:tasks.rados.rados.0.smithi141.stdout:3573: writing smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299 from 533518 to 1261370 tid 1 2018-02-16T02:28:25.713 INFO:tasks.rados.rados.0.smithi141.stdout:3573: writing smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299 from 2029614 to 2524516 tid 2 2018-02-16T02:28:25.739 INFO:tasks.rados.rados.0.smithi141.stdout:3573: writing smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299 from 3193506 to 3733518 tid 3 2018-02-16T02:28:25.740 INFO:tasks.rados.rados.0.smithi141.stdout:3574: read oid 42 snap -1 2018-02-16T02:28:25.740 INFO:tasks.rados.rados.0.smithi141.stdout:3574: expect deleted 2018-02-16T02:28:25.740 INFO:tasks.rados.rados.0.smithi141.stdout:3575: snap_create 2018-02-16T02:28:25.742 INFO:tasks.rados.rados.0.smithi141.stdout:3573: finishing write tid 6 to smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299 2018-02-16T02:28:26.651 INFO:tasks.ceph.mon.c.smithi184.stderr:2018-02-16 02:28:26.649819 7fb7d3e73700 -1 mon.c@2(peon).osd e1600 update_from_paxos full map CRC mismatch, resetting to canonical 2018-02-16T02:28:26.652 INFO:tasks.ceph.mon.b.smithi184.stderr:2018-02-16 02:28:26.650992 7fb08a9bb700 -1 mon.b@1(peon).osd e1600 update_from_paxos full map CRC mismatch, resetting to canonical 2018-02-16T02:28:26.654 INFO:tasks.rados.rados.0.smithi141.stdout:3574: done (5 left) 2018-02-16T02:28:26.654 INFO:tasks.rados.rados.0.smithi141.stdout:3575: done (4 left) 2018-02-16T02:28:26.655 INFO:tasks.rados.rados.0.smithi141.stdout:3576: snap_create 2018-02-16T02:28:27.431 INFO:tasks.ceph.mon.b.smithi184.stderr:2018-02-16 02:28:27.429488 7fb08a9bb700 -1 mon.b@1(peon).osd e1601 update_from_paxos full map CRC mismatch, resetting to canonical 2018-02-16T02:28:27.432 INFO:tasks.ceph.mon.c.smithi184.stderr:2018-02-16 02:28:27.429488 7fb7d3e73700 -1 mon.c@2(peon).osd e1601 update_from_paxos full map CRC mismatch, resetting to canonical 2018-02-16T02:28:27.434 INFO:tasks.rados.rados.0.smithi141.stdout:3576: done (4 left) 2018-02-16T02:28:27.435 INFO:tasks.rados.rados.0.smithi141.stdout:3577: rollback oid 36 current snap is 456 2018-02-16T02:28:27.435 INFO:tasks.rados.rados.0.smithi141.stdout:rollback oid 36 to 446 2018-02-16T02:28:27.435 INFO:tasks.rados.rados.0.smithi141.stdout:3578: snap_create 2018-02-16T02:28:27.457 INFO:tasks.rados.rados.0.smithi141.stdout:3573: finishing write tid 1 to smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299 2018-02-16T02:28:27.457 INFO:tasks.rados.rados.0.smithi141.stderr:Error: finished tid 1 when last_acked_tid was 6
So I guess it's not outputting all the tids it's using, but at a quick skim of the source I'm not sure how. Nonetheless, it's obviously happening, and if you scroll back searching for object smithi14116405-7, you'll find that same basic pattern of writing tids 1-3 or 1-4, and then getting "finishing write tid" statements for 1-6. All in all, I'm confuzzled.
Updated by Nathan Cutler about 6 years ago
Happened again (jewel 10.2.11 integration testing) - http://qa-proxy.ceph.com/teuthology/smithfarm-2018-03-28_20:31:45-rados-wip-jewel-backports-distro-basic-smithi/2331780/