Project

General

Profile

Bug #22743

"RadosModel.h: 854: FAILED assert(0)" in upgrade:hammer-x-jewel-distro-basic-smithi

Added by Yuri Weinstein about 6 years ago. Updated almost 6 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/hammer-x
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2018-01-19_01:15:02-upgrade:hammer-x-jewel-distro-basic-smithi/
Job: 2088826
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2018-01-19_01:15:02-upgrade:hammer-x-jewel-distro-basic-smithi/2088826/teuthology.log

2018-01-21T16:58:24.901 INFO:tasks.rados.rados.0.smithi076.stderr:Error: finished tid 1 when last_acked_tid was 6
2018-01-21T16:58:24.906 INFO:tasks.rados.rados.0.smithi076.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f35dbfff700 time 2018-01-21 16:58:24.910783
2018-01-21T16:58:24.906 INFO:tasks.rados.rados.0.smithi076.stderr:./test/osd/RadosModel.h: 854: FAILED assert(0)
2018-01-21T16:58:24.906 INFO:tasks.rados.rados.0.smithi076.stderr: ceph version 0.94.10-85-ga8e54ce (a8e54cee69fc2fdc8df27f35ebe1b56444f43317)
2018-01-21T16:58:24.906 INFO:tasks.rados.rados.0.smithi076.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x4e147b]
2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x46e) [0x4c8bfe]
2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 3: (write_callback(void*, void*)+0x19) [0x4d8a79]
2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 4: (()+0x9526d) [0x7f35f4f3526d]
2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 5: (()+0x704c9) [0x7f35f4f104c9]
2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 6: (()+0x1351e8) [0x7f35f4fd51e8]
2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 7: (()+0x8184) [0x7f35f4a74184]
2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: 8: (clone()+0x6d) [0x7f35f33f4ffd]
2018-01-21T16:58:24.907 INFO:tasks.rados.rados.0.smithi076.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Related issues

Related to RADOS - Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi Duplicate 11/07/2017
Related to RADOS - Bug #22123: osd: objecter sends out of sync with pg epochs for proxied ops Resolved 11/14/2017

History

#1 Updated by Greg Farnum about 6 years ago

  • Project changed from Ceph to RADOS
  • Category set to Correctness/Safety
  • Priority changed from Urgent to Normal
  • Release set to hammer

I don't think a bug in a hammer binary during an upgrade test to jewel is an urgent problem at this point?

#3 Updated by Nathan Cutler about 6 years ago

  • Related to Bug #22064: "RadosModel.h: 865: FAILED assert(0)" in rados-jewel-distro-basic-smithi added

#4 Updated by Nathan Cutler about 6 years ago

  • Related to Bug #22123: osd: objecter sends out of sync with pg epochs for proxied ops added

#5 Updated by Nathan Cutler about 6 years ago

@David I guess this is a duplicate, too?

#6 Updated by David Zafman about 6 years ago

@nathan This doesn't have cache tier, so it would be a different issue. Maybe related to upgrade?

#7 Updated by Greg Farnum about 6 years ago

I looked at it briefly and the output doesn't make any sense to me, but I don't have a lot of context around what these tests are doing. Namely, I don't see a tid write 6 being sent out anywhere nearby.

2018-02-16T02:28:25.685 INFO:tasks.rados.rados.0.smithi141.stdout:3573:  writing smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299 from 533518 to 1261370 tid 1
2018-02-16T02:28:25.713 INFO:tasks.rados.rados.0.smithi141.stdout:3573:  writing smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299 from 2029614 to 2524516 tid 2
2018-02-16T02:28:25.739 INFO:tasks.rados.rados.0.smithi141.stdout:3573:  writing smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299 from 3193506 to 3733518 tid 3
2018-02-16T02:28:25.740 INFO:tasks.rados.rados.0.smithi141.stdout:3574: read oid 42 snap -1
2018-02-16T02:28:25.740 INFO:tasks.rados.rados.0.smithi141.stdout:3574:  expect deleted
2018-02-16T02:28:25.740 INFO:tasks.rados.rados.0.smithi141.stdout:3575: snap_create
2018-02-16T02:28:25.742 INFO:tasks.rados.rados.0.smithi141.stdout:3573:  finishing write tid 6 to smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299
2018-02-16T02:28:26.651 INFO:tasks.ceph.mon.c.smithi184.stderr:2018-02-16 02:28:26.649819 7fb7d3e73700 -1 mon.c@2(peon).osd e1600 update_from_paxos full map CRC mismatch, resetting to canonical
2018-02-16T02:28:26.652 INFO:tasks.ceph.mon.b.smithi184.stderr:2018-02-16 02:28:26.650992 7fb08a9bb700 -1 mon.b@1(peon).osd e1600 update_from_paxos full map CRC mismatch, resetting to canonical
2018-02-16T02:28:26.654 INFO:tasks.rados.rados.0.smithi141.stdout:3574: done (5 left)
2018-02-16T02:28:26.654 INFO:tasks.rados.rados.0.smithi141.stdout:3575: done (4 left)
2018-02-16T02:28:26.655 INFO:tasks.rados.rados.0.smithi141.stdout:3576: snap_create
2018-02-16T02:28:27.431 INFO:tasks.ceph.mon.b.smithi184.stderr:2018-02-16 02:28:27.429488 7fb08a9bb700 -1 mon.b@1(peon).osd e1601 update_from_paxos full map CRC mismatch, resetting to canonical
2018-02-16T02:28:27.432 INFO:tasks.ceph.mon.c.smithi184.stderr:2018-02-16 02:28:27.429488 7fb7d3e73700 -1 mon.c@2(peon).osd e1601 update_from_paxos full map CRC mismatch, resetting to canonical
2018-02-16T02:28:27.434 INFO:tasks.rados.rados.0.smithi141.stdout:3576: done (4 left)
2018-02-16T02:28:27.435 INFO:tasks.rados.rados.0.smithi141.stdout:3577: rollback oid 36 current snap is 456
2018-02-16T02:28:27.435 INFO:tasks.rados.rados.0.smithi141.stdout:rollback oid 36 to 446
2018-02-16T02:28:27.435 INFO:tasks.rados.rados.0.smithi141.stdout:3578: snap_create
2018-02-16T02:28:27.457 INFO:tasks.rados.rados.0.smithi141.stdout:3573:  finishing write tid 1 to smithi14116405-7 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299
2018-02-16T02:28:27.457 INFO:tasks.rados.rados.0.smithi141.stderr:Error: finished tid 1 when last_acked_tid was 6

So I guess it's not outputting all the tids it's using, but at a quick skim of the source I'm not sure how. Nonetheless, it's obviously happening, and if you scroll back searching for object smithi14116405-7, you'll find that same basic pattern of writing tids 1-3 or 1-4, and then getting "finishing write tid" statements for 1-6. All in all, I'm confuzzled.

Also available in: Atom PDF