Project

General

Profile

Actions

Backport #17970

closed

jewel: osd/ECBackend.cc: 201: FAILED assert(res.errors.empty())

Added by Nathan Cutler over 7 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
David Zafman
Target version:
Release:
jewel
Pull request ID:
Crash signature (v1):
Crash signature (v2):


Related issues 1 (0 open1 closed)

Copied from Ceph - Bug #13937: osd/ECBackend.cc: 201: FAILED assert(res.errors.empty())ResolvedDavid Zafman12/01/2015

Actions
Actions #1

Updated by Nathan Cutler over 7 years ago

  • Copied from Bug #13937: osd/ECBackend.cc: 201: FAILED assert(res.errors.empty()) added
Actions #2

Updated by Nathan Cutler over 7 years ago

  • Description updated (diff)
  • Status changed from New to In Progress
  • Assignee set to David Zafman
Actions #3

Updated by David Zafman over 7 years ago

  • Status changed from In Progress to Fix Under Review
Actions #4

Updated by Aaron T over 7 years ago

I compiled this pull request earlier today and started an OSD (17) with the new code. Then I started the OSDs which were sending error reports. Another crash happened almost immediately. Please let me know if I should open a new bug report since the error came from ReplicatedPG.cc.

Log snippit:

 -286> 2016-12-05 12:26:01.167445 7faaacfdb700 10 log is not dirty
   -16> 2016-12-05 12:26:01.584140 7faaacfdb700 10 osd.17 109374 do_recovery can start 1 (1/50 rops)
   -13> 2016-12-05 12:26:01.584815 7faaacfdb700 10 osd.17 109374 do_recovery starting 1 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]
   -12> 2016-12-05 12:26:01.584854 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE] recover_replicas(1)
   -11> 2016-12-05 12:26:01.584891 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.1(10) missing 0 objects.
   -10> 2016-12-05 12:26:01.584920 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.1(10) missing {}
    -9> 2016-12-05 12:26:01.584942 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.8(13) missing 0 objects.
    -8> 2016-12-05 12:26:01.584962 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.8(13) missing {}
    -7> 2016-12-05 12:26:01.584984 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.9(7) missing 0 objects.
    -6> 2016-12-05 12:26:01.585005 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.9(7) missing {}
    -5> 2016-12-05 12:26:01.585026 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.10(3) missing 0 objects.
    -4> 2016-12-05 12:26:01.585048 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.10(3) missing {}
    -3> 2016-12-05 12:26:01.585070 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.13(2) missing 1 objects.
    -2> 2016-12-05 12:26:01.585092 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.13(2) missing {2:c04d383d:::100000755d1.000001dd:head=0'0}
    -1> 2016-12-05 12:26:01.585118 7faaacfdb700 -1 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE] recover_replicas: object added to missing set for backfill, but is not in recovering, error!
     0> 2016-12-05 12:26:01.590780 7faaacfdb700 -1 osd/ReplicatedPG.cc: In function 'int ReplicatedPG::recover_replicas(int, ThreadPool::TPHandle&)' thread 7faaacfdb700 time 2016-12-05 12:26:01.585153
2016-12-05 12:26:01.645575 7faaacfdb700 -1 *** Caught signal (Aborted) **
 in thread 7faaacfdb700 thread_name:tp_osd_recov
     0> 2016-12-05 12:26:01.645575 7faaacfdb700 -1 *** Caught signal (Aborted) **
 in thread 7faaacfdb700 thread_name:tp_osd_recov
Actions #5

Updated by Loïc Dachary over 7 years ago

@Aaron zhang could you please open a bug report for this ? In the bug report make sure you add every details, including access to the git repository you used and the SHA1 you compiled, as well as how you compiled the sources. Thanks !

Actions #6

Updated by Aaron T over 7 years ago

Loic Dachary wrote:

@Aaron zhang could you please open a bug report for this ? In the bug report make sure you add every details, including access to the git repository you used and the SHA1 you compiled, as well as how you compiled the sources. Thanks !

@Loïc Dachary - Created an initial bug report here: #18162

Actions #7

Updated by Ken Dreyer about 7 years ago

  • Description updated (diff)
Actions #8

Updated by Nathan Cutler about 7 years ago

  • Status changed from Fix Under Review to Resolved
  • Target version set to v10.2.7
Actions

Also available in: Atom PDF