Project

General

Profile

Backport #17970

jewel: osd/ECBackend.cc: 201: FAILED assert(res.errors.empty())

Added by Nathan Cutler over 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Release:
jewel
Crash signature (v1):
Crash signature (v2):


Related issues

Copied from Ceph - Bug #13937: osd/ECBackend.cc: 201: FAILED assert(res.errors.empty()) Resolved 12/01/2015

History

#1 Updated by Nathan Cutler over 4 years ago

  • Copied from Bug #13937: osd/ECBackend.cc: 201: FAILED assert(res.errors.empty()) added

#2 Updated by Nathan Cutler over 4 years ago

  • Description updated (diff)
  • Status changed from New to In Progress
  • Assignee set to David Zafman

#3 Updated by David Zafman over 4 years ago

  • Status changed from In Progress to Fix Under Review

#4 Updated by Aaron Ten Clay over 4 years ago

I compiled this pull request earlier today and started an OSD (17) with the new code. Then I started the OSDs which were sending error reports. Another crash happened almost immediately. Please let me know if I should open a new bug report since the error came from ReplicatedPG.cc.

Log snippit:

 -286> 2016-12-05 12:26:01.167445 7faaacfdb700 10 log is not dirty
   -16> 2016-12-05 12:26:01.584140 7faaacfdb700 10 osd.17 109374 do_recovery can start 1 (1/50 rops)
   -13> 2016-12-05 12:26:01.584815 7faaacfdb700 10 osd.17 109374 do_recovery starting 1 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]
   -12> 2016-12-05 12:26:01.584854 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE] recover_replicas(1)
   -11> 2016-12-05 12:26:01.584891 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.1(10) missing 0 objects.
   -10> 2016-12-05 12:26:01.584920 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.1(10) missing {}
    -9> 2016-12-05 12:26:01.584942 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.8(13) missing 0 objects.
    -8> 2016-12-05 12:26:01.584962 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.8(13) missing {}
    -7> 2016-12-05 12:26:01.584984 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.9(7) missing 0 objects.
    -6> 2016-12-05 12:26:01.585005 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.9(7) missing {}
    -5> 2016-12-05 12:26:01.585026 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.10(3) missing 0 objects.
    -4> 2016-12-05 12:26:01.585048 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.10(3) missing {}
    -3> 2016-12-05 12:26:01.585070 7faaacfdb700 10 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.13(2) missing 1 objects.
    -2> 2016-12-05 12:26:01.585092 7faaacfdb700 20 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE]  peer osd.13(2) missing {2:c04d383d:::100000755d1.000001dd:head=0'0}
    -1> 2016-12-05 12:26:01.585118 7faaacfdb700 -1 osd.17 pg_epoch: 109374 pg[2.3s0( v 91692'213861 (78780'203853,91692'213861] local-les=109374 n=136827 ec=136 les/c/f 109374/80462/109085 109366/109373/109373) [37,61,13,34,43,35,18,39,54,2147483647,21,22,58,8]/[17,2147483647,38,10,16,35,43,9,37,45,1,23,34,8] r=0 lpr=109373 pi=80459-109372/1202 bft=13(2),18(6),21(10),22(11),34(3),37(0),39(7),43(4),54(8),58(12),61(1) crt=91692'213859 lcod 0'0 mlcod 0'0 active+undersized+degraded+remapped+inconsistent+backfilling NIBBLEWISE] recover_replicas: object added to missing set for backfill, but is not in recovering, error!
     0> 2016-12-05 12:26:01.590780 7faaacfdb700 -1 osd/ReplicatedPG.cc: In function 'int ReplicatedPG::recover_replicas(int, ThreadPool::TPHandle&)' thread 7faaacfdb700 time 2016-12-05 12:26:01.585153
2016-12-05 12:26:01.645575 7faaacfdb700 -1 *** Caught signal (Aborted) **
 in thread 7faaacfdb700 thread_name:tp_osd_recov
     0> 2016-12-05 12:26:01.645575 7faaacfdb700 -1 *** Caught signal (Aborted) **
 in thread 7faaacfdb700 thread_name:tp_osd_recov

#5 Updated by Loïc Dachary over 4 years ago

@Aaron could you please open a bug report for this ? In the bug report make sure you add every details, including access to the git repository you used and the SHA1 you compiled, as well as how you compiled the sources. Thanks !

#6 Updated by Aaron Ten Clay over 4 years ago

Loic Dachary wrote:

@Aaron could you please open a bug report for this ? In the bug report make sure you add every details, including access to the git repository you used and the SHA1 you compiled, as well as how you compiled the sources. Thanks !

@Loic - Created an initial bug report here: #18162

#7 Updated by Ken Dreyer about 4 years ago

  • Description updated (diff)

#8 Updated by Nathan Cutler about 4 years ago

  • Status changed from Fix Under Review to Resolved
  • Target version set to v10.2.7

Also available in: Atom PDF