General

Profile

Patrick Fruh

  • Registered on: 09/12/2017
  • Last connection: 12/19/2017

Issues

Activity

12/19/2017

09:18 PM RADOS Bug #22486: ceph shows wrong MAX AVAIL with hybrid (chooseleaf firstn 1, chooseleaf firstn -1) CR...
Forgot to put the output in code tags, sadly I can't edit the original, so here it is again to make it more readable:...
09:14 PM RADOS Bug #22486 (New): ceph shows wrong MAX AVAIL with hybrid (chooseleaf firstn 1, chooseleaf firstn ...
I have the following configuration of OSDs:
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 hdd 5...

10/18/2017

09:27 PM Ceph Bug #21834 (Duplicate): Filestore OSD Segfault in thread 7f084dffc700 thread_name:tp_fstore_op
I just zapped and deleted one of my OSDs and added it again, because of the segfault described here: http://tracker.c...
09:03 AM Ceph Bug #21826 (Duplicate): Filestore OSDs start segfaulting
Since upgrading to luminous 12.1.0 some of my filestore OSDs regularily start segfaulting/flapping during the nightly...

09/21/2017

09:10 PM Ceph Bug #21416: osd/PGLog.cc: 60: FAILED assert(s <= can_rollback_to) after upgrade to luminous
So, to sum everything up:
# I upgraded my 6 hosts with 41 OSDs total from Ceph 10.2.9 to 12.2.0 and from CentOS 7....
08:46 PM Ceph Bug #21416: osd/PGLog.cc: 60: FAILED assert(s <= can_rollback_to) after upgrade to luminous
Looking deeper into it, it's only my osd.0 that crashed and only some osds have this "pg_epoch" spam, the other logs ...
08:43 PM Ceph Bug #21416: osd/PGLog.cc: 60: FAILED assert(s <= can_rollback_to) after upgrade to luminous
Looking at the most recent logs, some OSDs still crashed yesterday early morning (even after going back to filestore ...
08:34 PM Ceph Bug #21416: osd/PGLog.cc: 60: FAILED assert(s <= can_rollback_to) after upgrade to luminous
Yes, the errors seemingly started to show up when the OSDs were under high load (recovering themselves or recovering ...

09/16/2017

01:00 PM Ceph Bug #21416: osd/PGLog.cc: 60: FAILED assert(s <= can_rollback_to) after upgrade to luminous
Self test of all 3 OSDs and journal / DB SSD came out ok.
11:15 AM Ceph Bug #21416: osd/PGLog.cc: 60: FAILED assert(s <= can_rollback_to) after upgrade to luminous
Those 3 OSDs share a journaling / DB SSD - so it might be that failing, currently running a smart self test on it

Also available in: Atom