Bug #5060
closed
osd: decode failure in load_pgs on 0.56.4
Added by Ivan Kudryavtsev almost 11 years ago.
Updated almost 11 years ago.
Description
On of my osd hosts crashed on high load and after rebooted it is unable to start some osds.
Error log for osd.3 is in attachment.
Before crash LA started to grow to 30 from 1.0 and IOWAIT also, may be RAID controller bug.
Files
- Assignee set to Samuel Just
- Priority changed from Normal to Urgent
- Subject changed from Ceph crashed under high load, osd start failed after reboot to osd: decode failure in load_pgs on 0.56.4
- Status changed from New to Need More Info
Was the ceph-osd process that originally crashed under load also 0.56.4? Or an earlier version? (Do you have the log/crash dump for that)
And, can you reproduce the start-up crash with 'debug osd = 20' and 'debug filestore = 20' and 'debug ms = 1'?
Yes, it was in 0.56.4 and before.
I can not reproduce because already formatted.
- Priority changed from Urgent to High
- Status changed from Need More Info to Can't reproduce
If you see this again, please capture the stacktrace of the original before recovering, and if you can, generate a complete log of the failed restart that follows. Thanks!
Also available in: Atom
PDF