Bug #12093
closed"osd/ReplicatedPG.cc: 10405: FAILED assert(obc)" timezone fix does not work
0%
Description
I start my osds. They crash. The osds that fail are part of an SSD cache.
I've attached a sample crash and my osd map.
I see this is the duplicate. It's not the case. All machines involved have the same time and timezone, ntp synced and still have problems.
I was able to repair this earlier by starting just the OSDs without the MDS. I drained the ssd cache, set it to forward, and let it run. As soon as I enabled writeback again, an hour later the OSDs went back to crashing. They crash when connecting to the monitor. All OSDs on one machine can stay up as long as the monitor is off, and as soon as they connect OSDs from both machines will start crashing.
The data is completely inaccessible at this point until I can get these ssd cache OSDs to stay online. My main data OSDs do not have any problems, they are fine, but I can't access them due to a writeback cache being in front of them.
Files