Project

General

Profile

Actions

Bug #6761

closed

emperor's "dirty" flag is being interpreted as "lost" by Dumpling OSDs

Added by Corin Langosch over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

All my systems run ubuntu 12.10. I was running dumpling for a few months without any errors. My kvm guests use qemu-rbd and run on the same machines as the mons and osds.

I just upgraded all my monitors (3) and one osd (out of 14) to emporer. The cluster is healthy and seems to be running fine. A few minutes after upgrading a few of my qemu (kvm) machines just died. There are no core dumps, no logs.

When I start one of the died kvm machines using the command line it dies again after a few seconds (using vnc i can see that linux boots up in the vm and then the vnc disconnects). On the command line I get http://pastie.org/8477535 This clearly looks like a ceph bug.

I already tried to pass "cache=none" but it doesn't help.

Another thing I noticed: I restarted the one upgraded osd to see if it's related to it. When the upgraded osd was down (not out, no rebalance active!) the kvm guest still didn't startup. I then started the osd again. Just a few moments later I discovered another already running kvm guest died (one which died during the upgrade before but which I was able to restart).

I relly hope someone can help me, as it's a production cluster and I need to get the virtual machines running again asap.

Thank you!


Files

export.txt (5.07 MB) export.txt Corin Langosch, 11/13/2013 09:55 AM
data.tar.gz (45.6 MB) data.tar.gz Corin Langosch, 11/13/2013 12:24 PM
Actions

Also available in: Atom PDF