Project

General

Profile

Actions

Bug #3797

closed

osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut

Added by Corin Langosch over 11 years ago. Updated over 11 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I just upgraded one of my production servers (2 osds) from 0.48.2argonaut to the latest 0.48.3argonaut and now of the osds is taking 100% for over 12 minutes now. I never had this with the previous argonaut before. Memory usage is only 30mb rss (the other osd seems to be running normally, taking almost no cpu and around 300 mb rss). "ceph -w" reports the cluster is healty and everything is up, but it seems the one osd is hanging and not really doing anything (because memory usage stays that low)?

Here's the logfile of the osd: http://pastie.org/pastes/5687380/text
Output of strace is here (not really much): http://pastie.org/5687401/text

Restarting the osd doesn't help - it shuts down cleanly, but after starting it it takes again 100% cpu. What can I do?


Files

ceph-osd.8.log (4.86 MB) ceph-osd.8.log Corin Langosch, 01/27/2013 05:03 AM
ceph.png (25.6 KB) ceph.png Corin Langosch, 01/27/2013 05:06 AM
ceph-osd.8.log (7.23 MB) ceph-osd.8.log Corin Langosch, 01/27/2013 10:25 AM
gdb.txt (130 KB) gdb.txt Corin Langosch, 01/27/2013 10:56 AM

Related issues 1 (0 open1 closed)

Related to Ceph - Feature #3376: use external leveldb package for default buildsDuplicate

Actions
Actions

Also available in: Atom PDF