Project

General

Profile

Actions

Bug #7206

closed

Ceph MDS Hang on hadoop workloads

Added by Greg Bowyer over 10 years ago. Updated about 5 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, Common/Protocol, Hadoop/Java, MDS
Labels (FS):
Java/Hadoop
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

With blatant disregard for posted advice about the stability of ceph-fs, I attempted to use it for testing out some hadoop workloads.

When the hadoop job gets to roughly 30% complete the MDS appears to go out for lunch locking up.

This is with ubuntu 13.10, kernel 3.11.0, ceph emperor, hadoop cdh 4.x.

I had to update the ceph filesystem hook for hadoop 2.x, that might be the cause, but I think that even if the updates are invalid a buggy client should not really be able to freak out the MDS.

The machines are on AWS so Xen might be part of the issue

Ceph FS is mounted on the machines as well.

I could not coax a perf report out of the machine (that also locks up)

I am more than willing to help track this down.


Files

ceph-mds.parlimentarian-10-2-38-5.log (8.71 MB) ceph-mds.parlimentarian-10-2-38-5.log Greg Bowyer, 01/22/2014 12:30 PM
client.dmesg (2.42 KB) client.dmesg Greg Bowyer, 01/22/2014 12:30 PM
Actions

Also available in: Atom PDF