Project

General

Profile

Actions

Bug #7206

closed

Ceph MDS Hang on hadoop workloads

Added by Greg Bowyer over 10 years ago. Updated about 5 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client, Common/Protocol, Hadoop/Java, MDS
Labels (FS):
Java/Hadoop
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

With blatant disregard for posted advice about the stability of ceph-fs, I attempted to use it for testing out some hadoop workloads.

When the hadoop job gets to roughly 30% complete the MDS appears to go out for lunch locking up.

This is with ubuntu 13.10, kernel 3.11.0, ceph emperor, hadoop cdh 4.x.

I had to update the ceph filesystem hook for hadoop 2.x, that might be the cause, but I think that even if the updates are invalid a buggy client should not really be able to freak out the MDS.

The machines are on AWS so Xen might be part of the issue

Ceph FS is mounted on the machines as well.

I could not coax a perf report out of the machine (that also locks up)

I am more than willing to help track this down.


Files

ceph-mds.parlimentarian-10-2-38-5.log (8.71 MB) ceph-mds.parlimentarian-10-2-38-5.log Greg Bowyer, 01/22/2014 12:30 PM
client.dmesg (2.42 KB) client.dmesg Greg Bowyer, 01/22/2014 12:30 PM
Actions #1

Updated by Greg Farnum over 10 years ago

This could be an issue with time sync on the nodes; check your clock drift. (That's the only issue I know of that we've run into with Hadoop.) If you're using the Hadoop/CephFS Filesystem you don't need to mount CephFS on the client nodes, btw.

If that's not the issue, you'll need to reproduce with mds logging turned on (probably debug mds = 20, debug ms = 1) and client logging (debug client = 20, debug ms = 1) and the admin socket enabled on the Hadoop nodes.
Once it hangs, see if you can get the mds to dump its cache ("ceph tell mds.0 dumpcache", I think) and gather the "mds_requests" and "dump_cache" output from the client admin sockets.

Actions #2

Updated by Greg Bowyer over 10 years ago

Greg Farnum wrote:

This could be an issue with time sync on the nodes; check your clock drift. (That's the only issue I know of that we've run into with Hadoop.) If you're using the Hadoop/CephFS Filesystem you don't need to mount CephFS on the client nodes, btw.

Double checked, all the clocks are in sync (NTP and all)

If that's not the issue, you'll need to reproduce with mds logging turned on (probably debug mds = 20, debug ms = 1) and client logging (debug client = 20, debug ms = 1) and the admin socket enabled on the Hadoop nodes.
Once it hangs, see if you can get the mds to dump its cache ("ceph tell mds.0 dumpcache", I think) and gather the "mds_requests" and "dump_cache" output from the client admin sockets.

I will do this tonight when the cluster is quiet, is there anything else I can grab at the same time ?

Actions #3

Updated by Greg Farnum over 10 years ago

That should be enough to either diagnose it or realize we need to reproduce it locally.

Actions #4

Updated by Greg Bowyer over 10 years ago

Greg Farnum already knows this, but for reference

I spent a large part of today with debug logging on to try to catch this occurring without much success I think the debug logging slows everything down to the point at which it masks timing junk.

I am going to see if I can grab a mds dump when / if it happens again

Greg Bowyer wrote:

Greg Farnum wrote:

This could be an issue with time sync on the nodes; check your clock drift. (That's the only issue I know of that we've run into with Hadoop.) If you're using the Hadoop/CephFS Filesystem you don't need to mount CephFS on the client nodes, btw.

Double checked, all the clocks are in sync (NTP and all)

If that's not the issue, you'll need to reproduce with mds logging turned on (probably debug mds = 20, debug ms = 1) and client logging (debug client = 20, debug ms = 1) and the admin socket enabled on the Hadoop nodes.
Once it hangs, see if you can get the mds to dump its cache ("ceph tell mds.0 dumpcache", I think) and gather the "mds_requests" and "dump_cache" output from the client admin sockets.

I will do this tonight when the cluster is quiet, is there anything else I can grab at the same time ?

Actions #5

Updated by Zheng Yan over 10 years ago

just enable admin socket (without enabling client debug and mds debug). When the hang occurs, dump client request and mds cache (ceph --admin-daemon /var/run/ceph/ceph-fuse.asok mds_requests, ceph mds tell 0 dumpcache /tmp/cachedump.mds0)

Actions #6

Updated by Greg Farnum about 10 years ago

  • Status changed from New to Need More Info
Actions #7

Updated by Greg Farnum almost 8 years ago

  • Status changed from Need More Info to Can't reproduce

If this was a time issue, we fixed a bunch of weird stuff in the switch to solely client-directed mtime updates.

Actions #8

Updated by Greg Farnum almost 8 years ago

  • Category changed from 47 to 48
  • Component(FS) Client, Common/Protocol, Hadoop/Java, MDS added
Actions #9

Updated by Patrick Donnelly about 5 years ago

  • Category deleted (48)
  • Labels (FS) Java/Hadoop added
Actions

Also available in: Atom PDF