Mds dumpability


Expand the 'dump' command set so that a running mds can dump more state to a local file or unix domain socket for debugging visibility.


  • Name (Affiliation)
  • Name (Affiliation)
  • Name

Interested Parties

  • Sage Weil (Inktank)
  • Name (Affiliation)
  • Name

Current Status

A 'dumpcache <filename>' command will dump all inodes, dentries, and dirs to a file using the operator<<() methods. This is a good start, but only captures what is exposed by those methods--not all structure fields.
All encodable objects (that go to disk or over the wire) include a dump(Formatter*) method that will generate structured json (or xml etc) output for debugging. The in-memory only structures (like CInode, CDentry, CDir) don't have these (yet).
The admin socket infrastructure lets you query a running daemon from the command line (e.g., 'ceph daemon mds.a <command.') and get the output. It is fully buffered (entire result is built in memory, then written to the output).
The Formatter class has JSON and XML implementations. It is fully buffered (entire output is generated in memory, then written to the output socket).

Detailed Description

We should add dump() methods to all important in-memory structures, including:
  • CInode
  • CDir
  • CDentry
  • Capability
Any dump should be recursive (i.e., dumping the inode will also dump the capabilities). It should not following links, thoguh (dumping a dir shouldn't dump all dentries and inodes.... or should it?)
We shoudl also capture other improtant state, like:
  • MDRequest / Mutation
  • Session
  • LogSegments
Then we wire them up to admin/dev comamnds to dump cache and other in-memory state.
The final step to make this usable on large clusters is to make the Formatter interface stream based. Right now it looks like:
  • create Formater
  • dump to it; class accumulates result in memory
  • flush to a bufferlist
  • write bufferlist to final destination
Instead, it should be more like
  • create Formatter
  • specify the sink (bufferlist, fd, something). maybe make it an ostream, and make bufferlist work with that?
  • write to it.
  • at the very end call flush() (no arguments!)

Work items

Coding tasks

  1. mds: add dump methods to CInode, CDir, CDentry, Capability
  2. mds: add dump methods to Session, MDRequest, LogSegment
  3. admin socket dump commands
  4. tell dump commands (dump to a file)
  5. Formatter: refactor into a stream-based implementation for better memory efficiency
  6. make the admin socket protocol handle a streaming output of unknown length.