Ceph User Committee meeting 2014-04-03

Executive summary

The agenda was:

Documentation of the new Firefly features (tiering, erasure code)

  • Good: second answer is
  • Pain point : ease of use of tiering and erasure code
  • Needs clarification : is erasure code beneficial to smaller users ?
  • Wish : more tiering and erasure code use cases
  • Needs clarification : do erasure code require more work for the MONs ?
  • Wish : are there any plans for "glued objects". like adding a bunch of small objects together into one large blob, then EC that blob?
  • Needs clarification : "10 DCs" example. It does not show the tradeoff of this solution: to read one object, you have to read from 6 DCs!
  • Needs clarification : relationships between tiering and erasure code because at the moment it looks like tiering is exclusively for caching



  • Needs clarification : when will CephFS be ready for production ?
  • Wish : solid list of show-stoppers to make it prod-ready
  • Needs clarification : fsck tool has yet to be developed, manual repair tools
  • Wish: wiki page for CephFS use cases
    • store files
    • web content for existing non-ceph-aware applications
    • legacy that need to scale out
    • legacy that needs capacity (RAID arrays can only get so large)
    • backups
    • a filesystem without a SPOF
    • hadoop / HDFS compatibility
    • reexporting as cifs/nfs
    • backing existing tools that use FS
    • distributing images to be local for hypervisor nodes in openstack
    • SAN/NAS stuff
    • HPC in the context of
    • lustre alternative
    • reduce storage costs / replace netapp


  • Wish: asynchronous replication at the rados level
  • Wish: gzip rados class
  • Pain point: see more documentation with regard to decoding the log messages from Ceph daemons
  • Pain point: explanations of all the configuration parameters ( config_opts.h)
  • Wish: uses cache tiering to say "this slow data can be compressed now"
  • Pain point: the mon node frequently FLOODs its log with always the same message : logger could aggregate identical messages
  • Needs clarification: is the cache pool ready for production ?
  • WIsh: bandwidth reservations / guarantee that pools have a certain amount of iops/throughput available even if other pools are hammering the storage system
  • Pain point: samba/netatalk on top of RBD is stable, a bit slow though, need more IOPS
  • Pain point: ceph 0.72 with debian's bleeding edge 3.14-rc7 kernel fails btrfs corruption even when a 'very small' ceph cluster has only 3 guest VMs running the phoronix-test-suite disk test on it.
  • Wish: "hierarchical near" backfilling, based on crush location. ie 4 replicas, 2 in each rack. instead backfill from primary: backfill from OSD in the same rack