Ceph User Committee meeting 2014-04-03¶
The agenda was:
Documentation of the new Firefly features (tiering, erasure code)¶
- Good: https://www.google.com/search?q=erasure+code+ceph second answer is https://ceph.com/docs/master/dev/erasure-coded-pool/
- Pain point : ease of use of tiering and erasure code
- Needs clarification : is erasure code beneficial to smaller users ?
- Wish : more tiering and erasure code use cases
- Needs clarification : do erasure code require more work for the MONs ?
- Wish : are there any plans for "glued objects". like adding a bunch of small objects together into one large blob, then EC that blob?
- Needs clarification : "10 DCs" example. It does not show the tradeoff of this solution: to read one object, you have to read from 6 DCs!
- Needs clarification : relationships between tiering and erasure code because at the moment it looks like tiering is exclusively for caching
- Needs clarification : when will CephFS be ready for production ?
- Wish : solid list of show-stoppers to make it prod-ready
- Needs clarification : fsck tool has yet to be developed, manual repair tools
- Wish: wiki page for CephFS use cases
- store files
- web content for existing non-ceph-aware applications
- legacy that need to scale out
- legacy that needs capacity (RAID arrays can only get so large)
- a filesystem without a SPOF
- hadoop / HDFS compatibility
- reexporting as cifs/nfs
- backing existing tools that use FS
- distributing images to be local for hypervisor nodes in openstack
- SAN/NAS stuff
- HPC in the context of http://www.castep.org/
- lustre alternative
- reduce storage costs / replace netapp
- Wish: asynchronous replication at the rados level
- Wish: gzip rados class
- Pain point: see more documentation with regard to decoding the log messages from Ceph daemons
- Pain point: explanations of all the configuration parameters ( config_opts.h)
- Wish: uses cache tiering to say "this slow data can be compressed now"
- Pain point: the mon node frequently FLOODs its log with always the same message : logger could aggregate identical messages
- Needs clarification: is the cache pool ready for production ?
- WIsh: bandwidth reservations / guarantee that pools have a certain amount of iops/throughput available even if other pools are hammering the storage system
- Pain point: samba/netatalk on top of RBD is stable, a bit slow though, need more IOPS
- Pain point: ceph 0.72 with debian's bleeding edge 3.14-rc7 kernel fails btrfs corruption even when a 'very small' ceph cluster has only 3 guest VMs running the phoronix-test-suite disk test on it.
- Wish: "hierarchical near" backfilling, based on crush location. ie 4 replicas, 2 in each rack. instead backfill from primary: backfill from OSD in the same rack