Ceph-brag is going to be an anonymized cluster reporting tool designed to collect a "registry" of Ceph clusters for community knowledge. This data will be displayed on a public web page using UUID by default, but users can claim their cluster and publish information about ownership if they so desire.
- Loic Dachary <firstname.lastname@example.org>
- Patrick McGarry (Inktank)
- Sebastien Han (eNovance)
- Sage Weil (Inktank)
- Mike Dawson <email@example.com>
- Haomai Wang <firstname.lastname@example.org>
- Danny Al-Gaaf <email@example.com>
- Use the http://wiki.ceph.com/Brag namespace for publication
- A teuthology task should be able to publish results if given proper credentials
- Setup consists of (declarative - e.g JSON/XML exportable) Topology, Hardware, OSs, Ceph (setup method, command/script steps and/or chef/puppet artifacts, final running ceph configs on node classes).
- Apdex (Application Performance Index) is an open standard developed by an alliance of companies. It defines a standard method for reporting and comparing the performance of software applications in computing. Its purpose is to convert measurements into insights about user satisfaction, by specifying a uniform way to analyze and report on the degree to which measured performance meets user expectations. http://en.wikipedia.org/wiki/Apdex
- What relationship with https://blueprints.launchpad.net/oslo/+spec/opt-in-stats-tracking ?
- Check the new osd metadata as a way to extract information https://github.com/ceph/ceph/pull/843
- lshw -xml to extract the hardware configuration
- CAS of the RAM https://github.com/enovance/edeploy/blob/master/build/sources/timings.c
- megacli on dell & hpacucli on HP ... to figure out more about disks
Detailed Description¶Client side is a 'ceph brag' or 'ceph-brag' command. Generates a lump of JSON that is anonymous and sends it to brag.ceph.com (or similar). Includes:
- a unique identifier for the cluster. this is not the cluster fsid, but a new uuid, generated once and stored via the config-key interface, so that subsequent ceph-brag commands wil re-use the same id.
- cluster creation date
- number of osds, mons, mdss, pgs
- number of bytes, objects, pools
- number of bytes, ios read/written
- number of unique ips (hosts)?
- count of crush items by type (root, rack, host, osd)
- per-pool metadata
- replica count
- type (just rep for now, soon ec)
- os, kernel info (once available)
- hardware information (CPUs,RAM,Network ... maybe reduced to some basic anonymised data)
- ceph version(s)
- Contact email
- Use-case (rgw, openstack, genomics, hpc, log archival, backups, whatever)
We need to make sure to expose no critical information to the public that could be a source to run e.g. exploits or DDoS attacks against a cluster. This is critical, otherwise no big company will ever expose any information via this tool. It could be even a problem to provide information abou the used ceph version.
ceph-brag # generate brag json, dump to stdout
ceph-brag publish # post it!
ceph-brag update-metadata --name ... --organization ... --email ... --description ...
ceph-brag unpublish --yes-i-am-shy
Server side is some WSGI or similar modern/cute REST endpoint. Simply logs the result to a database.
- generate all the json
- ceph-brag publish
- ceph-brag update-metadata
- ceph-brag clear-metadata
- ceph-brag unpublish
- ceph-brag server
- basic tool to summarize results
- number of clusters, bytes, objects
- os, ceph version histograms
Build / release tasks¶
- deploy brag server to ceph.com
- document security implications of ceph-brag
- document how to obtain the public database
- programmatically, or by request from a human?