Ceph-Brag¶

Summary¶

Ceph-brag is going to be an anonymized cluster reporting tool designed to collect a "registry" of Ceph clusters for community knowledge. This data will be displayed on a public web page using UUID by default, but users can claim their cluster and publish information about ownership if they so desire.

Owners¶

Loic Dachary <loic@dachary.org>
Patrick McGarry (Inktank)
Sebastien Han (eNovance)
Sage Weil (Inktank)

Interested Parties¶

Mike Dawson <mike.dawson@cloudapt.com>
Haomai Wang <haomaiwang@gmail.com>
Danny Al-Gaaf <danny.al-gaaf@bisect.de>

Current Status¶

Use the http://wiki.ceph.com/Brag namespace for publication
A teuthology task should be able to publish results if given proper credentials
Setup consists of (declarative - e.g JSON/XML exportable) Topology, Hardware, OSs, Ceph (setup method, command/script steps and/or chef/puppet artifacts, final running ceph configs on node classes).
Apdex (Application Performance Index) is an open standard developed by an alliance of companies. It defines a standard method for reporting and comparing the performance of software applications in computing. Its purpose is to convert measurements into insights about user satisfaction, by specifying a uniform way to analyze and report on the degree to which measured performance meets user expectations. http://en.wikipedia.org/wiki/Apdex
What relationship with https://blueprints.launchpad.net/oslo/+spec/opt-in-stats-tracking ?
Check the new osd metadata as a way to extract information https://github.com/ceph/ceph/pull/843
lshw -xml to extract the hardware configuration
CAS of the RAM https://github.com/enovance/edeploy/blob/master/build/sources/timings.c
megacli on dell & hpacucli on HP ... to figure out more about disks

Detailed Description¶

Client side is a 'ceph brag' or 'ceph-brag' command. Generates a lump of JSON that is anonymous and sends it to brag.ceph.com (or similar). Includes:

a unique identifier for the cluster. this is not the cluster fsid, but a new uuid, generated once and stored via the config-key interface, so that subsequent ceph-brag commands wil re-use the same id.
cluster creation date
number of osds, mons, mdss, pgs
number of bytes, objects, pools
number of bytes, ios read/written
number of unique ips (hosts)?
count of crush items by type (root, rack, host, osd)
per-pool metadata
- replica count
- type (just rep for now, soon ec)
os, kernel info (once available)
hardware information (CPUs,RAM,Network ... maybe reduced to some basic anonymised data)
ceph version(s)

On each brag, dump the current ownership information for the cluster. By default this is empty/undefined. ceph-brag options can be used to update the following fields:

Name
Organization
Contact email
Use-case (rgw, openstack, genomics, hpc, log archival, backups, whatever)

We need to make sure to expose no critical information to the public that could be a source to run e.g. exploits or DDoS attacks against a cluster. This is critical, otherwise no big company will ever expose any information via this tool. It could be even a problem to provide information abou the used ceph version.

Usage:

ceph-brag # generate brag json, dump to stdout
ceph-brag publish # post it!
ceph-brag update-metadata --name ... --organization ... --email ... --description ...
ceph-brag clear-metadata
ceph-brag unpublish --yes-i-am-shy

Server side is some WSGI or similar modern/cute REST endpoint. Simply logs the result to a database.

Work items¶

Coding tasks¶

ceph-brag
1. generate all the json
ceph-brag publish
ceph-brag update-metadata
ceph-brag clear-metadata
ceph-brag unpublish
ceph-brag server
basic tool to summarize results
1. number of clusters, bytes, objects
2. os, ceph version histograms

Build / release tasks¶

deploy brag server to ceph.com

Documentation tasks¶

document security implications of ceph-brag
document how to obtain the public database
1. programmatically, or by request from a human?

Files (0)

Updated by Jessica Mack almost 9 years ago · 1 revisions

Project

General

Profile

Ceph

Sidebar¶

Wiki