Version 2 - History - Calamariapihardwarestorage - Ceph - Ceph

Calamariapihardwarestorage » History » Version 2

Christina Meno, 07/01/2015 06:09 PM

-Christina Meno
+h1. Calamari/api/hardware/storage
 Summary
 In a large distributed system things are always happening. We care more about the causes and the implications of these events than the constant stream of them.
 This is a design to help connect the events happening to hardware and the Ceph constructs they affect.
 Owners
 Gregory Meno(Red Hat) Pacific
 Joe Handzik(HP) Central
 Interested Parties
 If you are interested in contributing to this blueprint, or want to be a "speaker" during the Summit session, list your name here.
 Name (Affiliation)
 Name (Affiliation)
 Name
 Current Status
 planning and implementation
 Detailed Description
-Christina Meno
+First I’d like to make an exciting announcement:
 packages for calamari and romana are available on download.ceph.com
 What packages?
 calamari-server, calamari-clients(romana), and diamond
 Ok I’ve got these packages what do I do with them?
 http://calamari.readthedocs.org/en/latest/operations/server_install.html
 What is the plan going forward?
 get nightly test suites
 public facing build infrastructure
 What distributions are supported?
 centos
 ubuntu
 When will packages distribution XYZ be provided?
 when volunteers emerge to lead the effort
 fedora 21+ planned
 Now let’s talk about hardware and ceph.
 In a large distributed system things are always happening. We care more about the causes and the implications of these events than the constant stream of them.
 [JH] - Cause is a key, definitely. We may want to consider how best to store a stream of events though, for post-event trend analysis. At large scales, a bad batch of drives can be identified early via IO trends, drive health, and failure identification (for example). Definitely not our first priority here though, I agree.
 This is a design to help connect the events happening to hardware and the Ceph constructs they affect.
-Christina Meno
+OSD.128 is down? That's in host foo_bar_5.... but what drive is that? Is the failure software or hardware? What do I replace it with? How long has it been failing?
 These questions probably sound familiar if you are an operator of a Ceph cluster. We want to improve the facility to answer these questions by implementing a new storage hardware API.
 OSDs have storage hardware
 Storage hardware has events
 Events can inform proper corrective action.
-Christina Meno
+example:
 api/v2/hardware/storage
 # provides a list of all known storage
 Thoughts:
 This data should be paginated
 Questions:
 What are the ways we’d like to filter this data? by host, by manufacturer, by service, by has_error
 by_service filtering would be an indirect way to learn about all the hardware that backs a pool. Should we just filter by_pool?
 How do we apply SES commands to this endpoint?
 Not just SES commands, but other CLI commands too. Like I mentioned in my email, I’d like some direction from users here if we can get it, but it’s not essential for the first wave of things I’d expect us to implement.
 api/v2/hardware/storage/SCSI_WWID
 host # foo_bar_5
 drive # sdb
 type # something that makes sense to a human
 capacity
 usage
 manufacturer
 FW version
 serial #
 services
 [osd.128, mon1]
 last_event
 status
 full_status
 timestamp
 event_message # human readable
 error_rate ?
 Questions:
 Calamari currently stores all state in memory and serializes it to persistent storage for crash-recovery. Will raw SMART data for most recent status on largest clusters overwhelm a typical calamari installation?
 Fair question, I’m not sure. Probably worth pushing this down into a log file of some sort regardless.
 Would error-rate be an effective way to work around this limitation?
 I don’t know that we should squash SMART into a single number, I think John mentioned that in his feedback and I tend to agree. I’d prefer the workaround I listed under #1.
 How do we provide meaningful base-line data for identifying outliers? Is that part of this API?
 At first, no. It might be interesting to consider a world where all Ceph users can contribute to a database of sorts, where we globalize trend analysis of drive failures. Pie-in-the-sky, yes. But it’d make for a heck of a demo :)
 Is there any additional metadata that should be collected / presented?
 Is SCSI_WWID the correct persistent identifier for storage?
 Resources:
 https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/persistent_naming.html
 Relevant thoughts on sysfs:
 https://www.kernel.org/pub/linux/kernel/people/mochel/doc/papers/ols-2005/mochel.pdf
 Relevant components of /sys:
 /sys/block/sd<letter>
 /sys/class/block
 /sys/block/sd<letter>/device (symlink to actual device)
 /sys/class/enclosure
 http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/drivers/misc/enclosure.c
 http://lxr.free-electrons.com/source/include/scsi/scsi_device.h
 (vpd_pg83 stuff is relevant)
 http://lxr.free-electrons.com/source/drivers/scsi/scsi.c
 Tools:
 o   Sg3utils
 o   http://linux.die.net/man/1/sgpio
 o   Sg_ses: http://sg.danny.cz/sg/sg_ses.html
 o   Sg_vpd: http://linux.die.net/man/8/sg_vpd
 Christina Meno
 Work items
 This section should contain a list of work tasks created by this blueprint.  Please include engineering tasks as well as related build/release and documentation work.  If this blueprint requires cleanup of deprecated features, please list those tasks as well.
 Coding tasks
 * https://github.com/ceph/ceph/pull/4699
 * pull OSD hardware info in to calamari
 * write checks for the storage hardware
 * Task 3
 Build / release tasks
 Task 1
 Task 2
 Task 3
 Documentation tasks
 Task 1
 Task 2
 Task 3
 Deprecation tasks
 Task 1
 Task 2
 Task 3

Project

General

Profile

Ceph

Calamariapihardwarestorage » History » Version 2