Feature #62856
opencephfs: persist an audit log in CephFS
0%
Description
... for quickly learning what disaster tools and commands have been run on the file system.
Too often we see a cluster which is "broken" but it's not clear what may have been done to get it to that state. We have suspicions but asking a group of folks cooperatively maintaining the cluster may yield incomplete answers. So, it'd be very helpful if we could consult a persistent log which records all recovery actions taken on the cluster.
I think this log, to start, would simply record what commands (e.g. cephfs-journal-tool ...) have been run. We can modify these tools to record that information.
The trivial way to do this would be to link these tools to libcephsqlite where a database would be stored in the metadata pool. That database can then be pulled down when collecting data from the cluster and analyzed locally by developers.
Updated by Patrick Donnelly 8 months ago
- Blocked by Feature #62884: audit: create audit module which persists in RADOS important operations performed on the cluster added
Updated by Patrick Donnelly 8 months ago
We discussed this in standup today. We are now considering a design with a new "audit" module in the ceph-mgr.
Updated by Patrick Donnelly 13 days ago
- Target version changed from v19.0.0 to v20.0.0