Feature #62856
opencephfs: persist an audit log in CephFS
0%
Description
... for quickly learning what disaster tools and commands have been run on the file system.
Too often we see a cluster which is "broken" but it's not clear what may have been done to get it to that state. We have suspicions but asking a group of folks cooperatively maintaining the cluster may yield incomplete answers. So, it'd be very helpful if we could consult a persistent log which records all recovery actions taken on the cluster.
I think this log, to start, would simply record what commands (e.g. cephfs-journal-tool ...) have been run. We can modify these tools to record that information.
The trivial way to do this would be to link these tools to libcephsqlite where a database would be stored in the metadata pool. That database can then be pulled down when collecting data from the cluster and analyzed locally by developers.