Feature #50614
closed[pwl] enhance "rbd status" output and periodically update it
0%
Description
"Image cache state" section is very confusing because it is effectively a snapshot from the time the cache was loaded. It is not updated until the cache is orderly closed. A dirty cache can be reported as clean and so on...
Also, no metrics of any kind are included. It shouldn't take an admin socket, two different configuration options and a grep though debug output and/or raw perf counters to get an idea of how the cache is doing.
Updated by Ilya Dryomov almost 3 years ago
- Related to Bug #50613: [pwl] "rbd status" output is incorrect added
Updated by CONGMIN YIN over 2 years ago
no metrics of any kind are included. It shouldn't take an admin socket, two different configuration options and a grep though debug output and/or raw perf counters to get an idea of how the cache is doing.
Hi @Ilya Dryomov, I don't quite understand the second sentence in the description. Can you explain the problem more?
Updated by Ilya Dryomov over 2 years ago
There is some useful data collected in the form of perf counters, such as the number of hits, the number of bytes read from the cache, various latencies, etc. See AbstractWriteLog::perf_start(). But perf counters are rather hard to access on the client side: an admin socket may not be set up, if it is set up one needs to find the right one and then manually grab the data with "ceph --admin-daemon ... perf dump" or similar. If the workload is restarted, a different admin socket gets created (usually) so automating the collection and aggregation with external tools is a pain.
A "grep through debug output" refers to AbstractWriteLog::periodic_stats(). Again, very useful data, but in order to get to it, one needs to set "debug rbd pwl = 1" and "rbd_persistent_cache_log_periodic_stats = true" and grep the log file for "STATS:". And again, the log file may not be set up, etc.
"rbd status" should be taught to report some of this data. Not all of it -- just what is immediately useful to the end user, some kind of "at a glance" view.
Updated by Ilya Dryomov about 2 years ago
- Status changed from New to In Progress
- Pull request ID set to 45684
The "dirty cache can be reported as clean" part has been addressed in https://github.com/ceph/ceph/pull/45660.
Updated by Ilya Dryomov about 2 years ago
- Status changed from In Progress to Pending Backport
Updated by Backport Bot about 2 years ago
- Copied to Backport #55292: pacific: [pwl] enhance "rbd status" output and periodically update it added
Updated by Backport Bot about 2 years ago
- Copied to Backport #55293: quincy: [pwl] enhance "rbd status" output and periodically update it added
Updated by Ilya Dryomov almost 2 years ago
- Status changed from Pending Backport to Resolved