Feature #3805
open
Added by Sage Weil over 11 years ago.
Updated over 11 years ago.
Description
If a log message comes through and is a dup of the previous, increment a counter or something and only log it once with a (repeated N times) type message.
Related issues
1 (1 open — 0 closed)
What kind of dups are we trying to detect?
This sounds to me like a wishlist item that requires much more work to be useful than we'd like. From Deb's previous ticket comments I think she'd like to see similar outputs (but with different entities associated) being compressed down. But with most of those kinds of messages, the entity is the important part in the rare occasion when we care about the log message.
I tend to think there aren't very many dups we could usefully compress. It's pretty easy to add a one-string buffer to compare everything but the timestamp, but I suspect it also wouldn't compress very much. I left 3775 as "need more info" to suggest some strings that could profitably be removed or compressed because they have little useful info in them, but I suspect there aren't many.
The one that comes to mind is "no heartbeat from osd.foo since timestamp bar" messages. We could try to identify the few cases where this does happen (grep the mailing list maybe?), and add appropriate backoff/escalation logic to those cases. I suspect that repeat messages either can be ignored and should be summarized (as Sage suggests), such as clock skew, or are an indication of something severe (probably more severe as the number of messages increase), so handling them all in the same way might not be appropriate. In the severe cases, can we start reporting outside of logging (output in the ceph status summary, start sending messages to all the terminals on the node, etc.)?
Also available in: Atom
PDF