Common/TrackedOp: inaccurate count for total slow requests
Check bool OpTracker::check_ops_in_flight for detail information.
common/TrackedOp: fix inaccurate counting for total slow requests
In the original design there are two counters in charge of collecting
potentially problematic requests, namely 'slow' and 'warned'.
Counter 'slow' is responsible for capturing all the requests which have
already hit the "complain" limit and shall be marked as "slow" while
counter 'warned' is responsible for countering those requests which
have already hit the "warning interval" and thus shall be logged and
The problem here is if 'warned' counter hits the log_threshold,
we will quit the entire for loop but there may be residual shard_queues
which may still containing slow requests. As a result, the 'slow' counter
does not reflect the real number of total slow requests in all the
shard_queues under this case. And no slow requests will be tracked
especially when 'log_threshold' is set to zero(Do we do this intentional?
Or else we shall never allow 'log_threshold' to be zero).
The solution for the above problem is to keep counting 'slow' requests
until we have finished traversing all the shard_queues, no matter
whether we have gathered enough requests for logging or not, and if so,
we simply stop counter 'warned' and skip over logging process.