Project

General

Profile

Actions

Bug #62529

open

PrimaryLogPG::log_op_stats uses `now` time vs op end time when calculating op latency

Added by Michael Kidd 9 months ago. Updated 9 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
low-hanging-fruit
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

  • With `debug_osd` 15 or higher, a log line called `log_op_stats` is output, which ends with a reported latency value `lat`.
  • This line can be output several 10's of milliseconds after the op actually completes.
  • However, the `PrimaryLogPG::log_op_stats` uses `ceph_clock_now` as the time for calculating the op duration (lat) instead of the time when the op completed.
  • This can result in an arbitrarily higher latency value being reported.

https://github.com/ceph/ceph/blob/main/src/osd/PrimaryLogPG.cc#L4384-L4387

Actions

Also available in: Atom PDF