Actions
Bug #50100
openstale slow osd heartbeats health alert
Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
[WRN] OSD_SLOW_PING_TIME_BACK: Slow OSD heartbeats on back (longest 329881.007ms) Slow OSD heartbeats on back from osd.57 [] to osd.34 [] 329881.007 msec Slow OSD heartbeats on back from osd.46 [] to osd.22 [] 322439.429 msec Slow OSD heartbeats on back from osd.6 [] to osd.39 [] 322141.381 msec Slow OSD heartbeats on back from osd.6 [] to osd.1 [] 317347.729 msec Slow OSD heartbeats on back from osd.19 [] to osd.36 [] 312785.584 msec Slow OSD heartbeats on back from osd.43 [] to osd.61 [] 92993.926 msec Slow OSD heartbeats on back from osd.43 [] to osd.63 [] 92839.392 msec Slow OSD heartbeats on back from osd.43 [] to osd.53 [] 92786.246 msec possibly improving Slow OSD heartbeats on back from osd.43 [] to osd.52 [] 92786.206 msec Slow OSD heartbeats on back from osd.57 [] to osd.51 [] 92587.894 msec Truncated long network list. Use ceph daemon mgr.# dump_osd_network for more information [WRN] OSD_SLOW_PING_TIME_FRONT: Slow OSD heartbeats on front (longest 330695.632ms) Slow OSD heartbeats on front from osd.57 [] to osd.34 [] 330695.632 msec Slow OSD heartbeats on front from osd.46 [] to osd.22 [] 322945.797 msec Slow OSD heartbeats on front from osd.6 [] to osd.39 [] 320848.377 msec Slow OSD heartbeats on front from osd.6 [] to osd.1 [] 317744.886 msec Slow OSD heartbeats on front from osd.19 [] to osd.36 [] 313810.277 msec Slow OSD heartbeats on front from osd.43 [] to osd.52 [] 92994.370 msec Slow OSD heartbeats on front from osd.57 [] to osd.51 [] 92884.778 msec Slow OSD heartbeats on front from osd.43 [] to osd.53 [] 92839.072 msec Slow OSD heartbeats on front from osd.43 [] to osd.63 [] 92786.355 msec Slow OSD heartbeats on front from osd.43 [] to osd.61 [] 92786.328 msec Truncated long network list. Use ceph daemon mgr.# dump_osd_network for more information
the dump has
{ "threshold": 1000, "entries": [ { "last update": "Thu Apr 1 15:36:08 2021", "stale": true, "from osd": 57, "to osd": 34, "interface": "front", "average": { "1min": 330695.632, "5min": 330695.632, "15min": 44364.478 }, "min": { "1min": 330695.632, "5min": 330695.632, "15min": 330695.632 }, "max": { "1min": 330695.632, "5min": 330695.632, "15min": 330695.632 }, "last": 1.964 }, ...
but when i look at 'ceph pg dump osds' i don't see that osd.57 ping osd.34:
OSD_STAT USED AVAIL USED_RAW TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM ... 57 547 GiB 1.3 TiB 547 GiB 1.8 TiB [0,2,4,8,9,10,12,13,14,18,21,22,24,26,28,29,30,33,35,36,37,39,42,43,44,45,46,51,52,53,55,59,61] 38 7
same goes for the others
This is 16.1.0-1341-gc0a8a600 / 16.2.0.
Actions