Project

General

Profile

Bug #48269

insights module can generate too much data, fail to put in config-key

Added by Dan Mick 7 months ago. Updated 2 months ago.

Status:
New
Priority:
Urgent
Assignee:
Category:
insights module
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

A sick cluster has a lot of insights data:

  1. ceph insights | wc
    72010 296222 3730601

The mgr logs the entire packet on "config-key set", and then shows

failed: (27) File too large
2020-11-17T06:25:09.736+0000 7f812ee58700 0 mgr set_store mon returned -27: error: entry size limited to 65536 bytes. Use 'mon config key max entry size' to manually adjust

The insights report is lost.

Not sure what to recommend for this.

History

#1 Updated by Brad Hubbard 7 months ago

  • Category set to insights module

#2 Updated by Neha Ojha 7 months ago

  • Priority changed from Normal to Urgent

#3 Updated by Brad Hubbard 7 months ago

The 64k limit of mon_config_key_max_entry_size is arbitrary and has previously been expanded in https://github.com/badone/ceph/commit/b38b8e980cb477ab2b0f320ab51eaa0c0fec7da6 Wondering if we just expand that again? Possibly not a dream solution. Alternatively, we could grant 'insights' module the ability to ignore that limit?

#4 Updated by Dan Mick 7 months ago

We could, but that's a big big expansion. I don't know how much to worry about monstore space consumption.

Another option is compression I suppose, perhaps at some threshold.

#5 Updated by Brad Hubbard 7 months ago

It's not clear at all how big this might actually get.

# ceph insights | wc
 118123  654579 7384608

The calls to 'config-key set mgr/insights' occur every 10 seconds by default and lead to a substantial entry in the log that should definitely be reviewed as well.

# grep  "config-key set mgr/insights" ceph-mgr.XXX005.xxyjcw.log|awk '{print($1)}'|head -5
2020-11-30T06:25:02.015+0000
2020-11-30T06:25:14.159+0000
2020-11-30T06:25:26.643+0000
2020-11-30T06:25:38.231+0000
2020-11-30T06:25:52.300+0000

# grep  "config-key set mgr/insights" ceph-mgr.XXX005.xxyjcw.log|head -1|wc
      1   12830  117254

# grep  "config-key set mgr/insights" ceph-mgr.XXX005.xxyjcw.log|tail -1|wc
      1   30097  269537

#6 Updated by Sage Weil 4 months ago

I would rather create a tiny 'insights' rados pool and dump the insights reports there. devicehealth takes this approach

#7 Updated by Josh Durgin 3 months ago

We discussed this at cdm a few months ago: https://pad.ceph.com/p/insights_config-key_set_failure_problem - main conclusion was store in-memory only, and perhaps don't store every single health update

#8 Updated by Brad Hubbard 2 months ago

  • Assignee set to Brad Hubbard
  • Priority changed from Urgent to High

#9 Updated by Brad Hubbard 2 months ago

  • Priority changed from High to Urgent

Also available in: Atom PDF