Project

General

Profile

Bug #48269

insights module can generate too much data, fail to put in config-key

Added by Dan Mick about 3 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
insights module
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
nautilus,octopus,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

A sick cluster has a lot of insights data:

  1. ceph insights | wc
    72010 296222 3730601

The mgr logs the entire packet on "config-key set", and then shows

failed: (27) File too large
2020-11-17T06:25:09.736+0000 7f812ee58700 0 mgr set_store mon returned -27: error: entry size limited to 65536 bytes. Use 'mon config key max entry size' to manually adjust

The insights report is lost.

Not sure what to recommend for this.


Related issues

Duplicated by mgr - Bug #51637: mgr/insights: mgr consumes excessive amounts of memory Duplicate
Copied to mgr - Backport #51949: octopus: insights module can generate too much data, fail to put in config-key Resolved
Copied to mgr - Backport #51950: nautilus: insights module can generate too much data, fail to put in config-key Resolved
Copied to mgr - Backport #51951: pacific: insights module can generate too much data, fail to put in config-key Resolved

History

#1 Updated by Brad Hubbard about 3 years ago

  • Category set to insights module

#2 Updated by Neha Ojha about 3 years ago

  • Priority changed from Normal to Urgent

#3 Updated by Brad Hubbard about 3 years ago

The 64k limit of mon_config_key_max_entry_size is arbitrary and has previously been expanded in https://github.com/badone/ceph/commit/b38b8e980cb477ab2b0f320ab51eaa0c0fec7da6 Wondering if we just expand that again? Possibly not a dream solution. Alternatively, we could grant 'insights' module the ability to ignore that limit?

#4 Updated by Dan Mick about 3 years ago

We could, but that's a big big expansion. I don't know how much to worry about monstore space consumption.

Another option is compression I suppose, perhaps at some threshold.

#5 Updated by Brad Hubbard almost 3 years ago

It's not clear at all how big this might actually get.

# ceph insights | wc
 118123  654579 7384608

The calls to 'config-key set mgr/insights' occur every 10 seconds by default and lead to a substantial entry in the log that should definitely be reviewed as well.

# grep  "config-key set mgr/insights" ceph-mgr.XXX005.xxyjcw.log|awk '{print($1)}'|head -5
2020-11-30T06:25:02.015+0000
2020-11-30T06:25:14.159+0000
2020-11-30T06:25:26.643+0000
2020-11-30T06:25:38.231+0000
2020-11-30T06:25:52.300+0000

# grep  "config-key set mgr/insights" ceph-mgr.XXX005.xxyjcw.log|head -1|wc
      1   12830  117254

# grep  "config-key set mgr/insights" ceph-mgr.XXX005.xxyjcw.log|tail -1|wc
      1   30097  269537

#6 Updated by Sage Weil almost 3 years ago

I would rather create a tiny 'insights' rados pool and dump the insights reports there. devicehealth takes this approach

#7 Updated by Josh Durgin over 2 years ago

We discussed this at cdm a few months ago: https://pad.ceph.com/p/insights_config-key_set_failure_problem - main conclusion was store in-memory only, and perhaps don't store every single health update

#8 Updated by Brad Hubbard over 2 years ago

  • Assignee set to Brad Hubbard
  • Priority changed from Urgent to High

#9 Updated by Brad Hubbard over 2 years ago

  • Priority changed from High to Urgent

#10 Updated by Brad Hubbard over 2 years ago

  • Pull request ID set to 42442

#11 Updated by Brad Hubbard over 2 years ago

  • Duplicated by Bug #51637: mgr/insights: mgr consumes excessive amounts of memory added

#12 Updated by Konstantin Shalygin over 2 years ago

  • Status changed from New to Fix Under Review
  • Target version set to v14.2.23
  • Backport set to nautilus pacific
  • Affected Versions v14.2.22 added

#13 Updated by Neha Ojha over 2 years ago

  • Backport changed from nautilus pacific to nautilus,octopus,pacific

#14 Updated by Brad Hubbard over 2 years ago

  • Status changed from Fix Under Review to Pending Backport

#15 Updated by Backport Bot over 2 years ago

  • Copied to Backport #51949: octopus: insights module can generate too much data, fail to put in config-key added

#16 Updated by Backport Bot over 2 years ago

  • Copied to Backport #51950: nautilus: insights module can generate too much data, fail to put in config-key added

#17 Updated by Backport Bot over 2 years ago

  • Copied to Backport #51951: pacific: insights module can generate too much data, fail to put in config-key added

#18 Updated by Loïc Dachary about 2 years ago

  • Target version deleted (v14.2.23)

#19 Updated by Brad Hubbard about 2 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF