Project

General

Profile

Actions

Bug #48269

closed

insights module can generate too much data, fail to put in config-key

Added by Dan Mick over 3 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
insights module
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
nautilus,octopus,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

A sick cluster has a lot of insights data:

  1. ceph insights | wc
    72010 296222 3730601

The mgr logs the entire packet on "config-key set", and then shows

failed: (27) File too large
2020-11-17T06:25:09.736+0000 7f812ee58700 0 mgr set_store mon returned -27: error: entry size limited to 65536 bytes. Use 'mon config key max entry size' to manually adjust

The insights report is lost.

Not sure what to recommend for this.


Related issues 4 (0 open4 closed)

Has duplicate mgr - Bug #51637: mgr/insights: mgr consumes excessive amounts of memoryDuplicateBrad Hubbard

Actions
Copied to mgr - Backport #51949: octopus: insights module can generate too much data, fail to put in config-keyResolvedLaura PaduanoActions
Copied to mgr - Backport #51950: nautilus: insights module can generate too much data, fail to put in config-keyResolvedBrad HubbardActions
Copied to mgr - Backport #51951: pacific: insights module can generate too much data, fail to put in config-keyResolvedLaura PaduanoActions
Actions #1

Updated by Brad Hubbard over 3 years ago

  • Category set to insights module
Actions #2

Updated by Neha Ojha over 3 years ago

  • Priority changed from Normal to Urgent
Actions #3

Updated by Brad Hubbard over 3 years ago

The 64k limit of mon_config_key_max_entry_size is arbitrary and has previously been expanded in https://github.com/badone/ceph/commit/b38b8e980cb477ab2b0f320ab51eaa0c0fec7da6 Wondering if we just expand that again? Possibly not a dream solution. Alternatively, we could grant 'insights' module the ability to ignore that limit?

Actions #4

Updated by Dan Mick over 3 years ago

We could, but that's a big big expansion. I don't know how much to worry about monstore space consumption.

Another option is compression I suppose, perhaps at some threshold.

Actions #5

Updated by Brad Hubbard over 3 years ago

It's not clear at all how big this might actually get.

# ceph insights | wc
 118123  654579 7384608

The calls to 'config-key set mgr/insights' occur every 10 seconds by default and lead to a substantial entry in the log that should definitely be reviewed as well.

# grep  "config-key set mgr/insights" ceph-mgr.XXX005.xxyjcw.log|awk '{print($1)}'|head -5
2020-11-30T06:25:02.015+0000
2020-11-30T06:25:14.159+0000
2020-11-30T06:25:26.643+0000
2020-11-30T06:25:38.231+0000
2020-11-30T06:25:52.300+0000

# grep  "config-key set mgr/insights" ceph-mgr.XXX005.xxyjcw.log|head -1|wc
      1   12830  117254

# grep  "config-key set mgr/insights" ceph-mgr.XXX005.xxyjcw.log|tail -1|wc
      1   30097  269537
Actions #6

Updated by Sage Weil about 3 years ago

I would rather create a tiny 'insights' rados pool and dump the insights reports there. devicehealth takes this approach

Actions #7

Updated by Josh Durgin about 3 years ago

We discussed this at cdm a few months ago: https://pad.ceph.com/p/insights_config-key_set_failure_problem - main conclusion was store in-memory only, and perhaps don't store every single health update

Actions #8

Updated by Brad Hubbard about 3 years ago

  • Assignee set to Brad Hubbard
  • Priority changed from Urgent to High
Actions #9

Updated by Brad Hubbard about 3 years ago

  • Priority changed from High to Urgent
Actions #10

Updated by Brad Hubbard almost 3 years ago

  • Pull request ID set to 42442
Actions #11

Updated by Brad Hubbard almost 3 years ago

  • Has duplicate Bug #51637: mgr/insights: mgr consumes excessive amounts of memory added
Actions #12

Updated by Konstantin Shalygin almost 3 years ago

  • Status changed from New to Fix Under Review
  • Target version set to v14.2.23
  • Backport set to nautilus pacific
  • Affected Versions v14.2.22 added
Actions #13

Updated by Neha Ojha over 2 years ago

  • Backport changed from nautilus pacific to nautilus,octopus,pacific
Actions #14

Updated by Brad Hubbard over 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #15

Updated by Backport Bot over 2 years ago

  • Copied to Backport #51949: octopus: insights module can generate too much data, fail to put in config-key added
Actions #16

Updated by Backport Bot over 2 years ago

  • Copied to Backport #51950: nautilus: insights module can generate too much data, fail to put in config-key added
Actions #17

Updated by Backport Bot over 2 years ago

  • Copied to Backport #51951: pacific: insights module can generate too much data, fail to put in config-key added
Actions #18

Updated by Loïc Dachary over 2 years ago

  • Target version deleted (v14.2.23)
Actions #19

Updated by Brad Hubbard over 2 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF