Project

General

Profile

Actions

Bug #57460

closed

Json formatted ceph pg dump hangs on large clusters

Added by Ponnuvel P over 1 year ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
low-hanging-fruit
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

PG dump command `ceph pg dump --format json-pretty` hangs on large clusters.

The ceph-mgr daemon hangs and eventually fails over to the other ceph-mgr.

This happens in a cluster with 4391 OSDs and 66675 PGs.

The non-json `ceph pg dump` is about 35M in this cluster. So by the json formatted output would be considerably large (perhaps close to 2G depending on the number .

This was observed on an Octopus (15.2.13) cluster. But I'd expect the same happen in Pacific or Quincy on any sufficiently large cluster.

One option could be to disable network_ping_times from the Python (ceph-mgr) module which would is the major component (size-wise) for each of the OSDs.

Actions

Also available in: Atom PDF