Project

General

Profile

Actions

Bug #49736

closed

cephfs-top: missing keys in the client_metadata

Added by Jos Collin about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

There are missing keys in the mgr/stats client_metadata for some clients, which causes the exception mentioned in the BZ [1] in cephfs-top [2]. Either cephfs-top should handle the missing metadata entries or the mgr/stats should fill in defaults until it can update the metadata. This exception occurs unexpectedly with no definite action/steps while cephfs-top is running.

Below is the `ceph fs perf stats` dumped during the exception. Notice client.14585.

{"version": 1, "global_counters": ["cap_hit", "read_latency", "write_latency", "metadata_latency", "dentry_lease"], "counters": [], 

"client_metadata": 
{"client.14504": {"IP": "127.0.0.1", "hostname": "smithi069", "root": "/", "mount_point": "/mnt/cephfs", "valid_metrics": ["cap_hit", "read_latency", "write_latency", "metadata_latency", "dentry_lease"]}, 
"client.14507": {"IP": "127.0.0.1", "hostname": "smithi069", "root": "/", "mount_point": "/mnt/cephfs2", "valid_metrics": ["cap_hit", "read_latency", "write_latency", "metadata_latency", "dentry_lease"]}, 
"client.14585": {"IP": "127.0.0.1"}}, 

"global_metrics": 
{"client.14504": [[2, 0], [0, 0], [0, 0], [0, 3038554], [0, 0]], 
"client.14507": [[2, 0], [0, 0], [0, 0], [0, 3091147], [0, 0]], 
"client.14585": [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0]]}, 

"metrics": {"delayed_ranks": [], "mds.0": {"client.14504": [], "client.14507": [], "client.14585": []}}}

The mgr logs during the exception reflect the same. The mgr logs cannot be attached to this ticket because of Maximum file size: 1000 KB limit.

More Details:
Here [3] we set IP metadata initially and then send a request to the mds for the remaining metadata. In the meantime, the current stats are dumped when cephfs-top queries mgr/stats, which would cause the exception. So the cephfs-top should be prepared to handle that OR mgr/stats should fill in the defaults (N/A, not available) and later update when it receives the metadata query reply. On the MDS side, it is observed that the metadata query reply did not contain metadata for client.14585 - this also need to be debugged.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1934426
[2] https://github.com/ceph/ceph/blob/master/src/tools/cephfs/top/cephfs-top#L256
[3] https://github.com/ceph/ceph/blob/master/src/pybind/mgr/stats/fs/perf_stats.py#L275


Related issues 1 (0 open1 closed)

Copied to CephFS - Backport #49973: pacific: cephfs-top: missing keys in the client_metadataResolvedJos CollinActions
Actions

Also available in: Atom PDF