Project

General

Profile

Actions

Bug #62560

open

Provide a ceph subcommand for a client-side connectivity test

Added by Alexander Patrakov 9 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
ceph cli
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Recently, I had to troubleshoot a customer's cluster where, after adding new OSD hosts, reading certain files off CephFS resulted in a hang. We wasted a lot of time trying to troubleshoot this, but in the end, it boiled down to an ACL misconfiguration on one of the network switches. Namely, all cluster nodes were able to communicate with each other, but the client nodes were not authorized to communicate with new OSD hosts on the network level.

To make troubleshooting easier, please provide a new ceph client subcommand (e.g., "ceph connectivity") that tries to connect to every MON, MGR, OSD, MDS, and maybe also every RADOS gateway, and reports any failures such as connection resets and timeouts. Maybe it should also try to send some large (let's say 16 kilobytes) dummy requests or provoke a large dummy reply in order to test for MTU-related issues.

No data to display

Actions

Also available in: Atom PDF