Project

General

Profile

Actions

Bug #50237

open

cephfs-journal-tool/cephfs-data-scan: Stuck in infinite loop with "NetHandler create_socket couldn't create socket"

Added by n u about 3 years ago. Updated almost 3 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
tools
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Both tools are getting stuck in an infinite loop and only outputting this message.

Env:
version: 15.2.4
dockerized: ceph/ceph:v15.2.4
kernel: 3.10.0-1160.15.2.el7.x86_64

mons are msgrv2 only
ceph.conf:
mon host = [v2:10.0.0.1:3300],[v2:10.0.0.2:3300], ...
ms_mon_client/cluster/service_mode = secure
ms_cluster/service/client = secure

Example commands we tried:
cephfs-data-scan pg_files / 23.4
cephfs-journal-tool journal export backup.bin

For debugging we tried so far:
- Running the command from a non containerized env
- Multiple client versions (nautilus, very recent octopus build)
- Look at debug logs: we noticed that the client tries msgrv1 but not msgrv2, but I'm not 100% sure, because I didn't record the logs (I can go back and verify if needed)

Is it possible that these tools cannot speak msgrv2?
If yes, what would be the process to "enable" msgrv1? (Our monitors are already listening on the v1 port however if I update ceph.conf with the v1 addresses and run ex.: ceph -s, it crashes the monitor which receives the query)

The following is a similar bug report, however in our case the tools don't work at all.
https://tracker.ceph.com/issues/41034


Files

debuglogs_50237.tar.gz (785 KB) debuglogs_50237.tar.gz daemon debug logs, strace outputs while running the cephfs recovery tools n u, 04/16/2021 01:30 PM
Actions

Also available in: Atom PDF