Project

General

Profile

Bug #47734

client: hang after statfs

Added by Patrick Donnelly 4 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Urgent
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature:

Description

2020-10-02T16:25:49.256 INFO:teuthology.orchestra.run.smithi097.stderr:Error EINVAL: pool 'test_new_default_ec-data' (id '29') is an erasure-coded pool. Use of an EC pool for the default data pool is discouraged; see the online CephFS documentation for more information. Use --force to override.
2020-10-02T16:25:49.258 DEBUG:teuthology.orchestra.run:got remote process result: 22
2020-10-02T16:25:49.260 INFO:teuthology.nuke.actions:Clearing teuthology firewall rules...
2020-10-02T16:25:49.261 INFO:teuthology.orchestra.run.smithi097:> sudo sh -c 'iptables-save | grep -v teuthology | iptables-restore'
2020-10-02T16:25:49.289 INFO:teuthology.orchestra.run.smithi174:> sudo sh -c 'iptables-save | grep -v teuthology | iptables-restore'
2020-10-02T16:25:49.323 INFO:teuthology.nuke.actions:Cleared teuthology firewall rules.
2020-10-02T16:25:49.324 INFO:teuthology.orchestra.run:Running command with timeout 900
2020-10-02T16:25:49.324 INFO:teuthology.orchestra.run.smithi097:> (cd /home/ubuntu/cephtest && exec stat --file-system '--printf=%T
2020-10-02T16:25:49.325 INFO:teuthology.orchestra.run.smithi097:> ' -- /home/ubuntu/cephtest/mnt.0)

From: /ceph/teuthology-archive/pdonnell-2020-10-02_16:04:20-fs-wip-pdonnell-testing-20201002.031142-octopus-distro-basic-smithi/5489148/teuthology.log

This does not exist on master/Pacific because of 8728da9c085fea4c34e4247f45d495437f32c5fd. The is_mounted check no longer does a statfs on the ceph-fuse mount. The real cause of this is that the Client does not add a timeout for responses from the monitors:

https://github.com/ceph/ceph/blob/2d5bf454b7b42bf883a89b5ec12ad457c95fc37d/src/client/Client.cc#L15062

We should update this to be some reasonable timeout.


Related issues

Related to CephFS - Feature #44044: qa: add network namespaces to kernel/ceph-fuse mounts for partition testing Resolved
Copied to CephFS - Backport #47941: nautilus: octopus: client: hang after statfs Rejected
Copied to CephFS - Backport #47942: octopus: octopus: client: hang after statfs Resolved

History

#1 Updated by Patrick Donnelly 4 months ago

  • Related to Feature #44044: qa: add network namespaces to kernel/ceph-fuse mounts for partition testing added

#2 Updated by Patrick Donnelly 4 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 37529

#3 Updated by Patrick Donnelly 3 months ago

  • Status changed from Fix Under Review to Pending Backport

#4 Updated by Nathan Cutler 3 months ago

  • Subject changed from octopus: client: hang after statfs to client: hang after statfs
  • Affected Versions v15.2.5 added

#5 Updated by Nathan Cutler 3 months ago

  • Copied to Backport #47941: nautilus: octopus: client: hang after statfs added

#6 Updated by Nathan Cutler 3 months ago

  • Copied to Backport #47942: octopus: octopus: client: hang after statfs added

#7 Updated by Nathan Cutler about 1 month ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF