Project

General

Profile

Bug #18872

write to cephfs mount hangs, ceph-fuse and kernel

Added by Jan Fajerski 6 months ago. Updated 8 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
02/09/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
kraken, jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
master
Component(FS):
ceph-fuse
Needs Doc:
No

Description

When trying to write to a cephfs mount using 'dd' the client hangs indefinitely. The kernel client can be <ctrl-c>'ed, fuse must be kill using 'kill'. The file itself is created but has no content (size 0). The cluster is healthy and cephfs can be accessed by other clients normally.

Client is a PPC64le machine. Ceph version on the client is ceph version 10.2.5-239-g3a6a822 (3a6a822c8125858afaeac7a1ee0d121d063660f0).

Cluster machines are x86_64. Ceph version for the daemons is ceph version 10.2.4-211-g12b091b (12b091b4a40947aa43919e71a318ed0dcedc8734)

Attached logs are trimmed to the time frame for the write. I don't see anything obvious. The client seems to wait for the right capabilities: 10 client.94152 waiting for caps need Fw want Fb

strace says the ceph-fuse process is in futex()
strace -p $(pgrep ceph-fuse)
Process 32761 attached
futex(0x3ffffd654ba0, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff^CProcess 32761 detached
<detached ...>

ceph-client.admin.log View - Client logs with 'debug client = 20 debug ms = 1' (49.1 KB) Jan Fajerski, 02/09/2017 03:26 PM

ceph-mds.ix64ph1078.log View - MDS log with 'debug mds = 20' (100 KB) Jan Fajerski, 02/09/2017 03:26 PM


Related issues

Copied to fs - Backport #19845: kraken: write to cephfs mount hangs, ceph-fuse and kernel Resolved
Copied to fs - Backport #19846: jewel: write to cephfs mount hangs, ceph-fuse and kernel Resolved

History

#1 Updated by Jan Fajerski 6 months ago

Also daemon commands don't return anything. That is for the client mds_requests and objecter_requests and ops_in_flight are empty. The MDS doesn't seem to have any ops_in_flight either.

#2 Updated by Greg Farnum 6 months ago

Well, the problem is clearly indicated by the client

2017-02-09 15:47:20.426573 3fff787fea60 10 client.94152 waiting for caps need Fw want Fb

Not sure why it's not updating correctly, but I can't find commit 3a6a822c8125858afaeac7a1ee0d121d063660f0. Where's this client come from?

#3 Updated by John Spray 6 months ago

PPC clients! Wondering if you've tried running any of the automated tests (the unit tests, or teuthology suites?) on PPC? They might help to isolate any issues.

#4 Updated by Jan Fajerski 6 months ago

The commit is from the SUSE repo. Its part of the ses4 branch: https://github.com/SUSE/ceph/commits/ses4. Sorry should have mentioned that.

I have to admit I haven't though of any testint yet. Building now and will look into teuthology testing too.

#5 Updated by Jan Fajerski 5 months ago

make check seem to get stuck after PASS: unittest_log on unittest_throttle.

edit The machine has only very little ram (~3GB). Will rerun on a larger machine.

#6 Updated by Jan Fajerski 5 months ago

make check finishes with 2 failed suites.

  • FAIL: test/osd/osd-scrub-repair.sh
  • FAIL: test/osd/osd-scrub-snaps.sh

Failures seem unrelated to issue observed, though I might be missing something in the 24M log file.
Any ideas or pointer appreciated.

#7 Updated by Jan Fajerski 3 months ago

  • Status changed from New to Need Review

Turns out this is an issue of ceph leaking arch-dependend flags on the wire. See kernel ml [PATCH] ceph: Fix file open flags on ppc64 for details and patch.

Same issue is present in the fuse client.

#8 Updated by Jan Fajerski 3 months ago

  • Subject changed from Jewel write to cephfs mount hangs, ceph-fuse and kernel to write to cephfs mount hangs, ceph-fuse and kernel
  • Affected Versions deleted (v10.2.5, v10.2.6)
  • Release master added
  • Release deleted (jewel)
  • Component(FS) ceph-fuse added

#9 Updated by Zheng Yan 3 months ago

  • Status changed from Need Review to Pending Backport
  • Backport set to kraken, jewel

#10 Updated by Nathan Cutler 3 months ago

  • Copied to Backport #19845: kraken: write to cephfs mount hangs, ceph-fuse and kernel added

#11 Updated by Nathan Cutler 3 months ago

  • Copied to Backport #19846: jewel: write to cephfs mount hangs, ceph-fuse and kernel added

#12 Updated by Nathan Cutler 8 days ago

  • Status changed from Pending Backport to Resolved
  • Assignee set to Jan Fajerski

Also available in: Atom PDF