Project

General

Profile

Actions

Bug #18872

closed

write to cephfs mount hangs, ceph-fuse and kernel

Added by Jan Fajerski about 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
kraken, jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
ceph-fuse
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When trying to write to a cephfs mount using 'dd' the client hangs indefinitely. The kernel client can be <ctrl-c>'ed, fuse must be kill using 'kill'. The file itself is created but has no content (size 0). The cluster is healthy and cephfs can be accessed by other clients normally.

Client is a PPC64le machine. Ceph version on the client is ceph version 10.2.5-239-g3a6a822 (3a6a822c8125858afaeac7a1ee0d121d063660f0).

Cluster machines are x86_64. Ceph version for the daemons is ceph version 10.2.4-211-g12b091b (12b091b4a40947aa43919e71a318ed0dcedc8734)

Attached logs are trimmed to the time frame for the write. I don't see anything obvious. The client seems to wait for the right capabilities: 10 client.94152 waiting for caps need Fw want Fb

strace says the ceph-fuse process is in futex()
strace -p $(pgrep ceph-fuse)
Process 32761 attached
futex(0x3ffffd654ba0, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, ffffffff^CProcess 32761 detached
<detached ...>


Files

ceph-client.admin.log (49.1 KB) ceph-client.admin.log Client logs with 'debug client = 20 debug ms = 1' Jan Fajerski, 02/09/2017 03:26 PM
ceph-mds.ix64ph1078.log (100 KB) ceph-mds.ix64ph1078.log MDS log with 'debug mds = 20' Jan Fajerski, 02/09/2017 03:26 PM

Related issues 2 (0 open2 closed)

Copied to CephFS - Backport #19845: kraken: write to cephfs mount hangs, ceph-fuse and kernelResolvedJan FajerskiActions
Copied to CephFS - Backport #19846: jewel: write to cephfs mount hangs, ceph-fuse and kernelResolvedJan FajerskiActions
Actions #1

Updated by Jan Fajerski about 7 years ago

Also daemon commands don't return anything. That is for the client mds_requests and objecter_requests and ops_in_flight are empty. The MDS doesn't seem to have any ops_in_flight either.

Actions #2

Updated by Greg Farnum about 7 years ago

Well, the problem is clearly indicated by the client

2017-02-09 15:47:20.426573 3fff787fea60 10 client.94152 waiting for caps need Fw want Fb

Not sure why it's not updating correctly, but I can't find commit 3a6a822c8125858afaeac7a1ee0d121d063660f0. Where's this client come from?

Actions #3

Updated by John Spray about 7 years ago

PPC clients! Wondering if you've tried running any of the automated tests (the unit tests, or teuthology suites?) on PPC? They might help to isolate any issues.

Actions #4

Updated by Jan Fajerski about 7 years ago

The commit is from the SUSE repo. Its part of the ses4 branch: https://github.com/SUSE/ceph/commits/ses4. Sorry should have mentioned that.

I have to admit I haven't though of any testint yet. Building now and will look into teuthology testing too.

Actions #5

Updated by Jan Fajerski about 7 years ago

make check seem to get stuck after PASS: unittest_log on unittest_throttle.

edit The machine has only very little ram (~3GB). Will rerun on a larger machine.

Actions #6

Updated by Jan Fajerski about 7 years ago

make check finishes with 2 failed suites.

  • FAIL: test/osd/osd-scrub-repair.sh
  • FAIL: test/osd/osd-scrub-snaps.sh

Failures seem unrelated to issue observed, though I might be missing something in the 24M log file.
Any ideas or pointer appreciated.

Actions #7

Updated by Jan Fajerski almost 7 years ago

  • Status changed from New to Fix Under Review

Turns out this is an issue of ceph leaking arch-dependend flags on the wire. See kernel ml [PATCH] ceph: Fix file open flags on ppc64 for details and patch.

Same issue is present in the fuse client.

Actions #8

Updated by Jan Fajerski almost 7 years ago

  • Subject changed from Jewel write to cephfs mount hangs, ceph-fuse and kernel to write to cephfs mount hangs, ceph-fuse and kernel
  • Release deleted (jewel)
  • Release set to master
  • Affected Versions deleted (v10.2.5, v10.2.6)
  • Component(FS) ceph-fuse added
Actions #9

Updated by Zheng Yan almost 7 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to kraken, jewel
Actions #10

Updated by Nathan Cutler almost 7 years ago

  • Copied to Backport #19845: kraken: write to cephfs mount hangs, ceph-fuse and kernel added
Actions #11

Updated by Nathan Cutler almost 7 years ago

  • Copied to Backport #19846: jewel: write to cephfs mount hangs, ceph-fuse and kernel added
Actions #12

Updated by Nathan Cutler over 6 years ago

  • Status changed from Pending Backport to Resolved
  • Assignee set to Jan Fajerski
Actions

Also available in: Atom PDF