Project

General

Profile

Actions

Bug #17275

closed

MDS long-time blocked ops. ceph-fuse locks up with getattr of file

Added by Henrik Korkuc over 7 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, ceph-fuse
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In 10.2.1 cluster we are having some blocked ceph-fuse (tested with 10.2.2) metadata accesses to files.

I cannot reproduce the issue but it happens from time to time.

mds dump_blocked_ops show some amount of long-running ops. Logs show some amount of slow ops, e.g. " 6 slow requests, 1 included below; oldest blocked for > 83360.920878 secs". Attaching some log files.
ceph-mds_log with debug_mds* set to 20/20 and greped with "10000328fca\|client.7397637"
dump_blocked_ops
mds_cache.gz

Problem is not Jewel specific, had similar problems on hammer cluster/client too


Files

ceph-mds_log (18.1 KB) ceph-mds_log Henrik Korkuc, 09/14/2016 12:53 PM
dump_blocked_ops (6.88 KB) dump_blocked_ops Henrik Korkuc, 09/14/2016 12:54 PM
mds_cache.xz (553 KB) mds_cache.xz Henrik Korkuc, 09/14/2016 12:55 PM
dump_cache (13.2 KB) dump_cache Henrik Korkuc, 09/14/2016 02:36 PM
mds_requests (592 Bytes) mds_requests Henrik Korkuc, 09/14/2016 02:36 PM
objecter_requests (131 Bytes) objecter_requests Henrik Korkuc, 09/14/2016 02:36 PM
dump_cache_2816210 (20.1 KB) dump_cache_2816210 Henrik Korkuc, 09/15/2016 06:40 AM
gdb.txt (594 KB) gdb.txt Henrik Korkuc, 09/16/2016 07:40 AM
client.27841966_gdb.txt (118 KB) client.27841966_gdb.txt Henrik Korkuc, 09/26/2016 02:50 PM
client.27841966_objecter_requests (3.22 KB) client.27841966_objecter_requests Henrik Korkuc, 09/26/2016 02:50 PM
mds-dump_blocked_ops (318 KB) mds-dump_blocked_ops Henrik Korkuc, 09/26/2016 02:51 PM

Related issues 1 (0 open1 closed)

Copied to CephFS - Backport #17697: jewel: MDS long-time blocked ops. ceph-fuse locks up with getattr of fileResolvedLoïc DacharyActions

Updated by Henrik Korkuc over 7 years ago

forgot to attach files... Attaching now. Also recompressed mds_cache with xz for it to go bellow 1MB (50% smaller file compared to gz)

Actions #2

Updated by Zheng Yan over 7 years ago

please run "dump cache", "mds_requests" and "objecter_requests" commands for client.7397637 and upload the outputs.

Updated by Henrik Korkuc over 7 years ago

as I killed that ceph-fuse, I hanged another ceph-fuse. Attaching outputs from that one

Actions #4

Updated by Zheng Yan over 7 years ago

the client that causes problem is 2816210, please dump its cache

Actions #5

Updated by Henrik Korkuc over 7 years ago

Attaching.

btw as you'll probably notice, 2016-09-11-170046 directory is in another path in this dump. It was moved to "moved" after it started to cause problems. And for dirs in this "broken" state it takes minutes to move them

Actions #6

Updated by Zheng Yan over 7 years ago

it seems there is a blocked write in client.2816210. could you use gdb to attach that ceph-fuse, the run "thread apply all bt" command in gdb. and upload the output

Actions #7

Updated by Henrik Korkuc over 7 years ago

Actions #8

Updated by Zheng Yan over 7 years ago

Thread 835 (Thread 0x7f4c937fe700 (LWP 18496)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f4cbcfeb81d in Wait (mutex=..., this=0x7f4cc7e04a68) at ./common/Cond.h:56
#2  ObjectCacher::maybe_wait_for_writeback (this=this@entry=0x7f4cc7e046b0, len=len@entry=8192) at osdc/ObjectCacher.cc:1660
#3  0x00007f4cbcffafd3 in ObjectCacher::_wait_for_write (this=this@entry=0x7f4cc7e046b0, wr=wr@entry=0x7f49901e0c50, len=len@entry=8192, oset=oset@entry=0x7f4ca009cf48, 
    onfreespace=onfreespace@entry=0x0) at osdc/ObjectCacher.cc:1682
#4  0x00007f4cbd000c29 in ObjectCacher::writex (this=this@entry=0x7f4cc7e046b0, wr=wr@entry=0x7f49901e0c50, oset=oset@entry=0x7f4ca009cf48, onfreespace=onfreespace@entry=0x0)
    at osdc/ObjectCacher.cc:1626
#5  0x00007f4cbcf8fb01 in file_write (flags=0, mtime=..., bl=..., len=<optimized out>, offset=671285764096, snapc=..., layout=0x7f4ca009cc28, oset=0x7f4ca009cf48, this=0x7f4cc7e046b0)
    at ./osdc/ObjectCacher.h:766
#6  Client::_write (this=this@entry=0x7f4cc7e01180, f=f@entry=0x7f4c7c024ea0, offset=offset@entry=671285764096, size=size@entry=8192, 
    buf=buf@entry=0x7f4cc7e117c0 "\267\n\207\272\207\037\257\350\367\274\203\064\373|E\324\063V\244G\031\211MRa\035\215\327+\036<\373\204\370h\"\005\002!]\276\206Yݿ\252\351[\364\211\262\036\232\354\022}", iov=iov@entry=0x0, iovcnt=0) at client/Client.cc:8469
#7  0x00007f4cbcf90c67 in Client::ll_write (this=0x7f4cc7e01180, fh=0x7f4c7c024ea0, off=off@entry=671285764096, len=len@entry=8192, 
    data=data@entry=0x7f4cc7e117c0 "\267\n\207\272\207\037\257\350\367\274\203\064\373|E\324\063V\244G\031\211MRa\035\215\327+\036<\373\204\370h\"\005\002!]\276\206Yݿ\252\351[\364\211\262\036\232\354\022}") at client/Client.cc:11456
#8  0x00007f4cbcf41596 in fuse_ll_write (req=0x7f4c8c0b7280, ino=<optimized out>, 
    buf=0x7f4cc7e117c0 "\267\n\207\272\207\037\257\350\367\274\203\064\373|E\324\063V\244G\031\211MRa\035\215\327+\036<\373\204\370h\"\005\002!]\276\206Yݿ\252\351[\364\211\262\036\232\354\022}", 
    size=8192, off=671285764096, fi=0x7f4c937fd5c0) at client/fuse_ll.cc:503
#9  0x00007f4cbc968fe4 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#10 0x00007f4cbc969895 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#11 0x00007f4cbc966394 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#12 0x00007f4cbc31f0ca in start_thread (arg=0x7f4c937fe700) at pthread_create.c:309
#13 0x00007f4cbac868ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

this thread is blocking, which prevents mds from serving other threads' request. please run objecter_requests command for client.2816210, check if there are hang osd requests.

Actions #9

Updated by Henrik Korkuc over 7 years ago

{
"ops": [],
"linger_ops": [],
"pool_ops": [],
"pool_stat_ops": [],
"statfs_ops": [],
"command_ops": []
}

Actions #10

Updated by Zheng Yan over 7 years ago

this is strange, could you do some interactive debug (through xchat)

Actions #11

Updated by Henrik Korkuc over 7 years ago

sure. I am singler at OFTC #ceph

Actions #12

Updated by Zheng Yan over 7 years ago

  • Status changed from New to 7

It seems ObjectCacher::maybe_wait_for_writeback() doesn't get woken up properly. Maybe following patch can fix it

diff --git a/src/osdc/ObjectCacher.cc b/src/osdc/ObjectCacher.cc
index e18de8c..ca58eb2 100644
--- a/src/osdc/ObjectCacher.cc
+++ b/src/osdc/ObjectCacher.cc
@@ -2522,6 +2522,8 @@ void ObjectCacher::bh_stat_sub(BufferHead *bh)
   default:
     assert(0 == "bh_stat_sub: invalid bufferhead state");
   }
+  if (get_stat_dirty_waiting() > 0)
+    stat_cond.Signal();
 }

 void ObjectCacher::bh_set_state(BufferHead *bh, int s)
Actions #13

Updated by Henrik Korkuc over 7 years ago

hey Zheng,

can you provide me instructions how can I trace client which blocks ops? client.2816210 in this case.

I have some blocked ops but I am not sure if they are running new or old version

Actions #14

Updated by Zheng Yan over 7 years ago

check the cluster log, find lines "client.xxx failing to respond to capability release". client.xxx is the guy that causes problem

Updated by Henrik Korkuc over 7 years ago

it looks like this time I got a read request stuck for 2 days from OSD. Client is 0.94.9

client cache: ceph-post-file: cd539449-cca1-4b36-bf32-54f0353c0815
mds cache: ceph-post-file: 477d14b0-dcc9-4291-a7fd-fe654483b6b8

cluster does not have stale/down/missing pgs, but there was pg splitting/rebalancing done some time ago (I think splitting was before that read request).
ceph pg 1.1ab4db54 query says "Error ENOENT: i don't have pgid 1.1ab4db54"

Actions #16

Updated by Zheng Yan over 7 years ago

Henrik Korkuc wrote:

it looks like this time I got a read request stuck for 2 days from OSD. Client is 0.94.9

client cache: ceph-post-file: cd539449-cca1-4b36-bf32-54f0353c0815
mds cache: ceph-post-file: 477d14b0-dcc9-4291-a7fd-fe654483b6b8

cluster does not have stale/down/missing pgs, but there was pg splitting/rebalancing done some time ago (I think splitting was before that read request).
ceph pg 1.1ab4db54 query says "Error ENOENT: i don't have pgid 1.1ab4db54"

This is objecter issue, please open a new ticket (Category librados)

Actions #17

Updated by Henrik Korkuc over 7 years ago

I created #17413 for that. Still waiting for this issue's issue to pop out

Actions #18

Updated by Henrik Korkuc over 7 years ago

got a getattr long lock, but this time client has long-running objecter requests. I will be filling a ticket for that soon. I think we can assume patch helped with these problems

Actions #20

Updated by Zheng Yan over 7 years ago

  • Status changed from 7 to Fix Under Review
Actions #21

Updated by John Spray over 7 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to jewel
Actions #22

Updated by Loïc Dachary over 7 years ago

  • Copied to Backport #17697: jewel: MDS long-time blocked ops. ceph-fuse locks up with getattr of file added
Actions #23

Updated by Nathan Cutler about 7 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF