Project

General

Profile

Bug #17275

MDS long-time blocked ops. ceph-fuse locks up with getattr of file

Added by Henrik Korkuc 6 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
09/14/2016
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
hammer, jewel
Component(FS):
MDS, ceph-fuse
Needs Doc:
No

Description

In 10.2.1 cluster we are having some blocked ceph-fuse (tested with 10.2.2) metadata accesses to files.

I cannot reproduce the issue but it happens from time to time.

mds dump_blocked_ops show some amount of long-running ops. Logs show some amount of slow ops, e.g. " 6 slow requests, 1 included below; oldest blocked for > 83360.920878 secs". Attaching some log files.
ceph-mds_log with debug_mds* set to 20/20 and greped with "10000328fca\|client.7397637"
dump_blocked_ops
mds_cache.gz

Problem is not Jewel specific, had similar problems on hammer cluster/client too

ceph-mds_log (18.1 KB) Henrik Korkuc, 09/14/2016 12:53 PM

dump_blocked_ops (6.88 KB) Henrik Korkuc, 09/14/2016 12:54 PM

mds_cache.xz (553 KB) Henrik Korkuc, 09/14/2016 12:55 PM

dump_cache (13.2 KB) Henrik Korkuc, 09/14/2016 02:36 PM

mds_requests (592 Bytes) Henrik Korkuc, 09/14/2016 02:36 PM

objecter_requests (131 Bytes) Henrik Korkuc, 09/14/2016 02:36 PM

dump_cache_2816210 (20.1 KB) Henrik Korkuc, 09/15/2016 06:40 AM

gdb.txt View (594 KB) Henrik Korkuc, 09/16/2016 07:40 AM

client.27841966_gdb.txt View (118 KB) Henrik Korkuc, 09/26/2016 02:50 PM

client.27841966_objecter_requests (3.22 KB) Henrik Korkuc, 09/26/2016 02:50 PM

mds-dump_blocked_ops (318 KB) Henrik Korkuc, 09/26/2016 02:51 PM


Related issues

Copied to Backport #17697: jewel: MDS long-time blocked ops. ceph-fuse locks up with getattr of file Resolved

History

#1 Updated by Henrik Korkuc 6 months ago

forgot to attach files... Attaching now. Also recompressed mds_cache with xz for it to go bellow 1MB (50% smaller file compared to gz)

#2 Updated by Zheng Yan 6 months ago

please run "dump cache", "mds_requests" and "objecter_requests" commands for client.7397637 and upload the outputs.

#3 Updated by Henrik Korkuc 6 months ago

as I killed that ceph-fuse, I hanged another ceph-fuse. Attaching outputs from that one

#4 Updated by Zheng Yan 6 months ago

the client that causes problem is 2816210, please dump its cache

#5 Updated by Henrik Korkuc 6 months ago

Attaching.

btw as you'll probably notice, 2016-09-11-170046 directory is in another path in this dump. It was moved to "moved" after it started to cause problems. And for dirs in this "broken" state it takes minutes to move them

#6 Updated by Zheng Yan 6 months ago

it seems there is a blocked write in client.2816210. could you use gdb to attach that ceph-fuse, the run "thread apply all bt" command in gdb. and upload the output

#7 Updated by Henrik Korkuc 6 months ago

#8 Updated by Zheng Yan 6 months ago

Thread 835 (Thread 0x7f4c937fe700 (LWP 18496)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f4cbcfeb81d in Wait (mutex=..., this=0x7f4cc7e04a68) at ./common/Cond.h:56
#2  ObjectCacher::maybe_wait_for_writeback (this=this@entry=0x7f4cc7e046b0, len=len@entry=8192) at osdc/ObjectCacher.cc:1660
#3  0x00007f4cbcffafd3 in ObjectCacher::_wait_for_write (this=this@entry=0x7f4cc7e046b0, wr=wr@entry=0x7f49901e0c50, len=len@entry=8192, oset=oset@entry=0x7f4ca009cf48, 
    onfreespace=onfreespace@entry=0x0) at osdc/ObjectCacher.cc:1682
#4  0x00007f4cbd000c29 in ObjectCacher::writex (this=this@entry=0x7f4cc7e046b0, wr=wr@entry=0x7f49901e0c50, oset=oset@entry=0x7f4ca009cf48, onfreespace=onfreespace@entry=0x0)
    at osdc/ObjectCacher.cc:1626
#5  0x00007f4cbcf8fb01 in file_write (flags=0, mtime=..., bl=..., len=<optimized out>, offset=671285764096, snapc=..., layout=0x7f4ca009cc28, oset=0x7f4ca009cf48, this=0x7f4cc7e046b0)
    at ./osdc/ObjectCacher.h:766
#6  Client::_write (this=this@entry=0x7f4cc7e01180, f=f@entry=0x7f4c7c024ea0, offset=offset@entry=671285764096, size=size@entry=8192, 
    buf=buf@entry=0x7f4cc7e117c0 "\267\n\207\272\207\037\257\350\367\274\203\064\373|E\324\063V\244G\031\211MRa\035\215\327+\036<\373\204\370h\"\005\002!]\276\206Yݿ\252\351[\364\211\262\036\232\354\022}", iov=iov@entry=0x0, iovcnt=0) at client/Client.cc:8469
#7  0x00007f4cbcf90c67 in Client::ll_write (this=0x7f4cc7e01180, fh=0x7f4c7c024ea0, off=off@entry=671285764096, len=len@entry=8192, 
    data=data@entry=0x7f4cc7e117c0 "\267\n\207\272\207\037\257\350\367\274\203\064\373|E\324\063V\244G\031\211MRa\035\215\327+\036<\373\204\370h\"\005\002!]\276\206Yݿ\252\351[\364\211\262\036\232\354\022}") at client/Client.cc:11456
#8  0x00007f4cbcf41596 in fuse_ll_write (req=0x7f4c8c0b7280, ino=<optimized out>, 
    buf=0x7f4cc7e117c0 "\267\n\207\272\207\037\257\350\367\274\203\064\373|E\324\063V\244G\031\211MRa\035\215\327+\036<\373\204\370h\"\005\002!]\276\206Yݿ\252\351[\364\211\262\036\232\354\022}", 
    size=8192, off=671285764096, fi=0x7f4c937fd5c0) at client/fuse_ll.cc:503
#9  0x00007f4cbc968fe4 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#10 0x00007f4cbc969895 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#11 0x00007f4cbc966394 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#12 0x00007f4cbc31f0ca in start_thread (arg=0x7f4c937fe700) at pthread_create.c:309
#13 0x00007f4cbac868ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

this thread is blocking, which prevents mds from serving other threads' request. please run objecter_requests command for client.2816210, check if there are hang osd requests.

#9 Updated by Henrik Korkuc 6 months ago

{
"ops": [],
"linger_ops": [],
"pool_ops": [],
"pool_stat_ops": [],
"statfs_ops": [],
"command_ops": []
}

#10 Updated by Zheng Yan 6 months ago

this is strange, could you do some interactive debug (through xchat)

#11 Updated by Henrik Korkuc 6 months ago

sure. I am singler at OFTC #ceph

#12 Updated by Zheng Yan 6 months ago

  • Status changed from New to Testing

It seems ObjectCacher::maybe_wait_for_writeback() doesn't get woken up properly. Maybe following patch can fix it

diff --git a/src/osdc/ObjectCacher.cc b/src/osdc/ObjectCacher.cc
index e18de8c..ca58eb2 100644
--- a/src/osdc/ObjectCacher.cc
+++ b/src/osdc/ObjectCacher.cc
@@ -2522,6 +2522,8 @@ void ObjectCacher::bh_stat_sub(BufferHead *bh)
   default:
     assert(0 == "bh_stat_sub: invalid bufferhead state");
   }
+  if (get_stat_dirty_waiting() > 0)
+    stat_cond.Signal();
 }

 void ObjectCacher::bh_set_state(BufferHead *bh, int s)

#13 Updated by Henrik Korkuc 6 months ago

hey Zheng,

can you provide me instructions how can I trace client which blocks ops? client.2816210 in this case.

I have some blocked ops but I am not sure if they are running new or old version

#14 Updated by Zheng Yan 6 months ago

  • Needs Doc set to No

check the cluster log, find lines "client.xxx failing to respond to capability release". client.xxx is the guy that causes problem

#15 Updated by Henrik Korkuc 6 months ago

it looks like this time I got a read request stuck for 2 days from OSD. Client is 0.94.9

client cache: ceph-post-file: cd539449-cca1-4b36-bf32-54f0353c0815
mds cache: ceph-post-file: 477d14b0-dcc9-4291-a7fd-fe654483b6b8

cluster does not have stale/down/missing pgs, but there was pg splitting/rebalancing done some time ago (I think splitting was before that read request).
ceph pg 1.1ab4db54 query says "Error ENOENT: i don't have pgid 1.1ab4db54"

#16 Updated by Zheng Yan 6 months ago

Henrik Korkuc wrote:

it looks like this time I got a read request stuck for 2 days from OSD. Client is 0.94.9

client cache: ceph-post-file: cd539449-cca1-4b36-bf32-54f0353c0815
mds cache: ceph-post-file: 477d14b0-dcc9-4291-a7fd-fe654483b6b8

cluster does not have stale/down/missing pgs, but there was pg splitting/rebalancing done some time ago (I think splitting was before that read request).
ceph pg 1.1ab4db54 query says "Error ENOENT: i don't have pgid 1.1ab4db54"

This is objecter issue, please open a new ticket (Category librados)

#17 Updated by Henrik Korkuc 6 months ago

I created #17413 for that. Still waiting for this issue's issue to pop out

#18 Updated by Henrik Korkuc 5 months ago

got a getattr long lock, but this time client has long-running objecter requests. I will be filling a ticket for that soon. I think we can assume patch helped with these problems

#20 Updated by Zheng Yan 5 months ago

  • Status changed from Testing to Need Review

#21 Updated by John Spray 5 months ago

  • Status changed from Need Review to Pending Backport
  • Backport set to jewel

#22 Updated by Loic Dachary 5 months ago

  • Copied to Backport #17697: jewel: MDS long-time blocked ops. ceph-fuse locks up with getattr of file added

#23 Updated by Nathan Cutler about 2 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF