Bug #17275: MDS long-time blocked ops. ceph-fuse locks up with getattr of file - CephFS - Ceph

Actions

Copy link

Bug #17275

closed

MDS long-time blocked ops. ceph-fuse locks up with getattr of file

Added by Henrik Korkuc over 7 years ago. Updated about 7 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

other

Tags:

Backport:

jewel

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

MDS, ceph-fuse

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

In 10.2.1 cluster we are having some blocked ceph-fuse (tested with 10.2.2) metadata accesses to files.

I cannot reproduce the issue but it happens from time to time.

mds dump_blocked_ops show some amount of long-running ops. Logs show some amount of slow ops, e.g. " 6 slow requests, 1 included below; oldest blocked for > 83360.920878 secs". Attaching some log files.
ceph-mds_log with debug_mds* set to 20/20 and greped with "10000328fca\|client.7397637"
dump_blocked_ops
mds_cache.gz

Problem is not Jewel specific, had similar problems on hammer cluster/client too

Files

Download all files

ceph-mds_log (18.1 KB) ceph-mds_log		Henrik Korkuc, 09/14/2016 12:53 PM
dump_blocked_ops (6.88 KB) dump_blocked_ops		Henrik Korkuc, 09/14/2016 12:54 PM
mds_cache.xz (553 KB) mds_cache.xz		Henrik Korkuc, 09/14/2016 12:55 PM
dump_cache (13.2 KB) dump_cache		Henrik Korkuc, 09/14/2016 02:36 PM
mds_requests (592 Bytes) mds_requests		Henrik Korkuc, 09/14/2016 02:36 PM
objecter_requests (131 Bytes) objecter_requests		Henrik Korkuc, 09/14/2016 02:36 PM
dump_cache_2816210 (20.1 KB) dump_cache_2816210		Henrik Korkuc, 09/15/2016 06:40 AM
gdb.txt (594 KB) gdb.txt		Henrik Korkuc, 09/16/2016 07:40 AM
client.27841966_gdb.txt (118 KB) client.27841966_gdb.txt		Henrik Korkuc, 09/26/2016 02:50 PM
client.27841966_objecter_requests (3.22 KB) client.27841966_objecter_requests		Henrik Korkuc, 09/26/2016 02:50 PM
mds-dump_blocked_ops (318 KB) mds-dump_blocked_ops		Henrik Korkuc, 09/26/2016 02:51 PM

Related issues 1 (0 open — 1 closed)

Actions

Copy link Download all files

Updated by Henrik Korkuc over 7 years ago

File ceph-mds_log ceph-mds_log added
File dump_blocked_ops dump_blocked_ops added
File mds_cache.xz mds_cache.xz added

forgot to attach files... Attaching now. Also recompressed mds_cache with xz for it to go bellow 1MB (50% smaller file compared to gz)

Actions

Copy link

Updated by Zheng Yan over 7 years ago

please run "dump cache", "mds_requests" and "objecter_requests" commands for client.7397637 and upload the outputs.

Actions

Copy link Download all files

Updated by Henrik Korkuc over 7 years ago

File dump_cache dump_cache added
File mds_requests mds_requests added
File objecter_requests objecter_requests added

as I killed that ceph-fuse, I hanged another ceph-fuse. Attaching outputs from that one

Actions

Copy link

Updated by Zheng Yan over 7 years ago

the client that causes problem is 2816210, please dump its cache

Actions

Copy link

Updated by Henrik Korkuc over 7 years ago

File dump_cache_2816210 dump_cache_2816210 added

Attaching.

btw as you'll probably notice, 2016-09-11-170046 directory is in another path in this dump. It was moved to "moved" after it started to cause problems. And for dirs in this "broken" state it takes minutes to move them

Actions

Copy link

Updated by Zheng Yan over 7 years ago

it seems there is a blocked write in client.2816210. could you use gdb to attach that ceph-fuse, the run "thread apply all bt" command in gdb. and upload the output

Actions

Copy link

Updated by Henrik Korkuc over 7 years ago

File gdb.txt gdb.txt added

Actions

Copy link

Updated by Zheng Yan over 7 years ago

Thread 835 (Thread 0x7f4c937fe700 (LWP 18496)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f4cbcfeb81d in Wait (mutex=..., this=0x7f4cc7e04a68) at ./common/Cond.h:56
#2  ObjectCacher::maybe_wait_for_writeback (this=this@entry=0x7f4cc7e046b0, len=len@entry=8192) at osdc/ObjectCacher.cc:1660
#3  0x00007f4cbcffafd3 in ObjectCacher::_wait_for_write (this=this@entry=0x7f4cc7e046b0, wr=wr@entry=0x7f49901e0c50, len=len@entry=8192, oset=oset@entry=0x7f4ca009cf48, 
    onfreespace=onfreespace@entry=0x0) at osdc/ObjectCacher.cc:1682
#4  0x00007f4cbd000c29 in ObjectCacher::writex (this=this@entry=0x7f4cc7e046b0, wr=wr@entry=0x7f49901e0c50, oset=oset@entry=0x7f4ca009cf48, onfreespace=onfreespace@entry=0x0)
    at osdc/ObjectCacher.cc:1626
#5  0x00007f4cbcf8fb01 in file_write (flags=0, mtime=..., bl=..., len=<optimized out>, offset=671285764096, snapc=..., layout=0x7f4ca009cc28, oset=0x7f4ca009cf48, this=0x7f4cc7e046b0)
    at ./osdc/ObjectCacher.h:766
#6  Client::_write (this=this@entry=0x7f4cc7e01180, f=f@entry=0x7f4c7c024ea0, offset=offset@entry=671285764096, size=size@entry=8192, 
    buf=buf@entry=0x7f4cc7e117c0 "\267\n\207\272\207\037\257\350\367\274\203\064\373|E\324\063V\244G\031\211MRa\035\215\327+\036<\373\204\370h\"\005\002!]\276\206Yݿ\252\351[\364\211\262\036\232\354\022}", iov=iov@entry=0x0, iovcnt=0) at client/Client.cc:8469
#7  0x00007f4cbcf90c67 in Client::ll_write (this=0x7f4cc7e01180, fh=0x7f4c7c024ea0, off=off@entry=671285764096, len=len@entry=8192, 
    data=data@entry=0x7f4cc7e117c0 "\267\n\207\272\207\037\257\350\367\274\203\064\373|E\324\063V\244G\031\211MRa\035\215\327+\036<\373\204\370h\"\005\002!]\276\206Yݿ\252\351[\364\211\262\036\232\354\022}") at client/Client.cc:11456
#8  0x00007f4cbcf41596 in fuse_ll_write (req=0x7f4c8c0b7280, ino=<optimized out>, 
    buf=0x7f4cc7e117c0 "\267\n\207\272\207\037\257\350\367\274\203\064\373|E\324\063V\244G\031\211MRa\035\215\327+\036<\373\204\370h\"\005\002!]\276\206Yݿ\252\351[\364\211\262\036\232\354\022}", 
    size=8192, off=671285764096, fi=0x7f4c937fd5c0) at client/fuse_ll.cc:503
#9  0x00007f4cbc968fe4 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#10 0x00007f4cbc969895 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#11 0x00007f4cbc966394 in ?? () from /lib/x86_64-linux-gnu/libfuse.so.2
#12 0x00007f4cbc31f0ca in start_thread (arg=0x7f4c937fe700) at pthread_create.c:309
#13 0x00007f4cbac868ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

this thread is blocking, which prevents mds from serving other threads' request. please run objecter_requests command for client.2816210, check if there are hang osd requests.

Actions

Copy link

Updated by Henrik Korkuc over 7 years ago

{
"ops": [],
"linger_ops": [],
"pool_ops": [],
"pool_stat_ops": [],
"statfs_ops": [],
"command_ops": []
}

Actions

Copy link

#10

Updated by Zheng Yan over 7 years ago

this is strange, could you do some interactive debug (through xchat)

Actions

Copy link

#11

Updated by Henrik Korkuc over 7 years ago

sure. I am singler at OFTC #ceph

Actions

Copy link

#12

Updated by Zheng Yan over 7 years ago

Status changed from New to 7

It seems ObjectCacher::maybe_wait_for_writeback() doesn't get woken up properly. Maybe following patch can fix it

diff --git a/src/osdc/ObjectCacher.cc b/src/osdc/ObjectCacher.cc
index e18de8c..ca58eb2 100644
--- a/src/osdc/ObjectCacher.cc
+++ b/src/osdc/ObjectCacher.cc
@@ -2522,6 +2522,8 @@ void ObjectCacher::bh_stat_sub(BufferHead *bh)
   default:
     assert(0 == "bh_stat_sub: invalid bufferhead state");
   }
+  if (get_stat_dirty_waiting() > 0)
+    stat_cond.Signal();
 }

 void ObjectCacher::bh_set_state(BufferHead *bh, int s)

Actions

Copy link

#13

Updated by Henrik Korkuc over 7 years ago

hey Zheng,

can you provide me instructions how can I trace client which blocks ops? client.2816210 in this case.

I have some blocked ops but I am not sure if they are running new or old version

Actions

Copy link

#14

Updated by Zheng Yan over 7 years ago

check the cluster log, find lines "client.xxx failing to respond to capability release". client.xxx is the guy that causes problem

Actions

Copy link Download all files

#15

Updated by Henrik Korkuc over 7 years ago

File client.27841966_gdb.txt client.27841966_gdb.txt added
File client.27841966_objecter_requests client.27841966_objecter_requests added
File mds-dump_blocked_ops mds-dump_blocked_ops added

it looks like this time I got a read request stuck for 2 days from OSD. Client is 0.94.9

client cache: ceph-post-file: cd539449-cca1-4b36-bf32-54f0353c0815
mds cache: ceph-post-file: 477d14b0-dcc9-4291-a7fd-fe654483b6b8

cluster does not have stale/down/missing pgs, but there was pg splitting/rebalancing done some time ago (I think splitting was before that read request).
ceph pg 1.1ab4db54 query says "Error ENOENT: i don't have pgid 1.1ab4db54"

Actions

Copy link

#16

Updated by Zheng Yan over 7 years ago

Henrik Korkuc wrote:

it looks like this time I got a read request stuck for 2 days from OSD. Client is 0.94.9

client cache: ceph-post-file: cd539449-cca1-4b36-bf32-54f0353c0815
mds cache: ceph-post-file: 477d14b0-dcc9-4291-a7fd-fe654483b6b8

cluster does not have stale/down/missing pgs, but there was pg splitting/rebalancing done some time ago (I think splitting was before that read request).
ceph pg 1.1ab4db54 query says "Error ENOENT: i don't have pgid 1.1ab4db54"

This is objecter issue, please open a new ticket (Category librados)

Actions

Copy link

#17

Updated by Henrik Korkuc over 7 years ago

I created #17413 for that. Still waiting for this issue's issue to pop out

Actions

Copy link

#18

Updated by Henrik Korkuc over 7 years ago

got a getattr long lock, but this time client has long-running objecter requests. I will be filling a ticket for that soon. I think we can assume patch helped with these problems

Actions

Copy link

#19