Project

General

Profile

Actions

Bug #1472

closed

cfuse hangs with v0.34

Added by Sam Lang over 12 years ago. Updated over 7 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I see hangs with cfuse that appear to be at random (random requests to servers). Here are the backtraces of some cfuse processes that have hung:

hang 1:

Thread 1 (Thread 0x7ff70921b760 (LWP 18813)):
#0 0x00007ff708bc3bac in pthread_cond_wait@@GLIBC_2.3.2 ()
from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00000000006a4c91 in Cond::Wait (this=0x7fff0ab9ada0, mutex=...)
at ../../src/common/Cond.h:46
#2 0x0000000000691956 in Client::_read_async (this=0x11b5380, f=0x4b0cb90,
off=25079808, len=131072, bl=0x7fff0ab9af50)
at ../../src/client/Client.cc:5165
#3 0x0000000000690a42 in Client::_read (this=0x11b5380, f=0x4b0cb90,
offset=25079808, size=131072, bl=0x7fff0ab9af50)
at ../../src/client/Client.cc:5056
#4 0x00000000006a0045 in Client::ll_read (this=0x11b5380, fh=0x4b0cb90,
off=25079808, len=131072, bl=0x7fff0ab9af50)
at ../../src/client/Client.cc:6723
#5 0x0000000000660cd3 in ceph_ll_read (req=0x22b7a80, ino=1099511628151,
size=131072, off=25079808, fi=0x7fff0ab9afd0)
at ../../src/client/fuse_ll.cc:339
#6 0x00007ff708de8d0e in ?? () from /lib/libfuse.so.2
#7 0x00007ff708de6cb5 in fuse_session_loop () from /lib/libfuse.so.2
#8 0x0000000000661ab9 in ceph_fuse_ll_main (c=0x11b5380, argc=4,
argv=0x11bf000, fd=5) at ../../src/client/fuse_ll.cc:610
#9 0x00000000006584c4 in main (argc=4, argv=0x11bf000, envp=0x11af000)
at ../../src/cfuse.cc:134

hang 2:

#0 0x00007fb1cf155bac in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00000000006a4c91 in Cond::Wait (this=0x7fff617d78e0, mutex=...) at ../../src/common/Cond.h:46
#2 0x000000000066f58a in Client::make_request (this=0x1dff380, request=0x1ead600, uid=1005, gid=1006, ptarget=0x0,
use_mds=-1, pdirbl=0x0) at ../../src/client/Client.cc:1081
#3 0x0000000000687df0 in Client::_getattr (this=0x1dff380, in=0x1eafb00, mask=341, uid=1005, gid=1006)
at ../../src/client/Client.cc:3904
#4 0x0000000000696eb4 in Client::ll_getattr (this=0x1dff380, vino=..., attr=0x7fff617d7ad0, uid=1005, gid=1006)
at ../../src/client/Client.cc:5857
#5 0x000000000065f9d2 in ceph_ll_getattr (req=0x1e13080, ino=1099511628120, fi=0x0) at ../../src/client/fuse_ll.cc:130
#6 0x00007fb1cf37b085 in ?? () from /lib/libfuse.so.2
#7 0x00007fb1cf378cb5 in fuse_session_loop () from /lib/libfuse.so.2
#8 0x0000000000661ab9 in ceph_fuse_ll_main (c=0x1dff380, argc=4, argv=0x1e09000, fd=5) at ../../src/client/fuse_ll.cc:610
#9 0x00000000006584c4 in main (argc=4, argv=0x1e09000, envp=0x1df9000) at ../../src/cfuse.cc:134

hang 3:

#0 0x00007f1fea859bac in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00000000006a4c91 in Cond::Wait (this=0x7fff55c46880, mutex=...) at ../../src/common/Cond.h:46
#2 0x000000000067a183 in Client::wait_on_list (this=0x19e0380, ls=...) at ../../src/client/Client.cc:2305
#3 0x00000000006773a6 in Client::get_caps (this=0x19e0380, in=0x4983000, need=2048, want=1024, got=0x7fff55c46a44, endoff=-1)
at ../../src/client/Client.cc:1973
#4 0x00000000006909d8 in Client::_read (this=0x19e0380, f=0x58fadc0, offset=0, size=4096, bl=0x7fff55c46b20)
at ../../src/client/Client.cc:5044
#5 0x00000000006a0045 in Client::ll_read (this=0x19e0380, fh=0x58fadc0, off=0, len=4096, bl=0x7fff55c46b20)
at ../../src/client/Client.cc:6723
#6 0x0000000000660cd3 in ceph_ll_read (req=0xd6b0900, ino=1099511636749, size=4096, off=0, fi=0x7fff55c46ba0)
at ../../src/client/fuse_ll.cc:339
#7 0x00007f1feaa7ed0e in ?? () from /lib/libfuse.so.2
#8 0x00007f1feaa7ccb5 in fuse_session_loop () from /lib/libfuse.so.2
#9 0x0000000000661ab9 in ceph_fuse_ll_main (c=0x19e0380, argc=4, argv=0x19ea000, fd=5) at ../../src/client/fuse_ll.cc:610
#10 0x00000000006584c4 in main (argc=4, argv=0x19ea000, envp=0x19da000) at ../../src/cfuse.cc:134

Also, I'm not sure this is related, but I see quite a few ms_handle_reset messages in the client logs:

2011-08-31 13:28:36.958444 7f2548aeb700 client4193 ms_handle_reset on 192.168.101.11:6812/8154
2011-08-31 13:28:36.958574 7f2548aeb700 client4193 ms_handle_reset on 192.168.101.15:6801/27154
2011-08-31 13:28:37.005442 7f2548aeb700 client4193 ms_handle_reset on 192.168.101.13:6813/15849
2011-08-31 13:30:13.639193 7f2548aeb700 client4193 ms_handle_reset on 192.168.101.15:6805/27238
2011-08-31 13:30:28.158462 7f2548aeb700 client4193 ms_handle_reset on 192.168.101.12:6801/23779
2011-08-31 13:30:33.408491 7f2548aeb700 client4193 ms_handle_reset on 192.168.101.11:6808/7922

Actions

Also available in: Atom PDF