Bug #23883: kclient: CephFS kernel client hang - CephFS - Ceph

Actions

Copy link

Bug #23883

open

kclient: CephFS kernel client hang

Added by wei jin almost 6 years ago. Updated about 5 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

kceph

Labels (FS):

multimds

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

ceph: 12.2.4/12.2.5
os: debian jessie
kernel: 4.9/4.4

After restart all mds(6 in total, 5 active, 1 standby), client will hang at 'stat' operation according to strace. (test command: df, lsof...). It is very easy to trigger this issue if we do 'ls -lR' with a huge directory which includes million files before we restart all mds.

dmesg shows that connection was denied and I did not run into this issue with Jewel. This is very critical because many commands will hang on the client machine, and seems reboot is the only way to rescue.

I am not sure whether this patch(https://github.com/ceph/ceph/pull/21592) will fix this issue too.
or
It is related to authentication bug #110131

[164426.867436] libceph: mds4 10.20.52.159:6800 socket closed (con state OPEN)
[164427.871075] libceph: mds0 10.20.52.160:6800 socket closed (con state OPEN)
[164431.274627] libceph: mds0 10.20.52.160:6800 socket closed (con state CONNECTING)
[164432.042622] libceph: mds0 10.20.52.160:6800 socket closed (con state CONNECTING)
[164433.034613] libceph: mds0 10.20.52.160:6800 socket closed (con state CONNECTING)
[164435.050590] libceph: mds0 10.20.52.160:6800 socket closed (con state CONNECTING)
[164439.213001] libceph: wrong peer, want 10.20.52.160:6800/-713929449, got 10.20.52.160:6800/1282156498
[164439.213003] libceph: mds0 10.20.52.160:6800 wrong peer at address
[164479.144207] ceph: mds0 caps stale
[164479.144209] ceph: mds4 caps stale
[164488.007687] ceph: mds4 reconnect start
[164636.153486] ceph: mds0 reconnect start
[164647.877800] ceph: mds4 recovery completed
[164647.877807] ceph: mds0 recovery completed
[164647.878088] libceph: mon0 10.20.63.194:6789 session lost, hunting for new mon
[164647.887724] ceph: mds0 reconnect denied
[164647.887742] ceph: mds4 reconnect denied
[164647.890247] libceph: mon4 10.20.64.66:6789 session established
[164654.634241] libceph: mds4 10.20.52.155:6800 socket closed (con state NEGOTIATING)
[165547.897829] libceph: mds2 10.20.52.158:6800 socket closed (con state OPEN)
[165555.605291] libceph: mds4 10.20.52.155:6800 socket closed (con state OPEN)
[166274.096654] libceph: mds1 10.20.52.156:6800 socket closed (con state OPEN)
[166469.161566] libceph: mds3 10.20.52.159:6800 socket closed (con state OPEN)

Actions

Copy link

Updated by Patrick Donnelly almost 6 years ago

Subject changed from CephFS kernel client hang to kclient: CephFS kernel client hang
Priority changed from Normal to High
Target version set to v13.0.0
Source set to Community (user)
Severity changed from 3 - minor to 2 - major
Component(FS) kceph added
Labels (FS) multimds added

Actions

Copy link

Updated by Zheng Yan almost 6 years ago

need mds log to check what happened

Actions

Copy link

Updated by Zheng Yan almost 6 years ago

Component(FS) Common/Protocol added

besides, 4.4/4.9 kernel is too old for using multimds.

Actions

Copy link

Updated by Zheng Yan almost 6 years ago

Component(FS) deleted (~~Common/Protocol~~)

Actions

Copy link

Updated by wei jin almost 6 years ago

Zheng Yan wrote:

besides, 4.4/4.9 kernel is too old for using multimds.

It is very difficult to upgrade kernel versions for online machine.
I exported every user's directory to different ranks when using mulitmds, is it possible for old kernel?

And I reproduced this issue just now by using one active mds, all other five mds are standby.

Actions

Copy link

Updated by Zheng Yan almost 6 years ago

please upload mds log

Actions

Copy link

Updated by wei jin almost 6 years ago

Zheng Yan wrote:

please upload mds log

which level?
after setting debug_mds = 20 and debug_ms = 1, log file is about 500MB in one minute.

Actions

Copy link

Updated by Zheng Yan almost 6 years ago

debug_mds = 10. only for period that mds is recovering

Actions

Copy link

Updated by wei jin almost 6 years ago

2018-04-28 17:58:57.506825 7f2dcf726700 10 mds.0.journal EMetaBlob.replay for [2,head] had [inode 0x100067aa9dd [...2,head] /jinwei_test/tmp/n20-064-084/n20-064-084_1085/ auth v136880 f(v0 m2
018-03-28 00:07:43.390546 1=0+1) n(v15 rc2018-03-28 00:08:23.687127 b9825152 100002=100000+2) (iversion lock) | dirfrag=1 0x56101da4d800]
2018-04-28 17:58:57.506835 7f2dcf726700 10 mds.0.log _replay - complete, 29900 events
2018-04-28 17:58:57.506837 7f2dcf726700 10 mds.0.log _replay_thread kicking waiters
2018-04-28 17:58:57.506841 7f2dcf726700 10 MDSInternalContextBase::complete: 15C_MDS_BootStart
2018-04-28 17:58:57.506844 7f2dcf726700  1 mds.0.140006 replay_done
2018-04-28 17:58:57.506846 7f2dcf726700  1 mds.0.140006 making mds journal writeable
2018-04-28 17:58:57.507025 7f2dcf726700  2 mds.0.140006 i am alone, moving to state reconnect
2018-04-28 17:58:57.507030 7f2dcf726700  3 mds.0.140006 request_state up:reconnect
2018-04-28 17:58:57.507031 7f2dcf726700 10 mds.beacon.n20-052-155 set_want_state: up:replay -> up:reconnect
2018-04-28 17:58:57.507041 7f2dcf726700 10 mds.beacon.n20-052-155 _send up:reconnect seq 3
2018-04-28 17:58:57.507054 7f2dcf726700 10 mds.0.log _replay_thread finish
2018-04-28 17:58:58.365873 7f2dd9ecb700 10 mds.client.admin  existing session 0x561016b28d80 for client.2902113 10.20.63.195:0/1496633167 existing con 0, new/authorizing con 0x561026ef0800
2018-04-28 17:58:58.365888 7f2dd9ecb700 10 mds.client.admin ms_verify_authorizer: parsing auth_cap_str='allow *'
2018-04-28 17:58:58.365939 7f2dd6f35700 10 mds.n20-052-155 ms_handle_accept 10.20.63.195:0/1496633167 con 0x561026ef0800 session 0x561016b28d80
2018-04-28 17:58:58.365952 7f2dd6f35700 10 mds.n20-052-155  session connection 0 -> 0x561026ef0800
2018-04-28 17:58:58.365973 7f2dd6f35700  5 mds.n20-052-155 handle_mds_map epoch 140007 from mon.1
2018-04-28 17:58:58.366014 7f2dd6f35700 10 mds.n20-052-155      my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in
separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2}
2018-04-28 17:58:58.366021 7f2dd6f35700 10 mds.n20-052-155  mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in
separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2}
2018-04-28 17:58:58.366025 7f2dd6f35700 10 mds.n20-052-155 map says I am 10.20.52.155:6800/1926089105 mds.0.140006 state up:reconnect
2018-04-28 17:58:58.366032 7f2dd6f35700 10 mds.n20-052-155 handle_mds_map: handling map as rank 0
2018-04-28 17:58:58.366034 7f2dd6f35700  1 mds.0.140006 handle_mds_map i am now mds.0.140006
2018-04-28 17:58:58.366036 7f2dd6f35700  1 mds.0.140006 handle_mds_map state change up:replay --> up:reconnect
2018-04-28 17:58:58.366047 7f2dd6f35700  1 mds.0.140006 reconnect_start
2018-04-28 17:58:58.366050 7f2dd6f35700  1 mds.0.140006 reopen_log
2018-04-28 17:58:58.366051 7f2dd6f35700 10 mds.0.cache rollback_uncommitted_fragments: 0 pending
2018-04-28 17:58:58.366058 7f2dd6f35700 10 mds.0.server apply_blacklist: killed 0
2018-04-28 17:58:58.366060 7f2dd6f35700  4 mds.0.140006 reconnect_start: killed 0 blacklisted sessions (5 blacklist entries, 2)
2018-04-28 17:58:58.366064 7f2dd6f35700  1 mds.0.server reconnect_clients -- 2 sessions
2018-04-28 17:58:58.366066 7f2dd6f35700 10 mds.0.sessionmap dump
2018-04-28 17:58:58.366067 7f2dd6f35700 10 mds.0.sessionmap client.3193954 0x561016b27a00 state open completed {} prealloc_inos [] used_inos []
2018-04-28 17:58:58.366070 7f2dd6f35700 10 mds.0.sessionmap client.2902113 0x561016b28d80 state open completed {} prealloc_inos [] used_inos []
2018-04-28 17:58:58.366105 7f2dd6f35700  7 mds.0.server handle_client_reconnect client.2902113
2018-04-28 17:58:58.366110 7f2dd6f35700 10 mds.0.server  reconnect_start 2018-04-28 17:58:58.366063 delay 0.000045
2018-04-28 17:58:58.366145 7f2dd6f35700  0 log_channel(cluster) log [DBG] : reconnect by client.2902113 10.20.63.195:0/1496633167 after 0.000045
2018-04-28 17:58:58.366151 7f2dd6f35700 10  mds.0.cache.snaprealm(0x1 seq 1 0x56101693f840) have_past_parents_open [1,head]
2018-04-28 17:58:58.366154 7f2dd6f35700 10  mds.0.cache.snaprealm(0x1 seq 1 0x56101693f840) build_snap_set [0,head] on snaprealm(0x1 seq 1 lc 0 cr 0 cps 1 snaps={} 0x56101693f840)
2018-04-28 17:58:58.366158 7f2dd6f35700 10  mds.0.cache.snaprealm(0x1 seq 1 0x56101693f840) build_snap_trace my_snaps []
2018-04-28 17:58:58.366161 7f2dd6f35700 10  mds.0.cache.snaprealm(0x1 seq 1 0x56101693f840) check_cache rebuilt  seq 1 cached_seq 1 cached_last_created 0 cached_last_destroyed 0)
2018-04-28 17:58:58.366164 7f2dd6f35700 10 mds.0.cache finish_snaprealm_reconnect client.2902113 up to date on snaprealm(0x1 seq 1 lc 0 cr 0 cps 1 snaps={} 0x56101693f840)
2018-04-28 17:58:58.366169 7f2dd6f35700 10 mds.0.cache.ino(0x1) add_client_cap first cap, joining realm snaprealm(0x1 seq 1 lc 0 cr 0 cps 1 snaps={} 0x56101693f840)
2018-04-28 17:58:58.366723 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 3 rtt 0.859677
2018-04-28 17:59:07.045243 7f2dd6f35700  3 mds.0.server handle_client_session client_session(request_renewcaps seq 4335) from client.2902113
2018-04-28 17:59:07.045267 7f2dd6f35700 10 mds.0.sessionmap touch_session s=0x561016b28d80 name=client.2902113
2018-04-28 17:59:07.507095 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 4
2018-04-28 17:59:07.507566 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 4 rtt 0.000460
2018-04-28 17:59:17.507143 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 5
2018-04-28 17:59:17.507759 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 5 rtt 0.000605
2018-04-28 17:59:27.045248 7f2dd6f35700  3 mds.0.server handle_client_session client_session(request_renewcaps seq 4336) from client.2902113
2018-04-28 17:59:27.045271 7f2dd6f35700 10 mds.0.sessionmap touch_session s=0x561016b28d80 name=client.2902113
2018-04-28 17:59:27.507195 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 6
2018-04-28 17:59:27.507694 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 6 rtt 0.000488
2018-04-28 17:59:37.507244 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 7
2018-04-28 17:59:37.507748 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 7 rtt 0.000492
2018-04-28 17:59:47.045236 7f2dd6f35700  3 mds.0.server handle_client_session client_session(request_renewcaps seq 4337) from client.2902113
2018-04-28 17:59:47.045261 7f2dd6f35700 10 mds.0.sessionmap touch_session s=0x561016b28d80 name=client.2902113
2018-04-28 17:59:47.507299 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 8
2018-04-28 17:59:47.507868 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 8 rtt 0.000556
2018-04-28 17:59:57.507356 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 9
2018-04-28 17:59:57.507858 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 9 rtt 0.000491
2018-04-28 18:00:07.045267 7f2dd6f35700  3 mds.0.server handle_client_session client_session(request_renewcaps seq 4338) from client.2902113
2018-04-28 18:00:07.045290 7f2dd6f35700 10 mds.0.sessionmap touch_session s=0x561016b28d80 name=client.2902113
2018-04-28 18:00:07.507407 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 10
2018-04-28 18:00:07.507902 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 10 rtt 0.000484
2018-04-28 18:00:17.507466 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 11
2018-04-28 18:00:17.508056 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 11 rtt 0.000579
2018-04-28 18:00:27.045255 7f2dd6f35700  3 mds.0.server handle_client_session client_session(request_renewcaps seq 4339) from client.2902113
2018-04-28 18:00:27.045278 7f2dd6f35700 10 mds.0.sessionmap touch_session s=0x561016b28d80 name=client.2902113
2018-04-28 18:00:27.507531 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 12
2018-04-28 18:00:27.508062 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 12 rtt 0.000516
2018-04-28 18:00:37.507588 7f2dd3f2f700 10 mds.beacon.n20-052-155 _send up:reconnect seq 13
2018-04-28 18:00:37.508095 7f2dd6f35700 10 mds.beacon.n20-052-155 handle_mds_beacon up:reconnect seq 13 rtt 0.000496
2018-04-28 18:00:41.637045 7f2dd4730700 10 mds.0.server reconnect timed out
2018-04-28 18:00:41.637051 7f2dd4730700  1 mds.0.server reconnect gave up on client.3193954 10.8.147.33:0/2305264606
2018-04-28 18:00:41.637095 7f2dd4730700  0 log_channel(cluster) log [WRN] : evicting unresponsive client n8-147-033: (3193954), after waiting 100 seconds during MDS startup
2018-04-28 18:00:41.637105 7f2dd4730700  4 mds.0.140006 Preparing blacklist command... (wait=0)
2018-04-28 18:00:41.637124 7f2dd4730700  4 mds.0.140006 Sending mon blacklist command: {"prefix":"osd blacklist", "blacklistop":"add","addr":"10.8.147.33:0/2305264606"}
2018-04-28 18:00:41.637139 7f2dd4730700  1 mds.0.server reconnect will complete once clients are evicted
2018-04-28 18:00:41.704825 7f2dd6f35700  7 mds.0.server operator(): full = 0 epoch = 63658
2018-04-28 18:00:41.704834 7f2dd6f35700  4 mds.0.140006 handle_osd_map epoch 63658, 1 new blacklist entries
2018-04-28 18:00:41.704838 7f2dd6f35700 10 mds.0.server kill_session 0x561016b27a00
2018-04-28 18:00:41.704854 7f2dd6f35700 10 mds.0.server apply_blacklist: killed 1
2018-04-28 18:00:41.704859 7f2dd6f35700  4 mds.0.140006 set_osd_epoch_barrier: epoch=63658
2018-04-28 18:00:41.704868 7f2dcff27700  5 mds.0.log _submit_thread 5911330478902~97 : ESession client.3193954 10.8.147.33:0/2305264606 close cmapv 1107443677
2018-04-28 18:00:41.763626 7f2dd0f29700 10 MDSIOContextBase::complete: 20C_MDS_session_finish
2018-04-28 18:00:41.763632 7f2dd0f29700 10 mds.0.server _session_logged client.3193954 10.8.147.33:0/2305264606 state_seq 2 close 1107443677
2018-04-28 18:00:41.763640 7f2dd0f29700 10 mds.0.sessionmap remove_session s=0x561016b27a00 name=client.3193954
2018-04-28 18:00:41.787344 7f2dd0f29700  4 mds.0.140006 set_osd_epoch_barrier: epoch=63658
2018-04-28 18:00:41.787349 7f2dd0f29700  1 mds.0.140006 session 3193954 was removed while we waited for blacklist
2018-04-28 18:00:41.787352 7f2dd0f29700 10 MDSInternalContextBase::complete: 25MDSInternalContextWrapper
2018-04-28 18:00:41.787356 7f2dd0f29700  7 mds.0.server reconnect_gather_finish.  failed on 1 clients
2018-04-28 18:00:41.787358 7f2dd0f29700 10 MDSInternalContextBase::complete: 12C_MDS_VoidFn
2018-04-28 18:00:41.787359 7f2dd0f29700  1 mds.0.140006 reconnect_done
2018-04-28 18:00:41.787360 7f2dd0f29700  3 mds.0.140006 request_state up:rejoin
2018-04-28 18:00:41.787362 7f2dd0f29700 10 mds.beacon.n20-052-155 set_want_state: up:reconnect -> up:rejoin
2018-04-28 18:00:41.787368 7f2dd0f29700 10 mds.beacon.n20-052-155 _send up:rejoin seq 14
2018-04-28 18:00:42.704532 7f2dd6f35700  5 mds.n20-052-155 handle_mds_map epoch 140009 from mon.1
2018-04-28 18:00:42.704568 7f2dd6f35700 10 mds.n20-052-155      my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in
separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2}
2018-04-28 18:00:42.704573 7f2dd6f35700 10 mds.n20-052-155  mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in
separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2}
2018-04-28 18:00:42.704576 7f2dd6f35700 10 mds.n20-052-155 map says I am 10.20.52.155:6800/1926089105 mds.0.140006 state up:rejoin
2018-04-28 18:00:42.704582 7f2dd6f35700 10 mds.n20-052-155 handle_mds_map: handling map as rank 0
2018-04-28 18:00:42.704584 7f2dd6f35700  1 mds.0.140006 handle_mds_map i am now mds.0.140006
2018-04-28 18:00:42.704586 7f2dd6f35700  1 mds.0.140006 handle_mds_map state change up:reconnect --> up:rejoin
2018-04-28 18:00:42.704596 7f2dd6f35700  1 mds.0.140006 rejoin_start
2018-04-28 18:00:42.704603 7f2dd6f35700 10 mds.0.cache rejoin_start
2018-04-28 18:00:42.704606 7f2dd6f35700 10 mds.0.cache process_imported_caps
2018-04-28 18:00:42.704607 7f2dd6f35700  7 mds.0.cache trim_non_auth
2018-04-28 18:00:42.728120 7f2dd6f35700 10 mds.0.cache |__ 0    auth [dir 0x100 ~mds0/ [2,head] auth v=200553 cv=0/0 dir_auth=0 state=1610612736 f(v0 10=0+10) n(v76 rc2018-04-27 15:49:01.4215
35 b355887 3361=3351+10) hs=3+0,ss=0+0 dirty=3 | child=1 subtree=1 subtreetemp=0 dirty=1 0x561016a18a00]
2018-04-28 18:00:42.728166 7f2dd6f35700 10 mds.0.cache |__ 0    auth [dir 0x1 / [2,head] auth v=749783 cv=0/0 dir_auth=0 state=1610612736 f(v0 m2018-04-26 15:20:11.253382 1=0+1) n(v41662 rc20
18-04-27 15:49:01.421535 b93269611600943 10417207935=10416998889+209046) hs=1+1,ss=0+0 dirty=2 | child=1 subtree=1 subtreetemp=0 dirty=1 0x561016a17c00]
2018-04-28 18:00:42.728403 7f2dd6f35700  1 mds.0.140006 rejoin_joint_start
2018-04-28 18:00:42.728407 7f2dd6f35700 10 mds.0.cache rejoin_send_rejoins with recovery_set
2018-04-28 18:00:42.728411 7f2dd6f35700 10 mds.0.cache nothing to rejoin
2018-04-28 18:00:42.728412 7f2dd6f35700 10 mds.0.cache rejoin_gather_finish
2018-04-28 18:00:42.728413 7f2dd6f35700 10 mds.0.cache open_undef_inodes_dirfrags 0 inodes 0 dirfrags
2018-04-28 18:00:42.728415 7f2dd6f35700 10 mds.0.cache process_imported_caps
2018-04-28 18:00:42.728417 7f2dd6f35700 10 mds.0.cache choose_lock_states_and_reconnect_caps
2018-04-28 18:00:42.728423 7f2dd6f35700 10  mds.0.cache.snaprealm(0x1 seq 1 0x56101693f840) have_past_parents_open [1,head]
2018-04-28 18:00:42.728426 7f2dd6f35700 10 mds.0.cache  have past snap parents for realm snaprealm(0x1 seq 1 lc 0 cr 0 cps 1 snaps={} 0x56101693f840) on [inode 0x1 [...2,head] / auth v339255
snaprealm=0x56101693f840 f(v0 m2018-04-26 15:20:11.253382 1=0+1) n(v41662 rc2018-04-27 15:49:01.421535 b93269611600943 10417207936=10416998889+209047)/n(v0 1=0+1) (inest sync dirty) (iversion
 lock) caps={2902113=pAsLsXsFs/-@0} | dirtyscattered=1 dirfrag=1 caps=1 dirty=1 0x561016a17500]
2018-04-28 18:00:42.728453 7f2dd6f35700 10  mds.0.cache.snaprealm(0x1 seq 1 0x56101693f840) have_past_parents_open [1,head]
2018-04-28 18:00:42.728455 7f2dd6f35700 10 mds.0.cache  have past snap parents for realm snaprealm(0x1 seq 1 lc 0 cr 0 cps 1 snaps={} 0x56101693f840) on [inode 0x1 [...2,head] / auth v339255
snaprealm=0x56101693f840 f(v0 m2018-04-26 15:20:11.253382 1=0+1) n(v41662 rc2018-04-27 15:49:01.421535 b93269611600943 10417207936=10416998889+209047)/n(v0 1=0+1) (inest sync dirty) (iversion
 lock) caps={2902113=pAsLsXsFs/-@0} | dirtyscattered=1 dirfrag=1 caps=1 dirty=1 0x561016a17500]
......

Actions

Copy link

#10

Updated by wei jin almost 6 years ago

client kernel dmesg:

[  818.110868] ceph: loaded (mds proto 32)
[  818.130534] libceph: mon4 10.20.64.66:6789 session established
[  818.149258] libceph: client3193954 fsid c3e505ba-e748-4043-a493-9cc74a215bcd
[ 2829.862812] libceph: mds0 10.20.52.155:6800 socket closed (con state OPEN)
[ 2847.113131] ceph: mds0 reconnect start
[ 2937.976098] ceph: mds0 caps stale
[ 2937.976335] libceph: mon4 10.20.64.66:6789 session lost, hunting for new mon
[ 2937.996316] ceph: mds0 reconnect success
[ 2937.998839] libceph: mon3 10.20.64.30:6789 session established
[ 2943.907730] ceph: mds0 recovery completed
[ 4929.770933] perf: interrupt took too long (2512 > 2500), lowering kernel.perf_event_max_sample_rate to 79500
[ 7204.949325] libceph: mds0 10.20.52.157:6800 socket closed (con state OPEN)
[ 7222.551954] ceph: mds0 reconnect start
[ 7247.320959] libceph: mon3 10.20.64.30:6789 session lost, hunting for new mon
[ 7376.090566] INFO: task kworker/0:1:313 blocked for more than 120 seconds.
[ 7376.090586]       Not tainted 4.9.0-0.bpo.3-amd64 #1 Debian 4.9.30-2+deb9u2~bpo8+1
[ 7376.090606] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 7376.090626] kworker/0:1     D    0   313      2 0x00000000
[ 7376.090648] Workqueue: events delayed_work [ceph]
[ 7376.090650]  ffff88532ef64800 0000000000000000 ffff883333301100 ffff883331e87080
[ 7376.090652]  ffff88333fc187c0 ffffb3e31962fd58 ffffffff8f202a4d 0000000000000000
[ 7376.090654]  ffff883333301100 000000008ecc339b ffff883331e87080 ffff883331e87080
[ 7376.090656] Call Trace:
[ 7376.090662]  [<ffffffff8f202a4d>] ? __schedule+0x23d/0x6d0
[ 7376.090664]  [<ffffffff8f202f12>] ? schedule+0x32/0x80
[ 7376.090665]  [<ffffffff8f20319a>] ? schedule_preempt_disabled+0xa/0x10
[ 7376.090667]  [<ffffffff8f204d64>] ? __mutex_lock_slowpath+0xb4/0x130
[ 7376.090669]  [<ffffffff8f204dfb>] ? mutex_lock+0x1b/0x30
[ 7376.090675]  [<ffffffffc0c86cf6>] ? delayed_work+0xf6/0x200 [ceph]
[ 7376.090677]  [<ffffffff8ec917cb>] ? process_one_work+0x14b/0x410
[ 7376.090679]  [<ffffffff8ec92285>] ? worker_thread+0x65/0x4a0
[ 7376.090680]  [<ffffffff8ec92220>] ? rescuer_thread+0x340/0x340
[ 7376.090681]  [<ffffffff8ec97580>] ? kthread+0xf0/0x110
[ 7376.090683]  [<ffffffff8ec2476b>] ? __switch_to+0x2bb/0x700
[ 7376.090685]  [<ffffffff8ec97490>] ? kthread_park+0x60/0x60
[ 7376.090686]  [<ffffffff8f207a35>] ? ret_from_fork+0x25/0x30
[ 7417.651326] ceph: mds0 caps stale
[ 7417.670181] ceph: mds0 reconnect denied
[ 7417.672496] libceph: mon0 10.20.63.194:6789 session established
[ 7417.682234] libceph: mds0 10.20.52.155:6800 socket closed (con state NEGOTIATING)

Actions

Copy link

#11

Updated by dongdong tao almost 6 years ago

Hi Wei,
this is a very interesting problem, from your description, i would like to share my thought:
this should be your client "10.8.147.33" carries too much (millions) file capabilities , 1 million caps is about 100MB, but this still can not explain why this client need to spend more than 100s to send all the caps. so i suspect a very slow network speed between your client(10.8.147.33) and mds(10.20.52.155)， looks like they are not in the same segment.

Actions

Copy link

#12

Updated by Jeff Layton almost 6 years ago

v4.9 is quite old at this point, so it would be helpful to know if this is something that has already been fixed in more recent kernels. Is this reproducible on a more recent kernel? Ideally, something around v4.16-ish ?

Actions

Copy link

#13