Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release" - CephFS - Ceph

Actions

Copy link

Bug #18798

closed

FS activity hung, MDS reports client "failing to respond to capability release"

Added by Darrell Enns about 7 years ago. Updated almost 7 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

cephfs, mds

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v10.2.5

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I've had two occurrences in the past 3 weeks where filesystem activity hangs, with the MDS report a client "failing to respond to capability release". I'm using the Kernel cephfs client. Ceph is version 10.2.5 and Kernel is version 4.9.0. I've since downgraded the kernel to 4.7.5, as I was running that Kernel previously and did not have any issues (which could be just luck, or the bug was introduced somewhere between 4.7.5 and 4.9.0).

More details below:

[root@kb-ceph03 ~]# ceph status
    cluster 3bd52056-9ed8-4ff6-9359-6a3ea437d1e9
     health HEALTH_WARN
            mds0: Client kb-ceph03.knc.local: failing to respond to capability release
     monmap e2: 3 mons at {kb-ceph01=10.100.212.2:6789/0,kb-ceph02=10.100.212.3:6789/0,kb-ceph03=10.100.212.4:6789/0}
            election epoch 362, quorum 0,1,2 kb-ceph01,kb-ceph02,kb-ceph03
      fsmap e215: 1/1/1 up {0=kb-ceph03=up:active}, 2 up:standby
     osdmap e1293: 24 osds: 24 up, 24 in
            flags sortbitwise,require_jewel_osds
      pgmap v2612120: 1296 pgs, 4 pools, 5782 GB data, 1446 kobjects
            15314 GB used, 115 TB / 130 TB avail
                1296 active+clean
  client io 8127 kB/s wr, 0 op/s rd, 1 op/s wr

[root@kb-ceph03 ~]# ceph mds stat
e215: 1/1/1 up {0=kb-ceph03=up:active}, 2 up:standby

[root@kb-ceph03 ~]# ceph daemon mds.kb-ceph03 dump_ops_in_flight
{
    "ops": [
        {
            "description": "client_request(client.48718333:240331824 getattr pAsLsXsFs #1000000498c 2017-02-01 14:03:38.913579)",
            "initiated_at": "2017-02-01 14:03:38.914845",
            "age": 1446.955546,
            "duration": 1446.955708,
            "type_data": [
                "failed to rdlock, waiting",
                "client.48718333:240331824",
                "client_request",
                {
                    "client": "client.48718333",
                    "tid": 240331824
                },
                [
                    {
                        "time": "2017-02-01 14:03:38.914845",
                        "event": "initiated" 
                    },
                    {
                        "time": "2017-02-01 14:03:38.914958",
                        "event": "failed to rdlock, waiting" 
                    }
                ]
            ]
        },
        {
            "description": "client_request(client.48718333:240322739 getattr pAsLsXsFs #1000000498c 2017-02-01 14:01:27.049810)",
            "initiated_at": "2017-02-01 14:01:27.049678",
            "age": 1578.820714,
            "duration": 1578.820904,
            "type_data": [
                "failed to rdlock, waiting",
                "client.48718333:240322739",
                "client_request",
                {
                    "client": "client.48718333",
                    "tid": 240322739
                },
                [
                    {
                        "time": "2017-02-01 14:01:27.049678",
                        "event": "initiated" 
                    },
                    {
                        "time": "2017-02-01 14:01:27.049869",
                        "event": "failed to rdlock, waiting" 
                    }
                ]
            ]
        }
    ],
    "num_ops": 2
}

[root@kb-ceph03 ~]# ceph daemon mds.kb-ceph03 session ls
[
    {
        "id": 68523984,
        "num_leases": 1,
        "num_caps": 820,
        "state": "open",
        "replay_requests": 0,
        "completed_requests": 0,
        "reconnecting": false,
        "inst": "client.68523984 10.100.212.3:0\/2863784256",
        "client_metadata": {
            "entity_id": "admin",
            "hostname": "kb-ceph02.knc.local",
            "kernel_version": "4.9.0-1.el7.elrepo.x86_64",
            "root": "\/" 
        }
    },
    {
        "id": 48805081,
        "num_leases": 0,
        "num_caps": 843,
        "state": "open",
        "replay_requests": 0,
        "completed_requests": 0,
        "reconnecting": false,
        "inst": "client.48805081 10.100.212.4:0\/1660598714",
        "client_metadata": {
            "entity_id": "admin",
            "hostname": "kb-ceph03.knc.local",
            "kernel_version": "4.9.0-1.el7.elrepo.x86_64",
            "root": "\/" 
        }
    },
    {
        "id": 48718333,
        "num_leases": 0,
        "num_caps": 840,
        "state": "open",
        "replay_requests": 0,
        "completed_requests": 2,
        "reconnecting": false,
        "inst": "client.48718333 10.100.212.2:0\/2586033108",
        "client_metadata": {
            "entity_id": "admin",
            "hostname": "kb-ceph01.knc.local",
            "kernel_version": "4.9.0-1.el7.elrepo.x86_64",
            "root": "\/" 
        }
    }
]

Actions

Copy link

Updated by Zheng Yan about 7 years ago

please run 'ceph daemon mds.kb-ceph03 dump cache /tmp/cachedump.0' and upload /tmp/cachedump.0. Besides, please check if there is any hang OSD requests in /sys/kernel/debug/ceph/xxxx/osdc

Actions

Copy link

Updated by Zheng Yan about 7 years ago

probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a732cbf3fc9e835187e8b914f0c61cy

Actions

Copy link

Updated by Darrell Enns about 7 years ago

Does the mds cache dump need to be done while it is hung? It's a production system, so I wasn't able to leave it in a hung state. I'd have to wait to see if it happens again. If a cache dump while operating normal is useful, I can grab one tomorrow.

Is there a typo in that github url? It's returning 404 for me.

Actions

Copy link

Updated by Darrell Enns about 7 years ago

Cache dump is here:

https://knowledgenetworkbc-my.sharepoint.com/personal/darrelle_knowledge_ca/_layouts/15/guestaccess.aspx?docid=03aaaa62646f84d9b9760d9baeff0dbbc&authkey=AaMPXkH1G1KmDg4NehVtYoE&expiration=2017-02-17T17:27:49.000Z

It was too large to attach to the ticket. The link is good for 14 days. The cluster was operating normally when the dump was taken (no capability release error).

Actions

Copy link

Updated by Darrell Enns about 7 years ago

Zheng Yan wrote:

probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a732cbf3fc9e835187e8b914f0c61cy

Have you got a working URL for this commit? I'd like to figure out which Kernel versions (if any) may contain a potential fix.

Actions

Copy link

Updated by elder one about 7 years ago

Hit the same bug today with 4.4.59 kernel client.
2 MDS servers (1 standby) 10.2.6-1 on Ubuntu Trusty.

From mds log:

2017-04-04 17:32:37.705768 7f08c5342700  0 log_channel(cluster) log [WRN] : client.41746305 isn't responding to mclientcaps(revoke), ino 100008eb60c pending
pAsLsXsFr issued pAsLsXsFscr, sent 60.141398 seconds ago
2017-04-04 17:33:37.706640 7f08c5342700  0 log_channel(cluster) log [WRN] : client.41746305 isn't responding to mclientcaps(revoke), ino 100008eb60c pending
pAsLsXsFr issued pAsLsXsFscr, sent 120.142098 seconds ago
2017-04-04 17:34:57.706859 7f08c5342700  0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 32.794437 secs
2017-04-04 17:34:57.706872 7f08c5342700  0 log_channel(cluster) log [WRN] : slow request 32.794437 seconds old, received at 2017-04-04 17:34:24.912336: clien
t_request(client.41746305:884324 getattr pAsLsXsFs #100008eb60c 2017-04-04 17:34:24.909674) currently failed to rdlock, waiting

Actions

Copy link

Updated by Darrell Enns about 7 years ago

Very interesting! That means we have:

4.4.59 - bug
4.7.5 - no bug (or at least rare enough that I'm not seeing it)
4.9.0 - bug

Is the 4.4.59 kernel you are using the stock Ubuntu one? I wonder if they backported some ceph code that included the bug.

Actions

Copy link

Updated by Zheng Yan about 7 years ago

Darrell Enns wrote:

Zheng Yan wrote:

probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a732cbf3fc9e835187e8b914f0c61cy

Have you got a working URL for this commit? I'd like to figure out which Kernel versions (if any) may contain a potential fix.

Sorry, it's https://github.com/ceph/ceph-client/commit/d641df819db8b80198fd85d9de91137e8a823b07

Actions

Copy link

Updated by Zheng Yan about 7 years ago

elder one wrote:

Hit the same bug today with 4.4.59 kernel client.
2 MDS servers (1 standby) 10.2.6-1 on Ubuntu Trusty.

From mds log:
[...]

please check if there is any hang OSD requests in /sys/kernel/debug/ceph/xxxx/osdc

Actions

Copy link

#10

Updated by elder one about 7 years ago

My cephfs clients (Ubuntu Xenial) are running kernel from Ubuntu PPA: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4.59/
Ceph cluster (Ubuntu Trusty) with MD servers runs 4.4.39 kernel also from the same page.

I don't have this client node any more, killed it on privete cloud and booted next client from the same base image.
However I got all the application's logs (Postfix/Dovecot) in syslog server, from these I don't see any errors accessing cephfs.

I do monitor ceph cluster and it's disks - no high disk activity/iops durig time this error happened.

Actions

Copy link

#11

Updated by Zheng Yan about 7 years ago

elder one wrote:

My cephfs clients (Ubuntu Xenial) are running kernel from Ubuntu PPA: http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4.59/
Ceph cluster (Ubuntu Trusty) with MD servers runs 4.4.39 kernel also from the same page.

I don't have this client node any more, killed it on privete cloud and booted next client from the same base image.
However I got all the application's logs (Postfix/Dovecot) in syslog server, from these I don't see any errors accessing cephfs.

I do monitor ceph cluster and it's disks - no high disk activity/iops durig time this error happened.

Seems like different bug. If it happens again, please dump mds cache (ceph daemon mds.xxx dump cache /tmp/cachedump.0) and check if there is any hang OSD requests in /sys/kernel/debug/ceph/xxxx/osdc

Actions

Copy link

#12

Updated by Darrell Enns about 7 years ago

Thanks for the link Zheng. Looks like that fix is in mainline as of 4.10.2. When I have a chance, I'll try upgrading and see how it goes.

Actions

Copy link

#13

Updated by elder one about 7 years ago

Error happened again:

ceph -s
    cluster a5ac4128-5b84-4c15-b760-2621a011d55d
     health HEALTH_WARN
            mds0: Client mailnode-6-eMail-38-2223:mailserver failing to respond to capability release
     monmap e1: 3 mons at {cph01=192.168.30.71:6789/0,cph02=192.168.30.72:6789/0,cph03=192.168.30.73:6789/0}
            election epoch 1128, quorum 0,1,2 cph01,cph02,cph03
      fsmap e845: 1/1/1 up {0=cph05=up:active}, 1 up:standby-replay
     osdmap e67159: 55 osds: 55 up, 55 in
            flags require_jewel_osds
      pgmap v74062434: 6512 pgs, 13 pools, 6525 GB data, 3051 kobjects
            19903 GB used, 29888 GB / 49791 GB avail
                6512 active+clean
  client io 1760 kB/s rd, 2924 kB/s wr, 33 op/s rd, 337 op/s wr

Can't dump mds cache during a day as it locked my fs for a minute last time dumping/reloading 1.5M files.

Empty output from /sys/kernel/debug/ceph/xxxx/osdc file on all 3 fs kernel clients.

Checked other files in the same dir as well: on the node cluster reports error (mailnode-6-eMail-38-2223)
there is extra line: want next osdmap in /sys/kernel/debug/ceph/xxxx/monc. Msdmap and osdmap numbers in this file are the same on all nodes.

Also /sys/kernel/debug/ceph/xxxx/mdsc on the client ceph reports error, is empty but on other 2 healty nodes there's a list with references reported in mds log.

779015  mds0    getattr  #10000908ca7
780164  mds0    getattr  #10000908ca7
781191  mds0    getattr  #10000908ca7
781758  mds0    getattr  #10000908ca7
782175  mds0    getattr  #10000908ca7
783760  mds0    getattr  #10000908ca7
786552  mds0    getattr  #10000908ca7
789168  mds0    getattr  #10000908ca7
790370  mds0    getattr  #10000908ca7

Fragment from mds log:

2017-04-10 12:56:06.469300 7f1d0b5f6700  0 log_channel(cluster) log [WRN] : 28 slow requests, 2 included below; oldest blocked for > 4013.082964 secs
2017-04-10 12:56:06.469341 7f1d0b5f6700  0 log_channel(cluster) log [WRN] : slow request 3842.578557 seconds old, received at 2017-04-10 11:52:03.890467: client_request(client.41923726:781758 getattr pAsLsXsFs #10000908ca7 2017-04-10 11:52:03.887023) currently failed to rdlock, waiting
2017-04-10 12:56:06.469676 7f1d0b5f6700  0 log_channel(cluster) log [WRN] : slow request 482.127906 seconds old, received at 2017-04-10 12:48:04.341118: client_request(client.41932292:734242 getattr pAsLsXsFs #10000908ca7 2017-04-10 12:48:04.337616) currently failed to rdlock, waiting

Actions

Copy link

#14

Updated by Zheng Yan about 7 years ago

Can you try updating the cephfs client to 4.10.x kernel

Actions

Copy link

#15

Updated by elder one about 7 years ago

Upgraded cephfs clients to 4.10.9 kernel.

Actions

Copy link

#16

Updated by elder one about 7 years ago

Some wierd issues raised with my applications (unrelated to cephfs) with 4.10.x kernel.

The same fix is in 4.9.x kernel tree since version 4.9.14
Downgraded kernel to 4.9.21.

Let's see how it works...

Actions

Copy link

#17

Updated by elder one about 7 years ago

Got another error with 4.9.21 kernel.

On the cephfs node /sys/kernel/debug/ceph/xxx/:

cat osdc
REQUESTS 0 homeless 0
LINGER REQUESTS

cat mdsc
699592  mds0    getattr  #100009133ce

In the MDS log:

2017-04-11 16:20:49.021727 7f1d0b5f6700  0 log_channel(cluster) log [WRN] : client.42086552 isn't responding to mclientcaps(revoke), ino 100009133ce pending pAsLsXsFr issued pAsLsXsFscr, sent 123.722723 seconds ago
2017-04-11 16:20:59.010267 7f1d0b5f6700  0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 32.989459 secs
2017-04-11 16:20:59.010336 7f1d0b5f6700  0 log_channel(cluster) log [WRN] : slow request 32.989459 seconds old, received at 2017-04-11 16:20:26.020377: client_request(client.42086552:699592 getattr pAsLsXsFs #100009133ce 2017-04-11 16:20:26.013495) currently failed to rdlock, waiting

Seems to me that Dovecot server on cephfs node locks things up - As I stopped Dovecot service - didn't stop normally:

service dovecot stop
.. syslog ..
systemd[1]: Closed Dovecot IMAP/POP3 email server activation socket.
systemd[1]: Stopping Dovecot IMAP/POP3 email server...
systemd[1]: dovecot.service: State 'stop-sigterm' timed out. Killing.
systemd[1]: Stopped Dovecot IMAP/POP3 email server.
systemd[1]: dovecot.service: Unit entered failed state.
systemd[1]: dovecot.service: Failed with result 'timeout'.

After Dovecot terminated, Ceph cluster reported OK status.

Actions

Copy link

#18

Updated by elder one about 7 years ago

Dumped also mds cache (healhty cluster state 8 hours later)
Searched inode in mds error log:

[inode 100009133ce [2,head] /mail/data/virtual/domain/username/cur/1491916721.M239018P30451.mailnode-19-eMail-38-2241,S=134272,W=136238:2,S 
auth v26371 s=134272 n(v0 b134272 1=1+0) (iversion lock) caps={42066516=pAsLsXsFscr/-@1,42086552=pAsXs/-@12} |
 ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=0 dirty=0 waiter=0 authpin=0 0x55900a71f680]

Actions

Copy link

#19

Updated by Zheng Yan about 7 years ago

elder one wrote:

Got another error with 4.9.21 kernel.

On the cephfs node /sys/kernel/debug/ceph/xxx/:

[...]
[...]

In the MDS log:
[...]

Seems to me that Dovecot server on cephfs node locks things up - As I stopped Dovecot service - didn't stop normally:

[...]

After Dovecot terminated, Ceph cluster reported OK status.

It's caused by different bug. could you please try adding "mmap_disable = yes" to dovecot config

Actions

Copy link

#20

Updated by elder one about 7 years ago

I do have mmap disabled in Dovecot conf.

Relevant bits from Dovecot conf:

mmap_disable = yes
mail_nfs_index = yes
mail_nfs_storage = yes
mail_fsync = always
lock_method = fcntl

Actions

Copy link

#21

Updated by Zheng Yan about 7 years ago

could you try adding commit 2b1ac852 (ceph: try getting buffer capability for readahead/fadvise) to your 4.9.x kernel. If it still does not work, need kernel debugging to find what happened.(which may be not convenience for your case)

Actions

Copy link

#22

Updated by elder one about 7 years ago

Well, managed to patch ceph kernel module with 2b1ac852 commit, but my Ubuntu 4.9 kernel will not load unsigned modules...

error in syslog on cephfs mount

libceph: loaded (mon/osd proto 15/24)
PKCS#7 signature not signed with a trusted key
ceph: loading out-of-tree module taints kernel.
ceph: module verification failed: signature and/or required key missing - tainting kernel

Actions

Copy link

#23

Updated by Zheng Yan about 7 years ago

elder one wrote:

Well, managed to patch ceph kernel module with 2b1ac852 commit, but my Ubuntu 4.9 kernel will not load unsigned modules...

error in syslog on cephfs mount
[...]

try installing complied kernel and modules?

Actions

Copy link

#24

Updated by elder one about 7 years ago

Yeah, did that.
My test server with compiled 4.9 kernel and patched ceph module mounted cephfs just fine.
It might take a while to run it on production servers.

Any chance this 2b1ac852 patch gets packported to 4.9 mainline kernel?

Actions

Copy link

#25

Updated by elder one about 7 years ago

One difference I noticed between 4.4 and 4.9 kernels
- with 4.4 kernel on cephfs directory sizes (total bytes of all files inside dir) are shown but 4.9 kernel shows file count inside directory next level.

Is it bug of feature?

Actions

Copy link

#26

Updated by Darrell Enns about 7 years ago

FYI, I don't think the module signature stuff is an issue. It's just notifying you that you've loaded a module that didn't come with your distribution. It should still work just fine.

Actions

Copy link

#27

Updated by elder one about 7 years ago

Yeah, you're right Darrell, operator error :) I guess VM did not have correct storage network attached when I tried to mount CephFS on stock kernel with patched ceph module.

Anyway, running now with recompiled 4.9.22 kernel and patched ceph module.

Actions

Copy link

#28

Updated by Josh Durgin about 7 years ago

Project changed from Ceph to CephFS

Actions

Copy link

#29

Updated by elder one almost 7 years ago

Just reporting that no errors after using patched (commit 2b1ac852) cephfs kernel module.

Also upgraded ceph to 10.2.7 two weeks ago.

Could this commit be backported to 4.9 longterm kernel?

Actions

Copy link

#30

Updated by Zheng Yan almost 7 years ago

elder one wrote:

One difference I noticed between 4.4 and 4.9 kernels
- with 4.4 kernel on cephfs directory sizes (total bytes of all files inside dir) are shown but 4.9 kernel shows file count inside directory next level.

Is it bug of feature?

feature. controlled by rbytes/norbytes mount option

Actions

Copy link

#31

Updated by Zheng Yan almost 7 years ago

elder one wrote:

Just reporting that no errors after using patched (commit 2b1ac852) cephfs kernel module.

Also upgraded ceph to 10.2.7 two weeks ago.

Could this commit be backported to 4.9 longterm kernel?

I have sent request to stable@vger.kernel.org

Actions

Copy link

#32

Updated by elder one almost 7 years ago

Zheng Yan wrote:

elder one wrote:

One difference I noticed between 4.4 and 4.9 kernels
- with 4.4 kernel on cephfs directory sizes (total bytes of all files inside dir) are shown but 4.9 kernel shows file count inside directory next level.

Is it bug of feature?

feature. controlled by rbytes/norbytes mount option

Hmm, manual states that rbytes is ON by default. My cephfs is not mounted with norbytes option.

Actions

Copy link

#33

Updated by Zheng Yan almost 7 years ago

Status changed from New to Resolved

"ceph: try getting buffer capability for readahead/fadvise" has backported into 4.9.x

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #18798

FS activity hung, MDS reports client "failing to respond to capability release"

Updated by Zheng Yan about 7 years ago

Updated by Zheng Yan about 7 years ago

Updated by Darrell Enns about 7 years ago

Updated by Darrell Enns about 7 years ago

Updated by Darrell Enns about 7 years ago

Updated by elder one about 7 years ago

Updated by Darrell Enns about 7 years ago

Updated by Zheng Yan about 7 years ago

Updated by Zheng Yan about 7 years ago

Updated by elder one about 7 years ago

Updated by Zheng Yan about 7 years ago

Updated by Darrell Enns about 7 years ago

Updated by elder one about 7 years ago

Updated by Zheng Yan about 7 years ago

Updated by elder one about 7 years ago

Updated by elder one about 7 years ago

Updated by elder one about 7 years ago

Updated by elder one about 7 years ago

Updated by Zheng Yan about 7 years ago

Updated by elder one about 7 years ago

Updated by Zheng Yan about 7 years ago

Updated by elder one about 7 years ago

Updated by Zheng Yan about 7 years ago

Updated by elder one about 7 years ago

Updated by elder one about 7 years ago

Updated by Darrell Enns about 7 years ago

Updated by elder one about 7 years ago

Updated by Josh Durgin about 7 years ago

Updated by elder one almost 7 years ago

Updated by Zheng Yan almost 7 years ago

Updated by Zheng Yan almost 7 years ago

Updated by elder one almost 7 years ago

Updated by Zheng Yan almost 7 years ago