Bug #44976: MDS problem slow requests, cache pressure, damaged metadata after upgrading 14.2.7 to 14.2.8 - CephFS - Ceph

Actions

Copy link

Bug #44976

closed

MDS problem slow requests, cache pressure, damaged metadata after upgrading 14.2.7 to 14.2.8

Added by mitchell walls about 4 years ago. Updated almost 2 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Jeff Layton

Category:

Performance/Resource Usage

Target version:

Ceph - v14.2.8

% Done:

Source:

Community (user)

Tags:

cephfs ganesha mds ERR_FSAL_SERVERFAULT

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v14.2.8

ceph-qa-suite:

ceph-deploy

Component(FS):

Ganesha FSAL

Labels (FS):

Pull request ID:

34596

Crash signature (v1):

Crash signature (v2):

Description

I originally posted this on the nfs-ganesha issues github. They referenced me here since they think it is related to a problem in libcephfs or mds. They were thinking that the Attr_Expiration_Time = 0 was agitating an underlying issue rather than the issue itself.
I am planning on upgrading to octopus. I figured I would wait a couple weeks before doing that while we figure this out.

OS: Centos 7.7.1908
Versions:
Ceph - 14.2.8
NFS-Ganesha - 2.8.3

12 Days ago I upgraded to 14.2.8. From 14.2.7. I started seeing the cache pressure messages after that upgrade.
The cephfs system sat rather dormant for over a week. Last week on Thursday started seeing all of this stuff

Firstly, whether I have 1 MDS or 3, I tend to see the clients failing to respond to cache pressure.
Also, I always find that the MDS reports damaged metadata and pops back up after doing a scrub.

If I have 3 MDSes I start to get 3 MDS report slow requests. The requests are failing to RDLOCK/WRLOCK.
I should be able to recreate this I just want to make sure I am getting all of the logs needed when I recreate it.
Let me know which you would like? That way I get right log levels and the right logs.

The cluster has existed for almost a year(10 months). Version was nautilus this entire time.
No disk failures or hardware failures the entire time.
The nfs cephfs has existed pretty much since the beginning with exact same use case.

I can repair the metadata but it will eventually show right back up.

Also once the filesystem gets pretty bad with 3 mds and metadata damange.
The slow requests issue will arise just by listing files in a directory from a kernel mount.

NFS Ganesha logs on debug shows entirely nothing. If I increase to 3 mdses, then it starts to show this.
03/04/2020 13:57:51 : epoch 5e866ee0 : cmp04: ganesha.nfsd-1096024[svc_595] posix2fsal_error :FSAL :CRIT :Mapping 108(default) to ERR_FSAL_SERVERFAULT

I am using autoscale for this cluster. So besides the cephfs warnings that is the only thing I get.
These are the messages with just 1 MDS.
I plan to switcch to the multiple MDS and recreate the slow requests RDLOCK/WRLOCK issue.

ALL OF THE FOLLOWING IS 1 MDS. Like I said it gets much worse with 3.

[root@cmp03 ~]# ceph -s
  cluster:
    id:     FSID
    health: HEALTH_WARN
            2 clients failing to respond to cache pressure
            1 MDSs report damaged metadata
            too few PGs per OSD (15 < min 30)

  services:
    mon: 3 daemons, quorum mon01,mon02,mon03 (age 12d)
    mgr: mon01(active, since 12d), standbys: mon02, mon03
    mds: nfs:1 {0=cmp03=up:active} 3 up:standby
    osd: 64 osds: 64 up (since 12d), 64 in (since 9w)

  data:
    pools:   8 pools, 336 pgs
    objects: 4.78M objects, 6.6 TiB
    usage:   43 TiB used, 446 TiB / 488 TiB avail
    pgs:     336 active+clean

  io:
    client:   0 B/s rd, 338 KiB/s wr, 14 op/s rd, 16 op/s wr

[root@cmp03 ~]# ceph health detail
HEALTH_ERR 2 clients failing to respond to cache pressure; 1 MDSs report damaged metadata; too few PGs per OSD (15 < min 30)
MDS_CLIENT_RECALL 2 clients failing to respond to cache pressure
    mdscmp03(mds.0): Client cmp04. failing to respond to cache pressure client_id: 106550698
    mdscmp03(mds.0): Client cmp04. failing to respond to cache pressure client_id: 106550713
MDS_DAMAGE 1 MDSs report damaged metadata
    mdscmp03(mds.0): Metadata damage detected
TOO_FEW_PGS too few PGs per OSD (15 < min 30)

[root@cmp04 ~]# ceph tell mds.cmp03 client ls
2020-04-07 08:04:02.930 7ff9457fa700  0 client.107373917 ms_handle_reset on v2:10.10.48.102:6800/3250397436
2020-04-07 08:04:02.963 7ff9467fc700  0 client.107312857 ms_handle_reset on v2:10.10.48.102:6800/3250397436
[
    {
        "id": 106550713,
        "entity": {
            "name": {
                "type": "client",
                "num": 106550713
            },
            "addr": {
                "type": "any",
                "addr": "10.10.48.103:0",
                "nonce": 452355589
            }
        },
        "state": "open",
        "num_leases": 0,
        "num_caps": 95382,
        "request_load_avg": 0,
        "uptime": 59384.837150430998,
        "requests_in_flight": 0,
        "completed_requests": 5,
        "reconnecting": false,
        "recall_caps": {
            "value": 286887.99426187889,
            "halflife": 60
        },
        "release_caps": {
            "value": 0,
            "halflife": 60
        },
        "recall_caps_throttle": {
            "value": 10319.76544103905,
            "halflife": 2.5
        },
        "recall_caps_throttle2o": {
            "value": 815.64572559331623,
            "halflife": 0.5
        },
        "session_cache_liveness": {
            "value": 0,
            "halflife": 300
        },
        "inst": "client.106550713 10.10.48.103:0/452355589",
        "completed_requests": [
            {
                "tid": 1367968,
                "created_ino": "0x0" 
            },
            {
                "tid": 1367969,
                "created_ino": "0x0" 
            },
            {
                "tid": 1367975,
                "created_ino": "0x0" 
            },
            {
                "tid": 1367977,
                "created_ino": "0x0" 
            },
            {
                "tid": 1367979,
                "created_ino": "0x0" 
            }
        ],
        "prealloc_inos": [
            {
                "start": 1099522467650,
                "length": 287
            },
            {
                "start": 1099522469581,
                "length": 505
            }
        ],
        "used_inos": [],
        "client_metadata": {
            "features": "0x0000000000001fff",
            "ceph_sha1": "2d095e947a02261ce61424021bb43bd3022d35cb",
            "ceph_version": "ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable)",
            "entity_id": "ganesha-9209657c-3804-4c42-99e3-d669f936a3f4",
            "hostname": "cmp04.",
            "pid": "1630708",
            "root": "/volumes/_nogroup/9209657c-3804-4c42-99e3-d669f936a3f4",
            "timeout": "300",
            "uuid": "ganesha-cmp04.-0411" 
        }
    },
    {
        "id": 107031118,
        "entity": {
            "name": {
                "type": "client",
                "num": 107031118
            },
            "addr": {
                "type": "v1",
                "addr": "10.10.48.103:0",
                "nonce": 3527069069
            }
        },
        "state": "open",
        "num_leases": 0,
        "num_caps": 1,
        "request_load_avg": 0,
        "uptime": 59384.837150430998,
        "requests_in_flight": 0,
        "completed_requests": 0,
        "reconnecting": false,
        "recall_caps": {
            "value": 0,
            "halflife": 60
        },
        "release_caps": {
            "value": 0,
            "halflife": 60
        },
        "recall_caps_throttle": {
            "value": 0,
            "halflife": 2.5
        },
        "recall_caps_throttle2o": {
            "value": 0,
            "halflife": 0.5
        },
        "session_cache_liveness": {
            "value": 0,
            "halflife": 300
        },
        "inst": "client.107031118 v1:10.10.48.103:0/3527069069",
        "completed_requests": [],
        "prealloc_inos": [],
        "used_inos": [],
        "client_metadata": {
            "features": "0x00000000000000ff",
            "entity_id": "admin",
            "hostname": "cmp04.",
            "kernel_version": "3.10.0-1062.12.1.el7.x86_64",
            "root": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7" 
        }
    },
    {
        "id": 106550698,
        "entity": {
            "name": {
                "type": "client",
                "num": 106550698
            },
            "addr": {
                "type": "any",
                "addr": "10.10.48.103:0",
                "nonce": 4140370337
            }
        },
        "state": "open",
        "num_leases": 0,
        "num_caps": 4771,
        "request_load_avg": 0,
        "uptime": 59384.837150430998,
        "requests_in_flight": 0,
        "completed_requests": 0,
        "reconnecting": false,
        "recall_caps": {
            "value": 302690.92799215915,
            "halflife": 60
        },
        "release_caps": {
            "value": 0,
            "halflife": 60
        },
        "recall_caps_throttle": {
            "value": 12256.831612812737,
            "halflife": 2.5
        },
        "recall_caps_throttle2o": {
            "value": 2601.7867175086562,
            "halflife": 0.5
        },
        "session_cache_liveness": {
            "value": 0,
            "halflife": 300
        },
        "inst": "client.106550698 10.10.48.103:0/4140370337",
        "completed_requests": [],
        "prealloc_inos": [
            {
                "start": 1099522471904,
                "length": 687
            }
        ],
        "used_inos": [],
        "client_metadata": {
            "features": "0x0000000000001fff",
            "ceph_sha1": "2d095e947a02261ce61424021bb43bd3022d35cb",
            "ceph_version": "ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable)",
            "entity_id": "ganesha-a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7",
            "hostname": "cmp04.",
            "pid": "1630708",
            "root": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7",
            "timeout": "300",
            "uuid": "ganesha-cmp04.-03fe" 
        }
    },
    {
        "id": 106103709,
        "entity": {
            "name": {
                "type": "client",
                "num": 106103709
            },
            "addr": {
                "type": "v1",
                "addr": "10.10.48.103:0",
                "nonce": 1266568608
            }
        },
        "state": "open",
        "num_leases": 0,
        "num_caps": 3741,
        "request_load_avg": 3,
        "uptime": 59384.837150430998,
        "requests_in_flight": 0,
        "completed_requests": 1,
        "reconnecting": false,
        "recall_caps": {
            "value": 0,
            "halflife": 60
        },
        "release_caps": {
            "value": 0,
            "halflife": 60
        },
        "recall_caps_throttle": {
            "value": 0,
            "halflife": 2.5
        },
        "recall_caps_throttle2o": {
            "value": 0,
            "halflife": 0.5
        },
        "session_cache_liveness": {
            "value": 136.57798974559933,
            "halflife": 300
        },
        "inst": "client.106103709 v1:10.10.48.103:0/1266568608",
        "completed_requests": [
            {
                "tid": 4537378,
                "created_ino": "0x0" 
            }
        ],
        "prealloc_inos": [],
        "used_inos": [],
        "client_metadata": {
            "features": "0x00000000000000ff",
            "entity_id": "admin",
            "hostname": "cmp04.",
            "kernel_version": "3.10.0-1062.12.1.el7.x86_64",
            "root": "/volumes/_nogroup/9209657c-3804-4c42-99e3-d669f936a3f4" 
        }
    }
]

[root@cmp04 ~]# ceph tell mds.cmp03 damage ls
....
7000+ backtrace errors
....
{
    "damage_type": "backtrace",
    "id": 4289458104,
    "ino": 1099522468770,
    "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag45_nearest10_hog/msm_6tic_N11_lag45.txt" 
},
{
    "damage_type": "backtrace",
    "id": 4291358526,
    "ino": 1099522468884,
    "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3_nearest10_hog/state_apo_6tic_N11_lag3_9.pdb" 
},
{
    "damage_type": "backtrace",
    "id": 4291586401,
    "ino": 1099522467412,
    "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag25_nearest10_hog/state_apo_6tic_N11_lag25_1.pdb" 
},
{
    "damage_type": "backtrace",
    "id": 4293141981,
    "ino": 1099522471402,
    "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/corr_frames10/p450_apoN11_lag83_spearmancorr.npy" 
},
{
    "damage_type": "backtrace",
    "id": 4293413627,
    "ino": 1099522467311,
    "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag29_nearest10_hog/state_apo_6tic_N11_lag29_0.pdb" 
},
{
    "damage_type": "backtrace",
    "id": 4294040932,
    "ino": 1099522468824,
    "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag31_nearest10_hog/state_apo_6tic_N11_lag31_10.pdb" 

[root@cmp03 ~]# cat /etc/ceph/ceph.conf
[global]
fsid = FSID
mon_initial_members = mon01, mon02, mon03
mon_host = 10.10.48.27,10.10.48.28,10.10.48.29
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

public_network = 10.10.48.0/24
cluster_network= 10.10.53.0/24

mon_allow_pool_delete = true

log_to_syslog = true
err_to_syslog = true
mon_cluster_log_to_syslog = true
clog_to_syslog = true

[mds]
mds_cache_memory_limit = 68719476736
mds_cache_reservation = 0.05

[osd]
bluestore_cache_size_hdd = 8589934592
bluestore_cache_size_ssd = 8589934592

[client]
rbd_cache = true
rbd_cache_size = 67108864
rbd_cache_max_dirty = 50331648
rbd_cache_target_dirty = 33554432
rbd_cache_max_dirty_age = 2
rbd_cache_writethrough_until_flush = true
rbd_concurrent_management_ops = 20

[client.manila]
client_mount_uid = 0
client_mount_gid = 0
keyring = /etc/ceph/manila.keyring

[root@cmp04 ~]# cat /etc/ganesha/ceph.conf
LOG
{
    Default_Log_Level = DEBUG;
}
NFS_CORE_PARAM
{
    Enable_NLM = false;
    Enable_RQUOTA = false;
    Protocols = 4;
}
NFSv4
{
    Delegations = true;
    RecoveryBackend = rados_ng;
    Minor_Versions =  1,2;
}
MDCACHE {
    Dir_Chunk = 0;
    NParts = 1;
    Cache_Size = 1;
}
RADOS_URLS
{
    Ceph_Conf = /etc/ceph/ceph.conf;
    UserId = "admin";
    watch_url = "rados://nfs-ganesha/ganesha-export-counter";
}
%url rados://nfs-ganesha/ganesha-export-index
CEPH
{
}
RADOS_KV
{
    Ceph_Conf = /etc/ceph/ceph.conf;
    UserId = "admin";
    pool = "nfs-ganesha";
}
NFS_KRB5
{
       Active_krb5 = false;
}

RADOS EXPORT EXAMPLE:

EXPORT {
    CLIENT {
        Clients = {ips}
        Access_Type = "rw";
    }

    SecType = "sys";
    Pseudo = "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7";
    Squash = "None";
    FSAL {
        Secret_Access_Key = {secret}
        User_Id = {user}
        Name = "Ceph";
    }
    Tag = "share-a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7";
    Path = "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7";
    Export_Id = 1022;
}

Here are the logs while running, the scrub. About 7485 lines of this. All seem the same.

2020-04-07 07:54:16.326 7f3308d21700  0 log_channel(cluster) log [WRN] : Scrub error on inode 0x10000a576dc (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/state_apo_6tic_N11_lag3700_5.pdb) see mds.cmp03 log and `damage ls` output for details
2020-04-07 07:54:16.326 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 07:54:16.326 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576dc [2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/state_apo_6tic_N11_lag3700_5.pdb auth v62 dirtyparent s=6229785 n(v0 rc2020-04-06 18:20:15.392548 b6229785 1=1+0) (iversion lock) caps={106550698=pAsLsXsFsc/-@4},l=106550698(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 authpin=0 0x555a9bd6ee00]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(39)0x10000a576dc:[<0x10000a576c9/state_apo_6tic_N11_lag3700_5.pdb v62>,<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104037>,<0x1/volumes v1692870>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-2}
2020-04-07 07:54:16.326 7f3308d21700  0 log_channel(cluster) log [WRN] : Scrub error on inode 0x10000a576e7 (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt) see mds.cmp03 log and `damage ls` output for details
2020-04-07 07:54:16.326 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 07:54:16.326 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576e7 [2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt auth v116 dirtyparent s=2065 n(v0 rc2020-04-06 18:20:52.995274 b2065 1=1+0) (iversion lock) caps={106550698=pAsLsXsFsc/-@2},l=106550698(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 authpin=0 0x555a4541dc00]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(39)0x10000a576e7:[<0x10000a576c9/msm_6tic_N11_lag3700.txt v116>,<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104037>,<0x1/volumes v1692870>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-2}
2020-04-07 07:54:16.326 7f3308d21700  0 log_channel(cluster) log [WRN] : Scrub error on inode 0x10000a576c9 (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog) see mds.cmp03 log and `damage ls` output for details
2020-04-07 07:54:16.326 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 07:54:16.326 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576c9 [...2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/ auth v103691 dirtyparent f(v0 m2020-04-06 18:20:52.988600 12=12+0) n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1) (inest lock) (ifile excl) (iversion lock) caps={106550698=pAsLsXsFsx/AsLsXsFsx@14},l=106550698 | request=0 lock=0 dirfrag=1 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 waiter=0 authpin=0 0x555a53c58700]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(38)0x10000a576c9:[<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104037>,<0x1/volumes v1692870>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":true,"passed":true,"read_ret_val":0,"ondisk_value.dirstat":"f(v0 m2020-04-06 18:20:52.988600 12=12+0)","ondisk_value.rstat":"n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1)","memory_value.dirrstat":"f(v0 m2020-04-06 18:20:52.988600 12=12+0)","memory_value.rstat":"n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1)","error_str":""},"return_code":-2}
2020-04-07 07:54:16.356 7f3308d21700  0 log_channel(cluster) log [INF] : scrub complete with tag 'ce3c7e20-71eb-4bb6-bbc5-d2463ba8ff38'
2020-04-07 07:54:16.356 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 07:54:18.698 7f330f52e700  1 mds.cmp03 Updating MDS map to version 26580 from mon.2
2020-04-07 09:17:58.270 7f330f52e700  1 mds.cmp03 Updating MDS map to version 26581 from mon.2

Scrub Repair logs

2020-04-07 09:23:44.230 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 09:23:44.230 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576dc [2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/state_apo_6tic_N11_lag3700_5.pdb auth v62 dirtyparent s=6229785 n(v0 rc2020-04-06 18:20:15.392548 b6229785 1=1+0) (iversion lock) caps={106550698=pAsLsXsFsc/-@4},l=106550698(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 authpin=0 0x555a9bd6ee00]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(39)0x10000a576dc:[<0x10000a576c9/state_apo_6tic_N11_lag3700_5.pdb v62>,<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104143>,<0x1/volumes v1692976>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-2}
2020-04-07 09:23:44.230 7f3308d21700  0 log_channel(cluster) log [WRN] : bad backtrace on inode 0x10000a576e7(/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt), rewriting it
2020-04-07 09:23:44.230 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 09:23:44.230 7f3308d21700  0 log_channel(cluster) log [INF] : Scrub repaired inode 0x10000a576e7 (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt)
2020-04-07 09:23:44.230 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 09:23:44.230 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576e7 [2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt auth v116 dirtyparent s=2065 n(v0 rc2020-04-06 18:20:52.995274 b2065 1=1+0) (iversion lock) caps={106550698=pAsLsXsFsc/-@2},l=106550698(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 authpin=0 0x555a4541dc00]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(39)0x10000a576e7:[<0x10000a576c9/msm_6tic_N11_lag3700.txt v116>,<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104143>,<0x1/volumes v1692976>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-2}
2020-04-07 09:23:44.231 7f3308d21700  0 log_channel(cluster) log [WRN] : bad backtrace on inode 0x10000a576c9(/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog), rewriting it
2020-04-07 09:23:44.231 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 09:23:44.231 7f3308d21700  0 log_channel(cluster) log [INF] : Scrub repaired inode 0x10000a576c9 (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog)
2020-04-07 09:23:44.231 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 09:23:44.231 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576c9 [...2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/ auth v103691 dirtyparent f(v0 m2020-04-06 18:20:52.988600 12=12+0) n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1) (inest lock) (ifile excl) (iversion lock) caps={106550698=pAsLsXsFsx/AsLsXsFsx@16},l=106550698 | request=0 lock=0 dirfrag=1 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 waiter=0 authpin=0 0x555a53c58700]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(38)0x10000a576c9:[<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104143>,<0x1/volumes v1692976>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":true,"passed":true,"read_ret_val":0,"ondisk_value.dirstat":"f(v0 m2020-04-06 18:20:52.988600 12=12+0)","ondisk_value.rstat":"n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1)","memory_value.dirrstat":"f(v0 m2020-04-06 18:20:52.988600 12=12+0)","memory_value.rstat":"n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1)","error_str":""},"return_code":-2}
2020-04-07 09:27:33.890 7f3308d21700  0 log_channel(cluster) log [INF] : scrub complete with tag '47c8a435-7fdd-40f1-879f-3cb435ce530f'
2020-04-07 09:27:33.890 7f3308d21700  0 log_channel(cluster) do_log log to syslog
2020-04-07 09:31:42.355 7f330f52e700  1 mds.cmp03 Updating MDS map to version 26582 from mon.2

Files

Download all files

ceph-mds.jy-cmp04.log-20200403.gz (72.5 KB) ceph-mds.jy-cmp04.log-20200403.gz	Day it starts	mitchell walls, 04/07/2020 03:21 PM
ceph-mds.jy-cmp04.log-20200404.gz (223 KB) ceph-mds.jy-cmp04.log-20200404.gz	Debugging day	mitchell walls, 04/07/2020 03:21 PM
ceph-mds.jy-cmp03.log-20200404.gz (400 KB) ceph-mds.jy-cmp03.log-20200404.gz		mitchell walls, 04/07/2020 03:23 PM
ceph-mds.jy-cmp03.log-20200403.gz (129 KB) ceph-mds.jy-cmp03.log-20200403.gz		mitchell walls, 04/07/2020 03:23 PM
ceph-mds.jy-blk02.log-20200403.gz (181 KB) ceph-mds.jy-blk02.log-20200403.gz		mitchell walls, 04/07/2020 03:23 PM
ceph-mds.jy-blk01.log-20200404.gz (261 KB) ceph-mds.jy-blk01.log-20200404.gz		mitchell walls, 04/07/2020 03:23 PM
ceph-mds.jy-blk01.log-20200403.gz (130 KB) ceph-mds.jy-blk01.log-20200403.gz		mitchell walls, 04/07/2020 03:23 PM
ceph-mds.jy-blk02.log-20200404-part1.log.gz (800 KB) ceph-mds.jy-blk02.log-20200404-part1.log.gz		mitchell walls, 04/07/2020 03:34 PM
ceph-mds.jy-blk02.log-20200404-part2.log.gz (755 KB) ceph-mds.jy-blk02.log-20200404-part2.log.gz		mitchell walls, 04/07/2020 03:34 PM

Related issues 1 (0 open — 1 closed)

Actions

Copy link Download all files

Updated by mitchell walls about 4 years ago

File ceph-mds.jy-cmp04.log-20200404.gz ceph-mds.jy-cmp04.log-20200404.gz added
File ceph-mds.jy-cmp04.log-20200403.gz ceph-mds.jy-cmp04.log-20200403.gz added

attaching all mds logs for the 3rd and 4th when it started

Actions

Copy link Download all files

Updated by mitchell walls about 4 years ago

File ceph-mds.jy-cmp03.log-20200404.gz ceph-mds.jy-cmp03.log-20200404.gz added
File ceph-mds.jy-cmp03.log-20200403.gz ceph-mds.jy-cmp03.log-20200403.gz added

Actions

Copy link Download all files

Updated by mitchell walls about 4 years ago

File ceph-mds.jy-blk02.log-20200403.gz ceph-mds.jy-blk02.log-20200403.gz added
File ceph-mds.jy-blk01.log-20200404.gz ceph-mds.jy-blk01.log-20200404.gz added
File ceph-mds.jy-blk01.log-20200403.gz ceph-mds.jy-blk01.log-20200403.gz added

Just as a heads up I was restarting mdses alot to troubleshoot this issue. Once I would restart it would work fine for a couple hours.

Actions

Copy link

Updated by mitchell walls about 4 years ago

not hours minutes

Actions

Copy link

Updated by Greg Farnum about 4 years ago

Assignee set to Jeff Layton

The backtrace "metadata damage" is probably erroneous: https://tracker.ceph.com/issues/43543; if we have busy files that never get evicted from the journal it's possible the backtrace doesn't get written down and we improperly call that an error.

So the cache pressure messages, and any cluster overloading that may be a sign of, is probably the real issue here.

Actions

Copy link

Updated by Greg Farnum about 4 years ago

Project changed from Ceph to CephFS
Category changed from common to Performance/Resource Usage
Component(FS) Ganesha FSAL added

Actions

Copy link Download all files

Updated by mitchell walls about 4 years ago

File ceph-mds.jy-blk02.log-20200404-part2.log.gz ceph-mds.jy-blk02.log-20200404-part2.log.gz added
File ceph-mds.jy-blk02.log-20200404-part1.log.gz ceph-mds.jy-blk02.log-20200404-part1.log.gz added

this file was a bit to large split it up

Actions

Copy link

Updated by mitchell walls about 4 years ago

I forgot to mention all of the mds servers are pretty large (including whole cluster) with two cpus and 256-384 gb of ram and 10 gig network. They were old vm hosts.

The traffic level has been the same since the cluster was created.

Actions

Copy link

Updated by mitchell walls about 4 years ago

This also seems to be very closely related to this issue https://tracker.ceph.com/issues/44947

Actions

Copy link

#10

Updated by Jeff Layton about 4 years ago

I agree with Greg. The metadata damage warnings are probably unrelated to the cache pressure issues. Let's focus on the cache pressure problems here for now, and we can open another bug to handle the metadata damage problem if we find that it's necessary.

03/04/2020 13:57:51 : epoch 5e866ee0 : cmp04: ganesha.nfsd-1096024[svc_595] posix2fsal_error :FSAL :CRIT :Mapping 108(default) to ERR_FSAL_SERVERFAULT

108 is ESHUTDOWN. I don't see any occurrences of that in the libcephfs code or ganesha, so I'm a little curious where that would be coming from. That may be unrelated though (or be due to blacklisting).

Both the ganesha clients seem to have large num_caps and recall_caps* values in the info, and 0 for release_caps. Here's a snippet from the first one:

"state": "open",
"num_leases": 0,
"num_caps": 95382,
"request_load_avg": 0,
"uptime": 59384.837150430998,
"requests_in_flight": 0,
"completed_requests": 5,
"reconnecting": false,
"recall_caps": {
"value": 286887.99426187889,
"halflife": 60
},
"release_caps": {
"value": 0,
"halflife": 60
},
"recall_caps_throttle": {
"value": 10319.76544103905,
"halflife": 2.5
},
"recall_caps_throttle2o": {
"value": 815.64572559331623,
"halflife": 0.5
},

So I'm guessing this means that the MDS sent the client a bunch of revokes but it's not sending any cap releases? Possibly there is a refcount leak of some sort, but I'm not as familiar with how cap releases are driven in libcephfs.

Actions

Copy link

#11

Updated by Jeff Layton about 4 years ago

I have another case opened internally at RH, and got them to force a coredump. From that I was able to dump ganesha's mdcache LRU state:

(gdb) p lru_state
$15 = {
entries_hiwat = 100000,
entries_used = 1768856,
chunks_hiwat = 100000,
chunks_used = 0,
fds_system_imposed = 1048576,
fds_hard_limit = 1038090,
fds_hiwat = 943718,
fds_lowat = 524288,
futility = 0,
per_lane_work = 50,
biggest_window = 419430,
prev_fd_count = 0,
prev_time = 1586157121,
fd_state = 0
}

The entries_used is a lot larger than it should be here (the max should be entries_hiwat). That's probably indicative of a refcount leak in ganesha.

Mitchell, are you able to attach the debugger to the running ganesha daemon when it's in this state, and do:

(gdb) p lru_state

You will need the nfs-ganesha-debuginfo package installed.

Actions

Copy link

#12

Updated by mitchell walls about 4 years ago

(gdb) p lru_state
$1 = {entries_hiwat = 100000, entries_used = 100148, chunks_hiwat = 100000, chunks_used = 0,
fds_system_imposed = 1048576, fds_hard_limit = 1038090, fds_hiwat = 943718, fds_lowat = 524288,
futility = 0, per_lane_work = 50, biggest_window = 419430, prev_fd_count = 0,
prev_time = 1586281947, fd_state = 0}

Actions

Copy link

#13

Updated by Jeff Layton about 4 years ago

Thanks, it's not much above entries_hiwat but it is some above it, so it seems likely this is the same issue as the other I'm looking at. One way to confirm it would be to set something like this in the ganesha config:

MDCACHE {
    Entries_HWMark = 10000;
}

...and keep an eye on the entries_used value in the lru_state. If you cross that threshold and it never seems to go down then that's a pretty good indicator that there is a refcount leak of these entries.

Unfortunately, refcount leaks are tough to track down. What would help most at this point is a way to reliably reproduce this (including a well-defined workload on the NFS clients that triggers this).

Actions

Copy link

#14

Updated by mitchell walls about 4 years ago

Okay so do you also think that is related to the mds slow requests from whenever I was running 3 mdses?

I am working on creating a dummy way to mimic the workload that way being run at the time. Thankfully it was just one user at the time it started.

Actions

Copy link

#15

Updated by Jeff Layton about 4 years ago

mitchell walls wrote:

Okay so do you also think that is related to the mds slow requests from whenever I was running 3 mdses?

It seems likely, but it's not 100% clear at this point.

I am working on creating a dummy way to mimic the workload that way being run at the time. Thankfully it was just one user at the time it started.

That would help a lot.

Actions

Copy link

#16

Updated by Jeff Layton about 4 years ago

...and to be clear too:

If you're working on a reproducer, then setting Entries_HWMark to a really low value (100 or 1000?) might be good. That should be enough to be able to tell whether you're seeing the problem w/o necessarily bringing the MDS to its knees. The ultimate goal here is to determine what specific situation leads to this sort of leak, which makes it simpler to find it. So, getting it down to a small number of potential operations makes it simpler to see what happened.

Actions

Copy link

#17

Updated by mitchell walls about 4 years ago

Okay thank you. I am working on it. So the script they were working with was dealing with a group of files that were symlinked from a different directory in nfs. They were all relatively large. It was at this point the mds requests were hung up. Cache pressure existed this entire time. How does the mds load balancing handle symlinks?

lrwxrwxrwx 1 52 Nov 26 14:04 linkedtraj10.dcd -> ../../p450/p450_apo_C10/step7_production_apo_C10.dcd
lrwxrwxrwx 1 52 Mar 17 10:15 linkedtraj11.dcd -> ../../p450/p450_apo_C11/step7_production_apo_C11.dcd
lrwxrwxrwx 1 52 Nov 26 14:04 linkedtraj12.dcd -> ../../p450/p450_apo_C12/step7_production_apo_C12.dcd
lrwxrwxrwx 1 52 Nov 26 14:04 linkedtraj13.dcd -> ../../p450/p450_apo_C5/step7.1_production_apo_C5.dcd
lrwxrwxrwx 1 52 Nov 26 14:04 linkedtraj14.dcd -> ../../p450/p450_apo_C6/step7.1_production_apo_C6.dcd
lrwxrwxrwx 1 52 Nov 26 14:04 linkedtraj15.dcd -> ../../p450/p450_apo_C7/step7.1_production_apo_C7.dcd
lrwxrwxrwx 1 52 Nov 26 14:04 linkedtraj16.dcd -> ../../p450/p450_apo_C8/step7.1_production_apo_C8.dcd
lrwxrwxrwx 1 52 Nov 26 14:04 linkedtraj17.dcd -> ../../p450/p450_apo_C9/step7.1_production_apo_C9.dcd
lrwxrwxrwx 1 54 Nov 26 14:04 linkedtraj18.dcd -> ../../p450/p450_apo_C10/step7.1_production_apo_C10.dcd
lrwxrwxrwx 1 52 Nov 26 14:04 linkedtraj19.dcd -> ../../p450/p450_apo_C5/step7.2_production_apo_C5.dcd
lrwxrwxrwx 1 38 Nov 26 14:04 linkedtraj1.dcd -> ../../p450/prod/step7_productionC1.dcd
lrwxrwxrwx 1 54 Nov 26 14:04 linkedtraj20.dcd -> ../../p450/p450_apo_C10/step7.2_production_apo_C10.dcd
lrwxrwxrwx 1 38 Nov 26 14:04 linkedtraj2.dcd -> ../../p450/prod/step7_productionC2.dcd
lrwxrwxrwx 1 50 Nov 26 14:04 linkedtraj3.dcd -> ../../p450/p450_apo_C1/step7_production_apo_C1.dcd
lrwxrwxrwx 1 50 Nov 26 14:04 linkedtraj4.dcd -> ../../p450/p450_apo_C4/step7_production_apo_C4.dcd
lrwxrwxrwx 1 50 Nov 26 14:04 linkedtraj5.dcd -> ../../p450/p450_apo_C5/step7_production_apo_C5.dcd
lrwxrwxrwx 1 50 Nov 26 14:04 linkedtraj6.dcd -> ../../p450/p450_apo_C6/step7_production_apo_C6.dcd
lrwxrwxrwx 1 50 Nov 26 14:04 linkedtraj7.dcd -> ../../p450/p450_apo_C7/step7_production_apo_C7.dcd
lrwxrwxrwx 1 50 Nov 26 14:04 linkedtraj8.dcd -> ../../p450/p450_apo_C8/step7_production_apo_C8.dcd
lrwxrwxrwx 1 50 Nov 26 14:04 linkedtraj9.dcd -> ../../p450/p450_apo_C9/step7_production_apo_C9.dcd

Actions

Copy link

#18

Updated by Jeff Layton about 4 years ago

mitchell walls wrote:

Okay thank you. I am working on it. So the script they were working with was dealing with a group of files that were symlinked from a different directory in nfs. They were all relatively large. It was at this point the mds requests were hung up. Cache pressure existed this entire time. How does the mds load balancing handle symlinks?

Symlinks are just a particular sort of inode and end up distributed among MDS's according to same rules as anything else (I think based on primary dentry link). The symlinks in this case would be resolved on the NFS clients though, so it seems unlikely that they would be a factor here. That said, we don't really know at this point. I'll play around with symlink chasing and see if it makes any difference.

FWIW, I've also tried lowering the Entries_HWMark value to really low values (100) and run a number of testcases against it and have not yet reproduced the problem.

Actions

Copy link

#19

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Update on the cache message, The cache message was still there with the single mds with the entries_used less than entries_hiwat. I have to restart the ganesha service after attaching the debugger since it gives me I/O error when accessing the share. So the message goes away after I restart. But here were the results below.

$1 = {entries_hiwat = 10000, entries_used = 1978, chunks_hiwat = 100000, chunks_used = 0, fds_system_imposed = 1048576, fds_hard_limit = 1038090, fds_hiwat = 943718, fds_lowat = 524288, futility = 0, 
  per_lane_work = 50, biggest_window = 419430, prev_fd_count = 0, prev_time = 1586350073, fd_state = 0}

I was wrong about the updates. It was ceph from 14.2.1 -> 14.2.7 and nfs-ganesha from 2.8.2 -> 2.8.3 that it first started could be coincidence. It was later that the entire was cluster was also update from 14.2.1 -> 14.2.8 including the ganesha server.

I will work on doing a downgrade. I am still working on the workload that is causing this issue. It is python msmbuilder and mdtraj script which is pulling in about 20 or so symlinked files larger than 500gb to create a pandas table. The script is only reading at this point.

Actions

Copy link

#21

Updated by mitchell walls about 4 years ago

I have gotten further and it gets really weird. It isn't actually the python script it was a move they did after looking into the history. This is just in regards to the complete hangup with slow requests.

lets say I do this with 3 mdses:

(base) [root@jy-cmp03 figures_N4_30nearest]# pwd
/mnt/namdoutput/analysis_apo/figures_N4_30nearest
(base) [root@jy-cmp03 figures_N4_30nearest]# ls
p450_apoN4_lag100.png  p450_apoN4_lag32.png  p450_apoN4_lag55.png  p450_apoN4_lag78.png
p450_apoN4_lag10.png   p450_apoN4_lag33.png  p450_apoN4_lag56.png  p450_apoN4_lag79.png
p450_apoN4_lag11.png   p450_apoN4_lag34.png  p450_apoN4_lag57.png  p450_apoN4_lag7.png
p450_apoN4_lag12.png   p450_apoN4_lag35.png  p450_apoN4_lag58.png  p450_apoN4_lag80.png
p450_apoN4_lag13.png   p450_apoN4_lag36.png  p450_apoN4_lag59.png  p450_apoN4_lag81.png
p450_apoN4_lag14.png   p450_apoN4_lag37.png  p450_apoN4_lag5.png   p450_apoN4_lag82.png
p450_apoN4_lag15.png   p450_apoN4_lag38.png  p450_apoN4_lag60.png  p450_apoN4_lag83.png
p450_apoN4_lag16.png   p450_apoN4_lag39.png  p450_apoN4_lag61.png  p450_apoN4_lag84.png
p450_apoN4_lag17.png   p450_apoN4_lag3.png   p450_apoN4_lag62.png  p450_apoN4_lag85.png
p450_apoN4_lag18.png   p450_apoN4_lag40.png  p450_apoN4_lag63.png  p450_apoN4_lag86.png
p450_apoN4_lag19.png   p450_apoN4_lag41.png  p450_apoN4_lag64.png  p450_apoN4_lag87.png
p450_apoN4_lag1.png    p450_apoN4_lag42.png  p450_apoN4_lag65.png  p450_apoN4_lag88.png
p450_apoN4_lag20.png   p450_apoN4_lag43.png  p450_apoN4_lag66.png  p450_apoN4_lag89.png
p450_apoN4_lag21.png   p450_apoN4_lag44.png  p450_apoN4_lag67.png  p450_apoN4_lag8.png
p450_apoN4_lag22.png   p450_apoN4_lag45.png  p450_apoN4_lag68.png  p450_apoN4_lag90.png
p450_apoN4_lag23.png   p450_apoN4_lag46.png  p450_apoN4_lag69.png  p450_apoN4_lag91.png
p450_apoN4_lag24.png   p450_apoN4_lag47.png  p450_apoN4_lag6.png   p450_apoN4_lag92.png
p450_apoN4_lag25.png   p450_apoN4_lag48.png  p450_apoN4_lag70.png  p450_apoN4_lag93.png
p450_apoN4_lag26.png   p450_apoN4_lag49.png  p450_apoN4_lag71.png  p450_apoN4_lag94.png
p450_apoN4_lag27.png   p450_apoN4_lag4.png   p450_apoN4_lag72.png  p450_apoN4_lag95.png
p450_apoN4_lag28.png   p450_apoN4_lag50.png  p450_apoN4_lag73.png  p450_apoN4_lag96.png
p450_apoN4_lag29.png   p450_apoN4_lag51.png  p450_apoN4_lag74.png  p450_apoN4_lag97.png
p450_apoN4_lag2.png    p450_apoN4_lag52.png  p450_apoN4_lag75.png  p450_apoN4_lag98.png
p450_apoN4_lag30.png   p450_apoN4_lag53.png  p450_apoN4_lag76.png  p450_apoN4_lag99.png
p450_apoN4_lag31.png   p450_apoN4_lag54.png  p450_apoN4_lag77.png  p450_apoN4_lag9.png
(base) [root@jy-cmp03 figures_N4_30nearest]# mv p450_apoN4*.png ../

It completely hangs here and the ceph health shows up with 3 mdses slow requests. If I am just using 1 mds it work just fine.

Now with 3 mdses if I run this it works without any issues no slow requests. Also fine with 1 mds.

(base) [root@jy-cmp03 figures_N4_30nearest]# mv p450_apoN4_lag*.png ../

The script/workload did fine no matter what I tried with the Attr setting, everything was fine. As soon as I did this move from their history the whole thing completely hung up.

Actions

Copy link

#22

Updated by mitchell walls about 4 years ago

Here are logs for all three mdses when this occurs.

2020-04-08 14:00:02.179 7f2c8f6f3700  0 mds.1.migrator nicely exporting to mds.0 [dir 0x10000003c9f.010* /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/ [2,head] auth{0=1,2=1} v=133634 cv=0/0 dir_auth=1 state=1610874881|complete f(v902 m2020-04-08 13:59:30.177939 147=13+134) n(v2051 rc2020-04-08 13:59:30.177939 b12117192385 1669=1531+138)/n(v2051 rc2020-04-08 13:59:14.337857 b12117476856 1681=1543+138) hs=147+14,ss=0+0 dirty=44 | child=1 frozen=0 subtree=1 importing=0 replicated=1 dirty=1 authpin=0 0x55e7b1f19900]
2020-04-08 14:00:02.180 7f2c8f6f3700  0 mds.1.bal reexporting [dir 0x10000003c9f.100* /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/ [2,head] auth{0=1,2=1} v=133660 cv=0/0 dir_auth=1 ap=0+2 state=1610874881|complete f(v902 m2020-04-08 13:59:30.268940 140=11+129) n(v2051 rc2020-04-08 13:59:30.268940 b10415607247 1324=1190+134)/n(v2051 rc2020-04-08 13:59:14.353857 b10415869931 1335=1201+134) hs=140+14,ss=0+0 dirty=16 | child=1 frozen=0 subtree=1 importing=0 replicated=1 dirty=1 authpin=0 0x55e7b26f0300] pop 8.20051 to mds.2
2020-04-08 14:00:02.180 7f2c8f6f3700  0 mds.1.migrator nicely exporting to mds.2 [dir 0x10000003c9f.100* /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/ [2,head] auth{0=1,2=1} v=133660 cv=0/0 dir_auth=1 ap=0+2 state=1610874881|complete f(v902 m2020-04-08 13:59:30.268940 140=11+129) n(v2051 rc2020-04-08 13:59:30.268940 b10415607247 1324=1190+134)/n(v2051 rc2020-04-08 13:59:14.353857 b10415869931 1335=1201+134) hs=140+14,ss=0+0 dirty=16 | child=1 frozen=0 subtree=1 importing=0 replicated=1 dirty=1 authpin=0 0x55e7b26f0300]
2020-04-08 14:00:02.180 7f2c8f6f3700  0 mds.1.bal reexporting [dir 0x10000003c9f.011* /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/ [2,head] auth{0=1,2=1} v=133664 cv=0/0 dir_auth=1 state=1610874881|complete f(v902 m2020-04-08 13:59:30.237939 163=12+151) n(v2051 rc2020-04-08 13:59:30.237939 b12511767427 1371=1220+151)/n(v2051 rc2020-04-08 13:59:14.351857 b12512171507 1388=1237+151) hs=163+20,ss=0+0 dirty=23 | child=1 frozen=0 subtree=1 importing=0 replicated=1 dirty=1 authpin=0 0x55e7b2b0e000] pop 9.7282 to mds.2
2020-04-08 14:00:03.816 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26921 from mon.1
2020-04-08 14:00:24.847 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26922 from mon.1
2020-04-08 14:00:25.856 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26923 from mon.1
2020-04-08 14:00:33.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 3 included below; oldest blocked for > 43.001595 secs
2020-04-08 14:00:33.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:33.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : slow request 31.130491 seconds old, received at 2020-04-08 14:00:02.180864: internal op exportdir:mds.1:5 currently failed to wrlock, waiting
2020-04-08 14:00:33.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:33.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : slow request 31.130388 seconds old, received at 2020-04-08 14:00:02.180967: internal op exportdir:mds.1:6 currently failed to wrlock, waiting
2020-04-08 14:00:33.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:33.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : slow request 43.001594 seconds old, received at 2020-04-08 13:59:50.309761: client_request(client.107721161:8648 rename #0x10000003c9f/p450_apoN4_lag100.png #0x30000075da6/p450_apoN4_lag100.png 2020-04-08 13:59:50.308041 caller_uid=0, caller_gid=0{}) currently failed to wrlock, waiting
2020-04-08 14:00:33.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:37.337 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26924 from mon.1
2020-04-08 14:00:38.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 48.001707 secs
2020-04-08 14:00:38.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:41.379 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26925 from mon.1
2020-04-08 14:00:43.310 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 53.001811 secs
2020-04-08 14:00:43.310 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:45.433 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26926 from mon.1
2020-04-08 14:00:48.310 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 58.001929 secs
2020-04-08 14:00:48.310 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:53.310 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 1 included below; oldest blocked for > 63.002042 secs
2020-04-08 14:00:53.310 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:53.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : slow request 63.002042 seconds old, received at 2020-04-08 13:59:50.309761: client_request(client.107721161:8648 rename #0x10000003c9f/p450_apoN4_lag100.png #0x30000075da6/p450_apoN4_lag100.png 2020-04-08 13:59:50.308041 caller_uid=0, caller_gid=0{}) currently failed to wrlock, waiting
2020-04-08 14:00:53.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:58.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 68.002170 secs
2020-04-08 14:00:58.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:03.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 2 included below; oldest blocked for > 73.002280 secs
2020-04-08 14:01:03.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:03.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : slow request 61.131176 seconds old, received at 2020-04-08 14:00:02.180864: internal op exportdir:mds.1:5 currently failed to wrlock, waiting
2020-04-08 14:01:03.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:03.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : slow request 61.131073 seconds old, received at 2020-04-08 14:00:02.180967: internal op exportdir:mds.1:6 currently failed to wrlock, waiting
2020-04-08 14:01:03.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:08.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 78.002395 secs
2020-04-08 14:01:08.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:09.192 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26927 from mon.1
2020-04-08 14:01:13.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 83.002502 secs
2020-04-08 14:01:13.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:18.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 88.002619 secs
2020-04-08 14:01:18.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:23.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 93.002726 secs
2020-04-08 14:01:23.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:28.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 98.002841 secs
2020-04-08 14:01:28.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:33.311 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 103.002946 secs
2020-04-08 14:01:33.311 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:38.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 108.003063 secs
2020-04-08 14:01:38.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:41.261 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26928 from mon.1
2020-04-08 14:01:43.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 113.003171 secs
2020-04-08 14:01:43.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:43.375 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26929 from mon.1
2020-04-08 14:01:45.401 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26930 from mon.1
2020-04-08 14:01:48.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 118.003284 secs
2020-04-08 14:01:48.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:53.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 1 included below; oldest blocked for > 123.003389 secs
2020-04-08 14:01:53.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:53.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : slow request 123.003388 seconds old, received at 2020-04-08 13:59:50.309761: client_request(client.107721161:8648 rename #0x10000003c9f/p450_apoN4_lag100.png #0x30000075da6/p450_apoN4_lag100.png 2020-04-08 13:59:50.308041 caller_uid=0, caller_gid=0{}) currently failed to wrlock, waiting
2020-04-08 14:01:53.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:58.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 128.003504 secs
2020-04-08 14:01:58.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:01.541 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26931 from mon.1
2020-04-08 14:02:03.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 2 included below; oldest blocked for > 133.003607 secs
2020-04-08 14:02:03.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:03.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : slow request 121.132503 seconds old, received at 2020-04-08 14:00:02.180864: internal op exportdir:mds.1:5 currently failed to wrlock, waiting
2020-04-08 14:02:03.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:03.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : slow request 121.132400 seconds old, received at 2020-04-08 14:00:02.180967: internal op exportdir:mds.1:6 currently failed to wrlock, waiting
2020-04-08 14:02:03.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:08.196 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26932 from mon.1
2020-04-08 14:02:08.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 138.003720 secs
2020-04-08 14:02:08.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:09.212 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26933 from mon.1
2020-04-08 14:02:12.974 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26934 from mon.1
2020-04-08 14:02:13.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 143.003828 secs
2020-04-08 14:02:13.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:18.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 148.003937 secs
2020-04-08 14:02:18.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:23.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 153.004046 secs
2020-04-08 14:02:23.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:28.313 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 158.004171 secs
2020-04-08 14:02:28.313 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:29.197 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26935 from mon.1
2020-04-08 14:02:33.313 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 163.004279 secs
2020-04-08 14:02:33.313 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:38.313 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 168.004392 secs
2020-04-08 14:02:38.313 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:41.425 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26936 from mon.1
2020-04-08 14:02:43.313 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 173.004501 secs
2020-04-08 14:02:43.313 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:48.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 178.004616 secs
2020-04-08 14:02:48.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:53.023 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26937 from mon.1
2020-04-08 14:02:53.312 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 183.004753 secs
2020-04-08 14:02:53.312 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:58.313 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 188.004869 secs
2020-04-08 14:02:58.313 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:03.314 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 193.004976 secs
2020-04-08 14:03:03.314 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:08.314 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 198.005084 secs
2020-04-08 14:03:08.314 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:13.052 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26938 from mon.1
2020-04-08 14:03:13.314 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 203.005176 secs
2020-04-08 14:03:13.314 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:18.314 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 208.005288 secs
2020-04-08 14:03:18.314 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:21.216 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26939 from mon.1
2020-04-08 14:03:23.314 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 213.005397 secs
2020-04-08 14:03:23.314 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:25.393 7f2c8f6f3700  1 mds.jy-cmp04 Updating MDS map to version 26940 from mon.1
2020-04-08 14:03:28.315 7f2c8ceee700  0 log_channel(cluster) log [WRN] : 3 slow requests, 0 included below; oldest blocked for > 218.005513 secs
2020-04-08 14:03:28.315 7f2c8ceee700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:28.423 7f2c8fef4700 -1 received  signal: Terminated from /usr/lib/systemd/systemd --system --deserialize 17  (PID: 1) UID: 0
2020-04-08 14:03:28.423 7f2c8fef4700 -1 mds.jy-cmp04 *** got signal Terminated ***
2020-04-08 14:03:28.423 7f2c8fef4700  1 mds.jy-cmp04 suicide! Wanted state up:active
2020-04-08 14:03:28.480 7f2c8fef4700  1 mds.1.26914 shutdown: shutting down rank 1

replicated=1 dirty=1 authpin=0 tempexporting=0 0x55bf19858500]
2020-04-08 14:00:03.816 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26921 from mon.2
2020-04-08 14:00:22.178 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 31.869775 secs
2020-04-08 14:00:22.178 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:22.178 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 31.869775 seconds old, received at 2020-04-08 13:59:50.310183: rejoin:client.107721161:8648 currently dispatched
2020-04-08 14:00:22.178 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:24.847 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26922 from mon.2
2020-04-08 14:00:25.855 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26923 from mon.2
2020-04-08 14:00:27.178 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 36.869877 secs
2020-04-08 14:00:27.178 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:32.178 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 41.869979 secs
2020-04-08 14:00:32.178 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:37.178 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 3 slow requests, 2 included below; oldest blocked for > 46.870081 secs
2020-04-08 14:00:37.178 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:37.178 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 34.999144 seconds old, received at 2020-04-08 14:00:02.181119: rejoin:mds.1:5 currently dispatched
2020-04-08 14:00:37.178 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:37.179 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 33.363097 seconds old, received at 2020-04-08 14:00:03.817166: rejoin:mds.1:6 currently dispatched
2020-04-08 14:00:37.179 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:37.337 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26924 from mon.2
2020-04-08 14:00:41.379 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26925 from mon.2
2020-04-08 14:00:42.179 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 4 slow requests, 1 included below; oldest blocked for > 51.870185 secs
2020-04-08 14:00:42.179 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:42.179 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 33.482869 seconds old, received at 2020-04-08 14:00:08.697498: client_request(client.107721146:3770 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:00:08.697243 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:00:42.179 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:45.434 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26926 from mon.2
2020-04-08 14:00:47.179 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 4 slow requests, 0 included below; oldest blocked for > 56.870291 secs
2020-04-08 14:00:47.179 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:52.179 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 4 slow requests, 1 included below; oldest blocked for > 61.870398 secs
2020-04-08 14:00:52.179 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:52.179 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 61.870397 seconds old, received at 2020-04-08 13:59:50.310183: rejoin:client.107721161:8648 currently dispatched
2020-04-08 14:00:52.179 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:57.179 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 4 slow requests, 0 included below; oldest blocked for > 66.870503 secs
2020-04-08 14:00:57.179 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:02.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 4 slow requests, 0 included below; oldest blocked for > 71.870610 secs
2020-04-08 14:01:02.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:07.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 5 slow requests, 3 included below; oldest blocked for > 76.870714 secs
2020-04-08 14:01:07.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:07.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 64.999778 seconds old, received at 2020-04-08 14:00:02.181119: rejoin:mds.1:5 currently dispatched
2020-04-08 14:01:07.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:07.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 63.363731 seconds old, received at 2020-04-08 14:00:03.817166: rejoin:mds.1:6 currently dispatched
2020-04-08 14:01:07.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:07.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 33.015284 seconds old, received at 2020-04-08 14:00:34.165613: client_request(client.107721146:3771 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:00:34.165348 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:01:07.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:09.193 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26927 from mon.2
2020-04-08 14:01:12.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 5 slow requests, 1 included below; oldest blocked for > 81.870822 secs
2020-04-08 14:01:12.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:12.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 63.483506 seconds old, received at 2020-04-08 14:00:08.697498: client_request(client.107721146:3770 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:00:08.697243 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:01:12.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:17.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 5 slow requests, 0 included below; oldest blocked for > 86.870925 secs
2020-04-08 14:01:17.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:22.179 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 5 slow requests, 0 included below; oldest blocked for > 91.871038 secs
2020-04-08 14:01:22.179 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:27.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 5 slow requests, 0 included below; oldest blocked for > 96.871141 secs
2020-04-08 14:01:27.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:32.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 5 slow requests, 0 included below; oldest blocked for > 101.871244 secs
2020-04-08 14:01:32.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:37.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 6 slow requests, 2 included below; oldest blocked for > 106.871347 secs
2020-04-08 14:01:37.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:37.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 63.015917 seconds old, received at 2020-04-08 14:00:34.165613: client_request(client.107721146:3771 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:00:34.165348 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:01:37.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:37.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 31.735686 seconds old, received at 2020-04-08 14:01:05.445844: client_request(client.107721146:3772 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:05.445567 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:01:37.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:41.261 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26928 from mon.2
2020-04-08 14:01:42.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 7 slow requests, 1 included below; oldest blocked for > 111.871452 secs
2020-04-08 14:01:42.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:42.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 30.751883 seconds old, received at 2020-04-08 14:01:11.429752: client_request(client.107721146:3773 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:11.429495 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:01:42.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:43.375 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26929 from mon.2
2020-04-08 14:01:45.401 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26930 from mon.2
2020-04-08 14:01:47.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 7 slow requests, 0 included below; oldest blocked for > 116.871551 secs
2020-04-08 14:01:47.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:52.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 7 slow requests, 1 included below; oldest blocked for > 121.871666 secs
2020-04-08 14:01:52.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:52.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 121.871666 seconds old, received at 2020-04-08 13:59:50.310183: rejoin:client.107721161:8648 currently dispatched
2020-04-08 14:01:52.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:57.181 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 8 slow requests, 1 included below; oldest blocked for > 126.871771 secs
2020-04-08 14:01:57.181 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:57.181 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 32.180101 seconds old, received at 2020-04-08 14:01:25.001852: client_request(client.107721146:3774 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:25.001608 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:01:57.181 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:01.541 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26931 from mon.2
2020-04-08 14:02:02.181 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 8 slow requests, 1 included below; oldest blocked for > 131.871876 secs
2020-04-08 14:02:02.181 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:02.181 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 120.000940 seconds old, received at 2020-04-08 14:00:02.181119: rejoin:mds.1:5 currently dispatched
2020-04-08 14:02:02.181 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:07.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 9 slow requests, 3 included below; oldest blocked for > 136.871985 secs
2020-04-08 14:02:07.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:07.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 123.365002 seconds old, received at 2020-04-08 14:00:03.817166: rejoin:mds.1:6 currently dispatched
2020-04-08 14:02:07.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:07.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 61.736324 seconds old, received at 2020-04-08 14:01:05.445844: client_request(client.107721146:3772 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:05.445567 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:07.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:07.180 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 30.380576 seconds old, received at 2020-04-08 14:01:36.801591: client_request(client.107721146:3775 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:36.801380 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:07.180 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:08.196 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26932 from mon.2
2020-04-08 14:02:09.212 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26933 from mon.2
2020-04-08 14:02:12.181 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 9 slow requests, 2 included below; oldest blocked for > 141.872095 secs
2020-04-08 14:02:12.181 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:12.181 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 123.484779 seconds old, received at 2020-04-08 14:00:08.697498: client_request(client.107721146:3770 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:00:08.697243 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:12.181 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:12.181 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 60.752526 seconds old, received at 2020-04-08 14:01:11.429752: client_request(client.107721146:3773 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:11.429495 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:12.181 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:12.973 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26934 from mon.2
2020-04-08 14:02:17.181 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 9 slow requests, 0 included below; oldest blocked for > 146.872221 secs
2020-04-08 14:02:17.181 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:22.181 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 9 slow requests, 0 included below; oldest blocked for > 151.872306 secs
2020-04-08 14:02:22.181 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:27.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 10 slow requests, 2 included below; oldest blocked for > 156.872410 secs
2020-04-08 14:02:27.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:27.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 62.180741 seconds old, received at 2020-04-08 14:01:25.001852: client_request(client.107721146:3774 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:25.001608 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:27.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:27.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 34.169797 seconds old, received at 2020-04-08 14:01:53.012796: client_request(client.107721146:3776 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:53.012544 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:27.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:29.198 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26935 from mon.2
2020-04-08 14:02:32.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 10 slow requests, 0 included below; oldest blocked for > 161.872519 secs
2020-04-08 14:02:32.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 12 slow requests, 4 included below; oldest blocked for > 166.872625 secs
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 123.017194 seconds old, received at 2020-04-08 14:00:34.165613: client_request(client.107721146:3771 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:00:34.165348 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 60.381216 seconds old, received at 2020-04-08 14:01:36.801591: client_request(client.107721146:3775 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:36.801380 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 31.911212 seconds old, received at 2020-04-08 14:02:05.271596: client_request(client.107721161:8651 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:05.270719 caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 30.887334 seconds old, received at 2020-04-08 14:02:06.295474: client_request(client.107721146:3777 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:06.295214 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:37.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:41.427 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26936 from mon.2
2020-04-08 14:02:42.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 12 slow requests, 0 included below; oldest blocked for > 171.872730 secs
2020-04-08 14:02:42.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:47.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 12 slow requests, 0 included below; oldest blocked for > 176.872834 secs
2020-04-08 14:02:47.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:52.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 14 slow requests, 2 included below; oldest blocked for > 181.872939 secs
2020-04-08 14:02:52.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:52.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 33.946467 seconds old, received at 2020-04-08 14:02:18.236654: client_request(client.107721146:3778 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:18.236392 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:52.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:52.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 30.843845 seconds old, received at 2020-04-08 14:02:21.339276: client_request(client.107721161:8653 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:21.338799 caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting
2020-04-08 14:02:52.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:53.025 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26937 from mon.2
2020-04-08 14:02:57.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 14 slow requests, 1 included below; oldest blocked for > 186.873042 secs
2020-04-08 14:02:57.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:57.182 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 64.170429 seconds old, received at 2020-04-08 14:01:53.012796: client_request(client.107721146:3776 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:53.012544 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:02:57.182 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:02.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 14 slow requests, 0 included below; oldest blocked for > 191.873148 secs
2020-04-08 14:03:02.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:07.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 14 slow requests, 3 included below; oldest blocked for > 196.873253 secs
2020-04-08 14:03:07.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:07.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 121.737592 seconds old, received at 2020-04-08 14:01:05.445844: client_request(client.107721146:3772 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:05.445567 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:03:07.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:07.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 61.911840 seconds old, received at 2020-04-08 14:02:05.271596: client_request(client.107721161:8651 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:05.270719 caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting
2020-04-08 14:03:07.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:07.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 60.887962 seconds old, received at 2020-04-08 14:02:06.295474: client_request(client.107721146:3777 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:06.295214 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:03:07.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:12.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 15 slow requests, 2 included below; oldest blocked for > 201.873360 secs
2020-04-08 14:03:12.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:12.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 120.753791 seconds old, received at 2020-04-08 14:01:11.429752: client_request(client.107721146:3773 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:11.429495 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:03:12.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:12.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 31.463685 seconds old, received at 2020-04-08 14:02:40.719858: client_request(client.107721161:8654 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:40.717897 caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting
2020-04-08 14:03:12.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:13.053 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26938 from mon.2
2020-04-08 14:03:17.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 16 slow requests, 1 included below; oldest blocked for > 206.873464 secs
2020-04-08 14:03:17.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:17.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 30.551437 seconds old, received at 2020-04-08 14:02:46.632210: client_request(client.107721146:3779 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:46.631960 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:03:17.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:21.216 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26939 from mon.2
2020-04-08 14:03:22.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 17 slow requests, 3 included below; oldest blocked for > 211.873568 secs
2020-04-08 14:03:22.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:22.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 63.947097 seconds old, received at 2020-04-08 14:02:18.236654: client_request(client.107721146:3778 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:18.236392 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:03:22.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:22.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 60.844475 seconds old, received at 2020-04-08 14:02:21.339276: client_request(client.107721161:8653 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:21.338799 caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting
2020-04-08 14:03:22.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:22.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 33.377983 seconds old, received at 2020-04-08 14:02:48.805768: client_request(client.107721146:3780 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:02:48.805542 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:03:22.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:25.393 7f5d24263700  1 mds.jy-blk01 Updating MDS map to version 26940 from mon.2
2020-04-08 14:03:27.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : 17 slow requests, 1 included below; oldest blocked for > 216.873673 secs
2020-04-08 14:03:27.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:27.183 7f5d21a5e700  0 log_channel(cluster) log [WRN] : slow request 122.182003 seconds old, received at 2020-04-08 14:01:25.001852: client_request(client.107721146:3774 getattr pAsLsXsFs #0x10000003c9f 2020-04-08 14:01:25.001608 caller_uid=1002, caller_gid=1000{4,24,27,30,44,46,108,124,1000,1006,}) currently failed to rdlock, waiting
2020-04-08 14:03:27.183 7f5d21a5e700  0 log_channel(cluster) do_log log to syslog

2020-04-08 14:00:03.816 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26921 from mon.1
2020-04-08 14:00:23.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 33.537515 secs
2020-04-08 14:00:23.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:23.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : slow request 33.537515 seconds old, received at 2020-04-08 13:59:50.310262: rejoin:client.107721161:8648 currently acquired locks
2020-04-08 14:00:23.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:24.847 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26922 from mon.1
2020-04-08 14:00:25.856 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26923 from mon.1
2020-04-08 14:00:28.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 38.537604 secs
2020-04-08 14:00:28.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:33.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 43.537703 secs
2020-04-08 14:00:33.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:37.337 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26924 from mon.1
2020-04-08 14:00:38.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 48.537785 secs
2020-04-08 14:00:38.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:41.380 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26925 from mon.1
2020-04-08 14:00:43.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 53.537873 secs
2020-04-08 14:00:43.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:45.434 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26926 from mon.1
2020-04-08 14:00:48.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 58.537963 secs
2020-04-08 14:00:48.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:53.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 63.538053 secs
2020-04-08 14:00:53.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:53.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : slow request 63.538052 seconds old, received at 2020-04-08 13:59:50.310262: rejoin:client.107721161:8648 currently acquired locks
2020-04-08 14:00:53.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:00:58.847 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 68.538142 secs
2020-04-08 14:00:58.847 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:03.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 73.538231 secs
2020-04-08 14:01:03.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:08.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 78.538318 secs
2020-04-08 14:01:08.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:09.193 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26927 from mon.1
2020-04-08 14:01:13.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 83.538396 secs
2020-04-08 14:01:13.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:18.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 88.538485 secs
2020-04-08 14:01:18.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:23.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 93.538569 secs
2020-04-08 14:01:23.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:28.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 98.538660 secs
2020-04-08 14:01:28.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:33.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 103.538780 secs
2020-04-08 14:01:33.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:38.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 108.538858 secs
2020-04-08 14:01:38.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:41.261 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26928 from mon.1
2020-04-08 14:01:43.375 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26929 from mon.1
2020-04-08 14:01:43.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 113.538946 secs
2020-04-08 14:01:43.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:45.402 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26930 from mon.1
2020-04-08 14:01:48.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 118.539034 secs
2020-04-08 14:01:48.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:53.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 123.539120 secs
2020-04-08 14:01:53.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:53.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : slow request 123.539120 seconds old, received at 2020-04-08 13:59:50.310262: rejoin:client.107721161:8648 currently acquired locks
2020-04-08 14:01:53.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:01:58.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 128.539207 secs
2020-04-08 14:01:58.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:01.541 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26931 from mon.1
2020-04-08 14:02:03.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 133.539294 secs
2020-04-08 14:02:03.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:08.196 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26932 from mon.1
2020-04-08 14:02:08.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 138.539379 secs
2020-04-08 14:02:08.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:09.212 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26933 from mon.1
2020-04-08 14:02:12.974 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26934 from mon.1
2020-04-08 14:02:13.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 143.539467 secs
2020-04-08 14:02:13.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:18.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 148.539552 secs
2020-04-08 14:02:18.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:23.848 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 153.539637 secs
2020-04-08 14:02:23.848 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:28.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 158.539746 secs
2020-04-08 14:02:28.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:29.198 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26935 from mon.1
2020-04-08 14:02:33.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 163.539823 secs
2020-04-08 14:02:33.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:38.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 168.539910 secs
2020-04-08 14:02:38.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:41.426 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26936 from mon.1
2020-04-08 14:02:43.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 173.539997 secs
2020-04-08 14:02:43.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:48.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 178.540086 secs
2020-04-08 14:02:48.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:53.024 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26937 from mon.1
2020-04-08 14:02:53.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 183.540174 secs
2020-04-08 14:02:53.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:02:58.849 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 188.540268 secs
2020-04-08 14:02:58.849 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:03.850 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 193.540357 secs
2020-04-08 14:03:03.850 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:08.850 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 198.540442 secs
2020-04-08 14:03:08.850 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:13.052 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26938 from mon.1
2020-04-08 14:03:13.850 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 203.540526 secs
2020-04-08 14:03:13.850 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:18.850 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 208.540615 secs
2020-04-08 14:03:18.850 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:21.216 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26939 from mon.1
2020-04-08 14:03:23.850 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 213.540705 secs
2020-04-08 14:03:23.850 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:25.393 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26940 from mon.1
2020-04-08 14:03:28.850 7f2d7b58b700  0 log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 218.540792 secs
2020-04-08 14:03:28.850 7f2d7b58b700  0 log_channel(cluster) do_log log to syslog
2020-04-08 14:03:29.426 7f2d7dd90700  1 mds.jy-blk02 Updating MDS map to version 26941 from mon.1
2020-04-08 14:03:29.426 7f2d7dd90700  1 mds.2.cache handle_mds_failure mds.1 : recovery peers are

Actions

Copy link

#23

Updated by Jeff Layton about 4 years ago

Ok. If you're able to change settings, can you try setting the MDCACHE Entries_HWMark value much lower? Like maybe down to 1000 or so, and see if that alleviates the problem?

The longer explanation is that the MDS's can end up in this state even when the clients are just holding Inode references. Each mdcache entry holds one and ganesha and ceph lack a way for libcephfs to ask ganesha to release a certain number of entries. Tuning the Entries_HWMark value very low may be enough to avoid the problem altogether.

Longer term, we probably need to plumb in some way for libcephfs to request the application (ganesha) release some of its Inode refs, but that's a much more involved project.

Actions

Copy link

#24

Updated by mitchell walls about 4 years ago

I have set that as well to 100. Still getting the same thing. The weird thing is that this only happens with multiple mdses.
Entries_HWMark = 100;

HEALTH_WARN 2 MDSs report slow requests; too few PGs per OSD (15 < min 30)
MDS_SLOW_REQUEST 2 MDSs report slow requests
    mdsjy-blk01(mds.1): 1 slow requests are blocked > 30 secs
    mdsjy-blk02(mds.0): 3 slow requests are blocked > 30 secs
TOO_FEW_PGS too few PGs per OSD (15 < min 30)

Here is an example of something happens which gives the slow requests.

(base) [root@jy-cmp04 figures_N4_30nearest]# ls
p450_apoN4_lag100.png  p450_apoN4_lag33.png  p450_apoN4_lag57.png  p450_apoN4_lag80.png
p450_apoN4_lag10.png   p450_apoN4_lag34.png  p450_apoN4_lag58.png  p450_apoN4_lag81.png
p450_apoN4_lag11.png   p450_apoN4_lag35.png  p450_apoN4_lag59.png  p450_apoN4_lag82.png
p450_apoN4_lag12.png   p450_apoN4_lag36.png  p450_apoN4_lag5.png   p450_apoN4_lag83.png
p450_apoN4_lag13.png   p450_apoN4_lag37.png  p450_apoN4_lag60.png  p450_apoN4_lag84.png
p450_apoN4_lag14.png   p450_apoN4_lag38.png  p450_apoN4_lag61.png  p450_apoN4_lag85.png
p450_apoN4_lag15.png   p450_apoN4_lag39.png  p450_apoN4_lag62.png  p450_apoN4_lag86.png
p450_apoN4_lag16.png   p450_apoN4_lag3.png   p450_apoN4_lag63.png  p450_apoN4_lag87.png
p450_apoN4_lag17.png   p450_apoN4_lag40.png  p450_apoN4_lag64.png  p450_apoN4_lag88.png
p450_apoN4_lag18.png   p450_apoN4_lag41.png  p450_apoN4_lag65.png  p450_apoN4_lag89.png
p450_apoN4_lag19.png   p450_apoN4_lag42.png  p450_apoN4_lag66.png  p450_apoN4_lag8.png
p450_apoN4_lag1.png    p450_apoN4_lag43.png  p450_apoN4_lag67.png  p450_apoN4_lag90.png
p450_apoN4_lag20.png   p450_apoN4_lag44.png  p450_apoN4_lag68.png  p450_apoN4_lag91.png
p450_apoN4_lag21.png   p450_apoN4_lag45.png  p450_apoN4_lag69.png  p450_apoN4_lag92.png
p450_apoN4_lag22.png   p450_apoN4_lag46.png  p450_apoN4_lag6.png   p450_apoN4_lag93.png
p450_apoN4_lag23.png   p450_apoN4_lag47.png  p450_apoN4_lag70.png  p450_apoN4_lag94.png
p450_apoN4_lag24.png   p450_apoN4_lag48.png  p450_apoN4_lag71.png  p450_apoN4_lag95.png
p450_apoN4_lag25.png   p450_apoN4_lag49.png  p450_apoN4_lag72.png  p450_apoN4_lag96.png
p450_apoN4_lag26.png   p450_apoN4_lag4.png   p450_apoN4_lag73.png  p450_apoN4_lag97.png
p450_apoN4_lag27.png   p450_apoN4_lag50.png  p450_apoN4_lag74.png  p450_apoN4_lag98.png
p450_apoN4_lag28.png   p450_apoN4_lag51.png  p450_apoN4_lag75.png  p450_apoN4_lag99.png
p450_apoN4_lag29.png   p450_apoN4_lag52.png  p450_apoN4_lag76.png  p450_apoN4_lag9.png
p450_apoN4_lag2.png    p450_apoN4_lag53.png  p450_apoN4_lag77.png  test
p450_apoN4_lag30.png   p450_apoN4_lag54.png  p450_apoN4_lag78.png
p450_apoN4_lag31.png   p450_apoN4_lag55.png  p450_apoN4_lag79.png
p450_apoN4_lag32.png   p450_apoN4_lag56.png  p450_apoN4_lag7.png
(base) [root@jy-cmp04 figures_N4_30nearest]# cd ..
ls
^C^C^C^C

ls

Actions

Copy link

#25

Updated by mitchell walls about 4 years ago

At this point I am okay with just running a single mds with 3 standbys. Everytime I go to recreate the problem it is a bit of a struggle to get back to a single mds. Evicting clients, restarting all mds, setting max_mds back to 1 fails sometimes with stuck requests as well.

Actions

Copy link

#26

Updated by Jeff Layton about 4 years ago

Ok thanks. I'll see what I can do with a multimds setup.

To be clear, I think there is more than one problem here (and in more than one place):

1) cache pressure: we're probably holding more inodes than we really want to be. Sizing the Entries_HWMark value lower should keep that cache small, but...

2) ganesha is occasionally exceeding it's high water marks in the mdcache: not clear on why that is yet, but this may be a bug in ganesha

3) slow requests: typically this is due to the MDS being unable to gather locks typically because some client is not releasing caps. If it's only happening with multimds then it may be related to remote lock facility in ceph. The problem is that I can't tell why the requests are slow with the info in the logs above, so we may need a higher debug level to understand why.

I'll try to put together a setup closer to yours tomorrow and see if I reproduce it. Thanks for all your help so far!

Actions

Copy link

#27

Updated by mitchell walls about 4 years ago

I will increase logging on the mds and do the same thing again tomorrow. Thanks for all your help.

Actions

Copy link

#28

Updated by mitchell walls about 4 years ago

Okay so far, I haven't seen any messages about cache pressure recently. That is without the Entries_HWMark set at all. I'm thinking it is just mostly random.

Today I am going to focus on the slow requests. I am going to increase the debugging on the mds then cause the slow requests.

Can you give me an idea of what all you would like me to collect at this time?
What log level for all of the mdses?
Maybe a dump_ops_in_flight?

Actions

Copy link

#29

Updated by Jeff Layton about 4 years ago

mitchell walls wrote:

Here is an example of something happens which gives the slow requests.

[...]

I set up a small cluster with 3 MDS' and haven't been able to reproduce this at all. What may be best is if we leave this bug about the cache pressure problems (which I think we have a handle on, and more concerns ganesha), and open a separate tracker ticket to track the problem you're having with slow requests on multimds (which may not be specific to ganesha).

Actions

Copy link

#30

Updated by mitchell walls about 4 years ago

Okay sure sounds good. As of right now I don't think I've seen the cache pressure messages in 2 days. I can't think of anything that I've changed might take another day I guess. Would you want me to create that?

Actions

Copy link

#31

Updated by mitchell walls about 4 years ago

I have migrated data away and did a fresh install of ceph using cephadm for octopus. I am not quite to the point of testing if this issue is gone. Have to migrate the data still.

Actions

Copy link

#32

Updated by Jeff Layton about 4 years ago

I've been studying this problem for a couple of days now, and I now think we will probably need to plumb in some cache backpressure mechanisms to fix this correctly. The fundamental issue is that ganesha trims its own handle cache down to the Entries_HWMark value, and no further. That's based on scanning an LRU list of the entries and removing references to them.

The Ceph MDS wants to dynamically size the client's cache at times. The libcephfs library mostly tries to satisfy this by releasing dentries. ganesha doesn't cache dentries though -- it caches inode references. After a server reboot, where the clients reclaim everything by filehandle, we might not even have touched a single dentry. So, libcephfs will also need a callback mechanism to ask the application to release references to specific inodes if possible.

Ganesha will need a corresponding mechanism to allow that callback to release an entry from the cache.

I think with all of that in place, we'll be in a much better position to avoid cache pressure warnings under load.

Actions

Copy link

#33

Updated by mitchell walls about 4 years ago

Figured I'd let you know the cache pressure messages are back on Octopus running containers deploying using cephadm. Do you think that this is something I can ignore? Or should I temporarily switch over to using kernel/fuse mount cephfs? Unfortunately my users aren't interested or willing to use any apis or rgw.

I am also running one mds at the moment. I plan to temporarily switch to three to see if the slow requests issue comes back. At that point can we say that it is not an issue on my end lol?

Also if you have just a few minutes at some point, Could you take a look at my other issue. It has to do with ganesha deployed in containers and the watch_url not working. I checked to make sure the dashboard is notifying on the config object when creating using a terminal watcher and it is. The problem is I have to manually run podmand kill --signal=SIGHUP "ganesha container id" in order to reload the exports.

Thanks for all your help!

Actions

Copy link

#34

Updated by Jeff Layton about 4 years ago

mitchell walls wrote:

Figured I'd let you know the cache pressure messages are back on Octopus running containers deploying using cephadm. Do you think that this is something I can ignore? Or should I temporarily switch over to using kernel/fuse mount cephfs? Unfortunately my users aren't interested or willing to use any apis or rgw.

I'd permanently switch to kcephfs or fuse if you can (at least for the forseeable future). ceph+ganesha is really meant to be a means of "casual" access, at least for now. You'll likely get better performance out of a direct cephfs mount (either FUSE or kclient). ganesha+ceph means adding an extra hop for all of the data since we can't do pNFS to ceph objects (yet). It's still very much a WIP.

I am also running one mds at the moment. I plan to temporarily switch to three to see if the slow requests issue comes back. At that point can we say that it is not an issue on my end lol?

It probably will come back. I suspect this issue is related, and that sometimes the MDS needs to recall pin caps to handle a migration event. Ganesha is holding the Inode references and won't let them go, so things get stuck. Proving this empirically is a bit tough though.

Also if you have just a few minutes at some point, Could you take a look at my other issue. It has to do with ganesha deployed in containers and the watch_url not working. I checked to make sure the dashboard is notifying on the config object when creating using a terminal watcher and it is. The problem is I have to manually run podmand kill --signal=SIGHUP "ganesha container id" in order to reload the exports.

I think that's a regression that crept into ganesha. The fix is here:

https://github.com/nfs-ganesha/nfs-ganesha/commit/640e387774f462cf04f7c7a591c6f9d8199ce312

Hopefully that will go into a stable ganesha release soon.

Actions

Copy link

#35

Updated by Jeff Layton about 4 years ago

All that said, the fact that all of this started occurring after an upgrade is curious. It would be nice to know whether something changed in ceph or ganesha at that point. BTW: what version of ganesha were you running before the upgrade?

Actions

Copy link

#36

Updated by mitchell walls about 4 years ago

One other thing then we can close this issue if possible, do you know if knfsd sharing a cephfs mount is something that is recommended or worth it on the performance/stability end? The problem I am going to run into is that my "public_network" is kind of hidden. Maybe i can find a way to switch the public network.

Anyways thanks for all your help.

Actions

Copy link

#37

Updated by mitchell walls about 4 years ago

The upgrade to octopus was a brand new install. I migrated the old ceph cluster to a fast dtn temporarily. Then mounted both cephfs from new and old cluster then copied everything over.

If I remember right that the issue started on the old cluster after going from 14.2.1->14.2.7 on ceph and ganesha 2.8.2->2.8.3. I posted it in a previous comment.

Actions

Copy link

#38

Updated by mitchell walls about 4 years ago

There also is a possibility the workload has changed but I believe it has stayed the same for the most part the last year. I asked the primary user to see if he was doing anything different and he said he wasn't.

I would say it is workload related. But It would happen as soon as I added more then 1 mds then listing a directory would potentially cause slow requests. This would happen even from a cephfs kernel mount. The cache pressure always existed even when the share/cluster is dorment during the weekend.

Actions

Copy link

#39

Updated by mitchell walls about 4 years ago

I'm going to send some stats on the total cephfs, number of files, size etc.

Actions

Copy link

#40

Updated by mitchell walls about 4 years ago

Usages:
Software and home directories for a slurm/ondemand cluster. It is not used at all just starting out.

GPU job storage for namd and rstudio. The gpu jobs are not writing to the nfs-ganesha share. It is getting transferred as they complete.

Total CEPHFS all shared by nfs-ganesha.
SIZE: 5.6TB
REGULAR FILE COUNT: 1508890
DIR COUNT: 134463
SYMLINK COUNT: 12245

Actions

Copy link

#41

Updated by mitchell walls about 4 years ago

Nice update. I added over 1 million files when the issue possibly started happening.

Actions

Copy link

#42

Updated by mitchell walls about 4 years ago

MATLAB R2019b apparently installed 600000 files.

Actions

Copy link

#43

Updated by mitchell walls almost 4 years ago

I removed a million files (the application/homedirectory share) and now the cache pressure messages are much less often about once an hours on for 5-30ish minutes. All night the nfs share was unused no traffic with the cache pressure message still present off and on. Which is a bit weird to me since it started happening after 40 minutes of 200ish iops then continued all night after that stopped.

I am going to move to kernel/fuse mount cephfs this week. I have one question, do you think that it is ganesha that has showed stability issues at such a large file and directory count since that was the primary change outside of the upgrade? Is there any possibility that cephfs could show the same issue? The reason I ask is because I previously had the issue with slow requests (not cache pressure warnings) on multiple mdses while all of the ganesha exports were evicted.

I am glad we are getting somewhere, it is nice to know that I am helping even though I'm not a developer. I actually have the old cluster running outside of production that I can use to run tests for you. I might need it one more time this week if networking wants me to do a rebuild to change the public network.

Thank you for your time.

Actions

Copy link

#44

Updated by Jeff Layton almost 4 years ago

mitchell walls wrote:

I am going to move to kernel/fuse mount cephfs this week.

That's certainly worth testing. A kcephfs or ceph-fuse mount should be able to respond more readily to cache pressure.

I have one question, do you think that it is ganesha that has showed stability issues at such a large file and directory count since that was the primary change outside of the upgrade? Is there any possibility that cephfs could show the same issue? The reason I ask is because I previously had the issue with slow requests (not cache pressure warnings) on multiple mdses while all of the ganesha exports were evicted.

It's not clear. This may be a bug or it may be something that was always going to happen in your environment, once the dataset got large enough. I will say that we went through some similar struggles with ceph-fuse a couple of years ago, and had to plumb in better cache pressure handling there. I think we'll likely need to do the same with ganesha.

Actions

Copy link

#45

Updated by Jeff Layton almost 4 years ago

Pull request ID set to 34596

Ok, ceph patches are pretty much done. Just waiting on review so I can merge them. There are also some ganesha patches to make it use the new functionality, culminating in this patch:

https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/490848

Mitchell, I'd like to get these bits to you if you're able to test them in your environment. These are all client-side changes, so it should be sufficient to just run updated libcephfs, librados, and nfs-ganesha packages on the host that's running ganesha. Is this something you're able to do? It may take a few days to get packages together if so.

Actions

Copy link

#46

Updated by mitchell walls almost 4 years ago

The Ganesha server is now running in a podman container version 15.2.1. So it should be easy to create an image with the patches? The issue is still there somewhat worse now. I haven't moved to direct mount cephfs so it is still used.

Actions

Copy link

#47

Updated by Jeff Layton almost 4 years ago

Probably easy for some people, but it may take me some time. I'll see what I can do.

Actions

Copy link

#48

Updated by mitchell walls almost 4 years ago

I can install the package within the container as well if needed. What is are the needed ceph patches? I could attempt this myself.

Actions

Copy link

#49

Updated by mitchell walls almost 4 years ago

Don't worry about the container, I can create it if needed. If not I could also create the packages if you would like. Just need a list of patches to apply to the client packages based off of centos 8.

Actions

Copy link

#50

Updated by Jeff Layton almost 4 years ago

I kicked off a ceph-ci build here:

https://shaman.ceph.com/builds/ceph/wip-jlayton-45114/d8f2dbd338fb5237bed8c34ddb4950791f1bff58/default/203542/

The ganesha patches are here:

https://github.com/jtlayton/nfs-ganesha/tree/ceph-cache-pressure

Ganesha will need to be built against the libcephfs-devel package built above.

Actions

Copy link

#51

Updated by mitchell walls almost 4 years ago

Just double checking, I need to update ceph on the nfs server to the rpms from the ci, then build nfs-ganesha? Could you possibly give me the flags I would need to build ganesha for ceph? Would save me just a bit of time.

Actions

Copy link

#52

Updated by Jeff Layton almost 4 years ago

Sure, in a nutshell...

Check out the correct ganesha branch in git first. Clone my tree and cd into it, and then:

$ git checkout ceph-cache-pressure
$ mkdir build
$ cd build
$ cmake ../src
$ make srpm

That should produce an srpm that you can then build on the rhel/centos host with the correct libcephfs-devel and librados-devel packages installed.

$ sudo yum-builddep /path/to/srpm
$ rpm -ivh /path/to/srpm
$ rpmbuild -bb /path/to/specfile

...and that should produce packages.

Actions

Copy link

#53

Updated by mitchell walls almost 4 years ago

Okay I have it setup. The userspace version for ganesha is showing
ceph version 16.0.0-977-gd8f2dbd (d8f2dbd338fb5237bed8c34ddb4950791f1bff58) pacific (dev)

Going to give it some time. I'll report back on Monday if I don't see the messages.

I did build the docker image. It is just very large since I did all of the source building there.

Actions

Copy link

#54

Updated by Jeff Layton almost 4 years ago

Many thanks. Let me know how it goes!

Actions

Copy link

#55

Updated by mitchell walls almost 4 years ago

Unfortunately it seems to be back. It was also sitting relatively dormant during this time again. Let me know if you need any logs or stats.

Actions

Copy link

#56

Updated by mitchell walls almost 4 years ago

It seems the way it is doing it might've changed it used to just sit there. On Monday, I plan to test the multi-mds problem.

4/24/20 2:38:26 PM
[WRN]
Health check failed: 1 clients failing to respond to cache pressure (MDS_CLIENT_RECALL)

4/24/20 2:25:39 PM
[INF]
Cluster is now healthy

4/24/20 2:25:39 PM
[INF]
Health check cleared: MDS_CLIENT_RECALL (was: 1 clients failing to respond to cache pressure)

4/24/20 2:25:38 PM
[INF]
MDS health message cleared (mds.0): Client dh-ceph-gw01 failing to respond to cache pressure

4/24/20 2:20:26 PM
[WRN]
Health check failed: 1 clients failing to respond to cache pressure (MDS_CLIENT_RECALL)

4/24/20 2:19:58 PM
[INF]
Cluster is now healthy

4/24/20 2:19:58 PM
[INF]
Health check cleared: MDS_CLIENT_RECALL (was: 1 clients failing to respond to cache pressure)

4/24/20 2:19:58 PM
[INF]
MDS health message cleared (mds.0): Client dh-ceph-gw01 failing to respond to cache pressure

4/24/20 2:10:07 PM
[WRN]
Health check failed: 1 clients failing to respond to cache pressure (MDS_CLIENT_RECALL)

4/24/20 2:00:00 PM
[INF]
overall HEALTH_OK

4/24/20 1:00:00 PM
[INF]
overall HEALTH_OK

Actions

Copy link

#57

Updated by Jeff Layton almost 4 years ago

Yeah, I think it's basically doing what it's supposed to do but there may be some delay between libcephfs getting the message to shrink its caches and that actually happening. It may also just be that the application you're using is holding files open for a long time. Ganesha can't release them if they are held open. If that's the case, then you may need to consider increasing the size of the mds caches as well.

In any case, let me know what happens with the multi-mds setup.

Actions

Copy link

#58

Updated by mitchell walls almost 4 years ago

I would say that it would be related to load or open files but I saw the message on both the new and old ceph cluster. Even showing up continually for hours When nothing was mounted to the nfs share on the new or old cluster depending on which one did not have the vip address at the time.

Running multile mds now.

Actions

Copy link

#59

Updated by mitchell walls almost 4 years ago

I forgot to mention that the message has been going for few hours continually.

4/25/20 4:00:00 AM
[WRN]
overall HEALTH_WARN 1 clients failing to respond to cache pressure

4/25/20 3:00:00 AM
[WRN]
overall HEALTH_WARN 1 clients failing to respond to cache pressure

4/25/20 2:00:00 AM
[WRN]
overall HEALTH_WARN 1 clients failing to respond to cache pressure

4/25/20 1:00:00 AM
[WRN]
overall HEALTH_WARN 1 clients failing to respond to cache pressure

4/25/20 12:00:00 AM
[WRN]
overall HEALTH_WARN 1 clients failing to respond to cache pressure

4/24/20 11:00:00 PM
[WRN]
overall HEALTH_WARN 1 clients failing to respond to cache pressure

4/24/20 10:00:00 PM
[WRN]
overall HEALTH_WARN 1 clients failing to respond to cache pressure

4/24/20 9:00:00 PM
[WRN]
overall HEALTH_WARN 1 clients failing to respond to cache pressure

4/24/20 8:49:16 PM
[WRN]
Health check failed: 1 clients failing to respond to cache pressure (MDS_CLIENT_RECALL)

4/24/20 8:34:08 PM
[INF]
Cluster is now healthy

Actions

Copy link

#60

Updated by Jeff Layton almost 4 years ago

mitchell walls wrote:

I would say that it would be related to load or open files but I saw the message on both the new and old ceph cluster. Even showing up continually for hours When nothing was mounted to the nfs share on the new or old cluster depending on which one did not have the vip address at the time.

Running multile mds now.

I think I need a much better desciption of how you're testing this and your NFS server configuration.

So are you saying that you're seeing cache pressure warnings even when there has been zero activity on the server since it was started? Or are you saying that you see cache pressure warnings after activity against the server for a while, and that the warnings don't go away after you move the virtual IP to another server?

Note that migrating IP addresses between two different instances like that is not a configuration that we're testing or supporting, btw. You may end up seeing reboot recovery issues unless you are somehow ensuring exclusive access to the recovery backend objects during that time.

Actions

Copy link

#61

Updated by mitchell walls almost 4 years ago

Let me test it some more. I think the cache pressure messages were showing up without any load. I am going to see if the messages show up on the new cluster with nothing connected to ganesha. Give me a day or two to mess around with this.

The virtual ip is just used temporarily to switch ganesha servers which are connected to separate ceph clusters. I have two ganesha servers on completely different hardware that are connected to two different ceph clusters both on completely different hardware as well. If I switch the virtual ip it is a brand new connection since it causes the mount to go stale and requires a remount to the virtual ip. Both ceph clusters are hosting the same data and I am keeping them rsynced occassionally or everytime I need to switch while we are testing. Rsyncing is done using kernel mounts.

The reason for two separate ceph clusters was to see if it would fix the issues as well as an opportunity to upgrade to newest version and containerized ceph. Also the other reason for the virtual ip is that I need the user to have access to the data on the cluster as we can only use nfs-ganesha at the time. I have been doing some modifications to the new cluster (changing public networks and modifying cephfs organization) so everything is running through the old one.

Actions

Copy link

#62

Updated by Patrick Donnelly almost 4 years ago

Description updated (diff)

(reformatted)

Actions

Copy link

#63

Updated by Patrick Donnelly almost 4 years ago

I ahven't read the entire thread but my theory is:

(a) NFS-Ganesha has a reference leak that is preventing it from releasing caps.
(b) The MDS changed in v14.2.8 (see #41865) to recall caps from quiescent sessions. Even with spare cache space, the MDS will recall caps from clients (i.e. nfs-ganesha) not doing anything.

Ganesha can't respond to the recall behavior so it shows up as a client failing to respond to cache pressure. This is indicated by the release_caps counter showing 0. The failure to lock when doing I/O on the tree also indicates this because Ganesha can never release caps.

Actions

Copy link

#64

Updated by Jeff Layton almost 4 years ago

I don't see any evidence for a refcount leak in ganesha. What ganesha does do is maintain a cache of filehandles, and each of those holds a ceph Inode reference. My understanding is that that implies a "pin" cap from the MDS.

Up until the patches I did for this, ganesha didn't have a way to release those due to pressure from libcephfs -- only when it exceeded a certain count of filehandles, and based on an LRU.

With the changes I've proposed, ganesha gets a callback and then can release the filehandle iff it's not in active use. If clients are holding a lot of files open though, then it may not be able to release them.

Actions

Copy link

#65

Updated by mitchell walls almost 4 years ago

Cephfs for our cluster has grown significantly in the number of clients due to joining a large HTC grid. I will not expand the MDS count until I need to. I am curious since I am still seeing the cache pressure message only from the 1 ganesha client out of a total of 24 clients (unforunately need ganesha for one legacy server). Do you think that we will see degraded performance due to ganesha? I have pasted mds perf dump.

[root@dh-ceph01 ~]# ceph tell mds.0 perf dump
2020-05-13T06:09:15.350-0500 7f57037fe700  0 client.241970 ms_handle_reset on v2:10.10.48.40:6850/1024128545
2020-05-13T06:09:15.376-0500 7f570cff9700  0 client.241980 ms_handle_reset on v2:10.10.48.40:6850/1024128545
{
    "AsyncMessenger::Worker-0": {
        "msgr_recv_messages": 16192169,
        "msgr_send_messages": 13269945,
        "msgr_recv_bytes": 5110888844,
        "msgr_send_bytes": 18121163294,
        "msgr_created_connections": 23005,
        "msgr_active_connections": 27,
        "msgr_running_total_time": 719.551073636,
        "msgr_running_send_time": 298.941735142,
        "msgr_running_recv_time": 17132.996499097,
        "msgr_running_fast_dispatch_time": 20.863573874,
        "msgr_send_messages_queue_lat": {
            "avgcount": 13269945,
            "sum": 341.787665477,
            "avgtime": 0.000025756
        },
        "msgr_handle_ack_lat": {
            "avgcount": 8916999,
            "sum": 3.012752779,
            "avgtime": 0.000000337
        }
    },
    "AsyncMessenger::Worker-1": {
        "msgr_recv_messages": 5354392,
        "msgr_send_messages": 8144582,
        "msgr_recv_bytes": 1499369079,
        "msgr_send_bytes": 11710760385,
        "msgr_created_connections": 24961,
        "msgr_active_connections": 26,
        "msgr_running_total_time": 376.617117064,
        "msgr_running_send_time": 176.943254554,
        "msgr_running_recv_time": 432.537346632,
        "msgr_running_fast_dispatch_time": 16.974740563,
        "msgr_send_messages_queue_lat": {
            "avgcount": 8144574,
            "sum": 296.974236924,
            "avgtime": 0.000036462
        },
        "msgr_handle_ack_lat": {
            "avgcount": 4910029,
            "sum": 1.815952941,
            "avgtime": 0.000000369
        }
    },
    "AsyncMessenger::Worker-2": {
        "msgr_recv_messages": 1839791,
        "msgr_send_messages": 3000865,
        "msgr_recv_bytes": 611292197,
        "msgr_send_bytes": 7939585448,
        "msgr_created_connections": 25388,
        "msgr_active_connections": 28,
        "msgr_running_total_time": 219.774878286,
        "msgr_running_send_time": 97.883445060,
        "msgr_running_recv_time": 101.309807342,
        "msgr_running_fast_dispatch_time": 13.852478489,
        "msgr_send_messages_queue_lat": {
            "avgcount": 3000863,
            "sum": 283.374957031,
            "avgtime": 0.000094431
        },
        "msgr_handle_ack_lat": {
            "avgcount": 1885947,
            "sum": 0.840600351,
            "avgtime": 0.000000445
        }
    },
    "cct": {
        "total_workers": 1,
        "unhealthy_workers": 0
    },
    "finisher-MDSRank": {
        "queue_len": 0,
        "complete_latency": {
            "avgcount": 752442,
            "sum": 215.468005496,
            "avgtime": 0.000286358
        }
    },
    "finisher-PurgeQueue": {
        "queue_len": 0,
        "complete_latency": {
            "avgcount": 97465,
            "sum": 3.075045257,
            "avgtime": 0.000031550
        }
    },
    "mds": {
        "request": 8168047,
        "reply": 8168017,
        "reply_latency": {
            "avgcount": 8168017,
            "sum": 1939.588228047,
            "avgtime": 0.000237461
        },
        "forward": 0,
        "dir_fetch": 20410,
        "dir_commit": 190175,
        "dir_split": 3,
        "dir_merge": 0,
        "inodes": 1661698,
        "inodes_top": 1074070,
        "inodes_bottom": 587628,
        "inodes_pin_tail": 0,
        "inodes_pinned": 127311,
        "inodes_expired": 1093342,
        "inodes_with_caps": 5268,
        "caps": 7776,
        "subtrees": 2,
        "traverse": 9306762,
        "traverse_hit": 9162645,
        "traverse_forward": 0,
        "traverse_discover": 0,
        "traverse_dir_fetch": 2,
        "traverse_remote_ino": 0,
        "traverse_lock": 51,
        "load_cent": 69,
        "q": 0,
        "exported": 0,
        "exported_inodes": 0,
        "imported": 0,
        "imported_inodes": 0,
        "openino_dir_fetch": 0,
        "openino_backtrace_fetch": 244,
        "openino_peer_discover": 0,
        "root_rfiles": 1526728,
        "root_rbytes": 5791695977592,
        "root_rsnaps": 0
    },
    "mds_cache": {
        "num_strays": 0,
        "num_strays_delayed": 0,
        "num_strays_enqueuing": 0,
        "strays_created": 30365,
        "strays_enqueued": 30335,
        "strays_reintegrated": 59,
        "strays_migrated": 0,
        "num_recovering_processing": 0,
        "num_recovering_enqueued": 0,
        "num_recovering_prioritized": 0,
        "recovery_started": 2,
        "recovery_completed": 2,
        "ireq_enqueue_scrub": 0,
        "ireq_exportdir": 0,
        "ireq_flush": 0,
        "ireq_fragmentdir": 3,
        "ireq_fragstats": 0,
        "ireq_inodestats": 0
    },
    "mds_log": {
        "evadd": 11612288,
        "evex": 11621355,
        "evtrm": 11621355,
        "ev": 109146,
        "evexg": 0,
        "evexd": 1024,
        "segadd": 12901,
        "segex": 12901,
        "segtrm": 12901,
        "seg": 129,
        "segexg": 0,
        "segexd": 1,
        "expos": 22742567762,
        "wrpos": 22892586935,
        "rdpos": 3994251872,
        "jlat": {
            "avgcount": 463511,
            "sum": 7053.128651580,
            "avgtime": 0.015216744
        },
        "replayed": 118213
    },
    "mds_mem": {
        "ino": 1661585,
        "ino+": 1691920,
        "ino-": 30335,
        "dir": 125151,
        "dir+": 127848,
        "dir-": 2697,
        "dn": 1662322,
        "dn+": 2794361,
        "dn-": 1132039,
        "cap": 7776,
        "cap+": 22010052,
        "cap-": 22002276,
        "rss": 5882124,
        "heap": 331964
    },
    "mds_server": {
        "dispatch_client_request": 8496719,
        "dispatch_server_request": 0,
        "handle_client_request": 8168047,
        "handle_client_session": 1396477,
        "handle_slave_request": 0,
        "req_create_latency": {
            "avgcount": 1063774,
            "sum": 416.446819056,
            "avgtime": 0.000391480
        },
        "req_getattr_latency": {
            "avgcount": 780158,
            "sum": 452.001666216,
            "avgtime": 0.000579371
        },
        "req_getfilelock_latency": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "req_link_latency": {
            "avgcount": 158,
            "sum": 0.040710362,
            "avgtime": 0.000257660
        },
        "req_lookup_latency": {
            "avgcount": 624639,
            "sum": 38.526743836,
            "avgtime": 0.000061678
        },
        "req_lookuphash_latency": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "req_lookupino_latency": {
            "avgcount": 123,
            "sum": 0.124224542,
            "avgtime": 0.001009955
        },
        "req_lookupname_latency": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "req_lookupparent_latency": {
            "avgcount": 1,
            "sum": 0.000032967,
            "avgtime": 0.000032967
        },
        "req_lookupsnap_latency": {
            "avgcount": 6,
            "sum": 0.000223386,
            "avgtime": 0.000037231
        },
        "req_lssnap_latency": {
            "avgcount": 27773,
            "sum": 0.834396023,
            "avgtime": 0.000030043
        },
        "req_mkdir_latency": {
            "avgcount": 106905,
            "sum": 130.336735428,
            "avgtime": 0.001219182
        },
        "req_mknod_latency": {
            "avgcount": 2,
            "sum": 0.004692740,
            "avgtime": 0.002346370
        },
        "req_mksnap_latency": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "req_open_latency": {
            "avgcount": 20624,
            "sum": 2.768943281,
            "avgtime": 0.000134258
        },
        "req_readdir_latency": {
            "avgcount": 1852424,
            "sum": 427.746657278,
            "avgtime": 0.000230911
        },
        "req_rename_latency": {
            "avgcount": 1058674,
            "sum": 149.169771728,
            "avgtime": 0.000140902
        },
        "req_renamesnap_latency": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "req_rmdir_latency": {
            "avgcount": 2694,
            "sum": 4.553650222,
            "avgtime": 0.001690293
        },
        "req_rmsnap_latency": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "req_rmxattr_latency": {
            "avgcount": 8451,
            "sum": 0.932765755,
            "avgtime": 0.000110373
        },
        "req_setattr_latency": {
            "avgcount": 1048708,
            "sum": 148.584209959,
            "avgtime": 0.000141683
        },
        "req_setdirlayout_latency": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "req_setfilelock_latency": {
            "avgcount": 34694,
            "sum": 3.761646479,
            "avgtime": 0.000108423
        },
        "req_setlayout_latency": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "req_setxattr_latency": {
            "avgcount": 1519190,
            "sum": 155.909210569,
            "avgtime": 0.000102626
        },
        "req_symlink_latency": {
            "avgcount": 13583,
            "sum": 2.938335458,
            "avgtime": 0.000216324
        },
        "req_unlink_latency": {
            "avgcount": 5436,
            "sum": 4.906792762,
            "avgtime": 0.000902647
        },
        "cap_revoke_eviction": 0
    },
    "mds_sessions": {
        "session_count": 24,
        "session_add": 381,
        "session_remove": 357,
        "sessions_open": 24,
        "sessions_stale": 0,
        "total_load": 23,
        "average_load": 0,
        "avg_session_uptime": 756986
    },
    "mempool": {
        "bloom_filter_bytes": 10318316,
        "bloom_filter_items": 10318316,
        "bluestore_alloc_bytes": 0,
        "bluestore_alloc_items": 0,
        "bluestore_cache_data_bytes": 0,
        "bluestore_cache_data_items": 0,
        "bluestore_cache_onode_bytes": 0,
        "bluestore_cache_onode_items": 0,
        "bluestore_cache_other_bytes": 0,
        "bluestore_cache_other_items": 0,
        "bluestore_fsck_bytes": 0,
        "bluestore_fsck_items": 0,
        "bluestore_txc_bytes": 0,
        "bluestore_txc_items": 0,
        "bluestore_writing_deferred_bytes": 0,
        "bluestore_writing_deferred_items": 0,
        "bluestore_writing_bytes": 0,
        "bluestore_writing_items": 0,
        "bluefs_bytes": 0,
        "bluefs_items": 0,
        "buffer_anon_bytes": 885669,
        "buffer_anon_items": 556,
        "buffer_meta_bytes": 0,
        "buffer_meta_items": 0,
        "osd_bytes": 0,
        "osd_items": 0,
        "osd_mapbl_bytes": 0,
        "osd_mapbl_items": 0,
        "osd_pglog_bytes": 0,
        "osd_pglog_items": 0,
        "osdmap_bytes": 21952,
        "osdmap_items": 570,
        "osdmap_mapping_bytes": 0,
        "osdmap_mapping_items": 0,
        "pgmap_bytes": 0,
        "pgmap_items": 0,
        "mds_co_bytes": 4510033464,
        "mds_co_items": 63134747,
        "unittest_1_bytes": 0,
        "unittest_1_items": 0,
        "unittest_2_bytes": 0,
        "unittest_2_items": 0
    },
    "objecter": {
        "op_active": 0,
        "op_laggy": 0,
        "op_send": 3113016,
        "op_send_bytes": 22140859871,
        "op_resend": 0,
        "op_reply": 3113016,
        "op": 3113016,
        "op_r": 38103,
        "op_w": 3074913,
        "op_rmw": 0,
        "op_pg": 0,
        "osdop_stat": 83817,
        "osdop_create": 2232430,
        "osdop_read": 17429,
        "osdop_write": 480869,
        "osdop_writefull": 52950,
        "osdop_writesame": 0,
        "osdop_append": 0,
        "osdop_zero": 2,
        "osdop_truncate": 3,
        "osdop_delete": 67218,
        "osdop_mapext": 0,
        "osdop_sparse_read": 0,
        "osdop_clonerange": 0,
        "osdop_getxattr": 20656,
        "osdop_setxattr": 3408671,
        "osdop_cmpxattr": 0,
        "osdop_rmxattr": 0,
        "osdop_resetxattrs": 0,
        "osdop_call": 0,
        "osdop_watch": 0,
        "osdop_notify": 0,
        "osdop_src_cmpxattr": 0,
        "osdop_pgls": 0,
        "osdop_pgls_filter": 0,
        "osdop_other": 1348,
        "linger_active": 0,
        "linger_send": 0,
        "linger_resend": 0,
        "linger_ping": 0,
        "poolop_active": 0,
        "poolop_send": 0,
        "poolop_resend": 0,
        "poolstat_active": 0,
        "poolstat_send": 0,
        "poolstat_resend": 0,
        "statfs_active": 0,
        "statfs_send": 0,
        "statfs_resend": 0,
        "command_active": 0,
        "command_send": 0,
        "command_resend": 0,
        "map_epoch": 3294,
        "map_full": 0,
        "map_inc": 1304,
        "osd_sessions": 57,
        "osd_session_open": 71162,
        "osd_session_close": 71105,
        "osd_laggy": 0,
        "omap_wr": 445985,
        "omap_rd": 40834,
        "omap_del": 115828
    },
    "oft": {
        "omap_total_objs": 1,
        "omap_total_kv_pairs": 11,
        "omap_total_updates": 38544,
        "omap_total_removes": 507678
    },
    "purge_queue": {
        "pq_executing_ops": 0,
        "pq_executing_ops_high_water": 82,
        "pq_executing": 0,
        "pq_executing_high_water": 52,
        "pq_executed": 30335,
        "pq_item_in_journal": 0
    },
    "throttle-msgr_dispatch_throttler-mds": {
        "val": 0,
        "max": 104857600,
        "get_started": 0,
        "get": 23386349,
        "get_sum": 5490958524,
        "get_or_fail_fail": 0,
        "get_or_fail_success": 23386349,
        "take": 0,
        "take_sum": 0,
        "put": 23386349,
        "put_sum": 5490958524,
        "wait": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        }
    },
    "throttle-objecter_bytes": {
        "val": 0,
        "max": 104857600,
        "get_started": 0,
        "get": 0,
        "get_sum": 0,
        "get_or_fail_fail": 0,
        "get_or_fail_success": 0,
        "take": 3113016,
        "take_sum": 22407217231,
        "put": 3044420,
        "put_sum": 22407217231,
        "wait": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        }
    },
    "throttle-objecter_ops": {
        "val": 0,
        "max": 1024,
        "get_started": 0,
        "get": 0,
        "get_sum": 0,
        "get_or_fail_fail": 0,
        "get_or_fail_success": 0,
        "take": 3113016,
        "take_sum": 3113016,
        "put": 3113016,
        "put_sum": 3113016,
        "wait": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        }
    },
    "throttle-write_buf_throttle": {
        "val": 0,
        "max": 3758096384,
        "get_started": 0,
        "get": 30335,
        "get_sum": 3284131,
        "get_or_fail_fail": 0,
        "get_or_fail_success": 30335,
        "take": 0,
        "take_sum": 0,
        "put": 17358,
        "put_sum": 3284131,
        "wait": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        }
    },
    "throttle-write_buf_throttle-0x5603c6253f40": {
        "val": 0,
        "max": 3758096384,
        "get_started": 0,
        "get": 11612288,
        "get_sum": 18898334922,
        "get_or_fail_fail": 0,
        "get_or_fail_success": 11612288,
        "take": 0,
        "take_sum": 0,
        "put": 463511,
        "put_sum": 18898334922,
        "wait": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        }
    }
}

Actions

Copy link

#66

Updated by mitchell walls almost 4 years ago

We are officially moved off of the old cluster so now I can mess around with the old one without any worries (still using rbd for openstack). I mounted the filesystem with nfs and ran a find -type f on the nfs mount and after a bit of the find running I am getting MDSs report slow requests and the find completely hangs. I stop the find and the MDS slow requests stays active and one of the monitors starts flopping. Essentially the same behaviour, currently on version 15.2.1 running in containers using cephadm as orchestrator.

I am mounting using noatime and the find -type f slow requests are happening on a kernel mount as well. Not sure why they are wrlocks. One other thing is that the slow requests persist even after evicting all of the clients.

My goal for today is to see if the cache pressure messages pop up with ganesha being the only client with one nfs share, a single mds, and nothing mounting that nfs share. I'll play around with multiple MDSes and logging the process as time allows. Something is definitely severely messed up with the slow mds requests happening with kernel mounts and multiple mdses. The cache pressure messages never show up for kernel mounts at least not that I've noticed in logs on the new cluster.

Let me know what you would like to have run and I will work on it as time permits
Thanks!

Actions

Copy link

#67

Updated by mitchell walls almost 4 years ago

Okay I have left the nfs and old ceph cluster connected and haven't seen a cache pressure message. Added a client mounted a share from ganesha yesterday and still didn't see the message. This is running ganesha on the 15.2.1 ceph container image. I am now thinking you might be correct on one of the clients connected to ganesha on the new cluster might be behaving badly as far as open files though I don't see any activity. Is there anyway to see stats on a per client or overall basis in ganesha?

Now we can focus on the mds problem that occurs on old and most likely on new cluster (won't test at the moment). You were wanting a new ticket for this?

Actions

Copy link

#68

Updated by Jeff Layton almost 4 years ago

Unfortunately, ganesha doesn't have great instrumentation in this area. There is a ganeshactl program that ships with ganesha that can fetch some info from the daemon using dbus. I'm not sure if any of that shows the number of files currently held open though. It does track some info about open files, but the stats gathering doesn't seem to properly wired up to count open files that don't have kernel file descriptors (as is the case with FSAL_CEPH). It may be best to take this question to the nfs-ganesha mailing list as the developers there may have suggestions on how to approach this.

As far as opening a new tracker ticket, that's probably best. As always, it's best to keep tickets like this as narrowly focused as possible. One tracker bug per actual problem will drive these things to resolution faster. It's hard to get my arms around this one since it seems to involve several different but related issues, and I'm a bit confused by your testing as you seem to have been changing several things at once.

Actions

Copy link

#69

Updated by mitchell walls almost 4 years ago

Okay , I will work on writing up a new ticket for the slow requests problem and at the moment not do anything to troubleshoot the multiple mds slow requests problem, given that recreating it takes an hour to move back to 1 mds from 3 mdses. I am focusing on 1 mds and the cache pressure problem.

All of my testing from now on is happening on the previous out of production cluster that I migrated the data off of. The thing I am testing right now is whether the cache pressure messages show up with nothing connected to nfs-ganesha. The cluster is healthy (1 mds) except for the cache pressure messages that have been on for over 8 hours. The only client connected to the ceph filesystem is nfs-ganesha. Nothing is mounting nfs from nfs-ganesha since I have fire-walled it off. I find this very unusual. I am hoping for some steps on how to troubleshoot why this is happening. This is also currently happening on the new freshly installed production cluster where everything was slowly migrated through rsync. I hope this clears up the confusion.

Actions

Copy link

#70

Updated by Patrick Donnelly almost 4 years ago

Related to Feature #12334: nfs-ganesha: handle client cache pressure in NFS Ganesha FSAL added

Actions

Copy link

#71

Updated by Jeff Layton over 2 years ago

Status changed from New to Resolved

I think we ended up resolving the original issue and then the bug wandered off into the weeds and Mitchell opened a new one for it.

Actions

Copy link

#72

Updated by Mike Lowe almost 2 years ago

I believe I'm seeing this condition with my cluster, what do you need to know to proceed?

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #44976

MDS problem slow requests, cache pressure, damaged metadata after upgrading 14.2.7 to 14.2.8

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Greg Farnum about 4 years ago

Updated by Greg Farnum about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by Jeff Layton about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls about 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Patrick Donnelly almost 4 years ago

Updated by Patrick Donnelly almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Jeff Layton almost 4 years ago

Updated by mitchell walls almost 4 years ago

Updated by Patrick Donnelly almost 4 years ago

Updated by Jeff Layton over 2 years ago

Updated by Mike Lowe almost 2 years ago