Bug #44976
Updated by Patrick Donnelly about 4 years ago
I originally posted this on the nfs-ganesha issues github. They referenced me here since they think it is related to a problem in libcephfs or mds. They were thinking that the Attr_Expiration_Time = 0 was agitating an underlying issue rather than the issue itself. I am planning on upgrading to octopus. I figured I would wait a couple weeks before doing that while we figure this out. OS: Centos 7.7.1908 Versions: Ceph - 14.2.8 NFS-Ganesha - 2.8.3 12 Days ago I upgraded to 14.2.8. From 14.2.7. I started seeing the cache pressure messages after that upgrade. The cephfs system sat rather dormant for over a week. Last week on Thursday started seeing all of this stuff Firstly, whether I have 1 MDS or 3, I tend to see the clients failing to respond to cache pressure. Also, I always find that the MDS reports damaged metadata and pops back up after doing a scrub. If I have 3 MDSes I start to get 3 MDS report slow requests. The requests are failing to RDLOCK/WRLOCK. I should be able to recreate this I just want to make sure I am getting all of the logs needed when I recreate it. Let me know which you would like? That way I get right log levels and the right logs. The cluster has existed for almost a year(10 months). Version was nautilus this entire time. No disk failures or hardware failures the entire time. The nfs cephfs has existed pretty much since the beginning with exact same use case. I can repair the metadata but it will eventually show right back up. Also once the filesystem gets pretty bad with 3 mds and metadata damange. The slow requests issue will arise just by listing files in a directory from a kernel mount. NFS Ganesha logs on debug shows entirely nothing. If I increase to 3 mdses, then it starts to show this. 03/04/2020 13:57:51 : epoch 5e866ee0 : cmp04: ganesha.nfsd-1096024[svc_595] posix2fsal_error :FSAL :CRIT :Mapping 108(default) to ERR_FSAL_SERVERFAULT I am using autoscale for this cluster. So besides the cephfs warnings that is the only thing I get. These are the messages with just 1 MDS. I plan to switcch to the multiple MDS and recreate the slow requests RDLOCK/WRLOCK issue. ALL OF THE FOLLOWING IS 1 MDS. Like I said it gets much worse with 3. <pre> [root@cmp03 ~]# ceph -s cluster: id: FSID health: HEALTH_WARN 2 clients failing to respond to cache pressure 1 MDSs report damaged metadata too few PGs per OSD (15 < min 30) services: mon: 3 daemons, quorum mon01,mon02,mon03 (age 12d) mgr: mon01(active, since 12d), standbys: mon02, mon03 mds: nfs:1 {0=cmp03=up:active} 3 up:standby osd: 64 osds: 64 up (since 12d), 64 in (since 9w) data: pools: 8 pools, 336 pgs objects: 4.78M objects, 6.6 TiB usage: 43 TiB used, 446 TiB / 488 TiB avail pgs: 336 active+clean io: client: 0 B/s rd, 338 KiB/s wr, 14 op/s rd, 16 op/s wr [root@cmp03 ~]# ceph health detail HEALTH_ERR 2 clients failing to respond to cache pressure; 1 MDSs report damaged metadata; too few PGs per OSD (15 < min 30) MDS_CLIENT_RECALL 2 clients failing to respond to cache pressure mdscmp03(mds.0): Client cmp04. failing to respond to cache pressure client_id: 106550698 mdscmp03(mds.0): Client cmp04. failing to respond to cache pressure client_id: 106550713 MDS_DAMAGE 1 MDSs report damaged metadata mdscmp03(mds.0): Metadata damage detected TOO_FEW_PGS too few PGs per OSD (15 < min 30) [root@cmp04 ~]# ceph tell mds.cmp03 client ls 2020-04-07 08:04:02.930 7ff9457fa700 0 client.107373917 ms_handle_reset on v2:10.10.48.102:6800/3250397436 2020-04-07 08:04:02.963 7ff9467fc700 0 client.107312857 ms_handle_reset on v2:10.10.48.102:6800/3250397436 [ { "id": 106550713, "entity": { "name": { "type": "client", "num": 106550713 }, "addr": { "type": "any", "addr": "10.10.48.103:0", "nonce": 452355589 } }, "state": "open", "num_leases": 0, "num_caps": 95382, "request_load_avg": 0, "uptime": 59384.837150430998, "requests_in_flight": 0, "completed_requests": 5, "reconnecting": false, "recall_caps": { "value": 286887.99426187889, "halflife": 60 }, "release_caps": { "value": 0, "halflife": 60 }, "recall_caps_throttle": { "value": 10319.76544103905, "halflife": 2.5 }, "recall_caps_throttle2o": { "value": 815.64572559331623, "halflife": 0.5 }, "session_cache_liveness": { "value": 0, "halflife": 300 }, "inst": "client.106550713 10.10.48.103:0/452355589", "completed_requests": [ { "tid": 1367968, "created_ino": "0x0" }, { "tid": 1367969, "created_ino": "0x0" }, { "tid": 1367975, "created_ino": "0x0" }, { "tid": 1367977, "created_ino": "0x0" }, { "tid": 1367979, "created_ino": "0x0" } ], "prealloc_inos": [ { "start": 1099522467650, "length": 287 }, { "start": 1099522469581, "length": 505 } ], "used_inos": [], "client_metadata": { "features": "0x0000000000001fff", "ceph_sha1": "2d095e947a02261ce61424021bb43bd3022d35cb", "ceph_version": "ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable)", "entity_id": "ganesha-9209657c-3804-4c42-99e3-d669f936a3f4", "hostname": "cmp04.", "pid": "1630708", "root": "/volumes/_nogroup/9209657c-3804-4c42-99e3-d669f936a3f4", "timeout": "300", "uuid": "ganesha-cmp04.-0411" } }, { "id": 107031118, "entity": { "name": { "type": "client", "num": 107031118 }, "addr": { "type": "v1", "addr": "10.10.48.103:0", "nonce": 3527069069 } }, "state": "open", "num_leases": 0, "num_caps": 1, "request_load_avg": 0, "uptime": 59384.837150430998, "requests_in_flight": 0, "completed_requests": 0, "reconnecting": false, "recall_caps": { "value": 0, "halflife": 60 }, "release_caps": { "value": 0, "halflife": 60 }, "recall_caps_throttle": { "value": 0, "halflife": 2.5 }, "recall_caps_throttle2o": { "value": 0, "halflife": 0.5 }, "session_cache_liveness": { "value": 0, "halflife": 300 }, "inst": "client.107031118 v1:10.10.48.103:0/3527069069", "completed_requests": [], "prealloc_inos": [], "used_inos": [], "client_metadata": { "features": "0x00000000000000ff", "entity_id": "admin", "hostname": "cmp04.", "kernel_version": "3.10.0-1062.12.1.el7.x86_64", "root": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7" } }, { "id": 106550698, "entity": { "name": { "type": "client", "num": 106550698 }, "addr": { "type": "any", "addr": "10.10.48.103:0", "nonce": 4140370337 } }, "state": "open", "num_leases": 0, "num_caps": 4771, "request_load_avg": 0, "uptime": 59384.837150430998, "requests_in_flight": 0, "completed_requests": 0, "reconnecting": false, "recall_caps": { "value": 302690.92799215915, "halflife": 60 }, "release_caps": { "value": 0, "halflife": 60 }, "recall_caps_throttle": { "value": 12256.831612812737, "halflife": 2.5 }, "recall_caps_throttle2o": { "value": 2601.7867175086562, "halflife": 0.5 }, "session_cache_liveness": { "value": 0, "halflife": 300 }, "inst": "client.106550698 10.10.48.103:0/4140370337", "completed_requests": [], "prealloc_inos": [ { "start": 1099522471904, "length": 687 } ], "used_inos": [], "client_metadata": { "features": "0x0000000000001fff", "ceph_sha1": "2d095e947a02261ce61424021bb43bd3022d35cb", "ceph_version": "ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable)", "entity_id": "ganesha-a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7", "hostname": "cmp04.", "pid": "1630708", "root": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7", "timeout": "300", "uuid": "ganesha-cmp04.-03fe" } }, { "id": 106103709, "entity": { "name": { "type": "client", "num": 106103709 }, "addr": { "type": "v1", "addr": "10.10.48.103:0", "nonce": 1266568608 } }, "state": "open", "num_leases": 0, "num_caps": 3741, "request_load_avg": 3, "uptime": 59384.837150430998, "requests_in_flight": 0, "completed_requests": 1, "reconnecting": false, "recall_caps": { "value": 0, "halflife": 60 }, "release_caps": { "value": 0, "halflife": 60 }, "recall_caps_throttle": { "value": 0, "halflife": 2.5 }, "recall_caps_throttle2o": { "value": 0, "halflife": 0.5 }, "session_cache_liveness": { "value": 136.57798974559933, "halflife": 300 }, "inst": "client.106103709 v1:10.10.48.103:0/1266568608", "completed_requests": [ { "tid": 4537378, "created_ino": "0x0" } ], "prealloc_inos": [], "used_inos": [], "client_metadata": { "features": "0x00000000000000ff", "entity_id": "admin", "hostname": "cmp04.", "kernel_version": "3.10.0-1062.12.1.el7.x86_64", "root": "/volumes/_nogroup/9209657c-3804-4c42-99e3-d669f936a3f4" } } ] [root@cmp04 ~]# ceph tell mds.cmp03 damage ls .... 7000+ backtrace errors .... { "damage_type": "backtrace", "id": 4289458104, "ino": 1099522468770, "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag45_nearest10_hog/msm_6tic_N11_lag45.txt" }, { "damage_type": "backtrace", "id": 4291358526, "ino": 1099522468884, "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3_nearest10_hog/state_apo_6tic_N11_lag3_9.pdb" }, { "damage_type": "backtrace", "id": 4291586401, "ino": 1099522467412, "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag25_nearest10_hog/state_apo_6tic_N11_lag25_1.pdb" }, { "damage_type": "backtrace", "id": 4293141981, "ino": 1099522471402, "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/corr_frames10/p450_apoN11_lag83_spearmancorr.npy" }, { "damage_type": "backtrace", "id": 4293413627, "ino": 1099522467311, "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag29_nearest10_hog/state_apo_6tic_N11_lag29_0.pdb" }, { "damage_type": "backtrace", "id": 4294040932, "ino": 1099522468824, "path": "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag31_nearest10_hog/state_apo_6tic_N11_lag31_10.pdb" [root@cmp03 ~]# cat /etc/ceph/ceph.conf [global] fsid = FSID mon_initial_members = mon01, mon02, mon03 mon_host = 10.10.48.27,10.10.48.28,10.10.48.29 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public_network = 10.10.48.0/24 cluster_network= 10.10.53.0/24 mon_allow_pool_delete = true log_to_syslog = true err_to_syslog = true mon_cluster_log_to_syslog = true clog_to_syslog = true [mds] mds_cache_memory_limit = 68719476736 mds_cache_reservation = 0.05 [osd] bluestore_cache_size_hdd = 8589934592 bluestore_cache_size_ssd = 8589934592 [client] rbd_cache = true rbd_cache_size = 67108864 rbd_cache_max_dirty = 50331648 rbd_cache_target_dirty = 33554432 rbd_cache_max_dirty_age = 2 rbd_cache_writethrough_until_flush = true rbd_concurrent_management_ops = 20 [client.manila] client_mount_uid = 0 client_mount_gid = 0 keyring = /etc/ceph/manila.keyring [root@cmp04 ~]# cat /etc/ganesha/ceph.conf LOG { Default_Log_Level = DEBUG; } NFS_CORE_PARAM { Enable_NLM = false; Enable_RQUOTA = false; Protocols = 4; } NFSv4 { Delegations = true; RecoveryBackend = rados_ng; Minor_Versions = 1,2; } MDCACHE { Dir_Chunk = 0; NParts = 1; Cache_Size = 1; } RADOS_URLS { Ceph_Conf = /etc/ceph/ceph.conf; UserId = "admin"; watch_url = "rados://nfs-ganesha/ganesha-export-counter"; } %url rados://nfs-ganesha/ganesha-export-index CEPH { } RADOS_KV { Ceph_Conf = /etc/ceph/ceph.conf; UserId = "admin"; pool = "nfs-ganesha"; } NFS_KRB5 { Active_krb5 = false; } RADOS EXPORT EXAMPLE: EXPORT { CLIENT { Clients = {ips} Access_Type = "rw"; } SecType = "sys"; Pseudo = "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7"; Squash = "None"; FSAL { Secret_Access_Key = {secret} User_Id = {user} Name = "Ceph"; } Tag = "share-a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7"; Path = "/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7"; Export_Id = 1022; } </pre> Here are the logs while running, the scrub. About 7485 lines of this. All seem the same. <pre> 2020-04-07 07:54:16.326 7f3308d21700 0 log_channel(cluster) log [WRN] : Scrub error on inode 0x10000a576dc (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/state_apo_6tic_N11_lag3700_5.pdb) see mds.cmp03 log and `damage ls` output for details 2020-04-07 07:54:16.326 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 07:54:16.326 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576dc [2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/state_apo_6tic_N11_lag3700_5.pdb auth v62 dirtyparent s=6229785 n(v0 rc2020-04-06 18:20:15.392548 b6229785 1=1+0) (iversion lock) caps={106550698=pAsLsXsFsc/-@4},l=106550698(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 authpin=0 0x555a9bd6ee00]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(39)0x10000a576dc:[<0x10000a576c9/state_apo_6tic_N11_lag3700_5.pdb v62>,<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104037>,<0x1/volumes v1692870>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-2} 2020-04-07 07:54:16.326 7f3308d21700 0 log_channel(cluster) log [WRN] : Scrub error on inode 0x10000a576e7 (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt) see mds.cmp03 log and `damage ls` output for details 2020-04-07 07:54:16.326 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 07:54:16.326 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576e7 [2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt auth v116 dirtyparent s=2065 n(v0 rc2020-04-06 18:20:52.995274 b2065 1=1+0) (iversion lock) caps={106550698=pAsLsXsFsc/-@2},l=106550698(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 authpin=0 0x555a4541dc00]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(39)0x10000a576e7:[<0x10000a576c9/msm_6tic_N11_lag3700.txt v116>,<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104037>,<0x1/volumes v1692870>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-2} 2020-04-07 07:54:16.326 7f3308d21700 0 log_channel(cluster) log [WRN] : Scrub error on inode 0x10000a576c9 (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog) see mds.cmp03 log and `damage ls` output for details 2020-04-07 07:54:16.326 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 07:54:16.326 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576c9 [...2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/ auth v103691 dirtyparent f(v0 m2020-04-06 18:20:52.988600 12=12+0) n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1) (inest lock) (ifile excl) (iversion lock) caps={106550698=pAsLsXsFsx/AsLsXsFsx@14},l=106550698 | request=0 lock=0 dirfrag=1 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 waiter=0 authpin=0 0x555a53c58700]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(38)0x10000a576c9:[<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104037>,<0x1/volumes v1692870>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":true,"passed":true,"read_ret_val":0,"ondisk_value.dirstat":"f(v0 m2020-04-06 18:20:52.988600 12=12+0)","ondisk_value.rstat":"n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1)","memory_value.dirrstat":"f(v0 m2020-04-06 18:20:52.988600 12=12+0)","memory_value.rstat":"n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1)","error_str":""},"return_code":-2} 2020-04-07 07:54:16.356 7f3308d21700 0 log_channel(cluster) log [INF] : scrub complete with tag 'ce3c7e20-71eb-4bb6-bbc5-d2463ba8ff38' 2020-04-07 07:54:16.356 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 07:54:18.698 7f330f52e700 1 mds.cmp03 Updating MDS map to version 26580 from mon.2 2020-04-07 09:17:58.270 7f330f52e700 1 mds.cmp03 Updating MDS map to version 26581 from mon.2 </pre> Scrub Repair logs <pre> 2020-04-07 09:23:44.230 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 09:23:44.230 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576dc [2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/state_apo_6tic_N11_lag3700_5.pdb auth v62 dirtyparent s=6229785 n(v0 rc2020-04-06 18:20:15.392548 b6229785 1=1+0) (iversion lock) caps={106550698=pAsLsXsFsc/-@4},l=106550698(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 authpin=0 0x555a9bd6ee00]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(39)0x10000a576dc:[<0x10000a576c9/state_apo_6tic_N11_lag3700_5.pdb v62>,<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104143>,<0x1/volumes v1692976>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-2} 2020-04-07 09:23:44.230 7f3308d21700 0 log_channel(cluster) log [WRN] : bad backtrace on inode 0x10000a576e7(/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt), rewriting it 2020-04-07 09:23:44.230 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 09:23:44.230 7f3308d21700 0 log_channel(cluster) log [INF] : Scrub repaired inode 0x10000a576e7 (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt) 2020-04-07 09:23:44.230 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 09:23:44.230 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576e7 [2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/msm_6tic_N11_lag3700.txt auth v116 dirtyparent s=2065 n(v0 rc2020-04-06 18:20:52.995274 b2065 1=1+0) (iversion lock) caps={106550698=pAsLsXsFsc/-@2},l=106550698(-1) | ptrwaiter=0 request=0 lock=0 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 authpin=0 0x555a4541dc00]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(39)0x10000a576e7:[<0x10000a576c9/msm_6tic_N11_lag3700.txt v116>,<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104143>,<0x1/volumes v1692976>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":false,"passed":false,"read_ret_val":0,"ondisk_value.dirstat":"f()","ondisk_value.rstat":"n()","memory_value.dirrstat":"f()","memory_value.rstat":"n()","error_str":""},"return_code":-2} 2020-04-07 09:23:44.231 7f3308d21700 0 log_channel(cluster) log [WRN] : bad backtrace on inode 0x10000a576c9(/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog), rewriting it 2020-04-07 09:23:44.231 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 09:23:44.231 7f3308d21700 0 log_channel(cluster) log [INF] : Scrub repaired inode 0x10000a576c9 (/volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog) 2020-04-07 09:23:44.231 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 09:23:44.231 7f3308d21700 -1 mds.0.scrubstack _validate_inode_done scrub error on inode [inode 0x10000a576c9 [...2,head] /volumes/_nogroup/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7/namdoutput/analysis_apo/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog/ auth v103691 dirtyparent f(v0 m2020-04-06 18:20:52.988600 12=12+0) n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1) (inest lock) (ifile excl) (iversion lock) caps={106550698=pAsLsXsFsx/AsLsXsFsx@16},l=106550698 | request=0 lock=0 dirfrag=1 caps=1 dirtyparent=1 scrubqueue=0 dirty=1 waiter=0 authpin=0 0x555a53c58700]: {"performed_validation":true,"passed_validation":false,"backtrace":{"checked":true,"passed":false,"read_ret_val":-2,"ondisk_value":"(-1)0x0:[]//","memoryvalue":"(38)0x10000a576c9:[<0x10000003c9f/tunneltriangle_20dcds_6ticN11_lag3700_nearest10_hog v103691>,<0x40000000f4e/analysis_apo v189475>,<0x400000011a5/namdoutput v89091>,<0x10000000001/a6c48cd4-17e8-49df-9a4a-0f8f8b4364c7 v3379713>,<0x10000000000/_nogroup v3104143>,<0x1/volumes v1692976>]//","error_str":"failed to decode on-disk backtrace (0 bytes)!"},"raw_stats":{"checked":true,"passed":true,"read_ret_val":0,"ondisk_value.dirstat":"f(v0 m2020-04-06 18:20:52.988600 12=12+0)","ondisk_value.rstat":"n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1)","memory_value.dirrstat":"f(v0 m2020-04-06 18:20:52.988600 12=12+0)","memory_value.rstat":"n(v0 rc2020-04-06 18:20:52.995274 b68529700 13=12+1)","error_str":""},"return_code":-2} 2020-04-07 09:27:33.890 7f3308d21700 0 log_channel(cluster) log [INF] : scrub complete with tag '47c8a435-7fdd-40f1-879f-3cb435ce530f' 2020-04-07 09:27:33.890 7f3308d21700 0 log_channel(cluster) do_log log to syslog 2020-04-07 09:31:42.355 7f330f52e700 1 mds.cmp03 Updating MDS map to version 26582 from mon.2 </pre> h1.