Actions
Support #16528
closedStuck with CephFS with 1M files in one dir
Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Component(FS):
Labels (FS):
Pull request ID:
Description
I'm pretty much stukc with cephfs (jewel 10.2.2) with 1 million 0 byte files in one dir left behind from unsuccessful bonnie++ run.
Can't list/delete files or that directory. At least in reasonable time. (Waited for 4 hours).
Ceph kernel client with 4.4.14 kernel on Ubuntu 14.04.
2 metadata servers, 1 active, another in standby-replay mode.
From active mds log when I try to access dir:
2016-06-29 19:05:39.053345 7fcc198da700 1 heartbeat_map reset_timeout 'MDSRank' had timed out after 10 2016-06-29 19:05:57.472549 7fcc198da700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 33.047606 secs 2016-06-29 19:05:57.472563 7fcc198da700 0 log_channel(cluster) log [WRN] : slow request 33.047606 seconds old, received at 2016-06-29 19:05:24.424893: client_request(client.28058182:3 readdir #100000021c2 2016-06-29 19:05:24.420352) currently acquired locks 2016-06-29 19:06:25.457003 7fcc198da700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 61.032040 secs 2016-06-29 19:06:25.457022 7fcc198da700 0 log_channel(cluster) log [WRN] : slow request 61.032040 seconds old, received at 2016-06-29 19:05:24.424893: client_request(client.28058182:3 readdir #100000021c2 2016-06-29 19:05:24.420352) currently acquired locks 2016-06-29 19:08:02.719494 7fcc198da700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 35.652664 secs 2016-06-29 19:08:02.719520 7fcc198da700 0 log_channel(cluster) log [WRN] : slow request 35.652664 seconds old, received at 2016-06-29 19:07:27.066769: client_request(client.28058182:6 readdir #100000021c2 00000ef1d6BKMay 2016-06-29 19:07:27.054945) currently acquired locks 2016-06-29 19:08:30.702883 7fcc198da700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 63.636058 secs 2016-06-29 19:08:30.702899 7fcc198da700 0 log_channel(cluster) log [WRN] : slow request 63.636058 seconds old, received at 2016-06-29 19:07:27.066769: client_request(client.28058182:6 readdir #100000021c2 00000ef1d6BKMay 2016-06-29 19:07:27.054945) currently acquired locks 2016-06-29 19:10:43.184162 7fcc198da700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 35.598239 secs 2016-06-29 19:10:43.184177 7fcc198da700 0 log_channel(cluster) log [WRN] : slow request 35.598239 seconds old, received at 2016-06-29 19:10:07.585870: client_request(client.28058182:9 readdir #100000021c2 0000022f71S 2016-06-29 19:10:07.574338) currently acquired locks 2016-06-29 19:11:10.606474 7fcc198da700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 63.020538 secs 2016-06-29 19:11:10.606503 7fcc198da700 0 log_channel(cluster) log [WRN] : slow request 63.020538 seconds old, received at 2016-06-29 19:10:07.585870: client_request(client.28058182:9 readdir #100000021c2 0000022f71S 2016-06-29 19:10:07.574338) currently acquired locks 2016-06-29 19:13:43.266474 7fcc198da700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 35.753201 secs 2016-06-29 19:13:43.266489 7fcc198da700 0 log_channel(cluster) log [WRN] : slow request 35.753201 seconds old, received at 2016-06-29 19:13:07.513218: client_request(client.28058182:13 readdir #100000021c2 0000012468KDbTAmV 2016-06-29 19:13:07.502135) currently acquired locks 2016-06-29 19:14:11.108257 7fcc198da700 0 log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 63.594955 secs 2016-06-29 19:14:11.108278 7fcc198da700 0 log_channel(cluster) log [WRN] : slow request 63.594955 seconds old, received at 2016-06-29 19:13:07.513218: client_request(client.28058182:13 readdir #100000021c2 0000012468KDbTAmV 2016-06-29 19:13:07.502135) currently acquired locks
Only one client mounted to fs as follows:
192.168.30.71,192.168.30.72,192.168.30.73:/ on /mnt/cephfs type ceph (name=admin,rsize=2097152,wsize=2097152,readdir_max_entries=10240,readdir_max_bytes=2097152,key=client.admin)
rados df
pool name KB objects clones degraded unfound rd rd KB wr wr KB cephfs_data 0 1022482 0 0 0 9938 14623544 3655424 25191882 cephfs_metadata 35953 31 0 0 0 28177 67305532 147345 6187783
Ceph cluster status is OK
Files
Actions