Bug #13546
closedmv of directories hung Ceph filesystem
0%
Description
On a test Ceph cluster running Ceph v9.1.0 with the 4.3rc6 kernel on Trusty, I tried to rename two directories in a Ceph filesystem to a sub-directory with the commands:
root@ede-c1-adm01:/cephfs/top# ls
ede-c1-gw01 ede-c1-gw02
root@ede-c1-adm01:/cephfs/top# mkdir z
root@ede-c1-adm01:/cephfs/top# mv e* z/
and the file Ceph filesystem hung on all mounted clients. On all clients, if I do a ls /cephfs/ the process hangs, and ^c ^z will not break the hang.
The system I did the mv command on was fuse mounted with the mount options:
cat /proc/mounts | grep ceph
ceph-fuse /cephfs fuse.ceph-fuse rw,noatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
There are 2 clients systems that had the Ceph file system kernel mounted with the mount options:
cat /proc/mounts | grep ceph
10.14.2.11,10.14.2.12,10.14.2.13:/ /cephfs ceph rw,noatime,name=cephfs,secret=<hidden>,acl 0 0
Before I ran the mv command, I had stopped the 2 write processes per kernel mounted system, so there should have been no I/O going to the Ceph filesystem at the time of the mv operation.
In another window, I had a ceph -w running, and after the mv command it started showing:
015-10-20 14:55:26.511106 mon.0 [INF] pgmap v5643: 832 pgs: 832 active+clean; 55106 MB data, 162 GB used, 863 GB / 1061 GB avail 2015-10-20 14:55:36.723555 mon.0 [INF] pgmap v5644: 832 pgs: 832 active+clean; 55106 MB data, 162 GB used, 863 GB / 1061 GB avail; 204 B/s wr, 0 op/s 2015-10-20 14:55:37.735373 mon.0 [INF] pgmap v5645: 832 pgs: 832 active+clean; 55106 MB data, 162 GB used, 863 GB / 1061 GB avail; 456 B/s wr, 0 op/s 2015-10-20 14:56:06.526391 mds.0 [WRN] 4 slow requests, 1 included below; oldest blocked for > 134822.090727 secs 2015-10-20 14:56:06.526414 mds.0 [WRN] slow request 31.369423 seconds old, received at 2015-10-20 14:55:35.156855: client_request(client.4240:131 rename #10000000001/ede-c1-gw01 #10000000000/ede-c1-gw01 2015-10-20 14:55:35.142806) currently failed to xlock, waiting 2015-10-20 14:56:31.019164 mon.0 [INF] pgmap v5646: 832 pgs: 832 active+clean; 55106 MB data, 162 GB used, 863 GB / 1061 GB avail 2015-10-20 14:56:36.527636 mds.0 [WRN] 4 slow requests, 1 included below; oldest blocked for > 134852.091916 secs 2015-10-20 14:56:36.527681 mds.0 [WRN] slow request 61.370613 seconds old, received at 2015-10-20 14:55:35.156855: client_request(client.4240:131 rename #10000000001/ede-c1-gw01 #10000000000/ede-c1-gw01 2015-10-20 14:55:35.142806) currently failed to xlock, waiting 2015-10-20 14:56:41.527858 mds.0 [WRN] 5 slow requests, 1 included below; oldest blocked for > 134857.092109 secs 2015-10-20 14:56:41.527868 mds.0 [WRN] slow request 31.129815 seconds old, received at 2015-10-20 14:56:10.397846: client_request(client.14128:14187 getattr pAsLsXsFs #10000000000 2015-10-20 14:56:10.395375) currently failed to rdlock, waiting 2015-10-20 14:56:57.166362 mon.0 [INF] pgmap v5647: 832 pgs: 832 active+clean; 55106 MB data, 162 GB used, 863 GB / 1061 GB avail 2015-10-20 14:57:11.528982 mds.0 [WRN] 5 slow requests, 1 included below; oldest blocked for > 134887.093320 secs 2015-10-20 14:57:11.529010 mds.0 [WRN] slow request 61.131026 seconds old, received at 2015-10-20 14:56:10.397846: client_request(client.14128:14187 getattr pAsLsXsFs #10000000000 2015-10-20 14:56:10.395375) currently failed to rdlock, waiting 2015-10-20 14:57:26.527825 mon.0 [INF] pgmap v5648: 832 pgs: 832 active+clean; 55106 MB data, 162 GB used, 863 GB / 1061 GB avail 2015-10-20 14:57:36.742034 mon.0 [INF] pgmap v5649: 832 pgs: 832 active+clean; 55106 MB data, 162 GB used, 863 GB / 1061 GB avail 2015-10-20 14:57:37.754074 mon.0 [INF] pgmap v5650: 832 pgs: 832 active+clean; 55106 MB data, 162 GB used, 863 GB / 1061 GB avail 2015-10-20 14:57:36.530023 mds.0 [WRN] 5 slow requests, 1 included below; oldest blocked for > 134912.094357 secs 2015-10-20 14:57:36.530046 mds.0 [WRN] slow request 121.373053 seconds old, received at 2015-10-20 14:55:35.156855: client_request(client.4240:131 rename #10000000001/ede-c1-gw01 #10000000000/ede-c1-gw01 2015-10-20 14:55:35.142806) currently failed to xlock, waiting 2015-10-20 14:58:11.531409 mds.0 [WRN] 5 slow requests, 1 included below; oldest blocked for > 134947.095728 secs 2015-10-20 14:58:11.531435 mds.0 [WRN] slow request 121.133434 seconds old, received at 2015-10-20 14:56:10.397846: client_request(client.14128:14187 getattr pAsLsXsFs #10000000000 2015-10-20 14:56:10.395375) currently failed to rdlock, waiting 2015-10-20 14:59:36.534872 mds.0 [WRN] 5 slow requests, 1 included below; oldest blocked for > 135032.099205 secs 2015-10-20 14:59:36.534895 mds.0 [WRN] slow request 241.377901 seconds old, received at 2015-10-20 14:55:35.156855: client_request(client.4240:131 rename #10000000001/ede-c1-gw01 #10000000000/ede-c1-gw01 2015-10-20 14:55:35.142806) currently failed to xlock, waiting 2015-10-20 15:00:00.000975 mon.0 [INF] HEALTH_WARN; mds0: Client ede-c1-gw01:cephfs failing to respond to capability release; mds0: Client ede-c1-gw02:cephfs failing to respond to capability release 2015-10-20 15:00:11.536200 mds.0 [WRN] 5 slow requests, 1 included below; oldest blocked for > 135067.100527 secs 2015-10-20 15:00:11.536245 mds.0 [WRN] slow request 241.138233 seconds old, received at 2015-10-20 14:56:10.397846: client_request(client.14128:14187 getattr pAsLsXsFs #10000000000 2015-10-20 14:56:10.395375) currently failed to rdlock, waiting 2015-10-20 15:03:36.543495 mds.0 [WRN] 5 slow requests, 1 included below; oldest blocked for > 135272.107818 secs 2015-10-20 15:03:36.543520 mds.0 [WRN] slow request 481.386514 seconds old, received at 2015-10-20 14:55:35.156855: client_request(client.4240:131 rename #10000000001/ede-c1-gw01 #10000000000/ede-c1-gw01 2015-10-20 14:55:35.142806) currently failed to xlock, waiting 2015-10-20 15:04:11.544806 mds.0 [WRN] 5 slow requests, 1 included below; oldest blocked for > 135307.109083 secs 2015-10-20 15:04:11.544827 mds.0 [WRN] slow request 481.146789 seconds old, received at 2015-10-20 14:56:10.397846: client_request(client.14128:14187 getattr pAsLsXsFs #10000000000 2015-10-20 14:56:10.395375) currently failed to rdlock, waiting ... 2015-10-20 15:33:46.625316 mds.0 [WRN] slow request 482.361401 seconds old, received at 2015-10-20 15:25:44.263752: client_request(client.4240:139 getattr pAsLsXsFs #10000000000 2015-10-20 15:25:44.249554) currently failed to rdlock, waiting
System info:
ceph -s cluster 75a91bbe-b287-11e4-889f-001517987704 health HEALTH_WARN mds0: Client ede-c1-gw01:cephfs failing to respond to capability release mds0: Client ede-c1-gw02:cephfs failing to respond to capability release monmap e1: 3 mons at {ede-c1-mon01=10.14.2.11:6789/0,ede-c1-mon02=10.14.2.12:6789/0,ede-c1-mon03=10.14.2.13:6789/0} election epoch 12, quorum 0,1,2 ede-c1-mon01,ede-c1-mon02,ede-c1-mon03 mdsmap e22: 1/1/1 up {0=ede-c1-mds02=up:active}, 2 up:standby osdmap e49: 18 osds: 18 up, 18 in flags sortbitwise pgmap v5650: 832 pgs, 4 pools, 55106 MB data, 25170 objects 162 GB used, 863 GB / 1061 GB avail 832 active+clean ceph health detail HEALTH_WARN mds0: Client ede-c1-gw01:cephfs failing to respond to capability release; mds0: Client ede-c1-gw02:cephfs failing to respond to capability release mds0: Client ede-c1-gw01:cephfs failing to respond to capability release(client_id: 14128) mds0: Client ede-c1-gw02:cephfs failing to respond to capability release(client_id: 14138) ceph -v ceph version 9.1.0 (3be81ae6cf17fcf689cd6f187c4615249fea4f61) Kernel mounted system uname -a Linux ede-c1-gw01 4.3.0-040300rc6-generic #201510182030 SMP Mon Oct 19 00:31:41 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux fuse mounted system uname -a Linux ede-c1-adm01 4.3.0-040300rc6-generic #201510182030 SMP Mon Oct 19 00:31:41 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux dpkg -l | grep fuse ii ceph-fuse 9.1.0-1trusty amd64 FUSE-based client for the Ceph distributed file system ii fuse 2.9.2-4ubuntu4.14.04.1 amd64 Filesystem in Userspace ii libfuse2:amd64 2.9.2-4ubuntu4.14.04.1 amd64 Filesystem in Userspace (library) From the fuse mount: df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda5 35723920 7704560 26181640 23% / none 4 0 4 0% /sys/fs/cgroup udev 1014384 12 1014372 1% /dev tmpfs 204892 740 204152 1% /run none 5120 0 5120 0% /run/lock none 1024444 12 1024432 1% /run/shm none 102400 0 102400 0% /run/user ceph-fuse 1113567232 208039936 905527296 19% /cephfs From one of systems using the kernel mount df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda5 35723920 3158012 30728188 10% / none 4 0 4 0% /sys/fs/cgroup udev 1014012 4 1014008 1% /dev tmpfs 204892 664 204228 1% /run none 5120 0 5120 0% /run/lock none 1024444 8 1024436 1% /run/shm none 102400 0 102400 0% /run/user 10.14.2.11,10.14.2.12,10.14.2.13:/ 1113567232 208039936 905527296 19% /cephfs
I am not seeing any errors in /var/log/kern.log or /var/log/syslog.
Files
Updated by Eric Eastman over 8 years ago
Updated by Greg Farnum over 8 years ago
- Priority changed from Normal to High
Bumping this up as we ought to examine it and I think it's been lost in the shuffle.
Updated by Zheng Yan over 8 years ago
- Status changed from New to 12
I have an explanation for this.
see https://github.com/ukernel/ceph/commit/a750c361cd631a1f87ee152083d2a42c49fd02b6
Updated by Greg Farnum over 8 years ago
- Status changed from 12 to In Progress
Do you have a PR or any other commits to go with that? Is it safe to unilaterally not drop caps in a case like this?
Updated by Zheng Yan over 8 years ago
- Status changed from In Progress to Fix Under Review
I added it to https://github.com/ceph/ceph/pull/7199
Updated by Greg Farnum about 8 years ago
- Status changed from Fix Under Review to Resolved