Bug #10208
closedlibceph: intermittent hangs under memory pressure
0%
Files
Updated by Andrei Mikhailovsky over 9 years ago
The kern.log attached, with the data got shortly after running the following command:
time dd if=/dev/zero of=4G00 bs=4M count=5K oflag=direct & time dd if=/dev/zero of=4G11 bs=4M count=5K oflag=direct &time dd if=/dev/zero of=4G22 bs=4M count=5K oflag=direct &time dd if=/dev/zero of=4G33 bs=4M count=5K oflag=direct & time dd if=/dev/zero of=4G44 bs=4M count=5K oflag=direct & time dd if=/dev/zero of=4G55 bs=4M count=5K oflag=direct &time dd if=/dev/zero of=4G66 bs=4M count=5K oflag=direct &time dd if=/dev/zero of=4G77 bs=4M count=5K oflag=direct &
The output is an nfs mount point running over cephfs, which is mounted with mount -t ceph ... ...
Andrei
Updated by Zheng Yan over 9 years ago
do nfs mount and cephfs mount on the same machine?
Updated by Ilya Dryomov over 9 years ago
- Status changed from 12 to Need More Info
Andrei, does the below mean you had OSDs and cephfs mounted on the same box? I missed this completely because the problem looked very similar to an rbd problem I was debugging at the time and I just assumed it was a libceph problem.
I had nfsd process hang tasks on the server side and not the actual hang tasks on the client side. Here is my setup: (osd server + cephfs kernel mountpoint + nfs server) ---- IPoIB link ----- (hypervisor host + nfs client) So, when I was running dd tests on the mountpoint on the nfs client it has produced hang tasks of the nfsd process on the nfs server side. I have not seen any hang tasks on the client itself.
Updated by Ilya Dryomov over 9 years ago
- Priority changed from Urgent to High
A similar problem with krbd I was debugging with a user offline went away with 3.18 as far as they can tell.
I assume it was fixed by memory reclaim flags on waitqueues and sockets patches that went into 3.18.
Updated by Ilya Dryomov about 5 years ago
- Status changed from Need More Info to Resolved