Bug #3267
closedMultiple active MDSes stall when listing freshly created files
0%
Description
The output from ceph-debugpack can be found at the following location: [[http://cumulonim.biz/mds.tar.gz]] We were running the following version (but tested with HEAD/ad97bbb0a1e985b91ab0ffe9ae5b15cfce465211 as well):
ceph version 0.52 (commit:e48859474c4944d4ff201ddc9f5fd400e8898173)
We created a Ceph cluster on a single box with 3 monitors, 9 MDSes (6 active, 3 standby), and 5 OSDs. Note: we've seen the same issue when separated across several machines, but kept it down to one box for this report and for simplicity. After creating the cluster, we mounted CephFS from another server and ran the following line:
for D in {0..99}; do mkdir -p /mnt/ceph/$D; for F in {0..99}; do echo "Hello $F in $D" > /mnt/ceph/$D/$F; done; done
In the specific instance (of the logs provided,) we were able to complete this command, but immediately found the MDSes to "stall" after trying to `ls /mnt/ceph/*` (we saw 2 directories returned prior to the command hanging.) Note: in other runs of this test, the command to create the files and directories would also stall. We ran these tests repeatedly with similar outcomes.
Updated by Greg Farnum over 11 years ago
I'll try and take a look at this, but multi-MDS setups are known to be pretty unstable at this point. Have you tried just using one active with some standbys?
Updated by Stan Schwertly over 11 years ago
Greg Farnum wrote:
I'll try and take a look at this, but multi-MDS setups are known to be pretty unstable at this point. Have you tried just using one active with some standbys?
We are aware of the issue with multi-MDS setups, hence the report :)! For good measure, we reissued the command against a cluster with a single MDS and two standbys and it worked like a charm.
Let us know if there's any more detail we can provide.
Updated by Greg Farnum over 11 years ago
- Priority changed from Normal to Low
Currently de-prioritizing multi-MDS bugs.
Updated by Greg Farnum almost 8 years ago
- Category changed from 47 to 90
- Component(FS) Common/Protocol, MDS added
Updated by John Spray over 7 years ago
- Status changed from New to Closed
This ticket is old and the use case seems like something we will pick up on from the multimds suite if it's still broken.
Updated by Patrick Donnelly about 5 years ago
- Category deleted (
90) - Labels (FS) multimds added