Project

General

Profile

Bug #36593

qa: quota failure caused by clients stepping on each other

Added by Patrick Donnelly about 2 years ago. Updated 3 days ago.

Status:
New
Priority:
High
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
pacific,octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature:

Description

2018-10-24T03:08:42.204 INFO:tasks.workunit.client.0.smithi071.stderr:100+0 records in
2018-10-24T03:08:42.204 INFO:tasks.workunit.client.0.smithi071.stderr:100+0 records out
2018-10-24T03:08:42.204 INFO:tasks.workunit.client.0.smithi071.stderr:104857600 bytes (105 MB) copied, 1.68958 s, 62.1 MB/s
2018-10-24T03:08:42.211 INFO:tasks.workunit.client.0.smithi071.stderr:+ rm -rf big big2 second third
2018-10-24T03:08:42.244 INFO:tasks.workunit.client.0.smithi071.stderr:+ setfattr . -n ceph.quota.max_files -v 5
2018-10-24T03:08:42.252 INFO:tasks.workunit.client.0.smithi071.stderr:+ mkdir ok
2018-10-24T03:08:42.254 INFO:tasks.workunit.client.0.smithi071.stderr:+ touch ok/1
2018-10-24T03:08:42.262 INFO:tasks.workunit.client.0.smithi071.stderr:+ touch ok/2
2018-10-24T03:08:42.266 INFO:tasks.workunit.client.0.smithi071.stderr:+ touch 3
2018-10-24T03:08:42.271 INFO:tasks.workunit.client.0.smithi071.stderr:+ expect_false touch shouldbefail
2018-10-24T03:08:42.271 INFO:tasks.workunit.client.0.smithi071.stderr:+ set -x
2018-10-24T03:08:42.271 INFO:tasks.workunit.client.0.smithi071.stderr:+ touch shouldbefail
2018-10-24T03:08:42.276 INFO:tasks.workunit.client.0.smithi071.stderr:+ return 1
2018-10-24T03:08:42.279 DEBUG:teuthology.orchestra.run:got remote process result: 1
2018-10-24T03:08:42.279 INFO:tasks.workunit:Stopping ['fs/quota'] on client.0...
2018-10-24T03:08:42.279 INFO:teuthology.orchestra.run.smithi071:Running: 'sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0'
2018-10-24T03:08:42.340 INFO:tasks.workunit.client.3.smithi086.stderr:100+0 records in
2018-10-24T03:08:42.340 INFO:tasks.workunit.client.3.smithi086.stderr:100+0 records out
2018-10-24T03:08:42.340 INFO:tasks.workunit.client.3.smithi086.stderr:104857600 bytes (105 MB) copied, 1.69412 s, 61.9 MB/s
2018-10-24T03:08:42.342 INFO:tasks.workunit.client.3.smithi086.stderr:+ rm -rf big big2 second third
2018-10-24T03:08:42.373 INFO:tasks.workunit.client.3.smithi086.stderr:+ setfattr . -n ceph.quota.max_files -v 5
2018-10-24T03:08:42.397 INFO:tasks.workunit.client.3.smithi086.stderr:+ mkdir ok
2018-10-24T03:08:42.400 INFO:tasks.workunit.client.3.smithi086.stderr:+ touch ok/1
2018-10-24T03:08:42.408 INFO:tasks.workunit.client.3.smithi086.stderr:+ touch ok/2
2018-10-24T03:08:42.412 INFO:tasks.workunit.client.3.smithi086.stderr:+ touch 3
2018-10-24T03:08:42.418 INFO:tasks.workunit.client.3.smithi086.stderr:+ expect_false touch shouldbefail
2018-10-24T03:08:42.418 INFO:tasks.workunit.client.3.smithi086.stderr:+ set -x
2018-10-24T03:08:42.418 INFO:tasks.workunit.client.3.smithi086.stderr:+ touch shouldbefail
2018-10-24T03:08:42.422 INFO:tasks.workunit.client.3.smithi086.stderr:+ return 1
2018-10-24T03:08:42.423 DEBUG:teuthology.orchestra.run:got remote process result: 1
2018-10-24T03:08:42.423 INFO:tasks.workunit:Stopping ['fs/quota'] on client.3...
2018-10-24T03:08:42.424 INFO:teuthology.orchestra.run.smithi086:Running: 'sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.3 /home/ubuntu/cephtest/clone.client.3'
2018-10-24T03:08:42.484 INFO:tasks.workunit.client.1.smithi071.stderr:100+0 records in
2018-10-24T03:08:42.484 INFO:tasks.workunit.client.1.smithi071.stderr:100+0 records out
2018-10-24T03:08:42.485 INFO:tasks.workunit.client.1.smithi071.stderr:104857600 bytes (105 MB) copied, 1.61699 s, 64.8 MB/s
2018-10-24T03:08:42.486 INFO:tasks.workunit.client.1.smithi071.stderr:+ rm -rf big big2 second third
2018-10-24T03:08:42.517 INFO:tasks.workunit.client.1.smithi071.stderr:+ setfattr . -n ceph.quota.max_files -v 5
2018-10-24T03:08:42.535 INFO:tasks.workunit.client.1.smithi071.stderr:+ mkdir ok
2018-10-24T03:08:42.536 INFO:tasks.workunit.client.1.smithi071.stderr:+ touch ok/1
...
CommandFailedError: Command failed (workunit test fs/quota/quota.sh) on smithi071 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=307f3fef8e789fb91a70d2316de219f1d0e5899b TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/fs/quota/quota.sh'

From: /ceph/teuthology-archive/pdonnell-2018-10-24_02:35:37-fs-wip-pdonnell-testing-20181023.224346-distro-basic-smithi/3177753/teuthology.log

History

#1 Updated by Luis Henriques about 2 years ago

A quick look at the logs shows that there are 4 clients running this test simultaneously. I wonder if this something that used to succeed before. Because these clients seem to be interfering with each other, setting and removing quotas.

If that's the case, a possible fix would be to have each client create it's own test directory. Something like the diff below:

diff --git a/qa/workunits/fs/quota/quota.sh b/qa/workunits/fs/quota/quota.sh
index 1315be6d8609..d6e59317ecdf 100755
--- a/qa/workunits/fs/quota/quota.sh
+++ b/qa/workunits/fs/quota/quota.sh
@@ -25,8 +25,9 @@ function write_file()
        return 0
 }

-mkdir quota-test
-cd quota-test
+testdir=`hostname -A`-quota-test
+mkdir $testdir
+cd $testdir

 # bytes
 setfattr . -n ceph.quota.max_bytes -v 100000000  # 100m
@@ -123,6 +124,6 @@ expect_false setfattr -n ceph.quota -v "max_bytes=-1 max_files=-1" .
 #addme

 cd ..
-rm -rf quota-test
+rm -rf $testdir

 echo OK

#2 Updated by Patrick Donnelly about 2 years ago

  • Subject changed from quota failure to qa: quota failure caused by clients stepping on each other

#3 Updated by Patrick Donnelly about 2 years ago

  • Assignee set to Patrick Donnelly

#4 Updated by Patrick Donnelly about 2 years ago

  • Assignee deleted (Patrick Donnelly)

Luis Henriques wrote:

A quick look at the logs shows that there are 4 clients running this test simultaneously. I wonder if this something that used to succeed before. Because these clients seem to be interfering with each other, setting and removing quotas.

If that's the case, a possible fix would be to have each client create it's own test directory. Something like the diff below:

[...]

Luis, I took a look. Each client gets its own subdirectory in the CephFS mount:

2018-10-24T03:08:34.009 INFO:teuthology.orchestra.run.smithi086:Running (workunit test fs/quota/quota.sh): 'mkdir -p -- /home/ubuntu/cephtest/mnt.2/client.2/tmp && cd -- /home/ubuntu/cephtest/mnt.2/client.2/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=307f3fef8e789fb91a70d2316de219f1d0e5899b TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="2" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.2 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.2 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.2/qa/workunits/fs/quota/quota.sh'

Emphasis on "mkdir p - /home/ubuntu/cephtest/mnt.2/client.2/tmp && cd -- /home/ubuntu/cephtest/mnt.2/client.2/tmp". mnt.2 is the CephFS root.

Mind taking another look?

#5 Updated by Luis Henriques about 2 years ago

Patrick Donnelly wrote:

Luis Henriques wrote:

A quick look at the logs shows that there are 4 clients running this test simultaneously. I wonder if this something that used to succeed before. Because these clients seem to be interfering with each other, setting and removing quotas.

If that's the case, a possible fix would be to have each client create it's own test directory. Something like the diff below:

[...]

Luis, I took a look. Each client gets its own subdirectory in the CephFS mount:

[...]

Emphasis on "mkdir p - /home/ubuntu/cephtest/mnt.2/client.2/tmp && cd -- /home/ubuntu/cephtest/mnt.2/client.2/tmp". mnt.2 is the CephFS root.

Mind taking another look?

Ah, sorry! I missed that. Sure, I'll have another look at the logs.

#6 Updated by Patrick Donnelly about 2 years ago

Luis Henriques wrote:

Patrick Donnelly wrote:

Luis Henriques wrote:

A quick look at the logs shows that there are 4 clients running this test simultaneously. I wonder if this something that used to succeed before. Because these clients seem to be interfering with each other, setting and removing quotas.

If that's the case, a possible fix would be to have each client create it's own test directory. Something like the diff below:

[...]

Luis, I took a look. Each client gets its own subdirectory in the CephFS mount:

[...]

Emphasis on "mkdir p - /home/ubuntu/cephtest/mnt.2/client.2/tmp && cd -- /home/ubuntu/cephtest/mnt.2/client.2/tmp". mnt.2 is the CephFS root.

Mind taking another look?

Ah, sorry! I missed that. Sure, I'll have another look at the logs.

Great, thanks for having a look! I'll assign this to you for now :)

#7 Updated by Patrick Donnelly about 2 years ago

  • Assignee set to Luis Henriques

#8 Updated by Luis Henriques about 2 years ago

Quick update: Looking further at the logs helped me... getting more confused :-)

So, all the 4 clients are failing when they try to create the 'shouldbefail' file, because that operation should fail due to max_files quota being set.

What I'm seeing is that both client.0 and client.1 call insert_dentry_inode() for the 'quota-test' dir several times, with different vino. We have 4 clients: clients 2 and 3 call this function only once (with a unique vino each); the other 2 clients call it 4 times, with the vinos for all the 4 clients:

  • client.2: 0x100000007d4
  • client.3: 0x10000000bbd
  • client.0: 0x10000000002, 0x100000003eb, 0x100000007d4, 0x10000000bbd
  • client.1: 0x100000003eb, 0x10000000002, 0x100000007d4, 0x10000000bbd

This looks wrong to me, each client shouldn't be reading inside each others client.$ID where the 'quota-test' dirs are.

So, my current theory is that the MDS is doing something wrong setting the snaprealms and the clients are receiving the wrong snaprealms for their quota-test dirs. I'll try dig a bit deeper into the MDS code and see if I can find something, but maybe the above description rings a bell to anyone (Yan? :-) )

#9 Updated by Luis Henriques about 2 years ago

And another update: I can not understand why there are two clients (both on smithi071 btw) that do a readdir in the root directory. See for ex. aroung 03:08:36.514 in client.0.

Initially I thought this could also be a prob with casting/truncation because the readdir was being done on 0x1 instead of 0x10000000001 (I even thought about 32bits clients, but they all seem to be 64 bits). But then the client.1 should be readdir'ing 0x100000003ea, and still doing it on 0x1.

#10 Updated by Patrick Donnelly almost 2 years ago

  • Target version changed from v14.0.0 to v15.0.0

#11 Updated by Patrick Donnelly 11 months ago

  • Assignee deleted (Luis Henriques)
  • Priority changed from High to Normal
  • Target version deleted (v15.0.0)

#12 Updated by Patrick Donnelly 3 months ago

  • Priority changed from Normal to High
  • Target version set to v16.0.0
  • Backport deleted (mimic)

/ceph/teuthology-archive/pdonnell-2020-11-04_17:39:34-fs-wip-pdonnell-testing-20201103.210407-distro-basic-smithi/5590496/teuthology.log
/ceph/teuthology-archive/pdonnell-2020-11-04_17:39:34-fs-wip-pdonnell-testing-20201103.210407-distro-basic-smithi/5590512/teuthology.log

(might be a different failure.)

#13 Updated by Luis Henriques 2 months ago

Patrick Donnelly wrote:

/ceph/teuthology-archive/pdonnell-2020-11-04_17:39:34-fs-wip-pdonnell-testing-20201103.210407-distro-basic-smithi/5590496/teuthology.log
/ceph/teuthology-archive/pdonnell-2020-11-04_17:39:34-fs-wip-pdonnell-testing-20201103.210407-distro-basic-smithi/5590512/teuthology.log

(might be a different failure.)

Definitely a different failure, but this time I think it's easy to understand what's going on. It's a regression introduced (by me!) with the fix for cross-quota-tree renames.

Looks like the issue started happening in kernel 5.8 with commit dffdcd71458e ("ceph: allow rename operation under different quota realms"). My first guess is that the MDSs aren't updating inodes RSTATs immediately after a truncate operation:

+ mkdir files limit
+ truncate files/file -s 10G
+ setfattr limit -n ceph.quota.max_bytes -v 1000000
+ expect_false mv files limit/
+ set -x
+ mv files limit/
+ return 1

Is there a reason for the MDSs to postpone this update? I'll need to check what the fuse client is doing here (not even sure the fuse client allows renames across quota realms).

#14 Updated by Luis Henriques 2 months ago

The kernel client code is optimized to buffer the new file size when doing the truncate syscall:

        if (ia_valid & ATTR_SIZE) {
                if ((issued & CEPH_CAP_FILE_EXCL) &&
                    attr->ia_size > inode->i_size) {
                ...
                }
                } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 ||
                           attr->ia_size != inode->i_size) {
                ...
                }
         }

Only in the Fs case we're actually sending the SETATTR operation to the MDS.

A trivial fix would be to always send it, even if we have Fx caps. This would have a performance impact, of course but I believe this is what the fuse client is doing.

#15 Updated by Patrick Donnelly 2 months ago

Thanks for checking Luis, I made a new ticket here: https://tracker.ceph.com/issues/48203

Let's move the discussion over there.

#16 Updated by Patrick Donnelly 3 days ago

  • Target version changed from v16.0.0 to v17.0.0
  • Backport set to pacific,octopus,nautilus

Also available in: Atom PDF