Project

General

Profile

Bug #48778

Setting quota triggered ops storms on meta pool

Added by Gerry D about 3 years ago. Updated about 3 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

A few days ago, I toyed with setting a quota on a handful (less than 60) directories on my CephFS filesystem. At the time, I was running version 15.2.1. After a while, I noticed the number of operations on my meta pool was extremely high. Normally, this would be near-zero or sometimes a few hundreds. Now this was at about 10K. I removed all the quotas I had set in the hopes that this would resolve the issue, but to no avail.

As I was due for an upgrade, I decided to upgrade to 15.2.8 yesterday. At first, things seemed to be back to normal. However, after a few hours, a strange pattern started to emerge, where the operations on the meta pool would spike to 20K for approximately 1h40m and then go back to normal for approximately 3h15m, making this a recurring event every 5 hours or so.

I have attached a screenshot to illustrate what this looks like. The first "wave" of 10K IOPS is until I did the upgrade, the next 20K spikes are after the upgrade.

During these spikes, dump_ops_in_flight shows a bunch of pending operations, many of them pertaining to ceph.quota (despite there being no more quotas set), all with at least one event of "failed to xlock, waiting". Excerpt below.

        {
            "description": "client_request(mds.0:61937455 setxattr #0x10006a6465b ceph.quota caller_uid=0, caller_gid=0{})",
            "initiated_at": "2021-01-06T09:45:41.610493+0000",
            "age": 0.0010133469999999999,
            "duration": 0.001879094,
            "type_data": {
                "flag_point": "failed to xlock, waiting",
                "reqid": "mds.0:61937455",
                "op_type": "client_request",
                "client_info": {
                    "client": "mds.0",
                    "tid": 61937455
                },
                "events": [
                    {
                        "time": "2021-01-06T09:45:41.610493+0000",
                        "event": "initiated" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610493+0000",
                        "event": "throttled" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610493+0000",
                        "event": "header_read" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610493+0000",
                        "event": "all_read" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610598+0000",
                        "event": "dispatched" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610607+0000",
                        "event": "failed to xlock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.612276+0000",
                        "event": "failed to xlock, waiting" 
                    }
                ]
            }
        },
        {
            "description": "client_request(mds.0:61937456 setxattr #0x100000014c1 ceph.quota caller_uid=0, caller_gid=0{})",
            "initiated_at": "2021-01-06T09:45:41.610515+0000",
            "age": 0.00099128499999999995,
            "duration": 0.00189281,
            "type_data": {
                "flag_point": "failed to xlock, waiting",
                "reqid": "mds.0:61937456",
                "op_type": "client_request",
                "client_info": {
                    "client": "mds.0",
                    "tid": 61937456
                },
                "events": [
                    {
                        "time": "2021-01-06T09:45:41.610515+0000",
                        "event": "initiated" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610515+0000",
                        "event": "throttled" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610515+0000",
                        "event": "header_read" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610515+0000",
                        "event": "all_read" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610609+0000",
                        "event": "dispatched" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610615+0000",
                        "event": "failed to xlock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.612294+0000",
                        "event": "failed to xlock, waiting" 
                    }
                ]
            }
        },
        {
            "description": "client_request(client.75073:35290007 rmdir #0x10006acee5b/fonts 2021-01-06T09:45:41.609114+0000 caller_uid=3166, caller_gid=1000{})",
            "initiated_at": "2021-01-06T09:45:41.610660+0000",
            "age": 0.00084625900000000003,
            "duration": 0.001782741,
            "type_data": {
                "flag_point": "acquired locks",
                "reqid": "client.75073:35290007",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.75073",
                    "tid": 35290007
                },
                "events": [
                    {
                        "time": "2021-01-06T09:45:41.610660+0000",
                        "event": "initiated" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610661+0000",
                        "event": "throttled" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610660+0000",
                        "event": "header_read" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610663+0000",
                        "event": "all_read" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610673+0000",
                        "event": "dispatched" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610690+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610697+0000",
                        "event": "failed to xlock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.612426+0000",
                        "event": "acquired locks" 
                    }
                ]
            }
        },
        {
            "description": "client_request(client.75073:35287363 setfilelock rule 2, type 2, owner 13411631674495090448, pid 9437, start 0, length 0, wait 1 #0x10006accd98 2021-01-06T09:44:08.248412+0000 caller_uid=2730, caller_gid=1000{})",
            "initiated_at": "2021-01-06T09:44:08.251306+0000",
            "age": 93.360200323000001,
            "duration": 93.36117634,
            "type_data": {
                "flag_point": "failed to add lock, waiting",
                "reqid": "client.75073:35287363",
                "op_type": "client_request",
                "client_info": {
                    "client": "client.75073",
                    "tid": 35287363
                },
                "events": [
                    {
                        "time": "2021-01-06T09:44:08.251306+0000",
                        "event": "initiated" 
                    },
                    {
                        "time": "2021-01-06T09:44:08.251308+0000",
                        "event": "throttled" 
                    },
                    {
                        "time": "2021-01-06T09:44:08.251306+0000",
                        "event": "header_read" 
                    },
                    {
                        "time": "2021-01-06T09:44:08.251314+0000",
                        "event": "all_read" 
                    },
                    {
                        "time": "2021-01-06T09:44:08.251328+0000",
                        "event": "dispatched" 
                    },
                    {
                        "time": "2021-01-06T09:44:08.251353+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:44:08.251360+0000",
                        "event": "failed to add lock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:44:10.683395+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:44:10.683397+0000",
                        "event": "failed to add lock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:44:17.626619+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:44:17.626621+0000",
                        "event": "failed to add lock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:44:46.166481+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:44:46.166483+0000",
                        "event": "failed to add lock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:44:46.371158+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:44:46.371159+0000",
                        "event": "failed to add lock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:44:47.581885+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:44:47.581886+0000",
                        "event": "failed to add lock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:44:52.718687+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:44:52.718689+0000",
                        "event": "failed to add lock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:45:31.331148+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:45:31.331150+0000",
                        "event": "failed to add lock, waiting" 
                    }
                ]
            }
        },
        {
            "description": "client_request(mds.0:61937457 setxattr #0x10006a64656 ceph.quota caller_uid=0, caller_gid=0{})",
            "initiated_at": "2021-01-06T09:45:41.610708+0000",
            "age": 0.00079801799999999995,
            "duration": 0.0018484860000000001,
            "type_data": {
                "flag_point": "submit entry: journal_and_reply",
                "reqid": "mds.0:61937457",
                "op_type": "client_request",
                "client_info": {
                    "client": "mds.0",
                    "tid": 61937457
                },
                "events": [
                    {
                        "time": "2021-01-06T09:45:41.610708+0000",
                        "event": "initiated" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610708+0000",
                        "event": "throttled" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610708+0000",
                        "event": "header_read" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610708+0000",
                        "event": "all_read" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610728+0000",
                        "event": "dispatched" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.610750+0000",
                        "event": "failed to xlock, waiting" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.612514+0000",
                        "event": "acquired locks" 
                    },
                    {
                        "time": "2021-01-06T09:45:41.612523+0000",
                        "event": "submit entry: journal_and_reply" 
                    }
                ]
            }
        }
    ],

I'm at a loss as for what to do next, but I do think this looks like I hit a bug somehow, as I don't see why this happened in the first place, and escpecially why this is still happening now after removing all quotas and upgrading (and thus restarting everything).

meta-iops.png View (78.1 KB) Gerry D, 01/07/2021 05:28 AM

traffic_ceph_vs_clients.png View (102 KB) Gerry D, 01/12/2021 05:52 AM

History

#1 Updated by Patrick Donnelly about 3 years ago

  • Status changed from New to Need More Info

This doesn't sound like a bug we'd expect. I think you may have some clients executing these setxattr requests but I'm unsure on the reason or mechanism. Suggest you look at your session listing with the MDS to review what clients are connected so you can examine their workloads.

#2 Updated by Gerry D about 3 years ago

Thank you for your reply, Patrick.

I already suspected the requests might come from the clients, but my initial debugging didn't reveal any such requests (and certainly not that many) from any client. Your reply prompted me to check again, and I'm now fairly sure the issue isn't related to any client.

There are a total of 7 client sessions to the affected filesystem. Two clients have two mounts each (4 sessions total), and three clients have one mount each (3 sessions total).

Of these, I can shut down the three clients having a single mount with little impact, so I did that, with no change in the symptoms described. For the other two, I can only disable one at a time without impact, which I did, to no avail.

I also checked the debug information available on the clients in /sys/kernel/debug/ceph/*/ which shows a very low workload. The osdc file shows REQUESTS 0 homeless 0 LINGER REQUESTS BACKOFFS, while I would think the "ops in flight" I pasted in the original post should be accounted for somewhere in these files if these originated from the clients, or maybe I'm misunderstanding something here?

Furthermore, we see a lot more network traffic between the ceph nodes themselves, than traffic between clients and the ceph nodes. This is not totally unexpected, but I would think there should at least be some correlation, which there isn't. The graph traffic_ceph_vs_clients.png I have attached illustrates what I mean. The yellow stack is the aggregate of the clients, the green is the aggregate of the ceph nodes. As you can see, the traffic is only between the ceph nodes, and you can see a massive increase (from ~30mbps to 400mbps and later to 500mpbs) at the exact moment the problem started.

Is there anything else I could check to pinpoint the cause of the issue?

#3 Updated by Tom Myny about 3 years ago

Small question, why do we see client_request(mds.0:61937457 setxattr #0x10006a64656 ceph.quota caller_uid=0, caller_gid=0{}) if we have remove all quotas (for now).

Also, the client_request is mentioning the mds server itself (mds.0) , is that normal ?

#4 Updated by Gerry D about 3 years ago

Today I was able to simultaneously shut down all clients, which I did. The write operations were still happening, even with all clients shut down.

I tried to get a dump of the MDS cache, but the command failed to output anything; instead triggering a failover to the secondary MDS. Only after this failover, the write operations subsided.

For good measure, I have restarted both the active and standby MDS before starting the clients again. The write operation storm has not re-appeared yet, though if the situation now is anything like last week's, it could re-start any moment.

#5 Updated by Gerry D about 3 years ago

After the issue did not re-appear for 6 days, I tried enabling quotas again on 71 directories. Shortly after doing so, the issue is back with exactly the same symptoms. We are seeing a continuous load of approximately 5k iops on the meta pool (normally 10-50 up to 100-500 iops max). Traffic between the ceph nodes is back up to 200mbps continuously (normally 20-40mbps).

All quotas that had been set have now been removed again, to no avail.

I will attempt to restart both MDS later and see if it goes back to normal.

#6 Updated by Gerry D about 3 years ago

After restarting both MDS the situation went back to normal.

Combing though our logs to find out which action triggered the issue, I found that the trigger seems to be setting a quota of 0 on a directory which previously never had a quota set:

root@cephclient:/mnt/ceph/# mkdir test
root@cephclient:/mnt/ceph/# getfattr -n ceph.quota.max_bytes test/
test/: ceph.quota.max_bytes: No such attribute
root@cephclient:/mnt/ceph/# setfattr -n ceph.quota.max_bytes -v 0 test

Doing the above, I saw the same symptoms as before.

After restarting the MDS again and waiting a while to make sure everything was back to normal again, I tried it again. This time by first setting the quota to a non-zero value and only after that setting it to 0 again.

root@cephclient:/mnt/ceph/# mkdir test2
root@cephclient:/mnt/ceph/# getfattr -n ceph.quota.max_bytes test2/
test2/: ceph.quota.max_bytes: No such attribute
root@cephclient:/mnt/ceph/# setfattr -n ceph.quota.max_bytes -v 99999 test2
root@cephclient:/mnt/ceph/# setfattr -n ceph.quota.max_bytes -v 0 test2

This does not trigger the issue.

I re-confirmed my findings by triggering it again on a new directory later.

For now, I've worked around the issue by making sure we always set a non-zero quota before setting it to 0 (it's an automatic deployment, always setting the quota to the desired value or off (0) is done for idempotency). This seems to work reliably, I will report back if time proves me wrong.

I'm happy to provide any information you might need to find the root cause now that I can reliably trigger it. I'm not sure where to start myself, so hints are welcome.

#7 Updated by Patrick Donnelly about 3 years ago

Gerry D wrote:

After restarting both MDS the situation went back to normal.

Combing though our logs to find out which action triggered the issue, I found that the trigger seems to be setting a quota of 0 on a directory which previously never had a quota set:

[...]

Doing the above, I saw the same symptoms as before.

I tried this with ceph-fuse and the kernel and didn't observe any extra setxattr traffic.

For now, I've worked around the issue by making sure we always set a non-zero quota before setting it to 0 (it's an automatic deployment, always setting the quota to the desired value or off (0) is done for idempotency). This seems to work reliably, I will report back if time proves me wrong.

I'm happy to provide any information you might need to find the root cause now that I can reliably trigger it. I'm not sure where to start myself, so hints are welcome.

Does this only happen with the kernel client?

Also available in: Atom PDF