Project

General

Profile

Actions

Bug #52339

closed

rgw upgrade 14.2.22 slow requests

Added by Maximilian Stinsky over 2 years ago. Updated about 2 years ago.

Status:
Won't Fix - EOL
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Greetings,

we upgraded our ceph cluster from 14.2.21 to 14.2.22 and since then we see a couple of s3 request that are quite slow.
For us it seems that something is waiting for a timeout or so because all slow requests take around 60s or a multiple of 60s.

After downgrading the rgw back to version 14.2.21 the problem instantly disappeared, therefore we are sure the problem is somewhere in the 14.2.22 patch.
rgw logs showed a lot of "No stored secret string, cache miss" messages in the timeframe we were running 14.2.22 which points us somewhat to the new caching of s3 credentials from keystone (https://github.com/ceph/ceph/pull/41158).

Any idea what the problem could be or what information we need to provide to find the problem?

Thanks in advance

Actions #1

Updated by Casey Bodley over 2 years ago

  • Status changed from New to Won't Fix - EOL

the luminous release is EOL now, so we won't be able to fix this there. if you see similar issues with a later release, please reopen!

Actions #2

Updated by Maximilian Stinsky over 2 years ago

Yes we know that the nautilus release is EOL, but it leaves us with quite the bad feeling to go and upgrade to the next major release as those also implement the new caching mechanism which most likely results in the slow requests we saw with 14.2.22. If we decided to upgrade to 16.2.5 and hit the same issue we would not be able to rollback to a working release.

In the end the 14.2.22 came after the EOL date and brought a new caching mechanism which might result in slow requests - at least in our environment.

Actions #3

Updated by Dan van der Ster over 2 years ago

Maximilian, are you indeed enabling keystone auth? `rgw_s3_auth_use_keystone = true`
By default this is disabled.

Actions #4

Updated by Maximilian Stinsky over 2 years ago

Yes we have rgw s3 auth enabled for our rgw's.

Actions #5

Updated by Maximilian Stinsky about 2 years ago

Just wanted to drop a comment that we upgraded our ceph cluster to pacific and are not seeing this issue anymore.
We could see that the keystone requests dropped by a lot so the new caching feature is active.
But I still think that 14.2.22 has some kind of issue with the new rgw caching method.

Actions

Also available in: Atom PDF