Project

General

Profile

Bug #22963

radosgw-admin usage show loops indefinitly - again

Added by Ingo Reimann about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
rgw, usage
Backport:
luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
rgw
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After updating our cluster from jewel to luminous, radosgw-admin usage show loops. I thought #21196 should have closed the Bug.

EG:
radosgw-admin usage show --uid="DunkelDatenaustausch" --start-date=2018-01-01 --end-date=2018-02-01

{
    "entries": [
        {
            "user": "DunkelDatenaustausch",
            "buckets": [
                {
                    "bucket": "",
                    "time": "2018-01-17 10:00:00.000000Z",
                    "epoch": 1516183200,
                    "owner": "DunkelDatenaustausch",
                    "categories": [
                        {
                            "category": "list_buckets",
                            "bytes_sent": 344,
                            "bytes_received": 0,
                            "ops": 1,
                            "successful_ops": 1
                        }
                    ]
                },
                {
                    "bucket": "",
                    "time": "2018-01-17 10:00:00.000000Z",
                    "epoch": 1516183200,
                    "owner": "DunkelDatenaustausch",
                    "categories": [
                        {
                            "category": "list_buckets",
                            "bytes_sent": 344,
                            "bytes_received": 0,
                            "ops": 1,
                            "successful_ops": 1
                        }
                    ]
                },
                {
                    "bucket": "",
                    "time": "2018-01-17 10:00:00.000000Z",
                    "epoch": 1516183200,
                    "owner": "DunkelDatenaustausch",
                    "categories": [
                        {
                            "category": "list_buckets",
                            "bytes_sent": 344,
                            "bytes_received": 0,
                            "ops": 1,
                            "successful_ops": 1
                        }
                    ]
                },
                [...]

Interestingly, the amount of data, bytes_sent and bytes_received seem to be correct and respect to complete time range. We wrote a little wrapper, that detects and breaks the loop. We still got the correct accounting, when we added the bytes afterwards.

For some accounts, we still get the original behaviour.

The cluster is rather old. It has been upgraded over years from bobtail, probably. I included the zonegroup config.

clusterinfo.zip (20.6 KB) Ingo Reimann, 02/09/2018 06:40 AM


Related issues

Related to RADOS - Bug #22050: ERROR type entries of pglog do not update min_last_complete_ondisk, potentially ballooning memory usage Resolved 11/06/2017
Related to rgw - Bug #23229: usage trim loops forever: infinite calls to rgw.user_usage_log_trim Resolved 03/05/2018
Copied to rgw - Backport #23686: luminous: radosgw-admin usage show loops indefinitly - again Resolved

History

#1 Updated by Greg Farnum about 6 years ago

  • Project changed from Ceph to rgw

#2 Updated by Robin Johnson about 6 years ago

  • Status changed from New to 12
  • Priority changed from Normal to High
  • Tags set to rgw, usage
  • Backport set to luminous
  • Severity changed from 3 - minor to 2 - major
  • Release set to master

This bug definitely still exists, is affecting billing.
Here's the interesting part from --debug-rgw=20 --debug-ms=1 on the client, I'm going to track it to the OSD as well, but I think it's in the client-side class code.

2018-02-15 19:03:31.179167 7f3ba1219b80  1 -- [XXXX::5752]:0/598436282 --> [XXXX::7098]:6838/2803930 -- osd_op(unknown.0.0:70 12.2 12:51cb4789:::usage.21:head [call rgw.user_usage_log_read] snapc 0=[] ondisk+read+known_if_redirected e1292041) v8 -- 0x5618934a01d0 con 0
2018-02-15 19:03:31.192370 7f3b8f1a5700  1 -- [XXXX::5752]:0/598436282 <== osd.604 [XXXX::7098]:6838/2803930 1 ==== osd_op_reply(70 usage.21 [call] v0'0 uv194985623 ondisk = 0) v8 ==== 152+0+399 (1593127871 0 1061501781) 0x7f3b80012070 con 0x56189349bbb0
2018-02-15 19:03:31.192649 7f3ba1219b80  1 -- [XXXX::5752]:0/598436282 --> [XXXX::7098]:6838/2803930 -- osd_op(unknown.0.0:71 12.2 12:51cb4789:::usage.21:head [call rgw.user_usage_log_read] snapc 0=[] ondisk+read+known_if_redirected e1292041) v8 -- 0x56189349a360 con 0
2018-02-15 19:03:31.209750 7f3b8f1a5700  1 -- [XXXX::5752]:0/598436282 <== osd.604 [XXXX::7098]:6838/2803930 2 ==== osd_op_reply(71 usage.21 [call] v0'0 uv194985623 ondisk = 0) v8 ==== 152+0+399 (1593127871 0 1061501781) 0x7f3b80012070 con 0x56189349bbb0
2018-02-15 19:03:31.209975 7f3ba1219b80  1 -- [XXXX::5752]:0/598436282 --> [XXXX::7098]:6838/2803930 -- osd_op(unknown.0.0:72 12.2 12:51cb4789:::usage.21:head [call rgw.user_usage_log_read] snapc 0=[] ondisk+read+known_if_redirected e1292041) v8 -- 0x56189349a360 con 0
2018-02-15 19:03:31.227569 7f3b8f1a5700  1 -- [XXXX::5752]:0/598436282 <== osd.604 [XXXX::7098]:6838/2803930 3 ==== osd_op_reply(72 usage.21 [call] v0'0 uv194985623 ondisk = 0) v8 ==== 152+0+399 (1593127871 0 1061501781) 0x7f3b80010850 con 0x56189349bbb0
2018-02-15 19:03:31.227796 7f3ba1219b80  1 -- [XXXX::5752]:0/598436282 --> [XXXX::7098]:6838/2803930 -- osd_op(unknown.0.0:73 12.2 12:51cb4789:::usage.21:head [call rgw.user_usage_log_read] snapc 0=[] ondisk+read+known_if_redirected e1292041) v8 -- 0x56189349a360 con 0
2018-02-15 19:03:31.245164 7f3b8f1a5700  1 -- [XXXX::5752]:0/598436282 <== osd.604 [XXXX::7098]:6838/2803930 4 ==== osd_op_reply(73 usage.21 [call] v0'0 uv194985623 ondisk = 0) v8 ==== 152+0+399 (1593127871 0 1061501781) 0x7f3b80012070 con 0x56189349bbb0

#3 Updated by Robin Johnson about 6 years ago

Debugging so far:
1. The bug only occurs on the Congress cluster, not the Benjamin cluster (was rebuilt from scratch with Jewel era). So probably depends on old data. Congress has LOTS of old usage data.
2. I haven't seen the ENOENT path get triggered yet.

#4 Updated by Robin Johnson about 6 years ago

The comment by Marcin Gibula in issue 21196 is actually spot on.
This DOES actually complete, eventually, if nothing else kills it first for taking too long.

I created a brand new user, and started a capture on it, asking for yesterday's data (of which there should be NONE):
time radosgw-admin usage show --uid ceph-22963-test --start-date=2018-02-17

It did eventually complete, with the time output of:
real 37m3.425s
user 0m39.836s
sys 0m14.912s

If I pick another username, the runtime is very similar.

#5 Updated by Marcin Gibula about 6 years ago

Hi,
just FYI - I've uploaded some logs in this ticket as well - http://tracker.ceph.com/issues/22314

If you need anything more, just ask - it's currently a major blocker for us as well.

#6 Updated by Orit Wasserman about 6 years ago

  • Related to Bug #2050: rgw: crash at Objecter::_linger_commit() added

#7 Updated by Orit Wasserman about 6 years ago

  • Related to deleted (Bug #2050: rgw: crash at Objecter::_linger_commit())

#8 Updated by Orit Wasserman about 6 years ago

  • Related to Bug #22050: ERROR type entries of pglog do not update min_last_complete_ondisk, potentially ballooning memory usage added

#9 Updated by Greg Farnum about 6 years ago

Looked through the bucket_list.log and do see some inexplicably long op times. Although the watch-notify stuff (3) I could maybe see taking a while, most listing ops take 1 ms instead of 350 (1), and a create shouldn't take 196 ms (2) unless the PG is blocked for some reason. (Which it...might be?)

2018-02-07 07:23:43.501849 7f87efd3f700  1 -- 10.198.10.101:0/734401551 --> 10.198.10.14:6832/271459 -- osd_op(unknown.0.0:21 16.1 16:80000000::::head [pgnls start_epoch 2083520] snapc 0=[] ondisk+read+ignore_overlay+known_if_redirected e2083520) v8 -- 0x7f87e000c900 con 0
2018-02-07 07:23:43.825047 7f87ef53e700  1 -- 10.198.10.101:0/734401551 <== osd.215 10.198.10.14:6832/271459 1 ==== osd_op_reply(21  [pgnls start_epoch 2083520] v987079'84 uv84 ondisk = 1) v8 ==== 144+0+107 (3045876259 0 1982614520) 0x7f87e4009c00 con 0x7f87e0008240
...
2018-02-07 07:23:43.846484 7f8802235cc0  1 -- 10.198.10.101:0/734401551 --> 10.198.10.15:6830/158365 -- osd_op(unknown.0.0:52 8.7 8:f95f44c2:::notify.0:head [create] snapc 0=[] ondisk+write+known_if_redirected e2083520) v8 -- 0x56443269d7c0 con 0
2018-02-07 07:23:44.042682 7f87ef53e700  1 -- 10.198.10.101:0/734401551 <== osd.27 10.198.10.15:6830/158365 1 ==== osd_op_reply(52 notify.0 [create] v2083520'1819325 uv1819325 ondisk = 0) v8 ==== 152+0+0 (872112348 0 0) 0x7f87e400a900 con 0x5644326990a0
...
2018-02-07 07:23:44.042912 7f8802235cc0  1 -- 10.198.10.101:0/734401551 --> 10.198.10.15:6830/158365 -- osd_op(unknown.0.0:53 8.7 8:f95f44c2:::notify.0:head [watch watch cookie 94850903562176] snapc 0=[] ondisk+write+known_if_redirected e2083520) v8 -- 0x56443269e1b0 con 0
2018-02-07 07:23:45.013615 7f87ef53e700  1 -- 10.198.10.101:0/734401551 <== osd.27 10.198.10.15:6830/158365 2 ==== osd_op_reply(53 notify.0 [watch watch cookie 94850903562176] v2083520'1819326 uv1819325 ondisk = 0) v8 ==== 152+0+0 (739656982 0 0) 0x7f87e400a900 con 0x5644326990a0
2018-02-07 07:23:45.013746 7f8802235cc0 20 add_watcher() i=0

But since that log appears to end successfully I'm confused

#10 Updated by Greg Farnum about 6 years ago

Also Sage is asking if missing https://github.com/ceph/ceph/pull/13021 might be causing the repeated loops.

#11 Updated by Greg Farnum about 6 years ago

Okay, dug into the rgw side of things a bit more and I think https://github.com/ceph/ceph/pull/19131 contains some incorrect logic?

[16:04:51] <gregsfortytwo> it looks to me like that's a little broken: in cls_rgw_client, we keep looping until we receive ENODATA
[16:05:23] <gregsfortytwo> in cls_rgw, we only return ENODATA if "iter.empty()", which in usage_iterate_range is key.iter
[16:05:30] <gregsfortytwo> s/key.iter/key_iter/
[16:06:07] <gregsfortytwo> but in usage_iterate_range we set key_iter to non-empty if we run out of user data to trim, meaning that we're done

#12 Updated by Greg Farnum about 6 years ago

  • Status changed from 12 to 4

It kinda looks to me like we should just set "*truncated = false" in the cases where usage_iterate_range() reaches the end of what we care about, and that the callers should trust that truncated parameter about whether they're done or not. But perhaps I'm missing something?

Anyway, pushed a PR: https://github.com/ceph/ceph/pull/20730

Assuming that's correct, it will for sure need a backport.

#13 Updated by Orit Wasserman about 6 years ago

  • Related to Bug #23229: usage trim loops forever: infinite calls to rgw.user_usage_log_trim added

#14 Updated by Orit Wasserman about 6 years ago

  • Related to Bug #23229: usage trim loops forever: infinite calls to rgw.user_usage_log_trim added

#15 Updated by Orit Wasserman about 6 years ago

  • Related to deleted (Bug #23229: usage trim loops forever: infinite calls to rgw.user_usage_log_trim)

#16 Updated by Orit Wasserman about 6 years ago

  • Status changed from 4 to Fix Under Review
  • Assignee set to Orit Wasserman

#17 Updated by Robin Johnson about 6 years ago

@Orit & @Greg:
I confirm this fix works for the Benjamin cluster! Now completes in a few seconds.

#18 Updated by Orit Wasserman about 6 years ago

  • Status changed from Fix Under Review to 17

#19 Updated by Orit Wasserman about 6 years ago

  • Assignee changed from Orit Wasserman to Yehuda Sadeh

#20 Updated by Robin Johnson about 6 years ago

my own further fix here:
https://github.com/dreamhost/ceph/commit/6c0cb1c1491df1f69309ff57a5b0f9d7929e1990
Will be submitted for master this week hopefully.

Further digging shows this might have been broken since the Jewel release?

#22 Updated by Casey Bodley almost 6 years ago

  • Status changed from 17 to 7

#23 Updated by Casey Bodley almost 6 years ago

  • Status changed from 7 to Pending Backport

#24 Updated by Nathan Cutler almost 6 years ago

  • Copied to Backport #23686: luminous: radosgw-admin usage show loops indefinitly - again added

#25 Updated by Abhishek Lekshmanan almost 6 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF