Project

General

Profile

Actions

Bug #25060

open

rgw-multisite:RGWDataSyncShardCR is wakened up too offen

Added by Xinying Song over 5 years ago. Updated over 5 years ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

RGWDataSyncShardCR should be wakened up every 20 seconds(defined by INCREMENTAL_INTERVAL) when there is no IO to rgw. But we found it is wakened up almost at every second in our production environment.
Further investigation shows that this is because of the out-band data process logic. Each time rgw receives out-band data, it will wake up the corresponding CR. But this CR will add a new timer event before it yield. With time passing, a CR will have more than one timer event, so it will be wakened up more than one time in every 20 seconds.

This problem can be reproduced as follows:
1. set rgw_data_log_num_shards = 1, debug_rgw = 20
2. `tail -f /var/log/ceph/ceph-client.rgw.1.log |grep "shard_id=0"`, this will print logs every 20 seconds
3. upload one file
4. observe results in step 2, it will print more log than before.

Actions #2

Updated by Xinying Song over 5 years ago

Since class RGWCompletionManager in master branch adds a new struct rgw_io_id, which is absent in luminous, backport work cannot achieve by cherry pick. Could anyone give some suggestion? Thanks.

Actions #3

Updated by Nathan Cutler over 5 years ago

  • Backport set to mimic,luminous

It's too early to backport. The master PR needs to merge, first, otherwise it's impossible to guarantee that the cherry-pick will be from a commit that is in master.

Actions #4

Updated by Nathan Cutler over 5 years ago

  • Status changed from New to Fix Under Review
Actions

Also available in: Atom PDF