rgw: bucket index sporadically reshards to 65521 shards
With rgw_dynamic_resharding=true about 10% of buckets` indexes were resharded each to 65521 shards.
Buckets contained 200-400 objects or less.
After disabling dynamic resharding after 2 days of test running anyway one bucket index was resharded, also to 65521 shards.
#7 Updated by Aleksei Gutikov 2 months ago
Orit Wasserman wrote:
Can you provide rgw logs?
Unfortunately no logs left for last test run.
We fixed indexes with 'radosgw-admin bucket reshard'.
I think we will take 12.1.2, enable dynamic resharding, and enable debug and repeat test run.
Regarding debug, everywhere I see debug logs for resharding they have level 20, for example RGWRados::check_bucket_shards.
We can't set any debug to 20 because logging on our test cluster will not handle it.
Maybe you can provide a patch with loglevel 0 for logs you need, or point to branch with such changes?
#9 Updated by Aleksei Gutikov 2 months ago
- File rgw.log.1.gz added
Bug reproduced with v12.1.4
Here is in attachment part of logs of radosgw.
I have increased loglevel of messages related to resharding and here is what I see:
Aug 18 14:54:27 P20B-SR4-R1-CEPH-DS06 radosgw: 2017-08-18 14:54:27.408767 7f98208ab700 0 check_bucket_shards: resharding needed: stats.num_objects=18446744073709551609 shard max_objects=100000 Aug 18 14:54:27 P20B-SR4-R1-CEPH-DS06 radosgw: 2017-08-18 14:54:27.408793 7f98208ab700 0 check_bucket_shards bucket fast-stream-57 need resharding old num shards 0 new num shards 2890341191
#10 Updated by Orit Wasserman 2 months ago
I created https://github.com/oritwas/ceph/tree/wip-rgw-resharding-debug
with debug resharding debug you can use.
#11 Updated by Aleksei Gutikov 2 months ago
Log for your branch:
2017-08-21 16:55:40.236138 7f4d3bd44700 0 check_bucket_shards bucket debian.thread-14.s3-test-bucket need resharding 0 old num shards 0 new num shards 45 2017-08-21 16:55:40.283410 7f4d3bd44700 0 check_bucket_shards: stats.num_objects=4 num_objs 1 num_shards 1 shard max_objects=100000 2017-08-21 16:55:40.283604 7f4d3bd44700 0 check_bucket_shards bucket debian.thread-14.s3-test-bucket need resharding 0 old num shards 0 new num shards 45 2017-08-21 16:55:43.278628 7f4d1b503700 0 check_bucket_shards: stats.num_objects=18446744073709551613 num_objs 1 num_shards 1 shard max_objects=100000 2017-08-21 16:55:43.278719 7f4d1b503700 0 check_bucket_shards: resharding needed: stats.num_objects=18446744073709551613 shard max_objects=100000 2017-08-21 16:55:43.278736 7f4d1b503700 0 check_bucket_shards bucket debian.thread-55.s3-test-bucket need resharding 1 old num shards 0 new num shards 2890341191 2017-08-21 16:55:43.278753 7f4d1b503700 0 add_bucket_to_reshard bucket =debian.thread-55.s3-test-bucket, orig_num=1, new_num_shards=65521 2017-08-21 16:55:46.182566 7f4d41756700 0 could not get bucket info for bucket=debian.thread-13.s3-test-bucket[71cdbda3-1ff8-470f-a65c-0712ea420854.4157.1]) r=-2 2017-08-21 16:55:46.182574 7f4d41756700 0 WARNING: sync_bucket() returned r=-2 2017-08-21 16:55:46.182849 7f4d41756700 0 could not get bucket info for bucket=debian.thread-14.s3-test-bucket[71cdbda3-1ff8-470f-a65c-0712ea420854.4156.4]) r=-2 2017-08-21 16:55:46.182852 7f4d41756700 0 WARNING: sync_bucket() returned r=-2 2017-08-21 16:55:46.183460 7f4d41756700 0 could not get bucket info for bucket=debian.thread-15.s3-test-bucket[71cdbda3-1ff8-470f-a65c-0712ea420854.4155.5]) r=-2