Bug #52776
openthe bucket resharding time is too long, putting object is fail
0%
Description
there is 50 million objects in bucket, and the bucket index need reshard to 1024, but the resharding time is too long, and putting object is fail.
ceph version: 14.2.8
Updated by Casey Bodley over 2 years ago
- Status changed from New to Need More Info
- Tags set to reshard
this sounds like a relatively small bucket to have such performance issues. are the index pools on ssd/nvme? how long does the reshard take to complete?
Updated by Mark Kogan over 2 years ago
posting performance results of resharding 50M obj buckets from 1 to 10224 shards on vstart environment
summery of performance in example environments:
elapsed time was ~6 minutes on SSD system & ~20 minute on HDD system (the HDD system also reported BlueFS spillover which was not reported on the SSD system)
(* performance in different environments may vary depending on cluster load, (deep)scrub, backfill, etc ...)
# objects were written with: numactl -N 1 -m 1 -- ~/go/bin/hsbench -a b2345678901234567890 -s b234567890123456789012345678901234567890 -u http://127.0.0.1:8000 -z 4K -d -1 -t $(numactl -N 1 -- nproc) -b 1 -n 50000000 -m cxip -bp b01b |& tee hsbench.log
# silvertip; - hdd & 14.2.8 git branch -vv * (no branch) 2d095e947a0 14.2.8 sudo ./bin/ceph status cluster: id: bac3a36d-eed8-4460-8619-5951395eb416 health: HEALTH_WARN noscrub,nodeep-scrub flag(s) set BlueFS spillover detected on 1 OSD(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ services: mon: 1 daemons, quorum a (age 43h) mgr: x(active, since 43h) osd: 1 osds: 1 up (since 43h), 1 in (since 43h) flags noscrub,nodeep-scrub rgw: 1 daemon active (8000) data: pools: 6 pools, 768 pgs objects: 50.00M objects, 191 GiB usage: 296 GiB used, 235 GiB / 531 GiB avail pgs: 768 active+clean watch -cd "timeout 4s sudo ./bin/ceph df 2>/dev/null" RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 531 GiB 235 GiB 221 GiB 296 GiB 55.77 TOTAL 531 GiB 235 GiB 221 GiB 296 GiB 55.77 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL .rgw.root 1 1.2 KiB 4 16 KiB 0 229 GiB default.rgw.control 2 0 B 8 0 B 0 229 GiB default.rgw.meta 3 4.0 KiB 22 84 KiB 0 229 GiB default.rgw.log 4 0 B 191 0 B 0 229 GiB default.rgw.buckets.index 5 0 B 2 0 B 0 229 GiB default.rgw.buckets.data 6 191 GiB 50.00M 191 GiB 45.40 229 GiB sudo ./bin/radosgw-admin bucket stats --bucket=b01b000000000000 2>/dev/null | grep num "num_shards": 1, "num_objects": 50000000 fallocate -l 1M ./1M.dat sudo time ./bin/radosgw-admin bucket reshard --bucket=b01b000000000000 --num-shards=1024 --yes-i-really-mean-it ... 50000000 50000000 2021-10-03 11:25:43.064 7fffbffff700 -1 RGWWatcher::handle_error cookie 93825003807488 err (107) Transport endpoint is not connected 2021-10-03 11:25:53.393 7fffedff5840 1 execute INFO: reshard of bucket "b01b000000000000" from "b01b000000000000:801862d0-a8fe-4809-9a76-68df48767f90.4173.2" to "b01b000000000000:801862d0-a8fe-4809-9a76-68df48767f90.183023.1" completed successfully 415.24user 52.77system 20:22.05elapsed 38%CPU (0avgtext+0avgdata 157136maxresident)k ^^^^^ 6032inputs+8outputs (4major+186345minor)pagefaults 0swaps time s3cmd put 1M.dat s3://b01b000000000000 upload: '1M.dat' -> 's3://b01b000000000000/1M.dat' [1 of 1] 1048576 of 1048576 100% in 0s 8.12 MB/s failed WARNING: Upload failed: /1M.dat (timed out) WARNING: Waiting 3 sec... upload: '1M.dat' -> 's3://b01b000000000000/1M.dat' [1 of 1] 1048576 of 1048576 100% in 0s 83.06 MB/s failed WARNING: Upload failed: /1M.dat (timed out) WARNING: Waiting 6 sec... upload: '1M.dat' -> 's3://b01b000000000000/1M.dat' [1 of 1] 1048576 of 1048576 100% in 0s 82.34 MB/s failed WARNING: Upload failed: /1M.dat (timed out) WARNING: Waiting 9 sec... upload: '1M.dat' -> 's3://b01b000000000000/1M.dat' [1 of 1] 1048576 of 1048576 100% in 0s 47.62 MB/s failed WARNING: Upload failed: /1M.dat (timed out) WARNING: Waiting 12 sec... upload: '1M.dat' -> 's3://b01b000000000000/1M.dat' [1 of 1] 1048576 of 1048576 100% in 0s 21.36 MB/s done s3cmd put 1M.dat s3://b01b000000000000 0.20s user 0.05s system 0% cpu 20:30.73 total ^^^^^ sudo ./bin/radosgw-admin bucket stats --bucket=b01b000000000000 2>/dev/null | grep num "num_shards": 1024, "num_objects": 50000001
# sepia o07 - ssd & master git branch -vv * master 6939ea034a2 [origin/master: behind 12] Merge PR #43323 into master sudo ./bin/radosgw-admin bucket stats --bucket=b01b000000000000 2>/dev/null | grep num "num_shards": 1, "num_objects": 50000000 sudo time ./bin/radosgw-admin bucket reshard --bucket=b01b000000000000 --num-shards=1024 --yes-i-really-mean-it ... 49979000 49980000 49981000 49982000 49983000 49984000 49985000 49986000 49987000 49988000 49989000 49990000 49991000 49992000 49993000 49994000 49995000 49996000 49997000 49998000 49999000 50000000 50000000 2021-10-04T07:50:43.476+0000 7ffff7e3ca80 1 execute INFO: reshard of bucket "b01b000000000000" from "b01b000000000000:e6369cbb-16f7-45e2-a904-dd2c5e69b8ed.4169.2" to "b01b000000000000:e6369cbb-16f7-45e2-a904-dd2c5e69b8ed.79935.1" completed successfully 245.17user 47.52system 6:26.06elapsed 75%CPU (0avgtext+0avgdata 173156maxresident)k ^^^^ 25848inputs+16outputs (34major+153351minor)pagefaults 0swaps time s3cmd put 1M.dat s3://b01b000000000000 upload: '1M.dat' -> 's3://b01b000000000000/1M.dat' [1 of 1] 1048576 of 1048576 100% in 0s 165.84 MB/s failed WARNING: Upload failed: /1M.dat (timed out) WARNING: Waiting 3 sec... upload: '1M.dat' -> 's3://b01b000000000000/1M.dat' [1 of 1] 1048576 of 1048576 100% in 85s 12.03 KB/s done s3cmd put 1M.dat s3://b01b000000000000 0.12s user 0.02s system 0% cpu 6:28.30 total ^^^^ sudo ./bin/radosgw-admin bucket stats --bucket=b01b000000000000 2>/dev/null | grep num "num_shards": 1024, "num_objects": 50000001 sudo ./bin/ceph status cluster: id: 37cc56db-397f-498c-ad99-18521fde7c26 health: HEALTH_WARN 12 mgr modules have failed dependencies noscrub,nodeep-scrub flag(s) set 6 pool(s) have no replicas configured services: mon: 1 daemons, quorum a (age -499183281) mgr: x(active, since 23h) osd: 1 osds: 1 up (since 23h), 1 in (since 23h) flags noscrub,nodeep-scrub rgw: 1 daemon active (1 hosts, 1 zones) data: pools: 6 pools, 768 pgs objects: 50.00M objects, 191 GiB usage: 537 GiB used, 1.5 TiB / 2.0 TiB avail pgs: 768 active+clean
Updated by J. Eric Ivancich over 2 years ago
@Huber Ming -- Are you able to provide the "need more info"?
Updated by Huber ming over 2 years ago
as @Mark Kogan said, elapsed time was ~6 minutes on SSD system(resharding 50M obj buckets from 1 to 10224 shards), all putting object are falied during this time;
is there any method to put objects into some resharding bucekt?