Bug #47174
closed[BlueStore] Pool/PG deletion(space reclamation) is very slow
0%
Description
Version - 14.2.8 was also reproduced in 12.2.12
- We use cosbench to fill the cluster - obviously for the RGW workload. Five buckets were filled with following object sizes.
config="containers=r(1,5);objects=r(1,374000);sizes=h(1|1|50,64|64|15,8192|8192|15,65536|65536|15,1048576|1048576|5)KB" >
- Then 35 buckets were filled with the fixed object size 256K.
config="containers=u(6,15);objects=u(1,1000000);sizes=c(256)KB" config="containers=u(16,25);objects=u(1,1000000);sizes=c(256)KB" config="containers=u(26,35);objects=u(1,1000000);sizes=c(256)KB"
- In the above workload testing the current space reclaimation status.
Thu Aug 27 13:16:27 UTC 2020 RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 361 TiB 202 TiB 158 TiB 159 TiB 43.96 TOTAL 361 TiB 202 TiB 158 TiB 159 TiB 43.96 Thu Aug 27 16:57:09 UTC 2020 RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 361 TiB 283 TiB 77 TiB 78 TiB 21.49 TOTAL 361 TiB 283 TiB 77 TiB 78 TiB 21.49
- If we go with above stats and in general with above object sizes we have seen it takes 8+ hours to reclaim the space where it was before starting the workload.
- This space reclamation time gets more increased when we fill the cluster with more small objects, for example:
config="containers=u(1,5);objects=u(1,50000000);sizes=h(1|2|25,2|4|40,4|8|25,8|256|10)KB"
- It looks like the more the tiny objects the more time it takes to reclaim the space from pools/pgs deletion.
- The cluster has 192 OSD's and DB is in NVMe and block is in HDD.
+ DB partition is 62G in NVMe
+ OSD block disk is 1.8T
+ Each node has 24 OSD's
+ 12 OSD's DB in one NVMe so total 2 NVMe disks
+ 251G total RAM in each node
+ 56 core CPU
+ From one of the OSD node
# uptime
17:24:24 up 2 days, 4:55, 2 users, load average: 81.69, 81.59, 81.82
top - 17:25:35 up 2 days, 4:56, 2 users, load average: 83.08, 82.01, 81.95 Tasks: 644 total, 1 running, 643 sleeping, 0 stopped, 0 zombie %Cpu(s): 6.9 us, 2.8 sy, 0.0 ni, 52.5 id, 37.7 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem : 26385555+total, 69740848 free, 18721012+used, 6904572 buff/cache KiB Swap: 524284 total, 524284 free, 0 used. 66270048 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 84346 ceph 20 0 9735624 7.4g 22612 S 41.2 2.9 489:58.54 /usr/bin/ceph-osd -f --cluster ceph --id 173 --setuser ceph --setgroup ceph 84266 ceph 20 0 9649848 7.4g 22448 S 29.4 2.9 394:04.31 /usr/bin/ceph-osd -f --cluster ceph --id 181 --setuser ceph --setgroup ceph 84315 ceph 20 0 9559208 7.4g 22624 S 29.4 2.9 475:31.08 /usr/bin/ceph-osd -f --cluster ceph --id 75 --setuser ceph --setgroup ceph 84357 ceph 20 0 9622716 7.4g 22636 S 29.4 2.9 413:25.13 /usr/bin/ceph-osd -f --cluster ceph --id 117 --setuser ceph --setgroup ceph 84281 ceph 20 0 9798076 7.4g 22696 S 23.5 2.9 450:28.81 /usr/bin/ceph-osd -f --cluster ceph --id 52 --setuser ceph --setgroup ceph 84291 ceph 20 0 9796536 7.4g 22688 S 23.5 2.9 341:54.12 /usr/bin/ceph-osd -f --cluster ceph --id 7 --setuser ceph --setgroup ceph 84302 ceph 20 0 9716028 7.4g 22792 S 23.5 2.9 438:46.45 /usr/bin/ceph-osd -f --cluster ceph --id 21 --setuser ceph --setgroup ceph 84313 ceph 20 0 9694608 7.3g 22644 S 23.5 2.9 346:54.91 /usr/bin/ceph-osd -f --cluster ceph --id 141 --setuser ceph --setgroup ceph 84317 ceph 20 0 9736432 7.4g 22676 S 23.5 2.9 459:17.50 /usr/bin/ceph-osd -f --cluster ceph --id 30 --setuser ceph --setgroup ceph 84336 ceph 20 0 9633048 7.4g 22784 S 23.5 2.9 469:43.33 /usr/bin/ceph-osd -f --cluster ceph --id 107 --setuser ceph --setgroup ceph 84342 ceph 20 0 9402956 7.4g 22640 S 23.5 2.9 452:44.92 /usr/bin/ceph-osd -f --cluster ceph --id 46 --setuser ceph --setgroup ceph 84351 ceph 20 0 9543764 7.4g 22604 S 23.5 2.9 334:12.42 /usr/bin/ceph-osd -f --cluster ceph --id 125 --setuser ceph --setgroup ceph 84362 ceph 20 0 9514564 7.4g 22556 S 23.5 2.9 341:06.37 /usr/bin/ceph-osd -f --cluster ceph --id 91 --setuser ceph --setgroup ceph 87544 ceph 20 0 9642692 7.4g 22600 S 23.5 2.9 345:25.25 /usr/bin/ceph-osd -f --cluster ceph --id 68 --setuser ceph --setgroup ceph 84247 ceph 20 0 9511292 7.4g 22680 S 17.6 2.9 405:16.52 /usr/bin/ceph-osd -f --cluster ceph --id 164 --setuser ceph --setgroup ceph 84254 ceph 20 0 9597888 7.4g 22604 S 17.6 2.9 404:31.53 /usr/bin/ceph-osd -f --cluster ceph --id 99 --setuser ceph --setgroup ceph 84323 ceph 20 0 9652036 7.4g 22592 S 17.6 2.9 489:35.24 /usr/bin/ceph-osd -f --cluster ceph --id 190 --setuser ceph --setgroup ceph 84341 ceph 20 0 9624968 7.4g 22484 S 17.6 2.9 369:05.30 /usr/bin/ceph-osd -f --cluster ceph --id 13 --setuser ceph --setgroup ceph 84353 ceph 20 0 9656128 7.4g 22648 S 17.6 2.9 391:18.63 /usr/bin/ceph-osd -f --cluster ceph --id 83 --setuser ceph --setgroup ceph 84354 ceph 20 0 9419564 7.4g 22628 S 17.6 2.9 365:26.11 /usr/bin/ceph-osd -f --cluster ceph --id 38 --setuser ceph --setgroup ceph 154813 root 20 0 162696 2856 1612 R 17.6 0.0 0:00.04 top -c -n1 84277 ceph 20 0 9713448 7.4g 22472 S 11.8 2.9 400:50.81 /usr/bin/ceph-osd -f --cluster ceph --id 156 --setuser ceph --setgroup ceph 84320 ceph 20 0 9736748 7.4g 22636 S 11.8 2.9 409:17.71 /usr/bin/ceph-osd -f --cluster ceph --id 148 --setuser ceph --setgroup ceph 84355 ceph 20 0 9749624 6.8g 22588 S 11.8 2.7 359:17.82 /usr/bin/ceph-osd -f --cluster ceph --id 133 --setuser ceph --setgroup ceph 84314 ceph 20 0 9561324 6.3g 22576 S 5.9 2.5 391:15.68 /usr/bin/ceph-osd -f --cluster ceph --id 59 --setuser ceph --setgroup ceph
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util nvme0n1 28250.00 31.00 2369532.00 244.00 0.00 0.00 0.00 0.00 2.89 2.26 83.16 83.88 7.87 0.04 100.70 nvme1n1 28957.00 36.00 2411584.00 288.00 0.00 0.00 0.00 0.00 2.64 1.81 78.04 83.28 8.00 0.03 101.00 sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 2776.00 4.00 232480.00 32.00 0.00 0.00 0.00 0.00 2.89 3.00 8.04 83.75 8.00 0.36 100.00 dm-3 2849.00 2.00 241796.00 16.00 0.00 0.00 0.00 0.00 2.81 2.00 8.01 84.87 8.00 0.35 100.00 dm-4 2259.00 1.00 177868.00 8.00 0.00 0.00 0.00 0.00 2.82 2.00 6.37 78.74 8.00 0.44 100.00 dm-5 2499.00 3.00 209644.00 24.00 0.00 0.00 0.00 0.00 2.93 1.00 7.32 83.89 8.00 0.40 100.00 dm-6 3115.00 7.00 260588.00 56.00 0.00 0.00 0.00 0.00 2.84 3.29 8.89 83.66 8.00 0.32 100.00 dm-7 2293.00 3.00 191508.00 24.00 0.00 0.00 0.00 0.00 2.89 1.00 6.64 83.52 8.00 0.44 100.00 dm-8 2592.00 1.00 221416.00 12.00 0.00 0.00 0.00 0.00 2.97 1.00 7.71 85.42 12.00 0.39 100.00 dm-9 690.00 1.00 58056.00 8.00 0.00 0.00 0.00 0.00 2.86 1.00 1.98 84.14 8.00 1.38 95.50 dm-10 2176.00 3.00 181016.00 16.00 0.00 0.00 0.00 0.00 2.90 1.33 6.30 83.19 5.33 0.46 100.00 dm-11 1462.00 0.00 125620.00 0.00 0.00 0.00 0.00 0.00 2.82 0.00 4.12 85.92 0.00 0.68 100.00 dm-12 3186.00 4.00 268904.00 32.00 0.00 0.00 0.00 0.00 2.91 3.00 9.30 84.40 8.00 0.31 100.00 dm-13 2379.00 1.00 203524.00 8.00 0.00 0.00 0.00 0.00 2.95 2.00 7.04 85.55 8.00 0.42 100.00 dm-14 1801.00 4.00 150780.00 32.00 0.00 0.00 0.00 0.00 2.70 1.25 4.87 83.72 8.00 0.55 99.90 dm-15 2654.00 2.00 220952.00 16.00 0.00 0.00 0.00 0.00 2.60 2.50 6.92 83.25 8.00 0.38 100.00 dm-16 2364.00 7.00 197504.00 56.00 0.00 0.00 0.00 0.00 2.63 1.29 6.22 83.55 8.00 0.42 99.90 dm-17 3057.00 7.00 225848.00 56.00 0.00 0.00 0.00 0.00 2.58 2.43 7.89 73.88 8.00 0.33 100.00 dm-18 963.00 0.00 83284.00 0.00 0.00 0.00 0.00 0.00 2.75 0.00 2.65 86.48 0.00 1.02 98.60 dm-19 2823.00 4.00 236264.00 32.00 0.00 0.00 0.00 0.00 2.61 1.50 7.38 83.69 8.00 0.35 99.90 dm-20 3670.00 3.00 311164.00 24.00 0.00 0.00 0.00 0.00 2.62 1.67 9.61 84.79 8.00 0.27 99.90 dm-21 1604.00 0.00 136908.00 0.00 0.00 0.00 0.00 0.00 2.79 0.00 4.48 85.35 0.00 0.62 100.00 dm-22 2286.00 5.00 192028.00 40.00 0.00 0.00 0.00 0.00 2.56 2.40 5.87 84.00 8.00 0.44 100.00 dm-23 2817.00 3.00 237992.00 24.00 0.00 0.00 0.00 0.00 2.63 1.33 7.40 84.48 8.00 0.35 99.90 dm-24 3108.00 1.00 263720.00 8.00 0.00 0.00 0.00 0.00 2.69 3.00 8.38 84.85 8.00 0.32 100.10 dm-25 1809.00 0.00 155404.00 0.00 0.00 0.00 0.00 0.00 2.74 0.00 4.95 85.91 0.00 0.55 99.90
- OSD memorty target value
osd memory target = 7880472439
- Cluster status - zero client IO
# ceph -s cluster: id: 78dbd3fd-75z2-47a7-848a-0137ba6ace9a health: HEALTH_WARN 5 daemons have recently crashed 2 slow ops, oldest one blocked for 84839 sec, daemons [osd.147,mon.f18-h09-000-r620] have slow ops. services: mon: 3 daemons, quorum mon1,mon2,mon3 (age 5d) mgr: mon1(active, since 5d), standbys: mon2, mon3 osd: 192 osds: 192 up (since 109m), 192 in (since 7h) data: pools: 16 pools, 7040 pgs objects: 49 objects, 6.2 KiB usage: 74 TiB used, 287 TiB / 361 TiB avail pgs: 7032 active+clean 7 active+clean+scrubbing 1 active+clean+scrubbing+deep io: client: 253 B/s rd, 0 op/s rd, 0 op/s wr
- Data pool is an EC pool - 4+2
# ceph osd erasure-code-profile get myprofile crush-device-class= crush-failure-domain=osd crush-root=default jerasure-per-chunk-alignment=false k=4 m=2 plugin=jerasure technique=reed_sol_van w=8
Updated by Vikhyat Umrao over 3 years ago
I have selected one OSD for providing the debug log information. perf top from the same OSD.
ceph 84323 1 28 Aug26 ? 08:11:24 /usr/bin/ceph-osd -f --cluster ceph --id 190 --setuser ceph --setgroup ceph
perf top -p 84323
7.88% libc-2.17.so [.] __memcmp_sse4_1 7.31% ceph-osd [.] rocksdb::DataBlockIter::SeekToFirst 6.79% ceph-osd [.] rocksdb::MergingIterator::Next 6.07% ceph-osd [.] rocksdb::DataBlockIter::Next 5.91% libc-2.17.so [.] __memcpy_ssse3_back 5.33% ceph-osd [.] rocksdb_cache::BinnedLRUHandleTable::FindP 2.73% libpthread-2.17.so [.] pthread_mutex_unlock 2.62% ceph-osd [.] rocksdb::DBIter::FindNextUserEntryInternal 2.23% ceph-osd [.] rocksdb::Block::NewIterator<rocksdb::DataB 1.64% ceph-osd [.] rocksdb::BlockBasedTableIterator<rocksdb:: 1.17% [kernel] [k] irq_return 1.06% [kernel] [k] do_numa_page 0.95% libpthread-2.17.so [.] pthread_mutex_lock 0.94% [kernel] [k] gup_pte_range 0.85% [kernel] [k] error_swapgs 0.79% [kernel] [k] _raw_qspin_lock 0.77% ceph-osd [.] rocksdb_cache::BinnedLRUCacheShard::LRU_In 0.70% [kernel] [k] page_waitqueue 0.70% ceph-osd [.] rocksdb::BlockBasedTable::NewDataBlockIter 0.70% [kernel] [k] change_pte_range
I have also collected the following debug log level logs for osd.190. I will upload them.
ceph tell osd.190 injectargs '--debug_osd 20 --debug_bluestore 20 --debug_bluefs 20 --debug_rocksdb 20'
# ceph osd metadata osd.190 { "id": 190, "arch": "x86_64", "back_addr": "[v2:192.168.45.91:6850/84323,v1:192.168.45.91:6854/84323]", "back_iface": "enp4s0f1", "bluefs": "1", "bluefs_db_access_mode": "blk", "bluefs_db_block_size": "4096", "bluefs_db_dev_node": "/dev/dm-25", "bluefs_db_driver": "KernelDevice", "bluefs_db_partition_path": "/dev/dm-25", "bluefs_db_rotational": "0", "bluefs_db_size": "66676850688", "bluefs_db_support_discard": "1", "bluefs_db_type": "ssd", "bluefs_single_shared_device": "0", "bluestore_bdev_access_mode": "blk", "bluestore_bdev_block_size": "4096", "bluestore_bdev_dev_node": "/dev/dm-47", "bluestore_bdev_driver": "KernelDevice", "bluestore_bdev_partition_path": "/dev/dm-47", "bluestore_bdev_rotational": "1", "bluestore_bdev_size": "1999839952896", "bluestore_bdev_support_discard": "0", "bluestore_bdev_type": "hdd", "ceph_release": "nautilus", "ceph_version": "ceph version 14.2.8-89.el7cp (9ab115d618c72e7d9227441ec25ceb1487c76fb8) nautilus (stable)", "ceph_version_short": "14.2.8-89.el7cp", "cpu": "Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz", "default_device_class": "hdd", "device_ids": "nvme1n1=INTEL_SSDPEDMD800G4_CVFT6172003T800CGN,sdz=AVAGO_SMC3108_0002e4072e8ffcb0260199b11a800403", "devices": "nvme1n1,sdz", "distro": "rhel", "distro_description": "Red Hat Enterprise Linux Server 7.8 (Maipo)", "distro_version": "7.8", "front_addr": "[v2:192.168.45.91:6850/84323,v1:192.168.45.91:6854/84323]", "front_iface": "enp4s0f0", "hb_back_addr": "[v2:192.168.45.91:6858/84323,v1:192.168.45.91:6863/84323]", "hb_front_addr": "[v2:192.168.45.91:6859/84323,v1:192.168.45.91:6864/84323]", "hostname": "osdnode1.redhat.com", "journal_rotational": "0", "kernel_description": "#1 SMP Fri Jun 12 14:34:17 EDT 2020", "kernel_version": "3.10.0-1127.13.1.el7.x86_64", "mem_swap_kb": "524284", "mem_total_kb": "263855556", "network_numa_node": "0", "network_numa_nodes": "0", "objectstore_numa_nodes": "1", "objectstore_numa_unknown_devices": "sdz", "os": "Linux", "osd_data": "/var/lib/ceph/osd/ceph-190", "osd_objectstore": "bluestore", "rotational": "1" }
- From osd.190 perf ump
"bluefs": { "gift_bytes": 0, "reclaim_bytes": 0, "db_total_bytes": 66676842496, "db_used_bytes": 5631893504, "wal_total_bytes": 0, "wal_used_bytes": 0, "slow_total_bytes": 79993634816, "slow_used_bytes": 0, "num_files": 92, "log_bytes": 6508544, "log_compactions": 6, "logged_bytes": 688148480, "files_written_wal": 2, "files_written_sst": 1292, "bytes_written_wal": 101419847074, "bytes_written_sst": 81394218145, "bytes_written_slow": 0, "max_bytes_wal": 0, "max_bytes_db": 6096412672, "max_bytes_slow": 0, "read_random_count": 3378652, "read_random_bytes": 95902364683, "read_random_disk_count": 703478, "read_random_disk_bytes": 83455174386, "read_random_buffer_count": 2682185, "read_random_buffer_bytes": 12447185436, "read_count": 22915510, "read_bytes": 5612151269720, "read_prefetch_count": 22904442, "read_prefetch_bytes": 5611797253037 }, "bluestore": { "kv_flush_lat": { "avgcount": 26323243, "sum": 28.606289727, "avgtime": 0.000001086 }, "kv_commit_lat": { "avgcount": 26323243, "sum": 4058.024723604, "avgtime": 0.000154161 }, "kv_sync_lat": { "avgcount": 26323243, "sum": 4086.631013331, "avgtime": 0.000155248 }, "kv_final_lat": { "avgcount": 26287290, "sum": 320.624837028, "avgtime": 0.000012196 }, "state_prepare_lat": { "avgcount": 27143162, "sum": 5777.001448305, "avgtime": 0.000212834 }, "state_aio_wait_lat": { "avgcount": 27143162, "sum": 1339.831461853, "avgtime": 0.000049361 }, "state_io_done_lat": { "avgcount": 27143162, "sum": 36.584842317, "avgtime": 0.000001347 }, "state_kv_queued_lat": { "avgcount": 27143162, "sum": 1959.621523663, "avgtime": 0.000072195 }, "state_kv_commiting_lat": { "avgcount": 27143162, "sum": 5605.120579028, "avgtime": 0.000206502 }, "state_kv_done_lat": { "avgcount": 27143216, "sum": 17.832246320, "avgtime": 0.000000656 }, "state_deferred_queued_lat": { "avgcount": 55116, "sum": 2678321.722681965, "avgtime": 48.594268863 }, "state_deferred_aio_wait_lat": { "avgcount": 55116, "sum": 16.250258907, "avgtime": 0.000294837 }, "state_deferred_cleanup_lat": { "avgcount": 55116, "sum": 1575.661414028, "avgtime": 0.028588094 }, "state_finishing_lat": { "avgcount": 27143167, "sum": 3.564354538, "avgtime": 0.000000131 }, "state_done_lat": { "avgcount": 27143167, "sum": 3079267.057073008, "avgtime": 0.113445385 }, "throttle_lat": { "avgcount": 27143162, "sum": 151.964357626, "avgtime": 0.000005598 }, "submit_lat": { "avgcount": 27143162, "sum": 6204.636075448, "avgtime": 0.000228589 }, "commit_lat": { "avgcount": 27143162, "sum": 14727.335121744, "avgtime": 0.000542579 }, "read_lat": { "avgcount": 138419, "sum": 2496.145919446, "avgtime": 0.018033260 }, "read_onode_meta_lat": { "avgcount": 278646, "sum": 17.341151394, "avgtime": 0.000062233 }, "read_wait_aio_lat": { "avgcount": 140227, "sum": 2457.914231043, "avgtime": 0.017528109 }, "compress_lat": { "avgcount": 0, "sum": 0.000000000, "avgtime": 0.000000000 }, "decompress_lat": { "avgcount": 0, "sum": 0.000000000, "avgtime": 0.000000000 }, "csum_lat": { "avgcount": 196613, "sum": 17.218755752, "avgtime": 0.000087576 },
Updated by Vikhyat Umrao over 3 years ago
- dump_mempools:
# ceph daemon osd.190 dump_mempools { "mempool": { "by_pool": { "bloom_filter": { "items": 0, "bytes": 0 }, "bluestore_alloc": { "items": 960161, "bytes": 7681288 }, "bluestore_cache_data": { "items": 122, "bytes": 36945920 }, "bluestore_cache_onode": { "items": 2174839, "bytes": 1217909840 }, "bluestore_cache_other": { "items": 423489947, "bytes": 814029999 }, "bluestore_fsck": { "items": 0, "bytes": 0 }, "bluestore_txc": { "items": 49, "bytes": 34104 }, "bluestore_writing_deferred": { "items": 601, "bytes": 2415072 }, "bluestore_writing": { "items": 900, "bytes": 3364950 }, "bluefs": { "items": 2392, "bytes": 55120 }, "buffer_anon": { "items": 216582, "bytes": 21638445 }, "buffer_meta": { "items": 601, "bytes": 52888 }, "osd": { "items": 317, "bytes": 3758352 }, "osd_mapbl": { "items": 53, "bytes": 4501543 }, "osd_pglog": { "items": 417878, "bytes": 183866320 }, "osdmap": { "items": 57295, "bytes": 1522776 }, "osdmap_mapping": { "items": 0, "bytes": 0 }, "pgmap": { "items": 0, "bytes": 0 }, "mds_co": { "items": 0, "bytes": 0 }, "unittest_1": { "items": 0, "bytes": 0 }, "unittest_2": { "items": 0, "bytes": 0 } }, "total": { "items": 427321737, "bytes": 2297776617 } } }
Updated by Vikhyat Umrao over 3 years ago
- Uploaded debug log file and osd.190.perf.dump -
[vikhyat@redhat Desktop]$ ceph-post-file tracker.47174.osd.190.perf.dump.txt args: -- tracker.47174.osd.190.perf.dump.txt /usr/bin/ceph-post-file: upload tag 5bf0591d-271e-4df0-8409-4c44c581c01e /usr/bin/ceph-post-file: user: vikhyat@redhat /usr/bin/ceph-post-file: will upload file tracker.47174.osd.190.perf.dump.txt sftp> mkdir post/5bf0591d-271e-4df0-8409-4c44c581c01e_vikhyat@redhat_4146a28b-38fd-4b93-9787-5c9bbe12c31a sftp> cd post/5bf0591d-271e-4df0-8409-4c44c581c01e_vikhyat@redhat_4146a28b-38fd-4b93-9787-5c9bbe12c31a sftp> put /tmp/tmp.1yq403KXHa user sftp> put tracker.47174.osd.190.perf.dump.txt /usr/bin/ceph-post-file: copy the upload id below to share with a dev: ceph-post-file: 5bf0591d-271e-4df0-8409-4c44c581c01e [vikhyat@redhat Desktop]$ ceph-post-file tracker.47174.osd.190.debug.logs args: -- tracker.47174.osd.190.debug.logs /usr/bin/ceph-post-file: upload tag 0f1309ba-2ec6-4656-8089-89c71cfe97d5 /usr/bin/ceph-post-file: user: vikhyat@redhat /usr/bin/ceph-post-file: will upload file tracker.47174.osd.190.debug.logs sftp> mkdir post/0f1309ba-2ec6-4656-8089-89c71cfe97d5_vikhyat@redhat_e3371103-c55d-4a57-a6eb-239ca9ff520d sftp> cd post/0f1309ba-2ec6-4656-8089-89c71cfe97d5_vikhyat@redhat_e3371103-c55d-4a57-a6eb-239ca9ff520d sftp> put /tmp/tmp.DccSu0nFQN user sftp> put tracker.47174.osd.190.debug.logs /usr/bin/ceph-post-file: copy the upload id below to share with a dev: ceph-post-file: 0f1309ba-2ec6-4656-8089-89c71cfe97d5
Updated by Igor Fedotov over 3 years ago
- Status changed from New to In Progress
- Assignee set to Igor Fedotov
Updated by Serg D over 3 years ago
it looks similar to problem https://tracker.ceph.com/issues/47044
Updated by Vikhyat Umrao over 3 years ago
Thank you, Igor for looking into this one.
FYI, as this is continuously reproduced this time again we had deleted multiple pools and space reclaimation was running. This time we used ` `osd_delete_sleep_hybrid = 0` and `osd_delete_sleep_hdd = 0` and the following are the stats from space reclaimation.
pool deletion: Thu Sep 10 15:41:14 UTC 2020 - 147 TiB used, 214 TiB / 361 TiB avail 2020-09-10 15:41:13.771548 mgr.f18-h09-000-r620 (mgr.3034348) 832984 : cluster [DBG] pgmap v888291: 0 pgs: ; 0 B data, 147 TiB used, 214 TiB / 361 TiB avail reclamation complete: Thu Sep 10 20:39:07 UTC 2020 - 13 TiB used, 348 TiB / 361 TiB avail 2020-09-10 20:39:07.718656 mgr.f18-h09-000-r620 (mgr.3034348) 841906 : cluster [DBG] pgmap v897531: 128 pgs: 128 active+clean; 5.8 KiB data, 13 TiB used, 348 TiB / 361 TiB avail
This time it was approx 5 hours and below when we had not tuned then also it was approx 5 hours so I am not sure changing delete_sleep is helping here much.
2020-09-09 11:50:33.986846 mgr.f18-h09-000-r620 (mgr.3034348) 784592 : cluster [DBG] pgmap v833784: 0 pgs: ; 0 B data, 158 TiB used, 203 TiB / 361 TiB avail 2020-09-09 17:22:10.892894 mgr.f18-h09-000-r620 (mgr.3034348) 794514 : cluster [DBG] pgmap v843879: 128 pgs: 128 active+clean; 6.2 KiB data, 13 TiB used, 347 TiB / 361 TiB avail; 4.4 KiB/s rd, 0 B/s wr, 7 op/s
Updated by Igor Fedotov over 3 years ago
- Is duplicate of Bug #47044: PG::_delete_some isn't optimal iterating objects added
Updated by Igor Fedotov over 3 years ago
Updated by Igor Fedotov over 3 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 37496
Updated by Kefu Chai over 3 years ago
- Status changed from Fix Under Review to Resolved