Project

General

Profile

Actions

Bug #47174

closed

[BlueStore] Pool/PG deletion(space reclamation) is very slow

Added by Vikhyat Umrao over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Version - 14.2.8 was also reproduced in 12.2.12

- We use cosbench to fill the cluster - obviously for the RGW workload. Five buckets were filled with following object sizes.

config="containers=r(1,5);objects=r(1,374000);sizes=h(1|1|50,64|64|15,8192|8192|15,65536|65536|15,1048576|1048576|5)KB" >

- Then 35 buckets were filled with the fixed object size 256K.

config="containers=u(6,15);objects=u(1,1000000);sizes=c(256)KB" 
config="containers=u(16,25);objects=u(1,1000000);sizes=c(256)KB" 
config="containers=u(26,35);objects=u(1,1000000);sizes=c(256)KB" 

- In the above workload testing the current space reclaimation status.

Thu Aug 27 13:16:27 UTC 2020
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED
    hdd       361 TiB     202 TiB     158 TiB      159 TiB         43.96
    TOTAL     361 TiB     202 TiB     158 TiB      159 TiB         43.96

Thu Aug 27 16:57:09 UTC 2020
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED       RAW USED     %RAW USED 
    hdd       361 TiB     283 TiB     77 TiB       78 TiB         21.49 
    TOTAL     361 TiB     283 TiB     77 TiB       78 TiB         21.49 

- If we go with above stats and in general with above object sizes we have seen it takes 8+ hours to reclaim the space where it was before starting the workload.

- This space reclamation time gets more increased when we fill the cluster with more small objects, for example:

config="containers=u(1,5);objects=u(1,50000000);sizes=h(1|2|25,2|4|40,4|8|25,8|256|10)KB" 

- It looks like the more the tiny objects the more time it takes to reclaim the space from pools/pgs deletion.

- The cluster has 192 OSD's and DB is in NVMe and block is in HDD.
+ DB partition is 62G in NVMe
+ OSD block disk is 1.8T
+ Each node has 24 OSD's
+ 12 OSD's DB in one NVMe so total 2 NVMe disks
+ 251G total RAM in each node
+ 56 core CPU
+ From one of the OSD node # uptime
17:24:24 up 2 days, 4:55, 2 users, load average: 81.69, 81.59, 81.82

top - 17:25:35 up 2 days,  4:56,  2 users,  load average: 83.08, 82.01, 81.95
Tasks: 644 total,   1 running, 643 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.9 us,  2.8 sy,  0.0 ni, 52.5 id, 37.7 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem : 26385555+total, 69740848 free, 18721012+used,  6904572 buff/cache
KiB Swap:   524284 total,   524284 free,        0 used. 66270048 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                       
  84346 ceph      20   0 9735624   7.4g  22612 S  41.2  2.9 489:58.54 /usr/bin/ceph-osd -f --cluster ceph --id 173 --setuser ceph --setgroup ceph                                   
  84266 ceph      20   0 9649848   7.4g  22448 S  29.4  2.9 394:04.31 /usr/bin/ceph-osd -f --cluster ceph --id 181 --setuser ceph --setgroup ceph                                   
  84315 ceph      20   0 9559208   7.4g  22624 S  29.4  2.9 475:31.08 /usr/bin/ceph-osd -f --cluster ceph --id 75 --setuser ceph --setgroup ceph                                    
  84357 ceph      20   0 9622716   7.4g  22636 S  29.4  2.9 413:25.13 /usr/bin/ceph-osd -f --cluster ceph --id 117 --setuser ceph --setgroup ceph                                   
  84281 ceph      20   0 9798076   7.4g  22696 S  23.5  2.9 450:28.81 /usr/bin/ceph-osd -f --cluster ceph --id 52 --setuser ceph --setgroup ceph                                    
  84291 ceph      20   0 9796536   7.4g  22688 S  23.5  2.9 341:54.12 /usr/bin/ceph-osd -f --cluster ceph --id 7 --setuser ceph --setgroup ceph                                     
  84302 ceph      20   0 9716028   7.4g  22792 S  23.5  2.9 438:46.45 /usr/bin/ceph-osd -f --cluster ceph --id 21 --setuser ceph --setgroup ceph                                    
  84313 ceph      20   0 9694608   7.3g  22644 S  23.5  2.9 346:54.91 /usr/bin/ceph-osd -f --cluster ceph --id 141 --setuser ceph --setgroup ceph                                   
  84317 ceph      20   0 9736432   7.4g  22676 S  23.5  2.9 459:17.50 /usr/bin/ceph-osd -f --cluster ceph --id 30 --setuser ceph --setgroup ceph                                    
  84336 ceph      20   0 9633048   7.4g  22784 S  23.5  2.9 469:43.33 /usr/bin/ceph-osd -f --cluster ceph --id 107 --setuser ceph --setgroup ceph                                   
  84342 ceph      20   0 9402956   7.4g  22640 S  23.5  2.9 452:44.92 /usr/bin/ceph-osd -f --cluster ceph --id 46 --setuser ceph --setgroup ceph                                    
  84351 ceph      20   0 9543764   7.4g  22604 S  23.5  2.9 334:12.42 /usr/bin/ceph-osd -f --cluster ceph --id 125 --setuser ceph --setgroup ceph                                   
  84362 ceph      20   0 9514564   7.4g  22556 S  23.5  2.9 341:06.37 /usr/bin/ceph-osd -f --cluster ceph --id 91 --setuser ceph --setgroup ceph                                    
  87544 ceph      20   0 9642692   7.4g  22600 S  23.5  2.9 345:25.25 /usr/bin/ceph-osd -f --cluster ceph --id 68 --setuser ceph --setgroup ceph                                    
  84247 ceph      20   0 9511292   7.4g  22680 S  17.6  2.9 405:16.52 /usr/bin/ceph-osd -f --cluster ceph --id 164 --setuser ceph --setgroup ceph                                   
  84254 ceph      20   0 9597888   7.4g  22604 S  17.6  2.9 404:31.53 /usr/bin/ceph-osd -f --cluster ceph --id 99 --setuser ceph --setgroup ceph                                    
  84323 ceph      20   0 9652036   7.4g  22592 S  17.6  2.9 489:35.24 /usr/bin/ceph-osd -f --cluster ceph --id 190 --setuser ceph --setgroup ceph                                   
  84341 ceph      20   0 9624968   7.4g  22484 S  17.6  2.9 369:05.30 /usr/bin/ceph-osd -f --cluster ceph --id 13 --setuser ceph --setgroup ceph                                    
  84353 ceph      20   0 9656128   7.4g  22648 S  17.6  2.9 391:18.63 /usr/bin/ceph-osd -f --cluster ceph --id 83 --setuser ceph --setgroup ceph                                    
  84354 ceph      20   0 9419564   7.4g  22628 S  17.6  2.9 365:26.11 /usr/bin/ceph-osd -f --cluster ceph --id 38 --setuser ceph --setgroup ceph                                    
 154813 root      20   0  162696   2856   1612 R  17.6  0.0   0:00.04 top -c -n1                                                                                                    
  84277 ceph      20   0 9713448   7.4g  22472 S  11.8  2.9 400:50.81 /usr/bin/ceph-osd -f --cluster ceph --id 156 --setuser ceph --setgroup ceph                                   
  84320 ceph      20   0 9736748   7.4g  22636 S  11.8  2.9 409:17.71 /usr/bin/ceph-osd -f --cluster ceph --id 148 --setuser ceph --setgroup ceph                                   
  84355 ceph      20   0 9749624   6.8g  22588 S  11.8  2.7 359:17.82 /usr/bin/ceph-osd -f --cluster ceph --id 133 --setuser ceph --setgroup ceph                                   
  84314 ceph      20   0 9561324   6.3g  22576 S   5.9  2.5 391:15.68 /usr/bin/ceph-osd -f --cluster ceph --id 59 --setuser ceph --setgroup ceph  
Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme0n1       28250.00   31.00 2369532.00    244.00     0.00     0.00   0.00   0.00    2.89    2.26  83.16    83.88     7.87   0.04 100.70
nvme1n1       28957.00   36.00 2411584.00    288.00     0.00     0.00   0.00   0.00    2.64    1.81  78.04    83.28     8.00   0.03 101.00
sda              0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdb              0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
dm-0             0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
dm-1             0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
dm-2          2776.00    4.00 232480.00     32.00     0.00     0.00   0.00   0.00    2.89    3.00   8.04    83.75     8.00   0.36 100.00
dm-3          2849.00    2.00 241796.00     16.00     0.00     0.00   0.00   0.00    2.81    2.00   8.01    84.87     8.00   0.35 100.00
dm-4          2259.00    1.00 177868.00      8.00     0.00     0.00   0.00   0.00    2.82    2.00   6.37    78.74     8.00   0.44 100.00
dm-5          2499.00    3.00 209644.00     24.00     0.00     0.00   0.00   0.00    2.93    1.00   7.32    83.89     8.00   0.40 100.00
dm-6          3115.00    7.00 260588.00     56.00     0.00     0.00   0.00   0.00    2.84    3.29   8.89    83.66     8.00   0.32 100.00
dm-7          2293.00    3.00 191508.00     24.00     0.00     0.00   0.00   0.00    2.89    1.00   6.64    83.52     8.00   0.44 100.00
dm-8          2592.00    1.00 221416.00     12.00     0.00     0.00   0.00   0.00    2.97    1.00   7.71    85.42    12.00   0.39 100.00
dm-9           690.00    1.00  58056.00      8.00     0.00     0.00   0.00   0.00    2.86    1.00   1.98    84.14     8.00   1.38  95.50
dm-10         2176.00    3.00 181016.00     16.00     0.00     0.00   0.00   0.00    2.90    1.33   6.30    83.19     5.33   0.46 100.00
dm-11         1462.00    0.00 125620.00      0.00     0.00     0.00   0.00   0.00    2.82    0.00   4.12    85.92     0.00   0.68 100.00
dm-12         3186.00    4.00 268904.00     32.00     0.00     0.00   0.00   0.00    2.91    3.00   9.30    84.40     8.00   0.31 100.00
dm-13         2379.00    1.00 203524.00      8.00     0.00     0.00   0.00   0.00    2.95    2.00   7.04    85.55     8.00   0.42 100.00
dm-14         1801.00    4.00 150780.00     32.00     0.00     0.00   0.00   0.00    2.70    1.25   4.87    83.72     8.00   0.55  99.90
dm-15         2654.00    2.00 220952.00     16.00     0.00     0.00   0.00   0.00    2.60    2.50   6.92    83.25     8.00   0.38 100.00
dm-16         2364.00    7.00 197504.00     56.00     0.00     0.00   0.00   0.00    2.63    1.29   6.22    83.55     8.00   0.42  99.90
dm-17         3057.00    7.00 225848.00     56.00     0.00     0.00   0.00   0.00    2.58    2.43   7.89    73.88     8.00   0.33 100.00
dm-18          963.00    0.00  83284.00      0.00     0.00     0.00   0.00   0.00    2.75    0.00   2.65    86.48     0.00   1.02  98.60
dm-19         2823.00    4.00 236264.00     32.00     0.00     0.00   0.00   0.00    2.61    1.50   7.38    83.69     8.00   0.35  99.90
dm-20         3670.00    3.00 311164.00     24.00     0.00     0.00   0.00   0.00    2.62    1.67   9.61    84.79     8.00   0.27  99.90
dm-21         1604.00    0.00 136908.00      0.00     0.00     0.00   0.00   0.00    2.79    0.00   4.48    85.35     0.00   0.62 100.00
dm-22         2286.00    5.00 192028.00     40.00     0.00     0.00   0.00   0.00    2.56    2.40   5.87    84.00     8.00   0.44 100.00
dm-23         2817.00    3.00 237992.00     24.00     0.00     0.00   0.00   0.00    2.63    1.33   7.40    84.48     8.00   0.35  99.90
dm-24         3108.00    1.00 263720.00      8.00     0.00     0.00   0.00   0.00    2.69    3.00   8.38    84.85     8.00   0.32 100.10
dm-25         1809.00    0.00 155404.00      0.00     0.00     0.00   0.00   0.00    2.74    0.00   4.95    85.91     0.00   0.55  99.90

- OSD memorty target value
osd memory target = 7880472439

- Cluster status - zero client IO

# ceph -s
  cluster:
    id:     78dbd3fd-75z2-47a7-848a-0137ba6ace9a
    health: HEALTH_WARN
            5 daemons have recently crashed
            2 slow ops, oldest one blocked for 84839 sec, daemons [osd.147,mon.f18-h09-000-r620] have slow ops.

  services:
    mon: 3 daemons, quorum mon1,mon2,mon3 (age 5d)
    mgr: mon1(active, since 5d), standbys: mon2, mon3
    osd: 192 osds: 192 up (since 109m), 192 in (since 7h)

  data:
    pools:   16 pools, 7040 pgs
    objects: 49 objects, 6.2 KiB
    usage:   74 TiB used, 287 TiB / 361 TiB avail
    pgs:     7032 active+clean
             7    active+clean+scrubbing
             1    active+clean+scrubbing+deep

  io:
    client:   253 B/s rd, 0 op/s rd, 0 op/s wr

- Data pool is an EC pool - 4+2

# ceph osd erasure-code-profile get myprofile
crush-device-class=
crush-failure-domain=osd
crush-root=default
jerasure-per-chunk-alignment=false
k=4
m=2
plugin=jerasure
technique=reed_sol_van
w=8

Related issues 1 (0 open1 closed)

Is duplicate of RADOS - Bug #47044: PG::_delete_some isn't optimal iterating objectsResolvedIgor Fedotov

Actions
Actions #1

Updated by Vikhyat Umrao over 3 years ago

I have selected one OSD for providing the debug log information. perf top from the same OSD.

ceph 84323 1 28 Aug26 ? 08:11:24 /usr/bin/ceph-osd -f --cluster ceph --id 190 --setuser ceph --setgroup ceph
perf top -p 84323

     7.88%  libc-2.17.so        [.] __memcmp_sse4_1
     7.31%  ceph-osd            [.] rocksdb::DataBlockIter::SeekToFirst
     6.79%  ceph-osd            [.] rocksdb::MergingIterator::Next
     6.07%  ceph-osd            [.] rocksdb::DataBlockIter::Next
     5.91%  libc-2.17.so        [.] __memcpy_ssse3_back
     5.33%  ceph-osd            [.] rocksdb_cache::BinnedLRUHandleTable::FindP
     2.73%  libpthread-2.17.so  [.] pthread_mutex_unlock
     2.62%  ceph-osd            [.] rocksdb::DBIter::FindNextUserEntryInternal
     2.23%  ceph-osd            [.] rocksdb::Block::NewIterator<rocksdb::DataB
     1.64%  ceph-osd            [.] rocksdb::BlockBasedTableIterator<rocksdb::
     1.17%  [kernel]            [k] irq_return
     1.06%  [kernel]            [k] do_numa_page
     0.95%  libpthread-2.17.so  [.] pthread_mutex_lock
     0.94%  [kernel]            [k] gup_pte_range
     0.85%  [kernel]            [k] error_swapgs
     0.79%  [kernel]            [k] _raw_qspin_lock
     0.77%  ceph-osd            [.] rocksdb_cache::BinnedLRUCacheShard::LRU_In
     0.70%  [kernel]            [k] page_waitqueue
     0.70%  ceph-osd            [.] rocksdb::BlockBasedTable::NewDataBlockIter
     0.70%  [kernel]            [k] change_pte_range

I have also collected the following debug log level logs for osd.190. I will upload them.
ceph tell osd.190 injectargs '--debug_osd 20 --debug_bluestore 20 --debug_bluefs 20 --debug_rocksdb 20'


# ceph osd metadata osd.190
{
    "id": 190,
    "arch": "x86_64",
    "back_addr": "[v2:192.168.45.91:6850/84323,v1:192.168.45.91:6854/84323]",
    "back_iface": "enp4s0f1",
    "bluefs": "1",
    "bluefs_db_access_mode": "blk",
    "bluefs_db_block_size": "4096",
    "bluefs_db_dev_node": "/dev/dm-25",
    "bluefs_db_driver": "KernelDevice",
    "bluefs_db_partition_path": "/dev/dm-25",
    "bluefs_db_rotational": "0",
    "bluefs_db_size": "66676850688",
    "bluefs_db_support_discard": "1",
    "bluefs_db_type": "ssd",
    "bluefs_single_shared_device": "0",
    "bluestore_bdev_access_mode": "blk",
    "bluestore_bdev_block_size": "4096",
    "bluestore_bdev_dev_node": "/dev/dm-47",
    "bluestore_bdev_driver": "KernelDevice",
    "bluestore_bdev_partition_path": "/dev/dm-47",
    "bluestore_bdev_rotational": "1",
    "bluestore_bdev_size": "1999839952896",
    "bluestore_bdev_support_discard": "0",
    "bluestore_bdev_type": "hdd",
    "ceph_release": "nautilus",
    "ceph_version": "ceph version 14.2.8-89.el7cp (9ab115d618c72e7d9227441ec25ceb1487c76fb8) nautilus (stable)",
    "ceph_version_short": "14.2.8-89.el7cp",
    "cpu": "Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz",
    "default_device_class": "hdd",
    "device_ids": "nvme1n1=INTEL_SSDPEDMD800G4_CVFT6172003T800CGN,sdz=AVAGO_SMC3108_0002e4072e8ffcb0260199b11a800403",
    "devices": "nvme1n1,sdz",
    "distro": "rhel",
    "distro_description": "Red Hat Enterprise Linux Server 7.8 (Maipo)",
    "distro_version": "7.8",
    "front_addr": "[v2:192.168.45.91:6850/84323,v1:192.168.45.91:6854/84323]",
    "front_iface": "enp4s0f0",
    "hb_back_addr": "[v2:192.168.45.91:6858/84323,v1:192.168.45.91:6863/84323]",
    "hb_front_addr": "[v2:192.168.45.91:6859/84323,v1:192.168.45.91:6864/84323]",
    "hostname": "osdnode1.redhat.com",
    "journal_rotational": "0",
    "kernel_description": "#1 SMP Fri Jun 12 14:34:17 EDT 2020",
    "kernel_version": "3.10.0-1127.13.1.el7.x86_64",
    "mem_swap_kb": "524284",
    "mem_total_kb": "263855556",
    "network_numa_node": "0",
    "network_numa_nodes": "0",
    "objectstore_numa_nodes": "1",
    "objectstore_numa_unknown_devices": "sdz",
    "os": "Linux",
    "osd_data": "/var/lib/ceph/osd/ceph-190",
    "osd_objectstore": "bluestore",
    "rotational": "1" 
}

- From osd.190 perf ump

 "bluefs": {
        "gift_bytes": 0,
        "reclaim_bytes": 0,
        "db_total_bytes": 66676842496,
        "db_used_bytes": 5631893504,
        "wal_total_bytes": 0,
        "wal_used_bytes": 0,
        "slow_total_bytes": 79993634816,
        "slow_used_bytes": 0,
        "num_files": 92,
        "log_bytes": 6508544,
        "log_compactions": 6,
        "logged_bytes": 688148480,
        "files_written_wal": 2,
        "files_written_sst": 1292,
        "bytes_written_wal": 101419847074,
        "bytes_written_sst": 81394218145,
        "bytes_written_slow": 0,
        "max_bytes_wal": 0,
        "max_bytes_db": 6096412672,
        "max_bytes_slow": 0,
        "read_random_count": 3378652,
        "read_random_bytes": 95902364683,
        "read_random_disk_count": 703478,
        "read_random_disk_bytes": 83455174386,
        "read_random_buffer_count": 2682185,
        "read_random_buffer_bytes": 12447185436,
        "read_count": 22915510,
        "read_bytes": 5612151269720,
        "read_prefetch_count": 22904442,
        "read_prefetch_bytes": 5611797253037
    },

    "bluestore": {
        "kv_flush_lat": {
            "avgcount": 26323243,
            "sum": 28.606289727,
            "avgtime": 0.000001086
        },
        "kv_commit_lat": {
            "avgcount": 26323243,
            "sum": 4058.024723604,
            "avgtime": 0.000154161
        },
        "kv_sync_lat": {
            "avgcount": 26323243,
            "sum": 4086.631013331,
            "avgtime": 0.000155248
        },
        "kv_final_lat": {
            "avgcount": 26287290,
            "sum": 320.624837028,
            "avgtime": 0.000012196
        },
        "state_prepare_lat": {
            "avgcount": 27143162,
            "sum": 5777.001448305,
            "avgtime": 0.000212834
        },
        "state_aio_wait_lat": {
            "avgcount": 27143162,
            "sum": 1339.831461853,
            "avgtime": 0.000049361
        },
        "state_io_done_lat": {
            "avgcount": 27143162,
            "sum": 36.584842317,
            "avgtime": 0.000001347
        },
        "state_kv_queued_lat": {
            "avgcount": 27143162,
            "sum": 1959.621523663,
            "avgtime": 0.000072195
        },
        "state_kv_commiting_lat": {
            "avgcount": 27143162,
            "sum": 5605.120579028,
            "avgtime": 0.000206502
        },
        "state_kv_done_lat": {
            "avgcount": 27143216,
            "sum": 17.832246320,
            "avgtime": 0.000000656
        },
        "state_deferred_queued_lat": {
            "avgcount": 55116,
            "sum": 2678321.722681965,
            "avgtime": 48.594268863
        },
        "state_deferred_aio_wait_lat": {
            "avgcount": 55116,
            "sum": 16.250258907,
            "avgtime": 0.000294837
        },
        "state_deferred_cleanup_lat": {
            "avgcount": 55116,
            "sum": 1575.661414028,
            "avgtime": 0.028588094
        },
        "state_finishing_lat": {
            "avgcount": 27143167,
            "sum": 3.564354538,
            "avgtime": 0.000000131
        },
        "state_done_lat": {
            "avgcount": 27143167,
            "sum": 3079267.057073008,
            "avgtime": 0.113445385
        },
        "throttle_lat": {
            "avgcount": 27143162,
            "sum": 151.964357626,
            "avgtime": 0.000005598
        },
        "submit_lat": {
            "avgcount": 27143162,
            "sum": 6204.636075448,
            "avgtime": 0.000228589
        },
        "commit_lat": {
            "avgcount": 27143162,
            "sum": 14727.335121744,
            "avgtime": 0.000542579
        },
        "read_lat": {
            "avgcount": 138419,
            "sum": 2496.145919446,
            "avgtime": 0.018033260
        },
        "read_onode_meta_lat": {
            "avgcount": 278646,
            "sum": 17.341151394,
            "avgtime": 0.000062233
        },
        "read_wait_aio_lat": {
            "avgcount": 140227,
            "sum": 2457.914231043,
            "avgtime": 0.017528109
        },
        "compress_lat": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "decompress_lat": {
            "avgcount": 0,
            "sum": 0.000000000,
            "avgtime": 0.000000000
        },
        "csum_lat": {
            "avgcount": 196613,
            "sum": 17.218755752,
            "avgtime": 0.000087576
        },

Actions #2

Updated by Vikhyat Umrao over 3 years ago

- dump_mempools:

# ceph daemon osd.190 dump_mempools
{
    "mempool": {
        "by_pool": {
            "bloom_filter": {
                "items": 0,
                "bytes": 0
            },
            "bluestore_alloc": {
                "items": 960161,
                "bytes": 7681288
            },
            "bluestore_cache_data": {
                "items": 122,
                "bytes": 36945920
            },
            "bluestore_cache_onode": {
                "items": 2174839,
                "bytes": 1217909840
            },
            "bluestore_cache_other": {
                "items": 423489947,
                "bytes": 814029999
            },
            "bluestore_fsck": {
                "items": 0,
                "bytes": 0
            },
            "bluestore_txc": {
                "items": 49,
                "bytes": 34104
            },
            "bluestore_writing_deferred": {
                "items": 601,
                "bytes": 2415072
            },
            "bluestore_writing": {
                "items": 900,
                "bytes": 3364950
            },
            "bluefs": {
                "items": 2392,
                "bytes": 55120
            },
            "buffer_anon": {
                "items": 216582,
                "bytes": 21638445
            },
            "buffer_meta": {
                "items": 601,
                "bytes": 52888
            },
            "osd": {
                "items": 317,
                "bytes": 3758352
            },
            "osd_mapbl": {
                "items": 53,
                "bytes": 4501543
            },
            "osd_pglog": {
                "items": 417878,
                "bytes": 183866320
            },
            "osdmap": {
                "items": 57295,
                "bytes": 1522776
            },
            "osdmap_mapping": {
                "items": 0,
                "bytes": 0
            },
            "pgmap": {
                "items": 0,
                "bytes": 0
            },
            "mds_co": {
                "items": 0,
                "bytes": 0
            },
            "unittest_1": {
                "items": 0,
                "bytes": 0
            },
            "unittest_2": {
                "items": 0,
                "bytes": 0
            }
        },
        "total": {
            "items": 427321737,
            "bytes": 2297776617
        }
    }
}
Actions #3

Updated by Vikhyat Umrao over 3 years ago

- Uploaded debug log file and osd.190.perf.dump -

[vikhyat@redhat Desktop]$ ceph-post-file tracker.47174.osd.190.perf.dump.txt
args: -- tracker.47174.osd.190.perf.dump.txt
/usr/bin/ceph-post-file: upload tag 5bf0591d-271e-4df0-8409-4c44c581c01e
/usr/bin/ceph-post-file: user: vikhyat@redhat
/usr/bin/ceph-post-file: will upload file tracker.47174.osd.190.perf.dump.txt
sftp> mkdir post/5bf0591d-271e-4df0-8409-4c44c581c01e_vikhyat@redhat_4146a28b-38fd-4b93-9787-5c9bbe12c31a
sftp> cd post/5bf0591d-271e-4df0-8409-4c44c581c01e_vikhyat@redhat_4146a28b-38fd-4b93-9787-5c9bbe12c31a
sftp> put /tmp/tmp.1yq403KXHa user
sftp> put tracker.47174.osd.190.perf.dump.txt
/usr/bin/ceph-post-file: copy the upload id below to share with a dev:

ceph-post-file: 5bf0591d-271e-4df0-8409-4c44c581c01e

[vikhyat@redhat Desktop]$ ceph-post-file tracker.47174.osd.190.debug.logs 
args: -- tracker.47174.osd.190.debug.logs
/usr/bin/ceph-post-file: upload tag 0f1309ba-2ec6-4656-8089-89c71cfe97d5
/usr/bin/ceph-post-file: user: vikhyat@redhat
/usr/bin/ceph-post-file: will upload file tracker.47174.osd.190.debug.logs
sftp> mkdir post/0f1309ba-2ec6-4656-8089-89c71cfe97d5_vikhyat@redhat_e3371103-c55d-4a57-a6eb-239ca9ff520d
sftp> cd post/0f1309ba-2ec6-4656-8089-89c71cfe97d5_vikhyat@redhat_e3371103-c55d-4a57-a6eb-239ca9ff520d
sftp> put /tmp/tmp.DccSu0nFQN user
sftp> put tracker.47174.osd.190.debug.logs
/usr/bin/ceph-post-file: copy the upload id below to share with a dev:

ceph-post-file: 0f1309ba-2ec6-4656-8089-89c71cfe97d5
Actions #4

Updated by Igor Fedotov over 3 years ago

  • Status changed from New to In Progress
  • Assignee set to Igor Fedotov
Actions #5

Updated by Serg D over 3 years ago

it looks similar to problem https://tracker.ceph.com/issues/47044

Actions #6

Updated by Vikhyat Umrao over 3 years ago

Thank you, Igor for looking into this one.

FYI, as this is continuously reproduced this time again we had deleted multiple pools and space reclaimation was running. This time we used ` `osd_delete_sleep_hybrid = 0` and `osd_delete_sleep_hdd = 0` and the following are the stats from space reclaimation.

pool deletion: Thu Sep 10 15:41:14 UTC 2020 - 147 TiB used, 214 TiB / 361 TiB avail

2020-09-10 15:41:13.771548 mgr.f18-h09-000-r620 (mgr.3034348) 832984 : cluster [DBG] pgmap v888291: 0 pgs: ; 0 B data, 147 TiB used, 214 TiB / 361 TiB avail

reclamation complete: Thu Sep 10 20:39:07 UTC 2020 - 13 TiB used, 348 TiB / 361 TiB avail

2020-09-10 20:39:07.718656 mgr.f18-h09-000-r620 (mgr.3034348) 841906 : cluster [DBG] pgmap v897531: 128 pgs: 128 active+clean; 5.8 KiB data, 13 TiB used, 348 TiB / 361 TiB avail

This time it was approx 5 hours and below when we had not tuned then also it was approx 5 hours so I am not sure changing delete_sleep is helping here much.

2020-09-09 11:50:33.986846 mgr.f18-h09-000-r620 (mgr.3034348) 784592 : cluster [DBG] pgmap v833784: 0 pgs: ; 0 B data, 158 TiB used, 203 TiB / 361 TiB avail

2020-09-09 17:22:10.892894 mgr.f18-h09-000-r620 (mgr.3034348) 794514 : cluster [DBG] pgmap v843879: 128 pgs: 128 active+clean; 6.2 KiB data, 13 TiB used, 347 TiB / 361 TiB avail; 4.4 KiB/s rd, 0 B/s wr, 7 op/s
Actions #7

Updated by Igor Fedotov over 3 years ago

  • Is duplicate of Bug #47044: PG::_delete_some isn't optimal iterating objects added
Actions #8

Updated by Igor Fedotov over 3 years ago

Serg D wrote:

it looks similar to problem https://tracker.ceph.com/issues/47044

yep, indeed

Actions #9

Updated by Igor Fedotov over 3 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 37496
Actions #10

Updated by Kefu Chai over 3 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF