Project

General

Profile

Actions

Bug #55761

closed

ceph osd bluestore memory leak

Added by dovefi Z almost 2 years ago. Updated almost 2 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
osd
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

hi, everyone, we have a ceph cluster and we only user rgw with EC Pool, now the cluster osd memory keeps growing to more than 16gb

  • ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
  • osd config
    ceph daemon osd.374 config get osd_memory_target
    {
        "osd_memory_target": "4294967296" 
    }
    

osd memory used


    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
2158658 CEPH      20   0 17.023G 0.016T  10700 S   5.6 12.9  97:37.84 CEPH-OSD
2167500 CEPH      20   0 16.503G 0.015T  17392 S   1.7 12.1 103:05.32 CEPH-OSD
2164715 CEPH      20   0 14.230G 0.013T  19904 S   1.3 10.6 101:39.31 CEPH-OSD
2163249 CEPH      20   0 13.249G 0.012T  20068 S   1.3  9.9  94:41.75 CEPH-OSD
2160621 CEPH      20   0 12.728G 0.012T   9256 S   2.3  9.4 103:20.76 CEPH-OSD
2158042 CEPH      20   0 12.748G 0.011T  22344 S   1.0  9.3  88:23.16 CEPH-OSD
2155989 CEPH      20   0 11.843G 0.010T  22656 S   4.3  8.5  91:13.34 CEPH-OSD
2168721 CEPH      20   0 10.896G 0.010T  20204 S   4.6  8.0  93:46.26 CEPH-OSD
2166929 CEPH      20   0 10.458G 9.538G  11004 S   1.3  7.6  91:26.08 CEPH-OSD
2161262 CEPH      20   0 9011516 7.734G   9600 S   1.7  6.2  98:06.58 CEPH-OSD

osd heap dump info

osd.374 dumping heap profile now.
------------------------------------------------
MALLOC:    16957882712 (16172.3 MiB) Bytes in use by application
MALLOC: +            0 (    0.0 MiB) Bytes in page heap freelist
MALLOC: +    241870520 (  230.7 MiB) Bytes in central cache freelist
MALLOC: +     11673568 (   11.1 MiB) Bytes in transfer cache freelist
MALLOC: +     81639440 (   77.9 MiB) Bytes in thread cache freelists
MALLOC: +     75378880 (   71.9 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: =  17368445120 (16563.8 MiB) Actual memory used (physical + swap)
MALLOC: +      3547136 (    3.4 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: =  17371992256 (16567.2 MiB) Virtual address space used
MALLOC:
MALLOC:        1110696              Spans in use
MALLOC:             47              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.

osd used memory 16GB

osd dump_mempools

{
    "bloom_filter": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_alloc": {
        "items": 15718400,
        "bytes": 15718400
    },
    "bluestore_cache_data": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_cache_onode": {
        "items": 2747609,
        "bytes": 1846393248
    },
    "bluestore_cache_other": {
        "items": 682869756,
        "bytes": 9725848257
    },
    "bluestore_fsck": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_txc": {
        "items": 60,
        "bytes": 44160
    },
    "bluestore_writing_deferred": {
        "items": 641,
        "bytes": 2639419
    },
    "bluestore_writing": {
        "items": 1585,
        "bytes": 6707995
    },
    "bluefs": {
        "items": 5971,
        "bytes": 227232
    },
    "buffer_anon": {
        "items": 16814,
        "bytes": 15956676
    },
    "buffer_meta": {
        "items": 115,
        "bytes": 10120
    },
    "osd": {
        "items": 117,
        "bytes": 1480752
    },
    "osd_mapbl": {
        "items": 0,
        "bytes": 0
    },
    "osd_pglog": {
        "items": 525725,
        "bytes": 103983300
    },
    "osdmap": {
        "items": 137722,
        "bytes": 3183664
    },
    "osdmap_mapping": {
        "items": 0,
        "bytes": 0
    },
    "pgmap": {
        "items": 0,
        "bytes": 0
    },
    "mds_co": {
        "items": 0,
        "bytes": 0
    },
    "unittest_1": {
        "items": 0,
        "bytes": 0
    },
    "unittest_2": {
        "items": 0,
        "bytes": 0
    },
    "total": {
        "items": 702024515,
        "bytes": 11722193223
    }
}

the bluestore_cache_other value is very high

what happen?


Related issues 1 (0 open1 closed)

Is duplicate of bluestore - Bug #48729: Bluestore memory leak on srub operationsResolved

Actions
Actions #1

Updated by dovefi Z almost 2 years ago

i set the mempool_debug to true ,and dump again

{
    "bloom_filter": {
        "items": 0,
        "bytes": 0,
        "by_type": {
            "unsigned char": {
                "items": 0,
                "bytes": 0
            }
        }
    },
    "bluestore_alloc": {
        "items": 15718400,
        "bytes": 15718400
    },
    "bluestore_cache_data": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_cache_onode": {
        "items": 2747611,
        "bytes": 1846394592,
        "by_type": {
            "BlueStore::Onode": {
                "items": 2747611,
                "bytes": 1846394592
            }
        }
    },
    "bluestore_cache_other": {
        "items": 682869511,
        "bytes": 9725841859,
        "by_type": {
            "BlueStore::Blob": {
                "items": 5420397,
                "bytes": 563721288
            },
            "BlueStore::Buffer": {
                "items": 160,
                "bytes": 24320
            },
            "BlueStore::Extent": {
                "items": 5419151,
                "bytes": 260119248
            },
            "BlueStore::ExtentMap::Shard": {
                "items": 114,
                "bytes": 1824
            },
            "BlueStore::SharedBlob": {
                "items": 5420397,
                "bytes": 607084464
            },
            "bluestore_pextent_t": {
                "items": 163800,
                "bytes": 2620800
            },
            "char": {
                "items": 19153107,
                "bytes": 19153107
            },
            "std::_Rb_tree_node<std::pair<int const, boost::intrusive_ptr<BlueStore::Blob> > >": {
                "items": 38,
                "bytes": 1824
            },
            "std::_Rb_tree_node<std::pair<unsigned int const, std::unique_ptr<BlueStore::Buffer, std::default_delete<BlueStore::Buffer> > > >": {
                "items": 0,
                "bytes": 0
            },
            "std::__detail::_Hash_node_base*": {
                "items": 126698,
                "bytes": 1013584
            },
            "std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >": {
                "items": 0,
                "bytes": 0
            }
        }
    },
    "bluestore_fsck": {
        "items": 0,
        "bytes": 0
    },
    "bluestore_txc": {
        "items": 62,
        "bytes": 45632,
        "by_type": {
            "BlueStore::TransContext": {
                "items": 62,
                "bytes": 45632
            }
        }
    },
    "bluestore_writing_deferred": {
        "items": 665,
        "bytes": 2735063
    },
    "bluestore_writing": {
        "items": 1647,
        "bytes": 6967279
    },
    "bluefs": {
        "items": 6243,
        "bytes": 233760,
        "by_type": {
            "BlueFS::Dir": {
                "items": 3,
                "bytes": 264
            },
            "BlueFS::File": {
                "items": 334,
                "bytes": 53440
            },
            "BlueFS::FileLock": {
                "items": 1,
                "bytes": 8
            },
            "BlueFS::FileReader": {
                "items": 323,
                "bytes": 41344
            },
            "BlueFS::FileReaderBuffer": {
                "items": 0,
                "bytes": 0
            },
            "BlueFS::FileWriter": {
                "items": 6,
                "bytes": 1920
            }
        }
    },
    "buffer_anon": {
        "items": 16833,
        "bytes": 15695833
    },
    "buffer_meta": {
        "items": 117,
        "bytes": 10296,
        "by_type": {
            "ceph::buffer::raw_char": {
                "items": 0,
                "bytes": 0
            },
            "ceph::buffer::raw_claimed_char": {
                "items": 0,
                "bytes": 0
            },
            "ceph::buffer::raw_malloc": {
                "items": 0,
                "bytes": 0
            },
            "ceph::buffer::raw_mmap_pages": {
                "items": 0,
                "bytes": 0
            },
            "ceph::buffer::raw_pipe": {
                "items": 0,
                "bytes": 0
            },
            "ceph::buffer::raw_posix_aligned": {
                "items": 117,
                "bytes": 10296
            },
            "ceph::buffer::raw_static": {
                "items": 0,
                "bytes": 0
            },
            "ceph::buffer::raw_unshareable": {
                "items": 0,
                "bytes": 0
            }
        }
    },
    "osd": {
        "items": 117,
        "bytes": 1480752,
        "by_type": {
            "PG::CephPeeringEvt": {
                "items": 0,
                "bytes": 0
            },
            "PrimaryLogPG": {
                "items": 117,
                "bytes": 1480752
            }
        }
    },
    "osd_mapbl": {
        "items": 0,
        "bytes": 0
    },
    "osd_pglog": {
        "items": 525725,
        "bytes": 103983300
    },
    "osdmap": {
        "items": 135706,
        "bytes": 3151408,
        "by_type": {
            "OSDMap": {
                "items": 51,
                "bytes": 43248
            },
            "OSDMap::Incremental": {
                "items": 0,
                "bytes": 0
            }
        }
    },
    "osdmap_mapping": {
        "items": 0,
        "bytes": 0
    },
    "pgmap": {
        "items": 0,
        "bytes": 0
    },
    "mds_co": {
        "items": 0,
        "bytes": 0
    },
    "unittest_1": {
        "items": 0,
        "bytes": 0
    },
    "unittest_2": {
        "items": 0,
        "bytes": 0
    },
    "total": {
        "items": 702022637,
        "bytes": 11722258174
    }
}

bluestore_cache_other total value not equal to all the type value sum

Actions #2

Updated by dovefi Z almost 2 years ago

ceph daemon  osd.375 config get bluestore_cache_type
{
    "bluestore_cache_type": "2q" 
}

the cache type is 2q
Actions #3

Updated by dovefi Z almost 2 years ago

and we can see memory use from bluestore log

2022-05-26 17:43:35.953782 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _trim_shards cache_size: 134217728 kv_alloc: 134217728 kv_used: 134179922 meta_alloc: 0 meta_used: 11572236451 data_alloc: 0 data_used: 0
2022-05-26 17:43:36.154367 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _tune_cache_size target: 4294967296 heap: 17317584896 unmapped: 1228800 mapped: 17316356096 old cache_size: 134217728 new cache size: 134217728
2022-05-26 17:43:40.964708 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _trim_shards cache_size: 134217728 kv_alloc: 134217728 kv_used: 134179922 meta_alloc: 0 meta_used: 11572236451 data_alloc: 0 data_used: 0
2022-05-26 17:43:41.165183 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _tune_cache_size target: 4294967296 heap: 17317584896 unmapped: 1228800 mapped: 17316356096 old cache_size: 134217728 new cache size: 134217728
2022-05-26 17:43:45.975669 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _trim_shards cache_size: 134217728 kv_alloc: 134217728 kv_used: 134179922 meta_alloc: 0 meta_used: 11572236451 data_alloc: 0 data_used: 0
2022-05-26 17:43:46.176221 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _tune_cache_size target: 4294967296 heap: 17317584896 unmapped: 1228800 mapped: 17316356096 old cache_size: 134217728 new cache size: 134217728
2022-05-26 17:43:50.986562 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _trim_shards cache_size: 134217728 kv_alloc: 134217728 kv_used: 134179922 meta_alloc: 0 meta_used: 11572236451 data_alloc: 0 data_used: 0
2022-05-26 17:43:51.186998 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _tune_cache_size target: 4294967296 heap: 17317584896 unmapped: 1228800 mapped: 17316356096 old cache_size: 134217728 new cache size: 134217728
2022-05-26 17:43:55.997303 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _trim_shards cache_size: 134217728 kv_alloc: 134217728 kv_used: 134179922 meta_alloc: 0 meta_used: 11572236451 data_alloc: 0 data_used: 0
2022-05-26 17:43:56.197714 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _tune_cache_size target: 4294967296 heap: 17317584896 unmapped: 1228800 mapped: 17316356096 old cache_size: 134217728 new cache size: 134217728
2022-05-26 17:44:01.007777 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _trim_shards cache_size: 134217728 kv_alloc: 134217728 kv_used: 134179922 meta_alloc: 0 meta_used: 11572236451 data_alloc: 0 data_used: 0
2022-05-26 17:44:01.208322 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _tune_cache_size target: 4294967296 heap: 17317584896 unmapped: 1228800 mapped: 17316356096 old cache_size: 134217728 new cache size: 134217728
2022-05-26 17:44:06.018235 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _trim_shards cache_size: 134217728 kv_alloc: 134217728 kv_used: 134179922 meta_alloc: 0 meta_used: 11572236451 data_alloc: 0 data_used: 0
2022-05-26 17:44:06.218651 7feea4f20700  5 bluestore.MempoolThread(0x55e2c5293bf0) _tune_cache_size target: 4294967296 heap: 17317584896 unmapped: 1228800 mapped: 17316356096 old cache_size: 134217728 new cache size: 134217728
Actions #4

Updated by Igor Fedotov almost 2 years ago

Hard to say without additional debugging but your issue reminds me another case with a bug in onode cache trimming: https://tracker.ceph.com/issues/48729

And Luminous release at your cluster is definitely VERY outdated. It's out of support for some years already.
Not to mention that 12.2.12 is not even the latest Luminous(!) minor release...
Given that I doubt you can get much support with it.

Actions #5

Updated by dovefi Z almost 2 years ago

Igor Fedotov wrote:

Hard to say without additional debugging but your issue reminds me another case with a bug in onode cache trimming: https://tracker.ceph.com/issues/48729

And Luminous release at your cluster is definitely VERY outdated. It's out of support for some years already.
Not to mention that 12.2.12 is not even the latest Luminous(!) minor release...
Given that I doubt you can get much support with it.

Yes, very similar to this bugļ¼Œosd memory keep growing during scrub and deep scrub

Actions #6

Updated by dovefi Z almost 2 years ago

dovefi Z wrote:

Igor Fedotov wrote:

Hard to say without additional debugging but your issue reminds me another case with a bug in onode cache trimming: https://tracker.ceph.com/issues/48729

And Luminous release at your cluster is definitely VERY outdated. It's out of support for some years already.
Not to mention that 12.2.12 is not even the latest Luminous(!) minor release...
Given that I doubt you can get much support with it.

can you give some debug advice, it hard for us to upgrade a ceph cluster in production

Actions #7

Updated by Igor Fedotov almost 2 years ago

You can try to temporarily raise debug-bluestore level to 20, leave it running for a minute or two and grep OSD log for "maximum skip pinned reached" pattern. If present then highly likely you're facing the bug I shared
above. You can try to workaround that misbehavior with raising bluestore_cache_trim_max_skip_pinned config parameter to a pretty high value (e.g. 100000) and see if it helps.

Do not forget to revert debug-bluestore back to default once log analysis is completed as the log might grow pretty large.

Actions #8

Updated by dovefi Z almost 2 years ago

Igor Fedotov wrote:

You can try to temporarily raise debug-bluestore level to 20, leave it running for a minute or two and grep OSD log for "maximum skip pinned reached" pattern. If present then highly likely you're facing the bug I shared
above. You can try to workaround that misbehavior with raising bluestore_cache_trim_max_skip_pinned config parameter to a pretty high value (e.g. 100000) and see if it helps.

Do not forget to revert debug-bluestore back to default once log analysis is completed as the log might grow pretty large.

thank you for your advice.

Actions #9

Updated by dovefi Z almost 2 years ago

Igor Fedotov wrote:

Hard to say without additional debugging but your issue reminds me another case with a bug in onode cache trimming: https://tracker.ceph.com/issues/48729

And Luminous release at your cluster is definitely VERY outdated. It's out of support for some years already.
Not to mention that 12.2.12 is not even the latest Luminous(!) minor release...
Given that I doubt you can get much support with it.

We have confirmed that this is the same bug with https://tracker.ceph.com/issues/48729, change bluestore_cache_trim_max_skip_pinned config or backport with the pr can fix this problem

thank you very much!

Actions #10

Updated by Igor Fedotov almost 2 years ago

  • Is duplicate of Bug #48729: Bluestore memory leak on srub operations added
Actions #11

Updated by Igor Fedotov almost 2 years ago

  • Status changed from New to Duplicate
Actions #12

Updated by Igor Fedotov almost 2 years ago

  • Project changed from Ceph to bluestore
  • Category deleted (OSD)
Actions

Also available in: Atom PDF