Project

General

Profile

Bug #44924

High memory usage in fsck/repair

Added by Igor Fedotov almost 4 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
pacific octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Originally this issue appeared at:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/JSUDXTQWBAPXTCLM5PTJTJKTSNS5H7YG/

Primarily investigation revealed that OSD has 35817948 of shared blobs, see num_shared_shards in the attached calc_objectstore_db_histogram.

And fsck code analysis shows that BlueStore's fsck builds some intermediate data structures around these entries in RAM, see:

sb_info_map_t sb_info;
in BlueStore::_fsck_on_open

Perhaps we should revise this approach and may be cap the amount of entries loaded to memory simultaneously. Maybe per PG (or per shard?) analysis or something?

calc_objectstore_db_histogram.txt View (84.7 KB) Igor Fedotov, 04/03/2020 10:35 AM


Related issues

Copied to bluestore - Backport #53890: pacific: High memory usage in fsck/repair Resolved
Copied to bluestore - Backport #53891: octopus: High memory usage in fsck/repair Resolved

History

#2 Updated by Igor Fedotov almost 4 years ago

  • Source set to Community (user)

#3 Updated by Igor Fedotov over 2 years ago

  • Status changed from New to Fix Under Review
  • Assignee set to Igor Fedotov
  • Pull request ID set to 43667

#4 Updated by Igor Fedotov over 2 years ago

  • Backport changed from octopus to pacific octopus

#5 Updated by Igor Fedotov about 2 years ago

  • Status changed from Fix Under Review to Pending Backport

#6 Updated by Backport Bot about 2 years ago

  • Copied to Backport #53890: pacific: High memory usage in fsck/repair added

#7 Updated by Backport Bot about 2 years ago

  • Copied to Backport #53891: octopus: High memory usage in fsck/repair added

#8 Updated by Igor Fedotov about 2 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF