Project

General

Profile

Actions

Bug #58106

open

when a large number of error ops appear in the OSDs,pglog does not trim.

Added by 王子敬 wang over 1 year ago. Updated over 1 year ago.

Status:
Need More Info
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When We use the s3 interface append and copy of the object gateway, a large number of error ops appear in the OSDs when the pressure is high and concurrent. We have an S3 cluster with OSDs running out of memory due to the large amount of ram needed to hold pglog entries.This causes the OSDs memory oom. Because a large number of error pglogs are not written to the hard disk. This causes the pglog trim mechanism to fail. pglog does not trim error op. How to solve this problem?
(This is on an osd with osd_memory_target = 2GB, and the osd has 223 PGs).
osd_max_pg_log_entries=10000
osd_min_pg_log_entries: 250
osd_pg_log_trim_max: 10000,
osd_pg_log_trim_min: 100,
osd_target_pg_log_entries_per_osd: 300000,


Files

pg_dump.txt (5.63 KB) pg_dump.txt ceph pg dump partial output 王子敬 wang, 11/30/2022 01:44 AM
2.1a0s2_pglog (416 KB) 2.1a0s2_pglog 王子敬 wang, 11/30/2022 08:31 AM
1668130751396.jpg (106 KB) 1668130751396.jpg top osd memory 王子敬 wang, 12/02/2022 01:03 AM
1666334986727.jpg (3.33 KB) 1666334986727.jpg the pglog in the momory 王子敬 wang, 12/02/2022 01:10 AM
image-2022-10-13-15-02-19-462.png (54.9 KB) image-2022-10-13-15-02-19-462.png dump_mempools 王子敬 wang, 12/06/2022 01:17 AM
image-2022-10-13-15-02-36-902.png (75.2 KB) image-2022-10-13-15-02-36-902.png 王子敬 wang, 12/06/2022 01:26 AM
image-2022-10-13-15-01-58-184.png (145 KB) image-2022-10-13-15-01-58-184.png 王子敬 wang, 12/06/2022 01:26 AM
Actions

Also available in: Atom PDF