Feature #1583
osd: bound pg log memory usage
0%
Description
Back in 0.34, my cluster with 3210 PGs 3-plicated across 3 OSD required some 4GB of RAM for the OSDs, i.e., with 3 OSDs up it would use 1.3GB on each; 2GB for 2 OSDs or 4GB for a single OSD.
With 0.35 and 3594, each OSD eats up 4.5GB of RAM just reading the local state, before even trying to contact a monitor. Two OSDs complete recovery using some 6 or 7GB each, and a single OSD skyrockets to 14+ GB before even moving PGs to peering. I could never complete recovery with a single 0.35 OSD :-(
I'm attaching the results of some memory profiling. I selected 3 relevant snapshots in the following sequence of events:
0. I brought osd.2 down and let the others recover, then brought them all down
1. I started osd.2 with memory profiling, and took snapshot alldown when it completed transitioning all PGs to crashed+down+degraded+peering
2. I started osd.1, and took snapshot cleanboth of osd.2 when all PGs were active+clean+degraded
3. I stopped osd.1, and took snapshot recovering of osd.2 when I ran out of time
Then I generated graphs out of each snapshot, as well as graphs with the incremental memory use between consecutive snapshots. They're all attached.
History
#1 Updated by Alexandre Oliva over 12 years ago
- File clean3-cleandegraded2.pdf added
- File cleandegraded2-recovering1.pdf added
- File clean3-recovering1.pdf added
- File beforemon.pdf added
- File clean3.pdf added
- File cleandegraded2.pdf added
- File recovering1.pdf added
- File beforemon-clean3.pdf added
For comparison, here are some graphs of memory use in 0.34, with the 3210-PG cluster, given the following events:
1. started osd.0 with memory profiling and let it run until it tried to contact mon0 (beforemon)
2. started osd.1 and osd.2 without profiling, and let them all recover until all PGs (all 3-plicated) were active+clean (clean3)
3. killed osd.2 and let the other two recover until all PGs were active+clean+degraded (cleandegraded2)
4. killed osd.1 and let osd.0 recover until all PGs were active+clean+degraded and the degraded factor had reached 66.667%; it hit peak memory use a while before recovery was done (recovering1)
I'm sorry that these graphs aren't as detailed as the ones for 0.35, I didn't keep debug info for 0.34.
#2 Updated by Sage Weil over 12 years ago
- Category set to OSD
- Assignee set to Sage Weil
- Target version set to v0.37
Just started looking at this, but alldown-to-cleanboth points to the Ondisklog::block_map, which is a stupid piece of low-hanging fruit; cleaning that up now.
#3 Updated by Sage Weil over 12 years ago
- Tracker changed from Bug to Feature
- Subject changed from excessive memory footprint growth in 0.35 to osd: bound pg log memory usage
- Assignee deleted (
Sage Weil) - Target version changed from v0.37 to v0.38
I don't think this is a new problem, but it is a problem!
#4 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position set to 939
#5 Updated by Sage Weil over 12 years ago
- Target version changed from v0.38 to v0.39
#6 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
952) - translation missing: en.field_position set to 1
- translation missing: en.field_position changed from 1 to 967
#7 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
968) - translation missing: en.field_position set to 1
- translation missing: en.field_position changed from 1 to 968
#8 Updated by Sage Weil over 12 years ago
- translation missing: en.field_position deleted (
968) - translation missing: en.field_position set to 969
#9 Updated by Sage Weil over 12 years ago
- Target version deleted (
v0.39) - translation missing: en.field_position deleted (
970) - translation missing: en.field_position set to 115
#10 Updated by Loïc Dachary over 9 years ago
- Status changed from New to Resolved
Memory consumption has improved/changed a lot since this ticket was open and I believe this issue is no longer relevant.