Bug #44010: changing osd_memory_target currently requires restart, should update at runtime - bluestore - Ceph

Custom queries

Bug queue
Bug triage
Crash queue
Crash triage
Feedback
My issues
Need Review
Pending backports
Product Backlog Scrub

Actions

Copy link

Bug #44010

closed

changing osd_memory_target currently requires restart, should update at runtime

Added by Frank Schilder about 4 years ago. Updated over 1 year ago.

Status:

Duplicate

Priority:

Normal

Assignee:

Sridhar Seshasayee

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

The documentation of osd_memory_target says "Can update at runtime: true", but it seems that a restart is required to activate the setting, so it can currently not be updated at runtime (meaning it takes effect without restart).

This was observed with mimic 13.2.8 (after an upgrade from 13.2.2).

See also this thread: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/I5DJS7CSXLE4RYSUPTQYOVIKR3B6T5GL/

The following settings were active:

osd_memory_target [default=4294967296]
all bluestore_cache_[*] options at default

[root@ceph-04 ~]# ceph config get osd.243 bluefs_allocator
bitmap
[root@ceph-04 ~]# ceph config get osd.243 bluestore_allocator
bitmap
[root@ceph-04 ~]# ceph config get osd.243 osd_memory_target
8589934592
[root@gnosis ~]# ceph config get osd.243 bluestore_cache_size
0
[root@gnosis ~]# ceph config get osd.243 bluestore_cache_size_ssd
3221225472

All settings except osd_memory_target for osd.243 are mimic 13.2.8 defaults. After setting the increased target for osd.243 nothing happened. After waiting a few hours, I restarted this OSD and memory usage went up as expected. I tried both, up- and down-sizing of the memory target. Both required a restart.

It seems best if this parameter could be adjusted at run-time without a restart.

Related issues 1 (0 open — 1 closed)

Related to bluestore - Bug #41009: osd_memory_target isn't applied in runtime.

Resolved

Sridhar Seshasayee

Actions

Issue # Delay: days Cancel

History
Notes
Property changes

Actions

Copy link

Updated by Igor Fedotov about 4 years ago

Related to Bug #41009: osd_memory_target isn't applied in runtime. added

Actions

Copy link

Updated by Igor Fedotov about 4 years ago

Project changed from Ceph to bluestore

Actions

Copy link

Updated by Igor Fedotov about 4 years ago

Assignee set to Sridhar Seshasayee

Sridhar, mind taking a look?

Actions

Copy link

Updated by Sridhar Seshasayee about 4 years ago

This is my preliminary thoughts after looking at the mimic code base:

"osd_memory_target" config option cannot be changed at runtime currently in
mimic. This option along with a few other ones related to priority cache tuner
were made configurable at runtime on a very recent nautilus release and of
course on the master branch.

Therefore, the documentation for this option being configurable at runtime in
mimic is obviously wrong.

I will investigate further on if we can make this option configurable at runtime.

Actions

Copy link

Updated by Frank Schilder about 4 years ago

Thanks for looking at this.

Since this has been fixed in nautilus, my personal preference would then be to look at https://tracker.ceph.com/issues/44011 . The main nuisance is that deploying new OSDs with custom memory targets required three steps: deploy, change memory target, restart. The requested feature would mitigate the impact of the bug and benefit more users.

It might, therefore, not be worth trying to fix the run-time update functionality for mimic and rather update the documentation.

Actions

Copy link

Updated by Frank Schilder about 4 years ago

I was just told that it is already possible to set values for different device classes in mimic using masks.

Therefore, since the impact of this issue seems negligible, I would be OK with marking this as "will not fix" or setting it to very low priority [personal opinion].

Actions

Copy link

Updated by José Jorge about 4 years ago

Frank Schilder wrote:

I was just told that it is already possible to set values for different device classes in mimic using masks.

An example in this bug report would be helpful for people like me who arrives ehere trying to fix the behaviour. With nautilus 14.2.6 at least, osd always get bigger than twice the osd_memory_target value, even after restarting the OSDs.

Actions

Copy link

Updated by Frank Schilder about 4 years ago

Setting values depending on device class is described here: https://docs.ceph.com/docs/mimic/rados/configuration/ceph-conf/#sections-and-masks .

However, it seems not to work. Here the steps I took:

1. Start OSDs with default value for osd_memory_target (4G). All OSDs start consuming close to 4G as expected.
2. Set osd_memory_target to 2G and wait. Expected behaviour is that OSDs start reducing their consumption. This does not work and was the reason for this ticket.
3. Restart all OSDs. Now the new value for osd_memory_target takes effect and OSDs stay below 2G.

Now I tried with the device class setting using filters, but this seems not to work as intended.

4. Remove the osd_memory_target setting from step 2 with

ceph config rm osd osd_memory_target

5. Set class-specific target:

ceph config set osd/class:hdd osd_memory_target 2147483648

6. A config dump now contains this line:

WHO         MASK      LEVEL    OPTION                            VALUE         RO
  osd       class:hdd basic    osd_memory_target                 2147483648

7. Restart an OSD backed by a HDD. Expected behaviour is that the OSD stays below 2G as all the others. Unfortunately, this does not work as documented (output from top):

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                            
 785197 ceph      20   0 5672756   3.9g   9564 S   0.0  6.3 236:23.47 /usr/bin/ceph-osd --cluster ceph -f -i 34 --setuser ceph --setgro+ 
 979905 ceph      20   0 4355268   3.0g  22952 S   1.0  4.7   4:25.64 /usr/bin/ceph-osd --cluster ceph -f -i 227 --setuser ceph --setgr+ 
 898846 ceph      20   0 2996032   1.9g   8988 S   1.0  3.0 173:10.26 /usr/bin/ceph-osd --cluster ceph -f -i 96 --setuser ceph --setgro+ 
 900352 ceph      20   0 2948024   1.9g   9008 S   1.7  3.0 166:09.68 /usr/bin/ceph-osd --cluster ceph -f -i 206 --setuser ceph --setgr+ 
 897236 ceph      20   0 2892108   1.8g   8464 S   1.0  2.9 149:46.43 /usr/bin/ceph-osd --cluster ceph -f -i 225 --setuser ceph --setgr+ 
 895713 ceph      20   0 3037724   1.8g   8692 S   1.3  2.9 144:59.47 /usr/bin/ceph-osd --cluster ceph -f -i 198 --setuser ceph --setgr+ 
 899216 ceph      20   0 3001248   1.8g   8652 S   1.3  2.9 138:01.32 /usr/bin/ceph-osd --cluster ceph -f -i 200 --setuser ceph --setgr+ 
 894927 ceph      20   0 2973764   1.8g   8804 S   0.7  2.8 163:52.28 /usr/bin/ceph-osd --cluster ceph -f -i 194 --setuser ceph --setgr+ 
 896893 ceph      20   0 3004720   1.8g   8524 S   1.0  2.8 178:03.53 /usr/bin/ceph-osd --cluster ceph -f -i 222 --setuser ceph --setgr+ 
 899659 ceph      20   0 2970016   1.7g   9332 S   0.7  2.8 150:56.23 /usr/bin/ceph-osd --cluster ceph -f -i 202 --setuser ceph --setgr+ 
 897651 ceph      20   0 2934408   1.7g   8644 S   3.7  2.7 161:30.68 /usr/bin/ceph-osd --cluster ceph -f -i 133 --setuser ceph --setgr+ 
 895245 ceph      20   0 2952512   1.7g   9096 S   1.0  2.7 204:53.27 /usr/bin/ceph-osd --cluster ceph -f -i 196 --setuser ceph --setgr+ 
 898033 ceph      20   0 2939056   1.7g   9056 S   1.0  2.6 160:57.36 /usr/bin/ceph-osd --cluster ceph -f -i 120 --setuser ceph --setgr+ 
 898441 ceph      20   0 2906020   1.6g   8264 S   1.3  2.6 149:21.77 /usr/bin/ceph-osd --cluster ceph -f -i 109 --setuser ceph --setgr+ 
 899995 ceph      20   0 2947384   1.6g   9164 S   0.7  2.6 143:57.95 /usr/bin/ceph-osd --cluster ceph -f -i 204 --setuser ceph --setgr+ 
 896046 ceph      20   0 2972096   1.6g   9076 S   1.3  2.6 181:17.36 /usr/bin/ceph-osd --cluster ceph -f -i 224 --setuser ceph --setgr+

OSD 34 is on SSD, so the 3.9G is OK. OSD 227 is the one on HDD I restarted in step 7. It does not use the value set with the class:hdd mask. All other OSDs were restarted in step 2 with osd_memory_target set to 2G.

Here is the crush tree for this host:

-136        134.01401                 host ceph-16       
  96   hdd    8.90999                     osd.96         
 109   hdd    8.90999                     osd.109        
 120   hdd    8.90999                     osd.120        
 133   hdd    8.90999                     osd.133        
 194   hdd    8.90999                     osd.194        
 196   hdd    8.90999                     osd.196        
 198   hdd    8.90999                     osd.198        
 200   hdd    8.90999                     osd.200        
 202   hdd    8.90999                     osd.202        
 204   hdd    8.90999                     osd.204        
 206   hdd    8.90999                     osd.206        
 222   hdd    8.90999                     osd.222        
 224   hdd    8.90999                     osd.224        
 225   hdd    8.90999                     osd.225        
 227   hdd    8.90999                     osd.227        
  34   ssd    0.36400                     osd.34

Now it looks like there are actually two bugs here:

- Changing osd_memory_target requires a restart but should not.
- Setting config values with masks does not work at all, restart or not.

I don't use nautilus, so I can't comment on the behaviour there. For mimic it seems to work if OSDs get the correct value. As I wrote above, priority has getting config values with masks working as documented. I personally can live with the restarts if masks work properly.

Actions

Copy link