luminous: Apparent Memory Leak in OSD
Since last update (late October), been experiencing apparent memory leak in OSD process on two ceph servers in small business environment.
Debian stretch, kernel 4.9.0-8-amd64, luminous with Bluestore.
Two servers, each with two OSD daemons - 8 TB storage each (2x4TB) and 8GB RAM.
memory use on OSD process has been observed to grow at about 100MB per hour per OSD; have been rebooting servers when each OSD process approaches 50% of physical memory. After reboot they return to about 10% use each, and begin growing again.
Since no one else has reported, I believe likely to to my configuration, but both systems have been extremely stable up to this point.
Maybe related to my non-optimal replicas? (triple replicas on two servers)
- Subject changed from Apparent Memory Leak in OSD to luminous: Apparent Memory Leak in OSD
- Status changed from New to Need More Info
can you dump the mempools (ceph daemon osd.NNN dump_mempools) several times over the growht of the process so we can see what is consuming the memory?
#9 Updated by John Jaser 2 months ago
Konstantin: thanks for pointing that out. that looks like the issue. Both OSD servers have 8GB RAM total, each running two OSD daemons. So the default osd_memory_target setting of 4294967296 won't allow any overhead for OS RAM. (sort of breaks the 1GB RAM per TB storage rule of thumb for my setup).
I changed setting to osd_memory_target = 2684354560
After 55 hours uptime, free memory is about 2.1G which is just slightly over target, and looks stable.
Thanks to all.
#11 Updated by Konstantin Shalygin 2 months ago
I made dumps during the tune of the osd_memory_target value. Perhaps this data will be useful in the future.
I set osd_memory_target to 3GB, memory consumption is about 7-9% more than the default Bluestore settings in 12.2.8.