Project

General

Profile

Bug #38479

Updated by Casey Bodley about 5 years ago

When multiple gateways are running in the same zone of a multisite configuration, they use leases to coordinate with each other on sync. While one gateway holds a lease, other gateways continue to poll until it becomes available. Each failed poll attempt leaves behind a spawned coroutine stack, causing memory growth that doesn't get cleaned up until shutdown. 

 <pre> 
 ~/ceph/build $ bin/ceph daemon run/c2/out/radosgw.8002.asok cr dump | jq .coroutine_managers[0].run_contexts[0].entries[1].ops[0] 
 {                                                                                                                                                             
   "type": "25RGWDataSyncShardControlCR", 
   "spawned": [ 
     "0x55ab5bdb7860", 
     "0x55ab5c258780", 
     "0x55ab5c3565a0", 
     "0x55ab5c357680", 
     "0x55ab5c357b30", 
     "0x55ab5c364e10", 
     "0x55ab5c3650e0", 
     "0x55ab5c3652c0", 
     "0x55ab5c3654a0", 
     "0x55ab5c422e10", 
     "0x55ab5c4231d0", 
     "0x55ab5c4234a0", 
     "0x55ab5c423770", 
     "0x55ab5c4914a0", 
     "0x55ab5c491770", 
     "0x55ab5c491a40", 
     "0x55ab5c491d10", 
     "0x55ab5c5013b0", 
     "0x55ab5c501680", 
     "0x55ab5c501950", 
     "0x55ab5c501c20", 
     "0x55ab5c501ef0", 
     "0x55ab5c57bd10", 
     "0x55ab5b734960", 
     "0x55ab5c5ec5a0", 
     "0x55ab5c5ec690", 
     "0x55ab5c5ec780", 
     "0x55ab5c5ec870", 
     "0x55ab5c5ec960", 
     "0x55ab5c5eca50", 
     "0x55ab5c5ecb40", 
     "0x55ab5c5ecc30" 
   ] 
 } 
 </pre> 

 https://github.com/ceph/ceph/pull/26639

Back