Bug #19446
closedrgw: heavy memory leak when multisite sync fail on 10.2.6
0%
Description
heavry memory leak was found when the site sync failed in an active-active multisite clusters (2 sites). In about 2 hours, it will consume about 60GB memory.
when the following log occured every time, the radosgw RSS increased again and again.
2017-04-02 22:56:10.621369 7fa580fe1700 0 ERROR: a sync operation returned error
2017-04-02 22:56:14.772360 7fa580fe1700 0 ERROR: failure in sync, backing out (sync_status=-5)
2017-04-02 22:56:14.782136 7fa580fe1700 0 ERROR: failed to log sync failure in error repo: retcode=0
2017-04-02 22:56:14.782207 7fa580fe1700 0 rgw meta sync: ERROR: RGWBackoffControlCR called coroutine returned 5>fetch_remote_obj() returned r=-2
2017-04-02 22:57:53.071968 7fa580fe1700 0 ERROR: failed to read remote data log info: ret=-5
2017-04-02 22:57:53.165198 7fa580fe1700 0 rgw meta sync: ERROR: RGWBackoffControlCR called coroutine returned -5
2017-04-02 22:58:44.120891 7fa580fe1700 0 ERROR: failed to read remote data log info: ret=-5
2017-04-02 22:58:44.124339 7fa580fe1700 0 rgw meta sync: ERROR: RGWBackoffControlCR called coroutine returned -5
2017-04-02 22:59:25.327494 7fa580fe1700 0 ERROR: failed to read remote data log info: ret=-5
2017-04-02 22:59:25.406766 7fa580fe1700 0 rgw meta sync: ERROR: RGWBackoffControlCR called coroutine returned -5
2017-04-02 23:00:35.433212 7fa580fe1700 0 ERROR: failed to fetch remote data log info: ret=-5
2017-04-02 23:00:35.442030 7fa580fe1700 0 rgw meta sync: ERROR: RGWBackoffControlCR called coroutine returned -5
2017-04-02 23:00:55.583137 7fa580fe1700 0 ERROR: failed to read remote data log info: ret=-5
2017-04-02 23:00:55.587294 7fa580fe1700 0 rgw meta sync: ERROR: RGWBackoffControlCR called coroutine returned -5
2017-04-02 23:02:12.050433 7fa58bff7700 0 store
2017-04-02 23:02:12.050548 7fa580fe1700 0 ERROR: a sync operation returned error
2017-04-02 23:02:12.052318 7fa584fe9700 0 store->fetch_remote_obj() returned r=-2
2017-04-02 23:02:12.052447 7fa580fe1700 0 ERROR: a sync operation returned error
2017-04-02 23:02:12.054184 7fa5977fe700 0 store->fetch_remote_obj() returned r=-2
2017-04-02 23:02:12.054292 7fa580fe1700 0 ERROR: a sync operation returned error
2017-04-02 23:02:12.056197 7fa58cff9700 0 store->fetch_remote_obj() returned r=-2
2017-04-02 23:02:12.056307 7fa580fe1700 0 ERROR: a sync operation returned error
2017-04-02 23:02:12.064099 7fa5897f2700 0 store->fetch_remote_obj() returned r=-2
2017-04-02 23:02:12.064239 7fa580fe1700 0 ERROR: a sync operation returned error
2017-04-02 23:02:13.126223 7fa580fe1700 0 ERROR: failure in sync, backing out (sync_status=-2)
2017-04-02 23:02:13.233178 7fa580fe1700 0 WARNING: skipping data log entry for missing bucket aspen:1d0e03f4-f7fc-4ee6-a956-b66483526e3d.4741.4
2017-04-02 23:02:27.769720 7fa5887f0700 0 store->fetch_remote_obj() returned r=-2
2017-04-02 23:02:27.772560 7fa58ffff700 0 store->fetch_remote_obj() returned r=-2
2017-04-02 23:02:27.788009 7fa580fe1700 0 ERROR: a sync operation returned error
2017-04-02 23:02:27.788023 7fa580fe1700 0 ERROR: a sync operation returned error
2017-04-02 23:02:27.963422 7fa580fe1700 0 ERROR: failure in sync, backing out (sync_status=-2)
2017-04-02 23:02:28.093612 7fa580fe1700 0 WARNING: skipping data log entry for missing bucket aspen:1d0e03f4-f7fc-4ee6-a956-b66483526e3d.4741.4
Files