Project

General

Profile

Bug #20339

rgw stops serving requests after receiving SIGHUP

Added by Yuri Gorshkov about 2 years ago. Updated about 2 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
06/19/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Hi.
We're running Ceph kraken (v11.2.0) on CentOS 7 and have noticed that after receiving a SIGHUP, radosgw stops processing requests.
In the logs we're observing the following error messages:

2017-06-19 00:01:03.592113 7f832be9e700  1 rgw realm reloader: Frontends paused
2017-06-19 00:01:03.592339 7f8349eda700  0 ERROR: failed to clone shard, completion_mgr.get_next() returned ret=-125

Related issues

Related to rgw - Bug #20686: rgw hangs in RGWRealmReloader::reload on SIGHUP Resolved 07/19/2017

History

#1 Updated by Casey Bodley about 2 years ago

SIGHUP will pause frontends while it reloads the realm/period configuration for multisite. Are you not seeing a corresponding 'Resuming frontends with new realm configuration.' message in the log?

#2 Updated by Casey Bodley about 2 years ago

  • Status changed from New to Need More Info

#3 Updated by Yehuda Sadeh about 2 years ago

I vaguely remember seeing something similar, maybe this:
http://tracker.ceph.com/issues/19834

#4 Updated by Yuri Gorshkov about 2 years ago

Hi Casey,

No, I don't see the 'Resuming...' message when the frontend crashes. For now we've just resorted to restarting rgws when they hang.

#5 Updated by Casey Bodley about 2 years ago

  • Related to Bug #20686: rgw hangs in RGWRealmReloader::reload on SIGHUP added

Also available in: Atom PDF