Bug #44499: rgw hung - rgw - Ceph

Actions

Copy link

Bug #44499

closed

rgw hung

Added by Andrey Groshev about 4 years ago. Updated about 4 years ago.

Status:

Won't Fix

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v14.2.8

ceph-qa-suite:

rgw

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I have cluster on 5 nodes.
System pool with replicas and data pools with Erasure Codes 4+2 (min_size 5).
To work, I had to turn off one server.
I turned it off, and after a minute my whole cluster hung.
haproxy writes that the servers do not respond for more than 5 seconds.
I try check from console "rados ls -p .rgw.root"
And my request is frozen.
I understand that by turning off one node a part of the PGs has become lacking. (incompllete)
But the system pool in the replica and read in any case should be available.
How to find out the reason?

Actions

Copy link

Updated by Casey Bodley about 4 years ago

Status changed from New to Won't Fix

radosgw depends on having a healthy ceph cluster, so this is expected behavior

Actions

Copy link

Updated by Andrey Groshev about 4 years ago

Casey Bodley wrote:

radosgw depends on having a healthy ceph cluster, so this is expected behavior

What does the expected mean? I have many other pools in good condition. Because of one pool, does the cluster hang? Through this RGW, other users go to their very healthy pools.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rgw

Custom queries

Bug #44499

rgw hung

Updated by Casey Bodley about 4 years ago

Updated by Andrey Groshev about 4 years ago