Project

General

Profile

Bug #44499

rgw hung

Added by Andrey Groshev over 2 years ago. Updated over 2 years ago.

Status:
Won't Fix
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rgw
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have cluster on 5 nodes.
System pool with replicas and data pools with Erasure Codes 4+2 (min_size 5).
To work, I had to turn off one server.
I turned it off, and after a minute my whole cluster hung.
haproxy writes that the servers do not respond for more than 5 seconds.
I try check from console "rados ls -p .rgw.root"
And my request is frozen.
I understand that by turning off one node a part of the PGs has become lacking. (incompllete)
But the system pool in the replica and read in any case should be available.
How to find out the reason?

History

#1 Updated by Casey Bodley over 2 years ago

  • Status changed from New to Won't Fix

radosgw depends on having a healthy ceph cluster, so this is expected behavior

#2 Updated by Andrey Groshev over 2 years ago

Casey Bodley wrote:

radosgw depends on having a healthy ceph cluster, so this is expected behavior

What does the expected mean? I have many other pools in good condition. Because of one pool, does the cluster hang? Through this RGW, other users go to their very healthy pools.

Also available in: Atom PDF