Bug #44499
closedrgw hung
0%
Description
I have cluster on 5 nodes.
System pool with replicas and data pools with Erasure Codes 4+2 (min_size 5).
To work, I had to turn off one server.
I turned it off, and after a minute my whole cluster hung.
haproxy writes that the servers do not respond for more than 5 seconds.
I try check from console "rados ls -p .rgw.root"
And my request is frozen.
I understand that by turning off one node a part of the PGs has become lacking. (incompllete)
But the system pool in the replica and read in any case should be available.
How to find out the reason?
Updated by Casey Bodley about 4 years ago
- Status changed from New to Won't Fix
radosgw depends on having a healthy ceph cluster, so this is expected behavior
Updated by Andrey Groshev about 4 years ago
Casey Bodley wrote:
radosgw depends on having a healthy ceph cluster, so this is expected behavior
What does the expected mean? I have many other pools in good condition. Because of one pool, does the cluster hang? Through this RGW, other users go to their very healthy pools.