Bug #61508
openKernel bug with ceph 17.2.3 and kernel 5.14.0-284.11.1
0%
Description
Hello,
we encounter a kernel bug with ceph after we updated one of our nodes from AlmaLinux 9.1 to 9.2.
We can't reproduce it willingly, but it will happen sooner or later if we let traffic get onto the node.
After the update it occurred twice after 4 hours and a third time the next day after we started the node again.
The error we get from ceph is "1 client failing to respond to capability release".
To solve it we have to reset the node and while it lasts the other nodes can't write or read the specific file for which the capability release is waiting.
We updated from kernel-5.14.0-162.23.1 to kernel-5.14.0-284.11.1, both with the same ceph version ceph-base-17.2.3-2.
Since we're not sure what exactly the problem is, we opened up a case at the AlmaLinux bug tracker too.
If you need further information, please don't hesitate to ask, I will try to provide it if I can.
Thanks a lot in advance,
Andre
Files
No data to display