Project

General

Profile

Actions

Bug #40068

open

High CPU load using Ceph Nautilus in Rook

Added by Blaine Gardner almost 5 years ago. Updated almost 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The Rook community is tracking an issue here: https://github.com/rook/rook/issues/3132

I believe this may be an issue with Ceph, or at least an issue in Ceph that Rook is somehow exacerbating. We had a different user come and report the same high-CPU issue to us at the Rook booth at Kubecon Barcelona.

Actions #1

Updated by Blaine Gardner almost 5 years ago

A recent update to the Rook issue suggests that the issue may be present in newly deployed Nautilus clusters, but it couldn't be reproduced in Mimic clusters or a Nautilus cluster upgraded from Mimic.

Actions #2

Updated by Nathan Cutler almost 5 years ago

@Blaine - is there any reason to assume the issue is related to Bluestore? (The issue is currently in the "Bluestore" tracker - maybe it should be moved to the more general "Ceph" tracker?)

Actions #3

Updated by Blaine Gardner almost 5 years ago

No. I wasn't aware I was on a bluestore-specific site. This should be moved.

Actions #4

Updated by Nathan Cutler almost 5 years ago

  • Project changed from bluestore to Ceph
Actions #5

Updated by Vito Botta almost 5 years ago

Hi,

I am the person who created the issue on the Rook project. I have updated the issue with some more info. In short, I was not able to reproduce the problem (i.e. system becomes unresponsive when copying more than little amounts of data into a volume until I have to forcefully reboot the server or servers) with Fedora 29 or CentOS. I have the problem consistently with Ubuntu 18.04 with the default kernel and with RancherOS which I think uses the Ubuntu kernel. However I cannot reproduce the problem even with Ubuntu if I first upgrade the kernel to 5.0.0.15, before setting up Kubernetes and Rook/Ceph. Could it be a bug/problem with the default kernel in Ubuntu 18.04?

Also please note that I have this issue only with Nautilus. If I install Rook using Ceph Mimic the test completes without any problems. Thanks in advance for any help!

Actions

Also available in: Atom PDF