Project

General

Profile

Bug #37532

mon: expected_num_objects warning triggers on bluestore-only setups

Added by Paul Emmerich about 5 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Administration/Usability
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus, octopus
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Follow up for the mailing list thread http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-December/031711.html

To reproduce:

Run
ceph osd pool create test3 1024 1024 replicated
on a cluster consisting of only BlueStore OSDs.

Result:
Error ERANGE: For better initial performance on pools expected to store a large number of objects, consider supplying the expected_num_objects parameter when creating the pool.

Problems:

  • The error message is confusing; there is no indication on how to set this option
  • The man page ceph also does not contain this option (For anyone who encounters this problem and finds this post: the full syntax is ceph osd pool create <name> <pgs> <pgs> replicated <crush rule name (default: replicated_rule)> <expected_num_objects>)
  • It is a FileStore-specific option and should not show up at all when I don't have any Filestore OSDs
  • It should only be a warning, not an error since it says "consider supplying"

The bug was introduced here: https://github.com/ceph/ceph/pull/23072/commits/69fb2293c4d38012e7c4781aaa39a47596125bbb

It checks the osd_objectstore config option to determine if it's a filestore cluster, which is incorrect. The option seems only to be used to somehow set the default type of an OSD if its lacking the "type" file ~ceph/osd/ceph-<id>/type. The option defaults to filestore which is probably a good choice for a deprecated option in case any weirdly deployed OSD relies on it, it's certainly quite old and therefore filestore.

I'm not sure how/where this gets converted to an ERANGE error though.

A proper fix would be to scan the cluster for filestore OSDs and then show the warning with a more helpful message.

But I'd just propose to remove this warning entirely since it's filestore-specific and really confusing? All new clusters will be bluestore anyways so it's not needed. And older installations not yet converted to Bluestore probably encountered the split issues before and know about it?


Related issues

Copied to RADOS - Backport #46738: nautilus: mon: expected_num_objects warning triggers on bluestore-only setups Resolved
Copied to RADOS - Backport #46739: octopus: mon: expected_num_objects warning triggers on bluestore-only setups Resolved

History

#2 Updated by Joao Eduardo Luis about 5 years ago

I don't think it's wise to simply remove the code because filestore is no longer the default. We need to consider existing clusters that haven't moved to bluestore yet.

I would much rather fix the underlying issues you point out:

1. making the error message a bit more self-explanatory, including how to disable it IF all the osds are running on bluestore rather than filestore; and
2. ensuring we properly document how to specify 'expected_num_objects'
3. either allow the command to succeed on a subsequent try by passing an 'i-am-sure' flag, or conveying a proper error tone rather than a warning (I'm more inclined towards the latter simply for simplicity's sake).

#3 Updated by Joao Eduardo Luis about 5 years ago

  • Project changed from Ceph to RADOS
  • Category changed from Monitor to Administration/Usability
  • Component(RADOS) Monitor added

#4 Updated by Joao Eduardo Luis about 5 years ago

  • Status changed from New to 12

#5 Updated by Joao Eduardo Luis about 5 years ago

  • Subject changed from expected_num_objects warning triggers on bluestore-only setups to mon: expected_num_objects warning triggers on bluestore-only setups

#6 Updated by Patrick Donnelly about 4 years ago

  • Status changed from 12 to New

#7 Updated by yunqing wang over 3 years ago

Joao Eduardo Luis wrote:

I don't think it's wise to simply remove the code because filestore is no longer the default. We need to consider existing clusters that haven't moved to bluestore yet.

I would much rather fix the underlying issues you point out:

1. making the error message a bit more self-explanatory, including how to disable it IF all the osds are running on bluestore rather than filestore; and
2. ensuring we properly document how to specify 'expected_num_objects'
3. either allow the command to succeed on a subsequent try by passing an 'i-am-sure' flag, or conveying a proper error tone rather than a warning (I'm more inclined towards the latter simply for simplicity's sake).

I meet this problem recently. see https://github.com/ceph/ceph/pull/36090

#8 Updated by Nathan Cutler over 3 years ago

  • Status changed from New to Fix Under Review
  • Backport set to nautilus, octopus
  • Pull request ID set to 36090

#9 Updated by Kefu Chai over 3 years ago

  • Status changed from Fix Under Review to Pending Backport

#10 Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #46738: nautilus: mon: expected_num_objects warning triggers on bluestore-only setups added

#11 Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #46739: octopus: mon: expected_num_objects warning triggers on bluestore-only setups added

#12 Updated by Nathan Cutler over 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF