Project

General

Profile

Bug #810

1). PG bits don't get recognized and 2). Takes too long for OSDs to boot up.

Added by DongJin Lee about 13 years ago. Updated about 13 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

1.

By default, (0.24.3) there are 264 number of placement groups per osd.
I've tried to change to something else.

[global]
pid file = /var/run/ceph/$name.pid
logger dir = /cephlog
log dir = /cephlog
user = root
osd_pg_bits = 2
osd_pgp_bits = 2

No changes, still 264 per osd, previous master branch (back in 10-Feb) had at least 2.3k per OSD (and this didn't change either)

2.

After the OSDs are started, ceph osd peering takes too long.
Attached (ceph-bug) contains ceph -w log, 6osds started, altogether took about 6 minutes. (and OSDs are 100% util)
I believe trying to mount or doing some file copying during this process, can lead to inconsistent performance results.
Also for master ver. with 2.3k num pgs, it took up to 10 minutes.
During this process, ceph thinks that some OSDs are down and shows many degradings.
Generally, more OSDs means more peering and so takes longer.
Is there any way to speed up the peering process? (more OSDs seem to take longer, so the scalability is the problem)
and or least 'signals' so that the immediate mount still wait until all peerings have finished, and doesn't time out?

Currently, after the cosds started, I sleep for 10m (for 6 osds, just to be safe), and finally mount.

If I tried to mount while peering, it hangs for a while,
and the mount fails returning a message "can't read superblock".
I can keep retrying till finally the mount succeeds, however, I think this approach is still an 'early mount'
because I still see OSD 1 at 100% util (busy) for about additional 2-3 minutes.
And I don't prefer to do any benchmark tests till all OSDs are idle, 0% util.

Thanks a lot,
Regards, DJ

ceph-bug (32.1 KB) DongJin Lee, 02/17/2011 12:47 AM

History

#1 Updated by Greg Farnum about 13 years ago

  • Assignee deleted (DongJin Lee)

#2 Updated by Greg Farnum about 13 years ago

  • Assignee set to Greg Farnum
  • Target version changed from v0.24.3 to v0.25

#3 Updated by Greg Farnum about 13 years ago

  • Status changed from New to Duplicate

Setting pg bits is working properly, but you're using the wrong config name. :) Underscores should only be used on the command line; in the config file use spaces:

[global]
pid file = /var/run/ceph/$name.pid
logger dir = /cephlog
log dir = /cephlog
user = root
osd pg bits = 2
osd pgp bits = 2

We have a number of other bugs dealing with the OSD peering issues, so closing this as a duplicate! :)

Also available in: Atom PDF