Project

General

Profile

Actions

Bug #3617

closed

Ceph doesn't support > 65536 PGs(?) and fails silently

Added by Faidon Liambotis over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

While playing with a test cluster and trying to size it according to production needs & future growth, we decided to create a pool with 65536 placement groups. There were some other pools Since there is no pg splitting yet, we started with very small cluster, two boxes with 4 OSDs each.

We were seeing very weird behavior, including PGs that never managed to peer and lots of unfound PGs:
2012-12-11 02:18:16.576998 mon.0 [INF] pgmap v11750: 66728 pgs: 9581 active, 16659 active+clean, 2 active+remapped+wait_backfill, 28382 active+recovery_wait, 1208 5 peering, 4 active+remapped, 3 active+recovery_wait+remapped, 7 remapped+peering, 5 active+recovering; 79015 MB data, 166 GB used, 18433 GB / 18600 GB avail; 100586/230461 degraded (43.646 ); 11716/115185 unfound (10.171)

After inquiring it about it on IRC, I was told that the maximum PGs are 65536 and was pointed at struct ceph_pg, presumably the 16-bit value for the placement seed.

If that's the case, this probably means that it overflowed and failed in many other ways silently. It'd be nice if Ceph wouldn't let you shoot yourself in the foot like and deny setting a pool to a size that would increase PGs over that limit.

Additionally, at the time this weird behavior happened, we were seeing a lot of OSDs get a SIGABRT, asserting in:
osd/ReplicatedPG.cc: In function 'int ReplicatedPG::pull(const hobject_t&, eversion_t, int)' thread 7f496b723700 time 2012-12-10 16:42:40.295124\nosd/ReplicatedPG.cc: 4890: FAILED assert(peer_missing.count(fromosd))

Full backtrace is attached. I'm unsure if it's related or not, but due to the lack of more info/debugging logs and the unsual of the setup, I'm not filing a separate bug.

This was with Ceph 0.55 and a stock configuration with nothing unusual but the number of PGs.


Files

ceph-osd-64kpg-crash.txt (2.19 KB) ceph-osd-64kpg-crash.txt Faidon Liambotis, 12/13/2012 09:40 AM
Actions #1

Updated by Ian Colle over 11 years ago

  • Assignee set to Joao Eduardo Luis
  • Priority changed from Normal to High
Actions #2

Updated by Joao Eduardo Luis over 11 years ago

  • Status changed from New to In Progress
Actions #3

Updated by Greg Farnum over 11 years ago

  • Status changed from In Progress to Resolved

Joao wrote a monitor patch that prohibits counts greater than 65535, and it's merged in. I've created #3622 to deal with the larger issue of the PG limit.

Actions #4

Updated by Joao Eduardo Luis over 11 years ago

The default is now 65536, and can be adjusted using the option 'mon max pool pg num' if higher values are desired.

Actions #5

Updated by Faidon Liambotis over 11 years ago

Note how your commit changed the (default) limit from 65535 to 65536.

Actions #6

Updated by Sage Weil over 11 years ago

  • Status changed from Resolved to In Progress

Looking closer, I have a feeling this was a large # of pgs making a different bug surface. Jim has been running his cluster with 64K pgs for a while now without problems (aside from the ceph_pg limit, which only affects the kernel client, and for which 65536 is the max).

I'm testing this out now on my cluster.

Actions #7

Updated by Faidon Liambotis over 11 years ago

Note how this was on a cluster with very few OSDs (4 at the time!) as I originally mentioned and this may play a factor here. Also note that we never got over the limit per pool but we had a single pool (.rgw.buckets specifically) with 65536 PGs. Finally, we switched that pool to 16K and it has been running happily for days (and we've enlarged it multiple times since).

Actions #8

Updated by Greg Farnum over 11 years ago

How's the testing come along, Sage?

Actions #9

Updated by Ian Colle over 11 years ago

  • Priority changed from High to Normal
Actions #10

Updated by Ian Colle over 11 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF