Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently - Ceph - Ceph

Actions

Copy link

Bug #3617

closed

Ceph doesn't support > 65536 PGs(?) and fails silently

Added by Faidon Liambotis over 11 years ago. Updated over 11 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Joao Eduardo Luis

Category:

Target version:

% Done:

Source:

Development

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi,

While playing with a test cluster and trying to size it according to production needs & future growth, we decided to create a pool with 65536 placement groups. There were some other pools Since there is no pg splitting yet, we started with very small cluster, two boxes with 4 OSDs each.

We were seeing very weird behavior, including PGs that never managed to peer and lots of unfound PGs:
2012-12-11 02:18:16.576998 mon.0 [INF] pgmap v11750: 66728 pgs: 9581 active, 16659 active+clean, 2 active+remapped+wait_backfill, 28382 active+recovery_wait, 1208 5 peering, 4 active+remapped, 3 active+recovery_wait+remapped, 7 remapped+peering, 5 active+recovering; 79015 MB data, 166 GB used, 18433 GB / 18600 GB avail; 100586/230461 degraded (43.646 ); 11716/115185 unfound (10.171)

After inquiring it about it on IRC, I was told that the maximum PGs are 65536 and was pointed at struct ceph_pg, presumably the 16-bit value for the placement seed.

If that's the case, this probably means that it overflowed and failed in many other ways silently. It'd be nice if Ceph wouldn't let you shoot yourself in the foot like and deny setting a pool to a size that would increase PGs over that limit.

Additionally, at the time this weird behavior happened, we were seeing a lot of OSDs get a SIGABRT, asserting in:
osd/ReplicatedPG.cc: In function 'int ReplicatedPG::pull(const hobject_t&, eversion_t, int)' thread 7f496b723700 time 2012-12-10 16:42:40.295124\nosd/ReplicatedPG.cc: 4890: FAILED assert(peer_missing.count(fromosd))

Full backtrace is attached. I'm unsure if it's related or not, but due to the lack of more info/debugging logs and the unsual of the setup, I'm not filing a separate bug.

This was with Ceph 0.55 and a stock configuration with nothing unusual but the number of PGs.

Files

ceph-osd-64kpg-crash.txt (2.19 KB) ceph-osd-64kpg-crash.txt

Faidon Liambotis, 12/13/2012 09:40 AM

Actions

Copy link

Updated by Ian Colle over 11 years ago

Assignee set to Joao Eduardo Luis
Priority changed from Normal to High

Actions

Copy link

Updated by Joao Eduardo Luis over 11 years ago

Status changed from New to In Progress

Actions

Copy link

Updated by Greg Farnum over 11 years ago

Status changed from In Progress to Resolved

Joao wrote a monitor patch that prohibits counts greater than 65535, and it's merged in. I've created #3622 to deal with the larger issue of the PG limit.

Actions

Copy link

Updated by Joao Eduardo Luis over 11 years ago

The default is now 65536, and can be adjusted using the option 'mon max pool pg num' if higher values are desired.

Actions

Copy link

Updated by Faidon Liambotis over 11 years ago

Note how your commit changed the (default) limit from 65535 to 65536.

Actions

Copy link

Updated by Sage Weil over 11 years ago

Status changed from Resolved to In Progress

Looking closer, I have a feeling this was a large # of pgs making a different bug surface. Jim has been running his cluster with 64K pgs for a while now without problems (aside from the ceph_pg limit, which only affects the kernel client, and for which 65536 is the max).

I'm testing this out now on my cluster.

Actions

Copy link

Updated by Faidon Liambotis over 11 years ago

Note how this was on a cluster with very few OSDs (4 at the time!) as I originally mentioned and this may play a factor here. Also note that we never got over the limit per pool but we had a single pool (.rgw.buckets specifically) with 65536 PGs. Finally, we switched that pool to 16K and it has been running happily for days (and we've enlarged it multiple times since).

Actions

Copy link

Updated by Greg Farnum over 11 years ago

How's the testing come along, Sage?

Actions

Copy link

Updated by Ian Colle over 11 years ago

Priority changed from High to Normal

Actions

Copy link

#10

Updated by Ian Colle over 11 years ago

Status changed from In Progress to Resolved

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #3617

Ceph doesn't support > 65536 PGs(?) and fails silently

Updated by Ian Colle over 11 years ago

Updated by Joao Eduardo Luis over 11 years ago

Updated by Greg Farnum over 11 years ago

Updated by Joao Eduardo Luis over 11 years ago

Updated by Faidon Liambotis over 11 years ago

Updated by Sage Weil over 11 years ago

Updated by Faidon Liambotis over 11 years ago

Updated by Greg Farnum over 11 years ago

Updated by Ian Colle over 11 years ago

Updated by Ian Colle over 11 years ago