Bug #2161
closed
nonlinear scaling for PGMap::pg_stat encode
Added by Sage Weil about 12 years ago.
Updated about 12 years ago.
Description
> OSDs size of pg_stat_t
> latest encode
> time
>
> 48 2976397 0.323052
> 72 4472477 0.666633
> 96 5969461 1.159198
> 120 7466021 1.738096
> 144 8963141 2.428229
> 168 10460309 3.203832
> 192 11956709 4.083013
> 240 14950445 6.453171
> 288 17916589 9.462052
My guesses are:
- something in bufferlist is doing something O(n) on the list<ptr>
- some map<> is getting hammered
?
My ceph-osd processes run at 100% CPU for many minutes at a time doing this: http://pastebin.com/wYnPKWeJ
In src/include/buffer.h (version 0.44.1) I found the following comment about a set of operations including that copy_in():
// WARNING: this are horribly inefficient for large bufferlists.
The same may cause the above encoding behaviour?
Ake van der Meer wrote:
My ceph-osd processes run at 100% CPU for many minutes at a time doing this: http://pastebin.com/wYnPKWeJ
In src/include/buffer.h (version 0.44.1) I found the following comment about a set of operations including that copy_in():
// WARNING: this are horribly inefficient for large bufferlists.
The same may cause the above encoding behaviour?
Aha! That is indeed the problem. Working up a fix now.
Thanks!
- Status changed from 12 to 7
- Target version set to v0.46
- Status changed from 7 to Resolved
Also available in: Atom
PDF