Bug #21846
closedDefault ms log level results in ~40% performance degradation on RBD 4K random read IO
0%
Description
Luminous is now 15% slower than Jewel and over 40% slower as compared to when the ms logs are disabled.
v10.2.10 defaults:
bw ( KiB/s): min=101152, max=115880, per=100.00%, avg=111149.07, stdev=3721.50, samples=30 iops : min=25288, max=28970, avg=27787.23, stdev=930.28, samples=30
v12.2.1 defaults:
bw ( KiB/s): min=81376, max=98360, per=100.00%, avg=92884.47, stdev=4851.53, samples=30 iops : min=20344, max=24590, avg=23221.10, stdev=1212.87, samples=30
v12.2.1 w/ "debug ms = 0":
bw ( KiB/s): min=154096, max=165448, per=100.00%, avg=160584.73, stdev=3897.58, samples=30 iops : min=38524, max=41362, avg=40146.10, stdev=974.31, samples=30
Updated by Sage Weil over 6 years ago
- Status changed from New to 12
- Priority changed from Normal to Urgent
Two options?
1. Just set debug ms = 0 by default for clients.
2. Fix the async msgr to not log the second message. That probably doesn't help as much?
The in-memory ms logging seems less useful on the client side...? Jason?
Updated by Jason Dillaman over 6 years ago
I posted PR https://github.com/ceph/ceph/pull/18418 as a temporary workaround for clients. I figured I would leave this one as a placeholder for perhaps a better tracing
/ "flight data recorder" system for performance critical code paths.
Updated by Ken Dreyer over 6 years ago
Jason Dillaman wrote:
I posted PR https://github.com/ceph/ceph/pull/18418 as a temporary workaround for clients. I figured I would leave this one as a placeholder for perhaps a better tracing
/ "flight data recorder" system for performance critical code paths.
How should we indicate that PR 18418 needs to go into Luminous (v12.2.2?)
Updated by Jason Dillaman over 6 years ago
Ken Dreyer wrote:
How should we indicate that PR 18418 needs to go into Luminous (v12.2.2?)
Using the magic for backporting? See tracker ticket #21860 that is associated w/ that PR.