Project

General

Profile

Bug #6040

Significant slowdown of osds since v0.67 Dumpling

Added by Oliver Daudey over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I'm running a Ceph-cluster with 3 nodes, each of which runs a mon, osd and mds. I'm using RBD on this cluster as storage for KVM, CephFS is unused at this time. While still on v0.61.7 Cuttlefish, I got 70-100+MB/sec on simple linear writes to a file with `dd' inside a VM on this cluster under regular load and the osds usually averaged 20-100% CPU-utilisation in `top'. After the upgrade to Dumpling, CPU-usage for the osds shot up to 100% to 400% in `top' (multi-core system) and the speed for my writes with `dd' inside a VM dropped to 20-40MB/sec. Users complained that disk-access inside the VMs was significantly slower and the backups of the RBD-store I was running, also got behind quickly.

After downgrading only the osds to v0.61.7 Cuttlefish and leaving the rest at 0.67 Dumpling, speed and load returned to normal. I have repeated this performance-hit upon upgrade on a similar test-cluster under no additional load at all. Although CPU-usage for the osds wasn't as dramatic during these tests because there was no base-load from other VMs, I/O-performance dropped significantly after upgrading during these tests as well, and returned to normal after downgrading the osds.

I'm not sure what to make of it. There are no visible errors in the logs and everything runs and reports good health, it's just a lot slower, with a lot more CPU-usage.

Associated revisions

Revision fe68b15a (diff)
Added by Samuel Just over 10 years ago

PGLog: move the log size check after the early return

There really are stl implementations (like the one on my ubuntu 12.04
machine) which have a list::size() which is linear in the size of the
list. That assert, therefore, is quite expensive!

Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <>

Revision 1c0d75db (diff)
Added by Samuel Just over 10 years ago

PGLog: don't maintain log_keys_debug if the config is disabled

Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <>

Revision f808c205 (diff)
Added by Samuel Just over 10 years ago

PGLog: maintain writeout_from and trimmed

This way, we can avoid omap_rmkeyrange in the common append
and trim cases.

Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <>

Revision 40dc4893 (diff)
Added by Samuel Just over 10 years ago

PGLog: move the log size check after the early return

There really are stl implementations (like the one on my ubuntu 12.04
machine) which have a list::size() which is linear in the size of the
list. That assert, therefore, is quite expensive!

Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <>
(cherry picked from commit fe68b15a3d82349f8941f5b9f70fcbb5d4bc7f97)

Revision 53c7ab4d (diff)
Added by Samuel Just over 10 years ago

PGLog: don't maintain log_keys_debug if the config is disabled

Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <>
(cherry picked from commit 1c0d75db1075a58d893d30494a5d7280cb308899)

Revision c22d980c (diff)
Added by Samuel Just over 10 years ago

PGLog: maintain writeout_from and trimmed

This way, we can avoid omap_rmkeyrange in the common append
and trim cases.

Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <>
(cherry picked from commit f808c205c503f7d32518c91619f249466f84c4cf)

History

#1 Updated by Oliver Daudey over 10 years ago

Kernel: SMP Debian 3.2.46-1~bpo60+1 x86_64 GNU/Linux on Debian Squeeze.
QEMU/KVM: 1:1.1.2+dfsg-2~bpo60+1, recompiled with RBD-support.
Storage for osds on XFS, mount-opts: attr2,noquota,noatime,nodiratime

#2 Updated by Oliver Daudey over 10 years ago

For completeness, the relevant part of "ceph.conf", the rest of which just defines a standard 3-node cluster, with mon, mds and osd on each node.
[global]
auth supported = cephx
auth cluster required = cephx
auth service required = cephx
auth client required = cephx

[osd]
osd crush update on start = false
osd data = /mnt/data1/ceph/osd/$name
osd journal = /var/lib/ceph/osd/$name/journal
osd journal size = 10000 # uncomment the following line if you are mounting with ext4 # filestore xattr use omap = true

[mon]
mon data = /mnt/data1/ceph/mon/$cluster-$id

#3 Updated by Samuel Just over 10 years ago

  • Assignee set to Samuel Just

wip-dumpling-pglog-undirty may help with this.

#4 Updated by Ian Colle over 10 years ago

  • Priority changed from High to Urgent

#5 Updated by Samuel Just over 10 years ago

merged wip-dumpling-pglog-undirty with the config set to false into next and dumpling.

#6 Updated by Samuel Just over 10 years ago

  • Status changed from New to Resolved

From ceph-users:

Hey Samuel,

I picked up 0.67.1-10-g47c8949 from the GIT-builder and the osd from
that seems to work great so far. I'll have to let it run for a while
longer to really be sure it fixed the problem, but it looks promising,
not taking any more CPU than the Cuttlefish-osds. Thanks! I'll get
back to you.

Regards,
Oliver

#7 Updated by Samuel Just over 10 years ago

  • Status changed from Resolved to Pending Backport

#8 Updated by Sage Weil over 10 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF