Bug #6040
Significant slowdown of osds since v0.67 Dumpling
0%
Description
I'm running a Ceph-cluster with 3 nodes, each of which runs a mon, osd and mds. I'm using RBD on this cluster as storage for KVM, CephFS is unused at this time. While still on v0.61.7 Cuttlefish, I got 70-100+MB/sec on simple linear writes to a file with `dd' inside a VM on this cluster under regular load and the osds usually averaged 20-100% CPU-utilisation in `top'. After the upgrade to Dumpling, CPU-usage for the osds shot up to 100% to 400% in `top' (multi-core system) and the speed for my writes with `dd' inside a VM dropped to 20-40MB/sec. Users complained that disk-access inside the VMs was significantly slower and the backups of the RBD-store I was running, also got behind quickly.
After downgrading only the osds to v0.61.7 Cuttlefish and leaving the rest at 0.67 Dumpling, speed and load returned to normal. I have repeated this performance-hit upon upgrade on a similar test-cluster under no additional load at all. Although CPU-usage for the osds wasn't as dramatic during these tests because there was no base-load from other VMs, I/O-performance dropped significantly after upgrading during these tests as well, and returned to normal after downgrading the osds.
I'm not sure what to make of it. There are no visible errors in the logs and everything runs and reports good health, it's just a lot slower, with a lot more CPU-usage.
Associated revisions
PGLog: move the log size check after the early return
There really are stl implementations (like the one on my ubuntu 12.04
machine) which have a list::size() which is linear in the size of the
list. That assert, therefore, is quite expensive!
Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
PGLog: don't maintain log_keys_debug if the config is disabled
Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
PGLog: maintain writeout_from and trimmed
This way, we can avoid omap_rmkeyrange in the common append
and trim cases.
Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
PGLog: move the log size check after the early return
There really are stl implementations (like the one on my ubuntu 12.04
machine) which have a list::size() which is linear in the size of the
list. That assert, therefore, is quite expensive!
Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit fe68b15a3d82349f8941f5b9f70fcbb5d4bc7f97)
PGLog: don't maintain log_keys_debug if the config is disabled
Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 1c0d75db1075a58d893d30494a5d7280cb308899)
PGLog: maintain writeout_from and trimmed
This way, we can avoid omap_rmkeyrange in the common append
and trim cases.
Fixes: #6040
Backport: Dumpling
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit f808c205c503f7d32518c91619f249466f84c4cf)
History
#1 Updated by Oliver Daudey over 10 years ago
Kernel: SMP Debian 3.2.46-1~bpo60+1 x86_64 GNU/Linux on Debian Squeeze.
QEMU/KVM: 1:1.1.2+dfsg-2~bpo60+1, recompiled with RBD-support.
Storage for osds on XFS, mount-opts: attr2,noquota,noatime,nodiratime
#2 Updated by Oliver Daudey over 10 years ago
For completeness, the relevant part of "ceph.conf", the rest of which just defines a standard 3-node cluster, with mon, mds and osd on each node.
[global]
auth supported = cephx
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
[osd]
osd crush update on start = false
osd data = /mnt/data1/ceph/osd/$name
osd journal = /var/lib/ceph/osd/$name/journal
osd journal size = 10000
# uncomment the following line if you are mounting with ext4
# filestore xattr use omap = true
[mon]
mon data = /mnt/data1/ceph/mon/$cluster-$id
#3 Updated by Samuel Just over 10 years ago
- Assignee set to Samuel Just
wip-dumpling-pglog-undirty may help with this.
#4 Updated by Ian Colle over 10 years ago
- Priority changed from High to Urgent
#5 Updated by Samuel Just over 10 years ago
merged wip-dumpling-pglog-undirty with the config set to false into next and dumpling.
#6 Updated by Samuel Just over 10 years ago
- Status changed from New to Resolved
From ceph-users:
Hey Samuel,
I picked up 0.67.1-10-g47c8949 from the GIT-builder and the osd from
that seems to work great so far. I'll have to let it run for a while
longer to really be sure it fixed the problem, but it looks promising,
not taking any more CPU than the Cuttlefish-osds. Thanks! I'll get
back to you.
Regards,
Oliver
#7 Updated by Samuel Just over 10 years ago
- Status changed from Resolved to Pending Backport
#8 Updated by Sage Weil over 10 years ago
- Status changed from Pending Backport to Resolved