Fix #5844: osd: snaptrimmer should throttle itself - Ceph - Ceph

Actions

Copy link

Fix #5844

closed

osd: snaptrimmer should throttle itself

Added by Andrey Korolyov over 10 years ago. Updated over 7 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Cuttlefish still has some problems when operating with large snapshots on cluster with large amount of objects (~100G of commited data for image, almost two million objects), May be it`s possible to do some throttling to make those operations more smooth. The symptoms are almost the same as for peering - reads are completely blocked for a short amount of time.

Files

monleader.snap.rm.log.gz (187 KB) monleader.snap.rm.log.gz

Andrey Korolyov, 08/13/2013 08:00 AM

Actions

Copy link

Updated by Samuel Just over 10 years ago

(09:50:05 AM) sjust: xdeller: you mean snapshot trimming?
(09:50:58 AM) xdeller: if i may call it so, it`s about #5844
(09:51:45 AM) sjust: xdeller: creating a snapshot causes a hang?
(09:51:50 AM) xdeller: 0.61.7 improved situation quite well when mons was upgraded
(09:51:57 AM) xdeller: both creation and deletion
(09:52:18 AM) sjust: can you describe the sequence of operations
(09:52:19 AM) sjust: ?
(09:52:21 AM) xdeller: it`s not a hang, just a temporary increase in reads latency
(09:54:36 AM) xdeller: create a bunch of vms, say ten with read non-cacheable I/O per osd then create a large image, commit a lot of data to it continuously and do some snapshots
(09:54:37 AM) sjust: xdeller: if it's just a brief increase in read latency, that might be map propogation
(09:54:50 AM) xdeller: it seems so
(09:55:04 AM) xdeller: just because of new osdmap
(09:55:33 AM) sjust: ah... I assumed the problem was the background snapshot removal on the osds causing IO
(09:55:33 AM) xdeller: it may took a lot of seconds before latest mon improvements
(09:55:50 AM) sjust: if it's the map propagation, throttling isn't relevant
(09:56:16 AM) sjust: the problem is that the clients at that point might have a newer OSDMap than the osds and have to wait for the osds to catch up
(09:57:09 AM) The_Cog: alfredodeza: Yes. The CEPH docs don't refer to any particular one. I found python-cloudfiles on github but also read that cloudfiles had a proprietary API so don't know if that's suitable.
(09:57:12 AM) xdeller: I/O seems not to be a problem. as seen in per-disk stats
(09:57:41 AM) sjust: xdeller: right, the only way to fix that would be to increase OSDMap propogation speed

Actions

Copy link

Updated by Andrey Korolyov over 10 years ago

File monleader.snap.rm.log.gz monleader.snap.rm.log.gz added

Actions

Copy link

Updated by Igor Lukyanov over 10 years ago

We've scrutinized influence of snapshotting on disk IO and got some curious results. From our point of view, snapshot ops affect IO in two different ways:

1. The short (and the nastiest) effect is that just after snapshot deletion IO being stalled for a period from ?1s to 15s. Exact duration is quite random. Occasionally it may get up to 10-15 seconds. You may see these stalls on the graph http://imgur.com/wClp2tU (marked with purple dashes). Both reads and writes are involved. We suppose these stalls caused by issuing of new osd map that accompanies every snapshot op (?). To be exact, 0.61.7 made this issue much less annoying, but it’s still the case.

2. The long-lasting effect of snapshot deletion is growing of average read latency from 1-10ms to 40-80ms for a time period of tens of seconds. You may see such IO ‘shelves' on graph http://imgur.com/wClp2tU. Though avg timings get higher, the top bound does not seem to grow. So this issue does not seem to have a great impact on client IO.

Actions

Copy link

Updated by Igor Lukyanov over 10 years ago

(09:57:41 AM) sjust: xdeller: right, the only way to fix that would be to increase OSDMap propogation speed

Regarding this note:
Samuel, from your point of view, what subsystem (monitors/osds) should likely be optimized to increase osdmap propogation speed?
And is it really possible to achieve any visible improvement of it? We want to profile osdmap propagation and want to ask if such profiling might make any sense or probably would be useless. Thank you.

Actions

Copy link

Updated by Sage Weil over 10 years ago

Tracker changed from Bug to Fix
Subject changed from Cluster suffering at snapshot creation/deletion to osd: snaptrimmer should throttle itself

Actions

Copy link

Updated by David Zafman about 10 years ago

Priority changed from High to Urgent

This has been seen starving client I/O.

Actions

Copy link

Updated by Samuel Just over 7 years ago

Status changed from New to Resolved

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Fix #5844

osd: snaptrimmer should throttle itself

Updated by Samuel Just over 10 years ago

Updated by Andrey Korolyov over 10 years ago

Updated by Igor Lukyanov over 10 years ago

Updated by Igor Lukyanov over 10 years ago

Updated by Sage Weil over 10 years ago

Updated by David Zafman about 10 years ago

Updated by Samuel Just over 7 years ago