Fix #5844
closedosd: snaptrimmer should throttle itself
0%
Description
Cuttlefish still has some problems when operating with large snapshots on cluster with large amount of objects (~100G of commited data for image, almost two million objects), May be it`s possible to do some throttling to make those operations more smooth. The symptoms are almost the same as for peering - reads are completely blocked for a short amount of time.
Files
Updated by Samuel Just over 10 years ago
(09:50:05 AM) sjust: xdeller: you mean snapshot trimming?
(09:50:58 AM) xdeller: if i may call it so, it`s about #5844
(09:51:45 AM) sjust: xdeller: creating a snapshot causes a hang?
(09:51:50 AM) xdeller: 0.61.7 improved situation quite well when mons was upgraded
(09:51:57 AM) xdeller: both creation and deletion
(09:52:18 AM) sjust: can you describe the sequence of operations
(09:52:19 AM) sjust: ?
(09:52:21 AM) xdeller: it`s not a hang, just a temporary increase in reads latency
(09:54:36 AM) xdeller: create a bunch of vms, say ten with read non-cacheable I/O per osd then create a large image, commit a lot of data to it continuously and do some snapshots
(09:54:37 AM) sjust: xdeller: if it's just a brief increase in read latency, that might be map propogation
(09:54:50 AM) xdeller: it seems so
(09:55:04 AM) xdeller: just because of new osdmap
(09:55:33 AM) sjust: ah... I assumed the problem was the background snapshot removal on the osds causing IO
(09:55:33 AM) xdeller: it may took a lot of seconds before latest mon improvements
(09:55:50 AM) sjust: if it's the map propagation, throttling isn't relevant
(09:56:16 AM) sjust: the problem is that the clients at that point might have a newer OSDMap than the osds and have to wait for the osds to catch up
(09:57:09 AM) The_Cog: alfredodeza: Yes. The CEPH docs don't refer to any particular one. I found python-cloudfiles on github but also read that cloudfiles had a proprietary API so don't know if that's suitable.
(09:57:12 AM) xdeller: I/O seems not to be a problem. as seen in per-disk stats
(09:57:41 AM) sjust: xdeller: right, the only way to fix that would be to increase OSDMap propogation speed
Updated by Andrey Korolyov over 10 years ago
- File monleader.snap.rm.log.gz monleader.snap.rm.log.gz added
Updated by Igor Lukyanov over 10 years ago
We've scrutinized influence of snapshotting on disk IO and got some curious results. From our point of view, snapshot ops affect IO in two different ways:
1. The short (and the nastiest) effect is that just after snapshot deletion IO being stalled for a period from ?1s to 15s. Exact duration is quite random. Occasionally it may get up to 10-15 seconds. You may see these stalls on the graph http://imgur.com/wClp2tU (marked with purple dashes). Both reads and writes are involved. We suppose these stalls caused by issuing of new osd map that accompanies every snapshot op (?). To be exact, 0.61.7 made this issue much less annoying, but it’s still the case.
2. The long-lasting effect of snapshot deletion is growing of average read latency from 1-10ms to 40-80ms for a time period of tens of seconds. You may see such IO ‘shelves' on graph http://imgur.com/wClp2tU. Though avg timings get higher, the top bound does not seem to grow. So this issue does not seem to have a great impact on client IO.
Updated by Igor Lukyanov over 10 years ago
(09:57:41 AM) sjust: xdeller: right, the only way to fix that would be to increase OSDMap propogation speed
Regarding this note:
Samuel, from your point of view, what subsystem (monitors/osds) should likely be optimized to increase osdmap propogation speed?
And is it really possible to achieve any visible improvement of it? We want to profile osdmap propagation and want to ask if such profiling might make any sense or probably would be useless. Thank you.
Updated by Sage Weil over 10 years ago
- Tracker changed from Bug to Fix
- Subject changed from Cluster suffering at snapshot creation/deletion to osd: snaptrimmer should throttle itself
Updated by David Zafman about 10 years ago
- Priority changed from High to Urgent
This has been seen starving client I/O.