https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2015-08-06T16:40:17ZCeph Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=562712015-08-06T16:40:17ZKefu Chaitchaikov@gmail.com
<ul><li><strong>Assignee</strong> set to <i>Kefu Chai</i></li></ul> Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=562912015-08-07T08:10:45ZKefu Chaitchaikov@gmail.com
<ul></ul><pre>
2015-08-06 01:00:58.294782 7fd57c264700 10 mon-018@2(peon).osd e528892 check_sub 0x1c4b1280 next 528892 (onetime)
2015-08-06 01:00:58.294790 7fd57c264700 5 mon-018@2(peon).osd e528892 send_incremental [528892..528892] to client.81113180 10.202.48.22:0/1302109
2015-08-06 01:00:58.294815 7fd57c264700 10 mon-018@2(peon).osd e528892 build_incremental [528892..528892]
2015-08-06 01:00:58.335493 7fd57c264700 10 mon-018@2(peon).osd e528892 check_sub 0x72c20c0 next 528892 (onetime)
2015-08-06 01:00:58.335508 7fd57c264700 5 mon-018@2(peon).osd e528892 send_incremental [528892..528892] to client.81122956 10.202.48.29:0/1040905
2015-08-06 01:00:58.335512 7fd57c264700 10 mon-018@2(peon).osd e528892 build_incremental [528892..528892]
</pre>
<p>40678 ms from <code>build_incremental()</code> to the next <code>check_sub()</code>. in this case, a lease timeout followed.</p>
<pre>
2015-08-07 02:07:07.063950 7f475f98d700 10 mon-018@2(peon).osd e529993 check_sub 0x17c94340 next 529993 (onetime)
2015-08-07 02:07:07.063961 7f475f98d700 5 mon-018@2(peon).osd e529993 send_incremental [529993..529993] to client.84840059 10.202.48.34:0/2331792
2015-08-07 02:07:07.063974 7f475f98d700 10 mon-018@2(peon).osd e529993 build_incremental [529993..529993]
2015-08-07 02:07:07.064150 7f475f98d700 10 mon-018@2(peon).osd e529993 check_sub 0x1bf26a00 next 529993 (onetime)
2015-08-07 02:07:07.064159 7f475f98d700 5 mon-018@2(peon).osd e529993 send_incremental [529993..529993] to client.80986206 10.202.48.26:0/1364503
2015-08-07 02:07:07.064167 7f475f98d700 10 mon-018@2(peon).osd e529993 build_incremental [529993..529993]
2015-08-07 02:07:07.064268 7f475f98d700 10 mon-018@2(peon).osd e529993 check_sub 0x1baae100 next 529993 (onetime)
</pre>
<p>176 ms from <code>build_incremental()</code> to the next <code>check_sub()</code>.</p>
<pre>
ubuntu@teuthology $ grep '29993..529993' mon.18.log | grep send_incremental | head -n1
2015-08-07 02:07:07.063961 7f475f98d700 5 mon-018@2(peon).osd e529993 send_incremental [529993..529993] to client.84840059 10.202.48.34:0/2331792
ubuntu@teuthology $ grep '29993..529993' mon.18.log | grep send_incremental | tail -n1
2015-08-07 02:07:07.270799 7f475f98d700 5 mon-018@2(peon).osd e529993 send_incremental [529993..529993] to osd.411 10.202.49.16:6821/29473
ubuntu@teuthology$ grep '29993..529993' mon.18.log | grep send_incremental | awk '{print $10}' |less | cut -d'.' -f1|grep client | wc -l
729
</pre>
<p>and it replied 728 osdmap subscriptions in 206838 ms, i.e. 0.2 s.</p>
<pre>
$ grep '29993..529993' mon.18.log | grep send_incremental | awk '{print $11}' | cut -d':' -f1 | sort | uniq | wc -l
79
</pre><br />they comes from 79 ip addresses. so the monitor performed well with the same load. Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=563102015-08-07T18:14:18ZKefu Chaitchaikov@gmail.com
<ul></ul><p>sam suggested,</p>
<blockquote>
<p>we can also change the</p>
<p>- mon_leveldb_cache_size // increase this value to see if it remedies the situation <br />- mon_leveldb_size_warn</p>
</blockquote> Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=566052015-08-14T13:28:32ZGreg Farnumgfarnum@redhat.com
<ul></ul><p>These numbers look very strange to me. Do we have any more details about what's happening here, or why things might be so slow?</p> Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=566072015-08-14T14:23:36ZKefu Chaitchaikov@gmail.com
<ul></ul><blockquote>
<p>Do we have any more details about what's happening here</p>
</blockquote>
<p>per Tupper, this happens when an OSD host restarts.</p>
<blockquote>
<p>or why things might be so slow?</p>
</blockquote>
<p>one reason could be, the leveldb was compacting at that moment.</p> Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=567292015-08-18T15:54:13ZKefu Chaitchaikov@gmail.com
<ul><li><strong>Status</strong> changed from <i>12</i> to <i>Fix Under Review</i></li></ul><p><a class="external" href="https://github.com/ceph/ceph/pull/5524">https://github.com/ceph/ceph/pull/5524</a></p> Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=567882015-08-19T07:10:25ZKefu Chaitchaikov@gmail.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/56788/diff?detail_id=54984">diff</a>)</li><li><strong>Affected Versions</strong> <i>v0.80.10</i> added</li></ul> Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=571452015-08-27T15:55:15ZKefu Chaitchaikov@gmail.com
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Pending Backport</i></li><li><strong>Backport</strong> set to <i>firefly</i></li></ul> Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=572592015-08-29T10:11:26ZLoïc Dacharyloic@dachary.org
<ul><li><strong>Backport</strong> changed from <i>firefly</i> to <i>firefly,hammer</i></li></ul> Ceph - Bug #12638: build_incremental() could take 40678 ms to finishhttps://tracker.ceph.com/issues/12638?journal_id=603812015-10-20T19:37:37ZLoïc Dacharyloic@dachary.org
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul>