https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2014-11-28T00:58:34Z
Ceph
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=44936
2014-11-28T00:58:34Z
Zhi Zhang
zhangz.david@outlook.com
<ul></ul><pre>
scrub exits to wait for replicas to build scrub map:
------------------------------------------------
2014-11-27 04:52:11.336699 7f288658e700 10 osd.469 pg_epoch: 43797 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] starting a
new chunky scrub
2014-11-27 04:52:11.336735 7f288658e700 10 osd.469 pg_epoch: 43797 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] scrub start
2014-11-27 04:52:11.336758 7f288658e700 10
filestore(/home/y/var/lib/ceph/osd/ceph-469) collection_list_partial:
4.65b_head
2014-11-27 04:52:11.336778 7f288658e700 20 _collection_list_partial
0//0//-1/0/0 5-25 ls.size 0
...
2014-11-27 04:52:11.337033 7f288658e700 20 prefixes
B5608CEF,B5624762,B563D6FD,B56830C7,B5689025,B56AA0EE,B56B1F95,B56B2A1D,B5E35DE5,B5E5B3E6,B5E86E85,B5E8A3D5,B5EC883E,B5ECADEE,B5ED8453,B5EEFABE
2014-11-27 04:52:11.337101 7f288658e700 20
filestore(/home/y/var/lib/ceph/osd/ceph-469) objects:
[fec8065b/.dir.default.506102.4121/head//4,2674265b/.dir.default.1177618.958/head//4,df6d365b/.dir.default.1177618.284/head//4,7c03865b/.dir.default.1150130.745/head//4,5209865b/.dir.default.1177609.149/head//4,ee0aa65b/.dir.default.579855.7389/head//4,59f1b65b/.dir.default.1177618.1303/head//4,d1a2b65b/.dir.default.1150124.690/head//4,5ed53e5b/.dir.default.1150127.1797/head//4,6e3b5e5b/.dir.default.341342.2359/head//4,58e68e5b/.dir.default.1177618.297/head//4,5d3a8e5b/.dir.default.506102.318/head//4,e388ce5b/.dir.default.1177612.690/head//4,eedace5b/.dir.default.1177612.21/head//4,3548de5b/.dir.default.7291.5020/head//4,ebafee5b/.dir.default.29329.1440/head//4]
2014-11-27 04:52:11.339286 7f288658e700 10 osd.469 pg_epoch: 43797 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] scrub
requesting scrubmap from osd.99
2014-11-27 04:52:11.339339 7f288658e700 10 osd.469 pg_epoch: 43797 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] scrub
requesting scrubmap from osd.459
2014-11-27 04:52:11.339367 7f288658e700 10 osd.469 pg_epoch: 43797 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
build_scrub_map
2014-11-27 04:52:11.339379 7f288658e700 10
filestore(/home/y/var/lib/ceph/osd/ceph-469) collection_list_partial:
4.65b_head
2014-11-27 04:52:11.339384 7f288658e700 20 _collection_list_partial 0//0//-1
32-64 ls.size 0
2014-11-27 04:52:11.339588 7f288658e700 20 prefixes
B5608CEF,B5624762,B563D6FD,B56830C7,B5689025,B56AA0EE,B56B1F95,B56B2A1D,B5E35DE5,B5E5B3E6,B5E86E85,B5E8A3D5,B5EC883E,B5ECADEE,B5ED8453,B5EEFABE
2014-11-27 04:52:11.339654 7f288658e700 20
filestore(/home/y/var/lib/ceph/osd/ceph-469) objects:
[fec8065b/.dir.default.506102.4121/head//4,2674265b/.dir.default.1177618.958/head//4,df6d365b/.dir.default.1177618.284/head//4,7c03865b/.dir.default.1150130.745/head//4,5209865b/.dir.default.1177609.149/head//4,ee0aa65b/.dir.default.579855.7389/head//4,59f1b65b/.dir.default.1177618.1303/head//4,d1a2b65b/.dir.default.1150124.690/head//4,5ed53e5b/.dir.default.1150127.1797/head//4,6e3b5e5b/.dir.default.341342.2359/head//4,58e68e5b/.dir.default.1177618.297/head//4,5d3a8e5b/.dir.default.506102.318/head//4,e388ce5b/.dir.default.1177612.690/head//4,eedace5b/.dir.default.1177612.21/head//4,3548de5b/.dir.default.7291.5020/head//4,ebafee5b/.dir.default.29329.1440/head//4]
2014-11-27 04:52:11.339710 7f288658e700 10 osd.469 pg_epoch: 43797 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] _scan_list
scanning 16 objects deeply
...
2014-11-27 04:52:21.859629 7f288658e700 10 osd.469 pg_epoch: 43797 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
build_scrub_map_chunk done.
2014-11-27 04:52:21.859720 7f288658e700 10 osd.469 pg_epoch: 43797 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] wait for
replicas to build scrub map
------------------------------------------------
osd_sub_scrub_map OP comes back quickly:
------------------------------------------------
2014-11-27 04:52:21.893323 7f2888592700 10 osd.469 43799 dequeue_op 0x1730f4a0
prio 127 cost 6420 latency 7.496690 osd_sub_op(unknown.0.0:0 4.65b 0//0//-1
[scrub-map] v 0'0 snapset=0=[]:[] snapc=0=[]) v11 pg pg[4.65b( v 43775'1003452
(43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c 43311/43311
43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450 lcod
43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
...
2014-11-27 04:52:21.893511 7f2888592700 10 osd.469 pg_epoch: 43797 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] map version
is 43775'1003452
...
2014-11-27 04:52:21.893534 7f2888592700 10 osd.469 43799 dequeue_op 0x1730f4a0
finish
2014-11-27 04:52:21.903364 7f2889594700 10 osd.469 43799 dequeue_op 0x117d6d20
prio 127 cost 6420 latency 6.399532 osd_sub_op(unknown.0.0:0 4.65b 0//0//-1
[scrub-map] v 0'0 snapset=0=[]:[] snapc=0=[]) v11 pg pg[4.65b( v 43775'1003452
(43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c 43311/43311
43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450 lcod
43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
...
2014-11-27 04:52:21.903423 7f2889594700 10 osd.469 pg_epoch: 43799 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] got 459
scrub map
...
014-11-27 04:52:21.903483 7f2889594700 10 osd.469 pg_epoch: 43799 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] map version
is 43775'1003452
2014-11-27 04:52:21.903495 7f2889594700 10 osd.469 43799 dequeue_op 0x117d6d20
finish
...
------------------------------------------------
disk_tp to process pg removing event:
------------------------------------------------
2014-11-27 04:52:11.336758 7f288658e700 10
filestore(/home/y/var/lib/ceph/osd/ceph-469) collection_list_partial:
4.65b_head
2014-11-27 04:52:11.336778 7f288658e700 20 _collection_list_partial
0//0//-1/0/0 5-25 ls.size 0
2014-11-27 04:52:11.337033 7f288658e700 20 prefixes
B5608CEF,B5624762,B563D6FD,B56830C7,B5689025,B56AA0EE,B56B1F95,B56B2A1D,B5E35DE5,B5E5B3E6,B5E86E85,B5E8A3D5,B5EC883E,B5ECADEE,B5ED8453,B5EEFABE
...
2014-11-27 06:30:04.060618 7f288658e700 10
filestore(/home/y/var/lib/ceph/osd/ceph-469) collection_list_partial:
3.264s0_head
2014-11-27 06:30:04.060629 7f288658e700 20 _collection_list_partial
3f78e264/default.1177618.14_mon03c008.cos.gq1.yahoo.com_fe4a94032b486ec449788c1cc6265024/head//3/18446744073709551615/0
32-64 ls.size 0
2014-11-27 06:30:04.060688 7f288658e700 20 prefixes 4
2014-11-27 06:30:04.060709 7f288658e700 20 prefixes 46
2014-11-27 06:30:04.060726 7f288658e700 20 prefixes 462
2014-11-27 06:30:04.060758 7f288658e700 20 prefixes 462E
2014-11-27 06:30:04.060810 7f288658e700 20 prefixes
462E8,462E9,462EA,462EB,462EC,462ED,462EE,462EF
...
------------------------------------------------
disk_up continues to process scrub:
------------------------------------------------
2014-11-27 06:34:46.196360 7f288658e700 10 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
scrub_compare_maps has maps, analyzing
2014-11-27 06:34:46.196475 7f288658e700 10 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] scrub
comparing replica scrub maps
2014-11-27 06:34:46.196488 7f288658e700 2 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] scrub
osd.469 has 16 items
2014-11-27 06:34:46.196500 7f288658e700 2 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] scrub
replica 99 has 16 items
...
2014-11-27 06:34:46.196655 7f288658e700 10 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
be_select_auth_object: selecting osd 459 for obj
fec8065b/.dir.default.506102.4121/head//4
2014-11-27 06:34:46.196669 7f288658e700 10 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
be_select_auth_object: selecting osd 469 for obj
fec8065b/.dir.default.506102.4121/head//4
...
2014-11-27 06:34:46.197471 7f288658e700 10 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
be_select_auth_object: selecting osd 469 for obj
ebafee5b/.dir.default.29329.1440/head//4
2014-11-27 06:34:46.197494 7f288658e700 2 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
2014-11-27 06:34:46.197506 7f288658e700 10 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] _scrub
2014-11-27 06:34:46.197580 7f288658e700 10 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] _scrub
(deep-scrub) finish
2014-11-27 06:34:46.198070 7f288658e700 10 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep] deep-scrub
got 16/16 objects, 0/0 clones, 16/16 dirty, 16/16 omap, 0/0 hit_set_archive,
0/0 bytes.
2014-11-27 06:34:46.198089 7f288658e700 10 osd.469 pg_epoch: 43935 pg[4.65b( v
43775'1003452 (43547'1000451,43775'1003452] local-les=43311 n=16 ec=457 les/c
43311/43311 43260/43310/43310) [469,459,99] r=0 lpr=43310 crt=43769'1003450
lcod 43775'1003451 mlcod 43775'1003451 active+clean+scrubbing+deep]
process_inconsistent() checking authoritative
2014-11-27 06:34:46.198102 7f288658e700 0 log [INF] : 4.65b deep-scrub ok
------------------------------------------------
</pre>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=45055
2014-12-02T13:06:32Z
Samuel Just
sjust@redhat.com
<ul><li><strong>Project</strong> changed from <i>Ceph</i> to <i>rgw</i></li></ul><p>Radosgw creates unbounded size index objects, will change eventually.</p>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=45151
2014-12-03T19:11:45Z
Guang Yang
yguang11@outlook.com
<ul></ul><p>Samuel Just wrote:</p>
<blockquote>
<p>Radosgw creates unbounded size index objects, will change eventually.</p>
</blockquote>
<p>Hi Sam,<br />It is more like a general problem at OSD side rather than a rgw problem. In short, the problem is that a single object could be non-writable for a couple of hours because the thread taking responsibility to mark it writable was stuck doing pg remove (which is time consuming), not because it is <strong>really</strong> not writable.</p>
<p>I am thinking about two possibilities to fix:<br /> 1. Pull the time critical section into op thread (make the object as writable)<br /> 2. Use a dedicated thread pool to handle pg removing (and other time consuming work load)</p>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=45392
2014-12-09T16:35:49Z
Samuel Just
sjust@redhat.com
<ul></ul><p>In that case, two things:<br />1) move scrubbing into the OpWQ (I'm working on that one)<br />2) restructure pg removal to only remove N objects a time and requeue (at the front of the removal queue)</p>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=45393
2014-12-09T16:41:02Z
Guang Yang
yguang11@outlook.com
<ul><li><strong>Subject</strong> changed from <i>Radosgw threads get hung when hitting a paticular pattern at peering, remapping and deep-scrubbing</i> to <i>PG removal occupy the disk thread several hours</i></li></ul>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=46982
2015-01-27T18:54:36Z
Sage Weil
sage@newdream.net
<ul><li><strong>Project</strong> changed from <i>rgw</i> to <i>Ceph</i></li></ul>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=47434
2015-02-04T02:43:41Z
Guang Yang
yguang11@outlook.com
<ul></ul><p>Samuel Just wrote:</p>
<blockquote>
<p>In that case, two things:<br />2) restructure pg removal to only remove N objects a time and requeue (at the front of the removal queue)</p>
</blockquote>
<p><a class="external" href="https://github.com/ceph/ceph/pull/3564">https://github.com/ceph/ceph/pull/3564</a></p>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=47796
2015-02-10T21:37:46Z
Sage Weil
sage@newdream.net
<ul><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=47798
2015-02-10T21:38:02Z
Samuel Just
sjust@redhat.com
<ul><li><strong>Tracker</strong> changed from <i>Bug</i> to <i>Feature</i></li><li><strong>Target version</strong> set to <i>v0.94</i></li></ul>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=49392
2015-03-17T20:08:01Z
Samuel Just
sjust@redhat.com
<ul><li><strong>Target version</strong> changed from <i>v0.94</i> to <i>v0.95</i></li></ul>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=50640
2015-04-14T20:10:24Z
Sage Weil
sage@newdream.net
<ul><li><strong>Target version</strong> changed from <i>v0.95</i> to <i>v9.0.2</i></li></ul>
Ceph - Feature #10198: PG removal occupy the disk thread several hours
https://tracker.ceph.com/issues/10198?journal_id=51501
2015-05-07T17:39:40Z
Samuel Just
sjust@redhat.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Resolved</i></li></ul>