https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2020-01-24T16:06:54Z
Ceph
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=156866
2020-01-24T16:06:54Z
Josh Durgin
<ul></ul><p>The test needs to be updated due to <a class="external" href="https://github.com/ceph/ceph/pull/32683">https://github.com/ceph/ceph/pull/32683</a> - anything else that sets the log lengths (like the rados suite facet) needs updating too. If the tests set osd_target_pg_log_entries_per_osd they can get a similar effect.</p>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=156897
2020-01-24T20:00:05Z
Sage Weil
sage@newdream.net
<ul></ul><p>/a/sage-2020-01-24_13:15:58-rados-wip-sage2-testing-2020-01-23-1953-distro-basic-smithi/4701051</p>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=156902
2020-01-24T20:18:46Z
Neha Ojha
nojha@redhat.com
<ul><li><strong>Project</strong> changed from <i>Ceph</i> to <i>RADOS</i></li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=156916
2020-01-24T22:22:37Z
Neha Ojha
nojha@redhat.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Fix Under Review</i></li><li><strong>Pull request ID</strong> set to <i>32851</i></li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=156917
2020-01-24T22:36:09Z
Neha Ojha
nojha@redhat.com
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>In Progress</i></li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=156918
2020-01-24T22:52:36Z
Neha Ojha
nojha@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Fix Under Review</i></li><li><strong>Assignee</strong> set to <i>Neha Ojha</i></li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=157047
2020-01-26T17:18:38Z
Sage Weil
sage@newdream.net
<ul></ul><p>//a/sage-2020-01-24_23:29:53-rados-wip-sage2-testing-2020-01-24-1408-distro-basic-smithi/4703160</p>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=157100
2020-01-27T13:16:52Z
Sage Weil
sage@newdream.net
<ul><li><strong>Target version</strong> set to <i>v15.0.0</i></li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=157175
2020-01-27T17:05:34Z
Sage Weil
sage@newdream.net
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Resolved</i></li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=161170
2020-03-17T17:21:28Z
Sage Weil
sage@newdream.net
<ul></ul><p>/a/sage-2020-03-17_13:59:54-rados-wip-sage-testing-2020-03-17-0740-distro-basic-smithi/4863239</p>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=161173
2020-03-17T17:57:38Z
Neha Ojha
nojha@redhat.com
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>New</i></li></ul><p>Note that this is a resurrection of the same failure with different symptoms</p>
<p>/a/sage-2020-03-17_13:59:54-rados-wip-sage-testing-2020-03-17-0740-distro-basic-smithi/4863239</p>
<p>TEST_backfill_log_1</p>
<pre>
2020-03-17T16:17:27.682 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:82: _common_test: LOGLEN=50
2020-03-17T16:17:27.682 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:83: _common_test: '[' 50 '!=' 2 ']'
2020-03-17T16:17:27.683 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:84: _common_test: echo 'FAILED: Wrong log length got 50 (expected 2)'
2020-03-17T16:17:27.683 INFO:tasks.workunit.client.0.smithi014.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:85: _common_test: expr 0 + 1
2020-03-17T16:17:27.683 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:85: _common_test: ERRORS=1
2020-03-17T16:17:27.683 INFO:tasks.workunit.client.0.smithi014.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:87: _common_test: jq '.pg_log_t.dups | length' td/osd-backfill-recovery-log/result.log
2020-03-17T16:17:27.683 INFO:tasks.workunit.client.0.smithi014.stdout:FAILED: Wrong log length got 50 (expected 2)
2020-03-17T16:17:27.684 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:87: _common_test: DUPSLEN=7
2020-03-17T16:17:27.684 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:88: _common_test: '[' 7 '!=' 8 ']'
2020-03-17T16:17:27.685 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:89: _common_test: echo 'FAILED: Wrong dups length got 7 (expected 8)' 2020-03-17T16:17:27.685 INFO:tasks.workunit.client.0.smithi014.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:90: _common_test: expr 1 + 1
2020-03-17T16:17:27.685 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:90: _common_test: ERRORS=2
2020-03-17T16:17:27.685 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:92: _common_test: grep 'copy_up_to\|copy_after' td/osd-backfill-recovery-log/osd.0.log td/osd-backfill-recovery-log/osd.1.log td/osd-backfill-recovery-log/osd.2.log td/osd-backfill-recovery-log/osd.3.log td/osd-backfill-recovery-log/osd.4.log td/osd-backfill-recovery-log/osd.5.log
2020-03-17T16:17:27.685 INFO:tasks.workunit.client.0.smithi014.stdout:FAILED: Wrong dups length got 7 (expected 8)
</pre>
<p>/a/nojha-2020-03-16_17:35:35-rados:standalone-master-distro-basic-smithi/4860657/</p>
<p>TEST_backfill_log_2</p>
<pre>
2020-03-16T18:53:54.721 INFO:tasks.workunit.client.0.smithi001.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:82: _common_test: LOGLEN=50
2020-03-16T18:53:54.721 INFO:tasks.workunit.client.0.smithi001.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:83: _common_test: '[' 50 '!=' 2 ']'
2020-03-16T18:53:54.721 INFO:tasks.workunit.client.0.smithi001.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:84: _common_test: echo 'FAILED: Wrong log length got 50 (expected 2)'
2020-03-16T18:53:54.721 INFO:tasks.workunit.client.0.smithi001.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:85: _common_test: expr 0 + 1
2020-03-16T18:53:54.721 INFO:tasks.workunit.client.0.smithi001.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:85: _common_test: ERRORS=1
2020-03-16T18:53:54.721 INFO:tasks.workunit.client.0.smithi001.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:87: _common_test: jq '.pg_log_t.dups | length' td/osd-backfill-recovery-log/result.log
2020-03-16T18:53:54.722 INFO:tasks.workunit.client.0.smithi001.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:87: _common_test: DUPSLEN=100
2020-03-16T18:53:54.722 INFO:tasks.workunit.client.0.smithi001.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:88: _common_test: '[' 100 '!=' 148 ']'
2020-03-16T18:53:54.722 INFO:tasks.workunit.client.0.smithi001.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:89: _common_test: echo 'FAILED: Wrong dups length got 100 (expected 148)'
2020-03-16T18:53:54.722 INFO:tasks.workunit.client.0.smithi001.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:90: _common_test: expr 1 + 1
2020-03-16T18:53:54.722 INFO:tasks.workunit.client.0.smithi001.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:90: _common_test: ERRORS=2
2020-03-16T18:53:54.722 INFO:tasks.workunit.client.0.smithi001.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:92: _common_test: grep 'copy_up_to\|copy_after' td/osd-backfill-recovery-log/osd.0.log td/osd-backfill-recovery-log/osd.1.log td/osd-backfill-recovery-log/osd.2.log td/osd-backfill-recovery-log/osd.3.log td/osd-backfill-recovery-log/osd.4.log td/osd-backfill-recovery-log/osd.5.log
2020-03-16T18:53:54.722 INFO:tasks.workunit.client.0.smithi001.stdout:FAILED: Wrong log length got 50 (expected 2)
2020-03-16T18:53:54.723 INFO:tasks.workunit.client.0.smithi001.stdout:FAILED: Wrong dups length got 100 (expected 148)
</pre>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=161361
2020-03-19T20:59:00Z
Neha Ojha
nojha@redhat.com
<ul></ul><p>/a/sage-2020-03-17_13:59:54-rados-wip-sage-testing-2020-03-17-0740-distro-basic-smithi/4863239</p>
<p>Comparing a failed TEST_backfill_log_1 test with one that passed, I see that the "newprimary" is different in both cases. Note that the test marks osd.0,1,2 out and waits for clean.</p>
<p>PASSED test</p>
<p>"up":[4,3,5],"acting":[4,3,5]</p>
<pre>
../qa/standalone/osd/osd-backfill-recovery-log.sh:78: _common_test: newprimary=4
</pre>
<p>FAILED test</p>
<pre>
2020-03-17T16:17:26.220 INFO:tasks.workunit.client.0.smithi014.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-backfill-recovery-log.sh:77: _common_test: newprimary=1
</pre>
<p>Not sure why osd.1 still remained the primary here.</p>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=161427
2020-03-20T21:08:41Z
Neha Ojha
nojha@redhat.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=161505
2020-03-23T16:17:38Z
Neha Ojha
nojha@redhat.com
<ul></ul><p>Neha Ojha wrote:</p>
<blockquote>
<p>/a/sage-2020-03-17_13:59:54-rados-wip-sage-testing-2020-03-17-0740-distro-basic-smithi/4863239</p>
<p>Comparing a failed TEST_backfill_log_1 test with one that passed, I see that the "newprimary" is different in both cases. Note that the test marks osd.0,1,2 out and waits for clean.</p>
<p>PASSED test</p>
<p>"up":[4,3,5],"acting":[4,3,5]</p>
<p>[...]</p>
<p>FAILED test</p>
<p>[...]</p>
<p>Not sure why osd.1 still remained the primary here.</p>
</blockquote>
<p>Before wait_for_clean</p>
<pre>
2020-03-22T02:40:05.571 INFO:tasks.workunit.client.0.smithi001.stdout:{"pg_ready":true,"pg_stats":[{"pgid":"1.0","version":"34'143","reported_seq":"153","reported_epoch":"34","state":"active+clean","last_fresh":"2020-03-22T02:40:01.191973+0000","last_change":"2020-03-22T02:39:49.653126+0000","last_active":"2020-03-22T02:40:01.191973+0000","last_peered":"2020-03-22T02:40:01.191973+0000","last_clean":"2020-03-22T02:40:01.191973+0000","last_became_active":"2020-03-22T02:39:49.652216+0000","last_became_peered":"2020-03-22T02:39:49.652216+0000","last_unstale":"2020-03-22T02:40:01.191973+0000","last_undegraded":"2020-03-22T02:40:01.191973+0000","last_fullsized":"2020-03-22T02:40:01.191973+0000","mapping_epoch":32,"log_start":"34'100","ondisk_log_start":"34'100","created":32,"last_epoch_clean":33,"parent":"0.0","parent_split_bits":0,"last_scrub":"0'0","last_scrub_stamp":"2020-03-22T02:39:49.018635+0000","last_deep_scrub":"0'0","last_deep_scrub_stamp":"2020-03-22T02:39:49.018635+0000","last_clean_scrub_stamp":"2020-03-22T02:39:49.018635+0000","log_size":43,"ondisk_log_size":43,"stats_invalid":false,"dirty_stats_invalid":false,"omap_stats_invalid":false,"hitset_stats_invalid":false,"hitset_bytes_stats_invalid":false,"pin_stats_invalid":false,"manifest_stats_invalid":false,"snaptrimq_len":0,"stat_sum":{"num_bytes":974116,"num_objects":143,"num_object_clones":0,"num_object_copies":429,"num_objects_missing_on_primary":0,"num_objects_missing":0,"num_objects_degraded":0,"num_objects_misplaced":0,"num_objects_unfound":0,"num_objects_dirty":143,"num_whiteouts":0,"num_read":0,"num_read_kb":0,"num_write":143,"num_write_kb":1001,"num_scrub_errors":0,"num_shallow_scrub_errors":0,"num_deep_scrub_errors":0,"num_objects_recovered":0,"num_bytes_recovered":0,"num_keys_recovered":0,"num_objects_omap":0,"num_objects_hit_set_archive":0,"num_bytes_hit_set_archive":0,"num_flush":0,"num_flush_kb":0,"num_evict":0,"num_evict_kb":0,"num_promote":0,"num_flush_mode_high":0,"num_flush_mode_low":0,"num_evict_mode_some":0,"num_evict_mode_full":0,"num_objects_pinned":0,"num_legacy_snapsets":0,"num_large_omap_objects":0,"num_objects_manifest":0,"num_omap_bytes":0,"num_omap_keys":0,"num_objects_repaired":0},"up":[1,0,2],"acting":[1,0,2],"avail_no_missing":[],"object_location_counts":[],"blocked_by":[],"up_primary":1,"acting_primary":1,"purged_snaps":[]}]}
</pre>
<p>After wait_for_clean</p>
<pre>
2020-03-22T02:40:10.670 INFO:tasks.workunit.client.0.smithi001.stdout:{"pg_ready":true,"pg_stats":[{"pgid":"1.0","version":"34'150","reported_seq":"160","reported_epoch":"34","state":"active+clean","last_fresh":"2020-03-22T02:40:01.734859+0000","last_change":"2020-03-22T02:39:49.653126+0000","last_active":"2020-03-22T02:40:01.734859+0000","last_peered":"2020-03-22T02:40:01.734859+0000","last_clean":"2020-03-22T02:40:01.734859+0000","last_became_active":"2020-03-22T02:39:49.652216+0000","last_became_peered":"2020-03-22T02:39:49.652216+0000","last_unstale":"2020-03-22T02:40:01.734859+0000","last_undegraded":"2020-03-22T02:40:01.734859+0000","last_fullsized":"2020-03-22T02:40:01.734859+0000","mapping_epoch":32,"log_start":"34'100","ondisk_log_start":"34'100","created":32,"last_epoch_clean":33,"parent":"0.0","parent_split_bits":0,"last_scrub":"0'0","last_scrub_stamp":"2020-03-22T02:39:49.018635+0000","last_deep_scrub":"0'0","last_deep_scrub_stamp":"2020-03-22T02:39:49.018635+0000","last_clean_scrub_stamp":"2020-03-22T02:39:49.018635+0000","log_size":50,"ondisk_log_size":50,"stats_invalid":false,"dirty_stats_invalid":false,"omap_stats_invalid":false,"hitset_stats_invalid":false,"hitset_bytes_stats_invalid":false,"pin_stats_invalid":false,"manifest_stats_invalid":false,"snaptrimq_len":0,"stat_sum":{"num_bytes":1021800,"num_objects":150,"num_object_clones":0,"num_object_copies":450,"num_objects_missing_on_primary":0,"num_objects_missing":0,"num_objects_degraded":0,"num_objects_misplaced":0,"num_objects_unfound":0,"num_objects_dirty":150,"num_whiteouts":0,"num_read":0,"num_read_kb":0,"num_write":150,"num_write_kb":1050,"num_scrub_errors":0,"num_shallow_scrub_errors":0,"num_deep_scrub_errors":0,"num_objects_recovered":0,"num_bytes_recovered":0,"num_keys_recovered":0,"num_objects_omap":0,"num_objects_hit_set_archive":0,"num_bytes_hit_set_archive":0,"num_flush":0,"num_flush_kb":0,"num_evict":0,"num_evict_kb":0,"num_promote":0,"num_flush_mode_high":0,"num_flush_mode_low":0,"num_evict_mode_some":0,"num_evict_mode_full":0,"num_objects_pinned":0,"num_legacy_snapsets":0,"num_large_omap_objects":0,"num_objects_manifest":0,"num_omap_bytes":0,"num_omap_keys":0,"num_objects_repaired":0},"up":[1,0,2],"acting":[1,0,2],"avail_no_missing":[],"object_location_counts":[],"blocked_by":[],"up_primary":1,"acting_primary":1,"purged_snaps":[]}]}
</pre>
<p>Note that the "reported_epoch" remained the same. This is because mgr_stats_period is 5 seconds, and we have not fetched the latest stats yet.</p>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=161507
2020-03-23T16:22:22Z
Neha Ojha
nojha@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Fix Under Review</i></li></ul><p>Fix 2: <a class="external" href="https://github.com/ceph/ceph/pull/34126">https://github.com/ceph/ceph/pull/34126</a></p>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=161512
2020-03-23T18:55:22Z
Sage Weil
sage@newdream.net
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Pending Backport</i></li><li><strong>Backport</strong> set to <i>octopus</i></li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=162021
2020-03-30T22:24:10Z
Nathan Cutler
ncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li><li><strong>Backport</strong> deleted (<del><i>octopus</i></del>)</li></ul><pre>
$ git branch --contains b208177
master
* octopus
</pre>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=162022
2020-03-30T22:27:45Z
Neha Ojha
nojha@redhat.com
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>Pending Backport</i></li></ul><p>Nathan, we need <a class="external" href="https://github.com/ceph/ceph/pull/34126">https://github.com/ceph/ceph/pull/34126</a> as well - See <a class="external" href="https://tracker.ceph.com/issues/43807#note-15">https://tracker.ceph.com/issues/43807#note-15</a></p>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=162142
2020-03-31T10:04:01Z
Nathan Cutler
ncutler@suse.cz
<ul><li><strong>Backport</strong> set to <i>octopus</i></li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=162149
2020-03-31T10:06:58Z
Nathan Cutler
ncutler@suse.cz
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/44847">Backport #44847</a>: octopus: osd-backfill-recovery-log.sh fails</i> added</li></ul>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=162152
2020-03-31T10:10:44Z
Nathan Cutler
ncutler@suse.cz
<ul></ul><p>Neha Ojha wrote:</p>
<blockquote>
<p>Nathan, we need <a class="external" href="https://github.com/ceph/ceph/pull/34126">https://github.com/ceph/ceph/pull/34126</a> as well - See <a class="external" href="https://tracker.ceph.com/issues/43807#note-15">https://tracker.ceph.com/issues/43807#note-15</a></p>
</blockquote>
<p>Oops! Sorry! Got it now, and octopus PR is open for review.</p>
RADOS - Bug #43807: osd-backfill-recovery-log.sh fails
https://tracker.ceph.com/issues/43807?journal_id=162309
2020-04-01T19:34:31Z
Nathan Cutler
ncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul><p>While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".</p>