Project

General

Profile

Bug #12738

scrub bogus results when missing a clone

Added by Samuel Just over 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2015-08-20 20:15:47.150457 mon.0 10.12.2.2:6789/0 1313 : cluster [INF] pgmap v5060458: 2048 pgs: 1 active+clean+scrubbing+deep+inconsistent+repair, 1 active+clean+scrubbing+deep+inconsistent, 2046 active+clean; 6933 GB data, 19127 GB used, 150 TB / 169 TB avail; 3745 kB/s rd, 10543 kB/s wr, 559 op/s
2015-08-20 20:15:43.295815 osd.19 10.12.2.6:6838/1861727 295 : cluster [INF] 2.490 repair starts
2015-08-20 20:15:44.865367 osd.19 10.12.2.6:6838/1861727 296 : cluster [ERR] repair 2.490 3fac9490/rbd_data.eb5f22eb141f2.00000000000004ba/snapdir//2 missing clones
2015-08-20 20:15:44.865606 osd.19 10.12.2.6:6838/1861727 297 : cluster [ERR] repair 2.490 2d7b9490/rbd_data.18f92c3d1b58ba.0000000000006167/head//2 expected clone 3fac9490/rbd_data.eb5f22eb141f2.00000000000004ba/141//2
2015-08-20 20:15:44.865670 osd.19 10.12.2.6:6838/1861727 298 : cluster [ERR] repair 2.490 68c89490/rbd_data.16796a3d1b58ba.0000000000000047/head//2 expected clone 2d7b9490/rbd_data.18f92c3d1b58ba.0000000000006167/141//2
2015-08-20 20:15:44.865817 osd.19 10.12.2.6:6838/1861727 299 : cluster [ERR] repair 2.490 ded49490/rbd_data.11a25c7934d3d4.0000000000008a8a/head//2 expected clone 68c89490/rbd_data.16796a3d1b58ba.0000000000000047/141//2
2015-08-20 20:15:44.865881 osd.19 10.12.2.6:6838/1861727 300 : cluster [ERR] repair 2.490 bac19490/rbd_data.1238e82ae8944a.000000000000032e/head//2 expected clone ded49490/rbd_data.11a25c7934d3d4.0000000000008a8a/141//2
2015-08-20 20:15:44.865965 osd.19 10.12.2.6:6838/1861727 301 : cluster [ERR] repair 2.490 83319490/rbd_data.18e0c63d1b58ba.000000000000beb0/head//2 expected clone bac19490/rbd_data.1238e82ae8944a.000000000000032e/141//2
2015-08-20 20:15:44.866127 osd.19 10.12.2.6:6838/1861727 302 : cluster [ERR] repair 2.490 c3c09490/rbd_data.1238e82ae8944a.0000000000000c2b/head//2 expected clone 83319490/rbd_data.18e0c63d1b58ba.000000000000beb0/141//2
2015-08-20 20:15:44.866268 osd.19 10.12.2.6:6838/1861727 303 : cluster [ERR] repair 2.490 a0cd8490/rbd_data.e4b68613183f2.0000000000010bd9/head//2 expected clone c3c09490/rbd_data.1238e82ae8944a.0000000000000c2b/141//2
2015-08-20 20:15:44.866360 osd.19 10.12.2.6:6838/1861727 304 : cluster [ERR] repair 2.490 77ad8490/rbd_data.1238e82ae8944a.0000000000000890/head//2 expected clone a0cd8490/rbd_data.e4b68613183f2.0000000000010bd9/141//2
2015-08-20 20:15:44.866402 osd.19 10.12.2.6:6838/1861727 305 : cluster [ERR] repair 2.490 4f5d8490/rbd_data.19f7e72ae8944a.0000000000000baf/head//2 expected clone 77ad8490/rbd_data.1238e82ae8944a.0000000000000890/141//2
2015-08-20 20:15:44.866553 osd.19 10.12.2.6:6838/1861727 306 : cluster [ERR] repair 2.490 aa5a8490/rbd_data.18e0c63d1b58ba.0000000000031ff2/head//2 expected clone 4f5d8490/rbd_data.19f7e72ae8944a.0000000000000baf/141//2
2015-08-20 20:15:44.866685 osd.19 10.12.2.6:6838/1861727 307 : cluster [ERR] repair 2.490 ae0a8490/rbd_data.1a13562ae8944a.0000000000011861/head//2 expected clone aa5a8490/rbd_data.18e0c63d1b58ba.0000000000031ff2/141//2
2015-08-20 20:15:44.866759 osd.19 10.12.2.6:6838/1861727 308 : cluster [ERR] repair 2.490 66398490/rbd_data.16610e3d1b58ba.000000000000166e/head//2 expected clone ae0a8490/rbd_data.1a13562ae8944a.0000000000011861/141//2
2015-08-20 20:15:44.866783 osd.19 10.12.2.6:6838/1861727 309 : cluster [ERR] repair 2.490 5a688490/rbd_data.18e0c63d1b58ba.00000000000124b9/head//2 expected clone 66398490/rbd_data.16610e3d1b58ba.000000000000166e/141//2
2015-08-20 20:15:44.866797 osd.19 10.12.2.6:6838/1861727 310 : cluster [ERR] repair 2.490 6d178490/rbd_data.1238e82ae8944a.000000000000168f/head//2 expected clone 5a688490/rbd_data.18e0c63d1b58ba.00000000000124b9/141//2
2015-08-20 20:15:44.866818 osd.19 10.12.2.6:6838/1861727 311 : cluster [ERR] repair 2.490 a9b68490/rbd_data.18e0c63d1b58ba.00000000000319bb/head//2 expected clone 6d178490/rbd_data.1238e82ae8944a.000000000000168f/141//2
2015-08-20 20:15:44.866832 osd.19 10.12.2.6:6838/1861727 312 : cluster [ERR] repair 2.490 fde58490/rbd_data.18f92c3d1b58ba.000000000000a4b9/head//2 expected clone a9b68490/rbd_data.18e0c63d1b58ba.00000000000319bb/141//2

I thinks it's some kind of silly off-by-one in ReplicatedPG::_scrub.


Related issues

Copied to Ceph - Backport #14077: hammer: scrub bogus results when missing a clone Resolved

Associated revisions

Revision a23036c6 (diff)
Added by David Zafman over 8 years ago

osd: Make the _scrub routine produce good output and detect errors properly

Catch decode errors so osd doesn't crash on corrupt OI_ATTR or SS_ATTR
Use boost::optional<> to make current state clearer
Create next_clone as needed using head/curclone
Add equivalent logic after getting to end of scrubmap.objects

Fixes: #12738

Signed-off-by: David Zafman <>

Revision 18af852a (diff)
Added by David Zafman about 8 years ago

osd: Make the _scrub routine produce good output and detect errors properly

Catch decode errors so osd doesn't crash on corrupt OI_ATTR or SS_ATTR
Use boost::optional<> to make current state clearer
Create next_clone as needed using head/curclone
Add equivalent logic after getting to end of scrubmap.objects

Fixes: #12738

Signed-off-by: David Zafman <>
(cherry picked from commit a23036c6fd7de5d1dbc2bd30c967c0be51d94ca5)

Conflicts:
src/osd/ReplicatedPG.cc (no num_objects_pinned in hammer)
src/osd/ReplicatedPG.h (no get_temp_recovery_object() in hammer)

History

#1 Updated by Samuel Just over 8 years ago

  • Status changed from New to 12

#2 Updated by David Zafman over 8 years ago

  • Assignee set to David Zafman

#3 Updated by David Zafman over 8 years ago

  • Status changed from 12 to In Progress

#4 Updated by David Zafman over 8 years ago

  • Status changed from In Progress to Fix Under Review

Part of pull request # 5783

0ac3dbb7ad2eedf570c56afa8fda327b78492388

#5 Updated by David Zafman over 8 years ago

  • Status changed from Fix Under Review to Resolved

a23036c6fd7de5d1dbc2bd30c967c0be51d94ca5

#6 Updated by Ken Dreyer over 8 years ago

  • Status changed from Resolved to Pending Backport
  • Backport changed from hammer,firefly to hammer

#7 Updated by Loïc Dachary over 8 years ago

  • Copied to Backport #14077: hammer: scrub bogus results when missing a clone added

#8 Updated by Loïc Dachary about 8 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF