Project

General

Profile

Bug #19247

Bluestore may return improper object content under clone/zero/read scenario

Added by Igor Fedotov about 7 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The issue is occasionally observed using StoreTest.SyntheticMatrixCsumVsCompression/2 test case.
The test case reports content mismatch on read op:
---------------------- 11 / 16 ----------------------
max_write = 131072
max_size = 262144
alignment = 512
bluestore_min_alloc_size = 16384
bluestore_compression_mode = force
bluestore_compression_algorithm = snappy
bluestore_csum_type = crc32c
bluestore_default_buffered_read = false
bluestore_default_buffered_write = true
bluestore_sync_submit_transaction = false
seeding object 0
seeding object 500
Op 0
available_objects: 995 in_flight_objects: 5 total objects: 1000 in_flight 5
Op 1000
available_objects: 995 in_flight_objects: 1 total objects: 996 in_flight 1
Op 2000
available_objects: 1008 in_flight_objects: 3 total objects: 1011 in_flight 3
Op 3000
available_objects: 1018 in_flight_objects: 2 total objects: 1020 in_flight 2
Op 4000
available_objects: 1032 in_flight_objects: 2 total objects: 1034 in_flight 2
--- buffer mismatch between offset 0x1a800 and 0x1d000, total 0x1fc00
--- expected:
00014600 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| *
0001a800 38 36 34 39 30 34 38 34 32 30 35 39 39 35 36 33 |8649048420599563|
0001a810 31 31 31 30 37 38 38 39 38 38 32 34 31 35 32 31 |1110788988241521|

---actual
00014600 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| *
0001d000 37 31 32 38 34 37 30 30 36 34 32 32 39 31 34 34 |7128470064229144|

It might need 5-50 test case execute attempts to reproduce.
SSD drive used only.

Issue is reproducible at both master and down to 3cbe67f commit, older commits weren't verified:
Merge: fd39516 b893123
Author: Sage Weil <>
Date: Wed Feb 15 21:51:37 2017 -0600

Merge pull request #13390 from Liuchang0812/remove-header
os/bluestore: some cleanup

An analysis over multiple failures shows general pattern:
1) Clone a whole object or some range
2) Read on target - OK
3) Zero some range on source
4) Read on source - OK
5) Read on target - content mismatch

Other observations:
- Broken content is filled with zeroes
- Broken extent reading is always served from the cache
- Broken extent is originated from clone op, i.e. existed in a source object.
- Broken extent has strict correlation with the range where zero was applied. Not the full zeroed range is broken though. But broken is always a subset of zero op.

An attempt to straightforwardly reproduce the scenario in a standalone simplified test case doesn't reveal any issues.
Looks like the analysis above lacks some salt.

Below is a log snippet for the case attached.

case8.txt View (48.1 KB) Igor Fedotov, 03/09/2017 11:46 AM

19247.cc View - A test case with an unsuccessful attempt to reproduce the bug (2.88 KB) Igor Fedotov, 03/09/2017 12:02 PM

History

#1 Updated by Igor Fedotov about 7 years ago

#2 Updated by Igor Fedotov almost 7 years ago

Looks resolved in current master (03/31/17)

#3 Updated by Sage Weil almost 7 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF