Actions
Bug #4364
closedObjectCacher: inconsistency after flatten
% Done:
0%
Source:
Development
Tags:
Backport:
bobtail
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Using a vstart-created cluster with:
./ceph_test_librbd_fsx -l 40960 -N 1000 -S 2 -d rbd foo8
and ceph.conf including:
[client] [client] keyring = /home/joshd/ceph/src/keyring log file = out/$name.log rbd cache = true ms inject socket failures = 500 log max recent = 500 log max new = 1000 debug rbd = 20 debug ms = 1 debug objectcacher = 30
There's a read interpreted as starting with zeros (since the first bh is marked !exists in the cache):
truncating to largest ever: 0x774c 1 trunc from 0x0 to 0x774c 2 trunc from 0x774c to 0x8b7 3 write 0x5a51 thru 0x9fff (0x45af bytes) 4 read 0x3036 thru 0x533f (0x230a bytes) 5 punch from 0x1340 to 0x4d9b, (0x3a5b bytes) 8 read 0x4eca thru 0x9fff (0x5136 bytes) 9 trunc from 0xa000 to 0x3c31 13 write 0x50a thru 0x9fff (0x9af6 bytes) 14 punch from 0x8db2 to 0xa000, (0x124e bytes) 16 punch from 0x2519 to 0xa000, (0x7ae7 bytes) 18 trunc from 0xa000 to 0x7621 19 punch from 0x1430 to 0x7621, (0x61f1 bytes) 21 read 0x4969 thru 0x7620 (0x2cb8 bytes) 22 read 0x5211 thru 0x7620 (0x2410 bytes) 23 read 0x68b4 thru 0x7620 (0xd6d bytes) 24 write 0x1d41 thru 0x9fff (0x82bf bytes) 27 clone 1 order 22 su 2097152 sc 2 28 trunc from 0xa000 to 0x5b5c 29 trunc from 0x5b5c to 0x277d 30 read 0x20f3 thru 0x277c (0x68a bytes) 31 write 0x1a6c thru 0x9fff (0x8594 bytes) 32 write 0x7922 thru 0x9fff (0x26de bytes) 33 write 0x6c34 thru 0x80e2 (0x14af bytes) 35 flatten 37 read 0x807 thru 0x9fff (0x97f9 bytes) READ BAD DATA: offset = 0x807, size = 0x97f9, fname = foo8 OFFSET GOOD BAD RANGE 0x 807 0x960d 0x0000 0x 0 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 808 0x0d1f 0x0000 0x 1 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 809 0x1f0d 0x0000 0x 2 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 80a 0x0d6d 0x0000 0x 3 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 80b 0x6d0d 0x0000 0x 4 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 80c 0x0d9b 0x0000 0x 5 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 80d 0x9b0d 0x0000 0x 6 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 80e 0x0d48 0x0000 0x 7 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 80f 0x480d 0x0000 0x 8 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 810 0x0d59 0x0000 0x 9 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 811 0x590d 0x0000 0x a operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 812 0x0dca 0x0000 0x b operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 813 0xca0d 0x0000 0x c operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 814 0x0dfc 0x0000 0x d operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 815 0xfc0d 0x0000 0x e operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x 816 0x0ddd 0x0000 0x f operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops LOG DUMP (37 total operations): 1( 1 mod 256): TRUNCATE UP from 0x0 to 0x774c ******WWWW 2( 2 mod 256): TRUNCATE DOWN from 0x774c to 0x8b7 ******WWWW 3( 3 mod 256): WRITE 0x5a51 thru 0x9fff (0x45af bytes) HOLE ***WWWW 4( 4 mod 256): READ 0x3036 thru 0x533f (0x230a bytes) 5( 5 mod 256): PUNCH 0x1340 thru 0x4d9a (0x3a5b bytes) ******PPPP 6( 6 mod 256): SKIPPED (no operation) 7( 7 mod 256): SKIPPED (no operation) 8( 8 mod 256): READ 0x4eca thru 0x9fff (0x5136 bytes) 9( 9 mod 256): TRUNCATE DOWN from 0xa000 to 0x3c31 10( 10 mod 256): SKIPPED (no operation) 11( 11 mod 256): SKIPPED (no operation) 12( 12 mod 256): SKIPPED (no operation) 13( 13 mod 256): WRITE 0x50a thru 0x9fff (0x9af6 bytes) EXTEND ***WWWW 14( 14 mod 256): PUNCH 0x8db2 thru 0x9fff (0x124e bytes) 15( 15 mod 256): SKIPPED (no operation) 16( 16 mod 256): PUNCH 0x2519 thru 0x9fff (0x7ae7 bytes) 17( 17 mod 256): SKIPPED (no operation) 18( 18 mod 256): TRUNCATE DOWN from 0xa000 to 0x7621 19( 19 mod 256): PUNCH 0x1430 thru 0x7620 (0x61f1 bytes) 20( 20 mod 256): SKIPPED (no operation) 21( 21 mod 256): READ 0x4969 thru 0x7620 (0x2cb8 bytes) 22( 22 mod 256): READ 0x5211 thru 0x7620 (0x2410 bytes) 23( 23 mod 256): READ 0x68b4 thru 0x7620 (0xd6d bytes) 24( 24 mod 256): WRITE 0x1d41 thru 0x9fff (0x82bf bytes) EXTEND 25( 25 mod 256): SKIPPED (no operation) 26( 26 mod 256): SKIPPED (no operation) 27( 27 mod 256): CLONE 28( 28 mod 256): TRUNCATE DOWN from 0xa000 to 0x5b5c 29( 29 mod 256): TRUNCATE DOWN from 0x5b5c to 0x277d 30( 30 mod 256): READ 0x20f3 thru 0x277c (0x68a bytes) 31( 31 mod 256): WRITE 0x1a6c thru 0x9fff (0x8594 bytes) EXTEND 32( 32 mod 256): WRITE 0x7922 thru 0x9fff (0x26de bytes) 33( 33 mod 256): WRITE 0x6c34 thru 0x80e2 (0x14af bytes) 34( 34 mod 256): SKIPPED (no operation) 35( 35 mod 256): FLATTEN 36( 36 mod 256): SKIPPED (no operation) 37( 37 mod 256): READ 0x807 thru 0x9fff (0x97f9 bytes) ***RRRR*** Correct content saved for comparison (maybe hexdump "foo8" vs "foo8.fsxgood")
Exporting the image and comparing to foo8.fsxgood shows that the actual image matches, but the cache was wrong.
I'm guessing this is a bug in flatten and cache interaction, so you'd only see it if you're using librbd directly,
but it could be a more general problem with the cache.
Client log is attached.
Files
Updated by Josh Durgin about 11 years ago
- Status changed from In Progress to 7
If this doesn't cause any problems, it should be backported to bobtail. Leaving in testing until then.
Updated by Sage Weil about 11 years ago
- Status changed from 7 to Pending Backport
Updated by Josh Durgin almost 11 years ago
- Status changed from Pending Backport to Resolved
Actions