Bug #9054
ceph_test_rados: FAILED assert(!old_value.deleted())
0%
Description
ubuntu@teuthology:/a/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/403383
2014-08-08T11:03:40.624 INFO:teuthology.task.rados.rados.0.plana04.stdout:2997: read oid 439 snap 261 2014-08-08T11:03:40.624 INFO:teuthology.task.rados.rados.0.plana04.stdout:2998: write oid 69 current snap is 262 2014-08-08T11:03:40.624 INFO:teuthology.task.rados.rados.0.plana04.stdout:2998: seq_num 1310 ranges {542063=481193,1342062=1} 2014-08-08T11:03:40.656 INFO:teuthology.task.rados.rados.0.plana04.stdout:2998: writing plana0410548-69 from 542063 to 1023256 tid 1 2014-08-08T11:03:40.661 INFO:teuthology.task.rados.rados.0.plana04.stdout:2998: writing plana0410548-69 from 1342062 to 1342063 tid 2 2014-08-08T11:03:40.662 INFO:teuthology.task.rados.rados.0.plana04.stdout:2999: snap_remove snap 255 2014-08-08T11:03:40.721 INFO:teuthology.task.rados.rados.0.plana04.stdout:2997: expect (ObjNum 17004944 snap 0 seq_num 3355462528) 2014-08-08T11:03:40.722 INFO:teuthology.task.rados.rados.0.plana04.stderr:./test/osd/RadosModel.h: In function 'virtual void ReadOp::_finish(TestOp::CallbackInfo*)' thread 7f78117fa700 time 2014-08-08 11:03:40.720568 2014-08-08T11:03:40.722 INFO:teuthology.task.rados.rados.0.plana04.stderr:./test/osd/RadosModel.h: 1062: FAILED assert(!old_value.deleted()) 2014-08-08T11:03:40.722 INFO:teuthology.task.rados.rados.0.plana04.stderr: ceph version 0.83-412-gdceab8d (dceab8dc49d4f458e3bc9fd40eb8b487f1e35948) 2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 1: (ReadOp::_finish(TestOp::CallbackInfo*)+0xe08) [0x421398] 2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 2: (librados::C_AioComplete::finish(int)+0x1d) [0x7f781d62cfbd] 2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 3: (Context::complete(int)+0x9) [0x7f781d609e99] 2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 4: (Finisher::finisher_thread_entry()+0x1b8) [0x7f781d6bbcc8] 2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 5: (()+0x8182) [0x7f781d251182] 2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 6: (clone()+0x6d) [0x7f781ca6430d]
Related issues
Associated revisions
ReplicatedPG:start_flush send a second delete
Suppose we start with the following in the cache pool:
30:[29,21,20,15,10,4]:[22(21), 15(15,10), 4(4)]+head
The object doesn't exist at 29 or 20.
First, we flush 4 leaving the backing pool with:
3:[]+head
Then, we begin to flush 15 with a delete with snapc 4:[4] leaving the
backing pool with:
4:[4]:[4(4)]
Then, we finish flushing 15 with snapc 9:[4] with leaving the backing
pool with:
9:[4]:[4(4)]+head
Next, snaps 10 and 15 are removed causing clone 10 to be removed leaving
the cache with:
30:[29,21,20,4]:[22(21),4(4)]+head
We next begin to flush 22 by sending a delete with snapc 4(4) since
prev_snapc is 4 <---------- here is the bug
The backing pool ignores this request since 4 < 9 (ORDERSNAP) leaving it
with:
9:[4]:[4(4)]
Then, we complete flushing 22 with snapc 19:[4] leaving the backing pool
with:
19:[4]:[4(4)]+head
Then, we begin to flush head by deleting with snapc 22:[21,20,4] leaving
the backing pool with:
22[21,20,4]:[22(21,20), 4(4)]
Finally, we flush head leaving the backing pool with:
30:[29,21,20,4]:[22(21*,20*),4(4)]+head
When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of flushing 22, we need to do two things:
1) Ensure that the current head is cloned as cloneid 4 with snaps [4] by
sending a delete at snapc 4:[4].
2) Flush the data at snap sequence < 21 by sending a copyfrom with snapc
20:[20,4].
Unfortunately, it is possible that 1, 1&2, or 1 and part of the flush
process for some other now non-existent clone have already been
performed. Because of that, between 1) and 2), we need to send
a second delete ensuring that the object does not exist at 20.
Fixes: #9054
Backport: firefly
Signed-off-by: Samuel Just <sam.just@inktank.com>
ReplicatedPG:start_flush send a second delete
Suppose we start with the following in the cache pool:
30:[29,21,20,15,10,4]:[22(21), 15(15,10), 4(4)]+head
The object doesn't exist at 29 or 20.
First, we flush 4 leaving the backing pool with:
3:[]+head
Then, we begin to flush 15 with a delete with snapc 4:[4] leaving the
backing pool with:
4:[4]:[4(4)]
Then, we finish flushing 15 with snapc 9:[4] with leaving the backing
pool with:
9:[4]:[4(4)]+head
Next, snaps 10 and 15 are removed causing clone 10 to be removed leaving
the cache with:
30:[29,21,20,4]:[22(21),4(4)]+head
We next begin to flush 22 by sending a delete with snapc 4(4) since
prev_snapc is 4 <---------- here is the bug
The backing pool ignores this request since 4 < 9 (ORDERSNAP) leaving it
with:
9:[4]:[4(4)]
Then, we complete flushing 22 with snapc 19:[4] leaving the backing pool
with:
19:[4]:[4(4)]+head
Then, we begin to flush head by deleting with snapc 22:[21,20,4] leaving
the backing pool with:
22[21,20,4]:[22(21,20), 4(4)]
Finally, we flush head leaving the backing pool with:
30:[29,21,20,4]:[22(21*,20*),4(4)]+head
When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of flushing 22, we need to do two things:
1) Ensure that the current head is cloned as cloneid 4 with snaps [4] by
sending a delete at snapc 4:[4].
2) Flush the data at snap sequence < 21 by sending a copyfrom with snapc
20:[20,4].
Unfortunately, it is possible that 1, 1&2, or 1 and part of the flush
process for some other now non-existent clone have already been
performed. Because of that, between 1) and 2), we need to send
a second delete ensuring that the object does not exist at 20.
Fixes: #9054
Backport: firefly
Related: 66c7439ea0888777b5cfc08bcb0fbd7bfd8653c3
Signed-off-by: Samuel Just <sam.just@inktank.com>
ReplicatedPG:start_flush send a second delete
Suppose we start with the following in the cache pool:
30:[29,21,20,15,10,4]:[22(21), 15(15,10), 4(4)]+head
The object doesn't exist at 29 or 20.
First, we flush 4 leaving the backing pool with:
3:[]+head
Then, we begin to flush 15 with a delete with snapc 4:[4] leaving the
backing pool with:
4:[4]:[4(4)]
Then, we finish flushing 15 with snapc 9:[4] with leaving the backing
pool with:
9:[4]:[4(4)]+head
Next, snaps 10 and 15 are removed causing clone 10 to be removed leaving
the cache with:
30:[29,21,20,4]:[22(21),4(4)]+head
We next begin to flush 22 by sending a delete with snapc 4(4) since
prev_snapc is 4 <---------- here is the bug
The backing pool ignores this request since 4 < 9 (ORDERSNAP) leaving it
with:
9:[4]:[4(4)]
Then, we complete flushing 22 with snapc 19:[4] leaving the backing pool
with:
19:[4]:[4(4)]+head
Then, we begin to flush head by deleting with snapc 22:[21,20,4] leaving
the backing pool with:
22[21,20,4]:[22(21,20), 4(4)]
Finally, we flush head leaving the backing pool with:
30:[29,21,20,4]:[22(21*,20*),4(4)]+head
When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of flushing 22, we need to do two things:
1) Ensure that the current head is cloned as cloneid 4 with snaps [4] by
sending a delete at snapc 4:[4].
2) Flush the data at snap sequence < 21 by sending a copyfrom with snapc
20:[20,4].
Unfortunately, it is possible that 1, 1&2, or 1 and part of the flush
process for some other now non-existent clone have already been
performed. Because of that, between 1) and 2), we need to send
a second delete ensuring that the object does not exist at 20.
Fixes: #9054
Backport: firefly
Related: 66c7439ea0888777b5cfc08bcb0fbd7bfd8653c3
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 4843fd510b33a71999cdf9c2cfa2b4c318fa80fd)
History
#1 Updated by Sage Weil about 9 years ago
almost there. on osd.0, we finish trimming 14a here:
2014-08-08 10:55:12.311901 7f1237847700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2839 (0'0,462'2839] local-les=422 n=53 ec=6 les/c 422/422 421/421/421) [0,1] r=0 lpr=421 luod=462'2832 crt=459'2828 lcod 462'2830 mlcod 462'2830 active+clean snaptrimq=[14a~1]] SnapTrimmer state<NotTrimming>: NotTrimming: trimming 14a
and a moment later create a clone here:
2014-08-08 10:55:12.391746 7f123a04c700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2847 (0'0,462'2847] local-les=422 n=51 ec=6 les/c 422/422 421/421/421) [0,1] r=0 lpr=421 luod=462'2834 crt=462'2832 lcod 462'2832 mlcod 462'2832 active+clean] final snapset 14a=[149,148,147,146,144,143,141,140,13e,13b,13a,139,134,12b,129,128,10f,10b]:[10f,14a]+head in df2a6681/plana0410548-439/head//2
#2 Updated by Sage Weil about 9 years ago
2014-08-08 10:55:12.312751 7f1237847700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2839 (0'0,462'2839] local-les=422 n=53 ec=6 les/c 422/422 421/421/421) [0,1] r=0 lpr=421 luod=462'2832 crt=459'2828 lcod 462'2830 mlcod 462'2830 active+clean snaptrimq=[14a~1]] SnapTrimmer state<Trimming/WaitingOnReplicas>: WaitingOnReplicas: adding snap 14a to purged_snaps
...
2014-08-08 10:55:12.391576 7f123a04c700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2847 (0'0,462'2847] local-les=422 n=51 ec=6 les/c 422/422 421/421/421) [0,1] r=0 lpr=421 luod=462'2834 crt=462'2832 lcod 462'2832 mlcod 462'2832 active+clean] cloning v 462'2837 to df2a6681/plana0410548-439/14a//2 v 462'2848 snaps=[149,148,147,146,144,143,141,140,13e,13b,13a,139,134,12b,129,128]
hmm, the clone isn't diryt
#3 Updated by Sage Weil about 9 years ago
- Assignee deleted (
Sage Weil)
#4 Updated by Samuel Just about 9 years ago
- Assignee set to Samuel Just
#5 Updated by Samuel Just about 9 years ago
Hmm, I think the bug is like this:
Normally, if we get the following op sequence:
- write 1:[]
- delete 10:[3] (creates clone 10 with snaps 3)
- trim 3
- write 12:[11]
- (async, removes clone 10)
We would see something like the following sent to the backend:
- copyfrom@1 1:[] ENFORCE_SNAPC
- delete 10:[3] ORDERSNAP, ENFORCE_SNAPC (creates clone 10 with snaps 3)
- copyfrom@12 12:[11] ENFORCE_SNAPC
- (async, removes clone 10)
If instead we reorder a bit though, we might trim clone 10 before we send the delete and get:
- copyfrom@1 1:[] ENFORCE_SNAPC
- (async on cache pool, remove clone 10)
- copyfrom@11 12:[11] ENFORCE_SNAPC
This will cause the backing pool to create clone 12 instead of clone 10, and to incorrectly believe snap 11 to be part of a period during which the object existed after the first write instead of a period where the object did not exist.
#6 Updated by Samuel Just about 9 years ago
Thinking
#7 Updated by Samuel Just about 9 years ago
Actually, looks like this might already be handled correctly, re-consulting the log.
#8 Updated by Samuel Just about 9 years ago
Ok, we start with the following configuration in the cache (all dirty):
30:[29,21,20,15,10,4]:[22(21), 15(15,10), 4(4)]+head
The object doesn't exist at 29 or 20.
First, we flush 4 leaving the backing pool with:
3:[]+head
Then, we begin to flush 15 with a delete with snapc 4:[4] leaving the backing pool with:
4:[4]:[4(4)]
Then, we finish flushing 15 with snapc 9:[4] with leaving the backing pool with:
9:[4]:[4(4)]+head
Next, snaps 10 and 15 are removed causing clone 10 to be removed leaving the cache with:
30:[29,21,20,4]:[22(21),4(4)]+head
We next begin to flush 22 by sending a delete with snapc 4(4) since prev_snapc is 4 <---------- here is the bug
The backing pool ignores this request since 4 < 9 (ORDERSNAP) leaving it with:
9:[4]:[4(4)]
Then, we complete flushing 22 with snapc 19:[4] leaving the backing pool with:
19:[4]:[4(4)]+head
Then, we begin to flush head by deleting with snapc 22:[21,20,4] leaving the backing pool with:
22[21,20,4]:[22(21,20), 4(4)]
Finally, we flush head leaving the backing pool with:
30:[29,21,20,4]:[22(21*,20*),4(4)]+head
#9 Updated by Samuel Just about 9 years ago
When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of flushing 22, we need to do two things:
1) Ensure that the current head is cloned as cloneid 4 with snaps [4] by
sending a delete at snapc 4:[4].
2) Flush the data at snap sequence < 21 by sending a copyfrom with snapc
20:[20,4].
Unfortunately, it is possible that 1, 1&2, or 1 and part of the flush
process for some other now non-existent clone have already been
performed. Because we don't know whether 1 has already happened, we
currently use ORDERSNAP to prevent us from deleting a second time after
2) has been sent. I think we can get away without it:
If the snap_seq of this object on the backing pool is > 4, we know that
at some point, a delete was first sent with snapc 4:[4], so clone 4 has
been correctly created. In that case, because clone 22 is still dirty,
the newest snap seq which could have been sent to the backing pool is
20. Thus, because there are no snaps left between 20 and 4 (there
actually can't be since there would have to be another clone there),
performing a second delete at 4:[4] is harmless since it will be
promoted to some snap_seq seq where 4 <= seq < 21, which is fine
since we will be sending a copyfrom at 20.
#10 Updated by Samuel Just about 9 years ago
- Status changed from New to 7
- Backport set to firefly
wip-9054
#11 Updated by Sage Weil about 9 years ago
This sounds right to me!
#12 Updated by Sage Weil about 9 years ago
ubuntu@teuthology:/a/sage-2014-08-13_15:28:18-rados-next-testing-basic-multi/422862
#13 Updated by Sage Weil about 9 years ago
- Status changed from 7 to Pending Backport
#14 Updated by Sage Weil about 9 years ago
- Status changed from Pending Backport to 12
ubuntu@teuthology:/a/sage-2014-08-30_20:39:25-rados-wip-sage-testing-testing-basic-multi/462661
same symptom, at least. maybe this is a consequence of #9294?
#15 Updated by Sage Weil about 9 years ago
Sage Weil wrote:
ubuntu@teuthology:/a/sage-2014-08-30_20:39:25-rados-wip-sage-testing-testing-basic-multi/462661
same symptom, at least. maybe this is a consequence of #9294?
no.. this run included wip-9294
#16 Updated by Samuel Just about 9 years ago
2014-08-31T01:36:27.057 INFO:tasks.rados.rados.0.plana80.stdout:update_object_version oid 348 v 3674 (ObjNum 1096 snap 204 seq_num 1096) dirty exists
2014-08-31T01:36:27.057 INFO:tasks.rados.rados.0.plana80.stdout:2267: expect (ObjNum 34580816 snap 0 seq_num 3355477840)
Looks like a bug in the tool to me. Maybe memory corruption?
#17 Updated by Samuel Just about 9 years ago
- Status changed from 12 to Resolved
Opening new bug for that one.
#19 Updated by Samuel Just about 9 years ago
2014-08-31 01:24:18.568610 7f45febb1700 10 osd.2 pg_epoch: 192 pg[2.1( v 192'1575 (0'0,192'1575] local-les=105 n=49 ec=6 les/c 105/105 104/104/97) [2,5] r=0 lpr=104 luod=192'1570 crt=192'1568 lcod 192'1568 mlcod 192'1568 active+clean] start_flush 95d4eb9d/plana802684-165/head//2 v166'1404 uv1051 non-blocking/best-effo
rt
2014-08-31 01:24:18.568616 7f274bdd6700 10 osd.0 pg_epoch: 192 pg[1.1( v 192'341 (0'0,192'341] local-les=105 n=139 ec=5 les/c 105/105 104/104/63) [0,5] r=0 lpr=104 crt=188'331 lcod 192'339 mlcod 192'339 active+clean] handle_message: 0x1cfd900
2014-08-31 01:24:18.568617 7fbebf652700 10 write_log with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, dirty_divergent_priors: 0, writeout_from: 192'1574, trimmed:
2014-08-31 01:24:18.568619 7f45febb1700 20 osd.2 pg_epoch: 192 pg[2.1( v 192'1575 (0'0,192'1575] local-les=105 n=49 ec=6 les/c 105/105 104/104/97) [2,5] r=0 lpr=104 luod=192'1570 crt=192'1568 lcod 192'1568 mlcod 192'1568 active+clean] snapset 65=[64,63,62,61,60,5f,5e,5d,5b,59,58,57,56,55,50,4d,4c,48,45,42,39]:[]+head
2014-08-31 01:24:18.568624 7f274bdd6700 10 osd.0 pg_epoch: 192 pg[1.1( v 192'341 (0'0,192'341] local-les=105 n=139 ec=5 les/c 105/105 104/104/63) [0,5] r=0 lpr=104 crt=188'331 lcod 192'339 mlcod 192'339 active+clean] do_op osd_op(osd.2.3:467 plana802684-57 [delete] 1.43c35a9d snapc 65=[64,63,62,61,60,5f,5e,5d,5b,59,58
,57,56,55,50,4d,4c,48,45,42,39] ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e192) v4 may_write > write-ordered flags ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected 10.214.133.36:6814/1414 --> 10.214.133.36:6800/1378 -- osd_op(osd.2.3:468 plana802684-165 [copy-from ver 1051] 1.95d4eb9d snapc 65=[64,63,62,61,60,5f,5e,5d,5b,59,58,57,56,55,50,4d,4c,48,45,42,39] ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e192) v4 -- ?+0
2014-08-31 01:24:18.568628 7f45febb1700 20 osd.2 pg_epoch: 192 pg[2.1( v 192'1575 (0'0,192'1575] local-les=105 n=49 ec=6 les/c 105/105 104/104/97) [2,5] r=0 lpr=104 luod=192'1570 crt=192'1568 lcod 192'1568 mlcod 192'1568 active+clean] start_flush no older clones
2014-08-31 01:24:18.568642 7f274bdd6700 15 filestore(/var/lib/ceph/osd/ceph-0) getattr 1.1_head/43c35a9d/plana802684-57/head//1 '_'
2014-08-31 01:24:18.568650 7f45febb1700 1 -
0x5702500 con 0x39292c0
Bug in the head bug fix case.
#20 Updated by Samuel Just about 9 years ago
- Status changed from 12 to 7
#21 Updated by Sage Weil about 9 years ago
ubuntu@teuthology:/a/teuthology-2014-09-08_02:32:01-rados-master-testing-basic-multi/472355
#22 Updated by Sage Weil about 9 years ago
- Status changed from 7 to Pending Backport
#23 Updated by Samuel Just about 9 years ago
wip-sam-testing-firefly
#24 Updated by Samuel Just about 9 years ago
- Status changed from Pending Backport to Resolved