Project

General

Profile

Bug #9054

ceph_test_rados: FAILED assert(!old_value.deleted())

Added by Sage Weil over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
firefly
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/a/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/403383

2014-08-08T11:03:40.624 INFO:teuthology.task.rados.rados.0.plana04.stdout:2997: read oid 439 snap 261
2014-08-08T11:03:40.624 INFO:teuthology.task.rados.rados.0.plana04.stdout:2998: write oid 69 current snap is 262
2014-08-08T11:03:40.624 INFO:teuthology.task.rados.rados.0.plana04.stdout:2998:  seq_num 1310 ranges {542063=481193,1342062=1}
2014-08-08T11:03:40.656 INFO:teuthology.task.rados.rados.0.plana04.stdout:2998:  writing plana0410548-69 from 542063 to 1023256 tid 1
2014-08-08T11:03:40.661 INFO:teuthology.task.rados.rados.0.plana04.stdout:2998:  writing plana0410548-69 from 1342062 to 1342063 tid 2
2014-08-08T11:03:40.662 INFO:teuthology.task.rados.rados.0.plana04.stdout:2999: snap_remove snap 255
2014-08-08T11:03:40.721 INFO:teuthology.task.rados.rados.0.plana04.stdout:2997:  expect (ObjNum 17004944 snap 0 seq_num 3355462528)
2014-08-08T11:03:40.722 INFO:teuthology.task.rados.rados.0.plana04.stderr:./test/osd/RadosModel.h: In function 'virtual void ReadOp::_finish(TestOp::CallbackInfo*)' thread 7f78117fa700 time 2014-08-08 11:03:40.720568
2014-08-08T11:03:40.722 INFO:teuthology.task.rados.rados.0.plana04.stderr:./test/osd/RadosModel.h: 1062: FAILED assert(!old_value.deleted())
2014-08-08T11:03:40.722 INFO:teuthology.task.rados.rados.0.plana04.stderr: ceph version 0.83-412-gdceab8d (dceab8dc49d4f458e3bc9fd40eb8b487f1e35948)
2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 1: (ReadOp::_finish(TestOp::CallbackInfo*)+0xe08) [0x421398]
2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 2: (librados::C_AioComplete::finish(int)+0x1d) [0x7f781d62cfbd]
2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 3: (Context::complete(int)+0x9) [0x7f781d609e99]
2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 4: (Finisher::finisher_thread_entry()+0x1b8) [0x7f781d6bbcc8]
2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 5: (()+0x8182) [0x7f781d251182]
2014-08-08T11:03:40.723 INFO:teuthology.task.rados.rados.0.plana04.stderr: 6: (clone()+0x6d) [0x7f781ca6430d]


Related issues

Duplicated by Ceph - Bug #9385: ceph_test_rados: incorrect buffer at pos ... Duplicate 09/08/2014

Associated revisions

Revision a525bf31 (diff)
Added by Samuel Just over 9 years ago

ReplicatedPG:start_flush send a second delete

Suppose we start with the following in the cache pool:

30:[29,21,20,15,10,4]:[22(21), 15(15,10), 4(4)]+head

The object doesn't exist at 29 or 20.

First, we flush 4 leaving the backing pool with:

3:[]+head

Then, we begin to flush 15 with a delete with snapc 4:[4] leaving the
backing pool with:

4:[4]:[4(4)]

Then, we finish flushing 15 with snapc 9:[4] with leaving the backing
pool with:

9:[4]:[4(4)]+head

Next, snaps 10 and 15 are removed causing clone 10 to be removed leaving
the cache with:

30:[29,21,20,4]:[22(21),4(4)]+head

We next begin to flush 22 by sending a delete with snapc 4(4) since
prev_snapc is 4 <---------- here is the bug

The backing pool ignores this request since 4 < 9 (ORDERSNAP) leaving it
with:

9:[4]:[4(4)]

Then, we complete flushing 22 with snapc 19:[4] leaving the backing pool
with:

19:[4]:[4(4)]+head

Then, we begin to flush head by deleting with snapc 22:[21,20,4] leaving
the backing pool with:

22[21,20,4]:[22(21,20), 4(4)]

Finally, we flush head leaving the backing pool with:

30:[29,21,20,4]:[22(21*,20*),4(4)]+head

When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of flushing 22, we need to do two things:
1) Ensure that the current head is cloned as cloneid 4 with snaps [4] by
sending a delete at snapc 4:[4].
2) Flush the data at snap sequence < 21 by sending a copyfrom with snapc
20:[20,4].

Unfortunately, it is possible that 1, 1&2, or 1 and part of the flush
process for some other now non-existent clone have already been
performed. Because of that, between 1) and 2), we need to send
a second delete ensuring that the object does not exist at 20.

Fixes: #9054
Backport: firefly
Signed-off-by: Samuel Just <>

Revision 4843fd51 (diff)
Added by Samuel Just over 9 years ago

ReplicatedPG:start_flush send a second delete

Suppose we start with the following in the cache pool:

30:[29,21,20,15,10,4]:[22(21), 15(15,10), 4(4)]+head

The object doesn't exist at 29 or 20.

First, we flush 4 leaving the backing pool with:

3:[]+head

Then, we begin to flush 15 with a delete with snapc 4:[4] leaving the
backing pool with:

4:[4]:[4(4)]

Then, we finish flushing 15 with snapc 9:[4] with leaving the backing
pool with:

9:[4]:[4(4)]+head

Next, snaps 10 and 15 are removed causing clone 10 to be removed leaving
the cache with:

30:[29,21,20,4]:[22(21),4(4)]+head

We next begin to flush 22 by sending a delete with snapc 4(4) since
prev_snapc is 4 <---------- here is the bug

The backing pool ignores this request since 4 < 9 (ORDERSNAP) leaving it
with:

9:[4]:[4(4)]

Then, we complete flushing 22 with snapc 19:[4] leaving the backing pool
with:

19:[4]:[4(4)]+head

Then, we begin to flush head by deleting with snapc 22:[21,20,4] leaving
the backing pool with:

22[21,20,4]:[22(21,20), 4(4)]

Finally, we flush head leaving the backing pool with:

30:[29,21,20,4]:[22(21*,20*),4(4)]+head

When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of flushing 22, we need to do two things:
1) Ensure that the current head is cloned as cloneid 4 with snaps [4] by
sending a delete at snapc 4:[4].
2) Flush the data at snap sequence < 21 by sending a copyfrom with snapc
20:[20,4].

Unfortunately, it is possible that 1, 1&2, or 1 and part of the flush
process for some other now non-existent clone have already been
performed. Because of that, between 1) and 2), we need to send
a second delete ensuring that the object does not exist at 20.

Fixes: #9054
Backport: firefly
Related: 66c7439ea0888777b5cfc08bcb0fbd7bfd8653c3
Signed-off-by: Samuel Just <>

Revision fd96eb62 (diff)
Added by Samuel Just over 9 years ago

ReplicatedPG:start_flush send a second delete

Suppose we start with the following in the cache pool:

30:[29,21,20,15,10,4]:[22(21), 15(15,10), 4(4)]+head

The object doesn't exist at 29 or 20.

First, we flush 4 leaving the backing pool with:

3:[]+head

Then, we begin to flush 15 with a delete with snapc 4:[4] leaving the
backing pool with:

4:[4]:[4(4)]

Then, we finish flushing 15 with snapc 9:[4] with leaving the backing
pool with:

9:[4]:[4(4)]+head

Next, snaps 10 and 15 are removed causing clone 10 to be removed leaving
the cache with:

30:[29,21,20,4]:[22(21),4(4)]+head

We next begin to flush 22 by sending a delete with snapc 4(4) since
prev_snapc is 4 <---------- here is the bug

The backing pool ignores this request since 4 < 9 (ORDERSNAP) leaving it
with:

9:[4]:[4(4)]

Then, we complete flushing 22 with snapc 19:[4] leaving the backing pool
with:

19:[4]:[4(4)]+head

Then, we begin to flush head by deleting with snapc 22:[21,20,4] leaving
the backing pool with:

22[21,20,4]:[22(21,20), 4(4)]

Finally, we flush head leaving the backing pool with:

30:[29,21,20,4]:[22(21*,20*),4(4)]+head

When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of flushing 22, we need to do two things:
1) Ensure that the current head is cloned as cloneid 4 with snaps [4] by
sending a delete at snapc 4:[4].
2) Flush the data at snap sequence < 21 by sending a copyfrom with snapc
20:[20,4].

Unfortunately, it is possible that 1, 1&2, or 1 and part of the flush
process for some other now non-existent clone have already been
performed. Because of that, between 1) and 2), we need to send
a second delete ensuring that the object does not exist at 20.

Fixes: #9054
Backport: firefly
Related: 66c7439ea0888777b5cfc08bcb0fbd7bfd8653c3
Signed-off-by: Samuel Just <>
(cherry picked from commit 4843fd510b33a71999cdf9c2cfa2b4c318fa80fd)

History

#1 Updated by Sage Weil over 9 years ago

almost there. on osd.0, we finish trimming 14a here:

2014-08-08 10:55:12.311901 7f1237847700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2839 (0'0,462'2839] local-les=422 n=53 ec=6 les/c 422/422 421/421/421) [0,1] r=0 lpr=421 luod=462'2832 crt=459'2828 lcod 462'2830 mlcod 462'2830 active+clean snaptrimq=[14a~1]] SnapTrimmer state<NotTrimming>: NotTrimming: trimming 14a

and a moment later create a clone here:

2014-08-08 10:55:12.391746 7f123a04c700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2847 (0'0,462'2847] local-les=422 n=51 ec=6 les/c 422/422 421/421/421) [0,1] r=0 lpr=421 luod=462'2834 crt=462'2832 lcod 462'2832 mlcod 462'2832 active+clean] final snapset 14a=[149,148,147,146,144,143,141,140,13e,13b,13a,139,134,12b,129,128,10f,10b]:[10f,14a]+head in df2a6681/plana0410548-439/head//2

#2 Updated by Sage Weil over 9 years ago

2014-08-08 10:55:12.312751 7f1237847700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2839 (0'0,462'2839] local-les=422 n=53 ec=6 les/c 422/422 421/421/421) [0,1] r=0 lpr=421 luod=462'2832 crt=459'2828 lcod 462'2830 mlcod 462'2830 active+clean snaptrimq=[14a~1]] SnapTrimmer state<Trimming/WaitingOnReplicas>: WaitingOnReplicas: adding snap 14a to purged_snaps
...
2014-08-08 10:55:12.391576 7f123a04c700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2847 (0'0,462'2847] local-les=422 n=51 ec=6 les/c 422/422 421/421/421) [0,1] r=0 lpr=421 luod=462'2834 crt=462'2832 lcod 462'2832 mlcod 462'2832 active+clean] cloning v 462'2837 to df2a6681/plana0410548-439/14a//2 v 462'2848 snaps=[149,148,147,146,144,143,141,140,13e,13b,13a,139,134,12b,129,128]

hmm, the clone isn't diryt

#3 Updated by Sage Weil over 9 years ago

  • Assignee deleted (Sage Weil)

#4 Updated by Samuel Just over 9 years ago

  • Assignee set to Samuel Just

#5 Updated by Samuel Just over 9 years ago

Hmm, I think the bug is like this:

Normally, if we get the following op sequence:
- write 1:[]
- delete 10:[3] (creates clone 10 with snaps 3)
- trim 3
- write 12:[11]
- (async, removes clone 10)

We would see something like the following sent to the backend:
- copyfrom@1 1:[] ENFORCE_SNAPC
- delete 10:[3] ORDERSNAP, ENFORCE_SNAPC (creates clone 10 with snaps 3)
- copyfrom@12 12:[11] ENFORCE_SNAPC
- (async, removes clone 10)

If instead we reorder a bit though, we might trim clone 10 before we send the delete and get:
- copyfrom@1 1:[] ENFORCE_SNAPC
- (async on cache pool, remove clone 10)
- copyfrom@11 12:[11] ENFORCE_SNAPC

This will cause the backing pool to create clone 12 instead of clone 10, and to incorrectly believe snap 11 to be part of a period during which the object existed after the first write instead of a period where the object did not exist.

#6 Updated by Samuel Just over 9 years ago

Thinking

#7 Updated by Samuel Just over 9 years ago

Actually, looks like this might already be handled correctly, re-consulting the log.

#8 Updated by Samuel Just over 9 years ago

Ok, we start with the following configuration in the cache (all dirty):

30:[29,21,20,15,10,4]:[22(21), 15(15,10), 4(4)]+head

The object doesn't exist at 29 or 20.

First, we flush 4 leaving the backing pool with:

3:[]+head

Then, we begin to flush 15 with a delete with snapc 4:[4] leaving the backing pool with:

4:[4]:[4(4)]

Then, we finish flushing 15 with snapc 9:[4] with leaving the backing pool with:

9:[4]:[4(4)]+head

Next, snaps 10 and 15 are removed causing clone 10 to be removed leaving the cache with:

30:[29,21,20,4]:[22(21),4(4)]+head

We next begin to flush 22 by sending a delete with snapc 4(4) since prev_snapc is 4 <---------- here is the bug

The backing pool ignores this request since 4 < 9 (ORDERSNAP) leaving it with:

9:[4]:[4(4)]

Then, we complete flushing 22 with snapc 19:[4] leaving the backing pool with:

19:[4]:[4(4)]+head

Then, we begin to flush head by deleting with snapc 22:[21,20,4] leaving the backing pool with:

22[21,20,4]:[22(21,20), 4(4)]

Finally, we flush head leaving the backing pool with:

30:[29,21,20,4]:[22(21*,20*),4(4)]+head

#9 Updated by Samuel Just over 9 years ago

When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of flushing 22, we need to do two things:
1) Ensure that the current head is cloned as cloneid 4 with snaps [4] by
sending a delete at snapc 4:[4].
2) Flush the data at snap sequence < 21 by sending a copyfrom with snapc
20:[20,4].

Unfortunately, it is possible that 1, 1&2, or 1 and part of the flush
process for some other now non-existent clone have already been
performed. Because we don't know whether 1 has already happened, we
currently use ORDERSNAP to prevent us from deleting a second time after
2) has been sent. I think we can get away without it:

If the snap_seq of this object on the backing pool is > 4, we know that
at some point, a delete was first sent with snapc 4:[4], so clone 4 has
been correctly created. In that case, because clone 22 is still dirty,
the newest snap seq which could have been sent to the backing pool is
20. Thus, because there are no snaps left between 20 and 4 (there
actually can't be since there would have to be another clone there),
performing a second delete at 4:[4] is harmless since it will be
promoted to some snap_seq seq where 4 <= seq < 21, which is fine
since we will be sending a copyfrom at 20.

#10 Updated by Samuel Just over 9 years ago

  • Status changed from New to 7
  • Backport set to firefly

wip-9054

#11 Updated by Sage Weil over 9 years ago

This sounds right to me!

#12 Updated by Sage Weil over 9 years ago

ubuntu@teuthology:/a/sage-2014-08-13_15:28:18-rados-next-testing-basic-multi/422862

#13 Updated by Sage Weil over 9 years ago

  • Status changed from 7 to Pending Backport

#14 Updated by Sage Weil over 9 years ago

  • Status changed from Pending Backport to 12

ubuntu@teuthology:/a/sage-2014-08-30_20:39:25-rados-wip-sage-testing-testing-basic-multi/462661

same symptom, at least. maybe this is a consequence of #9294?

#15 Updated by Sage Weil over 9 years ago

Sage Weil wrote:

ubuntu@teuthology:/a/sage-2014-08-30_20:39:25-rados-wip-sage-testing-testing-basic-multi/462661

same symptom, at least. maybe this is a consequence of #9294?

no.. this run included wip-9294

#16 Updated by Samuel Just over 9 years ago

2014-08-31T01:36:27.057 INFO:tasks.rados.rados.0.plana80.stdout:update_object_version oid 348 v 3674 (ObjNum 1096 snap 204 seq_num 1096) dirty exists
2014-08-31T01:36:27.057 INFO:tasks.rados.rados.0.plana80.stdout:2267: expect (ObjNum 34580816 snap 0 seq_num 3355477840)

Looks like a bug in the tool to me. Maybe memory corruption?

#17 Updated by Samuel Just over 9 years ago

  • Status changed from 12 to Resolved

Opening new bug for that one.

#18 Updated by Samuel Just over 9 years ago

  • Status changed from Resolved to 12

nvm, reopening

#19 Updated by Samuel Just over 9 years ago

2014-08-31 01:24:18.568610 7f45febb1700 10 osd.2 pg_epoch: 192 pg[2.1( v 192'1575 (0'0,192'1575] local-les=105 n=49 ec=6 les/c 105/105 104/104/97) [2,5] r=0 lpr=104 luod=192'1570 crt=192'1568 lcod 192'1568 mlcod 192'1568 active+clean] start_flush 95d4eb9d/plana802684-165/head//2 v166'1404 uv1051 non-blocking/best-effo
rt
2014-08-31 01:24:18.568616 7f274bdd6700 10 osd.0 pg_epoch: 192 pg[1.1( v 192'341 (0'0,192'341] local-les=105 n=139 ec=5 les/c 105/105 104/104/63) [0,5] r=0 lpr=104 crt=188'331 lcod 192'339 mlcod 192'339 active+clean] handle_message: 0x1cfd900
2014-08-31 01:24:18.568617 7fbebf652700 10 write_log with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, dirty_divergent_priors: 0, writeout_from: 192'1574, trimmed:
2014-08-31 01:24:18.568619 7f45febb1700 20 osd.2 pg_epoch: 192 pg[2.1( v 192'1575 (0'0,192'1575] local-les=105 n=49 ec=6 les/c 105/105 104/104/97) [2,5] r=0 lpr=104 luod=192'1570 crt=192'1568 lcod 192'1568 mlcod 192'1568 active+clean] snapset 65=[64,63,62,61,60,5f,5e,5d,5b,59,58,57,56,55,50,4d,4c,48,45,42,39]:[]+head
2014-08-31 01:24:18.568624 7f274bdd6700 10 osd.0 pg_epoch: 192 pg[1.1( v 192'341 (0'0,192'341] local-les=105 n=139 ec=5 les/c 105/105 104/104/63) [0,5] r=0 lpr=104 crt=188'331 lcod 192'339 mlcod 192'339 active+clean] do_op osd_op(osd.2.3:467 plana802684-57 [delete] 1.43c35a9d snapc 65=[64,63,62,61,60,5f,5e,5d,5b,59,58
,57,56,55,50,4d,4c,48,45,42,39] ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e192) v4 may_write > write-ordered flags ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected
2014-08-31 01:24:18.568628 7f45febb1700 20 osd.2 pg_epoch: 192 pg[2.1( v 192'1575 (0'0,192'1575] local-les=105 n=49 ec=6 les/c 105/105 104/104/97) [2,5] r=0 lpr=104 luod=192'1570 crt=192'1568 lcod 192'1568 mlcod 192'1568 active+clean] start_flush no older clones
2014-08-31 01:24:18.568642 7f274bdd6700 15 filestore(/var/lib/ceph/osd/ceph-0) getattr 1.1_head/43c35a9d/plana802684-57/head//1 '_'
2014-08-31 01:24:18.568650 7f45febb1700 1 -
10.214.133.36:6814/1414 --> 10.214.133.36:6800/1378 -- osd_op(osd.2.3:468 plana802684-165 [copy-from ver 1051] 1.95d4eb9d snapc 65=[64,63,62,61,60,5f,5e,5d,5b,59,58,57,56,55,50,4d,4c,48,45,42,39] ondisk+write+ignore_overlay+enforce_snapc+known_if_redirected e192) v4 -- ?+0
0x5702500 con 0x39292c0

Bug in the head bug fix case.

#20 Updated by Samuel Just over 9 years ago

  • Status changed from 12 to 7

#21 Updated by Sage Weil over 9 years ago

ubuntu@teuthology:/a/teuthology-2014-09-08_02:32:01-rados-master-testing-basic-multi/472355

#22 Updated by Sage Weil over 9 years ago

  • Status changed from 7 to Pending Backport

#23 Updated by Samuel Just over 9 years ago

wip-sam-testing-firefly

#24 Updated by Samuel Just over 9 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF