Project

General

Profile

Bug #16236

cache/proxied ops from different primaries (cache interval change) don't order propertly, -ERANGE on write in ceph_test_rados

Added by Samuel Just almost 8 years ago. Updated over 4 years ago.

Status:
Won't Fix
Priority:
High
Assignee:
-
Category:
Tiering
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Came from finish_promote, writes happened between first and second copy_get which caused the version to change.


Related issues

Duplicated by Ceph - Bug #16827: "FAILED assert(0 == "racing read got wrong version")" in rados-jewel-distro-basic-smithi Duplicate 07/27/2016

History

#1 Updated by Samuel Just almost 8 years ago

sjust@teuthology:/a/samuelj-2016-06-09_19:31:27-rados-wip-16211-jewel-distro-basic-smithi/249150/remote$ grep -o '2016.* \(en\|de\)queue_op.*osd_op(.*1.df123c4b' ceph-osd.sorted | sed 's/prio.*osd_op(//'
2016-06-10 04:43:12.115467 7fad46f5f700 15 osd.1 18 enqueue_op 0x7fad5bfa4700 client.4123.0:940 1.df123c4b
2016-06-10 04:43:12.115503 7fad38dad700 10 osd.1 18 dequeue_op 0x7fad5bfa4700 client.4123.0:940 1.df123c4b
2016-06-10 04:43:12.122410 7fad46f5f700 15 osd.1 18 enqueue_op 0x7fad5bfa7100 client.4123.0:941 1.df123c4b
2016-06-10 04:43:12.122425 7fad38dad700 10 osd.1 18 dequeue_op 0x7fad5bfa7100 client.4123.0:941 1.df123c4b
2016-06-10 04:43:12.128765 7fad46f5f700 15 osd.1 18 enqueue_op 0x7fad5bfa7600 client.4123.0:942 1.df123c4b
2016-06-10 04:43:12.128779 7fad38dad700 10 osd.1 18 dequeue_op 0x7fad5bfa7600 client.4123.0:942 1.df123c4b
2016-06-10 04:43:12.128789 7fad46f5f700 15 osd.1 18 enqueue_op 0x7fad5bfa4800 client.4123.0:943 1.df123c4b
2016-06-10 04:43:12.128912 7fad46f5f700 15 osd.1 18 enqueue_op 0x7fad5bfa4f00 osd.1.3:430 1.df123c4b
2016-06-10 04:43:12.129560 7fad38dad700 10 osd.1 18 dequeue_op 0x7fad5bfa4800 client.4123.0:943 1.df123c4b
2016-06-10 04:43:12.129973 7fad365a8700 10 osd.1 18 dequeue_op 0x7fad5bfa4f00 osd.1.3:430 1.df123c4b
2016-06-10 04:43:12.131685 7fad365a8700 10 osd.1 18 dequeue_op 0x7fad5bfa4f00 osd.1.3:430 1.df123c4b
2016-06-10 04:45:37.605757 7fad47760700 15 osd.1 134 enqueue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:37.605807 7fad365a8700 10 osd.1 134 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:37.622346 7fad47760700 15 osd.1 134 enqueue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:37.622388 7fad38dad700 10 osd.1 134 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:37.633329 7fad47760700 15 osd.1 134 enqueue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:37.633432 7fad38dad700 10 osd.1 134 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:37.633449 7fad47760700 15 osd.1 134 enqueue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b
2016-06-10 04:45:37.633604 7fad38dad700 10 osd.1 134 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b
2016-06-10 04:45:37.633903 7fad47760700 15 osd.1 134 enqueue_op 0x7fad5bbd8f00 osd.1.3:1508 1.df123c4b
2016-06-10 04:45:37.633929 7fad38dad700 10 osd.1 134 dequeue_op 0x7fad5bbd8f00 osd.1.3:1508 1.df123c4b
2016-06-10 04:45:49.866540 7fad38dad700 10 osd.1 136 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:49.867116 7fad38dad700 10 osd.1 136 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:49.867191 7fad38dad700 10 osd.1 136 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:49.867288 7fad38dad700 10 osd.1 136 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b
2016-06-10 04:45:49.867410 7fad365a8700 10 osd.1 136 dequeue_op 0x7fad5bbd8f00 osd.1.3:1508 1.df123c4b
2016-06-10 04:45:51.618111 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:51.618149 7fad38dad700 10 osd.1 138 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:51.618177 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:51.618234 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:51.618227 7fad38dad700 10 osd.1 138 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:51.618278 7fad365a8700 10 osd.1 138 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:58.554820 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.555157 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.555358 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:58.555561 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:58.556018 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:58.556196 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:58.556301 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b
2016-06-10 04:45:58.556411 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5bbd8f00 osd.1.3:1508 1.df123c4b
2016-06-10 04:45:58.564542 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.564691 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.564797 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:58.564899 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:58.565246 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:58.565343 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:58.565396 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b
2016-06-10 04:45:58.565564 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5bbd8f00 osd.1.3:1508 1.df123c4b
2016-06-10 04:45:59.667061 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:59.667241 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:59.667328 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:59.667436 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:59.667565 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:59.667654 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:59.667774 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b
2016-06-10 04:45:59.667834 7fad365a8700 10 osd.1 144 dequeue_op 0x7fad5bbd8f00 osd.1.3:1508 1.df123c4b
2016-06-10 04:45:59.680346 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:59.683935 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:59.686354 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:59.690473 7fad365a8700 10 osd.1 144 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:59.698974 7fad365a8700 10 osd.1 144 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:59.700963 7fad365a8700 10 osd.1 144 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:59.709109 7fad46f5f700 15 osd.1 144 enqueue_op 0x7fad5f96d800 osd.3.3:1026 1.df123c4b
2016-06-10 04:45:59.725828 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b
2016-06-10 04:45:59.733290 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5bbd8f00 osd.1.3:1508 1.df123c4b
2016-06-10 04:45:59.733916 7fad38dad700 10 osd.1 144 dequeue_op 0x7fad5f96d800 osd.3.3:1026 1.df123c4b

#2 Updated by Samuel Just almost 8 years ago

The interesting sequence is:
2016-06-10 04:45:37.605757 7fad47760700 15 osd.1 134 enqueue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
...
2016-06-10 04:45:51.618111 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:51.618149 7fad38dad700 10 osd.1 138 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:51.618177 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:51.618234 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:51.618227 7fad38dad700 10 osd.1 138 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:51.618278 7fad365a8700 10 osd.1 138 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:58.554820 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.555157 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.555358 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:58.555561 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:58.556018 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:58.556196 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:58.556301 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b

That second enqueue_op is a second client.4123.0:3612 instance (different pointer value). Also, when they are dequeued again, the two osd ops and the new client.4123.0:3612 wind up before 13-15. Later, when the osd ops actually get run (list-snaps and the initial copy-get), they get run right before a bunch of writes causing the second copy-get assert-version to fail.

#3 Updated by Samuel Just almost 8 years ago

Samuel Just wrote:

The interesting sequence is:
2016-06-10 04:45:37.605757 7fad47760700 15 osd.1 134 enqueue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
...
2016-06-10 04:45:51.618111 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:51.618149 7fad38dad700 10 osd.1 138 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b

Winds up in waiting_for_peering.

2016-06-10 04:45:51.618177 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:51.618234 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:51.618227 7fad38dad700 10 osd.1 138 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b

waiting_for_peering

2016-06-10 04:45:51.618278 7fad365a8700 10 osd.1 138 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b

waiting_for_peering

After this, something causes everything to get requeued -- specifically, an interval change. So why do 13-15 dequeue in the wrong order below? Why does the original 12 dequeue in the right order?

2016-06-10 04:45:58.554820 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.555157 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.555358 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:58.555561 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:58.556018 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:58.556196 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:58.556301 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b

That second enqueue_op is a second client.4123.0:3612 instance (different pointer value). Also, when they are dequeued again, the two osd ops and the new client.4123.0:3612 wind up before 13-15. Later, when the osd ops actually get run (list-snaps and the initial copy-get), they get run right before a bunch of writes causing the second copy-get assert-version to fail.

#4 Updated by Samuel Just almost 8 years ago

  • Assignee set to Samuel Just
  • Priority changed from Normal to Urgent

#5 Updated by Samuel Just almost 8 years ago

2016-06-10 04:45:49.866540 7fad38dad700 10 osd.1 136 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b

waiting_for_peered

2016-06-10 04:45:49.867116 7fad38dad700 10 osd.1 136 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b

waiting_for_peered

2016-06-10 04:45:49.867191 7fad38dad700 10 osd.1 136 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b

waiting_for_peered

2016-06-10 04:45:49.867288 7fad38dad700 10 osd.1 136 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b

waiting_for_peered

2016-06-10 04:45:49.867410 7fad365a8700 10 osd.1 136 dequeue_op 0x7fad5bbd8f00 osd.1.3:1508 1.df123c4b
2016-06-10 04:45:51.618111 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:51.618149 7fad38dad700 10 osd.1 138 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:51.618177 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:51.618234 7fad46f5f700 15 osd.1 138 enqueue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:51.618227 7fad38dad700 10 osd.1 138 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:51.618278 7fad365a8700 10 osd.1 138 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:58.554820 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5f96e400 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.555157 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5ec61b00 client.4123.0:3612 1.df123c4b
2016-06-10 04:45:58.555358 7fad38dad700 10 osd.1 143 dequeue_op 0x7fad5c165500 osd.3.3:1021 1.df123c4b
2016-06-10 04:45:58.555561 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5c167800 osd.3.3:1022 1.df123c4b
2016-06-10 04:45:58.556018 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96fc00 client.4123.0:3613 1.df123c4b
2016-06-10 04:45:58.556196 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96ea00 client.4123.0:3614 1.df123c4b
2016-06-10 04:45:58.556301 7fad365a8700 10 osd.1 143 dequeue_op 0x7fad5f96c100 client.4123.0:3615 1.df123c4b

Ok, so the previous ones should have been in waiting_for_peered as well.

#6 Updated by Samuel Just almost 8 years ago

2016-06-10 04:42:41.619711 7fad4f7df800 0 osd.1 0 using 0 op queue with priority op cut off at 196.

I guess we're using the prioritized queue?

#7 Updated by Samuel Just almost 8 years ago

Same interval from
2016-06-10 04:45:49.866540 7fad38dad700 10 osd.1 136 dequeue_op 0x7fad5f96e400 prio 63 cost 643082 latency 12.261479 osd_op(client.4123.0:3612 1.df123c4b smithi00225005-212 [write 763216~643082,stat] snapc 71=[71,6f,61,60,45] ondisk+write+ignore_cache+ignore_overlay+known_if_redirected e134) v7 pg pg[1.3( v 134'1051 (110'825,134'1051] local-les=124 n=137 ec=7 les/c/f 124/124/0 136/136/120) [2,5,0]/[1,2,5] r=0 lpr=136 pi=1
23-135/1 crt=132'1042 lcod 132'1042 mlcod 0'0 remapped+peering]
through
2016-06-10 04:45:51.618278 7fad365a8700 10 osd.1 138 dequeue_op 0x7fad5c167800 prio 63 cost 29 latency 0.000081 osd_op(osd.3.3:1022 1.df123c4b (undecoded) ack+read+rwordered+ignore_cache+ignore_overlay+map_snap_clone+known_if_redirected e138) v7 pg pg[1.3( v 134'1051 (110'825,134'1051] local-les=124 n=137 ec=7 les/c/f 124/124/0 136/136/120) [2,5,0]/[1,2,5] r=0 lpr=136 pi=123-135/1 crt=132'1042 lcod 132'1042 mlcod 0'0 remapped+peering]

#8 Updated by Samuel Just almost 8 years ago

2016-06-10 04:45:58.553823 7fad3adb1700 15 osd.1 pg_epoch: 143 pg[1.3( v 134'1051 (110'825,134'1051] local-les=124 n=137 ec=7 les/c/f 124/124/0 143/143/120) [0,5]/[1,5] r=0 lpr=143 pi=123-142/2 crt=132'1042 lcod 132'1042 mlcod 0'0 remapped NIBBLEWISE] requeue_ops 0x7fad5f96e400,0x7fad5da7a400,0x7fad5da7b700,0x7fad5da7b800,0x7fad5da7b900,0x7fad5da7ba00,0x7fad5f96fc00,0x7fad5f96ea00,0x7fad5f96c100,0x7fad5bbd8f00,0x7fad5ec6
1b00,0x7fad5c165500,0x7fad5c167800,0x7fad5c166900,0x7fad5c165f00,0x7fad5c164000

Focusing on just the two client.4123.0:3612 instances (0x7fad5f96e400 is the first, 0x7fad5ec61b00 is the second), we seem them in the right order in the requeue_ops. The second is followed by the two osd ops (list-snaps and copy-get). So far, so boring. 13-15 also seem to be in the right places.

#9 Updated by Samuel Just almost 8 years ago

Ok, the first sequence of ops came from osd.1, the second from osd.3. The bug is that the queues do not guarantee ordering between ops from different sources. I think the fix is that proxied ops need to be queued based on a queuing token from the pgid (source or target? How does this interact with split? Can't use the raw token, too many queues.) Can't queue based on the client, these need to order properly with the cache-tier primary's flushes/evicts/etc. Note that reads, writes, and cache/tiering promotion/eviction/flush operations will all need to agree on this token.

#10 Updated by Samuel Just almost 8 years ago

  • Subject changed from jewel (probably master also): -ERANGE on write in ceph_test_rados to cache/proxied ops from different primaries (cache interval change) don't order propertly, -ERANGE on write in ceph_test_rados

#11 Updated by Sage Weil almost 8 years ago

/a/sage-2016-06-11_06:29:35-rados-jewel---basic-smithi/251518

(at least ceph_test_rados got ERANGE.. presumably it's this bug)

#12 Updated by Samuel Just over 7 years ago

  • Assignee deleted (Samuel Just)

#13 Updated by Samuel Just over 7 years ago

  • Duplicated by Bug #16827: "FAILED assert(0 == "racing read got wrong version")" in rados-jewel-distro-basic-smithi added

#14 Updated by Samuel Just over 7 years ago

  • Priority changed from Urgent to High

#15 Updated by Joao Eduardo Luis over 7 years ago

  • Assignee set to Joao Eduardo Luis

#16 Updated by Nathan Cutler about 7 years ago

This bug is now haunting rados runs in jewel 10.2.6 integration testing:

/a/smithfarm-2017-01-30_11:11:11-rados-wip-jewel-backports-distro-basic-smithi/764195

#17 Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category set to Tiering
  • Component(RADOS) OSD added

#18 Updated by Greg Farnum over 6 years ago

  • Assignee deleted (Joao Eduardo Luis)

#19 Updated by Sage Weil over 6 years ago

  • Status changed from New to 12

#21 Updated by Josh Durgin over 4 years ago

  • Status changed from 12 to Won't Fix

Also available in: Atom PDF