Project

General

Profile

Bug #12814

flipping the overlay from forward to seems to reorder writes

Added by Samuel Just about 3 years ago. Updated about 2 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
08/27/2015
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados

Description

ubuntu@teuthology:/a/samuelj-2015-08-26_12:18:34-rados-wip-sam-testing-distro-basic-multi/1033948


Related issues

Related to Ceph - Backport #15171: hammer: osd: corruption when min_read_recency_for_promote > 1 Resolved
Duplicated by Ceph - Bug #13013: Cache tier thrashing causes out of order write Duplicate 09/09/2015
Duplicated by Ceph - Bug #14306: "RadosModel.h: 854: FAILED assert(0)" in rados-hammer-distro-basic-openstack Duplicate 01/08/2016

Associated revisions

Revision d7da6884 (diff)
Added by Sage Weil over 2 years ago

osd: add 'proxy' cache mode

The 'forward' mode is flawed for the same reason that redirects
are flawed: they allow the reordering of requests. Add a proxy
mode that can be used instead of forward. (Later we can consider
dropping forward entirely?)

Fixes: #12814
Signed-off-by: Sage Weil <>

History

#1 Updated by Sage Weil about 3 years ago

maybe /a/sage-2015-09-06_05:36:07-rados-wip-sage-testing---basic-multi/1044769 ?

2015-09-06T07:42:40.338 INFO:tasks.rados.rados.0.burnupi24.stderr:Error: finished tid 2 when last_acked_tid was 6

#2 Updated by Sage Weil almost 3 years ago

/a/sage-2015-11-14_04:50:22-rados-wip-sage-testing---basic-multi/1149284

#3 Updated by Sage Weil almost 3 years ago

  • Duplicated by Bug #13013: Cache tier thrashing causes out of order write added

#4 Updated by Samuel Just almost 3 years ago

/a/teuthology-2015-11-21_21:00:01-rados-hammer-distro-basic-openstack/17756

#5 Updated by Sage Weil almost 3 years ago

/a/sage-2015-11-25_07:26:30-rados-wip-sage-testing---basic-multi/1160057

#6 Updated by Sage Weil almost 3 years ago

teuthology:1160057  05:57 AM $ egrep 'client.4119.0:3052[456789]' remote/*/log/*osd* | grep === | sort -k 2 | grep e23
remote/burnupi26/log/ceph-osd.5.log:2015-11-25 14:38:24.300618 7fb4a7bab700  1 -- 10.214.135.38:6808/10423 <== client.4119 10.214.135.38:0/3782606168 87 ==== osd_op(client.4119.0:30524 1.65b401d9 (undecoded) ack+ondisk+retry+write+redirected+known_if_redirected e23) v7 ==== 243+0+734444 (3093478145 0 1938170131) 0x7fb4d860e680 con 0x7fb4d7267200
remote/mira057/log/ceph-osd.2.log:2015-11-25 14:38:24.320914 7f1b33bbe700  1 -- 10.214.134.136:6800/3338 <== client.4119 10.214.135.38:0/3782606168 36 ==== osd_op(client.4119.0:30525 2.65b401d9 (undecoded) ack+ondisk+write+known_if_redirected e23) v7 ==== 205+0+509614 (4268985705 0 4255468290) 0x7f1b64847c80 con 0x7f1b64272300
remote/burnupi26/log/ceph-osd.5.log:2015-11-25 14:38:24.325428 7fb4a7bab700  1 -- 10.214.135.38:6808/10423 <== client.4119 10.214.135.38:0/3782606168 88 ==== osd_op(client.4119.0:30529 1.65b401d9 (undecoded) ack+ondisk+retry+write+redirected+known_if_redirected e23) v7 ==== 205+0+509614 (3250234563 0 4255468290) 0x7fb4d8f9b440 con 0x7fb4d7267200
remote/mira057/log/ceph-osd.2.log:2015-11-25 14:38:24.331382 7f1b33bbe700  1 -- 10.214.134.136:6800/3338 <== client.4119 10.214.135.38:0/3782606168 37 ==== osd_op(client.4119.0:30526 2.65b401d9 (undecoded) ack+ondisk+write+known_if_redirected e23) v7 ==== 205+0+281605 (2638626519 0 3821793884) 0x7f1b63505e40 con 0x7f1b64272300
remote/mira057/log/ceph-osd.2.log:2015-11-25 14:38:24.331509 7f1b33bbe700  1 -- 10.214.134.136:6800/3338 <== client.4119 10.214.135.38:0/3782606168 38 ==== osd_op(client.4119.0:30527 2.65b401d9 (undecoded) ack+ondisk+write+known_if_redirected e23) v7 ==== 243+0+65 (3223704887 0 2138031401) 0x7f1b63695440 con 0x7f1b64272300
remote/mira057/log/ceph-osd.2.log:2015-11-25 14:38:24.331607 7f1b33bbe700  1 -- 10.214.134.136:6800/3338 <== client.4119 10.214.135.38:0/3782606168 39 ==== osd_op(client.4119.0:30528 2.65b401d9 (undecoded) ack+read+rwordered+known_if_redirected e23) v7 ==== 205+0+0 (1396123015 0 0) 0x7f1b63422ec0 con 0x7f1b64272300

the problematic op is the first one, client.4119.0:30524. why does it get sent with flag 'redirected' to osd.5?

#7 Updated by Loic Dachary almost 3 years ago

testing infernalis rados/singleton/{all/thrash_cache_writeback_forward_none.yaml fs/btrfs.yaml msgr/simple.yaml msgr-failures/few.yaml}, ran into it, just in case it may be helpful to know

2015-11-26T09:06:05.170 INFO:tasks.rados.rados.0.target167160.stdout:2400:  finishing write tid 1 to target167160.teuthology9942-2400
2015-11-26T09:06:05.171 INFO:tasks.rados.rados.0.target167160.stderr:Error: finished tid 1 when last_acked_tid was 5
2015-11-26T09:06:05.178 INFO:tasks.rados.rados.0.target167160.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f649bfff700 time 2015-11-26 09:06:05.080725
2015-11-26T09:06:05.178 INFO:tasks.rados.rados.0.target167160.stderr:./test/osd/RadosModel.h: 862: FAILED assert(0)
2015-11-26T09:06:05.180 INFO:tasks.rados.rados.0.target167160.stderr: ceph version 9.2.0-46-g97293b5 (97293b554026cff96da59760aeb577660867a2ac)
2015-11-26T09:06:05.180 INFO:tasks.rados.rados.0.target167160.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f64b341ffdb]
2015-11-26T09:06:05.180 INFO:tasks.rados.rados.0.target167160.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x4b8) [0x7f64b340b8a8]
2015-11-26T09:06:05.180 INFO:tasks.rados.rados.0.target167160.stderr: 3: (write_callback(void*, void*)+0x19) [0x7f64b3417379]
2015-11-26T09:06:05.180 INFO:tasks.rados.rados.0.target167160.stderr: 4: (()+0x9ccad) [0x7f64b0c33cad]
2015-11-26T09:06:05.181 INFO:tasks.rados.rados.0.target167160.stderr: 5: (()+0x77fc9) [0x7f64b0c0efc9]
2015-11-26T09:06:05.181 INFO:tasks.rados.rados.0.target167160.stderr: 6: (()+0x13bef8) [0x7f64b0cd2ef8]
2015-11-26T09:06:05.181 INFO:tasks.rados.rados.0.target167160.stderr: 7: (()+0x8182) [0x7f64b076b182]
2015-11-26T09:06:05.181 INFO:tasks.rados.rados.0.target167160.stderr: 8: (clone()+0x6d) [0x7f64af2f547d]
2015-11-26T09:06:05.181 INFO:tasks.rados.rados.0.target167160.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

#8 Updated by Sage Weil almost 3 years ago

  • Status changed from New to In Progress
  • Assignee set to Sage Weil
  • Source changed from other to Q/A

#9 Updated by bo cai almost 3 years ago

Hi,sage,how is this going?

#10 Updated by Samuel Just over 2 years ago

  • Duplicated by Bug #14306: "RadosModel.h: 854: FAILED assert(0)" in rados-hammer-distro-basic-openstack added

#11 Updated by David Zafman over 2 years ago

Seen again in my hammer backport testing:

description: rados:singleton:all/thrash_cache_writeback_forward_none.yaml

teuthology:/a/dzafman-2016-02-24_20:32:40-rados:singleton:all-wip-14077-hammer---basic-smithi/26379

#12 Updated by Yuri Weinstein over 2 years ago

Seen in run:
http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2016-03-08_02:00:01-rados-infernalis-distro-basic-openstack/
Job: 11576
Logs: http://teuthology.ovh.sepia.ceph.com/teuthology/teuthology-2016-03-08_02:00:01-rados-infernalis-distro-basic-openstack/11576/teuthology.log

2016-03-08T03:39:03.627 INFO:tasks.rados.rados.0.target078044.stdout:9241:  writing target078044.ovh.sepia.ceph.com13368-9241 from 1639782 to 2066647 tid 2
2016-03-08T03:39:03.627 INFO:tasks.rados.rados.0.target078044.stdout: waiting on 16
2016-03-08T03:39:03.691 INFO:tasks.rados.rados.0.target078044.stdout:9213:  finishing write tid 2 to target078044.ovh.sepia.ceph.com13368-9213
2016-03-08T03:39:03.691 INFO:tasks.rados.rados.0.target078044.stderr:Error: finished tid 2 when last_acked_tid was 6
2016-03-08T03:39:03.701 INFO:tasks.rados.rados.0.target078044.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f8e22ffd700 time 2016-03-08 03:39:03.617390
2016-03-08T03:39:03.701 INFO:tasks.rados.rados.0.target078044.stderr:./test/osd/RadosModel.h: 862: FAILED assert(0)
2016-03-08T03:39:03.703 INFO:tasks.rados.rados.0.target078044.stderr: ceph version 9.2.1-5-g5eb2d6c (5eb2d6cf1ecb0c2d9b3caee10eff99c24d82b484)
2016-03-08T03:39:03.703 INFO:tasks.rados.rados.0.target078044.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f8e31f85f9b]
2016-03-08T03:39:03.704 INFO:tasks.rados.rados.0.target078044.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x4b8) [0x7f8e31f71868]
2016-03-08T03:39:03.704 INFO:tasks.rados.rados.0.target078044.stderr: 3: (write_callback(void*, void*)+0x19) [0x7f8e31f7d339]
2016-03-08T03:39:03.704 INFO:tasks.rados.rados.0.target078044.stderr: 4: (()+0x9cc7d) [0x7f8e2f799c7d]
2016-03-08T03:39:03.704 INFO:tasks.rados.rados.0.target078044.stderr: 5: (()+0x77f89) [0x7f8e2f774f89]
2016-03-08T03:39:03.704 INFO:tasks.rados.rados.0.target078044.stderr: 6: (()+0x13be88) [0x7f8e2f838e88]
2016-03-08T03:39:03.705 INFO:tasks.rados.rados.0.target078044.stderr: 7: (()+0x8182) [0x7f8e2f2d1182]
2016-03-08T03:39:03.705 INFO:tasks.rados.rados.0.target078044.stderr: 8: (clone()+0x6d) [0x7f8e2de5b47d]
2016-03-08T03:39:03.705 INFO:tasks.rados.rados.0.target078044.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

#13 Updated by Yuri Weinstein over 2 years ago

  • Release set to infernalis
  • ceph-qa-suite rados added

#14 Updated by Sage Weil over 2 years ago

  • Status changed from In Progress to Verified
  • Assignee deleted (Sage Weil)

#15 Updated by Greg Farnum over 2 years ago

Looks like a user on the mailing list ran into this: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg27680.html
Apparently switching between writeback and forward is useful in some cases for getting a read cache even if your cache pool can't handle the write throughput. 0_o

#16 Updated by Sage Weil over 2 years ago

  • Related to Backport #15171: hammer: osd: corruption when min_read_recency_for_promote > 1 added

#18 Updated by Sage Weil over 2 years ago

  • Assignee set to Sage Weil

#19 Updated by Sage Weil over 2 years ago

  • Status changed from Verified to In Progress

creating a 'proxy' mode that replaces 'forward'.

#20 Updated by Sage Weil over 2 years ago

  • Priority changed from Urgent to Immediate

#21 Updated by Sage Weil over 2 years ago

  • Status changed from In Progress to Resolved

#22 Updated by Sage Weil over 2 years ago

  • Status changed from Resolved to In Progress

http://pulpito.ceph.com/sage-2016-03-30_13:23:09-rados-wip-sage-testing-distro-basic-smithi/98188


75276277278279280281282283284285286287288289290291292293294295296297298299 version is 6730 and expected 8776
2016-03-30T15:25:29.310 INFO:tasks.rados.rados.0.smithi027.stderr:./test/osd/RadosModel.h: In function 'virtual void ReadOp::_finish(TestOp::CallbackInfo*)' thread 7fab15ffb700 time 2016-03-30 22:25:29.293395
2016-03-30T15:25:29.311 INFO:tasks.rados.rados.0.smithi027.stderr:./test/osd/RadosModel.h: 1169: FAILED assert(version == old_value.version)

#23 Updated by Loic Dachary over 2 years ago

on the hammer branch

  description: rados/singleton/{all/thrash_cache_writeback_forward_none.yaml fs/btrfs.yaml
    msgr-failures/many.yaml}
...
2016-04-03T19:42:03.114 INFO:tasks.rados.rados.0.target167114240148.stdout:5143:  finishing write tid 2 to target167114240148.teuthology9910-5143
2016-04-03T19:42:03.114 INFO:tasks.rados.rados.0.target167114240148.stderr:Error: finished tid 2 when last_acked_tid was 7
2016-04-03T19:42:03.115 INFO:tasks.rados.rados.0.target167114240148.stderr:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f7754ff9700 time 2016-04-03 19:42:03.103186
2016-04-03T19:42:03.115 INFO:tasks.rados.rados.0.target167114240148.stderr:./test/osd/RadosModel.h: 854: FAILED assert(0)
2016-04-03T19:42:03.116 INFO:tasks.rados.rados.0.target167114240148.stderr: ceph version 0.94.6-201-g7cfc739 (7cfc739a513da84d76b959efc51dd21c1cc552d6)
2016-04-03T19:42:03.116 INFO:tasks.rados.rados.0.target167114240148.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x4e101b]
2016-04-03T19:42:03.116 INFO:tasks.rados.rados.0.target167114240148.stderr: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x46e) [0x4c8dce]
2016-04-03T19:42:03.116 INFO:tasks.rados.rados.0.target167114240148.stderr: 3: (write_callback(void*, void*)+0x19) [0x4d8619]
2016-04-03T19:42:03.116 INFO:tasks.rados.rados.0.target167114240148.stderr: 4: (()+0x94fcd) [0x7f77617d8fcd]
2016-04-03T19:42:03.117 INFO:tasks.rados.rados.0.target167114240148.stderr: 5: (()+0x70319) [0x7f77617b4319]
2016-04-03T19:42:03.117 INFO:tasks.rados.rados.0.target167114240148.stderr: 6: (()+0x134348) [0x7f7761878348]
2016-04-03T19:42:03.117 INFO:tasks.rados.rados.0.target167114240148.stderr: 7: (()+0x8182) [0x7f7761318182]
2016-04-03T19:42:03.117 INFO:tasks.rados.rados.0.target167114240148.stderr: 8: (clone()+0x6d) [0x7f775fc9d47d]
2016-04-03T19:42:03.117 INFO:tasks.rados.rados.0.target167114240148.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2016-04-03T19:42:03.117 INFO:tasks.rados.rados.0.target167114240148.stderr:terminate called after throwing an instance of 'ceph::FailedAssertion'

#24 Updated by Sage Weil over 2 years ago

removed the test from the hammer suite https://github.com/ceph/ceph-qa-suite/pull/931

#25 Updated by Sage Weil over 2 years ago

2016-03-30 22:25:14.308757 7fb988321700 10 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] do_op osd_op(client.4118.0:46216 2.1aeca46e smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299 [write 712067~706736] snapc 0=[] ack+ondisk+write+known_if_redirected e20) v7 may_write -> write-ordered flags ack+ondisk+write+known_if_redirected
2016-03-30 22:25:14.308785 7fb988321700 10 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] get_object_context: obc NOT found in cache: 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head
2016-03-30 22:25:14.308822 7fb988321700 15 filestore(/var/lib/ceph/osd/ceph-5) getattr 2.2_head/#2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head# '_'
2016-03-30 22:25:14.308944 7fb988321700 10 filestore(/var/lib/ceph/osd/ceph-5) error opening file /var/lib/ceph/osd/ceph-5/current/2.2_head/smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105_44cfbf811b9e7fd5862e_0_long with flags=2: (2) No such file or directory
2016-03-30 22:25:14.308953 7fb988321700 10 filestore(/var/lib/ceph/osd/ceph-5) getattr 2.2_head/#2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head# '_' = -2
2016-03-30 22:25:14.308962 7fb988321700 10 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] get_object_context: no obc for soid 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head but can_create
2016-03-30 22:25:14.308979 7fb988321700 15 filestore(/var/lib/ceph/osd/ceph-5) getattr 2.2_head/#2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:snapdir# 'snapset'
2016-03-30 22:25:14.309051 7fb988321700 10 filestore(/var/lib/ceph/osd/ceph-5) error opening file /var/lib/ceph/osd/ceph-5/current/2.2_head/smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105_a23dc54bec6480cad5a6_0_long with flags=2: (2) No such file or directory
2016-03-30 22:25:14.309057 7fb988321700 10 filestore(/var/lib/ceph/osd/ceph-5) getattr 2.2_head/#2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:snapdir# 'snapset' = -2
2016-03-30 22:25:14.309084 7fb988321700 10 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] create_object_context 0x7fb9ae22f180 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head 
2016-03-30 22:25:14.309099 7fb988321700 10 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] populate_obc_watchers 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head
2016-03-30 22:25:14.309111 7fb988321700 20 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] ReplicatedPG::check_blacklisted_obc_watchers for obc 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head
2016-03-30 22:25:14.309123 7fb988321700 10 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] get_object_context: 0x7fb9ae22f180 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head rwstate(none n=0 w=0) oi: 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head(0'0 unknown.0.0:0 s 0 uv 0) ssc: 0x7fb9ae80b2c0 snapset: 0=[]:[]
2016-03-30 22:25:14.309142 7fb988321700 10 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] find_object_context 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head @head oi=2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head(0'0 unknown.0.0:0 s 0 uv 0)
2016-03-30 22:25:14.309162 7fb988321700 20 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] agent_choose_mode0x7fb9ac6fc800 delaying, ignored
2016-03-30 22:25:14.309179 7fb988321700 10 osd.5 pg_epoch: 20 pg[2.2( v 20'6027 (19'2932,20'6027] local-les=8 n=14 ec=6 les/c/f 8/8/0 6/6/6) [5,1] r=0 lpr=6 luod=20'6026 crt=20'6025 lcod 20'6025 mlcod 20'6025 active+clean] start_copy 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head from 2:76253758:::smithi0272108-7783 0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299:head @1 v0 flags 30 mirror_snapset

in epoch 20, the map is in proxy mode. i can't figure out why it didn't do a proxy write from the log. unfortunately the mon data tarball isn't complete, so i can't fully verify the osdmap content.

i think we need to wait for this to trigger again. enabling a higher osd log level.

#26 Updated by Sage Weil over 2 years ago

  • Status changed from In Progress to Need More Info

#27 Updated by Sage Weil over 2 years ago

  • Priority changed from Immediate to Urgent

#28 Updated by Samuel Just about 2 years ago

  • Status changed from Need More Info to Can't reproduce

Also available in: Atom PDF