Samuel Just's activity
From 01/26/2017 to 02/24/2017
02/24/2017
- 10:12 PM Ceph Bug #19076: osd/ReplicatedBackend.cc: 884: FAILED a ssert(j != bc->pulling.end())
- Something like https://github.com/athanatos/ceph/tree/wip-17831-18583-18809-18927-19076
- 07:31 PM Ceph Bug #19076: osd/ReplicatedBackend.cc: 884: FAILED a ssert(j != bc->pulling.end())
- 10:12 PM RADOS Bug #19023: ceph_test_rados invalid read caused apparently by lost intervals due to mons trimming...
- Something like: https://github.com/athanatos/ceph/tree/wip-19023
- 10:10 PM Ceph Bug #18961 (Resolved): objecter continually resends ops which don't have a callback
- 10:09 PM Ceph Bug #18937 (Resolved): cache/tiering flush bug with head delete
- 10:09 PM Ceph Revision 44b26f6a (ceph): Merge pull request #13594 from athanatos/wip-snap-trim-sleep
- osd: add snap trim reservation and re-implement osd_snap_trim_sleep
Reviewed-by: Josh Durgin <jdurgin@redhat.com> - 10:08 PM Ceph Revision 4f856fe9 (ceph): Merge pull request #13570 from athanatos/wip-18937
- osd: don't use ORDERSNAP for flush; always request/send ondisk ack
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Rev... - 07:27 PM Ceph Revision 0c0feca3 (ceph): osd,osdc: eliminate FLAG_ONDISK and helpers
- The objecter actually always needs to get a response in order to
be able to not continually resend ops (even if the c... - 07:26 PM Ceph Revision 48cc5d26 (ceph): PrimaryLogPG::start_flush: don't use ORDERSNAP, eliminate the second de...
- I think that whole thing was a misguided attempt to avoid deleting head
if it exists in the base tier (in reality it ...
02/22/2017
- 11:44 PM RADOS Bug #19023: ceph_test_rados invalid read caused apparently by lost intervals due to mons trimming...
- Well, sort of. last_epoch_clean is really about when we can forget OSDMaps. Should we retain OSDMaps on the mon (an...
- 11:33 PM RADOS Bug #19023: ceph_test_rados invalid read caused apparently by lost intervals due to mons trimming...
- 2017-02-20 20:45:59.104093 7f75c93f8700 10 osd.3 pg_epoch: 284 pg[1.16( v 278'379 (0'0,278'379] local-les=277 n=1 ec=...
- 12:09 AM RADOS Bug #19023: ceph_test_rados invalid read caused apparently by lost intervals due to mons trimming...
- 2017-02-20 20:46:28.567065 7ffa3242c700 10 osd.4 pg_epoch: 255 pg[1.16( v 254'369 (0'0,254'369] local-les=164 n=3 ec=...
- 12:05 AM RADOS Bug #19023: ceph_test_rados invalid read caused apparently by lost intervals due to mons trimming...
- 2017-02-20 20:46:40.165108 7f9e2ffc3700 10 osd.0 pg_epoch: 300 pg[1.16( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 ...
- 12:03 AM RADOS Bug #19023: ceph_test_rados invalid read caused apparently by lost intervals due to mons trimming...
- 2017-02-20 20:46:41.743173 7f9e277b2700 10 osd.0 pg_epoch: 301 pg[1.16( empty local-les=0 n=0 ec=141 les/c/f 164/164/...
- 10:34 PM Ceph Feature #18052: Replace past_intervals with more compact structure
- https://github.com/athanatos/ceph/tree/wip-past-intervals
- 10:34 PM Ceph Bug #17916 (Can't reproduce): osd/PGLog.cc: 1047: FAILED assert(oi.version == i->first)
- 10:33 PM Ceph Bug #18961 (Fix Under Review): objecter continually resends ops which don't have a callback
- https://github.com/ceph/ceph/pull/13570
- 10:33 PM Ceph Bug #18927 (Fix Under Review): on_flushed: object ... obc still alive
- https://github.com/ceph/ceph/pull/13569
- 10:32 PM Ceph Bug #18937 (Fix Under Review): cache/tiering flush bug with head delete
- https://github.com/ceph/ceph/pull/13570
02/21/2017
- 11:39 PM RADOS Bug #19023: ceph_test_rados invalid read caused apparently by lost intervals due to mons trimming...
- Notably, when it goes active at the end there, it's missing the 10 commits which happened during the [3,1] interval.
- 11:38 PM RADOS Bug #19023: ceph_test_rados invalid read caused apparently by lost intervals due to mons trimming...
- At epoch 255, 1.16 is on [4,3] and is active+clean
2017-02-20 20:45:10.962790 7fd9b7cba700 10 osd.4 pg_epoch: 255 ... - 01:27 AM RADOS Bug #19023 (Resolved): ceph_test_rados invalid read caused apparently by lost intervals due to mo...
- samuelj@teuthology:/a/samuelj-2017-02-20_18:45:04-rados-wip-18937---basic-smithi/839771/remote
If you look back in... - 05:24 AM Ceph Revision 2ed7759c (ceph): PrimaryLogPG: reimplement osd_snap_trim_sleep within the state machine
- Rather than blocking the main op queue, just pause for that amount of
time between state machine cycles.
Also, add o... - 01:42 AM Ceph Bug #19024 (Can't reproduce): ec pool stuck incomplete, active+remapped -- crush mapping anomaly?
- samuelj@teuthology:/a/samuelj-2017-02-20_18:45:04-rados-wip-18937---basic-smithi/839838
I killed the osd.5 process...
02/17/2017
- 05:48 PM Ceph Revision 51eee55c (ceph): ReplicatedBackend: don't queue Context outside of ObjectStore with obc
- We only flush the ObjectStore callbacks, not everything else. Thus,
there isn't a guarrantee that the obc held by pu...
02/16/2017
- 09:29 PM Ceph Bug #18961 (Resolved): objecter continually resends ops which don't have a callback
- This is triggered by the delete op sent during OSD flush.
02/15/2017
- 09:55 PM Ceph Bug #18929: "osd/PG.cc: 6896: FAILED assert(pg->is_acting(osd_with_shard) || pg->is_up(osd_with_s...
- I don't understand why this is not popping up. Sage's patch is correct, but something else is going on. Why is the ...
- 09:54 PM Ceph Bug #18929: "osd/PG.cc: 6896: FAILED assert(pg->is_acting(osd_with_shard) || pg->is_up(osd_with_s...
- samuelj@teuthology:/a/samuelj-2017-02-15_01:03:44-rados-wip-sam-testing---basic-smithi/816292 also
- 07:37 PM teuthology Bug #18946 (Rejected): apt-get dependency failures on rados run
- Maybe already fixed?
- 07:36 PM teuthology Bug #18946 (Rejected): apt-get dependency failures on rados run
- samuelj@teuthology:/a/samuelj-2017-02-15_01:03:44-rados-wip-sam-testing---basic-smithi$ for i in $(~/teuthology/virtu...
- 12:09 AM Ceph Bug #18937 (Resolved): cache/tiering flush bug with head delete
- base: 77=[77,76,74,71,6f,6d,62,61]:[]+head
promoted at 77, then deleted in cache
cache: 7a=[7a,76,74,6f,6d,62,... - 12:08 AM Ceph Bug #18809 (Resolved): FAILED assert(object_contexts.empty()) (live on master only from Jan-Feb 2...
- 12:07 AM Ceph Bug #18529 (Resolved): ERROR: test_rados.TestRados.test_ping_monitor
- 12:07 AM Ceph Bug #18927: on_flushed: object ... obc still alive
02/14/2017
- 08:00 PM Ceph Bug #17831 (Resolved): osd: ENOENT on clone
- http://tracker.ceph.com/issues/18927 and http://tracker.ceph.com/issues/18809 were caused by this series, I don't thi...
02/13/2017
- 05:47 PM Ceph Revision c2eac34c (ceph): osd/: add PG_STATE_SNAPTRIM[_WAIT] to expose snap trim state to user
- Signed-off-by: Samuel Just <sjust@redhat.com>
- 05:47 PM Ceph Revision 4aebf59d (ceph): rados: check that pool is done trimming before removing it
- Signed-off-by: Samuel Just <sjust@redhat.com>
- 05:47 PM Ceph Revision 21cc515a (ceph): osd/PrimaryLogPG: limit the number of concurrently trimming pgs
- This patch introduces an AsyncReserver for snap trimming to limit the
number of pgs on any single OSD which can be tr...
02/10/2017
- 07:20 PM Ceph Revision 6da3f9a5 (ceph): Merge pull request #13344 from gregsfortytwo/wip-osd-discussion-docs
- Wip osd discussion docs
Reviewed-by: Samuel Just <sjust@redhat.com> - 07:18 PM Ceph Revision 534ae8fe (ceph): Merge pull request #13342 from athanatos/wip-17831-18583-18809
- osd/: don't leak context for Blessed*Context or RecoveryQueueAsync
Reviewed-by: Kefu Chai <kchai@redhat.com>
Reviewe...
02/09/2017
- 06:51 PM Ceph Bug #18533: two instances of omap_digest mismatch
- wip-18533 is now cleaned up and has two specific unit tests and a fuzzer which reproduce invalid iterator results.
- 01:50 AM Ceph Bug #18533: two instances of omap_digest mismatch
- Nevermind, the bug can produce a more general set of errors than I had realized. See the more recent updates to the ...
- 12:28 AM Ceph Bug #18533: two instances of omap_digest mismatch
- If the entries David added a few days ago are the right ones, then the above bug doesn't explain what's happening in ...
- 12:13 AM Ceph Bug #18533: two instances of omap_digest mismatch
- David: Can you add the list of keys which are present on that node but shouldn't be?
02/08/2017
- 11:33 PM Ceph Bug #18533: two instances of omap_digest mismatch
- I'm pretty comfortable pinning the cluster trouble on that one, assuming the extra keys and the overlapping complete ...
- 11:30 PM Ceph Bug #18533: two instances of omap_digest mismatch
- wip-18533 above now has a unit test which causes the iterator to return a deleted value.
- 08:28 PM Ceph Bug #18533: two instances of omap_digest mismatch
- debuggging: https://github.com/athanatos/ceph/tree/wip-18533
- 08:28 PM Ceph Bug #18533: two instances of omap_digest mismatch
- <davidzlap> sjust: 100011cf577.00000000
<davidzlap> sjust: I meant http://pastebin.com/19W78B6U
<sjust> davidzlap: ...
02/07/2017
- 12:31 AM Ceph Revision a00efd8d (ceph): Merge pull request #13280 from athanatos/wip-revert-jewel-18581
- Revert "Merge pull request #12978 from asheplyakov/jewel-18581"
Reviewed-by: Josh Durgin <jdurgin@redhat.com> - 12:28 AM Ceph Revision 0cf7a613 (ceph): Revert "Merge pull request #12978 from asheplyakov/jewel-18581"
- See: http://tracker.ceph.com/issues/18809
This reverts commit 8e69580c97622abfcbda73f92d9b6b6780be031f, reversing
ch...
02/06/2017
- 06:30 PM Ceph Backport #18724 (New): jewel: osd: calc_clone_subsets misuses try_read_lock vs missing
- I have reverted this backport, it needs to be backported with http://tracker.ceph.com/issues/18809 as well.
- 06:29 PM Ceph Bug #18583: osd: calc_clone_subsets misuses try_read_lock vs missing
- This needs to be backported with http://tracker.ceph.com/issues/18809 (not in master yet, wait on that)
02/03/2017
- 09:19 PM Ceph Backport #18610: kraken: osd: ENOENT on clone
- See http://tracker.ceph.com/issues/18809 as well (will want to backport the branch there, it has the commits from the...
- 09:12 PM Ceph Revision 91b74235 (ceph): osd/: don't leak context for Blessed*Context or RecoveryQueueAsync
- This has always been a bug, but until
68defc2b0561414711d4dd0a76bc5d0f46f8a3f8, nothing deleted those contexts
withou... - 09:11 PM Ceph Bug #18809 (Resolved): FAILED assert(object_contexts.empty()) (live on master only from Jan-Feb 2...
- bless_context and bless_gencontext don't behave properly if the returned Context is deleted without calling complete(...
02/02/2017
- 12:45 AM Ceph Bug #18533: two instances of omap_digest mismatch
- I have copied the omap dirs for osds 72 (mira019:~samuelj/omap-osd-72), 7 (mira049:~samuelj/omap-osd-7), and 60 (mira...
- 12:03 AM Ceph Bug #18533: two instances of omap_digest mismatch
- ubuntu@mira049:~$ ( for i in {7..1}; do sudo zcat /var/log/ceph/ceph.log.$i.gz; done; sudo cat /var/log/ceph/ceph.log...
- 12:01 AM Ceph Bug #18533: two instances of omap_digest mismatch
- I suggest grabbing a copy of the leveldb instances from primary and a replica and examining the actual keys in the st...
- 12:00 AM Ceph Bug #18533: two instances of omap_digest mismatch
- samuelj@mira049:~$ ( for i in {7..1}; do sudo zcat /var/log/ceph/ceph.log.$i.gz; done; sudo cat /var/log/ceph/ceph.lo...
02/01/2017
- 11:56 PM Ceph Bug #18533: two instances of omap_digest mismatch
- Whatever happened, happened in the last few days.
samuelj@mira049:~$ ( for i in {7..1}; do sudo zcat /var/log/ceph...
01/29/2017
- 04:59 AM Ceph Revision 509de4d9 (ceph): PrimaryLogPG::try_lock_for_read: give up if missing
- The only users calc_*_subsets might try to read_lock an object which is
missing on the primary. Returning false in t... - 04:59 AM Ceph Revision cedaecf8 (ceph): ReplicatedBackend: take read locks for clone sources during recovery
- Otherwise, we run the risk of a clone source which hasn't actually
come into existence yet being used if we grab a cl...
01/26/2017
- 07:46 PM Ceph Revision 43e677dd (ceph): test/pybind/test_rados.py: tolerate empty output from mon ping
- Fixes: http://tracker.ceph.com/issues/18529
Signed-off-by: Samuel Just <sjust@redhat.com>
Also available in: Atom