Activity
From 05/24/2017 to 06/22/2017
06/22/2017
- 07:57 PM Bug #20000: osd assert in shared_cache.hpp: 107: FAILED assert(weak_refs.empty())
- These osd assertion failures reproduce consistently on shutdown in the rgw:multisite suite.
- 06:30 PM Bug #20041: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- Anecdotally, it looks like I may be running into this very same issue (or something similar) -- occasionally I have s...
- 05:46 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- Basically yes.
In src/mon/Session.h -> Subscription->next = -1 or 0.
I am learning C++ standard all the way but... - 11:01 AM Bug #20381 (New): bluestore: deferred aio submission can deadlock with completion
- Turns out when something is marked as a duplicate in redmine, it automatically closes this one when I close the other...
- 11:00 AM Bug #20381 (Duplicate): bluestore: deferred aio submission can deadlock with completion
- This ticket was opened first, but let's close it in favour of 20381 because that one has the integration test logs.
- 10:53 AM Bug #20381: bluestore: deferred aio submission can deadlock with completion
- The backtrace looks exactly like the one in #20379 - duplicate?
- 10:41 AM Bug #20381 (Resolved): bluestore: deferred aio submission can deadlock with completion
- ...
- 11:00 AM Bug #20379 (Duplicate): bluestore assertion (KernelDevice.cc: 529: FAILED assert(r == 0))
- This ticket was opened first, but let's close it in favour of 20381 because that one has the integration test logs.
- 10:58 AM Bug #20379: bluestore assertion (KernelDevice.cc: 529: FAILED assert(r == 0))
- Updated title to make it clear that this isn't specific to vstart
- 10:52 AM Bug #20379: bluestore assertion (KernelDevice.cc: 529: FAILED assert(r == 0))
- Looks like the integration tests are hitting this as well.
- 09:25 AM Bug #20379 (Duplicate): bluestore assertion (KernelDevice.cc: 529: FAILED assert(r == 0))
- There's already a bug (with lots of dups) that seems to be what I'm seeing in a vstart.sh cluster. Since this bug is...
- 02:13 AM Bug #20274 (Resolved): rewind divergent deletes head whiteout
- 12:50 AM Bug #20375 (Fix Under Review): osd: omap threadpool heartbeat is only reset every 100 values
- https://github.com/ceph/ceph/pull/15823
06/21/2017
- 10:26 PM Bug #20331 (Rejected): osd/PGLog.h: 770: FAILED assert(i->prior_version == last)
- #20274 isn't merged yet, fixing it there.
- 10:20 PM Bug #20331: osd/PGLog.h: 770: FAILED assert(i->prior_version == last)
- This is fallout from 986a31f02e11d915a630cab17234ec4b8040609c, the #20274 fix. When we skip error entries the prior_...
- 10:06 PM Bug #20375 (Resolved): osd: omap threadpool heartbeat is only reset every 100 values
- This could potentially be after 100MB of reads. There's little cost to resetting the heartbeat timeout, so simple do ...
- 09:02 PM Bug #20358 (Resolved): bluestore: sharedblob not moved during split
- 08:42 PM Bug #19909 (New): PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- So I didn't follow it all the way through but it sure looks to me like our acting_primary input to the crashing seque...
- 09:13 AM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- Yes, i'm pretty sure it was 12.0.3. But, not on first boot, only after massive failures got me to stale+down PG statu...
- 07:52 PM Bug #20041: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- Second reported case from mailing list of VMs locking up -- they also have VMs issuing periodic discards.
- 11:57 AM Bug #20041: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- Shouldn't this one be flagged as a regression? It was working fine under firefly and hammer.
- 07:31 PM Bug #19943 (Resolved): osd: enoent on snaptrimmer
- 04:34 PM Bug #20169 (Fix Under Review): filestore+btrfs occasionally returns ENOSPC
- https://github.com/ceph/ceph/pull/15814
- 04:09 PM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- I've seen xenial and centos failures now, no trusty yet.
- 04:07 PM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- ...
- 04:09 PM Bug #20000: osd assert in shared_cache.hpp: 107: FAILED assert(weak_refs.empty())
- Also in http://qa-proxy.ceph.com/teuthology/yuriw-2017-06-21_01:02:43-rgw-master_2017_6_21-distro-basic-smithi/130726...
- 03:55 PM Bug #20360: rados/verify valgrind tests: osds fail to start (xenial valgrind)
- ok, valgrind is now restricted to centos again.
- 02:49 AM Bug #20360: rados/verify valgrind tests: osds fail to start (xenial valgrind)
- 03:46 PM Bug #20371: mgr: occasional fails to send beacons (monc reconnect backoff too aggressive?)
- It looks like it wasn't aggressive enough about reconnection to the mon:...
- 02:17 PM Bug #20371 (Resolved): mgr: occasional fails to send beacons (monc reconnect backoff too aggressi...
- for a while,...
- 01:48 PM Bug #20370 (New): leaked MOSDOp via PrimaryLogPG::_copy_some and PrimaryLogPG::do_proxy_write
- ...
- 01:43 PM Bug #20369 (New): segv in OSD::ShardedOpWQ::_process
- ...
- 12:01 PM Backport #20366 (In Progress): kraken: kraken-bluestore 11.2.0 memory leak issue
- 11:50 AM Backport #20366 (Resolved): kraken: kraken-bluestore 11.2.0 memory leak issue
- https://github.com/ceph/ceph/pull/15792
- 08:44 AM Bug #18924: kraken-bluestore 11.2.0 memory leak issue
- *master PR*: https://github.com/ceph/ceph/pull/15295
*kraken backport PR*: https://github.com/ceph/ceph/pull/15792 - 02:22 AM Bug #18924 (Pending Backport): kraken-bluestore 11.2.0 memory leak issue
- 02:21 AM Bug #18924 (Fix Under Review): kraken-bluestore 11.2.0 memory leak issue
- https://github.com/ceph/ceph/pull/15792
should help - 02:34 AM Bug #20302 (Fix Under Review): "BlueStore.cc: 9023: FAILED assert(0 == "unexpected error")" in po...
- ...
- 02:31 AM Bug #20277 (Need More Info): bluestore crashed while performing scrub
- A bug was just fixed in the spanning blob code, see https://github.com/ceph/ceph/pull/15654. Are you able to reprodu...
- 02:23 AM Bug #20117 (Rejected): BlueStore.cc: 8585: FAILED assert(0 == "unexpected error")
- you need more log info to see what the actual error was. usually when i see this it's enospc...
- 02:12 AM Bug #19800 (Resolved): some osds are down when create a new pool and a new image of the pool (blu...
- This looks like rocksdb compaction, probably triggered in part by a big deletion. There was a recent fix to do reada...
06/20/2017
- 10:39 PM Bug #18681: ceph-disk prepare/activate misses steps and fails on [Bluestore]
- If you don't use the GPT partition labels/types that ceph-disk uses then the device ownership won't be changed to cep...
- 10:35 PM Bug #19983 (Need More Info): osds abort on shutdown with assert(/build/ceph-12.0.2/src/os/bluesto...
- Do you mean you pulled out the disk, and then ceph-osd crashed? That is normal--the disk si gone!
Or, do you mean... - 09:15 PM Bug #20360: rados/verify valgrind tests: osds fail to start (xenial valgrind)
- https://github.com/ceph/ceph/pull/15791
- 09:07 PM Bug #20360: rados/verify valgrind tests: osds fail to start (xenial valgrind)
- related? also started seeing these:...
- 08:32 PM Bug #20360 (New): rados/verify valgrind tests: osds fail to start (xenial valgrind)
- ...
- 08:55 PM Bug #19299 (New): Jewel -> Kraken: OSD boot takes 1+ hours, unusually high CPU
- Ping Sage, you got that subprocess strace data.
- 06:45 PM Bug #19299: Jewel -> Kraken: OSD boot takes 1+ hours, unusually high CPU
- Same problem here (fresh 12.0.3). Got OSD's behind by > 5000 maps, it took ~8 hours to get them booted.
Looking in... - 08:52 PM Bug #19700: OSD remained up despite cluster network being inactive?
- Sounds like we messed up the way cluster network heartbeating and the monitor's public network connection to the OSDs...
- 06:35 PM Bug #19700: OSD remained up despite cluster network being inactive?
- The cluster does not need to be performing any IO, other than normal peering and checking, and this will still happen...
- 08:50 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- red ref, are you saying you created a brand-new cluster with 12.0.3 and saw this on first boot?
Sage, do you think... - 06:30 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- I can confirm the second behavior ("failed to load OSD map for epoch 1") in native installed 12.0.3 (not in productio...
- 06:20 PM Bug #19909 (Won't Fix): PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid...
- What Greg said! :)
- 04:52 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- N.0.Y releases such as 12.0.2 are dev releases; you should not run them if you can't afford to rebuild them. Upgrades...
- 08:24 PM Bug #20227 (Resolved): os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded s...
- 02:56 AM Bug #20227 (Fix Under Review): os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark un...
- https://github.com/ceph/ceph/pull/15766
- 02:54 AM Bug #20227: os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded shard dirty")
- ...
- 02:50 AM Bug #20227: os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded shard dirty")
- /a/sage-2017-06-19_18:44:38-rbd:qemu-master---basic-smithi/1301319
- 08:16 PM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- /a/sage-2017-06-20_16:21:45-rados-wip-sage-testing2-distro-basic-smithi/1305525
rados/thrash/{0-size-min-size-overri... - 06:27 PM Bug #20303: filejournal: Unable to read past sequence ... journal is corrupt
- 06:15 PM Bug #20343: Jewel: OSD Thread time outs in XFS
- The filestore-level splitting and merging isn't in the logs - the best way to tell is examining a pg's directory - e....
- 05:32 PM Bug #20343: Jewel: OSD Thread time outs in XFS
- We looked through the mon logs and we can't really find any splitting (or merging) pg states in there. Do we need to...
- 12:34 AM Bug #20343: Jewel: OSD Thread time outs in XFS
- This could be filestore splitting directories into multiple subdirectories when there are many objects, then merging ...
- 06:12 PM Bug #19943 (Fix Under Review): osd: enoent on snaptrimmer
- https://github.com/ceph/ceph/pull/15787
- 06:02 PM Bug #19943: osd: enoent on snaptrimmer
- no, i'm an idiot, ceph-objectstore-tool is doing it and it's noted in a different log file. sheesh.
- 01:43 PM Bug #19943: osd: enoent on snaptrimmer
- confirmed same thing in another run. on osd startup, fsck shows the key that was deleted....
- 04:33 PM Bug #20301: "/src/osd/SnapMapper.cc: 231: FAILED assert(r == -2)" in rados
- also in http://qa-proxy.ceph.com/teuthology/yuriw-2017-06-20_00:37:23-rados-master-2017_6_20-distro-basic-smithi/1302...
- 03:56 PM Bug #20358 (Fix Under Review): bluestore: sharedblob not moved during split
- https://github.com/ceph/ceph/pull/15783
- 03:54 PM Bug #20358 (Resolved): bluestore: sharedblob not moved during split
- ...
- 01:22 PM Bug #19960: overflow in client_io_rate in ceph osd pool stats
- Bug is not reproducible after this commit (not sure that only one contains fix):
commit d6d1db62edeb4c40a774fcb56e...
06/19/2017
- 11:05 PM Bug #20273 (Resolved): osd/OSD.h: 1957: FAILED assert(peerin g_queue.empty())
- 10:47 PM Bug #20041: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- from thread: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-May/017869.html
[15:41:40] <jdillaman> greg... - 10:25 PM Bug #20343: Jewel: OSD Thread time outs in XFS
- That IO pattern may just be killing the OSD on its own, but I'm not sure what RGW is turning it into or if there's st...
- 07:16 PM Bug #20343 (New): Jewel: OSD Thread time outs in XFS
- Creating a tracker ticket following suggestion from mailing list:
"
We've been having this ongoing problem with... - 09:12 PM Bug #19960 (Resolved): overflow in client_io_rate in ceph osd pool stats
- If it's just one or two commits, we could backport (please fill in the Backport field in that case). But 131 commits?
- 09:11 PM Bug #19960: overflow in client_io_rate in ceph osd pool stats
- Aleksei: Please be more specific. PR#15073 has 131 commits - see https://github.com/ceph/ceph/pull/15073/commits
- 07:55 PM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- http://pulpito.ceph.com/jdillaman-2017-05-25_16:48:38-rbd-wip-jd-testing-distro-basic-smithi/1229611
- 07:55 PM Bug #20092 (Duplicate): ceph-osd: FileStore::_do_transaction: assert(0 == "unexpected error")
- Oh, that's probably the new thing where btrfs is giving us ENOENT (Sage guessing it's about rocksdb and snapshots). T...
- 12:26 PM Bug #20092 (Rejected): ceph-osd: FileStore::_do_transaction: assert(0 == "unexpected error")
- The osd.1 log showed the rocksdb encountered a full disk:
-17> 2017-05-25 22:14:28.664403 7fb70cd9b700 -1 rocks... - 07:51 PM Bug #20326 (Resolved): Scrubbing terminated -- not all pgs were active and clean.
- 06:45 PM Bug #20227: os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded shard dirty")
- reliably triggered, it seems, by rbd/qemu xfstests workload
- 06:45 PM Bug #19882 (Resolved): rbd/qemu: [ERR] handle_sub_read: Error -2 reading 1:e97125f5:::rbd_data.0....
- 05:43 PM Bug #19943: osd: enoent on snaptrimmer
- ...
- 03:30 PM Bug #18681: ceph-disk prepare/activate misses steps and fails on [Bluestore]
- Moving this to the RADOS bluestore tracker since it's probably owned by that team.
- 11:55 AM Bug #19939: OSD crash in MOSDRepOpReply::decode_payload
- ...
- 10:54 AM Bug #19939: OSD crash in MOSDRepOpReply::decode_payload
- Unless there was a patch, I wouldn't be too sure this is fixed -- it was an intermittent failure.
- 10:48 AM Bug #19939: OSD crash in MOSDRepOpReply::decode_payload
- all passed modulo a valgrind error in ceph-mds, see /a/kchai-2017-06-19_09:40:27-fs-master---basic-smithi/1300881/rem...
- 09:41 AM Bug #19939: OSD crash in MOSDRepOpReply::decode_payload
- rerunning at http://pulpito.ceph.com/kchai-2017-06-19_09:40:27-fs-master---basic-smithi/
- 08:14 AM Feature #15835 (Fix Under Review): filestore: randomize split threshold
06/18/2017
- 08:36 AM Bug #20332: rados bench seq option doesn't work
- Did you actually write out some data for it to read first? "seq" is just pulling back whatever was written down in th...
- 08:28 AM Bug #20303: filejournal: Unable to read past sequence ... journal is corrupt
- Bumping this priority up since it's an assert on read of committed data, rather than a simple disk write error.
- 08:24 AM Bug #20295: bluestore: Timeout in tp_osd_tp threads when running RBD bench in EC pool w/ overwrites
- Sounds like we need some way of more reliably accounting for the extra cost of EC overwrites in our throttle limits.
06/17/2017
- 09:19 PM Bug #20188: filestore: os/filestore/FileStore.h: 357: FAILED assert(q.empty()) from ceph_test_obj...
- This testing branch didn't include any of the filestore improvements we've been getting, did it?
- 09:18 PM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-06-17_13:41:40-rados-wip-sage-testing-distro-basic-smithi/1297478
- 09:16 PM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- Do we have any idea why it hasn't popped up in leveldb? Is the multi-threading stuff less conducive to being snapshot...
- 09:14 PM Bug #20134 (Rejected): test_rados.TestIoctx.test_aio_read AssertionError: 5 != 2
- 5 is EIO. Thats not an error code we produce, but it's a possibility until David's stuff preventing us from returning...
- 09:10 PM Bug #20326: Scrubbing terminated -- not all pgs were active and clean.
- https://github.com/ceph/ceph/pull/15747
- 09:09 PM Bug #20116: osds abort on shutdown with assert(ceph/src/osd/OSD.cc: 4324: FAILED assert(curmap))
- Are there more logs or core dumps available around this? That backtrace looks serious but doesn't contain enough info...
- 09:05 PM Support #20108 (Resolved): PGs are not remapped correctly when one host fails
- Okay, as described (and especially since it's better in jewel) this is almost certainly about CRUSH max_retries. I'm ...
- 06:18 PM Bug #20242: Make osd-scrub-repair.sh unit test run faster
- I'm looking into making this test run faster as well as a couple of the other slow ones by splitting them up into sma...
- 06:18 PM Bug #19639 (Can't reproduce): mon crash on shutdown
- 05:52 PM Bug #19639: mon crash on shutdown
- I haven't seen this happen again in recent memory.
- 05:25 AM Bug #19639: mon crash on shutdown
- Turning this down; should close if we don't get it happening again.
- 02:59 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- A month past and I'm still not able to figure where the problem was, neither am I able to recover my cluster. Trying ...
- 01:47 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- I presume this was a bug in the older dev releases, but we should verify that before release.
- 02:26 PM Bug #20099 (Need More Info): osd/filestore: osd/PGLog.cc: 911: FAILED assert(last_e.version.versi...
- Does this still exist or is it all cleaned up now? The repeating versions is a little weird but that's not enough dat...
- 02:22 PM Bug #20092: ceph-osd: FileStore::_do_transaction: assert(0 == "unexpected error")
- Do you have any evidence this *wasn't* an unexpected error given to us by the Filesystems, Jason? That does happen in...
- 02:15 PM Bug #20059: miscounting degraded objects
- Maybe we count each instance of an object when it's degraded (i.e., 3x for replicated pools), but the non-degraded on...
- 01:43 PM Bug #19882: rbd/qemu: [ERR] handle_sub_read: Error -2 reading 1:e97125f5:::rbd_data.0.10251ca0c5f...
- Is this the read of partially-written EC extents? Need some context if it's in Testing...
- 01:36 PM Bug #20227: os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded shard dirty")
- http://pulpito.ceph.com/sage-2017-06-16_19:23:03-rbd:qemu-wip-19882---basic-smithi/
reliably reproduced by rbd/qemu - 05:50 AM Bug #19737: EAGAIN encountered during pg scrub (jewel)
- (Optimistically sorting it as a test issue.)
- 05:50 AM Bug #19737: EAGAIN encountered during pg scrub (jewel)
- Is the message that the primary OSD is down incorrect? We've seen a few things like this that are test bugs around ha...
- 05:45 AM Bug #19700 (Need More Info): OSD remained up despite cluster network being inactive?
- 05:42 AM Bug #19695: mon: leaked session
- Has this reproduced? I thought valgrind was clean enough we notice new leaks.
- 05:19 AM Bug #19518: log entry does not include per-op rvals?
- Have we *ever* filled in the per-op rvalues on retry? That sounds distressingly like returning read data on a write o...
- 05:15 AM Bug #19487 (In Progress): "GLOBAL %RAW USED" of "ceph df" is not consistent with check_full_status
- Based on PR comments we expect this to be fixed up by one of David's disk handling branches. Or did that one already...
- 03:52 AM Bug #19939: OSD crash in MOSDRepOpReply::decode_payload
- John, sorry. i missed this. will take a look at it next monday.
- 02:34 AM Bug #19486: Rebalancing can propagate corrupt copy of replicated object
- Hat is an interesting point about BlueStore; it will detect corruption but not manual edits...
- 02:23 AM Bug #19400 (Resolved): add more info during pool delete error
- 12:26 AM Bug #20332 (Won't Fix): rados bench seq option doesn't work
For some reason "seq" option finishes too quickly....
06/16/2017
- 09:10 PM Bug #20227: os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded shard dirty")
- /a/sage-2017-06-16_18:45:23-rados-wip-sage-testing-distro-basic-smithi/1293630
- 01:40 PM Bug #20227: os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded shard dirty")
- ...
- 09:10 PM Bug #20331 (Rejected): osd/PGLog.h: 770: FAILED assert(i->prior_version == last)
- ...
- 07:44 PM Bug #20000 (Need More Info): osd assert in shared_cache.hpp: 107: FAILED assert(weak_refs.empty())
- 07:44 PM Bug #20000: osd assert in shared_cache.hpp: 107: FAILED assert(weak_refs.empty())
- Could be... maybe also #20273?
- 02:56 AM Bug #20000: osd assert in shared_cache.hpp: 107: FAILED assert(weak_refs.empty())
- we found that the msg threads still working after the `delete osd` in asyncmsg env, its because the asyncmsg::wait() ...
- 07:41 PM Bug #20274: rewind divergent deletes head whiteout
- 01:39 PM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- /a/sage-2017-06-16_00:46:50-rados-wip-sage-testing-distro-basic-smithi/1292433
rados/thrash-erasure-code/{ceph.yam... - 01:49 AM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- /a/kchai-2017-06-15_17:39:27-rados-wip-kefu-testing---basic-smithi/1291475 also with rocksdb + btrfs
- 06:39 AM Bug #14088 (In Progress): mon: nothing logged when ENOSPC encountered during start up
- https://github.com/ceph/ceph/pull/15723 - merged
- 05:54 AM Bug #19320: Pg inconsistent make ceph osd down
- Hmm, did one of our official release said have the broken snapshot trimming back port semantics? I didn't think so bu...
- 04:05 AM Bug #20256 (Resolved): "ceph osd df" is broken; asserts out on Luminous-enabled clusters
- 02:30 AM Bug #20326 (In Progress): Scrubbing terminated -- not all pgs were active and clean.
- ...
- 01:29 AM Bug #20326 (New): Scrubbing terminated -- not all pgs were active and clean.
- 01:03 AM Bug #20326 (Resolved): Scrubbing terminated -- not all pgs were active and clean.
- ...
- 12:42 AM Bug #20105: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify3/0 failure
- /a//kchai-2017-06-15_17:39:27-rados-wip-kefu-testing---basic-smithi/1291451
06/15/2017
- 09:42 PM Bug #19882: rbd/qemu: [ERR] handle_sub_read: Error -2 reading 1:e97125f5:::rbd_data.0.10251ca0c5f...
- 06:04 PM Bug #19882: rbd/qemu: [ERR] handle_sub_read: Error -2 reading 1:e97125f5:::rbd_data.0.10251ca0c5f...
- /a/teuthology-2017-06-15_02:01:02-rbd-master-distro-basic-smithi/1287766
rbd/qemu/{cache/writeback.yaml clusters/{fi... - 05:59 PM Bug #20273 (Fix Under Review): osd/OSD.h: 1957: FAILED assert(peerin g_queue.empty())
- https://github.com/ceph/ceph/pull/15710
- 05:53 PM Bug #20273: osd/OSD.h: 1957: FAILED assert(peerin g_queue.empty())
- - handle_osd_map queued a write, with _write_committed as callback
- thread pools alls hut down, including peering_w...
06/14/2017
- 08:36 PM Bug #20256: "ceph osd df" is broken; asserts out on Luminous-enabled clusters
- 08:20 PM Bug #20303 (Can't reproduce): filejournal: Unable to read past sequence ... journal is corrupt
- Run: http://pulpito.ceph.com/teuthology-2017-06-14_15:26:27-powercycle-master-distro-basic-smithi/
Job: 1285933
Log... - 08:18 PM Bug #20302 (Resolved): "BlueStore.cc: 9023: FAILED assert(0 == "unexpected error")" in powercycle...
- Run: http://pulpito.ceph.com/teuthology-2017-06-14_15:26:27-powercycle-master-distro-basic-smithi/
Job: 1285969
Log... - 07:52 PM Bug #20301 (Can't reproduce): "/src/osd/SnapMapper.cc: 231: FAILED assert(r == -2)" in rados
- Run: http://pulpito.ceph.com/yuriw-2017-06-14_15:02:07-rados-master_2017_6_14-distro-basic-smithi/
Job: 1285768
Log... - 06:46 PM Bug #19943 (In Progress): osd: enoent on snaptrimmer
- 02:12 PM Bug #19943: osd: enoent on snaptrimmer
- log with more debugging at /a/sage-2017-06-14_03:38:53-rados:thrash-wip-19943---basic-smithi/1284145/ceph-osd.5.log
- 03:38 AM Bug #19943: osd: enoent on snaptrimmer
- WTH. I've seen two cases where the object exists in snapmapper a different pool (cache tiering), but I think this is...
- 04:26 PM Bug #17806 (Resolved): OSD: do not open pgs when the pg is not in pg_map
- 10:01 AM Bug #17806: OSD: do not open pgs when the pg is not in pg_map
- The PR is merged to upstream. https://github.com/ceph/ceph/pull/11803. So please close it. Thanks.
- 03:54 AM Bug #17806: OSD: do not open pgs when the pg is not in pg_map
- Without more details I'm not sure this assessment is actually correct...
- 02:34 PM Bug #20295 (Resolved): bluestore: Timeout in tp_osd_tp threads when running RBD bench in EC pool ...
- When running "rbd bench-write" using an RBD image stored in an EC pool, the some OSD threads start to timeout and eve...
- 01:44 PM Bug #16890: rbd diff outputs nothing when the image is layered and with a writeback cache tier
- RBD isn't doing anything special with regard to cache tiering. It sounds like the whiteout in the cache tier is not r...
- 03:35 AM Bug #16890: rbd diff outputs nothing when the image is layered and with a writeback cache tier
- Jason, can you make sure you expect this to work from an RBD perspective and throw it into the RADPS project if so? :)
- 01:32 PM Feature #15835: filestore: randomize split threshold
- https://github.com/ceph/ceph/pull/15689
- 09:01 AM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
- Greg Farnum wrote:
> Note the second reporter confirms this is with cache tiering. Rather suspect that's got more to... - 03:46 AM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
- Note the second reporter confirms this is with cache tiering. Rather suspect that's got more to do with it than snaps...
- 05:27 AM Bug #18930: received Segmentation fault in PGLog::IndexedLog::add
- Don't suppose there's still a log or core dump associated with this?
- 04:46 AM Bug #14088: mon: nothing logged when ENOSPC encountered during start up
- No, just scrubbing and trying to get things in a realistic state.
- 04:08 AM Bug #14088: mon: nothing logged when ENOSPC encountered during start up
- Greg, No, but I can try and take a look in the next few days if you'd like?
- 12:46 AM Bug #14088: mon: nothing logged when ENOSPC encountered during start up
- Brad, did you do any work on this?
- 04:35 AM Bug #18752: LibRadosList.EnumerateObjects failure
- Hasn't reproduced yet.
- 04:27 AM Bug #18328 (Closed): crush: flaky unitest:
- 04:13 AM Bug #18021 (Duplicate): Assertion "needs_recovery" fails when balance_read reaches a replica OSD ...
- These are the same thing, right?
- 04:11 AM Bug #17968: Ceph:OSD can't finish recovery+backfill process due to assertion failure
- https://github.com/ceph/ceph/pull/15489#issuecomment-308152157
- 04:09 AM Bug #17949 (Resolved): make check: unittest_bit_alloc get_used_blocks() >= 0
- Linked PR is not merged but has a comment the race condition fix was merged.
- 04:03 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
- David, do we have any idea why this is failing? I'm not getting any idea from what's in the comments here.
- 03:51 AM Bug #17718: EC Overwrites: update ceph-objectstore-tool export/import to handle rollforward/rollback
- Josh, is this still outstanding? I presume we need it for testing...
- 03:02 AM Bug #16385 (Fix Under Review): rados bench seq and rand tests do not work if op_size != object_size
- One of the stuck PRs:
https://github.com/ceph/ceph/pull/12203 - 02:59 AM Bug #16379 (Closed): [ERROR ] "ceph auth get-or-create for keytype admin returned -1
- It's been a year without updates and tests are more or less working, so this must be fixed.
- 02:56 AM Bug #16365 (Resolved): Better network partition detection
- We're switching to 2KB heartbeat packets now for other reasons. I don't think there's much else we can do here, pract...
- 01:37 AM Bug #16177 (Closed): leveldb horrendously slow
- Adam's cluster got cleaned up; the MDS doesn't allow you to generate directory omaps that large anymore; RGW is doing...
- 12:43 AM Bug #13493: osd: for ec, cascading crash during recovery if one shard is corrupted
- I suspect this is being resolved by David's work on EIO handling?
- 12:02 AM Bug #20283 (New): qa: missing even trivial tests for many commands
- I wrote a trivial script to look for missing commands in tests (https://github.com/ceph/ceph/pull/15675/commits/3aad0...
06/13/2017
- 11:38 PM Bug #20256: "ceph osd df" is broken; asserts out on Luminous-enabled clusters
- https://github.com/ceph/ceph/pull/15675
- 10:00 PM Bug #13111: replicatedPG:the assert occurs in the fuction ReplicatedPG::on_local_recover.
- I don't really get how the AsyncMessenger could have caused this issue...?
- 09:50 PM Bug #12659 (Closed): Can't delete cache pool
- Closing due to lack of updates and various changes in cache pools since .94.
- 09:48 PM Bug #12615: Repair of Erasure Coded pool with an unrepairable object causes pg state to lose clea...
- David, is this still an issue?
- 08:53 AM Bug #20277: bluestore crashed while performing scrub
- What happened (twice) was:
* the osd had a crc error inconsistent pg
* set debug-bluestore and debug-osd to 20
* t... - 08:21 AM Bug #20277 (Can't reproduce): bluestore crashed while performing scrub
- ...
- 03:07 AM Bug #20274: rewind divergent deletes head whiteout
- https://github.com/ceph/ceph/pull/15649
- 02:54 AM Bug #20274 (Resolved): rewind divergent deletes head whiteout
- ...
- 03:00 AM Bug #19943: osd: enoent on snaptrimmer
- with snap trim whiteout fix applied,
/a/sage-2017-06-12_20:56:37-rados-wip-sage-testing-distro-basic-smithi/128066... - 02:59 AM Bug #20227: os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded shard dirty")
- /a/sage-2017-06-12_20:56:37-rados-wip-sage-testing-distro-basic-smithi/1280581
has full log... - 02:33 AM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- ...
- 02:28 AM Bug #20273 (Resolved): osd/OSD.h: 1957: FAILED assert(peerin g_queue.empty())
- ...
06/12/2017
- 04:35 PM Bug #20256: "ceph osd df" is broken; asserts out on Luminous-enabled clusters
- So obviously what happened is I thought we had moved the osd df command into the monitor, but that didn't actually ha...
- 04:33 PM Bug #20256 (Resolved): "ceph osd df" is broken; asserts out on Luminous-enabled clusters
- I got a private email report:
When do ‘ceph osd df’, ceph-mon always crush. The stack info as following:... - 08:46 AM Bug #18043: ceph-mon prioritizes public_network over mon_host address
- Thanks for the update, I look forward to seeing your PR :).
06/11/2017
- 07:52 PM Bug #13146 (Resolved): mon: creating a huge pool triggers a mon election
- We're throttling PG creates now.
- 07:28 PM Bug #11907: crushmap validation must not block the monitor
- Don't we internally time out crush map testing now? Does it behave sensibly if things take too long?
- 07:21 PM Bug #9523 (Closed): Both op threads and dispatcher threads could be stuck at acquiring the budget...
- Based on the PR discussion it seems the diagnosed issue wasn't the cause of the slowness. Closing since it hasn't (kn...
06/09/2017
- 07:51 PM Bug #20243 (Resolved): Improve size scrub error handling and ignore system attrs in xattr checking
Something similar to this was seen on a production system. If all the object_info_t matched there would be no erro...- 06:39 PM Bug #20242 (Resolved): Make osd-scrub-repair.sh unit test run faster
Most likely move some tests to the rados suite.- 01:26 AM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- ugh just saw this on xenial too. hrm.
/a/sage-2017-06-08_20:27:41-rados-wip-sage-testing2-distro-basic-smithi/127...
06/08/2017
- 06:52 PM Bug #20227 (Need More Info): os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unlo...
- Hmm, I see the fault_range call (it's in the new ec unclone code), but it's only dirtying the range including extents...
- 06:18 PM Bug #20227: os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded shard dirty")
- /a/sage-2017-06-08_02:04:29-rados-wip-sage-testing-distro-basic-smithi/1269367 too
- 06:14 PM Bug #20227 (Resolved): os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded s...
- ...
- 06:44 PM Bug #20221: kill osd + osd out leads to stale PGs
- @Greg the original bug description was updated with a simpler reproducer which does not involve copying objects. I be...
- 06:34 PM Bug #20221: kill osd + osd out leads to stale PGs
- Right, but what you've said here is that if you have pool size one, and kill the only OSD hosting it, then no other O...
- 02:58 PM Bug #20221: kill osd + osd out leads to stale PGs
- FWIW it was reproduced by badone.
- 12:20 PM Bug #20221: kill osd + osd out leads to stale PGs
- @Greg the first reproducer was not trying to rados put the same object. It was trying to rados put another object. I ...
- 12:18 PM Bug #20221: kill osd + osd out leads to stale PGs
- The reproducer works as expected on 12.0.3. The behavior changed somewhere in master after 12.0.3 was released.
- 12:17 PM Bug #20221: kill osd + osd out leads to stale PGs
- I don't understand what behavior you're looking for. Hanging is the expected behavior when data is unavailable.
- 10:07 AM Bug #20221 (New): kill osd + osd out leads to stale PGs
- h3. description
When the OSD is killed before ceph osd out, the PGs stay in stale state.
h3. reproducer
From... - 05:53 PM Bug #19960 (Pending Backport): overflow in client_io_rate in ceph osd pool stats
- 03:14 PM Bug #19960: overflow in client_io_rate in ceph osd pool stats
- > By which commit/PR?
554cf8394a9ac4f845c1fce03dd1a7f551a414a9
Merge pull request #15073 from liewegas/wip-mgr-stats - 11:00 AM Bug #18746: monitors crashing ./include/interval_set.h: 355: FAILED assert(0) (jewel+kraken)
- Hi Greg,
Thank you for taking the time to look into this.
Following the incident of the present ticket the clus...
06/07/2017
- 08:57 PM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-06-07_16:25:35-rados-wip-sage-testing2-distro-basic-smithi/1268182
rados/thrash-erasure-code/{ceph.ya... - 02:03 AM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-06-06_21:54:14-rados-wip-sage-testing-distro-basic-smithi/1265627
rados/thrash/{0-size-min-size-overr... - 08:11 PM Documentation #20215 (New): librados documentation improvement for the use cases
- librados documentation improvement for the use cases including the tradeoffs of object size, i/o rate, and omap vs re...
- 04:44 PM Bug #18696: OSD might assert when LTTNG tracing is enabled
- Wonder if this PR https://github.com/ceph/ceph/pull/14304 fixes this issue as well.
- 04:01 PM Bug #18750: handle_pg_remove: pg_map_lock held for write when taking pg_lock
- I think I remember this one and it wasn't really feasible to fix (at the time). If doing code inspection you'll want ...
- 03:59 PM Bug #18746: monitors crashing ./include/interval_set.h: 355: FAILED assert(0) (jewel+kraken)
- Pretty weird, that assert appears to be an internal interval_set consistency thing: https://github.com/ceph/ceph/blob...
- 03:58 PM Bug #19198: Bluestore doubles mem usage when caching object content
- 03:50 PM Bug #18667: [cache tiering] omap data time-traveled to stale version
- Jason says this "seems to pop up randomly every few weeks or so", so it's definitely a live, going concern. :(
- 03:40 PM Bug #19086 (Rejected): BlockDevice::create should add check for readlink result instead of raise ...
- 03:36 PM Bug #18647: ceph df output with erasure coded pools
- Let's verify this prior to Luminous and write a test for it!
- 03:29 PM Bug #19023 (Fix Under Review): ceph_test_rados invalid read caused apparently by lost intervals d...
- https://github.com/ceph/ceph/pull/15555
- 01:23 PM Bug #19960: overflow in client_io_rate in ceph osd pool stats
- Aleksei Gutikov wrote:
> fixed in master
By which commit/PR? - 12:04 PM Bug #19960: overflow in client_io_rate in ceph osd pool stats
- fixed in master
- 09:28 AM Bug #19783 (New): upgrade tests failing with "AssertionError: failed to complete snap trimming be...
- 06:34 AM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- Zengran Zhang wrote:
> 2017-05-19 22:48:23.854608 7f14f1c1e700 0 -- 10.10.133.1:6823/2019 >> 10.10.133.1:6819/19544... - 02:04 AM Bug #20000: osd assert in shared_cache.hpp: 107: FAILED assert(weak_refs.empty())
- ...
- 02:02 AM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- /a/sage-2017-06-06_21:54:14-rados-wip-sage-testing-distro-basic-smithi/1265467
rados/thrash/{0-size-min-size-overrid... - 02:02 AM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- /a/sage-2017-06-06_21:54:14-rados-wip-sage-testing-distro-basic-smithi/1265435
rados/thrash/{0-size-min-size-overr...
06/06/2017
- 07:02 PM Bug #20068 (Resolved): osd valgrind error in CrushWrapper::has_incompat_choose_args
- 01:19 PM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-06-05_22:19:51-rados-wip-sage-testing-distro-basic-smithi/1262663
rados/thrash/{0-size-min-size-overr... - 01:16 PM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-06-05_22:19:51-rados-wip-sage-testing-distro-basic-smithi/1262583
rados/thrash/{0-size-min-size-overr... - 01:13 PM Bug #20133: EnvLibradosMutipoolTest.DBBulkLoadKeysInRandomOrder hangs on rocksdb+librados
- /a/sage-2017-06-05_22:19:51-rados-wip-sage-testing-distro-basic-smithi/1262365
- 12:40 PM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- 2017-05-19 22:58:05.142834 7f14de2a2700 0 osd.0 pg_epoch: 78440 pg[9.10cs0( v 78440'6350 (78438'4241,78440'6350] loc...
06/05/2017
- 09:32 PM Bug #19518: log entry does not include per-op rvals?
- /a/sage-2017-06-05_18:36:01-rados-wip-sage-testing2-distro-basic-smithi/1261843...
- 06:27 PM Bug #20188 (New): filestore: os/filestore/FileStore.h: 357: FAILED assert(q.empty()) from ceph_te...
- ...
- 06:25 PM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-06-05_14:47:27-rados-wip-sage-testing-distro-basic-smithi/1260424
teuthology:1260424 06:25 PM $ cat s... - 06:24 PM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-06-05_14:47:27-rados-wip-sage-testing-distro-basic-smithi/1260344
teuthology:1260344 06:24 PM $ cat su... - 09:24 AM Backport #16239 (Fix Under Review): 'ceph tell osd.0 flush_pg_stats' fails in rados qa run
- https://github.com/ceph/ceph/pull/15475
06/03/2017
- 06:46 PM Bug #20169: filestore+btrfs occasionally returns ENOSPC
- It's less likely on Centos, but I think we've seen this before and it's usually been a btrfs kernel bug that got reso...
06/02/2017
- 03:29 PM Bug #20169 (New): filestore+btrfs occasionally returns ENOSPC
- ...
- 03:14 PM Bug #19964: occasional crushtool timeouts
- /a/sage-2017-06-02_08:32:01-rados-wip-sage-testing-distro-basic-smithi/1255514
teuthology:1255514 03:14 PM $ cat su... - 02:23 AM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-06-01_21:44:07-rados-wip-sage-testing---basic-smithi/1253654
teuthology:1253654 02:23 AM $ cat summary... - 02:20 AM Bug #20134 (Rejected): test_rados.TestIoctx.test_aio_read AssertionError: 5 != 2
- <Pre>
2017-06-01T22:57:09.649 INFO:tasks.workunit.client.0.smithi084.stderr:========================================...
06/01/2017
- 04:50 PM Bug #20133 (Can't reproduce): EnvLibradosMutipoolTest.DBBulkLoadKeysInRandomOrder hangs on rocksd...
- ...
- 04:43 PM Bug #19964: occasional crushtool timeouts
- /a/sage-2017-06-01_02:27:12-rados-wip-sage-testing2---basic-smithi/1249759
description: rados/singleton-bluestore/{a...
05/31/2017
- 11:06 PM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-05-31_18:45:30-rados-wip-sage-testing---basic-smithi/1248735
- 04:38 PM Bug #18043: ceph-mon prioritizes public_network over mon_host address
- and to elaborate on the fact that i have a branch and no pr, i do intend to finish this up soon, but likely only afte...
- 04:36 PM Bug #18043: ceph-mon prioritizes public_network over mon_host address
- fwiw, i've got a branch handling this from earlier this year: https://github.com/jecluis/ceph/commits/wip-mon-host
... - 04:04 PM Support #18508 (Closed): PGs of EC pool stuck in peering state
- There was clearly a lot going on here and none of it was clear. If switching to SimpleMessenger fixed it, I presume t...
- 03:14 PM Bug #17138: crush: inconsistent ruleset/ruled_id are difficult to figure out
- Some work in progress on this here: https://github.com/ceph/ceph/pull/13683
- 03:21 AM Bug #20117 (Rejected): BlueStore.cc: 8585: FAILED assert(0 == "unexpected error")
- version:
root@node0:~# ceph -v
ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935e5eb96f73e)
bluestore+rbd+ec+o... - 03:19 AM Bug #20116 (Can't reproduce): osds abort on shutdown with assert(ceph/src/osd/OSD.cc: 4324: FAILE...
- version:
root@node0:~# ceph -v
ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935e5eb96f73e)
bluestore+rbd+ec+o...
05/30/2017
- 01:46 PM Support #20108 (Resolved): PGs are not remapped correctly when one host fails
- I have run into the following problem:
in a 6 node cluster we have 2 nodes/chassis, and the crush rule set to distri... - 01:45 PM Bug #20041: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- Logs available on teuthology:/home/jdillaman/osd.23.log_try_rados_rm.gz
05/29/2017
- 11:30 PM Bug #19790 (In Progress): rados ls on pool with no access returns no error
- 11:28 PM Bug #19790: rados ls on pool with no access returns no error
- https://github.com/ceph/ceph/pull/15354
Greg, will talk to you about the per-object cap semantics separately. - 07:45 PM Bug #19964: occasional crushtool timeouts
- /a/sage-2017-05-28_05:00:18-rados-wip-sage-testing---basic-smithi/1238511
description: rados/singleton-bluestore/{... - 02:51 PM Bug #17968: Ceph:OSD can't finish recovery+backfill process due to assertion failure
- https://github.com/ceph/ceph/pull/15349
05/28/2017
- 09:17 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- I've no idea the repercussions (thinking I'll backup and recreate the cluster) but if you write an osdmap into all of...
- 03:09 AM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-05-27_03:43:09-rados-wip-sage-testing2---basic-smithi/1235222
- 03:07 AM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-05-27_03:43:09-rados-wip-sage-testing2---basic-smithi/1235419
- 02:03 AM Bug #19964: occasional crushtool timeouts
- /a/sage-2017-05-27_03:43:09-rados-wip-sage-testing2---basic-smithi/1235225
- 01:59 AM Bug #19964: occasional crushtool timeouts
- /a/sage-2017-05-27_01:05:11-rados-wip-sage-testing---basic-smithi/1233483
- 01:57 AM Bug #20105 (Resolved): LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify3/0 failure
- ...
05/27/2017
- 08:06 AM Bug #17968: Ceph:OSD can't finish recovery+backfill process due to assertion failure
- I have a document that provides the detail of our analysis of this problem, but it's written in chinese. If needed, I...
- 08:03 AM Bug #17968: Ceph:OSD can't finish recovery+backfill process due to assertion failure
- Hi, everyone.
Sorry, I forgot to watch my issues.
We found that the problem is due to "librados::OPERATION_BALA... - 07:59 AM Bug #19983: osds abort on shutdown with assert(/build/ceph-12.0.2/src/os/bluestore/KernelDevice.c...
- I pulled out a disk, and then there was the problem.
- 03:06 AM Bug #20099: osd/filestore: osd/PGLog.cc: 911: FAILED assert(last_e.version.version < e.version.ve...
- fang yuxiang wrote:
> i think this is not functional issue of ceph, maybe your local fs data is corrupted.
>
> ar... - 03:01 AM Bug #20099: osd/filestore: osd/PGLog.cc: 911: FAILED assert(last_e.version.version < e.version.ve...
- `read_log 406'6529418` and `read_log 346'6529418` have the same seq
other, I use ceph-kvstore-tool can show as:
... - 02:46 AM Bug #20099: osd/filestore: osd/PGLog.cc: 911: FAILED assert(last_e.version.version < e.version.ve...
- i think this is not functional issue of ceph, maybe your local fs data is corrupted.
are you using any block cache... - 02:41 AM Bug #20099 (Need More Info): osd/filestore: osd/PGLog.cc: 911: FAILED assert(last_e.version.versi...
- My Ceph cluster is down when the server is powered off,
and when i restart my osd, it failed in read_log.
As fllow:...
05/26/2017
- 09:44 PM Bug #19943: osd: enoent on snaptrimmer
- http://pulpito.ceph.com/gregf-2017-05-26_06:45:56-rados-wip-19931-snaptrim-pgs---basic-smithi/1231020/
- 03:36 PM Bug #20068 (Need More Info): osd valgrind error in CrushWrapper::has_incompat_choose_args
- https://github.com/ceph/ceph/pull/15244 was merged recently and modified how things are handled. Let see if it happen...
- 12:40 PM Bug #20092 (Duplicate): ceph-osd: FileStore::_do_transaction: assert(0 == "unexpected error")
- http://pulpito.ceph.com/jdillaman-2017-05-25_16:48:38-rbd-wip-jd-testing-distro-basic-smithi/1229611...
05/25/2017
- 10:07 PM Bug #20086 (Can't reproduce): LibRadosLockECPP.LockSharedDurPP gets EEXIST
- ...
- 06:11 AM Bug #19983: osds abort on shutdown with assert(/build/ceph-12.0.2/src/os/bluestore/KernelDevice.c...
- /a/bhubbard-2017-05-24_05:25:43-rados-wip-badone-testing---basic-smithi/1224591/teuthology.log...
- 05:56 AM Bug #19943: osd: enoent on snaptrimmer
- /a/bhubbard-2017-05-24_05:25:43-rados-wip-badone-testing---basic-smithi/1224546/teuthology.log
- 02:27 AM Bug #19964: occasional crushtool timeouts
- /a/sage-2017-05-24_22:20:09-rados-wip-sage-testing---basic-smithi/1225182
- 12:16 AM Bug #19790: rados ls on pool with no access returns no error
- Looking into this
05/24/2017
- 11:13 PM Bug #19939: OSD crash in MOSDRepOpReply::decode_payload
- Kefu, could you take a look at this one? Not sure if it's related to recent denc changes, or perhaps https://github.c...
- 10:26 AM Bug #19939: OSD crash in MOSDRepOpReply::decode_payload
- More instances from last night's master:
- http://pulpito.ceph.com/jspray-2017-05-23_22:31:39-fs-master-distro-basic... - 10:01 PM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-05-24_18:40:38-rados-wip-sage-testing2---basic-smithi/1224933
- 03:44 PM Bug #16890 (Fix Under Review): rbd diff outputs nothing when the image is layered and with a writ...
- 03:43 PM Feature #16883: omap not supported by ec pools
- This is due to erasure coded pools not supporting omap operations. It's a limitation for the current cache pool code,...
- 03:25 PM Bug #17170 (Can't reproduce): mon/monclient: update "unable to obtain rotating service keys when ...
- 03:22 PM Bug #17929: rados tool should bail out if you combine listing and setting the snap ID
- There is discussion on that (closed) PR. We just don't want to do snap listing as it's even more expensive than norma...
- 03:13 PM Bug #17968 (Need More Info): Ceph:OSD can't finish recovery+backfill process due to assertion fai...
- 03:13 PM Bug #17968 (Can't reproduce): Ceph:OSD can't finish recovery+backfill process due to assertion fa...
- 12:05 PM Bug #20068 (In Progress): osd valgrind error in CrushWrapper::has_incompat_choose_args
- 10:34 AM Bug #20068: osd valgrind error in CrushWrapper::has_incompat_choose_args
- Oops, left off the actual link:
http://pulpito.ceph.com/jspray-2017-05-23_22:31:39-fs-master-distro-basic-smithi/122... - 10:33 AM Bug #20068 (Resolved): osd valgrind error in CrushWrapper::has_incompat_choose_args
- Loic: assigning to you because it looks like you were working in this function recently....
- 10:47 AM Bug #20069 (New): PGs failing to create at start of test, REQUIRE_LUMINOUS not set?
- http://pulpito.ceph.com/jspray-2017-05-23_22:31:39-fs-master-distro-basic-smithi/1222407...
- 08:52 AM Bug #19790: rados ls on pool with no access returns no error
- For what it's worth, this is a regression. In Hammer, the appropriate EPERM is raised:...
Also available in: Atom