Activity
From 04/27/2017 to 05/26/2017
05/26/2017
- 09:44 PM Bug #19943: osd: enoent on snaptrimmer
- http://pulpito.ceph.com/gregf-2017-05-26_06:45:56-rados-wip-19931-snaptrim-pgs---basic-smithi/1231020/
- 03:36 PM Bug #20068 (Need More Info): osd valgrind error in CrushWrapper::has_incompat_choose_args
- https://github.com/ceph/ceph/pull/15244 was merged recently and modified how things are handled. Let see if it happen...
- 12:40 PM Bug #20092 (Duplicate): ceph-osd: FileStore::_do_transaction: assert(0 == "unexpected error")
- http://pulpito.ceph.com/jdillaman-2017-05-25_16:48:38-rbd-wip-jd-testing-distro-basic-smithi/1229611...
05/25/2017
- 10:07 PM Bug #20086 (Can't reproduce): LibRadosLockECPP.LockSharedDurPP gets EEXIST
- ...
- 06:11 AM Bug #19983: osds abort on shutdown with assert(/build/ceph-12.0.2/src/os/bluestore/KernelDevice.c...
- /a/bhubbard-2017-05-24_05:25:43-rados-wip-badone-testing---basic-smithi/1224591/teuthology.log...
- 05:56 AM Bug #19943: osd: enoent on snaptrimmer
- /a/bhubbard-2017-05-24_05:25:43-rados-wip-badone-testing---basic-smithi/1224546/teuthology.log
- 02:27 AM Bug #19964: occasional crushtool timeouts
- /a/sage-2017-05-24_22:20:09-rados-wip-sage-testing---basic-smithi/1225182
- 12:16 AM Bug #19790: rados ls on pool with no access returns no error
- Looking into this
05/24/2017
- 11:13 PM Bug #19939: OSD crash in MOSDRepOpReply::decode_payload
- Kefu, could you take a look at this one? Not sure if it's related to recent denc changes, or perhaps https://github.c...
- 10:26 AM Bug #19939: OSD crash in MOSDRepOpReply::decode_payload
- More instances from last night's master:
- http://pulpito.ceph.com/jspray-2017-05-23_22:31:39-fs-master-distro-basic... - 10:01 PM Bug #19943: osd: enoent on snaptrimmer
- /a/sage-2017-05-24_18:40:38-rados-wip-sage-testing2---basic-smithi/1224933
- 03:44 PM Bug #16890 (Fix Under Review): rbd diff outputs nothing when the image is layered and with a writ...
- 03:43 PM Feature #16883: omap not supported by ec pools
- This is due to erasure coded pools not supporting omap operations. It's a limitation for the current cache pool code,...
- 03:25 PM Bug #17170 (Can't reproduce): mon/monclient: update "unable to obtain rotating service keys when ...
- 03:22 PM Bug #17929: rados tool should bail out if you combine listing and setting the snap ID
- There is discussion on that (closed) PR. We just don't want to do snap listing as it's even more expensive than norma...
- 03:13 PM Bug #17968 (Need More Info): Ceph:OSD can't finish recovery+backfill process due to assertion fai...
- 03:13 PM Bug #17968 (Can't reproduce): Ceph:OSD can't finish recovery+backfill process due to assertion fa...
- 12:05 PM Bug #20068 (In Progress): osd valgrind error in CrushWrapper::has_incompat_choose_args
- 10:34 AM Bug #20068: osd valgrind error in CrushWrapper::has_incompat_choose_args
- Oops, left off the actual link:
http://pulpito.ceph.com/jspray-2017-05-23_22:31:39-fs-master-distro-basic-smithi/122... - 10:33 AM Bug #20068 (Resolved): osd valgrind error in CrushWrapper::has_incompat_choose_args
- Loic: assigning to you because it looks like you were working in this function recently....
- 10:47 AM Bug #20069 (New): PGs failing to create at start of test, REQUIRE_LUMINOUS not set?
- http://pulpito.ceph.com/jspray-2017-05-23_22:31:39-fs-master-distro-basic-smithi/1222407...
- 08:52 AM Bug #19790: rados ls on pool with no access returns no error
- For what it's worth, this is a regression. In Hammer, the appropriate EPERM is raised:...
05/23/2017
- 08:24 PM Bug #18165 (In Progress): OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_tar...
- 07:37 PM Bug #19790: rados ls on pool with no access returns no error
- Well, it's obvious enough, we go into PrimaryLogPG::do_pg_op() before we check op_has_sufficient_caps().
I think t... - 06:57 PM Bug #20059 (Resolved): miscounting degraded objects
- on bigbang,...
- 09:50 AM Bug #20053 (New): crush compile / decompile looses precision on weight
- The weight of an item is displayed with %.3f and looses precision that makes a difference in mapping.
Steps to rep... - 03:39 AM Bug #20050: osd: very old pg creates take a long time to build past_intervals
- partially addressed by patch in wip-bigbang.
- 03:33 AM Bug #20050 (Resolved): osd: very old pg creates take a long time to build past_intervals
- (bigbang)
osds were down for a long time and pgs never got created. when the osds finally are up, they have to go...
05/22/2017
- 11:05 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- Still happens in 12.0.3, with the patch [[https://github.com/ceph/ceph/pull/15046]] applied。
- 08:35 PM Bug #20041: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- ...
- 05:22 PM Bug #20041: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- I've seen this on scrub as well.
- 03:55 PM Bug #20041 (Resolved): ceph-osd: PGs getting stuck in scrub state, stalling RBD
- See the attached logs for the remove op against rbd_data.21aafa6b8b4567.0000000000000aaa...
- 04:34 PM Bug #19964: occasional crushtool timeouts
- See this log as well:
http://qa-proxy.ceph.com/teuthology/yuriw-2017-05-20_04:20:14-rados-master_2017_5_20---basic... - 06:51 AM Bug #20000 (Can't reproduce): osd assert in shared_cache.hpp: 107: FAILED assert(weak_refs.empty())
- version:
root@node0:~# ceph -v
ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935e5eb96f73e)
bluestore+ec+overw...
05/20/2017
- 06:41 AM Bug #19964: occasional crushtool timeouts
- this is not new, i've been spotting this occasionally in our jenkins run.
05/19/2017
- 04:38 PM Bug #19991 (New): dmclock-tests fail on my build VM
On my build machine which is a VM. It passes on Jenkins.
[ RUN ] test_client.full_bore_timing
/home/dz...- 03:18 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
i have the same error with 12.0.3...- 02:52 PM Bug #19803: osd_op_reply for stat does not contain data (ceph-mds crashes with unhandled buffer::...
- After switching to writeback cache mode, this error didn't occur again. So I'm confident the proxy mode of the cache ...
- 08:30 AM Bug #19983 (Closed): osds abort on shutdown with assert(/build/ceph-12.0.2/src/os/bluestore/Kerne...
- version:
root@node0:~# ceph -v
ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935e5eb96f73e)
bluestore+rbd+ec+o...
05/17/2017
- 11:09 PM Bug #19971 (In Progress): osd: deletes are performed inline during pg log processing
- 11:09 PM Bug #19971 (Resolved): osd: deletes are performed inline during pg log processing
- With a large number of deletes in a client workload, this can easily saturate a disk and cause very high latency, sin...
- 09:42 PM Bug #19700: OSD remained up despite cluster network being inactive?
- Was the cluster performing IO while this happened? Do your public and private networks perhaps route to each other?
... - 07:34 PM Bug #19790: rados ls on pool with no access returns no error
- Same issue even with just @rw@:...
- 06:38 PM Bug #19790: rados ls on pool with no access returns no error
- I'm not at a computer to check, but I'm pretty sure the "allow *" is short-circuiting other security checks here and ...
- 04:10 PM Bug #19790: rados ls on pool with no access returns no error
- 09:12 AM Bug #19790: rados ls on pool with no access returns no error
- Just checking: is anyone looking at this? It's arguably a security issue, after all.
- 03:53 PM Bug #16567: Ceph raises scrub errors on pgs with objects being overwritten
- Hmm, similar reports have popped up (although with on-disk size 0) on the mailing list. Those involved cache tiers th...
- 03:51 PM Bug #16279: assert(objiter->second->version > last_divergent_update) failed
- xfs corruption means your setup was not safe for power failure, or your disk is dying. Neither is something that ceph...
- 03:38 PM Bug #15936: Osd-s on cache pool crash after upgrade from Hammer to Jewel
- Ping Joao? This looks to have been a crash in persisting/trimming HitSets, which I know underwent a bunch of changes/...
- 03:19 PM Bug #15741: librados get_last_version() doesn't return correct result after aio completion
- Any update on this, David? :)
- 02:23 PM Bug #19964 (Resolved): occasional crushtool timeouts
- ...
- 11:21 AM Bug #19960 (Resolved): overflow in client_io_rate in ceph osd pool stats
- luminous branch, v12.0.2
Output of ceph osd pool stats -f json contains overflowed values in client_io_rate sectio...
05/16/2017
- 07:59 PM Bug #19943: osd: enoent on snaptrimmer
- /a/yuriw-2017-05-15_22:59:10-rados-wip-yuri-testing_2017_5_16-distro-basic-smithi/1181575 (bluestore)
- 05:55 PM Bug #19943: osd: enoent on snaptrimmer
- Clone 269 was trimmed but it corresponds to a lot of other snapshots, so the object shouldn't be removed until all th...
- 04:04 PM Bug #19943 (Resolved): osd: enoent on snaptrimmer
- ...
- 04:54 PM Feature #19944 (Rejected): [RFE]: add option/support config persistence with ceph tell command
- we should have support in ceph itself to make the conf changes persist, ceph tell has good error checking mechanism a...
- 11:08 AM Bug #19939 (Resolved): OSD crash in MOSDRepOpReply::decode_payload
Seen on kcephfs suite, running against test branch based on Monday's master....- 02:39 AM Bug #19936 (New): filestore ENOTEMPTY
- ...
05/12/2017
- 06:32 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- Or Sage/Zheng can confirm if this failure mode matches that error...
- 06:31 PM Bug #19909: PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid.pool()))
- I'm just pattern-matching from going through my email, but https://github.com/ceph/ceph/pull/15046 is about OSDMap de...
05/11/2017
- 03:14 PM Bug #19911 (Can't reproduce): osd: out of order op
- ...
- 01:01 PM Bug #19909 (Won't Fix): PastIntervals::check_new_interval: assert(lastmap->get_pools().count(pgid...
- After updating osds from 12.0.1 to 12.0.1-2248-g745902a, we get all osd's failing like this:...
05/10/2017
- 02:03 PM Bug #19639: mon crash on shutdown
- Is it reproducing? Wouldn't surprise me if these were linked.
- 10:42 AM Bug #19639: mon crash on shutdown
- Sorry, I made the history confusing by editing the description. The "other one" is the one that is now the only one ...
05/09/2017
- 03:50 PM Bug #19895 (Can't reproduce): test/osd/RadosModel.h: 1169: FAILED assert(version == old_value.ver...
- ...
- 08:42 AM Bug #18698: BlueFS FAILED assert(0 == "allocate failed... wtf")
- Hi,
in fact, we are still (after some days) having this issue after upgrading to Luminous 12.0.2.
Same errors in th...
05/08/2017
- 09:50 PM Bug #19882 (Resolved): rbd/qemu: [ERR] handle_sub_read: Error -2 reading 1:e97125f5:::rbd_data.0....
- /a/sage-2017-05-08_20:50:21-rbd:qemu-wip-19863---basic-smithi/1114854
/a/sage-2017-05-08_20:50:21-rbd:qemu-wip-19863... - 06:54 PM Bug #19881 (Can't reproduce): ceph-osd: pg_update_log_missing(1.20 epoch 66/11 rep_tid 1493 entri...
- OSD assertion failure during rbd-mirror test:
http://qa-proxy.ceph.com/teuthology/jdillaman-2017-05-08_11:56:19-rbd-...
05/07/2017
- 05:15 PM Bug #19803: osd_op_reply for stat does not contain data (ceph-mds crashes with unhandled buffer::...
- Unfortunately my colleague already fixed the MDS (recover_dentries, journal reset) - now the op reply contains data.
...
05/04/2017
- 02:35 AM Bug #17945: ceph_test_rados_api_tier: failed to decode hitset in HitSetWrite test
- saw it again, ...
- 01:05 AM Bug #19803: osd_op_reply for stat does not contain data (ceph-mds crashes with unhandled buffer::...
- osd op replies for 'stat' do not contain data. (140+0+0 in these lines) ...
05/03/2017
- 11:21 PM Bug #19849 (New): cls ops do not consistently get ENOENT on whiteouts
- the cls glue objclass.cc directly calls do_osd_ops, which inconsistently checks for !exists || is_whiteout(). instea...
- 03:58 PM Bug #19803: osd_op_reply for stat does not contain data (ceph-mds crashes with unhandled buffer::...
- Thanks for looking into this.
Here is the output with debug_mds=20 and debug_ms=1:...
05/02/2017
- 04:25 PM Bug #18698: BlueFS FAILED assert(0 == "allocate failed... wtf")
- Hi,
i tested with kraken v11.2.0 again : deactivated the bluefs_allocator = stupid and restarted all my OSDs, issu... - 10:06 AM Bug #18749: OSD: allow EC PGs to do recovery below min_size
- https://trello.com/c/5q8YSNtu
I am willing to solve this problem - 09:16 AM Bug #19803: osd_op_reply for stat does not contain data (ceph-mds crashes with unhandled buffer::...
- The crash is strange,it happened when decoding on-wire message from osd. Please add 'debug ms = 1' to mds config and ...
- 08:10 AM Cleanup #18875: osd: give deletion ops a cost when performing backfill
- Working on this issue
- 07:06 AM Bug #19400: add more info during pool delete error
- It's resolved.
05/01/2017
- 10:26 PM Bug #19818 (New): crush: get_rule_weight_osd_map does not factor in pool size, rule
- The get_rule_weight_osd_map assumes that every osd reachable by the TAKE ops are used once. This isn't true in gener...
- 05:33 PM Bug #19440: osd: trims maps taht pgs haven't consumed yet when there are gaps
- 04:24 PM Bug #18698 (Can't reproduce): BlueFS FAILED assert(0 == "allocate failed... wtf")
- I haven't seen this in any our qa... is it still happening for you? Which versions?
- 03:55 PM Bug #19639 (Need More Info): mon crash on shutdown
- what is the "other one" (besides probe_timeout #19738)?
- 02:34 PM Bug #19815 (New): Rollback/EC log entries take gratuitous amounts of memory
- Each osd consumes too much memory when i tested ec overwrite. So i watched heap memory with google-perftools.
I fou...
04/28/2017
- 09:38 PM Feature #19810 (New): qa: test that we are trimming maps
- We merged https://github.com/ceph/ceph/pull/14504 without noticing that it prevented *all* OSD map trimming, because ...
- 03:29 PM Bug #19803 (New): osd_op_reply for stat does not contain data (ceph-mds crashes with unhandled bu...
- Hi,
our MDS crashes reproducible after some hours when we're extracting lots of zip archives (with many small file... - 05:42 AM Bug #19800: some osds are down when create a new pool and a new image of the pool (bluestore)
- In this case, the bug occurs if firstly remove pool, following create pool and image.
When it occur, The most intui... - 02:19 AM Bug #19800: some osds are down when create a new pool and a new image of the pool (bluestore)
- (gdb) bt
#0 0x00002b0ff06d4cc3 in pread64 () at ../sysdeps/unix/syscall-template.S:81
#1 0x00005591f7cecd35 in pr... - 02:18 AM Bug #19800 (Resolved): some osds are down when create a new pool and a new image of the pool (blu...
- After much Write IOs such as snapshot writing and PG splitting for cluster, to create a new pool and a new image of t...
04/27/2017
- 06:57 PM Bug #18329 (Can't reproduce): pure virtual method called in rocksdb from bluestore
- haven't seen this since then.
- 06:19 PM Bug #19737: EAGAIN encountered during pg scrub (jewel)
- Ran 4 times - 50% failure rate: http://pulpito.ceph.com/smithfarm-2017-04-27_17:35:57-rados-wip-jewel-backports-distr...
- 05:40 PM Bug #19737: EAGAIN encountered during pg scrub (jewel)
- http://pulpito.ceph.com/smithfarm-2017-04-27_16:56:17-rados-wip-jewel-backports---basic-smithi/1074069/
- 08:58 AM Bug #19790 (Resolved): rados ls on pool with no access returns no error
- Given the following auth capabilities:...
Also available in: Atom