Activity
From 09/15/2014 to 10/14/2014
10/14/2014
- 07:32 PM Bug #8620: rest/test.py occasional failure (dumpling)
- ubuntu@teuthology:/a/teuthology-2014-10-13_19:00:01-rados-dumpling-distro-basic-multi/545881
- 07:30 PM Bug #8851: Mon crash after update to 0.80.4
- In our product env, we use 0.83. Coming accross this problem too.
Try this patch https://github.com/ceph/ceph/pull/2... - 07:30 PM Bug #8851: Mon crash after update to 0.80.4
- In our product env, we use 0.83. Coming accross this problem too.
Try this patch https://github.com/ceph/ceph/pull/2... - 07:22 PM Bug #9765: CachePool flush -> OSD Failed
- ...
- 02:58 AM Bug #9765: CachePool flush -> OSD Failed
- I'm sorry!
*[root@ct3 ~]# ceph --version
ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)*
- 01:24 AM Bug #9765: CachePool flush -> OSD Failed
- in addition:...
- 01:19 AM Bug #9765 (Duplicate): CachePool flush -> OSD Failed
- Hi,All.
I encountered a problem flushing the data before deleting CachePool.
My crushmap:... - 06:55 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- https://github.com/ceph/ceph/pull/2724
- 06:42 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Indeed ! Thanks !
- 06:32 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- I don't think this is the patch you want see c776a89880fdac270e6334ad8e49fa616d05d0d4 and acfe62e0aa45bff208e38aeedad...
- 06:27 PM Bug #9073 (Fix Under Review): OSD with device/partition journals down after fresh deploy or upgra...
- * firefly backport https://github.com/ceph/ceph/pull/2724
- 06:22 PM Bug #9073 (Pending Backport): OSD with device/partition journals down after fresh deploy or upgra...
- The fix for this bug is https://github.com/ceph/ceph/commit/c776a89880fdac270e6334ad8e49fa616d05d0d4 and needs backpo...
- 06:31 PM Bug #9785 (Resolved): /etc/ceph/dmcrypt-keys and key contents are created world-readable
- get_or_create_dmcrypt_key in ceph-disk creates the key_dir and key_files, but does not set any specific permissions o...
- 06:23 PM Bug #9768 (Duplicate): ceph-osd mkfs hangs
- 06:00 PM Bug #9768: ceph-osd mkfs hangs
- Created with ceph-disk prepare --fs-type=ext4 and ceph-disk activate /dev/loop3p1
- 04:46 PM Bug #9768: ceph-osd mkfs hangs
- On ubuntu-14.04 the logs of a ceph-osd mkfs on 0.80.5 that completes successfully.
- 04:04 PM Bug #9768: ceph-osd mkfs hangs
- 07:21 AM Bug #9768: ceph-osd mkfs hangs
- I browsed the patches to aio in 3.12.7 until now and saw nothing that could related to this problem https://www.kerne...
- 07:15 AM Bug #9768: ceph-osd mkfs hangs
- ...
- 06:01 AM Bug #9768: ceph-osd mkfs hangs
- Although https://github.com/ceph/ceph/commit/2f11631f3144f2cc0e04d718e40e716540c8af19 seems related, the log shows Fi...
- 05:45 AM Bug #9768 (Duplicate): ceph-osd mkfs hangs
- h3. Workaround for Firefly <= 0.80.7
If it shows with... - 06:15 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
- rsync asks us to see previous errors;) yes, I think sudo should work
- 02:36 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
- Well, that would make sense. How did you find those in the log?
We should probably just run this as sudo or someth... - 06:30 AM CephFS Bug #9674: nightly failed multiple_rsync.sh
- ...
- 05:52 PM devops Bug #9783: upgrade ceph-common (0.80.7-1trusty) over (0.80.5-0ubuntu0.14.04.1) fails
- looks like 17732dc0c8878ea58813ad543c5359cb811079cc which probably should have included some other package control he...
- 04:55 PM devops Bug #9783 (Rejected): upgrade ceph-common (0.80.7-1trusty) over (0.80.5-0ubuntu0.14.04.1) fails
- This happens when switching from Ubuntu / Debian repositories to Ceph repositories:...
- 05:31 PM Bug #9784 (Resolved): All tools should be named consistently and argument parsing should be better
Slowly some of the tools like ceph_objectstore_tool have migrated to have underscores in the name. But I noticed s...- 04:42 PM Bug #9769: upgrade/firefly: latest_dumpling_release.yaml always fails
- Tests still failing but at different point.
See http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-14_14:22:47-u... - 09:17 AM Bug #9769: upgrade/firefly: latest_dumpling_release.yaml always fails
- From rerun with verbose:...
- 08:21 AM Bug #9769 (In Progress): upgrade/firefly: latest_dumpling_release.yaml always fails
- Running with verbose is on for job - http://qa-proxy.ceph.com/teuthology/sage-2014-10-13_20:46:44-upgrade:firefly-fir...
- 06:01 AM Bug #9769 (Resolved): upgrade/firefly: latest_dumpling_release.yaml always fails
- ...
- 04:00 PM Bug #9408 (Fix Under Review): erasure-code: misalignment
- 03:57 PM Bug #9700 (Resolved): cephtool mon_osd intermittent failure
- I've not seen errors since this patch, except for firefly builds because this was not backported. Feel free to re-ope...
- 03:06 PM Bug #9388 (Duplicate): osd/PG.cc: 2945: FAILED assert(r == 0) in update_snap_map
- 03:01 PM Bug #9390 (Duplicate): EEXIST on split due to import/export
- 03:00 PM Bug #7588 (Resolved): OSD Seg fault in string assign ObjectOperation::C_ObjectOperation_copyget::...
- 02:59 PM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
- The corresponding line of code in master branch for test/librados/misc.cc was changed by Josh in Feb:
7a019b38 src/t... - 02:51 PM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
- Same issue in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-13_17:10:01-upgrade:dumpling-firefly-x:paral...
- 02:59 PM rgw Bug #9774 (Won't Fix): multi-version: giant rgw throws 500 with dumpling osds
- 09:42 AM rgw Bug #9774 (Won't Fix): multi-version: giant rgw throws 500 with dumpling osds
- ...
- 02:58 PM Bug #9757: mon: loops on osd pool create
- final commit is cf4e30095e8149d1df0f2c9b4c93c9df0779ec84
- 02:57 PM Bug #9757 (Resolved): mon: loops on osd pool create
- 02:31 PM Bug #9757 (Fix Under Review): mon: loops on osd pool create
- 02:28 PM Bug #9757: mon: loops on osd pool create
- This bug is dated october 12th with https://github.com/ceph/ceph/commit/0c1eafd7ab6f7d2a5eccd10ce267bde5e90932c5 whic...
- 01:51 PM Bug #9757: mon: loops on osd pool create
- * https://github.com/ceph/ceph/commit/fe43202449e3caf60e796f1205ef4303e905659d does not need to be backported because...
- 01:40 PM Bug #9757: mon: loops on osd pool create
- "mon/OSDMonitor : Use user provided ruleset for replicated pool":https://github.com/ceph/ceph/commit/cf4e30095e8149d1...
- 01:18 PM Bug #9757: mon: loops on osd pool create
- ...
- 12:54 PM Bug #9757: mon: loops on osd pool create
- This was run using the following backport https://github.com/ceph/ceph/commits/wip-9757
- 02:47 PM Feature #9781 (Resolved): ceph_objectstore_tool: On import handle splits
Once we have OSDMap information we need to check for splits during pg import:
Sam:
Upon import, if we detect a ...- 02:45 PM Feature #9780 (Resolved): ceph_objectstore_tool: Add OSDMap information to pg export
- Gather appropriate OSDMap information and include in export data.
- 02:43 PM Fix #7711: OpTracker output doesn't include op size for subops
- I didn't do this back then. We should get to it, though.
- 02:14 PM Linux kernel client Feature #9779: libceph: sync up with objecter
- Make sure not to break existing (correct!) behavior: we need to resent watch or notify when *any* member of the actin...
- 02:10 PM Linux kernel client Feature #9779 (Resolved): libceph: sync up with objecter
- - the way we resend lingering requests isn't quite the same
- __map_request() is too aggressive about resending:
... - 02:04 PM Linux kernel client Bug #8806 (Resolved): libceph: must use new tid when watch is resent
- 02:02 PM Fix #9778 (New): forbid erasure code profile modifications that can modify data encoding
- even if --force is set in erasure-code-profile set because it can corrupt the content of the erasure coded pool. For ...
- 01:25 PM Feature #9449 (Resolved): mon: make ceph -s break more things onto multiple lines (health blurbs,...
- 01:24 PM Feature #9598 (Fix Under Review): re-enable Objecter fast dispatch
- 01:24 PM Fix #9194 (In Progress): librados/osd: watch reconnect needs to be exclusive to detect possibly m...
- 01:12 PM Feature #9776 (New): try to make address sanitizer work
- 12:44 PM Feature #9198 (Fix Under Review): librados: notify callback includes gid of notifier
- 12:44 PM Feature #9197 (Fix Under Review): librados/osd: notify reply payload
- 12:43 PM Feature #8899 (Resolved): Kerberos/LDAP Support:: mon: define mon role capabilities
- 12:29 PM rgw Bug #9763 (Resolved): firefly upgrade tests fail s3tests, apache goes away
- https://github.com/ceph/s3-tests/commit/7e7457e1af8481cf111f25edab198d7498e18551
- 12:19 PM rgw Bug #9763: firefly upgrade tests fail s3tests, apache goes away
- looks like a bad test in s3-tests
- 11:13 AM Bug #5925: hung ceph_test_rados_delete_pools_parallel
- 08:47 AM Bug #5925: hung ceph_test_rados_delete_pools_parallel
- I've been reproducing this reliably with wip-9321.giant. Hung job in plana12....
- 11:12 AM Bug #9696 (Resolved): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(wan...
- 10:30 AM rbd Bug #5977 (Resolved): librbd: python bindings need docstrings to show up in online docs
- commit:7022679e2c76c707d3d28c052045d11736582b3a
- 08:11 AM rbd Bug #5977: librbd: python bindings need docstrings to show up in online docs
- PR: https://github.com/ceph/ceph/pull/2720
- 08:10 AM rbd Bug #5977 (In Progress): librbd: python bindings need docstrings to show up in online docs
- 09:28 AM rbd Fix #7787: rbd diff takes longer as images grow larger
- Dependent on issue #4087
- 09:26 AM rbd Feature #7746: Capacity Management: rbd df
- Dependent on issue #4087
- 09:25 AM rbd Feature #7746 (In Progress): Capacity Management: rbd df
- 09:07 AM rbd Feature #7746 (Fix Under Review): Capacity Management: rbd df
- 09:13 AM rbd Bug #8329 (Need More Info): qemu-img rpm provided breaks snapshooting functionality on centos
- Andrija, according to Bugzilla, the availability of the "-s" option in qemu-img was a backporting bug and was effecti...
- 09:12 AM Linux kernel client Feature #190 (Resolved): krbd: DISCARD support
- 09:09 AM rbd Feature #8902 (Fix Under Review): rbd mirroring: librbd: funnel snapshot, resize events via lock ...
- 09:08 AM rgw Cleanup #9772 (In Progress): rgw: reorganize RGWRados
- 09:07 AM rgw Cleanup #9772 (Resolved): rgw: reorganize RGWRados
- need to clean up the different states, separate access to system objects vs data objects.
- 09:07 AM rbd Feature #8900 (Fix Under Review): rbd mirroring: librbd:making image locking mandatory
- 09:07 AM rbd Feature #4087 (Fix Under Review): rbd: bitmaps for tracking object existence
- 09:06 AM rgw Feature #9013: rgw: set civetweb as a default frontend
- Done, commit:63d0ec7b2c00b7f9515d492009115d87414a77ab.
- 09:02 AM rbd Bug #9771 (Won't Fix): Segmentation fault after upgrade v0.80.5 -> v0.80.6
- This is new test upgrades from v0.80.4 -> v0.80.5 -> v0.80.4->firefly and runs different workloads after each step.
... - 07:32 AM Bug #9731: Ceph 0.80.6 OSD crashes
- And here is another core file from another server. The backtrace in the log looks like a different path to me.
- 07:13 AM Bug #9731: Ceph 0.80.6 OSD crashes
- The Debian Wheezy build server doesn't seem to be online yet so I haven't been able to test your patch.
However, I... - 06:20 AM rbd Feature #9733: Separate rbd listing into CAP
- I apologize, I thought I mentioned that you need to use RBD image format 2, but re-reading my comments I seemed to ha...
- 06:20 AM CephFS Feature #9755: Fence late clients during reconnect timeout
- There can be certain cases where a client can reconnect after being evicted, e.g. if:
* the client didn't hold an... - 03:53 AM Fix #9767 (New): do not leak ceph-disk activate lock to the OSD
- The "activate_lock":https://github.com/ceph/ceph/blob/giant/src/ceph-disk#L1997 will leak to the OSD. This is harmles...
- 03:19 AM rgw Bug #9766 (Rejected): s3tests: test_100_continue failing
- Trying out s3tests on a ceph (0.80.5) cluster with ...
- 01:38 AM Bug #9761: ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error 7 in ld-2.1...
- I cannot repeat this voluntarily. And debug.20 eats up space.
- 01:34 AM Bug #9761: ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error 7 in ld-2.1...
- Pavel Veretennikov wrote:
> Just found this error in the logs. Ceph 0.80.6, Ubuntu 14.04, kernel 3.13.0-36-generic
... - 01:18 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
- Looking at the ceph-osd.2.log uploaded by Sahana.
Prior to the reported problem, there was one more crash while merg... - 12:42 AM Linux kernel client Bug #9749 (Resolved): kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
- fixed by "ceph: fix divide-by-zero in __validate_layout()" in the testing branch
- 12:03 AM rgw Feature #8316: Ceilometer support for RGW Swift statistics
- If we want to support ceilometer, can't we support the statistics of both S3 & swift APIs?
10/13/2014
- 08:45 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- I added a new test (#9758) and testing it on ceph-qa-suites branch 'wip_9758' which is doing step upgrades v0.80.4-v0...
- 10:52 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- wip-9731-firefly does not have this patch.
- 05:40 PM rgw Bug #9763 (Resolved): firefly upgrade tests fail s3tests, apache goes away
- ...
- 04:50 PM CephFS Feature #414 (Fix Under Review): ceph-fuse: implement file locking
- 04:42 PM devops Bug #9747: ceph.spec.in will always use 95-ceph-osd-alt.rules
- running gitbuilder
- 04:40 PM devops Bug #9747 (Fix Under Review): ceph.spec.in will always use 95-ceph-osd-alt.rules
- * backported to giant already
* firefly backport https://github.com/ceph/ceph/pull/2717 - 08:16 AM devops Bug #9747 (Pending Backport): ceph.spec.in will always use 95-ceph-osd-alt.rules
- 03:05 PM rbd Feature #9733: Separate rbd listing into CAP
- Using those caps does not allow the kernel client to mount the image:
[root@nodezz ~]# ceph auth caps client.rdleb... - 02:20 PM rbd Feature #9733: Separate rbd listing into CAP
- I've looked over that document a few times, but I'm not finding specifics about things like "object_prefix", "rbd_hea...
- 01:59 PM rbd Feature #9733: Separate rbd listing into CAP
- Yes, you should be able to use different users within Nova, Cinder, and Glance config files. The capability grammar ...
- 01:16 PM rbd Feature #9733: Separate rbd listing into CAP
- Let me try this and see if it will do what we think. I don't know enough about the Open Stack side, but I hope we can...
- 01:08 PM rbd Feature #9733: Separate rbd listing into CAP
- The RBD image directory is stored within an object named 'rbd_directory' in each pool. You could create a capspec wh...
- 12:52 PM CephFS Feature #9755: Fence late clients during reconnect timeout
- Hmm, I like the basic thrust of this, but I'm a little concerned as well — we have other tickets to let clients recon...
- 03:39 AM CephFS Feature #9755 (Resolved): Fence late clients during reconnect timeout
During reconnect, MDSs terminate the sessions of any clients which fail to reconnect within the window. Because wh...- 12:45 PM Linux kernel client Feature #190: krbd: DISCARD support
- This should go upstream to Linus in the next day or two (for 3.18-rc1).
- 11:29 AM Linux kernel client Feature #190: krbd: DISCARD support
- Alphe Salas wrote:
> I agree with Kyle and Brian. This feature is necessary. I would like to have more information a... - 11:27 AM Linux kernel client Feature #190: krbd: DISCARD support
- I agree with Kyle and Brian. This feature is necessary. I would like to have more information about the status of thi...
- 12:28 PM rbd Fix #7787 (In Progress): rbd diff takes longer as images grow larger
- 12:17 PM Bug #9761 (Rejected): ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error ...
- Just found this error in the logs. Ceph 0.80.6, Ubuntu 14.04, kernel 3.13.0-36-generic
Nothing special in the ceph... - 12:03 PM devops Bug #9760 (Rejected): librados2 fails to install from ceph-qa
- Which is causing lots of failures in ceph-deploy's test runs
Example full log of one of the failures: http://qa-pr... - 10:48 AM Bug #9731: Ceph 0.80.6 OSD crashes
- 10:19 AM Linux kernel client Bug #9749: kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
- 09:59 AM Bug #9714 (Duplicate): Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-mu...
- i think this is a dup of #9757
- 09:48 AM Bug #9714: Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi run
- Sam, can you take a look at this?
Still an issue in one off run - http://qa-proxy.ceph.com/teuthology/teuthology-2... - 09:35 AM rbd Bug #9742: `rbd map lun` fails with: (2) No such file or directory on kernel 3.14.14 w/ udev-216 ...
- is this running inside a container? this looks lik ea problem with the authentication keys and there is a known issu...
- 09:32 AM Bug #9744 (Won't Fix): cephx: verify_reply couldn't decrypt with error: error decoding block for ...
- this happens when clocks are very skewed.
- 08:59 AM rbd Bug #9602 (Closed): rbd export -> nc ->rbd import = memory leak
- Irek, thanks for the update. Closing as not a bug.
- 04:43 AM rbd Bug #9602: rbd export -> nc ->rbd import = memory leak
- Jason Dillaman wrote:
> I quickly attempted to reproduce this on the same version w/o success. Can you attach /etc/... - 08:47 AM Documentation #9730: ceph-deploy mon create-inital, does not take arguments
- merged commit eb27245 into master
- 08:46 AM Documentation #9730 (Resolved): ceph-deploy mon create-inital, does not take arguments
- Merged the pull request.
- 08:21 AM Documentation #9730 (In Progress): ceph-deploy mon create-inital, does not take arguments
- PR opened https://github.com/ceph/ceph/pull/2714
- 08:34 AM Bug #9757: mon: loops on osd pool create
- also breaking teuthology-2014-10-09_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi
- 06:31 AM Bug #9757 (Resolved): mon: loops on osd pool create
- http://pulpito.ceph.com/sage-2014-10-12_09:13:46-upgrade:dumpling-x-wip-sam-firefly-testing-distro-basic-multi/541361...
- 07:22 AM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
- I have submitted the following patches:
Update s3-tests with the new small size multipart tests:
https://github.c... - 06:56 AM Cleanup #9756: Issues found by Clang
- start with
https://github.com/ceph/autobuild-ceph/blob/master/build-ceph.sh
and make a build-ceph-clang.sh. ... - 04:50 AM Cleanup #9756 (In Progress): Issues found by Clang
- I again [1] used Clang with -Weverything [2] to compile the Ceph repository [3].
There is still a huge amount of ser... - 06:22 AM rbd Bug #8329: qemu-img rpm provided breaks snapshooting functionality on centos
- Any info on this? At least can we define some prefered way of enabling qemu-img/kvm to speak to CEPH (Do it your self...
- 03:16 AM CephFS Feature #9754 (Resolved): A 'fence and evict' client eviction command
Currently the "session evict" operation on the MDS admin socket will terminate the session, and release any capabil...- 01:07 AM Linux kernel client Bug #9355 (Closed): rbd: map fails with EINVAL inside a container
- Opened #9753.
- 01:06 AM Linux kernel client Feature #9753 (Resolved): libceph: allow custom network namespaces
- See the bottom of #9355.
- 12:57 AM Linux kernel client Bug #9192: krbd: poor read (about 10%) vs write performance
- Hi Eric,
Thanks for doing this. I was concerned about this being a regression after the queueing changes, but it ...
10/12/2014
- 10:07 PM Bug #9614: PG stuck with remapped
- The original fix was not clean, just added a new pull request: https://github.com/ceph/ceph/pull/2711
- 09:06 PM Bug #9215: Ceph Firefly 0.80.5 : OSD flapping too frequently
- karan singh wrote:
> You can close this case , problem has been solved after applying fix (0.80.5-1-gc4b77d2)
May... - 06:43 PM Bug #9731: Ceph 0.80.6 OSD crashes
- I already saw the commit in branch. The status of this issue should not be New.
- 03:23 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- Results for the run teuthology-2014-10-11_19:00:02-upgrade:dumpling-x-wip-9731-firefly-distro-basic-multi
Still jo... - 02:44 PM Linux kernel client Bug #9192: krbd: poor read (about 10%) vs write performance
- I was able to get some dedicated test time on one of our Ceph test clusters to rerun the kernel RBD read/write tests ...
- 12:09 PM Bug #9752 (Resolved): acting in past intervals contains primary and up_primary (looks like duplic...
- In a 0.80.6 in the context of http://tracker.ceph.com/issues/9750 the following showed up (the full output can be fou...
- 11:20 AM Bug #9751: ceph tell osd.6 version hangs
- here is the gdb output of the OSD process that fails to answer to ceph tell
- 10:18 AM Bug #9751: ceph tell osd.6 version hangs
- attaching the log with lockdep = true, starting from when the osd boots up to the point where ceph tell blocks forever
- 10:10 AM Bug #9751: ceph tell osd.6 version hangs
- greg : the log is from when the osd started up to the point where ceph tell hangs
- 09:51 AM Bug #9751: ceph tell osd.6 version hangs
- Maybe similar to #9748 and #9714 ?
- 09:49 AM Bug #9751: ceph tell osd.6 version hangs
- Was the OSD already "hanging" when you generated this log?
- 09:21 AM Bug #9751 (Rejected): ceph tell osd.6 version hangs
- ...
- 10:05 AM Bug #9718 (Fix Under Review): osd_types: check_new_interval: min_size check needs to consider CRU...
- 09:36 AM Bug #9750: pg incomplete
- I guess it's not a bug indeed, only the logical outcome of something going wrong. What is probably a bug is having th...
- 09:32 AM Bug #9750 (Won't Fix): pg incomplete
- 09:07 AM Bug #9750: pg incomplete
- So you don't actually think it's a bug?
In any case, if you still have the disk accessible, you probably want to u... - 08:54 AM Bug #9750: pg incomplete
- osd.3 has failed recently (this morning) because btrfs turned it read-only. it is likely that it contains the missing...
- 08:45 AM Bug #9750: pg incomplete
- What did you *do* to this cluster? I don't think these PGs are supposed to have historical acting sets that look like...
- 08:01 AM Bug #9750 (Won't Fix): pg incomplete
- ...
10/11/2014
- 10:33 PM Linux kernel client Bug #9749 (Resolved): kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
- Our UC-KLEE tool discovered a Linux kernel divide-by-zero crash in the Ceph
client driver. I found the bug on kernel... - 03:52 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- Sage has scheduled run on wip-9731-firefly http://pulpito.front.sepia.ceph.com/teuthology-2014-10-10_16:50:01-upgrade...
- 03:35 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- pre-firefly mons I think would also suffice to cause this bug. Actually, if you upgrade the osds from pre-firefly to...
- 03:29 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- Can you rerun with wip-sam-firefly-testing? (actually, ignore the firefly branch for the moment and use wip-sam-firef...
- 01:05 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- I guess it's expected as backport is still pending.
Update:
In the run http://pulpito.front.sepia.ceph.com/teutho... - 01:10 PM Bug #7588 (Pending Backport): OSD Seg fault in string assign ObjectOperation::C_ObjectOperation_c...
- This actually doesn't seem to have been backported to firefly. I think it might be causing some of the cache/tiering...
- 01:01 PM Bug #9748 (Rejected): Dead jobs in upgrade:dumpling-x-firefly-distro-basic-multi run
- Jobs '537916', '537917'
Logs are in:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-10_19:00:01-upgrade... - 12:54 PM Bug #9714: Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi run
- Same problem in run - http://pulpito.front.sepia.ceph.com/teuthology-2014-10-10_19:00:01-upgrade:dumpling-x-firefly-d...
- 09:27 AM devops Bug #9747 (Fix Under Review): ceph.spec.in will always use 95-ceph-osd-alt.rules
- 09:22 AM devops Bug #9747: ceph.spec.in will always use 95-ceph-osd-alt.rules
- https://github.com/ceph/ceph/pull/2706
- 09:12 AM devops Bug #9747 (Resolved): ceph.spec.in will always use 95-ceph-osd-alt.rules
- In ceph.spec.in *%if (0%{?rhel} || 0%{?rhel} < 7)* "see sources":https://github.com/ceph/ceph/blob/giant/ceph.spec.in...
- 09:08 AM Bug #9746 (Resolved): reconcile upstream ceph.spec.in with other ceph.spec (SuSE, EPEL, etc)
- There are many differences between the "epel ceph.spec":https://dl.fedoraproject.org/pub/epel/7/SRPMS/c/ceph-0.80.5-8...
- 08:32 AM devops Bug #9721 (Rejected): partx -a should be called after creating the data partition
- The diagnostic is incorrect. At the time the data partition is created it does not make sense to try to activate it b...
10/10/2014
- 08:51 PM Bug #9716 (Resolved): Warning in API headers when compiling with -Wstrict-prototypes
- commit:d98b75530b0ea8f243a4dc8e1881bc6da2bca99d
- 02:10 PM Bug #9716 (Fix Under Review): Warning in API headers when compiling with -Wstrict-prototypes
- https://github.com/ceph/ceph/pull/2701
- 01:44 PM Bug #9716: Warning in API headers when compiling with -Wstrict-prototypes
- Forgot to mention that qemu uses -Werror by default, hence the errors.
- 08:45 PM Bug #8983 (Resolved): rados bench -b option does not take orders of magnitude (k,M,..) but also d...
- commit:3b9dcff7755a3ffcb9df8a06e6d0e525e77de641
- 02:13 PM Bug #8983: rados bench -b option does not take orders of magnitude (k,M,..) but also does not thr...
- https://github.com/ceph/ceph/pull/2678
- 08:12 PM Bug #9143 (Rejected): Incorrect key sequence in encoding object name to key for GenericObjectMap
- 07:21 PM Bug #9731: Ceph 0.80.6 OSD crashes
- wip-firefly-9696-9731 should have a fix for this as well as 9696. Let me know whether that helps.
- 03:11 PM Bug #9731: Ceph 0.80.6 OSD crashes
- I think this is a bug in PGLog::IndexedLog::trim(). Making patch.
- 11:44 AM Bug #9731: Ceph 0.80.6 OSD crashes
- Attached ceph OSD log from crash with debugging turned on.
- 11:03 AM Bug #9731: Ceph 0.80.6 OSD crashes
- I will add those to my configuration and restart ceph on each node.
Luckily this is just my test environment. - 10:47 AM Bug #9731: Ceph 0.80.6 OSD crashes
- Can you reproduce either of these with logging?
debug osd = 20
debug filestore = 20
debug ms = 1 - 09:45 AM Bug #9731 (Can't reproduce): Ceph 0.80.6 OSD crashes
- I received 2 different crashes on 2 different OSDs on different nodes within 30s of eachother on 0.80.6. I just upgr...
- 06:41 PM Bug #9744: cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
- I think I found the problem: new node (with new OSD) had incorrect time.
Everything returned to normal after correct... - 06:10 PM Bug #9744: cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
- Found the following in the logs of the new OSD:...
- 05:57 PM Bug #9744 (Won't Fix): cephx: verify_reply couldn't decrypt with error: error decoding block for ...
- Shortly after upgrade 0.80.5 to 0.80.6 cluster became slow and then almost completely stopped
with several OSDs exhi... - 05:45 PM rgw Bug #9307 (Resolved): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-fir...
- above errors from yuri are #9169.. something else
- 08:10 AM rgw Bug #9307: "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-x-mast...
- Same issues in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-b...
- 05:00 PM Fix #9199 (In Progress): librados: watch linger pings need to verify pg mapping hasn't changed
- 04:59 PM Fix #9196 (In Progress): librados: watch_check() to synchronous verify we haven't missed notifies
- 04:59 PM Fix #8905: msgr: encode osd epoch in nonce to avoid misc OSD reconnect races
- 04:56 PM Bug #9706 (Fix Under Review): osdc/Objecter.cc: 1570: FAILED assert(op->session)
- 09:26 AM Bug #9706 (In Progress): osdc/Objecter.cc: 1570: FAILED assert(op->session)
- tick() locking is broken
- 04:55 PM rgw Bug #7796 (Pending Backport): RGW Keystone token auth fails with '411 Length Required' when Keyst...
- 04:11 PM CephFS Bug #9679: Ceph hadoop terasort job failure
- I do believe that Hadoop kills the clients after they reach a point that the run-time believes everything has been fl...
- 02:02 PM CephFS Bug #9679: Ceph hadoop terasort job failure
- Looking at the bad client (11139), the first thing I notice is that the messaging is way backed up. What's the networ...
- 09:13 AM CephFS Bug #9679: Ceph hadoop terasort job failure
- Here is the directory listing. All of the files should be the same size....
- 03:07 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
- commit:82175ec94acc89dc75da0154f86187fb2e4dbf5e
- 03:06 PM rbd Bug #9513 (Pending Backport): rbd_cache=true default setting is degading librbd performance ~10X ...
- 01:50 PM rbd Bug #9742 (Resolved): `rbd map lun` fails with: (2) No such file or directory on kernel 3.14.14 w...
- when trying to map a standard rbd image as a block device, the command fails with (2) No such file or directory.
I... - 01:27 PM Feature #9741 (Closed): teuthology-suite: allow scheduling sub-suites
- 01:18 PM Bug #9443: btrfs pwrite returns EEXIST on journal FileJournal::write_bl
- Don't run on that kernel. :(
(My understanding is that they have a fix in testing and it shouldn't be in an actual r... - 01:04 PM Bug #9443: btrfs pwrite returns EEXIST on journal FileJournal::write_bl
- Is there a workaround ?
- 01:03 PM Bug #9740 (Duplicate): FileJournal::do_write assert(0)
- 12:59 PM Bug #9740 (Duplicate): FileJournal::do_write assert(0)
http://pulpito.ceph.com/loic-2014-10-10_08:45:20-rados:thrash-erasure-code-isa-master-testing-basic-vps/536207/
...- 01:00 PM Bug #9696 (Pending Backport): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
- 10:11 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- Ok, can you reproduce with the logging above?
- 01:09 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- Also, could either Loïc or Sam explain what exact combination of circumstances causes this assert to trigger? I can't...
- 12:59 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- Sam, I can confirm with certainty that this did *not* happen during an upgrade from dumpling. All nodes were running ...
- 12:09 PM Bug #9739 (Won't Fix): rados cli: listsnaps does not list snaps
- To reproduce:...
- 12:04 PM Bug #9738 (Won't Fix): rados cli: objects not present in a snapshot are listed anyway
- To reproduce:...
- 11:54 AM Bug #9737 (Resolved): rados cli: --snapid (not --snap) option is broken
- Running "rados --pool mypool --snapid 1 ls" (assuming 1 is a valid snap number) crashes without printing or returning...
- 11:34 AM RADOS Bug #9736 (New): rados cli doesn't print specific usage errors
- If a user executes e.g. "rados lssnap", the command prints out the usage information. However, it does not say that ...
- 11:23 AM Bug #9735 (Resolved): "rados lock list" doesn't output final end-of-line
- The rados command-line utility doesn't output an end-of-line character at the end of the output of the "lock list" co...
- 11:10 AM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
- David, is it related to some code you were working on? Pls take a look and reassigned if necessary.
- 09:06 AM Bug #9729 (Resolved): "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:paralle...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-10_08:19:51-upgrade:dumpling-firefly-x:parallel-gi...
- 10:59 AM Feature #6258: ceph-disk: zap should wipefs
- A user in the #ceph-devel channel had issues, it wouldn't matter that he tried to zap the disk, the filesystem was st...
- 10:25 AM rbd Feature #9733 (New): Separate rbd listing into CAP
- We are concerned that if the key is compromised in our OpenStack environment, then all images in the pool can be list...
- 09:46 AM Bug #9732 (Resolved): ReplicatedPG::hit_set_trim osd/ReplicatedPG.cc: 11006: FAILED assert(obc)
- The timezone of the machine was incorrect CDT instead of CEST. All other machines (MON and OSD) are on CEST.
On a ... - 09:26 AM Documentation #9730 (Resolved): ceph-deploy mon create-inital, does not take arguments
- It uses the same hosts that where passed into `ceph-deploy new {HOSTS}`
But this sections says the user should pas... - 09:15 AM Feature #9728: erasure-code: jerasure support for NEON
- https://github.com/ceph/ceph/pull/2694
- 09:02 AM Feature #9728 (Resolved): erasure-code: jerasure support for NEON
- Work done by Janne Grunau @ https://github.com/jannau/ceph/compare/neon . It will be available in Hammer.
- 08:50 AM rbd Feature #8902: rbd mirroring: librbd: funnel snapshot, resize events via lock holder
- WIP branch: https://github.com/ceph/ceph/compare/wip-8902
- 08:46 AM Bug #9702: "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:firefly-x-giant-...
- Same in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-08_19:20:02-upgrade:firefly-x-giant-distro-basic-m...
- 08:46 AM Bug #9703: "Segmentation fault" in upgrade:firefly-x-giant-distro-basic-multi run
- Same in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-08_19:20:02-upgrade:firefly-x-giant-distro-basic-m...
- 08:45 AM devops Bug #9724: VPS machines not being locked "No route to host"
- Why does this keep happening?
- 08:44 AM devops Bug #9724: VPS machines not being locked "No route to host"
- correct URL:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-ba... - 07:59 AM devops Bug #9724 (Rejected): VPS machines not being locked "No route to host"
- In the run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-basic...
- 08:27 AM Bug #9727 (Duplicate): 0.86 EC+ KV OSDs crashing
- Hi, testing our Tiering setup with EC+KV backend a bit further on the latest dev release, our OSDS started to crash a...
- 08:03 AM devops Bug #9725 (Won't Fix): Error "'sudo yum install ceph-radosgw-0.67.11 -y'"in upgrade:dumpling-x-fi...
- In run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-basic-vps...
- 07:18 AM CephFS Bug #9692 (Resolved): ACL workunit syntax error
- 07:05 AM rgw Feature #9723 (New): Support metering info
- Add object storage metering support similar to openstack swift ceilometer. Its should able plugable with openstack ce...
- 05:39 AM devops Bug #9721 (Fix Under Review): partx -a should be called after creating the data partition
- https://github.com/ceph/ceph/pull/2648 and https://github.com/dachary/ceph/commit/81d6c5b5a33de745ae4a23536409de0c0e7...
- 05:26 AM devops Bug #9721: partx -a should be called after creating the data partition
- 05:18 AM devops Bug #9721 (Rejected): partx -a should be called after creating the data partition
- In the following udev is racing with the creation of the partition:...
- 03:23 AM Feature #9720 (Resolved): erasure-code: non regression should test jerasure variants
- check that content encoded with one variant exactly matches content encoded with another variant
- 12:36 AM rgw Feature #8052: Support for Keystone Identity API v3
- From swift 2.2.0 changelog:
* Added support for Keystone v3 auth.
Keystone v3 introduced the concept ...
10/09/2014
- 05:14 PM Bug #9718 (Resolved): osd_types: check_new_interval: min_size check needs to consider CRUSH_ITEM_...
- 04:40 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- wip-9696-firefly removes the assert on firefly, it's not valid for the compat case.
- 04:32 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- https://github.com/ceph/ceph/pull/2684/files
- 04:31 PM Bug #9696 (Fix Under Review): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
- 04:31 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- Can you restart one of the crashing osds with
debug osd = 20
debug filestore = 20
debug ms = 1 ?
As far as we... - 04:26 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- 04:25 PM Bug #9696 (Fix Under Review): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
- https://github.com/ceph/ceph/pull/2684
- 04:20 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- 04:16 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- running in gitbuilder under the branch wip-9696-compat-acting
- 03:56 PM Bug #9696 (Fix Under Review): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
- 03:38 PM Bug #9696 (In Progress): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(...
- https://github.com/ceph/ceph/pull/2682
- 03:01 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- It actually failed a new test case AFTER it went out into a stable release version.
- 02:28 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- Whoa, wait -- Loïc, are you saying this actually failed a test case and still made it into a release in a stable vers...
- 02:23 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
- For the record http://tracker.ceph.com/issues/9715 hits the same assert in similar conditions in teuthology and the f...
- 03:46 PM rgw Bug #7796 (Fix Under Review): RGW Keystone token auth fails with '411 Length Required' when Keyst...
- 03:32 PM Bug #9715: assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting) firefly
- sjust: I think it's due to the compatibility thing where we include the backfill peer in the acting set if there are ...
- 03:11 PM Bug #9715: assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting) firefly
- I see the change (92cfd370) that added the assert and didn't consider "compat_mode." In older OSDs we only have one ...
- 02:20 PM Bug #9715 (Duplicate): assert(want_acting_backfill.size() - want_backfill.size() == num_want_acti...
- 02:08 PM Bug #9715: assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting) firefly
- " assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting);":https://github.com/ceph/ceph/blob/f...
- 09:51 AM Bug #9715 (Duplicate): assert(want_acting_backfill.size() - want_backfill.size() == num_want_acti...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_19:30:01-upgrade:dumpling-firefly-x:stress-spli...
- 03:12 PM Bug #8983 (Fix Under Review): rados bench -b option does not take orders of magnitude (k,M,..) bu...
- 02:26 PM devops Bug #9712: ceph.com is not accessible from IPv6 only environments
- Works for me (native v6 ip).
Thanks! - 12:23 PM devops Bug #9712 (Resolved): ceph.com is not accessible from IPv6 only environments
- This is me. Somehow the ipv6 ip address became unassigned to the ceph.com dedicated server in DH's database. Not sure...
- 10:29 AM devops Bug #9712: ceph.com is not accessible from IPv6 only environments
- Apologies if somebody else should be handling this, but I think it's yours? :)
- 07:48 AM devops Bug #9712 (Resolved): ceph.com is not accessible from IPv6 only environments
- Some time ago we found that we can't connect to ceph.com:443. Moreover we can't either ping it....
- 01:14 PM Bug #9711 (Duplicate): 'cache' osd crash on ceph 0.86
- 01:29 AM Bug #9711 (Duplicate): 'cache' osd crash on ceph 0.86
- In a tiering setup cache+ EC on KV, one cache OSD has crashed after about 12hours testing with rados bench.
Stackt... - 01:13 PM Bug #9480: OSD is crashing while object deletion
- It's back http://tracker.ceph.com/issues/9711
- 12:07 PM CephFS Bug #9679: Ceph hadoop terasort job failure
- empty fs:...
- 08:21 AM CephFS Bug #9679: Ceph hadoop terasort job failure
- Thanks Huamin. Yeh, It looks like some writes are being lost, probably due to an unclean shutdown. I'll get some trac...
- 08:06 AM CephFS Bug #9679: Ceph hadoop terasort job failure
- For comparison, teragen files on CephFS
./hadoop/bin/hadoop fs -ls /in-dir-3
14/10/09 08:05:05 WARN util.NativeC... - 07:04 AM CephFS Bug #9679: Ceph hadoop terasort job failure
- Run the same tests on HDFS 2.4.1, thoguh on a different setup. Terasort finished without any problem.
Cmd:
./hado... - 11:46 AM rbd Feature #2467 (In Progress): qemu: implement bdrv_invalidate_cache
- Patch sent to qemu-devel@nongnu.org.
- 10:50 AM Bug #9716 (Resolved): Warning in API headers when compiling with -Wstrict-prototypes
- Configuring qemu fails because of the compile errors:...
- 09:40 AM Bug #9714 (Duplicate): Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-mu...
- Run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-...
- 09:30 AM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
- Thanks for the update, Ilya! You actually gave me a hint as to a workaround - run the container with `--net host` so ...
- 08:59 AM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
- The...
- 09:23 AM rbd Feature #8902 (In Progress): rbd mirroring: librbd: funnel snapshot, resize events via lock holder
- ... also include flatten.
- 09:22 AM rbd Feature #8900 (In Progress): rbd mirroring: librbd:making image locking mandatory
- 08:28 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
- 08:11 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
- Still an issue: http://pulpito.front.sepia.ceph.com/teuthology-2014-10-08_23:20:03-multi-version-giant-distro-basic-m...
- 08:16 AM rgw Bug #9612 (New): "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant-test...
- Still an issue: http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_23:20:03-multi-version-giant-distro-basic-m...
- 08:09 AM Bug #9705 (Duplicate): "RadosModel.h: 829: FAILED assert(0)" in multi-version-giant-distro-basic-...
- 07:06 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- I've just updated pull request 2419 with a more complete fix for the issue.
I was now able to reproduce 100% when my... - 02:52 AM Bug #9327: Usability Issue: Ceph-deploy does not print all the commands which it is executing
- This issue seen in 0.84 build, can we cross check ones.
- 02:30 AM Feature #9420 (Fix Under Review): erasure-code: tools and archive to check for non regression of ...
- The gitbuilders have been updated, it is ready for review.
- 12:28 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
- What will be state of OSD in 3 node cluster ?, in 3 node cluster there will be other OSD's running on other nodes, so...
10/08/2014
- 11:08 PM CephFS Bug #9679: Ceph hadoop terasort job failure
- missing one of these?...
- 10:46 PM CephFS Bug #9679: Ceph hadoop terasort job failure
- My bet at this point is on the generation of the input data set. Teragen creates a file with X 100byte entries. When ...
- 07:57 PM Bug #9559: ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() function
- please give me a cve id ,thanks
- 04:23 PM Bug #9630 (Need More Info): osd: leaked pg refs on shutdown (dumpling)
- I'm out of ideas but happy to keep exploring if someone has a lead. If this happens again cross referencing the logs ...
- 04:12 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
- "OSD::shutdown":https://github.com/ceph/ceph/blob/dumpling/src/osd/OSD.cc#L1521 clear() the "finished":https://github...
- 03:21 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
- ...
- 01:53 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
- The last thing that happened to pg 2.15 was...
- 01:35 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
- Log lines related to pg 2.15...
- 11:38 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
- It could not be in_progress_splits : the logs do not contain the word *split*
- 11:18 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
- It is the same assert as http://tracker.ceph.com/issues/7891 but the PGBackend did not exist at the time, therefore t...
- 11:03 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
- The osd.2 actually crashed with:...
- 10:07 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
- The full valgrind report from remote/vpm180/log/valgrind/osd.2.log.gz...
- 03:58 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
- It seems that we need to backport the update_range/scan_range changes (intended to avoid backfill related flushes) fr...
- 01:17 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
- I think the least distasteful solution is to actually backport the last_backfill_started modifications. I'll start t...
- 03:38 PM rbd Feature #7272 (Duplicate): rbd: import performance
- 03:14 PM rbd Feature #7272: rbd: import performance
- Reads are single-threaded, but writes are asynchronous, so multiple could be in flight at once. (In rbd.cc, do_impor...
- 11:53 AM rbd Bug #9513 (Fix Under Review): rbd_cache=true default setting is degading librbd performance ~10X ...
- 11:06 AM Bug #9496 (Resolved): mon: pg scrub timestamps must be populated at pg creation
- 11:04 AM Bug #9128 (Resolved): Newly-restarted OSD may suicide itself after hitting suicide time out value...
- 11:04 AM Bug #9419 (Pending Backport): dumpling->firefly upgrade, sending setallochint?
- 10:59 AM rbd Feature #8900: rbd mirroring: librbd:making image locking mandatory
- WIP branch: https://github.com/ceph/ceph/compare/wip-8900
- 09:37 AM rbd Bug #9642: Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-giant-distro-ba...
- 08:49 AM Bug #9706 (Resolved): osdc/Objecter.cc: 1570: FAILED assert(op->session)
- This was actually on wip-sam-testing, but does not appear related to any of the patches.
ubuntu@teuthology:/a/samu... - 08:43 AM rgw Bug #9307 (New): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-...
- I see the same issues on giant run:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-07_15:54:57-upgrade:dum... - 08:31 AM Bug #9705 (Duplicate): "RadosModel.h: 829: FAILED assert(0)" in multi-version-giant-distro-basic-...
- Looks similar to #9528 (no root issue mentioned)
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-1... - 08:05 AM Bug #9703 (Resolved): "Segmentation fault" in upgrade:firefly-x-giant-distro-basic-multi run
- I see coredump on mira076 client.1 (@*/531751/remote/mira076/@), but could not get any info about it.
Logs are in ... - 08:00 AM Bug #9702 (Duplicate): "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:fire...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-06_19:20:01-upgrade:firefly-x-giant-distro-basic-m...
- 07:56 AM Bug #9700: cephtool mon_osd intermittent failure
- Waiting about a week to see if it shows up again.
- 07:30 AM Bug #9700 (Fix Under Review): cephtool mon_osd intermittent failure
- https://github.com/ceph/ceph/pull/2670...
- 07:23 AM Bug #9700: cephtool mon_osd intermittent failure
- The osd 1 goes down during the following. Reading the script and what it does I can imagine why. Unless osd.1 dies be...
- 06:57 AM Bug #9700: cephtool mon_osd intermittent failure
- ENXIO is expected when ceph tell tries to join an osd that is not ready and it should be treated as EAGAIN. If it hap...
- 06:11 AM Bug #9700 (Resolved): cephtool mon_osd intermittent failure
Hit this one time on a gitbuilder: it's not clear to me why we have a 5-time retry here: some timeout raciness in t...- 07:28 AM CephFS Feature #9437 (Resolved): make 'ceph tell mds.* ...' work, deprecate 'ceph mds tell * ...'
- ...
- 05:29 AM devops Support #8861: Deploying additional monitors fails.
- My work around that was to declare all monitors before install, and install all monitors at once. Pretty sure if I ne...
- 02:39 AM devops Support #8861: Deploying additional monitors fails.
- As per my update in #5195:
Same here. I have run through the latest quick start documentation and am using Ubuntu ... - 05:16 AM devops Bug #9697 (Rejected): exitcode of gatherkeys has changed the latests versions
- Hi,
We've been using ceph-deploy in a deployment component, and we also use the gatherkeys function.
In some earl... - 02:37 AM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
- Same here. I have run through the latest quick start documentation and am using Ubuntu 14.04.1 and Ceph firefly with ...
- 02:15 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- Finally I can reproduce it ! I know, I've already said that and was wrong...
Actually I still don't manage with the ... - 12:35 AM Bug #9696 (Resolved): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(wan...
- After an upgrade from 0.80.5 to 0.80.6, almost *all* OSDs went down after hitting the following failed assertion:
... - 12:31 AM Bug #9408 (In Progress): erasure-code: misalignment
- 12:15 AM Bug #9677 (Resolved): osd_disk_thread_ioprio_class is ignored
10/07/2014
- 10:01 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
- Josh Durgin wrote:
> The issue was reported on firefly - does it have the same behavior as master, or is there somet... - 05:48 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
- The issue was reported on firefly - does it have the same behavior as master, or is there something that should be ba...
- 05:30 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
- Here is the response from the gateway:...
- 12:56 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
- Maybe the problem is that we don't send the xml body with the appropriate error?
- 12:52 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
- I have added a test to s3-test to check for EntityTooSmall and it *passes* on the current code. According to AWS an ...
- 09:50 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
- I'm happy to redefine the permissions if and when that becomes an option/requirement, but until then, it seems like t...
- 09:42 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
- The immediate issue was resolved by switching it to rw (or so my code check and utter lack of memory tells me). But I...
- 09:28 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
- Well, hang on a minute...the question is about the nature of the command, which is totally mds-specific, not rest-api...
- 07:07 AM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
- I don't know that this is still a bug, but since it was a REST api issue I don't think it belongs in the MDS tracker ...
- 07:28 PM CephFS Bug #9692 (Resolved): ACL workunit syntax error
- http://pulpito.ceph.com/gregf-2014-10-06_19:59:42-kcephfs-wip-9628-testing-basic-multi/531900...
- 07:26 PM CephFS Bug #9628 (Resolved): mds: race between ms_handle_accept() and ms_handle_reset()
- Merged to master in commit:1b7fae7b2953649564a9e226b4abedad0ce652cc
- 05:51 PM rbd Bug #9513 (In Progress): rbd_cache=true default setting is degading librbd performance ~10X in Giant
- The regression was introduced in commit 4fc9fffc494abedac0a9b1ce44706343f18466f1 (according to git bisect). This is ...
- 04:33 PM RADOS Bug #9606: mon: ambiguous error_status returned to user when type is wrong in a command
- regardless of this being properly parsed on the client or not, the monitor should not rely on client argument validat...
- 04:28 PM Bug #9496 (Fix Under Review): mon: pg scrub timestamps must be populated at pg creation
- https://github.com/ceph/ceph/pull/2663
also in wip-sam-testing - 01:46 PM Bug #9496: mon: pg scrub timestamps must be populated at pg creation
- Kind of odd, last scrub timestamp should never be 0.
- 02:35 PM Bug #9416 (Duplicate): ods crash in upgrade:dumpling-dumpling-distro-basic-vps run
- 02:35 PM Fix #9689 (New): ceph df reports % of global size used instead of MAX AVAIL 0.80.6
- running the ceph df command returns a much lower %USED than expected. instead of reporting %USED of MAX AVAIL, which ...
- 02:33 PM Bug #9503 (Pending Backport): Dumpling: removing many snapshots in a short time makes OSDs go ber...
- 01:57 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- https://github.com/ceph/ceph/pull/2659
- 02:08 PM Bug #9203 (Resolved): ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limi...
- 11:25 AM Bug #9203 (Fix Under Review): ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `po...
- 02:06 PM Bug #9113 (Pending Backport): osd: snap trimming eats memory, linearly
- 11:25 AM Bug #9113 (Fix Under Review): osd: snap trimming eats memory, linearly
- 02:04 PM Bug #9626 (Pending Backport): PG: cancel backfill reservations if we get a cancel during backfill
- 11:25 AM Bug #9626 (Fix Under Review): PG: cancel backfill reservations if we get a cancel during backfill
- 01:57 PM Bug #7368: ceph osd repair * blocks after some minutes and prevent other ceph pg repair commands
- Loic, If I understand correctly, #9566 is "normal" backfilling, and Sage's explanation is clear. In my case, I had lo...
- 01:40 PM Bug #7368 (Can't reproduce): ceph osd repair * blocks after some minutes and prevent other ceph p...
- 01:51 PM Bug #9467 (Won't Fix): Delete default erasure coded profile getting succeeded
- This looks like exactly how it is supposed to work?
- 01:49 PM Bug #9434: rbd rm hangs
- Are your pgs clean? (ceph -s)
- 01:43 PM Bug #9551 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-testing-basic-vps run
- 01:42 PM Bug #2848 (Won't Fix): OSDMap: pool_id is 64-bit, but pool_max is 32-bit
- 01:32 PM Bug #9515 (Duplicate): "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:...
- Ok, hopefully fixed this time.
- 01:28 PM Bug #8822 (Resolved): osd: hang on shutdown, spinlocks
- 01:25 PM Bug #9181: Osd: segv in OpTracker::unregister_inflight_op
- Somnath Roy wrote:
> Sam,
> This core is different and happening on Firefly. The other optracker fixes should also ... - 01:25 PM Bug #9181: Osd: segv in OpTracker::unregister_inflight_op
- Sam,
This core is different and happening on Firefly. The other optracker port should also be backported to Firefly ... - 01:15 PM Bug #9181 (Resolved): Osd: segv in OpTracker::unregister_inflight_op
- I think this got fixed with the other optracker fix?
- 01:22 PM Bug #9661 (Resolved): ceph_objectstore_tool doesn't work with memstore
- 6067f295e7bc571b43aa891f5560d96933721b19
- 01:20 PM Bug #9682 (Duplicate): "os/FileJournal.cc: 1677: FAILED assert(0)" in upgrade:firefly-firefly-dis...
- 10:55 AM Bug #9682 (Duplicate): "os/FileJournal.cc: 1677: FAILED assert(0)" in upgrade:firefly-firefly-dis...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-05_10:00:04-upgrade:firefly-firefly-distro-basic-m...
- 01:19 PM Bug #9683 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-distro-basic-multi run
- 10:58 AM Bug #9683 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-distro-basic-multi run
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-05_10:00:04-upgrade:firefly-firefly-distro-basic-m...
- 01:16 PM Bug #8333 (Can't reproduce): ceph_test_rados_delete_pools_parallel: Received fewer notifies than ...
- 01:12 PM Bug #9128: Newly-restarted OSD may suicide itself after hitting suicide time out value because it...
- 01:10 PM Bug #9322 (In Progress): OSDMap updates from pgmap can be delayed indefinitely
- 01:10 PM Bug #9321 (In Progress): pgmap updates from OSDMap can be delayed indefinitely
- 01:09 PM Bug #6101 (Can't reproduce): ceph-osd crash on corrupted store
- 12:28 PM Bug #9582 (Resolved): librados: segmentation fault on timeout
- i believe all patches affecting firefly and dumpling have been backported.
- 11:41 AM Bug #9582 (Pending Backport): librados: segmentation fault on timeout
- 11:42 AM Bug #9650 (Resolved): RWTimer cancel_event is racy
- 11:29 AM Bug #8520 (Can't reproduce): osd: segv in PushOp::print()
- 11:28 AM Bug #9008 (Pending Backport): Objecter: pg listing can deadlock when throttling is in use
- 11:28 AM Bug #9417 (Duplicate): "Segmentation fault" in upgrade:dumpling-giant-x-master-distro-basic-vps run
- 11:26 AM Bug #9614: PG stuck with remapped
- 11:03 AM Bug #9684 (Can't reproduce): "Scrubbing terminated" in upgrade:firefly-firefly-distro-basic-multi...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-05_10:00:04-upgrade:firefly-firefly-distro-basic-m...
- 10:28 AM rbd Bug #9642: Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-giant-distro-ba...
- Should be fixed by https://github.com/ceph/ceph-qa-suite/pull/169
- 09:54 AM Messengers Fix #9678 (Rejected): errno shadowed in Pipe.cc
- 07:41 AM Messengers Fix #9678: errno shadowed in Pipe.cc
- If it is expected to see an error message when there is nothing to read, then I was mistaken.
Not retrieving the ... - 07:13 AM Messengers Fix #9678: errno shadowed in Pipe.cc
- Where's it being reset? That error message is admittedly strange but it actually happens because the underlying funct...
- 05:54 AM Messengers Fix #9678 (Rejected): errno shadowed in Pipe.cc
- In some places errno is used after it has been reset and the original error code does not show in the message. For in...
- 09:54 AM rbd Bug #6926 (Resolved): rbd: diff output includes previously non-existent objects as zeroed extents
- commit:9a1ab95176fe4d200a83b7b4f7e2b3097d541a7a
- 09:54 AM CephFS Bug #9679: Ceph hadoop terasort job failure
- https://issues.apache.org/jira/browse/MAPREDUCE-2018
- 09:53 AM CephFS Bug #9679: Ceph hadoop terasort job failure
- https://svn.apache.org/repos/asf/hadoop/common/branches/MAPREDUCE-233/src/examples/org/apache/hadoop/examples/terasor...
- 09:39 AM CephFS Bug #9679: Ceph hadoop terasort job failure
- Teragen command:
./hadoop/bin/hadoop jar ./hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar t... - 09:22 AM CephFS Bug #9679: Ceph hadoop terasort job failure
- Thanks for adding this. What command did you use to generate the input?
- 09:04 AM CephFS Bug #9679 (Closed): Ceph hadoop terasort job failure
- Hadoop version: 2.4.1
Ceph version:
ceph --version
ceph version 0.85-986-g031ef05 (031ef0551ebc98d824075558e884... - 09:36 AM rbd Bug #9680 (Duplicate): Errors in test_rbd.* in upgrade:dumpling-firefly-x:parallel-giant-distro-b...
- 09:28 AM rbd Bug #9680 (Duplicate): Errors in test_rbd.* in upgrade:dumpling-firefly-x:parallel-giant-distro-b...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-06_17:20:35-upgrade:dumpling-firefly-x:parallel-gi...
- 09:32 AM Linux kernel client Bug #4689: libceph: don't have alloc_msg methods limit length
- Related to #9560, #9561?
- 09:28 AM rbd Bug #5768 (Resolved): rbd-fuse: leak in enumerate_images()
- commit:9132ca47959ae1a9a658971b0c8f4fe6e8d0cad3
- 09:26 AM rbd Bug #7385: Objectcacher setting max object counts too low
- 09:24 AM rbd Bug #9391 (Need More Info): fio rbd driver rewrites same blocks
- 09:24 AM rbd Bug #9146 (Can't reproduce): EPERM from image_read.sh
- 09:22 AM rbd Bug #9602 (Need More Info): rbd export -> nc ->rbd import = memory leak
- 09:20 AM rgw Bug #9254 (Fix Under Review): rgw: civetweb requires explicit \r\n for http headers
- 09:15 AM rgw Bug #9039 (Pending Backport): Using COPY on radosgw to copy object from one bucket to another tha...
- 09:13 AM rgw Bug #8587 (Pending Backport): rgw: subuser object not created correctly
- 09:13 AM rgw Bug #9155 (Resolved): Swift Subuser - 403 Forbidden - during upload/post
- Fixed (#8587)
- 09:11 AM rgw Bug #5595 (Fix Under Review): object has a Content-Type, but its content_type property is not sho...
- 09:09 AM Bug #9677 (Pending Backport): osd_disk_thread_ioprio_class is ignored
- 05:10 AM Bug #9677: osd_disk_thread_ioprio_class is ignored
- https://github.com/ceph/ceph/pull/2654
- 04:48 AM Bug #9677 (Resolved): osd_disk_thread_ioprio_class is ignored
- The "osd_disk_thread_ioprio_class configuration option":http://ceph.com/docs/giant/rados/configuration/osd-config-ref...
- 09:00 AM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
- 08:54 AM Bug #9635 (Resolved): mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())
- 07:03 AM CephFS Bug #9636 (Duplicate): segfault in CInode::get_caps_allowed_for_client
- 07:02 AM CephFS Bug #9562 (Resolved): Lockdep assertion in Filer purge
- Backported to giant:...
- 07:02 AM CephFS Bug #8576 (Need More Info): teuthology: nfs tests failing on umount
- 06:50 AM Bug #6756: journal full hang on startup
- ...
- 06:37 AM Bug #6003: journal Unable to read past sequence 406 ...
- ...
- 06:32 AM Bug #9418 (Pending Backport): mon: drop internal-purpose messages from clients without proper caps
- 03:55 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
- as per this document "http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/", that mon will get 3...
- 03:53 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
- as per this document "http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/", that mon will get 3...
- 01:15 AM Bug #9676 (Resolved): disk thread ioprio class misses osd
- 01:10 AM Bug #9676 (Fix Under Review): disk thread ioprio class misses osd
- https://github.com/ceph/ceph/pull/2653
- 01:06 AM Bug #9676 (Resolved): disk thread ioprio class misses osd
- http://ceph.com/docs/master/rados/configuration/osd-config-ref/
- 12:28 AM Bug #9675: splitting a pool doesn't start when rule_id != ruleset_id
- Sorry for formatting... should be like this:...
- 12:24 AM Bug #9675 (Resolved): splitting a pool doesn't start when rule_id != ruleset_id
- commit:78e84f34da83abf5a62ae97bb84ab70774b164a6
Dumpling 0.67.10
Rule is like this:
{ "rule_id": 6,
...
10/06/2014
- 10:53 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
- Note that we only see the debug output when we're trying to write to the RBD bus directly on the host - from within t...
- 10:32 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
- Here's some debugging after disabling auth.
As root on the CoreOS host, echoing directly into the RBD bus also doe... - 10:04 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
- For posterity, recording my conversation with Josh here. http://irclogs.ceph.widodh.nl/index.php?date=2014-09-04
<... - 10:03 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
- Seeing the same issue on a 3.16.2 kernel: ...
- 08:55 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
- A fellow member of the CoreOS community is also running into this: https://groups.google.com/forum/#!topic/coreos-use...
- 06:27 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
- rsync return codes aren't standard error codes. The man page says that 23 means...
- 05:59 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
- #define ENFILE 23 /* File table overflow */
maybe we should adjust ulimit - 02:23 PM CephFS Bug #9674 (Resolved): nightly failed multiple_rsync.sh
- http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-03_23:04:01-fs-giant-distro-basic-multi/527949/...
- 05:52 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
- Good to know that you are able to reproduce this.
I think the log entries you mentioned are there in Firefly as well... - 03:09 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
- I double-checked, and I had multiple versions of librbd on my path. (I forgot about installing one of them.) I remo...
- 01:14 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
- Hopefully, you made sure the librbd/librados libraries fio_rbd is loading are from the giant. As I said, replacing th...
- 11:55 AM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
- Thanks for the details. Unfortunately, I'm still unable to reproduce the issue. Was your cluster created in Firefly...
- 03:52 PM devops Bug #9658 (Resolved): upgrade from dumpling to firefly is broken
- 03:06 PM devops Bug #9658: upgrade from dumpling to firefly is broken
- ubuntu@teuthology:/a/teuthology-2014-10-06_14:06:56-upgrade:dumpling-firefly-x:parallel-wip-9658-firefly-distro-basic...
- 12:51 PM devops Bug #9658: upgrade from dumpling to firefly is broken
- ubuntu@teuthology:/a/teuthology-2014-10-06_10:31:05-upgrade:dumpling-firefly-x:parallel-wip-9658-distro-basic-vps/529856
- 10:12 AM devops Bug #9658: upgrade from dumpling to firefly is broken
- sure, am testing it now
- 09:55 AM devops Bug #9658: upgrade from dumpling to firefly is broken
- Tamil, this should be fixed in the wip-9658 branch.. can you test please? The firefly backport will be a bit differe...
- 09:48 AM devops Bug #9658: upgrade from dumpling to firefly is broken
- 03:23 PM Bug #9203: ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limit' failed.
- 02:54 PM Bug #9419: dumpling->firefly upgrade, sending setallochint?
- Notes on using feature bits already present. The problem is that CEPH_FEATURE_MSGR_KEEPALIVE2 was back ported, so we...
- 02:17 PM Bug #9385 (Duplicate): ceph_test_rados: incorrect buffer at pos ...
- 01:49 PM Documentation #9673 (Closed): Document ceph df numbers
- We need to just write down what they mean. It's one of the first questions, and it's one of the hardest ones to answ...
- 11:29 AM Bug #9664 (Rejected): mon: ceph osd metada failure on centos7
- it fails because the dockerized centos is has a fake systemd http://jperrin.github.io/centos/2014/09/25/centos-docker...
- 11:02 AM Bug #9657 (Resolved): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
- No backport is needed; this is done. (commit:25bcc39bb809e2d13beea1529e4ab92d1b61fa5b)
- 09:56 AM devops Tasks #9669 (Resolved): teuthology.front needs an upgrade
- We need a newer libvirt version on the machine, and Ubuntu precise just doesn't contain what we need. I can't even se...
- 09:39 AM devops Bug #9654 (Duplicate): "error: subprocess paste was killed by signal (Broken pipe)" in upgrade:du...
- 09:39 AM devops Bug #9656: Remove conditional statement in ceph-radosgw startup script log section
- Hmm yeah. I think the better solution would be to fix the /var/log/ceph (/var/log/radosgw?) permissions so that log ...
- 09:37 AM Bug #9663 (Resolved): Objecter assertion failure
- 09:34 AM devops Bug #9667 (Duplicate): Missing packages in upgrade:dumpling-firefly-x:parallel-giant-distro-basic...
- #9658
- 08:45 AM devops Bug #9667 (Duplicate): Missing packages in upgrade:dumpling-firefly-x:parallel-giant-distro-basic...
- This looks like a dupe of #9640 but a different version.
Job: http://qa-proxy.ceph.com/teuthology/teuthology-2014-... - 09:33 AM Bug #9668 (Rejected): osd killed by ABRT from FAILED assert
- this is almost certainly either the max files ulimit (ulimit -n , see max open files = ... in ceph.conf) or /proc/sys...
- 09:00 AM Bug #9668 (Rejected): osd killed by ABRT from FAILED assert
- -----------
[Mon Oct 6 15:09:03 2014] init: ceph-osd (ceph/46) main process (3058) killed by ABRT signal
---------... - 09:32 AM Bug #9655 (Resolved): tests: qa/workunits/cephtool/test.sh fails ENXIO
- 08:38 AM devops Fix #9666 (Resolved): ceph-disk error when activate is missing an argument is cryptic
- When the device argument is missing:...
- 08:14 AM devops Bug #9665: ceph-disk zap should call partprobe
- https://github.com/ceph/ceph/pull/2648
- 07:43 AM devops Bug #9665 (Resolved): ceph-disk zap should call partprobe
- h3. User description
Symptoms:
* A disk is used by an OSD
* The OSD is not longer useful and the disk is clear... - 08:09 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- I believe the fact that the commit message for 255b430a87201c7d0cf8f10a3c1e62cbe8dd2d93 said @Backfill@ where it shou...
- 08:04 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Hi Sam,
I think this is fixed in master/giant.. correct? Just a gentle reminder that we'd appreciate a backport in d... - 08:04 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Hi Sam,
Same as for #9503
I think this is fixed in master/giant.. correct? Just a gentle reminder that we'd appreci... - 12:51 AM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
10/05/2014
- 01:12 PM Bug #9663: Objecter assertion failure
- Probably this...
- 12:41 PM Bug #9663 (Resolved): Objecter assertion failure
- In latest Giant build. This bug appears to be related to http://tracker.ceph.com/issues/9067...
- 01:00 PM Bug #9664 (Rejected): mon: ceph osd metada failure on centos7
- "qa/workunits/cephtool/test.sh":https://github.com/ceph/ceph/blob/master/qa/workunits/cephtool/test.sh#L665 consisten...
10/04/2014
- 12:50 PM RADOS Bug #9492 (Fix Under Review): Crush Mapper crashes when number of replicas is less than total num...
- * firefly https://github.com/ceph/ceph/pull/2643
* giant https://github.com/ceph/ceph/pull/2642
running on http:/... - 12:33 PM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
- Pull req info :
fix for firstn rules: https://github.com/ceph/ceph/pull/2568
fix for indep rules : https://githu... - 04:41 AM RADOS Bug #9492 (Pending Backport): Crush Mapper crashes when number of replicas is less than total num...
- I think both patches should be backported to giant and firefly. Would you like to do that ? It essentially means you ...
- 02:46 AM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
- 02:38 AM Bug #9655 (Fix Under Review): tests: qa/workunits/cephtool/test.sh fails ENXIO
- https://github.com/ceph/ceph/pull/2641
10/03/2014
- 06:36 PM Bug #9657: MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
- tested with wip-9657, fix works fine.
logs are copied to vpm102.front.sepia.ceph.com:/home/ubuntu/wip-9657 - 05:02 PM Bug #9657 (Pending Backport): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
- fix looks right. merged it into giant branch
- 04:11 PM Bug #9657: MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
- https://github.com/ceph/ceph/pull/2640
Tamil will put it through the upgrade suite. - 04:07 PM Bug #9657 (Fix Under Review): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
- 04:05 PM Bug #9657: MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
- Okay, it's because Message::encode() transmutes a compat_version of 0 into compat_version == HEAD_VERSION, and we are...
- 03:59 PM Bug #9657 (In Progress): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
- Well, good news and bad news:
This is not a monitor bug, and my initial guess is that it will only affect clusters r... - 11:11 AM Bug #9657 (Resolved): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-01_19:20:01-upgrade:firefly-x-giant-distro-basic-m...
- 05:04 PM devops Bug #9658: upgrade from dumpling to firefly is broken
- possible debian fix in wip-9658. asked ceph-maintainers and branto for review and help with the spec file change.
- 03:56 PM devops Bug #9658 (In Progress): upgrade from dumpling to firefly is broken
- 03:52 PM devops Bug #9658 (New): upgrade from dumpling to firefly is broken
- sandon: looks like the problem is a file that was in python-ceph was moved to ceph and apt is bailing due to over-wri...
- 03:52 PM devops Bug #9658 (In Progress): upgrade from dumpling to firefly is broken
- this is broken by commit:eb0f6e347969b40c0655d3165a6c4531c6b595a3, which is post 0.80.6. phew! yay testing.
- 02:49 PM devops Bug #9658 (Resolved): upgrade from dumpling to firefly is broken
- This is definitely blocking the upgrade testing for giant.
logs: http://qa-proxy.ceph.com/teuthology/teuthology-20... - 03:16 PM Bug #9661 (Fix Under Review): ceph_objectstore_tool doesn't work with memstore
- 03:07 PM Bug #9661 (Resolved): ceph_objectstore_tool doesn't work with memstore
A CephContext* isn't passed to ObjectStore::create() so MemStore::mount() crashes.
MemStore::set_allow_sharded_o...- 03:04 PM devops Bug #9640: Missing packages in multi-version-giant-testing-basic-multi
- Looks like there two different issues here,
Update on "sudo apt-get update..."
Running the command line on a ma... - 11:20 AM devops Bug #9640: Missing packages in multi-version-giant-testing-basic-multi
- 4 jobs failed in http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_23:20:03-multi-version-giant-distro-basic-...
- 02:53 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
- Please make sure you are following these steps..
1. Build the latest giant package both in cluster and client side... - 01:35 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
- In my test cluster, I'm getting the same performance with "rbd cache = false" and "rbd cache = true". Could you post...
- 02:50 PM CephFS Feature #9659 (Duplicate): MDS: support cache eviction
- It would be really useful when writing certain kinds of tests (eg, for scrubbing) to be able to know that a particula...
- 12:01 PM Bug #9653 (Resolved): ceph-disk: bootstrap-osd keyring ignores --statedir
- 06:54 AM Bug #9653: ceph-disk: bootstrap-osd keyring ignores --statedir
- * giant https://github.com/ceph/ceph/pull/2635
* firefly https://github.com/ceph/ceph/pull/2634
- 05:13 AM Bug #9653 (Fix Under Review): ceph-disk: bootstrap-osd keyring ignores --statedir
- https://github.com/ceph/ceph/pull/2633
- 04:54 AM Bug #9653 (Resolved): ceph-disk: bootstrap-osd keyring ignores --statedir
- ...
- 11:40 AM Fix #9245 (Resolved): remove Monitor::osdmonitor_prepare_command
- 10:01 AM Fix #9245 (Fix Under Review): remove Monitor::osdmonitor_prepare_command
- giant backport https://github.com/ceph/ceph/pull/2637
- 09:27 AM Fix #9245 (Pending Backport): remove Monitor::osdmonitor_prepare_command
- 07:15 AM Fix #9245: remove Monitor::osdmonitor_prepare_command
- https://github.com/ceph/ceph/pull/2636
- 11:06 AM devops Bug #9656 (Rejected): Remove conditional statement in ceph-radosgw startup script log section
- The startup script has a conditional statement to determine if a log file exists, and will touch and chown the log fi...
- 10:43 AM Bug #9655 (Resolved): tests: qa/workunits/cephtool/test.sh fails ENXIO
- ...
- 10:34 AM Bug #8083: erasure-code: fix static code analysis errors found in gf-complete
- For the record these are minor fixes and I expect to see them used when NEON is merged upstream and we update the jer...
- 10:12 AM Bug #8083 (Resolved): erasure-code: fix static code analysis errors found in gf-complete
- merged https://bitbucket.org/jimplank/gf-complete/pull-request/24/static-code-analysis-fixes
- 09:04 AM devops Bug #9654 (Duplicate): "error: subprocess paste was killed by signal (Broken pipe)" in upgrade:du...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-01_19:10:01-upgrade:dumpling-firefly-x:parallel-gi...
- 07:36 AM rbd Feature #9374 (Resolved): rbd: use a rolling average for bench-write
- commit:b47fdd400e14bd1b5e5bea9d18f895c92b8050be
- 07:17 AM Bug #9644 (Can't reproduce): ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
- I tried with latest master and I'm no longer hitting it. I'm not sure if this was due to an environment issue or som...
- 04:32 AM Bug #9644: ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
- The CEPH_CONF and CEPH_ARGS are "taken care of when the test starts":https://github.com/ceph/ceph/blob/giant/src/test...
- 04:17 AM Bug #9644: ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
- Could you include the error you get also ? One idea that comes to mind is that the test-erasure-code.sh do require au...
- 06:52 AM CephFS Bug #9636: segfault in CInode::get_caps_allowed_for_client
- looks like it's the same as #9628
- 12:37 AM Bug #9619 (Can't reproduce): excessive mon memory usage when rbd rm 1PB
- At 83% completion (rbd rm big)...
10/02/2014
- 08:53 PM Bug #9625 (Need More Info): firefly: memory corruption
- 07:32 PM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
- sent a pull request now, #2632
- 05:48 PM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
- not sure what the state of this bug is then.. yehuda?
- 06:01 PM Bug #9544 (Resolved): osd: pg deletion vs create race leads to EEXIST on mkcoll (dumpling)
- 06:00 PM rbd Bug #6494 (Resolved): High memory consumption of qemu/librbd with enabled cache
- ok did dumpling too
- 05:51 PM rbd Bug #6494: High memory consumption of qemu/librbd with enabled cache
- backported to firefly. josh, should we do dumpling too?
- 05:46 PM rgw Bug #8621 (Resolved): civetweb frontend fails authentication if URL has special chars
- a953b313f1e2f884be6ee2ce356780f4f70849dd
- 05:46 PM rgw Bug #8718 (Resolved): CORS OPTIONS request fails for presigned urls
- 6fee71154d838868807fd9824d829c8250d9d2eb
- 05:45 PM rgw Bug #8784 (Resolved): rgw: completion leak
- b0d08aab837808f18708a4f8ced0503c0fce2fec
- 05:44 PM rgw Bug #9089 (Resolved): rgw: copy_obj_data() does not stripe target object
- 05:44 PM rgw Feature #9200 (Resolved): rgw: log civetweb access
- 05:42 PM rgw Bug #9206 (Resolved): rgw: cross rgw message headers filtered by apache 2.4
- 05:41 PM rgw Bug #9353 (Resolved): Log files created under /var/log/radosgw/ do not have the .log extension
- 05:37 PM rgw Bug #9148 (Resolved): rgw: multiregion tests failing, s3tests.functional.test_s3.test_region_copy...
- 05:37 PM rgw Bug #9226 (Resolved): rgw: crash when copying specific objects
- 05:36 PM rgw Bug #9208 (Resolved): rgw: civetweb does not drain request buffer correctly
- 05:36 PM rgw Bug #9201 (Resolved): rgw: bad object with different pool alignment
- 05:23 PM Feature #8391 (Resolved): sysvinit does not support custom cluster names
- 05:22 PM Feature #8203 (Resolved): Replica setting values in df output
- 05:22 PM Feature #7792 (Closed): leveldb 1.12.0 for rhel
- 05:21 PM Feature #7344 (Resolved): osd: add additional heartbeat on cluster interface
- 05:20 PM Feature #6261 (Resolved): ceph-filestore-dump use cases for disaster recovery
- 05:18 PM Feature #5614 (Resolved): mon: enable moving pools to HASHPSPOOL mode
- 05:15 PM Feature #4914 (Resolved): rados tool: read xattr from file / stdin
- 05:14 PM Feature #4005: Add perftools to the kernel debian package script
- 05:13 PM Feature #3345 (Resolved): support multiple clusters with sysvinit
- 05:13 PM Feature #3340 (New): refuse to accept "cluster=foo" in ceph.conf
- 05:13 PM Feature #3340 (Rejected): refuse to accept "cluster=foo" in ceph.conf
- 05:13 PM Feature #3288 (Resolved): docs: document the chooseleaf command in crush
- 05:12 PM Feature #3086 (Resolved): workqueue: dynamically adjust number of threads
- 05:12 PM Feature #2894 (Resolved): cli: help command for ceph subsystems
- 05:11 PM Feature #1880 (Rejected): osd: optionally log all request latencies
- 05:11 PM Messengers Feature #1851 (Rejected): SimpleMessenger: use non-blocking io
- 05:10 PM Feature #1267 (Rejected): osd: rgw class to do acl check
- 05:09 PM RADOS Feature #84 (Rejected): mon: auto adjust pg_num as pool grows
- 05:08 PM Feature #2222 (Resolved): osd: distinguish between 'degraded' and 'misplaced'
- 05:08 PM Feature #5907 (Resolved): permanently log all administrative actions
- 05:07 PM Feature #3849 (Resolved): Track slow PGs and times OSDs marked down
- 04:28 PM Feature #8560 (Resolved): mon: instrument paxos
- 03:58 PM rgw Bug #9651 (Duplicate): RGW: Object Removal Atomicity
- The issue appears then a system does down when there are pending object deletions. The object can be removed but will...
- 03:30 PM Bug #9650: RWTimer cancel_event is racy
- 03:30 PM Bug #9650 (Fix Under Review): RWTimer cancel_event is racy
- wip-rwtimer
- 02:57 PM Bug #9650: RWTimer cancel_event is racy
- The issue is that we execute events under a shared (read) lock, and we allow you to cancel them under a shared (read)...
- 01:55 PM Bug #9650 (Resolved): RWTimer cancel_event is racy
- (in safe mode) we carry the rwlock for the callback. but we use a separate mutex to protect the events. and we can
... - 03:30 PM Bug #9582 (Fix Under Review): librados: segmentation fault on timeout
- 10:23 AM Bug #9582 (In Progress): librados: segmentation fault on timeout
- hmm, several failures on giant
ubuntu@teuthology:/var/lib/teuthworker/archive/samuelj-2014-10-01_18:59:42-rados-gi... - 03:29 PM rgw Bug #9307 (Resolved): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-fir...
- 09:31 AM rgw Bug #9307: "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-x-mast...
- suite:upgrade:dumpling-x
In run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_19:00:02-upgrade:dumplin... - 02:15 PM CephFS Bug #9514 (Resolved): ceph-fuse pjd test is failing in giant nightlies
- Dumpling commit:5f601f099be98c2b061cc94fb06917e7543f3efe
Firefly commit:9fee8de25ab5c155cd6a3d32a71e45630a5ded15 - 01:56 PM Bug #8752: firefly: scrub/repair stat mismatch
- I think I found where it is happening. For a while I was using Btrfs-based OSDs with journals on SSD-based ext4. For ...
- 11:58 AM Bug #9559: ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() function
- This was fixed in version 0.83 in commit 046c9769fc4eaffc1dd4a21b61c1c5696d537def, although I'm sure it could be back...
- 11:42 AM Bug #9649 (Can't reproduce): OSD hang in op_tp
- ubuntu@teuthology:/a/samuelj-2014-10-01_18:59:42-rados-giant-wip-testing-old-vanilla-basic-multi/524982
valgrind, ... - 11:30 AM Bug #9626: PG: cancel backfill reservations if we get a cancel during backfill
- 11:20 AM Feature #9647 (New): osd: hard cap on PGs per OSD
- 11:00 AM devops Feature #9411: remove qemu symlink for librbd on rhel7.1 (and later)
- This ticket is inaccurate.
The version of qemu-kvm that ships with base RHEL 6.x or 7.x does not and has no plans ... - 10:56 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
- ubuntu@teuthology:/a/samuelj-2014-10-01_18:59:42-rados-giant-wip-testing-old-vanilla-basic-multi/524988
- 10:51 AM devops Feature #3161 (Rejected): make gcov website public, via proxy on gitbuilder.sepia.ceph.com
- 10:49 AM devops Feature #2663 (Closed): crowbar: UI for setting generic ceph.conf values
- 10:48 AM devops Feature #2910 (Closed): crowbar: Use JBOD mode for ceph-osd
- 10:46 AM devops Feature #8037 (Closed): Test leveldb 1.12 (or newer) and package as necessary
- 10:46 AM devops Feature #3023 (Closed): juju: automated QA of OpenStack RBD integration
- 10:46 AM devops Feature #3022 (Closed): juju: automated QA of Ceph
- 10:46 AM devops Feature #2695 (Closed): crowbar: Automated QA
- 10:45 AM devops Feature #3017 (Closed): juju: dev env setup
- 10:45 AM devops Feature #3018 (Closed): juju: test deploy of openstack
- 10:45 AM devops Feature #3020 (Closed): juju: change nova to use rbd
- 10:44 AM devops Feature #7925 (In Progress): Feature: create new download.ceph.com site
- 10:42 AM devops Feature #3021 (Closed): juju: change glance to use rbd
- 09:42 AM Bug #9644 (Can't reproduce): ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
- I haven't seen anyone complaining about this so either 1) no one is running this test, or 2) I'm the only one hitting...
- 09:23 AM devops Bug #9643 (Rejected): Error "install ceph-devel-0.67.11 -y" in -upgrade:dumpling-x-firefly-distro...
- In run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_19:00:02-upgrade:dumpling-x-firefly-distro-basic-vps...
- 08:39 AM rbd Bug #9642 (Resolved): Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-gian...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-01_15:06:04-upgrade:dumpling-firefly-x:parallel-gi...
- 07:55 AM Bug #9619: excessive mon memory usage when rbd rm 1PB
- The mon memory indeed grows but after 30 minutes running I'm not sure it is related. And it's growing slowly....
- 06:45 AM Bug #9619 (New): excessive mon memory usage when rbd rm 1PB
- Checking the OSD memory usage when the problem is MON growth is not a good idea.
- 06:35 AM Bug #9619 (Can't reproduce): excessive mon memory usage when rbd rm 1PB
- With a vstart cluster with one monitor and three OSDs and...
- 07:42 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- I'm able to reproduce the problem with 0daddfbf1164d6ba3f38eee29d2f11acfa62f2b6 from your tree https://github.com/spo...
- 07:28 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- Damn... I was a bit too fast when I thought I was reproducing the issue !
I was indeed reproducing the original one,... - 05:40 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- I've finally managed to reproduce it, thanks to Loic : the trick was Ubuntu + debug mode. Maybe you also need more th...
- 05:34 AM Bug #8011 (Resolved): osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= s...
- I'm unable to reproduce it any more, assuming fixed.
- 05:33 AM Bug #8747: OSD crash on scrub:osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || s...
- I can't reproduce any more on 0.80.5 + Firefly HEAD as of 2014-09-16...
10/01/2014
- 05:14 PM Bug #9625: firefly: memory corruption
- ubuntu@teuthology:/var/lib/teuthworker/archive/sage-bug-9625-e/521446
- 05:12 PM Bug #9625: firefly: memory corruption
- hit it again (or something very similar):...
- 04:14 PM Bug #9617 (Pending Backport): objecter shutdown races with msg dispatch
- 02:05 PM Bug #9617 (Fix Under Review): objecter shutdown races with msg dispatch
- https://github.com/ceph/ceph/pull/2621
- 03:34 PM Feature #5035 (Resolved): rados: smarter localized reads
- https://github.com/ceph/ceph/commit/22df77325165157c47bc782476e0e3ab9cf652c4
- 03:17 PM devops Bug #9640 (Rejected): Missing packages in multi-version-giant-testing-basic-multi
- In run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_14:25:21-multi-version-giant-distro-basic-multi/
... - 02:18 PM Bug #9537 (Resolved): OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_tot...
- 01:41 PM rbd Bug #8187 (Resolved): librbd: list_children() reports duplicates with cache pools
- 11:56 AM rbd Bug #8187 (Fix Under Review): librbd: list_children() reports duplicates with cache pools
- https://github.com/ceph/ceph/pull/2619
- 12:53 PM Bug #9572 (Resolved): erasure-code: BlaumRoth default encoding regression
- 12:51 PM Bug #9620 (Resolved): tests: qa/workunits/cephtool/test.sh race condition
- 12:47 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- Demoting to Normal because it only happens in debug mode.
- 06:45 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- 02:45 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- Using ...
- 12:30 PM Bug #9570 (Need More Info): osd crash in FileJournal::WriteFinisher::entry() aio
- I'm out of ideas. There hopefully is enough background information to help with diagnostic when / if it re-surfaces.
- 12:10 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- A theory that does not explain the problem, for the record....
- 11:35 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- What probably happens is that "the aio_info":https://github.com/ceph/ceph/blob/giant/src/os/FileJournal.cc#L1303 that...
- 09:23 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- "linux 3.12.7":https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.12.7 has been released january 2014. The patc...
- 09:17 AM Bug #9570 (In Progress): osd crash in FileJournal::WriteFinisher::entry() aio
- debian/changelog does not show anything that suggest a bug was fixed in libaio after 0.3.109-2 which could relate to ...
- 08:52 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- linux kernel is 3.12.7 and libaio is 0.3.109-2ubuntu1
- 07:29 AM Bug #9570 (Need More Info): osd crash in FileJournal::WriteFinisher::entry() aio
- 06:40 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- The "aio: v4 ensure access to ctx->ring_pages is correctly serialised for migration":https://git.kernel.org/cgit/linu...
- 06:22 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- Although "aio: protect reqs_available updates from changes in interrupt handlers":https://git.kernel.org/cgit/linux/k...
- 05:45 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- It turns out that the alignment requirement has to be enforced indeed. On a 3.13 linux kernel the following:...
- 10:42 AM CephFS Bug #9636 (Duplicate): segfault in CInode::get_caps_allowed_for_client
While doing ad-hoc killing of clients stuck on full cluster: unchecked dereference of session connection....- 09:52 AM Feature #8188 (Resolved): librados: interface to inspect pool properties
- 09:39 AM rbd Feature #4454 (Closed): openstack: support volume migration in Cinder
- tracking via https://blueprints.launchpad.net/cinder/+spec/generic-volume-migration
- 09:36 AM rbd Feature #7921 (Resolved): Openstack: live migration for ephemeral volumes
- 09:35 AM rbd Feature #7920 (Resolved): Openstack: cloning for rbd ephemeral disks
- 09:29 AM rbd Feature #5138 (Closed): LIO Support
- Being tracked in Red Hat Bugzilla
- 09:25 AM rbd Feature #4087 (In Progress): rbd: bitmaps for tracking object existence
- 09:24 AM rbd Feature #4804 (Rejected): tgt: switch to aio
- iSCSI work now focused on LIO
- 09:23 AM Feature #9302 (Resolved): mon: 'ceph osd pool ls' command
- 09:22 AM rgw Feature #9013: rgw: set civetweb as a default frontend
- https://github.com/ceph/ceph/pull/2381
- 08:53 AM devops Fix #5900 (In Progress): Create a Python package for ceph Python bindings
- 08:24 AM rbd Bug #9602: rbd export -> nc ->rbd import = memory leak
- I quickly attempted to reproduce this on the same version w/o success. Can you attach /etc/ceph/big.conf? How large...
- 06:59 AM Bug #8942: Bad JSON output in ceph osd tree
- I see this in ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)and it is indeed a problem.
I *think* ... - 06:49 AM devops Feature #9133: create ceph user/group; run daemons as ceph (non-root)
- Indeed a lot of packaging updates and probably many difficulties to properly upgrade daemons :/
Anyone working on ... - 06:18 AM CephFS Feature #7317 (In Progress): mds: behave with fs fills (e.g., allow deletion)
- 06:15 AM CephFS Feature #9437 (Fix Under Review): make 'ceph tell mds.* ...' work, deprecate 'ceph mds tell * ...'
09/30/2014
- 11:57 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- Working in the container...
- 11:54 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- To make sure this is not environmental problem I clone a clean copy from your branch and removed .ccache entirely.
- 11:19 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- Running the test in the container still fails. ...
- 11:06 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- I reproduced the above valgrind output a few minutes ago on my development laptop. After upgrading from...
- 10:55 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- Using the same source tree with the same kernel but inside an ubuntu 14.04 docker container, I was not able to reprod...
- 05:24 PM rgw Bug #8587 (Resolved): rgw: subuser object not created correctly
- commit:1441ffe8103f03c6b2f625f37adbb2e1cfec66bb
- 05:19 PM Bug #9635: mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())
- 05:19 PM Bug #9635 (Fix Under Review): mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())
- from teh log it looks like this happened during shutdown. see wip-9635
- 04:54 PM Bug #9635 (Resolved): mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())
- ...
- 04:58 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
- hmm, these seem to always happen with valgrind!
- 04:52 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
- ubuntu@teuthology:/a/teuthology-2014-09-29_23:02:01-rgw-giant-testing-basic-multi/519792
- 03:32 PM Bug #9459 (Need More Info): osd: blocked request
- 03:31 PM Bug #9288 (Duplicate): "Assertion `nlock == 0' failed" in upgrade:firefly-firefly-testing-basic-v...
- see #9040
- 03:09 PM Bug #8997 (Can't reproduce): ceph_test_rados_watch_notify hangs
- I suspect the watch resend fix (commit:1349383ac416673cb6df2438729fd2182876a7d1 for #9220) fixed some of these. (It ...
- 03:06 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
- The simple fixes here seem insufficient (fail in qa). Haven't seen anybody else hitting this, which surprises me a b...
- 01:22 PM Feature #9198 (In Progress): librados: notify callback includes gid of notifier
- 01:22 PM Feature #9197 (In Progress): librados/osd: notify reply payload
- 01:13 PM Feature #8899 (Fix Under Review): Kerberos/LDAP Support:: mon: define mon role capabilities
- 01:03 PM RADOS Feature #9632 (New): testing: test CrushWrapper::get_full_location_ordered()
- A recent backport of changes to get_full_location_ordered() passed all the make check and RADOS suite tests, but caus...
- 12:56 PM Feature #9031: List RADOS namespaces and list all objects in all namespaces
- 11:52 AM Bug #8822 (Need More Info): osd: hang on shutdown, spinlocks
- 11:51 AM Bug #8822: osd: hang on shutdown, spinlocks
- valgrind is 1:3.10~20140411-0ubuntu1
3.10.0 release notes claim to have fixed
336435 Valgrind hangs in pthread... - 11:39 AM Bug #8822: osd: hang on shutdown, spinlocks
- http://stackoverflow.com/questions/24558914/valgrind-hangs-in-pthread-spin-lock-consuming-100-cpu
valgrind bug? - 11:38 AM Bug #8822: osd: hang on shutdown, spinlocks
- happened again:...
- 11:26 AM Bug #9617: objecter shutdown races with msg dispatch
- wip-objecter-shutdown
- 11:17 AM rgw Feature #8911 (In Progress): RGW doesn't return 'x-timestamp' in header which is used by 'View De...
- 10:29 AM CephFS Bug #9562 (Pending Backport): Lockdep assertion in Filer purge
- This is popping up in Giant as well, which I believe has the new code that was the proximate cause. :)
- 10:27 AM CephFS Bug #9514 (Pending Backport): ceph-fuse pjd test is failing in giant nightlies
- In giant as commit:0ea20a668cf859881c49b33d1b6db4e636eda18a.
Needs to go to firefly as well. - 09:58 AM devops Tasks #8366 (In Progress): Update ceph.com/docs to default to the latest major release (0.80)
- 09:47 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- https://github.com/ceph/ceph/pull/2611 seems like a good candidate for backport.
- 09:40 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- https://github.com/ceph/ceph/commit/66a9fbe2c7ba59b7cd034c17865adce3432cd2cb and https://github.com/ceph/ceph/commit/...
- 08:41 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- None of the commits in FileJournal.cc from dumpling to master fix something that could cause a problem of that nature.
- 09:40 AM Bug #9630 (Resolved): osd: leaked pg refs on shutdown (dumpling)
- ...
- 08:40 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
- 9/30/14 update - Still waiting in queue http://pulpito.front.sepia.ceph.com/teuthology-2014-09-29_23:20:02-multi-vers...
- 08:36 AM rgw Bug #9612 (Resolved): "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant...
- PR https://github.com/ceph/ceph-qa-suite/pull/154
- 12:08 AM CephFS Bug #9628: mds: race between ms_handle_accept() and ms_handle_reset()
- https://github.com/ceph/ceph/pull/2596
- 12:08 AM CephFS Bug #9628 (Resolved): mds: race between ms_handle_accept() and ms_handle_reset()
- ceph version 0.85-1003-g3ae673c (3ae673c764a4fac6e554e05722f0179566ed3fb3)
1: (ceph::BackTrace::BackTrace(int)+0x2...
09/29/2014
- 11:40 PM Bug #9582: librados: segmentation fault on timeout
- Thanks for your investigations and the quick fix! We have not been able to test this fix yet, but I will report back ...
- 01:13 PM Bug #9582: librados: segmentation fault on timeout
- in giant, dumpling. still need to merge firefly backport.
- 01:10 PM Bug #9582 (Pending Backport): librados: segmentation fault on timeout
- 08:16 AM Bug #9582 (Fix Under Review): librados: segmentation fault on timeout
- 10:09 PM Bug #9459: osd: blocked request
- saw something similar on another cluster, ...
- 09:18 PM rbd Bug #9602: rbd export -> nc ->rbd import = memory leak
- As a suggestion, prohibit the use of the cache when RDB imports.
- 09:03 PM rbd Bug #9602: rbd export -> nc ->rbd import = memory leak
- Hi, Sage.
I'm sorry, was wrong to put up parameter: rbd_cache size
The problem is not confirmed.
- 08:18 PM CephFS Bug #9562 (Resolved): Lockdep assertion in Filer purge
- 06:12 PM rgw Bug #9615 (Resolved): "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-du...
- Fixed typo https://github.com/ceph/ceph-qa-suite/pull/157
- 05:45 PM rgw Bug #9615: "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-dumpling-dist...
- interesting how is it possible if I only added one yaml file v0.67.11.yaml, will look
- 05:36 PM rgw Bug #9615: "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-dumpling-dist...
- this yaml has no 'rgw' task... that's why it gets connection refused.
- 06:11 PM rgw Bug #9616: upgrade test restarts rgw, test gets 500
- Even if we do somehow get it to retry (might require changes to the fastcgi module), we'll still get 500s from reques...
- 05:47 PM rgw Bug #9616: upgrade test restarts rgw, test gets 500
- Yehuda Sadeh wrote:
> Not sure what the test is doing exactly, but the 500 is because the rgw process was restarted ... - 05:36 PM rgw Bug #9616: upgrade test restarts rgw, test gets 500
- Not sure what the test is doing exactly, but the 500 is because the rgw process was restarted in the middle of the te...
- 05:52 PM rgw Bug #9612: "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant-testing-ba...
- pls update with new test... this one was specifying firefly
- 05:45 PM rgw Bug #9169: 100-continue broken for centos/rhel
- maybe we are lacking the apache or mod_fastcgi packages here?
- 05:41 PM rgw Bug #9169: 100-continue broken for centos/rhel
- Yuri Weinstein wrote:
> Similar issue in suite:upgrade:firefly
>
> http://pulpito.front.sepia.ceph.com/teuthology... - 05:32 PM Bug #9617 (In Progress): objecter shutdown races with msg dispatch
- 04:21 PM Bug #9617: objecter shutdown races with msg dispatch
- ...
- 05:28 PM Feature #8960 (Resolved): filestore: store backend type persisently
- 05:24 PM Bug #9142 (Can't reproduce): [ RUN ] LibRadosTwoPoolsPP.PromoteSnapScrub hang
- 05:24 PM Bug #9141 (Can't reproduce): [ RUN ] LibRadosAio.IsCompletePP hang
- 04:49 PM Bug #6301: ceph-osd hung by XFS using linux 3.10
- fwiw, after upgrading the performance test nodes from Ubuntu 13.10 to Fedora Core 20, I appear to be hitting this und...
- 04:44 PM Feature #9580: ceph-disk, ceph-osd: make journal [partition] creation conditional based on osd_ob...
- Mark Kirkwood wrote:
> While we are thinking about this, note that some of the keyvalue backends have facility to ha... - 04:43 PM CephFS Bug #9341: MDS: very slow rejoin
- John Spray wrote:
> The userspace change and test for this are merged into master. Is the kernel side all done too?... - 01:07 PM CephFS Bug #9341: MDS: very slow rejoin
- The userspace change and test for this are merged into master. Is the kernel side all done too?
- 04:33 PM CephFS Bug #9514: ceph-fuse pjd test is failing in giant nightlies
- 03:49 PM CephFS Bug #9514: ceph-fuse pjd test is failing in giant nightlies
- So here's a question: why does the client (temporarily) remember its ctime as being 2014-09-26 19:22:06.889397, but n...
- 02:58 PM CephFS Bug #9514 (In Progress): ceph-fuse pjd test is failing in giant nightlies
- Hah, we got the failure with logs in /a/sage-2014-09-26_17:51:11-smoke-giant-distro-basic-multi/513914
All of the ... - 04:26 PM Bug #9614: PG stuck with remapped
- Thanks Loic for the following up.
After talking to other engineers, the backfilling seems like due to he removed O... - 12:32 PM Bug #9614: PG stuck with remapped
- It looks like you are on the right track :-)
- 12:23 PM Bug #9614: PG stuck with remapped
- ...
- 12:13 PM Bug #9614: PG stuck with remapped
- could you attach the full output of pg query 3.1ee7 please ? And also the ceph osd tree would help to get an idea why...
- 02:21 AM Bug #9614: PG stuck with remapped
- There are still two issues:
# Some PGs are stuck with active+remapped forever (for both replicated pool and EC pool)... - 02:07 AM Bug #9614: PG stuck with remapped
- Guang Yang wrote:
> Another observation is that even the pg dump result for such PG:
> [...]
>
> Even there is a... - 01:53 AM Bug #9614: PG stuck with remapped
- Attaching CRUSH / EC profile / OSD dump.
- 01:28 AM Bug #9614: PG stuck with remapped
- Loic Dachary wrote:
> [...]
> The *2147483647* here shows mapping failed. Is this something you expect ?
As there ... - 01:22 AM Bug #9614: PG stuck with remapped
- ...
- 04:24 PM Bug #9113: osd: snap trimming eats memory, linearly
- There's another piece. The trimmer is constantly requeueing.
- 02:02 PM Bug #9113 (Pending Backport): osd: snap trimming eats memory, linearly
- 01:59 PM Bug #9113 (Fix Under Review): osd: snap trimming eats memory, linearly
- 04:15 PM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
- I will verify the result when they are ready but I'm not too concerned ;-)
- 04:15 PM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
- 02:42 PM Bug #9620 (Resolved): tests: qa/workunits/cephtool/test.sh race condition
- i jumped the gun and merged, oops!
- 01:52 PM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
- gitbuilder running
- 01:51 PM Bug #9620 (Fix Under Review): tests: qa/workunits/cephtool/test.sh race condition
- https://github.com/ceph/ceph/pull/2603
- 08:18 AM Bug #9620 (Pending Backport): tests: qa/workunits/cephtool/test.sh race condition
- 04:53 AM Bug #9620 (Fix Under Review): tests: qa/workunits/cephtool/test.sh race condition
- https://github.com/ceph/ceph/pull/2594
- 04:36 AM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
- The *ceph osd thrash* command will randomly "mark osds down and up":https://github.com/ceph/ceph/blob/firefly/src/mon...
- 03:29 AM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
- The following sequence happens:
* ceph osd dump finds 3 osd "down"
* ceph osd dump finds no osd "down"
* ceph os... - 03:24 AM Bug #9620 (Resolved): tests: qa/workunits/cephtool/test.sh race condition
- "osd are marked down":https://github.com/ceph/ceph/blob/master/qa/workunits/cephtool/test.sh#L604 and a loop checking...
- 03:46 PM RADOS Bug #9606: mon: ambiguous error_status returned to user when type is wrong in a command
- This may be easier if/when ceph_argparse gets made into a proper Python package; I hear there is renewed interest in ...
- 02:57 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- Exploring the idea that maybe the buffers pointed to by the iovec are overriden, mixed up
- 08:28 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- Reading the buffer.{h,cc} code it looks like the caller is protected from a situation where a bufferptr leftover can ...
- 02:53 PM Bug #9626 (Resolved): PG: cancel backfill reservations if we get a cancel during backfill
- 02:36 PM Fix #9566: osd: prioritize recovery of OSDs with most work to do
- Factor the number of backfill (or backfill_wait) pgs on the OSD into the recovery priority. Make sure this accounts ...
- 02:14 PM Bug #9574 (Pending Backport): Backfill: recheck full status once reservation is granted
- 01:51 PM Bug #9574 (Fix Under Review): Backfill: recheck full status once reservation is granted
- 02:00 PM Bug #9388: osd/PG.cc: 2945: FAILED assert(r == 0) in update_snap_map
- This is the one with the import/export racing with split
- 01:59 PM Bug #9503 (Fix Under Review): Dumpling: removing many snapshots in a short time makes OSDs go ber...
- 01:54 PM Bug #9545 (Resolved): filestore stuck in journal->should_commit_now() loop on shutdown
- 01:52 PM Bug #8629 (Pending Backport): cache_evict needs to prevent make_writeable from creating a snapdir
- 01:45 PM Bug #9480 (Resolved): OSD is crashing while object deletion
- 01:30 PM Bug #9625: firefly: memory corruption
- /a/samuelj-2014-09-23_14:40:50-rados-firefly-wip-testing-old-vanilla-basic-multi/507058 another example
- 10:44 AM Bug #9625: firefly: memory corruption
- ubuntu@teuthology:/a/sage-2014-09-27_20:55:12-rados-firefly-distro-basic-multi/515818
ubuntu@teuthology:/a/sage-2014... - 10:43 AM Bug #9625 (Resolved): firefly: memory corruption
- I am guessing that these two coredumps are related.
#0 0x00007f1918142f07 in _dl_map_object_deps (map=map@entry=0... - 01:15 PM CephFS Bug #8576: teuthology: nfs tests failing on umount
- Trying the sync on Sage's go-ahead. :)
commit:56223ce98b659fe7b25b55161ef8163495f438fc in teuthology. - 10:45 AM CephFS Bug #8576: teuthology: nfs tests failing on umount
- Is there any chance that just running a sync on the node prior to trying to "exportfs -au" might prevent this? I'm he...
- 12:51 PM devops Fix #9017 (Fix Under Review): [paddles] implement validation across all controller methods
- Pull request opened https://github.com/ceph/paddles/pull/46
- 10:30 AM Bug #9623 (Won't Fix): On cluster with 3 mons, stopping 2 mons made cluster in-accessible, with I...
- This is expected and intended behavior. The monitors are a Paxos system and require a quorum of *more than* half to b...
- 07:30 AM Bug #9623: On cluster with 3 mons, stopping 2 mons made cluster in-accessible, with IO's hung/pause
- Removing myself as I may not have time to deal with this right now.
- 06:25 AM Bug #9623 (Won't Fix): On cluster with 3 mons, stopping 2 mons made cluster in-accessible, with I...
- Cluster with "n" number of monitor nodes, will be in-accessible if "n-1" number of monitors are down.
Its been obser... - 09:55 AM devops Bug #6461 (Rejected): ceph-deploy should at least issue a warning if there are parser errors read...
- `ConfigParser` will not have errors reading a config file that has duplicate sections.
In Python2.X a duplicate se... - 08:25 AM Bug #9613 (Duplicate): "Segmentation fault" in upgrade:dumpling-giant-x:parallel-giant-distro-bas...
- #9582
- 07:50 AM devops Bug #6489 (Can't reproduce): ceph-deploy: get_nonlocal_ip() should filter ipv6 addrs
- 07:44 AM devops Bug #7483 (Rejected): ceph-deploy should fetch keyrings always
- There isn't a reasonable way to implement this. The use case is deploying to a new node and having stale files in the...
- 06:00 AM Bug #9408: erasure-code: misalignment
- Running under the branch wip-9408-buffer-alignment in http://ceph.com/gitbuilder.cgi
- 05:58 AM Bug #9408: erasure-code: misalignment
- New pull request https://github.com/ceph/ceph/pull/2595
- 02:06 AM Bug #9572: erasure-code: BlaumRoth default encoding regression
- Brute force check of w=7 with all possible values for k prove it allows recovering all scenarios. ...
- 01:53 AM rbd Bug #9391: fio rbd driver rewrites same blocks
- Could you provide your fio job file / config to verify the issue?
09/28/2014
- 11:38 PM Bug #9592 (Resolved): librados: Not able to create Large Files with Librados
- 03:46 PM Bug #9592: librados: Not able to create Large Files with Librados
- Extend the checks to librados.hpp and aio_* https://github.com/ceph/ceph/pull/2590
- 11:30 PM Bug #9304 (Resolved): pool create with invalid crush rule name succeeds
- 11:02 PM Bug #6003: journal Unable to read past sequence 406 ...
- ...
- 09:26 PM Fix #9566: osd: prioritize recovery of OSDs with most work to do
- The recovery slows simply because there are fewer PGs left degraded and the per-pg (or per-osd) recovery rate is limi...
- 12:42 PM Bug #9619 (Can't reproduce): excessive mon memory usage when rbd rm 1PB
- Steps to reproduce:
* create a 1 peta byte rbd image
* remove the image
the mon memory usage will grow over 10GB - 12:37 PM rgw Bug #9307: "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-x-mast...
- Also in http://pulpito.front.sepia.ceph.com/teuthology-2014-09-28_08:42:11-upgrade:dumpling-firefly-giant:parallel-gi...
- 12:32 PM Bug #9618 (Won't Fix): kernel 3.14 in Debian Jessie : XFS bug
- For the record: the 3.14 kernel that was (until today) the default for Debian Jessie exhibited the following XFS bug ...
- 08:37 AM Bug #9617 (Resolved): objecter shutdown races with msg dispatch
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-27_19:10:02-upgrade:firefly-giant-x:parallel-giant...
- 08:12 AM Bug #9515 (New): "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parall...
- Still see in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-27_18:40:01-upgrade:dumpling-giant-x:parallel-gia...
- 07:57 AM rgw Bug #9615: "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-dumpling-dist...
- Appears to be only on @1-dumpling-install/v0.67.11.yaml@
- 07:48 AM rgw Bug #9615 (Resolved): "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-du...
- In http://pulpito.front.sepia.ceph.com/teuthology-2014-09-27_18:45:01-upgrade:dumpling-dumpling-distro-basic-vps/ run...
- 07:52 AM rgw Bug #9616 (Resolved): upgrade test restarts rgw, test gets 500
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-27_18:45:01-upgrade:dumpling-dumpling-distro-basic...
- 04:29 AM Bug #9614: PG stuck with remapped
- Another observation is that even the pg dump result for such PG:...
- 03:45 AM Bug #9614 (Resolved): PG stuck with remapped
- In our pre-production cluster, we observed that the cluster starts backfilling even with OSD noout flag set when ther...
09/27/2014
- 10:48 PM rbd Bug #9595: librbd: internal methods can operate on extra objects when non-default striping is used
- https://github.com/ceph/ceph/pull/2588
- 10:48 PM rbd Bug #9595 (Fix Under Review): librbd: internal methods can operate on extra objects when non-defa...
- 04:42 PM Bug #9613: "Segmentation fault" in upgrade:dumpling-giant-x:parallel-giant-distro-basic-multi run
- Looks similar to #9508
- 04:40 PM Bug #9613 (Duplicate): "Segmentation fault" in upgrade:dumpling-giant-x:parallel-giant-distro-bas...
- Two failures in http://pulpito.front.sepia.ceph.com/teuthology-2014-09-26_18:44:02-upgrade:dumpling-giant-x:parallel-...
- 11:40 AM rgw Bug #9612: "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant-testing-ba...
- i suspect the giant rgw won't work with firefly osds?
- 08:56 AM rgw Bug #9612 (Rejected): "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-26_23:20:01-multi-version-giant-testing-basic-mult...
- 11:39 AM Bug #9610 (Resolved): Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::Cal...
- pushed fix to dumpling branch, commit:503f865d6432bead72aac0ffba0539d807f078c4
- 08:33 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
- Another similar crash in job http://pulpito.front.sepia.ceph.com/teuthology-2014-09-26_23:20:01-multi-version-giant-t...
- 08:29 AM Bug #9610 (Resolved): Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::Cal...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-26_23:20:01-multi-version-giant-testing-basic-mult...
- 11:36 AM devops Bug #9611 (Rejected): Missing packages in multi-version-giant-testing-basic-multi
- Doesn't look like a 'next' branch exists any longer so no way to fix this.
- 08:52 AM devops Bug #9611: Missing packages in multi-version-giant-testing-basic-multi
- In run http://pulpito.front.sepia.ceph.com/teuthology-2014-09-26_23:20:01-multi-version-giant-testing-basic-multi/
... - 08:50 AM devops Bug #9611 (Rejected): Missing packages in multi-version-giant-testing-basic-multi
- 09:16 AM Bug #9592: librados: Not able to create Large Files with Librados
- Looking at librados.hpp
- 01:57 AM Bug #9592 (Fix Under Review): librados: Not able to create Large Files with Librados
- https://github.com/ceph/ceph/pull/2584 should be enough. Unless there is a good reason to write an object with chunks...
- 05:53 AM Bug #7648 (Resolved): ceph-mon corner case denial of service
- 02:32 AM Bug #7648 (Fix Under Review): ceph-mon corner case denial of service
- emperor backport https://github.com/ceph/ceph/pull/2585
- 02:22 AM Bug #7648 (Pending Backport): ceph-mon corner case denial of service
- the backport needs to be on emperor also
- 04:27 AM RADOS Bug #9606: mon: ambiguous error_status returned to user when type is wrong in a command
- ceph.in "uses ceph_argparse":https://github.com/ceph/ceph/blob/giant/src/ceph.in#L67 to validate the arguments client...
- 12:02 AM RADOS Bug #9492 (Need More Info): Crush Mapper crashes when number of replicas is less than total numbe...
- What happens with indep ?
09/26/2014
- 07:13 PM rgw Bug #9588: Keystone s3 auth integration lacking access_key = tenant:user ability supported by swi...
- So, actually talking to a swift s3 proxy with:
access_key = 'demo:demo'
secret_key = 'password'
results in:
... - 06:14 PM Bug #7648 (Fix Under Review): ceph-mon corner case denial of service
- https://github.com/ceph/ceph/pull/2583
- 08:49 AM Bug #7648 (In Progress): ceph-mon corner case denial of service
- works for any osd that exists but is not in the crush map, it seems
- 05:53 PM Bug #9570 (In Progress): osd crash in FileJournal::WriteFinisher::entry() aio
- 03:32 PM CephFS Bug #8427: ceph-fuse: Dumpling "cache still has 0+1 items, waiting (for caps to release?)" on shu...
- Sage believes this is a bug with readahead that got fixed in subsequent releases.
- 06:51 AM CephFS Bug #8427 (Won't Fix): ceph-fuse: Dumpling "cache still has 0+1 items, waiting (for caps to relea...
- 03:26 PM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
- 01:56 PM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
- Ran valgrind with the patch and no errors were found with different rule combinations of num_rep and number of osds t...
- 12:44 PM Bug #9417: "Segmentation fault" in upgrade:dumpling-giant-x-master-distro-basic-vps run
- Same issue in job http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-26_10:44:24-upgrade:dumpling-giant-x:paralle...
- 12:03 PM devops Bug #9607 (Resolved): wrong epel-release version present in misc-ceph repo
- That epel release 7 RPM should not have ever been put in that repo. It is removed and to its correct location and cep...
- 11:31 AM devops Bug #9607 (Resolved): wrong epel-release version present in misc-ceph repo
- In a CentOS 6 box where we run `yum install epel-release` it now sees that it needs to update to use the epel-release...
- 11:35 AM devops Bug #9603: No package ceph-debuginfo-0.67.10 available in upgrade:dumpling-firefly-x-giant-distro...
- Would be helpful to include:...
- 11:08 AM devops Bug #9603: No package ceph-debuginfo-0.67.10 available in upgrade:dumpling-firefly-x-giant-distro...
- Same issue in suite:upgrade:dumpling-giant-x
http://pulpito.front.sepia.ceph.com/teuthology-2014-09-26_10:44:24-up... - 08:12 AM devops Bug #9603 (Rejected): No package ceph-debuginfo-0.67.10 available in upgrade:dumpling-firefly-x-g...
- In run http://pulpito.front.sepia.ceph.com/teuthology-2014-09-25_19:25:02-upgrade:dumpling-firefly-x-giant-distro-bas...
- 11:34 AM rbd Feature #2466 (Resolved): librbd: add invalidate_cache function to interface
- This was added a while back in commit:5d340d26dd70192eb0e4f3f240e3433fb9a24154
- 11:18 AM RADOS Bug #9606 (New): mon: ambiguous error_status returned to user when type is wrong in a command
- ...
- 10:10 AM Bug #9592: librados: Not able to create Large Files with Librados
- Nice catch Pavan Rallabhandi ;-) I had trouble reproducing the problem because I forgot the "LD_LIBRARY_PATH=.libs" ...
- 09:37 AM Bug #9592: librados: Not able to create Large Files with Librados
- The minimal script...
- 08:05 AM Bug #9592 (In Progress): librados: Not able to create Large Files with Librados
- 04:44 AM Bug #9592: librados: Not able to create Large Files with Librados
- ...
- 10:01 AM devops Bug #9548 (Rejected): ceph mon creation failed for centOS
- 09:39 AM Feature #9302 (Fix Under Review): mon: 'ceph osd pool ls' command
- https://github.com/ceph/ceph/pull/2581
- 09:34 AM devops Bug #9232: disk zap doesnt remove the dmcrypt settings on disk
- I think that `disk zap` would certainly have to clear the dmcrypt flags in the disk.
Can you make sure that it doe... - 08:56 AM rgw Bug #9605 (Won't Fix): rgw: need to have shadow objects named after head object
- 08:55 AM rgw Feature #9604 (Resolved): rgw: create a tool for orphaned objects cleanup
- 08:07 AM devops Bug #9567 (New): Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
- Still see in today's run:
http://pulpito.front.sepia.ceph.com/teuthology-2014-09-25_19:25:02-upgrade:dumpling-fire... - 07:41 AM Fix #9601: erasure-code: ErasureCode::encode overhead is too high
- The handling got more complicated due to the updated padding handling.
It's a bit little faster. jerasure_matrix_e... - 05:17 AM Fix #9601: erasure-code: ErasureCode::encode overhead is too high
- The overhead has shifted but looks globaly the same with https://github.com/ceph/ceph/pull/2558
!{width: 100%}jannau... - 03:52 AM Fix #9601: erasure-code: ErasureCode::encode overhead is too high
- Applying https://github.com/ceph/ceph/pull/2558 and benchmarking again
- 03:34 AM Fix #9601 (New): erasure-code: ErasureCode::encode overhead is too high
- When encoding 4KB buffers it is ~15% of the total CPU being used although it is only preparing the buffers.
!{width:... - 05:26 AM rbd Bug #9602 (Closed): rbd export -> nc ->rbd import = memory leak
- I see a memory leak when importing raw devi?e.
Export Scheme:
[rbd@rbdbackup ~]$ rbd --no-progress -n client.rbdb... - 01:58 AM Cleanup #9600 (New): rework bufferlist::*aligned* functions
- The align function should allow 32 byte alignment (for SIMD instructions) or page alignment (for I/O). There should b...
- 01:00 AM Bug #8592: sgdisk no longer likes `--change-name` when creating partitions
- I have fixed this by add --zap-disk option, hope this will help you.
- 12:44 AM Bug #9419: dumpling->firefly upgrade, sending setallochint?
- Thanks for explaining. Since alloc hint is optional it does not matter if it is activated and deactivate later.
- 12:22 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Please disregard #15. I just fell victim to inaccurate documentation about the @incomplete@ PG state.
-Sam's hunch...
09/25/2014
- 11:59 PM Bug #9592: librados: Not able to create Large Files with Librados
- A modified script to debug this issue:-
####################################
import rados
import sys
try:
cluste... - 10:02 AM Bug #9592: librados: Not able to create Large Files with Librados
- If I were to guess, something in the stack is converting the size value down to an int32 and then back up to int64, s...
- 09:51 AM Bug #9592 (Can't reproduce): librados: Not able to create Large Files with Librados
- ...
- 06:28 AM Bug #9592 (Resolved): librados: Not able to create Large Files with Librados
- I find this issue while i was trying to run a 1GB Write Cosbench Workload using librados.(My 1MB write & read run was...
- 11:08 PM rgw Bug #9588: Keystone s3 auth integration lacking access_key = tenant:user ability supported by swi...
- Despite asking for swift I am actually getting the nova object store doing the s3 stuff it seems. I'll comment gaian ...
- 10:16 PM rgw Bug #9588: Keystone s3 auth integration lacking access_key = tenant:user ability supported by swi...
- Hmm - maybe not tested enough, as it looks like the way devstack sets up the swift s3 layer is a bit screwy, and almo...
- 07:45 PM Documentation #9542: Error link:"Ceph Object Gateway"->"Manual Install"
- I know it "*is the way the doc is generated*", and I know "*it's not a bug in a link*",too.(Guess it's Sphinx?). But ...
- 09:39 AM Documentation #9542 (Won't Fix): Error link:"Ceph Object Gateway"->"Manual Install"
- This is the way the doc is generated, it's not a bug in a link. And it actually makes more logical sense to jump from...
- 06:23 PM CephFS Feature #541 (Resolved): mds: tempsync
- this is implemented... TSYN and related states
- 06:21 PM Feature #1092 (Rejected): mon: checkpointing
- 06:19 PM Feature #131 (Resolved): bring wireshark plugin is up to date
- 05:47 PM CephFS Feature #630 (Resolved): release caps on inodes unlinked by other clients
- 05:47 PM CephFS Feature #630: release caps on inodes unlinked by other clients
- dup of #5039. already fixed by commit f8a947d92 client: trim deleted inode
- 05:03 PM Feature #9568 (Resolved): Add test case to test #9419 (ceph wip-9419)
- 04:34 PM CephFS Bug #9514: ceph-fuse pjd test is failing in giant nightlies
- This hasn't reproduced since we turned on debug logging. :(
But I did see it on a run without any logging: /a/gregf-... - 04:09 PM Feature #9580: ceph-disk, ceph-osd: make journal [partition] creation conditional based on osd_ob...
- While we are thinking about this, note that some of the keyvalue backends have facility to have their "wal" aka journ...
- 04:05 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- I don't see how it could be related to a problem in align_bl or bufferlist::rebuild_align. The worst these could do i...
- 03:28 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- Maybe "iterating the bufferptr":https://github.com/ceph/ceph/blob/dumpling/src/os/FileJournal.cc#L1297 can return buf...
- 03:03 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- ...
- 02:43 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- Sheldon, could you upload the full log somewhere if you still have it ?
- 01:53 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- * align_bl related pull request https://github.com/ceph/ceph/pull/2501
* rebuild_align fix (back from 2013) https:/... - 02:14 PM Bug #9203 (In Progress): ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < l...
- 02:14 PM Feature #9598 (Resolved): re-enable Objecter fast dispatch
- We had to nix fast dispatch on the Objecter because it could deadlock in conjunction with mark_down() calls.
Fixin... - 02:06 PM Bug #9536: erasure-code: ISA plugin alignment must be constant
- (parts of) this will need to be backported with the rest of the ISA plugin stuff
- 02:05 PM Bug #9536 (Pending Backport): erasure-code: ISA plugin alignment must be constant
- 01:52 PM Bug #9389 (Need More Info): ec pg stuck peering, did not send query for one shard
- commit:d851c3f2338e8d17dfd78d631b9f7977365356aa adds better debug output (and cleans up a bit)
- 01:21 PM rbd Bug #9595 (Resolved): librbd: internal methods can operate on extra objects when non-default stri...
- ...
- 01:04 PM Bug #9295 (Resolved): osd/OSD.cc: 5501: FAILED assert(session) in ms_fast_dispatch
- 01:03 PM Bug #9295 (Duplicate): osd/OSD.cc: 5501: FAILED assert(session) in ms_fast_dispatch
- 01:03 PM Bug #9295: osd/OSD.cc: 5501: FAILED assert(session) in ms_fast_dispatch
- dup of #9462
- 01:01 PM Bug #9462 (Resolved): msgr deadlock: osd reply vs mark_down vs fault
- 12:44 PM Bug #8910 (Duplicate): ceph_test_objectstore: ObjectStore/StoreTest.ManyObjectTest/0 failure on f...
- pretty sure this is a dup of #8395
- 12:27 PM Bug #9582: librados: segmentation fault on timeout
- i'm going to see if we can just skip the rx_buffers zero-copy paths when a timeout is present
- 12:20 PM Bug #9388: osd/PG.cc: 2945: FAILED assert(r == 0) in update_snap_map
- import/export related
- 11:44 AM Bug #9571 (Resolved): rocksdb testing with powercycling fails on trusty
- this was an issue with the code fix and not a product bug.
resolved now. - 11:14 AM Bug #9419: dumpling->firefly upgrade, sending setallochint?
On any change of pg configuration peering happens, so a new collection of feature bits from the peers is collected....- 10:37 AM Bug #9419 (Fix Under Review): dumpling->firefly upgrade, sending setallochint?
- 12:46 AM Bug #9419: dumpling->firefly upgrade, sending setallochint?
- What happens if
* all OSDs in a PG support setallochint
* one secondary OSD goes down
* the secondary is replac... - 11:09 AM Bug #8395: ceph-test-objectstore doesn't clean up
- backported to firefly branch
- 11:08 AM Feature #9594 (New): stop backfill when osd becomes too full
- We will currently refuse the reservation, but we don't actually stop backfill once it is started.
- 10:45 AM Bug #9480: OSD is crashing while object deletion
- 10:42 AM Bug #9390 (In Progress): EEXIST on split due to import/export
- 10:38 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- 10:37 AM Bug #9584: OpTracker segfault on shutdown (firefly)
- shutdown race is not so important
- 10:17 AM devops Tasks #8366: Update ceph.com/docs to default to the latest major release (0.80)
- John Wilkins wrote:
> We need to review this a bit further. Pointing to the latest major release is fine, but we nee... - 10:09 AM Bug #9593 (Resolved): osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked()) (firefly)
- 10:02 AM Bug #9593 (Fix Under Review): osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked()) (fir...
- https://github.com/ceph/ceph/pull/2576
- 09:57 AM Bug #9593 (Resolved): osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked()) (firefly)
- ...
- 09:24 AM Feature #9532 (Duplicate): rados.py should export omap interface
- #6114
- 06:31 AM rgw Bug #9469: RadosGW performance degrades with high concurrency workload.
- Debugging further I was able to root cause the issue further. I enable debug logs for radosgw (20/20) , enabled acces...
- 03:31 AM CephFS Bug #9562 (Fix Under Review): Lockdep assertion in Filer purge
- https://github.com/ceph/ceph/pull/2572
- 02:58 AM Bug #9579: Default parameters are not getting initialized for EC profile using isa EC plugin
- Documentation update on k/m, good catch ! https://github.com/ceph/ceph/pull/2571
- 02:52 AM Bug #9579: Default parameters are not getting initialized for EC profile using isa EC plugin
- This is confusing and I added http://tracker.ceph.com/issues/9589 to work on improving the user experience. Thanks fo...
- 02:49 AM Bug #9579: Default parameters are not getting initialized for EC profile using isa EC plugin
- From the code, it seems default value of k & m for "isa" profile are 7 & 3 respectively.
class ErasureCodeIsaDefau... - 02:41 AM Bug #9579 (Won't Fix): Default parameters are not getting initialized for EC profile using isa EC...
- There are no defaults for k/m for the isa plugin, the parameters need to be set explicitly as documented at http://ce...
- 02:49 AM Feature #9589 (Resolved): erasure-code: query plugin for erasure-code-profile defaults
- When a parameter is missing from an erasure-code-profile (ruleset-failure-domain for instance) it falls back to the d...
- 02:14 AM Bug #8863: osd: second reservation rejection -> crash
- Hi Sage,
We are still getting this issue, even thought commit is included in our build. Any Updates? - 01:24 AM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
- Running in debug mode with https://github.com/ceph/ceph/pull/2568 (using the crushmap created as in the description):...
- 12:56 AM CephFS Bug #9563 (Resolved): kcephfs crash in ceph_mdsc_do_request
- 12:55 AM CephFS Bug #9564 (Resolved): kcephfs crash in _nfs4_do_open
- the bug is fixed upstream commit f39c0104 (NFS: remove BUG possibility in nfs4_open_and_get_state). I rebased the tes...
- 12:19 AM Bug #9485: Monitor crash due to wrong crush rule set
- Thanks so much.
BTW:
I repeat this in my dev environment with 60 osds on one host. I create 6 virtual racks. (you... - 12:09 AM Bug #9485: Monitor crash due to wrong crush rule set
- Thanks for the detailed instructions. I'll try them to repeat the problem.
09/24/2014
- 10:23 PM rgw Bug #9588 (Rejected): Keystone s3 auth integration lacking access_key = tenant:user ability suppo...
- For instance according to http://docs.openstack.org/grizzly/openstack-object-storage/admin/content/configuring-openst...
- 08:40 PM Bug #9485: Monitor crash due to wrong crush rule set
- K=8 M=4 doesn't work.
I rebuild the cluster and do the following steps.
(delete all pools)
1. create a profile... - 04:37 AM Bug #9485: Monitor crash due to wrong crush rule set
- Could you please let me know if it always work with *K=8 M=4* ?
- 01:17 AM Bug #9485: Monitor crash due to wrong crush rule set
- I know that I need 11 and the rule provide 12 and It looks CRUSH will do thetruncate.
It doesn't seem to be an iss... - 12:42 AM Bug #9485: Monitor crash due to wrong crush rule set
- You have *K=8 M=3* which means your pool needs 11 OSDs. However the rule you defined will always provide 12 OSDs and ...
- 12:22 AM Bug #9485: Monitor crash due to wrong crush rule set
- The profile used for the ecpool is K=8 M=3.
If I set the min_size = 3, max_size = 12(as default), the monitor cras... - 12:04 AM Bug #9485: Monitor crash due to wrong crush rule set
- Could you also attach the log of monitor crash you are seeing ? Note that if you change a crush rule that is currentl...
- 08:16 PM Bug #9585: ceph assertion using rocksdb store in master branch
- It looks like that powercycle will make header's bitmap inconsistence with actual data keys.
- 11:10 AM Bug #9585 (Can't reproduce): ceph assertion using rocksdb store in master branch
- ceph version 0.85-980-gc5906ec (c5906eca2ffa837891ba7d84775ece7b91f6c5c8)
ceph assertion when rocksdb is used for ... - 07:47 PM CephFS Bug #6613: samba is crashing in teuthology
- Still happening
/a/teuthology-2014-09-22_23:14:01-samba-giant-testing-basic-multi/50607 - 07:43 PM CephFS Bug #8427: ceph-fuse: Dumpling "cache still has 0+1 items, waiting (for caps to release?)" on shu...
- /a/teuthology-2014-09-22_19:06:01-fs-dumpling-testing-basic-multi/505408
Grabbed all the logs out of /var/log/ceph... - 04:21 PM Bug #6697 (Resolved): strncmp(3) must not be used on binary data
- 07:02 AM Bug #6697 (Fix Under Review): strncmp(3) must not be used on binary data
- https://github.com/ceph/ceph/pull/2567
- 06:51 AM Bug #6697: strncmp(3) must not be used on binary data
- 03:53 PM Bug #8910 (In Progress): ceph_test_objectstore: ObjectStore/StoreTest.ManyObjectTest/0 failure on...
- reopening this bug as it seems to happen in the nightlies,
log: http://qa-proxy.ceph.com/teuthology/teuthology-... - 02:51 PM devops Bug #9489 (Rejected): --zap-disk does not clear enough
- ...
- 10:22 AM devops Bug #9489 (Can't reproduce): --zap-disk does not clear enough
- 10:19 AM devops Bug #9489: --zap-disk does not clear enough
- I believe the original cause of report was likely in error unrelated to ceph-disk. Loic, you had mentioned you might ...
- 09:43 AM devops Bug #9489 (Need More Info): --zap-disk does not clear enough
- A bit more context is needed here, how/what doesn't work as expected? Is it possible to reproduce?
When zap disk d... - 02:27 PM Fix #3180: use of strerror() for possibly-negative return values
- Yeah, I actually fixed this, and forgot the bug still existed.
- 05:51 AM Fix #3180 (Rejected): use of strerror() for possibly-negative return values
- I could not find an instance where strerror is used instead of cpp_strerror in the current master...
- 02:27 PM Feature #4611: cephtool: set-quota, no get-quota
- heh, bug 4611 duplicates bug 8523, does it? :)
- 05:18 AM Feature #4611 (Duplicate): cephtool: set-quota, no get-quota
- 02:22 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- 2014-09-22 16:00:20.680448 7fee6abcf700 0 -- 10.10.10.7:6808/25820 >> 10.10.10.16:0/1007485 pipe(0xba12a00 sd=628 :6...
- 02:21 PM rgw Bug #9587 (Resolved): ceph-radosgw sysvinit script on EL6 cannot set ulimit
- The script tries to set ulimit -n 32768 as the apache user. It errors to:
bash: line 0: ulimit: open files: cannot m... - 02:18 PM CephFS Bug #8576: teuthology: nfs tests failing on umount
- https://github.com/ceph/teuthology/pull/336
- 01:58 PM Bug #9113: osd: snap trimming eats memory, linearly
- 01:57 PM Feature #9568: Add test case to test #9419 (ceph wip-9419)
- Tests case:
@0-cluster / start.yaml@... - 12:36 PM Bug #9582: librados: segmentation fault on timeout
- Okay, looks like this is another race:
1) The message is coming in over the wire, and the Pipe grabs a preallocated ... - 07:38 AM Bug #9582 (Resolved): librados: segmentation fault on timeout
- Summary: If you configure librados with rados_osd_op_timeout, timeouts will result sometimes in a segmentation fault....
- 10:52 AM Bug #9584: OpTracker segfault on shutdown (firefly)
- /a/samuelj-2014-09-23_14:40:50-rados-firefly-wip-testing-old-vanilla-basic-multi/507309 (once it times out)
- 10:52 AM Bug #9584 (Can't reproduce): OpTracker segfault on shutdown (firefly)
- #0 0x00007f5ec74baf07 in _dl_map_object_deps (map=map@entry=0x7f5ec76bc4e8, preloads=preloads@entry=0x0, npreloads=n...
- 10:37 AM Messengers Bug #1803 (New): msgr: behave better when ending TCP connections
- This has been greatly improved with the addition of our socket timeouts and things, but I don't think it's properly r...
- 03:12 AM Messengers Bug #1803 (Resolved): msgr: behave better when ending TCP connections
- Not sure at which point this problem was fixed but it is doubtful that it stayed around for the past three years unno...
- 10:21 AM Bug #9554 (Can't reproduce): "FAILED assert(0 == "hit suicide timeout")" in upgrade:firefly-firef...
- Looks like just an overloaded node.
- 10:17 AM RADOS Feature #4650: osd: separate OSD names from their IDs
- We expose OSD IDs in lots of places — like error reporting. But users can't specify those IDs (although they could on...
- 05:25 AM RADOS Feature #4650: osd: separate OSD names from their IDs
- From a system administration point of view there is no need to know about the OSD id. Naming the OSDs with human read...
- 10:05 AM RADOS Bug #8984 (Won't Fix): creating erasure-code pool when not having a root item default
- The recommended way to deal with the absence of a *default* root is to define an erasure-code-profile that "specifies...
- 10:00 AM Bug #8942: Bad JSON output in ceph osd tree
- 10:00 AM CephFS Cleanup #2378 (Resolved): "ceph -s" MDS output is confusing
- We don't print mds status if there's not an FS any more.
- 09:42 AM RADOS Feature #6114: Complete python binding interfaces for librados
- It went stale as I couldn't keep up with the changes to the modules themselves as the modifications where significant...
- 08:34 AM RADOS Feature #6114: Complete python binding interfaces for librados
- What has become of https://github.com/ceph/ceph/commits/wip-5900 ? Is there a reason why it was not merged ? Or am I ...
- 09:40 AM Bug #9556 (Duplicate): Segmentation fault in upgrade:dumpling-firefly-x-giant-distro-basic-multi ...
- 09:39 AM Bug #9556: Segmentation fault in upgrade:dumpling-firefly-x-giant-distro-basic-multi run
- From Sam's advice to look for something related to "Read Timeout" and from the log, this seems to be a duplicate of #...
- 09:29 AM RADOS Feature #6421: FileStore: Op unit tests
- change the %Done to reflect the fact that there is work done already.
- 09:16 AM devops Fix #8508: packaging: deb repository key should be @redhat.com
- The deb repository key just needs to be re-created with a @redhat.com email
- 09:08 AM Bug #8323 (Duplicate): mon_osd_allow_primary_affinity Can not be Injected
- 08:36 AM Feature #5511 (Duplicate): rados.py support for object locking
- #6114
- 08:26 AM Bug #7843 (Can't reproduce): OSD fails to start
- Feel free to re-open if you have a HOWTO reproduce the issue. If you figured out what was wrong, it would be nice if ...
- 08:16 AM rgw Feature #7680: Use new civetweb git repo for ceph
- The repository copy is useful when fixes are needed. They can diverge from upstream while the change is proposed.
- 08:12 AM Feature #7664 (Resolved): systemd service files
- https://github.com/ceph/ceph/tree/giant/systemd
- 08:10 AM Bug #7623 (Resolved): local 'best' uninitialized in Objecter
- Fixed by 605e645026487519d4195358330832b3369b531d
- 08:05 AM Bug #6101: ceph-osd crash on corrupted store
- Bumping so it does not get to the bottom of the list for the next bug scrub.
- 08:01 AM Bug #7368: ceph osd repair * blocks after some minutes and prevent other ceph pg repair commands
- Another mention of things slowing down when repair is almost complete : http://tracker.ceph.com/issues/9566 . Not sur...
- 07:52 AM Bug #7409 (Can't reproduce): "make check" doesn't work without --with-radosgw
- ...
- 07:43 AM Bug #9362: librados, rados_read corrupts memory on timeout
- Update: The patch branch I used did not contain the complete code that has been merged to the dumpling branch. Using ...
- 07:29 AM Feature #7340 (Duplicate): rados.py does not expose object locking
- 07:18 AM Cleanup #7105 (Closed): There are three different ways to retrieve an authentication key
- It is not necessary indeed. However, now that it has been published it would be non backward compatible to remove any...
- 07:09 AM Bug #6834 (Can't reproduce): nightlies: monitor crashed in emperor
- It either showed up again and has been associated with another issue or it has been fixed.
- 06:59 AM rgw Feature #9581 (New): Ability to move objects to a second storage tier based on policy
- To be compatible with AWS S3 API like bucket lifecyle, ceph should have the ability to move the object from standard ...
- 06:48 AM Feature #6687: Ability to set up/down/in/out based on CRUSH hierarchy
- +1
- 06:47 AM Feature #3604 (Resolved): print lookup path when reporting -ENOENT to user-space
- 06:45 AM Feature #6567 (Rejected): emit warning on unknown/ invalid configuration directives
- This is unfortunately not possible as there is no central place to query to know what is a valid option and what is n...
- 06:26 AM Bug #6371 (Duplicate): rados bench segfaults when read --block-size < write --block-size
- 06:09 AM devops Bug #9506: Pass monitor SSH addresses via CLI flag
- This will be *very* tricky to do with CLI flags, so after discussing this with Kyle, it was decided that using the ce...
- 05:28 AM devops Bug #9506: Pass monitor SSH addresses via CLI flag
- The use case is ceph-deploy is being executed on a management node, homed on a management network. The monitors are m...
- 06:09 AM Feature #5521: Enhance PGLS or new op to list all namespace/objects in a pool.
- 06:08 AM Feature #5521 (Duplicate): Enhance PGLS or new op to list all namespace/objects in a pool.
- 06:02 AM Feature #9580 (Resolved): ceph-disk, ceph-osd: make journal [partition] creation conditional base...
- or example, with keyvaluestore-dev ceph-disk makes a journal parititon and general screws things up. see http://artic...
- 05:46 AM Bug #9579 (Won't Fix): Default parameters are not getting initialized for EC profile using isa EC...
When created an EC profile using erasure code plugin "isa", default values for parameters k, m and technique are ...- 05:43 AM Feature #4771 (Rejected): Snippet / included configuration
- Loic Dachary wrote:
> The ceph.conf file tends to disapear almost entirely. The mons can contain all the information... - 05:33 AM Feature #4771: Snippet / included configuration
- The ceph.conf file tends to disapear almost entirely. The mons can contain all the information and are a central poin...
- 05:36 AM Feature #4230 (Resolved): librados: node.js bindings
- https://github.com/ksperis/node-rados
- 05:25 AM devops Bug #9510 (Closed): ceph-deploy: Move mon keyring generation 'mon create-initial'
- 05:12 AM Feature #2158 (Duplicate): cephtool: helpful error/timeout when no monitor quorum
- 04:54 AM Subtask #4306 (Resolved): make the new snap trimmer design work with split
- 04:46 AM Feature #2147 (Resolved): objclass: add CLS_ERR macro
- https://github.com/ceph/ceph/blob/giant/src/objclass/objclass.h#L31
- 03:18 AM Feature #4005: Add perftools to the kernel debian package script
- Any progress ?
- 03:17 AM Feature #1810 (Resolved): monclient: timeouts?
- Implemented by 671a76d64bc50e4f15f4c2804d99887e22dcdb69
- 03:04 AM Bug #4206 (Resolved): concurrent rados bench processes don't work well for seq reads
- Implemented by 308758b7878c48ab64caf71ff646e057c2c1c5aa
- 03:01 AM Fix #4202: osd: pg delete
- a command that deletes a designated pg ? If so it would help to have a use case.
- 02:56 AM Support #3902 (Closed): S3-tests need to cleanup after themselves
- Tests are run on short lived machines and this won't be an issue.
- 02:54 AM Feature #3855 (Resolved): Making Scrubs Nicer
- 02:52 AM Documentation #3846 (Resolved): Debian install has incorrect gitbuilder URL
- The install pages have been reworked.
- 02:49 AM Feature #3202 (Resolved): tools: coverity clean
- An on going effort by Danny Al-Gaaf
- 02:42 AM Feature #3241 (Resolved): qa: integration tests for mon, osd, and mds caps
- There now are caps tests run by teuthology : https://github.com/ceph/ceph/blob/giant/qa/workunits/mon/caps.py https:/...
- 02:29 AM Feature #3095 (Resolved): rbd tool resize improvements
- ...
- 02:27 AM Feature #3083 (Resolved): Provide separate APT repos for argonaut, bobtail, etc; stable would alw...
- Not as suggested but the stable repositories are organized in a sensible way.
- 02:23 AM Feature #2953 (Resolved): append() in librados is not exposed to python API
- Implemented by 39bf68c3ceee3f62960d0866f35835325cca5660
- 02:19 AM Bug #2848: OSDMap: pool_id is 64-bit, but pool_max is 32-bit
- "still valid":https://github.com/ceph/ceph/blob/giant/src/osd/OSDMap.h#L206
- 02:16 AM Feature #2812 (Resolved): automated CentOS testing
- RPM based operating systems are now part of the teuthology runs.
- 02:14 AM Feature #2776 (Resolved): rados tool: bulk removal of objects
- Implemented by cc8df29e19a1fc441ad903aeeb59f7d3e15a5e7c
- 02:08 AM Feature #2755 (Resolved): ceph-conftool: optionally return the default for a config option if no ...
- Marking as resolved since there now is a way to get the default value, although not as suggested....
- 02:00 AM Cleanup #2671 (Resolved): buffer.h: do efficient buffer comparisons
- Resolved by 2a46564158ebf519ae6e7ee318b97c61cf032692 with content_equals
- 01:53 AM Tasks #2529 (Resolved): debian: Merge packaging changes from Ubuntu 12.04
- There is no longer a difference.
- 01:50 AM Feature #2519 (Resolved): rados: allow setting pg_num and pgp_num when creating a pool
- Using a mon cmd to create the pool instead of the specialized function supports setting pg_num / pgp_num.
- 01:40 AM Bug #2154 (Resolved): rados: bench seq should not segfault when blocksize doesn't match write blo...
- ...
- 01:32 AM Feature #2112 (Resolved): msgr fault injection
- Starting 90f66980bfb1f2541dcb11be2c358a9832a291b1 in november 2012 a number of *OPTION(ms_inject_...* options have be...
- 01:07 AM Feature #1583 (Resolved): osd: bound pg log memory usage
- Memory consumption has improved/changed a lot since this ticket was open and I believe this issue is no longer relevant.
- 01:04 AM Feature #1619 (Resolved): libvirt: test with selinux/apparmour enabled
- I believe this has been extensively tested in the context of OpenStack
- 12:59 AM Feature #1525 (Resolved): qa: check out fio, add to ceph-qa-suite if it's good
- https://github.com/ceph/ceph-qa-suite/blob/giant/suites/tgt/basic/tasks/fio.yaml and https://github.com/ceph/ceph/blo...
- 12:52 AM Tasks #1418: set up a no-atomic-ops gitbuilder
- gitbuilders currently use *--with-libatomic-ops*
- 12:34 AM Feature #543 (Resolved): PG::search_for_missing: don't iterate over all missing
- The code base changed significantly and does not have this problem anymore.
- 12:30 AM Feature #1091 (Duplicate): librados: support pgls filter
- http://tracker.ceph.com/issues/9262
- 12:24 AM Cleanup #1042: need const iterator for bufferlist
- "still valid":https://github.com/ceph/ceph/blob/giant/src/include/buffer.h#L240
09/23/2014
- 11:58 PM Bug #9485: Monitor crash due to wrong crush rule set
- What probably happens is that you created an erasure code profile with k+m that is lower than the number of OSDs prov...
- 07:03 PM Bug #9485: Monitor crash due to wrong crush rule set
- Because the monitor crash and it can not be restarted, so currently I can not get "ceph osd dump".
I checked the i... - 08:40 AM Bug #9485: Monitor crash due to wrong crush rule set
- Could you also please add the output of *ceph osd dump* ? It looks like you have run into http://tracker.ceph.com/iss...
- 08:38 PM Bug #9558: Both op threads and dispatcher threads get hung even for few minutes during peering stage
- More info:
When OSD daemon/host is down, some PGs becomes active+degrade, while others are still active+clean. As ... - 06:16 PM CephFS Bug #9562: Lockdep assertion in Filer purge
- can we just unlock the PurgeRange/Probe locks before using the objecter?
- 06:21 AM CephFS Bug #9562 (In Progress): Lockdep assertion in Filer purge
- 06:21 AM CephFS Bug #9562: Lockdep assertion in Filer purge
So I think this bug already existed with the Probe lock, but it was triggered by the new PurgeRange lock, because t...- 05:48 PM Bug #9528: RadosModel assertion failure in firefly
- sam, please mention the parent bug.
- 01:22 PM Bug #9528 (Duplicate): RadosModel assertion failure in firefly
- 05:07 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- fix is in wip-9487 and wip-sam-testing
- 02:28 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- I'm not 100% sure, so I'd thought I'd ask: what's the exact reason for the PG being marked incomplete here? Is it the...
- 02:05 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- 01:58 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- https://github.com/ceph/ceph/pull/2525
The num_trimmed does not seem to be reset. I think you are not trimming at... - 06:12 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Whoa. While a cluster with this patch applied doesn't spin like crazy in snap_trim anymore, killing an OSD seems to i...
- 04:06 PM Bug #9113: osd: snap trimming eats memory, linearly
- It's not just dumpling, the repops set in the snap trimmer is just wonky. We need to trim a bounded set of objects, ...
- 03:47 PM Bug #9554: "FAILED assert(0 == "hit suicide timeout")" in upgrade:firefly-firefly-testing-basic-v...
The crashed osd.5 was on a node that had a load average of 38. osd.1 didn't see ping responses although it saw o...- 03:27 PM rbd Bug #8187: librbd: list_children() reports duplicates with cache pools
- Never mind, figured it out. Apparently it's not enough to set pool2 up as a tier of pool1, it also has to be an over...
- 02:51 PM rbd Bug #8187: librbd: list_children() reports duplicates with cache pools
- Josh, I'm having trouble reproducing this. Do you have a test case?
- 10:55 AM rbd Bug #8187 (In Progress): librbd: list_children() reports duplicates with cache pools
- 02:31 PM CephFS Bug #9564: kcephfs crash in _nfs4_do_open
- /a/teuthology-2014-09-22_23:10:02-knfs-giant-testing-basic-multi/506055/teuthology.log
- 02:26 PM Bug #9462: msgr deadlock: osd reply vs mark_down vs fault
- Finally got through a suite run and it looks pretty good, but need to check the few failures:
http://pulpito.ceph.co... - 02:26 PM devops Bug #9268 (Resolved): Recipe errors in rgw:multifs-dumpling-testing-basic-vps
- Fixed this in ceph-qa-chef. I hought there was another issue open so in teuthology and assigned to me, this was maybe...
- 02:23 PM devops Bug #9267 (Resolved): "Gem::DependencyError" in upgrade:dumpling-dumpling-distro-basic-vps
- Problematic images now include chef.
- 02:20 PM devops Bug #9489: --zap-disk does not clear enough
- 02:15 PM devops Bug #9567: Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
- Was caused when moving to new rhel7 gitbuilder firefly was comitted but not built on the new one when the old one was...
- 02:14 PM devops Bug #9567 (Resolved): Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
- 02:09 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- I had some comments on that pull request.
- 02:08 PM Bug #9545: filestore stuck in journal->should_commit_now() loop on shutdown
- 02:07 PM devops Bug #9548 (Need More Info): ceph mon creation failed for centOS
- the command `mon create-initial` does not take any hosts as arguments. It doesn't take any at all.
It will look at... - 02:07 PM devops Bug #8976 (Resolved): httpd on RHEL7 (RHEL repo) incompatible with mod_fastcgi (ceph repo)
- Closing as Tamil tested and said it was good.
- 02:07 PM Bug #8629: cache_evict needs to prevent make_writeable from creating a snapdir
- 02:05 PM Bug #9285: osd: promoted object can get evicted before promotion completes
- I left a comment on a simpler approach.
- 02:00 PM Linux kernel client Bug #8568: libceph: kernel BUG at net/ceph/osd_client.c:885
- BUG_ON(!list_empty(&req->r_req_lru_item)) in __kick_osd_requests()
Can't reproduce but need to look harder into ho... - 01:39 PM Bug #9472 (Duplicate): osd crash in -upgrade:dumpling-dumpling-distro-basic-vps suite
- 01:38 PM Bug #9476 (Duplicate): "Segmentation fault (core dumped)" in upgrade:dumpling-giant-x:parallel-gi...
- 01:35 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
- what was the assert?
- 01:33 PM Bug #9501 (Rejected): Assertion in FileJournal::do_write
- 01:27 PM Bug #9008: Objecter: pg listing can deadlock when throttling is in use
- 01:26 PM Bug #9422 (Can't reproduce): librados: client.admin authentication error (110) Connection timed out
- 01:25 PM Bug #9274 (Can't reproduce): "AssertionError: failed to recover before timeout expired" in upgrad...
- 01:21 PM Bug #9544 (Pending Backport): osd: pg deletion vs create race leads to EEXIST on mkcoll (dumpling)
- 01:18 PM rgw Bug #8587 (Fix Under Review): rgw: subuser object not created correctly
- 01:15 PM Bug #9418: mon: drop internal-purpose messages from clients without proper caps
- 01:14 PM Bug #9546 (Rejected): LibRadosWatchNotify.WatchNotifyTest failure
- 01:13 PM Bug #9293 (Pending Backport): _collection_move_rename EEXIST
- 01:12 PM Bug #9293 (Fix Under Review): _collection_move_rename EEXIST
- 01:12 PM rgw Feature #7467 (Fix Under Review): Make radosgw work with multiple hostnames
- 09:08 AM rgw Feature #7467 (In Progress): Make radosgw work with multiple hostnames
- 01:11 PM rgw Bug #5595: object has a Content-Type, but its content_type property is not shown in Swift object ...
- Needs review, can't set status on this tracker.
- 11:21 AM rgw Bug #5595: object has a Content-Type, but its content_type property is not shown in Swift object ...
- I think this happens if the object was created before, and then its metadata was modified. It's similar to another is...
- 01:06 PM Bug #9574: Backfill: recheck full status once reservation is granted
- 12:07 PM Bug #9574 (Resolved): Backfill: recheck full status once reservation is granted
- Otherwise, we queue many backfill reservations while we are not full and then each one is granted in turn without che...
- 01:05 PM Bug #9443 (Rejected): btrfs pwrite returns EEXIST on journal FileJournal::write_bl
- Not our bug.
- 12:57 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
- teuthology@teuthology:/a/teuthology-2014-09-22_23:02:01-rgw-giant-testing-basic-multi/505881
- 12:56 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
- teuthology@teuthology:/a/teuthology-2014-09-22_23:02:01-rgw-giant-testing-basic-multi/505875
- 12:48 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
- Seem to me like timing out due to slow ec backend.
- 12:34 PM rgw Bug #9575 (Duplicate): s3tests.functional.test_s3.test_region_copy_object fails (races with rados...
- ...
- 12:49 PM rgw Bug #9576 (Resolved): rgw: update object content-length doesn't work correctly
- This only applies to the swift POST object metadata api call.
- 11:51 AM devops Tasks #8366 (Fix Under Review): Update ceph.com/docs to default to the latest major release (0.80)
- We need to review this a bit further. Pointing to the latest major release is fine, but we need to have a way to cher...
- 11:43 AM Bug #8885 (Resolved): SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
- 11:08 AM Bug #9547 (Resolved): python rados aio_read truncates returned buffer on \000
- 10:33 AM Bug #9482 (Resolved): osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log....
- 10:33 AM Bug #9339 (Resolved): ReplicatedPG crash in hitset_create
- 10:32 AM Bug #8777 (Resolved): osd/PGLog.h: 88: FAILED assert(rollback_info_trimmed_to_riter == log.rbegin())
- 10:32 AM Bug #9054 (Resolved): ceph_test_rados: FAILED assert(!old_value.deleted())
- 10:32 AM Bug #9326 (Resolved): osd crash in upgrade:dumpling-firefly-x-master-distro-basic-vps suite
- Does not need to be backported!
- 10:30 AM Bug #9240 (Resolved): osd_max_backfills = 1 can cause reserver deadlock for EC
- 10:30 AM Bug #9179 (Resolved): unfound objects, recovery timeout
- 10:30 AM Bug #9481 (Resolved): osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
- 10:30 AM Bug #9497 (Resolved): choose_acting has to let the pg be down any time acting < min_size even if ...
- 09:39 AM Linux kernel client Bug #9573 (New): krbd: investigate a dd-in-a-loop slowdown
- Reported at the bottom of #8818.
- 09:38 AM rbd Bug #5768 (Fix Under Review): rbd-fuse: leak in enumerate_images()
- https://github.com/ceph/ceph/pull/2524
- 09:34 AM rbd Bug #6926 (Fix Under Review): rbd: diff output includes previously non-existent objects as zeroed...
- https://github.com/ceph/ceph/pull/2523
- 09:27 AM Feature #8188: librados: interface to inspect pool properties
- https://github.com/ceph/ceph/pull/2552
- 09:27 AM Feature #8188 (Fix Under Review): librados: interface to inspect pool properties
- 09:26 AM rgw Bug #7796 (Won't Fix): RGW Keystone token auth fails with '411 Length Required' when Keystone usi...
- The recommendation is to work around the issue using the afformentioned apache configuration.
- 09:14 AM rgw Bug #8676: md5sum check failed during readwrite.py
- This might have been fixed, downgrading it for now until it's dis/proved.
- 08:59 AM rgw Bug #8676: md5sum check failed during readwrite.py
- There's a chance this one is the same as #9307
- 09:07 AM rgw Bug #6611 (Won't Fix): RGW: Using underscores when setting headers returns 403
- The cgi interface prevents us from doing anything about it. With civetweb it'd be different, but at this point there'...
- 09:02 AM devops Bug #6592 (Can't reproduce): 3.8 kernel + /dev/cciss/c0d1 + precise : fail to show in /dev/disk/b...
- I lost access to the hardware before being able to properly reproduce / diagnose this border case.
- 08:59 AM rgw Bug #9307 (Pending Backport): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dump...
- Should have been fixed by commit:d41c3e858c6f215792c67b8c2a42312cae07ece9
Note that when backporting also need to ... - 08:57 AM Bug #9408: erasure-code: misalignment
- gitbuilder is all green
- 08:56 AM Bug #9408 (Fix Under Review): erasure-code: misalignment
- Corresponding pull request https://github.com/ceph/ceph/pull/2558
- 08:52 AM rgw Bug #9529 (Resolved): ./common/ceph_crypto.h: 83: FAILED assert(s == SECSuccess)
- 08:52 AM rgw Bug #9529: ./common/ceph_crypto.h: 83: FAILED assert(s == SECSuccess)
- Fixed by commit:7b137246b49a9f0b4d8b8d5cebfa78cc1ebd14e7
- 08:45 AM Bug #9381 (Resolved): "jerasure load dlopen(/usr/lib64/ceph/erasure-code/libec_lrc.so)" error in ...
- All rpm packages were eventually updated.
- 08:42 AM Bug #9224 (Can't reproduce): osd: segv in dlopen
- 08:29 AM Bug #9470 (Resolved): daemon pid file is not being created when running service ceph
- 08:29 AM Bug #9509 (Resolved): init script cannot stop OSDs
- 08:15 AM Bug #9572 (Fix Under Review): erasure-code: BlaumRoth default encoding regression
- 02:45 AM Bug #9572: erasure-code: BlaumRoth default encoding regression
- https://github.com/ceph/ceph/pull/2556
- 02:35 AM Bug #9572 (In Progress): erasure-code: BlaumRoth default encoding regression
- 02:10 AM Bug #9572 (Resolved): erasure-code: BlaumRoth default encoding regression
- Fixing the "bug on BlaumRoth w constraint":https://github.com/ceph/ceph/commit/9e2d04f7631cc7cd8444e7329890c2429a2d94...
- 06:31 AM Feature #9420: erasure-code: tools and archive to check for non regression of encoding
- 04:37 AM Feature #9343 (Resolved): erasure-code: allow upgrades for lrc and isa plugins
- 12:13 AM devops Bug #9506 (Rejected): Pass monitor SSH addresses via CLI flag
- There probably is something to be done to clarify the confusion between mon id and hostnames but it is another topic ;-)
09/22/2014
- 10:43 PM CephFS Bug #9563: kcephfs crash in ceph_mdsc_do_request
- the bug came from "ceph: use pagelist to present MDS request data". I force updated the testing branch, please test it.
- 05:04 AM CephFS Bug #9563 (Resolved): kcephfs crash in ceph_mdsc_do_request
From serial console:...- 07:50 PM Bug #9571 (Resolved): rocksdb testing with powercycling fails on trusty
- This is when osd_objectstore is using rocksdb,...
- 07:19 PM Bug #9503 (Fix Under Review): Dumpling: removing many snapshots in a short time makes OSDs go ber...
- 07:32 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- OK, that seems to have done it. After installing the updated autobuild with Dan's patch and keeping the snap_trim lim...
- 07:18 PM Bug #9502 (Pending Backport): mon: does not verify disk is not full on startup
- 07:16 PM Bug #9455 (Resolved): mon: audit log read events should be debug level
- 03:57 PM devops Feature #9050: Calamari builds for ceph.com
- Yes, we need a ceph.com/<something>/calamari repo which contains the various packages.
What needs some discussion... - 03:55 PM Bug #9570 (Rejected): osd crash in FileJournal::WriteFinisher::entry() aio
- h3. Workaround
Try with a kernel newer than 3.13 - as new as the environment allows.
h3. Collect more informati... - 02:18 PM Feature #9420: erasure-code: tools and archive to check for non regression of encoding
- * Created the repository https://github.com/ceph/ceph-erasure-code-corpus
* Asked Sandon if having such a reposito... - 01:37 PM Feature #9568 (Resolved): Add test case to test #9419 (ceph wip-9419)
- 12:03 PM Bug #9538 (Resolved): mon crashes on some --format=plain commands
- 10:32 AM devops Bug #9567: Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
- and http://pulpito.front.sepia.ceph.com/teuthology-2014-09-21_19:25:01-upgrade:dumpling-firefly-x-giant-distro-basic-...
- 09:10 AM devops Bug #9567 (Rejected): Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
- In run http://pulpito.front.sepia.ceph.com/teuthology-2014-09-21_19:25:01-upgrade:dumpling-firefly-x-giant-distro-bas...
- 10:13 AM Feature #9343 (Fix Under Review): erasure-code: allow upgrades for lrc and isa plugins
- Rebased the pull request against giant https://github.com/ceph/ceph/pull/2551
- 07:45 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
- The "logs of the failed test":http://qa-proxy.ceph.com/teuthology/ubuntu-2014-09-19_04:50:17-rados:monthrash-wip-9343...
- 07:33 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
- The "monthrash against giant":http://pulpito.ceph.com/ubuntu-2014-09-20_00:35:01-rados:monthrash-giant-testing-basic-...
- 10:08 AM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
- Samuel Just wrote:
> I'd need the corresponding logs from osd.5 to be sure, but I believe the problem is that osd.5,... - 10:05 AM Bug #9515 (Duplicate): "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:...
- 10:04 AM Bug #9515: "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parallel-gia...
- Seems to be related to http://tracker.ceph.com/issues/9508 and recently resolved
- 10:01 AM Bug #9515: "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parallel-gia...
- The stack trace is:...
- 09:17 AM Bug #9515: "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parallel-gia...
- Also seeing in suite:upgrade:dumpling-firefly-x
http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-21_19:25:01... - 07:53 AM Bug #9515: "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parallel-gia...
- Also shows in http://tracker.ceph.com/issues/9343#note-9
- 08:52 AM Feature #9161: Cache warmup and ejection
- I started to work on this.
Is there a chance it could go into Hammer release? - 08:51 AM Feature #9161: Cache warmup and ejection
- I started to work on this.
Is there a change it could go into Hammer release? - 08:43 AM Fix #9566 (Need More Info): osd: prioritize recovery of OSDs with most work to do
Assume 72 hours for host replacement/reprovisioning SLA. When host goes down (hardware failure), we expect complete...- 07:36 AM devops Bug #9510: ceph-deploy: Move mon keyring generation 'mon create-initial'
- Would adding a separate command for keyring creation be better?
Would moving it to `create-initial` mean that it ... - 05:12 AM devops Bug #9506 (Need More Info): Pass monitor SSH addresses via CLI flag
- Could you give me a use case? In what context something like this would happen, and at what point in the deployment p...
- 05:08 AM CephFS Bug #9564: kcephfs crash in _nfs4_do_open
- http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-19_23:10:01-knfs-giant-testing-basic-multi/500158/...
- 05:07 AM CephFS Bug #9564 (Resolved): kcephfs crash in _nfs4_do_open
- 04:47 AM CephFS Bug #9562: Lockdep assertion in Filer purge
- ...
- 04:46 AM CephFS Bug #9562 (Resolved): Lockdep assertion in Filer purge
- 04:08 AM Linux kernel client Bug #8979 (Resolved): GPF kernel panics - auth?
- Landed in 3.17-rc5. Opened #9560 and #9561 for the issues mentioned above.
- 04:04 AM Linux kernel client Bug #9561 (Rejected): libceph: do not crash if auth reply is not understood
- 04:02 AM Linux kernel client Bug #9560 (Rejected): libceph: msg kmalloc failure handling on the reply path
- 02:55 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
- Issue reproduced, find the following info
Attaching mon and dmesg log of monitor node
Executed following comman... - 12:51 AM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
- RAM frequency, interesting. Something to keep in mind..
09/21/2014
- 11:56 PM Bug #9559: ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() function
- ceph-0.80.5/src/common/fd.cc dump_open_fds() function allows attackers to cause buffer overflow via vectors related t...
- 11:47 PM Bug #9559 (Resolved): ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() func...
- ceph-0.80.5/src/common/fd.cc dump_open_fds() function allows attackers to cause buffer overflow via vectors related...
- 11:28 PM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
- Please ignore the previous update. Here is the correct one:
While some osds where in nearfull situation, shutdown ... - 10:27 PM Bug #8863: osd: second reservation rejection -> crash
- Yes Sage it is included.
commit 2b13de16c522754e30a0a55fb9d072082dac455e
Author: Sage Weil <sage@redhat.com>
Dat... - 10:24 PM Bug #9558 (Can't reproduce): Both op threads and dispatcher threads get hung even for few minutes...
- During peering stage, op threads will handle peering event and check the missing objects in this function: bool PG::M...
- 09:56 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Thanks a lot! I'll report back once there is an update to share.
- 07:26 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- That log shows me PGs with huge snap_trimq, which is very unfriendy to the snap trimmer. I've added Dan's patch on t...
- 12:51 PM Bug #9503 (Need More Info): Dumpling: removing many snapshots in a short time makes OSDs go berserk
- 07:16 PM Bug #8752: firefly: scrub/repair stat mismatch
- Sage Weil wrote:
> Is it possible the inconsistencies are correlated with the kernel (vs userspace) client? That wo... - 07:03 PM Bug #8752: firefly: scrub/repair stat mismatch
- Dmitry Smirnov wrote:
> On 0.80.5 inconsistencies disappear from pool 20 (CephFS caching pool) although I also stopp... - 06:12 PM Bug #8752: firefly: scrub/repair stat mismatch
- On 0.80.5 inconsistencies disappear from pool 20 (CephFS caching pool) although I also stopped using kernel FS client...
- 07:11 PM rbd Bug #8000 (Closed): SLAB: Unable to allocate memory on node 0
- No particular access pattern seems to provoke this issue and frankly I have no clue what's causing it apart from "dee...
- 04:26 PM CephFS Feature #9557 (Resolved): mds: verify backtrace on fetch_dir
- Verify that the backtrace is valid when we finish fetch_dir. That is, that we would have been able to locate the dir...
- 04:13 PM Bug #9285 (Fix Under Review): osd: promoted object can get evicted before promotion completes
- 04:03 PM Bug #8629 (Fix Under Review): cache_evict needs to prevent make_writeable from creating a snapdir
- https://github.com/ceph/ceph/pull/2550
- 02:25 PM Bug #9556 (Duplicate): Segmentation fault in upgrade:dumpling-firefly-x-giant-distro-basic-multi ...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-21_10:14:47-upgrade:dumpling-firefly-x-giant-distr...
- 01:42 PM Bug #9545 (Fix Under Review): filestore stuck in journal->should_commit_now() loop on shutdown
- https://github.com/ceph/ceph/pull/2549
- 12:50 PM Bug #9389: ec pg stuck peering, did not send query for one shard
- At least on that one, looks like do_queries doesn't send the query. That can happen if the osd is down as of the osd...
- 12:41 PM Bug #9389: ec pg stuck peering, did not send query for one shard
- /a/samuelj-2014-09-20_19:00:23-rados-wip-sam-testing-firefly2-wip-testing-old-vanilla-basic-multi/501557
probably ... - 11:06 AM Bug #9555 (Resolved): msg/Pipe.cc: 1513: FAILED assert(0 == "old msgs despite reconnect_seq featu...
- firefly
/a/samuelj-2014-09-20_19:00:23-rados-wip-sam-testing-firefly2-wip-testing-old-vanilla-basic-multi/501749/r... - 10:08 AM Bug #9293: _collection_move_rename EEXIST
- 05:44 AM Bug #9547: python rados aio_read truncates returned buffer on \000
- firefly backport https://github.com/ceph/ceph/pull/2548
- 03:18 AM Bug #9547: python rados aio_read truncates returned buffer on \000
- The example from the description was not right but fixing it to have the expected length does not change the result o...
- 03:16 AM Bug #9547 (Pending Backport): python rados aio_read truncates returned buffer on \000
09/20/2014
- 06:46 PM Bug #9554 (Can't reproduce): "FAILED assert(0 == "hit suicide timeout")" in upgrade:firefly-firef...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-20_15:08:15-upgrade:firefly-firefly-testing-basic-...
- 05:53 PM devops Bug #9460: mira004, mira036. mira017 unresponsive
- mira004 is bad again - 2014-09-20T17:31:32.251 INFO:teuthology.provision:Downburst completed on ubuntu@vpm024.front.s...
- 03:29 PM Linux kernel client Bug #9432: kcephfs: null pointer deref in posix_acl_create
- 03:04 PM Bug #9551 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-testing-basic-vps run
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-20_13:44:11-upgrade:firefly-firefly-testing-basic-...
- 06:32 AM devops Bug #9548 (Rejected): ceph mon creation failed for centOS
- Trying to deploy ceph in centOS. But every time execute the below command I'm getting failed response.
[ceph@ceph-... - 04:15 AM Bug #9547 (Fix Under Review): python rados aio_read truncates returned buffer on \000
- 04:15 AM Bug #9547: python rados aio_read truncates returned buffer on \000
- running wip-9547-python-rados-truncate from https://github.com/ceph/ceph/pull/2545 on http://ceph.com/gitbuilder.cgi
- 03:44 AM Bug #9547: python rados aio_read truncates returned buffer on \000
- "need firefly backport":https://github.com/ceph/ceph/blob/firefly/src/pybind/rados.py#L1093
- 03:40 AM Bug #9547: python rados aio_read truncates returned buffer on \000
- Proposed fix https://github.com/ceph/ceph/pull/2544
- 03:36 AM Bug #9547 (Resolved): python rados aio_read truncates returned buffer on \000
- ...
- 02:16 AM Bug #9535 (Duplicate): monitor crashed after restarting
- 02:14 AM Bug #9455 (Fix Under Review): mon: audit log read events should be debug level
- https://github.com/ceph/ceph/pull/2538
- 02:14 AM Bug #9502 (Fix Under Review): mon: does not verify disk is not full on startup
- https://github.com/ceph/ceph/pull/2538
- 12:37 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
- What "was supposed to be the baseline":http://pulpito.ceph.com/sage-2014-09-18_17:42:51-rados:monthrash-wip-9301-dist...
09/19/2014
- 09:02 PM CephFS Bug #9178 (Resolved): samba: ENOTEMPTY on "rm -rf"
- 08:58 PM Bug #9546 (Rejected): LibRadosWatchNotify.WatchNotifyTest failure
- ...
- 05:32 PM Bug #9419: dumpling->firefly upgrade, sending setallochint?
- 05:18 PM Bug #9545: filestore stuck in journal->should_commit_now() loop on shutdown
- ...
- 05:18 PM Bug #9545: filestore stuck in journal->should_commit_now() loop on shutdown
- sync_entry is looping on the same seq while the main thread waits for umount. journal should_commit_now() is stuck r...
- 05:18 PM Bug #9545 (Resolved): filestore stuck in journal->should_commit_now() loop on shutdown
- 04:45 PM Bug #9390: EEXIST on split due to import/export
- Not precisely sure how to approach this. We can make the OSD robust to this situation or we can adjust the test to a...
- 04:44 PM Bug #9390: EEXIST on split due to import/export
- Tricky. I think that we saw the following sequence:
stop osd N
export pg X at epoch e
split pg X at epoch e+3
... - 04:43 PM Bug #8011 (Can't reproduce): osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || so...
- Pinged Dmitry to see if he is sitll seeing this or has a log
- 04:35 PM Bug #9384 (Resolved): OSD is crashing while io is running and querying withadmin socket
- 04:13 PM Bug #9502: mon: does not verify disk is not full on startup
- 04:03 PM Bug #9544: osd: pg deletion vs create race leads to EEXIST on mkcoll (dumpling)
- wip-sharedptr-registry-backport
- 03:39 PM Bug #9544 (Resolved): osd: pg deletion vs create race leads to EEXIST on mkcoll (dumpling)
- ...
- 03:42 PM Bug #7120 (Duplicate): osd: EEXIST on mkcoll on dumpling
- 03:34 PM Bug #7120: osd: EEXIST on mkcoll on dumpling
- /a/sage-2014-09-18_22:33:58-rados-dumpling-distro-basic-multi/496304/remote
- 03:34 PM CephFS Bug #9539 (Resolved): struct PurgeRange in Filer.cc needs lock to protect
- 06:32 AM CephFS Bug #9539 (Resolved): struct PurgeRange in Filer.cc needs lock to protect
- send two requests to delete 1000026dfe3.00000067, but no request to 1000026dfe3.00000068...
- 02:50 PM rgw Bug #9543 (Rejected): AssertionError(s) in upgrade:dumpling-dumpling-distro-basic-vps run
- All in http://pulpito.front.sepia.ceph.com/teuthology-2014-09-19_11:48:54-upgrade:dumpling-dumpling-distro-basic-vps/...
- 02:47 PM CephFS Bug #8576: teuthology: nfs tests failing on umount
- Been playing around with this some.
- 02:47 PM CephFS Bug #9177 (Resolved): ceph-fuse: failing MPI mdtest runs
- John fixed this by updating mdtest in ceph-qa-suite as of commit:b1365a80982dba4160e861c28d887b066ca451b6.
- 02:27 PM Bug #9301 (Pending Backport): paxos: off by one w/ versions in forming quorum
- 12:42 PM Bug #9537: OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_total_chunk_si...
- Please note that while this ought to work in the technical sense, you are unlikely to be happy with RADOS if you make...
- 11:51 AM Bug #9537: OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_total_chunk_si...
- OSD log of the primary OSD which crashed
- 07:17 AM Bug #9537 (Fix Under Review): OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo...
- https://github.com/ceph/ceph/pull/2534 to be confirmed by the OSD logs
- 07:05 AM Bug #9537 (Need More Info): OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.g...
- Could you please attach the last 20,000 (twenty thousand) lines of the logs of the crashed primary OSD ?
- 03:52 AM Bug #9537: OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_total_chunk_si...
- Config:
OSD nodes: 3
Monitor nodes: 2
Number of OSD's: 24
This is observed on 0.84 and is consistently getting ... - 03:33 AM Bug #9537 (Resolved): OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_tot...
On freshly created cluster, created an EC pool with default ec profile.
Wrote 5MB of object file using rados put...- 12:16 PM CephFS Feature #9284 (Resolved): mds: warn when clients are not responding to cache pressure
- Merged in giant...
- 12:03 PM rbd Feature #6228: image name metavariable
- Glad to see this feature added. Thank you! Mark, Adam, Sage, and Loic!
Assuming it wouldn't be too difficult, coul... - 09:47 AM rbd Feature #6228 (Pending Backport): image name metavariable
- 02:01 AM rbd Feature #6228 (Fix Under Review): image name metavariable
- 11:04 AM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
- I'd need the corresponding logs from osd.5 to be sure, but I believe the problem is that osd.5, due to 9497 and this ...
- 10:51 AM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
- wip-sam-testing-firefly
- 10:51 AM Bug #9497: choose_acting has to let the pg be down any time acting < min_size even if there are b...
- wip-sam-testing-firefly
- 10:50 AM Bug #9481: osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
- wip-sam-testing-firefly
- 10:50 AM Bug #9326 (Pending Backport): osd crash in upgrade:dumpling-firefly-x-master-distro-basic-vps suite
- 10:50 AM Bug #9240: osd_max_backfills = 1 can cause reserver deadlock for EC
- wip-sam-testing-firefly
- 10:50 AM Bug #9293: _collection_move_rename EEXIST
- wip-sam-testing-firefly
- 10:49 AM Bug #9179: unfound objects, recovery timeout
- wip-sam-testing-firefly
- 10:49 AM Bug #8777: osd/PGLog.h: 88: FAILED assert(rollback_info_trimmed_to_riter == log.rbegin())
- wip-sam-testing-firefly
- 10:49 AM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- wip-sam-testing-firefly
- 10:49 AM Bug #9339: ReplicatedPG crash in hitset_create
- wip-sam-testing-firefly
- 09:23 AM Documentation #9542 (Won't Fix): Error link:"Ceph Object Gateway"->"Manual Install"
- In the page "Install Ceph Object Gateway"(http://ceph.com/docs/master/install/install-ceph-gateway/index.html), the "...
- 09:05 AM Documentation #8995 (Resolved): Preflight Checklist Clarifications
- http://ceph.com/docs/master/start/ Preflight was revamped significantly to address the comments and anticipate others.
- 09:05 AM CephFS Bug #9540 (Rejected): Crash during FS upgrade: assert(o->get_num_ref() == 0)
- Never mind, seems like this was just another manifestation of the original segment reference bug -- giant HEAD is OK.
- 06:37 AM CephFS Bug #9540: Crash during FS upgrade: assert(o->get_num_ref() == 0)
- The crash hits at the last ceph.restart (after upgrade from firefly to 83bd3430e3a17b77265e696095904b7a9032d2ee).
... - 06:33 AM CephFS Bug #9540 (Rejected): Crash during FS upgrade: assert(o->get_num_ref() == 0)
- ...
- 08:08 AM Bug #8863: osd: second reservation rejection -> crash
- does your build include commit:2b13de16c522754e30a0a55fb9d072082dac455e ?
- 07:44 AM RADOS Bug #9492 (Fix Under Review): Crush Mapper crashes when number of replicas is less than total num...
- https://github.com/ceph/ceph/pull/2528
- 07:35 AM Bug #9470 (Pending Backport): daemon pid file is not being created when running service ceph
- firefly backport : https://github.com/ceph/ceph/pull/2535
- 06:58 AM Linux kernel client Bug #9533 (Duplicate): kcephfs: fail to send requests initiated during mds restart
- this was an old bug, patch was missing from running kernel.
ceph: fix kick_requests()
- 06:56 AM Bug #9362: librados, rados_read corrupts memory on timeout
- I did another test today using the build from http://gitbuilder.ceph.com/ceph-deb-wheezy-x86_64-basic/ref/wip-dumplin...
- 06:36 AM Bug #9538: mon crashes on some --format=plain commands
- Checked all other uses of new_formatter allocated pointer in OSDMonitor
- 06:31 AM Bug #9538 (Fix Under Review): mon crashes on some --format=plain commands
- https://github.com/ceph/ceph/pull/2533
- 06:11 AM Bug #9538: mon crashes on some --format=plain commands
- 05:28 AM Bug #9538 (Resolved): mon crashes on some --format=plain commands
Mentioned by bens on IRC, creating ticket in case we forget:...- 06:33 AM rgw Feature #8911: RGW doesn't return 'x-timestamp' in header which is used by 'View Details' of Open...
- Hello Luis, et al..
I have a customer who's requesting status for this Feature.. They view it as a bug since it c... - 05:17 AM Bug #9536 (Fix Under Review): erasure-code: ISA plugin alignment must be constant
- * giant backport https://github.com/ceph/ceph/pull/2531
- 05:07 AM Bug #9536 (In Progress): erasure-code: ISA plugin alignment must be constant
- 02:57 AM Bug #9536 (Resolved): erasure-code: ISA plugin alignment must be constant
commit:28c2b6e4f2bc6d77b9150fcf9a917d85c69c9ed1
"EC_ISA_VECTOR_OP_WORDSIZE":https://github.com/ceph/ceph/blob/ma...- 04:52 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
- scheduled "a monthrash":http://pulpito.ceph.com/ubuntu-2014-09-19_04:50:17-rados:monthrash-wip-9343-erasure-code-feat...
- 04:43 AM Fix #8914 (Resolved): osd crashed at assert ReplicatedBackend::build_push_op
- 02:59 AM Bug #9485: Monitor crash due to wrong crush rule set
- Hi loic:
log, "ceph osd tree" output and crush map added.
log:
0> 2014-09-19 09:43:08.462737 7f92d9674700 -... - 01:58 AM Bug #9485: Monitor crash due to wrong crush rule set
- Hi,
It should not crash, it should give you an error of some kind maybe. Could you please attach to this ticket a ... - 02:11 AM Bug #9408: erasure-code: misalignment
- Running under the branch wip-9408-buffer-alignment in http://ceph.com/gitbuilder.cgi
- 01:55 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
- It reproed. PFA logs attached.
here is the snippet:
2014-09-19 10:27:02.228364 7f86d73a2700 0 log_channel(de... - 01:41 AM Bug #9535 (Duplicate): monitor crashed after restarting
- recently when i restarted my ceph cluster , the monitor crashed , below is the output of monitor log
2014-09-19 ... - 01:39 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Hi Sage, Thanks for the quick patch. I tried wip-9487-dumpling on our test cluster and now there is no snap trimming ...
09/18/2014
- 09:43 PM Linux kernel client Bug #9533 (Duplicate): kcephfs: fail to send requests initiated during mds restart
- mds sees...
- 09:37 PM Bug #9202: Performance degradation during recovering and backfilling
- New ticket here - http://tracker.ceph.com/issues/9523
- 01:53 AM Bug #9202: Performance degradation during recovering and backfilling
- Hi Samuel,
Thanks for the short-term fix by tuning that 2 parameters of backfill scan. With tuning other backfill/... - 09:16 PM Bug #9481: osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
- ceph cluster with 8 osd nodes each having 64 osds, few osds were crashing with this assert .As one node had timestamp...
- 11:01 AM Bug #9481 (Pending Backport): osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
- 09:44 AM Bug #9481 (Fix Under Review): osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
- 08:42 PM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
- Captured debug log with wip-log-crash-firefly branch and attached.
- 01:18 PM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
- sjust believes I may have hit the same bug, running 0.80.5. Attached is the log from an OSD with settings:...
- 12:55 PM Bug #9482 (Pending Backport): osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head...
- 09:44 AM Bug #9482 (Fix Under Review): osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head...
- Can't set info.last_epoch_started there, going to just use history.last_epoch_started as lower bound.
- 07:16 PM Bug #9485: Monitor crash due to wrong crush rule set
- Hi, loic.
Currently I'm running some tests on my dev envrionment, after the tests are finished, I will reproduce i... - 06:42 PM CephFS Feature #9189 (Resolved): Expose client identifying metadata to MDS, e.g. hostname
- 06:33 PM Feature #9532 (Duplicate): rados.py should export omap interface
- IWBN to be able to manipulate omap values with Python
- 06:03 PM Feature #8188 (In Progress): librados: interface to inspect pool properties
- 05:09 PM rgw Bug #9529 (Resolved): ./common/ceph_crypto.h: 83: FAILED assert(s == SECSuccess)
- ...
- 05:01 PM Bug #9487 (Fix Under Review): dumpling: snaptrimmer causes slow requests while backfilling. osd_s...
- wip-9487
wip-9487-dumpling for backport - 03:23 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Nevermind, I've reproduced it!
- 03:21 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Thanks Sage. There's a log with debug_osd=20 attached to this issue. I'll try tomorrow to get one with debug_ms=1 too.
- 03:08 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Okay, I can't seem to reproduce this.
Dan or Florian, can you attach a log? What I need is debug ms = 1 and debug... - 02:45 PM Bug #9487 (In Progress): dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_t...
- Dan van der Ster wrote:
> I also noticed that before the snap trimmer starts, purge_snaps is [] for 5.318. Is that n... - 02:52 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Please comment on https://github.com/ceph/ceph/pull/2516.
Thanks! - 04:17 PM Bug #9528: RadosModel assertion failure in firefly
- also this one,
log: http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-11_23:20:03-multi-version-master-testi... - 04:14 PM Bug #9528 (Duplicate): RadosModel assertion failure in firefly
- This is basically firefly client running against the dumpling cluster.
logs: http://qa-proxy.ceph.com/teuthology/t... - 03:43 PM Bug #9517 (Resolved): Errors in test_rbd.* tests in upgrade:dumpling-giant-x:parallel-giant-distr...
- This was due to ceph-qa-suite updates not being on the giant branch.
- 03:29 PM rbd Feature #6228: image name metavariable
- Yeah, that is probably a good idea anyway.. we've had uniqueness issues like this before! That is an easy thing and ...
- 03:21 PM rbd Feature #6228: image name metavariable
- It's not perfect, but we could add a cctid variable so users could specify something like "admin socket = /var/run/ce...
- 03:11 PM rbd Feature #6228: image name metavariable
- This assumes that each image has its own cct, but a process could have multiple images open in one cct. (In fact, co...
- 02:07 PM rbd Feature #6228 (In Progress): image name metavariable
- 03:01 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Sage, a log is at https://www.dropbox.com/s/f2xyx12y2zr7fid/ceph-osd.14.log.xz -- behold the awesomeness of xz; that ...
- 02:49 PM Bug #9503 (Duplicate): Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Florian Haas wrote:
> Sage, I do have logs (@debug osd=20@, though not @debug ms=1@), but after the discussion with ... - 02:46 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Sage, I do have logs (@debug osd=20@, though not @debug ms=1@), but after the discussion with Dan on the -devel list,...
- 02:35 PM Bug #9503 (Need More Info): Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Hi Florian-
Can you generate some OSD logs (debug ms = 1, debug osd = 20) and attach them to the bug? The message... - 02:23 PM Bug #9301: paxos: off by one w/ versions in forming quorum
- 02:07 PM rbd Bug #5768: rbd-fuse: leak in enumerate_images()
- 01:49 PM rbd Bug #5768 (In Progress): rbd-fuse: leak in enumerate_images()
- 02:05 PM Bug #9462: msgr deadlock: osd reply vs mark_down vs fault
- 12:55 PM Bug #9453 (Resolved): ceph_objectstore_tool incorrect log tail output for --op log
- 09:45 AM Bug #9453 (Fix Under Review): ceph_objectstore_tool incorrect log tail output for --op log
- 11:51 AM rbd Feature #7746 (In Progress): Capacity Management: rbd df
- see wip-7746
- 11:46 AM Feature #9526 (Resolved): mon: 'osd crush rename-bucket <old> <new>'
- 11:42 AM Bug #9497 (Pending Backport): choose_acting has to let the pg be down any time acting < min_size ...
- 09:43 AM Bug #9497 (Fix Under Review): choose_acting has to let the pg be down any time acting < min_size ...
- 11:26 AM rgw Bug #9525 (Duplicate): Deleted object shows in object listing
What appears to happen is that a request to delete an object comes in while the cluster is in a terrible state perf...- 11:16 AM rgw Bug #9169: 100-continue broken for centos/rhel
- Similar issue in suite:upgrade:firefly
http://pulpito.front.sepia.ceph.com/teuthology-2014-09-17_19:00:01-upgrade:... - 11:04 AM rgw Bug #9479: ETag is not included in the XML response to put object copy operation
- This is under v0.67.10
- 11:04 AM rgw Bug #9478: Incorrect content type in response header
- This is under v0.67.10
- 11:00 AM Bug #8315 (Pending Backport): osd: watch callback vs callback funky
- 09:40 AM Bug #8315 (Fix Under Review): osd: watch callback vs callback funky
- 09:40 AM rbd Bug #6926 (In Progress): rbd: diff output includes previously non-existent objects as zeroed extents
- 09:40 AM Bug #9326 (Fix Under Review): osd crash in upgrade:dumpling-firefly-x-master-distro-basic-vps suite
- 09:37 AM Feature #7767 (Resolved): messenger:buffer reads
- 07:58 AM CephFS Feature #9437 (In Progress): make 'ceph tell mds.* ...' work, deprecate 'ceph mds tell * ...'
- 06:16 AM CephFS Feature #9477: Handle kclient shutdown with dead network more gracefully
- In the general case (e.g. root filesystem is cephfs) there's nothing we can do: the system can't shut down until the ...
- 05:56 AM CephFS Bug #9518 (Resolved): client metadata get lost after mds restart
- ...
- 02:30 AM CephFS Bug #9518 (Fix Under Review): client metadata get lost after mds restart
- Well, I also shouldn't have missed it while writing the code :-)
https://github.com/ceph/ceph/pull/2515 - 04:01 AM RADOS Bug #9523 (Closed): Both op threads and dispatcher threads could be stuck at acquiring the budget...
- When OSD is rejoining and peering, we still see some slow requests and performance downgradation in about 5 to 10 min...
- 02:08 AM Bug #8863: osd: second reservation rejection -> crash
- Two osds were down and out due to that crash, I was not able to start those osds again. So removed those osds and add...
- 01:13 AM Linux kernel client Bug #9507: calling llistxattr(2) on a symlink crashes the client
- ...
09/17/2014
- 11:59 PM CephFS Bug #9504 (Duplicate): failed to decode message of type 24 v2: buffer::end_of_buffer
- looks like this is duplicate of #9458
- 08:23 AM CephFS Bug #9504 (Duplicate): failed to decode message of type 24 v2: buffer::end_of_buffer
- root@burnupi21:~# less /var/log/upstart/ceph-mds-ceph_burnupi21.log
... - 11:57 PM Linux kernel client Bug #9458: client wrongly fenced
- is the client using 3.16 kernel? possibly due to missing following commit...
- 02:45 PM Linux kernel client Bug #9458: client wrongly fenced
- The kernel client is definitely doing something wrong here, but I don't know what — the userspace messenger is not in...
- 02:38 PM Linux kernel client Bug #9458: client wrongly fenced
- The MDS went into reconnect at 4:59:50...
- 11:09 AM Linux kernel client Bug #9458: client wrongly fenced
- Taking a look; luckily we have at least *some* of the logging...
- 08:17 AM Linux kernel client Bug #9458: client wrongly fenced
- mds restarted and teuthology failed to reconnect again, 07:30:34.485721
- 07:18 AM Linux kernel client Bug #9458: client wrongly fenced
- teuthology was fenced again. not sure it was during a mds restart this time, either. notably the monitors went offl...
- 10:52 PM Bug #8863: osd: second reservation rejection -> crash
- Even i got the above crash, when few osds were in nearfull situation.
Snippet of logs:
2014-09-17 17:29:41.69... - 08:46 PM CephFS Bug #9518: client metadata get lost after mds restart
- Dur, shouldn't have missed that in review. :(
- 07:44 PM CephFS Bug #9518 (Resolved): client metadata get lost after mds restart
- 04:25 PM Bug #9517 (Resolved): Errors in test_rbd.* tests in upgrade:dumpling-giant-x:parallel-giant-distr...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-17_13:53:14-upgrade:dumpling-giant-x:parallel-gian...
- 03:31 PM Bug #9452 (Resolved): All tests failed in upgrade:dumpling-giant-x:parallel-master-distro-basic-m...
- Looks like we passed those issues
#9515 might be realted - 03:29 PM Bug #9515 (Duplicate): "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-17_13:53:14-upgrade:dumpling-giant-x:parallel-gian...
- 03:05 PM CephFS Bug #9514 (Resolved): ceph-fuse pjd test is failing in giant nightlies
- commit:0ea20a668cf859881c49b33d1b6db4e636eda18a
http://qa-proxy.ceph.com/teuthology/sage-2014-09-14_18:23:49-smoke... - 02:43 PM rbd Bug #9513 (Resolved): rbd_cache=true default setting is degading librbd performance ~10X in Giant
- We are experiencing severe librbd performance degradation in Giant over firefly release. Here is the experiment we di...
- 02:33 PM Bug #8885: SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
- It's the same issue as #9384
Here is the pull request for the same.
https://github.com/ceph/ceph/pull/2440 - 01:17 PM Bug #9508 (Resolved): objecter: segv on timeout/cancel (LibRadosIo ReadTimeout)
- commit:cef34f429972267061fc0e730ef976887ccb78a9
- 10:22 AM Bug #9508 (Fix Under Review): objecter: segv on timeout/cancel (LibRadosIo ReadTimeout)
- https://github.com/ceph/ceph/pull/2498
- 09:59 AM Bug #9508 (Resolved): objecter: segv on timeout/cancel (LibRadosIo ReadTimeout)
- ...
- 11:05 AM Bug #9509: init script cannot stop OSDs
- Yep, it needs to be backported to Firefly
- 11:01 AM Bug #9509 (Pending Backport): init script cannot stop OSDs
- See #9470. I guess the commit probably needs to be backported to firefly?
- 10:57 AM Bug #9509: init script cannot stop OSDs
- Let me redo the last sentence...
One user reported the issue on CentOS 7 and I managed to reproduce it. I assume i... - 10:48 AM Bug #9509 (Resolved): init script cannot stop OSDs
- Running a @service ceph stop osd@ will not stop OSDs.
It seems the problem is that the OSDs are launched with the ... - 11:04 AM devops Bug #9510 (Closed): ceph-deploy: Move mon keyring generation 'mon create-initial'
- Right now the monitor keyring is generated with 'ceph-deploy new', in cases where an admin wants to use a pre-existin...
- 10:23 AM Bug #9501: Assertion in FileJournal::do_write
- Don't worry, Sam says this is some kernel bug in btrfs, but he hasn't told the rest of us about it yet.
- 04:03 AM Bug #9501: Assertion in FileJournal::do_write
- Urgh, I have stupidly just killed that job before making a copy of the logs.
- 03:56 AM Bug #9501 (Rejected): Assertion in FileJournal::do_write
- ...
- 09:48 AM Linux kernel client Bug #9507 (Resolved): calling llistxattr(2) on a symlink crashes the client
- The code hits a "BUG();" line at https://github.com/ceph/ceph-client/blob/7e8a295295775ec9e05411cefc578ff4bfc94740/fs...
- 09:33 AM devops Bug #9506 (Rejected): Pass monitor SSH addresses via CLI flag
- In some network configurations it is desirable to have ceph-deploy access monitors from one network, and use another ...
- 08:51 AM Linux kernel client Bug #9505 (Duplicate): kcephfs: client gets stuck in reconnect loop?
- ...
- 08:37 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Added issue #9487 as *possibly* related.
- 07:52 AM Bug #9503 (Resolved): Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Back in March, there was a report from Craig Lewis on the users list that mentioned several OSDs going to 100% CPU fo...
- 07:10 AM Bug #9502 (Resolved): mon: does not verify disk is not full on startup
- mira040...
- 06:09 AM rgw Feature #9359: rgw: Export user stats in get-user-info Adminops API
- Updated PR with a new commit to resolve Yehuda's comments. Please help to review it.
- 06:08 AM Bug #9490 (Rejected): crushtool crash if --num-rep is missing
- The root of the problem is #9492 : when --num-rep is missing it defaults to the range defined in the rule and does th...
- 05:58 AM Bug #9490 (In Progress): crushtool crash if --num-rep is missing
- 06:03 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- I also noticed that before the snap trimmer starts, purge_snaps is [] for 5.318. Is that normal, or should (the compl...
- 05:48 AM CephFS Feature #9189: Expose client identifying metadata to MDS, e.g. hostname
- Userspace part merged:...
- 05:47 AM CephFS Feature #9375 (Resolved): Send single 'many clients' health warning instead of N warnings for N c...
- ...
- 03:39 AM rgw Bug #9500 (Duplicate): 0.80.5 on CentOS 6.5: radosgw-admin fails to correctly name subuser object
- System info: Firefly (0.80.5 on CentOS 6.5). radosgw is configured and working fine with s3cmd.
Symptom: despite t... - 01:23 AM Bug #8083 (In Progress): erasure-code: fix static code analysis errors found in gf-complete
- A number of fixes already are in gf-complete master and "added two":https://bitbucket.org/jimplank/gf-complete/pull-r...
09/16/2014
- 11:27 PM Bug #9488 (Rejected): Writing object onto EC pool created with customized ec profile getting hung
- k=1 m=1 is not supposed to work, it won't do anything useful. k=5 m=3 totals 8 osds and you only have 6 hence it blocks.
- 08:32 PM Bug #9488: Writing object onto EC pool created with customized ec profile getting hung
- Hi Loic,
I have 3 OSD hosts and total of 6 OSD's.
ems@rack6-client-5:~$ sudo ceph osd crush rule dump
[
{... - 08:36 AM Bug #9488 (Need More Info): Writing object onto EC pool created with customized ec profile gettin...
- It is the normal behavior when there are not enough hosts to satisfy the crush rules. Do you have 22 hosts available ...
- 05:14 AM Bug #9488: Writing object onto EC pool created with customized ec profile getting hung
- Attaching logs
- 05:09 AM Bug #9488: Writing object onto EC pool created with customized ec profile getting hung
- This issue is observed on ceph 0.84
- 05:07 AM Bug #9488 (Rejected): Writing object onto EC pool created with customized ec profile getting hung
- Writing object onto EC pool created with customized EC profile is getting hung.
But, writing object onto EC pool wit... - 11:15 PM Bug #9219 (Resolved): lost_unfound test got ENOENT: i don't have pgid 1.e
- 05:38 PM Bug #9219: lost_unfound test got ENOENT: i don't have pgid 1.e
- Merged into giant by commit:782848af596fdb0be57daa68481b3976b7119141.
- 10:14 PM devops Bug #9499 (Can't reproduce): osds do not start after reboot (centos7, dm-crypt)
- most osds do not come up after reboot; only one does.
adding a 'sleep 10 ; ceph-disk activate-all' to /etc/rc.loca... - 09:57 PM devops Bug #9498 (Resolved): el7 still using crappy el6 udev rules
- 08:33 PM Bug #9497: choose_acting has to let the pg be down any time acting < min_size even if there are b...
- 08:33 PM Bug #9497 (Resolved): choose_acting has to let the pg be down any time acting < min_size even if ...
- Otherwise, build_prior won't realize that the interval was maybe_went_rw.
- 07:05 PM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
- The issue is that crush temporary buffers(scratch array) are allocated as per size of num_replica configured by the ...
- 05:28 PM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
- Seg fault log:
CRUSH*** Caught signal (Segmentation fault) **
in thread 7f3dcb0007c0
ceph version 0.85-778-gb285... - 12:37 PM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
- 1. ./crushtool --outfn crushmap --build --num_osds 100 host straw 4 rack straw 10 default straw 0
2../crushtool -d c... - 06:36 PM Bug #9496 (Resolved): mon: pg scrub timestamps must be populated at pg creation
- logs: ubuntu@teuthology:/a/teuthology-2014-09-15_16:05:01-upgrade:firefly-giant-x:parallel-giant-distro-basic-multi/4...
- 05:37 PM CephFS Fix #9435 (Resolved): prevent use of cache pools as metadata or data pools
- Merged into giant branch in commit:eb1b2e0072bf605095f4104c2b6c2abfba216dbe
- 02:57 AM CephFS Fix #9435 (Fix Under Review): prevent use of cache pools as metadata or data pools
- https://github.com/ceph/ceph/pull/2507
- 03:46 PM Bug #9480: OSD is crashing while object deletion
- Created the following pull request for the fix.
https://github.com/ceph/ceph/pull/2510 - 02:50 PM rgw Feature #9493 (Resolved): Ability to disable keystone revocation polling when using UUID keystone...
- When using a UUID keystone provider revocation is handled by deleting the token from the persistence backend (ie. no ...
- 02:16 PM CephFS Feature #9466: kclient: Extend CephFSTestCase tests to cover kclient
- Got these passing at least once by hand using IPMI to work around #9477, suite scheduled:
http://pulpito.front.sep... - 11:49 AM Documentation #8995: Preflight Checklist Clarifications
- Addressed clarifications and also added preflight material for other distributions.
- 11:47 AM Documentation #9475 (Resolved): Broken links on downloads page
- Resolved by Ross Turk.
- 11:36 AM Documentation #9491 (Closed): Radosgw docs incorrectly state to disable print continue on centos ...
- https://ceph.com/docs/master/radosgw/config/ states:
"On CentOS/RHEL distributions, turn off print continue. If yo... - 11:31 AM Bug #9490 (Fix Under Review): crushtool crash if --num-rep is missing
- https://github.com/ceph/ceph/pull/2508
- 11:15 AM Bug #9490: crushtool crash if --num-rep is missing
- crash occurs when num-rep takes the value 1
- 11:00 AM Bug #9490 (Rejected): crushtool crash if --num-rep is missing
- ...
- 10:13 AM devops Bug #9489: --zap-disk does not clear enough
- it's worth noting that the OSD worked fine in the cluster after initial deployment, it's not until the node is reboot...
- 10:05 AM devops Bug #9489 (Rejected): --zap-disk does not clear enough
- sometime the partitions are resurected
- 10:07 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- Andrei,
No, I haven't, but plan to try harder. I am however seeing an extreme slowdown, will open a ticket to tak... - 02:49 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- Ilya,
I was wondering if you've managed to verify my findings? Has anyone experienced similar behaviour as I am?
... - 10:00 AM devops Bug #5929 (Resolved): debian: python-ceph should depend on libcephfs1
- This was comitted quite a long time ago during a bug scrub I believe.
- 09:12 AM RADOS Fix #6109: pg <pgid> mark_unfound_lost fails if a completely-gone OSD still in map
- I'm having a similar issue, I have one unfound object that I can't delete. I'm also getting the "Error EINVAL: pg has...
- 08:46 AM Bug #9438: librados API generated doc broken
- I'd be happy to review : which pull request / branch is it ?
- 08:38 AM Bug #9485 (Need More Info): Monitor crash due to wrong crush rule set
- Could you add the stack trace of the mon crash to the ticket ? I remember the discussion we had on the mailing list a...
- 03:29 AM Bug #9434: rbd rm hangs
- Loic Dachary wrote:
> Version 0.71 was a development version. Are you observing the same version on a stable release... - 02:37 AM Bug #9304: pool create with invalid crush rule name succeeds
- "rebased against giant":https://github.com/ceph/ceph/pull/2506
- 02:33 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Here is a bit more... I checked for "snap_trimmer entry" on other OSDs this morning. There were a few others, but all...
- 01:59 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- > I was able to isolate the cause of the backfilling to one single OSD
typo.. I was able to isolate the cause of ... - 01:47 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- In case it wasn't clear, there is nothing special about osd.11. Each time I reweight 2 OSDs the slow requests are cau...
- 01:44 AM Bug #9487 (Resolved): dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim...
- Hi,
using dumpling 0.67.10...
We are doing quite a bit of backfilling these days in order to make room for some n...
09/15/2014
- 08:37 PM Bug #9485 (Resolved): Monitor crash due to wrong crush rule set
- I create a customized crush rule for ec pool
1 set take default
2 choose firstn 6 type rack
3 chooseleaf firstn ... - 08:04 PM Feature #9222: annotate config options
- If we want to think in an internationalization direction, perhaps the right thing is to msg-catalog the help informat...
- 03:05 PM Feature #9222: annotate config options
- Yeah. I would also love to see min/max values for the numeric options.
- 02:56 PM Feature #9222: annotate config options
- If a fourth argument is set to a description string in config_opts.h, ceph.in could get access to it via a pybind/com...
- 05:39 PM Fix #9484: OSD: block until we have the same map as the client on pg commands
- Instead of blocking for *every* tell command (or even a subset), we can add one new command 'get_latest_osdmap' or si...
- 05:11 PM Fix #9484 (New): OSD: block until we have the same map as the client on pg commands
- Right now, if a client has a newer map than we do and sends a PG command (like list_missing, #9219) we can reply ENOE...
- 05:14 PM Bug #9219 (Fix Under Review): lost_unfound test got ENOENT: i don't have pgid 1.e
- I created a few other tickets for the specific pg command issue, and created a PR so the OSD will subscribe to any os...
- 04:49 PM Bug #9219: lost_unfound test got ENOENT: i don't have pgid 1.e
- Okay, so at the time osdmap 19 was created, we had two of three OSDs running (osd.1 was down and out, and teuthology ...
- 05:08 PM Feature #9483 (Resolved): OSD: add a get_newest_map command to the admin socket
- This could be useful in testing and to "unstick" clusters in some odd situations we've seen before.
- 04:20 PM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
- Yeah, pretty sure that's right, even if we only find backfill peers, we want to let those determine the min acceptabl...
- 04:06 PM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
- Actually, I'm not sure that's right. Thinking.
- 04:02 PM Bug #9482 (Resolved): osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log....
- backfill peers are not setting info.last_epoch_started allowing subsequent primaries to erroneously conclude that it ...
- 03:39 PM Bug #9481 (Resolved): osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
- Bug is PGLog::claim_log_and_clear_rollback_info sets rollback_info_trimmed_to before setting head.
- 03:27 PM Bug #9480: OSD is crashing while object deletion
- I have root caused it, it seems to be happening because one of my earlier changes :-( .. Here is the rot cause.
1.... - 03:00 PM Bug #9480 (Resolved): OSD is crashing while object deletion
- Reproducible step:
1. Run a command something like this.
rados bench -p rbench 200 write -t 32 -b 1024
The O... - 03:12 PM Bug #9109: ceph CLI: Help is missing -k keyring option
- Initial pull request:
https://github.com/ceph/ceph/pull/2483
Need to design a solution so that all clients can ... - 03:05 PM Bug #9109: ceph CLI: Help is missing -k keyring option
- not a low hanging fruit after all, johnu will try another ;-)
- 12:51 PM Bug #9109: ceph CLI: Help is missing -k keyring option
- So, really, this applies to all the "Ceph global" options that the frontend doesn't have reason to do anything specia...
- 02:02 PM CephFS Bug #9444 (Resolved): "unmatched rstat" exception after firefly->master upgrade
- if mds_verify_scatter isn't enabled, the MDS will fix rstat mismatch atomically.
- 10:45 AM CephFS Bug #9444: "unmatched rstat" exception after firefly->master upgrade
- I think you're right, John. I'm not sure why we never saw this before though — Zheng, what changed that we're looking...
- 02:45 AM CephFS Bug #9444: "unmatched rstat" exception after firefly->master upgrade
- Is this actually fixed, in the case of filesystems created using old code? It seems like the patch prevents creating...
- 01:34 PM Bug #9452: All tests failed in upgrade:dumpling-giant-x:parallel-master-distro-basic-multi run
- The main source of these problems should be fixed by commit:cdb7675a21c9107e3596c90c2b1598def3c6899f
- 01:33 PM rbd Bug #6494: High memory consumption of qemu/librbd with enabled cache
- FTR the commits fixing this are commit:4fc9fffc494abedac0a9b1ce44706343f18466f1 and commit:cdb7675a21c9107e3596c90c2b...
- 01:04 PM CephFS Fix #9435: prevent use of cache pools as metadata or data pools
- First half here: https://github.com/ceph/ceph/tree/wip-9435 (no handling of tiering updates yet)
- 12:47 PM CephFS Fix #9435 (In Progress): prevent use of cache pools as metadata or data pools
- 12:39 PM rgw Bug #9479 (Resolved): ETag is not included in the XML response to put object copy operation
- User performs a put object copy operation, and the ETag is not included in the XML response.
- 12:37 PM rgw Bug #9478 (Resolved): Incorrect content type in response header
- User performs a put object copy operation, and seeing the content-type in the response header returned as "binary/oct...
- 12:32 PM CephFS Feature #9477: Handle kclient shutdown with dead network more gracefully
Ah, this *only* happens if I have some dirty state from userspace at the time. In this instance it's my Mount.open...- 11:59 AM CephFS Feature #9477 (Closed): Handle kclient shutdown with dead network more gracefully
- ...
- 10:44 AM Bug #9476 (Duplicate): "Segmentation fault (core dumped)" in upgrade:dumpling-giant-x:parallel-gi...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-14_15:05:01-upgrade:dumpling-giant-x:parallel-gian...
- 10:40 AM Linux kernel client Bug #4614 (Can't reproduce): Root cephfs does not mount at boot on Ubuntu 12.04
- 10:38 AM Documentation #9475 (Resolved): Broken links on downloads page
- The "View installation docs for..." links at the bottom of http://ceph.com/resources/downloads/ are broken, presumabl...
- 10:35 AM Bug #9470 (Resolved): daemon pid file is not being created when running service ceph
- This was fixed by commit:bccb0eb64891f65fd475e96b6386494044cae8c1, which will be in Giant.
- 05:01 AM Bug #9470 (Resolved): daemon pid file is not being created when running service ceph
- Hi,
We have been seeing some strange issues with the latest version(s) of ceph. I'm testing on 0.85 right now, an... - 10:14 AM CephFS Bug #9423 (Resolved): failure in client_recovery task
- 10:14 AM CephFS Bug #9423: failure in client_recovery task
Fixed merged to giant....- 08:07 AM CephFS Bug #9423: failure in client_recovery task
- 09:50 AM CephFS Feature #9466 (In Progress): kclient: Extend CephFSTestCase tests to cover kclient
- 03:43 AM CephFS Feature #9466: kclient: Extend CephFSTestCase tests to cover kclient
- kclient instrumentation to enable implementing KernelClient::get_global_id (mapping local mount to the ID we see on t...
- 03:38 AM CephFS Feature #9466 (Resolved): kclient: Extend CephFSTestCase tests to cover kclient
Currently the mds_client_recovery and mds_client_limits tasks in ceph-qa-suite only run against the fuse client, be...- 09:42 AM devops Feature #9474 (Resolved): unify init-radosgw versions'
- there is a sysv version and a regular version. keep these in sync.
even better would be to unify with init-ceph .... - 08:33 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
- running "monthrash against master":http://pulpito.ceph.com/loic-2014-09-15_08:31:19-rados:monthrash-master-testing-ba...
- 08:33 AM Bug #9472 (Duplicate): osd crash in -upgrade:dumpling-dumpling-distro-basic-vps suite
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-14_17:00:01-upgrade:dumpling-dumpling-distro-basic...
- 08:21 AM devops Bug #9460: mira004, mira036. mira017 unresponsive
- For mira017 see : http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-14_17:00:01-upgrade:dumpling-dumpling-distro...
- 08:06 AM CephFS Bug #9177: ceph-fuse: failing MPI mdtest runs
- https://github.com/ceph/ceph-qa-suite/pull/140
- 08:04 AM CephFS Bug #9177: ceph-fuse: failing MPI mdtest runs
- 07:16 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
- I've now installed an ubuntu 14.04 but could still not make it fail.
Even valgrind has a perfectly clean output.
I ... - 06:50 AM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
- Starting to look into this...
- 06:42 AM Bug #9408 (In Progress): erasure-code: misalignment
- Now I see it, thanks for your patience.
- 06:12 AM Bug #9408: erasure-code: misalignment
- Hi Loic, I think Janne Grunau is right. For memory align, it depend on the bufferlist::c_str.
Using this patch:
... - 04:13 AM Bug #9408: erasure-code: misalignment
- With the following applied on dcc608d5d3f701315eaf0edee6f0a4796a4d97e1...
- 03:20 AM Bug #9408: erasure-code: misalignment
- jianpeng ma wrote:
> Can you tell met the result for this situation? I run with your command but it looks good.
I... - 04:52 AM Bug #9008: Objecter: pg listing can deadlock when throttling is in use
- Please help to review: https://github.com/ceph/ceph/pull/2489
- 04:33 AM rgw Bug #9469 (Rejected): RadosGW performance degrades with high concurrency workload.
- I am running COSbench as a performance benchmarking tool on a CEPH cluster(Swift API). Setup details are as follows:-...
- 04:29 AM Bug #9468 (Won't Fix): Unable to delete crush rule with blank space
- I am not sure how crush rule with blank space in beginning got created. But, I am not able to delete it.
ems@rack... - 04:15 AM Bug #9467 (Won't Fix): Delete default erasure coded profile getting succeeded
- Deleting default erasure coded profile is getting succeeded.
Also, re-creating erasure coded profile "default" with ... - 03:24 AM Fix #6754 (Resolved): erasure-code: jerasure plugin does not check parameters properly
Also available in: Atom