Project

General

Profile

Activity

From 07/12/2014 to 08/10/2014

08/10/2014

11:43 PM devops Bug #9061 (Resolved): dumpling to firefly upgrade on RH6 restarts the daemons
Hi,
When I upgrade the RPMs on a RH6 server from 0.67.9 to 0.80.5, the daemons are (cond)restarted. I believe these ...
Dan van der Ster
07:20 PM Linux kernel client Bug #8806: libceph: must use new tid when watch is resent
meanwhile, the MWatchNotify message now has a return value encoded at the end (s32) when header.version >= 0. See wi... Sage Weil
07:19 PM Linux kernel client Bug #8806: libceph: must use new tid when watch is resent
the bug is with the kernel client: it needs to use a new tid when resending the watch. this was partially fixed on t... Sage Weil
05:04 PM Bug #9057 (Fix Under Review): mark_down from fast dispatch can deadlock
https://github.com/ceph/ceph/pull/2238 Sage Weil
10:45 AM Bug #9057: mark_down from fast dispatch can deadlock
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-08-09_14:13:44-rados-next-testing-basic-multi/410713
3 (!...
Sage Weil
08:41 AM Bug #9057 (Resolved): mark_down from fast dispatch can deadlock
... Sage Weil
04:13 PM Feature #8639 (In Progress): mon: dispatch messages while blocked waiting for IO
Sage Weil
03:45 PM Bug #8620: rest/test.py occasional failure (dumpling)
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-08-10_13:22:17-rados-dumpling-distro-basic-multi/413788 Sage Weil
02:07 PM Feature #8560 (Fix Under Review): mon: instrument paxos
Sage Weil
12:51 PM rgw Bug #8988 (Fix Under Review): AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
Two consecutive run with the increased timeout do not show the bug ("one":http://pulpito.ceph.com/loic-2014-08-10_15:... Loïc Dachary
02:03 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
In a few tickets it is suggested that this may be an idle timeout problem. I "rescheduled a suite":http://pulpito.cep... Loïc Dachary
01:31 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
In the attached file, each part separated with *-----------------------------* is the output between the last success... Loïc Dachary
01:09 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
The errors for each failure are different and suggests the tests are failing for an independent reason such as the cl... Loïc Dachary
01:03 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
* http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-ba... Loïc Dachary
12:46 PM Bug #9055 (Fix Under Review): LibRadosTwoPoolsPP.HitSetWrite (and others) fail on remove of whiteout
https://github.com/ceph/ceph/pull/2236 Sage Weil
11:05 AM Feature #9059 (Resolved): osd: store opportunistic whole-object checksum
when we deep scrub, we have a whole-object checksums that cover data and omap. store a copy in object_info_t, along ... Sage Weil
10:52 AM Bug #8935: operations not idempotent when enabling cache
sage-2014-08-09_14:13:44-rados-next-testing-basic-multi/410527 and 410528 Sage Weil
10:51 AM Bug #9058 (Can't reproduce): rest-api: long-running process may fail 'tell osd...' due to stale o...
sage-2014-08-09_14:13:44-rados-next-testing-basic-multi/410524 Sage Weil
10:48 AM Bug #8894: osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-08-09_14:13:44-rados-next-testing-basic-multi/410806
alwa...
Sage Weil
02:16 AM CephFS Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
"same error":http://pulpito.ceph.com/loic-2014-08-10_09:59:49-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping... Loïc Dachary
12:53 AM CephFS Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
Another "similar crash":http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chun... Loïc Dachary
12:39 AM CephFS Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
And the same trace at "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upgrade:firef... Loïc Dachary
12:33 AM CephFS Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
Looks like a similar problem at "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upg... Loïc Dachary
01:04 AM Feature #9025: erasure-code: chunk remapping
The upgrade suite from firefly had one error related to an independant "MDS problem":http://pulpito.ceph.com/loic-201... Loïc Dachary
12:49 AM Feature #8496 (Resolved): erasure-code: ErasureCode base class
Loïc Dachary
12:41 AM Feature #8496: erasure-code: ErasureCode base class
The "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upgrade:firefly-x:stress-split-... Loïc Dachary
12:16 AM Bug #8978: ceph ping not working as expected
I'm experiencing the same (on newly installed ceph-cluster via Ubuntu server 14.04.1):
ceph status
cluster b6...
Kees Boogert

08/09/2014

10:55 PM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
Unfortunately converting RBD to image format 2 did not fix it. User returned after being away for a week and her syst... Dmitry Smirnov
05:50 PM CephFS Bug #9056: fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid context"

http://pulpito.front.sepia.ceph.com/john-2014-08-09_14:56:53-fs-wip-mds-contexts-testing-basic-plana/409236/
http:...
John Spray
05:48 PM CephFS Bug #9056 (Resolved): fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid ...

kernel 5f740d7e1531099b888410e6bab13f68da9b1a4d
wip-mds-contexts (aka wip-objecter) 7be59771bff09e2b46b5467627cb...
John Spray
12:53 PM Bug #9055 (Resolved): LibRadosTwoPoolsPP.HitSetWrite (and others) fail on remove of whiteout
2014-08-09T09:03:14.670 INFO:tasks.workunit.client.0.plana70.stdout:test/librados/TestCase.cc:93: Failure
2014-08-09...
Sage Weil
12:26 PM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
2014-08-08 10:55:12.312751 7f1237847700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2839 (0'0,462'2839] local-les=422 n=53 e... Sage Weil
10:04 AM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
almost there. on osd.0, we finish trimming 14a here:
2014-08-08 10:55:12.311901 7f1237847700 10 osd.0 pg_epoch: 4...
Sage Weil
11:43 AM Bug #8894: osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-08-08_22:30:19-rados-wip-sage-testing-testing-basic-burnupi/... Sage Weil
01:39 AM Bug #9044 (Fix Under Review): erasure-code: use ruleset instead of ruleid
"associated pull request":https://github.com/ceph/ceph/pull/2232 Loïc Dachary

08/08/2014

11:00 PM Bug #9054 (Resolved): ceph_test_rados: FAILED assert(!old_value.deleted())
ubuntu@teuthology:/a/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/403383... Sage Weil
10:58 PM Bug #8997: ceph_test_rados_watch_notify hangs
ubuntu@teuthology:/a/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/402968 Sage Weil
10:55 AM Bug #8997: ceph_test_rados_watch_notify hangs
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/402968 Sage Weil
10:54 PM Bug #9053 (Resolved): mon/Paxos.cc: 628: FAILED assert(begin->last_committed == last_committed)
ubuntu@teuthology:/a/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/402965
description: rados/monthra...
Sage Weil
07:36 PM Bug #9052: ceph-mon crashes with *** Caught signal (Floating point exception) **
With no OSDs in the cluster, the calculations for @pgs_per_osd@ can divide by zero (integer, but that still causes th... Dan Mick
07:29 PM Bug #9052 (Resolved): ceph-mon crashes with *** Caught signal (Floating point exception) **
I've found that I can crash ceph-mon by attempting to change pool values (such as pg_num) before adding OSDs to the c... Jamin Collins
06:59 PM rgw Documentation #9051 (Closed): Document rgw_defer_to_bucket_acls option
It appears that the only documentation right now is the commit message of 1d7c2041. Benjamin Gilbert
06:16 PM Bug #7576: osd: large skew in pg epochs (dumpling)
..and when we do, include commit:a52a855f6c92b03dd84cd0cc1759084f070a98c2 !! Sage Weil
06:16 PM Bug #7576 (Pending Backport): osd: large skew in pg epochs (dumpling)
still want to backport this to firefly ... Sage Weil
06:04 PM rgw Bug #8621: civetweb frontend fails authentication if URL has special chars
tested wip-8621 by executing s3tests, there are still a few failures,
logs are copied to ubuntu@mira042.front.sepi...
Tamilarasi muthamizhan
04:42 PM Fix #4205: librados: Improve Watch-notify semantics
http://pad.ceph.com/p/watch-notify Sage Weil
03:55 PM devops Feature #9050 (Rejected): Calamari builds for ceph.com
Neil Levine
03:24 PM devops Feature #6310 (Closed): Get Dumpling into CentOS Ceph repo
Neil Levine
10:31 AM Bug #9046 (Resolved): Limiting the pool object quota stops the IO, however IO does not restart if...
Issue Title: Limiting the pool object quota stops the IO, however IO does not restart if we rest the pool object quot... Hirak Mazumder
09:37 AM Bug #9040: clients can SEGV during package upgrade
Ian Colle
09:03 AM Bug #9023: valgrind failures in OSD
Another `new Session` at OSD.cc:3704
http://qa-proxy.ceph.com/teuthology/john-2014-08-07_18:44:20-fs-wip-mds-context...
John Spray
06:43 AM Bug #9044 (Resolved): erasure-code: use ruleset instead of ruleid
When "ruleset is looked up by name":https://github.com/ceph/ceph/blob/firefly/src/mon/OSDMonitor.cc#L2928 when creati... Loïc Dachary
03:15 AM Feature #9025: erasure-code: chunk remapping
"requeued, for ubuntu 14.04 to get quicker results":http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-... Loïc Dachary
03:13 AM Feature #8496: erasure-code: ErasureCode base class
"requeued, for ubuntu 14.04 to get quicker results":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upgrade:firefly-... Loïc Dachary
02:43 AM rgw Bug #9043 (Duplicate): rgw:Cannot add object to Ceph using Openstack Dashboard(Horizon) in firefly
Uploading a new object fails with message "Error: Unable to upload object".
While adding an object using Horizon w...
Ashish Chandra

08/07/2014

03:56 PM Feature #8276: ceph-filestore-dump import-rados -p <pool> <archive>
Implemented syntax:
ceph_objectstore_tool import-rados pool [import_file|-]
Import into the specified pool on r...
David Zafman
03:54 PM Bug #8396 (Resolved): osd: message delayed in Session misdirected after split
Samuel Just
03:39 PM Bug #8625 (Pending Backport): EC pool - OSD creates an empty file for op with 'create 0~0, writef...
Sage Weil
02:34 PM Bug #9040: clients can SEGV during package upgrade
https://github.com/ceph/ceph-qa-suite/pull/77 seemed fixing this.
Testing now.
Yuri Weinstein
01:56 PM Bug #9040 (Won't Fix): clients can SEGV during package upgrade
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-06_16:30:35-upgrade:dumpling-dumpling---basic-vps/... Yuri Weinstein
12:37 PM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
Well, I think data copy is the right thing to do. If I put bucket in different pool is because they're configured dif... Sylvain Munaut
10:40 AM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
The problem is that it is implicitly assumed with the new manifest that the tail is going to reside at the same pool ... Yehuda Sadeh
10:07 AM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
Really ? I didn't see anything in the code that checked whether the destination bucket was in the same pool or not an... Sylvain Munaut
09:59 AM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
That sounds like an issue with the new (firefly) manifest. Yehuda Sadeh
07:21 AM rgw Bug #9039 (Resolved): Using COPY on radosgw to copy object from one bucket to another that's in a...
Currently if you copy an object from a bucket to another one which is in another rados pool, things will just break. ... Sylvain Munaut
09:34 AM Bug #9035 (Closed): ceph cluster is using more space than actual data after replication
the used is simply summing the statfs(2) results on all the OSDs. you can see this by doing a df on the osd volumes,... Sage Weil
02:24 AM Bug #9035 (Closed): ceph cluster is using more space than actual data after replication
Ceph cluster is using more space than estimated space to store data after replication.
Total cluster capacity is 5...
Srinivasula Reddy Maram
07:52 AM rgw Bug #9037 (Duplicate): civetweb: error HEAD responses return body
Ian Colle
07:40 AM rgw Bug #9037: civetweb: error HEAD responses return body
Ah, sorry, somehow managed to miss it when I looked through the issue list. Please close this then. Valtteri Vuorikoski
07:34 AM rgw Bug #9037: civetweb: error HEAD responses return body
See #8539 Sylvain Munaut
02:59 AM rgw Bug #9037 (Duplicate): civetweb: error HEAD responses return body
0.80.5 radosgw with civetweb frontend returns body data when sending an error response to a HEAD request. This breaks... Valtteri Vuorikoski
06:41 AM CephFS Feature #9029: min/max uid for snapshot creation
Wido den Hollander
06:00 AM Bug #4254: osd: failure to recover before timeout on rados bench and thrashing; negative stats
I am seeing this issue again on v0.80.4. I stopped 3 osd processes and marked them as out to trigger data migration (... Zhi Zhang
03:08 AM Feature #8496: erasure-code: ErasureCode base class
"requeued on vps because plana are very busy":http://pulpito.ceph.com/loic-2014-08-07_12:09:48-upgrade:firefly-x:stre... Loïc Dachary
03:06 AM Feature #9025: erasure-code: chunk remapping
"queued the suite on vps because plana are very busy":http://pulpito.ceph.com/loic-2014-08-07_12:06:56-upgrade:firefl... Loïc Dachary
12:54 AM Feature #9025: erasure-code: chunk remapping
"upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-07_09:56:17-upgrade:firefly-x:stress-split-wip-... Loïc Dachary
01:23 AM Feature #9034 (New): erasure-code: better LRC strategy
The current LRC recovery strategy does not take advantage of all possibilities and may fail to discover a scenario th... Loïc Dachary
01:17 AM Feature #9033 (Resolved): erasure-code: simplified LRC
Add implicit parity and simplified LRC as "described by Andreas":https://www.mail-archive.com/ceph-devel@vger.kernel.... Loïc Dachary

08/06/2014

06:40 PM Bug #9022 (Pending Backport): Potential lock leaks in RadosClient
Sage Weil
02:58 AM Bug #9022: Potential lock leaks in RadosClient
Pull request on the way. Pavan Rallabhandi
02:58 AM Bug #9022 (Resolved): Potential lock leaks in RadosClient
While going through RadosClient, identified couple of interfaces librados::RadosClient::lookup_pool() and librados::R... Pavan Rallabhandi
03:39 PM Feature #9031: List RADOS namespaces and list all objects in all namespaces

A way to implement this is to enhance the pg_ls_repsonse_t to include the namespace (or change object_t to hobject_...
David Zafman
02:30 PM Feature #9031 (Resolved): List RADOS namespaces and list all objects in all namespaces
We can currently create namespaces, but cannot easily view those that have been created. A method of listing namespac... Brian Andrus
03:23 PM devops Bug #9032 (Rejected): ceph-deploy over proxy
I have my servers working behind a proxy. When I run the ceph-deploy install command I get an error:
[ceph01][INFO ...
TJ Walker
02:05 PM Feature #9030 (Fix Under Review): mon: quickly identify 'problem'  osds
Sage Weil
02:05 PM Feature #9030 (Resolved): mon: quickly identify 'problem'  osds
Sage Weil
12:55 PM Bug #8860 (Fix Under Review): ceph-disk issues with custom cluster name
PR opened https://github.com/ceph/ceph/pull/2216 Alfredo Deza
12:25 PM CephFS Feature #9029 (Resolved): min/max uid for snapshot creation
On shared systems like shared hosting it might be useful to prevent regular users from creating snapshots on CephFS.
...
Wido den Hollander
12:20 PM rgw Feature #6747: PowerDNS backend for RGW bucket directing
Wido den Hollander
11:06 AM rbd Bug #8845 (Pending Backport): Flattening Clones of clone, results in command failure
Sage Weil
09:41 AM Bug #9019 (Resolved): Makefile.am: error: required file './README' not found
fixed it up with a symlink.. other solutions seemed more annoying :( Sage Weil
08:39 AM Linux kernel client Bug #8818 (Resolved): IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
OK, thanks everybody.... Ilya Dryomov
08:09 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
I switched to the good kernel (3.16.0-ceph-00037-g0532581) yesterday and re-ran my scripts overnight. The scripts co... Greg Wilson
08:39 AM Linux kernel client Bug #8464 (Resolved): krbd: deadlock
OK, thanks everybody.... Ilya Dryomov
08:06 AM Feature #8496: erasure-code: ErasureCode base class
"scheduled upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-06_17:07:04-upgrade:firefly-x:stress-... Loïc Dachary
06:22 AM Feature #8496: erasure-code: ErasureCode base class
The test "only had one job":http://pulpito.ceph.com/loic-2014-08-05_13:45:56-upgrade:firefly-x:stress-split-wip-8496-... Loïc Dachary
07:12 AM Feature #9025 (Fix Under Review): erasure-code: chunk remapping
"need review":https://github.com/ceph/ceph/pull/2213 Loïc Dachary
06:28 AM Feature #9025 (Resolved): erasure-code: chunk remapping
Interpret the *mapping* parameter and remap the chunks accordingly. For instance mapping=_DD means the data chunks ar... Loïc Dachary
07:11 AM CephFS Feature #9026 (Resolved): client: vxattr support for rctime, rsize, etc.
Sage Weil
05:44 AM Bug #9023 (Can't reproduce): valgrind failures in OSD

osd.2 from OSD.cc:462 (SafeTimer::init, pthread_create)
http://pulpito.front.sepia.ceph.com/john-2014-08-01_11:0...
John Spray

08/05/2014

11:22 PM Feature #9021 (Resolved): librbd: shared flag, object map
we need to consider to make a tradeoff between multi-client support and single-client support for librbd. In practice... Haomai Wang
10:43 PM Bug #8797: "ceph status" do not exit with python_2.7.8
For a moment Python maintainer in Debian kindly fixed this issue for us by adding patch to revert problematic change ... Dmitry Smirnov
07:34 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
The 00036 "bad" kernel started showing the problem in the /var/log/kern.log file within minutes of starting my test s... Eric Eastman
12:49 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Eric, Greg,
The fix on top of 3.16 + testing is in wip-request-fn.
http://gitbuilder.ceph.com/kernel-deb-precis...
Ilya Dryomov
06:16 PM Bug #9019 (Resolved): Makefile.am: error: required file './README' not found
commit(a923e2c9eb16823fa484c) Renamed README to README.md to render in markdown. After that, i can't generate Makefil... jianpeng ma
06:15 PM Bug #9008: Objecter: pg listing can deadlock when throttling is in use
> I'm guessing the request is hung on teh OSD side of things...
Thanks Sage. Sadly after radosgw daemon restarting, ...
Guang Yang
08:28 AM Bug #9008 (Need More Info): Objecter: pg listing can deadlock when throttling is in use
Sage Weil
08:28 AM Bug #9008: Objecter: pg listing can deadlock when throttling is in use
please query the admin socket for the process like so:
ceph daemon /var/run/ceph/ceph-client.*.asok objecter_requ...
Sage Weil
02:44 AM Bug #9008 (Resolved): Objecter: pg listing can deadlock when throttling is in use
In our Ceph cluster (with radosgw), we found that occasionally the processing threads hands forever and eventually ha... Guang Yang
02:24 PM Bug #9018 (Resolved): "LibRadosTwoPoolsPP*" failed in upgrade:dumpling-x-firefly---basic-vps
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-05_09:22:33-upgrade:dumpling-x-firefly---basic-vps... Yuri Weinstein
02:13 PM devops Feature #8868: Update Fedora to 0.80.5 packages with ceph-common
So, there's a PR open for some restructuring of the .spec file now that we need to get in soon to make this more sane... Dan Mick
01:21 PM Fix #6278 (Resolved): osd: throttle snap trimming
Sage Weil
01:20 PM devops Fix #9017 (Rejected): [paddles] implement validation across all controller methods
paddles has a lot of boilerplate in controllers that look like:... Alfredo Deza
01:15 PM Feature #9015 (Resolved): msgr refactoring to support xio work
Sage Weil
01:09 PM Feature #9015 (Resolved): msgr refactoring to support xio work
Sage Weil
01:14 PM Fix #8905 (In Progress): msgr: encode osd epoch in nonce to avoid misc OSD reconnect races
Sage Weil
01:10 PM Feature #7516 (Fix Under Review): mon: reweight-by-pg
Sage Weil
01:06 PM Feature #7238 (In Progress): erasure code : implement LRC plugin
Samuel Just
12:55 PM Bug #8083: erasure-code: fix static code analysis errors found in gf-complete
Loïc Dachary
12:28 PM Documentation #8875 (Resolved): `ceph-deploy new` needs to be called for every node, not just the...
PR https://github.com/ceph/ceph/pull/2206
and merged commit e6935dd into master
Alfredo Deza
09:37 AM Documentation #8875 (In Progress): `ceph-deploy new` needs to be called for every node, not just ...
I noted the problem in the docs and will fix that shortly.
You are right, you need to run `ceph-deploy new {NODES}...
Alfredo Deza
11:19 AM Bug #9011: osd memory leaks on next
gonna see if this happens on plana too Sage Weil
11:13 AM Bug #9011: osd memory leaks on next
these look like static std::strings. and some other weird leaks that don't make sense... Sage Weil
08:00 AM Bug #9011 (Duplicate): osd memory leaks on next
ubuntu@teuthology:/a/sage-2014-08-04_11:34:19-rgw-next-testing-basic-vps/397606
need to clean these up
Sage Weil
09:26 AM rgw Feature #9013 (Resolved): rgw: set civetweb as a default frontend
Should add civetweb to the default frontends. Yehuda Sadeh
09:13 AM Messengers Bug #8880 (Pending Backport): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_s...
Sage Weil
09:11 AM Bug #9012 (Duplicate): "[WRN] map e277 wrongly marked me down" in upgrade:dumpling-x-firefly---ba...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-04_14:18:17-upgrade:dumpling-x-firefly---basic-vps... Yuri Weinstein
09:05 AM rgw Feature #8218 (In Progress): rgw: object versioning manifest changes
Ian Colle
09:05 AM rgw Feature #8217 (In Progress): rgw: object versioning object overwrite / delete changes
Ian Colle
09:05 AM rgw Feature #8216 (In Progress): rgw: object versioning objclass support
Ian Colle
09:05 AM rgw Feature #8473 (In Progress): rgw: Shard bucket index objects to improve single bucket PUT throughput
Ian Colle
08:54 AM rbd Bug #8845 (Fix Under Review): Flattening Clones of clone, results in command failure
https://github.com/ceph/ceph/pull/2205 Josh Durgin
08:52 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
btw, the steps to reproduce this issue are mentioned by Sahana above & it can be reproduced on a single node too.
...
Dhiraj Kamble
08:47 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
Hi Greg,
No i did not intend to add any comments.
The reason i thought we should assert is, so that we can serv...
Dhiraj Kamble
08:25 AM Bug #9007 (Duplicate): Ceph Firefly 0.80.4 : Unable to get some pool values
you're right. this is fixed in master, and backported to firefly-next.. will be in next firefly point release. Sage Weil
01:50 AM Bug #9007 (Duplicate): Ceph Firefly 0.80.4 : Unable to get some pool values
h1. Hello Developers
I am curious to know if there is something missing from the code for Ceph pool values.
As...
karan singh
07:56 AM rgw Bug #8676: md5sum check failed during readwrite.py
ubuntu@teuthology:/a/sage-2014-08-04_11:34:19-rgw-next-testing-basic-vps/397522 Sage Weil
04:46 AM Feature #8496: erasure-code: ErasureCode base class
"upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-05_13:45:56-upgrade:firefly-x:stress-split-wip-... Loïc Dachary
12:48 AM Feature #8496 (Fix Under Review): erasure-code: ErasureCode base class
"pull request":https://github.com/ceph/ceph/pull/2201 Loïc Dachary
04:22 AM Bug #9009 (In Progress): (wip-objecter) ObjectCacher assert in fs client
OK, no big deal, just that there are contexts in the Client, like the MDS, which need updating to take client_lock wh... John Spray
03:49 AM Bug #9009 (Resolved): (wip-objecter) ObjectCacher assert in fs client

From branch wip-mds-contexts, which is a derivative of wip-objecter.
http://qa-proxy.ceph.com/teuthology/john-20...
John Spray
03:21 AM rgw Feature #8911: RGW doesn't return 'x-timestamp' in header which is used by 'View Details' of Open...
It also doesnot returns "Content-type" header as well. Swift does return this header aswell. So I would love to see r... Ashish Chandra
12:45 AM rgw Documentation #9003: rgw: document development setup for rgw
Much needed. Great! Abhishek Lekshmanan

08/04/2014

11:33 PM Feature #8496 (In Progress): erasure-code: ErasureCode base class
Because it needs work to adapt the isa plugin, it deserves a separate patch. Otherwise it mixes two unrelated topics. Loïc Dachary
05:12 AM Feature #8496 (Rejected): erasure-code: ErasureCode base class
It is part of a "larger pull request":https://github.com/ceph/ceph/pull/1911 Loïc Dachary
11:21 PM Bug #8736: thrash and scrub combination lead to error
http://pulpito.ceph.com/loic-2014-08-04_15:06:02-upgrade:firefly-x:stress-split-wip-8475-testing-basic-plana/396887/
...
Loïc Dachary
11:02 PM Feature #8475 (Resolved): erasure-code: oversized objects when using the Cauchy technique
Loïc Dachary
06:05 AM Feature #8475: erasure-code: oversized objects when using the Cauchy technique
"scheduled upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-04_15:06:02-upgrade:firefly-x:stress-... Loïc Dachary
02:07 AM Feature #8475: erasure-code: oversized objects when using the Cauchy technique
"Rebased and repushed":https://github.com/ceph/ceph/pull/1890 , running gitbuilder Loïc Dachary
08:00 PM rgw Feature #3454: Support temp URLs for Swift API
This should be documented somewhere too, at least in the table at http://ceph.com/docs/master/radosgw/swift/ Blair Bethwaite
03:09 PM Bug #8998 (Pending Backport): osd: SEGV in OSD::heartbeat()
Sage Weil
03:00 PM Bug #8998 (Fix Under Review): osd: SEGV in OSD::heartbeat()
https://github.com/ceph/ceph/pull/2198 Sage Weil
09:14 AM Bug #8998: osd: SEGV in OSD::heartbeat()
ubuntu@teuthology:/a/teuthology-2014-08-03_02:30:01-rados-next-testing-basic-plana/394893 Sage Weil
02:18 PM rgw Feature #9004 (New): rgw: multi-site: multi-master
As a user, I want to be able to write to any available RGW and have that file available on other RGWs for read and wr... Neil Levine
02:06 PM Bug #8891 (Resolved): rados bench hang during thrashing
Sage Weil
09:17 AM Bug #8891 (Fix Under Review): rados bench hang during thrashing
Sage Weil
01:53 PM rgw Documentation #9003: rgw: document development setup for rgw
While we're at it, beefing up the rgw support in vstart.sh would be great. right now you can pass RGW=1 and it will ... Sage Weil
01:49 PM rgw Documentation #9003 (Closed): rgw: document development setup for rgw
Yehuda Sadeh
11:20 AM rgw Bug #9002 (Duplicate): Creating swift key with --gen-secret in separate step from subuser creatio...
Customer reported on CentOS with Ceph v0.80.4
Steps to reproduce:
radosgw-admin user create --uid=testuser1 --dis...
Brian Andrus
11:00 AM rgw Bug #9001 (Won't Fix): Starting gateway with radosgw init script fails to create socket
Ceph Version: v0.80.4
Distro: CentOS
Customer reported, unable to reproduce.
/var/run/ceph directory owned by ...
Brian Andrus
09:16 AM Bug #7986: 3.1s0 scrub stat mismatch, got 2041/2044 objects, 0/0 clones, 2041/2044 dirty, 0/0
ubuntu@teuthology:/a/teuthology-2014-08-03_02:30:01-rados-next-testing-basic-plana/395219 Sage Weil
07:07 AM Linux kernel client Bug #8979: GPF kernel panics - auth?
pushed wip-8979 which removes the fixed buffer size. but, we still need to make things not crash when the auth reply... Sage Weil
06:57 AM Linux kernel client Bug #8979: GPF kernel panics - auth?
yeah:
#define TEMP_TICKET_BUF_LEN 256
Sage Weil
06:48 AM Linux kernel client Bug #8979: GPF kernel panics - auth?
... Sage Weil
06:36 AM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
I was able to complete install.
The first step above granted sudo rights on each node.
The way I was able to get it...
Bobby Yakov
05:56 AM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
You still need a user that can call sudo without a password prompt on remote nodes.
And it looks like you only pas...
Alfredo Deza
05:47 AM devops Bug #8893 (Resolved): ceph-deploy install command on centos 6.5 reports exception
merged commit eb9ea33 into ceph:master Alfredo Deza
01:47 AM Bug #8601 (Resolved): erasure-code: default profile does not exist after upgrade
Loïc Dachary

08/03/2014

09:48 PM rgw Bug #8864: radosgw help doesn't seem to display some debug options
I pushed a couple of commits to fix most of undocumented options in man pages & help for #8112. Can you let me know w... Abhishek Lekshmanan
09:35 PM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
Finally I've isolated the issue.
Something was wrong with a particular RBD image (format 1) that was created on Ceph...
Dmitry Smirnov
09:11 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
another similar hang:... Sage Weil
06:27 PM Bug #8891: rados bench hang during thrashing
i think this was the same repaer vs fast dispatch that i tracked down in wip-msgr. Sage Weil
02:48 PM devops Bug #8330: repodata on rpm repos do not list latest ceph-deploy (1.5.2)
Agreed, this is fixed. Current repodata works perfectly with all packages showing correctly (on the same host btw, I'... Simon Ironside
08:40 AM rgw Bug #8784: rgw: completion leak
ubuntu@teuthology:/a/teuthology-2014-08-01_23:02:01-rgw-master-testing-basic-plana/394054 Sage Weil
08:39 AM Bug #8996 (Resolved): "Segmentation fault" in upgrade:dumpling-x-firefly---basic-vps suite
botched (double) backport, fixed by commit:4e03d5b512c8d2f7fa51dda95c6132e676529f9b Sage Weil

08/02/2014

05:01 PM Bug #8998 (Resolved): osd: SEGV in OSD::heartbeat()
... Sage Weil
04:58 PM Bug #8997 (Can't reproduce): ceph_test_rados_watch_notify hangs
... Sage Weil
04:55 PM Bug #8996 (Resolved): "Segmentation fault" in upgrade:dumpling-x-firefly---basic-vps suite
There are lots of these errors in:
http://pulpito.front.sepia.ceph.com/teuthology-2014-08-02_08:50:33-upgrade:dumpli...
Yuri Weinstein
04:31 PM Messengers Bug #8880: msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq feature")
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392461 Sage Weil
08:14 AM Bug #8396: osd: message delayed in Session misdirected after split
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392256 Sage Weil
08:07 AM Bug #6003: journal Unable to read past sequence 406 ...
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392342... Sage Weil

08/01/2014

09:25 PM Bug #8776 (Won't Fix): osd: runaway memory on dumpling
this is a result of a very large omap object and us building a transaction to delete the keys. the problem is the bi... Sage Weil
09:57 AM Bug #8776: osd: runaway memory on dumpling
Argh, it's building up a leveldb operation to atomically remove all of the keys associated with the object. I *think... Samuel Just
06:26 PM Bug #8930 (Resolved): osd: test unable to produce unfound objects
David Zafman
04:07 PM Bug #8930 (Fix Under Review): osd: test unable to produce unfound objects
David Zafman
09:41 AM Bug #8930: osd: test unable to produce unfound objects
David Zafman
03:56 PM devops Bug #8849 (Resolved): rpm restarts daemons on upgrade
already backported, commit:e75dd2e4b7adb65c2de84e633efcd6c19a6e457b and ^ Sage Weil
03:55 PM Bug #8728 (Resolved): rest/test.py osd create not idempotent
Sage Weil
03:54 PM Bug #8670: Cache tiering parameters can not be displayed for a pool
non trivial to backport.. need to get all the rados test refactoring, too! Sage Weil
03:51 PM CephFS Bug #8622 (Resolved): erasure-code: rados command does not enforce alignement constraints
commit:7a58da53ebfcaaf385c21403b654d1d2f1508e1a Sage Weil
03:48 PM Bug #6789 (Resolved): cannot remove the leader when there only are two monitors
Sage Weil
03:39 PM Bug #8944 (Pending Backport): Ceph daemon bad asok used in connection with cluster
Sage Weil
03:37 PM Bug #8714 (Pending Backport): we do not block old clients from breaking cache pools
Sage Weil
03:35 PM Feature #8674 (Pending Backport): osd: cache tier: avoid promotion on first read
commit:79d1aff1821bc9f21477636df4d0d4e57f2cd008 Sage Weil
03:32 PM rgw Bug #8937 (Pending Backport): rgw: broken large(-ish) objects
Sage Weil
03:05 PM Documentation #8995 (Resolved): Preflight Checklist Clarifications
There are several small clarifications that can be made to the Ceph Preflight Checklist to help new users try out Cep... Christopher Hertel
02:44 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
No need to do that just yet. I now fully understand the problem and working on a proper fix that I'd like you to tes... Ilya Dryomov
02:37 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
I have done some testing and I am seeing the same thing as Eric. With the deadlock-bad kernel I hit the deadlock iss... Greg Wilson
02:06 PM Bug #8625: EC pool - OSD creates an empty file for op with 'create 0~0, writefull 0~xxx, setxattr...
Making it not an rgw bug. Samuel Just
02:06 PM Bug #8625: EC pool - OSD creates an empty file for op with 'create 0~0, writefull 0~xxx, setxattr...
wip-8625, versioning should never be necessary after a create (it will be necessary before the create if the object a... Samuel Just
09:53 AM Bug #8625: EC pool - OSD creates an empty file for op with 'create 0~0, writefull 0~xxx, setxattr...
It's the create 0~0 followed by a writefull. Arguably, we still shouldn't version the object, I'll take a look. Samuel Just
01:02 PM Fix #8993 (Closed): osd_pool_default_pgp_num woes
When setting osd_pool_default_pgp_num and not osd_pool_default_pg_num you can create pools with more pgp than pg.
...
Alexandre Marangone
12:57 PM devops Bug #8893 (Fix Under Review): ceph-deploy install command on centos 6.5 reports exception
PR opened https://github.com/ceph/ceph-deploy/pull/226 Alfredo Deza
06:51 AM devops Bug #8893 (In Progress): ceph-deploy install command on centos 6.5 reports exception
Alfredo Deza
09:15 AM rbd Bug #8416 (Closed): Client Crash when try to map a volume (ubuntu)
OK, I'm going to assume this was indeed the missing features handling bug. I looked into it, it was introduced in 3.... Ilya Dryomov
08:23 AM Bug #8989 (Rejected): Failed running iogen.sh in upgrade:firefly-firefly-testing-basic-vps suite
It was a test mis-configuration. When we added a new client to run workload on, we had to be more specific about on ... Yuri Weinstein
07:10 AM Bug #8717 (Resolved): teuthology: valgrind leak checks broken for osd (at least)
Sage Weil
05:57 AM Bug #8601: erasure-code: default profile does not exist after upgrade
... Loïc Dachary
02:23 AM Feature #8992 (New): Uniqueness between two or more CRUSH ruleset choose statements
Assuming that ceph-node1 is in default root, when we define and assign following crush rule:... Szymon Zacher
01:44 AM Bug #8641: Cache tiering agent cannot flush or evict objects during the benchmark
In my opinion problem affect also cache_min_evict_age cache_min_flush_age and others. It's impossible to force ceph c... Szymon Zacher
12:40 AM CephFS Bug #8962: kcephfs: client does not release revoked cap
... Zheng Yan

07/31/2014

09:04 PM rgw Bug #8972 (Pending Backport): rgw: bucket index log wrong object name in multipart completion
Sage Weil
09:31 AM rgw Bug #8972 (Fix Under Review): rgw: bucket index log wrong object name in multipart completion
Sage Weil
08:54 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
Zheng Yan wrote:
> Sage Weil wrote:
> > Zheng Yan wrote:
> > > no clue what happened. please dump the mds cache wh...
Sage Weil
07:32 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
Sage Weil wrote:
> Zheng Yan wrote:
> > no clue what happened. please dump the mds cache when it happens next time
...
Zheng Yan
10:11 AM CephFS Bug #8962: kcephfs: client does not release revoked cap
Zheng Yan wrote:
> no clue what happened. please dump the mds cache when it happens next time
We have a dump, act...
Sage Weil
08:48 PM rgw Bug #8991 (Resolved): rgw: RGWRados::list_bi_log_entries() doesn't clear list
... Yehuda Sadeh
03:52 PM Bug #8977: osd: didn't discard sub_op_reply from previous interval?
Added some debugging to dump the OpWQ queue information if there are stale ops, running in loop. Samuel Just
12:53 PM Bug #8977: osd: didn't discard sub_op_reply from previous interval?
2014-07-30 10:40:58.317063 7fc2164da700 0 log [WRN] : slow request 960.196157 seconds old, received at 2014-07-30 10... Samuel Just
02:35 PM Bug #8989 (Rejected): Failed running iogen.sh in upgrade:firefly-firefly-testing-basic-vps suite
There majority of failures related to this in this run: http://pulpito.front.sepia.ceph.com/teuthology-2014-07-30_12:... Yuri Weinstein
12:52 PM Feature #131 (In Progress): bring wireshark plugin is up to date
Sage Weil
12:51 PM Documentation #7 (Resolved): Document Monitor Commands
ceph -h Sage Weil
11:29 AM rgw Bug #8988 (Resolved): AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
"Related issue":http://tracker.ceph.com/issues/9100
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-201...
Yuri Weinstein
11:25 AM Bug #8982 (Pending Backport): cache pool osds crashing when data is evicting to underlying storag...
Sage Weil
11:14 AM Bug #8982 (Fix Under Review): cache pool osds crashing when data is evicting to underlying storag...
Sage Weil
08:47 AM Bug #8982 (In Progress): cache pool osds crashing when data is evicting to underlying storage pool
Sage Weil
07:36 AM Bug #8982 (Resolved): cache pool osds crashing when data is evicting to underlying storage pool
We have a erasure coded pool 'ecdata' and a replicated(size=3) pool 'cache' acting as writeback cache upon it.
When...
Kenneth Waegeman
11:17 AM Bug #8969 (Pending Backport): PerfCounters.SinglePerfCounters failure on i386
Sage Weil
09:48 AM rgw Feature #8987 (New): rgw: data sync for multipart upload
Yehuda Sadeh
09:46 AM Bug #8986 (Duplicate): "[WRN] map e62 wrongly marked me down" in upgrade:dumpling-x-firefly---bas...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-30_13:00:44-upgrade:dumpling-x-firefly---basic-vps... Yuri Weinstein
09:43 AM Bug #8985: "[WRN] map e9 wrongly marked me down" in upgrade:dumpling-x-firefly---basic-vps suite
... Yuri Weinstein
09:42 AM Bug #8985 (Resolved): "[WRN] map e9 wrongly marked me down" in upgrade:dumpling-x-firefly---basic...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-30_13:00:44-upgrade:dumpling-x-firefly---basic-vps... Yuri Weinstein
09:35 AM Bug #8970 (Won't Fix): Injectargs - inconsistent parsing of bool values
these will also work:
--my-boolean-option=0
--my-boolean-option=false
but you're right, the others won't, be...
Sage Weil
09:33 AM Feature #8973: Add support for collecting usage information by namespace
We decided not to do this when designing namespaces because we wanted namespaces to scale independnetly of the size o... Sage Weil
08:49 AM Bug #8947 (Duplicate): Writing rados objects with max objects set for cache pool crashed osd
Oh, i see it now. This is a dup of #8982. Sage Weil
08:29 AM RADOS Support #8600: MON crashes on new crushmap injection
In addition to the choose vs. chooseleaf issue that Joao is mentioning here, we have also seen problems when min_size... Henning Stener
08:13 AM Bug #8966: ceph.conf "osd pool default size = 2" not working
Then the documentation (http://ceph.com/docs/master/start/quick-ceph-deploy/) on point 2 should be updated.... Christoph Pedro
07:58 AM RADOS Bug #8984 (Won't Fix): creating erasure-code pool when not having a root item default
When creating a EC pool:
> ceph osd pool create poolio 128 128 erasure profile15
It returns
> Error ENOENT: root ...
Kenneth Waegeman
07:46 AM Bug #8983 (Resolved): rados bench -b option does not take orders of magnitude (k,M,..) but also d...
When running this:
> rados -p <pool> bench 1000 write -t 10 -b 4M
It runs with -b 4 instead of expected
> rados -...
Kenneth Waegeman
06:04 AM Bug #8601: erasure-code: default profile does not exist after upgrade
Apparently having an EC pool is still sufficient to prevent kernel clients from mounting, so I don't think we can bac... Greg Farnum
05:52 AM Bug #8601: erasure-code: default profile does not exist after upgrade
"firefly backport":https://github.com/ceph/ceph/pull/2178 Loïc Dachary
05:16 AM Bug #8601 (Pending Backport): erasure-code: default profile does not exist after upgrade
Loïc Dachary
02:53 AM Linux kernel client Bug #8979 (Resolved): GPF kernel panics - auth?
From James Eckersall, "GPF kernel panics" on ceph-users.
I've had a fun time with ceph this week.
We have a clust...
Ilya Dryomov

07/30/2014

10:59 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
no clue what happened. please dump the mds cache when it happens next time Zheng Yan
07:01 AM CephFS Bug #8962: kcephfs: client does not release revoked cap
and the code that did it is in teuthology.git/teuthology/misc.py:... Sage Weil
07:00 AM CephFS Bug #8962: kcephfs: client does not release revoked cap
here is the final state of the directory:... Sage Weil
10:25 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
The deadlock-bad kernel showed the error after a few minutes of running multiple dd writes to rbd device. Here is one... Eric Eastman
11:33 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
All,
Can you try and confirm that deadlock-bad fails and deadlock-good works for you?
deadlock-bad:
http://g...
Ilya Dryomov
05:18 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Update: At this point I'm almost certain this is not an rbd/ceph problem. Trying to track down the exact culprit. Ilya Dryomov
04:59 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
I can reproduce this with 100% certainty now on Trusty, 3.15.6-031506-generic.
Running:
bonnie++ -n 512
agai...
Karl Austin
09:57 PM Bug #8752 (New): firefly: scrub/repair stat mismatch
This problem manifests only on caching pools.
I have two EC pools with the following settings:...
Dmitry Smirnov
09:44 PM Bug #8229 (Closed): 0.80~rc1: OSD crash (domino effect)
Closing: nothing left to track here; did not have this problem with 0.80.4. Dmitry Smirnov
09:42 PM Bug #8978 (Can't reproduce): ceph ping not working as expected
Reading the doc: http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/
I came across command: cep...
Eric Eastman
09:26 PM Bug #8977 (Can't reproduce): osd: didn't discard sub_op_reply from previous interval?
/a/teuthology-2014-07-29_02:30:02-rados-firefly-distro-basic-plana/384397
an op gets stuck in limbo because we are...
Sage Weil
08:54 PM rgw Bug #8586 (Pending Backport): Missing Swift API Header causes RadosGW to segfault
Sage Weil
05:57 PM devops Bug #8976: httpd on RHEL7 (RHEL repo) incompatible with mod_fastcgi (ceph repo)
Also, when trying to enable the httpd ceph pkg with systemctl:
systemctl enable httpd
httpd.service is not a nat...
Marcelo Giles
05:22 PM devops Bug #8976 (Resolved): httpd on RHEL7 (RHEL repo) incompatible with mod_fastcgi (ceph repo)
On a RHEL7 system
yum install httpd mod_fastcgi
systemctl start httpd
Apache fails to start with the folowin...
Marcelo Giles
05:12 PM Bug #8947 (Need More Info): Writing rados objects with max objects set for cache pool crashed osd
can you attach the complete logs? all three osds claim to have hit an assert, but the assert message isn't in the lo... Sage Weil
04:59 PM rbd Bug #8920 (Pending Backport): rbd/singleton/{all/formatted-output.yaml} fails on trusty due to wh...
Sage Weil
01:43 PM rbd Bug #8920 (Fix Under Review): rbd/singleton/{all/formatted-output.yaml} fails on trusty due to wh...
Sage Weil
04:36 PM Bug #8776: osd: runaway memory on dumpling
it's all here:... Sage Weil
02:49 PM Bug #8969 (Fix Under Review): PerfCounters.SinglePerfCounters failure on i386
Sage Weil
10:31 AM Bug #8969 (Resolved): PerfCounters.SinglePerfCounters failure on i386
[ RUN ] PerfCounters.SinglePerfCounters
test/perf_counters.cc:111: Failure
Value of: msg
Actual: "{"test_perfcount...
Sage Weil
02:29 PM Bug #8628 (Resolved): Bad ceph_osd_op.extent union access in ReplicatedPG::do_osd_ops
commit:58212b1245373b6f015cbff11844d33a900bf3cb Sage Weil
02:19 PM Bug #8628 (Rejected): Bad ceph_osd_op.extent union access in ReplicatedPG::do_osd_ops
ceph_osd_op_uses_extent(op.op) guards the references ot the extent view of the union Sage Weil
02:13 PM Bug #8717: teuthology: valgrind leak checks broken for osd (at least)
Sage Weil
02:12 PM Bug #8717 (Resolved): teuthology: valgrind leak checks broken for osd (at least)
Sage Weil
02:12 PM Bug #8777 (Can't reproduce): osd/PGLog.h: 88: FAILED assert(rollback_info_trimmed_to_riter == log...
Sage Weil
02:11 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
Sage Weil
02:02 PM Bug #8595 (In Progress): osd: client op blocks until backfill starts (dumpling)
Sage Weil
01:59 PM Bug #8714 (Fix Under Review): we do not block old clients from breaking cache pools
https://github.com/ceph/ceph/pull/2172 Sage Weil
01:46 PM Bug #8974 (Can't reproduce): osd crashed with merge_log assert due to removal of isds
Even I got same asserts in one of the osds, when removed one osd from each node in a ceph cluster of 3 osd nodes ( 5 ... Sahana Lokeshappa
01:31 PM devops Bug #8850: ceph-deploy tests fail during tar due to file changed; incomplete shutdown?
an initial take on getting more information on what is going on:

https://github.com/ceph/teuthology/pull/302/files
Alfredo Deza
12:47 PM devops Bug #8850: ceph-deploy tests fail during tar due to file changed; incomplete shutdown?
I initially thought that the ceph daemon was still running but according to upstart docs, this output:... Alfredo Deza
11:53 AM Feature #8973 (New): Add support for collecting usage information by namespace
As of now there is no simple way to determine how much data is being used by a particular namespace. Customers curren... Tyler Brekke
11:36 AM rgw Bug #8972 (Resolved): rgw: bucket index log wrong object name in multipart completion
When completing a multipart upload operation, when removing the parts from the index the entries that are logged in t... Yehuda Sadeh
11:27 AM rgw Bug #8971 (Duplicate): rgw: s3 test failures with civetweb
teuthology logs are copied to ubuntu@mira023.front.sepia.ceph.com:/home/ubuntu/civetweb_s3
config.yaml:...
Tamilarasi muthamizhan
10:35 AM Bug #8970 (Won't Fix): Injectargs - inconsistent parsing of bool values
Hi all,
ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) on Ubuntu 14.04 LTS
This is how I am able ...
Peter Vilhan
10:19 AM Feature #8960 (Fix Under Review): filestore: store backend type persisently
https://github.com/ceph/ceph/pull/2163 Sage Weil
10:17 AM Bug #8601: erasure-code: default profile does not exist after upgrade
"rebased and repushed":https://github.com/ceph/ceph/pull/1990 Loïc Dachary
09:37 AM Bug #8966 (Closed): ceph.conf "osd pool default size = 2" not working
the config option needs to go in the [global] section, not [default] (which is never used for anything) Sage Weil
04:31 AM Bug #8966: ceph.conf "osd pool default size = 2" not working
Recognized the failure with the command "ceph osd dump". There the pools had always the size 3 (default). Christoph Pedro
04:29 AM Bug #8966 (Closed): ceph.conf "osd pool default size = 2" not working
Version
ceph-deploy: 1.5.9
ceph 0.80.5
Ceph.config:...
Christoph Pedro
09:03 AM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
It appears I was able to get further this time, the steps are below.
Key difference is, when I did ceph-deploy new I...
Bobby Yakov
06:20 AM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
Hi Alfredo,
Nodes were cleaned out, will re-run install today and get you the log files.
In the mean time, it appea...
Bobby Yakov
06:17 AM Bug #8922: ceph-deploy mon create fails to create additional monitoring nodes.
ceph-deploy new cwtcph001
ceph-deploy install cwtcph001 cwtcph002 cwtcph003
ceph-deploy mon create cwtcph001 cwtcph...
Bobby Yakov
05:32 AM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
Ilya Dryomov wrote:
> What do you mean by "I can't explain why only one machine is affected" above? Do you have oth...
Dmitry Smirnov
12:27 AM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
What do you mean by "I can't explain why only one machine is affected" above? Do you have other similar boxes/setups... Ilya Dryomov
02:01 AM rgw Bug #8383: Upload part of one object passed with incorrect upload id or incorrect object id in S3...
Hi,sage,
Sure!
I use S3 API to do this test....
Jingjing Zhao
01:28 AM CephFS Bug #8961 (Won't Fix): du [directory] vs du -b [directory] size doubles
cephfs tracks recursive directory stats. A directory's size is space used by files underneath the directory. If you d... Zheng Yan

07/29/2014

09:41 PM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
This problem remains to be very painful... Average frequency is one crash per day. Less than 24 hours ago I had two c... Dmitry Smirnov
09:38 PM Bug #8863: osd: second reservation rejection -> crash
i used this command reimport the crushmap, bug osd still crash shaojun ruan
01:19 PM Bug #8863: osd: second reservation rejection -> crash
try this:
ceph osd getcrushmap -o cm
ceph osd setcrushmap -i cm
and then see if you can reproduce it after t...
Sage Weil
03:41 AM Bug #8863: osd: second reservation rejection -> crash
osd reject the other osd's backfill request twice probably because the space is full, then the request one crashed shaojun ruan
03:27 AM Bug #8863: osd: second reservation rejection -> crash
*scenario:*
1. 3-replica
2. space is nearlly full(some osd >96%)
We guess the reason is osd continuously receivi...
shaojun ruan
07:52 PM Bug #8886: Miss some folders in PG's folder
Hi, Samuel,
First, I correct my word " it should be stored in the DIR_3 at third level", actually it miss the DIR_...
Jingjing Zhao
01:43 PM Bug #8886: Miss some folders in PG's folder
Can you add a find . on that pg directory? Also, does this happen reliably? Also, on what version did you reproduce... Samuel Just
07:30 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
I saw similar hang a few weeks ago. In that case, all OSDs were down, the MDS couldn't submit log event. Zheng Yan
03:05 PM CephFS Bug #8962 (Resolved): kcephfs: client does not release revoked cap
several instances where the mds tries to revoke a cap (Ls and Fs have been observed so far) and the client doesn't re... Sage Weil
07:18 PM CephFS Bug #8964: kcephfs: client does not resend requests on mds restart
Zheng Yan
07:18 PM CephFS Bug #8964: kcephfs: client does not resend requests on mds restart
probably fixed by https://github.com/ceph/ceph-client/commit/967166011221589288348b893720d358150176b9 Zheng Yan
05:40 PM CephFS Bug #8964: kcephfs: client does not resend requests on mds restart
mds log and the client kern.log with debug cranked up:... Sage Weil
05:39 PM CephFS Bug #8964 (Resolved): kcephfs: client does not resend requests on mds restart
i have a bunch of hung requests,... Sage Weil
06:47 PM Feature #8965 (New): Improve threading for ObjectCacher
The ObjectCacher currently use a single global lock for all state. Break this down to improve multithread performanc... Haomai Wang
03:55 PM Feature #8960: filestore: store backend type persisently
Sage Weil
10:27 AM Feature #8960 (Resolved): filestore: store backend type persisently
Sage Weil
03:32 PM rgw Bug #8586 (Fix Under Review): Missing Swift API Header causes RadosGW to segfault
Yehuda Sadeh
03:06 PM RADOS Bug #8963 (Resolved): erasure coding crush rulset breaks rbd kernel clients on non-ec pools on Ub...
On a fresh install using ceph-deploy on Ubuntu 14.04 creating any erasure coded pool breaks rbd clients on linux 3.13... Greg Dahlman
03:02 PM Bug #8726 (Resolved): (firefly command on dumpling issue?) Error "'adjust-ulimits ceph-coverage /...
commit:fcc0b2451b47793a64fc4cd4675fef667a4a5b45 in ceph-qa-suite.git Josh Durgin
02:31 PM Bug #8628: Bad ceph_osd_op.extent union access in ReplicatedPG::do_osd_ops
This was fixed in 58212b1. Adam Crume
02:28 PM devops Bug #6091 (Won't Fix): centos build should use redhat-rpm-config for debuginfo packages
Sage Weil
02:28 PM devops Bug #5819 (Won't Fix): redhat-rpm-config package needed for debuginfo packages
Sage Weil
02:26 PM devops Bug #7181 (Rejected): debian 7 wheezy init.d script will not start OSDs not corresponding to a mo...
touch /var/lib/ceph/osd/*/sysvinit Sage Weil
02:26 PM devops Bug #6937 (Resolved): udev: OSD using dmcrypt aren't automatically started
Sage Weil
02:25 PM devops Bug #6453 (Won't Fix): libapache2-mod-fastcgi Packages for Debian Squeeze have incorrect dependen...
Sage Weil
02:25 PM devops Bug #6158: selective sync of ceph precise dependencies from havana cloud archive
Note: Talk to neil about this. Sandon Van Ness
02:22 PM devops Bug #8602 (Rejected): ceph fedora package is missing erasure code libraries
redoing (redid?) these packages Sage Weil
02:22 PM Bug #8711 (Resolved): Error "ceph --format=json-pretty osd lspools" is "unrecognized command" in ...

Oops, this should have been closed already...
John Spray
01:51 PM Bug #8711: Error "ceph --format=json-pretty osd lspools" is "unrecognized command" in cuttlefish
Probably best to change the test to cope? Samuel Just
02:21 PM devops Bug #7598 (Can't reproduce): ceph-disk-activate error with ceph-deploy
Sage Weil
02:19 PM devops Bug #8581 (Can't reproduce): DNS issues when resolving hosts
Sage Weil
02:17 PM devops Bug #8734: EPEL / Ceph.com package priority issues
ceph-deploy sets the priority; other users will need to do so themselves.
perhaps that can be mentioned in the doc...
Sage Weil
02:15 PM devops Bug #5283 (Won't Fix): Ceph-deploy can't handle /dev/disk/by-* device paths
Sage Weil
02:06 PM devops Bug #7627 (Resolved): ceph-disk: does not start daemons properly under systemd
commit:3e0d9800767018625f0e7d797c812aa44c426dab Sage Weil
02:01 PM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
Can you paste the whole output of ceph-deploy? Alfredo Deza
01:58 PM Bug #6141 (Can't reproduce): OSDs crash on recovery
Samuel Just
01:52 PM Bug #8673 (Resolved): s3tests.functional.test_s3.test_multipart_upload failed in teuthology-2014-...
Sage Weil
01:50 PM Bug #8654 (Resolved): Parsing /etc/lsb-release for OSD metadata is not portable
Sage Weil
01:49 PM Bug #8644 (Rejected): 624ae21833 breaks ceph-disk
Sage Weil
01:48 PM Bug #8852 (Won't Fix): submodules not cecking out the right branch, jerasure does not compile
workaround is to remove the dir then rerun the submodule command. we blame git! Sage Weil
01:47 PM Bug #8801 (Can't reproduce): Ceph monitors do not start after server restart
from teh logs the ceph-mon process was never started.. iw ould look in your /var/log/upstart logs? Sage Weil
01:37 PM Bug #8943 (Pending Backport): "ceph df" cannot show pool available space correctly
commit:04d0526718ccfc220b4fe0c9046ac58899d9dafc Sage Weil
01:34 PM Bug #8495 (Duplicate): osd: bad state machine event on backfill request
Sage Weil
01:29 PM Bug #8694 (Duplicate): OSD crashed (assertion failure) at FileStore::_collection_move_rename
#8733 Sage Weil
01:28 PM rgw Bug #8676: md5sum check failed during readwrite.py
I don't see anything wrong in the logs other than this:... Yehuda Sadeh
01:27 PM Bug #8753: PG::activate assert failed when recover finished
Has this happened since? Samuel Just
01:26 PM Bug #8865: cep osd setmaxosd doesn't check if osds exist
agreed Samuel Just
01:26 PM Bug #8752 (Can't reproduce): firefly: scrub/repair stat mismatch
Sage Weil
01:25 PM Bug #8752 (Resolved): firefly: scrub/repair stat mismatch
Sage Weil
01:06 PM CephFS Bug #8961 (Won't Fix): du [directory] vs du -b [directory] size doubles
Under cephfs using the kernel client, du -b shows an incorrect size.
I've also found that du --apparent-size shows...
Matt Hook
01:04 PM Bug #8717 (In Progress): teuthology: valgrind leak checks broken for osd (at least)
Sage Weil
01:03 PM Bug #8717 (Resolved): teuthology: valgrind leak checks broken for osd (at least)
Sage Weil
01:03 PM Bug #8926 (Resolved): osd: invalid Message* deref in C_SendMap
Sage Weil
01:03 PM Bug #8924 (Resolved): osd: leaking local_connection under valgrind
Sage Weil
12:59 PM Messengers Bug #8880: msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq feature")
Sage Weil
10:42 AM rgw Bug #8632 (Resolved): rgw: bucket listing with delimiter doesn't scale well
backported to dumpling commit:9604425b86f5839a109faa1f396b0d114e9b9391 Yehuda Sadeh
09:36 AM rgw Bug #8632 (Pending Backport): rgw: bucket listing with delimiter doesn't scale well
in firefly, not dumpling yet Sage Weil
10:31 AM rgw Bug #8846 (Resolved): radosgw on 0.80.4 crashes when doing a multi-part upload
Yehuda Sadeh
10:11 AM Bug #8532 (Can't reproduce): 0.80.1: OSD crash (domino effect), same as BUG #8229
Let us know if anything interesting comes up. Samuel Just
10:10 AM Bug #8229: 0.80~rc1: OSD crash (domino effect)
This bug described a whole bunch of unrelated problems, can you open a fresh bug? Samuel Just
10:01 AM Bug #8959: osd crashed in upgrade:dumpling-x-firefly---basic-vps suite
this sounds a bit like a problem we had a while back with hung IOs from the VMs? Sage Weil
08:40 AM Bug #8959: osd crashed in upgrade:dumpling-x-firefly---basic-vps suite
Seems the same crash in another tests, logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-28_11:48:15... Yuri Weinstein
08:36 AM Bug #8959 (Can't reproduce): osd crashed in upgrade:dumpling-x-firefly---basic-vps suite
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-28_11:48:15-upgrade:dumpling-x-firefly---basic-vps... Yuri Weinstein
09:41 AM CephFS Bug #8574: teuthology: NFS mounts on trusty are failing
I'm not sure if this is a different issue or a different system:... Greg Farnum
09:40 AM devops Support #8861: Deploying additional monitors fails.
I am also seeing this error when trying to add a new monitor. Same version of Ubuntu and Ceph. James Devine
09:38 AM rgw Bug #8735 (Can't reproduce): TestAccountNoContainers fail in Firefly upgrade:firefly-x:stress-split
Sage Weil
09:38 AM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
Sage Weil
09:37 AM rgw Bug #8848 (Resolved): "adjust-ulimits: command not found" in upgrade:firefly-firefly-testing-basi...
Sage Weil
09:37 AM rgw Bug #8847 (Can't reproduce): "Error initializing cluster client" in upgrade:firefly-firefly-testi...
Sage Weil
09:34 AM Bug #8921 (Won't Fix): ceph pg dump <{summary|sum|delta|pools|osds|pgs|pgs_brief}> only work corr...
Sage Weil
09:33 AM rgw Bug #8864: radosgw help doesn't seem to display some debug options
there are others that we could add Sage Weil
09:32 AM rgw Bug #8864 (Resolved): radosgw help doesn't seem to display some debug options
Sage Weil
09:32 AM rgw Bug #6911 (Won't Fix): rgw test failure on the arm set up
Sage Weil
09:31 AM rgw Bug #8111 (Need More Info): /etc/init.d/ceph-radosgw for RHEL needs QA
isn't it /etc/init.d/radosgw?
Sage Weil
09:30 AM rgw Bug #8383 (Need More Info): Upload part of one object passed with incorrect upload id or incorrec...
Can you provide more detailed steps to reproduce? ideally, a new test in s3-tests.... :) Sage Weil
09:29 AM rgw Bug #7799 (Can't reproduce): Errors in upgrade:dumpling-x:stress-split-firefly---basic-plana suite
Sage Weil
09:25 AM rgw Bug #8311 (Resolved): No pool name error in ubuntu-2014-05-06_21:02:54-upgrade:dumpling-dumpling-...
Sage Weil
09:25 AM rgw Bug #8784: rgw: completion leak
Sage Weil
09:23 AM rbd Bug #6695 (Won't Fix): Upgrade rbd failure in nightly tests. (mkdir --p ..)
Sage Weil
09:22 AM rbd Bug #5480 (Can't reproduce): libceph: unexpected old state in con_sock_state_change
Sage Weil
09:21 AM rbd Bug #8845: Flattening Clones of clone, results in command failure
fsx is now able to catch this one. Ilya Dryomov
09:19 AM rbd Bug #8845: Flattening Clones of clone, results in command failure
Josh Durgin
09:15 AM rbd Bug #8845: Flattening Clones of clone, results in command failure
Ilya Dryomov
09:21 AM rbd Bug #7693: virsh domblkinfo fails with 'Bad file descriptor'
https://bugzilla.redhat.com/show_bug.cgi?id=1124508 Sage Weil
09:17 AM rbd Bug #7620 (Can't reproduce): BUG: soft lockup - CPU#0 stuck for 23s!
Sage Weil
09:15 AM Linux kernel client Bug #8568 (New): libceph: kernel BUG at net/ceph/osd_client.c:885
Ilya Dryomov
09:10 AM Linux kernel client Bug #8568: libceph: kernel BUG at net/ceph/osd_client.c:885
Ilya Dryomov
09:14 AM rbd Bug #8709: stale size reported by ioctl(BLKGETSIZE64) after librbd_resize() returns
The problem has been traced to http://tracker.ceph.com/issues/8806. Keeping this around to re-test after it gets fixed. Ilya Dryomov
09:11 AM Bug #8439 (Won't Fix): ceph-osd crashing often
see 0.80.x Sage Weil
09:10 AM Bug #8445 (Won't Fix): osd not starting anymore
0.78 had lots of issues; see 0.80.x Sage Weil
09:01 AM rbd Bug #8318 (Can't reproduce): "rbd: create error" in upgrade:dumpling-dumpling-testing-basic-plana...
Sage Weil
09:01 AM rbd Bug #8715 (Can't reproduce): "ceph_test_librbd_fsx: invalid option -- 'h'" error in teuthology-20...
Sage Weil
06:57 AM CephFS Feature #7759 (Resolved): journal-tool: roll in resetter/dumper from MDS
... John Spray
06:56 AM CephFS Feature #7761 (Resolved): journal-tool: forwards-search through corrupt regions
... John Spray
06:55 AM CephFS Feature #7763: journal-tool: import
... John Spray
06:54 AM CephFS Feature #7763 (Resolved): journal-tool: import
This was done when undump was merged into cephfs-journal-tool John Spray
06:51 AM CephFS Bug #8773 (Resolved): failing cephfs set_layout tests
Test is retired and unsafe behaviour (data pool default to 0) is disabled in master. John Spray
06:07 AM CephFS Bug #8811 (Resolved): Journal corruption during upgrade to 0.82 with standby-replay daemons
This got fixed 11 days ago, but was never marked closed. Merged in commit:b9463e3497cc1f2a1bab0838430a4402d8c88af0 Greg Farnum
05:59 AM Bug #8932 (Resolved): rados api test hang on HitSetWrite
Merged to master in commit:37eba045ec78f2ea8f9000c6b158e20808d29fb2 Greg Farnum
05:56 AM Bug #8931 (Pending Backport): failed write reply order from ceph_test_rados
Merged to master in commit:050ac87530c2637f097e07b5373115721303f07c Greg Farnum

07/28/2014

10:47 PM Bug #8944: Ceph daemon bad asok used in connection with cluster
wip-8944 created, but gitbuilders are having enough problems I'm not submitting a PR yet Dan Mick
02:11 PM Bug #8944 (Fix Under Review): Ceph daemon bad asok used in connection with cluster
Adding the global args to the invocation of ceph-conf seems to resolve this. Dan Mick
12:41 PM Bug #8944: Ceph daemon bad asok used in connection with cluster
oh....because --cluster on the cli ... yeah.
Dan Mick
12:40 PM Bug #8944: Ceph daemon bad asok used in connection with cluster
ceph uses ceph-conf --show-config-value admin_socket -n <name> and believes it; wonder why that's not working? Dan Mick
09:58 AM Bug #8944: Ceph daemon bad asok used in connection with cluster
Sage Weil
05:01 AM Bug #8944 (Resolved): Ceph daemon bad asok used in connection with cluster
Using @ceph --cluster clustername daemon mon.host1 config@ causes ... Szymon Zacher
10:46 PM Bug #8947: Writing rados objects with max objects set for cache pool crashed osd
Uploading crash dump Mallikarjun Biradar
01:45 PM Bug #8947: Writing rados objects with max objects set for cache pool crashed osd
Could not reproduce using vstart.sh on current master branch. I never saw a crash or bug report with that stack trace. David Zafman
10:08 AM Bug #8947: Writing rados objects with max objects set for cache pool crashed osd
I don't remember the details, but we were previously crashing with a 10-object limit anyway due to hit sets and such.... Greg Farnum
08:16 AM Bug #8947: Writing rados objects with max objects set for cache pool crashed osd
Test configuration:
No of osd nodes: 3
No of osd's : 4
No of monitors: 2
Kernel versions: 3.13.0-24-generic
No o...
Mallikarjun Biradar
08:15 AM Bug #8947 (Duplicate): Writing rados objects with max objects set for cache pool crashed osd
Setting target_max_objects parameter and writing rados object onto cache pool crashed osd.
History of operations o...
Mallikarjun Biradar
06:41 PM Messengers Bug #8880 (Fix Under Review): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_s...
New patches to split up the code more, as requested. :) Greg Farnum
10:56 AM Messengers Bug #8880 (In Progress): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq fe...
Greg Farnum
02:12 PM rgw Bug #8937 (Fix Under Review): rgw: broken large(-ish) objects
Yehuda Sadeh
02:10 PM rgw Feature #7774 (Resolved): rgw: cache decoded user and bucket info
This one has been merged in a while a go, at commit:82c547952dc9e7a3e9fab1264f5fdd903ab6973e. Yehuda Sadeh
01:07 PM Bug #8941 (Can't reproduce): DaemonConfig.SubstitutionLoop unit test goes haywire
nevermind, most recent occurrence was feb, so ignoring this. Sage Weil
01:02 PM rgw Feature #8956 (Resolved): rgw: support bucket notification
Yehuda Sadeh
11:32 AM Documentation #8955: doc refers to [default] section, don't think it exists
http://ceph.com/docs/master/start/quick-ceph-deploy/#create-a-cluster refers to the [default] section in the ceph.con... Dan Mick
11:31 AM Documentation #8955 (Resolved): doc refers to [default] section, don't think it exists
Dan Mick
09:21 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
I'm pretty sure it's the disabled lockdep that affects this. Our testing kernel is built with lockdep enabled, Ubunt... Ilya Dryomov
08:50 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Hi Ilya,
I can reliably reproduce the error when running this generic kernel with no changes:
http://kernel.ubu...
Greg Wilson
08:39 AM Bug #8935: operations not idempotent when enabling cache
I think you're right that a per-object log would be needed to solve this problem — and I think that means we shouldn'... Greg Farnum
08:02 AM rgw Feature #8945 (Resolved): rgw: support swift /info api
Yehuda Sadeh
06:55 AM Bug #8938 (Resolved): OSD memory leak seen with fs-master-testing-basic/kernel_untar_build.sh
This was fixed at about the same time:... John Spray
06:42 AM CephFS Feature #7810 (In Progress): libcephfs: add a test that freezes + unfreezes a client, and then ve...
John Spray
05:27 AM Bug #8895: ceph osd pool stats (displayed incorrect values)
Negative & undefined values in counts objects:
*-5/0 objects degraded (-inf%)*
*-32/12 objects degraded (-266...
Andrey Matyashov
03:06 AM rgw Bug #8864: radosgw help doesn't seem to display some debug options
This should be closed with #8112 Abhishek Lekshmanan
02:48 AM Bug #8943 (Resolved): "ceph df" cannot show pool available space correctly
Currently when user have 2 pools with different ruleset and different root, basically they will use differen... Xiaoxi Chen
12:37 AM Bug #8863: osd: second reservation rejection -> crash
Last week we've created a new cluster(all components use v0.80.4), continuously writing data until space is full, the... shaojun ruan

07/27/2014

11:45 PM Bug #8942 (Resolved): Bad JSON output in ceph osd tree
Hi,
JSON output for @ceph osd tree@ has bad format for stray array: every osd are printed in the same array element....
Szymon Zacher
10:41 PM Bug #8941 (Can't reproduce): DaemonConfig.SubstitutionLoop unit test goes haywire
... Sage Weil
10:31 PM Bug #8822: osd: hang on shutdown, spinlocks
saw this again, ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-27_02:30:01-rados-next-testing-basi... Sage Weil
10:28 PM Bug #8396: osd: message delayed in Session misdirected after split
very likely another instance, but i didn't look closely.... Sage Weil
10:20 PM Bug #8940 (Duplicate): 3.22s1 shard 0(2) missing ad166f62/benchmark_data_plana57_30491_object1036...
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-27_02:30:01-rados-next-testing-basic-plana/380335
...
Sage Weil
09:47 PM Bug #6003: journal Unable to read past sequence 406 ...
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-27_02:30:01-rados-next-testing-basic-plana/380261
...
Sage Weil
02:32 PM Bug #8758: PGs get stuck in “replay”, but drop it upon osd restarts
As for the issue of losing replay states upon member osd restarts... Could the fix be as simple as not setting inter... Alexandre Oliva
01:44 PM Bug #8758: PGs get stuck in “replay”, but drop it upon osd restarts
Here's a patch that addresses the “stuck in replay” problem (but not the “replay is dropped after osd re-peering” one). Alexandre Oliva
11:21 AM Bug #8863 (Need More Info): osd: second reservation rejection -> crash
Sage Weil
11:20 AM Bug #8922 (Need More Info): ceph-deploy mon create fails to create additional monitoring nodes.
It sounds like the monitor names don't match the host names or something similar. Can you post the full sequence of ... Sage Weil

07/26/2014

10:14 PM Bug #8939 (In Progress): stalled LibRadosTwoPoolsPP.TryFlushReadRace; client failed to reconnect?
Sage Weil
10:10 PM Bug #8939: stalled LibRadosTwoPoolsPP.TryFlushReadRace; client failed to reconnect?
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-07-25_22:40:14-rados-wip-sage-testing-testing-basic-plana/37... Sage Weil
10:05 PM Bug #8939 (Duplicate): stalled LibRadosTwoPoolsPP.TryFlushReadRace; client failed to reconnect?
it appears the OSD was behaving properly, but things stalled because on of the stat replies got... Sage Weil
02:06 PM Bug #8938 (Resolved): OSD memory leak seen with fs-master-testing-basic/kernel_untar_build.sh

http://pulpito.front.sepia.ceph.com/teuthology-2014-07-25_23:04:01-fs-master-testing-basic-plana/378947/
Initial...
John Spray

07/25/2014

09:38 PM Bug #8931 (Fix Under Review): failed write reply order from ceph_test_rados
Sage Weil
02:46 PM Bug #8931: failed write reply order from ceph_test_rados
Sage Weil
02:16 PM Bug #8931 (In Progress): failed write reply order from ceph_test_rados
- writeback mode
- write 1 received
- put on full list
- mode changes to forward
- write 2 recieved
- forwarde...
Sage Weil
11:13 AM Bug #8931 (Resolved): failed write reply order from ceph_test_rados
... Sage Weil
09:37 PM Bug #8924 (Fix Under Review): osd: leaking local_connection under valgrind
Sage Weil
06:14 PM Bug #8924: osd: leaking local_connection under valgrind
https://github.com/ceph/ceph/pull/2148 Sage Weil
06:14 PM Bug #8924 (Fix Under Review): osd: leaking local_connection under valgrind
Sage Weil
08:21 PM rgw Bug #8937 (Resolved): rgw: broken large(-ish) objects
In current master, following fixes for #8442 and #8928. Specifically the latter triggers the issue. Objects end up wi... Yehuda Sadeh
06:14 PM Bug #8926 (Fix Under Review): osd: invalid Message* deref in C_SendMap
Sage Weil
10:29 AM Bug #8926: osd: invalid Message* deref in C_SendMap
wip-osd-leaks Sage Weil
09:17 AM Bug #8926 (Resolved): osd: invalid Message* deref in C_SendMap
ubuntu@teuthology:/a/sage-osd-leaks-a/377521
<error>
<unique>0x2</unique>
<tid>41</tid>
<kind>InvalidRead...
Sage Weil
06:14 PM Bug #8717 (Fix Under Review): teuthology: valgrind leak checks broken for osd (at least)
Sage Weil
05:42 PM rgw Bug #8676: md5sum check failed during readwrite.py
ubuntu@teuthology:/a/sage-2014-07-25_15:29:13-rgw-wip-msgr-testing-basic-plana/377978 Sage Weil
05:40 PM rgw Bug #8784: rgw: completion leak
ubuntu@teuthology:/a/sage-2014-07-25_15:29:13-rgw-wip-msgr-testing-basic-plana/378011
so this is in master, too
Sage Weil
04:13 PM Bug #8935: operations not idempotent when enabling cache
if we have both the pg log and an op list in object_info_t, then we can have a rados op that returns the 'recent reqi... Sage Weil
04:05 PM Bug #8935 (Resolved): operations not idempotent when enabling cache
consider:
- no cache
- send delete to base tier
- base does delete, replies
- mon set overlay
- client resends...
Sage Weil
03:21 PM rbd Bug #8821 (Resolved): rbd: ceph.conf "rbd default format" woes
Sage Weil
01:50 PM Bug #8932 (Fix Under Review): rados api test hang on HitSetWrite
https://github.com/ceph/ceph/pull/2146 Sage Weil
11:28 AM Bug #8932 (Resolved): rados api test hang on HitSetWrite
http://pulpito.ceph.com/sage-2014-07-24_11:53:12-rados-master-testing-basic-plana/376753/ Sage Weil
11:35 AM rgw Bug #8928 (Pending Backport): rgw: bad object created if stripe size is not a multiple of chunk size
Josh Durgin
10:35 AM rgw Bug #8928 (Resolved): rgw: bad object created if stripe size is not a multiple of chunk size
Yehuda Sadeh
11:34 AM rgw Bug #8442 (Pending Backport): rgw: does not detect/adapt to erasure pool stripe size
Josh Durgin
11:11 AM Bug #8930 (Resolved): osd: test unable to produce unfound objects
http://pulpito.ceph.com/sage-2014-07-24_11:53:12-rados-master-testing-basic-plana/376343/
http://pulpito.ceph.com/sa...
Sage Weil
11:01 AM Bug #8396: osd: message delayed in Session misdirected after split
ubuntu@teuthology:/a/sage-2014-07-24_11:53:12-rados-master-testing-basic-plana/376677 Sage Weil
10:53 AM rgw Feature #8929 (New): rgw:support bucket lifecycle
Yehuda Sadeh
10:34 AM RADOS Support #8600: MON crashes on new crushmap injection
JC, although we don't have a fix for the crash yet (we shouldn't crash if a crushmap is incorrectly structured), ther... Joao Eduardo Luis
10:33 AM Bug #8885 (Need More Info): SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
let's wait for this to trigger on master or next Sage Weil
10:32 AM Bug #8882 (Pending Backport): osd: osd tier remove ... leaves incomplete clones behind, confusing...
Sage Weil
08:47 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Hi Ilya,
We used Ubuntu standard kernel as well, specifically:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v...
Xavier Trilla
08:31 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
OK, so lockdep is disabled. Since you are able to reproduce it so reliably, can you try our testing kernel? The pac... Ilya Dryomov
07:52 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
We are using the standard kernels as they come from the ubuntu site. That being said when I look in the config files... Greg Wilson
12:29 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
I have been trying to reproduce this with both parallel dds and the attached script, but so far no luck. We've seen ... Ilya Dryomov

07/24/2014

07:04 PM Bug #8891 (Need More Info): rados bench hang during thrashing
now that the logging is there we wait for it to happen again... Sage Weil
03:30 PM Bug #8891 (Resolved): rados bench hang during thrashing
added debug messages to radosbench.yaml
commit 367d4da083ea47b1de9201bbda943e57617f6701
also cherry-picked to c...
Tamilarasi muthamizhan
06:59 PM Bug #8924 (Resolved): osd: leaking local_connection under valgrind
teuthology-2014-07-24_12:21:43-rados:verify-master-testing-basic-plana/376768
and several others in that run
Sage Weil
06:58 PM Bug #8890 (Resolved): osd: very slow valgrind+rados/test.sh+thrashing run
doubled the timeout in the qa suite, master, next and firefly branches.
i think what changed is all of the new EC ...
Sage Weil
03:49 PM Bug #8717: teuthology: valgrind leak checks broken for osd (at least)
scheduled teuthology run for rados/verify suite with ceph-qa-suite_branch=wip-leaks and ceph_branch=next,
looks li...
Tamilarasi muthamizhan
12:36 PM Bug #8922 (Can't reproduce): ceph-deploy mon create fails to create additional monitoring nodes.
Hi guys,
Please help.
Running into issue adding monitors. The initial monitor gets created successfully, but get ...
Bobby Yakov
12:18 PM Bug #8921: ceph pg dump <{summary|sum|delta|pools|osds|pgs|pgs_brief}> only work correctly as json
Source: ZD #1671 Tyler Brekke
12:16 PM Bug #8921 (Won't Fix): ceph pg dump <{summary|sum|delta|pools|osds|pgs|pgs_brief}> only work corr...
When ceph pg dump is ran with an argument without specifying json, The normal output from ceph pg dump is returned.
...
Tyler Brekke
12:00 PM Bug #8863: osd: second reservation rejection -> crash
Can you attach teh full osd log please? thanks!
also, are all mons upgraded to 0.80.4 at this point? and were tun...
Sage Weil
11:10 AM Feature #7988 (In Progress): Logs: Log every administrative action taken by a user
Joao Eduardo Luis
10:59 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
Did you mean to add some comments on that second post, Dhiraj?
Anyway, this looks like a kind of disk corruption w...
Greg Farnum
03:31 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
Dhiraj Kamble wrote:
> This SEGV is caused due to a NULL omap iterator.
> This can happen if the object belonging t...
Dhiraj Kamble
01:46 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
This SEGV is caused due to a NULL omap iterator.
This can happen if the object belonging to the Primary OSD is missi...
Dhiraj Kamble
01:32 AM Fix #8914 (Resolved): osd crashed at assert ReplicatedBackend::build_push_op
h3. Steps to reproduce... Sahana Lokeshappa
10:56 AM rbd Bug #8920 (Resolved): rbd/singleton/{all/formatted-output.yaml} fails on trusty due to whitespace
ubuntu@teuthology:/a/teuthology-2014-07-23_23:00:01-rbd-next-testing-basic-plana/375973
the cram diff complains ab...
Sage Weil
10:55 AM rbd Bug #8919 (Resolved): qemu-iotests fails to find common.env
ubuntu@teuthology:/a/teuthology-2014-07-23_23:00:01-rbd-next-testing-basic-plana/375974
note this is trusty ...
...
Sage Weil
10:41 AM Support #8915 (Closed): Ceph Firefly 0.80.4 : health HEALTH_WARN pool volumes has too few pgs; cr...
These topics are pretty well-covered in the release notes. Please refer to those, and if you have any further questio... Greg Farnum
02:42 AM Support #8915 (Closed): Ceph Firefly 0.80.4 : health HEALTH_WARN pool volumes has too few pgs; cr...
Hello Ceph Developers
Recently i have upgraded from dumpling to Firefly stable release 0.80.4 . As soon as upgrad...
karan singh
10:29 AM Bug #8346: OSD crashes on master (FAILED assert(ip_op.waiting_for_commit.count(from)))
Yeah, looks like a dup of #8887 as well. Greg Farnum
10:20 AM CephFS Documentation #8918 (Resolved): kclient: known working kernels
A request has been made to document known working kernels on the docs, here:
http://ceph.com/docs/master/start/os-...
JuanJose Galvez
10:12 AM Bug #8895: ceph osd pool stats (displayed incorrect values)
Which part of the stats do you think are incorrect? You've got 7*1TB+2TB+500GB, which sounds like ~8339GB to me (give... Greg Farnum
09:48 AM rgw Bug #8311: No pool name error in ubuntu-2014-05-06_21:02:54-upgrade:dumpling-dumpling-testing-bas...
Yuri Weinstein
09:45 AM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
Any update? Still waiting Bobby Yakov
06:03 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
We also confirmed the issue is with 3.15 above as we back rev'd our system to 3.14.13 and were able to successfully c... Greg Wilson
01:17 AM Bug #8121: ReplicatedBackend::build_push_op() should handle a short read or assert
The second issue reported is not same as the original one.
It is a SEGV due to NULL omap iterator. Please open anoth...
Dhiraj Kamble

07/23/2014

11:14 PM Bug #8863: osd: second reservation rejection -> crash
We've tracked this problem and catch some logs:
==================================================
core?
(gdb) p e...
shaojun ruan
07:11 PM Bug #8701 (Pending Backport): osd: scrub found obsolete rollback obj
Sage Weil
07:10 PM Bug #8889 (Pending Backport): osd/ReplicatedPG.cc: 5162: FAILED assert(got)
Sage Weil
06:24 PM Bug #8882 (Fix Under Review): osd: osd tier remove ... leaves incomplete clones behind, confusing...
Sage Weil
06:05 PM Bug #8882: osd: osd tier remove ... leaves incomplete clones behind, confusing scrub
This is a stupid test with tiering teardown.
- set up cache pool, write a bunch of stuff
- some objects in teh ca...
Sage Weil
06:06 PM Bug #8881 (Duplicate): scrub 85.0 cf2b2318/foo15/3/test-rados-api-plana35-13313-11/85 expected cl...
same as #8882 Sage Weil
04:19 PM Bug #8884 (Can't reproduce): osd/OSD.cc: 6317: FAILED assert(p->second.empty()) in consume_map()
gonna wait to see this on a non-testing branch Sage Weil
03:28 PM Bug #8891: rados bench hang during thrashing
... Tamilarasi muthamizhan
03:04 PM Bug #8909 (Duplicate): LibRadosListECPP.ListObjectsManyPP hang on firefly
osd crashed with weak_refs assert (#7995). Sage Weil
11:12 AM Bug #8909 (Duplicate): LibRadosListECPP.ListObjectsManyPP hang on firefly
ubuntu@teuthology:/a/teuthology-2014-07-22_02:30:01-rados-firefly-distro-basic-plana/374351 Sage Weil
03:00 PM Bug #7891 (Resolved): osd: leaked pg refs on shutdown
Sage Weil
01:53 PM rbd Bug #8912 (Resolved): librbd segfaults when creating new image (rbd-ephemeral-clone-stable-icehouse)
*Background*:
Installed openstack 2014.1.1 on Ubuntu 14.04 ( apt-get update & upgrade)
- glance with ceph
- nova ...
Zollner Robert
01:52 PM rgw Feature #8911: RGW doesn't return 'x-timestamp' in header which is used by 'View Details' of Open...
Just to be clear, Swift does return this header and the OpenStack devs advise they only test against the Swift storag... Michael Kidd
01:49 PM rgw Feature #8911 (Resolved): RGW doesn't return 'x-timestamp' in header which is used by 'View Detai...
Because RGW doesn't provide an 'x-timestamp' header in its reply to OpenStack, the 'View Details' feature fails as no... Michael Kidd
01:36 PM devops Feature #8871 (Resolved): modify ceph-deploy to only install repo file and not install packages
closed this in master with 68f6a6e Alfredo Deza
06:59 AM devops Feature #8871 (Fix Under Review): modify ceph-deploy to only install repo file and not install pa...
Pull Request opened https://github.com/ceph/ceph-deploy/pull/222 Alfredo Deza
01:14 PM Bug #8910 (Can't reproduce): ceph_test_objectstore: ObjectStore/StoreTest.ManyObjectTest/0 failur...
totally perplexed by this. the previous test deletes the collection. :/
havne't been able to reproduce.
Sage Weil
12:11 PM Bug #8910 (Duplicate): ceph_test_objectstore: ObjectStore/StoreTest.ManyObjectTest/0 failure on f...
... Sage Weil
01:09 PM devops Bug #8330: repodata on rpm repos do not list latest ceph-deploy (1.5.2)
As far as I can tell this appears to be something weird with yum where its not listing the packages if its already in... Sandon Van Ness
12:11 PM Bug #7986: 3.1s0 scrub stat mismatch, got 2041/2044 objects, 0/0 clones, 2041/2044 dirty, 0/0
ubuntu@teuthology:/a/teuthology-2014-07-22_02:30:01-rados-firefly-distro-basic-plana/374223 Sage Weil
10:10 AM rgw Bug #8858 (Resolved): NextMarker, Prefix missing from bucket list results
Josh Durgin
04:58 AM Bug #8907: All user traffics will start to get 500 after some time if (m+1) OSDs of one EC PG are...
The problem, as David explained, is due to that many OPs are stuck at OSD side and in turn hand the thread at radosgw... Guang Yang
04:00 AM Bug #8907 (Duplicate): All user traffics will start to get 500 after some time if (m+1) OSDs of o...
EC pool configuration: k=8, m=3
Steps to reproduce:
1. stop 4 OSDs of one EC PG (down), so this PG can't be wri...
Zhi Zhang
12:23 AM Bug #8887: osd crashes at assert(e.version > info.last_update): PG:add_log_entry
Greg, can you please confirm whether #8346 is also same Pavan Rallabhandi
12:22 AM Bug #8346: OSD crashes on master (FAILED assert(ip_op.waiting_for_commit.count(from)))
For the asserts seen in OSD.0 and OSD.2 logs, there is a tracker #8887.
For OSD.1, going through the logs, sub_op_...
Pavan Rallabhandi

07/22/2014

04:58 PM rgw Bug #8858 (Pending Backport): NextMarker, Prefix missing from bucket list results
Josh Durgin
04:46 PM Fix #8905 (New): msgr: encode osd epoch in nonce to avoid misc OSD reconnect races
We currently cannot tell whether an incoming connection is from an older or newer osd... we can only tell if it is in... Sage Weil
04:24 PM Bug #8851 (Resolved): Mon crash after update to 0.80.4
Sage Weil
04:14 PM Bug #8885: SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
Have you seen this on a mainstream branch?
Looking at the dump() function, the only plausible candidate for failin...
Greg Farnum
04:01 PM Messengers Bug #8880 (Fix Under Review): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_s...
wip-8880, PR https://github.com/ceph/ceph/pull/2135. It's untested and needs a suite run and review. Greg Farnum
03:32 PM Messengers Bug #8880 (In Progress): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq fe...
And indeed there's just nothing here making sure the peer is actually active, nor even that it's the leader. I'm addi... Greg Farnum
03:11 PM Messengers Bug #8880: msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq feature")
Sequence of events
*************************
osd.0
=======
pipe to osd.1 faulted with nothing to send
scrub_shou...
Greg Farnum
04:00 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Josh Durgin wrote:
> Somone reported seeing this same issue in 3.15.2 but not in 3.13.7
Ubuntu Kernel 3.14.9 also...
Xavier Trilla
01:35 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Somone reported seeing this same issue in 3.15.2 but not in 3.13.7 Josh Durgin
03:35 PM Bug #8897 (Resolved): OSD: get_max_object_name_length() int versus uint
Merged as of commit:29401c7e779ad071e0dde7c2a9600fa470370d9a Greg Farnum
01:39 PM Bug #8897 (Fix Under Review): OSD: get_max_object_name_length() int versus uint
Sage Weil
01:28 PM Bug #8897 (Resolved): OSD: get_max_object_name_length() int versus uint
We have a build warning now for signed/unsigned comparison warnings in the OSD now. It's a comparison between osd_max... Greg Farnum
02:32 PM rbd Fix #8903 (Resolved): rbd mirroring: librbd: fix watch/notify usage
specfiically, we need to get a notification when we had to re-register the watch and may have missed an event. Sage Weil
02:32 PM rbd Feature #8902 (Resolved): rbd mirroring: librbd: funnel snapshot, resize events via lock holder
Sage Weil
02:30 PM rbd Feature #8901 (Resolved): rbd mirroring: librbd: add functional tests for librbd lock breaks
Sage Weil
02:30 PM rbd Feature #8900 (Resolved): rbd mirroring: librbd:making image locking mandatory
Sage Weil
02:12 PM Feature #8899 (Resolved): Kerberos/LDAP Support:: mon: define mon role capabilities
Three main roles:
- read-only (except for auth!)
- admin (read/write, but can't change auth)
- role-definer (...
Sage Weil
02:04 PM devops Bug #8893: ceph-deploy install command on centos 6.5 reports exception
Alfredo Deza
11:05 AM devops Bug #8893: ceph-deploy install command on centos 6.5 reports exception
I take it that generating ceph.repo directly is basically skiping the ceph-release RPM (which provides that file)? M... Sage Weil
01:46 PM CephFS Bug #8876: kcephfs: hang on read of length 0
The workload that triggered this was one client doing repeated tail -1, and another client doing appends. Can we con... Sage Weil
01:38 PM devops Feature #8871: modify ceph-deploy to only install repo file and not install packages
Some progress, I made this a subcommand as it would look cleaner when specifying what repo should be installed from c... Alfredo Deza
01:09 PM devops Feature #8871 (In Progress): modify ceph-deploy to only install repo file and not install packages
There are a few things that happen usually when installing a remote repo file in ceph-deploy:
* GPG file gets impo...
Alfredo Deza
01:25 PM Feature #7988: Logs: Log every administrative action taken by a user
as per Neil's request, this is what will be logged to syslog:... Joao Eduardo Luis
01:14 PM Bug #8887 (Duplicate): osd crashes at assert(e.version > info.last_update): PG:add_log_entry
There isn't much log history to look at here, but the same op is being dequeued twice by the OSD, and the second is h... Greg Farnum
01:13 PM Bug #8889: osd/ReplicatedPG.cc: 5162: FAILED assert(got)
- there is an in-progress copy_from
- backfill advances up to the snapdir object, sets backfill_pos, sets backfill_r...
Sage Weil
07:44 AM Bug #8889: osd/ReplicatedPG.cc: 5162: FAILED assert(got)
Greg Farnum wrote:
> Maybe I misunderstand, but if we're flushing snapshot 3, we need to write it (using old snapcon...
Sage Weil
12:29 PM devops Feature #8656 (Duplicate): Update Ceph packages in Fedora
with 8868 Neil Levine
08:21 AM devops Bug #8896: missing i386 packages for Trusty
I added the releases to match the i386 rule, but we will not be able to get them out until we try a new release:
<...
Alfredo Deza
08:18 AM devops Bug #8896 (Rejected): missing i386 packages for Trusty
I *think* we just missed getting them configured in Jenkins, as we do build i386 for Wheezy, Squeeze, and Precise. Alfredo Deza
07:43 AM Bug #8895 (Duplicate): ceph osd pool stats (displayed incorrect values)
... Andrey Matyashov
07:39 AM rgw Bug #8864: radosgw help doesn't seem to display some debug options
duplicate of #8112 Abhishek Lekshmanan
06:21 AM CephFS Feature #8869 (Fix Under Review): MDS: support standby-replay on old-format journals
https://github.com/ceph/ceph/pull/2132 John Spray
06:20 AM CephFS Feature #4886 (In Progress): teuthology: add tests that use the MDS dumper
Dumper is used in cephfs_journal_tool_smoke.sh, which I should make sure gets hooked into the QA suite. John Spray
02:49 AM Bug #8532: 0.80.1: OSD crash (domino effect), same as BUG #8229
still no inconsistencies. everything is running fine now. but there is now another one, who might have similar proble... Markus Blank-Burian
02:40 AM Bug #8229: 0.80~rc1: OSD crash (domino effect)
Even I got same asserts in one of the osds, when removed one osd from each node in a ceph cluster of 3 nodes ( 5 osds... Sahana Lokeshappa

07/21/2014

10:47 PM Bug #8851: Mon crash after update to 0.80.4
Greg Farnum wrote:
> Can you upload the full log of startup with crash?
> By "temporarily resolved", do you mean it...
shaojun ruan
10:17 AM Bug #8851 (Fix Under Review): Mon crash after update to 0.80.4
https://github.com/ceph/ceph/pull/2128 Joao Eduardo Luis
10:10 AM Bug #8851: Mon crash after update to 0.80.4
This issue should only affect users that have been running without cephx and have not ever created a key.
It's due...
Joao Eduardo Luis
10:34 PM Bug #8889: osd/ReplicatedPG.cc: 5162: FAILED assert(got)
Maybe I misunderstand, but if we're flushing snapshot 3, we need to write it (using old snapcontext, obviously) and t... Greg Farnum
08:50 AM Bug #8889 (Resolved): osd/ReplicatedPG.cc: 5162: FAILED assert(got)
ubuntu@teuthology:/a/teuthology-2014-07-20_02:30:01-rados-next-testing-basic-plana/371321
This is in the base tier...
Sage Weil
10:31 PM Bug #8887: osd crashes at assert(e.version > info.last_update): PG:add_log_entry
This error looks familiar to me but we don't have any other tracker entries for it. The commit in question is a part ... Greg Farnum
05:10 AM Bug #8887 (Duplicate): osd crashes at assert(e.version > info.last_update): PG:add_log_entry
I have ceph cluster with 3 monitors, 3 osd nodes (3 osds in each node)
While Io was going on, rebooted a osd node...
Sahana Lokeshappa
06:47 PM devops Bug #8893: ceph-deploy install command on centos 6.5 reports exception
So something that caused some issues here is that the ceph-release rpm file is for v0.80.4 (dev) is just pointing to ... Sandon Van Ness
03:06 PM devops Bug #8893 (Resolved): ceph-deploy install command on centos 6.5 reports exception
ceph-deploy install command reports exception though command is successful.
test setup: vpm036...
Tamilarasi muthamizhan
04:47 PM rgw Bug #8632: rgw: bucket listing with delimiter doesn't scale well
wip-8632 mitigates the issue. Yehuda Sadeh
04:41 PM Bug #8894: osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
... Sage Weil
03:38 PM Bug #8894 (In Progress): osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
Sage Weil
03:37 PM Bug #8894 (Resolved): osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-20_02:30:01-rados-next-testing-basic-plana/371110
...
Sage Weil
04:03 PM Bug #8701: osd: scrub found obsolete rollback obj
Greg Farnum
01:23 PM Messengers Fix #8892 (New): msgr: clean up local Session invocation functions
We are fairly inconsistent about delivering connect() setups for the loopback Connection in the SimpleMessenger. We s... Greg Farnum
10:13 AM rgw Bug #8311 (Fix Under Review): No pool name error in ubuntu-2014-05-06_21:02:54-upgrade:dumpling-d...
https://github.com/ceph/ceph-qa-suite/pull/63 Sage Weil
09:33 AM Bug #8891 (Resolved): rados bench hang during thrashing
teuthology-2014-07-20_02:30:01-rados-next-testing-basic-plana/371201
we thrash and recover but the rados bench get...
Sage Weil
09:27 AM Bug #8890 (Resolved): osd: very slow valgrind+rados/test.sh+thrashing run
teuthology-2014-07-20_02:30:01-rados-next-testing-basic-plana/371464
not sure if this is typical or not. either w...
Sage Weil
07:00 AM CephFS Bug #8876: kcephfs: hang on read of length 0
probably fixed by https://github.com/ceph/ceph-client/commit/d0d0db2268cc343c2361c83510d8e9711021fcce Zheng Yan
06:41 AM CephFS Feature #4583: libcephfs: add test that kills a client and verifies mds cleans it up
Just need mds_client_recovery into ceph-qa-suite before closing this John Spray

07/20/2014

09:51 PM Bug #8752: firefly: scrub/repair stat mismatch
Upgraded cluster to 0.80.4, restarted all components (previously MDS 0.80.2 could be still running), copied some data... Dmitry Smirnov
08:48 PM Bug #8886 (Closed): Miss some folders in PG's folder
When put objects to a cluster, I checked the contents of directory /var/lib/ceph/osd/current/pg.xxx and found a probl... Jingjing Zhao
06:18 PM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
I'm getting confident that this kernel bug always hit during deep-scrub.
I reproduced it several times just by start...
Dmitry Smirnov
02:18 PM Bug #8174 (Resolved): rados put of a long object name crashes the OSD process
Sage Weil
09:36 AM CephFS Bug #8878: mds lock cycle (wip-objecter)
This is going to be a bit of a project:
- fix every completion to take mds_lock
- .. and shunt every one off to...
Sage Weil
07:31 AM Bug #8885 (Resolved): SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-07-19_23:15:03-rados-wip-sage-testing-testing-basic-plana/37... Sage Weil
07:27 AM Bug #8884 (Can't reproduce): osd/OSD.cc: 6317: FAILED assert(p->second.empty()) in consume_map()
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-07-19_23:15:03-rados-wip-sage-testing-testing-basic-plana/37... Sage Weil

07/19/2014

03:19 PM Bug #8701 (Fix Under Review): osd: scrub found obsolete rollback obj
the rgw suite, which reliably triggered this, is now passing. wip-8701 ready for review!
Sage Weil
03:00 PM rgw Bug #8676: md5sum check failed during readwrite.py
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-07-19_13:59:16-rgw-wip-8701-testing-basic-plana/370152 Sage Weil
09:30 AM Bug #8882 (Resolved): osd: osd tier remove ... leaves incomplete clones behind, confusing scrub
ubuntu@teuthology:/a/teuthology-2014-07-18_02:32:01-rados-master-testing-basic-plana/368480
rados/thrash/{clusters/f...
Sage Weil
09:29 AM Bug #8881 (Duplicate): scrub 85.0 cf2b2318/foo15/3/test-rados-api-plana35-13313-11/85 expected cl...
ubuntu@teuthology:/a/teuthology-2014-07-18_02:32:01-rados-master-testing-basic-plana/368448
ubuntu@teuthology:/a/teu...
Sage Weil
09:24 AM Messengers Bug #8880: msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq feature")
Which daemon was this?
Looks like that commit does include the fix for #8504... :(
Greg Farnum
09:10 AM Messengers Bug #8880 (Resolved): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq featu...
ubuntu@teuthology:/a/teuthology-2014-07-18_02:32:01-rados-master-testing-basic-plana/368391... Sage Weil
06:43 AM Bug #8680 (Resolved): crushtool should not send it's output to stderr
This has been resolved in the master branch. Wido den Hollander
02:43 AM devops Bug #8330: repodata on rpm repos do not list latest ceph-deploy (1.5.2)
Still the same for ceph-deploy 1.5.9 on the rhel7 ceph-noarch repo.
ceph-deploy is the only package available. ceph-...
Simon Ironside

07/18/2014

09:12 PM CephFS Bug #8878 (Resolved): mds lock cycle (wip-objecter)
... Sage Weil
03:34 PM rgw Bug #8858 (Fix Under Review): NextMarker, Prefix missing from bucket list results
Yehuda Sadeh
02:06 PM Bug #8851: Mon crash after update to 0.80.4
Can you upload the full log of startup with crash?
By "temporarily resolved", do you mean it's working now, or does ...
Greg Farnum
01:37 PM rgw Bug #8702 (Pending Backport): RadosGW incorrectly converting + to space in URLs
Yehuda Sadeh
11:16 AM rgw Bug #8702: RadosGW incorrectly converting + to space in URLs
ok, I see it now. Yehuda Sadeh
10:43 AM rgw Bug #8702: RadosGW incorrectly converting + to space in URLs
We talked about this on the mailing list. I can reproduce this at will on apache as well.
Can you take a look at ...
Brian Rak
10:40 AM rgw Bug #8702: RadosGW incorrectly converting + to space in URLs
The problem seem to be with the the web server itself here that doesn't send the REQUEST_URI url encoded as it should... Yehuda Sadeh
07:27 AM rgw Bug #8702: RadosGW incorrectly converting + to space in URLs
See https://github.com/ceph/ceph/pull/2117 Brian Rak
01:28 PM CephFS Bug #8876: kcephfs: hang on read of length 0
got debug output from a resend, but not very helpful.. i think teh bug is in the striped read code, which happened lo... Sage Weil
01:27 PM CephFS Bug #8876 (Resolved): kcephfs: hang on read of length 0
... Sage Weil
01:13 PM Documentation #8875 (Resolved): `ceph-deploy new` needs to be called for every node, not just the...
Hi guys,
Running into issue adding monitors. The initial monitor gets created successfully, but get below error ad...
Bobby Yakov
12:50 PM devops Feature #8868: Update Fedora to 0.80.5 packages with ceph-common
The ceph-common split is now in the latest firefly. There are a few other cleanups (reducing dependencies) pending i... Sage Weil
11:07 AM devops Feature #8868 (Resolved): Update Fedora to 0.80.5 packages with ceph-common
In order to push the Ceph client packages up into RHEL 7.1, we need to have the latest packages available in Fedora a... Neil Levine
11:44 AM devops Feature #8871 (Resolved): modify ceph-deploy to only install repo file and not install packages
As a user, I want to update my repo files but not automatically install the packages.
Proposal: Add a switch to th...
Neil Levine
11:14 AM CephFS Feature #8869 (Resolved): MDS: support standby-replay on old-format journals
Right now if we see an old-format journal and we're in standby-replay, we just hang around waiting for it to be conve... Greg Farnum
11:14 AM rbd Bug #8821 (Fix Under Review): rbd: ceph.conf "rbd default format" woes
https://github.com/ceph/ceph/pull/2112 Josh Durgin
10:01 AM Feature #7988 (Fix Under Review): Logs: Log every administrative action taken by a user
https://github.com/ceph/ceph/pull/2118 Joao Eduardo Luis
08:22 AM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
Still having this issue with firefly, is it possible it was re-introduced>?
see SUPPORT #8861 just opened.
Bobby Yakov
07:03 AM Bug #8865 (Resolved): cep osd setmaxosd doesn't check if osds exist
this lets you destroy whole swaths of osds. i think we should make you 'ceph osd rm ...' first Sage Weil
05:38 AM Bug #8346: OSD crashes on master (FAILED assert(ip_op.waiting_for_commit.count(from)))
It reproduced.
Setup details:
3 osd nodes (3 osds in each node)
3 monitors
rebooted the node with osds:6,7,...
Sahana Lokeshappa
12:04 AM rgw Bug #8864 (Resolved): radosgw help doesn't seem to display some debug options
Looks like radosgw has options like `--debug-rgw` & `--log-file`, but these don't seem to appear in the help or docum... Abhishek Lekshmanan

07/17/2014

08:20 PM Bug #8863 (Resolved): osd: second reservation rejection -> crash
I found bug#7624 resolved this problem?http://tracker.ceph.com/issues/7642? and the source code of OSDMonitor in 0.80... shaojun ruan
07:09 PM rgw Bug #8311: No pool name error in ubuntu-2014-05-06_21:02:54-upgrade:dumpling-dumpling-testing-bas...
Sage Weil wrote:
> Ok, it installs dumpling, upgrades to v0.80.1, then runs radosgw. Is there a way to work around ...
Yuri Weinstein
04:45 PM rgw Bug #8311: No pool name error in ubuntu-2014-05-06_21:02:54-upgrade:dumpling-dumpling-testing-bas...
Ok, it installs dumpling, upgrades to v0.80.1, then runs radosgw. Is there a way to work around thsi bug (which is i... Sage Weil
09:11 AM rgw Bug #8311 (New): No pool name error in ubuntu-2014-05-06_21:02:54-upgrade:dumpling-dumpling-testi...
Marking as "new" as still see this problem in http://pulpito.front.sepia.ceph.com/teuthology-2014-07-16_19:12:01-upgr... Yuri Weinstein
04:49 PM Bug #8701 (In Progress): osd: scrub found obsolete rollback obj
Sage Weil
04:39 PM rgw Bug #8846 (Pending Backport): radosgw on 0.80.4 crashes when doing a multi-part upload
oops, we still need to do dumpling Sage Weil
04:32 PM rgw Bug #8846 (Resolved): radosgw on 0.80.4 crashes when doing a multi-part upload
Sage Weil
03:03 PM Bug #8835: rados mkpool doesn't error out for pools which are existing
Hmm. That's probably fine, assuming it's wired up correctly (e.g., doesn't get blocked if the osdmap is up to date). ... Greg Farnum
01:22 PM rbd Bug #8821: rbd: ceph.conf "rbd default format" woes
Now works as expected; thanks, Josh.
Dmitry Smirnov
01:13 PM devops Support #8861 (Rejected): Deploying additional monitors fails.
Hi GUys,
Pretty new to Ceph, need help in troubleshooting install.
Using Ubuntu 14.04 and Ceph firefly.
When ru...
Bobby Yakov
12:28 PM devops Bug #8849 (Pending Backport): rpm restarts daemons on upgrade
Sage Weil
11:26 AM Bug #8860 (Resolved): ceph-disk issues with custom cluster name
ceph-disk and the init script in some places ignores the custom cluster name... Alfredo Deza
08:24 AM Bug #8851: Mon crash after update to 0.80.4
it can be temporarily resolved by this command?
-------------------------------------------------
ceph-kvstore-tool...
shaojun ruan
07:10 AM rbd Bug #8859 (Closed): krbd crash while serving linux-lio iscsi: rbd_assert(img_request != NULL);
We have Linux-HA configuring a pair of nodes to make highly-available iSCSI targets with Linux-LIO, and so it maps th... Walter Huf
06:23 AM CephFS Bug #8811: Journal corruption during upgrade to 0.82 with standby-replay daemons
https://github.com/ceph/ceph/pull/2115 John Spray
06:21 AM CephFS Bug #8811 (Fix Under Review): Journal corruption during upgrade to 0.82 with standby-replay daemons
John Spray
06:20 AM Bug #8857 (Resolved): mon: mds newfs command is not idempotent
Looks good to me. We may get questions from anyone who relied on the old behaviour of newfs to 'reset' a filesystem,... John Spray
04:49 AM rgw Bug #7796: RGW Keystone token auth fails with '411 Length Required' when Keystone using Apache/WSGI
I also ran into this while trying to set up a test cluster. Never could figure out what went wrong until I finally st... Abhishek Lekshmanan
04:41 AM Bug #8801: Ceph monitors do not start after server restart
We were able to reproduce the issue with the monitors by restarting the physical server. The Ceph configuration had d... AltScale Inc

07/16/2014

08:57 PM Bug #8835: rados mkpool doesn't error out for pools which are existing
Greg, the check is actually present in Objecter::create_pool() which is not hit(may be due to a stale osdmap?). I hav... Pavan Rallabhandi
10:41 AM Bug #8835: rados mkpool doesn't error out for pools which are existing
This is basically intended behavior; under some circumstances the message to the monitors can get "replayed" and ther... Greg Farnum
03:35 AM Bug #8835: rados mkpool doesn't error out for pools which are existing
Have a fix in place, would be sending out a pull request soon. Pavan Rallabhandi
07:48 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
looks like a ABBA deadlock between ceph_connection->mutex and ceph_osd_client->request_mutex.... Zheng Yan
04:34 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
The problem is also very repeatable at our site as well. Attached is the kern.log file after running the requested c... Greg Wilson
11:52 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Hi Zheng,
No problem, in our setup it's pretty easy to reproduce the error.
Please find the output attached. (...
Xavier Trilla
06:11 PM rgw Bug #8858 (Resolved): NextMarker, Prefix missing from bucket list results
S3 returns these when listing buckets; RGW does not. This breaks clients like GoodSync.
The NextMarker parameter ...
Sage Weil
05:23 PM Bug #8857 (Fix Under Review): mon: mds newfs command is not idempotent
Sage Weil
05:14 PM Bug #8857 (Resolved): mon: mds newfs command is not idempotent
... Sage Weil
04:44 PM CephFS Bug #8811 (In Progress): Journal corruption during upgrade to 0.82 with standby-replay daemons
This may be the result of a bug in the journal reformatting that occurs during upgrade, affecting systems using stand... John Spray
04:18 PM rgw Bug #8846: radosgw on 0.80.4 crashes when doing a multi-part upload
Even with the default chunk size, this can be triggered by performing a multipart upload consisting of a single small... Benjamin Gilbert
02:07 PM rgw Bug #8846 (Fix Under Review): radosgw on 0.80.4 crashes when doing a multi-part upload
Yehuda Sadeh
09:42 AM rgw Bug #8846: radosgw on 0.80.4 crashes when doing a multi-part upload
Oh yes, I've raised it to 5M to avoid having each part of a multi-part generate 2 objects in rados. Sylvain Munaut
09:40 AM rgw Bug #8846: radosgw on 0.80.4 crashes when doing a multi-part upload
Are you by any chance using a non-default chunk size? Yehuda Sadeh
08:11 AM rgw Bug #8846: radosgw on 0.80.4 crashes when doing a multi-part upload
Yes, it happens since ea68b9372319fd0bab40856db26528d36359102e as I reported on the ML. (and now realize I forgot to ... Sylvain Munaut
08:08 AM rgw Bug #8846: radosgw on 0.80.4 crashes when doing a multi-part upload
Did that happen before (e.g., 0.80.3)? can you add:
debug ms = 1
debug rgw = 20
Yehuda Sadeh
07:06 AM rgw Bug #8846: radosgw on 0.80.4 crashes when doing a multi-part upload
For more info, I'm doing a multipart upload and it crashes at the last part of the file. Two first part are 5M and th... Sylvain Munaut
05:36 AM rgw Bug #8846: radosgw on 0.80.4 crashes when doing a multi-part upload
... Sylvain Munaut
05:31 AM rgw Bug #8846 (Resolved): radosgw on 0.80.4 crashes when doing a multi-part upload

This is the tracelog (from a self compiled version since I started debugging this myself. However same exact issue ...
Sylvain Munaut
03:25 PM rgw Bug #8632 (Need More Info): rgw: bucket listing with delimiter doesn't scale well
Neil Levine
02:20 PM Bug #8174 (Fix Under Review): rados put of a long object name crashes the OSD process
Sage Weil
02:17 PM Bug #8701: osd: scrub found obsolete rollback obj
This is going to require some thought.
The basic problem is that because of filename length limitations, for long ...
Samuel Just
01:46 PM Bug #8852 (Need More Info): submodules not cecking out the right branch, jerasure does not compile
Could you please add the log of the commands and their output ? It works for me on the current master:... Loïc Dachary
10:53 AM Bug #8852 (Won't Fix): submodules not cecking out the right branch, jerasure does not compile
I noticed that after doing a "./do_autogen.sh" the compilation process breaks saying that "galois_init_default_field"... Lluis PJ
01:33 PM devops Bug #8849 (Fix Under Review): rpm restarts daemons on upgrade
Pull request opened https://github.com/ceph/ceph/pull/2109 Alfredo Deza
12:39 PM devops Bug #8849 (In Progress): rpm restarts daemons on upgrade
Greg: that might be Suse, as we are specifically looking for it for certain restart-related things in the Spec file. ... Alfredo Deza
10:14 AM devops Bug #8849: rpm restarts daemons on upgrade
I'm all for changing this, but we want to be careful when doing so. It sounds familiar to me and I think maybe we set... Greg Farnum
09:25 AM devops Bug #8849 (Resolved): rpm restarts daemons on upgrade
Sage Weil
01:30 PM CephFS Bug #8177 (Resolved): Client: seg fault in verify_reply_trace on traceless reply
I believe the fix that actually went into master is commit:334c43f54d31131c4970f43d7e43ebb43e6cd22d. Greg Farnum
12:59 PM CephFS Bug #8576: teuthology: nfs tests failing on umount
http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-09_23:10:02-knfs-next-testing-basic-plana/353010/
http://qa-p...
Greg Farnum
12:34 PM Documentation #8854 (Closed): Clarify potential problems from ceph-deploy purgedata command when ...
When running the ceph-deploy purgedata command on a storage node the command will end up making later installations a... JuanJose Galvez
12:17 PM devops Bug #8813 (Resolved): ceph-disk list displays INFO messages rendering output hard to read
in master, backported to firefly Sage Weil
10:50 AM rbd Bug #8821 (In Progress): rbd: ceph.conf "rbd default format" woes
Josh Durgin
09:59 AM Bug #8851 (Resolved): Mon crash after update to 0.80.4
When I updated mon from 0.80.3 to 0.80.4, restart it then crashed
--------------------------------------------------...
shaojun ruan
09:41 AM devops Bug #8850 (Can't reproduce): ceph-deploy tests fail during tar due to file changed; incomplete sh...
ubuntu@teuthology:/a/teuthology-2014-07-15_19:08:01-ceph-deploy-dumpling-testing-basic-plana/363933
and others.
<...
Sage Weil
09:27 AM devops Bug #7391 (Resolved): ceph-deploy should pass the verbose flag to ceph-disk
merged commit 7b0056b into ceph:master Alfredo Deza
06:39 AM devops Bug #7391 (Fix Under Review): ceph-deploy should pass the verbose flag to ceph-disk
Pull request opened https://github.com/ceph/ceph-deploy/pull/216 Alfredo Deza
08:42 AM rgw Bug #8848 (Resolved): "adjust-ulimits: command not found" in upgrade:firefly-firefly-testing-basi...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-15_19:12:01-upgrade:firefly-firefly-testing-basic-... Yuri Weinstein
08:22 AM rgw Bug #8847 (Can't reproduce): "Error initializing cluster client" in upgrade:firefly-firefly-testi...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-15_19:12:01-upgrade:firefly-firefly-testing-basic-... Yuri Weinstein
05:23 AM Bug #8797: "ceph status" do not exit with python_2.7.8
I believe that we should attempt to replicate the problem first as I know the Python ticket will get ignored unless t... Alfredo Deza
05:08 AM devops Bug #7627: ceph-disk: does not start daemons properly under systemd
Some possibly related feedback from running master (aeaac69) on Fedora 20:
* Mons don't come up because they're tr...
John Spray
04:18 AM rbd Bug #8845 (Resolved): Flattening Clones of clone, results in command failure
1. Created clone of clone in below manner
Create a Pool i.e. pool1
create a rbd i.e.rbd1
create ...
Ramakrishnan Periyasamy

07/15/2014

09:32 PM rgw Bug #8311: No pool name error in ubuntu-2014-05-06_21:02:54-upgrade:dumpling-dumpling-testing-bas...
I still see it on firefly;
http://pulpito.front.sepia.ceph.com/ubuntu-2014-07-15_21:01:54-upgrade:firefly-firefly-te...
Yuri Weinstein
12:52 PM rgw Bug #8311 (Resolved): No pool name error in ubuntu-2014-05-06_21:02:54-upgrade:dumpling-dumpling-...
should be fixed now Sage Weil
09:07 PM Bug #8769 (Rejected): osd.3 crashed in upgrade:dumpling-x:stress-split-firefly---basic-multi suite
not much to go on without the osd log; let's wait for it to reproduce. Sage Weil
07:35 PM Linux kernel client Bug #8798 (Won't Fix): The kernel of a server with Ceph hangs
Zheng Yan
07:30 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
when this happens again, execute 'echo t > /proc/sysrq-trigger' and upload the kernel message.
By the way, are the...
Zheng Yan
07:23 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Hi,
We are experimenting exactly the same problem when running several concurrent dd operations to a kernel mounte...
Xavier Trilla
06:11 PM devops Bug #7627: ceph-disk: does not start daemons properly under systemd
i think we fixed this by doing systemd-run from teh init script... Sage Weil
06:00 PM Bug #8752: firefly: scrub/repair stat mismatch
Samuel Just wrote:
> Just fyi, this is a relatively harmless stat counting error. It shouldn't cause corruption.
...
Dmitry Smirnov
01:23 PM Bug #8752: firefly: scrub/repair stat mismatch
Just fyi, this is a relatively harmless stat counting error. It shouldn't cause corruption. Not that I know how to ... Samuel Just
05:18 AM Bug #8752: firefly: scrub/repair stat mismatch
If #8830 affect only XFS-based OSDs it is definitely not my case. All my OSDs are on Btrfs...
Objects from affected ...
Dmitry Smirnov
05:29 PM Feature #8844 (Resolved): asserts to log message to ceph log
There are number of outstanding issues with stability of Ceph components: for example it is not unusual for OSDs to g... Dmitry Smirnov
05:09 PM RADOS Feature #8843 (New): ceph pg {deep-}scrub 20.\*
Similar to command ... Dmitry Smirnov
05:08 PM Bug #7804 (Duplicate): backfill racing with a hitset object remove
This looks like a dup of #7983, where we already fix backfill vs hit_set issues by deferring any hit_set_persist or t... Sage Weil
04:10 PM Linux kernel client Feature #8842: CephFS kernel module for RHEL7.0 GA
The kernel modules are available in the firefly rhel7 as well as rpm-testing.
Getting cephfs working is not someth...
Sandon Van Ness
04:01 PM Linux kernel client Feature #8842 (Resolved): CephFS kernel module for RHEL7.0 GA
Looks like we only have the libceph and RBD kernel modules for RHEL 7.0 GA at rpm-testing/rhel7.
We need to have ...
Neil Levine
03:31 PM CephFS Feature #8634 (Resolved): mds: admin commands list, evict, etc session
commit:911038ecdbad5c19bd20ac0bd5a03dae53aa3175 Greg Farnum
03:30 PM Bug #8701: osd: scrub found obsolete rollback obj
wip-8701 adds a test to store_test which reproduces the problem. The issue appears to be with collection_move_rename... Samuel Just
02:30 PM Bug #8701: osd: scrub found obsolete rollback obj
Now I wonder whether it's the long object name. Samuel Just
02:23 PM Bug #8701: osd: scrub found obsolete rollback obj
Actually, I appear to have already correctly handled the thing I mentioned above. Must be something else. Samuel Just
12:14 PM Bug #8701: osd: scrub found obsolete rollback obj
more of these in the latest rgw master run...
/var/lib/teuthworker/archive/teuthology-2014-07-14_23:02:01-rgw-mast...
Sage Weil
02:21 PM devops Bug #6703 (Resolved): OSDs with dmcrypt fail to start at boot
Merged into Ceph master branch with hash 31eefeb Alfredo Deza
02:12 PM devops Bug #7486 (Rejected): python-backports needs fixing for rhel
Sage Weil
02:10 PM devops Bug #8513 (Can't reproduce): s3tests failed at bootstrap in the nightlies
Sage Weil
02:08 PM devops Bug #8788 (Resolved): Rhel 7 ceph=deploy v1.5.7 for firefly fails to retrieve correct package - i...
Alfredo Deza
02:04 PM devops Bug #8374 (Won't Fix): redhat-lsb is not recognized as a dependency in FC19
Sage Weil
01:50 PM Bug #8588: In the erasure-coded pool, primary OSD will crash at decoding if any data chunk's size...
Yeah, this needs to be handled better. The biggest problem is that the crash is on the primary rather than the repli... Samuel Just
01:44 PM Bug #8777: osd/PGLog.h: 88: FAILED assert(rollback_info_trimmed_to_riter == log.rbegin())
this happened 3x on my wip-msgr run, too: ubuntu@teuthology:/a/sage-2014-07-12_17:17:39-rados:thrash-wip-msgr-testing... Sage Weil
01:41 PM Bug #8726: (firefly command on dumpling issue?) Error "'adjust-ulimits ceph-coverage /home/ubuntu...
the thrashosds needs the primary-affinity: false (i think? check that syntax) in the yaml Sage Weil
01:41 PM Bug #8801: Ceph monitors do not start after server restart
Can you provide logs for the monitor that doesn't start? Ideally with 'debug mon = 10'. Joao Eduardo Luis
01:39 PM Linux kernel client Bug #8806: libceph: must use new tid when watch is resent
Sage Weil
01:37 PM CephFS Bug #8834 (Rejected): ceph client hang when copy files
Sage Weil
12:43 AM CephFS Bug #8834: ceph client hang when copy files
looks like the readdir memory allocation fail bug. It should be fixed in 3.14 Zheng Yan
12:26 AM CephFS Bug #8834: ceph client hang when copy files
the client was running on ubuntu 14.04
Linux tc-host-2 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014...
Wen Wei
12:23 AM CephFS Bug #8834 (Rejected): ceph client hang when copy files
ceph client hang when copy files
I'm not familiar with Ceph, hope the attached syslog would give you some hints.
Wen Wei
01:37 PM Bug #8747: OSD crash on scrub:osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || s...
Yeah, 8011 seems to be less dead then we thought, reopening. Samuel Just
01:36 PM Bug #8747 (Duplicate): OSD crash on scrub:osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubbe...
see #8011 Sage Weil
05:27 AM Bug #8747: OSD crash on scrub:osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || s...
Although it takes up to an hour to reproduce I seems to have a reliable way to do so.
I shall be happy to capture de...
Dmitry Smirnov
01:36 PM Bug #8011: osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= scrubber.end)
see #8747 for a log of this happening on 0.80.3 Sage Weil
01:34 PM Bug #8646 (Resolved): OSD: assert in share_map() when marked down by an OSDMap
Sage Weil
01:33 PM Bug #8714: we do not block old clients from breaking cache pools
how about we return EPERM or EOPNOTSUPP on osd ops from clients w/o the caching features? Sage Weil
01:31 PM Bug #8642 (Duplicate): After Upgrade from Emperor to Firefly osd start (seemingly randomly) crashing
dup of #8738, now fixed Sage Weil
01:29 PM Bug #8584 (Duplicate): OSD Crashing on firefly - Timeouts on starting again
this look slike it was #8738 Sage Weil
01:26 PM Bug #8694: OSD crashed (assertion failure) at FileStore::_collection_move_rename
This is probably a dup of 8733. Samuel Just
01:20 PM Bug #8691: osd: PG::_lock, OSD::pg_map_lock lock cycle
Sage Weil
01:19 PM Bug #8643 (Closed): 0.80.1: OSD crash: osd/ECBackend.cc: 529: FAILED assert(pop.data.length() == ...
Samuel Just
01:18 PM Bug #8532 (Need More Info): 0.80.1: OSD crash (domino effect), same as BUG #8229
Samuel Just
01:12 PM Feature #7288 (Resolved): Deep-scrub throttle
everything but the idea that the scrub timing could be randomized has been implemented. the prioritization will get ... Sage Weil
01:11 PM Feature #8580: Decrease disk thread's IO priority and/or make it configurable
oh, we did backport the io priority Sage Weil
01:10 PM Feature #8580 (Resolved): Decrease disk thread's IO priority and/or make it configurable
would rather not backport the ioprio stuff to dumpling. the sleep is there. Sage Weil
12:35 PM Bug #8830 (Resolved): deep scrub mismatches on rbd workload with alloc hints
Sage Weil
12:20 PM Bug #8797: "ceph status" do not exit with python_2.7.8
Fascinating info so far, Dmitry, thanks for your work on this. Anxious to see what the Python team thinks of the ass... Dan Mick
02:40 AM Bug #8797: "ceph status" do not exit with python_2.7.8
http://bugs.python.org/issue21963 Dmitry Smirnov
11:04 AM devops Bug #7391: ceph-deploy should pass the verbose flag to ceph-disk

Example output with the changeset...
Alfredo Deza
09:36 AM devops Bug #8813: ceph-disk list displays INFO messages rendering output hard to read
Started work on #7391 to address this from ceph-deploy's end.
Alfredo Deza
06:16 AM devops Bug #8813 (Fix Under Review): ceph-disk list displays INFO messages rendering output hard to read
PR opened https://github.com/ceph/ceph/pull/2106 Alfredo Deza
08:46 AM devops Bug #8831 (Duplicate): ice1.2 on precise:ceph-deploy purge reports error
Closing as duplicate of #8730
Was resolved in ceph-deploy 1.5.9
Alfredo Deza
07:14 AM CephFS Feature #4583 (In Progress): libcephfs: add test that kills a client and verifies mds cleans it up
Sage Weil
07:11 AM CephFS Bug #8257 (Resolved): 0.80~rc1: MDS segmentation fault
... John Spray
07:03 AM CephFS Bug #8118 (Closed): MDS crashes
This got a non-zero response from the OSD while writing out a directory. That's generally not an MDS bug, and if it w... Greg Farnum
06:55 AM CephFS Bug #6609 (Can't reproduce): teuthology rsync workunit failure
Sage Weil
06:52 AM CephFS Bug #7613 (Need More Info): mds/MDCache.cc: 216: FAILED assert(inode_map.count(in->vino()) == 0)
Sage Weil
05:08 AM Bug #8835 (Resolved): rados mkpool doesn't error out for pools which are existing
'rados mkpool' doesn't seem to throw an error, for pools that are already existing.
<snip>
root#: rados mkpool ...
Pavan Rallabhandi

07/14/2014

04:04 PM devops Bug #8831 (Duplicate): ice1.2 on precise:ceph-deploy purge reports error
... Tamilarasi muthamizhan
03:41 PM Bug #8752: firefly: scrub/repair stat mismatch
This appears unrelated to 8830. Probably a stat miscounting bug somewhere in the cache/tiering code. Samuel Just
03:40 PM Bug #8752 (New): firefly: scrub/repair stat mismatch
Actually, maybe not. The naive interpretation doesn't have #8830 causing differences in file sizes...but maybe it cou... Greg Farnum
03:19 PM Bug #8752 (Duplicate): firefly: scrub/repair stat mismatch
almost certainly a dup of #8830. fix will hit the firefly branch shortly! Sage Weil
03:16 PM Bug #8752: firefly: scrub/repair stat mismatch
No improvement with 0.80.3. I'm getting ~20 inconsistent PGs after every cycle of full "deep-scrub" (i.e. `ceph osd d... Dmitry Smirnov
03:08 PM Bug #8815 (Resolved): mon: scrub error (osdmap encoding mismatch?) upgrading from 0.80 to ~0.80.2
Sage Weil
03:07 PM Bug #8747: OSD crash on scrub:osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || s...
No improvement with 0.80.3 -- I'm still getting those crashes frequently on "deep-scrub" and "repair".
Sometimes two...
Dmitry Smirnov
01:28 PM Bug #8830 (Resolved): deep scrub mismatches on rbd workload with alloc hints
Samuel Just
12:33 PM devops Bug #8813 (In Progress): ceph-disk list displays INFO messages rendering output hard to read
Alfredo Deza
12:09 PM rgw Bug #8702: RadosGW incorrectly converting + to space in URLs
Looks like this is ultimately caused by line 1227 in rgw_rest.cc... Brian Rak
10:41 AM Bug #8823 (Resolved): Failing LibRadosTierPP.HitSetRead,Write,Trim
Sage Weil
09:36 AM Support #8826 (Rejected): Attempt to set PG_NUM and PGP_NUM to 8192 on pool rbd causes OSDs to go...
Based on other analysis in the private ticket, it looks like it's just hitting the fd limit; I think that's well-docu... Greg Farnum
07:51 AM Support #8826 (Rejected): Attempt to set PG_NUM and PGP_NUM to 8192 on pool rbd causes OSDs to go...
The customer is willing to modify pool rbd for its production usage but doing so generates an error message when issu... Jean-Charles Lopez
05:30 AM CephFS Bug #8811: Journal corruption during upgrade to 0.82 with standby-replay daemons
Hmmm. Aside from is_readable() giving inconsistent results, seems like this could happen if there was a bug that cau... John Spray

07/13/2014

10:19 PM Bug #8824 (Can't reproduce): osd: hung MOSDECSubOpWrite
ubuntu@teuthology:/a/teuthology-2014-07-13_02:30:04-rados-next-testing-basic-plana/357668
see osd.2's log...
Sage Weil
08:52 PM Bug #8823: Failing LibRadosTierPP.HitSetRead,Write,Trim
ah, it looks like these are from the mon change that prevents fetching/setting hit_set_* fields on non-tier pools. Sage Weil
08:51 PM Bug #8823 (Resolved): Failing LibRadosTierPP.HitSetRead,Write,Trim
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-07-13_13:58:27-rados-master-testing-basic-plana/357922
and ...
Sage Weil
08:46 PM Bug #8822: osd: hang on shutdown, spinlocks
... Sage Weil
08:45 PM Bug #8822 (Resolved): osd: hang on shutdown, spinlocks
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-13_02:30:04-rados-next-testing-basic-plana/357857
...
Sage Weil
02:12 AM rbd Bug #8821 (Resolved): rbd: ceph.conf "rbd default format" woes
`/usr/bin/rbd` have few errors with "rbd default format".
With the following in "/etc/ceph/ceph.conf":...
Dmitry Smirnov

07/12/2014

06:13 AM Feature #8538 (Resolved): Functionality to have rbdmap also mount after mapping an image
Merged. Thanks, Sage. Dmitry Smirnov
 

Also available in: Atom