Project

General

Profile

Activity

From 09/15/2014 to 10/14/2014

10/14/2014

07:32 PM Bug #8620: rest/test.py occasional failure (dumpling)
ubuntu@teuthology:/a/teuthology-2014-10-13_19:00:01-rados-dumpling-distro-basic-multi/545881 Sage Weil
07:30 PM Bug #8851: Mon crash after update to 0.80.4
In our product env, we use 0.83. Coming accross this problem too.
Try this patch https://github.com/ceph/ceph/pull/2...
wei li
07:30 PM Bug #8851: Mon crash after update to 0.80.4
In our product env, we use 0.83. Coming accross this problem too.
Try this patch https://github.com/ceph/ceph/pull/2...
wei li
07:22 PM Bug #9765: CachePool flush -> OSD Failed
... Loïc Dachary
02:58 AM Bug #9765: CachePool flush -> OSD Failed
I'm sorry!
*[root@ct3 ~]# ceph --version
ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)*
Irek Fasikhov
01:24 AM Bug #9765: CachePool flush -> OSD Failed
in addition:... Irek Fasikhov
01:19 AM Bug #9765 (Duplicate): CachePool flush -> OSD Failed
Hi,All.
I encountered a problem flushing the data before deleting CachePool.
My crushmap:...
Irek Fasikhov
06:55 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
https://github.com/ceph/ceph/pull/2724 Loïc Dachary
06:42 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
Indeed ! Thanks ! Loïc Dachary
06:32 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
I don't think this is the patch you want see c776a89880fdac270e6334ad8e49fa616d05d0d4 and acfe62e0aa45bff208e38aeedad... Mark Kirkwood
06:27 PM Bug #9073 (Fix Under Review): OSD with device/partition journals down after fresh deploy or upgra...
* firefly backport https://github.com/ceph/ceph/pull/2724 Loïc Dachary
06:22 PM Bug #9073 (Pending Backport): OSD with device/partition journals down after fresh deploy or upgra...
The fix for this bug is https://github.com/ceph/ceph/commit/c776a89880fdac270e6334ad8e49fa616d05d0d4 and needs backpo... Loïc Dachary
06:31 PM Bug #9785 (Resolved): /etc/ceph/dmcrypt-keys and key contents are created world-readable
get_or_create_dmcrypt_key in ceph-disk creates the key_dir and key_files, but does not set any specific permissions o... David Clarke
06:23 PM Bug #9768 (Duplicate): ceph-osd mkfs hangs
Loïc Dachary
06:00 PM Bug #9768: ceph-osd mkfs hangs
Created with ceph-disk prepare --fs-type=ext4 and ceph-disk activate /dev/loop3p1 Loïc Dachary
04:46 PM Bug #9768: ceph-osd mkfs hangs
On ubuntu-14.04 the logs of a ceph-osd mkfs on 0.80.5 that completes successfully. Loïc Dachary
04:04 PM Bug #9768: ceph-osd mkfs hangs
Loïc Dachary
07:21 AM Bug #9768: ceph-osd mkfs hangs
I browsed the patches to aio in 3.12.7 until now and saw nothing that could related to this problem https://www.kerne... Loïc Dachary
07:15 AM Bug #9768: ceph-osd mkfs hangs
... Loïc Dachary
06:01 AM Bug #9768: ceph-osd mkfs hangs
Although https://github.com/ceph/ceph/commit/2f11631f3144f2cc0e04d718e40e716540c8af19 seems related, the log shows Fi... Loïc Dachary
05:45 AM Bug #9768 (Duplicate): ceph-osd mkfs hangs
h3. Workaround for Firefly <= 0.80.7
If it shows with...
Loïc Dachary
06:15 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
rsync asks us to see previous errors;) yes, I think sudo should work Zheng Yan
02:36 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
Well, that would make sense. How did you find those in the log?
We should probably just run this as sudo or someth...
Greg Farnum
06:30 AM CephFS Bug #9674: nightly failed multiple_rsync.sh
... Zheng Yan
05:52 PM devops Bug #9783: upgrade ceph-common (0.80.7-1trusty) over (0.80.5-0ubuntu0.14.04.1) fails
looks like 17732dc0c8878ea58813ad543c5359cb811079cc which probably should have included some other package control he... Dan Mick
04:55 PM devops Bug #9783 (Rejected): upgrade ceph-common (0.80.7-1trusty) over (0.80.5-0ubuntu0.14.04.1) fails
This happens when switching from Ubuntu / Debian repositories to Ceph repositories:... Loïc Dachary
05:31 PM Bug #9784 (Resolved): All tools should be named consistently and argument parsing should be better

Slowly some of the tools like ceph_objectstore_tool have migrated to have underscores in the name. But I noticed s...
David Zafman
04:42 PM Bug #9769: upgrade/firefly: latest_dumpling_release.yaml always fails
Tests still failing but at different point.
See http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-14_14:22:47-u...
Yuri Weinstein
09:17 AM Bug #9769: upgrade/firefly: latest_dumpling_release.yaml always fails
From rerun with verbose:... Yuri Weinstein
08:21 AM Bug #9769 (In Progress): upgrade/firefly: latest_dumpling_release.yaml always fails
Running with verbose is on for job - http://qa-proxy.ceph.com/teuthology/sage-2014-10-13_20:46:44-upgrade:firefly-fir... Yuri Weinstein
06:01 AM Bug #9769 (Resolved): upgrade/firefly: latest_dumpling_release.yaml always fails
... Sage Weil
04:00 PM Bug #9408 (Fix Under Review): erasure-code: misalignment
Loïc Dachary
03:57 PM Bug #9700 (Resolved): cephtool mon_osd intermittent failure
I've not seen errors since this patch, except for firefly builds because this was not backported. Feel free to re-ope... Loïc Dachary
03:06 PM Bug #9388 (Duplicate): osd/PG.cc: 2945: FAILED assert(r == 0) in update_snap_map
David Zafman
03:01 PM Bug #9390 (Duplicate): EEXIST on split due to import/export
David Zafman
03:00 PM Bug #7588 (Resolved): OSD Seg fault in string assign ObjectOperation::C_ObjectOperation_copyget::...
Sage Weil
02:59 PM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
The corresponding line of code in master branch for test/librados/misc.cc was changed by Josh in Feb:
7a019b38 src/t...
David Zafman
02:51 PM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
Same issue in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-13_17:10:01-upgrade:dumpling-firefly-x:paral... Yuri Weinstein
02:59 PM rgw Bug #9774 (Won't Fix): multi-version: giant rgw throws 500 with dumpling osds
Sage Weil
09:42 AM rgw Bug #9774 (Won't Fix): multi-version: giant rgw throws 500 with dumpling osds
... Sage Weil
02:58 PM Bug #9757: mon: loops on osd pool create
final commit is cf4e30095e8149d1df0f2c9b4c93c9df0779ec84 Sage Weil
02:57 PM Bug #9757 (Resolved): mon: loops on osd pool create
Loïc Dachary
02:31 PM Bug #9757 (Fix Under Review): mon: loops on osd pool create
Loïc Dachary
02:28 PM Bug #9757: mon: loops on osd pool create
This bug is dated october 12th with https://github.com/ceph/ceph/commit/0c1eafd7ab6f7d2a5eccd10ce267bde5e90932c5 whic... Loïc Dachary
01:51 PM Bug #9757: mon: loops on osd pool create
* https://github.com/ceph/ceph/commit/fe43202449e3caf60e796f1205ef4303e905659d does not need to be backported because... Loïc Dachary
01:40 PM Bug #9757: mon: loops on osd pool create
"mon/OSDMonitor : Use user provided ruleset for replicated pool":https://github.com/ceph/ceph/commit/cf4e30095e8149d1... Loïc Dachary
01:18 PM Bug #9757: mon: loops on osd pool create
... Loïc Dachary
12:54 PM Bug #9757: mon: loops on osd pool create
This was run using the following backport https://github.com/ceph/ceph/commits/wip-9757 Loïc Dachary
02:47 PM Feature #9781 (Resolved): ceph_objectstore_tool: On import handle splits

Once we have OSDMap information we need to check for splits during pg import:
Sam:
Upon import, if we detect a ...
David Zafman
02:45 PM Feature #9780 (Resolved): ceph_objectstore_tool: Add OSDMap information to pg export
Gather appropriate OSDMap information and include in export data.
David Zafman
02:43 PM Fix #7711: OpTracker output doesn't include op size for subops
I didn't do this back then. We should get to it, though. Greg Farnum
02:14 PM Linux kernel client Feature #9779: libceph: sync up with objecter
Make sure not to break existing (correct!) behavior: we need to resent watch or notify when *any* member of the actin... Ilya Dryomov
02:10 PM Linux kernel client Feature #9779 (Resolved): libceph: sync up with objecter
- the way we resend lingering requests isn't quite the same
- __map_request() is too aggressive about resending:
...
Ilya Dryomov
02:04 PM Linux kernel client Bug #8806 (Resolved): libceph: must use new tid when watch is resent
Ilya Dryomov
02:02 PM Fix #9778 (New): forbid erasure code profile modifications that can modify data encoding
even if --force is set in erasure-code-profile set because it can corrupt the content of the erasure coded pool. For ... Loïc Dachary
01:25 PM Feature #9449 (Resolved): mon: make ceph -s break more things onto multiple lines (health blurbs,...
Sage Weil
01:24 PM Feature #9598 (Fix Under Review): re-enable Objecter fast dispatch
Sage Weil
01:24 PM Fix #9194 (In Progress): librados/osd: watch reconnect needs to be exclusive to detect possibly m...
Sage Weil
01:12 PM Feature #9776 (New): try to make address sanitizer work
Samuel Just
12:44 PM Feature #9198 (Fix Under Review): librados: notify callback includes gid of notifier
Sage Weil
12:44 PM Feature #9197 (Fix Under Review): librados/osd: notify reply payload
Sage Weil
12:43 PM Feature #8899 (Resolved): Kerberos/LDAP Support:: mon: define mon role capabilities
Sage Weil
12:29 PM rgw Bug #9763 (Resolved): firefly upgrade tests fail s3tests, apache goes away
https://github.com/ceph/s3-tests/commit/7e7457e1af8481cf111f25edab198d7498e18551 Sage Weil
12:19 PM rgw Bug #9763: firefly upgrade tests fail s3tests, apache goes away
looks like a bad test in s3-tests Sage Weil
11:13 AM Bug #5925: hung ceph_test_rados_delete_pools_parallel
David Zafman
08:47 AM Bug #5925: hung ceph_test_rados_delete_pools_parallel
I've been reproducing this reliably with wip-9321.giant. Hung job in plana12.... Joao Eduardo Luis
11:12 AM Bug #9696 (Resolved): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(wan...
Sage Weil
10:30 AM rbd Bug #5977 (Resolved): librbd: python bindings need docstrings to show up in online docs
commit:7022679e2c76c707d3d28c052045d11736582b3a Josh Durgin
08:11 AM rbd Bug #5977: librbd: python bindings need docstrings to show up in online docs
PR: https://github.com/ceph/ceph/pull/2720 Jason Dillaman
08:10 AM rbd Bug #5977 (In Progress): librbd: python bindings need docstrings to show up in online docs
Jason Dillaman
09:28 AM rbd Fix #7787: rbd diff takes longer as images grow larger
Dependent on issue #4087 Jason Dillaman
09:26 AM rbd Feature #7746: Capacity Management: rbd df
Dependent on issue #4087 Jason Dillaman
09:25 AM rbd Feature #7746 (In Progress): Capacity Management: rbd df
Jason Dillaman
09:07 AM rbd Feature #7746 (Fix Under Review): Capacity Management: rbd df
Ian Colle
09:13 AM rbd Bug #8329 (Need More Info): qemu-img rpm provided breaks snapshooting functionality on centos
Andrija, according to Bugzilla, the availability of the "-s" option in qemu-img was a backporting bug and was effecti... Jason Dillaman
09:12 AM Linux kernel client Feature #190 (Resolved): krbd: DISCARD support
Ian Colle
09:09 AM rbd Feature #8902 (Fix Under Review): rbd mirroring: librbd: funnel snapshot, resize events via lock ...
Josh Durgin
09:08 AM rgw Cleanup #9772 (In Progress): rgw: reorganize RGWRados
Yehuda Sadeh
09:07 AM rgw Cleanup #9772 (Resolved): rgw: reorganize RGWRados
need to clean up the different states, separate access to system objects vs data objects. Yehuda Sadeh
09:07 AM rbd Feature #8900 (Fix Under Review): rbd mirroring: librbd:making image locking mandatory
Ian Colle
09:07 AM rbd Feature #4087 (Fix Under Review): rbd: bitmaps for tracking object existence
Ian Colle
09:06 AM rgw Feature #9013: rgw: set civetweb as a default frontend
Done, commit:63d0ec7b2c00b7f9515d492009115d87414a77ab. Yehuda Sadeh
09:02 AM rbd Bug #9771 (Won't Fix): Segmentation fault after upgrade v0.80.5 -> v0.80.6
This is new test upgrades from v0.80.4 -> v0.80.5 -> v0.80.4->firefly and runs different workloads after each step.
...
Yuri Weinstein
07:32 AM Bug #9731: Ceph 0.80.6 OSD crashes
And here is another core file from another server. The backtrace in the log looks like a different path to me. Brad House
07:13 AM Bug #9731: Ceph 0.80.6 OSD crashes
The Debian Wheezy build server doesn't seem to be online yet so I haven't been able to test your patch.
However, I...
Brad House
06:20 AM rbd Feature #9733: Separate rbd listing into CAP
I apologize, I thought I mentioned that you need to use RBD image format 2, but re-reading my comments I seemed to ha... Jason Dillaman
06:20 AM CephFS Feature #9755: Fence late clients during reconnect timeout
There can be certain cases where a client can reconnect after being evicted, e.g. if:
* the client didn't hold an...
John Spray
03:53 AM Fix #9767 (New): do not leak ceph-disk activate lock to the OSD
The "activate_lock":https://github.com/ceph/ceph/blob/giant/src/ceph-disk#L1997 will leak to the OSD. This is harmles... Loïc Dachary
03:19 AM rgw Bug #9766 (Rejected): s3tests: test_100_continue failing
Trying out s3tests on a ceph (0.80.5) cluster with ... Abhishek Lekshmanan
01:38 AM Bug #9761: ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error 7 in ld-2.1...
I cannot repeat this voluntarily. And debug.20 eats up space. Pavel Veretennikov
01:34 AM Bug #9761: ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error 7 in ld-2.1...
Pavel Veretennikov wrote:
> Just found this error in the logs. Ceph 0.80.6, Ubuntu 14.04, kernel 3.13.0-36-generic
...
Irek Fasikhov
01:18 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
Looking at the ceph-osd.2.log uploaded by Sahana.
Prior to the reported problem, there was one more crash while merg...
Varada Kari
12:42 AM Linux kernel client Bug #9749 (Resolved): kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
fixed by "ceph: fix divide-by-zero in __validate_layout()" in the testing branch Zheng Yan
12:03 AM rgw Feature #8316: Ceilometer support for RGW Swift statistics
If we want to support ceilometer, can't we support the statistics of both S3 & swift APIs? Abhishek Lekshmanan

10/13/2014

08:45 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
I added a new test (#9758) and testing it on ceph-qa-suites branch 'wip_9758' which is doing step upgrades v0.80.4-v0... Yuri Weinstein
10:52 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
wip-9731-firefly does not have this patch. Samuel Just
05:40 PM rgw Bug #9763 (Resolved): firefly upgrade tests fail s3tests, apache goes away
... Sage Weil
04:50 PM CephFS Feature #414 (Fix Under Review): ceph-fuse: implement file locking
Zheng Yan
04:42 PM devops Bug #9747: ceph.spec.in will always use 95-ceph-osd-alt.rules
running gitbuilder Loïc Dachary
04:40 PM devops Bug #9747 (Fix Under Review): ceph.spec.in will always use 95-ceph-osd-alt.rules
* backported to giant already
* firefly backport https://github.com/ceph/ceph/pull/2717
Loïc Dachary
08:16 AM devops Bug #9747 (Pending Backport): ceph.spec.in will always use 95-ceph-osd-alt.rules
Sage Weil
03:05 PM rbd Feature #9733: Separate rbd listing into CAP
Using those caps does not allow the kernel client to mount the image:
[root@nodezz ~]# ceph auth caps client.rdleb...
Robert LeBlanc
02:20 PM rbd Feature #9733: Separate rbd listing into CAP
I've looked over that document a few times, but I'm not finding specifics about things like "object_prefix", "rbd_hea... Robert LeBlanc
01:59 PM rbd Feature #9733: Separate rbd listing into CAP
Yes, you should be able to use different users within Nova, Cinder, and Glance config files. The capability grammar ... Jason Dillaman
01:16 PM rbd Feature #9733: Separate rbd listing into CAP
Let me try this and see if it will do what we think. I don't know enough about the Open Stack side, but I hope we can... Robert LeBlanc
01:08 PM rbd Feature #9733: Separate rbd listing into CAP
The RBD image directory is stored within an object named 'rbd_directory' in each pool. You could create a capspec wh... Jason Dillaman
12:52 PM CephFS Feature #9755: Fence late clients during reconnect timeout
Hmm, I like the basic thrust of this, but I'm a little concerned as well — we have other tickets to let clients recon... Greg Farnum
03:39 AM CephFS Feature #9755 (Resolved): Fence late clients during reconnect timeout

During reconnect, MDSs terminate the sessions of any clients which fail to reconnect within the window. Because wh...
John Spray
12:45 PM Linux kernel client Feature #190: krbd: DISCARD support
This should go upstream to Linus in the next day or two (for 3.18-rc1). Sage Weil
11:29 AM Linux kernel client Feature #190: krbd: DISCARD support
Alphe Salas wrote:
> I agree with Kyle and Brian. This feature is necessary. I would like to have more information a...
Alphe Salas
11:27 AM Linux kernel client Feature #190: krbd: DISCARD support
I agree with Kyle and Brian. This feature is necessary. I would like to have more information about the status of thi... Alphe Salas
12:28 PM rbd Fix #7787 (In Progress): rbd diff takes longer as images grow larger
Jason Dillaman
12:17 PM Bug #9761 (Rejected): ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error ...
Just found this error in the logs. Ceph 0.80.6, Ubuntu 14.04, kernel 3.13.0-36-generic
Nothing special in the ceph...
Pavel Veretennikov
12:03 PM devops Bug #9760 (Rejected): librados2 fails to install from ceph-qa
Which is causing lots of failures in ceph-deploy's test runs
Example full log of one of the failures: http://qa-pr...
Alfredo Deza
10:48 AM Bug #9731: Ceph 0.80.6 OSD crashes
Samuel Just
10:19 AM Linux kernel client Bug #9749: kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
Ilya Dryomov
09:59 AM Bug #9714 (Duplicate): Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-mu...
i think this is a dup of #9757 Sage Weil
09:48 AM Bug #9714: Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi run
Sam, can you take a look at this?
Still an issue in one off run - http://qa-proxy.ceph.com/teuthology/teuthology-2...
Yuri Weinstein
09:35 AM rbd Bug #9742: `rbd map lun` fails with: (2) No such file or directory on kernel 3.14.14 w/ udev-216 ...
is this running inside a container? this looks lik ea problem with the authentication keys and there is a known issu... Sage Weil
09:32 AM Bug #9744 (Won't Fix): cephx: verify_reply couldn't decrypt with error: error decoding block for ...
this happens when clocks are very skewed. Sage Weil
08:59 AM rbd Bug #9602 (Closed): rbd export -> nc ->rbd import = memory leak
Irek, thanks for the update. Closing as not a bug. Jason Dillaman
04:43 AM rbd Bug #9602: rbd export -> nc ->rbd import = memory leak
Jason Dillaman wrote:
> I quickly attempted to reproduce this on the same version w/o success. Can you attach /etc/...
Irek Fasikhov
08:47 AM Documentation #9730: ceph-deploy mon create-inital, does not take arguments
merged commit eb27245 into master Alfredo Deza
08:46 AM Documentation #9730 (Resolved): ceph-deploy mon create-inital, does not take arguments
Merged the pull request. John Wilkins
08:21 AM Documentation #9730 (In Progress): ceph-deploy mon create-inital, does not take arguments
PR opened https://github.com/ceph/ceph/pull/2714 Alfredo Deza
08:34 AM Bug #9757: mon: loops on osd pool create
also breaking teuthology-2014-10-09_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi Sage Weil
06:31 AM Bug #9757 (Resolved): mon: loops on osd pool create
http://pulpito.ceph.com/sage-2014-10-12_09:13:46-upgrade:dumpling-x-wip-sam-firefly-testing-distro-basic-multi/541361... Sage Weil
07:22 AM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
I have submitted the following patches:
Update s3-tests with the new small size multipart tests:
https://github.c...
Luis Pabon
06:56 AM Cleanup #9756: Issues found by Clang
start with
https://github.com/ceph/autobuild-ceph/blob/master/build-ceph.sh
and make a build-ceph-clang.sh. ...
Sage Weil
04:50 AM Cleanup #9756 (In Progress): Issues found by Clang
I again [1] used Clang with -Weverything [2] to compile the Ceph repository [3].
There is still a huge amount of ser...
Daniel Hofmann
06:22 AM rbd Bug #8329: qemu-img rpm provided breaks snapshooting functionality on centos
Any info on this? At least can we define some prefered way of enabling qemu-img/kvm to speak to CEPH (Do it your self... Andrija Panic
03:16 AM CephFS Feature #9754 (Resolved): A 'fence and evict' client eviction command

Currently the "session evict" operation on the MDS admin socket will terminate the session, and release any capabil...
John Spray
01:07 AM Linux kernel client Bug #9355 (Closed): rbd: map fails with EINVAL inside a container
Opened #9753. Ilya Dryomov
01:06 AM Linux kernel client Feature #9753 (Resolved): libceph: allow custom network namespaces
See the bottom of #9355. Ilya Dryomov
12:57 AM Linux kernel client Bug #9192: krbd: poor read (about 10%) vs write performance
Hi Eric,
Thanks for doing this. I was concerned about this being a regression after the queueing changes, but it ...
Ilya Dryomov

10/12/2014

10:07 PM Bug #9614: PG stuck with remapped
The original fix was not clean, just added a new pull request: https://github.com/ceph/ceph/pull/2711 Guang Yang
09:06 PM Bug #9215: Ceph Firefly 0.80.5 : OSD flapping too frequently
karan singh wrote:
> You can close this case , problem has been solved after applying fix (0.80.5-1-gc4b77d2)
May...
Wang Qiang
06:43 PM Bug #9731: Ceph 0.80.6 OSD crashes
I already saw the commit in branch. The status of this issue should not be New. Wang Qiang
03:23 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Results for the run teuthology-2014-10-11_19:00:02-upgrade:dumpling-x-wip-9731-firefly-distro-basic-multi
Still jo...
Yuri Weinstein
02:44 PM Linux kernel client Bug #9192: krbd: poor read (about 10%) vs write performance
I was able to get some dedicated test time on one of our Ceph test clusters to rerun the kernel RBD read/write tests ... Eric Eastman
12:09 PM Bug #9752 (Resolved): acting in past intervals contains primary and up_primary (looks like duplic...
In a 0.80.6 in the context of http://tracker.ceph.com/issues/9750 the following showed up (the full output can be fou... Loïc Dachary
11:20 AM Bug #9751: ceph tell osd.6 version hangs
here is the gdb output of the OSD process that fails to answer to ceph tell Loïc Dachary
10:18 AM Bug #9751: ceph tell osd.6 version hangs
attaching the log with lockdep = true, starting from when the osd boots up to the point where ceph tell blocks forever Loïc Dachary
10:10 AM Bug #9751: ceph tell osd.6 version hangs
greg : the log is from when the osd started up to the point where ceph tell hangs Loïc Dachary
09:51 AM Bug #9751: ceph tell osd.6 version hangs
Maybe similar to #9748 and #9714 ? Yuri Weinstein
09:49 AM Bug #9751: ceph tell osd.6 version hangs
Was the OSD already "hanging" when you generated this log? Greg Farnum
09:21 AM Bug #9751 (Rejected): ceph tell osd.6 version hangs
... Loïc Dachary
10:05 AM Bug #9718 (Fix Under Review): osd_types: check_new_interval: min_size check needs to consider CRU...
Sage Weil
09:36 AM Bug #9750: pg incomplete
I guess it's not a bug indeed, only the logical outcome of something going wrong. What is probably a bug is having th... Loïc Dachary
09:32 AM Bug #9750 (Won't Fix): pg incomplete
Loïc Dachary
09:07 AM Bug #9750: pg incomplete
So you don't actually think it's a bug?
In any case, if you still have the disk accessible, you probably want to u...
Greg Farnum
08:54 AM Bug #9750: pg incomplete
osd.3 has failed recently (this morning) because btrfs turned it read-only. it is likely that it contains the missing... Loïc Dachary
08:45 AM Bug #9750: pg incomplete
What did you *do* to this cluster? I don't think these PGs are supposed to have historical acting sets that look like... Greg Farnum
08:01 AM Bug #9750 (Won't Fix): pg incomplete
... Loïc Dachary

10/11/2014

10:33 PM Linux kernel client Bug #9749 (Resolved): kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
Our UC-KLEE tool discovered a Linux kernel divide-by-zero crash in the Ceph
client driver. I found the bug on kernel...
David Ramos
03:52 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Sage has scheduled run on wip-9731-firefly http://pulpito.front.sepia.ceph.com/teuthology-2014-10-10_16:50:01-upgrade... Yuri Weinstein
03:35 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
pre-firefly mons I think would also suffice to cause this bug. Actually, if you upgrade the osds from pre-firefly to... Samuel Just
03:29 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Can you rerun with wip-sam-firefly-testing? (actually, ignore the firefly branch for the moment and use wip-sam-firef... Samuel Just
01:05 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
I guess it's expected as backport is still pending.
Update:
In the run http://pulpito.front.sepia.ceph.com/teutho...
Yuri Weinstein
01:10 PM Bug #7588 (Pending Backport): OSD Seg fault in string assign ObjectOperation::C_ObjectOperation_c...
This actually doesn't seem to have been backported to firefly. I think it might be causing some of the cache/tiering... Samuel Just
01:01 PM Bug #9748 (Rejected): Dead jobs in upgrade:dumpling-x-firefly-distro-basic-multi run
Jobs '537916', '537917'
Logs are in:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-10_19:00:01-upgrade...
Yuri Weinstein
12:54 PM Bug #9714: Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi run
Same problem in run - http://pulpito.front.sepia.ceph.com/teuthology-2014-10-10_19:00:01-upgrade:dumpling-x-firefly-d... Yuri Weinstein
09:27 AM devops Bug #9747 (Fix Under Review): ceph.spec.in will always use 95-ceph-osd-alt.rules
Loïc Dachary
09:22 AM devops Bug #9747: ceph.spec.in will always use 95-ceph-osd-alt.rules
https://github.com/ceph/ceph/pull/2706 Loïc Dachary
09:12 AM devops Bug #9747 (Resolved): ceph.spec.in will always use 95-ceph-osd-alt.rules
In ceph.spec.in *%if (0%{?rhel} || 0%{?rhel} < 7)* "see sources":https://github.com/ceph/ceph/blob/giant/ceph.spec.in... Loïc Dachary
09:08 AM Bug #9746 (Resolved): reconcile upstream ceph.spec.in with other ceph.spec (SuSE, EPEL, etc)
There are many differences between the "epel ceph.spec":https://dl.fedoraproject.org/pub/epel/7/SRPMS/c/ceph-0.80.5-8... Loïc Dachary
08:32 AM devops Bug #9721 (Rejected): partx -a should be called after creating the data partition
The diagnostic is incorrect. At the time the data partition is created it does not make sense to try to activate it b... Loïc Dachary

10/10/2014

08:51 PM Bug #9716 (Resolved): Warning in API headers when compiling with -Wstrict-prototypes
commit:d98b75530b0ea8f243a4dc8e1881bc6da2bca99d Josh Durgin
02:10 PM Bug #9716 (Fix Under Review): Warning in API headers when compiling with -Wstrict-prototypes
https://github.com/ceph/ceph/pull/2701 Adam Crume
01:44 PM Bug #9716: Warning in API headers when compiling with -Wstrict-prototypes
Forgot to mention that qemu uses -Werror by default, hence the errors. Adam Crume
08:45 PM Bug #8983 (Resolved): rados bench -b option does not take orders of magnitude (k,M,..) but also d...
commit:3b9dcff7755a3ffcb9df8a06e6d0e525e77de641 Josh Durgin
02:13 PM Bug #8983: rados bench -b option does not take orders of magnitude (k,M,..) but also does not thr...
https://github.com/ceph/ceph/pull/2678 Adam Crume
08:12 PM Bug #9143 (Rejected): Incorrect key sequence in encoding object name to key for GenericObjectMap
Haomai Wang
07:21 PM Bug #9731: Ceph 0.80.6 OSD crashes
wip-firefly-9696-9731 should have a fix for this as well as 9696. Let me know whether that helps. Samuel Just
03:11 PM Bug #9731: Ceph 0.80.6 OSD crashes
I think this is a bug in PGLog::IndexedLog::trim(). Making patch. Samuel Just
11:44 AM Bug #9731: Ceph 0.80.6 OSD crashes
Attached ceph OSD log from crash with debugging turned on. Brad House
11:03 AM Bug #9731: Ceph 0.80.6 OSD crashes
I will add those to my configuration and restart ceph on each node.
Luckily this is just my test environment.
Brad House
10:47 AM Bug #9731: Ceph 0.80.6 OSD crashes
Can you reproduce either of these with logging?
debug osd = 20
debug filestore = 20
debug ms = 1
Samuel Just
09:45 AM Bug #9731 (Can't reproduce): Ceph 0.80.6 OSD crashes
I received 2 different crashes on 2 different OSDs on different nodes within 30s of eachother on 0.80.6. I just upgr... Brad House
06:41 PM Bug #9744: cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
I think I found the problem: new node (with new OSD) had incorrect time.
Everything returned to normal after correct...
Dmitry Smirnov
06:10 PM Bug #9744: cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
Found the following in the logs of the new OSD:... Dmitry Smirnov
05:57 PM Bug #9744 (Won't Fix): cephx: verify_reply couldn't decrypt with error: error decoding block for ...
Shortly after upgrade 0.80.5 to 0.80.6 cluster became slow and then almost completely stopped
with several OSDs exhi...
Dmitry Smirnov
05:45 PM rgw Bug #9307 (Resolved): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-fir...
above errors from yuri are #9169.. something else Sage Weil
08:10 AM rgw Bug #9307: "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-x-mast...
Same issues in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-b... Yuri Weinstein
05:00 PM Fix #9199 (In Progress): librados: watch linger pings need to verify pg mapping hasn't changed
Sage Weil
04:59 PM Fix #9196 (In Progress): librados: watch_check() to synchronous verify we haven't missed notifies
Sage Weil
04:59 PM Fix #8905: msgr: encode osd epoch in nonce to avoid misc OSD reconnect races
Sage Weil
04:56 PM Bug #9706 (Fix Under Review): osdc/Objecter.cc: 1570: FAILED assert(op->session)
Sage Weil
09:26 AM Bug #9706 (In Progress): osdc/Objecter.cc: 1570: FAILED assert(op->session)
tick() locking is broken Sage Weil
04:55 PM rgw Bug #7796 (Pending Backport): RGW Keystone token auth fails with '411 Length Required' when Keyst...
Sage Weil
04:11 PM CephFS Bug #9679: Ceph hadoop terasort job failure
I do believe that Hadoop kills the clients after they reach a point that the run-time believes everything has been fl... Noah Watkins
02:02 PM CephFS Bug #9679: Ceph hadoop terasort job failure
Looking at the bad client (11139), the first thing I notice is that the messaging is way backed up. What's the networ... Greg Farnum
09:13 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Here is the directory listing. All of the files should be the same size.... Noah Watkins
03:07 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
commit:82175ec94acc89dc75da0154f86187fb2e4dbf5e Josh Durgin
03:06 PM rbd Bug #9513 (Pending Backport): rbd_cache=true default setting is degading librbd performance ~10X ...
Josh Durgin
01:50 PM rbd Bug #9742 (Resolved): `rbd map lun` fails with: (2) No such file or directory on kernel 3.14.14 w...
when trying to map a standard rbd image as a block device, the command fails with (2) No such file or directory.
I...
Adeel N
01:27 PM Feature #9741 (Closed): teuthology-suite: allow scheduling sub-suites
Samuel Just
01:18 PM Bug #9443: btrfs pwrite returns EEXIST on journal FileJournal::write_bl
Don't run on that kernel. :(
(My understanding is that they have a fix in testing and it shouldn't be in an actual r...
Greg Farnum
01:04 PM Bug #9443: btrfs pwrite returns EEXIST on journal FileJournal::write_bl
Is there a workaround ? Loïc Dachary
01:03 PM Bug #9740 (Duplicate): FileJournal::do_write assert(0)
Loïc Dachary
12:59 PM Bug #9740 (Duplicate): FileJournal::do_write assert(0)

http://pulpito.ceph.com/loic-2014-10-10_08:45:20-rados:thrash-erasure-code-isa-master-testing-basic-vps/536207/
...
Loïc Dachary
01:00 PM Bug #9696 (Pending Backport): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
Samuel Just
10:11 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Ok, can you reproduce with the logging above? Samuel Just
01:09 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Also, could either Loïc or Sam explain what exact combination of circumstances causes this assert to trigger? I can't... Florian Haas
12:59 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Sam, I can confirm with certainty that this did *not* happen during an upgrade from dumpling. All nodes were running ... Florian Haas
12:09 PM Bug #9739 (Won't Fix): rados cli: listsnaps does not list snaps
To reproduce:... Adam Crume
12:04 PM Bug #9738 (Won't Fix): rados cli: objects not present in a snapshot are listed anyway
To reproduce:... Adam Crume
11:54 AM Bug #9737 (Resolved): rados cli: --snapid (not --snap) option is broken
Running "rados --pool mypool --snapid 1 ls" (assuming 1 is a valid snap number) crashes without printing or returning... Adam Crume
11:34 AM RADOS Bug #9736 (New): rados cli doesn't print specific usage errors
If a user executes e.g. "rados lssnap", the command prints out the usage information. However, it does not say that ... Adam Crume
11:23 AM Bug #9735 (Resolved): "rados lock list" doesn't output final end-of-line
The rados command-line utility doesn't output an end-of-line character at the end of the output of the "lock list" co... Adam Crume
11:10 AM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
David, is it related to some code you were working on? Pls take a look and reassigned if necessary. Yuri Weinstein
09:06 AM Bug #9729 (Resolved): "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:paralle...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-10_08:19:51-upgrade:dumpling-firefly-x:parallel-gi... Yuri Weinstein
10:59 AM Feature #6258: ceph-disk: zap should wipefs
A user in the #ceph-devel channel had issues, it wouldn't matter that he tried to zap the disk, the filesystem was st... Alfredo Deza
10:25 AM rbd Feature #9733 (New): Separate rbd listing into CAP
We are concerned that if the key is compromised in our OpenStack environment, then all images in the pool can be list... Robert LeBlanc
09:46 AM Bug #9732 (Resolved): ReplicatedPG::hit_set_trim osd/ReplicatedPG.cc: 11006: FAILED assert(obc)
The timezone of the machine was incorrect CDT instead of CEST. All other machines (MON and OSD) are on CEST.
On a ...
Loïc Dachary
09:26 AM Documentation #9730 (Resolved): ceph-deploy mon create-inital, does not take arguments
It uses the same hosts that where passed into `ceph-deploy new {HOSTS}`
But this sections says the user should pas...
Alfredo Deza
09:15 AM Feature #9728: erasure-code: jerasure support for NEON
https://github.com/ceph/ceph/pull/2694 Loïc Dachary
09:02 AM Feature #9728 (Resolved): erasure-code: jerasure support for NEON
Work done by Janne Grunau @ https://github.com/jannau/ceph/compare/neon . It will be available in Hammer. Loïc Dachary
08:50 AM rbd Feature #8902: rbd mirroring: librbd: funnel snapshot, resize events via lock holder
WIP branch: https://github.com/ceph/ceph/compare/wip-8902 Jason Dillaman
08:46 AM Bug #9702: "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:firefly-x-giant-...
Same in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-08_19:20:02-upgrade:firefly-x-giant-distro-basic-m... Yuri Weinstein
08:46 AM Bug #9703: "Segmentation fault" in upgrade:firefly-x-giant-distro-basic-multi run
Same in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-08_19:20:02-upgrade:firefly-x-giant-distro-basic-m... Yuri Weinstein
08:45 AM devops Bug #9724: VPS machines not being locked "No route to host"
Why does this keep happening? Zack Cerza
08:44 AM devops Bug #9724: VPS machines not being locked "No route to host"
correct URL:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-ba...
Zack Cerza
07:59 AM devops Bug #9724 (Rejected): VPS machines not being locked "No route to host"
In the run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-basic... Yuri Weinstein
08:27 AM Bug #9727 (Duplicate): 0.86 EC+ KV OSDs crashing
Hi, testing our Tiering setup with EC+KV backend a bit further on the latest dev release, our OSDS started to crash a... Kenneth Waegeman
08:03 AM devops Bug #9725 (Won't Fix): Error "'sudo yum install ceph-radosgw-0.67.11 -y'"in upgrade:dumpling-x-fi...
In run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-basic-vps... Yuri Weinstein
07:18 AM CephFS Bug #9692 (Resolved): ACL workunit syntax error
Zheng Yan
07:05 AM rgw Feature #9723 (New): Support metering info
Add object storage metering support similar to openstack swift ceilometer. Its should able plugable with openstack ce... Swami Reddy
05:39 AM devops Bug #9721 (Fix Under Review): partx -a should be called after creating the data partition
https://github.com/ceph/ceph/pull/2648 and https://github.com/dachary/ceph/commit/81d6c5b5a33de745ae4a23536409de0c0e7... Loïc Dachary
05:26 AM devops Bug #9721: partx -a should be called after creating the data partition
Loïc Dachary
05:18 AM devops Bug #9721 (Rejected): partx -a should be called after creating the data partition
In the following udev is racing with the creation of the partition:... Loïc Dachary
03:23 AM Feature #9720 (Resolved): erasure-code: non regression should test jerasure variants
check that content encoded with one variant exactly matches content encoded with another variant Loïc Dachary
12:36 AM rgw Feature #8052: Support for Keystone Identity API v3
From swift 2.2.0 changelog:
* Added support for Keystone v3 auth.
Keystone v3 introduced the concept ...
Dag Stenstad

10/09/2014

05:14 PM Bug #9718 (Resolved): osd_types: check_new_interval: min_size check needs to consider CRUSH_ITEM_...
Samuel Just
04:40 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
wip-9696-firefly removes the assert on firefly, it's not valid for the compat case. Samuel Just
04:32 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
https://github.com/ceph/ceph/pull/2684/files Samuel Just
04:31 PM Bug #9696 (Fix Under Review): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
Samuel Just
04:31 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Can you restart one of the crashing osds with
debug osd = 20
debug filestore = 20
debug ms = 1 ?
As far as we...
Samuel Just
04:26 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Loïc Dachary
04:25 PM Bug #9696 (Fix Under Review): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
https://github.com/ceph/ceph/pull/2684 Loïc Dachary
04:20 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Samuel Just
04:16 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
running in gitbuilder under the branch wip-9696-compat-acting Loïc Dachary
03:56 PM Bug #9696 (Fix Under Review): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
Loïc Dachary
03:38 PM Bug #9696 (In Progress): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(...
https://github.com/ceph/ceph/pull/2682 Loïc Dachary
03:01 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
It actually failed a new test case AFTER it went out into a stable release version. Ian Colle
02:28 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Whoa, wait -- Loïc, are you saying this actually failed a test case and still made it into a release in a stable vers... Florian Haas
02:23 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
For the record http://tracker.ceph.com/issues/9715 hits the same assert in similar conditions in teuthology and the f... Loïc Dachary
03:46 PM rgw Bug #7796 (Fix Under Review): RGW Keystone token auth fails with '411 Length Required' when Keyst...
Yehuda Sadeh
03:32 PM Bug #9715: assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting) firefly
sjust: I think it's due to the compatibility thing where we include the backfill peer in the acting set if there are ... Loïc Dachary
03:11 PM Bug #9715: assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting) firefly
I see the change (92cfd370) that added the assert and didn't consider "compat_mode." In older OSDs we only have one ... David Zafman
02:20 PM Bug #9715 (Duplicate): assert(want_acting_backfill.size() - want_backfill.size() == num_want_acti...
Loïc Dachary
02:08 PM Bug #9715: assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting) firefly
" assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting);":https://github.com/ceph/ceph/blob/f... Loïc Dachary
09:51 AM Bug #9715 (Duplicate): assert(want_acting_backfill.size() - want_backfill.size() == num_want_acti...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_19:30:01-upgrade:dumpling-firefly-x:stress-spli... Yuri Weinstein
03:12 PM Bug #8983 (Fix Under Review): rados bench -b option does not take orders of magnitude (k,M,..) bu...
Adam Crume
02:26 PM devops Bug #9712: ceph.com is not accessible from IPv6 only environments
Works for me (native v6 ip).
Thanks!
Gleb Borisov
12:23 PM devops Bug #9712 (Resolved): ceph.com is not accessible from IPv6 only environments
This is me. Somehow the ipv6 ip address became unassigned to the ceph.com dedicated server in DH's database. Not sure... Sandon Van Ness
10:29 AM devops Bug #9712: ceph.com is not accessible from IPv6 only environments
Apologies if somebody else should be handling this, but I think it's yours? :) Greg Farnum
07:48 AM devops Bug #9712 (Resolved): ceph.com is not accessible from IPv6 only environments
Some time ago we found that we can't connect to ceph.com:443. Moreover we can't either ping it.... Gleb Borisov
01:14 PM Bug #9711 (Duplicate): 'cache' osd crash on ceph 0.86
Loïc Dachary
01:29 AM Bug #9711 (Duplicate): 'cache' osd crash on ceph 0.86
In a tiering setup cache+ EC on KV, one cache OSD has crashed after about 12hours testing with rados bench.
Stackt...
Kenneth Waegeman
01:13 PM Bug #9480: OSD is crashing while object deletion
It's back http://tracker.ceph.com/issues/9711 Loïc Dachary
12:07 PM CephFS Bug #9679: Ceph hadoop terasort job failure
empty fs:... Noah Watkins
08:21 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Thanks Huamin. Yeh, It looks like some writes are being lost, probably due to an unclean shutdown. I'll get some trac... Noah Watkins
08:06 AM CephFS Bug #9679: Ceph hadoop terasort job failure
For comparison, teragen files on CephFS
./hadoop/bin/hadoop fs -ls /in-dir-3
14/10/09 08:05:05 WARN util.NativeC...
Huamin Chen
07:04 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Run the same tests on HDFS 2.4.1, thoguh on a different setup. Terasort finished without any problem.
Cmd:
./hado...
Huamin Chen
11:46 AM rbd Feature #2467 (In Progress): qemu: implement bdrv_invalidate_cache
Patch sent to qemu-devel@nongnu.org. Adam Crume
10:50 AM Bug #9716 (Resolved): Warning in API headers when compiling with -Wstrict-prototypes
Configuring qemu fails because of the compile errors:... Adam Crume
09:40 AM Bug #9714 (Duplicate): Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-mu...
Run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-... Yuri Weinstein
09:30 AM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Thanks for the update, Ilya! You actually gave me a hint as to a workaround - run the container with `--net host` so ... Chris Armstrong
08:59 AM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
The... Ilya Dryomov
09:23 AM rbd Feature #8902 (In Progress): rbd mirroring: librbd: funnel snapshot, resize events via lock holder
... also include flatten. Jason Dillaman
09:22 AM rbd Feature #8900 (In Progress): rbd mirroring: librbd:making image locking mandatory
Jason Dillaman
08:28 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
Ian Colle
08:11 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
Still an issue: http://pulpito.front.sepia.ceph.com/teuthology-2014-10-08_23:20:03-multi-version-giant-distro-basic-m... Yuri Weinstein
08:16 AM rgw Bug #9612 (New): "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant-test...
Still an issue: http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_23:20:03-multi-version-giant-distro-basic-m... Yuri Weinstein
08:09 AM Bug #9705 (Duplicate): "RadosModel.h: 829: FAILED assert(0)" in multi-version-giant-distro-basic-...
Yuri Weinstein
07:06 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
I've just updated pull request 2419 with a more complete fix for the issue.
I was now able to reproduce 100% when my...
Sebastien Ponce
02:52 AM Bug #9327: Usability Issue: Ceph-deploy does not print all the commands which it is executing
This issue seen in 0.84 build, can we cross check ones. Ramakrishnan P
02:30 AM Feature #9420 (Fix Under Review): erasure-code: tools and archive to check for non regression of ...
The gitbuilders have been updated, it is ready for review. Loïc Dachary
12:28 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
What will be state of OSD in 3 node cluster ?, in 3 node cluster there will be other OSD's running on other nodes, so... Ramakrishnan P

10/08/2014

11:08 PM CephFS Bug #9679: Ceph hadoop terasort job failure
missing one of these?... Noah Watkins
10:46 PM CephFS Bug #9679: Ceph hadoop terasort job failure
My bet at this point is on the generation of the input data set. Teragen creates a file with X 100byte entries. When ... Noah Watkins
07:57 PM Bug #9559: ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() function
please give me a cve id ,thanks qinghao tang
04:23 PM Bug #9630 (Need More Info): osd: leaked pg refs on shutdown (dumpling)
I'm out of ideas but happy to keep exploring if someone has a lead. If this happens again cross referencing the logs ... Loïc Dachary
04:12 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
"OSD::shutdown":https://github.com/ceph/ceph/blob/dumpling/src/osd/OSD.cc#L1521 clear() the "finished":https://github... Loïc Dachary
03:21 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
... Loïc Dachary
01:53 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
The last thing that happened to pg 2.15 was... Loïc Dachary
01:35 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
Log lines related to pg 2.15... Loïc Dachary
11:38 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
It could not be in_progress_splits : the logs do not contain the word *split* Loïc Dachary
11:18 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
It is the same assert as http://tracker.ceph.com/issues/7891 but the PGBackend did not exist at the time, therefore t... Loïc Dachary
11:03 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
The osd.2 actually crashed with:... Loïc Dachary
10:07 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
The full valgrind report from remote/vpm180/log/valgrind/osd.2.log.gz... Loïc Dachary
03:58 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
It seems that we need to backport the update_range/scan_range changes (intended to avoid backfill related flushes) fr... Samuel Just
01:17 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
I think the least distasteful solution is to actually backport the last_backfill_started modifications. I'll start t... Samuel Just
03:38 PM rbd Feature #7272 (Duplicate): rbd: import performance
Josh Durgin
03:14 PM rbd Feature #7272: rbd: import performance
Reads are single-threaded, but writes are asynchronous, so multiple could be in flight at once. (In rbd.cc, do_impor... Adam Crume
11:53 AM rbd Bug #9513 (Fix Under Review): rbd_cache=true default setting is degading librbd performance ~10X ...
Adam Crume
11:06 AM Bug #9496 (Resolved): mon: pg scrub timestamps must be populated at pg creation
Samuel Just
11:04 AM Bug #9128 (Resolved): Newly-restarted OSD may suicide itself after hitting suicide time out value...
Samuel Just
11:04 AM Bug #9419 (Pending Backport): dumpling->firefly upgrade, sending setallochint?
Samuel Just
10:59 AM rbd Feature #8900: rbd mirroring: librbd:making image locking mandatory
WIP branch: https://github.com/ceph/ceph/compare/wip-8900 Jason Dillaman
09:37 AM rbd Bug #9642: Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-giant-distro-ba...
Yuri Weinstein
08:49 AM Bug #9706 (Resolved): osdc/Objecter.cc: 1570: FAILED assert(op->session)
This was actually on wip-sam-testing, but does not appear related to any of the patches.
ubuntu@teuthology:/a/samu...
Samuel Just
08:43 AM rgw Bug #9307 (New): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-...
I see the same issues on giant run:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-07_15:54:57-upgrade:dum...
Yuri Weinstein
08:31 AM Bug #9705 (Duplicate): "RadosModel.h: 829: FAILED assert(0)" in multi-version-giant-distro-basic-...
Looks similar to #9528 (no root issue mentioned)
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-1...
Yuri Weinstein
08:05 AM Bug #9703 (Resolved): "Segmentation fault" in upgrade:firefly-x-giant-distro-basic-multi run
I see coredump on mira076 client.1 (@*/531751/remote/mira076/@), but could not get any info about it.
Logs are in ...
Yuri Weinstein
08:00 AM Bug #9702 (Duplicate): "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:fire...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-06_19:20:01-upgrade:firefly-x-giant-distro-basic-m... Yuri Weinstein
07:56 AM Bug #9700: cephtool mon_osd intermittent failure
Waiting about a week to see if it shows up again. Loïc Dachary
07:30 AM Bug #9700 (Fix Under Review): cephtool mon_osd intermittent failure
https://github.com/ceph/ceph/pull/2670... Loïc Dachary
07:23 AM Bug #9700: cephtool mon_osd intermittent failure
The osd 1 goes down during the following. Reading the script and what it does I can imagine why. Unless osd.1 dies be... Loïc Dachary
06:57 AM Bug #9700: cephtool mon_osd intermittent failure
ENXIO is expected when ceph tell tries to join an osd that is not ready and it should be treated as EAGAIN. If it hap... Loïc Dachary
06:11 AM Bug #9700 (Resolved): cephtool mon_osd intermittent failure

Hit this one time on a gitbuilder: it's not clear to me why we have a 5-time retry here: some timeout raciness in t...
John Spray
07:28 AM CephFS Feature #9437 (Resolved): make 'ceph tell mds.* ...' work, deprecate 'ceph mds tell * ...'
... John Spray
05:29 AM devops Support #8861: Deploying additional monitors fails.
My work around that was to declare all monitors before install, and install all monitors at once. Pretty sure if I ne... Bobby Yakov
02:39 AM devops Support #8861: Deploying additional monitors fails.
As per my update in #5195:
Same here. I have run through the latest quick start documentation and am using Ubuntu ...
Matthew Rees
05:16 AM devops Bug #9697 (Rejected): exitcode of gatherkeys has changed the latests versions
Hi,
We've been using ceph-deploy in a deployment component, and we also use the gatherkeys function.
In some earl...
Kenneth Waegeman
02:37 AM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
Same here. I have run through the latest quick start documentation and am using Ubuntu 14.04.1 and Ceph firefly with ... Matthew Rees
02:15 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Finally I can reproduce it ! I know, I've already said that and was wrong...
Actually I still don't manage with the ...
Sebastien Ponce
12:35 AM Bug #9696 (Resolved): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(wan...
After an upgrade from 0.80.5 to 0.80.6, almost *all* OSDs went down after hitting the following failed assertion:
...
Florian Haas
12:31 AM Bug #9408 (In Progress): erasure-code: misalignment
Loïc Dachary
12:15 AM Bug #9677 (Resolved): osd_disk_thread_ioprio_class is ignored
Loïc Dachary

10/07/2014

10:01 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
Josh Durgin wrote:
> The issue was reported on firefly - does it have the same behavior as master, or is there somet...
Luis Pabon
05:48 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
The issue was reported on firefly - does it have the same behavior as master, or is there something that should be ba... Josh Durgin
05:30 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
Here is the response from the gateway:... Luis Pabon
12:56 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
Maybe the problem is that we don't send the xml body with the appropriate error? Yehuda Sadeh
12:52 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
I have added a test to s3-test to check for EntityTooSmall and it *passes* on the current code. According to AWS an ... Luis Pabon
09:50 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
I'm happy to redefine the permissions if and when that becomes an option/requirement, but until then, it seems like t... Dan Mick
09:42 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
The immediate issue was resolved by switching it to rw (or so my code check and utter lack of memory tells me). But I... Greg Farnum
09:28 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
Well, hang on a minute...the question is about the nature of the command, which is totally mds-specific, not rest-api... Dan Mick
07:07 AM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
I don't know that this is still a bug, but since it was a REST api issue I don't think it belongs in the MDS tracker ... Greg Farnum
07:28 PM CephFS Bug #9692 (Resolved): ACL workunit syntax error
http://pulpito.ceph.com/gregf-2014-10-06_19:59:42-kcephfs-wip-9628-testing-basic-multi/531900... Greg Farnum
07:26 PM CephFS Bug #9628 (Resolved): mds: race between ms_handle_accept() and ms_handle_reset()
Merged to master in commit:1b7fae7b2953649564a9e226b4abedad0ce652cc Greg Farnum
05:51 PM rbd Bug #9513 (In Progress): rbd_cache=true default setting is degading librbd performance ~10X in Giant
The regression was introduced in commit 4fc9fffc494abedac0a9b1ce44706343f18466f1 (according to git bisect). This is ... Adam Crume
04:33 PM RADOS Bug #9606: mon: ambiguous error_status returned to user when type is wrong in a command
regardless of this being properly parsed on the client or not, the monitor should not rely on client argument validat... Joao Eduardo Luis
04:28 PM Bug #9496 (Fix Under Review): mon: pg scrub timestamps must be populated at pg creation
https://github.com/ceph/ceph/pull/2663
also in wip-sam-testing
Joao Eduardo Luis
01:46 PM Bug #9496: mon: pg scrub timestamps must be populated at pg creation
Kind of odd, last scrub timestamp should never be 0. Samuel Just
02:35 PM Bug #9416 (Duplicate): ods crash in upgrade:dumpling-dumpling-distro-basic-vps run
Samuel Just
02:35 PM Fix #9689 (New): ceph df reports % of global size used instead of MAX AVAIL 0.80.6
running the ceph df command returns a much lower %USED than expected. instead of reporting %USED of MAX AVAIL, which ... Heath Jepson
02:33 PM Bug #9503 (Pending Backport): Dumpling: removing many snapshots in a short time makes OSDs go ber...
Samuel Just
01:57 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
https://github.com/ceph/ceph/pull/2659 Samuel Just
02:08 PM Bug #9203 (Resolved): ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limi...
Sage Weil
11:25 AM Bug #9203 (Fix Under Review): ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `po...
Samuel Just
02:06 PM Bug #9113 (Pending Backport): osd: snap trimming eats memory, linearly
Sage Weil
11:25 AM Bug #9113 (Fix Under Review): osd: snap trimming eats memory, linearly
Samuel Just
02:04 PM Bug #9626 (Pending Backport): PG: cancel backfill reservations if we get a cancel during backfill
Sage Weil
11:25 AM Bug #9626 (Fix Under Review): PG: cancel backfill reservations if we get a cancel during backfill
Samuel Just
01:57 PM Bug #7368: ceph osd repair * blocks after some minutes and prevent other ceph pg repair commands
Loic, If I understand correctly, #9566 is "normal" backfilling, and Sage's explanation is clear. In my case, I had lo... Yann Dupont
01:40 PM Bug #7368 (Can't reproduce): ceph osd repair * blocks after some minutes and prevent other ceph p...
Samuel Just
01:51 PM Bug #9467 (Won't Fix): Delete default erasure coded profile getting succeeded
This looks like exactly how it is supposed to work? Samuel Just
01:49 PM Bug #9434: rbd rm hangs
Are your pgs clean? (ceph -s) Samuel Just
01:43 PM Bug #9551 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-testing-basic-vps run
Samuel Just
01:42 PM Bug #2848 (Won't Fix): OSDMap: pool_id is 64-bit, but pool_max is 32-bit
Samuel Just
01:32 PM Bug #9515 (Duplicate): "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:...
Ok, hopefully fixed this time. Samuel Just
01:28 PM Bug #8822 (Resolved): osd: hang on shutdown, spinlocks
Samuel Just
01:25 PM Bug #9181: Osd: segv in OpTracker::unregister_inflight_op
Somnath Roy wrote:
> Sam,
> This core is different and happening on Firefly. The other optracker fixes should also ...
Somnath Roy
01:25 PM Bug #9181: Osd: segv in OpTracker::unregister_inflight_op
Sam,
This core is different and happening on Firefly. The other optracker port should also be backported to Firefly ...
Somnath Roy
01:15 PM Bug #9181 (Resolved): Osd: segv in OpTracker::unregister_inflight_op
I think this got fixed with the other optracker fix? Samuel Just
01:22 PM Bug #9661 (Resolved): ceph_objectstore_tool doesn't work with memstore
6067f295e7bc571b43aa891f5560d96933721b19 David Zafman
01:20 PM Bug #9682 (Duplicate): "os/FileJournal.cc: 1677: FAILED assert(0)" in upgrade:firefly-firefly-dis...
Samuel Just
10:55 AM Bug #9682 (Duplicate): "os/FileJournal.cc: 1677: FAILED assert(0)" in upgrade:firefly-firefly-dis...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-05_10:00:04-upgrade:firefly-firefly-distro-basic-m... Yuri Weinstein
01:19 PM Bug #9683 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-distro-basic-multi run
Samuel Just
10:58 AM Bug #9683 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-distro-basic-multi run
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-05_10:00:04-upgrade:firefly-firefly-distro-basic-m... Yuri Weinstein
01:16 PM Bug #8333 (Can't reproduce): ceph_test_rados_delete_pools_parallel: Received fewer notifies than ...
Samuel Just
01:12 PM Bug #9128: Newly-restarted OSD may suicide itself after hitting suicide time out value because it...
Samuel Just
01:10 PM Bug #9322 (In Progress): OSDMap updates from pgmap can be delayed indefinitely
Joao Eduardo Luis
01:10 PM Bug #9321 (In Progress): pgmap updates from OSDMap can be delayed indefinitely
Joao Eduardo Luis
01:09 PM Bug #6101 (Can't reproduce): ceph-osd crash on corrupted store
Samuel Just
12:28 PM Bug #9582 (Resolved): librados: segmentation fault on timeout
i believe all patches affecting firefly and dumpling have been backported. Sage Weil
11:41 AM Bug #9582 (Pending Backport): librados: segmentation fault on timeout
Sage Weil
11:42 AM Bug #9650 (Resolved): RWTimer cancel_event is racy
Sage Weil
11:29 AM Bug #8520 (Can't reproduce): osd: segv in PushOp::print()
Samuel Just
11:28 AM Bug #9008 (Pending Backport): Objecter: pg listing can deadlock when throttling is in use
Samuel Just
11:28 AM Bug #9417 (Duplicate): "Segmentation fault" in upgrade:dumpling-giant-x-master-distro-basic-vps run
Samuel Just
11:26 AM Bug #9614: PG stuck with remapped
Samuel Just
11:03 AM Bug #9684 (Can't reproduce): "Scrubbing terminated" in upgrade:firefly-firefly-distro-basic-multi...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-05_10:00:04-upgrade:firefly-firefly-distro-basic-m... Yuri Weinstein
10:28 AM rbd Bug #9642: Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-giant-distro-ba...
Should be fixed by https://github.com/ceph/ceph-qa-suite/pull/169 Yuri Weinstein
09:54 AM Messengers Fix #9678 (Rejected): errno shadowed in Pipe.cc
Greg Farnum
07:41 AM Messengers Fix #9678: errno shadowed in Pipe.cc
If it is expected to see an error message when there is nothing to read, then I was mistaken.
Not retrieving the ...
Loïc Dachary
07:13 AM Messengers Fix #9678: errno shadowed in Pipe.cc
Where's it being reset? That error message is admittedly strange but it actually happens because the underlying funct... Greg Farnum
05:54 AM Messengers Fix #9678 (Rejected): errno shadowed in Pipe.cc
In some places errno is used after it has been reset and the original error code does not show in the message. For in... Loïc Dachary
09:54 AM rbd Bug #6926 (Resolved): rbd: diff output includes previously non-existent objects as zeroed extents
commit:9a1ab95176fe4d200a83b7b4f7e2b3097d541a7a Josh Durgin
09:54 AM CephFS Bug #9679: Ceph hadoop terasort job failure
https://issues.apache.org/jira/browse/MAPREDUCE-2018 Noah Watkins
09:53 AM CephFS Bug #9679: Ceph hadoop terasort job failure
https://svn.apache.org/repos/asf/hadoop/common/branches/MAPREDUCE-233/src/examples/org/apache/hadoop/examples/terasor... Noah Watkins
09:39 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Teragen command:
./hadoop/bin/hadoop jar ./hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar t...
Huamin Chen
09:22 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Thanks for adding this. What command did you use to generate the input? Noah Watkins
09:04 AM CephFS Bug #9679 (Closed): Ceph hadoop terasort job failure
Hadoop version: 2.4.1
Ceph version:
ceph --version
ceph version 0.85-986-g031ef05 (031ef0551ebc98d824075558e884...
Huamin Chen
09:36 AM rbd Bug #9680 (Duplicate): Errors in test_rbd.* in upgrade:dumpling-firefly-x:parallel-giant-distro-b...
Josh Durgin
09:28 AM rbd Bug #9680 (Duplicate): Errors in test_rbd.* in upgrade:dumpling-firefly-x:parallel-giant-distro-b...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-06_17:20:35-upgrade:dumpling-firefly-x:parallel-gi... Yuri Weinstein
09:32 AM Linux kernel client Bug #4689: libceph: don't have alloc_msg methods limit length
Related to #9560, #9561? Ilya Dryomov
09:28 AM rbd Bug #5768 (Resolved): rbd-fuse: leak in enumerate_images()
commit:9132ca47959ae1a9a658971b0c8f4fe6e8d0cad3 Josh Durgin
09:26 AM rbd Bug #7385: Objectcacher setting max object counts too low
Josh Durgin
09:24 AM rbd Bug #9391 (Need More Info): fio rbd driver rewrites same blocks
Josh Durgin
09:24 AM rbd Bug #9146 (Can't reproduce): EPERM from image_read.sh
Josh Durgin
09:22 AM rbd Bug #9602 (Need More Info): rbd export -> nc ->rbd import = memory leak
Josh Durgin
09:20 AM rgw Bug #9254 (Fix Under Review): rgw: civetweb requires explicit \r\n for http headers
Yehuda Sadeh
09:15 AM rgw Bug #9039 (Pending Backport): Using COPY on radosgw to copy object from one bucket to another tha...
Yehuda Sadeh
09:13 AM rgw Bug #8587 (Pending Backport): rgw: subuser object not created correctly
Yehuda Sadeh
09:13 AM rgw Bug #9155 (Resolved): Swift Subuser - 403 Forbidden - during upload/post
Fixed (#8587) Yehuda Sadeh
09:11 AM rgw Bug #5595 (Fix Under Review): object has a Content-Type, but its content_type property is not sho...
Josh Durgin
09:09 AM Bug #9677 (Pending Backport): osd_disk_thread_ioprio_class is ignored
Loïc Dachary
05:10 AM Bug #9677: osd_disk_thread_ioprio_class is ignored
https://github.com/ceph/ceph/pull/2654 Loïc Dachary
04:48 AM Bug #9677 (Resolved): osd_disk_thread_ioprio_class is ignored
The "osd_disk_thread_ioprio_class configuration option":http://ceph.com/docs/giant/rados/configuration/osd-config-ref... Loïc Dachary
09:00 AM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Ilya Dryomov
08:54 AM Bug #9635 (Resolved): mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())
Joao Eduardo Luis
07:03 AM CephFS Bug #9636 (Duplicate): segfault in CInode::get_caps_allowed_for_client
Greg Farnum
07:02 AM CephFS Bug #9562 (Resolved): Lockdep assertion in Filer purge
Backported to giant:... John Spray
07:02 AM CephFS Bug #8576 (Need More Info): teuthology: nfs tests failing on umount
Greg Farnum
06:50 AM Bug #6756: journal full hang on startup
... Sage Weil
06:37 AM Bug #6003: journal Unable to read past sequence 406 ...
... Sage Weil
06:32 AM Bug #9418 (Pending Backport): mon: drop internal-purpose messages from clients without proper caps
Sage Weil
03:55 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
as per this document "http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/", that mon will get 3... Ramakrishnan P
03:53 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
as per this document "http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/", that mon will get 3... Ramakrishnan P
01:15 AM Bug #9676 (Resolved): disk thread ioprio class misses osd
Loïc Dachary
01:10 AM Bug #9676 (Fix Under Review): disk thread ioprio class misses osd
https://github.com/ceph/ceph/pull/2653 Loïc Dachary
01:06 AM Bug #9676 (Resolved): disk thread ioprio class misses osd
http://ceph.com/docs/master/rados/configuration/osd-config-ref/ Loïc Dachary
12:28 AM Bug #9675: splitting a pool doesn't start when rule_id != ruleset_id
Sorry for formatting... should be like this:... Dan van der Ster
12:24 AM Bug #9675 (Resolved): splitting a pool doesn't start when rule_id != ruleset_id
commit:78e84f34da83abf5a62ae97bb84ab70774b164a6
Dumpling 0.67.10
Rule is like this:
{ "rule_id": 6,
...
Dan van der Ster

10/06/2014

10:53 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Note that we only see the debug output when we're trying to write to the RBD bus directly on the host - from within t... Chris Armstrong
10:32 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Here's some debugging after disabling auth.
As root on the CoreOS host, echoing directly into the RBD bus also doe...
Chris Armstrong
10:04 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
For posterity, recording my conversation with Josh here. http://irclogs.ceph.widodh.nl/index.php?date=2014-09-04
<...
Chris Armstrong
10:03 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Seeing the same issue on a 3.16.2 kernel: ... Chris Armstrong
08:55 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
A fellow member of the CoreOS community is also running into this: https://groups.google.com/forum/#!topic/coreos-use... Chris Armstrong
06:27 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
rsync return codes aren't standard error codes. The man page says that 23 means... Greg Farnum
05:59 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
#define ENFILE 23 /* File table overflow */
maybe we should adjust ulimit
Zheng Yan
02:23 PM CephFS Bug #9674 (Resolved): nightly failed multiple_rsync.sh
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-03_23:04:01-fs-giant-distro-basic-multi/527949/... Greg Farnum
05:52 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
Good to know that you are able to reproduce this.
I think the log entries you mentioned are there in Firefly as well...
Somnath Roy
03:09 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
I double-checked, and I had multiple versions of librbd on my path. (I forgot about installing one of them.) I remo... Adam Crume
01:14 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
Hopefully, you made sure the librbd/librados libraries fio_rbd is loading are from the giant. As I said, replacing th... Somnath Roy
11:55 AM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
Thanks for the details. Unfortunately, I'm still unable to reproduce the issue. Was your cluster created in Firefly... Adam Crume
03:52 PM devops Bug #9658 (Resolved): upgrade from dumpling to firefly is broken
Sage Weil
03:06 PM devops Bug #9658: upgrade from dumpling to firefly is broken
ubuntu@teuthology:/a/teuthology-2014-10-06_14:06:56-upgrade:dumpling-firefly-x:parallel-wip-9658-firefly-distro-basic... Tamilarasi muthamizhan
12:51 PM devops Bug #9658: upgrade from dumpling to firefly is broken
ubuntu@teuthology:/a/teuthology-2014-10-06_10:31:05-upgrade:dumpling-firefly-x:parallel-wip-9658-distro-basic-vps/529856 Tamilarasi muthamizhan
10:12 AM devops Bug #9658: upgrade from dumpling to firefly is broken
sure, am testing it now Tamilarasi muthamizhan
09:55 AM devops Bug #9658: upgrade from dumpling to firefly is broken
Tamil, this should be fixed in the wip-9658 branch.. can you test please? The firefly backport will be a bit differe... Sage Weil
09:48 AM devops Bug #9658: upgrade from dumpling to firefly is broken
Sage Weil
03:23 PM Bug #9203: ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limit' failed.
Samuel Just
02:54 PM Bug #9419: dumpling->firefly upgrade, sending setallochint?
Notes on using feature bits already present. The problem is that CEPH_FEATURE_MSGR_KEEPALIVE2 was back ported, so we... David Zafman
02:17 PM Bug #9385 (Duplicate): ceph_test_rados: incorrect buffer at pos ...
Samuel Just
01:49 PM Documentation #9673 (Closed): Document ceph df numbers
We need to just write down what they mean. It's one of the first questions, and it's one of the hardest ones to answ... Dan Mick
11:29 AM Bug #9664 (Rejected): mon: ceph osd metada failure on centos7
it fails because the dockerized centos is has a fake systemd http://jperrin.github.io/centos/2014/09/25/centos-docker... Loïc Dachary
11:02 AM Bug #9657 (Resolved): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
No backport is needed; this is done. (commit:25bcc39bb809e2d13beea1529e4ab92d1b61fa5b) Greg Farnum
09:56 AM devops Tasks #9669 (Resolved): teuthology.front needs an upgrade
We need a newer libvirt version on the machine, and Ubuntu precise just doesn't contain what we need. I can't even se... Zack Cerza
09:39 AM devops Bug #9654 (Duplicate): "error: subprocess paste was killed by signal (Broken pipe)" in upgrade:du...
Sage Weil
09:39 AM devops Bug #9656: Remove conditional statement in ceph-radosgw startup script log section
Hmm yeah. I think the better solution would be to fix the /var/log/ceph (/var/log/radosgw?) permissions so that log ... Sage Weil
09:37 AM Bug #9663 (Resolved): Objecter assertion failure
Sage Weil
09:34 AM devops Bug #9667 (Duplicate): Missing packages in upgrade:dumpling-firefly-x:parallel-giant-distro-basic...
#9658 Sage Weil
08:45 AM devops Bug #9667 (Duplicate): Missing packages in upgrade:dumpling-firefly-x:parallel-giant-distro-basic...
This looks like a dupe of #9640 but a different version.
Job: http://qa-proxy.ceph.com/teuthology/teuthology-2014-...
Yuri Weinstein
09:33 AM Bug #9668 (Rejected): osd killed by ABRT from FAILED assert
this is almost certainly either the max files ulimit (ulimit -n , see max open files = ... in ceph.conf) or /proc/sys... Sage Weil
09:00 AM Bug #9668 (Rejected): osd killed by ABRT from FAILED assert
-----------
[Mon Oct 6 15:09:03 2014] init: ceph-osd (ceph/46) main process (3058) killed by ABRT signal
---------...
Sheldon Mustard
09:32 AM Bug #9655 (Resolved): tests: qa/workunits/cephtool/test.sh fails ENXIO
Loïc Dachary
08:38 AM devops Fix #9666 (Resolved): ceph-disk error when activate is missing an argument is cryptic
When the device argument is missing:... Loïc Dachary
08:14 AM devops Bug #9665: ceph-disk zap should call partprobe
https://github.com/ceph/ceph/pull/2648 Loïc Dachary
07:43 AM devops Bug #9665 (Resolved): ceph-disk zap should call partprobe
h3. User description
Symptoms:
* A disk is used by an OSD
* The OSD is not longer useful and the disk is clear...
Loïc Dachary
08:09 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
I believe the fact that the commit message for 255b430a87201c7d0cf8f10a3c1e62cbe8dd2d93 said @Backfill@ where it shou... Florian Haas
08:04 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Hi Sam,
I think this is fixed in master/giant.. correct? Just a gentle reminder that we'd appreciate a backport in d...
Dan van der Ster
08:04 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
Hi Sam,
Same as for #9503
I think this is fixed in master/giant.. correct? Just a gentle reminder that we'd appreci...
Dan van der Ster
12:51 AM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
Loïc Dachary

10/05/2014

01:12 PM Bug #9663: Objecter assertion failure
Probably this... Noah Watkins
12:41 PM Bug #9663 (Resolved): Objecter assertion failure
In latest Giant build. This bug appears to be related to http://tracker.ceph.com/issues/9067... Noah Watkins
01:00 PM Bug #9664 (Rejected): mon: ceph osd metada failure on centos7
"qa/workunits/cephtool/test.sh":https://github.com/ceph/ceph/blob/master/qa/workunits/cephtool/test.sh#L665 consisten... Loïc Dachary

10/04/2014

12:50 PM RADOS Bug #9492 (Fix Under Review): Crush Mapper crashes when number of replicas is less than total num...
* firefly https://github.com/ceph/ceph/pull/2643
* giant https://github.com/ceph/ceph/pull/2642
running on http:/...
Loïc Dachary
12:33 PM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
Pull req info :
fix for firstn rules: https://github.com/ceph/ceph/pull/2568
fix for indep rules : https://githu...
Johnu George
04:41 AM RADOS Bug #9492 (Pending Backport): Crush Mapper crashes when number of replicas is less than total num...
I think both patches should be backported to giant and firefly. Would you like to do that ? It essentially means you ... Loïc Dachary
02:46 AM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
Loïc Dachary
02:38 AM Bug #9655 (Fix Under Review): tests: qa/workunits/cephtool/test.sh fails ENXIO
https://github.com/ceph/ceph/pull/2641 Loïc Dachary

10/03/2014

06:36 PM Bug #9657: MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
tested with wip-9657, fix works fine.
logs are copied to vpm102.front.sepia.ceph.com:/home/ubuntu/wip-9657
Tamilarasi muthamizhan
05:02 PM Bug #9657 (Pending Backport): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
fix looks right. merged it into giant branch Sage Weil
04:11 PM Bug #9657: MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
https://github.com/ceph/ceph/pull/2640
Tamil will put it through the upgrade suite.
Greg Farnum
04:07 PM Bug #9657 (Fix Under Review): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
Greg Farnum
04:05 PM Bug #9657: MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
Okay, it's because Message::encode() transmutes a compat_version of 0 into compat_version == HEAD_VERSION, and we are... Greg Farnum
03:59 PM Bug #9657 (In Progress): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
Well, good news and bad news:
This is not a monitor bug, and my initial guess is that it will only affect clusters r...
Greg Farnum
11:11 AM Bug #9657 (Resolved): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-01_19:20:01-upgrade:firefly-x-giant-distro-basic-m... Yuri Weinstein
05:04 PM devops Bug #9658: upgrade from dumpling to firefly is broken
possible debian fix in wip-9658. asked ceph-maintainers and branto for review and help with the spec file change. Sage Weil
03:56 PM devops Bug #9658 (In Progress): upgrade from dumpling to firefly is broken
Tamilarasi muthamizhan
03:52 PM devops Bug #9658 (New): upgrade from dumpling to firefly is broken
sandon: looks like the problem is a file that was in python-ceph was moved to ceph and apt is bailing due to over-wri... Tamilarasi muthamizhan
03:52 PM devops Bug #9658 (In Progress): upgrade from dumpling to firefly is broken
this is broken by commit:eb0f6e347969b40c0655d3165a6c4531c6b595a3, which is post 0.80.6. phew! yay testing. Sage Weil
02:49 PM devops Bug #9658 (Resolved): upgrade from dumpling to firefly is broken
This is definitely blocking the upgrade testing for giant.
logs: http://qa-proxy.ceph.com/teuthology/teuthology-20...
Tamilarasi muthamizhan
03:16 PM Bug #9661 (Fix Under Review): ceph_objectstore_tool doesn't work with memstore
David Zafman
03:07 PM Bug #9661 (Resolved): ceph_objectstore_tool doesn't work with memstore

A CephContext* isn't passed to ObjectStore::create() so MemStore::mount() crashes.
MemStore::set_allow_sharded_o...
David Zafman
03:04 PM devops Bug #9640: Missing packages in multi-version-giant-testing-basic-multi
Looks like there two different issues here,
Update on "sudo apt-get update..."
Running the command line on a ma...
Yuri Weinstein
11:20 AM devops Bug #9640: Missing packages in multi-version-giant-testing-basic-multi
4 jobs failed in http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_23:20:03-multi-version-giant-distro-basic-... Yuri Weinstein
02:53 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
Please make sure you are following these steps..
1. Build the latest giant package both in cluster and client side...
Somnath Roy
01:35 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
In my test cluster, I'm getting the same performance with "rbd cache = false" and "rbd cache = true". Could you post... Adam Crume
02:50 PM CephFS Feature #9659 (Duplicate): MDS: support cache eviction
It would be really useful when writing certain kinds of tests (eg, for scrubbing) to be able to know that a particula... Greg Farnum
12:01 PM Bug #9653 (Resolved): ceph-disk: bootstrap-osd keyring ignores --statedir
Loïc Dachary
06:54 AM Bug #9653: ceph-disk: bootstrap-osd keyring ignores --statedir
* giant https://github.com/ceph/ceph/pull/2635
* firefly https://github.com/ceph/ceph/pull/2634
Loïc Dachary
05:13 AM Bug #9653 (Fix Under Review): ceph-disk: bootstrap-osd keyring ignores --statedir
https://github.com/ceph/ceph/pull/2633 Loïc Dachary
04:54 AM Bug #9653 (Resolved): ceph-disk: bootstrap-osd keyring ignores --statedir
... Loïc Dachary
11:40 AM Fix #9245 (Resolved): remove Monitor::osdmonitor_prepare_command
Loïc Dachary
10:01 AM Fix #9245 (Fix Under Review): remove Monitor::osdmonitor_prepare_command
giant backport https://github.com/ceph/ceph/pull/2637 Loïc Dachary
09:27 AM Fix #9245 (Pending Backport): remove Monitor::osdmonitor_prepare_command
Sage Weil
07:15 AM Fix #9245: remove Monitor::osdmonitor_prepare_command
https://github.com/ceph/ceph/pull/2636 Loïc Dachary
11:06 AM devops Bug #9656 (Rejected): Remove conditional statement in ceph-radosgw startup script log section
The startup script has a conditional statement to determine if a log file exists, and will touch and chown the log fi... Tupper Cole
10:43 AM Bug #9655 (Resolved): tests: qa/workunits/cephtool/test.sh fails ENXIO
... Loïc Dachary
10:34 AM Bug #8083: erasure-code: fix static code analysis errors found in gf-complete
For the record these are minor fixes and I expect to see them used when NEON is merged upstream and we update the jer... Loïc Dachary
10:12 AM Bug #8083 (Resolved): erasure-code: fix static code analysis errors found in gf-complete
merged https://bitbucket.org/jimplank/gf-complete/pull-request/24/static-code-analysis-fixes
Loïc Dachary
09:04 AM devops Bug #9654 (Duplicate): "error: subprocess paste was killed by signal (Broken pipe)" in upgrade:du...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-01_19:10:01-upgrade:dumpling-firefly-x:parallel-gi... Yuri Weinstein
07:36 AM rbd Feature #9374 (Resolved): rbd: use a rolling average for bench-write
commit:b47fdd400e14bd1b5e5bea9d18f895c92b8050be Jason Dillaman
07:17 AM Bug #9644 (Can't reproduce): ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
I tried with latest master and I'm no longer hitting it. I'm not sure if this was due to an environment issue or som... Joao Eduardo Luis
04:32 AM Bug #9644: ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
The CEPH_CONF and CEPH_ARGS are "taken care of when the test starts":https://github.com/ceph/ceph/blob/giant/src/test... Loïc Dachary
04:17 AM Bug #9644: ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
Could you include the error you get also ? One idea that comes to mind is that the test-erasure-code.sh do require au... Loïc Dachary
06:52 AM CephFS Bug #9636: segfault in CInode::get_caps_allowed_for_client
looks like it's the same as #9628 Zheng Yan
12:37 AM Bug #9619 (Can't reproduce): excessive mon memory usage when rbd rm 1PB
At 83% completion (rbd rm big)... Loïc Dachary

10/02/2014

08:53 PM Bug #9625 (Need More Info): firefly: memory corruption
Sage Weil
07:32 PM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
sent a pull request now, #2632 Yehuda Sadeh
05:48 PM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
not sure what the state of this bug is then.. yehuda? Sage Weil
06:01 PM Bug #9544 (Resolved): osd: pg deletion vs create race leads to EEXIST on mkcoll (dumpling)
Sage Weil
06:00 PM rbd Bug #6494 (Resolved): High memory consumption of qemu/librbd with enabled cache
ok did dumpling too Sage Weil
05:51 PM rbd Bug #6494: High memory consumption of qemu/librbd with enabled cache
backported to firefly. josh, should we do dumpling too? Sage Weil
05:46 PM rgw Bug #8621 (Resolved): civetweb frontend fails authentication if URL has special chars
a953b313f1e2f884be6ee2ce356780f4f70849dd Sage Weil
05:46 PM rgw Bug #8718 (Resolved): CORS OPTIONS request fails for presigned urls
6fee71154d838868807fd9824d829c8250d9d2eb Sage Weil
05:45 PM rgw Bug #8784 (Resolved): rgw: completion leak
b0d08aab837808f18708a4f8ced0503c0fce2fec Sage Weil
05:44 PM rgw Bug #9089 (Resolved): rgw: copy_obj_data() does not stripe target object
Sage Weil
05:44 PM rgw Feature #9200 (Resolved): rgw: log civetweb access
Sage Weil
05:42 PM rgw Bug #9206 (Resolved): rgw: cross rgw message headers filtered by apache 2.4
Sage Weil
05:41 PM rgw Bug #9353 (Resolved): Log files created under /var/log/radosgw/ do not have the .log extension
Sage Weil
05:37 PM rgw Bug #9148 (Resolved): rgw: multiregion tests failing, s3tests.functional.test_s3.test_region_copy...
Sage Weil
05:37 PM rgw Bug #9226 (Resolved): rgw: crash when copying specific objects
Sage Weil
05:36 PM rgw Bug #9208 (Resolved): rgw: civetweb does not drain request buffer correctly
Sage Weil
05:36 PM rgw Bug #9201 (Resolved): rgw: bad object with different pool alignment
Sage Weil
05:23 PM Feature #8391 (Resolved): sysvinit does not support custom cluster names
Sage Weil
05:22 PM Feature #8203 (Resolved): Replica setting values in df output
Sage Weil
05:22 PM Feature #7792 (Closed): leveldb 1.12.0 for rhel
Sage Weil
05:21 PM Feature #7344 (Resolved): osd: add additional heartbeat on cluster interface
Sage Weil
05:20 PM Feature #6261 (Resolved): ceph-filestore-dump use cases for disaster recovery
Sage Weil
05:18 PM Feature #5614 (Resolved): mon: enable moving pools to HASHPSPOOL mode
Sage Weil
05:15 PM Feature #4914 (Resolved): rados tool: read xattr from file / stdin
Sage Weil
05:14 PM Feature #4005: Add perftools to the kernel debian package script
Sage Weil
05:13 PM Feature #3345 (Resolved): support multiple clusters with sysvinit
Sage Weil
05:13 PM Feature #3340 (New): refuse to accept "cluster=foo" in ceph.conf
Sage Weil
05:13 PM Feature #3340 (Rejected): refuse to accept "cluster=foo" in ceph.conf
Sage Weil
05:13 PM Feature #3288 (Resolved): docs: document the chooseleaf command in crush
Sage Weil
05:12 PM Feature #3086 (Resolved): workqueue: dynamically adjust number of threads
Sage Weil
05:12 PM Feature #2894 (Resolved): cli: help command for ceph subsystems
Sage Weil
05:11 PM Feature #1880 (Rejected): osd: optionally log all request latencies
Sage Weil
05:11 PM Messengers Feature #1851 (Rejected): SimpleMessenger: use non-blocking io
Sage Weil
05:10 PM Feature #1267 (Rejected): osd: rgw class to do acl check
Sage Weil
05:09 PM RADOS Feature #84 (Rejected): mon: auto adjust pg_num as pool grows
Sage Weil
05:08 PM Feature #2222 (Resolved): osd: distinguish between 'degraded' and 'misplaced'
Sage Weil
05:08 PM Feature #5907 (Resolved): permanently log all administrative actions
Sage Weil
05:07 PM Feature #3849 (Resolved): Track slow PGs and times OSDs marked down
Sage Weil
04:28 PM Feature #8560 (Resolved): mon: instrument paxos
Sage Weil
03:58 PM rgw Bug #9651 (Duplicate): RGW: Object Removal Atomicity
The issue appears then a system does down when there are pending object deletions. The object can be removed but will... Tyler Brekke
03:30 PM Bug #9650: RWTimer cancel_event is racy
Sage Weil
03:30 PM Bug #9650 (Fix Under Review): RWTimer cancel_event is racy
wip-rwtimer Sage Weil
02:57 PM Bug #9650: RWTimer cancel_event is racy
The issue is that we execute events under a shared (read) lock, and we allow you to cancel them under a shared (read)... Sage Weil
01:55 PM Bug #9650 (Resolved): RWTimer cancel_event is racy
(in safe mode) we carry the rwlock for the callback. but we use a separate mutex to protect the events. and we can
...
Sage Weil
03:30 PM Bug #9582 (Fix Under Review): librados: segmentation fault on timeout
Sage Weil
10:23 AM Bug #9582 (In Progress): librados: segmentation fault on timeout
hmm, several failures on giant
ubuntu@teuthology:/var/lib/teuthworker/archive/samuelj-2014-10-01_18:59:42-rados-gi...
Sage Weil
03:29 PM rgw Bug #9307 (Resolved): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-fir...
Sage Weil
09:31 AM rgw Bug #9307: "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-x-mast...
suite:upgrade:dumpling-x
In run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_19:00:02-upgrade:dumplin...
Yuri Weinstein
02:15 PM CephFS Bug #9514 (Resolved): ceph-fuse pjd test is failing in giant nightlies
Dumpling commit:5f601f099be98c2b061cc94fb06917e7543f3efe
Firefly commit:9fee8de25ab5c155cd6a3d32a71e45630a5ded15
Greg Farnum
01:56 PM Bug #8752: firefly: scrub/repair stat mismatch
I think I found where it is happening. For a while I was using Btrfs-based OSDs with journals on SSD-based ext4. For ... Dmitry Smirnov
11:58 AM Bug #9559: ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() function
This was fixed in version 0.83 in commit 046c9769fc4eaffc1dd4a21b61c1c5696d537def, although I'm sure it could be back... Adam Crume
11:42 AM Bug #9649 (Can't reproduce): OSD hang in op_tp
ubuntu@teuthology:/a/samuelj-2014-10-01_18:59:42-rados-giant-wip-testing-old-vanilla-basic-multi/524982
valgrind, ...
Samuel Just
11:30 AM Bug #9626: PG: cancel backfill reservations if we get a cancel during backfill
Samuel Just
11:20 AM Feature #9647 (New): osd: hard cap on PGs per OSD
Sage Weil
11:00 AM devops Feature #9411: remove qemu symlink for librbd on rhel7.1 (and later)
This ticket is inaccurate.
The version of qemu-kvm that ships with base RHEL 6.x or 7.x does not and has no plans ...
Neil Levine
10:56 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
ubuntu@teuthology:/a/samuelj-2014-10-01_18:59:42-rados-giant-wip-testing-old-vanilla-basic-multi/524988 Samuel Just
10:51 AM devops Feature #3161 (Rejected): make gcov website public, via proxy on gitbuilder.sepia.ceph.com
Ian Colle
10:49 AM devops Feature #2663 (Closed): crowbar: UI for setting generic ceph.conf values
Neil Levine
10:48 AM devops Feature #2910 (Closed): crowbar: Use JBOD mode for ceph-osd
Neil Levine
10:46 AM devops Feature #8037 (Closed): Test leveldb 1.12 (or newer) and package as necessary
Ian Colle
10:46 AM devops Feature #3023 (Closed): juju: automated QA of OpenStack RBD integration
Neil Levine
10:46 AM devops Feature #3022 (Closed): juju: automated QA of Ceph
Neil Levine
10:46 AM devops Feature #2695 (Closed): crowbar: Automated QA
Neil Levine
10:45 AM devops Feature #3017 (Closed): juju: dev env setup
Neil Levine
10:45 AM devops Feature #3018 (Closed): juju: test deploy of openstack
Neil Levine
10:45 AM devops Feature #3020 (Closed): juju: change nova to use rbd
Neil Levine
10:44 AM devops Feature #7925 (In Progress): Feature: create new download.ceph.com site
Ian Colle
10:42 AM devops Feature #3021 (Closed): juju: change glance to use rbd
Neil Levine
09:42 AM Bug #9644 (Can't reproduce): ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
I haven't seen anyone complaining about this so either 1) no one is running this test, or 2) I'm the only one hitting... Joao Eduardo Luis
09:23 AM devops Bug #9643 (Rejected): Error "install ceph-devel-0.67.11 -y" in -upgrade:dumpling-x-firefly-distro...
In run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_19:00:02-upgrade:dumpling-x-firefly-distro-basic-vps... Yuri Weinstein
08:39 AM rbd Bug #9642 (Resolved): Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-gian...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-01_15:06:04-upgrade:dumpling-firefly-x:parallel-gi... Yuri Weinstein
07:55 AM Bug #9619: excessive mon memory usage when rbd rm 1PB
The mon memory indeed grows but after 30 minutes running I'm not sure it is related. And it's growing slowly.... Loïc Dachary
06:45 AM Bug #9619 (New): excessive mon memory usage when rbd rm 1PB
Checking the OSD memory usage when the problem is MON growth is not a good idea. Loïc Dachary
06:35 AM Bug #9619 (Can't reproduce): excessive mon memory usage when rbd rm 1PB
With a vstart cluster with one monitor and three OSDs and... Loïc Dachary
07:42 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
I'm able to reproduce the problem with 0daddfbf1164d6ba3f38eee29d2f11acfa62f2b6 from your tree https://github.com/spo... Loïc Dachary
07:28 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Damn... I was a bit too fast when I thought I was reproducing the issue !
I was indeed reproducing the original one,...
Sebastien Ponce
05:40 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
I've finally managed to reproduce it, thanks to Loic : the trick was Ubuntu + debug mode. Maybe you also need more th... Sebastien Ponce
05:34 AM Bug #8011 (Resolved): osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= s...
I'm unable to reproduce it any more, assuming fixed. Dmitry Smirnov
05:33 AM Bug #8747: OSD crash on scrub:osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || s...
I can't reproduce any more on 0.80.5 + Firefly HEAD as of 2014-09-16... Dmitry Smirnov

10/01/2014

05:14 PM Bug #9625: firefly: memory corruption
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-bug-9625-e/521446 Sage Weil
05:12 PM Bug #9625: firefly: memory corruption
hit it again (or something very similar):... Sage Weil
04:14 PM Bug #9617 (Pending Backport): objecter shutdown races with msg dispatch
Sage Weil
02:05 PM Bug #9617 (Fix Under Review): objecter shutdown races with msg dispatch
https://github.com/ceph/ceph/pull/2621 Josh Durgin
03:34 PM Feature #5035 (Resolved): rados: smarter localized reads
https://github.com/ceph/ceph/commit/22df77325165157c47bc782476e0e3ab9cf652c4 Loïc Dachary
03:17 PM devops Bug #9640 (Rejected): Missing packages in multi-version-giant-testing-basic-multi
In run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_14:25:21-multi-version-giant-distro-basic-multi/
...
Yuri Weinstein
02:18 PM Bug #9537 (Resolved): OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_tot...
Loïc Dachary
01:41 PM rbd Bug #8187 (Resolved): librbd: list_children() reports duplicates with cache pools
Adam Crume
11:56 AM rbd Bug #8187 (Fix Under Review): librbd: list_children() reports duplicates with cache pools
https://github.com/ceph/ceph/pull/2619 Adam Crume
12:53 PM Bug #9572 (Resolved): erasure-code: BlaumRoth default encoding regression
Loïc Dachary
12:51 PM Bug #9620 (Resolved): tests: qa/workunits/cephtool/test.sh race condition
Loïc Dachary
12:47 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Demoting to Normal because it only happens in debug mode. Loïc Dachary
06:45 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Loïc Dachary
02:45 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Using ... Loïc Dachary
12:30 PM Bug #9570 (Need More Info): osd crash in FileJournal::WriteFinisher::entry() aio
I'm out of ideas. There hopefully is enough background information to help with diagnostic when / if it re-surfaces. Loïc Dachary
12:10 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
A theory that does not explain the problem, for the record.... Loïc Dachary
11:35 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
What probably happens is that "the aio_info":https://github.com/ceph/ceph/blob/giant/src/os/FileJournal.cc#L1303 that... Loïc Dachary
09:23 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
"linux 3.12.7":https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.12.7 has been released january 2014. The patc... Loïc Dachary
09:17 AM Bug #9570 (In Progress): osd crash in FileJournal::WriteFinisher::entry() aio
debian/changelog does not show anything that suggest a bug was fixed in libaio after 0.3.109-2 which could relate to ... Loïc Dachary
08:52 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
linux kernel is 3.12.7 and libaio is 0.3.109-2ubuntu1 Sheldon Mustard
07:29 AM Bug #9570 (Need More Info): osd crash in FileJournal::WriteFinisher::entry() aio
Loïc Dachary
06:40 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
The "aio: v4 ensure access to ctx->ring_pages is correctly serialised for migration":https://git.kernel.org/cgit/linu... Loïc Dachary
06:22 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
Although "aio: protect reqs_available updates from changes in interrupt handlers":https://git.kernel.org/cgit/linux/k... Loïc Dachary
05:45 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
It turns out that the alignment requirement has to be enforced indeed. On a 3.13 linux kernel the following:... Loïc Dachary
10:42 AM CephFS Bug #9636 (Duplicate): segfault in CInode::get_caps_allowed_for_client

While doing ad-hoc killing of clients stuck on full cluster: unchecked dereference of session connection....
John Spray
09:52 AM Feature #8188 (Resolved): librados: interface to inspect pool properties
Sage Weil
09:39 AM rbd Feature #4454 (Closed): openstack: support volume migration in Cinder
tracking via https://blueprints.launchpad.net/cinder/+spec/generic-volume-migration Josh Durgin
09:36 AM rbd Feature #7921 (Resolved): Openstack: live migration for ephemeral volumes
Josh Durgin
09:35 AM rbd Feature #7920 (Resolved): Openstack: cloning for rbd ephemeral disks
Josh Durgin
09:29 AM rbd Feature #5138 (Closed): LIO Support
Being tracked in Red Hat Bugzilla Neil Levine
09:25 AM rbd Feature #4087 (In Progress): rbd: bitmaps for tracking object existence
Josh Durgin
09:24 AM rbd Feature #4804 (Rejected): tgt: switch to aio
iSCSI work now focused on LIO Neil Levine
09:23 AM Feature #9302 (Resolved): mon: 'ceph osd pool ls' command
Joao Eduardo Luis
09:22 AM rgw Feature #9013: rgw: set civetweb as a default frontend
https://github.com/ceph/ceph/pull/2381 Josh Durgin
08:53 AM devops Fix #5900 (In Progress): Create a Python package for ceph Python bindings
Alfredo Deza
08:24 AM rbd Bug #9602: rbd export -> nc ->rbd import = memory leak
I quickly attempted to reproduce this on the same version w/o success. Can you attach /etc/ceph/big.conf? How large... Jason Dillaman
06:59 AM Bug #8942: Bad JSON output in ceph osd tree
I see this in ceph version 0.80.5 (38b73c67d375a2552d8ed67843c8a65c2c0feba6)and it is indeed a problem.
I *think* ...
Wyllys Ingersoll
06:49 AM devops Feature #9133: create ceph user/group; run daemons as ceph (non-root)
Indeed a lot of packaging updates and probably many difficulties to properly upgrade daemons :/
Anyone working on ...
Sébastien Han
06:18 AM CephFS Feature #7317 (In Progress): mds: behave with fs fills (e.g., allow deletion)
John Spray
06:15 AM CephFS Feature #9437 (Fix Under Review): make 'ceph tell mds.* ...' work, deprecate 'ceph mds tell * ...'
John Spray

09/30/2014

11:57 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Working in the container... Loïc Dachary
11:54 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
To make sure this is not environmental problem I clone a clean copy from your branch and removed .ccache entirely. Loïc Dachary
11:19 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Running the test in the container still fails. ... Loïc Dachary
11:06 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
I reproduced the above valgrind output a few minutes ago on my development laptop. After upgrading from... Loïc Dachary
10:55 PM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Using the same source tree with the same kernel but inside an ubuntu 14.04 docker container, I was not able to reprod... Loïc Dachary
05:24 PM rgw Bug #8587 (Resolved): rgw: subuser object not created correctly
commit:1441ffe8103f03c6b2f625f37adbb2e1cfec66bb Josh Durgin
05:19 PM Bug #9635: mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())
Sage Weil
05:19 PM Bug #9635 (Fix Under Review): mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())
from teh log it looks like this happened during shutdown. see wip-9635 Sage Weil
04:54 PM Bug #9635 (Resolved): mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())
... Sage Weil
04:58 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
hmm, these seem to always happen with valgrind! Sage Weil
04:52 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
ubuntu@teuthology:/a/teuthology-2014-09-29_23:02:01-rgw-giant-testing-basic-multi/519792 Sage Weil
03:32 PM Bug #9459 (Need More Info): osd: blocked request
Sage Weil
03:31 PM Bug #9288 (Duplicate): "Assertion `nlock == 0' failed" in upgrade:firefly-firefly-testing-basic-v...
see #9040 Sage Weil
03:09 PM Bug #8997 (Can't reproduce): ceph_test_rados_watch_notify hangs
I suspect the watch resend fix (commit:1349383ac416673cb6df2438729fd2182876a7d1 for #9220) fixed some of these. (It ... Sage Weil
03:06 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
The simple fixes here seem insufficient (fail in qa). Haven't seen anybody else hitting this, which surprises me a b... Sage Weil
01:22 PM Feature #9198 (In Progress): librados: notify callback includes gid of notifier
Sage Weil
01:22 PM Feature #9197 (In Progress): librados/osd: notify reply payload
Sage Weil
01:13 PM Feature #8899 (Fix Under Review): Kerberos/LDAP Support:: mon: define mon role capabilities
Sage Weil
01:03 PM RADOS Feature #9632 (New): testing: test CrushWrapper::get_full_location_ordered()
A recent backport of changes to get_full_location_ordered() passed all the make check and RADOS suite tests, but caus... Greg Farnum
12:56 PM Feature #9031: List RADOS namespaces and list all objects in all namespaces
Samuel Just
11:52 AM Bug #8822 (Need More Info): osd: hang on shutdown, spinlocks
Sage Weil
11:51 AM Bug #8822: osd: hang on shutdown, spinlocks
valgrind is 1:3.10~20140411-0ubuntu1
3.10.0 release notes claim to have fixed
336435 Valgrind hangs in pthread...
Sage Weil
11:39 AM Bug #8822: osd: hang on shutdown, spinlocks
http://stackoverflow.com/questions/24558914/valgrind-hangs-in-pthread-spin-lock-consuming-100-cpu
valgrind bug?
Sage Weil
11:38 AM Bug #8822: osd: hang on shutdown, spinlocks
happened again:... Sage Weil
11:26 AM Bug #9617: objecter shutdown races with msg dispatch
wip-objecter-shutdown Josh Durgin
11:17 AM rgw Feature #8911 (In Progress): RGW doesn't return 'x-timestamp' in header which is used by 'View De...
Ian Colle
10:29 AM CephFS Bug #9562 (Pending Backport): Lockdep assertion in Filer purge
This is popping up in Giant as well, which I believe has the new code that was the proximate cause. :) Greg Farnum
10:27 AM CephFS Bug #9514 (Pending Backport): ceph-fuse pjd test is failing in giant nightlies
In giant as commit:0ea20a668cf859881c49b33d1b6db4e636eda18a.
Needs to go to firefly as well.
Greg Farnum
09:58 AM devops Tasks #8366 (In Progress): Update ceph.com/docs to default to the latest major release (0.80)
Ian Colle
09:47 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
https://github.com/ceph/ceph/pull/2611 seems like a good candidate for backport. Loïc Dachary
09:40 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
https://github.com/ceph/ceph/commit/66a9fbe2c7ba59b7cd034c17865adce3432cd2cb and https://github.com/ceph/ceph/commit/... Loïc Dachary
08:41 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
None of the commits in FileJournal.cc from dumpling to master fix something that could cause a problem of that nature. Loïc Dachary
09:40 AM Bug #9630 (Resolved): osd: leaked pg refs on shutdown (dumpling)
... Sage Weil
08:40 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
9/30/14 update - Still waiting in queue http://pulpito.front.sepia.ceph.com/teuthology-2014-09-29_23:20:02-multi-vers... Yuri Weinstein
08:36 AM rgw Bug #9612 (Resolved): "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant...
PR https://github.com/ceph/ceph-qa-suite/pull/154 Yuri Weinstein
12:08 AM CephFS Bug #9628: mds: race between ms_handle_accept() and ms_handle_reset()
https://github.com/ceph/ceph/pull/2596 Zheng Yan
12:08 AM CephFS Bug #9628 (Resolved): mds: race between ms_handle_accept() and ms_handle_reset()
ceph version 0.85-1003-g3ae673c (3ae673c764a4fac6e554e05722f0179566ed3fb3)
1: (ceph::BackTrace::BackTrace(int)+0x2...
Zheng Yan

09/29/2014

11:40 PM Bug #9582: librados: segmentation fault on timeout
Thanks for your investigations and the quick fix! We have not been able to test this fix yet, but I will report back ... Matthias Kiefer
01:13 PM Bug #9582: librados: segmentation fault on timeout
in giant, dumpling. still need to merge firefly backport. Sage Weil
01:10 PM Bug #9582 (Pending Backport): librados: segmentation fault on timeout
Greg Farnum
08:16 AM Bug #9582 (Fix Under Review): librados: segmentation fault on timeout
Sage Weil
10:09 PM Bug #9459: osd: blocked request
saw something similar on another cluster, ... Sage Weil
09:18 PM rbd Bug #9602: rbd export -> nc ->rbd import = memory leak
As a suggestion, prohibit the use of the cache when RDB imports. Irek Fasikhov
09:03 PM rbd Bug #9602: rbd export -> nc ->rbd import = memory leak
Hi, Sage.
I'm sorry, was wrong to put up parameter: rbd_cache size
The problem is not confirmed.
Irek Fasikhov
08:18 PM CephFS Bug #9562 (Resolved): Lockdep assertion in Filer purge
Zheng Yan
06:12 PM rgw Bug #9615 (Resolved): "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-du...
Fixed typo https://github.com/ceph/ceph-qa-suite/pull/157 Yuri Weinstein
05:45 PM rgw Bug #9615: "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-dumpling-dist...
interesting how is it possible if I only added one yaml file v0.67.11.yaml, will look Yuri Weinstein
05:36 PM rgw Bug #9615: "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-dumpling-dist...
this yaml has no 'rgw' task... that's why it gets connection refused. Sage Weil
06:11 PM rgw Bug #9616: upgrade test restarts rgw, test gets 500
Even if we do somehow get it to retry (might require changes to the fastcgi module), we'll still get 500s from reques... Yehuda Sadeh
05:47 PM rgw Bug #9616: upgrade test restarts rgw, test gets 500
Yehuda Sadeh wrote:
> Not sure what the test is doing exactly, but the 500 is because the rgw process was restarted ...
Sage Weil
05:36 PM rgw Bug #9616: upgrade test restarts rgw, test gets 500
Not sure what the test is doing exactly, but the 500 is because the rgw process was restarted in the middle of the te... Yehuda Sadeh
05:52 PM rgw Bug #9612: "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant-testing-ba...
pls update with new test... this one was specifying firefly Sage Weil
05:45 PM rgw Bug #9169: 100-continue broken for centos/rhel
maybe we are lacking the apache or mod_fastcgi packages here? Sage Weil
05:41 PM rgw Bug #9169: 100-continue broken for centos/rhel
Yuri Weinstein wrote:
> Similar issue in suite:upgrade:firefly
>
> http://pulpito.front.sepia.ceph.com/teuthology...
Sage Weil
05:32 PM Bug #9617 (In Progress): objecter shutdown races with msg dispatch
Sage Weil
04:21 PM Bug #9617: objecter shutdown races with msg dispatch
... Sage Weil
05:28 PM Feature #8960 (Resolved): filestore: store backend type persisently
Sage Weil
05:24 PM Bug #9142 (Can't reproduce): [ RUN ] LibRadosTwoPoolsPP.PromoteSnapScrub hang
Sage Weil
05:24 PM Bug #9141 (Can't reproduce): [ RUN ] LibRadosAio.IsCompletePP hang
Sage Weil
04:49 PM Bug #6301: ceph-osd hung by XFS using linux 3.10
fwiw, after upgrading the performance test nodes from Ubuntu 13.10 to Fedora Core 20, I appear to be hitting this und... Mark Nelson
04:44 PM Feature #9580: ceph-disk, ceph-osd: make journal [partition] creation conditional based on osd_ob...
Mark Kirkwood wrote:
> While we are thinking about this, note that some of the keyvalue backends have facility to ha...
Sage Weil
04:43 PM CephFS Bug #9341: MDS: very slow rejoin
John Spray wrote:
> The userspace change and test for this are merged into master. Is the kernel side all done too?...
Dmitry Smirnov
01:07 PM CephFS Bug #9341: MDS: very slow rejoin
The userspace change and test for this are merged into master. Is the kernel side all done too? John Spray
04:33 PM CephFS Bug #9514: ceph-fuse pjd test is failing in giant nightlies
Greg Farnum
03:49 PM CephFS Bug #9514: ceph-fuse pjd test is failing in giant nightlies
So here's a question: why does the client (temporarily) remember its ctime as being 2014-09-26 19:22:06.889397, but n... Greg Farnum
02:58 PM CephFS Bug #9514 (In Progress): ceph-fuse pjd test is failing in giant nightlies
Hah, we got the failure with logs in /a/sage-2014-09-26_17:51:11-smoke-giant-distro-basic-multi/513914
All of the ...
Greg Farnum
04:26 PM Bug #9614: PG stuck with remapped
Thanks Loic for the following up.
After talking to other engineers, the backfilling seems like due to he removed O...
Guang Yang
12:32 PM Bug #9614: PG stuck with remapped
It looks like you are on the right track :-) Loïc Dachary
12:23 PM Bug #9614: PG stuck with remapped
... Loïc Dachary
12:13 PM Bug #9614: PG stuck with remapped
could you attach the full output of pg query 3.1ee7 please ? And also the ceph osd tree would help to get an idea why... Loïc Dachary
02:21 AM Bug #9614: PG stuck with remapped
There are still two issues:
# Some PGs are stuck with active+remapped forever (for both replicated pool and EC pool)...
Guang Yang
02:07 AM Bug #9614: PG stuck with remapped
Guang Yang wrote:
> Another observation is that even the pg dump result for such PG:
> [...]
>
> Even there is a...
Guang Yang
01:53 AM Bug #9614: PG stuck with remapped
Attaching CRUSH / EC profile / OSD dump. Guang Yang
01:28 AM Bug #9614: PG stuck with remapped
Loic Dachary wrote:
> [...]
> The *2147483647* here shows mapping failed. Is this something you expect ?
As there ...
Guang Yang
01:22 AM Bug #9614: PG stuck with remapped
... Loïc Dachary
04:24 PM Bug #9113: osd: snap trimming eats memory, linearly
There's another piece. The trimmer is constantly requeueing. Samuel Just
02:02 PM Bug #9113 (Pending Backport): osd: snap trimming eats memory, linearly
Sage Weil
01:59 PM Bug #9113 (Fix Under Review): osd: snap trimming eats memory, linearly
Samuel Just
04:15 PM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
I will verify the result when they are ready but I'm not too concerned ;-) Loïc Dachary
04:15 PM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
Loïc Dachary
02:42 PM Bug #9620 (Resolved): tests: qa/workunits/cephtool/test.sh race condition
i jumped the gun and merged, oops! Sage Weil
01:52 PM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
gitbuilder running Loïc Dachary
01:51 PM Bug #9620 (Fix Under Review): tests: qa/workunits/cephtool/test.sh race condition
https://github.com/ceph/ceph/pull/2603 Loïc Dachary
08:18 AM Bug #9620 (Pending Backport): tests: qa/workunits/cephtool/test.sh race condition
Sage Weil
04:53 AM Bug #9620 (Fix Under Review): tests: qa/workunits/cephtool/test.sh race condition
https://github.com/ceph/ceph/pull/2594 Loïc Dachary
04:36 AM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
The *ceph osd thrash* command will randomly "mark osds down and up":https://github.com/ceph/ceph/blob/firefly/src/mon... Loïc Dachary
03:29 AM Bug #9620: tests: qa/workunits/cephtool/test.sh race condition
The following sequence happens:
* ceph osd dump finds 3 osd "down"
* ceph osd dump finds no osd "down"
* ceph os...
Loïc Dachary
03:24 AM Bug #9620 (Resolved): tests: qa/workunits/cephtool/test.sh race condition
"osd are marked down":https://github.com/ceph/ceph/blob/master/qa/workunits/cephtool/test.sh#L604 and a loop checking... Loïc Dachary
03:46 PM RADOS Bug #9606: mon: ambiguous error_status returned to user when type is wrong in a command
This may be easier if/when ceph_argparse gets made into a proper Python package; I hear there is renewed interest in ... Dan Mick
02:57 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
Exploring the idea that maybe the buffers pointed to by the iovec are overriden, mixed up Loïc Dachary
08:28 AM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
Reading the buffer.{h,cc} code it looks like the caller is protected from a situation where a bufferptr leftover can ... Loïc Dachary
02:53 PM Bug #9626 (Resolved): PG: cancel backfill reservations if we get a cancel during backfill
Samuel Just
02:36 PM Fix #9566: osd: prioritize recovery of OSDs with most work to do
Factor the number of backfill (or backfill_wait) pgs on the OSD into the recovery priority. Make sure this accounts ... Sage Weil
02:14 PM Bug #9574 (Pending Backport): Backfill: recheck full status once reservation is granted
Sage Weil
01:51 PM Bug #9574 (Fix Under Review): Backfill: recheck full status once reservation is granted
Samuel Just
02:00 PM Bug #9388: osd/PG.cc: 2945: FAILED assert(r == 0) in update_snap_map
This is the one with the import/export racing with split Samuel Just
01:59 PM Bug #9503 (Fix Under Review): Dumpling: removing many snapshots in a short time makes OSDs go ber...
Samuel Just
01:54 PM Bug #9545 (Resolved): filestore stuck in journal->should_commit_now() loop on shutdown
Samuel Just
01:52 PM Bug #8629 (Pending Backport): cache_evict needs to prevent make_writeable from creating a snapdir
Samuel Just
01:45 PM Bug #9480 (Resolved): OSD is crashing while object deletion
Samuel Just
01:30 PM Bug #9625: firefly: memory corruption
/a/samuelj-2014-09-23_14:40:50-rados-firefly-wip-testing-old-vanilla-basic-multi/507058 another example Samuel Just
10:44 AM Bug #9625: firefly: memory corruption
ubuntu@teuthology:/a/sage-2014-09-27_20:55:12-rados-firefly-distro-basic-multi/515818
ubuntu@teuthology:/a/sage-2014...
Samuel Just
10:43 AM Bug #9625 (Resolved): firefly: memory corruption
I am guessing that these two coredumps are related.
#0 0x00007f1918142f07 in _dl_map_object_deps (map=map@entry=0...
Samuel Just
01:15 PM CephFS Bug #8576: teuthology: nfs tests failing on umount
Trying the sync on Sage's go-ahead. :)
commit:56223ce98b659fe7b25b55161ef8163495f438fc in teuthology.
Greg Farnum
10:45 AM CephFS Bug #8576: teuthology: nfs tests failing on umount
Is there any chance that just running a sync on the node prior to trying to "exportfs -au" might prevent this? I'm he... Greg Farnum
12:51 PM devops Fix #9017 (Fix Under Review): [paddles] implement validation across all controller methods
Pull request opened https://github.com/ceph/paddles/pull/46 Alfredo Deza
10:30 AM Bug #9623 (Won't Fix): On cluster with 3 mons, stopping 2 mons made cluster in-accessible, with I...
This is expected and intended behavior. The monitors are a Paxos system and require a quorum of *more than* half to b... Greg Farnum
07:30 AM Bug #9623: On cluster with 3 mons, stopping 2 mons made cluster in-accessible, with IO's hung/pause
Removing myself as I may not have time to deal with this right now. Loïc Dachary
06:25 AM Bug #9623 (Won't Fix): On cluster with 3 mons, stopping 2 mons made cluster in-accessible, with I...
Cluster with "n" number of monitor nodes, will be in-accessible if "n-1" number of monitors are down.
Its been obser...
Mallikarjun Biradar
09:55 AM devops Bug #6461 (Rejected): ceph-deploy should at least issue a warning if there are parser errors read...
`ConfigParser` will not have errors reading a config file that has duplicate sections.
In Python2.X a duplicate se...
Alfredo Deza
08:25 AM Bug #9613 (Duplicate): "Segmentation fault" in upgrade:dumpling-giant-x:parallel-giant-distro-bas...
#9582 Sage Weil
07:50 AM devops Bug #6489 (Can't reproduce): ceph-deploy: get_nonlocal_ip() should filter ipv6 addrs
Alfredo Deza
07:44 AM devops Bug #7483 (Rejected): ceph-deploy should fetch keyrings always
There isn't a reasonable way to implement this. The use case is deploying to a new node and having stale files in the... Alfredo Deza
06:00 AM Bug #9408: erasure-code: misalignment
Running under the branch wip-9408-buffer-alignment in http://ceph.com/gitbuilder.cgi Loïc Dachary
05:58 AM Bug #9408: erasure-code: misalignment
New pull request https://github.com/ceph/ceph/pull/2595 Loïc Dachary
02:06 AM Bug #9572: erasure-code: BlaumRoth default encoding regression
Brute force check of w=7 with all possible values for k prove it allows recovering all scenarios. ... Loïc Dachary
01:53 AM rbd Bug #9391: fio rbd driver rewrites same blocks
Could you provide your fio job file / config to verify the issue? Danny Al-Gaaf

09/28/2014

11:38 PM Bug #9592 (Resolved): librados: Not able to create Large Files with Librados
Loïc Dachary
03:46 PM Bug #9592: librados: Not able to create Large Files with Librados
Extend the checks to librados.hpp and aio_* https://github.com/ceph/ceph/pull/2590 Loïc Dachary
11:30 PM Bug #9304 (Resolved): pool create with invalid crush rule name succeeds
Loïc Dachary
11:02 PM Bug #6003: journal Unable to read past sequence 406 ...
... Shambhu Rajak
09:26 PM Fix #9566: osd: prioritize recovery of OSDs with most work to do
The recovery slows simply because there are fewer PGs left degraded and the per-pg (or per-osd) recovery rate is limi... Sage Weil
12:42 PM Bug #9619 (Can't reproduce): excessive mon memory usage when rbd rm 1PB
Steps to reproduce:
* create a 1 peta byte rbd image
* remove the image
the mon memory usage will grow over 10GB
Loïc Dachary
12:37 PM rgw Bug #9307: "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-x-mast...
Also in http://pulpito.front.sepia.ceph.com/teuthology-2014-09-28_08:42:11-upgrade:dumpling-firefly-giant:parallel-gi... Yuri Weinstein
12:32 PM Bug #9618 (Won't Fix): kernel 3.14 in Debian Jessie : XFS bug
For the record: the 3.14 kernel that was (until today) the default for Debian Jessie exhibited the following XFS bug ... Loïc Dachary
08:37 AM Bug #9617 (Resolved): objecter shutdown races with msg dispatch
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-27_19:10:02-upgrade:firefly-giant-x:parallel-giant... Yuri Weinstein
08:12 AM Bug #9515 (New): "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parall...
Still see in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-27_18:40:01-upgrade:dumpling-giant-x:parallel-gia... Yuri Weinstein
07:57 AM rgw Bug #9615: "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-dumpling-dist...
Appears to be only on @1-dumpling-install/v0.67.11.yaml@ Yuri Weinstein
07:48 AM rgw Bug #9615 (Resolved): "ERROR: test suite for <module 's3tests.functional'" in upgrade:dumpling-du...
In http://pulpito.front.sepia.ceph.com/teuthology-2014-09-27_18:45:01-upgrade:dumpling-dumpling-distro-basic-vps/ run... Yuri Weinstein
07:52 AM rgw Bug #9616 (Resolved): upgrade test restarts rgw, test gets 500
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-27_18:45:01-upgrade:dumpling-dumpling-distro-basic... Yuri Weinstein
04:29 AM Bug #9614: PG stuck with remapped
Another observation is that even the pg dump result for such PG:... Guang Yang
03:45 AM Bug #9614 (Resolved): PG stuck with remapped
In our pre-production cluster, we observed that the cluster starts backfilling even with OSD noout flag set when ther... Guang Yang

09/27/2014

10:48 PM rbd Bug #9595: librbd: internal methods can operate on extra objects when non-default striping is used
https://github.com/ceph/ceph/pull/2588 Xinxin Shu
10:48 PM rbd Bug #9595 (Fix Under Review): librbd: internal methods can operate on extra objects when non-defa...
Xinxin Shu
04:42 PM Bug #9613: "Segmentation fault" in upgrade:dumpling-giant-x:parallel-giant-distro-basic-multi run
Looks similar to #9508 Yuri Weinstein
04:40 PM Bug #9613 (Duplicate): "Segmentation fault" in upgrade:dumpling-giant-x:parallel-giant-distro-bas...
Two failures in http://pulpito.front.sepia.ceph.com/teuthology-2014-09-26_18:44:02-upgrade:dumpling-giant-x:parallel-... Yuri Weinstein
11:40 AM rgw Bug #9612: "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant-testing-ba...
i suspect the giant rgw won't work with firefly osds? Sage Weil
08:56 AM rgw Bug #9612 (Rejected): "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-26_23:20:01-multi-version-giant-testing-basic-mult... Yuri Weinstein
11:39 AM Bug #9610 (Resolved): Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::Cal...
pushed fix to dumpling branch, commit:503f865d6432bead72aac0ffba0539d807f078c4 Sage Weil
08:33 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
Another similar crash in job http://pulpito.front.sepia.ceph.com/teuthology-2014-09-26_23:20:01-multi-version-giant-t... Yuri Weinstein
08:29 AM Bug #9610 (Resolved): Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::Cal...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-26_23:20:01-multi-version-giant-testing-basic-mult... Yuri Weinstein
11:36 AM devops Bug #9611 (Rejected): Missing packages in multi-version-giant-testing-basic-multi
Doesn't look like a 'next' branch exists any longer so no way to fix this. Sandon Van Ness
08:52 AM devops Bug #9611: Missing packages in multi-version-giant-testing-basic-multi
In run http://pulpito.front.sepia.ceph.com/teuthology-2014-09-26_23:20:01-multi-version-giant-testing-basic-multi/
...
Yuri Weinstein
08:50 AM devops Bug #9611 (Rejected): Missing packages in multi-version-giant-testing-basic-multi
Yuri Weinstein
09:16 AM Bug #9592: librados: Not able to create Large Files with Librados
Looking at librados.hpp Loïc Dachary
01:57 AM Bug #9592 (Fix Under Review): librados: Not able to create Large Files with Librados
https://github.com/ceph/ceph/pull/2584 should be enough. Unless there is a good reason to write an object with chunks... Loïc Dachary
05:53 AM Bug #7648 (Resolved): ceph-mon corner case denial of service
Sage Weil
02:32 AM Bug #7648 (Fix Under Review): ceph-mon corner case denial of service
emperor backport https://github.com/ceph/ceph/pull/2585 Loïc Dachary
02:22 AM Bug #7648 (Pending Backport): ceph-mon corner case denial of service
the backport needs to be on emperor also Loïc Dachary
04:27 AM RADOS Bug #9606: mon: ambiguous error_status returned to user when type is wrong in a command
ceph.in "uses ceph_argparse":https://github.com/ceph/ceph/blob/giant/src/ceph.in#L67 to validate the arguments client... Loïc Dachary
12:02 AM RADOS Bug #9492 (Need More Info): Crush Mapper crashes when number of replicas is less than total numbe...
What happens with indep ? Loïc Dachary

09/26/2014

07:13 PM rgw Bug #9588: Keystone s3 auth integration lacking access_key = tenant:user ability supported by swi...
So, actually talking to a swift s3 proxy with:
access_key = 'demo:demo'
secret_key = 'password'
results in:
...
Mark Kirkwood
06:14 PM Bug #7648 (Fix Under Review): ceph-mon corner case denial of service
https://github.com/ceph/ceph/pull/2583 Sage Weil
08:49 AM Bug #7648 (In Progress): ceph-mon corner case denial of service
works for any osd that exists but is not in the crush map, it seems Sage Weil
05:53 PM Bug #9570 (In Progress): osd crash in FileJournal::WriteFinisher::entry() aio
Sage Weil
03:32 PM CephFS Bug #8427: ceph-fuse: Dumpling "cache still has 0+1 items, waiting (for caps to release?)" on shu...
Sage believes this is a bug with readahead that got fixed in subsequent releases. Greg Farnum
06:51 AM CephFS Bug #8427 (Won't Fix): ceph-fuse: Dumpling "cache still has 0+1 items, waiting (for caps to relea...
Sage Weil
03:26 PM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
Loïc Dachary
01:56 PM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
Ran valgrind with the patch and no errors were found with different rule combinations of num_rep and number of osds t... Johnu George
12:44 PM Bug #9417: "Segmentation fault" in upgrade:dumpling-giant-x-master-distro-basic-vps run
Same issue in job http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-26_10:44:24-upgrade:dumpling-giant-x:paralle... Yuri Weinstein
12:03 PM devops Bug #9607 (Resolved): wrong epel-release version present in misc-ceph repo
That epel release 7 RPM should not have ever been put in that repo. It is removed and to its correct location and cep... Sandon Van Ness
11:31 AM devops Bug #9607 (Resolved): wrong epel-release version present in misc-ceph repo
In a CentOS 6 box where we run `yum install epel-release` it now sees that it needs to update to use the epel-release... Alfredo Deza
11:35 AM devops Bug #9603: No package ceph-debuginfo-0.67.10 available in upgrade:dumpling-firefly-x-giant-distro...
Would be helpful to include:... Zack Cerza
11:08 AM devops Bug #9603: No package ceph-debuginfo-0.67.10 available in upgrade:dumpling-firefly-x-giant-distro...
Same issue in suite:upgrade:dumpling-giant-x
http://pulpito.front.sepia.ceph.com/teuthology-2014-09-26_10:44:24-up...
Yuri Weinstein
08:12 AM devops Bug #9603 (Rejected): No package ceph-debuginfo-0.67.10 available in upgrade:dumpling-firefly-x-g...
In run http://pulpito.front.sepia.ceph.com/teuthology-2014-09-25_19:25:02-upgrade:dumpling-firefly-x-giant-distro-bas... Yuri Weinstein
11:34 AM rbd Feature #2466 (Resolved): librbd: add invalidate_cache function to interface
This was added a while back in commit:5d340d26dd70192eb0e4f3f240e3433fb9a24154 Josh Durgin
11:18 AM RADOS Bug #9606 (New): mon: ambiguous error_status returned to user when type is wrong in a command
... Christina Meno
10:10 AM Bug #9592: librados: Not able to create Large Files with Librados
Nice catch Pavan Rallabhandi ;-) I had trouble reproducing the problem because I forgot the "LD_LIBRARY_PATH=.libs" ... Loïc Dachary
09:37 AM Bug #9592: librados: Not able to create Large Files with Librados
The minimal script... Loïc Dachary
08:05 AM Bug #9592 (In Progress): librados: Not able to create Large Files with Librados
Loïc Dachary
04:44 AM Bug #9592: librados: Not able to create Large Files with Librados
... Pavan Rallabhandi
10:01 AM devops Bug #9548 (Rejected): ceph mon creation failed for centOS
Alfredo Deza
09:39 AM Feature #9302 (Fix Under Review): mon: 'ceph osd pool ls' command
https://github.com/ceph/ceph/pull/2581 Joao Eduardo Luis
09:34 AM devops Bug #9232: disk zap doesnt remove the dmcrypt settings on disk
I think that `disk zap` would certainly have to clear the dmcrypt flags in the disk.
Can you make sure that it doe...
Alfredo Deza
08:56 AM rgw Bug #9605 (Won't Fix): rgw: need to have shadow objects named after head object
Yehuda Sadeh
08:55 AM rgw Feature #9604 (Resolved): rgw: create a tool for orphaned objects cleanup
Yehuda Sadeh
08:07 AM devops Bug #9567 (New): Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
Still see in today's run:
http://pulpito.front.sepia.ceph.com/teuthology-2014-09-25_19:25:02-upgrade:dumpling-fire...
Yuri Weinstein
07:41 AM Fix #9601: erasure-code: ErasureCode::encode overhead is too high
The handling got more complicated due to the updated padding handling.
It's a bit little faster. jerasure_matrix_e...
Janne Grunau
05:17 AM Fix #9601: erasure-code: ErasureCode::encode overhead is too high
The overhead has shifted but looks globaly the same with https://github.com/ceph/ceph/pull/2558
!{width: 100%}jannau...
Loïc Dachary
03:52 AM Fix #9601: erasure-code: ErasureCode::encode overhead is too high
Applying https://github.com/ceph/ceph/pull/2558 and benchmarking again Loïc Dachary
03:34 AM Fix #9601 (New): erasure-code: ErasureCode::encode overhead is too high
When encoding 4KB buffers it is ~15% of the total CPU being used although it is only preparing the buffers.
!{width:...
Loïc Dachary
05:26 AM rbd Bug #9602 (Closed): rbd export -> nc ->rbd import = memory leak
I see a memory leak when importing raw devi?e.
Export Scheme:
[rbd@rbdbackup ~]$ rbd --no-progress -n client.rbdb...
Irek Fasikhov
01:58 AM Cleanup #9600 (New): rework bufferlist::*aligned* functions
The align function should allow 32 byte alignment (for SIMD instructions) or page alignment (for I/O). There should b... Loïc Dachary
01:00 AM Bug #8592: sgdisk no longer likes `--change-name` when creating partitions
I have fixed this by add --zap-disk option, hope this will help you. Gao Jiangmiao
12:44 AM Bug #9419: dumpling->firefly upgrade, sending setallochint?
Thanks for explaining. Since alloc hint is optional it does not matter if it is activated and deactivate later. Loïc Dachary
12:22 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Please disregard #15. I just fell victim to inaccurate documentation about the @incomplete@ PG state.
-Sam's hunch...
Florian Haas

09/25/2014

11:59 PM Bug #9592: librados: Not able to create Large Files with Librados
A modified script to debug this issue:-
####################################
import rados
import sys
try:
cluste...
pushpesh sharma
10:02 AM Bug #9592: librados: Not able to create Large Files with Librados
If I were to guess, something in the stack is converting the size value down to an int32 and then back up to int64, s... Greg Farnum
09:51 AM Bug #9592 (Can't reproduce): librados: Not able to create Large Files with Librados
... Loïc Dachary
06:28 AM Bug #9592 (Resolved): librados: Not able to create Large Files with Librados
I find this issue while i was trying to run a 1GB Write Cosbench Workload using librados.(My 1MB write & read run was... pushpesh sharma
11:08 PM rgw Bug #9588: Keystone s3 auth integration lacking access_key = tenant:user ability supported by swi...
Despite asking for swift I am actually getting the nova object store doing the s3 stuff it seems. I'll comment gaian ... Mark Kirkwood
10:16 PM rgw Bug #9588: Keystone s3 auth integration lacking access_key = tenant:user ability supported by swi...
Hmm - maybe not tested enough, as it looks like the way devstack sets up the swift s3 layer is a bit screwy, and almo... Mark Kirkwood
07:45 PM Documentation #9542: Error link:"Ceph Object Gateway"->"Manual Install"
I know it "*is the way the doc is generated*", and I know "*it's not a bug in a link*",too.(Guess it's Sphinx?). But ... Aaron Chen
09:39 AM Documentation #9542 (Won't Fix): Error link:"Ceph Object Gateway"->"Manual Install"
This is the way the doc is generated, it's not a bug in a link. And it actually makes more logical sense to jump from... Loïc Dachary
06:23 PM CephFS Feature #541 (Resolved): mds: tempsync
this is implemented... TSYN and related states Sage Weil
06:21 PM Feature #1092 (Rejected): mon: checkpointing
Sage Weil
06:19 PM Feature #131 (Resolved): bring wireshark plugin is up to date
Sage Weil
05:47 PM CephFS Feature #630 (Resolved): release caps on inodes unlinked by other clients
Zheng Yan
05:47 PM CephFS Feature #630: release caps on inodes unlinked by other clients
dup of #5039. already fixed by commit f8a947d92 client: trim deleted inode Zheng Yan
05:03 PM Feature #9568 (Resolved): Add test case to test #9419 (ceph wip-9419)
Yuri Weinstein
04:34 PM CephFS Bug #9514: ceph-fuse pjd test is failing in giant nightlies
This hasn't reproduced since we turned on debug logging. :(
But I did see it on a run without any logging: /a/gregf-...
Greg Farnum
04:09 PM Feature #9580: ceph-disk, ceph-osd: make journal [partition] creation conditional based on osd_ob...
While we are thinking about this, note that some of the keyvalue backends have facility to have their "wal" aka journ... Mark Kirkwood
04:05 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
I don't see how it could be related to a problem in align_bl or bufferlist::rebuild_align. The worst these could do i... Loïc Dachary
03:28 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
Maybe "iterating the bufferptr":https://github.com/ceph/ceph/blob/dumpling/src/os/FileJournal.cc#L1297 can return buf... Loïc Dachary
03:03 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
... Loïc Dachary
02:43 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
Sheldon, could you upload the full log somewhere if you still have it ? Loïc Dachary
01:53 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
* align_bl related pull request https://github.com/ceph/ceph/pull/2501
* rebuild_align fix (back from 2013) https:/...
Loïc Dachary
02:14 PM Bug #9203 (In Progress): ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < l...
Sage Weil
02:14 PM Feature #9598 (Resolved): re-enable Objecter fast dispatch
We had to nix fast dispatch on the Objecter because it could deadlock in conjunction with mark_down() calls.
Fixin...
Greg Farnum
02:06 PM Bug #9536: erasure-code: ISA plugin alignment must be constant
(parts of) this will need to be backported with the rest of the ISA plugin stuff Sage Weil
02:05 PM Bug #9536 (Pending Backport): erasure-code: ISA plugin alignment must be constant
Sage Weil
01:52 PM Bug #9389 (Need More Info): ec pg stuck peering, did not send query for one shard
commit:d851c3f2338e8d17dfd78d631b9f7977365356aa adds better debug output (and cleans up a bit) Sage Weil
01:21 PM rbd Bug #9595 (Resolved): librbd: internal methods can operate on extra objects when non-default stri...
... Josh Durgin
01:04 PM Bug #9295 (Resolved): osd/OSD.cc: 5501: FAILED assert(session) in ms_fast_dispatch
Sage Weil
01:03 PM Bug #9295 (Duplicate): osd/OSD.cc: 5501: FAILED assert(session) in ms_fast_dispatch
Sage Weil
01:03 PM Bug #9295: osd/OSD.cc: 5501: FAILED assert(session) in ms_fast_dispatch
dup of #9462 Sage Weil
01:01 PM Bug #9462 (Resolved): msgr deadlock: osd reply vs mark_down vs fault
Sage Weil
12:44 PM Bug #8910 (Duplicate): ceph_test_objectstore: ObjectStore/StoreTest.ManyObjectTest/0 failure on f...
pretty sure this is a dup of #8395 Sage Weil
12:27 PM Bug #9582: librados: segmentation fault on timeout
i'm going to see if we can just skip the rx_buffers zero-copy paths when a timeout is present Sage Weil
12:20 PM Bug #9388: osd/PG.cc: 2945: FAILED assert(r == 0) in update_snap_map
import/export related Samuel Just
11:44 AM Bug #9571 (Resolved): rocksdb testing with powercycling fails on trusty
this was an issue with the code fix and not a product bug.
resolved now.
Tamilarasi muthamizhan
11:14 AM Bug #9419: dumpling->firefly upgrade, sending setallochint?

On any change of pg configuration peering happens, so a new collection of feature bits from the peers is collected....
David Zafman
10:37 AM Bug #9419 (Fix Under Review): dumpling->firefly upgrade, sending setallochint?
David Zafman
12:46 AM Bug #9419: dumpling->firefly upgrade, sending setallochint?
What happens if
* all OSDs in a PG support setallochint
* one secondary OSD goes down
* the secondary is replac...
Loïc Dachary
11:09 AM Bug #8395: ceph-test-objectstore doesn't clean up
backported to firefly branch Sage Weil
11:08 AM Feature #9594 (New): stop backfill when osd becomes too full
We will currently refuse the reservation, but we don't actually stop backfill once it is started. Samuel Just
10:45 AM Bug #9480: OSD is crashing while object deletion
Samuel Just
10:42 AM Bug #9390 (In Progress): EEXIST on split due to import/export
Sage Weil
10:38 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
Samuel Just
10:37 AM Bug #9584: OpTracker segfault on shutdown (firefly)
shutdown race is not so important Sage Weil
10:17 AM devops Tasks #8366: Update ceph.com/docs to default to the latest major release (0.80)
John Wilkins wrote:
> We need to review this a bit further. Pointing to the latest major release is fine, but we nee...
Sage Weil
10:09 AM Bug #9593 (Resolved): osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked()) (firefly)
Sage Weil
10:02 AM Bug #9593 (Fix Under Review): osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked()) (fir...
https://github.com/ceph/ceph/pull/2576 Sage Weil
09:57 AM Bug #9593 (Resolved): osdc/Objecter.cc: 1225: FAILED assert(client_lock.is_locked()) (firefly)
... Sage Weil
09:24 AM Feature #9532 (Duplicate): rados.py should export omap interface
#6114 Loïc Dachary
06:31 AM rgw Bug #9469: RadosGW performance degrades with high concurrency workload.
Debugging further I was able to root cause the issue further. I enable debug logs for radosgw (20/20) , enabled acces... pushpesh sharma
03:31 AM CephFS Bug #9562 (Fix Under Review): Lockdep assertion in Filer purge
https://github.com/ceph/ceph/pull/2572 John Spray
02:58 AM Bug #9579: Default parameters are not getting initialized for EC profile using isa EC plugin
Documentation update on k/m, good catch ! https://github.com/ceph/ceph/pull/2571 Loïc Dachary
02:52 AM Bug #9579: Default parameters are not getting initialized for EC profile using isa EC plugin
This is confusing and I added http://tracker.ceph.com/issues/9589 to work on improving the user experience. Thanks fo... Loïc Dachary
02:49 AM Bug #9579: Default parameters are not getting initialized for EC profile using isa EC plugin
From the code, it seems default value of k & m for "isa" profile are 7 & 3 respectively.
class ErasureCodeIsaDefau...
Mallikarjun Biradar
02:41 AM Bug #9579 (Won't Fix): Default parameters are not getting initialized for EC profile using isa EC...
There are no defaults for k/m for the isa plugin, the parameters need to be set explicitly as documented at http://ce... Loïc Dachary
02:49 AM Feature #9589 (Resolved): erasure-code: query plugin for erasure-code-profile defaults
When a parameter is missing from an erasure-code-profile (ruleset-failure-domain for instance) it falls back to the d... Loïc Dachary
02:14 AM Bug #8863: osd: second reservation rejection -> crash
Hi Sage,
We are still getting this issue, even thought commit is included in our build. Any Updates?
Sahana Lokeshappa
01:24 AM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
Running in debug mode with https://github.com/ceph/ceph/pull/2568 (using the crushmap created as in the description):... Loïc Dachary
12:56 AM CephFS Bug #9563 (Resolved): kcephfs crash in ceph_mdsc_do_request
Zheng Yan
12:55 AM CephFS Bug #9564 (Resolved): kcephfs crash in _nfs4_do_open
the bug is fixed upstream commit f39c0104 (NFS: remove BUG possibility in nfs4_open_and_get_state). I rebased the tes... Zheng Yan
12:19 AM Bug #9485: Monitor crash due to wrong crush rule set
Thanks so much.
BTW:
I repeat this in my dev environment with 60 osds on one host. I create 6 virtual racks. (you...
Dong Lei
12:09 AM Bug #9485: Monitor crash due to wrong crush rule set
Thanks for the detailed instructions. I'll try them to repeat the problem. Loïc Dachary

09/24/2014

10:23 PM rgw Bug #9588 (Rejected): Keystone s3 auth integration lacking access_key = tenant:user ability suppo...
For instance according to http://docs.openstack.org/grizzly/openstack-object-storage/admin/content/configuring-openst... Mark Kirkwood
08:40 PM Bug #9485: Monitor crash due to wrong crush rule set
K=8 M=4 doesn't work.
I rebuild the cluster and do the following steps.
(delete all pools)
1. create a profile...
Dong Lei
04:37 AM Bug #9485: Monitor crash due to wrong crush rule set
Could you please let me know if it always work with *K=8 M=4* ? Loïc Dachary
01:17 AM Bug #9485: Monitor crash due to wrong crush rule set
I know that I need 11 and the rule provide 12 and It looks CRUSH will do thetruncate.
It doesn't seem to be an iss...
Dong Lei
12:42 AM Bug #9485: Monitor crash due to wrong crush rule set
You have *K=8 M=3* which means your pool needs 11 OSDs. However the rule you defined will always provide 12 OSDs and ... Loïc Dachary
12:22 AM Bug #9485: Monitor crash due to wrong crush rule set
The profile used for the ecpool is K=8 M=3.
If I set the min_size = 3, max_size = 12(as default), the monitor cras...
Dong Lei
12:04 AM Bug #9485: Monitor crash due to wrong crush rule set
Could you also attach the log of monitor crash you are seeing ? Note that if you change a crush rule that is currentl... Loïc Dachary
08:16 PM Bug #9585: ceph assertion using rocksdb store in master branch
It looks like that powercycle will make header's bitmap inconsistence with actual data keys. Haomai Wang
11:10 AM Bug #9585 (Can't reproduce): ceph assertion using rocksdb store in master branch
ceph version 0.85-980-gc5906ec (c5906eca2ffa837891ba7d84775ece7b91f6c5c8)
ceph assertion when rocksdb is used for ...
Tamilarasi muthamizhan
07:47 PM CephFS Bug #6613: samba is crashing in teuthology
Still happening
/a/teuthology-2014-09-22_23:14:01-samba-giant-testing-basic-multi/50607
Greg Farnum
07:43 PM CephFS Bug #8427: ceph-fuse: Dumpling "cache still has 0+1 items, waiting (for caps to release?)" on shu...
/a/teuthology-2014-09-22_19:06:01-fs-dumpling-testing-basic-multi/505408
Grabbed all the logs out of /var/log/ceph...
Greg Farnum
04:21 PM Bug #6697 (Resolved): strncmp(3) must not be used on binary data
Loïc Dachary
07:02 AM Bug #6697 (Fix Under Review): strncmp(3) must not be used on binary data
https://github.com/ceph/ceph/pull/2567 Loïc Dachary
06:51 AM Bug #6697: strncmp(3) must not be used on binary data
Loïc Dachary
03:53 PM Bug #8910 (In Progress): ceph_test_objectstore: ObjectStore/StoreTest.ManyObjectTest/0 failure on...
reopening this bug as it seems to happen in the nightlies,
log: http://qa-proxy.ceph.com/teuthology/teuthology-...
Tamilarasi muthamizhan
02:51 PM devops Bug #9489 (Rejected): --zap-disk does not clear enough
... Loïc Dachary
10:22 AM devops Bug #9489 (Can't reproduce): --zap-disk does not clear enough
Alfredo Deza
10:19 AM devops Bug #9489: --zap-disk does not clear enough
I believe the original cause of report was likely in error unrelated to ceph-disk. Loic, you had mentioned you might ... Brian Andrus
09:43 AM devops Bug #9489 (Need More Info): --zap-disk does not clear enough
A bit more context is needed here, how/what doesn't work as expected? Is it possible to reproduce?
When zap disk d...
Alfredo Deza
02:27 PM Fix #3180: use of strerror() for possibly-negative return values
Yeah, I actually fixed this, and forgot the bug still existed. Dan Mick
05:51 AM Fix #3180 (Rejected): use of strerror() for possibly-negative return values
I could not find an instance where strerror is used instead of cpp_strerror in the current master... Loïc Dachary
02:27 PM Feature #4611: cephtool: set-quota, no get-quota
heh, bug 4611 duplicates bug 8523, does it? :)
Dan Mick
05:18 AM Feature #4611 (Duplicate): cephtool: set-quota, no get-quota
Loïc Dachary
02:22 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
2014-09-22 16:00:20.680448 7fee6abcf700 0 -- 10.10.10.7:6808/25820 >> 10.10.10.16:0/1007485 pipe(0xba12a00 sd=628 :6... Samuel Just
02:21 PM rgw Bug #9587 (Resolved): ceph-radosgw sysvinit script on EL6 cannot set ulimit
The script tries to set ulimit -n 32768 as the apache user. It errors to:
bash: line 0: ulimit: open files: cannot m...
Alexandre Marangone
02:18 PM CephFS Bug #8576: teuthology: nfs tests failing on umount
https://github.com/ceph/teuthology/pull/336 Greg Farnum
01:58 PM Bug #9113: osd: snap trimming eats memory, linearly
Samuel Just
01:57 PM Feature #9568: Add test case to test #9419 (ceph wip-9419)
Tests case:
@0-cluster / start.yaml@...
Yuri Weinstein
12:36 PM Bug #9582: librados: segmentation fault on timeout
Okay, looks like this is another race:
1) The message is coming in over the wire, and the Pipe grabs a preallocated ...
Greg Farnum
07:38 AM Bug #9582 (Resolved): librados: segmentation fault on timeout
Summary: If you configure librados with rados_osd_op_timeout, timeouts will result sometimes in a segmentation fault.... Matthias Kiefer
10:52 AM Bug #9584: OpTracker segfault on shutdown (firefly)
/a/samuelj-2014-09-23_14:40:50-rados-firefly-wip-testing-old-vanilla-basic-multi/507309 (once it times out) Samuel Just
10:52 AM Bug #9584 (Can't reproduce): OpTracker segfault on shutdown (firefly)
#0 0x00007f5ec74baf07 in _dl_map_object_deps (map=map@entry=0x7f5ec76bc4e8, preloads=preloads@entry=0x0, npreloads=n... Samuel Just
10:37 AM Messengers Bug #1803 (New): msgr: behave better when ending TCP connections
This has been greatly improved with the addition of our socket timeouts and things, but I don't think it's properly r... Greg Farnum
03:12 AM Messengers Bug #1803 (Resolved): msgr: behave better when ending TCP connections
Not sure at which point this problem was fixed but it is doubtful that it stayed around for the past three years unno... Loïc Dachary
10:21 AM Bug #9554 (Can't reproduce): "FAILED assert(0 == "hit suicide timeout")" in upgrade:firefly-firef...
Looks like just an overloaded node. David Zafman
10:17 AM RADOS Feature #4650: osd: separate OSD names from their IDs
We expose OSD IDs in lots of places — like error reporting. But users can't specify those IDs (although they could on... Greg Farnum
05:25 AM RADOS Feature #4650: osd: separate OSD names from their IDs
From a system administration point of view there is no need to know about the OSD id. Naming the OSDs with human read... Loïc Dachary
10:05 AM RADOS Bug #8984 (Won't Fix): creating erasure-code pool when not having a root item default
The recommended way to deal with the absence of a *default* root is to define an erasure-code-profile that "specifies... Loïc Dachary
10:00 AM Bug #8942: Bad JSON output in ceph osd tree
Loïc Dachary
10:00 AM CephFS Cleanup #2378 (Resolved): "ceph -s" MDS output is confusing
We don't print mds status if there's not an FS any more. Greg Farnum
09:42 AM RADOS Feature #6114: Complete python binding interfaces for librados
It went stale as I couldn't keep up with the changes to the modules themselves as the modifications where significant... Alfredo Deza
08:34 AM RADOS Feature #6114: Complete python binding interfaces for librados
What has become of https://github.com/ceph/ceph/commits/wip-5900 ? Is there a reason why it was not merged ? Or am I ... Loïc Dachary
09:40 AM Bug #9556 (Duplicate): Segmentation fault in upgrade:dumpling-firefly-x-giant-distro-basic-multi ...
Joao Eduardo Luis
09:39 AM Bug #9556: Segmentation fault in upgrade:dumpling-firefly-x-giant-distro-basic-multi run
From Sam's advice to look for something related to "Read Timeout" and from the log, this seems to be a duplicate of #... Joao Eduardo Luis
09:29 AM RADOS Feature #6421: FileStore: Op unit tests
change the %Done to reflect the fact that there is work done already. Loïc Dachary
09:16 AM devops Fix #8508: packaging: deb repository key should be @redhat.com
The deb repository key just needs to be re-created with a @redhat.com email Loïc Dachary
09:08 AM Bug #8323 (Duplicate): mon_osd_allow_primary_affinity Can not be Injected
Loïc Dachary
08:36 AM Feature #5511 (Duplicate): rados.py support for object locking
#6114
Loïc Dachary
08:26 AM Bug #7843 (Can't reproduce): OSD fails to start
Feel free to re-open if you have a HOWTO reproduce the issue. If you figured out what was wrong, it would be nice if ... Loïc Dachary
08:16 AM rgw Feature #7680: Use new civetweb git repo for ceph
The repository copy is useful when fixes are needed. They can diverge from upstream while the change is proposed. Loïc Dachary
08:12 AM Feature #7664 (Resolved): systemd service files
https://github.com/ceph/ceph/tree/giant/systemd Loïc Dachary
08:10 AM Bug #7623 (Resolved): local 'best' uninitialized in Objecter
Fixed by 605e645026487519d4195358330832b3369b531d Loïc Dachary
08:05 AM Bug #6101: ceph-osd crash on corrupted store
Bumping so it does not get to the bottom of the list for the next bug scrub. Loïc Dachary
08:01 AM Bug #7368: ceph osd repair * blocks after some minutes and prevent other ceph pg repair commands
Another mention of things slowing down when repair is almost complete : http://tracker.ceph.com/issues/9566 . Not sur... Loïc Dachary
07:52 AM Bug #7409 (Can't reproduce): "make check" doesn't work without --with-radosgw
... Loïc Dachary
07:43 AM Bug #9362: librados, rados_read corrupts memory on timeout
Update: The patch branch I used did not contain the complete code that has been merged to the dumpling branch. Using ... Matthias Kiefer
07:29 AM Feature #7340 (Duplicate): rados.py does not expose object locking
Loïc Dachary
07:18 AM Cleanup #7105 (Closed): There are three different ways to retrieve an authentication key
It is not necessary indeed. However, now that it has been published it would be non backward compatible to remove any... Loïc Dachary
07:09 AM Bug #6834 (Can't reproduce): nightlies: monitor crashed in emperor
It either showed up again and has been associated with another issue or it has been fixed. Loïc Dachary
06:59 AM rgw Feature #9581 (New): Ability to move objects to a second storage tier based on policy
To be compatible with AWS S3 API like bucket lifecyle, ceph should have the ability to move the object from standard ... Swami Reddy
06:48 AM Feature #6687: Ability to set up/down/in/out based on CRUSH hierarchy
+1 Loïc Dachary
06:47 AM Feature #3604 (Resolved): print lookup path when reporting -ENOENT to user-space
Loïc Dachary
06:45 AM Feature #6567 (Rejected): emit warning on unknown/ invalid configuration directives
This is unfortunately not possible as there is no central place to query to know what is a valid option and what is n... Loïc Dachary
06:26 AM Bug #6371 (Duplicate): rados bench segfaults when read --block-size < write --block-size
Loïc Dachary
06:09 AM devops Bug #9506: Pass monitor SSH addresses via CLI flag
This will be *very* tricky to do with CLI flags, so after discussing this with Kyle, it was decided that using the ce... Alfredo Deza
05:28 AM devops Bug #9506: Pass monitor SSH addresses via CLI flag
The use case is ceph-deploy is being executed on a management node, homed on a management network. The monitors are m... Kyle Bader
06:09 AM Feature #5521: Enhance PGLS or new op to list all namespace/objects in a pool.
Loïc Dachary
06:08 AM Feature #5521 (Duplicate): Enhance PGLS or new op to list all namespace/objects in a pool.
Loïc Dachary
06:02 AM Feature #9580 (Resolved): ceph-disk, ceph-osd: make journal [partition] creation conditional base...
or example, with keyvaluestore-dev ceph-disk makes a journal parititon and general screws things up. see http://artic... Sage Weil
05:46 AM Bug #9579 (Won't Fix): Default parameters are not getting initialized for EC profile using isa EC...

When created an EC profile using erasure code plugin "isa", default values for parameters k, m and technique are ...
Mallikarjun Biradar
05:43 AM Feature #4771 (Rejected): Snippet / included configuration
Loic Dachary wrote:
> The ceph.conf file tends to disapear almost entirely. The mons can contain all the information...
Wido den Hollander
05:33 AM Feature #4771: Snippet / included configuration
The ceph.conf file tends to disapear almost entirely. The mons can contain all the information and are a central poin... Loïc Dachary
05:36 AM Feature #4230 (Resolved): librados: node.js bindings
https://github.com/ksperis/node-rados Loïc Dachary
05:25 AM devops Bug #9510 (Closed): ceph-deploy: Move mon keyring generation 'mon create-initial'
Kyle Bader
05:12 AM Feature #2158 (Duplicate): cephtool: helpful error/timeout when no monitor quorum
Loïc Dachary
04:54 AM Subtask #4306 (Resolved): make the new snap trimmer design work with split
Loïc Dachary
04:46 AM Feature #2147 (Resolved): objclass: add CLS_ERR macro
https://github.com/ceph/ceph/blob/giant/src/objclass/objclass.h#L31 Loïc Dachary
03:18 AM Feature #4005: Add perftools to the kernel debian package script
Any progress ? Loïc Dachary
03:17 AM Feature #1810 (Resolved): monclient: timeouts?
Implemented by 671a76d64bc50e4f15f4c2804d99887e22dcdb69 Loïc Dachary
03:04 AM Bug #4206 (Resolved): concurrent rados bench processes don't work well for seq reads
Implemented by 308758b7878c48ab64caf71ff646e057c2c1c5aa Loïc Dachary
03:01 AM Fix #4202: osd: pg delete
a command that deletes a designated pg ? If so it would help to have a use case. Loïc Dachary
02:56 AM Support #3902 (Closed): S3-tests need to cleanup after themselves
Tests are run on short lived machines and this won't be an issue. Loïc Dachary
02:54 AM Feature #3855 (Resolved): Making Scrubs Nicer
Loïc Dachary
02:52 AM Documentation #3846 (Resolved): Debian install has incorrect gitbuilder URL
The install pages have been reworked. Loïc Dachary
02:49 AM Feature #3202 (Resolved): tools: coverity clean
An on going effort by Danny Al-Gaaf Loïc Dachary
02:42 AM Feature #3241 (Resolved): qa: integration tests for mon, osd, and mds caps
There now are caps tests run by teuthology : https://github.com/ceph/ceph/blob/giant/qa/workunits/mon/caps.py https:/... Loïc Dachary
02:29 AM Feature #3095 (Resolved): rbd tool resize improvements
... Loïc Dachary
02:27 AM Feature #3083 (Resolved): Provide separate APT repos for argonaut, bobtail, etc; stable would alw...
Not as suggested but the stable repositories are organized in a sensible way. Loïc Dachary
02:23 AM Feature #2953 (Resolved): append() in librados is not exposed to python API
Implemented by 39bf68c3ceee3f62960d0866f35835325cca5660 Loïc Dachary
02:19 AM Bug #2848: OSDMap: pool_id is 64-bit, but pool_max is 32-bit
"still valid":https://github.com/ceph/ceph/blob/giant/src/osd/OSDMap.h#L206 Loïc Dachary
02:16 AM Feature #2812 (Resolved): automated CentOS testing
RPM based operating systems are now part of the teuthology runs. Loïc Dachary
02:14 AM Feature #2776 (Resolved): rados tool: bulk removal of objects
Implemented by cc8df29e19a1fc441ad903aeeb59f7d3e15a5e7c Loïc Dachary
02:08 AM Feature #2755 (Resolved): ceph-conftool: optionally return the default for a config option if no ...
Marking as resolved since there now is a way to get the default value, although not as suggested.... Loïc Dachary
02:00 AM Cleanup #2671 (Resolved): buffer.h: do efficient buffer comparisons
Resolved by 2a46564158ebf519ae6e7ee318b97c61cf032692 with content_equals Loïc Dachary
01:53 AM Tasks #2529 (Resolved): debian: Merge packaging changes from Ubuntu 12.04
There is no longer a difference. Loïc Dachary
01:50 AM Feature #2519 (Resolved): rados: allow setting pg_num and pgp_num when creating a pool
Using a mon cmd to create the pool instead of the specialized function supports setting pg_num / pgp_num. Loïc Dachary
01:40 AM Bug #2154 (Resolved): rados: bench seq should not segfault when blocksize doesn't match write blo...
... Loïc Dachary
01:32 AM Feature #2112 (Resolved): msgr fault injection
Starting 90f66980bfb1f2541dcb11be2c358a9832a291b1 in november 2012 a number of *OPTION(ms_inject_...* options have be... Loïc Dachary
01:07 AM Feature #1583 (Resolved): osd: bound pg log memory usage
Memory consumption has improved/changed a lot since this ticket was open and I believe this issue is no longer relevant. Loïc Dachary
01:04 AM Feature #1619 (Resolved): libvirt: test with selinux/apparmour enabled
I believe this has been extensively tested in the context of OpenStack Loïc Dachary
12:59 AM Feature #1525 (Resolved): qa: check out fio, add to ceph-qa-suite if it's good
https://github.com/ceph/ceph-qa-suite/blob/giant/suites/tgt/basic/tasks/fio.yaml and https://github.com/ceph/ceph/blo... Loïc Dachary
12:52 AM Tasks #1418: set up a no-atomic-ops gitbuilder
gitbuilders currently use *--with-libatomic-ops* Loïc Dachary
12:34 AM Feature #543 (Resolved): PG::search_for_missing: don't iterate over all missing
The code base changed significantly and does not have this problem anymore. Loïc Dachary
12:30 AM Feature #1091 (Duplicate): librados: support pgls filter
http://tracker.ceph.com/issues/9262 Loïc Dachary
12:24 AM Cleanup #1042: need const iterator for bufferlist
"still valid":https://github.com/ceph/ceph/blob/giant/src/include/buffer.h#L240 Loïc Dachary

09/23/2014

11:58 PM Bug #9485: Monitor crash due to wrong crush rule set
What probably happens is that you created an erasure code profile with k+m that is lower than the number of OSDs prov... Loïc Dachary
07:03 PM Bug #9485: Monitor crash due to wrong crush rule set
Because the monitor crash and it can not be restarted, so currently I can not get "ceph osd dump".
I checked the i...
Dong Lei
08:40 AM Bug #9485: Monitor crash due to wrong crush rule set
Could you also please add the output of *ceph osd dump* ? It looks like you have run into http://tracker.ceph.com/iss... Loïc Dachary
08:38 PM Bug #9558: Both op threads and dispatcher threads get hung even for few minutes during peering stage
More info:
When OSD daemon/host is down, some PGs becomes active+degrade, while others are still active+clean. As ...
Zhi Zhang
06:16 PM CephFS Bug #9562: Lockdep assertion in Filer purge
can we just unlock the PurgeRange/Probe locks before using the objecter? Zheng Yan
06:21 AM CephFS Bug #9562 (In Progress): Lockdep assertion in Filer purge
John Spray
06:21 AM CephFS Bug #9562: Lockdep assertion in Filer purge

So I think this bug already existed with the Probe lock, but it was triggered by the new PurgeRange lock, because t...
John Spray
05:48 PM Bug #9528: RadosModel assertion failure in firefly
sam, please mention the parent bug. Tamilarasi muthamizhan
01:22 PM Bug #9528 (Duplicate): RadosModel assertion failure in firefly
Samuel Just
05:07 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
fix is in wip-9487 and wip-sam-testing Sage Weil
02:28 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
I'm not 100% sure, so I'd thought I'd ask: what's the exact reason for the PG being marked incomplete here? Is it the... Florian Haas
02:05 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Samuel Just
01:58 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
https://github.com/ceph/ceph/pull/2525
The num_trimmed does not seem to be reset. I think you are not trimming at...
Samuel Just
06:12 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Whoa. While a cluster with this patch applied doesn't spin like crazy in snap_trim anymore, killing an OSD seems to i... Florian Haas
04:06 PM Bug #9113: osd: snap trimming eats memory, linearly
It's not just dumpling, the repops set in the snap trimmer is just wonky. We need to trim a bounded set of objects, ... Samuel Just
03:47 PM Bug #9554: "FAILED assert(0 == "hit suicide timeout")" in upgrade:firefly-firefly-testing-basic-v...

The crashed osd.5 was on a node that had a load average of 38. osd.1 didn't see ping responses although it saw o...
David Zafman
03:27 PM rbd Bug #8187: librbd: list_children() reports duplicates with cache pools
Never mind, figured it out. Apparently it's not enough to set pool2 up as a tier of pool1, it also has to be an over... Adam Crume
02:51 PM rbd Bug #8187: librbd: list_children() reports duplicates with cache pools
Josh, I'm having trouble reproducing this. Do you have a test case? Adam Crume
10:55 AM rbd Bug #8187 (In Progress): librbd: list_children() reports duplicates with cache pools
Adam Crume
02:31 PM CephFS Bug #9564: kcephfs crash in _nfs4_do_open
/a/teuthology-2014-09-22_23:10:02-knfs-giant-testing-basic-multi/506055/teuthology.log John Spray
02:26 PM Bug #9462: msgr deadlock: osd reply vs mark_down vs fault
Finally got through a suite run and it looks pretty good, but need to check the few failures:
http://pulpito.ceph.co...
Greg Farnum
02:26 PM devops Bug #9268 (Resolved): Recipe errors in rgw:multifs-dumpling-testing-basic-vps
Fixed this in ceph-qa-chef. I hought there was another issue open so in teuthology and assigned to me, this was maybe... Sandon Van Ness
02:23 PM devops Bug #9267 (Resolved): "Gem::DependencyError" in upgrade:dumpling-dumpling-distro-basic-vps
Problematic images now include chef. Sandon Van Ness
02:20 PM devops Bug #9489: --zap-disk does not clear enough
Ian Colle
02:15 PM devops Bug #9567: Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
Was caused when moving to new rhel7 gitbuilder firefly was comitted but not built on the new one when the old one was... Sandon Van Ness
02:14 PM devops Bug #9567 (Resolved): Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
Sandon Van Ness
02:09 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
I had some comments on that pull request. Samuel Just
02:08 PM Bug #9545: filestore stuck in journal->should_commit_now() loop on shutdown
Samuel Just
02:07 PM devops Bug #9548 (Need More Info): ceph mon creation failed for centOS
the command `mon create-initial` does not take any hosts as arguments. It doesn't take any at all.
It will look at...
Alfredo Deza
02:07 PM devops Bug #8976 (Resolved): httpd on RHEL7 (RHEL repo) incompatible with mod_fastcgi (ceph repo)
Closing as Tamil tested and said it was good. Sandon Van Ness
02:07 PM Bug #8629: cache_evict needs to prevent make_writeable from creating a snapdir
Samuel Just
02:05 PM Bug #9285: osd: promoted object can get evicted before promotion completes
I left a comment on a simpler approach. Samuel Just
02:00 PM Linux kernel client Bug #8568: libceph: kernel BUG at net/ceph/osd_client.c:885
BUG_ON(!list_empty(&req->r_req_lru_item)) in __kick_osd_requests()
Can't reproduce but need to look harder into ho...
Ilya Dryomov
01:39 PM Bug #9472 (Duplicate): osd crash in -upgrade:dumpling-dumpling-distro-basic-vps suite
Samuel Just
01:38 PM Bug #9476 (Duplicate): "Segmentation fault (core dumped)" in upgrade:dumpling-giant-x:parallel-gi...
Samuel Just
01:35 PM Bug #9570: osd crash in FileJournal::WriteFinisher::entry() aio
what was the assert? Samuel Just
01:33 PM Bug #9501 (Rejected): Assertion in FileJournal::do_write
Samuel Just
01:27 PM Bug #9008: Objecter: pg listing can deadlock when throttling is in use
Samuel Just
01:26 PM Bug #9422 (Can't reproduce): librados: client.admin authentication error (110) Connection timed out
Samuel Just
01:25 PM Bug #9274 (Can't reproduce): "AssertionError: failed to recover before timeout expired" in upgrad...
Samuel Just
01:21 PM Bug #9544 (Pending Backport): osd: pg deletion vs create race leads to EEXIST on mkcoll (dumpling)
Samuel Just
01:18 PM rgw Bug #8587 (Fix Under Review): rgw: subuser object not created correctly
Yehuda Sadeh
01:15 PM Bug #9418: mon: drop internal-purpose messages from clients without proper caps
Joao Eduardo Luis
01:14 PM Bug #9546 (Rejected): LibRadosWatchNotify.WatchNotifyTest failure
Sage Weil
01:13 PM Bug #9293 (Pending Backport): _collection_move_rename EEXIST
Samuel Just
01:12 PM Bug #9293 (Fix Under Review): _collection_move_rename EEXIST
Samuel Just
01:12 PM rgw Feature #7467 (Fix Under Review): Make radosgw work with multiple hostnames
Yehuda Sadeh
09:08 AM rgw Feature #7467 (In Progress): Make radosgw work with multiple hostnames
Yehuda Sadeh
01:11 PM rgw Bug #5595: object has a Content-Type, but its content_type property is not shown in Swift object ...
Needs review, can't set status on this tracker. Yehuda Sadeh
11:21 AM rgw Bug #5595: object has a Content-Type, but its content_type property is not shown in Swift object ...
I think this happens if the object was created before, and then its metadata was modified. It's similar to another is... Yehuda Sadeh
01:06 PM Bug #9574: Backfill: recheck full status once reservation is granted
Samuel Just
12:07 PM Bug #9574 (Resolved): Backfill: recheck full status once reservation is granted
Otherwise, we queue many backfill reservations while we are not full and then each one is granted in turn without che... Samuel Just
01:05 PM Bug #9443 (Rejected): btrfs pwrite returns EEXIST on journal FileJournal::write_bl
Not our bug. Samuel Just
12:57 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
teuthology@teuthology:/a/teuthology-2014-09-22_23:02:01-rgw-giant-testing-basic-multi/505881 Sage Weil
12:56 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
teuthology@teuthology:/a/teuthology-2014-09-22_23:02:01-rgw-giant-testing-basic-multi/505875 Sage Weil
12:48 PM rgw Bug #9575: s3tests.functional.test_s3.test_region_copy_object fails (races with radosgw-agent?)
Seem to me like timing out due to slow ec backend. Yehuda Sadeh
12:34 PM rgw Bug #9575 (Duplicate): s3tests.functional.test_s3.test_region_copy_object fails (races with rados...
... Sage Weil
12:49 PM rgw Bug #9576 (Resolved): rgw: update object content-length doesn't work correctly
This only applies to the swift POST object metadata api call. Yehuda Sadeh
11:51 AM devops Tasks #8366 (Fix Under Review): Update ceph.com/docs to default to the latest major release (0.80)
We need to review this a bit further. Pointing to the latest major release is fine, but we need to have a way to cher... John Wilkins
11:43 AM Bug #8885 (Resolved): SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
Somnath Roy
11:08 AM Bug #9547 (Resolved): python rados aio_read truncates returned buffer on \000
Loïc Dachary
10:33 AM Bug #9482 (Resolved): osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log....
Samuel Just
10:33 AM Bug #9339 (Resolved): ReplicatedPG crash in hitset_create
Samuel Just
10:32 AM Bug #8777 (Resolved): osd/PGLog.h: 88: FAILED assert(rollback_info_trimmed_to_riter == log.rbegin())
Samuel Just
10:32 AM Bug #9054 (Resolved): ceph_test_rados: FAILED assert(!old_value.deleted())
Samuel Just
10:32 AM Bug #9326 (Resolved): osd crash in upgrade:dumpling-firefly-x-master-distro-basic-vps suite
Does not need to be backported! Samuel Just
10:30 AM Bug #9240 (Resolved): osd_max_backfills = 1 can cause reserver deadlock for EC
Samuel Just
10:30 AM Bug #9179 (Resolved): unfound objects, recovery timeout
Samuel Just
10:30 AM Bug #9481 (Resolved): osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
Samuel Just
10:30 AM Bug #9497 (Resolved): choose_acting has to let the pg be down any time acting < min_size even if ...
Samuel Just
09:39 AM Linux kernel client Bug #9573 (New): krbd: investigate a dd-in-a-loop slowdown
Reported at the bottom of #8818. Ilya Dryomov
09:38 AM rbd Bug #5768 (Fix Under Review): rbd-fuse: leak in enumerate_images()
https://github.com/ceph/ceph/pull/2524 Adam Crume
09:34 AM rbd Bug #6926 (Fix Under Review): rbd: diff output includes previously non-existent objects as zeroed...
https://github.com/ceph/ceph/pull/2523 Adam Crume
09:27 AM Feature #8188: librados: interface to inspect pool properties
https://github.com/ceph/ceph/pull/2552 Adam Crume
09:27 AM Feature #8188 (Fix Under Review): librados: interface to inspect pool properties
Adam Crume
09:26 AM rgw Bug #7796 (Won't Fix): RGW Keystone token auth fails with '411 Length Required' when Keystone usi...
The recommendation is to work around the issue using the afformentioned apache configuration. Yehuda Sadeh
09:14 AM rgw Bug #8676: md5sum check failed during readwrite.py
This might have been fixed, downgrading it for now until it's dis/proved. Yehuda Sadeh
08:59 AM rgw Bug #8676: md5sum check failed during readwrite.py
There's a chance this one is the same as #9307 Yehuda Sadeh
09:07 AM rgw Bug #6611 (Won't Fix): RGW: Using underscores when setting headers returns 403
The cgi interface prevents us from doing anything about it. With civetweb it'd be different, but at this point there'... Yehuda Sadeh
09:02 AM devops Bug #6592 (Can't reproduce): 3.8 kernel + /dev/cciss/c0d1 + precise : fail to show in /dev/disk/b...
I lost access to the hardware before being able to properly reproduce / diagnose this border case. Loïc Dachary
08:59 AM rgw Bug #9307 (Pending Backport): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dump...
Should have been fixed by commit:d41c3e858c6f215792c67b8c2a42312cae07ece9
Note that when backporting also need to ...
Yehuda Sadeh
08:57 AM Bug #9408: erasure-code: misalignment
gitbuilder is all green Loïc Dachary
08:56 AM Bug #9408 (Fix Under Review): erasure-code: misalignment
Corresponding pull request https://github.com/ceph/ceph/pull/2558 Loïc Dachary
08:52 AM rgw Bug #9529 (Resolved): ./common/ceph_crypto.h: 83: FAILED assert(s == SECSuccess)
Yehuda Sadeh
08:52 AM rgw Bug #9529: ./common/ceph_crypto.h: 83: FAILED assert(s == SECSuccess)
Fixed by commit:7b137246b49a9f0b4d8b8d5cebfa78cc1ebd14e7 Yehuda Sadeh
08:45 AM Bug #9381 (Resolved): "jerasure load dlopen(/usr/lib64/ceph/erasure-code/libec_lrc.so)" error in ...
All rpm packages were eventually updated. Loïc Dachary
08:42 AM Bug #9224 (Can't reproduce): osd: segv in dlopen
Loïc Dachary
08:29 AM Bug #9470 (Resolved): daemon pid file is not being created when running service ceph
Loïc Dachary
08:29 AM Bug #9509 (Resolved): init script cannot stop OSDs
Loïc Dachary
08:15 AM Bug #9572 (Fix Under Review): erasure-code: BlaumRoth default encoding regression
Loïc Dachary
02:45 AM Bug #9572: erasure-code: BlaumRoth default encoding regression
https://github.com/ceph/ceph/pull/2556 Loïc Dachary
02:35 AM Bug #9572 (In Progress): erasure-code: BlaumRoth default encoding regression
Loïc Dachary
02:10 AM Bug #9572 (Resolved): erasure-code: BlaumRoth default encoding regression
Fixing the "bug on BlaumRoth w constraint":https://github.com/ceph/ceph/commit/9e2d04f7631cc7cd8444e7329890c2429a2d94... Loïc Dachary
06:31 AM Feature #9420: erasure-code: tools and archive to check for non regression of encoding
Loïc Dachary
04:37 AM Feature #9343 (Resolved): erasure-code: allow upgrades for lrc and isa plugins
Loïc Dachary
12:13 AM devops Bug #9506 (Rejected): Pass monitor SSH addresses via CLI flag
There probably is something to be done to clarify the confusion between mon id and hostnames but it is another topic ;-) Loïc Dachary

09/22/2014

10:43 PM CephFS Bug #9563: kcephfs crash in ceph_mdsc_do_request
the bug came from "ceph: use pagelist to present MDS request data". I force updated the testing branch, please test it. Zheng Yan
05:04 AM CephFS Bug #9563 (Resolved): kcephfs crash in ceph_mdsc_do_request

From serial console:...
John Spray
07:50 PM Bug #9571 (Resolved): rocksdb testing with powercycling fails on trusty
This is when osd_objectstore is using rocksdb,... Tamilarasi muthamizhan
07:19 PM Bug #9503 (Fix Under Review): Dumpling: removing many snapshots in a short time makes OSDs go ber...
Sage Weil
07:32 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
OK, that seems to have done it. After installing the updated autobuild with Dan's patch and keeping the snap_trim lim... Florian Haas
07:18 PM Bug #9502 (Pending Backport): mon: does not verify disk is not full on startup
Sage Weil
07:16 PM Bug #9455 (Resolved): mon: audit log read events should be debug level
Sage Weil
03:57 PM devops Feature #9050: Calamari builds for ceph.com
Yes, we need a ceph.com/<something>/calamari repo which contains the various packages.
What needs some discussion...
Neil Levine
03:55 PM Bug #9570 (Rejected): osd crash in FileJournal::WriteFinisher::entry() aio
h3. Workaround
Try with a kernel newer than 3.13 - as new as the environment allows.
h3. Collect more informati...
Sheldon Mustard
02:18 PM Feature #9420: erasure-code: tools and archive to check for non regression of encoding
* Created the repository https://github.com/ceph/ceph-erasure-code-corpus
* Asked Sandon if having such a reposito...
Loïc Dachary
01:37 PM Feature #9568 (Resolved): Add test case to test #9419 (ceph wip-9419)
Yuri Weinstein
12:03 PM Bug #9538 (Resolved): mon crashes on some --format=plain commands
Loïc Dachary
10:32 AM devops Bug #9567: Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
and http://pulpito.front.sepia.ceph.com/teuthology-2014-09-21_19:25:01-upgrade:dumpling-firefly-x-giant-distro-basic-... Yuri Weinstein
09:10 AM devops Bug #9567 (Rejected): Missing packages in upgrade:dumpling-firefly-x-giant-distro-basic-vps run
In run http://pulpito.front.sepia.ceph.com/teuthology-2014-09-21_19:25:01-upgrade:dumpling-firefly-x-giant-distro-bas... Yuri Weinstein
10:13 AM Feature #9343 (Fix Under Review): erasure-code: allow upgrades for lrc and isa plugins
Rebased the pull request against giant https://github.com/ceph/ceph/pull/2551 Loïc Dachary
07:45 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
The "logs of the failed test":http://qa-proxy.ceph.com/teuthology/ubuntu-2014-09-19_04:50:17-rados:monthrash-wip-9343... Loïc Dachary
07:33 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
The "monthrash against giant":http://pulpito.ceph.com/ubuntu-2014-09-20_00:35:01-rados:monthrash-giant-testing-basic-... Loïc Dachary
10:08 AM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
Samuel Just wrote:
> I'd need the corresponding logs from osd.5 to be sure, but I believe the problem is that osd.5,...
Aaron T
10:05 AM Bug #9515 (Duplicate): "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:...
Loïc Dachary
10:04 AM Bug #9515: "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parallel-gia...
Seems to be related to http://tracker.ceph.com/issues/9508 and recently resolved Loïc Dachary
10:01 AM Bug #9515: "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parallel-gia...
The stack trace is:... Loïc Dachary
09:17 AM Bug #9515: "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parallel-gia...
Also seeing in suite:upgrade:dumpling-firefly-x
http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-21_19:25:01...
Yuri Weinstein
07:53 AM Bug #9515: "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:parallel-gia...
Also shows in http://tracker.ceph.com/issues/9343#note-9 Loïc Dachary
08:52 AM Feature #9161: Cache warmup and ejection
I started to work on this.
Is there a chance it could go into Hammer release?
Anonymous
08:51 AM Feature #9161: Cache warmup and ejection
I started to work on this.
Is there a change it could go into Hammer release?
Anonymous
08:43 AM Fix #9566 (Need More Info): osd: prioritize recovery of OSDs with most work to do

Assume 72 hours for host replacement/reprovisioning SLA. When host goes down (hardware failure), we expect complete...
Sheldon Mustard
07:36 AM devops Bug #9510: ceph-deploy: Move mon keyring generation 'mon create-initial'
Would adding a separate command for keyring creation be better?
Would moving it to `create-initial` mean that it ...
Alfredo Deza
05:12 AM devops Bug #9506 (Need More Info): Pass monitor SSH addresses via CLI flag
Could you give me a use case? In what context something like this would happen, and at what point in the deployment p... Alfredo Deza
05:08 AM CephFS Bug #9564: kcephfs crash in _nfs4_do_open
http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-19_23:10:01-knfs-giant-testing-basic-multi/500158/... John Spray
05:07 AM CephFS Bug #9564 (Resolved): kcephfs crash in _nfs4_do_open
John Spray
04:47 AM CephFS Bug #9562: Lockdep assertion in Filer purge
... John Spray
04:46 AM CephFS Bug #9562 (Resolved): Lockdep assertion in Filer purge
John Spray
04:08 AM Linux kernel client Bug #8979 (Resolved): GPF kernel panics - auth?
Landed in 3.17-rc5. Opened #9560 and #9561 for the issues mentioned above. Ilya Dryomov
04:04 AM Linux kernel client Bug #9561 (Rejected): libceph: do not crash if auth reply is not understood
Ilya Dryomov
04:02 AM Linux kernel client Bug #9560 (Rejected): libceph: msg kmalloc failure handling on the reply path
Ilya Dryomov
02:55 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
Issue reproduced, find the following info
Attaching mon and dmesg log of monitor node
Executed following comman...
Ramakrishnan P
12:51 AM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
RAM frequency, interesting. Something to keep in mind.. Ilya Dryomov

09/21/2014

11:56 PM Bug #9559: ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() function
ceph-0.80.5/src/common/fd.cc dump_open_fds() function allows attackers to cause buffer overflow via vectors related t... qinghao tang
11:47 PM Bug #9559 (Resolved): ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() func...
ceph-0.80.5/src/common/fd.cc dump_open_fds() function allows attackers to cause buffer overflow via vectors related... qinghao tang
11:28 PM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
Please ignore the previous update. Here is the correct one:
While some osds where in nearfull situation, shutdown ...
Sahana Lokeshappa
10:27 PM Bug #8863: osd: second reservation rejection -> crash
Yes Sage it is included.
commit 2b13de16c522754e30a0a55fb9d072082dac455e
Author: Sage Weil <sage@redhat.com>
Dat...
Sahana Lokeshappa
10:24 PM Bug #9558 (Can't reproduce): Both op threads and dispatcher threads get hung even for few minutes...
During peering stage, op threads will handle peering event and check the missing objects in this function: bool PG::M... Zhi Zhang
09:56 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Thanks a lot! I'll report back once there is an update to share. Florian Haas
07:26 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
That log shows me PGs with huge snap_trimq, which is very unfriendy to the snap trimmer. I've added Dan's patch on t... Sage Weil
12:51 PM Bug #9503 (Need More Info): Dumpling: removing many snapshots in a short time makes OSDs go berserk
Sage Weil
07:16 PM Bug #8752: firefly: scrub/repair stat mismatch
Sage Weil wrote:
> Is it possible the inconsistencies are correlated with the kernel (vs userspace) client? That wo...
Dmitry Smirnov
07:03 PM Bug #8752: firefly: scrub/repair stat mismatch
Dmitry Smirnov wrote:
> On 0.80.5 inconsistencies disappear from pool 20 (CephFS caching pool) although I also stopp...
Sage Weil
06:12 PM Bug #8752: firefly: scrub/repair stat mismatch
On 0.80.5 inconsistencies disappear from pool 20 (CephFS caching pool) although I also stopped using kernel FS client... Dmitry Smirnov
07:11 PM rbd Bug #8000 (Closed): SLAB: Unable to allocate memory on node 0
No particular access pattern seems to provoke this issue and frankly I have no clue what's causing it apart from "dee... Dmitry Smirnov
04:26 PM CephFS Feature #9557 (Resolved): mds: verify backtrace on fetch_dir
Verify that the backtrace is valid when we finish fetch_dir. That is, that we would have been able to locate the dir... Sage Weil
04:13 PM Bug #9285 (Fix Under Review): osd: promoted object can get evicted before promotion completes
Sage Weil
04:03 PM Bug #8629 (Fix Under Review): cache_evict needs to prevent make_writeable from creating a snapdir
https://github.com/ceph/ceph/pull/2550
Sage Weil
02:25 PM Bug #9556 (Duplicate): Segmentation fault in upgrade:dumpling-firefly-x-giant-distro-basic-multi ...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-21_10:14:47-upgrade:dumpling-firefly-x-giant-distr... Yuri Weinstein
01:42 PM Bug #9545 (Fix Under Review): filestore stuck in journal->should_commit_now() loop on shutdown
https://github.com/ceph/ceph/pull/2549 Sage Weil
12:50 PM Bug #9389: ec pg stuck peering, did not send query for one shard
At least on that one, looks like do_queries doesn't send the query. That can happen if the osd is down as of the osd... Samuel Just
12:41 PM Bug #9389: ec pg stuck peering, did not send query for one shard
/a/samuelj-2014-09-20_19:00:23-rados-wip-sam-testing-firefly2-wip-testing-old-vanilla-basic-multi/501557
probably ...
Samuel Just
11:06 AM Bug #9555 (Resolved): msg/Pipe.cc: 1513: FAILED assert(0 == "old msgs despite reconnect_seq featu...
firefly
/a/samuelj-2014-09-20_19:00:23-rados-wip-sam-testing-firefly2-wip-testing-old-vanilla-basic-multi/501749/r...
Samuel Just
10:08 AM Bug #9293: _collection_move_rename EEXIST
Samuel Just
05:44 AM Bug #9547: python rados aio_read truncates returned buffer on \000
firefly backport https://github.com/ceph/ceph/pull/2548 Loïc Dachary
03:18 AM Bug #9547: python rados aio_read truncates returned buffer on \000
The example from the description was not right but fixing it to have the expected length does not change the result o... Loïc Dachary
03:16 AM Bug #9547 (Pending Backport): python rados aio_read truncates returned buffer on \000
Loïc Dachary

09/20/2014

06:46 PM Bug #9554 (Can't reproduce): "FAILED assert(0 == "hit suicide timeout")" in upgrade:firefly-firef...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-20_15:08:15-upgrade:firefly-firefly-testing-basic-... Yuri Weinstein
05:53 PM devops Bug #9460: mira004, mira036. mira017 unresponsive
mira004 is bad again - 2014-09-20T17:31:32.251 INFO:teuthology.provision:Downburst completed on ubuntu@vpm024.front.s... Yuri Weinstein
03:29 PM Linux kernel client Bug #9432: kcephfs: null pointer deref in posix_acl_create
Zheng Yan
03:04 PM Bug #9551 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-testing-basic-vps run
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-20_13:44:11-upgrade:firefly-firefly-testing-basic-... Yuri Weinstein
06:32 AM devops Bug #9548 (Rejected): ceph mon creation failed for centOS
Trying to deploy ceph in centOS. But every time execute the below command I'm getting failed response.
[ceph@ceph-...
Subhadip Bagui
04:15 AM Bug #9547 (Fix Under Review): python rados aio_read truncates returned buffer on \000
Loïc Dachary
04:15 AM Bug #9547: python rados aio_read truncates returned buffer on \000
running wip-9547-python-rados-truncate from https://github.com/ceph/ceph/pull/2545 on http://ceph.com/gitbuilder.cgi Loïc Dachary
03:44 AM Bug #9547: python rados aio_read truncates returned buffer on \000
"need firefly backport":https://github.com/ceph/ceph/blob/firefly/src/pybind/rados.py#L1093 Loïc Dachary
03:40 AM Bug #9547: python rados aio_read truncates returned buffer on \000
Proposed fix https://github.com/ceph/ceph/pull/2544 Loïc Dachary
03:36 AM Bug #9547 (Resolved): python rados aio_read truncates returned buffer on \000
... Loïc Dachary
02:16 AM Bug #9535 (Duplicate): monitor crashed after restarting
Joao Eduardo Luis
02:14 AM Bug #9455 (Fix Under Review): mon: audit log read events should be debug level
https://github.com/ceph/ceph/pull/2538 Joao Eduardo Luis
02:14 AM Bug #9502 (Fix Under Review): mon: does not verify disk is not full on startup
https://github.com/ceph/ceph/pull/2538 Joao Eduardo Luis
12:37 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
What "was supposed to be the baseline":http://pulpito.ceph.com/sage-2014-09-18_17:42:51-rados:monthrash-wip-9301-dist... Loïc Dachary

09/19/2014

09:02 PM CephFS Bug #9178 (Resolved): samba: ENOTEMPTY on "rm -rf"
Zheng Yan
08:58 PM Bug #9546 (Rejected): LibRadosWatchNotify.WatchNotifyTest failure
... Sage Weil
05:32 PM Bug #9419: dumpling->firefly upgrade, sending setallochint?
David Zafman
05:18 PM Bug #9545: filestore stuck in journal->should_commit_now() loop on shutdown
... Sage Weil
05:18 PM Bug #9545: filestore stuck in journal->should_commit_now() loop on shutdown
sync_entry is looping on the same seq while the main thread waits for umount. journal should_commit_now() is stuck r... Sage Weil
05:18 PM Bug #9545 (Resolved): filestore stuck in journal->should_commit_now() loop on shutdown
Sage Weil
04:45 PM Bug #9390: EEXIST on split due to import/export
Not precisely sure how to approach this. We can make the OSD robust to this situation or we can adjust the test to a... Samuel Just
04:44 PM Bug #9390: EEXIST on split due to import/export
Tricky. I think that we saw the following sequence:
stop osd N
export pg X at epoch e
split pg X at epoch e+3
...
Samuel Just
04:43 PM Bug #8011 (Can't reproduce): osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || so...
Pinged Dmitry to see if he is sitll seeing this or has a log Sage Weil
04:35 PM Bug #9384 (Resolved): OSD is crashing while io is running and querying withadmin socket
Sage Weil
04:13 PM Bug #9502: mon: does not verify disk is not full on startup
Sage Weil
04:03 PM Bug #9544: osd: pg deletion vs create race leads to EEXIST on mkcoll (dumpling)
wip-sharedptr-registry-backport Samuel Just
03:39 PM Bug #9544 (Resolved): osd: pg deletion vs create race leads to EEXIST on mkcoll (dumpling)
... Sage Weil
03:42 PM Bug #7120 (Duplicate): osd: EEXIST on mkcoll on dumpling
Samuel Just
03:34 PM Bug #7120: osd: EEXIST on mkcoll on dumpling
/a/sage-2014-09-18_22:33:58-rados-dumpling-distro-basic-multi/496304/remote Samuel Just
03:34 PM CephFS Bug #9539 (Resolved): struct PurgeRange in Filer.cc needs lock to protect
Zheng Yan
06:32 AM CephFS Bug #9539 (Resolved): struct PurgeRange in Filer.cc needs lock to protect
send two requests to delete 1000026dfe3.00000067, but no request to 1000026dfe3.00000068... Zheng Yan
02:50 PM rgw Bug #9543 (Rejected): AssertionError(s) in upgrade:dumpling-dumpling-distro-basic-vps run
All in http://pulpito.front.sepia.ceph.com/teuthology-2014-09-19_11:48:54-upgrade:dumpling-dumpling-distro-basic-vps/... Yuri Weinstein
02:47 PM CephFS Bug #8576: teuthology: nfs tests failing on umount
Been playing around with this some. Greg Farnum
02:47 PM CephFS Bug #9177 (Resolved): ceph-fuse: failing MPI mdtest runs
John fixed this by updating mdtest in ceph-qa-suite as of commit:b1365a80982dba4160e861c28d887b066ca451b6. Greg Farnum
02:27 PM Bug #9301 (Pending Backport): paxos: off by one w/ versions in forming quorum
Sage Weil
12:42 PM Bug #9537: OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_total_chunk_si...
Please note that while this ought to work in the technical sense, you are unlikely to be happy with RADOS if you make... Greg Farnum
11:51 AM Bug #9537: OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_total_chunk_si...
OSD log of the primary OSD which crashed Mallikarjun Biradar
07:17 AM Bug #9537 (Fix Under Review): OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo...
https://github.com/ceph/ceph/pull/2534 to be confirmed by the OSD logs Loïc Dachary
07:05 AM Bug #9537 (Need More Info): OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.g...
Could you please attach the last 20,000 (twenty thousand) lines of the logs of the crashed primary OSD ? Loïc Dachary
03:52 AM Bug #9537: OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_total_chunk_si...
Config:
OSD nodes: 3
Monitor nodes: 2
Number of OSD's: 24
This is observed on 0.84 and is consistently getting ...
Mallikarjun Biradar
03:33 AM Bug #9537 (Resolved): OSD crash after writing 10GB file onto EC Pool: FAILED assert(hinfo.get_tot...

On freshly created cluster, created an EC pool with default ec profile.
Wrote 5MB of object file using rados put...
Mallikarjun Biradar
12:16 PM CephFS Feature #9284 (Resolved): mds: warn when clients are not responding to cache pressure
Merged in giant... John Spray
12:03 PM rbd Feature #6228: image name metavariable
Glad to see this feature added. Thank you! Mark, Adam, Sage, and Loic!
Assuming it wouldn't be too difficult, coul...
Mike Dawson
09:47 AM rbd Feature #6228 (Pending Backport): image name metavariable
Sage Weil
02:01 AM rbd Feature #6228 (Fix Under Review): image name metavariable
Loïc Dachary
11:04 AM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
I'd need the corresponding logs from osd.5 to be sure, but I believe the problem is that osd.5, due to 9497 and this ... Samuel Just
10:51 AM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
wip-sam-testing-firefly Samuel Just
10:51 AM Bug #9497: choose_acting has to let the pg be down any time acting < min_size even if there are b...
wip-sam-testing-firefly Samuel Just
10:50 AM Bug #9481: osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
wip-sam-testing-firefly Samuel Just
10:50 AM Bug #9326 (Pending Backport): osd crash in upgrade:dumpling-firefly-x-master-distro-basic-vps suite
Samuel Just
10:50 AM Bug #9240: osd_max_backfills = 1 can cause reserver deadlock for EC
wip-sam-testing-firefly Samuel Just
10:50 AM Bug #9293: _collection_move_rename EEXIST
wip-sam-testing-firefly Samuel Just
10:49 AM Bug #9179: unfound objects, recovery timeout
wip-sam-testing-firefly Samuel Just
10:49 AM Bug #8777: osd/PGLog.h: 88: FAILED assert(rollback_info_trimmed_to_riter == log.rbegin())
wip-sam-testing-firefly Samuel Just
10:49 AM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
wip-sam-testing-firefly Samuel Just
10:49 AM Bug #9339: ReplicatedPG crash in hitset_create
wip-sam-testing-firefly Samuel Just
09:23 AM Documentation #9542 (Won't Fix): Error link:"Ceph Object Gateway"->"Manual Install"
In the page "Install Ceph Object Gateway"(http://ceph.com/docs/master/install/install-ceph-gateway/index.html), the "... Aaron Chen
09:05 AM Documentation #8995 (Resolved): Preflight Checklist Clarifications
http://ceph.com/docs/master/start/ Preflight was revamped significantly to address the comments and anticipate others. John Wilkins
09:05 AM CephFS Bug #9540 (Rejected): Crash during FS upgrade: assert(o->get_num_ref() == 0)
Never mind, seems like this was just another manifestation of the original segment reference bug -- giant HEAD is OK. John Spray
06:37 AM CephFS Bug #9540: Crash during FS upgrade: assert(o->get_num_ref() == 0)
The crash hits at the last ceph.restart (after upgrade from firefly to 83bd3430e3a17b77265e696095904b7a9032d2ee).
...
John Spray
06:33 AM CephFS Bug #9540 (Rejected): Crash during FS upgrade: assert(o->get_num_ref() == 0)
... John Spray
08:08 AM Bug #8863: osd: second reservation rejection -> crash
does your build include commit:2b13de16c522754e30a0a55fb9d072082dac455e ? Sage Weil
07:44 AM RADOS Bug #9492 (Fix Under Review): Crush Mapper crashes when number of replicas is less than total num...
https://github.com/ceph/ceph/pull/2528 Loïc Dachary
07:35 AM Bug #9470 (Pending Backport): daemon pid file is not being created when running service ceph
firefly backport : https://github.com/ceph/ceph/pull/2535 Loïc Dachary
06:58 AM Linux kernel client Bug #9533 (Duplicate): kcephfs: fail to send requests initiated during mds restart
this was an old bug, patch was missing from running kernel.
ceph: fix kick_requests()
Sage Weil
06:56 AM Bug #9362: librados, rados_read corrupts memory on timeout
I did another test today using the build from http://gitbuilder.ceph.com/ceph-deb-wheezy-x86_64-basic/ref/wip-dumplin... Matthias Kiefer
06:36 AM Bug #9538: mon crashes on some --format=plain commands
Checked all other uses of new_formatter allocated pointer in OSDMonitor Loïc Dachary
06:31 AM Bug #9538 (Fix Under Review): mon crashes on some --format=plain commands
https://github.com/ceph/ceph/pull/2533 Loïc Dachary
06:11 AM Bug #9538: mon crashes on some --format=plain commands
Loïc Dachary
05:28 AM Bug #9538 (Resolved): mon crashes on some --format=plain commands

Mentioned by bens on IRC, creating ticket in case we forget:...
John Spray
06:33 AM rgw Feature #8911: RGW doesn't return 'x-timestamp' in header which is used by 'View Details' of Open...
Hello Luis, et al..
I have a customer who's requesting status for this Feature.. They view it as a bug since it c...
Michael Kidd
05:17 AM Bug #9536 (Fix Under Review): erasure-code: ISA plugin alignment must be constant
* giant backport https://github.com/ceph/ceph/pull/2531 Loïc Dachary
05:07 AM Bug #9536 (In Progress): erasure-code: ISA plugin alignment must be constant
Loïc Dachary
02:57 AM Bug #9536 (Resolved): erasure-code: ISA plugin alignment must be constant

commit:28c2b6e4f2bc6d77b9150fcf9a917d85c69c9ed1
"EC_ISA_VECTOR_OP_WORDSIZE":https://github.com/ceph/ceph/blob/ma...
Loïc Dachary
04:52 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
scheduled "a monthrash":http://pulpito.ceph.com/ubuntu-2014-09-19_04:50:17-rados:monthrash-wip-9343-erasure-code-feat... Loïc Dachary
04:43 AM Fix #8914 (Resolved): osd crashed at assert ReplicatedBackend::build_push_op
Loïc Dachary
02:59 AM Bug #9485: Monitor crash due to wrong crush rule set
Hi loic:
log, "ceph osd tree" output and crush map added.
log:
0> 2014-09-19 09:43:08.462737 7f92d9674700 -...
Dong Lei
01:58 AM Bug #9485: Monitor crash due to wrong crush rule set
Hi,
It should not crash, it should give you an error of some kind maybe. Could you please attach to this ticket a ...
Loïc Dachary
02:11 AM Bug #9408: erasure-code: misalignment
Running under the branch wip-9408-buffer-alignment in http://ceph.com/gitbuilder.cgi Loïc Dachary
01:55 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
It reproed. PFA logs attached.
here is the snippet:
2014-09-19 10:27:02.228364 7f86d73a2700 0 log_channel(de...
Sahana Lokeshappa
01:41 AM Bug #9535 (Duplicate): monitor crashed after restarting
recently when i restarted my ceph cluster , the monitor crashed , below is the output of monitor log
2014-09-19 ...
Xinxin Shu
01:39 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
Hi Sage, Thanks for the quick patch. I tried wip-9487-dumpling on our test cluster and now there is no snap trimming ... Dan van der Ster

09/18/2014

09:43 PM Linux kernel client Bug #9533 (Duplicate): kcephfs: fail to send requests initiated during mds restart
mds sees... Sage Weil
09:37 PM Bug #9202: Performance degradation during recovering and backfilling
New ticket here - http://tracker.ceph.com/issues/9523 Guang Yang
01:53 AM Bug #9202: Performance degradation during recovering and backfilling
Hi Samuel,
Thanks for the short-term fix by tuning that 2 parameters of backfill scan. With tuning other backfill/...
Zhi Zhang
09:16 PM Bug #9481: osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
ceph cluster with 8 osd nodes each having 64 osds, few osds were crashing with this assert .As one node had timestamp... Sahana Lokeshappa
11:01 AM Bug #9481 (Pending Backport): osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
Sage Weil
09:44 AM Bug #9481 (Fix Under Review): osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
Samuel Just
08:42 PM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
Captured debug log with wip-log-crash-firefly branch and attached. Aaron T
01:18 PM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
sjust believes I may have hit the same bug, running 0.80.5. Attached is the log from an OSD with settings:... Aaron T
12:55 PM Bug #9482 (Pending Backport): osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head...
Sage Weil
09:44 AM Bug #9482 (Fix Under Review): osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head...
Can't set info.last_epoch_started there, going to just use history.last_epoch_started as lower bound. Samuel Just
07:16 PM Bug #9485: Monitor crash due to wrong crush rule set
Hi, loic.
Currently I'm running some tests on my dev envrionment, after the tests are finished, I will reproduce i...
Dong Lei
06:42 PM CephFS Feature #9189 (Resolved): Expose client identifying metadata to MDS, e.g. hostname
Zheng Yan
06:33 PM Feature #9532 (Duplicate): rados.py should export omap interface
IWBN to be able to manipulate omap values with Python Dan Mick
06:03 PM Feature #8188 (In Progress): librados: interface to inspect pool properties
Adam Crume
05:09 PM rgw Bug #9529 (Resolved): ./common/ceph_crypto.h: 83: FAILED assert(s == SECSuccess)
... Sage Weil
05:01 PM Bug #9487 (Fix Under Review): dumpling: snaptrimmer causes slow requests while backfilling. osd_s...
wip-9487
wip-9487-dumpling for backport
Sage Weil
03:23 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
Nevermind, I've reproduced it! Sage Weil
03:21 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
Thanks Sage. There's a log with debug_osd=20 attached to this issue. I'll try tomorrow to get one with debug_ms=1 too. Dan van der Ster
03:08 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
Okay, I can't seem to reproduce this.
Dan or Florian, can you attach a log? What I need is debug ms = 1 and debug...
Sage Weil
02:45 PM Bug #9487 (In Progress): dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_t...
Dan van der Ster wrote:
> I also noticed that before the snap trimmer starts, purge_snaps is [] for 5.318. Is that n...
Sage Weil
02:52 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
Please comment on https://github.com/ceph/ceph/pull/2516.
Thanks!
Dan van der Ster
04:17 PM Bug #9528: RadosModel assertion failure in firefly
also this one,
log: http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-11_23:20:03-multi-version-master-testi...
Tamilarasi muthamizhan
04:14 PM Bug #9528 (Duplicate): RadosModel assertion failure in firefly
This is basically firefly client running against the dumpling cluster.
logs: http://qa-proxy.ceph.com/teuthology/t...
Tamilarasi muthamizhan
03:43 PM Bug #9517 (Resolved): Errors in test_rbd.* tests in upgrade:dumpling-giant-x:parallel-giant-distr...
This was due to ceph-qa-suite updates not being on the giant branch. Josh Durgin
03:29 PM rbd Feature #6228: image name metavariable
Yeah, that is probably a good idea anyway.. we've had uniqueness issues like this before! That is an easy thing and ... Sage Weil
03:21 PM rbd Feature #6228: image name metavariable
It's not perfect, but we could add a cctid variable so users could specify something like "admin socket = /var/run/ce... Adam Crume
03:11 PM rbd Feature #6228: image name metavariable
This assumes that each image has its own cct, but a process could have multiple images open in one cct. (In fact, co... Adam Crume
02:07 PM rbd Feature #6228 (In Progress): image name metavariable
Adam Crume
03:01 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Sage, a log is at https://www.dropbox.com/s/f2xyx12y2zr7fid/ceph-osd.14.log.xz -- behold the awesomeness of xz; that ... Florian Haas
02:49 PM Bug #9503 (Duplicate): Dumpling: removing many snapshots in a short time makes OSDs go berserk
Florian Haas wrote:
> Sage, I do have logs (@debug osd=20@, though not @debug ms=1@), but after the discussion with ...
Sage Weil
02:46 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Sage, I do have logs (@debug osd=20@, though not @debug ms=1@), but after the discussion with Dan on the -devel list,... Florian Haas
02:35 PM Bug #9503 (Need More Info): Dumpling: removing many snapshots in a short time makes OSDs go berserk
Hi Florian-
Can you generate some OSD logs (debug ms = 1, debug osd = 20) and attach them to the bug? The message...
Sage Weil
02:23 PM Bug #9301: paxos: off by one w/ versions in forming quorum
Sage Weil
02:07 PM rbd Bug #5768: rbd-fuse: leak in enumerate_images()
Adam Crume
01:49 PM rbd Bug #5768 (In Progress): rbd-fuse: leak in enumerate_images()
Adam Crume
02:05 PM Bug #9462: msgr deadlock: osd reply vs mark_down vs fault
Sage Weil
12:55 PM Bug #9453 (Resolved): ceph_objectstore_tool incorrect log tail output for --op log
Sage Weil
09:45 AM Bug #9453 (Fix Under Review): ceph_objectstore_tool incorrect log tail output for --op log
Samuel Just
11:51 AM rbd Feature #7746 (In Progress): Capacity Management: rbd df
see wip-7746 Josh Durgin
11:46 AM Feature #9526 (Resolved): mon: 'osd crush rename-bucket <old> <new>'
Sage Weil
11:42 AM Bug #9497 (Pending Backport): choose_acting has to let the pg be down any time acting < min_size ...
Sage Weil
09:43 AM Bug #9497 (Fix Under Review): choose_acting has to let the pg be down any time acting < min_size ...
Samuel Just
11:26 AM rgw Bug #9525 (Duplicate): Deleted object shows in object listing

What appears to happen is that a request to delete an object comes in while the cluster is in a terrible state perf...
Tupper Cole
11:16 AM rgw Bug #9169: 100-continue broken for centos/rhel
Similar issue in suite:upgrade:firefly
http://pulpito.front.sepia.ceph.com/teuthology-2014-09-17_19:00:01-upgrade:...
Yuri Weinstein
11:04 AM rgw Bug #9479: ETag is not included in the XML response to put object copy operation
This is under v0.67.10 JuanJose Galvez
11:04 AM rgw Bug #9478: Incorrect content type in response header
This is under v0.67.10 JuanJose Galvez
11:00 AM Bug #8315 (Pending Backport): osd: watch callback vs callback funky
Sage Weil
09:40 AM Bug #8315 (Fix Under Review): osd: watch callback vs callback funky
Samuel Just
09:40 AM rbd Bug #6926 (In Progress): rbd: diff output includes previously non-existent objects as zeroed extents
Adam Crume
09:40 AM Bug #9326 (Fix Under Review): osd crash in upgrade:dumpling-firefly-x-master-distro-basic-vps suite
Samuel Just
09:37 AM Feature #7767 (Resolved): messenger:buffer reads
Samuel Just
07:58 AM CephFS Feature #9437 (In Progress): make 'ceph tell mds.* ...' work, deprecate 'ceph mds tell * ...'
John Spray
06:16 AM CephFS Feature #9477: Handle kclient shutdown with dead network more gracefully
In the general case (e.g. root filesystem is cephfs) there's nothing we can do: the system can't shut down until the ... John Spray
05:56 AM CephFS Bug #9518 (Resolved): client metadata get lost after mds restart
... John Spray
02:30 AM CephFS Bug #9518 (Fix Under Review): client metadata get lost after mds restart
Well, I also shouldn't have missed it while writing the code :-)
https://github.com/ceph/ceph/pull/2515
John Spray
04:01 AM RADOS Bug #9523 (Closed): Both op threads and dispatcher threads could be stuck at acquiring the budget...
When OSD is rejoining and peering, we still see some slow requests and performance downgradation in about 5 to 10 min... Zhi Zhang
02:08 AM Bug #8863: osd: second reservation rejection -> crash
Two osds were down and out due to that crash, I was not able to start those osds again. So removed those osds and add... Sahana Lokeshappa
01:13 AM Linux kernel client Bug #9507: calling llistxattr(2) on a symlink crashes the client
... Zheng Yan

09/17/2014

11:59 PM CephFS Bug #9504 (Duplicate): failed to decode message of type 24 v2: buffer::end_of_buffer
looks like this is duplicate of #9458 Zheng Yan
08:23 AM CephFS Bug #9504 (Duplicate): failed to decode message of type 24 v2: buffer::end_of_buffer
root@burnupi21:~# less /var/log/upstart/ceph-mds-ceph_burnupi21.log
...
Sage Weil
11:57 PM Linux kernel client Bug #9458: client wrongly fenced
is the client using 3.16 kernel? possibly due to missing following commit... Zheng Yan
02:45 PM Linux kernel client Bug #9458: client wrongly fenced
The kernel client is definitely doing something wrong here, but I don't know what — the userspace messenger is not in... Greg Farnum
02:38 PM Linux kernel client Bug #9458: client wrongly fenced
The MDS went into reconnect at 4:59:50... Greg Farnum
11:09 AM Linux kernel client Bug #9458: client wrongly fenced
Taking a look; luckily we have at least *some* of the logging... Greg Farnum
08:17 AM Linux kernel client Bug #9458: client wrongly fenced
mds restarted and teuthology failed to reconnect again, 07:30:34.485721 Sage Weil
07:18 AM Linux kernel client Bug #9458: client wrongly fenced
teuthology was fenced again. not sure it was during a mds restart this time, either. notably the monitors went offl... Sage Weil
10:52 PM Bug #8863: osd: second reservation rejection -> crash
Even i got the above crash, when few osds were in nearfull situation.
Snippet of logs:
2014-09-17 17:29:41.69...
Sahana Lokeshappa
08:46 PM CephFS Bug #9518: client metadata get lost after mds restart
Dur, shouldn't have missed that in review. :( Greg Farnum
07:44 PM CephFS Bug #9518 (Resolved): client metadata get lost after mds restart
Zheng Yan
04:25 PM Bug #9517 (Resolved): Errors in test_rbd.* tests in upgrade:dumpling-giant-x:parallel-giant-distr...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-17_13:53:14-upgrade:dumpling-giant-x:parallel-gian... Yuri Weinstein
03:31 PM Bug #9452 (Resolved): All tests failed in upgrade:dumpling-giant-x:parallel-master-distro-basic-m...
Looks like we passed those issues
#9515 might be realted
Yuri Weinstein
03:29 PM Bug #9515 (Duplicate): "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-17_13:53:14-upgrade:dumpling-giant-x:parallel-gian... Yuri Weinstein
03:05 PM CephFS Bug #9514 (Resolved): ceph-fuse pjd test is failing in giant nightlies
commit:0ea20a668cf859881c49b33d1b6db4e636eda18a
http://qa-proxy.ceph.com/teuthology/sage-2014-09-14_18:23:49-smoke...
Greg Farnum
02:43 PM rbd Bug #9513 (Resolved): rbd_cache=true default setting is degading librbd performance ~10X in Giant
We are experiencing severe librbd performance degradation in Giant over firefly release. Here is the experiment we di... Somnath Roy
02:33 PM Bug #8885: SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
It's the same issue as #9384
Here is the pull request for the same.
https://github.com/ceph/ceph/pull/2440
Somnath Roy
01:17 PM Bug #9508 (Resolved): objecter: segv on timeout/cancel (LibRadosIo ReadTimeout)
commit:cef34f429972267061fc0e730ef976887ccb78a9 Sage Weil
10:22 AM Bug #9508 (Fix Under Review): objecter: segv on timeout/cancel (LibRadosIo ReadTimeout)
https://github.com/ceph/ceph/pull/2498 Sage Weil
09:59 AM Bug #9508 (Resolved): objecter: segv on timeout/cancel (LibRadosIo ReadTimeout)
... Sage Weil
11:05 AM Bug #9509: init script cannot stop OSDs
Yep, it needs to be backported to Firefly Alexandre Marangone
11:01 AM Bug #9509 (Pending Backport): init script cannot stop OSDs
See #9470. I guess the commit probably needs to be backported to firefly? Greg Farnum
10:57 AM Bug #9509: init script cannot stop OSDs
Let me redo the last sentence...
One user reported the issue on CentOS 7 and I managed to reproduce it. I assume i...
Alexandre Marangone
10:48 AM Bug #9509 (Resolved): init script cannot stop OSDs
Running a @service ceph stop osd@ will not stop OSDs.
It seems the problem is that the OSDs are launched with the ...
Alexandre Marangone
11:04 AM devops Bug #9510 (Closed): ceph-deploy: Move mon keyring generation 'mon create-initial'
Right now the monitor keyring is generated with 'ceph-deploy new', in cases where an admin wants to use a pre-existin... Kyle Bader
10:23 AM Bug #9501: Assertion in FileJournal::do_write
Don't worry, Sam says this is some kernel bug in btrfs, but he hasn't told the rest of us about it yet. Greg Farnum
04:03 AM Bug #9501: Assertion in FileJournal::do_write
Urgh, I have stupidly just killed that job before making a copy of the logs. John Spray
03:56 AM Bug #9501 (Rejected): Assertion in FileJournal::do_write
... John Spray
09:48 AM Linux kernel client Bug #9507 (Resolved): calling llistxattr(2) on a symlink crashes the client
The code hits a "BUG();" line at https://github.com/ceph/ceph-client/blob/7e8a295295775ec9e05411cefc578ff4bfc94740/fs... Kevin Lamontagne
09:33 AM devops Bug #9506 (Rejected): Pass monitor SSH addresses via CLI flag
In some network configurations it is desirable to have ceph-deploy access monitors from one network, and use another ... Kyle Bader
08:51 AM Linux kernel client Bug #9505 (Duplicate): kcephfs: client gets stuck in reconnect loop?
... Sage Weil
08:37 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Added issue #9487 as *possibly* related. Florian Haas
07:52 AM Bug #9503 (Resolved): Dumpling: removing many snapshots in a short time makes OSDs go berserk
Back in March, there was a report from Craig Lewis on the users list that mentioned several OSDs going to 100% CPU fo... Florian Haas
07:10 AM Bug #9502 (Resolved): mon: does not verify disk is not full on startup
mira040... Sage Weil
06:09 AM rgw Feature #9359: rgw: Export user stats in get-user-info Adminops API
Updated PR with a new commit to resolve Yehuda's comments. Please help to review it. Xiangyu Lv
06:08 AM Bug #9490 (Rejected): crushtool crash if --num-rep is missing
The root of the problem is #9492 : when --num-rep is missing it defaults to the range defined in the rule and does th... Loïc Dachary
05:58 AM Bug #9490 (In Progress): crushtool crash if --num-rep is missing
Loïc Dachary
06:03 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
I also noticed that before the snap trimmer starts, purge_snaps is [] for 5.318. Is that normal, or should (the compl... Dan van der Ster
05:48 AM CephFS Feature #9189: Expose client identifying metadata to MDS, e.g. hostname
Userspace part merged:... John Spray
05:47 AM CephFS Feature #9375 (Resolved): Send single 'many clients' health warning instead of N warnings for N c...
... John Spray
03:39 AM rgw Bug #9500 (Duplicate): 0.80.5 on CentOS 6.5: radosgw-admin fails to correctly name subuser object
System info: Firefly (0.80.5 on CentOS 6.5). radosgw is configured and working fine with s3cmd.
Symptom: despite t...
Florian Haas
01:23 AM Bug #8083 (In Progress): erasure-code: fix static code analysis errors found in gf-complete
A number of fixes already are in gf-complete master and "added two":https://bitbucket.org/jimplank/gf-complete/pull-r... Loïc Dachary

09/16/2014

11:27 PM Bug #9488 (Rejected): Writing object onto EC pool created with customized ec profile getting hung
k=1 m=1 is not supposed to work, it won't do anything useful. k=5 m=3 totals 8 osds and you only have 6 hence it blocks. Loïc Dachary
08:32 PM Bug #9488: Writing object onto EC pool created with customized ec profile getting hung
Hi Loic,
I have 3 OSD hosts and total of 6 OSD's.
ems@rack6-client-5:~$ sudo ceph osd crush rule dump
[
{...
Mallikarjun Biradar
08:36 AM Bug #9488 (Need More Info): Writing object onto EC pool created with customized ec profile gettin...
It is the normal behavior when there are not enough hosts to satisfy the crush rules. Do you have 22 hosts available ... Loïc Dachary
05:14 AM Bug #9488: Writing object onto EC pool created with customized ec profile getting hung
Attaching logs Mallikarjun Biradar
05:09 AM Bug #9488: Writing object onto EC pool created with customized ec profile getting hung
This issue is observed on ceph 0.84 Mallikarjun Biradar
05:07 AM Bug #9488 (Rejected): Writing object onto EC pool created with customized ec profile getting hung
Writing object onto EC pool created with customized EC profile is getting hung.
But, writing object onto EC pool wit...
Mallikarjun Biradar
11:15 PM Bug #9219 (Resolved): lost_unfound test got ENOENT: i don't have pgid 1.e
Greg Farnum
05:38 PM Bug #9219: lost_unfound test got ENOENT: i don't have pgid 1.e
Merged into giant by commit:782848af596fdb0be57daa68481b3976b7119141. Greg Farnum
10:14 PM devops Bug #9499 (Can't reproduce): osds do not start after reboot (centos7, dm-crypt)
most osds do not come up after reboot; only one does.
adding a 'sleep 10 ; ceph-disk activate-all' to /etc/rc.loca...
Sage Weil
09:57 PM devops Bug #9498 (Resolved): el7 still using crappy el6 udev rules
Sage Weil
08:33 PM Bug #9497: choose_acting has to let the pg be down any time acting < min_size even if there are b...
Samuel Just
08:33 PM Bug #9497 (Resolved): choose_acting has to let the pg be down any time acting < min_size even if ...
Otherwise, build_prior won't realize that the interval was maybe_went_rw. Samuel Just
07:05 PM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
The issue is that crush temporary buffers(scratch array) are allocated as per size of num_replica configured by the ... Johnu George
05:28 PM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
Seg fault log:
CRUSH*** Caught signal (Segmentation fault) **
in thread 7f3dcb0007c0
ceph version 0.85-778-gb285...
Johnu George
12:37 PM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
1. ./crushtool --outfn crushmap --build --num_osds 100 host straw 4 rack straw 10 default straw 0
2../crushtool -d c...
Johnu George
06:36 PM Bug #9496 (Resolved): mon: pg scrub timestamps must be populated at pg creation
logs: ubuntu@teuthology:/a/teuthology-2014-09-15_16:05:01-upgrade:firefly-giant-x:parallel-giant-distro-basic-multi/4... Tamilarasi muthamizhan
05:37 PM CephFS Fix #9435 (Resolved): prevent use of cache pools as metadata or data pools
Merged into giant branch in commit:eb1b2e0072bf605095f4104c2b6c2abfba216dbe Greg Farnum
02:57 AM CephFS Fix #9435 (Fix Under Review): prevent use of cache pools as metadata or data pools
https://github.com/ceph/ceph/pull/2507 John Spray
03:46 PM Bug #9480: OSD is crashing while object deletion
Created the following pull request for the fix.
https://github.com/ceph/ceph/pull/2510
Somnath Roy
02:50 PM rgw Feature #9493 (Resolved): Ability to disable keystone revocation polling when using UUID keystone...
When using a UUID keystone provider revocation is handled by deleting the token from the persistence backend (ie. no ... Kyle Bader
02:16 PM CephFS Feature #9466: kclient: Extend CephFSTestCase tests to cover kclient
Got these passing at least once by hand using IPMI to work around #9477, suite scheduled:
http://pulpito.front.sep...
John Spray
11:49 AM Documentation #8995: Preflight Checklist Clarifications
Addressed clarifications and also added preflight material for other distributions. John Wilkins
11:47 AM Documentation #9475 (Resolved): Broken links on downloads page
Resolved by Ross Turk. John Wilkins
11:36 AM Documentation #9491 (Closed): Radosgw docs incorrectly state to disable print continue on centos ...
https://ceph.com/docs/master/radosgw/config/ states:
"On CentOS/RHEL distributions, turn off print continue. If yo...
Ben Hines
11:31 AM Bug #9490 (Fix Under Review): crushtool crash if --num-rep is missing
https://github.com/ceph/ceph/pull/2508 Loïc Dachary
11:15 AM Bug #9490: crushtool crash if --num-rep is missing
crash occurs when num-rep takes the value 1 Johnu George
11:00 AM Bug #9490 (Rejected): crushtool crash if --num-rep is missing
... Loïc Dachary
10:13 AM devops Bug #9489: --zap-disk does not clear enough
it's worth noting that the OSD worked fine in the cluster after initial deployment, it's not until the node is reboot... Loïc Dachary
10:05 AM devops Bug #9489 (Rejected): --zap-disk does not clear enough
sometime the partitions are resurected Loïc Dachary
10:07 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Andrei,
No, I haven't, but plan to try harder. I am however seeing an extreme slowdown, will open a ticket to tak...
Ilya Dryomov
02:49 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
Ilya,
I was wondering if you've managed to verify my findings? Has anyone experienced similar behaviour as I am?
...
Andrei Mikhailovsky
10:00 AM devops Bug #5929 (Resolved): debian: python-ceph should depend on libcephfs1
This was comitted quite a long time ago during a bug scrub I believe. Sandon Van Ness
09:12 AM RADOS Fix #6109: pg <pgid> mark_unfound_lost fails if a completely-gone OSD still in map
I'm having a similar issue, I have one unfound object that I can't delete. I'm also getting the "Error EINVAL: pg has... Sébastien Han
08:46 AM Bug #9438: librados API generated doc broken
I'd be happy to review : which pull request / branch is it ? Loïc Dachary
08:38 AM Bug #9485 (Need More Info): Monitor crash due to wrong crush rule set
Could you add the stack trace of the mon crash to the ticket ? I remember the discussion we had on the mailing list a... Loïc Dachary
03:29 AM Bug #9434: rbd rm hangs
Loic Dachary wrote:
> Version 0.71 was a development version. Are you observing the same version on a stable release...
Yi Li
02:37 AM Bug #9304: pool create with invalid crush rule name succeeds
"rebased against giant":https://github.com/ceph/ceph/pull/2506 Loïc Dachary
02:33 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
Here is a bit more... I checked for "snap_trimmer entry" on other OSDs this morning. There were a few others, but all... Dan van der Ster
01:59 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
> I was able to isolate the cause of the backfilling to one single OSD
typo.. I was able to isolate the cause of ...
Dan van der Ster
01:47 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
In case it wasn't clear, there is nothing special about osd.11. Each time I reweight 2 OSDs the slow requests are cau... Dan van der Ster
01:44 AM Bug #9487 (Resolved): dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim...
Hi,
using dumpling 0.67.10...
We are doing quite a bit of backfilling these days in order to make room for some n...
Dan van der Ster

09/15/2014

08:37 PM Bug #9485 (Resolved): Monitor crash due to wrong crush rule set
I create a customized crush rule for ec pool
1 set take default
2 choose firstn 6 type rack
3 chooseleaf firstn ...
Dong Lei
08:04 PM Feature #9222: annotate config options
If we want to think in an internationalization direction, perhaps the right thing is to msg-catalog the help informat... Dan Mick
03:05 PM Feature #9222: annotate config options
Yeah. I would also love to see min/max values for the numeric options. Sage Weil
02:56 PM Feature #9222: annotate config options
If a fourth argument is set to a description string in config_opts.h, ceph.in could get access to it via a pybind/com... Loïc Dachary
05:39 PM Fix #9484: OSD: block until we have the same map as the client on pg commands
Instead of blocking for *every* tell command (or even a subset), we can add one new command 'get_latest_osdmap' or si... Sage Weil
05:11 PM Fix #9484 (New): OSD: block until we have the same map as the client on pg commands
Right now, if a client has a newer map than we do and sends a PG command (like list_missing, #9219) we can reply ENOE... Greg Farnum
05:14 PM Bug #9219 (Fix Under Review): lost_unfound test got ENOENT: i don't have pgid 1.e
I created a few other tickets for the specific pg command issue, and created a PR so the OSD will subscribe to any os... Greg Farnum
04:49 PM Bug #9219: lost_unfound test got ENOENT: i don't have pgid 1.e
Okay, so at the time osdmap 19 was created, we had two of three OSDs running (osd.1 was down and out, and teuthology ... Greg Farnum
05:08 PM Feature #9483 (Resolved): OSD: add a get_newest_map command to the admin socket
This could be useful in testing and to "unstick" clusters in some odd situations we've seen before. Greg Farnum
04:20 PM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
Yeah, pretty sure that's right, even if we only find backfill peers, we want to let those determine the min acceptabl... Samuel Just
04:06 PM Bug #9482: osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
Actually, I'm not sure that's right. Thinking. Samuel Just
04:02 PM Bug #9482 (Resolved): osd/PGLog.cc: 544: FAILED assert(log.head >= olog.tail && olog.head >= log....
backfill peers are not setting info.last_epoch_started allowing subsequent primaries to erroneously conclude that it ... Samuel Just
03:39 PM Bug #9481 (Resolved): osd/PGLog.h: 87: FAILED assert(rollback_info_trimmed_to == head)
Bug is PGLog::claim_log_and_clear_rollback_info sets rollback_info_trimmed_to before setting head. Samuel Just
03:27 PM Bug #9480: OSD is crashing while object deletion
I have root caused it, it seems to be happening because one of my earlier changes :-( .. Here is the rot cause.
1....
Somnath Roy
03:00 PM Bug #9480 (Resolved): OSD is crashing while object deletion
Reproducible step:
1. Run a command something like this.
rados bench -p rbench 200 write -t 32 -b 1024
The O...
Somnath Roy
03:12 PM Bug #9109: ceph CLI: Help is missing -k keyring option
Initial pull request:
https://github.com/ceph/ceph/pull/2483
Need to design a solution so that all clients can ...
Johnu George
03:05 PM Bug #9109: ceph CLI: Help is missing -k keyring option
not a low hanging fruit after all, johnu will try another ;-) Loïc Dachary
12:51 PM Bug #9109: ceph CLI: Help is missing -k keyring option
So, really, this applies to all the "Ceph global" options that the frontend doesn't have reason to do anything specia... Dan Mick
02:02 PM CephFS Bug #9444 (Resolved): "unmatched rstat" exception after firefly->master upgrade
if mds_verify_scatter isn't enabled, the MDS will fix rstat mismatch atomically. Zheng Yan
10:45 AM CephFS Bug #9444: "unmatched rstat" exception after firefly->master upgrade
I think you're right, John. I'm not sure why we never saw this before though — Zheng, what changed that we're looking... Greg Farnum
02:45 AM CephFS Bug #9444: "unmatched rstat" exception after firefly->master upgrade
Is this actually fixed, in the case of filesystems created using old code? It seems like the patch prevents creating... John Spray
01:34 PM Bug #9452: All tests failed in upgrade:dumpling-giant-x:parallel-master-distro-basic-multi run
The main source of these problems should be fixed by commit:cdb7675a21c9107e3596c90c2b1598def3c6899f Josh Durgin
01:33 PM rbd Bug #6494: High memory consumption of qemu/librbd with enabled cache
FTR the commits fixing this are commit:4fc9fffc494abedac0a9b1ce44706343f18466f1 and commit:cdb7675a21c9107e3596c90c2b... Josh Durgin
01:04 PM CephFS Fix #9435: prevent use of cache pools as metadata or data pools
First half here: https://github.com/ceph/ceph/tree/wip-9435 (no handling of tiering updates yet) John Spray
12:47 PM CephFS Fix #9435 (In Progress): prevent use of cache pools as metadata or data pools
John Spray
12:39 PM rgw Bug #9479 (Resolved): ETag is not included in the XML response to put object copy operation
User performs a put object copy operation, and the ETag is not included in the XML response. Tupper Cole
12:37 PM rgw Bug #9478 (Resolved): Incorrect content type in response header
User performs a put object copy operation, and seeing the content-type in the response header returned as "binary/oct... Tupper Cole
12:32 PM CephFS Feature #9477: Handle kclient shutdown with dead network more gracefully

Ah, this *only* happens if I have some dirty state from userspace at the time. In this instance it's my Mount.open...
John Spray
11:59 AM CephFS Feature #9477 (Closed): Handle kclient shutdown with dead network more gracefully
... John Spray
10:44 AM Bug #9476 (Duplicate): "Segmentation fault (core dumped)" in upgrade:dumpling-giant-x:parallel-gi...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-14_15:05:01-upgrade:dumpling-giant-x:parallel-gian... Yuri Weinstein
10:40 AM Linux kernel client Bug #4614 (Can't reproduce): Root cephfs does not mount at boot on Ubuntu 12.04
Greg Farnum
10:38 AM Documentation #9475 (Resolved): Broken links on downloads page
The "View installation docs for..." links at the bottom of http://ceph.com/resources/downloads/ are broken, presumabl... John Spray
10:35 AM Bug #9470 (Resolved): daemon pid file is not being created when running service ceph
This was fixed by commit:bccb0eb64891f65fd475e96b6386494044cae8c1, which will be in Giant. Greg Farnum
05:01 AM Bug #9470 (Resolved): daemon pid file is not being created when running service ceph
Hi,
We have been seeing some strange issues with the latest version(s) of ceph. I'm testing on 0.85 right now, an...
Kenneth Waegeman
10:14 AM CephFS Bug #9423 (Resolved): failure in client_recovery task
John Spray
10:14 AM CephFS Bug #9423: failure in client_recovery task

Fixed merged to giant....
John Spray
08:07 AM CephFS Bug #9423: failure in client_recovery task
John Spray
09:50 AM CephFS Feature #9466 (In Progress): kclient: Extend CephFSTestCase tests to cover kclient
John Spray
03:43 AM CephFS Feature #9466: kclient: Extend CephFSTestCase tests to cover kclient
kclient instrumentation to enable implementing KernelClient::get_global_id (mapping local mount to the ID we see on t... John Spray
03:38 AM CephFS Feature #9466 (Resolved): kclient: Extend CephFSTestCase tests to cover kclient

Currently the mds_client_recovery and mds_client_limits tasks in ceph-qa-suite only run against the fuse client, be...
John Spray
09:42 AM devops Feature #9474 (Resolved): unify init-radosgw versions'
there is a sysv version and a regular version. keep these in sync.
even better would be to unify with init-ceph ....
Sage Weil
08:33 AM Feature #9343: erasure-code: allow upgrades for lrc and isa plugins
running "monthrash against master":http://pulpito.ceph.com/loic-2014-09-15_08:31:19-rados:monthrash-master-testing-ba... Loïc Dachary
08:33 AM Bug #9472 (Duplicate): osd crash in -upgrade:dumpling-dumpling-distro-basic-vps suite
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-14_17:00:01-upgrade:dumpling-dumpling-distro-basic... Yuri Weinstein
08:21 AM devops Bug #9460: mira004, mira036. mira017 unresponsive
For mira017 see : http://qa-proxy.ceph.com/teuthology/teuthology-2014-09-14_17:00:01-upgrade:dumpling-dumpling-distro... Yuri Weinstein
08:06 AM CephFS Bug #9177: ceph-fuse: failing MPI mdtest runs
https://github.com/ceph/ceph-qa-suite/pull/140 John Spray
08:04 AM CephFS Bug #9177: ceph-fuse: failing MPI mdtest runs
John Spray
07:16 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
I've now installed an ubuntu 14.04 but could still not make it fail.
Even valgrind has a perfectly clean output.
I ...
Sebastien Ponce
06:50 AM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
Starting to look into this... Luis Pabon
06:42 AM Bug #9408 (In Progress): erasure-code: misalignment
Now I see it, thanks for your patience. Loïc Dachary
06:12 AM Bug #9408: erasure-code: misalignment
Hi Loic, I think Janne Grunau is right. For memory align, it depend on the bufferlist::c_str.
Using this patch:
...
jianpeng ma
04:13 AM Bug #9408: erasure-code: misalignment
With the following applied on dcc608d5d3f701315eaf0edee6f0a4796a4d97e1... Loïc Dachary
03:20 AM Bug #9408: erasure-code: misalignment
jianpeng ma wrote:
> Can you tell met the result for this situation? I run with your command but it looks good.
I...
Janne Grunau
04:52 AM Bug #9008: Objecter: pg listing can deadlock when throttling is in use
Please help to review: https://github.com/ceph/ceph/pull/2489 Guang Yang
04:33 AM rgw Bug #9469 (Rejected): RadosGW performance degrades with high concurrency workload.
I am running COSbench as a performance benchmarking tool on a CEPH cluster(Swift API). Setup details are as follows:-... pushpesh sharma
04:29 AM Bug #9468 (Won't Fix): Unable to delete crush rule with blank space
I am not sure how crush rule with blank space in beginning got created. But, I am not able to delete it.
ems@rack...
Mallikarjun Biradar
04:15 AM Bug #9467 (Won't Fix): Delete default erasure coded profile getting succeeded
Deleting default erasure coded profile is getting succeeded.
Also, re-creating erasure coded profile "default" with ...
Mallikarjun Biradar
03:24 AM Fix #6754 (Resolved): erasure-code: jerasure plugin does not check parameters properly
Loïc Dachary
 

Also available in: Atom