Project

General

Profile

Activity

From 03/10/2014 to 04/08/2014

04/08/2014

11:51 PM Bug #8047 (Closed): 0.79: new OSD crashed within minutes
On 0.79 I added new OSD (on btrfs). Shortly after re-balancing begin newly added OSD crashed:... Dmitry Smirnov
10:12 PM Bug #8011: osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= scrubber.end)
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-04-08_14:01:14-rados:thrash-wip-7891-testing-basic-plana/178972 Sage Weil
08:49 PM Bug #8028 (Fix Under Review): /lib/lsb/init-functions does not exist in latest firefly rc
Sage Weil
08:49 PM Bug #8028: /lib/lsb/init-functions does not exist in latest firefly rc
The gitbuilders like this fine.. Alfredo, do you want to do a final test with ceph-deploy to make sure yum install fi... Sage Weil
05:57 PM Bug #8028: /lib/lsb/init-functions does not exist in latest firefly rc
Sage Weil
05:56 PM Bug #8028 (Fix Under Review): /lib/lsb/init-functions does not exist in latest firefly rc
Sage Weil
05:03 PM Bug #8028: /lib/lsb/init-functions does not exist in latest firefly rc
This is because of missing depends on "lsb-base". I reckon it (or similar package) should be available on RHEL/CentOS... Dmitry Smirnov
12:38 PM Bug #8028: /lib/lsb/init-functions does not exist in latest firefly rc
but those do not export the same functions... what is it that we needed from init-functions that we had to add that l... Alfredo Deza
09:34 AM Bug #8028: /lib/lsb/init-functions does not exist in latest firefly rc
the offending commit by the way is commit:012bb5fb5bbc76e5a2c5037dc0c6558f0b1b0a45 Sage Weil
09:34 AM Bug #8028: /lib/lsb/init-functions does not exist in latest firefly rc
... Sage Weil
08:16 AM Bug #8028: /lib/lsb/init-functions does not exist in latest firefly rc
Ok so this looks like this gets installed with the redhat-lsb package that has this file in it:... Alfredo Deza
07:04 AM Bug #8028: /lib/lsb/init-functions does not exist in latest firefly rc
Commenting out that one line allows me to start the monitors but I understand this has side-effects that are not enti... Alfredo Deza
06:07 AM Bug #8028 (Resolved): /lib/lsb/init-functions does not exist in latest firefly rc
... Alfredo Deza
08:25 PM CephFS Bug #8004: LibCephFS.HardlinkNoOriginal hang
Zheng Yan
06:49 PM rbd Bug #5469 (Fix Under Review): qemu-io: segfault when tried IO with invalid arguments
https://github.com/ceph/ceph/pull/1632 Josh Durgin
06:36 PM Bug #8046: osd/ReplicatedPG.h: 666: FAILED assert(got) in get_rw_locks()
Sage Weil
06:31 PM Bug #8046: osd/ReplicatedPG.h: 666: FAILED assert(got) in get_rw_locks()
The wr locks are held on both head and snapset due to a previous op (delete) that is committed but not yet applied. Sage Weil
05:35 PM Bug #8046 (Resolved): osd/ReplicatedPG.h: 666: FAILED assert(got) in get_rw_locks()
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-08_02:30:14-rados-firefly-distro-basic-plana/178780... Sage Weil
05:40 PM Bug #8045: osd: deadlock from osdmap feature update
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-08_02:30:14-rados-firefly-distro-basic-plana/178740 Sage Weil
05:30 PM Bug #8045 (Fix Under Review): osd: deadlock from osdmap feature update
Sage Weil
05:18 PM Bug #8045 (Resolved): osd: deadlock from osdmap feature update
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-08_02:30:14-rados-firefly-distro-basic-plana/178670... Sage Weil
05:01 PM Bug #8044 (Duplicate): osd/ReplicatedPG.cc: 2276: FAILED assert(p != snapset.clones.end())
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-08_02:30:14-rados-firefly-distro-basic-plana/178733... Sage Weil
04:11 PM Bug #8043 (Resolved): until we fix it more better, we should disallow split on cache pools
Samuel Just
03:52 PM Bug #8042 (Resolved): mon: crash decoding incremental osdmap on split firefly/dumpling
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-07_22:35:16-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
03:24 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Kevin Greenan proposes to expose a static function to reduce the amount of code required from the plugin : https://bi... Loïc Dachary
10:14 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Sage Weil
10:02 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
the above config.yaml ran 6 times without failing Loïc Dachary
05:47 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
running tests against the proposed change, using the previous config.yaml. Not really hoping that it will failing. Ju... Loïc Dachary
04:30 AM Bug #7914 (Fix Under Review): osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer ...
Loïc Dachary
03:04 PM Feature #8041 (Resolved): ceph uses GCC-specific strerror_r; easy to make more portable
GCC's strerror_r returns the string; the POSIX version returns success (the string is returned in the supplied buffer... Dan Mick
02:49 PM devops Bug #5835 (Resolved): Change text in package builds
Sage Weil
02:19 PM devops Bug #5835 (Fix Under Review): Change text in package builds
Sage Weil
02:46 PM devops Bug #6779 (Resolved): fix typo on the modfastcgi repo for fedora18
Simple typo fix fixed in ceph-qa-chef commit 1e8ba35 Sandon Van Ness
02:36 PM devops Feature #8039 (Closed): move to libgoogle-perftools4
From James on ceph-maintainers:... Sage Weil
02:35 PM devops Bug #7552 (Fix Under Review): dregs of mkcephfs still live on
Sage Weil
02:22 PM devops Feature #8037 (Closed): Test leveldb 1.12 (or newer) and package as necessary
Ian Colle
02:22 PM devops Bug #7918 (Won't Fix): Mon hangs at start after upgrading to leveldb-1.12.0-3.fc18.x86_64 from th...
we're not going to worry about fc18 at this point; let's focus on making sure the fc10 and centos/rhel stuff works. Sage Weil
02:13 PM Feature #6258 (New): ceph-disk: zap should wipefs
Ian Colle
02:08 PM Bug #8036: levedb: throws std::bad_allow on 14.04
... Sage Weil
02:03 PM Bug #8036 (Can't reproduce): levedb: throws std::bad_allow on 14.04
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-07_22:35:16-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
02:04 PM Bug #7744: osd: assert(last_e.version.version < e.version.version)
Sage Weil
01:56 PM Bug #3652 (Duplicate): split should not mess up stats
Ian Colle
01:55 PM Bug #4253 (Can't reproduce): radosgw: segfault in lockdep register
Sage Weil
01:54 PM Bug #5634: auth startup reports "ObjectNotFound" when keyring file is unreadable
Sage Weil
01:54 PM Bug #6629 (Won't Fix): fd cache and external changes to recently-modified files don't behave nicely
Samuel Just
01:53 PM Bug #6686 (Resolved): segfault in prioritized queue dequeue
commit:44028983 Sage Weil
01:52 PM Bug #6684 (Rejected): osd/PGLog.cc: 402: FAILED assert(log.head >= olog.tail && olog.head >= log....
Samuel Just
01:51 PM Bug #6826 (Duplicate): Non-equal performance of 'freshly joined' OSDs
Samuel Just
01:50 PM Bug #6826: Non-equal performance of 'freshly joined' OSDs
Probably related to snap trimming on newly clean new osds. Samuel Just
01:50 PM Bug #6944: objecter: localized read for missing objects polls replicas instead of waiting for pri...
Can you ensure this is fixed in your current objecter work? Ian Colle
01:49 PM Bug #6833 (Can't reproduce): `/etc/init.d/ceph status` occasionally exists silently
Sage Weil
01:49 PM Bug #3569 (Can't reproduce): Monitor & OSD failures when an OSD clock is wrong
Samuel Just
01:48 PM Bug #7044 (Can't reproduce): Segmentation fault rados suite on master
Sage Weil
01:48 PM Bug #6797 (Won't Fix): ceph osd out does not migrate properly
Sage Weil
01:47 PM Bug #7161 (Can't reproduce): rados api test LibRadosMisc.Exec failed on next branch
Sage Weil
01:46 PM Bug #7335 (Won't Fix): librbd does not raise "Object Not Found", instead returning NUL bytes
Sage Weil
01:45 PM Bug #7350 (Won't Fix): osd: scrub does not detect recently touched and then renamed backend files
Sage Weil
01:43 PM Bug #7593 (Resolved): Disk saturation during PG folder splitting
Sage Weil
01:41 PM Bug #7688 (Won't Fix): warn at fs/btrfs/extent-tree.c:5748 __btrfs_free_extent+0x9ce/0xa20
Sage Weil
01:41 PM Bug #7520 (Resolved): Lock contention during scrubbing which could potentially hang the OSD for a...
Sage Weil
01:41 PM Bug #7868 (Can't reproduce): "failed to recover before timeout expired" in powercycle-firefly---b...
Sage Weil
01:39 PM Bug #7936 (Can't reproduce): "failed: rados" in upgrade:dumpling-x:parallel-firefly-distro-basic-...
Sage Weil
01:37 PM Bug #7957 (Resolved): "[ERR] scrub mismatch" in upgrade:dumpling-emperor-x:parallel-firefly-testi...
Sage Weil
01:36 PM Bug #7968 (Won't Fix): ImportError occurred when run command 'ceph -v'
in order ot get the python stuff installed properly you should use a package. or install teh stuff in ceph.git/src/p... Sage Weil
01:34 PM rgw Bug #8016 (Resolved): "testPrefixAndLimit (test.functional.tests.TestContainerUTF8) ... ERROR" in...
Sage Weil
09:44 AM rgw Bug #8016: "testPrefixAndLimit (test.functional.tests.TestContainerUTF8) ... ERROR" in upgrade:du...
I think the 120s time just isn't long enough. Let's make it 300s (here and in the other thrashing/upgrade test). git... Sage Weil
01:33 PM Bug #7218 (Resolved): Displaying wrong number of pools with ceph -s after removeing a pool
Sage Weil
01:33 PM Bug #6689 (Resolved): osd: remove_redundant_pg_temp() can be slow on big clusters
Sage Weil
01:32 PM Bug #7549 (Won't Fix): Mon deadlock
This is most likely a bug in the older libgoogle-perftools* which is part of the Precise Ubuntu distribution. Either... David Zafman
01:21 PM rgw Bug #7815 (Can't reproduce): Test failed in upgrade:dumpling-x:parallel-firefly-testing-basic-pla...
Sage Weil
01:19 PM Bug #7068 (Can't reproduce): os/FileStore.cc: 4035: FAILED assert(omap_attrs.size() == omap_aset....
Samuel Just
01:17 PM Bug #6756: journal full hang on startup
Samuel Just
01:14 PM Bug #7398: osd: ERANGE from clone
might be dup of #7916 Sage Weil
01:14 PM Bug #6003 (Need More Info): journal Unable to read past sequence 406 ...
Sage Weil
01:13 PM Bug #7858 (Resolved): agent with snaps ceph_test_rados error
Samuel Just
01:13 PM Bug #7916 (Need More Info): ceph_test_rados got ENOENT on ec pool + thrashing
Sage Weil
01:08 PM Bug #7659 (Resolved): osd/ReplicatedPG.cc: 6751: FAILED assert(attrs || !pg_log.get_missing().is_...
David Zafman
01:08 PM Bug #8019 (Pending Backport): os/JournalingObjectStore.cc: 121: FAILED assert(op > committed_seq)...
Sage Weil
01:07 PM Bug #8019 (Resolved): os/JournalingObjectStore.cc: 121: FAILED assert(op > committed_seq) on wheezy
Sage Weil
12:13 PM Bug #8019 (Fix Under Review): os/JournalingObjectStore.cc: 121: FAILED assert(op > committed_seq)...
Sage Weil
01:06 PM Bug #7710 (In Progress): Multiple rados bench instance will overwrite the metadata object
This went through some review and is waiting on a respin. Greg Farnum
12:49 PM Bug #7891 (In Progress): osd: leaked pg refs on shutdown
Sage Weil
12:35 PM RADOS Fix #8035 (New): OSD: must guarantee we are newer than Objecter reply send epochs
Because OSDs are now normal clients of each other in some circumstances, we've broken our map synchronization guarant... Greg Farnum
12:33 PM devops Bug #6726: Official packages do not appear to be available for Saucy
There was a problem with our repo generator script for release builds which was causing even the new releases to not ... Sandon Van Ness
02:57 AM devops Bug #6726: Official packages do not appear to be available for Saucy
There should be as saucy/trusty was built for that release as I was involved with it today. I will find out what happ... Sandon Van Ness
02:00 AM devops Bug #6726: Official packages do not appear to be available for Saucy
Sandon Van Ness wrote:
> Since this ticket was opened our release build has been changed to include saucy/trusty so ...
Tim Bishop
01:32 AM devops Bug #6726: Official packages do not appear to be available for Saucy
There might be some confusion here. We have 'gitbuilders' that yes have been building them for some time that do nigh... Sandon Van Ness
01:24 AM devops Bug #6726: Official packages do not appear to be available for Saucy
I understand but that's not the issue. They have always been built, but the problem is that they are not being publis... Tom Verdaat
12:50 AM devops Bug #6726: Official packages do not appear to be available for Saucy
All new releases are being built for trusty/saucy. It is not super high priority at the moment to rebuild all our old... Sandon Van Ness
12:43 AM devops Bug #6726: Official packages do not appear to be available for Saucy
Any progress on this bug Sandon? Tom Verdaat
12:26 PM Bug #8001 (Fix Under Review): hung recovery; pg 3.f disappeared
Sage Weil
12:20 PM devops Bug #8034 (Resolved): ceph-deploy should run sudo yum clean all after installing ceph-release rpm
Otherwise you will get an unexpected old version from a previous ceph-release install after someone attempts to do a ... Sandon Van Ness
12:12 PM CephFS Bug #8026 (Resolved): shared pointer completely break multiple mds
Sage Weil
07:04 AM CephFS Bug #8026: shared pointer completely break multiple mds
Zheng Yan
12:04 AM CephFS Bug #8026 (Resolved): shared pointer completely break multiple mds
Zheng Yan
12:06 PM Feature #7437: EC: add adapt unittest teuthology task and add to nightly
David Zafman
12:05 PM devops Bug #5338 (Resolved): need rpm packages built for libapache-mod-fastcgi
This is complete. We have been doing these in jenkins for a while now. Sandon Van Ness
11:52 AM devops Bug #5338: need rpm packages built for libapache-mod-fastcgi
Ian Colle
12:04 PM rgw Feature #6678 (Resolved): rgw: reject writes to secondary zones
Pushed to dumpling, commit:b29238729f87c73dfdcf16dddcf293577678dea2 Yehuda Sadeh
11:56 AM devops Feature #7925: Feature: create new download.ceph.com site
I believe after a discussion with neil/Ian it was decided this was going to be on the back-burner for a bit so unassi... Sandon Van Ness
11:55 AM Bug #8031 (Fix Under Review): osd/ReplicatedPG.cc: 405: FAILED assert(needs_recovery)
Sage Weil
09:37 AM Bug #8031: osd/ReplicatedPG.cc: 405: FAILED assert(needs_recovery)
Sage Weil
08:28 AM Bug #8031 (Resolved): osd/ReplicatedPG.cc: 405: FAILED assert(needs_recovery)
ubuntu@teuthology:/a/teuthology-2014-04-07_23:01:05-kcephfs-master-testing-basic-plana/178021... Sage Weil
11:53 AM Feature #7792: leveldb 1.12.0 for rhel
After building 1.12 we later decided to take the package down. Until I get updates what we are doing about this it is... Sandon Van Ness
11:50 AM devops Feature #6098: put teuthology.front.sepia.ceph.com apache configuration files under source control
Ian Colle
10:47 AM Feature #8033 (New): Epic: Kerberos/LDAP Support
Users with existing LDAP or AD systems would like to integrate them into the cephx system so authentication and autho... Neil Levine
10:34 AM Bug #5818: leveldb 1.12: hang on shutdown (mon)
observed this again on leveldb 1.12:... Sage Weil
10:32 AM Bug #8007: osd: hang on shutdown with valgrind on trusty
nevermind, the second instance is a mon shutdown and it is a leveldb 1.12 compaction vs shutdown race. Sage Weil
10:20 AM Bug #8007: osd: hang on shutdown with valgrind on trusty
saw this on precise:
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-07_23:00:18-rgw-master-test...
Sage Weil
10:09 AM Bug #6789: cannot remove the leader when there only are two monitors
Cool :-) Loïc Dachary
09:34 AM Bug #6789 (Fix Under Review): cannot remove the leader when there only are two monitors
Joao Eduardo Luis
09:32 AM Bug #6789: cannot remove the leader when there only are two monitors
https://github.com/ceph/ceph/pull/1624 Joao Eduardo Luis
09:28 AM Bug #6789: cannot remove the leader when there only are two monitors
Also, it's relevant to mention that this does not happen only with the leader. Any monitor that is removed from the ... Joao Eduardo Luis
09:23 AM Bug #6789 (In Progress): cannot remove the leader when there only are two monitors
I was wrong. This does happen on current, and emperor, and dumpling.
The monitor has this features that allows hi...
Joao Eduardo Luis
09:49 AM Bug #8022 (Duplicate): coredumps found in librbd tests
Sage Weil
09:49 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
ubuntu@teuthology:/a/teuthology-2014-04-06_01:10:23-ceph-deploy-firefly-distro-basic-vps/173432 Sage Weil
09:40 AM Bug #8021: osd: ENOENT on clone on dumpling
i think this has been fixed on firefly but not dumpling.... Sage Weil
09:27 AM Linux kernel client Bug #7954: misdirected op
teuthology-2014-04-05_23:05:02-krbd-firefly-testing-basic-plana/173341
ubuntu-2014-04-05_23:18:18-kcephfs-master-tes...
Sage Weil
09:18 AM rbd Bug #8030 (Duplicate): krbd,kcephfs: misdirected request
Sage Weil
09:17 AM rbd Bug #8030: krbd,kcephfs: misdirected request
To be explicit: the error is -ENXIO, returned by the OSD when the write was not sent to the correct primary (and the ... Josh Durgin
08:19 AM rbd Bug #8030 (Duplicate): krbd,kcephfs: misdirected request
ubuntu@teuthology:/a/teuthology-2014-04-07_23:00:55-krbd-master-testing-basic-plana/178000... Sage Weil
09:10 AM rgw Bug #7799: Errors in upgrade:dumpling-x:stress-split-firefly---basic-plana suite
Yuri - have we seen this in a while? We believe it's fixed, but want to confirm. Ian Colle
09:06 AM devops Feature #7171 (Resolved): rbdmap should be part of ceph-common
commit:17732dc0c8878ea58813ad543c5359cb811079cc Josh Durgin
09:04 AM rbd Bug #6480: librbd crashed qemu-system-x86_64
Ian Colle
07:55 AM devops Bug #8027 (Can't reproduce): Ceph v0.79 Firefly RC :: erasure-code-profile command set not presen...
Alfredo Deza
07:53 AM devops Bug #8027: Ceph v0.79 Firefly RC :: erasure-code-profile command set not present for CentOS RPM
This looks like a problem with an unclean upgrade of sorts because I cannot replicate this problem at all.
Package...
Alfredo Deza
03:07 AM devops Bug #8027 (Can't reproduce): Ceph v0.79 Firefly RC :: erasure-code-profile command set not presen...
I have been using 0.78 in order to test EC and TP , with 0.78 i was not able to test erasure code profile feature ( ... karan singh
07:52 AM Bug #7991: ceph-mon crash
There is no evidence of a crash on the logs.
One of the monitors appears to be working fine.
The other monitor ...
Joao Eduardo Luis
06:59 AM CephFS Bug #3424: java: Add the correct JUnit package dependencies on supported platforms and ensure the...
Ian Colle
06:51 AM CephFS Bug #8025: nfs-on-kclient: rm -r failed
Zheng Yan

04/07/2014

10:21 PM Linux kernel client Bug #8024: kclient: misdirected osd request
Maybe caused by:... Zheng Yan
09:55 PM Linux kernel client Bug #8024 (Resolved): kclient: misdirected osd request
teuthology:/a/teuthology-2014-04-06_23:01:04-kcephfs-master-testing-basic-plana/175847... Greg Farnum
10:08 PM CephFS Bug #8025 (Resolved): nfs-on-kclient: rm -r failed
teuthology-2014-04-06_23:01:11-knfs-master-testing-basic-plana/175859/... Greg Farnum
09:26 PM CephFS Bug #7958 (Resolved): ceph-fuse+fsx umount hang on leaked inode reference
Sage Weil
09:20 PM Bug #7666 (Duplicate): librados: lock cycle on shutdown
Sage Weil
09:20 PM Bug #7376 (Resolved): mon: >10s spent in remove_redundant_pg_temp
208959a0dcacba40116730702021090a24865eb3 Sage Weil
09:17 PM CephFS Feature #7319 (Resolved): qa: multimds, no failure
Sage Weil
09:17 PM Feature #5437 (Resolved): ceph-mon performance on ARM
Sage Weil
09:17 PM Feature #2088 (Rejected): msgr: refactor 2 threads to one
Sage Weil
09:13 PM CephFS Bug #7739 (Resolved): mds: uninitialized field in message
Sage Weil
08:55 PM Bug #7997 (Resolved): handle_get_version returns old map epochs
Sage Weil
08:14 AM Bug #7997 (Pending Backport): handle_get_version returns old map epochs
Sage Weil
08:53 PM rgw Bug #6889 (Resolved): rgw: usage log: don't log system user operations
Sage Weil
08:53 PM rgw Bug #7099 (Resolved): Strange Comportments with media files
Sage Weil
08:49 PM Bug #7994 (Resolved): OSD: share map when sending subops to peers
Sage Weil
11:02 AM Bug #7994 (Pending Backport): OSD: share map when sending subops to peers
Sage Weil
10:58 AM Bug #7994: OSD: share map when sending subops to peers
The simple fix was merged into master in commit:1a9952c60570aa308410c69db0289160f44969b1. Greg Farnum
10:57 AM Bug #7994 (Resolved): OSD: share map when sending subops to peers
Sage Weil
08:49 PM Bug #7736 (Resolved): mon: can expose stale state
Sage Weil
08:49 PM Bug #7738 (Resolved): osd: journal crash on startup on wheezy
Sage Weil
08:49 PM Bug #6992 (Resolved): OSD assert fails after it found it was marked as down by monitor during hig...
Sage Weil
08:49 PM Bug #6909 (Resolved): Incomplete state should retry on Notify
Sage Weil
06:03 PM Bug #8022 (Duplicate): coredumps found in librbd tests
Coredump link http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-06_01:10:23-ceph-deploy-firefly-distro-basic-vps... Alfredo Deza
04:43 PM Bug #7996: 0.78: OSD is not suspend-friendly (unresponsive cluster on OSD crash)
Sage Weil wrote:
> I suspect that the mon on that machine is the key factor at play here. There was a fix that went...
Dmitry Smirnov
09:43 AM Bug #7996: 0.78: OSD is not suspend-friendly (unresponsive cluster on OSD crash)
I suspect that the mon on that machine is the key factor at play here. There was a fix that went in just after 0.78 ... Sage Weil
04:42 PM devops Bug #8017: Redhat Dependencies Unmet
From John's log it looks like the dumpling ceph-release rpm (http://ceph.com/rpm-dumpling/el6/noarch/ceph-release-1-0... Josh Durgin
04:39 PM devops Bug #8017: Redhat Dependencies Unmet
RH packages are dependent upon EPEL. Need to build all missing packages and include in repo. Ian Colle
02:57 PM devops Bug #8017 (Duplicate): Redhat Dependencies Unmet
Ian Colle
02:42 PM devops Bug #8017: Redhat Dependencies Unmet
This has been verified as a problem when installing dumpling too.
john@admin-host:~/rgw-validation-cluster$ ceph-...
John Wilkins
02:06 PM devops Bug #8017 (Resolved): Redhat Dependencies Unmet
When attempting to install Ceph Cuttlefish on a bare metal installation of Redhat 6.5, I encountered a series of unme... John Wilkins
04:31 PM Bug #8009 (New): librados failing tests for APILock
I'm totally confused. The test in question *only* runs if osd_max_attr_size != 0, which it is not (defaults to 0, no... Sage Weil
10:53 AM Bug #8009 (In Progress): librados failing tests for APILock
Sage Weil
07:15 AM Bug #8009 (Closed): librados failing tests for APILock
A few failures on ceph-deploy nightly tests:... Alfredo Deza
04:18 PM CephFS Bug #8005 (Rejected): fuse hang
no error in client log, looks like mds was killed by someone Zheng Yan
04:08 PM Bug #8003 (Resolved): head eviction can race with clone promotion
Sage Weil
03:59 PM Bug #7999: osd: pgs share info that hasn't been persisted
Sage Weil
01:20 PM Bug #7999: osd: pgs share info that hasn't been persisted
osd.0 starts a repop:... Sage Weil
03:58 PM Bug #7975 (Resolved): osd: handle inconsistent stats in the osd post split
Sage Weil
03:56 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
"work in progress":https://github.com/ceph/ceph/pull/1621 Loïc Dachary
01:16 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Kevin Greenan writes:
> Using the doc and a few special functions in jerasure, you can ensure that the underlying ...
Loïc Dachary
11:52 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
"gfp_array is a global variable":https://bitbucket.org/jimplank/jerasure/src/80fc5d1d95f06ea4732717b06b42177099cc93c9... Loïc Dachary
11:33 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Interesting : it is in *decode* this time, not *encode* Loïc Dachary
10:54 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
... Sage Weil
12:38 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Ran 27 times without triggering the problem. Will keep running it 100 times more. Loïc Dachary
03:41 PM Bug #8021 (Duplicate): osd: ENOENT on clone on dumpling
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-06_22:35:23-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
03:30 PM Bug #8020 (Resolved): evenly split stats on split
At least it's better than what we currently do. Samuel Just
03:30 PM Bug #7967 (Resolved): finish_promote needs to handle the omap flag
Samuel Just
03:03 PM Bug #8019 (Resolved): os/JournalingObjectStore.cc: 121: FAILED assert(op > committed_seq) on wheezy
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-06_22:35:23-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
02:22 PM Documentation #7886: What's the policy on URL stability for public documentation?
We could certainly have a "latest-release" link on the website and legislate that to be our 'stable' doc link... Dan Mick
02:15 PM Documentation #7886: What's the policy on URL stability for public documentation?
Getting off master and using the last named release branch name as the default docs root makes total sense. Does sphi... Neil Levine
02:01 PM Documentation #7886: What's the policy on URL stability for public documentation?
The Ceph.com site uses the master branch by default, which changes somewhat frequently. A more stable approach would ... John Wilkins
02:22 PM RADOS Fix #8018 (New): OSD: check if messages are actually handled in ms_dispatch
OSD::ms_dispatch returns "true" no matter what happens. The _dispatch() function which does the real work doesn't eve... Greg Farnum
02:17 PM Bug #7937 (Resolved): [ERR] deep-scrub 5.ds0 79d5820d/burnupi0838757-23/1f7//5 expected clone
Samuel Just
02:16 PM Bug #7985 (Rejected): 2014-04-02T20:36:41.677 INFO:teuthology.task.rados.rados.0.err:[10.214.131....
Samuel Just
02:02 PM rgw Bug #8016 (Resolved): "testPrefixAndLimit (test.functional.tests.TestContainerUTF8) ... ERROR" in...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-06_22:35:23-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
02:02 PM Bug #7904 (Resolved): osd/ReplicatedPG.cc: 10661: FAILED assert(is_active())
Sage Weil
02:01 PM Bug #7964 (Resolved): ceph_test_rados with snaps and caching stat errors
Sage Weil
12:57 PM CephFS Bug #8010: It's impossible to remove unused filesystem pools from a cluster
It's happening with
ceph version 0.78 (f6c746c314d7b87b8419b6e584c94bfe4511dbd4)
Linux access.car.dot.com 3.13.0-...
Maxym Kutsevol
10:07 AM CephFS Bug #8010 (Resolved): It's impossible to remove unused filesystem pools from a cluster
We've inadvertently made it impossible to remove a filesystem from a Ceph cluster. If there is not data in the FS, it... Greg Farnum
10:38 AM Bug #8011 (Resolved): osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= s...
osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= scrubber.end)
ceph version 0.78-600-g19...
Samuel Just
08:37 AM Bug #7706 (Resolved): osd: PrioritizedQueue can starve
Sage Weil
08:36 AM Bug #7977 (Resolved): cephx has embedded byte-order dependency
Sage Weil
06:26 AM Bug #8008 (Resolved): osd/ReplicatedPG.cc: 258: FAILED assert(missing_loc.needs_recovery(hoid)) d...
Here is the log from crashed OSD:... Dmitry Smirnov

04/06/2014

09:22 PM CephFS Bug #8005: fuse hang
still looks like MDS was dead Zheng Yan
01:44 PM CephFS Bug #8005 (Rejected): fuse hang
... Sage Weil
05:54 PM Bug #8007: osd: hang on shutdown with valgrind on trusty
... Sage Weil
05:52 PM Bug #8007 (Can't reproduce): osd: hang on shutdown with valgrind on trusty
have seen several runs hang on osd shutdown. trusty. valgrind. here is the gdb thread dump:... Sage Weil
05:40 PM CephFS Bug #8006 (Rejected): fuse hang on flush (icache branch)
The flush hang is because ceph-fuse was umounting (received signal). umounting can't finish becase MDS was dead at th... Zheng Yan
01:49 PM CephFS Bug #8006 (Rejected): fuse hang on flush (icache branch)
... Sage Weil
05:29 PM CephFS Bug #8004: LibCephFS.HardlinkNoOriginal hang
oh, and the 32-bit pointer thing is because ceph-fuse is running under valgrind. Sage Weil
05:27 PM CephFS Bug #8004: LibCephFS.HardlinkNoOriginal hang
seems easy to reproduce, just hit this again with... Sage Weil
01:41 PM CephFS Bug #8004 (Resolved): LibCephFS.HardlinkNoOriginal hang
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-04-05_15:44:13-multimds:verify-wip-ms-dump-testing-basic-pla... Sage Weil
04:32 PM Bug #8002 (Resolved): osds down, but not advancing osdmaps
Samuel Just
04:03 PM Bug #8002 (Fix Under Review): osds down, but not advancing osdmaps
Sage Weil
03:53 PM Bug #8002: osds down, but not advancing osdmaps
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-06_02:30:05-rados-master-testing-basic-plana/173853 Sage Weil
04:05 PM Bug #7986: 3.1s0 scrub stat mismatch, got 2041/2044 objects, 0/0 clones, 2041/2044 dirty, 0/0
ubuntu@teuthology:/var/lib/teuthworker/archive/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana
Sage Weil
04:05 PM Bug #7998 (Duplicate): 3.2s1 scrub stat mismatch, got 2000/2001 objects, 0/0 clones, 2000/2001 di...
#7986 Sage Weil
01:39 PM Bug #8003 (Resolved): head eviction can race with clone promotion
_verify_no_head_clones will check copy_ops. Samuel Just
01:22 PM CephFS Bug #7739: mds: uninitialized field in message
Sage Weil
06:39 AM rbd Bug #6480: librbd crashed qemu-system-x86_64
Josh, I am now running wip-6480-0.67.7 across our whole infrastructure. No issues yet. Because the race is rare, I th... Mike Dawson
03:03 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
The workload is slightly different from the previous one:... Loïc Dachary
02:58 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
The teuthology config.yaml... Loïc Dachary
02:39 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
The plana08 machine has been recycled and there is no core archived. See the attachment:ceph-osd.0.log.gz for the ful... Loïc Dachary
02:36 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
SSE4 plugin selected and ECX = *029ee3ff* (this is consistent)... Loïc Dachary
02:34 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
/var/lib/teuthworker/archive/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana/170763/remote/ubuntu@plana0... Loïc Dachary

04/05/2014

09:07 PM Bug #8002 (Resolved): osds down, but not advancing osdmaps
ubuntu@teuthology:/var/lib/teuthworker/archive/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana/170941
...
Sage Weil
08:58 PM Bug #8001 (Resolved): hung recovery; pg 3.f disappeared
ubuntu@teuthology:/var/lib/teuthworker/archive/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana/170892
...
Sage Weil
08:48 PM Bug #7891: osd: leaked pg refs on shutdown
ubuntu@teuthology:/var/lib/teuthworker/archive/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana/171097 Sage Weil
08:33 PM rbd Bug #8000 (Closed): SLAB: Unable to allocate memory on node 0
I'm getting the following kernel errors with ext4 on rbd:... Dmitry Smirnov
05:09 PM Bug #7999: osd: pgs share info that hasn't been persisted
... Sage Weil
05:09 PM Bug #7999 (Resolved): osd: pgs share info that hasn't been persisted
ubuntu@teuthology:/a/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana/170880 Sage Weil
04:59 PM Bug #7997 (Fix Under Review): handle_get_version returns old map epochs
Sage Weil
04:45 PM Bug #7997 (In Progress): handle_get_version returns old map epochs
Sage Weil
03:44 PM Bug #7997 (Resolved): handle_get_version returns old map epochs
ubuntu@teuthology:/var/lib/teuthworker/archive/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana/170649<pr... Sage Weil
03:55 PM Bug #7998 (Duplicate): 3.2s1 scrub stat mismatch, got 2000/2001 objects, 0/0 clones, 2000/2001 di...
... Sage Weil
03:45 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
ubuntu@teuthology:/var/lib/teuthworker/archive/gregf-2014-04-04_22:05:49-rados-wip-7994-testing-basic-plana/170763 Sage Weil

04/04/2014

08:43 PM Bug #7996 (Won't Fix): 0.78: OSD is not suspend-friendly (unresponsive cluster on OSD crash)
One machine running MON and OSD got suspended.
Shortly after (within seconds) the whole cluster got unresponsive for...
Dmitry Smirnov
07:54 PM Bug #7975 (Fix Under Review): osd: handle inconsistent stats in the osd post split
Sage Weil
02:08 PM Bug #7975 (In Progress): osd: handle inconsistent stats in the osd post split
Sage Weil
07:28 PM CephFS Bug #7980: 0.78: MDS crash (segmentation fault) on client wake-up from suspend.
Works as expected, problem solved, thank you. Dmitry Smirnov
12:18 AM CephFS Bug #7980: 0.78: MDS crash (segmentation fault) on client wake-up from suspend.
Very nice, thank you. I'll test and confirm. Dmitry Smirnov
12:01 AM CephFS Bug #7980 (Resolved): 0.78: MDS crash (segmentation fault) on client wake-up from suspend.
fixed by https://github.com/ceph/ceph/commit/fb72330fb3514be690dc60598242036aa560e023 Zheng Yan
06:07 PM Bug #7993 (Resolved): ceph-post-file can only accept one option
Sage Weil
03:26 PM Bug #7993 (Fix Under Review): ceph-post-file can only accept one option
Dan Mick
03:21 PM Bug #7993 (In Progress): ceph-post-file can only accept one option
Dan Mick
03:02 PM Bug #7993 (Resolved): ceph-post-file can only accept one option
currently only looks once at options, uses $1 where it means $2, etc. It needs it some getopts love. Dan Mick
05:10 PM Bug #7995 (Can't reproduce): osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs...
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-04_02:30:19-rados-master-testing-basic-plana/168602... Sage Weil
05:08 PM Bug #7891: osd: leaked pg refs on shutdown
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-04_02:30:19-rados-master-testing-basic-plana/168600 Sage Weil
11:07 AM Bug #7891: osd: leaked pg refs on shutdown
ubuntu@teuthology:/a/teuthology-2014-04-02_02:30:02-rados-master-testing-basic-plana/161303/remote$
Samuel Just
04:30 PM Bug #7994: OSD: share map when sending subops to peers
The quick-and-dirty fix is in wip-7994; I'll run it through the suite as soon as it builds. A (at least slightly) mor... Greg Farnum
04:28 PM Bug #7994 (Resolved): OSD: share map when sending subops to peers
Right now, the OSD doesn't preemptively share maps when sending subops. Fix it. Greg Farnum
04:21 PM Bug #7984 (Duplicate): osd/ReplicatedPG.cc: 2273: FAILED assert(p != snapset.clones.end())
Samuel Just
04:06 PM Bug #7984 (In Progress): osd/ReplicatedPG.cc: 2273: FAILED assert(p != snapset.clones.end())
Samuel Just
10:29 AM Bug #7984: osd/ReplicatedPG.cc: 2273: FAILED assert(p != snapset.clones.end())
ubuntu@teuthology:/a/teuthology-2014-04-02_02:30:02-rados-master-testing-basic-plana/161266/remote Samuel Just
10:29 AM Bug #7984 (Duplicate): osd/ReplicatedPG.cc: 2273: FAILED assert(p != snapset.clones.end())

ceph version 0.78-522-gedb8a59 (edb8a5965e72b6173d3f88d1a63c8b3ca1b9235c)
1: (ReplicatedPG::trim_object(hobject_...
Samuel Just
03:16 PM Bug #7983 (Resolved): osd: erroneously present object
Samuel Just
02:07 PM Bug #7983 (Fix Under Review): osd: erroneously present object
Sage Weil
01:13 PM Bug #7983 (In Progress): osd: erroneously present object
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-04_02:30:19-rados-master-testing-basic-plana/168472 Sage Weil
08:31 AM Bug #7983 (Resolved): osd: erroneously present object
ubuntu@teuthology:/a/teuthology-2014-04-03_02:30:03-rados-firefly-distro-basic-plana/166349 Sage Weil
02:53 PM Bug #7992 (Resolved): ceph-post-file keys are wrong
Sage Weil
02:37 PM Bug #7992 (Resolved): ceph-post-file keys are wrong
Snippet of install script in Makefile.am installs the known_hosts file to all three
of known_hosts, id_dsa, and id_d...
Dan Mick
02:28 PM Bug #7991: ceph-mon crash
Logs have been uploaded via cephdrop@ceph.com in issue7991 folder. Thanks.
Cluster details:
Ubuntu 12.04 - 3 x ...
Andrei Mikhailovsky
02:11 PM Bug #7991 (Rejected): ceph-mon crash
I've had an issue with crashing ceph-mon. It happened twice over the course of last two weeks. Attached are the ceph-... Andrei Mikhailovsky
01:26 PM rgw Feature #7990 (New): RGW: Ldap Integration
For users with existing LDAP systems, they would like to be able to configure RGW so that authentication and authoriz... Neil Levine
01:21 PM Feature #7988 (Resolved): Logs: Log every administrative action taken by a user
Many enterprise users have strict security policies which require that all events generated by a user are explicitly ... Neil Levine
12:59 PM Fix #7919: mon: prevent clients with a read cap from reading the full keyring
merged to master/firefly. backported to emperor too, along with a release note. Sage Weil
12:55 PM Fix #7919 (Resolved): mon: prevent clients with a read cap from reading the full keyring
Sage Weil
11:59 AM Bug #7987 (Duplicate): osd: backfill/recovery makes no progress
ubuntu@teuthology:/a/teuthology-2014-04-02_02:30:02-rados-master-testing-basic-plana/161007/remote$
At least, tha...
Samuel Just
11:04 AM Bug #6756: journal full hang on startup
ubuntu@teuthology:/a/teuthology-2014-04-02_02:30:02-rados-master-testing-basic-plana/161291/remote Samuel Just
10:50 AM Bug #7986: 3.1s0 scrub stat mismatch, got 2041/2044 objects, 0/0 clones, 2041/2044 dirty, 0/0
Base pool, issue with copy_from? Samuel Just
10:50 AM Bug #7986 (Can't reproduce): 3.1s0 scrub stat mismatch, got 2041/2044 objects, 0/0 clones, 2041/2...
duration: 1915.8093299865723
failure_reason: '"2014-04-02 21:31:21.281062 osd.4 10.214.133.26:6812/11657 29 : [ERR]
...
Samuel Just
10:45 AM Bug #7985 (Rejected): 2014-04-02T20:36:41.677 INFO:teuthology.task.rados.rados.0.err:[10.214.131....
ubuntu@teuthology:/a/teuthology-2014-04-02_02:30:02-rados-master-testing-basic-plana/160986/remote
2014-04-02T20:3...
Samuel Just
09:47 AM devops Bug #7981 (Resolved): chef fails to install libleveldb1
This was fixed a few hours after that test ran on the 2nd. Sandon Van Ness
06:17 AM devops Bug #7981 (Resolved): chef fails to install libleveldb1
Not sure if this is because the package is just not there or because there was a network hiccup when the tests ran
...
Alfredo Deza
09:23 AM Bug #7576 (Fix Under Review): osd: large skew in pg epochs (dumpling)
Sage Weil
09:21 AM Bug #7922 (Resolved): osd: multi-backfill reservation does not release on reject
Sage Weil
08:58 AM Support #7609: http://tracker.ceph.com/account/register returns 500 Internal error
someone ran into it again Loïc Dachary
08:30 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
This didn't trigger at all in yesterday's run. :/ Sage Weil
08:15 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
To find the failed ones:... Loïc Dachary
08:29 AM Bug #7892: osd/ReplicatedPG.cc: 7881: FAILED assert((data_included.empty() && data.length() == 0)...
ubuntu@teuthology:/a/teuthology-2014-04-03_02:30:03-rados-firefly-distro-basic-plana Sage Weil
05:17 AM rbd Bug #6480: librbd crashed qemu-system-x86_64
This may help somehow - recently I`ve started to collect perf samples via endless loop on selected hosts and some hos... Andrey Korolyov
12:36 AM rbd Bug #5876: Assertion failure in rbd_img_obj_callback() : rbd_assert(which >= img_request->next_co...
Just in case, the fix you are running with is now in 3.14. However we
are still working on a better fix, so we'll k...
Ilya Dryomov
12:18 AM rbd Bug #5876: Assertion failure in rbd_img_obj_callback() : rbd_assert(which >= img_request->next_co...
I haven't got this problem anymore, it seems really stable for me now. Thanks !
I think the issue can be mark as r...
Olivier Bonvalet

04/03/2014

11:47 PM CephFS Bug #7980 (Resolved): 0.78: MDS crash (segmentation fault) on client wake-up from suspend.
MDS crashes (segmentation fault) when I wake-up machine with CephFS (mounted using kernel client) from suspend:
<p...
Dmitry Smirnov
10:54 PM Bug #7922 (Fix Under Review): osd: multi-backfill reservation does not release on reject
David Zafman
06:08 PM Bug #7922: osd: multi-backfill reservation does not release on reject
... Sage Weil
03:29 PM Bug #7922: osd: multi-backfill reservation does not release on reject
i reproduced this on my first try with the patch in wip-7922.... Sage Weil
06:30 PM Bug #7977 (Pending Backport): cephx has embedded byte-order dependency
Sage Weil
01:56 PM Bug #7977 (Resolved): cephx has embedded byte-order dependency
Calculation of the original session key is byte-order-dependent; cephx_calc_client_server_challenge gets a message di... Dan Mick
05:49 PM Bug #7964: ceph_test_rados with snaps and caching stat errors
Samuel Just
01:04 PM Bug #7964 (Rejected): ceph_test_rados with snaps and caching stat errors
Samuel Just
05:43 PM rgw Bug #7978 (Resolved): rgw: infinite loop when iterating multipart object
Sage Weil
03:17 PM rgw Bug #7978 (Fix Under Review): rgw: infinite loop when iterating multipart object
Yehuda Sadeh
03:13 PM rgw Bug #7978 (Resolved): rgw: infinite loop when iterating multipart object
This happens when the object created has a final placement rule with part_size > 0, e.g.,... Yehuda Sadeh
01:55 PM Bug #7976 (Duplicate): 4.8 missing primary copy of ..., unfound (dumpling)
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-02_22:35:24-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
01:38 PM rbd Bug #7973 (Resolved): rbd ls returns error code of 1 on empty pool
This is the same as #6693. Backported the fix to the dumpling branch in commit:c66b61f9dcad217429e4876d27881d9fb2e7666f Josh Durgin
11:55 AM rbd Bug #7973 (Resolved): rbd ls returns error code of 1 on empty pool
Using version 0.67.7 rbd ls is returning an error code of 1 on an empty pool. This is causing OpenStack Havana proble... JuanJose Galvez
01:28 PM Bug #7975 (Resolved): osd: handle inconsistent stats in the osd post split
Samuel Just
01:01 PM Bug #7967: finish_promote needs to handle the omap flag
Samuel Just
10:24 AM Fix #7919 (Fix Under Review): mon: prevent clients with a read cap from reading the full keyring
pull request 1597 https://github.com/ceph/ceph/pull/1597 Joao Eduardo Luis
06:53 AM CephFS Bug #7958: ceph-fuse+fsx umount hang on leaked inode reference
Zheng Yan
05:32 AM CephFS Bug #7958: ceph-fuse+fsx umount hang on leaked inode reference
I guess it's introduce by commit f1c7b4ef0 (client: pin Inode during readahead). Readahead raced with truncate. Objec... Zheng Yan
12:43 AM Bug #7968 (Won't Fix): ImportError occurred when run command 'ceph -v'
I. How to reproduce:
1. Clone the latest ceph master code from github
2. From my laptop, install the pre-require...
HouMing Wang

04/02/2014

10:00 PM CephFS Bug #7958: ceph-fuse+fsx umount hang on leaked inode reference
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-01_23:00:33-fs-firefly-distro-basic-plana/160589 Sage Weil
10:23 AM CephFS Bug #7958 (Resolved): ceph-fuse+fsx umount hang on leaked inode reference
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-03-31_23:00:36-fs-master-testing-basic-plana/157636
...
Sage Weil
09:56 PM Bug #7922: osd: multi-backfill reservation does not release on reject
logs on flab:/more2/t/7922
on osd.9, we see:...
Sage Weil
03:21 PM Bug #7922 (In Progress): osd: multi-backfill reservation does not release on reject
Sage Weil
01:19 PM Bug #7922: osd: multi-backfill reservation does not release on reject
All highly-verbose logfiles have been uploaded at Sage's request (4.3GiB): http://www.aarontc.com/logs/ceph-logs-chek... Aaron T
08:07 PM Bug #7965 (Resolved): osd: SEGV in handle_recovery_read_complete
Ian Colle
04:47 PM Bug #7965: osd: SEGV in handle_recovery_read_complete
reliably reproduced with lockdep enabled with ceph_test_rados_api_tier. appears to be due to multiple initialization... Sage Weil
04:46 PM Bug #7965 (Fix Under Review): osd: SEGV in handle_recovery_read_complete
Sage Weil
04:08 PM Bug #7965 (Resolved): osd: SEGV in handle_recovery_read_complete
... Sage Weil
07:34 PM Bug #4185: Python multiprocessing exhibiting odd behaviour with librados
Some notes for the Next guy that comes across this issue:
You can use the multiprocessing managers to push all the...
Evan Felix
05:28 PM Bug #7967 (Resolved): finish_promote needs to handle the omap flag
Samuel Just
05:22 PM Bug #7689 (Duplicate): librados: ENOENT on ioctx create
Sage Weil
05:21 PM Bug #7689: librados: ENOENT on ioctx create
oh, i think this is a dup of #7736 Sage Weil
05:19 PM Bug #6429 (Can't reproduce): msg/Pipe.cc: 1029: FAILED assert(m)
Sage Weil
05:17 PM Bug #7776 (Resolved): client lockdep crash
Sage Weil
04:28 PM Bug #7776 (In Progress): client lockdep crash
Sage Weil
03:33 PM Bug #7776: client lockdep crash
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-01_02:30:19-rados-firefly-distro-basic-plana/158403... Sage Weil
03:32 PM Bug #7776: client lockdep crash
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-04-01_02:30:19-rados-firefly-distro-basic-plana/158409... Sage Weil
04:13 PM Bug #7915 (Resolved): ./include/interval_set.h: 385: FAILED assert(_size >= 0)
Samuel Just
04:04 PM Bug #7915 (Fix Under Review): ./include/interval_set.h: 385: FAILED assert(_size >= 0)
Sage Weil
11:55 AM Bug #7915 (Need More Info): ./include/interval_set.h: 385: FAILED assert(_size >= 0)
we've added more debug to master/firefly Sage Weil
04:12 PM CephFS Bug #7966 (Resolved): ceph-mds respawn doesn't always work
... Sage Weil
03:55 PM rgw Bug #7450: "radosgw-admin key create" ignores specified access key when subuser specified
@Yehuda: That code change looks good. I'll try to test by monday and get back to you with confirmation of it working. Robin Johnson
07:31 AM rgw Bug #7450: "radosgw-admin key create" ignores specified access key when subuser specified
I pushed some work into wip-7450. It takes care of the default access key generation in the case of subusers, and som... Yehuda Sadeh
03:54 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Compiled TestErasureCodeJerasure.cc with m=1 https://github.com/ceph/ceph/blob/master/src/test/erasure-code/TestErasu... Loïc Dachary
02:23 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
... Loïc Dachary
10:57 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
mira087 has the fisrst 2 failures, mira089 has the earlier one. they all see to crash in teh same place:... Sage Weil
10:19 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
... Loïc Dachary
09:38 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Ran test for both gf-complete and jerasure and they are valgrind clean ( https://bitbucket.org/jimplank/jerasure/pull... Loïc Dachary
09:34 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
/a/teuthology-2014-03-31_02:30:03-rados-master-testing-basic-plana/155875/teuthology.log is the only ec_pool test tha... Loïc Dachary
09:10 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
also happened on ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154379 on commi... Sage Weil
03:31 PM Bug #7964 (Resolved): ceph_test_rados with snaps and caching stat errors
Samuel Just
03:21 PM Bug #6681 (Resolved): osd recovery hung
backported to dumpling Sage Weil
01:52 PM Bug #6681 (Pending Backport): osd recovery hung
/var/lib/teuthworker/archive/teuthology-2014-04-01_19:00:30-rados-dumpling-testing-basic-plana/159217 Samuel Just
03:15 PM Bug #7576 (In Progress): osd: large skew in pg epochs (dumpling)
wip-7576 Sage Weil
03:07 PM Bug #7917 (Resolved): "ERROR: test_rbd."* in upgrade:dumpling-x:parallel-firefly-distro-basic-vps
Just needed to cherry-pick the fix for #6368 to dumpling. Josh Durgin
01:38 PM RADOS Feature #7962 (New): allow the user to query inconsistent object status and specify repair strategy
Samuel Just
12:56 PM Messengers Bug #7888 (Resolved): msgr: keepalive is insufficient
Sage Weil
10:56 AM devops Feature #7960 (Resolved): backport rpm creation of /usr/lib64/qemu/librbd.so.1 symlink to dumpling
Backport #7293 to dumpling so it works seamlessly with rhev and rhel-osp as well. Josh Durgin
10:51 AM Bug #7939 (Resolved): pg role wrong for replicated pools
Sage Weil
10:42 AM Bug #7805 (Pending Backport): emperor can go active with < min_size non-incomplete peers since we...
Sage Weil
08:50 AM Bug #7805 (Fix Under Review): emperor can go active with < min_size non-incomplete peers since we...
Sage Weil
10:41 AM Bug #7907 (Resolved): osd: rollback to head didn't mark_unrollbackable
Sage Weil
09:45 AM Bug #7393: osd: scrub stat mismatch, got 9/9 objects, 0/0 clones, 9/4 dirty, 0/0 whiteouts, 26738...
... Anonymous
09:36 AM Bug #7949 (Duplicate): "s3tests.functional.test_s3"* errors in upgrade:dumpling-x:parallel-firefl...
Dup of 7935 Ian Colle
09:33 AM Bug #7951 (Duplicate): "test_rbd."* tests failed in upgrade:dumpling-x:parallel-firefly-distro-ba...
Ian Colle
09:07 AM Bug #7957 (Resolved): "[ERR] scrub mismatch" in upgrade:dumpling-emperor-x:parallel-firefly-testi...
Logs are in http://qa-proxy.ceph.com/teuthology/wusui-2014-04-02_01:54:27-upgrade:dumpling-emperor-x:parallel-firefly... Yuri Weinstein
07:53 AM RADOS Feature #7956 (New): osd: implement posix_fadvise/POSIX_FADV_DONTNEED to prevent data caching
Running Ceph OSDs on commodity hardware often means that servers are not used exclusively for OSDs and may have other... Dmitry Smirnov
07:24 AM Bug #7931 (Can't reproduce): setcrushmap crashing monitor
Greg Farnum
06:36 AM Bug #7931: setcrushmap crashing monitor
Hi Greg,
I don't think that was the issue, however it has since been working. The crushmap I've uploaded was the r...
Luis Periquito

04/01/2014

09:46 PM Bug #7952: After aio_read() completes a call to return_value() doesn't return bytes read
This might be the issue underlying #7822 Dan Mick
06:02 PM Bug #7952 (Resolved): After aio_read() completes a call to return_value() doesn't return bytes read

When using the rados_aio_read() interface the c.bl bufferlist stores the read data. So the code below will set rva...
David Zafman
09:38 PM Bug #7922: osd: multi-backfill reservation does not release on reject
Here's a partial log from osd.3 around the problematic time: http://www.aarontc.com/logs/ceph-osd.3.log.bz2
Please...
Aaron T
09:22 PM Bug #7922 (Need More Info): osd: multi-backfill reservation does not release on reject
it would help to see the log from the primary sending the dup backfill request. it is not entirely trivial to determ... Sage Weil
09:18 PM Bug #7922: osd: multi-backfill reservation does not release on reject
We got a duplicate backfill reservation request from osd.3:... Sage Weil
08:44 PM Linux kernel client Bug #7954 (Resolved): misdirected op
Test Run: teuthology-2014-03-31_23:02:09-kcephfs-master-testing-basic-plana
======================================...
Sage Weil
06:59 PM rbd Bug #6480: librbd crashed qemu-system-x86_64
I found another source of race conditions, and hopefully fixed them in the wip-6480-0.67.7 branch. I'm running tests ... Josh Durgin
04:40 PM Bug #7941: caching needs to be able to enforce snap context on flush even with pool snaps
Samuel Just
01:11 PM Bug #7941 (Resolved): caching needs to be able to enforce snap context on flush even with pool snaps
Samuel Just
04:40 PM Bug #7942: promote uses cloneid, but backend may have a different cloneid
Samuel Just
01:12 PM Bug #7942 (Resolved): promote uses cloneid, but backend may have a different cloneid
Samuel Just
04:20 PM Bug #7949: "s3tests.functional.test_s3"* errors in upgrade:dumpling-x:parallel-firefly-distro-bas...
A pretty good chance that this is #7935. Yehuda Sadeh
03:28 PM Bug #7949 (Duplicate): "s3tests.functional.test_s3"* errors in upgrade:dumpling-x:parallel-firefl...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-31_19:33:26-upgrade:dumpling-x:parallel-firefly-di... Yuri Weinstein
04:17 PM Bug #7950 (Duplicate): "FAIL: s3tests.functional.test_s3.test_multipart_upload_contents" in upgra...
Duplicate of #7935, failed due to a new test added. Yehuda Sadeh
03:46 PM Bug #7950: "FAIL: s3tests.functional.test_s3.test_multipart_upload_contents" in upgrade:dumpling-...
May be a duplicate of 7949 Yuri Weinstein
03:45 PM Bug #7950 (Duplicate): "FAIL: s3tests.functional.test_s3.test_multipart_upload_contents" in upgra...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-31_19:33:26-upgrade:dumpling-x:parallel-firefly-di... Yuri Weinstein
03:59 PM Bug #7805: emperor can go active with < min_size non-incomplete peers since we check acting size
wip-7805-emperor Sage Weil
03:59 PM Bug #7938 (Resolved): Coredump generated by upgrade test
this was a broken commit:b097a237e11b48c47d3fd5484f3449e683e95db0. reverted. Sage Weil
11:23 AM Bug #7938 (Resolved): Coredump generated by upgrade test
The following yaml consistently generates coredumps.... Anonymous
03:55 PM Bug #7951 (Duplicate): "test_rbd."* tests failed in upgrade:dumpling-x:parallel-firefly-distro-ba...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-31_19:33:26-upgrade:dumpling-x:parallel-firefly-di... Yuri Weinstein
03:29 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
The only difference is that ext4 is used in teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154491 an... Loïc Dachary
03:15 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Running 30 more, sequentially. Loïc Dachary
03:09 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Contrary to what I thought (reading the wrong number), plana have SSE3 indeed and the jerasure plugin including SSE4 ... Loïc Dachary
02:59 PM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
ran through 20 iterations of the above yaml and did not reproduce. Sage Weil
09:35 AM Bug #7914 (In Progress): osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer lengt...
I will run the workload many times, hoping to reproduce the crash. Loïc Dachary
08:34 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Another run with success, complete logs in attachment:run2.txt
Loïc Dachary
06:49 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
I don't see why not locking the erasure-code plugin registry could be a problem in practice but it is a problem in th... Loïc Dachary
06:26 AM Bug #7914 (Can't reproduce): osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer l...
The above teuthology workload ran on two plana machines and did not create a core dump. Loïc Dachary
06:05 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
... Loïc Dachary
05:34 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Trying to trigger the problem again with:... Loïc Dachary
05:16 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154491... Loïc Dachary
03:29 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
both errors show with "thrashers/pggrow.yaml" (one with btrfs, the other with ext4) Loïc Dachary
02:34 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
the /a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154491 radosbench completed successfully
<pre...
Loïc Dachary
02:18 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
Trying to reproduce the problem from the source tree on the current master with... Loïc Dachary
01:49 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
cpuinfo on plana17 (the machine on which the core dump occured) shows:... Loïc Dachary
01:43 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
*-msse3* was not used although it should have and this was fixed "march 30th":https://github.com/ceph/ceph/commit/1c9... Loïc Dachary
01:39 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
"latest jerasure / gf-complete":https://github.com/ceph/ceph/commit/1c92453f748aea48084e57c9c721ee8080caeeb6 submodul... Loïc Dachary
01:23 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
... Loïc Dachary
12:57 AM Bug #7914 (In Progress): osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer lengt...
Loïc Dachary
02:46 PM Bug #7937: [ERR] deep-scrub 5.ds0 79d5820d/burnupi0838757-23/1f7//5 expected clone
Sage Weil
11:05 AM Bug #7937 (Fix Under Review): [ERR] deep-scrub 5.ds0 79d5820d/burnupi0838757-23/1f7//5 expected c...
Sage Weil
10:48 AM Bug #7937: [ERR] deep-scrub 5.ds0 79d5820d/burnupi0838757-23/1f7//5 expected clone
038e94b51e4945380c4ba771c88c953b6628d0f7 (wip-sam-testing) Samuel Just
10:46 AM Bug #7937 (Resolved): [ERR] deep-scrub 5.ds0 79d5820d/burnupi0838757-23/1f7//5 expected clone
2014-03-31 17:00:27,792.792 INFO:teuthology.task.internal:Removing archive directory...
2014-03-31 17:00:27,840.840 ...
Samuel Just
02:46 PM rgw Bug #7935 (Resolved): rgw: multipart upload failure with s3cmd
commit:ed5a5e075544662a12d94e472da55aeb2f0efe5d Josh Durgin
12:31 PM rgw Bug #7935: rgw: multipart upload failure with s3cmd
This issue happens when not all parts are the same size. Extended the s3 functional test to check this. Yehuda Sadeh
10:55 AM rgw Bug #7935 (Fix Under Review): rgw: multipart upload failure with s3cmd
Yehuda Sadeh
10:05 AM rgw Bug #7935 (Resolved): rgw: multipart upload failure with s3cmd
From ceph-users mailing list:... Yehuda Sadeh
02:44 PM devops Feature #7171 (Fix Under Review): rbdmap should be part of ceph-common
https://github.com/ceph/ceph/pull/1581 Sage Weil
02:25 PM Bug #7907: osd: rollback to head didn't mark_unrollbackable
the original modify:... Sage Weil
02:14 PM devops Feature #7947 (Duplicate): Create separate ceph and ceph-common packages for EL6 and EL7 builds
No reason why rpm builds shouldn't be the same as for debs... Neil Levine
02:07 PM Fix #7919: mon: prevent clients with a read cap from reading the full keyring
My guess is that this behavior kicked in when we started matching client caps with the expected caps on a per-command... Joao Eduardo Luis
01:27 PM Fix #7919: mon: prevent clients with a read cap from reading the full keyring
That sounds like the best solution to me. Somebody on the mailing list reported that Dumpling is not exposing user da... Greg Farnum
01:25 PM Fix #7919: mon: prevent clients with a read cap from reading the full keyring
My vote is to make 'auth' special and require * to access it (or an explcit grant of auth rw or something). This is ... Sage Weil
01:23 PM Fix #7919: mon: prevent clients with a read cap from reading the full keyring
Here's the thing: obviously, allowing anyone with read permission to simply grab the whole keyring is not a good thin... Joao Eduardo Luis
01:10 PM Fix #7919 (In Progress): mon: prevent clients with a read cap from reading the full keyring
Okay, I was the one at fault here. Was missing to provide a keyring with the client's key thus getting permission de... Joao Eduardo Luis
09:12 AM Fix #7919: mon: prevent clients with a read cap from reading the full keyring
Hum, I didn't validate the issue so maybe it was user error. Should probably poke him on the mailing list to check. *... Greg Farnum
05:26 AM Fix #7919: mon: prevent clients with a read cap from reading the full keyring
doesn't seem to happen on latest emperor. checking if anything changed in between 0.72.2 and latest emperor, or if I... Joao Eduardo Luis
01:26 PM Feature #7459 (Rejected): ceph-rest-api: sysvinit and upstart scripts
Sage Weil
01:06 PM Feature #7459 (Fix Under Review): ceph-rest-api: sysvinit and upstart scripts
Samuel Just
01:09 PM Feature #7940 (Resolved): add pool snaps to ceph_test_rados
Samuel Just
01:00 PM Bug #7939: pg role wrong for replicated pools
Samuel Just
12:16 PM Bug #7939 (Resolved): pg role wrong for replicated pools
Samuel Just
10:30 AM Bug #7892 (New): osd/ReplicatedPG.cc: 7881: FAILED assert((data_included.empty() && data.length()...
Sage Weil
10:23 AM Bug #7892 (Duplicate): osd/ReplicatedPG.cc: 7881: FAILED assert((data_included.empty() && data.le...
probably dups #7916 Sage Weil
10:29 AM Bug #7576: osd: large skew in pg epochs (dumpling)
We looked at this in standup today. There is a queue_null on every PG in OSD::consume_map(), so they should be gettin... Greg Farnum
10:24 AM Bug #7588 (Can't reproduce): OSD Seg fault in string assign ObjectOperation::C_ObjectOperation_co...
Sage Weil
10:23 AM Bug #7936 (Can't reproduce): "failed: rados" in upgrade:dumpling-x:parallel-firefly-distro-basic-...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-31_19:33:26-upgrade:dumpling-x:parallel-firefly-di... Yuri Weinstein
10:21 AM Bug #7916 (In Progress): ceph_test_rados got ENOENT on ec pool + thrashing
Sage Weil
09:43 AM Bug #7934 (Resolved): ceph_test_rados_watch_notify doesn't clean-up all pools it creates

$ ./rados lspools
data
metadata
rbd
$ ./ceph_test_rados_watch_notify
...
$ ./rados lspools
data
metadata
r...
David Zafman
09:41 AM Bug #7519 (Can't reproduce): upgrade: osd crash on cuttlefish -> v0.67.1 -> emperor
Sage Weil
09:35 AM devops Bug #7918: Mon hangs at start after upgrading to leveldb-1.12.0-3.fc18.x86_64 from the ceph-extra...
We've pulled those packages until we can figure what's causing this. Ian Colle
09:34 AM Bug #7926 (Resolved): "[ERR] scrub 45.0" in upgrade:dumpling-x:parallel-firefly-distro-basic-vps ...
commit:e672c52b4f8b945a516f2eec006e33665a08f045 Sage Weil
09:31 AM rbd Bug #6257 (Resolved): rbd: cp on sparse image allocates objects in dest
Sage Weil
09:21 AM rbd Feature #7921: Openstack: live migration for ephemeral volumes
Also need to ensure that code is backported to Icehouse-based distro (Ubuntu, RDO, RHEL-OSP) products. Neil Levine
09:19 AM rbd Feature #7921: Openstack: live migration for ephemeral volumes
Sage Weil
09:15 AM rbd Feature #7921: Openstack: live migration for ephemeral volumes
Neil Levine
09:21 AM rbd Feature #7920: Openstack: cloning for rbd ephemeral disks
Also need to ensure that code is backported to Icehouse-based distro (Ubuntu, RDO, RHEL-OSP) products. Neil Levine
09:18 AM rbd Feature #7920: Openstack: cloning for rbd ephemeral disks
Sage Weil
09:14 AM rbd Feature #7920: Openstack: cloning for rbd ephemeral disks
Neil Levine
09:15 AM rbd Feature #7924: Openstack: make long-running operations async in cinder
Neil Levine
09:15 AM rbd Feature #7923: Openstack: backup from in-use volume instead of from detached volume
Neil Levine
09:14 AM Bug #7931: setcrushmap crashing monitor
You should do this with "debug mon = 20" set, but it appears to be crashing because your crush map is somehow invalid... Greg Farnum
08:08 AM Bug #7931 (Can't reproduce): setcrushmap crashing monitor
Following the guides I've created a new crushmap. When I submit this new crushmap the monitor crashes with some infor... Luis Periquito
09:14 AM rbd Feature #7895: krbd: test cloning, discard, plus regular I/O via fsx
Neil Levine
09:14 AM Linux kernel client Feature #190: krbd: DISCARD support
Neil Levine
09:10 AM rbd Feature #7455 (Resolved): krbd,kcephfs: support primary-affinity
Sage Weil
09:05 AM rgw Feature #7932 (Resolved): Create design for object versioning, including subtasks and estimates
Ian Colle
02:09 AM Feature #7928 (Rejected): erasure-code : no SSE3 specific code
Is it worth compiling with *-msse3* and detecting it at runtime when no HAVE_SSE3 code is being compiled conditionall... Loïc Dachary

03/31/2014

09:44 PM devops Bug #7879 (Resolved): sentry-db is down
Migration complete! Zack Cerza
09:38 PM Bug #7927 (Duplicate): Removed pools still show up in "ceph pg dump" output
Duplicate #7912
This should be fixed as of commit 70d2e1353ecb9d31a394fdac333dbb0de93339d3.
Greg Farnum
09:32 PM Bug #7927 (Duplicate): Removed pools still show up in "ceph pg dump" output

Creating new pools 3 and 4 then removing them leaves them behind in ceph pg dump output. Minutes later it is still...
David Zafman
04:16 PM rbd Bug #6257 (Fix Under Review): rbd: cp on sparse image allocates objects in dest
Flatten ignores empty objects since commit:bfa106694dc4db97f58c623eafc3c2d0f9a8bff1, which is in dumpling and emperor... Josh Durgin
03:55 PM Bug #7917: "ERROR: test_rbd."* in upgrade:dumpling-x:parallel-firefly-distro-basic-vps
Looks like the same issue on
os_type: rhel
os_version: '6.5'
Los are in http://qa-proxy.ceph.com/teuthology/teuth...
Yuri Weinstein
09:15 AM Bug #7917 (Resolved): "ERROR: test_rbd."* in upgrade:dumpling-x:parallel-firefly-distro-basic-vps
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-29_19:33:24-upgrade:dumpling-x:parallel-firefly-di... Yuri Weinstein
03:19 PM Bug #7926 (Resolved): "[ERR] scrub 45.0" in upgrade:dumpling-x:parallel-firefly-distro-basic-vps ...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-29_19:33:24-upgrade:dumpling-x:parallel-firefly-di... Yuri Weinstein
02:59 PM Bug #7912 (Resolved): Wrong pool count when using "ceph -s" and "ceph -w"
Sage Weil
05:36 AM Bug #7912 (Resolved): Wrong pool count when using "ceph -s" and "ceph -w"
Hi,
running ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
I created a bunch of pools and delet...
Volker Voigt
02:58 PM devops Feature #7925 (Rejected): Feature: create new download.ceph.com site
See https://docs.google.com/a/inktank.com/document/d/1K8pUEZpN5-t1wd0t81MXgooB1Xz_Cgi4jkvo25dysiU/edit Neil Levine
02:57 PM Messengers Bug #7888 (Pending Backport): msgr: keepalive is insufficient
Sage Weil
02:21 PM Bug #7922: osd: multi-backfill reservation does not release on reject
After the additional logging directives were added and osd.0 was restarted, it quickly crashed again. Here is the log... Aaron T
01:55 PM Bug #7922 (Resolved): osd: multi-backfill reservation does not release on reject
Several OSDs are crashing quite frequently with this error, at least on version 0.77 and 0.78 (possibly earlier as we... Aaron T
02:15 PM rbd Bug #7577 (Resolved): rbd info displays extra random char in block prefix
Sage Weil
02:03 PM Bug #7915: ./include/interval_set.h: 385: FAILED assert(_size >= 0)
Looking at the first failure:
- PGPool::cached_removed_snaps is empty
- info.purged_snaps = 1~1
hence the fail...
Sage Weil
09:10 AM Bug #7915: ./include/interval_set.h: 385: FAILED assert(_size >= 0)
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154663 Sage Weil
09:07 AM Bug #7915: ./include/interval_set.h: 385: FAILED assert(_size >= 0)
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154509 Sage Weil
09:06 AM Bug #7915 (Duplicate): ./include/interval_set.h: 385: FAILED assert(_size >= 0)
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154635... Sage Weil
01:56 PM rbd Feature #7924 (Closed): Openstack: make long-running operations async in cinder
- deleting an image takes a long time, and blocks cinder-volume
from doing anything else during that time
...
Neil Levine
01:55 PM rbd Feature #7923 (Resolved): Openstack: backup from in-use volume instead of from detached volume
Currently, you can take a backup of a detached volume and send to a Swift/RBD backend.
We want to be able to take ba...
Neil Levine
01:55 PM Bug #7804 (Duplicate): backfill racing with a hitset object remove
Samuel Just
01:55 PM Bug #7893 (Duplicate): osd/ReplicatedPG.cc: 10190: FAILED assert(0 == "erroneously present object")
Samuel Just
09:16 AM Bug #7893: osd/ReplicatedPG.cc: 10190: FAILED assert(0 == "erroneously present object")
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154529 Sage Weil
01:55 PM Bug #7894 (Duplicate): osd: missing hitset object in cluster log
Samuel Just
01:51 PM rbd Feature #7921 (Resolved): Openstack: live migration for ephemeral volumes
- live migration for rbd ephemeral disks
- live migration can easily be truly live with shared storage
- easier mai...
Neil Levine
01:50 PM rbd Feature #7920 (Resolved): Openstack: cloning for rbd ephemeral disks
As per https://blueprints.launchpad.net/nova/+spec/rbd-clone-image-handler Neil Levine
12:42 PM Bug #7849 (Resolved): ceph-conf create empty log files
commit:fc1a424e837bee139726eec333c9efd65e2abb6a Josh Durgin
09:39 AM Bug #7849: ceph-conf create empty log files
Josh - please review the wip branch Ian Colle
11:58 AM Linux kernel client Feature #3837: krbd: support format 2 striping
Neil Levine
10:54 AM Bug #7875: osd: pg_pool_t hitset fields incompat
If you have a mix of OSDs tracking hitsets in the cluster, your data tracking isn't going to make any sense...what di... Greg Farnum
10:39 AM Fix #7919 (Resolved): mon: prevent clients with a read cap from reading the full keyring
From the mailing list thread "[ceph-users] Security Hole?"... Greg Farnum
09:45 AM Bug #7626 (Closed): After updating ceph from 0.75 to 0.77 one of the three monitors can't start
as far as I can tell, Sage is right. Nothing else seems off. Closing the ticket. Joao Eduardo Luis
02:11 AM Bug #7626: After updating ceph from 0.75 to 0.77 one of the three monitors can't start
Jasper Siero wrote:
> I updated all nodes to -0.88- 0.78 and removed the monitor and created a new one ;-)
Jasper Siero
02:10 AM Bug #7626: After updating ceph from 0.75 to 0.77 one of the three monitors can't start
I updated all nodes to 0.88 and removed the monitor and created a new one Jasper Siero
09:36 AM devops Bug #7918 (Won't Fix): Mon hangs at start after upgrading to leveldb-1.12.0-3.fc18.x86_64 from th...
I had a working 0.72.2 installation using the standard Fedora 18 RPMs.
I upgraded leveldb from 1.7.0-4.fc18 to 1.1...
Jens Kristian Søgaard
09:33 AM Bug #7908 (Resolved): "osd.3 ... [ERR] scrub" in upgrade:dumpling-x:parallel-firefly-distro-basic...
Sage Weil
09:09 AM Bug #7914: osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length from 4096 to...
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154491 Sage Weil
09:03 AM Bug #7914 (Resolved): osd: SEGV on ec write, ErasureCodeJerasure: encode adjusted buffer length f...
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154379... Sage Weil
09:09 AM Bug #7916 (Can't reproduce): ceph_test_rados got ENOENT on ec pool + thrashing
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154424 Sage Weil
08:58 AM Bug #7776: client lockdep crash
ubuntu@teuthology:/a/teuthology-2014-03-30_02:30:11-rados-master-testing-basic-plana/154676 Sage Weil

03/30/2014

09:17 AM Bug #7909 (Resolved): warnings during gf-complete/jerasure build
Sage Weil
04:33 AM Bug #7909: warnings during gf-complete/jerasure build
"work in progress":https://github.com/ceph/ceph/pull/1568 cherry pick of the above pending pull requests Loïc Dachary
01:25 AM Bug #7909 (In Progress): warnings during gf-complete/jerasure build
* https://bitbucket.org/jimplank/gf-complete/pull-request/12/fix-void-arithmetic-compilation-warning
* https://bitbu...
Loïc Dachary

03/29/2014

09:52 PM Bug #7849 (Fix Under Review): ceph-conf create empty log files
Sage Weil
09:39 PM Bug #7909 (Resolved): warnings during gf-complete/jerasure build
http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-tarball-saucy-amd64-basic/log.cgi?log=c3292e48483d861148322590ea1f05... Sage Weil
09:12 PM Bug #7908 (Resolved): "osd.3 ... [ERR] scrub" in upgrade:dumpling-x:parallel-firefly-distro-basic...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-29_19:33:24-upgrade:dumpling-x:parallel-firefly-di... Yuri Weinstein
04:59 PM Bug #7907 (Resolved): osd: rollback to head didn't mark_unrollbackable
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-28_22:35:22-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
04:18 PM Bug #7744: osd: assert(last_e.version.version < e.version.version)
Two of my osds are crashing with the same signature:
osd/PGLog.cc: 672: FAILED assert(last_e.version.version < e.v...
Jake Young
09:19 AM CephFS Bug #7880 (Resolved): multimds: directory gets rsynced twice
Sage Weil
05:28 AM Subtask #7548 (Resolved): Basic docs for Erasure Coding
Loïc Dachary

03/28/2014

07:11 PM rbd Bug #6480 (In Progress): librbd crashed qemu-system-x86_64
Mike provided a core dump in which structures appear to be normal, though it's hard to tell with the ceph::buffer::ra... Josh Durgin
06:31 PM Bug #7849: ceph-conf create empty log files
Socket part was introduced later -- perhaps only socket part shall be reversed?
Log-related change seems safe...
Dmitry Smirnov
06:05 PM Bug #7849 (In Progress): ceph-conf create empty log files
oops, the previous fix breaks ceph.py, which does ceph-conf --show-config-value to get admin_socket. Sage Weil
05:23 PM Bug #7626: After updating ceph from 0.75 to 0.77 one of the three monitors can't start
Yes, I'm pretty sure it is.. this bug affected 0.77 and was fixed for 0.78. If I remember correctly, the full osdmap... Sage Weil
05:20 PM Bug #7626: After updating ceph from 0.75 to 0.77 one of the three monitors can't start
This sounds like it could be commit:14ea8157eb2883b9f53c234044fe002153212ef8 Sage Weil
10:18 AM Bug #7626: After updating ceph from 0.75 to 0.77 one of the three monitors can't start
The store attached to the ticket shows the latest 7 full osdmaps as being unable to be decoded, which would explain t... Joao Eduardo Luis
05:18 PM Bug #7902 (Pending Backport): osd/PG.cc: 6803: FAILED assert(!pg->actingbackfill.empty())
Sage Weil
12:54 PM Bug #7902 (Resolved): osd/PG.cc: 6803: FAILED assert(!pg->actingbackfill.empty())
this a split dumpling/firefly cluster
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-03-27_22:35:2...
Sage Weil
05:17 PM Bug #7881 (Resolved): osd/PGLog.cc: 430: FAILED assert(to != olog.log.end() || (olog.head == info...
Sage Weil
03:37 PM Bug #7881: osd/PGLog.cc: 430: FAILED assert(to != olog.log.end() || (olog.head == info.last_update))
Samuel Just
03:16 PM Bug #7881 (In Progress): osd/PGLog.cc: 430: FAILED assert(to != olog.log.end() || (olog.head == i...
Samuel Just
09:52 AM Bug #7881: osd/PGLog.cc: 430: FAILED assert(to != olog.log.end() || (olog.head == info.last_update))
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-03-27_22:35:20-upgrade:dumpling-x:stress-split-firefly... Sage Weil
04:58 PM rgw Bug #7903 (Resolved): radosgw-admin: failing to set default region
Sage Weil
01:27 PM rgw Bug #7903 (Fix Under Review): radosgw-admin: failing to set default region
Yehuda Sadeh
12:59 PM rgw Bug #7903 (Resolved): radosgw-admin: failing to set default region
http://pulpito.ceph.com/teuthology-2014-03-27_23:00:17-rgw-firefly-distro-basic-plana/149617/ Yehuda Sadeh
04:00 PM rgw Bug #7837 (Resolved): s3tests test_object_metadata_replaced_on_put fails on an erasure coded pool
Sage Weil
03:56 PM CephFS Bug #7867 (Resolved): client/Client.cc: 2087: FAILED assert(!unclean)
Sage Weil
03:47 PM CephFS Bug #7867 (Pending Backport): client/Client.cc: 2087: FAILED assert(!unclean)
Sage Weil
03:52 PM Bug #7906 (Duplicate): "adjust-ulimits ... --rgw-region zero'''" filed in rgw-firefly-distro-basi...
Duplicates #7903 Yehuda Sadeh
03:36 PM Bug #7906 (Duplicate): "adjust-ulimits ... --rgw-region zero'''" filed in rgw-firefly-distro-basi...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-27_23:00:17-rgw-firefly-distro-basic-plana/149644/... Yuri Weinstein
02:07 PM Bug #7874 (Resolved): clone_range from clone on recovery fails if clone has been evicted
Sage Weil
02:05 PM Bug #7828 (Resolved): osd/ReplicatedPG.cc: 4984: FAILED assert(ctx->new_obs.exists == ctx->new_sn...
Sage Weil
02:05 PM Bug #7835 (Resolved): make_writeable needs to fill in ssc on new clone
Sage Weil
01:25 PM Bug #7805 (Resolved): emperor can go active with < min_size non-incomplete peers since we check a...
Sage Weil
01:21 PM Bug #7904 (Resolved): osd/ReplicatedPG.cc: 10661: FAILED assert(is_active())
We need to dump agent_state even on primary->primary
-69> 2014-03-28 13:07:10.884747 7fdc1660b700 15 journal pr...
Samuel Just
10:43 AM Fix #7890: erasure-code: last stripe is not truncated
At present, this is by design. Samuel Just
01:05 AM Fix #7890 (New): erasure-code: last stripe is not truncated
When encoding a 10 bytes object with osd_pool_erasure_code_stripe_width = 2048 and k=2 + m=1, the last strip should b... Loïc Dachary
10:05 AM rbd Feature #7895 (Resolved): krbd: test cloning, discard, plus regular I/O via fsx
This could be another fork of src/test/librbd/fsx.c in ceph.git, or modifications to it to abstract out the I/O more ... Josh Durgin
09:21 AM Bug #7894 (Duplicate): osd: missing hitset object in cluster log
failure_reason: '"2014-03-27 16:02:47.849800 osd.4 10.214.131.17:6800/13622 33 : [ERR]
4.0 shard (4,255) missing 0...
Sage Weil
09:20 AM Bug #7893 (Duplicate): osd/ReplicatedPG.cc: 10190: FAILED assert(0 == "erroneously present object")
... Sage Weil
09:19 AM Bug #7892 (Duplicate): osd/ReplicatedPG.cc: 7881: FAILED assert((data_included.empty() && data.le...
... Sage Weil
09:18 AM Bug #7891 (Resolved): osd: leaked pg refs on shutdown
... Sage Weil
08:53 AM Bug #5884: negative num_objects_degraded in pool stats
Oops, replace 1.3 with 1.30 in previous message.... John Spray
08:50 AM Bug #5884: negative num_objects_degraded in pool stats

Seen on a cluster that's been running for the past 2 weeks on the firefly branch.
Potentially noteworthy things ...
John Spray
08:25 AM CephFS Bug #7780 (Resolved): When full flag is set, even MDS writes are blocked
Fix was merged at c647a03fffb2e1e997dbdb0ff128eeb6efc47deb John Spray

03/27/2014

10:26 PM CephFS Bug #7880: multimds: directory gets rsynced twice
Zheng Yan
06:54 AM CephFS Bug #7880 (Resolved): multimds: directory gets rsynced twice
probalby the mtime doesn't get set properly teh first time?... Sage Weil
09:31 PM Messengers Bug #7888: msgr: keepalive is insufficient
wip-7888 handles this for MonClient. We can do the same with Objecter, but this is less critical because we will fin... Sage Weil
09:24 PM Messengers Bug #7888 (Fix Under Review): msgr: keepalive is insufficient
Sage Weil
06:01 PM Messengers Bug #7888 (In Progress): msgr: keepalive is insufficient
Sage Weil
04:44 PM Messengers Bug #7888 (Resolved): msgr: keepalive is insufficient
the current keepalive behavior relies on writes triggering a tcp timeout/error, which does not actually happy in many... Sage Weil
05:58 PM Bug #7824 (Resolved): LibRadosList.ListObjectsNS failure
ugh, the test wasn't restarting after the cuttlefish->dumpling upgrade step.
ceph-qa-suite.git commit 5651ee813170...
Sage Weil
05:40 PM devops Bug #7889: IPv6 support with ceph-deploy
socket.gethostbyname supports only ipv4 and should be replaced with socket.getaddrinfo
next the inet_aton is also ...
Miha Zidar
05:36 PM devops Bug #7889 (Resolved): IPv6 support with ceph-deploy
In ceph-deploy/util/arg_validators.py the hostname function fails "hostname: X is not resolvable" if using ipv6 Miha Zidar
05:14 PM Bug #7885 (Resolved): scrub must not take offence if evicted/unpromoted clones are missing on a c...
Samuel Just
11:54 AM Bug #7885: scrub must not take offence if evicted/unpromoted clones are missing on a cache pool
Samuel Just
11:41 AM Bug #7885 (Resolved): scrub must not take offence if evicted/unpromoted clones are missing on a c...
Samuel Just
04:41 PM Bug #7887 (Resolved): W: shlib-with-executable-stack
Lintian produced the following warnings after building Debian packages of ceph-0.78:... Dmitry Smirnov
04:29 PM Bug #7875 (Resolved): osd: pg_pool_t hitset fields incompat
Sage Weil
10:40 AM Bug #7875 (In Progress): osd: pg_pool_t hitset fields incompat
Sage Weil
02:48 PM Documentation #7886 (New): What's the policy on URL stability for public documentation?
We're trying to decide whether to include links to documentation on ceph.com within the UI of the Calamari product. W... Yan-Fa Li
01:24 PM Bug #7805 (Fix Under Review): emperor can go active with < min_size non-incomplete peers since we...
Samuel Just
01:24 PM Bug #7858: agent with snaps ceph_test_rados error
Samuel Just
12:37 PM Bug #7849 (Resolved): ceph-conf create empty log files
commit:acc31e75a3e7115c00f9980609948455e3b2d49e Josh Durgin
11:36 AM Bug #7849 (Fix Under Review): ceph-conf create empty log files
Sage Weil
11:16 AM Bug #7849: ceph-conf create empty log files
This should be just a matter of passing the right flags to common_init or global_init Sage Weil
11:34 AM Feature #7884 (New): investigate having the messenger (or dispatch q?) in the osd limit the numbe...
Samuel Just
11:14 AM rgw Bug #7876 (Resolved): rgw: > on char* in rgw_rest_user
Sage Weil
11:13 AM rgw Bug #7876 (Fix Under Review): rgw: > on char* in rgw_rest_user
Yehuda Sadeh
11:13 AM Bug #7826 (Resolved): osd: illegal instruction in jerasure
Sage Weil
03:23 AM Bug #7826 (Fix Under Review): osd: illegal instruction in jerasure
Loïc Dachary
10:40 AM CephFS Bug #7867: client/Client.cc: 2087: FAILED assert(!unclean)
Sage Weil
10:37 AM Fix #7560 (Closed): mon: add compat set feature to mark an upgraded pg format in order to disallo...
The feature was introduced for Dumpling. The only version that does not support it is Cuttlefish, and it will assert... Joao Eduardo Luis
10:20 AM Bug #7860 (Resolved): LibRadosTwoPoolsPP.PromoteSnap failed at line 311
Sage Weil
08:54 AM Bug #7881 (Resolved): osd/PGLog.cc: 430: FAILED assert(to != olog.log.end() || (olog.head == info...
mixed dumpling/firefly cluster:... Sage Weil
08:54 AM rbd Bug #6628: krbd: BUG during ceph_osdc_stop() sometimes when rbd_add() fails
Ian Colle
07:18 AM devops Bug #7879: sentry-db is down
Let's get these services moved over today. Zack Cerza
06:48 AM devops Bug #7879 (Resolved): sentry-db is down
Zack Cerza

03/26/2014

09:55 PM Bug #7824 (In Progress): LibRadosList.ListObjectsNS failure
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-03-25_19:33:06-upgrade:dumpling-x:parallel-firefly---b... Sage Weil
07:38 PM Feature #7437 (In Progress): EC: add adapt unittest teuthology task and add to nightly
David Zafman
07:00 PM CephFS Bug #5787 (Duplicate): client/Client.cc: 2081: FAILED assert(!unclean) in put_inode
Zheng Yan
07:00 PM CephFS Bug #5787: client/Client.cc: 2081: FAILED assert(!unclean) in put_inode
dup #7867 Zheng Yan
06:09 PM rgw Bug #7837: s3tests test_object_metadata_replaced_on_put fails on an erasure coded pool
Samuel Just
05:44 PM rgw Bug #7837: s3tests test_object_metadata_replaced_on_put fails on an erasure coded pool
wip-7837 Samuel Just
05:28 PM rgw Bug #7837: s3tests test_object_metadata_replaced_on_put fails on an erasure coded pool
looking Samuel Just
06:05 PM rgw Bug #7876 (Resolved): rgw: > on char* in rgw_rest_user
... Sage Weil
05:57 PM CephFS Bug #7867: client/Client.cc: 2087: FAILED assert(!unclean)
pushed a simpler version, wip-7867-b, that just pins the Inode* for the duration of the readahead. This avoids a pos... Sage Weil
09:13 AM CephFS Bug #7867 (Resolved): client/Client.cc: 2087: FAILED assert(!unclean)
here is the lsat bit of log, starting with the incomplete readahead that caused the crash... Sage Weil
05:54 PM Bug #7875 (Resolved): osd: pg_pool_t hitset fields incompat
If the hitset fields get used in pg_pool_t the encoding is marked as incompatible. This breaks old clients that don'... Sage Weil
05:27 PM Bug #7874: clone_range from clone on recovery fails if clone has been evicted
wip-7874 disables clone subsets for cache pools. Testing Samuel Just
05:26 PM Bug #7874: clone_range from clone on recovery fails if clone has been evicted
Samuel Just
05:13 PM Bug #7874 (Resolved): clone_range from clone on recovery fails if clone has been evicted
Options
1) don't allow clone_range from clone on recovery if cache pool
2) detect when the clone has been evicted a...
Samuel Just
05:26 PM Bug #7828: osd/ReplicatedPG.cc: 4984: FAILED assert(ctx->new_obs.exists == ctx->new_snapset.head_...
Samuel Just
05:25 PM Feature #7831: OSD: track objects with omap entries and don't count toward caps
Samuel Just
05:18 PM Bug #6910 (Resolved): don't query empty osds for unfound
Sage Weil
05:15 PM Feature #7871 (Resolved): ceph_test_rados: allow no-omap to be specified seperately from ec-pool
Sage Weil
11:43 AM Feature #7871 (Resolved): ceph_test_rados: allow no-omap to be specified seperately from ec-pool
Samuel Just
05:13 PM Bug #7870 (Resolved): only return ENOTSUP on omap write ops for EC pools
Sage Weil
02:04 PM Bug #7870 (Fix Under Review): only return ENOTSUP on omap write ops for EC pools
Samuel Just
11:39 AM Bug #7870 (Resolved): only return ENOTSUP on omap write ops for EC pools
This way redirected ops on objects without omap entries will work. Samuel Just
04:59 PM rgw Feature #7589 (Resolved): rgw: configurable chunk size
commit:98654092fc5a18ef542d294d5696cac86d96229f Josh Durgin
04:10 PM Bug #7860: LibRadosTwoPoolsPP.PromoteSnap failed at line 311
David Zafman
03:33 PM Bug #7826: osd: illegal instruction in jerasure
For the record, the illegal instruction was on an AVX instruction. It was part of the flags being set at compile time... Loïc Dachary
09:53 AM Bug #7826: osd: illegal instruction in jerasure
Loïc Dachary
03:32 AM Bug #7826 (In Progress): osd: illegal instruction in jerasure
"work in progress":https://github.com/ceph/ceph/pull/1534 Loïc Dachary
02:17 PM Feature #7873 (Resolved): pg query: dump peer_info, peer_missing in all states
This debug info is useful for recovering from incomplete situations. Samuel Just
01:21 PM Bug #7872 (Duplicate): PG: all_unfound_are_queried_or_lost must skip dne replicas
Samuel Just
01:21 PM Bug #7872: PG: all_unfound_are_queried_or_lost must skip dne replicas
actually, 6910 Samuel Just
01:17 PM Bug #7872 (Duplicate): PG: all_unfound_are_queried_or_lost must skip dne replicas
Samuel Just
11:48 AM Bug #7823 (Resolved): osd: copy-get from ec pool returns wrong size
Samuel Just
11:22 AM Cleanup #7869 (Resolved): arch: use cpuid.h when possible
Milosz Tanski <milosz@adfin.com>:
Instead of doing cpuid manually you can use builtins provided in gcc
(and in...
Loïc Dachary
09:47 AM Bug #7868 (Can't reproduce): "failed to recover before timeout expired" in powercycle-firefly---b...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-23_23:55:02-powercycle-firefly---basic-plana/14419... Yuri Weinstein
01:32 AM rbd Bug #7282: Unresponsive rbd-backed Qemu domain causes libvirtd to stall on all connections
Thanks! Florian Haas
12:15 AM devops Bug #6592: 3.8 kernel + /dev/cciss/c0d1 + precise : fail to show in /dev/disk/by-partuuid
We did not get to the bottom of this and the hardware is still available. It's cold but not dead ;-) Loïc Dachary

03/25/2014

11:51 PM RADOS Cleanup #7865 (New): erasure-code: fine grain SSE support
make use of "ifunc":http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html#index-g_t_0040code_007bifunc... Loïc Dachary
05:48 PM rgw Bug #7676 (Resolved): rgw: multi-part upload incompatible with EC backend
commit:7989cbd418ed8d51348851a39ffa84ac2224f4fe Josh Durgin
05:17 PM RADOS Feature #7770: "ceph osd crush set" should handle ingestion of non-compiled crush maps
Stating the obvious: If a user is going to manually modify the crush map, there is always a chance of botching it and... Brian Andrus
01:11 PM RADOS Feature #7770: "ceph osd crush set" should handle ingestion of non-compiled crush maps
We probably want a better idea of what they are trying to accomplish. Samuel Just
05:03 PM Tasks #7864 (New): please clarify copyright and the license
Please confirm the licensing information for the following files:
Files: src/test/librbd/fsx.c
Copyright: ??1991,...
Dmitry Smirnov
04:44 PM rgw Bug #7818 (Resolved): syncdaemon: error geting op state: list index out of range
fixed by commit:812e48a14837cc0173a15aafa6fa563bf9fdd6d4 in teuthology.git Josh Durgin
02:44 PM rgw Bug #7818: syncdaemon: error geting op state: list index out of range
this actually looks like a test issue after all - trying to run some of the data sync tests across regions Josh Durgin
04:43 PM rgw Bug #7820 (Resolved): "AssertionError: 404 != 301" in rgw-firefly-distro-basic-plana suite
fixed by commit:0cb00b1fb9bf6c59d787f340dd50ae16b1473f82 in teuthology.git Josh Durgin
04:23 PM rgw Bug #7099 (Pending Backport): Strange Comportments with media files
This one was also already merged, in commit:0427f61544529ab4e0792b6afbb23379fe722de1 Josh Durgin
04:11 PM rgw Bug #7271 (Resolved): container create via swift doesn't register ACL
This was merged a while ago as commit:bf38bfb2e6511217eaa720811af826fe5498a461 Josh Durgin
04:10 PM rbd Bug #6480: librbd crashed qemu-system-x86_64
One more:... Mike Dawson
09:05 AM rbd Bug #6480: librbd crashed qemu-system-x86_64
Ian Colle
04:03 PM Feature #7862 (Resolved): allow backfill/recovery while below min_size
Samuel Just
04:03 PM Feature #7861 (Closed): osd: allow writes on degraded objects
Samuel Just
03:57 PM Bug #7860 (Resolved): LibRadosTwoPoolsPP.PromoteSnap failed at line 311

This is even before setting up the tiering doing the following sequence:
Create/write "hi there" to object "baz"...
David Zafman
03:45 PM Bug #7828: osd/ReplicatedPG.cc: 4984: FAILED assert(ctx->new_obs.exists == ctx->new_snapset.head_...
nvm Samuel Just
02:31 PM Bug #7828: osd/ReplicatedPG.cc: 4984: FAILED assert(ctx->new_obs.exists == ctx->new_snapset.head_...
finish_ctx incorrectly sets head_exists for a clone evict operation. Samuel Just
03:36 PM Bug #7858 (Resolved): agent with snaps ceph_test_rados error
2014-03-25 15:04:11,699.699 INFO:teuthology.run:Summary data:
{duration: 474.998064994812, failure_reason: 'Command ...
Samuel Just
03:22 PM Bug #7824: LibRadosList.ListObjectsNS failure
And it looks like majority of upgrade:dumpling-x:parallel-firefly tests failed like this Yuri Weinstein
02:59 PM Bug #7824: LibRadosList.ListObjectsNS failure
Still an issue, if you need fresh logs log to yw box (via teuthology), logs are in /home/ubuntu/logs/142441 Yuri Weinstein
01:14 PM Bug #7824: LibRadosList.ListObjectsNS failure
Is this the one we are backporting? Samuel Just
02:42 PM rgw Bug #7821 (Duplicate): 'S3ResponseError: 301 Moved Permanently' in rgw-firefly-distro-basic-plana...
Josh Durgin
02:31 PM Bug #7843: OSD fails to start
this is the trace when it fails
ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
1: /usr/bin/ceph...
gustavo panizzo
09:57 AM Bug #7843: OSD fails to start
#6101 has nothing to do with this. :)
Looks like something has gone wrong with the OSD classes or some data passed t...
Greg Farnum
07:04 AM Bug #7843 (Can't reproduce): OSD fails to start
one of our OSD suddenly crashed, after that it no longer starts. the osd was new to the cluster so it was recovering.... gustavo panizzo
02:14 PM devops Bug #7645 (Resolved): mira020 is having problems
It had a hardware issue and has since been long fixed. Sandon Van Ness
02:13 PM devops Bug #4887: ceph-disk list: Doesn't list properly devices using dm-mapper
Alfredo - please confirm this issue no longer exists in current ceph-disk Ian Colle
02:11 PM devops Bug #5306: Xen based OSDs fail to start ceph-osd process
Yan is this still an issue? Ian Colle
02:10 PM devops Bug #5811 (Resolved): gperftools-{devel,libs}-2.0.11.el6 appears to be broken (centos6/rhel6)
Ian Colle
02:09 PM devops Bug #6592: 3.8 kernel + /dev/cciss/c0d1 + precise : fail to show in /dev/disk/by-partuuid
Loic - is this still "In Progress"? Ian Colle
02:07 PM devops Bug #5193: RHEL6 does not ship with xfsprogs
Should eventually live in http://www.ceph.com/rpm/rhel6/x86_64/ Ian Colle
02:05 PM devops Bug #5193: RHEL6 does not ship with xfsprogs
Build and package xfsprogs. Ian Colle
02:04 PM rgw Bug #7837: s3tests test_object_metadata_replaced_on_put fails on an erasure coded pool
I've got osd and rgw logs on metropolis:/home/joshd/teuthology/archive/rgw-ec-pool-7837 Josh Durgin
11:54 AM rgw Bug #7837: s3tests test_object_metadata_replaced_on_put fails on an erasure coded pool
This looks like an EC related regression, was working before.
The test first creates an object + xattrs. Then over...
Yehuda Sadeh
02:01 PM devops Bug #5479: Append our built packages with some sort of inktank/ceph identifier
Ian Colle
02:01 PM devops Bug #7312 (Resolved): ERROR: Running exception handlers
This was a temporary issue that has not repeated. Sandon Van Ness
01:41 PM Bug #7804: backfill racing with a hitset object remove
This appears to have been caused by a backfill racing with a hitset object remove -- probably easiest to block hitset... Samuel Just
01:27 PM Bug #7781 (Duplicate): "FAILED assert" in rados-firefly-distro-basic-plana suite
Samuel Just
11:55 AM Bug #7781: "FAILED assert" in rados-firefly-distro-basic-plana suite
Looks like Ceph issue. Yuri Weinstein
01:21 PM Bug #7495 (Resolved): ENOTEMPTY on collection remove
Samuel Just
01:20 PM Bug #6003: journal Unable to read past sequence 406 ...
Samuel Just
01:19 PM Bug #7728 (Resolved): osd/ReplicatedPG.cc: 4999: FAILED assert(got)
Samuel Just
01:04 PM Bug #7706: osd: PrioritizedQueue can starve
ms pq max tokens per priority: 16777216 is a good work around Samuel Just
12:34 PM Bug #7207: Lock contention at filestore I/O (FileStore::lfn_open) during filestore folder splitti...
As a performance patch that only impacts splitting, I don't think this is really appropriate for a backport. Greg Farnum
11:43 AM Bug #7849 (Resolved): ceph-conf create empty log files
`ceph-conf` creates empty log files in "/var/log/ceph". This is unexpected, undesireable (and undocumented) behaviour... Dmitry Smirnov
10:50 AM CephFS Feature #3863 (Resolved): implement a tool to lookup inode numbers without holding their path
Merged into master in commit:83661c273e759837816d5cd7ba27233ded898455 Greg Farnum
10:00 AM Bug #7822: examples/librados/hello_world.cc broken on master
The read isn't supposed to be null terminated; it's a bufferlist. I was sloppy in giving that to an ostream, but doin... Greg Farnum
09:57 AM Bug #7826: osd: illegal instruction in jerasure
... Loïc Dachary
09:47 AM Bug #7826: osd: illegal instruction in jerasure
ceph/src/unittest_erasure_code_jerasure fails consistently... Loïc Dachary
09:32 AM Bug #7826: osd: illegal instruction in jerasure
reproduced on https://github.com/dachary/ceph/tree/wip-sse-fix Loïc Dachary
08:01 AM Bug #7826: osd: illegal instruction in jerasure
Reproduced the bug on master in a kvm without SSE instructions :-)... Loïc Dachary
07:40 AM Bug #7826: osd: illegal instruction in jerasure
I thought that maybe ... Loïc Dachary
05:27 AM Bug #7826: osd: illegal instruction in jerasure
On master ( afcf016a8a814167ced178d8a4275fa2a6f94a4d ) it fails in the same way using the set of command above, with ... Loïc Dachary
02:32 AM Bug #7826: osd: illegal instruction in jerasure
... Loïc Dachary
02:29 AM Bug #7826: osd: illegal instruction in jerasure
... Loïc Dachary
09:25 AM rbd Bug #5876 (In Progress): Assertion failure in rbd_img_obj_callback() : rbd_assert(which >= img_re...
Ian Colle
09:24 AM rbd Bug #7282 (Resolved): Unresponsive rbd-backed Qemu domain causes libvirtd to stall on all connect...
Marking resolved since Wido's patches landed upstream. Thanks! Josh Durgin
09:24 AM rbd Bug #7577 (In Progress): rbd info displays extra random char in block prefix
Ian Colle
09:21 AM rgw Bug #7526 (Resolved): "ERROR:radosgw_agent.worker:syncing entries for shard 59" in rgw-firefly-di...
Yehuda Sadeh
09:21 AM rgw Bug #6889: rgw: usage log: don't log system user operations
commit:42ef8ba543c7bf13c5aa3b6b4deaaf8a0f9c58b6 Yehuda Sadeh
09:11 AM rgw Feature #6678 (Pending Backport): rgw: reject writes to secondary zones
Ian Colle
09:08 AM rgw Bug #7502 (Rejected): S3 API - deleting object always returns 204 regardless of object is existin...
Yehuda Sadeh
06:46 AM CephFS Feature #7761 (In Progress): journal-tool: forwards-search through corrupt regions
John Spray
06:45 AM CephFS Feature #7759 (In Progress): journal-tool: roll in resetter/dumper from MDS
John Spray
06:35 AM devops Feature #7239 (Closed): ceph-deploy: install cephfs java bindings
Noah, ceph-deploy does have the ability to install a package on remote nodes:... Alfredo Deza
06:25 AM rbd Bug #7790: Kernel panic when creating ZFS pools on CEPH RBD devices
Steps to reproduce bug:
- create a zpool with default options (@zpool create mypool /dev/rbd/rbdpool/rbddev@)
- w...
Andrea Ieri

03/24/2014

07:59 PM Bug #7779: osd: object file can have too many xattrs, get E2BIG
What we'll actually need to repair these objects will involve pulling the rgw xattrs out as well as the RADOS ones — ... Greg Farnum
05:31 PM Bug #7779: osd: object file can have too many xattrs, get E2BIG
actually, the salvage tool is useless. since we are throwing out the rados user attrs, we can just do... Sage Weil
05:23 PM Bug #7779: osd: object file can have too many xattrs, get E2BIG
wip-7779 has a reproducer program, and a 'salvage' target that will copy the known important ceph xattrs to a replace... Sage Weil
04:41 PM Bug #7779: osd: object file can have too many xattrs, get E2BIG
good news: scrub on a pg with an object with too many xattrs:... Sage Weil
06:23 PM rgw Bug #7837 (Resolved): s3tests test_object_metadata_replaced_on_put fails on an erasure coded pool
Using this configuration:... Josh Durgin
03:36 PM Bug #7835 (Resolved): make_writeable needs to fill in ssc on new clone
-1> 2014-03-24 15:29:50.770570 7fac0522e700 10 filestore(/var/lib/ceph/osd/ceph-4) _finish_op 0x5ac6a00 seq 4456 ... Samuel Just
02:55 PM rgw Bug #7742 (Resolved): FAIL: s3tests.functional.test_s3.test_region_bucket_create_master_access_re...
commit:0cb00b1fb9bf6c59d787f340dd50ae16b1473f82 in teuthology.git Josh Durgin
02:41 PM Bug #7826: osd: illegal instruction in jerasure
running this teuthology workload on http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-deb-precise-amd64-basic/log.cgi?... Loïc Dachary
08:34 AM Bug #7826: osd: illegal instruction in jerasure
this is what i triggered it with. note that yo uneed to adjust the branch to be something that doesn't include the w... Sage Weil
08:05 AM Bug #7826: osd: illegal instruction in jerasure
fixed a "few things in gf_complete":https://bitbucket.org/jimplank/gf-complete/pull-request/4/defer-the-decision-to-u... Loïc Dachary
07:40 AM Bug #7826: osd: illegal instruction in jerasure
"work in progress":https://github.com/ceph/ceph/tree/wip-sse-fix to run teuthology workloads Loïc Dachary
02:09 AM Bug #7826: osd: illegal instruction in jerasure
It would be helpful to have the config yaml used to trigger this... Loïc Dachary
01:15 AM Bug #7826: osd: illegal instruction in jerasure
"workaround":https://github.com/ceph/ceph/pull/1524 Loïc Dachary
02:21 PM RADOS Fix #7832: PG: do not double-queue pg log entries
Okay, that code diagnosis is probably not correct, since append_log doesn't set the dirty_from member which _write_lo... Greg Farnum
02:10 PM RADOS Fix #7832 (New): PG: do not double-queue pg log entries
PG::_write_log() sends each of the entries in the pg log to the omap. (obviously) Irritatingly, PG::append_log does t... Greg Farnum
02:20 PM Feature #7438 (Resolved): EC: adapt watch/notify stress test for EC and add to nightly
Two commits resolve this issue:
01b99668abd9aaa65e8864c2dad392d35dd892e1
5a3f6c7c8a01c002e9ff7ad5b49afaf3ae041ead
David Zafman
02:17 PM Bug #6789: cannot remove the leader when there only are two monitors
This doesn't currently happen on latest. Haven't tested yet with latest dumpling and latest emperor. Joao Eduardo Luis
12:14 PM Feature #7831 (Resolved): OSD: track objects with omap entries and don't count toward caps
Samuel Just
10:03 AM Bug #7827 (Resolved): osd: weird slow request warnings
Merged into master in commit:afcf016a8a814167ced178d8a4275fa2a6f94a4d Greg Farnum
08:27 AM Bug #7626 (In Progress): After updating ceph from 0.75 to 0.77 one of the three monitors can't start
Joao Eduardo Luis
07:34 AM Bug #6850 (Closed): mon: 'ceph health detail' with formatted output doesn't report low space on d...
As of today this is no longer valid. Joao Eduardo Luis
07:06 AM Feature #6732 (Rejected): mon: 'mon_status' should provide as much insight as 'ping'
'mon_status' has a different scope than that of 'ping'.
'mon_status' reports on monitor's status, via the ceph too...
Joao Eduardo Luis

03/23/2014

10:33 PM Bug #7826: osd: illegal instruction in jerasure
Sage Weil
10:32 PM Bug #7826 (Fix Under Review): osd: illegal instruction in jerasure
Sage Weil
06:38 PM Bug #7826 (Resolved): osd: illegal instruction in jerasure
... Sage Weil
10:12 PM Bug #7828 (Resolved): osd/ReplicatedPG.cc: 4984: FAILED assert(ctx->new_obs.exists == ctx->new_sn...
... Sage Weil
09:34 PM Bug #7827 (Fix Under Review): osd: weird slow request warnings
Sage Weil
09:27 PM Bug #7827 (Resolved): osd: weird slow request warnings
... Sage Weil
11:48 AM rgw Feature #6513 (Resolved): rgw: dr: Service scripts for meta/data sync agents
Added to radosgw-agent.git commit:c8d243f503b39e67a8daec6df40b58fb993cc6e6 and the following few to fix up packaging. Josh Durgin
05:25 AM Bug #5804: mon: binds to 0.0.0.0:6800something port
Also, for those who are ailed by this bug, best solution for them is to remove the affected monitor from the cluster ... Joao Eduardo Luis
05:23 AM Bug #5804 (Resolved): mon: binds to 0.0.0.0:6800something port
My suspicions were right, at least as far as the VMs are concerned. Whereas using 0.67.2 or 0.72.2 would hit this ev... Joao Eduardo Luis

03/22/2014

08:44 PM Bug #7824 (Resolved): LibRadosList.ListObjectsNS failure
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-03-22_19:33:22-upgrade:dumpling-x:parallel-firefly---b... Sage Weil
03:41 PM Bug #7823: osd: copy-get from ec pool returns wrong size
wip-ec-copy-get Sage Weil
03:19 PM Bug #7823 (Resolved): osd: copy-get from ec pool returns wrong size
reproduced w/ replicated cache in front of an ec pool and ceph_test_rados Sage Weil

03/21/2014

09:50 PM Bug #7822 (Resolved): examples/librados/hello_world.cc broken on master
Using the example program to test librados ports, I notice that it's currently failing (slightly) on master:... Dan Mick
07:30 PM devops Bug #6726: Official packages do not appear to be available for Saucy
I propose the following:
1. Ubuntu LTS (12.04, 14.04, etc) can (should) stop at the Inktank-supported Ceph release...
Peter Matulis
05:11 PM devops Bug #6726: Official packages do not appear to be available for Saucy
Been waiting for a while, one more weekend won't hurt ;)
Don't know about the others but I'll be very happy with e...
Tom Verdaat
05:01 PM devops Bug #6726: Official packages do not appear to be available for Saucy
Builds take quite a while and we are actually doing a release today so at the very least this will need to wait until... Sandon Van Ness
12:01 PM devops Bug #6726: Official packages do not appear to be available for Saucy
Ok, i was able to sucessfully build for saucy on jenkins. Trusty is giving me trouble as the machines I was using are... Sandon Van Ness
05:22 PM rgw Bug #7821 (Duplicate): 'S3ResponseError: 301 Moved Permanently' in rgw-firefly-distro-basic-plana...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-21_14:31:01-rgw-firefly-distro-basic-plana/140962/... Yuri Weinstein
05:16 PM rgw Bug #7820 (Resolved): "AssertionError: 404 != 301" in rgw-firefly-distro-basic-plana suite
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-21_14:31:01-rgw-firefly-distro-basic-plana/140967/... Yuri Weinstein
03:00 PM Feature #7757 (Resolved): erasure-code: enable jerasure-2 SSE optimizations
Loïc Dachary
07:27 AM Feature #7757: erasure-code: enable jerasure-2 SSE optimizations
"The problem with enabling that flag is that it apparently also gives the compiler license to use SSE2 instructions i... Loïc Dachary
07:01 AM Feature #7757 (Fix Under Review): erasure-code: enable jerasure-2 SSE optimizations
"work in progress":https://github.com/ceph/ceph/pull/1512 Loïc Dachary
04:58 AM Feature #7757: erasure-code: enable jerasure-2 SSE optimizations
Waiting for review, K. Greenan may do it this week-end Loïc Dachary
02:51 PM Bug #7212 (Resolved): monitor fails to start
Sage Weil
01:51 PM rgw Bug #7816 (Resolved): 403 Forbidden while accessing https://github.com/ceph/radosgw-agent.git/inf...
teuthology.git 752a76fb4860afa8cd385e7103fc2832cdbb4098 Sage Weil
01:30 PM rgw Bug #7816 (Resolved): 403 Forbidden while accessing https://github.com/ceph/radosgw-agent.git/inf...
... Sage Weil
01:41 PM Bug #7735 (Resolved): osd: priorityqueue debug dump crashes
Sage Weil
01:40 PM rgw Bug #7818 (Resolved): syncdaemon: error geting op state: list index out of range
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-03-21_13:07:49-rgw-firefly-distro-basic-plana/140905<p... Sage Weil
01:34 PM rgw Feature #7817 (Resolved): rgw: some support for copy multipart part
The minimum support should be returning a NotImplemented error code, rather than NoSuchKey. Yehuda Sadeh
01:29 PM Bug #7659: osd/ReplicatedPG.cc: 6751: FAILED assert(attrs || !pg_log.get_missing().is_missing(soi...
David Zafman
12:07 PM rgw Bug #7815 (Can't reproduce): Test failed in upgrade:dumpling-x:parallel-firefly-testing-basic-pla...
Logs are in http://qa-proxy.ceph.com/teuthology/ubuntu-2014-03-21_11:05:47-upgrade:dumpling-x:parallel-firefly-testin... Yuri Weinstein
09:19 AM Feature #7438 (Fix Under Review): EC: adapt watch/notify stress test for EC and add to nightly
David Zafman
06:46 AM CephFS Feature #7810 (Resolved): libcephfs: add a test that freezes + unfreezes a client, and then verif...
Sage Weil

03/20/2014

05:56 PM Bug #7735 (Fix Under Review): osd: priorityqueue debug dump crashes
Sage Weil
04:54 PM rgw Bug #7702 (Resolved): osd thrashing + rgw = timeouts
Sage Weil
04:39 PM rgw Bug #7808 (Duplicate): "ERROR: testContainerSerializedInfo" in upgrade:dumpling-x:stress-split-fi...
# 7799 Yuri Weinstein
04:37 PM rgw Bug #7808 (Duplicate): "ERROR: testContainerSerializedInfo" in upgrade:dumpling-x:stress-split-fi...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-20_01:35:01-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
04:25 PM rbd Documentation #7807 (Closed): fix hostname on docu
http://ceph.com/docs/master/install/upgrading-ceph
ceph-deploy install --stable {stable release} ceph-node1[ ceph-...
Rens Reinders
04:23 PM Bug #7805 (Resolved): emperor can go active with < min_size non-incomplete peers since we check a...
Samuel Just
03:54 PM Bug #7804 (Duplicate): backfill racing with a hitset object remove
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-20_02:30:02-rados-firefly-distro-basic-plana/13947... Yuri Weinstein
12:47 PM devops Bug #7585 (Resolved): ceph-deploy should handle requiretty failures
Pull request opened https://github.com/ceph/ceph-deploy/pull/171
And merged into ceph-deploy's master branch with ...
Alfredo Deza
10:04 AM devops Bug #7585 (In Progress): ceph-deploy should handle requiretty failures
Alfredo Deza
10:03 AM devops Bug #7585: ceph-deploy should handle requiretty failures
To replicate the issue, in `/etc/sudoers` the following needs to be present:... Alfredo Deza
11:40 AM rgw Bug #7786: civetweb segfaults with file uploads larger than 2GB
The patch has been merged. Chris Holcombe
11:17 AM rbd Bug #7790: Kernel panic when creating ZFS pools on CEPH RBD devices
updated source to support. Sheldon Mustard
10:35 AM Bug #7801 (Resolved): warn on ext4 remount journal replay INFO line
fixed by teuthology commit 7088885ecd739a4006418eb9e7464d2f0128ea5d Sage Weil
10:23 AM Bug #7801 (Resolved): warn on ext4 remount journal replay INFO line
Logs are in http://qa-proxy.ceph.com/teuthology/ubuntu-2014-03-19_17:33:45-powercycle-firefly-testing-basic-plana/138... Yuri Weinstein
10:33 AM Bug #7208: CEPH_FEATURE_CRUSH_V2 feature mismatch
oh, commit:08fa34d94e40f2f7230dec4dabad72107cdee27b removed the default erasure pool from the OSDMap which fixed this... Sage Weil
10:25 AM Bug #7208: CEPH_FEATURE_CRUSH_V2 feature mismatch
workarounds:
ceph osd crush tunables bobtail
works with any kernel > 3.9. or,
ceph osd crush tunables fir...
Sage Weil
10:20 AM rbd Bug #5469: qemu-io: segfault when tried IO with invalid arguments
we've got the same assertion failure today on 0.67.3. It's rare and random.
osdc/Striper.cc: In function 'static ...
Pawel Stefanski
09:36 AM rgw Bug #7799 (Can't reproduce): Errors in upgrade:dumpling-x:stress-split-firefly---basic-plana suite
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-20_01:35:01-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
09:23 AM Bug #7755 (Resolved): auth_log selection depends on the primary identity and can cause a loop
Sage Weil
09:19 AM Bug #7733 (Resolved): start_new_interval: always notify if !primary
Sage Weil
09:18 AM Bug #7777 (Resolved): 2014-03-18T12:48:30.368 INFO:teuthology.task.rados.rados.0.err:[10.214.132....
Sage Weil
08:17 AM rgw Bug #7796 (Resolved): RGW Keystone token auth fails with '411 Length Required' when Keystone usin...
When using keystone behind apache/wsgi in the following config, radosgw is unable to authenticate tokens:
cat /etc...
Michael Kidd
07:28 AM Feature #7792 (Closed): leveldb 1.12.0 for rhel
It appears libleveldb1 version 1.12.0 is provided for debian/ubuntu here:
http://ceph.com/debian-dumpling/pool/mai...
Sheldon Mustard
05:51 AM devops Bug #7677 (Resolved): Troubleshoot ceph-setup-nightly Jenkins failures
Alfredo Deza

03/19/2014

04:39 PM Bug #5804: mon: binds to 0.0.0.0:6800something port
Not wanting to commit to a full diagnosis just yet, this is looking a lot like a mild case of #7212 with a weird resu... Joao Eduardo Luis
03:39 PM rgw Feature #7791 (Rejected): radosgw-agent should show statistics
radosgw-agent might be a good method of offering statistics and reporting, such as log version between gateways, spee... Brian Andrus
03:37 PM rbd Bug #7790 (Resolved): Kernel panic when creating ZFS pools on CEPH RBD devices
Creating a ZFS pool on top of krbd causes a kernel panic.
From the ZFSonLinux bug tracker (https://github.com/zfso...
Chris Dunlop
01:18 PM rbd Fix #7787 (Resolved): rbd diff takes longer as images grow larger
rbd diff will currently take a VERY long time to complete for large images. This is not ideal. Perhaps a counter of s... Brian Andrus
12:24 PM rgw Bug #7786 (Resolved): civetweb segfaults with file uploads larger than 2GB
I submitted a pull request to fix this. I'm waiting to hear back from the maintainers of the project.
https://githu...
Chris Holcombe
11:46 AM Feature #7784: mon osd down out interval = 0 should prevent ceph health from reporting ok
Perhaps this config option should be converted into the noout flag; that's already plumbed up for such things. Greg Farnum
11:13 AM Feature #7784 (Resolved): mon osd down out interval = 0 should prevent ceph health from reporting ok
Samuel Just
11:45 AM Bug #7779: osd: object file can have too many xattrs, get E2BIG
Sounds like this is a manual repair job to me, then...
I guess we could write tools that know all the xattr name pat...
Greg Farnum
10:51 AM Bug #7393 (Duplicate): osd: scrub stat mismatch, got 9/9 objects, 0/0 clones, 9/4 dirty, 0/0 whit...
We don't seem to be seeing this any more, might have been related to the object_info_t leaking. Marking duplicate un... Samuel Just
10:36 AM Bug #7781 (Duplicate): "FAILED assert" in rados-firefly-distro-basic-plana suite
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-18_02:30:01-rados-firefly-distro-basic-plana/13740... Yuri Weinstein
08:28 AM CephFS Bug #7780 (Fix Under Review): When full flag is set, even MDS writes are blocked
John Spray
05:12 AM CephFS Bug #7780: When full flag is set, even MDS writes are blocked
Just tried commenting out the check in objecter and the MDS writes get through, so not too much work to do here. John Spray
04:05 AM CephFS Bug #7780 (Resolved): When full flag is set, even MDS writes are blocked

We're meant to allow the MDS to write even when the OSDs think they're full, so that we can journal file deletions ...
John Spray

03/18/2014

09:28 PM CephFS Bug #7708 (Resolved): mds: null deref in issue_caps
Sage Weil
08:56 PM Bug #7779: osd: object file can have too many xattrs, get E2BIG
Yep. This particular object was written back in November, though, so it predates the fix by some time (and in fact m... Sage Weil
08:52 PM Bug #7779: osd: object file can have too many xattrs, get E2BIG
This exact bug is why we made the changes to xattr handling and no longer store unlimited numbers in the filesystem. ... Greg Farnum
06:44 PM Bug #7779 (Resolved): osd: object file can have too many xattrs, get E2BIG
if an object has too many xattrs on it, you get E2BIG from listxattr. one such object:... Sage Weil
08:07 PM Feature #7438 (In Progress): EC: adapt watch/notify stress test for EC and add to nightly
David Zafman
03:57 PM Bug #7776: client lockdep crash
ubuntu@teuthology:/a/teuthology-2014-03-18_02:30:01-rados-firefly-distro-basic-plana/137408 Samuel Just
03:03 PM Bug #7776 (Resolved): client lockdep crash
2014-03-18T14:50:40.839 INFO:teuthology.task.workunit.client.0.err:[10.214.131.15]: -3> 2014-03-18 14:48:00.14746... Samuel Just
03:45 PM Bug #7777 (Resolved): 2014-03-18T12:48:30.368 INFO:teuthology.task.rados.rados.0.err:[10.214.132....
is_missing_object(snapdir)->wait_for_unreadable
2014-03-18T12:48:27.770 INFO:teuthology.task.thrashosds.thrasher:T...
Samuel Just
02:32 PM rgw Bug #7702: osd thrashing + rgw = timeouts
this affects the dumpling-x/stress-split tests, and *would* affect rados thrashing if we had an rgw workload in there. Sage Weil
02:28 PM rgw Documentation #7434 (In Progress): rgw: doc user/group quota
I have a local branch where this is in progress. I'll try and put it up at the end of the week after I finish the rel... John Wilkins
02:25 PM Bug #7398: osd: ERANGE from clone
Seems that there is a stale ssc from a previous interval (?) mucking up the calc_clone_subsets process. Testing a br... Samuel Just
12:36 PM Bug #7398: osd: ERANGE from clone
ubuntu@teuthology:/a/teuthology-2014-03-18_02:30:01-rados-firefly-distro-basic-plana/137259/remote
With logging.
Samuel Just
01:41 PM Bug #6633: osd: pgls vs osd restart/peering race misses objects
backported to dumpling. Sage Weil
12:40 PM rgw Feature #7774 (Resolved): rgw: cache decoded user and bucket info
Instead of having to access raw objects and then decode them for each request. Needs an invalidation mechanism that i... Yehuda Sadeh
12:34 PM Bug #7733: start_new_interval: always notify if !primary
Samuel Just
12:30 PM Bug #7728: osd/ReplicatedPG.cc: 4999: FAILED assert(got)
Samuel Just
12:30 PM Bug #7755: auth_log selection depends on the primary identity and can cause a loop
Samuel Just
11:40 AM Bug #7755: auth_log selection depends on the primary identity and can cause a loop
Samuel Just
12:05 PM Feature #7767 (Fix Under Review): messenger:buffer reads
Yehuda Sadeh
08:43 AM Feature #7767 (Resolved): messenger:buffer reads
The way we read message involves with multiple smallish reads, which can have a big performance impact. Need to buffe... Yehuda Sadeh
11:22 AM RADOS Feature #7770 (New): "ceph osd crush set" should handle ingestion of non-compiled crush maps
"ceph osd crush dump" will output a decompiled version of the crush map in a format of choice. It would be nice to be... Brian Andrus
10:49 AM Bug #6806 (Pending Backport): mon: audit cmd_getval() calls to make sure they handle failures cor...
Sage Weil
09:42 AM Bug #6806 (Fix Under Review): mon: audit cmd_getval() calls to make sure they handle failures cor...
Joao Eduardo Luis
10:08 AM Bug #7756 (Rejected): cannot get through configuration when configure.ac check xfs.h
Yep, you need the xfs-dev headers to build Ceph now. Greg Farnum
01:45 AM Bug #7756: cannot get through configuration when configure.ac check xfs.h
invalidate this bug report Xinxin Shu
01:38 AM Bug #7756: cannot get through configuration when configure.ac check xfs.h
in order to avoid these errors, i think we should check xfs lib , not xfs header file Xinxin Shu
01:36 AM Bug #7756 (Rejected): cannot get through configuration when configure.ac check xfs.h
when i used configure.ac to config ceph, i get an error "configure: error: xfs/xfs.h not found (--without-libxfs to d... Xinxin Shu
09:24 AM rbd Bug #5876: Assertion failure in rbd_img_obj_callback() : rbd_assert(which >= img_request->next_co...
Hi,
well, there is nothing else in dmesg, except several hours before this hang.
This servers are running about...
Olivier Bonvalet
09:07 AM rbd Bug #5876: Assertion failure in rbd_img_obj_callback() : rbd_assert(which >= img_request->next_co...
Hi Olivier,
Can you attach the entire dmesg or at least a few minutes worth of
messages prior to the assertion fa...
Ilya Dryomov
09:00 AM rbd Bug #5876: Assertion failure in rbd_img_obj_callback() : rbd_assert(which >= img_request->next_co...
and in a 3.13.5 kernel too :... Olivier Bonvalet
08:47 AM rbd Bug #5876: Assertion failure in rbd_img_obj_callback() : rbd_assert(which >= img_request->next_co...
Hi,
same thing with a 3.13.5 kernel :...
Olivier Bonvalet
08:17 AM Bug #7765: Bogus bandwidth statistics during pg creation
On the attachment, around 09:40 is some actual client I/O maxing out the cluster. Around 09:55 is where I start crea... John Spray
08:13 AM Bug #7765 (Can't reproduce): Bogus bandwidth statistics during pg creation

Noticed this while growing the number of PGs in an existing pool on a cluster running firefly branch....
John Spray
07:18 AM devops Bug #7627: ceph-disk: does not start daemons properly under systemd
About the systemd not playing well with ceph-deploy, I am still doubtful because the way ceph-deploy is working with ... Alfredo Deza
07:00 AM devops Bug #7627: ceph-disk: does not start daemons properly under systemd
I tried to replicate this with ceph-deploy and boy was it a nightmare to get there.
FC19 panics a few seconds afte...
Alfredo Deza
05:28 AM CephFS Feature #7764 (Resolved): InoTable/SessionMap/ manipulator (cephfs-table-tool)

Some bugs might create a situation where the metadata/journal may be okay but recovery fails because one of the glo...
John Spray
05:25 AM CephFS Feature #7763 (Resolved): journal-tool: import
Given a binary journal dump, insert it back into the metadata pool.
John Spray
05:23 AM CephFS Feature #7762 (Rejected): journal-tool: backwards-search after corrupt regions
When a corrupt region is encountered, use the start pointers to walk from the end of the log back to the latest valid... John Spray
05:22 AM CephFS Feature #7352: mds: make classes encode/decode-able
related, on journal-tool branch have got to a point where dencoder can handle getting the inodes out of .inode object... John Spray
05:20 AM CephFS Feature #7761 (Resolved): journal-tool: forwards-search through corrupt regions

When corruption is encountered, search forward byte by byte for the next sentinel word.
John Spray
05:18 AM CephFS Feature #7760 (Resolved): journal-tool: implement splice

Simple implementation based on new ENoOp event type that allows us to selectively blank out regions in existing log.
John Spray
05:17 AM CephFS Feature #7759 (Resolved): journal-tool: roll in resetter/dumper from MDS

To provide the capabilities in one convenient place and remove a few LOC from the MDS.
John Spray
05:15 AM CephFS Feature #7758 (Resolved): journal-tool: complete filtering
Already done:
--path
--range
Remaining to do:
--by-inode <inodeno>
--by-dirfrag <frag id>:[dentry]
--by-...
John Spray
02:52 AM Feature #7757 (Resolved): erasure-code: enable jerasure-2 SSE optimizations
gf-complete, the library used by jerasure-2, decides to use SSE instructions based on CPU features found at compile t... Loïc Dachary

03/17/2014

09:48 PM Bug #7207: Lock contention at filestore I/O (FileStore::lfn_open) during filestore folder splitti...
Hello Greg,
Can we backport this fix to dumpling?
Guang Yang
08:52 PM Feature #7599 (Resolved): erasure-code : upgrade to jerasure-2
Sage Weil
08:38 PM Bug #7755 (Resolved): auth_log selection depends on the primary identity and can cause a loop
calc_ec_acting assumes the first valid osd will be primary, this is not necessarily so with primary affinity. We can... Samuel Just
04:33 PM Feature #7662 (Resolved): add a "ec profiles" map to OSDMap
Loïc Dachary
09:12 AM Feature #7662 (Fix Under Review): add a "ec profiles" map to OSDMap
"work in progress":https://github.com/ceph/ceph/pull/1477 Loïc Dachary
04:22 PM Bug #7736 (Pending Backport): mon: can expose stale state
Sage Weil
03:29 PM Bug #7736: mon: can expose stale state
Sage Weil
04:45 AM Bug #7736: mon: can expose stale state
Joao Eduardo Luis
04:43 AM Bug #7736: mon: can expose stale state
This works for me.
Also noticed that a custom crafted message could trigger a commit with the old code. Apparentl...
Joao Eduardo Luis
04:12 PM Bug #7748 (Resolved): rados bench timeouts: reset timeout if recovery rate is non-zero
Samuel Just
10:48 AM Bug #7748 (Resolved): rados bench timeouts: reset timeout if recovery rate is non-zero
Samuel Just
04:04 PM Bug #7738 (Pending Backport): osd: journal crash on startup on wheezy
Sage Weil
04:04 PM Bug #7738 (Resolved): osd: journal crash on startup on wheezy
Sage Weil
03:40 PM Bug #7738 (Fix Under Review): osd: journal crash on startup on wheezy
As far as I can tell the problem is that the journal symlink isn't always present when we reopen the journal for writ... Sage Weil
03:43 PM Bug #7713 (Duplicate): "Error: finished tid 3 when last_acked_tid was 4" in upgrade:dumpling-x:pa...
Sage Weil
10:04 AM Bug #7713: "Error: finished tid 3 when last_acked_tid was 4" in upgrade:dumpling-x:parallel-firef...
actually, the problem is probably that dumpling still has the bug. Sage Weil
10:03 AM Bug #7713 (Duplicate): "Error: finished tid 3 when last_acked_tid was 4" in upgrade:dumpling-x:pa...
this was #7709, now fixed Sage Weil
03:37 PM rgw Bug #7741 (Resolved): sync.py: not enough arguments for format string
commit:5e7a98911e6a6db525d65981034d28065a045e47 Josh Durgin
01:53 PM rbd Feature #7746: Capacity Management: rbd df
Confirmed that this doesn't just affect RBD pools but affects any pool, as the data is generated from reading the off... Neil Levine
10:14 AM rbd Feature #7746 (Resolved): Capacity Management: rbd df
Currently, ceph df shows two sets of statistics: global and pool.
For global, the output is generated from aggrega...
Neil Levine
01:22 PM Bug #7751 (Closed): os: LevelDBStore: needs to check iter->status() during get()
We are ignoring the status of the iterator during 'LevelDBStore::get()' -- we're also ignoring the state of the itera... Joao Eduardo Luis
01:04 PM CephFS Bug #7750 (Can't reproduce): Attempting to mount a kNFS export of a sub-directory of a CephFS fil...
Hi,
I have run into the following issue when trying to export a subset of a CephFS filesystem via NFS:
Machine ...
David McBride
01:03 PM rgw Bug #7743 (Can't reproduce): Max retries exceeded with url: /admin/opstate?client-id=radosgw-agen...
This looks like network issues - the run was killed, so there are no logs from radosgw to confirm. Josh Durgin
01:00 PM rgw Bug #7749 (Resolved): radosgw-agent: worker.DataWorker.wait_for_object() should bail on requests....
Fixed in teuthology commit:ef2edcd3a96a2118d793b7a102688f60fa100f52 Josh Durgin
11:58 AM rgw Bug #7749: radosgw-agent: worker.DataWorker.wait_for_object() should bail on requests.ConnectionE...
A different approach: https://github.com/ceph/teuthology/pull/227 Zack Cerza
11:23 AM rgw Bug #7749 (Fix Under Review): radosgw-agent: worker.DataWorker.wait_for_object() should bail on r...
Zack Cerza
11:22 AM rgw Bug #7749: radosgw-agent: worker.DataWorker.wait_for_object() should bail on requests.ConnectionE...
https://github.com/ceph/radosgw-agent/pull/9 Zack Cerza
11:16 AM rgw Bug #7749 (Resolved): radosgw-agent: worker.DataWorker.wait_for_object() should bail on requests....
Currently, the method will do this forever if a connection goes down:... Zack Cerza
10:40 AM Bug #7712 (Resolved): osd: ENOENT following copy-from
Samuel Just
10:09 AM Bug #7715 (Resolved): Errors in cluster log "[ERR] mon.0" in in upgrade:dumpling-x:parallel-fire...
whitelisted this Sage Weil
10:03 AM Feature #7717: Per CRUSH bucket used/available space
Neil Levine
09:22 AM Fix #6780 (Closed): monitor errors when checking for quorum status
Joao Eduardo Luis
09:13 AM Fix #6780: monitor errors when checking for quorum status
can no longer reproduce this on firefly. any objections on closing? Joao Eduardo Luis
08:19 AM Bug #7611 (Resolved): All mon nodes crash when running "ceph tell osd.X" and using the "version" ...
Sage Weil
08:00 AM Bug #7611 (Fix Under Review): All mon nodes crash when running "ceph tell osd.X" and using the "v...
Joao Eduardo Luis

03/16/2014

05:32 PM Feature #7698 (Resolved): Add EC handling to ceph_filestore_dump
31a667918069ac72c70080dcacd6ff3b6d58f07b David Zafman
03:26 PM Feature #7662 (In Progress): add a "ec profiles" map to OSDMap
Loïc Dachary
10:36 AM Feature #7662 (Fix Under Review): add a "ec profiles" map to OSDMap
*ABANDONNED* "work in progress":https://github.com/ceph/ceph/pull/1475
I've set it to urgent so someone can decide...
Loïc Dachary
12:54 PM Bug #7744 (Can't reproduce): osd: assert(last_e.version.version < e.version.version)
I currently have 2 OSDs that won't start and it's preventing my
cluster from running my VMs.
My cluster is running ...
Kevinsky Dy
12:11 PM Bug #7740 (Resolved): resurrected pgs have lb=hobject_t::min(), but we respond to queries on dele...
Sage Weil
12:10 PM rgw Bug #7743 (Can't reproduce): Max retries exceeded with url: /admin/opstate?client-id=radosgw-agen...
ubuntu@teuthology:/a/teuthology-2014-03-15_23:00:27-rgw-firefly-distro-basic-plana/133582... Sage Weil
12:09 PM rgw Bug #7742 (Resolved): FAIL: s3tests.functional.test_s3.test_region_bucket_create_master_access_re...
ubuntu@teuthology:/a/teuthology-2014-03-15_23:00:27-rgw-firefly-distro-basic-plana/133587
this may be a dup of the...
Sage Weil
12:08 PM rgw Bug #7741 (Resolved): sync.py: not enough arguments for format string
ubuntu@teuthology:/a/teuthology-2014-03-15_23:00:27-rgw-firefly-distro-basic-plana/133587... Sage Weil
09:36 AM Bug #7719 (Resolved): failed to become clean
Sage Weil

03/15/2014

08:58 PM Bug #7728: osd/ReplicatedPG.cc: 4999: FAILED assert(got)
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-03-15_17:09:08-rados:thrash-firefly-distro-basic-plana... Sage Weil
07:39 PM CephFS Bug #7684 (Resolved): failed cfuse_workunit_kernel_untar_build.yaml test
Sage Weil
05:39 PM Bug #7740 (Resolved): resurrected pgs have lb=hobject_t::min(), but we respond to queries on dele...
Samuel Just
05:07 PM Bug #7736 (Fix Under Review): mon: can expose stale state
Sage Weil
02:15 PM Bug #7736: mon: can expose stale state
This *is* committed state. Looks like we need to wait for everybody to ack or for leases to expire before responding ... Greg Farnum
09:52 AM Bug #7736 (Resolved): mon: can expose stale state
teuthology-2014-03-14_19:00:49-rados-dumpling-testing-basic-plana/130941
here is a deletion on the primary:...
Sage Weil
04:17 PM rgw Bug #7730 (Rejected): Content-length should not required for initiate multipart upload
Yehuda Sadeh
03:30 PM Bug #7732 (Resolved): stuck recovering with unfound -- build_might_have_unfound bug
Sage Weil
03:30 PM Bug #7718 (Resolved): osd/PG.cc: 6062: FAILED assert(pg->want_acting.size())
Sage Weil
11:15 AM CephFS Bug #7739 (Resolved): mds: uninitialized field in message
ubuntu@teuthology:/a/teuthology-2014-03-14_23:01:36-multimds-master-testing-basic-plana/131675
hard to map this to...
Sage Weil
10:10 AM Bug #7738 (Resolved): osd: journal crash on startup on wheezy
ubuntu@teuthology:/a/sage-wheezy/130813... Sage Weil
10:09 AM Bug #7737 (Resolved): osd: deletes vs backfill makes degraded go negative
this is the tail end of a rados bench run that was outpacing backfill. once it started to delete objects, the degrad... Sage Weil
12:09 AM devops Bug #7627: ceph-disk: does not start daemons properly under systemd
I'm 90% sure this is systemd killing everything in the cgroup when ceph-deploy's ssh session closes. The log file ju... Sage Weil

03/14/2014

08:26 PM Bug #7735: osd: priorityqueue debug dump crashes
wip-pq Sage Weil
08:21 PM Bug #7735 (Resolved): osd: priorityqueue debug dump crashes
discovered when doing backfill testing with wip-pq:... Mark Nelson
06:03 PM Bug #7732: stuck recovering with unfound -- build_might_have_unfound bug
Samuel Just
05:58 PM Bug #7732 (Resolved): stuck recovering with unfound -- build_might_have_unfound bug
Samuel Just
05:59 PM Bug #7733 (Resolved): start_new_interval: always notify if !primary
Samuel Just
05:29 PM rgw Bug #7730: Content-length should not required for initiate multipart upload
I grabbed and compiled the patched version of FastCGI as a DSO from here:
https://github.com/ceph/mod_fastcgi.git
...
Neil Soman
05:03 PM rgw Bug #7730: Content-length should not required for initiate multipart upload
I'm not sure we actually require it. You're not using the modified fastcgi module and therefore chunked encoding is n... Yehuda Sadeh
03:52 PM rgw Bug #7730 (Rejected): Content-length should not required for initiate multipart upload
We are using the S3 SDK and I am getting a return from RGW claiming that Content-Length must be set. However, accordi... Neil Soman
05:05 PM rbd Bug #7577: rbd info displays extra random char in block prefix
Dan Mick wrote:
> I'm not working on this, just calling imprecations from the bleachers. :) I think the issue is t...
Danny Al-Gaaf
04:52 PM Bug #7712: osd: ENOENT following copy-from
it looks like the requeues get in all sorts of weird orders:
grep --color=auto 3\\.2\( remote/*/log/*osd.5* | egre...
Sage Weil
04:45 PM Bug #7712: osd: ENOENT following copy-from
grep dequeue_op remote/*/log/*osd.5* | grep plana2926152-290
osd.5.3:79 and osd.1.3:681 get reordered
when they...
Sage Weil
10:33 AM Bug #7712 (Resolved): osd: ENOENT following copy-from
2014-03-14T09:47:00.894 INFO:teuthology.task.rados.rados.0.out:[10.214.131.11]: 2625: delete oid 290 current snap is ... Sage Weil
04:36 PM Bug #7659 (In Progress): osd/ReplicatedPG.cc: 6751: FAILED assert(attrs || !pg_log.get_missing()....
David Zafman
04:20 PM rgw Documentation #7731 (Resolved): Warning about "rgw print continue" should be added to radosgw con...
Here,
http://ceph.com/docs/dumpling/radosgw/config/
Just as there is a warning to turn off FastCgiWrapper on Ce...
Neil Soman
03:59 PM Documentation #6465: admin/build-doc should have some kind of build check for broken links
Actually, it seems like the :ref: syntax at least catches a file name mismatch. We should probably begin converting t... John Wilkins
03:57 PM Documentation #7558 (Resolved): broken link in install/manual-deployment/
Fixed. The file name was incorrect. Probably should be using :ref: syntax instead. John Wilkins
03:13 PM Documentation #7729 (Closed): CAPS with dot (.) notation can be unquoted
Authentication documentation should indicate that CAPS with dot notation do not have to be quoted any longer.
htt...
John Wilkins
03:10 PM Bug #7720 (Duplicate): osd/ReplicatedPG.cc: 4991: FAILED assert(got)
Sage Weil
01:28 PM Bug #7720 (Duplicate): osd/ReplicatedPG.cc: 4991: FAILED assert(got)
... Sage Weil
02:51 PM Bug #7728 (Resolved): osd/ReplicatedPG.cc: 4999: FAILED assert(got)

ceph version 0.77-859-g0bf5b52 (0bf5b52ae035d9893b5d079abee05bb3114ef51e)
1: (ReplicatedPG::finish_ctx(Replicate...
Samuel Just
02:47 PM Bug #7718: osd/PG.cc: 6062: FAILED assert(pg->want_acting.size())
Samuel Just
02:46 PM Bug #7718: osd/PG.cc: 6062: FAILED assert(pg->want_acting.size())
issue_repop sets the last_update for stored peer_info to repop->v which is eversion_t() for temp objects. Testing fix. Samuel Just
12:58 PM Bug #7718 (Resolved): osd/PG.cc: 6062: FAILED assert(pg->want_acting.size())
... Sage Weil
02:37 PM Feature #7698: Add EC handling to ceph_filestore_dump
Ready to go into either master or firefly branches. David Zafman
02:17 PM Documentation #7727 (Resolved): CRUSH "chassis" bucket type
We should note that there is a new "chassis" default bucket type. It might go here:
http://ceph.com/docs/master/ra...
John Wilkins
02:09 PM Documentation #7726 (Resolved): Quick Start Requires Updating
With size = 3 by default, the cluster as defined by the quick start guide will never get to an active + clean state u... John Wilkins
02:06 PM Documentation #7725 (Resolved): osd pool default size need updating
The default value for size is now 3.
http://ceph.com/docs/master/rados/configuration/pool-pg-config-ref/
John Wilkins
02:04 PM Documentation #7724: filestore xattr use omap removed
It's a good idea to doc it, but in general we don't support reverting versions and should not give the appearance of ... Greg Farnum
02:01 PM Documentation #7724 (Resolved): filestore xattr use omap removed
Ceph Documentation needs to indicate that filestore xattr use omap was removed and specify the version. Users reverti... John Wilkins
01:59 PM Feature #7723 (New): Cancel RADOS Bench writes and still do reads
Right now in RADOS bench if you write some data out with the --no-cleanup flag but ctrl+c or kill the process before ... Mark Nelson
01:55 PM devops Cleanup #7722 (Resolved): Make /admin/build-doc distro independent
The script for building Ceph documentation under admin/build-doc is currently Debian/Ubuntu only. This limits communi... John Wilkins
01:52 PM Documentation #7721 (Closed): Document the process to migrate from RBD format 1 to format 2 images
There is a mention of that in the RBD man page, but I think we should also document it on ceph.com with some warning ... Alexandre Marangone
01:08 PM Bug #7719 (Resolved): failed to become clean
Stuck in remapped due to deleted pg shard clobbering want_pg_temp from primary.
2014-03-14T08:07:26.245 INFO:teuth...
Samuel Just
01:00 PM rbd Bug #6480: librbd crashed qemu-system-x86_64
Still getting these crashes multiple times a week across a dozen guests on separate hosts. These guests are video sur... Mike Dawson
12:51 PM Feature #7717 (Closed): Per CRUSH bucket used/available space
When using multiple pools spread across different failure domains it can be difficult to know how much space is avail... Alexandre Marangone
12:30 PM devops Feature #7716 (Resolved): Build debug packages for EL6
The debug packages for el6 are not available for releases >0.67.3.
http://ceph.com/rpm-dumpling/el6/x86_64/
http:...
Alexandre Marangone
12:07 PM Bug #7715 (Resolved): Errors in cluster log "[ERR] mon.0" in in upgrade:dumpling-x:parallel-fire...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-13_19:33:10-upgrade:dumpling-x:parallel-firefly---... Yuri Weinstein
11:51 AM Bug #7216 (Can't reproduce): ASSERT AuthMonitor::update_from_paxos on 0.72.2
Joao Eduardo Luis
09:43 AM Bug #7216 (New): ASSERT AuthMonitor::update_from_paxos on 0.72.2
Ian Colle
11:35 AM Bug #7709 (Resolved): osd: RWORDERED does not order reads ack vs write commit
Sage Weil
11:28 AM Bug #7696 (Resolved): osd/ECUtil.cc: 23: FAILED assert(i->second.length() == total_chunk_size)
Sage Weil
11:22 AM Bug #7713 (Duplicate): "Error: finished tid 3 when last_acked_tid was 4" in upgrade:dumpling-x:pa...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-13_19:33:10-upgrade:dumpling-x:parallel-firefly---... Yuri Weinstein
11:05 AM Bug #7692 (Resolved): mon: monitor fails to form quorum
Sage Weil
05:50 AM Bug #7692: mon: monitor fails to form quorum
We are now seeing a more generalized failure were no monitor is able to form quorum (from ceph-deploy):... Alfredo Deza
10:51 AM Bug #7489 (Resolved): `ceph-mon` is silent after non-zero exit status
Sage Weil
06:02 AM Bug #7489: `ceph-mon` is silent after non-zero exit status
Here's the thing: as is, ceph-mon is being run as a daemon. That entails dissociating it from its running terminal, ... Joao Eduardo Luis
05:10 AM Bug #7489: `ceph-mon` is silent after non-zero exit status
Tools like 'ceph-mon' are used by other tools to abstract ceph management and those tools will almost always check ex... Alfredo Deza
04:48 AM Bug #7489: `ceph-mon` is silent after non-zero exit status
This only happens when the monitor is running as a daemon, which I would think is the expected result: all output is ... Joao Eduardo Luis
10:09 AM devops Bug #7627: ceph-disk: does not start daemons properly under systemd
Sage Weil
09:59 AM rbd Bug #7465 (Can't reproduce): krbd: size of disk read or set incorrectly
Sage Weil
09:57 AM Bug #7576: osd: large skew in pg epochs (dumpling)
That doesn't seem like it's addressing the issue the right way. We've deliberately set it so that PGs which don't get... Greg Farnum
09:40 AM Bug #7576: osd: large skew in pg epochs (dumpling)
How about this: in OSDService, add
Mutex pg_epoch_lock;
Cond pg_epoch_cond;
multiset<epoch_t> pg_epochs;
ma...
Sage Weil
09:50 AM Fix #7711 (New): OpTracker output doesn't include op size for subops
Been discussing the OpTracker dumps on the mailing list and realized the subops don't contain transaction size inform... Greg Farnum
09:49 AM rgw Bug #7703 (Resolved): rgw: fail to copy object > 512k between buckets
d0d21fae483cfaf5556b18b2f0e113c7d418e90c Landed to Firefly Ian Colle
09:44 AM Bug #7626: After updating ceph from 0.75 to 0.77 one of the three monitors can't start
Sage Weil
09:44 AM Bug #7626 (New): After updating ceph from 0.75 to 0.77 one of the three monitors can't start
Ian Colle
09:42 AM Bug #7497 (Can't reproduce): timeout waiting to go clean
Ian Colle
09:34 AM Bug #6003 (Need More Info): journal Unable to read past sequence 406 ...
I think we need a log for this.. hopefully we will hit it now that things are cranked up on the nightlies? Sage Weil
09:21 AM Bug #7710 (Fix Under Review): Multiple rados bench instance will overwrite the metadata object
Ian Colle
06:53 AM Bug #7710: Multiple rados bench instance will overwrite the metadata object
Please help to review the pull request - https://github.com/ceph/ceph/pull/1457 Guang Yang
06:48 AM Bug #7710 (Resolved): Multiple rados bench instance will overwrite the metadata object
rados bench is useful to benchmark the cluster, however, there is one user case it does not support: read after write... Guang Yang
07:45 AM rbd Bug #7282: Unresponsive rbd-backed Qemu domain causes libvirtd to stall on all connections
So don't forget the patch I got accepted into libvirt a few weeks ago.
If you are running a RBD storage pool it wi...
Wido den Hollander
04:37 AM CephFS Bug #7684: failed cfuse_workunit_kernel_untar_build.yaml test
Zheng Yan

03/13/2014

10:41 PM Bug #7709 (Fix Under Review): osd: RWORDERED does not order reads ack vs write commit
Sage Weil
10:02 PM Bug #7709: osd: RWORDERED does not order reads ack vs write commit
Sage Weil
09:58 PM Bug #7709 (Resolved): osd: RWORDERED does not order reads ack vs write commit
We just fixed ceph_test_rados to wait for commit, not ack. It turns out that the RWORDERED flag is ordering a
wr...
Sage Weil
09:10 PM Bug #7498 (Resolved): stuck in recovery
Sage Weil
09:09 PM Bug #7210: mon: does not validate snapshot removal commands
I would like to see this in the wild before doing the backport. Sage Weil
09:08 PM Bug #7626: After updating ceph from 0.75 to 0.77 one of the three monitors can't start
Jasper Siero wrote:
> I submitted the new log with debug mon = 10 added to the ceph.conf.
> The two processes belo...
Sage Weil
05:09 AM Bug #7626: After updating ceph from 0.75 to 0.77 one of the three monitors can't start
I submitted the new log with debug mon = 10 added to the ceph.conf.
The two processes below also keeps running afte...
Jasper Siero
09:04 PM devops Bug #7641 (Resolved): packaging: ceph upgrade from cuttlefish to emperor is incomplete
Sage Weil
09:01 PM CephFS Bug #7708 (Resolved): mds: null deref in issue_caps
... Sage Weil
08:28 PM CephFS Bug #7684: failed cfuse_workunit_kernel_untar_build.yaml test
... Zheng Yan
11:01 AM CephFS Bug #7684: failed cfuse_workunit_kernel_untar_build.yaml test
And again:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-12_23:00:26-fs-firefly-testing-basic-plana/127977/...
Greg Farnum
04:35 PM Bug #7648 (Won't Fix): ceph-mon corner case denial of service
This only works on osd.0; for other osds, both dumpling and emperor behave (with bogus output)... Sage Weil
04:28 PM Bug #7706 (Pending Backport): osd: PrioritizedQueue can starve
Sage Weil
01:58 PM Bug #7706 (Resolved): osd: PrioritizedQueue can starve
starvation caused by pq max token limit Samuel Just
04:27 PM Bug #7704 (Resolved): "[ERR] scrub mismatch" in upgrade:dumpling-x:parallel-firefly---basic-pla...
Sage Weil
09:30 AM Bug #7704 (In Progress): "[ERR] scrub mismatch" in upgrade:dumpling-x:parallel-firefly---basic-...
I think this is a manifestation of the full osdmap encoding features, which aren't specified for dumpling and can thu... Sage Weil
08:59 AM Bug #7704: "[ERR] scrub mismatch" in upgrade:dumpling-x:parallel-firefly---basic-plana suite
Also see in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-12_17:27:09-upgrade:dumpling-x:parallel-firefly---... Yuri Weinstein
08:55 AM Bug #7704 (Resolved): "[ERR] scrub mismatch" in upgrade:dumpling-x:parallel-firefly---basic-pla...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-12_17:27:09-upgrade:dumpling-x:parallel-firefly---... Yuri Weinstein
04:10 PM Bug #7635 (Duplicate): failed to recover before timeout expired
Samuel Just
03:32 PM Bug #7208 (Resolved): CEPH_FEATURE_CRUSH_V2 feature mismatch
this was fixed when i fixed up the default crush tunables to be firefly not optimal Sage Weil
03:27 PM CephFS Bug #7565 (Resolved): Failed assert in check_rstats
Sage Weil
06:53 AM CephFS Bug #7565: Failed assert in check_rstats
ubuntu@teuthology:/a/teuthology-2014-03-12_04:55:11-multimds-master-testing-basic-plana/127691
failed on trivial_s...
Sage Weil
02:59 PM Bug #7705 (Resolved): ./test/osd/RadosModel.h: 837: FAILED assert(0 == "racing read got wrong ver...
Sage Weil
11:39 AM Bug #7705: ./test/osd/RadosModel.h: 837: FAILED assert(0 == "racing read got wrong version")
wip-7705 Sage Weil
10:01 AM Bug #7705 (Resolved): ./test/osd/RadosModel.h: 837: FAILED assert(0 == "racing read got wrong ver...
... Sage Weil
01:14 PM devops Feature #7239: ceph-deploy: install cephfs java bindings
I deployed Ceph to a cluster that I intended to run Hadoop on. In this case, a typical setup is to have the cephfs cl... Noah Watkins
12:51 PM devops Feature #7239 (Need More Info): ceph-deploy: install cephfs java bindings
Could you please elaborate why you think ceph-deploy should be in charge of installing a dependency like java binding... Alfredo Deza
11:54 AM devops Bug #7598 (Need More Info): ceph-disk-activate error with ceph-deploy
I can't replicate this behavior *at all*. I do know why that is the error that comes about though.
What happens is...
Alfredo Deza
11:20 AM rgw Bug #7703 (Fix Under Review): rgw: fail to copy object > 512k between buckets
Yehuda Sadeh
10:00 AM rgw Bug #7702 (Need More Info): osd thrashing + rgw = timeouts
Sage Weil
09:41 AM Bug #7699 (Duplicate): "failed: thrashosds" in upgrade:dumpling-x:stress-split-firefly---basic-pl...
Sage Weil
09:40 AM Bug #7260 (Can't reproduce): rados api test LibRadosList.ListObjectsNS failed
reopen if we see this now that the enumerator/iterator changes are in place Sage Weil
09:38 AM Bug #7259 (Resolved): ceph mon crash in master branch
Sage Weil
06:29 AM Bug #7593: Disk saturation during PG folder splitting
Thanks Sage very much for the comments.
To begin with, I propose a change here - https://github.com/ceph/ceph/pull...
Guang Yang

03/12/2014

09:02 PM rgw Bug #7526: "ERROR:radosgw_agent.worker:syncing entries for shard 59" in rgw-firefly-distro-basic-...
This fails due to issue #7703. Yehuda Sadeh
08:59 PM rgw Bug #7703 (Resolved): rgw: fail to copy object > 512k between buckets
The new manifest does not hold bucket marker info which is needed, as when we're copying object between buckets we ne... Yehuda Sadeh
06:37 PM Bug #7696: osd/ECUtil.cc: 23: FAILED assert(i->second.length() == total_chunk_size)
Samuel Just
10:07 AM Bug #7696 (Resolved): osd/ECUtil.cc: 23: FAILED assert(i->second.length() == total_chunk_size)
Primary and replica disagree on last_backfill for some reason.
-3> 2014-03-12 03:31:58.711369 7feaff6c9700 10 ...
Samuel Just
06:32 PM Bug #7649 (Resolved): ec ceph_test_rados stuck recovering
Samuel Just
05:32 PM Bug #7626 (Need More Info): After updating ceph from 0.75 to 0.77 one of the three monitors can't...
Sage Weil
05:09 PM Bug #7671 (Resolved): watch should return ENOENT if the object does not exist
Sage Weil
04:04 PM rgw Bug #7702 (Resolved): osd thrashing + rgw = timeouts
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-12_01:35:02-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
03:45 PM Bug #7682 (Resolved): osd/ReplicatedPG.cc: 6268: FAILED assert(waiting_for_ondisk.begin()->first ...
Sage Weil
02:12 PM Bug #7682: osd/ReplicatedPG.cc: 6268: FAILED assert(waiting_for_ondisk.begin()->first == repop->v)
Samuel Just
02:46 PM Feature #7701 (Resolved): Filterable querying interface for PGMonitor

On larger clusters with 100,000s or millions of PGs, dumping everything for external tools becomes inefficient.
...
John Spray
02:38 PM Feature #7700 (New): Create a health severity between OK and WARN

I think the name we liked best so far was NOTICE.
For each WARN "health check" a la #7192, reconsider its severi...
John Spray
02:04 PM Bug #7699 (Duplicate): "failed: thrashosds" in upgrade:dumpling-x:stress-split-firefly---basic-pl...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-12_01:35:02-upgrade:dumpling-x:stress-split-firefl... Yuri Weinstein
01:25 PM rbd Bug #7577: rbd info displays extra random char in block prefix
I'm not working on this, just calling imprecations from the bleachers. :) I think the issue is that, if the string ... Dan Mick
09:55 AM rbd Bug #7577: rbd info displays extra random char in block prefix
Dan van der Ster wrote:
> Danny Al-Gaaf wrote:
> > Dan van der Ster wrote:
> > > The other solution would be to co...
Danny Al-Gaaf
09:32 AM rbd Bug #7577: rbd info displays extra random char in block prefix
Danny Al-Gaaf wrote:
> Dan van der Ster wrote:
> > The other solution would be to copy block_name_prefix to a local...
Dan van der Ster
09:29 AM rbd Bug #7577: rbd info displays extra random char in block prefix
Dan van der Ster wrote:
> The other solution would be to copy block_name_prefix to a local null terminated in the fu...
Danny Al-Gaaf
09:24 AM rbd Bug #7577: rbd info displays extra random char in block prefix
The other solution would be to copy block_name_prefix to a local null terminated in the function where it is printed ... Dan van der Ster
09:20 AM rbd Bug #7577: rbd info displays extra random char in block prefix
Do you take a look at it (or do you have already a fix?) or should I?
The problem is that block_name_prefix in rbd...
Danny Al-Gaaf
09:08 AM rbd Bug #7577: rbd info displays extra random char in block prefix
Yeah, but I think it's data-dependent; must be max len and not have a zero byte directly after... Dan Mick
03:17 AM rbd Bug #7577: rbd info displays extra random char in block prefix
Dan Mick wrote:
> Why I feel some guilt here: :)
>
> commit 9a45ffb769c7c821196a8471009d0f9a4216c0d4
> Author: D...
Danny Al-Gaaf
03:05 AM rbd Bug #7577: rbd info displays extra random char in block prefix
I just rebuilt from current git dumpling head on my machine -- cannot reproduce. So I wonder this is something subtle... Dan van der Ster
01:24 PM rgw Bug #7502: S3 API - deleting object always returns 204 regardless of object is existing or not
Neil Soman wrote:
> So I actually don't think this is a bug. AWS now returns 204 unconditionally. In fact, you get a...
Neil Soman
01:21 PM rgw Bug #7502: S3 API - deleting object always returns 204 regardless of object is existing or not
Neil Soman wrote:
> So I actually don't think this is a bug. AWS now returns 204 unconditionally. In fact, you get a...
Neil Soman
01:21 PM rgw Bug #7502: S3 API - deleting object always returns 204 regardless of object is existing or not
So I actually don't think this is a bug. AWS now returns 204 unconditionally. In fact, you get a 204 even if you atte... Neil Soman
11:57 AM Bug #7695 (Resolved): ceph: ./admin/build-doc check for required tools fails for non debian envs
Sage Weil
09:50 AM Bug #7695 (Resolved): ceph: ./admin/build-doc check for required tools fails for non debian envs
./admin/build-doc runs even if e.g. ditaa or ant isn't installed on the system. The script part checking for the bina... Danny Al-Gaaf
11:31 AM Feature #7192: An easier-to-process health report
This sounds great to me. I assume we'd keep the 'detail' section as optional as it can get quite (!) big. Sage Weil
11:02 AM Feature #7698 (Resolved): Add EC handling to ceph_filestore_dump
David Zafman
10:46 AM Bug #7681 (Resolved): osd/SnapMapper.cc: 270: FAILED assert(check(oid))
Sage Weil
10:45 AM Bug #7256 (Duplicate): ceph osd crashed at ReplicatedPG::trim_object on next
Samuel Just
10:44 AM Bug #7650 (Resolved): rados bench seq tests not performed at correct IO sizes
Sage Weil
10:06 AM Feature #7662: add a "ec profiles" map to OSDMap
s/properties/ec profile/g Loïc Dachary
09:24 AM rbd Bug #7693 (Closed): virsh domblkinfo fails with 'Bad file descriptor'
Found that the 'domblkinfo' command fails while 'domblklist' and 'domblkstat' commands function normally.
## Envir...
Michael Kidd
06:35 AM Bug #7692: mon: monitor fails to form quorum
at first glance, problem has nothing to do with the 0.0.0.0 ip.
This looks like it's early deployment, therefore t...
Joao Eduardo Luis
06:09 AM Bug #7692 (Resolved): mon: monitor fails to form quorum
For some reason it is looking at 0.0.0.0 and of course that fails:... Alfredo Deza
01:13 AM CephFS Bug #1318: directories disappear across multiple rsyncs
please check if there was file recovery involved (MDCache::do_file_recover ) Zheng Yan

03/11/2014

07:30 PM devops Bug #7641 (Fix Under Review): packaging: ceph upgrade from cuttlefish to emperor is incomplete
It's sort of the other way around. ceph doesn't depend on version of the stuff in ceph-common at all, but ceph-commo... Sage Weil
07:24 PM Bug #7689: librados: ENOENT on ioctx create
downgrading thsi weirdness until i see it again Sage Weil
06:05 PM Bug #7689: librados: ENOENT on ioctx create
actually, no mon_command creating the pool appears in the logs. does not jive with the teuthology log. Sage Weil
04:54 PM Bug #7689 (Duplicate): librados: ENOENT on ioctx create
I think this is a race with pool creation, mon quorum changes, and new client startup getting a not-quite-fresh osdma... Sage Weil
07:23 PM rgw Bug #6889 (Pending Backport): rgw: usage log: don't log system user operations
Sage Weil
09:31 AM rgw Bug #6889 (Fix Under Review): rgw: usage log: don't log system user operations
Yehuda Sadeh
06:22 PM rgw Bug #7687 (Resolved): rgw: bucket creation time is not set
Sage Weil
06:14 PM rgw Bug #7687 (Fix Under Review): rgw: bucket creation time is not set
Yehuda Sadeh
06:09 PM rgw Bug #7687: rgw: bucket creation time is not set
actually it's set, but then it's overwritten. Yehuda Sadeh
03:09 PM rgw Bug #7687 (Resolved): rgw: bucket creation time is not set
Yehuda Sadeh
06:20 PM Bug #7654 (Duplicate): "AssertionError: failed to recover before timeout expired" in teuthology-2...
7635 Sage Weil
06:19 PM Bug #7631 (Resolved): command 'dump_ops_in_flight' not found
fixed by teuthology commit:57259b54beb746b31b70bb3a92de18d604002b0a Sage Weil
06:15 PM Bug #7631 (In Progress): command 'dump_ops_in_flight' not found
David Zafman
06:10 PM Bug #7406 (Duplicate): Seg fault in find_object_context()in recent master rados run
Samuel Just
05:17 PM Bug #7682 (In Progress): osd/ReplicatedPG.cc: 6268: FAILED assert(waiting_for_ondisk.begin()->fir...
David Zafman
12:59 PM Bug #7682 (Resolved): osd/ReplicatedPG.cc: 6268: FAILED assert(waiting_for_ondisk.begin()->first ...
http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-10_02:30:02-rados-firefly-testing-basic-plana/124761/teutholog... David Zafman
04:41 PM Bug #7688 (Won't Fix): warn at fs/btrfs/extent-tree.c:5748 __btrfs_free_extent+0x9ce/0xa20
... Sage Weil
04:38 PM Bug #7663 (Resolved): 2014-03-08T20:39:23.812 INFO:teuthology.task.rados.rados.1.err:[10.214.134....
Sage Weil
01:01 PM Bug #7663 (Fix Under Review): 2014-03-08T20:39:23.812 INFO:teuthology.task.rados.rados.1.err:[10....
Samuel Just
04:17 PM devops Bug #7617: ceph-deploy uninstall should document why it doesn't remove all relevant packages
The problem is that qemu-kvm depends on librbd1, and if you remove the latter, that means removing the former. That ... Dan Mick
04:11 PM rbd Bug #7577: rbd info displays extra random char in block prefix
Why I feel some guilt here: :)
commit 9a45ffb769c7c821196a8471009d0f9a4216c0d4
Author: Dan Mick <dan.mick@inktank...
Dan Mick
04:10 PM rbd Bug #7577: rbd info displays extra random char in block prefix
True, I don't see any attempt to add a nul. Oddly, it looks like there was half a plan to do so in librbd/internal.c... Dan Mick
01:18 PM rbd Bug #7577: rbd info displays extra random char in block prefix
I'm not current on my c++, but I'm pretty sure Danny's correct. Looking at the code, rbd.cc just does a cout of the 2... Dan van der Ster
01:13 PM rbd Bug #7577: rbd info displays extra random char in block prefix
Sent this in email, but: u007f sounds like maybe a 0x7f character?.. Which should be legal 7-bit ASCII, but perhaps i... Dan Mick
07:22 AM rbd Bug #7577: rbd info displays extra random char in block prefix
Here's a bit more with json format:
{"name":"volume-f529978c-0981-4eba-a5b5-7ba8ecc05e1b","size":5368709120,"objec...
Dan van der Ster
06:40 AM rbd Bug #7577: rbd info displays extra random char in block prefix
So this is not a recent regression. I can reproduce on el6.5 with 0.67-0
[root@dvtest1 ceph]# ceph --version
ce...
Dan van der Ster
06:24 AM rbd Bug #7577: rbd info displays extra random char in block prefix
Dan van der Ster wrote:
> My earlier diagnsis 0.67.4 vs 0.67.7 was incorrect -- actually that was a el6 vs ubuntu di...
Dan van der Ster
06:23 AM rbd Bug #7577: rbd info displays extra random char in block prefix
My earlier diagnsis 0.67.4 vs 0.67.7 was incorrect -- actually that was a el6 vs ubuntu difference.
On my ubuntu 0...
Dan van der Ster
04:11 AM rbd Bug #7577: rbd info displays extra random char in block prefix
Tried to reproduce, but couldn't. Looks to me like missing '\0' termination. Do you get this from all of your 0.67.7 ... Danny Al-Gaaf
04:05 PM Bug #7658 (Resolved): PGLog: _merge_object_divergent_entries: fix the case where prior_version ==...
Sage Weil
01:03 PM Bug #7658 (Fix Under Review): PGLog: _merge_object_divergent_entries: fix the case where prior_ve...
Samuel Just
04:05 PM Bug #7657 (Resolved): PGLog: fix proc_replica_log divergent entry selection
Sage Weil
01:04 PM Bug #7657 (Fix Under Review): PGLog: fix proc_replica_log divergent entry selection
Samuel Just
03:15 PM rgw Bug #7676 (Fix Under Review): rgw: multi-part upload incompatible with EC backend
Yehuda Sadeh
03:13 PM Bug #7635: failed to recover before timeout expired
It did eventually recover, as it turns out.
2014-03-06 05:40:41.795123 mon.0 10.214.132.32:6789/0 521 : [INF] pgmap ...
Samuel Just
03:02 PM Bug #7679 (Resolved): mds: stuck on TMAP2OMAP check incorrectly
Sage Weil
12:07 PM Bug #7679: mds: stuck on TMAP2OMAP check incorrectly
commit:b4fbe4f81348be74c654f3dae1c20a961b99c895 and a later commit fixed feature forwarding, which is needed for the ... Sage Weil
12:04 PM Bug #7679: mds: stuck on TMAP2OMAP check incorrectly
... Sage Weil
10:09 AM Bug #7679 (Resolved): mds: stuck on TMAP2OMAP check incorrectly
ubuntu@teuthology:/a/teuthology-2014-03-10_10:33:21-upgrade:dumpling-x:parallel-firefly---basic-plana/124951
mds r...
Sage Weil
03:00 PM Bug #7686 (Duplicate): osd spinning in agent_work
Sage Weil
02:37 PM Bug #7686: osd spinning in agent_work
agent_entry() is hitting a case where there appears to be work to do, but we never evict a hit set archive. In the l... David Zafman
02:08 PM Bug #7686: osd spinning in agent_work
FYI, this was/is on burnupiX after setting an OSD out and bringing it back in. Mark Nelson
02:05 PM Bug #7686 (Duplicate): osd spinning in agent_work
Probably related to the num_dirty/archived hitset issue?
2014-03-11 16:05:15.959964 7fc99340a700 20 osd.0 212 agen...
Samuel Just
02:44 PM devops Feature #7047 (Resolved): rhel7: build process for rbd.ko, ceph.ko kernel modules
The kmods are built as a part of the Red Hat kmod packaging (#6986) at https://github.com/ceph/ceph-kmod-rpm
Speci...
Ken Dreyer
02:40 PM Bug #7681: osd/SnapMapper.cc: 270: FAILED assert(check(oid))
Samuel Just
12:52 PM Bug #7681 (Resolved): osd/SnapMapper.cc: 270: FAILED assert(check(oid))

http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-10_02:30:02-rados-firefly-testing-basic-plana/124677/teuthol...
David Zafman
02:21 PM rgw Bug #7661 (Resolved): rgw: s3_multipart_upload.pl and s3_user_quota.pl tests fail
Sage Weil
02:21 PM rgw Bug #7375 (Resolved): s3_user_quota.pl fails
Sage Weil
09:07 AM rgw Bug #7375: s3_user_quota.pl fails
Sage Weil
02:20 PM rgw Bug #7374 (Resolved): s3_multipart_upload.pl fails
Sage Weil
09:07 AM rgw Bug #7374: s3_multipart_upload.pl fails
Sage Weil
02:01 PM Bug #7646 (Duplicate): osd/PGLog.cc: 291: FAILED assert(i->prior_version == last)
Samuel Just
01:59 PM Bug #7673 (Duplicate): "reached maximum tries" in /teuthology-2014-03-09_03:00:01-rados-firefly-t...
Samuel Just
01:59 PM Bug #7673: "reached maximum tries" in /teuthology-2014-03-09_03:00:01-rados-firefly-testing-basic...
I've turned up the timeout in the tests. Samuel Just
01:59 PM Bug #7673: "reached maximum tries" in /teuthology-2014-03-09_03:00:01-rados-firefly-testing-basic...
The non-ec ones are probably just an inadequate timeout -- the cleanup is likely to take longer than the writeout. T... Samuel Just
01:57 PM CephFS Bug #7685 (Can't reproduce): hung/failed teuthology test: cfuse_workunit_misc
http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-07_23:00:50-fs-firefly-testing-basic-plana/122094
http://qa-p...
Greg Farnum
01:46 PM CephFS Bug #7684 (Resolved): failed cfuse_workunit_kernel_untar_build.yaml test
http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-09_23:00:25-fs-firefly-testing-basic-plana/124157/
The teut...
Greg Farnum
01:37 PM Bug #7093 (Resolved): osd: peering can send messages prior to auth
Backported to Emperor and Dumpling Ian Colle
01:35 PM Bug #7468 (Duplicate): "scrub stat mismatch" error in rbd-master-testing-basic-plana suite
Samuel Just
01:31 PM Bug #7653 (Duplicate): "FAILED assert(!old_value.has_contents())" in rados-firefly-testing-basic-...
Samuel Just
01:24 PM Bug #5823 (Can't reproduce): cpu load on cluster node is very high, client can't get data on pg ...
Samuel Just
01:20 PM Bug #7626: After updating ceph from 0.75 to 0.77 one of the three monitors can't start
can you please rerun the monitor with 'debug mon = 10' and attach the resulting log Joao Eduardo Luis
01:20 PM Bug #7448 (Duplicate): os/FileJournal.cc: FAILED assert(fd >= 0)
Samuel Just
01:11 PM Bug #7545 (Duplicate): rados: notify was not recieved in ceph_test_rados_watch_notify with thrash...
Samuel Just
01:10 PM Bug #7611 (In Progress): All mon nodes crash when running "ceph tell osd.X" and using the "versio...
Joao Eduardo Luis
11:43 AM Bug #7674 (Resolved): Cache pool configuration stalls
Sage Weil
11:00 AM Bug #7674 (Fix Under Review): Cache pool configuration stalls
Sage Weil
10:44 AM rgw Bug #7450: "radosgw-admin key create" ignores specified access key when subuser specified
No, subusers are useful for S3 as well. Maybe this was by accident, but I'm relying on it already in production, for ... Robin Johnson
09:09 AM rgw Bug #7450: "radosgw-admin key create" ignores specified access key when subuser specified
Robin, is that still an issue (considering my previous comment), or should we close this one? Yehuda Sadeh
10:42 AM rbd Bug #7125: Assertion failure in rbd_img_obj_callback()
Probably related are:
http://tracker.ceph.com/issues/5876
http://tracker.ceph.com/issues/5662
http://tracker.cep...
Ilya Dryomov
10:29 AM Linux kernel client Bug #7069 (Resolved): CephFS hang when using fscache - several "blocked for more than 120 seconds...
this is fixed in 8fb883f3e30065529e4f35d4b4f355193dcdb7a2, according to milosz. fixed in kernel 3.13 Sage Weil
09:48 AM Linux kernel client Bug #7069 (Need More Info): CephFS hang when using fscache - several "blocked for more than 120 s...
THere are several fscache fixes in the testing branch; perhaps they address this? Sage Weil
10:27 AM Bug #7672 (Resolved): PG::choose_acting: run recoverable_predicate without CRUSH_ITEM_NONE
Sage Weil
10:18 AM Bug #7592 (Resolved): hit_set_trim() removal races with backfill
Sage Weil
10:09 AM Bug #7592 (Fix Under Review): hit_set_trim() removal races with backfill
David Zafman
10:15 AM rgw Feature #7680 (Resolved): Use new civetweb git repo for ceph
It seems that the original repo used to fork the ceph civetweb repo from is outdated. The current README on https://g... Danny Al-Gaaf
09:51 AM Linux kernel client Bug #2759: libceph: crush tree algorithm is not understood
We should be testing CRUSH with multiple algorithms Ian Colle
09:44 AM Linux kernel client Bug #2224 (Rejected): Oops in __cfh_to_dentry
Sage Weil
09:44 AM Linux kernel client Bug #5244 (Rejected): btrfs hang on tree lock, 3.9 kernel
Sage Weil
09:37 AM rbd Bug #4045 (Resolved): snap unprotect on a snapshot that is already unprotected throws inappropria...
e91fb910653a672560867d4a81aa30f9d5dc0af8 Ian Colle
09:35 AM rbd Bug #7582 (Resolved): "FAIL: test_rbd.TestImage.test_remove_with_watcher" in upgrade:dumpling-x-f...
Sage Weil
09:34 AM rbd Bug #7583 (Resolved): "librbd::ImageCtx: error finding header" in upgrade:dumpling-x-firefly---ba...
Sage Weil
09:33 AM rbd Bug #7625 (Resolved): ceph_test_rados_api_tier: not found
Sage Weil
09:32 AM rbd Bug #6577 (Can't reproduce): arm testing: rbd test segfaults at FAILED assert((bool)_front == (bo...
Sage Weil
09:21 AM rgw Bug #7543 (Resolved): rgw: off-by-one bug in rgw_trim_whitespace()
Sage Weil
09:18 AM rgw Bug #6913 (Duplicate): valgrind issues when running rgw tests
Sage Weil
09:17 AM rgw Bug #6802 (Rejected): ARM: rgw_swift failure (internal server error, 500)
Sage Weil
09:17 AM rgw Bug #6696 (Can't reproduce): Upgrade rgw failure in nightly tests. (/home/ubuntu/cephtest/s3-test...
Sage Weil
09:15 AM rgw Bug #6765 (Rejected): ARM: RGW s3 tests fail.
Sage Weil
09:13 AM rgw Feature #7589: rgw: configurable chunk size
Josh - please review wip-7589 Ian Colle
09:12 AM rgw Feature #7589 (Fix Under Review): rgw: configurable chunk size
Sage Weil
09:10 AM devops Bug #6453: libapache2-mod-fastcgi Packages for Debian Squeeze have incorrect dependencies
Sage Weil
09:08 AM rgw Bug #6911: rgw test failure on the arm set up
this sounds like a 32 bit int (size_t?) overflow Sage Weil
09:07 AM rgw Bug #7524 (Duplicate): "scrub stat mismatch" error in rgw-firefly-distro-basic-plana suite
Sage Weil
09:06 AM rgw Bug #7597 (Duplicate): hang in rados/test.sh
the mon assert is a dup Sage Weil
07:18 AM CephFS Bug #7474 (Won't Fix): Kernel oops with cephfs [ceph_write_begin -> *x8 -> wait_on_page_read]
This looks like it's the writeback deadlock when trying to flush from the client to the OSD on a single memory-constr... Greg Farnum
07:17 AM CephFS Bug #6599 (Resolved): client: invalid iterator dereference in Client::trim_caps
Sage Weil
07:12 AM CephFS Feature #5486: kclient: make it work with selinux
Hmm, Sage notes that maybe it'll work now we support ACLs. Or maybe we can use a special mount option? Greg Farnum
07:00 AM CephFS Bug #2187 (Can't reproduce): pjd chown/00.t failed test 97
Sage Weil
06:59 AM CephFS Bug #2740 (Resolved): mds: crash in Objecter when shutting down too early
Sage Weil

03/10/2014

05:11 PM Bug #7593: Disk saturation during PG folder splitting
At a high level, sure, if you know ahead of time how many objects per PG you expect you can pre-hash the PG directori... Sage Weil
01:26 AM Bug #7593: Disk saturation during PG folder splitting
Hi Sage,
If we would like to make the following changes:
# Bring in a new configuration flag which can be used to...
Guang Yang
05:08 PM devops Bug #7677: Troubleshoot ceph-setup-nightly Jenkins failures
Alfredo - please review. Ian Colle
04:57 PM devops Bug #7677 (Fix Under Review): Troubleshoot ceph-setup-nightly Jenkins failures
Here is the code change to skip the Debian package diffs if they are not present: https://github.com/ceph/ceph-build/... Ken Dreyer
04:55 PM devops Bug #7677 (Resolved): Troubleshoot ceph-setup-nightly Jenkins failures
h3. Background:
Ceph builds in Jenkins are broken up into three separate jobs: ceph-setup, ceph-build, and ceph-pa...
Ken Dreyer
05:05 PM devops Tasks #7678 (Resolved): f20 Jenkins slave
In #7094, we got a new Fedora 20 gitbuilder. We need to add this host to Jenkins as well.
I don't think there are ...
Ken Dreyer
04:38 PM Bug #7575 (Resolved): osd/ReplicatedPG.cc: 10600: FAILED assert(r >= 0): hit_set_persist() races ...
135c27ec74be352416d06a9d0ad78e63cf477433 David Zafman
04:28 PM rgw Bug #7676 (Resolved): rgw: multi-part upload incompatible with EC backend
multipart upload uses omap on data object. Need to provide a solution for this. Either we switch the data format for ... Yehuda Sadeh
04:16 PM devops Cleanup #7675: clean up Gary Lowell's WIP branches
*wip-doc-prereq* (was "5a811a":https://github.com/ceph/ceph/commit/5a811a0894db2619f5f916c0be85459d0f481265) has been... Ken Dreyer
03:44 PM devops Cleanup #7675 (In Progress): clean up Gary Lowell's WIP branches
Ken Dreyer
03:26 PM devops Cleanup #7675: clean up Gary Lowell's WIP branches
*wip-build-doc* has been rebased and submitted for merging here: https://github.com/ceph/ceph/pull/1415 Ken Dreyer
03:18 PM devops Cleanup #7675: clean up Gary Lowell's WIP branches
For the record, today I've reviewed and deleted the following wip branches from Gary.
*wip-lazy-cuttlefish-gl* (wa...
Ken Dreyer
03:13 PM devops Cleanup #7675 (Resolved): clean up Gary Lowell's WIP branches
Gary Lowell had contributed a couple of work-in-progress branches that are still outstanding. Here are the seven that... Ken Dreyer
02:20 PM Bug #7393: osd: scrub stat mismatch, got 9/9 objects, 0/0 clones, 9/4 dirty, 0/0 whiteouts, 26738...
I believe the error is in the cache pool. Samuel Just
02:20 PM Bug #7393: osd: scrub stat mismatch, got 9/9 objects, 0/0 clones, 9/4 dirty, 0/0 whiteouts, 26738...
[ERR] 4.3 scrub stat mismatch, got 48/48 objects, 0/0 clones, 21/23 dirty, 9/9
We are overestimating dirty now, ...
Samuel Just
02:17 PM Bug #7674: Cache pool configuration stalls
This was on 0.77-735-gd171418-1saucy which is a couple of days old. I can retest on firefly from today if there have... Mark Nelson
02:15 PM Bug #7674 (Resolved): Cache pool configuration stalls
In testing the following cache pool configuration:... Mark Nelson
02:09 PM Bug #7672: PG::choose_acting: run recoverable_predicate without CRUSH_ITEM_NONE
Samuel Just
01:32 PM Bug #7672 (Resolved): PG::choose_acting: run recoverable_predicate without CRUSH_ITEM_NONE
Samuel Just
02:09 PM Bug #7671: watch should return ENOENT if the object does not exist
Samuel Just
12:56 PM Bug #7671 (Resolved): watch should return ENOENT if the object does not exist
Currently, new_obs->exists doesn't get updated, so we create the log entry as a delete which messes everything up. Samuel Just
02:08 PM Bug #7673: "reached maximum tries" in /teuthology-2014-03-09_03:00:01-rados-firefly-testing-basic...
There are seem to be several of those, so 'major' Yuri Weinstein
02:07 PM Bug #7673 (Duplicate): "reached maximum tries" in /teuthology-2014-03-09_03:00:01-rados-firefly-t...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-09_03:00:01-rados-firefly-testing-basic-plana/1238... Yuri Weinstein
12:58 PM CephFS Bug #1318: directories disappear across multiple rsyncs
By “this” I meant files with different timestamps from what they were last set to, as in the first paragraph of comme... Alexandre Oliva
12:51 PM CephFS Bug #1318: directories disappear across multiple rsyncs
I'm afraid this still occurs quite often with ceph 0.77 and ceph.ko 3.13.6-gnu. I have a slightly better understandi... Alexandre Oliva
10:23 AM Bug #7592 (In Progress): hit_set_trim() removal races with backfill
David Zafman
10:09 AM Bug #7651 (Resolved): "HEALTH_WARN mds a is laggy" timed-out after 900 sec in -upgrade:dumpling-x...
Sage Weil
09:40 AM Cleanup #7668: remove custom googletest (gtest) copy from source tree
I have no expertise here and am down with making this change if it works out better, but when we initially grabbed gt... Greg Farnum
07:43 AM Cleanup #7668 (Resolved): remove custom googletest (gtest) copy from source tree
Remove the the potentially outdated gtest code from the source tree to keep it up-to-date:
Needed steps:
- since ...
Danny Al-Gaaf
09:32 AM devops Bug #7669 (New): obscure traceback when a partition of type TOBE_UUID can't be mounted
Steps to reproduce:... Loïc Dachary
03:20 AM devops Bug #7605: statup script /etc/init.d/ceph has incorrect slash
I meant:
the correct line should be:
r = sprintf("%.2f", d) instead of
r = sprintf(\"%.2f\", d)
Jan-Willem Michels
02:33 AM Bug #7667 (Resolved): Fix ceph code to build with llvm (clang/clang++)
Currently the code don't compile with clang++ due to non c++ standard usage of VLAs (http://clang.llvm.org/compatibil... Danny Al-Gaaf
12:45 AM rgw Bug #7526: "ERROR:radosgw_agent.worker:syncing entries for shard 59" in rgw-firefly-distro-basic-...
Fixing the agent problem did not make the test pass. I don't think the remainder is an agent issue, but haven't looke... Josh Durgin
 

Also available in: Atom