Project

General

Profile

Activity

From 10/02/2014 to 10/31/2014

10/31/2014

05:27 PM Feature #9981: osd: cache: proxy writes (instead of unconditionally promoting)
One thing we'll need to be careful about when not promoting is how we handle snapshots. I don't remember exactly how ... Greg Farnum
01:55 PM Feature #9981 (Resolved): osd: cache: proxy writes (instead of unconditionally promoting)
This should work similar to the read recency checks that don't always promote on first read, but give the cache osd a... Sage Weil
05:16 PM Bug #9985 (Resolved): osd: incorrect atime calculation
https://github.com/ceph/ceph/pull/2816 should be backported Sage Weil
05:10 PM CephFS Tasks #3680 (Rejected): deduplication in ceph
we should discuss this on the email list Sage Weil
04:38 PM RADOS Bug #9984: lttng_probe_unregister hangs on shutdown
maybe related to dynamic+static linking of lttng? Josh Durgin
04:16 PM RADOS Bug #9984 (New): lttng_probe_unregister hangs on shutdown
... Sage Weil
04:31 PM Bug #9976 (Resolved): ceph cli injectargs parsing broken
Sage Weil
02:25 PM Bug #9976: ceph cli injectargs parsing broken
Dan Mick
02:25 PM Bug #9976: ceph cli injectargs parsing broken
close; needs "if injectargs and ..", but that seems good Dan Mick
02:17 PM Bug #9976: ceph cli injectargs parsing broken
Maybe as simple as... Dan Mick
09:11 AM Bug #9976 (Resolved): ceph cli injectargs parsing broken
looks like it was the recent -- handling that broke?... Sage Weil
02:21 PM phprados Feature #424: Stream wrappers
Charles du Jeu wrote:
> Hi! Maybe I'm totally at the wrong place, if so, sorry for that.
> Was there any work done...
Wido den Hollander
08:27 AM phprados Feature #424: Stream wrappers
Hi! Maybe I'm totally at the wrong place, if so, sorry for that.
Was there any work done on that (streamwrapper imp...
Charles du Jeu
02:14 PM Bug #9983 (Resolved): Cleanup boost optionals for boost 1.56
This patch cleans up fatal errors with boost 1.56 when implicitly converting optionals to non-optional values.
It ...
William Kennington
01:58 PM RADOS Feature #9982 (New): osd: cache: make writes in readonly mode invalidate and then forward 
Sage Weil
01:36 PM Feature #9980 (Resolved): osd: cache: proxy reads during promote
wip-promote-forward may be a useful base, although I think it is not quite correct (we should proxy reads, not forwar... Sage Weil
01:35 PM Feature #9979 (Resolved): osd: cache: proxy reads (instead of redirect)
Sage Weil
01:12 PM Bug #9974 (Won't Fix): Osd-s bind only to 1st network in "public network"
OSDs bind and listen on only a single IP by design. Changing it would require major changes to how we handle identity... Greg Farnum
04:58 AM Bug #9974 (Won't Fix): Osd-s bind only to 1st network in "public network"
OSD daemons bind only to first network in ceph.conf "public network" parameter.
ceph.conf:
[global]
cluster netw...
elder one
10:57 AM Bug #9978 (Closed): keyvaluestore: void ECBackend::handle_sub_read
On 0.87 "Giant" I'm repeatedly hit by the following assert, typically crashing 4 ODSs at once:... Dmitry Smirnov
10:48 AM CephFS Bug #9977 (Resolved): cephfs-journal-tool falsely reports invalid start_ptr

This is happening when the journal expire_pos isn't at an object boundary. The expected start_ptr counter is being...
John Spray
10:03 AM CephFS Feature #1398: qa: multiclient file io test
... Anonymous
09:20 AM RADOS Bug #9911 (Rejected): ceph not placing replicas to OSDs on same host as down/out OSD
ah, it's because the vary_r tunable is false. we fixed this bug in firefly. switching to firefly tunables will reso... Sage Weil
08:35 AM Bug #9752 (Resolved): acting in past intervals contains primary and up_primary (looks like duplic...
Sage Weil
02:27 AM Bug #9752 (Fix Under Review): acting in past intervals contains primary and up_primary (looks lik...
* firefly https://github.com/ceph/ceph/pull/2847
* giant https://github.com/ceph/ceph/pull/2846
Loïc Dachary
08:28 AM devops Tasks #8366: Update ceph.com/docs to default to the latest major release (0.80)
John any updates on this? It is a bummer that we have all the infrastructure/services ready to deal with the redirect... Alfredo Deza
08:26 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Any news on the backport? Christopher Thorjussen
08:19 AM Linux kernel client Bug #9894: kcephfs: rm -r left files behind
Merged the userspace version of this; is there a separate ticket for that? Greg Farnum
06:19 AM devops Feature #8303: ceph-extra packages for newer Ubuntu versions
Bump.. ceph-extras still not available for Trusty 14.04. David Moreau Simard
04:32 AM rgw Bug #9973 (Resolved): Validation of Swift DLO manifest object ETag doesn't match OpenStack Swift ...
The way the RGW Swift API validates the ETag on DLO manifest objects does not match the way the OpenStack Swift imple... Mike Dorman

10/30/2014

09:32 PM Bug #9971 (Duplicate): OSD crashes again after restarting due to op thread time out at writing pg...
This crashes observed when one OSD was restarted after being down for a long time, it crashed again
because its op t...
Zhi Zhang
08:39 PM Feature #9954: buffer: method to ensure an extent is contiguous
Haomai Wang wrote:
> Hmm, just a another approach.
> Maybe we can use another interface called "get_range" for the ...
Sage Weil
06:49 PM Feature #9954: buffer: method to ensure an extent is contiguous
Hmm, just a another approach.
Maybe we can use another interface called "get_range" for the same goal.
| 1M byte...
Haomai Wang
01:34 PM Feature #9954 (Resolved): buffer: method to ensure an extent is contiguous
Add a method to assure that an extent in a bufferlist is contigous. Something like
bufferlist bl;
...
char *...
Sage Weil
08:33 PM Feature #9966: librados: set user_version operation
My recollection is that we preserve them when moving objects in/out of the cache tier. I assume we want them to also... Sage Weil
06:38 PM Feature #9966: librados: set user_version operation
What's the purpose of this? User versions are "user" only in the sense that they're the versions we expose to them as... Greg Farnum
02:08 PM Feature #9966 (New): librados: set user_version operation
Sage Weil
08:26 PM Feature #9953: osd: efficient ObjectStore::Transaction encoding
haomai's slides Sage Weil
01:31 PM Feature #9953 (Resolved): osd: efficient ObjectStore::Transaction encoding
Haomai and Dong proposed a vastly improved Transaction encoding during CDS. Video is here:
https://www.youtube.co...
Sage Weil
06:33 PM Bug #9752 (Pending Backport): acting in past intervals contains primary and up_primary (looks lik...
Greg Farnum
04:53 PM Bug #9752 (Fix Under Review): acting in past intervals contains primary and up_primary (looks lik...
https://github.com/ceph/ceph/pull/2843 Loïc Dachary
02:36 PM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
On a cluster running from sources with... Loïc Dachary
01:52 PM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
... Loïc Dachary
01:44 PM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
Fortunately I saved an entire osd directory from which I was able to extract osdmaps with duplicates related to attac... Loïc Dachary
10:36 AM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
... Loïc Dachary
10:33 AM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
Could it be that the "the acting vector":https://github.com/ceph/ceph/blob/giant/src/osd/osd_types.h#L1391 size is no... Loïc Dachary
03:26 PM Bug #9970 (Resolved): document erasure coded pool simple operations
Move part of http://ceph.com/docs/master/dev/erasure-coded-pool/#interface to the rados operation guide and fix the i... Loïc Dachary
03:01 PM Bug #9969 (Can't reproduce): osd: crash in delete, tcmalloc, PGLog::write_log (dumpling)
... Sage Weil
02:17 PM Feature #8633 (Duplicate): allow writes before recovering a replica
see #7861 Sage Weil
02:10 PM RADOS Feature #9967 (New): rados: pool rollback
roll back an entire pool to a previous snapshot. this is O(n): we enumerate objects and call rollback() on each one. Sage Weil
02:07 PM Feature #9965 (New): rados: new import from pipe/file
- use file format from ceph_objectstore_tool and new export (#9964)
- take care to preserve snapshot state
- preser...
Sage Weil
02:05 PM Feature #9964 (Resolved): rados: new export [range] to pipe/file
- export a range of hash values (or the entire pool) to stdout (or a file).
- use the same format that ceph_objectst...
Sage Weil
02:03 PM Feature #9963 (Fix Under Review): librados: improve get_objects and get_position interfaces
The requirement is that export (or some other user) needs to be able to
#. partition the hash space into N segment...
Sage Weil
01:59 PM Feature #9962 (In Progress): osd: kill 'category' in stats and public API
Sage Weil
01:58 PM Feature #9962 (Resolved): osd: kill 'category' in stats and public API
Sage Weil
01:56 PM Feature #9961 (Resolved): osd: new MOSDClientSubOp and Reply
Discussed during CDS here:
http://pad.ceph.com/p/hammer-fixed_memory_layout
http://youtu.be/CTp4eP9kPok
Create...
Sage Weil
01:48 PM Feature #9960 (Resolved): osd: adjust hint(s) for replica vs primary writes
We should generally DONTNEED on replicas, regardless of what the client asked us to do. Sage Weil
01:48 PM Feature #9959 (Resolved): osd: pass client fadvise hints through to objecstore
Sage Weil
01:47 PM Feature #9958 (Resolved): osd: add fadvise op to Objectstore::Transaction
Add fadvise op to ObjectStore::Transaction. Mirror posix_fadvise(2).
See #9957.
Sage Weil
01:45 PM Feature #9957 (Resolved): librados: add fadvise op
Add an fadvise operation to ObjectOperation. Mirror posix_fadvise(2).
Add it right around here: https://github.co...
Sage Weil
01:45 PM Feature #9956 (Resolved): osd: reenable alloc hints if kernel is known to be safe
Sage Weil
01:42 PM Bug #9480 (Resolved): OSD is crashing while object deletion
Samuel Just
01:37 PM Feature #9955 (Resolved): osd: allow encoded bufferlist to be used in place of map<K,V> for kv APIs
This will avoid encode/reencode overhead to convert things to an STL structure. Eventually, once we pass through the... Sage Weil
01:31 PM RADOS Feature #9952: osd: smarter choice of primary to minimize recovery disruption
We currently choose the first up osd as the primary unless it is impossible to do so. But, we can do better: other o... Sage Weil
01:30 PM RADOS Feature #9952 (New): osd: smarter choice of primary to minimize recovery disruption
Sage Weil
01:29 PM Feature #7862 (In Progress): allow backfill/recovery while below min_size
Sage Weil
01:28 PM Feature #9951 (New): librados, osd: per-object scrub operation
librados operation to scrub a single object. Sage Weil
01:27 PM Feature #9950 (New): rados: add ability to read a specific replica/shard from CLI
Sage Weil
01:25 PM Feature #9949 (New): librados: add ability to read a specific replica or shard
Part of make scrub/repair work is being able to explicitly fetch any copy or shard of an object. Extend librados to ... Sage Weil
01:24 PM Feature #9948 (New): osd: add scrub result query interface
This will use the admin interface (ceph tell <pgid> ...), similar to 'ceph tell <pgid> query'. results in json. see... Sage Weil
01:24 PM Feature #9947 (New): osd: store scrub error state in kv store; clear on peering event 
Sage Weil
11:39 AM Bug #9944 (Pending Backport): objecter: pool dne checks not correct
Sage Weil
10:56 AM Bug #9944 (Fix Under Review): objecter: pool dne checks not correct
https://github.com/ceph/ceph/pull/2839
Sage Weil
09:06 AM Bug #9944 (Resolved): objecter: pool dne checks not correct
... Sage Weil
11:08 AM Bug #9942 (Won't Fix): Debian armhf packages are missing in latest repo updates for Debian in Fir...
we don't (and never have) built armhf packages for ceph.com.
we do have a bunch of armv7l hardware and did build...
Sage Weil
04:26 AM Bug #9942 (Won't Fix): Debian armhf packages are missing in latest repo updates for Debian in Fir...
I'm trying to install Ceph with ceph-deploy on a armhf cluster but it failed:
[MS0][ERROR ] RuntimeError: command ...
Jasper Siero
10:35 AM Bug #9750: pg incomplete
... Loïc Dachary
10:11 AM CephFS Feature #1398: qa: multiclient file io test
A task that implements this could be useful for testing calamari as well (I manually did some of the things needed he... Anonymous
10:08 AM CephFS Feature #1398 (In Progress): qa: multiclient file io test
Anonymous
10:04 AM Bug #9945 (Resolved): giant: MClientSession COMPAT_VERSION is 2, should be 1
yup! Sage Weil
09:52 AM Bug #9945 (Fix Under Review): giant: MClientSession COMPAT_VERSION is 2, should be 1
https://github.com/ceph/ceph/pull/2837
https://github.com/ceph/ceph/pull/2838
John Spray
09:41 AM Bug #9945 (Resolved): giant: MClientSession COMPAT_VERSION is 2, should be 1
John Spray
09:37 AM CephFS Feature #9881 (In Progress): mds: admin command to flush the mds journal
John Spray
07:55 AM Bug #9916: osd: crash in check_ops_in_flight
The crash happened with radosgw as the client, so I guess it is formed by objecter - https://github.com/ceph/ceph/blo... Guang Yang
04:36 AM Feature #9943 (In Progress): osd: mark pg and use replica on EIO from client read
Copy the below email thread and open an issue to track the enhancement.... Guang Yang
02:56 AM Bug #9941 (Rejected): rados command line crashes when trying to copy pool snapshot
We are exploring options to regularly preserve the contents of the pools backing our rados gateways. For that we crea... Daniel Schneller
12:53 AM Bug #8797: "ceph status" do not exit with python_2.7.8
Just a note that people are hitting this in fedora 21, now:
https://bugzilla.redhat.com/show_bug.cgi?id=1155335
Boris Ranto

10/29/2014

09:34 PM CephFS Feature #9940: uclient: be more robust when dealing with outstanding RADOS IO and stale caps
While in the general case it is necessary to fence clients that have become unresponsive to the MDS, this type of "so... John Spray
09:23 PM CephFS Feature #9940 (New): uclient: be more robust when dealing with outstanding RADOS IO and stale caps
If we've given IO to the Objecter and our caps go stale, we need to do something to handle it. Greg Farnum
09:06 PM CephFS Bug #1666 (Resolved): hadoop: time-related meta-data problems
We now take client timestamps for almost everything, so this should no longer be a problem and I'm closing it unless ... Greg Farnum
07:13 PM Bug #9939 (Resolved): "giant" no longer log scrub errors
Scrubbing problematic PGs no longer report found errors: there no more records of discovered errors in ... Dmitry Smirnov
02:49 PM Bug #9916 (Need More Info): osd: crash in check_ops_in_flight
how is the OSDOp being formed? this looks like a bug on the client side to me. the attr ops should have name_len by... Sage Weil
02:45 PM Bug #9910 (Pending Backport): ceph_test_rados: out of order, probably due to message delay logic
Sage Weil
11:22 AM Bug #9910 (Fix Under Review): ceph_test_rados: out of order, probably due to message delay logic
https://github.com/ceph/ceph/pull/2832 Sage Weil
01:16 PM Feature #9776: try to make address sanitizer work
Ok, so the gcc version required to make this work is only a month or two old (dynamic linking bug fix). So, we're go... Samuel Just
01:11 PM Bug #9875 (Pending Backport): stuck recovering due to unfound hit_set object
Samuel Just
11:44 AM rbd Bug #9936: Exporting images larger than 2GB fails
PR: https://github.com/ceph/ceph/pull/2828 Jason Dillaman
11:43 AM rbd Bug #9936 (Resolved): Exporting images larger than 2GB fails
An lseek64 result code is copied into an int32, causing an overflow for large images. Jason Dillaman
11:37 AM RADOS Bug #9911: ceph not placing replicas to OSDs on same host as down/out OSD
Sorry, forgot that the majority agreement does not work with two replicas. Everything is ok now. Andrey Korolyov
10:44 AM RADOS Bug #9911: ceph not placing replicas to OSDs on same host as down/out OSD
Andrey Korolyov wrote:
> Can confirm placement mess on giant: I am backfilling one node from another one within two-...
Sage Weil
10:41 AM RADOS Bug #9911: ceph not placing replicas to OSDs on same host as down/out OSD
Can confirm placement mess on giant: I am backfilling one node from another one within two-node cluster. After today`... Andrey Korolyov
11:17 AM Linux kernel client Bug #9928: kernel BUG at fs/ceph/caps.c:2307!
The very first error message is:... Zheng Yan
10:37 AM Linux kernel client Bug #9928: kernel BUG at fs/ceph/caps.c:2307!
... Sage Weil
08:30 AM Linux kernel client Bug #9928: kernel BUG at fs/ceph/caps.c:2307!
MDS cache dump at ~/jcsp/9928/cachedump.1870.mds0 on teuthology.
This was taken at around 0800 local, long after t...
John Spray
07:55 AM Linux kernel client Bug #9928 (Resolved): kernel BUG at fs/ceph/caps.c:2307!

Client's view of its operations:...
John Spray
11:04 AM CephFS Bug #9935: client: segfault on ceph_rmdir path "/"
Yes, EBUSY is what a local filesystem gives you, so that sounds right to me. John Spray
10:48 AM CephFS Bug #9935 (Resolved): client: segfault on ceph_rmdir path "/"
A segfault occurs when removing the root directory. What is the expected behavior? I think -EBUSY is what makes sense. Noah Watkins
10:00 AM Bug #9891: "Assertion: os/DBObjectMap.cc: 1214: FAILED assert(0)" in upgrade:firefly-x-giant-dist...
does not appear to be a ceph issue.. either bad disk or leveldb corruption or something. lowering priority. Sage Weil
09:54 AM rgw Documentation #9934 (Closed): rgw: document backing pool capabilities and API usage
Document what RGW is capable of in terms of defining multiple backing RADOS pools and how they can be used via the S3... Sage Weil
09:52 AM rgw Feature #9933 (New): rgw: implement S3 RR (reduced redundancy) API
- mark a particular backing pool as the 'rr' one
- make RGW understand the S3 API for RR and use that backing pool f...
Sage Weil
09:51 AM rgw Feature #9932 (Resolved): rgw: map swift X-Storage-Policy header to rgw pools
This will let people use the new Swift "storage policies" API to use the preexisting RGW functionality Sage Weil
09:29 AM Subtask #9931 (New): create selinux policies for ceph-mon, ceph-osd, ceph-mds
From an internal red hat discussion:
There are probably three distinct things we need to do to get cephs and
SELi...
Sage Weil
09:27 AM Cleanup #9930 (New): gtest: update, move to submodule
the version we have is very old. update to a newer version, and possibly/probably move to a submodule. Sage Weil
05:25 AM Bug #9927: RHEL: selinux-policy-targeted rpm update triggers slow requests
Here's a solution:... Dan van der Ster
03:46 AM Bug #9927: RHEL: selinux-policy-targeted rpm update triggers slow requests
It is triggered by fixfiles -C /etc/selinux/targeted/contexts/files/file_contexts.pre restore... Dan van der Ster
03:35 AM Bug #9927 (Can't reproduce): RHEL: selinux-policy-targeted rpm update triggers slow requests
We observe slow requests while updating a server to RHEL6.6. The upgrade includes selinux-policy-targeted, which runs... Dan van der Ster
12:11 AM Bug #9919 (Resolved): tests: qa/workunits/cephtool/test.sh injectargs instability
Loïc Dachary

10/28/2014

10:52 PM Feature #9926 (Resolved): AsyncMessenger: Support kqueue interface for BSD and mac osx OS
Haomai Wang
09:14 PM Bug #9910: ceph_test_rados: out of order, probably due to message delay logic
... Sage Weil
09:08 PM Bug #9910: ceph_test_rados: out of order, probably due to message delay logic
wip-9910 Sage Weil
09:00 PM Bug #9910: ceph_test_rados: out of order, probably due to message delay logic
yeah, almost certain this is a bug with delayed messages. testing a fix.
ubuntu@teuthology:/a/sage-bug-9910-a/576723
Sage Weil
04:25 PM Bug #9910 (In Progress): ceph_test_rados: out of order, probably due to message delay logic
reproducing with client ms logs Sage Weil
05:35 PM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...

I happen to notice the issue because I happen to look at this guys pastebin. I didn't interact with him at all. N...
David Zafman
03:46 PM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
It is unfortunately gone... Loïc Dachary
03:00 PM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
First thing we want to get is an osdmap from the misbehaving epoch.
Loic: you can get the osdmap for a particular ...
Samuel Just
02:21 PM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
Actually, that thread is the same instance as david's. Samuel Just
02:10 PM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
See the thread "[ceph-users] Troubleshooting Incomplete PGs" for another instance of this (and there are several more... Greg Farnum
05:08 PM Bug #9921: msgr/osd/pg dead lock giant
https://github.com/ceph/ceph/pull/2825 Greg Farnum
04:56 PM Bug #9921 (Fix Under Review): msgr/osd/pg dead lock giant
wip-9921, totally untested. Greg Farnum
02:51 PM Bug #9921: msgr/osd/pg dead lock giant
From what I recall, none of these are simple locks to get rid of. I'm not actually sure how to go about it; even some... Greg Farnum
02:14 PM Bug #9921: msgr/osd/pg dead lock giant
SimpleMessenger lock is held by an accepting Pipe trying to replace an old Pipe:... Greg Farnum
01:50 PM Bug #9921: msgr/osd/pg dead lock giant
nvm, different deadlock Samuel Just
01:49 PM Bug #9921 (Duplicate): msgr/osd/pg dead lock giant
just kidding, this appears to be 9898 Samuel Just
11:03 AM Bug #9921 (Resolved): msgr/osd/pg dead lock giant
commit:2d6980570af2226fdee0edfcfe5a8e7f60fae615
/a/teuthology-2014-10-27_02:32:02-rados-giant-distro-basic-multi/5...
Samuel Just
03:42 PM Bug #9750: pg incomplete
I'm afraid these maps are lost... Loïc Dachary
03:22 PM Bug #9750: pg incomplete
Yeah, you'll want maps from back when the acting set was wonky. Might want to look into the past intervals code perh... Samuel Just
02:25 PM Bug #9919 (Fix Under Review): tests: qa/workunits/cephtool/test.sh injectargs instability
https://github.com/ceph/ceph/pull/2823 Loïc Dachary
09:42 AM Bug #9919 (Resolved): tests: qa/workunits/cephtool/test.sh injectargs instability
By modifying *osd_debug_drop_ping_probability = '444'* it introduces a side effect on the cluster that can create pro... Loïc Dachary
12:43 PM CephFS Bug #9900 (Duplicate): Failure in multiple_rsync (directories wrongly appear changed)
I imagine this is a dup of #9894? Greg Farnum
12:24 PM Linux kernel client Bug #5429: libceph: rcu stall, null deref in osd_reset->__reset_osd->__remove_osd
I bet there is another trace of this somewhere, no rcu stall, just plain NULL deref in rb_erase(). Will try to inves... Ilya Dryomov
11:36 AM Linux kernel client Bug #5429: libceph: rcu stall, null deref in osd_reset->__reset_osd->__remove_osd
Got reports of the 2nd trace (http://tracker.ceph.com/issues/5429#note-7) occuring on a kernel with the notify fixes. Josh Durgin
12:18 PM CephFS Bug #9800 (Pending Backport): client-limits test is not passing
I don't know that we need/want to try and push this in before release (although since it's all guarded inside of a br... Greg Farnum
05:29 AM CephFS Bug #9800 (Resolved): client-limits test is not passing
... John Spray
11:21 AM Bug #9920: admin socket check hang, osd appears fine
Hmm, osd.4 seems fine, not sure why the admin socket check didn't work. Samuel Just
10:00 AM Bug #9920 (Can't reproduce): admin socket check hang, osd appears fine
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-27_17:18:01-upgrade:firefly-x-giant-distro-basic-v... Yuri Weinstein
11:12 AM CephFS Bug #8255 (Fix Under Review): mds: directory with missing object cannot be removed
https://github.com/ceph/ceph/pull/2821 Zheng Yan
08:54 AM Bug #9288 (Resolved): "Assertion `nlock == 0' failed" in upgrade:firefly-firefly-testing-basic-vp...
Yuri Weinstein
08:52 AM rgw Bug #9866 (Resolved): "test_s3.test_multipart_upload ... ERROR" in upgrade:firefly:older-firefly-...
Yuri Weinstein
04:23 AM rgw Bug #9918: RGW-Swift: SubUser access permissions, does not seems to work
2014-10-28 16:43:28.776693 7f5cd87c0700 1 civetweb: 0x7f5d2c0093f0: 127.0.0.1 - - [28/Oct/2014:16:43:28 +0530] "GET ... pushpesh sharma
04:18 AM rgw Bug #9918 (Resolved): RGW-Swift: SubUser access permissions, does not seems to work
Create users and sub-users in generic development env:-
This is relevant json DS:-
{ "user_id": "user1",
"disp...
pushpesh sharma
03:58 AM rgw Bug #9917: RADOSGW: Not able to create Swift objects with erasure coded pool
2014-10-28 15:59:41.468515 7f0863fef700 20 RGWEnv::set(): HTTP_HOST: localhost:8000
2014-10-28 15:59:41.468583 7f086...
pushpesh sharma
03:58 AM rgw Bug #9917: RADOSGW: Not able to create Swift objects with erasure coded pool
able to create rados object:-
#./ceph osd pool create mypool 20 20 erasure
DEVELOPER MODE: setting PATH, PYTHONPA...
pushpesh sharma
03:56 AM rgw Bug #9917 (Won't Fix): RADOSGW: Not able to create Swift objects with erasure coded pool
ceph@Ubuntu14:~/ceph-0.86/src$ MON=3 MDS=0 RGW=1 OSD=3 ./vstart.sh -d -n -x -r
going verbose **
[./fetch_config /tm...
pushpesh sharma

10/27/2014

10:21 PM Bug #9916 (Resolved): osd: crash in check_ops_in_flight
Assertion failure:... Guang Yang
07:44 PM Bug #9915 (Resolved): osd: eviction logic reversed
commit:622c5ac Sage Weil
06:17 PM CephFS Feature #4138 (Fix Under Review): MDS: forward scrub: add functionality to verify disk data is co...
This bit at least has been isolated and put into a PR:
https://github.com/ceph/ceph/pull/2814
Greg Farnum
04:56 PM Bug #9910: ceph_test_rados: out of order, probably due to message delay logic
another one: ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-10-27_02:32:02-rados-giant-distro-basic-m... Sage Weil
01:15 PM Bug #9910 (Resolved): ceph_test_rados: out of order, probably due to message delay logic
/a/samuelj-2014-10-24_23:51:24-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/571220
* commit:f7431cc...
Samuel Just
04:23 PM CephFS Bug #9870 (Resolved): kernel: not handling cap_flush_ack messages properly
Zheng Yan
03:33 PM Linux kernel client Bug #9894: kcephfs: rm -r left files behind
Zheng Yan
03:25 PM rbd Bug #9391: fio rbd driver rewrites same blocks
@Mark: I have to take a look at fio for that. Is this all about sequential writes only? Do you see a different behavi... Danny Al-Gaaf
02:44 PM Bug #9891: "Assertion: os/DBObjectMap.cc: 1214: FAILED assert(0)" in upgrade:firefly-x-giant-dist...
2014-10-25 18:56:23.243456 7fefdcc3e700 20 filestore dbobjectmap: seq is 485
2014-10-25 18:56:23.243559 7fefdd43f700...
Samuel Just
02:31 PM Bug #9913 (Resolved): mon: audit log entires for forwarded requests lack info
... Sage Weil
02:27 PM Bug #9912 (Won't Fix): ceph osd up # not a valid command in 0.80.7
there is no way to administratively make an osd 'up'. the daemon needs to go through it's startup procedure and join... Sage Weil
02:24 PM Bug #9912 (Won't Fix): ceph osd up # not a valid command in 0.80.7
There is a valid command for setting an osd down:... Mark Nelson
02:16 PM RADOS Bug #9911: ceph not placing replicas to OSDs on same host as down/out OSD
ceph -s output with an OSD down and type host:... Mark Nelson
02:11 PM RADOS Bug #9911 (Rejected): ceph not placing replicas to OSDs on same host as down/out OSD
On a 3 node firefly cluster with 6 OSDs per host and 3x replication, when noup is set and 1 OSD is marked down/out, a... Mark Nelson
01:38 PM Feature #9598: re-enable Objecter fast dispatch
Sage Weil
01:13 PM Bug #9909 (Resolved): lost_unfound test/rados tool flawed, EEXIST when putting empty object
ubuntu@teuthology:/a/samuelj-2014-10-24_23:51:24-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/571037
...
Samuel Just
01:09 PM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
ubuntu@teuthology:/a/samuelj-2014-10-24_23:51:24-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/571474/r... Samuel Just
11:31 AM rgw Bug #9877: In some cases it's possible for rgw to segfault on http COPY
You mean #9226 ? Anonymous
11:16 AM rgw Bug #9907 (Resolved): radosgw-admin: can't disable max_size quota
From pull request, by Dong Lei:... Yehuda Sadeh
11:05 AM Linux kernel client Feature #9906 (Resolved): Inline data support

Currently the fuse client supports CEPH_FEATURE_MDS_INLINE_DATA but the kernel client does not.
John Spray
10:28 AM CephFS Bug #9904 (Resolved): Don't crash MDS on clients sending messages with bad seq
Currently in Server::handle_client_session, we do this:... John Spray
10:14 AM CephFS Feature #9903 (Resolved): Recover lost dirfrag via data pool

[While the MDS cluster is offline and journal has been flushed if necessary]
Given that a particular dirfrag obj...
John Spray
10:10 AM Bug #9731: Ceph 0.80.6 OSD crashes
Ok, let me know what happens. Samuel Just
10:09 AM Bug #9731: Ceph 0.80.6 OSD crashes
Nothing reported from valgrind. Also haven't seen crashes lately. At this point I'm thinking the issues were corrup... Brad House
10:06 AM Feature #9902 (Duplicate): Tool for RADOS import/export pool to file

To assist with CephFS disaster recovery, provide the ability to dump an entire pool (the cephfs metadata pool) to a...
John Spray
10:00 AM Support #9901 (New): libgoogle-perftools4: tcmalloc performance regression on armhf
Just to keep track of https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=766986 Loïc Dachary
09:36 AM CephFS Bug #9900 (Duplicate): Failure in multiple_rsync (directories wrongly appear changed)

http://pulpito.ceph.com/teuthology-2014-10-24_23:08:01-kcephfs-giant-testing-basic-multi/570840/
http://pulpito.ce...
John Spray
09:34 AM rgw Bug #9892: radosgw_admin.py: failed len(out['entries']) == 0 on usage show
seem like a broken test. We write an object here:... Yehuda Sadeh
09:09 AM rgw Bug #9148: rgw: multiregion tests failing, s3tests.functional.test_s3.test_region_copy_object
Seems that the slow_backend param has not been applied on the s3tests giant branch. Yehuda Sadeh
09:08 AM rgw Bug #9148: rgw: multiregion tests failing, s3tests.functional.test_s3.test_region_copy_object
in latest run, still trying to copy the 100M:... Yehuda Sadeh
09:09 AM devops Bug #9747 (Resolved): ceph.spec.in will always use 95-ceph-osd-alt.rules
Loïc Dachary
08:24 AM Bug #9702: "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:firefly-x-giant-...
Update in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-26_18:13:01-upgrade:firefly-x-giant-distro-basic... Yuri Weinstein
06:05 AM CephFS Bug #9800: client-limits test is not passing
https://github.com/ceph/ceph/pull/2809
http://pulpito.front.sepia.ceph.com/john-2014-10-27_13:05:29-fs:recovery-wip-...
John Spray

10/26/2014

07:54 PM Bug #9895 (Duplicate): Master/giant branch: OSD deadlock during recovery
#9898 Sage Weil
11:24 AM Bug #9895 (Duplicate): Master/giant branch: OSD deadlock during recovery
Given eight-OSD, two-node cluster (node01 and node04), three mons (node01, node04, twin2). OSDs placed on node04 acts... Andrey Korolyov
04:51 PM rbd Bug #9391: fio rbd driver rewrites same blocks
Hi Guys,
This is all on the fio side. From what I remember, when you are doing sequential writes and specify mult...
Mark Nelson
03:33 PM rgw Bug #9899 (Resolved): Error "coverage ceph osd pool get '' pg_num" in upgrade:dumpling-dumpling-d...
Seems related to rgw and 3-upgrade-sequence/upgrade-osd-mon-mds.yaml configurations... Yuri Weinstein
02:33 PM Messengers Bug #9898: osd: fast dispatch deadlock in mark_down (giant)
Looks like the same as I reported some hours before: #9895. Please close mine or this one as a duplicate. Andrey Korolyov
12:19 PM Messengers Bug #9898: osd: fast dispatch deadlock in mark_down (giant)
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-10-24_21:12:40-rados-wip-sam-testing-distro-basic-multi/570144 Sage Weil
12:18 PM Messengers Bug #9898: osd: fast dispatch deadlock in mark_down (giant)
full backtrace Sage Weil
12:17 PM Messengers Bug #9898 (Resolved): osd: fast dispatch deadlock in mark_down (giant)
this is basically a dup of the issue we saw with fast dispach in the objecter, but with the osd.... Sage Weil
11:49 AM rbd Bug #9855 (Resolved): rbd "Segmentation fault" in upgrade:firefly:singleton-firefly-distro-basic-...
fixed test Sage Weil
11:48 AM Linux kernel client Bug #9896: krbd: EPERM from map-snapshot-io.sh
ubuntu@teuthology:/a/teuthology-2014-10-24_23:06:01-krbd-giant-testing-basic-multi/570827 too Sage Weil
11:48 AM Linux kernel client Bug #9896 (Resolved): krbd: EPERM from map-snapshot-io.sh
... Sage Weil
11:24 AM Linux kernel client Bug #9894 (Resolved): kcephfs: rm -r left files behind
... Sage Weil
11:21 AM rgw Bug #9148: rgw: multiregion tests failing, s3tests.functional.test_s3.test_region_copy_object
also
ubuntu@teuthology:/a/teuthology-2014-10-24_23:02:01-rgw-giant-distro-basic-multi/570719
ubuntu@teuthology:/a/t...
Sage Weil
11:16 AM rgw Bug #9148: rgw: multiregion tests failing, s3tests.functional.test_s3.test_region_copy_object
teuthology-2014-10-24_23:02:01-rgw-giant-distro-basic-multi/570701 fails with slow_backend:true on giant.... Sage Weil
11:19 AM rgw Bug #9892 (Resolved): radosgw_admin.py: failed len(out['entries']) == 0 on usage show
... Sage Weil
08:42 AM Bug #9891 (Resolved): "Assertion: os/DBObjectMap.cc: 1214: FAILED assert(0)" in upgrade:firefly-x...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-25_18:13:01-upgrade:firefly-x-giant-distro-basic-m... Yuri Weinstein
05:03 AM Subtask #9890: mon: VIRT usage 2.4G larger than tcmalloc's VIRT stats (dumpling, centos6.3)
forgot to mention that leveldb stores for all mons are several GB large, even after compaction:... Joao Eduardo Luis
04:57 AM Subtask #9890: mon: VIRT usage 2.4G larger than tcmalloc's VIRT stats (dumpling, centos6.3)
mon.c (in quorum) is being the synchronization provider for mon.b (restarted with valgrind memcheck).
mon.c's spik...
Joao Eduardo Luis
04:39 AM Subtask #9890 (Can't reproduce): mon: VIRT usage 2.4G larger than tcmalloc's VIRT stats (dumpling...

* centos 6.3
* ceph version 0.67.11 (bc8b67bef6309a32361be76cd11fb56b057ea9d2)
* Stressing the monitors with qa...
Joao Eduardo Luis
04:26 AM Bug #9889 (Closed): mon: leveldb weirdness
Inquiries on leveldb on the monitors and weirdness sometimes associated.
This ticket is being used to track severa...
Joao Eduardo Luis

10/25/2014

11:04 AM Feature #9888: AsyncMessenger: Async event threads can shared by all AsyncMessenger
+1 Sage Weil
07:32 AM Feature #9888 (Resolved): AsyncMessenger: Async event threads can shared by all AsyncMessenger
Now, each AsyncMessenger will create "ms_async_op_threads" threads which will process incoming/outcoming connections.... Haomai Wang

10/24/2014

09:27 PM Bug #9727: 0.86 EC+ KV OSDs crashing
Not sure, I'm still waiting for crash for master branch Haomai Wang
06:10 AM Bug #9727: 0.86 EC+ KV OSDs crashing
will this be an issue solved in Giant? Kenneth Waegeman
02:38 PM Bug #9746: reconcile upstream ceph.spec.in with other ceph.spec (SuSE, EPEL, etc)
https://build.opensuse.org/package/show/home:netsroth/ceph Loïc Dachary
02:06 PM Bug #9731: Ceph 0.80.6 OSD crashes
Still no crashes under valgrind? How many osds are running under valgrind? We should probably leave it running for ... Samuel Just
01:20 PM Bug #9480 (Pending Backport): OSD is crashing while object deletion
Sage Weil
11:40 AM rgw Bug #9866: "test_s3.test_multipart_upload ... ERROR" in upgrade:firefly:older-firefly-distro-basi...
yuri, please close it when we get a pass on the nightlies. Tamilarasi muthamizhan
11:35 AM rgw Bug #9886 (Resolved): rgw: apache 2.4 does not send http status reason string
There's an issue with certain apache 2.4 versions, where it doesn't send back the http status reason in the response.... Yehuda Sadeh
11:34 AM rgw Bug #9878 (Pending Backport): rhel7 s3-tests fail due to missing reason
commit:a9dd4af Sage Weil
11:26 AM rbd Bug #8912: librbd segfaults when creating new image (rbd-ephemeral-clone-stable-icehouse)
For better searchability, the backtrace for this crash is:... Josh Durgin
11:24 AM rbd Bug #9513 (Pending Backport): rbd_cache=true default setting is degading librbd performance ~10X ...
reverted the backport for now as fully fixing the ObjectCacher is too large a change close to the giant release Josh Durgin
11:14 AM CephFS Bug #9884: too many files in /usr for multiple_rsync.sh
Yeah, just cutting it down to a more predictable/smaller directory sounds good to me. Greg Farnum
10:50 AM CephFS Bug #9884: too many files in /usr for multiple_rsync.sh
one failure http://pulpito.ceph.com/teuthology-2014-10-20_23:04:01-fs-giant-distro-basic-multi/562537/ Zheng Yan
10:49 AM CephFS Bug #9884 (Closed): too many files in /usr for multiple_rsync.sh
for example, plana81 has 60k files in /usr, but plana90 has 90k files in /usr. perhaps multiple_rsync should /usr/src... Zheng Yan
10:51 AM Bug #9873 (Resolved): rados bench crash
Sage Weil
10:15 AM Bug #9873: rados bench crash
ubuntu@teuthology:/a/samuelj-2014-10-23_17:44:53-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/567665 Samuel Just
09:38 AM Bug #9873 (Fix Under Review): rados bench crash
https://github.com/ceph/ceph/pull/2795 Sage Weil
09:07 AM Bug #9873 (In Progress): rados bench crash
Sage Weil
09:53 AM CephFS Feature #3882 (Rejected): Hide snapshot directory name in mount/mtab
we can now restrict snap access by uid... Sage Weil
09:49 AM CephFS Feature #9883 (Resolved): journal-tool: smarter scavenge (conditionally update dir objects)
Sage Weil
09:42 AM CephFS Feature #9881 (Resolved): mds: admin command to flush the mds journal
Sage Weil
09:41 AM CephFS Feature #9880 (Resolved): mds: more gracefully handle EIO on missing dir object
Sage Weil
08:53 AM rgw Bug #9877: In some cases it's possible for rgw to segfault on http COPY
looks like #9266. Yehuda Sadeh

10/23/2014

09:35 PM rgw Bug #9878: rhel7 s3-tests fail due to missing reason
Sage Weil
06:10 PM rgw Bug #9878 (Resolved): rhel7 s3-tests fail due to missing reason
commit:a9dd4af401328e8f9071dee52470a0685ceb296b Sage Weil
06:08 PM rgw Bug #9169 (Resolved): 100-continue broken for centos/rhel
Sage Weil
04:58 PM rgw Bug #9877 (Resolved): In some cases it's possible for rgw to segfault on http COPY

on 0.80.4
-81> 2014-10-23 22:22:05.586898 7f83547f8700 1 ====== starting new request req=0x7f8368013400 ==...
Anonymous
03:03 PM Bug #9876 (Resolved): failed pull needs to allow mark_unfound_lost revert eventually
Samuel Just
01:50 PM rgw Bug #9616 (Resolved): upgrade test restarts rgw, test gets 500
Sage Weil
01:47 PM CephFS Bug #9869 (Pending Backport): Client: not handling cap_flush_ack messages properly
I tested this manually with a patch that sets the starting tid value to 65535 and looking at the logs. That causes im... Greg Farnum
01:47 PM rbd Bug #9854: librbd: reads contending for cache space can cause livelock
Reads thrashing the cache can be reproduced with:... Josh Durgin
01:44 PM Bug #9821 (Pending Backport): failed to recover before timeout expired
Sage Weil
09:41 AM Bug #9821 (Fix Under Review): failed to recover before timeout expired
Samuel Just
12:47 PM CephFS Bug #9870: kernel: not handling cap_flush_ack messages properly
Zheng Yan
12:43 PM Bug #9372: injectarg boolean option is discarded
There is a warkaround (using --), not sure it deserves backporting. Loïc Dachary
12:41 PM Bug #9372 (Resolved): injectarg boolean option is discarded
Loïc Dachary
11:38 AM rbd Feature #9733: Separate rbd listing into CAP
Is the list of OSD class methods documented somewhere? Robert LeBlanc
11:37 AM Bug #9731: Ceph 0.80.6 OSD crashes
Other details as per sjustwork on irc:
* 3-node ceph cluster, 2 OSDs per node (1ssd 1hdd). All ssds are assigned ...
Brad House
10:34 AM Bug #9731: Ceph 0.80.6 OSD crashes
backtrace from last core... Brad House
10:19 AM Bug #9731: Ceph 0.80.6 OSD crashes
Forgot to attach latest core file from the crash prior to testing with valgrind when running wip-9731 Brad House
11:30 AM Bug #9836: mon unit tests use the wrong id
Although it could be backported to giant and firefly, it does not create actual problems. Only some tests use the mon... Loïc Dachary
11:28 AM Bug #9836 (Resolved): mon unit tests use the wrong id
Loïc Dachary
09:59 AM Bug #9408 (Pending Backport): erasure-code: misalignment
It can't be easily cherry picked because the code has changed. That can happen on firefly too. Backporting would make... Loïc Dachary
09:44 AM Bug #9874: ceph_test_rados, out of order ops
- exec:
client.0:
- ceph osd pool create base 4
- ceph osd pool create cache 4
- ceph osd tier ad...
Samuel Just
08:54 AM Bug #9874 (Duplicate): ceph_test_rados, out of order ops
2014-10-22T17:06:21.115 INFO:tasks.rados.rados.0.burnupi60.stderr:Error: finished tid 3 when last_acked_tid was 7
20...
Samuel Just
09:21 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
ubuntu@teuthology:/a/samuelj-2014-10-22_14:27:22-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/566853/r... Samuel Just
09:07 AM Bug #9875 (Resolved): stuck recovering due to unfound hit_set object
The hitset creation log entries have the same version for version and prior_version. This causes divergent entry det... Samuel Just
08:50 AM Bug #9873 (Resolved): rados bench crash
2014-10-23T00:25:06.570 INFO:tasks.radosbench.radosbench.0.mira034.stderr:osdc/Objecter.cc: 3971: FAILED assert(!tick... Samuel Just
08:49 AM devops Fix #5900: Create a Python package for ceph Python bindings
https://github.com/ceph/ceph/compare/wip-5900 Loïc Dachary
04:01 AM rgw Feature #8562 (Fix Under Review): rgw: Conditional PUT on ETag
Xiangyu Lv

10/22/2014

09:15 PM Documentation #9872 (Closed): erasure-code: document the LRC per layer plugin configuration
It is possible to set the profile on a per layer basis using the low level configuration http://ceph.com/docs/master/... Loïc Dachary
06:16 PM Bug #9731: Ceph 0.80.6 OSD crashes
We don't really want leak-check, it is likely slowing down the osds more than necessary. Samuel Just
05:23 PM Bug #9731: Ceph 0.80.6 OSD crashes
so far no luck replicating this with... Sage Weil
04:45 PM Bug #9731: Ceph 0.80.6 OSD crashes
We probably want to let them run under valgrind overnight if possible. Samuel Just
03:32 PM Bug #9731: Ceph 0.80.6 OSD crashes
Right, I couldn't get 3/3 under valgrind to ever come up to a good health, probably because of the load on it. Howev... Brad House
03:27 PM Bug #9731: Ceph 0.80.6 OSD crashes
(Last I heard, 2/3 were running valgrind, cluster is healthy)
Question: what version are the clients?
Samuel Just
08:16 AM Bug #9731: Ceph 0.80.6 OSD crashes
the 3rd OSD won't join, it is now always aborting at startup. log attached. Perhaps all the starting/stopping has c... Brad House
08:01 AM Bug #9731: Ceph 0.80.6 OSD crashes
after installing wip-9731 but before running under valgrind, I received a crash at 2014-10-22 10:44:42.326583 log at... Brad House
07:51 AM Bug #9731: Ceph 0.80.6 OSD crashes
I've got ceph updated to the wip-9731, and am attempting to start the OSDs under valgrind. However, the first one ap... Brad House
05:34 PM CephFS Bug #9870 (Resolved): kernel: not handling cap_flush_ack messages properly
This is the analogue to #9869, which Zheng tells me is also a problem in the kernel. We need to downcast the message ... Greg Farnum
05:30 PM CephFS Bug #9869: Client: not handling cap_flush_ack messages properly
Waiting for this to build so it can be tested. Greg Farnum
05:28 PM CephFS Bug #9869 (Resolved): Client: not handling cap_flush_ack messages properly
We saw a log segment that contained this:... Greg Farnum
04:47 PM Fix #9566 (Fix Under Review): osd: prioritize recovery of OSDs with most work to do
Here is a draft for review: https://github.com/ceph/ceph/pull/2778 if this sounds reasonable I'll write tests. Otherw... Loïc Dachary
02:47 PM Documentation #9867 (Closed): PGs per OSD documentation needs clarification
Documentation in question:
http://ceph.com/docs/master/rados/operations/placement-groups/
http://ceph.com/docs/mast...
Michael Kidd
02:37 PM rgw Bug #9169: 100-continue broken for centos/rhel
The problem seem to be unrelated to the fastcgi module. The actual issue is that we're running the apache with mpm co... Yehuda Sadeh
01:15 PM Bug #9864: osd doesn't report new stats for 3 hours when running test LibCephFS.MulticlientSimple
I think it's monitor bug. It took about two hours to commit an update... Zheng Yan
11:06 AM Bug #9864: osd doesn't report new stats for 3 hours when running test LibCephFS.MulticlientSimple
let's add debug to teh test yaml so that we have logs next time? Sage Weil
10:59 AM Bug #9864: osd doesn't report new stats for 3 hours when running test LibCephFS.MulticlientSimple
there is no mds log or client log. but ceph.log on both burnupi58 and burnupi58 look strange... Zheng Yan
09:31 AM Bug #9864 (Can't reproduce): osd doesn't report new stats for 3 hours when running test LibCephFS...
... Sage Weil
12:53 PM Bug #9480: OSD is crashing while object deletion
Samuel Just
12:32 PM rgw Bug #9866 (Fix Under Review): "test_s3.test_multipart_upload ... ERROR" in upgrade:firefly:older-...
https://github.com/ceph/ceph-qa-suite/pull/209 Sage Weil
10:30 AM rgw Bug #9866 (Resolved): "test_s3.test_multipart_upload ... ERROR" in upgrade:firefly:older-firefly-...
Run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-21_18:40:01-upgrade:firefly:older-firefly-distro-basic-vps... Yuri Weinstein
12:17 PM rbd Bug #9854 (In Progress): librbd: reads contending for cache space can cause livelock
Jason Dillaman
11:41 AM rbd Bug #9854: librbd: reads contending for cache space can cause livelock
Update:
Run teuthology-2014-10-21_23:17:01-upgrade:firefly:newer-firefly-distro-basic-vps
Job: ['565380']
Logs...
Yuri Weinstein
11:35 AM Bug #9859 (Resolved): Commit 2ac2a96 appears to break OSD creation
Sage Weil
10:43 AM Bug #9859: Commit 2ac2a96 appears to break OSD creation
Problem has been identified.
This went unnoticed as vstart.sh, even with cephx disabled, always creates a keyring,...
Joao Eduardo Luis
10:18 AM Bug #9859: Commit 2ac2a96 appears to break OSD creation
also, 2ac2a96 is the merge commit for the branch of c0e3bc9a Joao Eduardo Luis
10:11 AM Bug #9859: Commit 2ac2a96 appears to break OSD creation
Yesterday I figured as far as the monitor not handling 'MMonGetMap' messages from the OSD during mkfs because the OSD... Joao Eduardo Luis
09:59 AM Bug #9859 (In Progress): Commit 2ac2a96 appears to break OSD creation
Joao Eduardo Luis
11:29 AM rgw Bug #9865 (Resolved): "Assertion: osdc/ObjectCacher.cc" in upgrade:firefly:older-firefly-distro-b...
pushed fix to giant and firefly branches of ceph-qa-suite Sage Weil
11:19 AM rgw Bug #9865: "Assertion: osdc/ObjectCacher.cc" in upgrade:firefly:older-firefly-distro-basic-vps run
thrasher needs to not thrash primary affinity in this case. client connects before the primary-affinity is set so th... Sage Weil
11:10 AM rgw Bug #9865 (In Progress): "Assertion: osdc/ObjectCacher.cc" in upgrade:firefly:older-firefly-distr...
Sage Weil
10:22 AM rgw Bug #9865 (Resolved): "Assertion: osdc/ObjectCacher.cc" in upgrade:firefly:older-firefly-distro-b...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-21_18:40:01-upgrade:firefly:older-firefly-distro-b... Yuri Weinstein
10:49 AM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...
Full logs from pastebin to survive expiration. Loïc Dachary
10:32 AM Bug #8885: SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
/a/samuelj-2014-10-21_16:45:57-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/564093/remote Samuel Just
10:29 AM Bug #9675: splitting a pool doesn't start when rule_id != ruleset_id
Note that this patch will not change existing crushmaps, it will just make new rules using matching ruleset_id == rul... Loïc Dachary
10:00 AM Bug #9675 (Pending Backport): splitting a pool doesn't start when rule_id != ruleset_id
pending backports awaiting review/merge on
* dumpling: https://github.com/ceph/ceph/pull/2775
* emperor: https://...
Joao Eduardo Luis
10:05 AM Bug #9851: crash on journal/filestore shutdown on firefly
Running http://pulpito.ceph.com/loic-2014-10-22_10:04:57-upgrade:firefly-x-giant-testing-basic-vps/ which is s/branch... Loïc Dachary
09:14 AM Bug #9851: crash on journal/filestore shutdown on firefly
Running http://pulpito.ceph.com/loic-2014-10-22_20:28:41-rados:thrash-wip-9851-testing-basic-vps/ Loïc Dachary
09:09 AM Bug #9851: crash on journal/filestore shutdown on firefly
Loïc Dachary
09:05 AM Bug #9851: crash on journal/filestore shutdown on firefly
I wonder how to re-run http://pulpito.ceph.com/teuthology-2014-10-18_19:22:02-upgrade:firefly-x-giant-distro-basic-mu... Loïc Dachary
10:01 AM Bug #9852 (Fix Under Review): mon: monitor asserts on 'ceph mds add_data_pool X' if X is an ID th...
https://github.com/ceph/ceph/pull/2773 Joao Eduardo Luis
09:46 AM rbd Bug #9857 (Resolved): rbd readahead division by zero exception
Jason Dillaman
09:45 AM rbd Bug #9857: rbd readahead division by zero exception
PR: https://github.com/ceph/ceph/pull/2770 Jason Dillaman
08:53 AM devops Bug #9860: grub/os-prober launch kills most ceph OSD
And sda1 which is the ext4 mounted disj of osd.13
Oct 22 07:42:00 stri os-prober: debug: running /usr/lib/os-probe...
Laurent GUERBY
08:45 AM devops Bug #9860: grub/os-prober launch kills most ceph OSD
Logs detailing what os-prober was doing when one of the OSD crashed, sda2 is the journal partition of osd.13 who got ... Laurent GUERBY
08:42 AM devops Bug #9860: grub/os-prober launch kills most ceph OSD
Loïc Dachary
08:25 AM devops Bug #9860: grub/os-prober launch kills most ceph OSD
Adding more complete log lines with ASSERT references
<guerby> 2014-10-22 07:42:07.369785 7f6edf0b5700 0 -- 192.1...
Laurent GUERBY
12:12 AM devops Bug #9860 (Fix Under Review): grub/os-prober launch kills most ceph OSD
h3. Workaround
Disable os-probe with ...
Laurent GUERBY
08:09 AM Bug #9858 (Rejected): osd crush rule create-erasure idempotency failure
This was a side effect of process being killed at random. It was possible to reproduce it consistently until https://... Loïc Dachary
04:10 AM Bug #5925: hung ceph_test_rados_delete_pools_parallel
this was fun though.
I'll stop with the noise now and test this with the patch from #9845.
Joao Eduardo Luis
04:08 AM Bug #5925 (Can't reproduce): hung ceph_test_rados_delete_pools_parallel
and then I read David's comments on this ticket and I felt dumb. Joao Eduardo Luis
04:06 AM Bug #5925: hung ceph_test_rados_delete_pools_parallel
My last statement about the tick even was inaccurate.
gdb tells me that 'tick_event' is still set by the time we i...
Joao Eduardo Luis
03:48 AM Bug #5925: hung ceph_test_rados_delete_pools_parallel
Hit this again while testing a mon patch. Setting to this 'Verified' again until I check with David or Sam on what t... Joao Eduardo Luis
02:58 AM Bug #9585: ceph assertion using rocksdb store in master branch
Hi Tamilarasi, it's still broken for the master branch? Give a link to the corresponding job for pulpito.ceph.com? Haomai Wang
02:56 AM Bug #9814 (Resolved): FAILED assert(0) In function 'GenericObjectMap::Header GenericObjectMap::lo...
Haomai Wang
01:58 AM Bug #9761: ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error 7 in ld-2.1...
No. Pavel Veretennikov

10/21/2014

09:32 PM Bug #9859: Commit 2ac2a96 appears to break OSD creation
Specifically, this is with osd creation where the monmap isn't specified (similar to how vstart does it, but not ceph... Mark Nelson
09:09 PM Bug #9859 (Resolved): Commit 2ac2a96 appears to break OSD creation
Narrowed this down through Joao's comments and bisecting to hit this commit. Not sure if this only happens under spec... Mark Nelson
06:18 PM rgw Bug #9169: 100-continue broken for centos/rhel
Running a simplified yaml, see https://gist.github.com/yuriw/1603e536ee33a28f93a4
Note: Moved clients to separate ...
Yuri Weinstein
04:53 PM rgw Bug #9169: 100-continue broken for centos/rhel
Running a simplified yaml, see https://gist.github.com/yuriw/1603e536ee33a28f93a4
Note: Moved clients to separate ...
Yuri Weinstein
10:16 AM rgw Bug #9169: 100-continue broken for centos/rhel
See the dupe #9825 for latest run info Yuri Weinstein
10:07 AM rgw Bug #9169: 100-continue broken for centos/rhel
yuri to make a minimal test case Sage Weil
05:50 PM rgw Bug #9587 (Fix Under Review): ceph-radosgw sysvinit script on EL6 cannot set ulimit
https://github.com/ceph/ceph/pull/2771
This could use a manual test as well to ensure the limit is properly set on...
Sage Weil
05:23 PM Bug #9858: osd crush rule create-erasure idempotency failure
reproduced with *while make -j8 check ; do : ; done* after ~30 minutes (i.e. ~15 runs). Loïc Dachary
05:03 PM Bug #9858 (Rejected): osd crush rule create-erasure idempotency failure
The *./ceph osd crush rule create-erasure ruleset3* command run by test/mon/osd-crush.sh sometime fails to notice the... Loïc Dachary
05:20 PM Bug #9837 (Duplicate): rbd crash when upgrading from v0.80.5 to firefly
this could be same as bug # 9288, modified the upgrade:firefly suite to NOT upgrade clients when workload is in progr... Tamilarasi muthamizhan
05:19 PM rbd Feature #9733: Separate rbd listing into CAP
It sounds like Nova is configured to use RBD as the backing store for its ephemeral disk images instead of the local ... Jason Dillaman
11:51 AM rbd Feature #9733: Separate rbd listing into CAP
OK, putting the pool argument first does work. We have consequently found out that Nova does require list permissions... Robert LeBlanc
10:54 AM rbd Feature #9733: Separate rbd listing into CAP
Try placing the "pool=test" argument before the "object_prefix XYZ" portion of the cap:... Jason Dillaman
05:16 PM Bug #9610 (Resolved): Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::Cal...
fixed in multi-version suite already - commit b966da7b71c8aee22ff8e58b3b0c105b1d7ca4bf
fixed in upgrade:firefly/ol...
Tamilarasi muthamizhan
02:06 PM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
New ceph_test_rados is too picky for dumpling osds. We only want to use dumpling ceph_test_rados against clusters wi... Samuel Just
02:06 PM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
also: ubuntu@teuthology:/a/teuthology-2014-10-20_18:40:02-upgrade:firefly:older-firefly-distro-basic-vps/561550
<...
Tamilarasi muthamizhan
12:53 PM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
seeing this on the upgrade test from v0.67.11 to firefly [v0.80.7]... Tamilarasi muthamizhan
04:54 PM Bug #9752: acting in past intervals contains primary and up_primary (looks like duplicates but is...

"kit" on #ceph was in a situation of having incomplete pg. They sent the pg query output and it showed strange pas...
David Zafman
04:44 PM rbd Bug #9857 (Fix Under Review): rbd readahead division by zero exception
Jason Dillaman
03:53 PM rbd Bug #9857 (In Progress): rbd readahead division by zero exception
Jason Dillaman
02:42 PM rbd Bug #9857 (Resolved): rbd readahead division by zero exception
When using old-format RBD images, the RBD readahead block alignments are initialized to zero because the stripe param... Jason Dillaman
04:07 PM rbd Bug #9855: rbd "Segmentation fault" in upgrade:firefly:singleton-firefly-distro-basic-vps run
Tamilarasi muthamizhan wrote:
> I think this issue could be related to bug # 9288, upgrading clients when workload i...
Sage Weil
02:04 PM rbd Bug #9855: rbd "Segmentation fault" in upgrade:firefly:singleton-firefly-distro-basic-vps run
I think this issue could be related to bug # 9288, upgrading clients when workload is in progress.
Tamilarasi muthamizhan
02:02 PM rbd Bug #9855: rbd "Segmentation fault" in upgrade:firefly:singleton-firefly-distro-basic-vps run
more logs:
ubuntu@teuthology:/a/teuthology-2014-10-20_18:40:02-upgrade:firefly:older-firefly-distro-basic-vps/561562
Tamilarasi muthamizhan
11:11 AM rbd Bug #9855: rbd "Segmentation fault" in upgrade:firefly:singleton-firefly-distro-basic-vps run
logs: ubuntu@teuthology:/a/teuthology-2014-10-20_19:10:01-upgrade:firefly:newer-firefly-distro-basic-vps/561993 Tamilarasi muthamizhan
11:07 AM rbd Bug #9855 (Resolved): rbd "Segmentation fault" in upgrade:firefly:singleton-firefly-distro-basic-...
On:
os_type: rhel
os_version: '6.4'
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-20_20:50:...
Yuri Weinstein
03:32 PM rgw Bug #9612: "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant-testing-ba...
... Tamilarasi muthamizhan
03:24 PM rgw Bug #9612 (Rejected): "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant...
that's giant rgw and dumpling osds, shouldn't work. Yehuda Sadeh
03:22 PM CephFS Feature #9557 (Fix Under Review): mds: verify backtrace on fetch_dir
Zheng Yan
10:44 AM CephFS Feature #9557 (In Progress): mds: verify backtrace on fetch_dir
Greg Farnum
03:19 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
Please stop throwing this bug in the FS tracker just because it has the word MDS in it... Greg Farnum
02:58 PM rgw Bug #9616: upgrade test restarts rgw, test gets 500
fixed... Tamilarasi muthamizhan
02:33 PM Bug #9823 (Won't Fix): ceph-osd mkfs or ceph auth add : exit -9
It did not show up after the fix. The tests use a lot more than the default 1024 file descriptors allowed. Marking wo... Loïc Dachary
02:25 PM devops Bug #9665: ceph-disk zap should call partprobe
Running http://pulpito.ceph.com/loic-2014-10-21_14:25:31-ceph-deploy:singleton-wip-9665-ceph-disk-partprobe-testing-b... Loïc Dachary
02:19 PM Bug #9731: Ceph 0.80.6 OSD crashes
Sorry, still out sick today. Hoping to be in tomorrow. Brad House
11:10 AM Bug #9731: Ceph 0.80.6 OSD crashes
Brad House wrote:
> Sorry, I only have access during the week to the test system, and I'm out sick today. Hopefully...
Sage Weil
02:17 PM devops Bug #9807 (Duplicate): Missing radosgw packages in various upgrade suites
This was basically the same issue that was thought to be fixed and centos still had issues in issue #9824 but should ... Sandon Van Ness
02:13 PM devops Bug #9747: ceph.spec.in will always use 95-ceph-osd-alt.rules
gitbuilder was clean but got trimed Loïc Dachary
01:53 PM Bug #9288: "Assertion `nlock == 0' failed" in upgrade:firefly-firefly-testing-basic-vps suite
fixed it ... Tamilarasi muthamizhan
11:17 AM Bug #9288: "Assertion `nlock == 0' failed" in upgrade:firefly-firefly-testing-basic-vps suite
the job is upgrading client.0 (vpm072) in that test too.
i think
- install.upgrade:
all:
bran...
Sage Weil
01:47 PM Bug #9408: erasure-code: misalignment
Samuel Just
01:34 PM Bug #9485 (In Progress): Monitor crash due to wrong crush rule set
Did not forget about it, just busy with other things. Loïc Dachary
01:33 PM Bug #9684 (Can't reproduce): "Scrubbing terminated" in upgrade:firefly-firefly-distro-basic-multi...
no log or core Sage Weil
01:32 PM Bug #9434 (Can't reproduce): rbd rm hangs
Samuel Just
01:30 PM Bug #9702 (Duplicate): "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:fire...
probably dup of #9835 Sage Weil
01:29 PM Bug #9703 (Resolved): "Segmentation fault" in upgrade:firefly-x-giant-distro-basic-multi run
Samuel Just
01:27 PM Bug #9739 (Won't Fix): rados cli: listsnaps does not list snaps
because you haven't written to it yet! Samuel Just
01:19 PM Bug #9761: ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error 7 in ld-2.1...
Has this happened more than once? Samuel Just
01:18 PM Bug #9794 (Resolved): vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
checked firefly; no need to backport. Joao Eduardo Luis
01:15 PM Bug #9794: vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
Samuel Just
01:15 PM Bug #9794 (Resolved): vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
Samuel Just
01:11 PM Bug #9419 (Resolved): dumpling->firefly upgrade, sending setallochint?
Samuel Just
01:09 PM Bug #9649 (Can't reproduce): OSD hang in op_tp
Samuel Just
01:07 PM Bug #9559 (Resolved): ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() func...
Samuel Just
11:43 AM CephFS Bug #8809 (Can't reproduce): uclient: memory leak
maybe fixed by 2313ce1d024361fd7f4d2cbca789010f0fe0faad Zheng Yan
11:34 AM Bug #9856 (Duplicate): osd crashed after upgrade from v0.80.5 to firefly
#9851 Sage Weil
11:26 AM Bug #9856: osd crashed after upgrade from v0.80.5 to firefly
more jobs: ubuntu@teuthology:/a/teuthology-2014-10-20_19:10:01-upgrade:firefly:newer-firefly-distro-basic-vps/561999
...
Tamilarasi muthamizhan
11:23 AM Bug #9856 (Duplicate): osd crashed after upgrade from v0.80.5 to firefly
osd crashed after upgrading from ceph v0.80.5 to firefly and during thrashing,
logs: ubuntu@teuthology:/a/teutholo...
Tamilarasi muthamizhan
11:10 AM Linux kernel client Bug #9507 (Resolved): calling llistxattr(2) on a symlink crashes the client
Zheng Yan
10:55 AM CephFS Bug #9674: nightly failed multiple_rsync.sh
commit:477073aba1da880dfd0b8c82f4792788579f28b9 in master and commit:44ce33c12443909b02c7ee451ad45400f55d53c9 in giant Greg Farnum
10:38 AM Bug #9845 (Resolved): hung ceph_test_rados_delete_pools_parallel
Sage Weil
12:59 AM Bug #9845 (Fix Under Review): hung ceph_test_rados_delete_pools_parallel
David Zafman
12:48 AM Bug #9845 (Resolved): hung ceph_test_rados_delete_pools_parallel
... David Zafman
09:57 AM rgw Bug #9575 (Duplicate): s3tests.functional.test_s3.test_region_copy_object fails (races with rados...
Sage Weil
09:43 AM rgw Bug #3896 (Resolved): rest-bench common/WorkQueue.cc: 54: FAILED assert(_threads.empty())
Sage Weil
09:42 AM rgw Bug #1673 (Won't Fix): rgw: mod_fastcgi needs to be backward compatible
Sage Weil
09:41 AM rgw Bug #8251: radosgw-agent does not sync objects uploaded to recreated buckets
closed and obsolete : https://github.com/ceph/ceph/pull/2765 Sage Weil
09:40 AM rgw Bug #8550 (Resolved): rgw: need to reduce calls to rgw_obj.set_obj()
Sage Weil
09:38 AM rgw Bug #9043 (Duplicate): rgw:Cannot add object to Ceph using Openstack Dashboard(Horizon) in firefly
Sage Weil
09:31 AM rgw Bug #9525 (Duplicate): Deleted object shows in object listing
Sage Weil
09:29 AM rgw Bug #9576 (Fix Under Review): rgw: update object content-length doesn't work correctly
Sage Weil
09:27 AM rgw Bug #9500 (Duplicate): 0.80.5 on CentOS 6.5: radosgw-admin fails to correctly name subuser object
Sage Weil
09:27 AM rgw Bug #9500: 0.80.5 on CentOS 6.5: radosgw-admin fails to correctly name subuser object
unlikely to be ubuntu vs centos. this looks like #8587 or releated issues (pending backport to firefly) Sage Weil
09:25 AM rgw Bug #9469 (Rejected): RadosGW performance degrades with high concurrency workload.
please send an email about this to ceph-devel; that is a better forum to discuss performance issues. Sage Weil
09:23 AM rgw Bug #9543 (Rejected): AssertionError(s) in upgrade:dumpling-dumpling-distro-basic-vps run
Sage Weil
09:23 AM rgw Bug #9588 (Rejected): Keystone s3 auth integration lacking access_key = tenant:user ability suppo...
thanks Mark! Sage Weil
09:21 AM rgw Bug #9766 (Rejected): s3tests: test_100_continue failing
this is almost certainly a configuration error. need rgw print continue = true and patched mod_fastcgi Sage Weil
09:20 AM rgw Bug #9002 (Duplicate): Creating swift key with --gen-secret in separate step from subuser creatio...
#8587 Sage Weil
09:19 AM rgw Bug #8676 (Duplicate): md5sum check failed during readwrite.py
this appears to be resolved by #9307 Sage Weil
09:17 AM rgw Bug #9307 (Resolved): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-fir...
Sage Weil
09:17 AM rgw Bug #9307: "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-x-mast...
Sage Weil
09:16 AM rbd Bug #9854 (Resolved): librbd: reads contending for cache space can cause livelock
As a result of accounting for reads properly with #9513. Using qemu-io (a test program) is one way to trigger this - ... Josh Durgin
09:13 AM rgw Bug #9039 (Resolved): Using COPY on radosgw to copy object from one bucket to another that's in a...
Yehuda Sadeh
09:07 AM Bug #9675 (In Progress): splitting a pool doesn't start when rule_id != ruleset_id
Joao Eduardo Luis
09:06 AM rbd Bug #9513 (Resolved): rbd_cache=true default setting is degading librbd performance ~10X in Giant
backported in commit:65be257e9295619b960b49f6aa80ecdf8ea4d16a Josh Durgin
09:04 AM Bug #9813 (Resolved): cryptopp dependency missing for deb-based systems
Sage Weil
08:45 AM Bug #9813: cryptopp dependency missing for deb-based systems
Already addressed by [1], cheers!
[1] https://github.com/ceph/ceph/pull/2761
Federico Gimenez Nieto
08:54 AM Bug #9853 (Duplicate): coredump in upgrade:firefly-x-giant-distro-basic-vps run
#9851 Sage Weil
08:21 AM Bug #9853 (Duplicate): coredump in upgrade:firefly-x-giant-distro-basic-vps run
On:
os_type: rhel
os_version: '6.5'
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-20_15...
Yuri Weinstein
08:52 AM rgw Bug #9825 (Duplicate): s3tests failing on rhel 6.4 and 6.5 in upgrade:dumpling-firefly-x:parallel...
this is #9169 Sage Weil
08:33 AM rgw Bug #9825: s3tests failing on rhel 6.4 and 6.5 in upgrade:dumpling-firefly-x:parallel-giant-distr...
In the run teuthology/teuthology-2014-10-20_15:01:14-upgrade:firefly-x-giant-distro-basic-vps , jobs ['560024', '5600... Yuri Weinstein
08:12 AM Bug #9073 (Pending Backport): OSD with device/partition journals down after fresh deploy or upgra...
Loïc Dachary
07:50 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
https://github.com/ceph/ceph/pull/2764 is a better fix. The isolated patch made sense to me at the time but it looks ... Loïc Dachary
08:09 AM Linux kernel client Bug #9561 (Rejected): libceph: do not crash if auth reply is not understood
I believe the code is correct as is and I misdiagnosed the original issue. :( Sage Weil
08:09 AM Linux kernel client Bug #9560 (Rejected): libceph: msg kmalloc failure handling on the reply path
I believe the code is correct as is and I misdiagnosed the original issue. Sage Weil
07:35 AM Bug #9852 (Resolved): mon: monitor asserts on 'ceph mds add_data_pool X' if X is an ID that DNE
... Joao Eduardo Luis
06:58 AM Bug #9851 (Fix Under Review): crash on journal/filestore shutdown on firefly
https://github.com/ceph/ceph/pull/2764 Sage Weil
06:42 AM Bug #9851 (Resolved): crash on journal/filestore shutdown on firefly
saw this on several runs, e.g. /var/lib/teuthworker/archive/teuthology-2014-10-18_19:22:02-upgrade:firefly-x-giant-di... Sage Weil
01:48 AM devops Bug #9840: Monitor hung when add new osd
using the valgrin find errors:
==17554== Thread 7:
==17554== Invalid read of size 4
==17554== at 0x3168A0C380: ...
qiu shanggao
12:50 AM Bug #5925 (Can't reproduce): hung ceph_test_rados_delete_pools_parallel

Filed #9845 to describe the recent occurence. This bug was probably something else, so I'm setting it back to "Can...
David Zafman

10/20/2014

11:51 PM Bug #5925: hung ceph_test_rados_delete_pools_parallel
I don't think this would have happened when safe_callbacks was true. It was set to false in a fix for #9582. See al... David Zafman
07:02 PM Bug #5925: hung ceph_test_rados_delete_pools_parallel

Yup, there is a shutdown race. Thread 1 is waiting for the timer thread while holding the Objecter::rwlock in writ...
David Zafman
11:22 PM Fix #9834 (Rejected): osd_scrub_load_threshold should be checked during scrubbing
Loïc Dachary
10:14 AM Fix #9834: osd_scrub_load_threshold should be checked during scrubbing
I'm not sure this is something we should do. We attempt to schedule scrubs during periods of low disk usage, but if t... Greg Farnum
07:25 AM Fix #9834 (Rejected): osd_scrub_load_threshold should be checked during scrubbing
"osd_scrub_load_threshold":https://github.com/ceph/ceph/blob/firefly/src/common/config_opts.h#L515 is "considered":ht... Loïc Dachary
11:20 PM Bug #9844 (Won't Fix): "initiating reconnect" (log) race; crash of multiple OSDs (domino effect)
On 0.87 I watch "ceph osd tree" and notice that one OSD (leveldb/keyvaluestore-dev) is "down".
In its log I see
...
Dmitry Smirnov
10:45 PM Bug #9356 (Resolved): ceph_test_rados_striper_api_aio Segmentation faults
https://github.com/ceph/ceph/pull/2419 Loïc Dachary
09:38 PM Bug #9839 (Rejected): ErasureCodePluginSelectJerasure: generic plugin : abort
... Loïc Dachary
03:23 PM Bug #9839 (Need More Info): ErasureCodePluginSelectJerasure: generic plugin : abort
Loïc Dachary
03:23 PM Bug #9839: ErasureCodePluginSelectJerasure: generic plugin : abort
When trying to run manually *ceph-osd -i 0* it hangs at the same point. Loïc Dachary
03:02 PM Bug #9839 (Rejected): ErasureCodePluginSelectJerasure: generic plugin : abort
It fails when pre-loading the plugin in a context where erasure-code is not used.
http://pulpito.ceph.com/teutholo...
Loïc Dachary
06:27 PM devops Bug #9840: Monitor hung when add new osd
try again, monitor hung still
Thread 25 (Thread 0x7f93e5ec0700 (LWP 13652)):
#0 0x0000003168a0b5bc in pthread_co...
qiu shanggao
06:17 PM devops Bug #9840 (Rejected): Monitor hung when add new osd
ceph version: 0.80.6
Platform: Redhat RHLS 6.5
we want to test the replace disk case,
operator step:
1. ceph o...
qiu shanggao
06:02 PM Bug #9419: dumpling->firefly upgrade, sending setallochint?
This is done an a new case was added - PR https://github.com/ceph/ceph-qa-suite/pull/198 Yuri Weinstein
06:01 PM Feature #9568: Add test case to test #9419 (ceph wip-9419)
This is done an a new case was added - PR https://github.com/ceph/ceph-qa-suite/pull/198 Yuri Weinstein
02:14 PM Feature #9568: Add test case to test #9419 (ceph wip-9419)
this seems to require clients upgraded first running workloads against upgraded monitors and mixed versions of osds, ... Tamilarasi muthamizhan
04:14 PM rbd Feature #9733: Separate rbd listing into CAP
OK, so one more question. This looks like it allows access to any pool. Is there a way to limit this to a particular ... Robert LeBlanc
03:04 PM Bug #9389 (Duplicate): ec pg stuck peering, did not send query for one shard
Samuel Just
03:04 PM Bug #9822 (Resolved): failed to become clean before timeout expired
Samuel Just
02:29 PM Bug #9822: failed to become clean before timeout expired
Samuel Just
02:18 PM Bug #9821: failed to recover before timeout expired
in wip-sam-testing Samuel Just
02:09 PM Bug #9821: failed to recover before timeout expired
working on patch Samuel Just
01:55 PM Bug #9835 (Fix Under Review): osd: bug in misdirected op checks (firefly)
https://github.com/ceph/ceph/pull/2760 Sage Weil
10:12 AM Bug #9835: osd: bug in misdirected op checks (firefly)
Maybe we need to adjust how we're handling waiting_for_pg, but I don't think that this particular check is a bug — th... Greg Farnum
09:33 AM Bug #9835 (Resolved): osd: bug in misdirected op checks (firefly)
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-10-18_19:22:02-upgrade:firefly-x-giant-distro-basic-mu... Sage Weil
01:32 PM Bug #9806: Objecter: resend linger ops on split
Josh Durgin
01:23 PM CephFS Feature #414 (Resolved): ceph-fuse: implement file locking
Zheng Yan
01:22 PM CephFS Bug #8576: teuthology: nfs tests failing on umount
teuthology commit:4f2957c42d0f76a399cb26c660ede9243c095779 runs those commands as well as the previous ones. Greg Farnum
01:02 PM CephFS Bug #9679 (Closed): Ceph hadoop terasort job failure
Fixed in cephfs-hadoop repo. Noah Watkins
11:31 AM Bug #9288: "Assertion `nlock == 0' failed" in upgrade:firefly-firefly-testing-basic-vps suite
logs: ubuntu@teuthology:/a/teuthology-2014-10-17_23:30:01-upgrade:firefly:newer-firefly-distro-basic-vps/555356
<p...
Tamilarasi muthamizhan
11:20 AM Bug #9288 (New): "Assertion `nlock == 0' failed" in upgrade:firefly-firefly-testing-basic-vps suite
this seems to look different from bug # 9040. Tamilarasi muthamizhan
11:29 AM Bug #9837: rbd crash when upgrading from v0.80.5 to firefly
... Tamilarasi muthamizhan
11:27 AM Bug #9837 (Duplicate): rbd crash when upgrading from v0.80.5 to firefly
logs: ubuntu@teuthology:/a/teuthology-2014-10-17_23:30:01-upgrade:firefly:newer-firefly-distro-basic-vps/555359
<p...
Tamilarasi muthamizhan
11:22 AM Bug #9836 (Fix Under Review): mon unit tests use the wrong id
https://github.com/ceph/ceph/pull/2759 Loïc Dachary
11:19 AM Bug #9836: mon unit tests use the wrong id
It impacts
* "osd-erasure-code-profile.sh":https://github.com/ceph/ceph/blob/giant/src/test/mon/osd-erasure-code-...
Loïc Dachary
11:13 AM Bug #9836 (Resolved): mon unit tests use the wrong id
the mon id is incorrect for mon tests using "the call_TEST_functions":https://github.com/ceph/ceph/blob/firefly/src/t... Loïc Dachary
11:15 AM CephFS Bug #9800: client-limits test is not passing

Same failure:
http://pulpito.front.sepia.ceph.com/teuthology-2014-10-17_23:04:02-fs-giant-distro-basic-multi/555...
John Spray
11:07 AM Linux kernel client Bug #9458 (Resolved): client wrongly fenced
Zheng Yan
11:06 AM Linux kernel client Bug #1513 (Resolved): kclient: cap migration can race with cap addition on client
now cap import/export are ordered.
(commit 186e4f7a4b1883f3f46aa15366c0bcebc28fdda7, 4ee6a914edbbd2543884f0ad7d58ea4...
Zheng Yan
10:46 AM Bug #9820 (Resolved): mon connection hang on cephtool/test.sh
Sage Weil
10:38 AM Bug #9372: injectarg boolean option is discarded
"fails on precise":http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-tarball-precise-i386-basic/log.cgi?log=14ed21f9ad... Loïc Dachary
10:06 AM Bug #9826 (Rejected): ceph osd crush rule ls should use the pending crush, if any
Loïc Dachary
08:59 AM Bug #9826: ceph osd crush rule ls should use the pending crush, if any
... Loïc Dachary
09:18 AM rgw Bug #9825: s3tests failing on rhel 6.4 and 6.5 in upgrade:dumpling-firefly-x:parallel-giant-distr...
I am wondering if it's related to changes in s3tests? Yuri Weinstein
08:41 AM Bug #9819 (Won't Fix): EBUSY during scrub
this is expected and harmless. we just report the failure and move it. it happens when paxos is busy when we reques... Sage Weil
08:38 AM Bug #9731: Ceph 0.80.6 OSD crashes
Sorry, I only have access during the week to the test system, and I'm out sick today. Hopefully I'll be able to cont... Brad House
04:02 AM rgw Feature #8562: rgw: Conditional PUT on ETag
Closed the previous out-of-synced PR and submitted a new one: https://github.com/ceph/ceph/pull/2756 Xiangyu Lv
01:38 AM rgw Feature #8562: rgw: Conditional PUT on ETag
Here is a PR for discussion purpose: https://github.com/ceph/ceph/pull/2755
We may need to elaborate a bit on it aft...
Xiangyu Lv
03:46 AM Bug #9816: mon exits unexpectedly and gracefully
just a hunch: feels like you're capturing only stdout from the monitor, and the monitor may have hit the 'mon data av... Joao Eduardo Luis
01:42 AM Linux kernel client Bug #9749: kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
I guess we are just not used to doing it - I think we haven't filed any CVEs for ceph kernel bits (and kcephfs in par... Ilya Dryomov

10/19/2014

08:29 PM Bug #9731: Ceph 0.80.6 OSD crashes
Any update? Sage Weil
07:20 PM CephFS Bug #9341 (Pending Backport): MDS: very slow rejoin
Hmm, we didn't put this in Giant initially because we were trying not to perturb it. Master hasn't been run through t... Greg Farnum
06:45 PM CephFS Bug #9341 (Fix Under Review): MDS: very slow rejoin
Please include this fix to 0.87 which is affected just as badly as 0.80.x.
On 0.87 MDS stuck in "rejoin" for hours a...
Dmitry Smirnov
07:13 PM Bug #9826 (Fix Under Review): ceph osd crush rule ls should use the pending crush, if any
Loïc Dachary
07:13 PM Bug #9826: ceph osd crush rule ls should use the pending crush, if any
https://github.com/ceph/ceph/pull/2754 Loïc Dachary
07:07 PM Bug #9826 (Rejected): ceph osd crush rule ls should use the pending crush, if any
The following is racy:... Loïc Dachary
05:03 PM Bug #9823: ceph-osd mkfs or ceph auth add : exit -9
Maybe it runs out of file descriptors because of the // runs. Since the erasure code test is the one using the more d... Loïc Dachary
03:12 PM Bug #9823: ceph-osd mkfs or ceph auth add : exit -9
The error matching the mon log is different: auth add exits with -9 instead of mkfs.... Loïc Dachary
03:02 PM Bug #9823: ceph-osd mkfs or ceph auth add : exit -9
it was reproduced with a change to the script to keep the logs. Loïc Dachary
12:53 PM Bug #9823 (Won't Fix): ceph-osd mkfs or ceph auth add : exit -9
While running src/test/erasure-code/test-erasure-code.sh in a loop, the following happened. The -9 exit code suggests... Loïc Dachary
04:27 PM Bug #9796: osd: crash on blacklisted watcher reconnect (dumpling)
Observed similar crash in suite:upgrade:dumpling
Run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-18_17:00...
Yuri Weinstein
04:09 PM rgw Bug #9825: s3tests failing on rhel 6.4 and 6.5 in upgrade:dumpling-firefly-x:parallel-giant-distr...
Same problems:
suite:upgrade:dumpling-x
Run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-17_19:13:01-up...
Yuri Weinstein
03:19 PM rgw Bug #9825: s3tests failing on rhel 6.4 and 6.5 in upgrade:dumpling-firefly-x:parallel-giant-distr...
Log for rhel 6.5 job - http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-18_17:15:02-upgrade:dumpling-firefly-x:... Yuri Weinstein
03:18 PM rgw Bug #9825 (Duplicate): s3tests failing on rhel 6.4 and 6.5 in upgrade:dumpling-firefly-x:parallel...
Looks similar to #9763
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-18_17:15:02-upgrade:dump...
Yuri Weinstein
02:26 PM Linux kernel client Bug #9749: kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
This bug appears to be exploitable by unprivileged local users and will cause a machine-wide DoS. Is there some reaso... David Ramos
10:17 AM Bug #9822 (Resolved): failed to become clean before timeout expired
logs: ubuntu@teuthology:/a/teuthology-2014-10-17_02:32:01-rados-giant-distro-basic-multi/553345... Tamilarasi muthamizhan
10:15 AM Bug #9821: failed to recover before timeout expired
ubuntu@teuthology:/a/teuthology-2014-10-17_02:32:01-rados-giant-distro-basic-multi/553255 Tamilarasi muthamizhan
10:11 AM Bug #9821 (Resolved): failed to recover before timeout expired
logs: ubuntu@teuthology:/a/teuthology-2014-10-17_02:32:01-rados-giant-distro-basic-multi/553125... Tamilarasi muthamizhan
09:44 AM Feature #9817 (Fix Under Review): display X.XX deep-scrub starts
https://github.com/ceph/ceph/pull/2752 Loïc Dachary
08:15 AM Feature #9817 (Resolved): display X.XX deep-scrub starts
It would be convenient to have a message in the logs when deep-scrub starts... Loïc Dachary
09:40 AM Bug #9820 (Resolved): mon connection hang on cephtool/test.sh
log: ubuntu@teuthology:/a/teuthology-2014-10-17_02:32:01-rados-giant-distro-basic-multi/553035... Tamilarasi muthamizhan
09:28 AM Bug #9819 (Won't Fix): EBUSY during scrub
logs: ubuntu@teuthology:/a/teuthology-2014-10-17_02:32:01-rados-giant-distro-basic-multi/552986... Tamilarasi muthamizhan
08:40 AM Bug #9818 (Resolved): ENXIO qa/workunits/cephtool/test.sh:test_osd_bench
It looks like the OSD crashed but there is no more information than the following log at the moment. It was created w... Loïc Dachary
08:12 AM Bug #9816 (Can't reproduce): mon exits unexpectedly and gracefully
... Loïc Dachary

10/18/2014

09:25 PM Bug #9814 (Fix Under Review): FAILED assert(0) In function 'GenericObjectMap::Header GenericObjec...
https://github.com/ceph/ceph/pull/2710 Haomai Wang
05:10 PM Bug #9814 (Resolved): FAILED assert(0) In function 'GenericObjectMap::Header GenericObjectMap::lo...
LevelDB-based OSD (i.e. "keyvaluestore-dev") crashed as follows on 0.87 during backfill:... Dmitry Smirnov
07:54 PM Feature #9815 (Fix Under Review): run make check in parallel
https://github.com/ceph/ceph/pull/2750 Loïc Dachary
05:46 PM Feature #9815 (Resolved): run make check in parallel
Individual tests run by make check may bind fixed ports or use identical files or subdirectories to store temporary d... Loïc Dachary
05:06 PM Bug #9744: cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
Sage Weil wrote:
> this happens when clocks are very skewed.
Are we OK with such vulnerability that allow to brin...
Dmitry Smirnov
03:27 AM Bug #9813 (Resolved): cryptopp dependency missing for deb-based systems
Hi, when following [1] from a trusty64 box I've noticed that the libcrypto++-dev entry is missing from deps.deb.txt. ... Federico Gimenez Nieto

10/17/2014

08:30 PM Bug #9810 (Duplicate): dout_emergency is silenced in ceph-osd
"ceph-osd closes stderr":https://github.com/ceph/ceph/blob/giant/src/ceph_osd.cc#L499 and this may be the reason why ... Loïc Dachary
05:03 PM Bug #9809 (Rejected): common/perf_counters.cc: 105: FAILED assert(idx < m_upper_bound)
I changed the code and introduced the problem and then forgot I changed the code. Reverting the change fixes the prob... Loïc Dachary
04:50 PM Bug #9809 (Rejected): common/perf_counters.cc: 105: FAILED assert(idx < m_upper_bound)
Steps to reproduce
* modify vstart.sh with ...
Loïc Dachary
04:27 PM Bug #9808 (Rejected): PG stuck in active+undersized+degraded+remapped+backfill_toofull
The disk was 90% full ... hence the block. Loïc Dachary
04:21 PM Bug #9808: PG stuck in active+undersized+degraded+remapped+backfill_toofull
The "scheduled":https://github.com/ceph/ceph/blob/giant/src/osd/PG.cc#L5674 "RequestBackfill":https://github.com/cep... Loïc Dachary
04:13 PM Bug #9808 (Rejected): PG stuck in active+undersized+degraded+remapped+backfill_toofull
Steps to reproduce
* modify vstart.sh with ...
Loïc Dachary
04:21 PM Bug #9731: Ceph 0.80.6 OSD crashes
Just to check, there isn't anything interesting in dmesg, right? Samuel Just
03:07 PM Bug #9731: Ceph 0.80.6 OSD crashes
Oh, and the
--00:00:06:05.108 2312-- WARNING: unhandled syscall: 306
--00:00:06:05.108 2312-- You may be able to ...
Samuel Just
02:45 PM Bug #9731: Ceph 0.80.6 OSD crashes
Looks like in our testing we invoke valgrind as:
valgrind --suppressions=<suppression_file> --num-callers=50 --xml...
Samuel Just
02:14 PM Bug #9731: Ceph 0.80.6 OSD crashes
wheezy gitbuilder should be working. Samuel Just
02:13 PM Bug #9731: Ceph 0.80.6 OSD crashes
yeah, -f I think. Samuel Just
01:58 PM Bug #9731: Ceph 0.80.6 OSD crashes
valgrind appears to detach from the console when running with ceph-osd, is there some other flag I need to pass to ce... Brad House
01:55 PM Bug #9731: Ceph 0.80.6 OSD crashes
Sage also pushed a wip-9731 based on 0.80.7 with a piece of debugging which would be handy. Reproducing with that wo... Samuel Just
09:23 AM Bug #9731: Ceph 0.80.6 OSD crashes
Brad House wrote:
> sure, just tell me the best command line to us as I haven't ever tried to run ceph-osd outside o...
Sage Weil
03:22 PM Bug #9788 (Rejected): "Assertion: common/HeartbeatMap.cc: 79" placeholder for "hit suicide timeou...
Two osds, both on mira076 timed out:
osd5: a stat in the op_tp took 3 minutes (completed, surprisingly, right before...
Samuel Just
03:03 PM devops Bug #9807: Missing radosgw packages in various upgrade suites
looks like we are hitting a lot of failures in upgrade tests because of this issue. Tamilarasi muthamizhan
03:01 PM devops Bug #9807 (Duplicate): Missing radosgw packages in various upgrade suites
In teuthology-2014-10-16_19:00:01-upgrade:dumpling-x-firefly-distro-basic-vps... Yuri Weinstein
02:57 PM Bug #9220 (Resolved): objecter doesn't reconnect watch on interval change w/ same primary
This did not need backporting to dumpling after all, since it was broken after dumpling by commit:860d72770cdf092c027... Josh Durgin
11:13 AM Bug #9220 (Pending Backport): objecter doesn't reconnect watch on interval change w/ same primary
Josh Durgin
11:20 AM Bug #9806 (Resolved): Objecter: resend linger ops on split
Otherwise, we can lose notifies.
commit:cb9262abd7fd5f0a9f583bd34e4c425a049e56ce
Samuel Just
10:50 AM Bug #9419: dumpling->firefly upgrade, sending setallochint?
Samuel Just
10:49 AM Bug #9419: dumpling->firefly upgrade, sending setallochint?
next step is to add a tests for this to the upgrade suties. Samuel Just
10:43 AM Bug #9073 (Resolved): OSD with device/partition journals down after fresh deploy or upgrade to 0.83
Samuel Just
10:39 AM Bug #9614 (Pending Backport): PG stuck with remapped
Samuel Just
10:38 AM Bug #9718 (Pending Backport): osd_types: check_new_interval: min_size check needs to consider CRU...
Samuel Just
10:32 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
ubuntu@teuthology:/a/samuelj-2014-10-15_20:19:09-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/551397/r... Samuel Just
09:33 AM Documentation #9804 (Resolved): kvm and qemu do not document ceph/rbd support
* looking for ceph or rbd in http://www.linux-kvm.org/page/Special:Search?search=ceph&go=Go : zero match
* on qemu.o...
Loïc Dachary
09:10 AM Bug #6756 (Fix Under Review): journal full hang on startup
https://github.com/ceph/ceph/pull/2745
(rebased and retested old patch)
Sage Weil
07:48 AM Bug #9729 (Resolved): "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:paralle...
Yuri Weinstein
07:47 AM rbd Bug #9642 (Resolved): Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-gian...
Yuri Weinstein
07:46 AM rbd Bug #9642: Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-giant-distro-ba...
Fixed, tests passed on bare metal.
Last results - http://pulpito.front.sepia.ceph.com/teuthology-2014-10-16_17:10:01...
Yuri Weinstein
05:16 AM Bug #9794: vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
I confirm that
* the problem can be reproduced 100% of the time on my laptop,
* that cherry-pick c84a13ae87eed555...
Loïc Dachary
04:17 AM Bug #9794: vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
Loic, try this patch with the same conditions in which you triggered it: c84a13ae87eed5550bafda394d983a8e843cc08c
...
Joao Eduardo Luis
01:52 AM Feature #9802 (New): When replaced a disk, the CRUSH weight of the related host changed
In disk replacement test, when add a disk into cluster. The osd tree likes
below:...
Jingjing Zhao

10/16/2014

10:27 PM Bug #9801 (Won't Fix): ceph 0.80.7 build rpm packages in centos 7 error
ceph 0.80.7 build rpm packages in centos 7 error... wei li
06:30 PM Bug #8629: cache_evict needs to prevent make_writeable from creating a snapdir
https://github.com/ceph/ceph/pull/2737 Sage Weil
05:24 PM Fix #9566 (In Progress): osd: prioritize recovery of OSDs with most work to do
Loïc Dachary
05:11 PM Fix #9566: osd: prioritize recovery of OSDs with most work to do
Related commits:
* "osd: prioritize backfill based on *how* degraded":https://github.com/ceph/ceph/commit/0985ae71bc...
Loïc Dachary
05:04 PM Bug #9769 (Resolved): upgrade/firefly: latest_dumpling_release.yaml always fails
Sage Weil
10:56 AM Bug #9769: upgrade/firefly: latest_dumpling_release.yaml always fails
It fixed, testing now, here is the run passed:... Yuri Weinstein
04:59 PM Bug #9765 (Duplicate): CachePool flush -> OSD Failed
I'm pretty sure this is because #8629 has not yet been backported to firefly. It should be in 0.80.8. I'll prepare ... Sage Weil
05:48 AM Bug #9765: CachePool flush -> OSD Failed
The 'forward' mode means we will modify cached objects in place but forward any 'misses'. It is also possible that t... Sage Weil
04:58 PM Bug #9731: Ceph 0.80.6 OSD crashes
sure, just tell me the best command line to us as I haven't ever tried to run ceph-osd outside of the standard init s... Brad House
04:52 PM Bug #9731: Ceph 0.80.6 OSD crashes
Would it be possible to run the osds in question under valgrind? Samuel Just
01:49 PM Bug #9731: Ceph 0.80.6 OSD crashes
core file for last crash as requested by Samuel Just Brad House
01:38 PM Bug #9731: Ceph 0.80.6 OSD crashes
Samuel Just
12:48 PM Bug #9731: Ceph 0.80.6 OSD crashes
Another crash from another node, this time with debug increased. Will attach log, here is the backtrace from gdb:
<...
Brad House
10:42 AM Bug #9731: Ceph 0.80.6 OSD crashes
Another backtrace from a different machine, definitely different:... Brad House
10:33 AM Bug #9731: Ceph 0.80.6 OSD crashes
backtrace from last core file:... Brad House
10:02 AM Bug #9731: Ceph 0.80.6 OSD crashes
Can you reproduce with
debug osd = 20
debug ms = 1
debug filestore = 20
?
Samuel Just
06:47 AM Bug #9731: Ceph 0.80.6 OSD crashes
0.80.7 segfault core file and log. Happened immediately at startup after rebooting after update. Brad House
06:42 AM Bug #9731: Ceph 0.80.6 OSD crashes
I just upgraded to 0.80.7, and got a crash on startup of one of my OSDs. I'll grab the log and core dump and attach ... Brad House
04:04 PM Bug #9794: vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
Reverting to 128 PG on master makes the problem disapear. 92 PG also works. 64 PG fails. Loïc Dachary
03:43 PM Bug #9794: vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
... Loïc Dachary
01:54 PM Bug #9794: vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
It works on v0.85, bissecting Loïc Dachary
01:46 PM Bug #9794: vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
reproduced on a fresh ubuntu 14.04 with v0.86-408-gad2514d Loïc Dachary
02:59 PM Feature #9799: ceph tell {daemon}.{id} config set etc.
Two things to consider:
The authentication model is pretty different for a network connection to the daemon vs. a ...
Dan Mick
01:19 PM Feature #9799 (Resolved): ceph tell {daemon}.{id} config set etc.
It would be nice to be able to send asok commands to a daemon using ceph tell instead of login in the machine and usi... Loïc Dachary
02:19 PM Bug #9729 (Fix Under Review): "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x...
Yuri Weinstein
02:19 PM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
Backport to master - https://github.com/ceph/ceph-qa-suite/pull/195 Yuri Weinstein
10:59 AM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
Passed on nightlies:
http://pulpito.front.sepia.ceph.com/teuthology-2014-10-15_17:10:01-upgrade:dumpling-firefly-x...
Yuri Weinstein
01:54 PM CephFS Bug #9800 (Resolved): client-limits test is not passing
/a/teuthology-2014-10-13_23:04:01-fs-giant-distro-basic-multi/547170
The client isn't dropping its caps:...
Greg Farnum
01:15 PM rbd Bug #9595 (Resolved): librbd: internal methods can operate on extra objects when non-default stri...
commit:7b66ee4928d934d684b361602de783b927988503 Josh Durgin
10:50 AM CephFS Feature #4137: MDS: Implement a forward-scrubbing mechanism.
I realized today that we probably want to optionally scrub directories that were renamed into place following a scrub... Greg Farnum
09:11 AM Bug #9675: splitting a pool doesn't start when rule_id != ruleset_id
See also the ceph-user thread "NO pg created for erasure-coded pool" where rule_id != ruleset on firefly. Loïc Dachary
05:57 AM Bug #9796 (Won't Fix): osd: crash on blacklisted watcher reconnect (dumpling)
... Sage Weil

10/15/2014

11:29 PM Bug #9372 (Fix Under Review): injectarg boolean option is discarded
https://github.com/ceph/ceph/pull/2733 Loïc Dachary
10:03 PM Bug #9765: CachePool flush -> OSD Failed

I do not understand. Why data is written to the cache pool - forward
[root@ct3 ~]# ceph osd tier cache-mode cach...
Irek Fasikhov
09:31 PM Bug #9765: CachePool flush -> OSD Failed
On
[root@ct01 ~]# ceph --version
ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)
the same error.
Irek Fasikhov
08:22 PM Feature #9526 (Resolved): mon: 'osd crush rename-bucket <old> <new>'
Loïc Dachary
05:23 PM Feature #9526 (Fix Under Review): mon: 'osd crush rename-bucket <old> <new>'
https://github.com/ceph/ceph/pull/2732 Loïc Dachary
01:53 PM Feature #9526 (In Progress): mon: 'osd crush rename-bucket <old> <new>'
... Loïc Dachary
08:17 PM Bug #9790: ceph auth get does not always show auid
Since it's a minor inconvenience there probably is no need to backport. Loïc Dachary
08:15 PM Bug #9790 (Resolved): ceph auth get does not always show auid
Since it's a minor inconvenience there probably is no need to backport. Loïc Dachary
11:08 AM Bug #9790: ceph auth get does not always show auid
https://github.com/ceph/ceph/pull/2731 Loïc Dachary
11:00 AM Bug #9790: ceph auth get does not always show auid
And it never shows with *ceph auth list* whatever the format Loïc Dachary
10:59 AM Bug #9790 (Resolved): ceph auth get does not always show auid
The "auid shows in plain format":https://github.com/ceph/ceph/blob/giant/src/auth/KeyRing.cc#L251 but "not otherwise"... Loïc Dachary
06:17 PM devops Bug #9795 (Rejected): [fedora20] yum fails when install qemu-kvm
Hi.
When I try install qemu-kvm, yum tells a problem with dependencies.
ceph made ??the following repositories:
...
Maximiliano Federico Osorio Banados
06:09 PM Bug #9794 (Resolved): vstart.sh crashes MON with --paxos-propose-interval=0.01 and one MDS
It must be something specific to my environment that triggers it because this is exactly the same command that "is ru... Loïc Dachary
06:01 PM devops Bug #9793 (Rejected): Fedora 20 ceph-extras Repo missing
Hi
I trying install qemu-kvm in Fedora 20 and ceph firefly, according with http://ceph.com/docs/firefly/install/in...
Maximiliano Federico Osorio Banados
05:15 PM Bug #9487 (Pending Backport): dumpling: snaptrimmer causes slow requests while backfilling. osd_s...
Sage Weil
05:14 PM Bug #9718: osd_types: check_new_interval: min_size check needs to consider CRUSH_ITEM_NONE
Sage Weil
02:55 PM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
Testing changes in https://github.com/ceph/ceph-qa-suite/pull/192
NOTE: this has to sync up with the test suite on...
Yuri Weinstein
01:27 PM rbd Feature #9733: Separate rbd listing into CAP
Yes, working with Format 2 images, it indeed does mount without allowing listing. I've confirmed that our OpenStack e... Robert LeBlanc
12:49 PM Feature #9792 (New): make it harder to remove cluster data pools
- require a special key for this command
- pool removal doesn't apply immediately, but rather first switches pool to...
Yehuda Sadeh
12:45 PM rgw Feature #9791 (New): radosgw-agent: report sync delay
Summarizing this is a bit complicated since there are multiple kinds of logs and each is sharded, but there could be ... Josh Durgin
11:32 AM Bug #9788: "Assertion: common/HeartbeatMap.cc: 79" placeholder for "hit suicide timeout" issues
suite:upgrade:dumpling
run: http://pulpito.front.sepia.ceph.com/teuthology-2014-10-14_17:00:01-upgrade:dumpling-dump...
Yuri Weinstein
07:48 AM Bug #9788 (Closed): "Assertion: common/HeartbeatMap.cc: 79" placeholder for "hit suicide timeout"...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-13_19:30:01-upgrade:dumpling-firefly-x:stress-spli... Yuri Weinstein
10:54 AM Bug #9731: Ceph 0.80.6 OSD crashes
We included the patch I think should fix this in 80.7. Let me know what happens. Samuel Just
10:46 AM Bug #9706 (Resolved): osdc/Objecter.cc: 1570: FAILED assert(op->session)
Sage Weil
07:38 AM rbd Bug #9742: `rbd map lun` fails with: (2) No such file or directory on kernel 3.14.14 w/ udev-216 ...
no, this is running on bare metal. should i try re-generating auth keys for everything? Adeel N
03:07 AM devops Bug #9783: upgrade ceph-common (0.80.7-1trusty) over (0.80.5-0ubuntu0.14.04.1) fails
... Loïc Dachary
12:42 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Adding links for commits fixing this issue here for reference:
https://github.com/ceph/ceph/commit/9b18d99817c8b54...
Florian Haas
12:39 AM CephFS Bug #8576: teuthology: nfs tests failing on umount
I notice that if I execute 'service nfs stop' first, umounting cephfs always successes. 'service nfs stop' runs two c... Zheng Yan

10/14/2014

07:32 PM Bug #8620: rest/test.py occasional failure (dumpling)
ubuntu@teuthology:/a/teuthology-2014-10-13_19:00:01-rados-dumpling-distro-basic-multi/545881 Sage Weil
07:30 PM Bug #8851: Mon crash after update to 0.80.4
In our product env, we use 0.83. Coming accross this problem too.
Try this patch https://github.com/ceph/ceph/pull/2...
wei li
07:30 PM Bug #8851: Mon crash after update to 0.80.4
In our product env, we use 0.83. Coming accross this problem too.
Try this patch https://github.com/ceph/ceph/pull/2...
wei li
07:22 PM Bug #9765: CachePool flush -> OSD Failed
... Loïc Dachary
02:58 AM Bug #9765: CachePool flush -> OSD Failed
I'm sorry!
*[root@ct3 ~]# ceph --version
ceph version 0.80.6 (f93610a4421cb670b08e974c6550ee715ac528ae)*
Irek Fasikhov
01:24 AM Bug #9765: CachePool flush -> OSD Failed
in addition:... Irek Fasikhov
01:19 AM Bug #9765 (Duplicate): CachePool flush -> OSD Failed
Hi,All.
I encountered a problem flushing the data before deleting CachePool.
My crushmap:...
Irek Fasikhov
06:55 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
https://github.com/ceph/ceph/pull/2724 Loïc Dachary
06:42 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
Indeed ! Thanks ! Loïc Dachary
06:32 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
I don't think this is the patch you want see c776a89880fdac270e6334ad8e49fa616d05d0d4 and acfe62e0aa45bff208e38aeedad... Mark Kirkwood
06:27 PM Bug #9073 (Fix Under Review): OSD with device/partition journals down after fresh deploy or upgra...
* firefly backport https://github.com/ceph/ceph/pull/2724 Loïc Dachary
06:22 PM Bug #9073 (Pending Backport): OSD with device/partition journals down after fresh deploy or upgra...
The fix for this bug is https://github.com/ceph/ceph/commit/c776a89880fdac270e6334ad8e49fa616d05d0d4 and needs backpo... Loïc Dachary
06:31 PM Bug #9785 (Resolved): /etc/ceph/dmcrypt-keys and key contents are created world-readable
get_or_create_dmcrypt_key in ceph-disk creates the key_dir and key_files, but does not set any specific permissions o... David Clarke
06:23 PM Bug #9768 (Duplicate): ceph-osd mkfs hangs
Loïc Dachary
06:00 PM Bug #9768: ceph-osd mkfs hangs
Created with ceph-disk prepare --fs-type=ext4 and ceph-disk activate /dev/loop3p1 Loïc Dachary
04:46 PM Bug #9768: ceph-osd mkfs hangs
On ubuntu-14.04 the logs of a ceph-osd mkfs on 0.80.5 that completes successfully. Loïc Dachary
04:04 PM Bug #9768: ceph-osd mkfs hangs
Loïc Dachary
07:21 AM Bug #9768: ceph-osd mkfs hangs
I browsed the patches to aio in 3.12.7 until now and saw nothing that could related to this problem https://www.kerne... Loïc Dachary
07:15 AM Bug #9768: ceph-osd mkfs hangs
... Loïc Dachary
06:01 AM Bug #9768: ceph-osd mkfs hangs
Although https://github.com/ceph/ceph/commit/2f11631f3144f2cc0e04d718e40e716540c8af19 seems related, the log shows Fi... Loïc Dachary
05:45 AM Bug #9768 (Duplicate): ceph-osd mkfs hangs
h3. Workaround for Firefly <= 0.80.7
If it shows with...
Loïc Dachary
06:15 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
rsync asks us to see previous errors;) yes, I think sudo should work Zheng Yan
02:36 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
Well, that would make sense. How did you find those in the log?
We should probably just run this as sudo or someth...
Greg Farnum
06:30 AM CephFS Bug #9674: nightly failed multiple_rsync.sh
... Zheng Yan
05:52 PM devops Bug #9783: upgrade ceph-common (0.80.7-1trusty) over (0.80.5-0ubuntu0.14.04.1) fails
looks like 17732dc0c8878ea58813ad543c5359cb811079cc which probably should have included some other package control he... Dan Mick
04:55 PM devops Bug #9783 (Rejected): upgrade ceph-common (0.80.7-1trusty) over (0.80.5-0ubuntu0.14.04.1) fails
This happens when switching from Ubuntu / Debian repositories to Ceph repositories:... Loïc Dachary
05:31 PM Bug #9784 (Resolved): All tools should be named consistently and argument parsing should be better

Slowly some of the tools like ceph_objectstore_tool have migrated to have underscores in the name. But I noticed s...
David Zafman
04:42 PM Bug #9769: upgrade/firefly: latest_dumpling_release.yaml always fails
Tests still failing but at different point.
See http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-14_14:22:47-u...
Yuri Weinstein
09:17 AM Bug #9769: upgrade/firefly: latest_dumpling_release.yaml always fails
From rerun with verbose:... Yuri Weinstein
08:21 AM Bug #9769 (In Progress): upgrade/firefly: latest_dumpling_release.yaml always fails
Running with verbose is on for job - http://qa-proxy.ceph.com/teuthology/sage-2014-10-13_20:46:44-upgrade:firefly-fir... Yuri Weinstein
06:01 AM Bug #9769 (Resolved): upgrade/firefly: latest_dumpling_release.yaml always fails
... Sage Weil
04:00 PM Bug #9408 (Fix Under Review): erasure-code: misalignment
Loïc Dachary
03:57 PM Bug #9700 (Resolved): cephtool mon_osd intermittent failure
I've not seen errors since this patch, except for firefly builds because this was not backported. Feel free to re-ope... Loïc Dachary
03:06 PM Bug #9388 (Duplicate): osd/PG.cc: 2945: FAILED assert(r == 0) in update_snap_map
David Zafman
03:01 PM Bug #9390 (Duplicate): EEXIST on split due to import/export
David Zafman
03:00 PM Bug #7588 (Resolved): OSD Seg fault in string assign ObjectOperation::C_ObjectOperation_copyget::...
Sage Weil
02:59 PM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
The corresponding line of code in master branch for test/librados/misc.cc was changed by Josh in Feb:
7a019b38 src/t...
David Zafman
02:51 PM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
Same issue in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-13_17:10:01-upgrade:dumpling-firefly-x:paral... Yuri Weinstein
02:59 PM rgw Bug #9774 (Won't Fix): multi-version: giant rgw throws 500 with dumpling osds
Sage Weil
09:42 AM rgw Bug #9774 (Won't Fix): multi-version: giant rgw throws 500 with dumpling osds
... Sage Weil
02:58 PM Bug #9757: mon: loops on osd pool create
final commit is cf4e30095e8149d1df0f2c9b4c93c9df0779ec84 Sage Weil
02:57 PM Bug #9757 (Resolved): mon: loops on osd pool create
Loïc Dachary
02:31 PM Bug #9757 (Fix Under Review): mon: loops on osd pool create
Loïc Dachary
02:28 PM Bug #9757: mon: loops on osd pool create
This bug is dated october 12th with https://github.com/ceph/ceph/commit/0c1eafd7ab6f7d2a5eccd10ce267bde5e90932c5 whic... Loïc Dachary
01:51 PM Bug #9757: mon: loops on osd pool create
* https://github.com/ceph/ceph/commit/fe43202449e3caf60e796f1205ef4303e905659d does not need to be backported because... Loïc Dachary
01:40 PM Bug #9757: mon: loops on osd pool create
"mon/OSDMonitor : Use user provided ruleset for replicated pool":https://github.com/ceph/ceph/commit/cf4e30095e8149d1... Loïc Dachary
01:18 PM Bug #9757: mon: loops on osd pool create
... Loïc Dachary
12:54 PM Bug #9757: mon: loops on osd pool create
This was run using the following backport https://github.com/ceph/ceph/commits/wip-9757 Loïc Dachary
02:47 PM Feature #9781 (Resolved): ceph_objectstore_tool: On import handle splits

Once we have OSDMap information we need to check for splits during pg import:
Sam:
Upon import, if we detect a ...
David Zafman
02:45 PM Feature #9780 (Resolved): ceph_objectstore_tool: Add OSDMap information to pg export
Gather appropriate OSDMap information and include in export data.
David Zafman
02:43 PM Fix #7711: OpTracker output doesn't include op size for subops
I didn't do this back then. We should get to it, though. Greg Farnum
02:14 PM Linux kernel client Feature #9779: libceph: sync up with objecter
Make sure not to break existing (correct!) behavior: we need to resent watch or notify when *any* member of the actin... Ilya Dryomov
02:10 PM Linux kernel client Feature #9779 (Resolved): libceph: sync up with objecter
- the way we resend lingering requests isn't quite the same
- __map_request() is too aggressive about resending:
...
Ilya Dryomov
02:04 PM Linux kernel client Bug #8806 (Resolved): libceph: must use new tid when watch is resent
Ilya Dryomov
02:02 PM Fix #9778 (New): forbid erasure code profile modifications that can modify data encoding
even if --force is set in erasure-code-profile set because it can corrupt the content of the erasure coded pool. For ... Loïc Dachary
01:25 PM Feature #9449 (Resolved): mon: make ceph -s break more things onto multiple lines (health blurbs,...
Sage Weil
01:24 PM Feature #9598 (Fix Under Review): re-enable Objecter fast dispatch
Sage Weil
01:24 PM Fix #9194 (In Progress): librados/osd: watch reconnect needs to be exclusive to detect possibly m...
Sage Weil
01:12 PM Feature #9776 (New): try to make address sanitizer work
Samuel Just
12:44 PM Feature #9198 (Fix Under Review): librados: notify callback includes gid of notifier
Sage Weil
12:44 PM Feature #9197 (Fix Under Review): librados/osd: notify reply payload
Sage Weil
12:43 PM Feature #8899 (Resolved): Kerberos/LDAP Support:: mon: define mon role capabilities
Sage Weil
12:29 PM rgw Bug #9763 (Resolved): firefly upgrade tests fail s3tests, apache goes away
https://github.com/ceph/s3-tests/commit/7e7457e1af8481cf111f25edab198d7498e18551 Sage Weil
12:19 PM rgw Bug #9763: firefly upgrade tests fail s3tests, apache goes away
looks like a bad test in s3-tests Sage Weil
11:13 AM Bug #5925: hung ceph_test_rados_delete_pools_parallel
David Zafman
08:47 AM Bug #5925: hung ceph_test_rados_delete_pools_parallel
I've been reproducing this reliably with wip-9321.giant. Hung job in plana12.... Joao Eduardo Luis
11:12 AM Bug #9696 (Resolved): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(wan...
Sage Weil
10:30 AM rbd Bug #5977 (Resolved): librbd: python bindings need docstrings to show up in online docs
commit:7022679e2c76c707d3d28c052045d11736582b3a Josh Durgin
08:11 AM rbd Bug #5977: librbd: python bindings need docstrings to show up in online docs
PR: https://github.com/ceph/ceph/pull/2720 Jason Dillaman
08:10 AM rbd Bug #5977 (In Progress): librbd: python bindings need docstrings to show up in online docs
Jason Dillaman
09:28 AM rbd Fix #7787: rbd diff takes longer as images grow larger
Dependent on issue #4087 Jason Dillaman
09:26 AM rbd Feature #7746: Capacity Management: rbd df
Dependent on issue #4087 Jason Dillaman
09:25 AM rbd Feature #7746 (In Progress): Capacity Management: rbd df
Jason Dillaman
09:07 AM rbd Feature #7746 (Fix Under Review): Capacity Management: rbd df
Ian Colle
09:13 AM rbd Bug #8329 (Need More Info): qemu-img rpm provided breaks snapshooting functionality on centos
Andrija, according to Bugzilla, the availability of the "-s" option in qemu-img was a backporting bug and was effecti... Jason Dillaman
09:12 AM Linux kernel client Feature #190 (Resolved): krbd: DISCARD support
Ian Colle
09:09 AM rbd Feature #8902 (Fix Under Review): rbd mirroring: librbd: funnel snapshot, resize events via lock ...
Josh Durgin
09:08 AM rgw Cleanup #9772 (In Progress): rgw: reorganize RGWRados
Yehuda Sadeh
09:07 AM rgw Cleanup #9772 (Resolved): rgw: reorganize RGWRados
need to clean up the different states, separate access to system objects vs data objects. Yehuda Sadeh
09:07 AM rbd Feature #8900 (Fix Under Review): rbd mirroring: librbd:making image locking mandatory
Ian Colle
09:07 AM rbd Feature #4087 (Fix Under Review): rbd: bitmaps for tracking object existence
Ian Colle
09:06 AM rgw Feature #9013: rgw: set civetweb as a default frontend
Done, commit:63d0ec7b2c00b7f9515d492009115d87414a77ab. Yehuda Sadeh
09:02 AM rbd Bug #9771 (Won't Fix): Segmentation fault after upgrade v0.80.5 -> v0.80.6
This is new test upgrades from v0.80.4 -> v0.80.5 -> v0.80.4->firefly and runs different workloads after each step.
...
Yuri Weinstein
07:32 AM Bug #9731: Ceph 0.80.6 OSD crashes
And here is another core file from another server. The backtrace in the log looks like a different path to me. Brad House
07:13 AM Bug #9731: Ceph 0.80.6 OSD crashes
The Debian Wheezy build server doesn't seem to be online yet so I haven't been able to test your patch.
However, I...
Brad House
06:20 AM rbd Feature #9733: Separate rbd listing into CAP
I apologize, I thought I mentioned that you need to use RBD image format 2, but re-reading my comments I seemed to ha... Jason Dillaman
06:20 AM CephFS Feature #9755: Fence late clients during reconnect timeout
There can be certain cases where a client can reconnect after being evicted, e.g. if:
* the client didn't hold an...
John Spray
03:53 AM Fix #9767 (New): do not leak ceph-disk activate lock to the OSD
The "activate_lock":https://github.com/ceph/ceph/blob/giant/src/ceph-disk#L1997 will leak to the OSD. This is harmles... Loïc Dachary
03:19 AM rgw Bug #9766 (Rejected): s3tests: test_100_continue failing
Trying out s3tests on a ceph (0.80.5) cluster with ... Abhishek Lekshmanan
01:38 AM Bug #9761: ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error 7 in ld-2.1...
I cannot repeat this voluntarily. And debug.20 eats up space. Pavel Veretennikov
01:34 AM Bug #9761: ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error 7 in ld-2.1...
Pavel Veretennikov wrote:
> Just found this error in the logs. Ceph 0.80.6, Ubuntu 14.04, kernel 3.13.0-36-generic
...
Irek Fasikhov
01:18 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
Looking at the ceph-osd.2.log uploaded by Sahana.
Prior to the reported problem, there was one more crash while merg...
Varada Kari
12:42 AM Linux kernel client Bug #9749 (Resolved): kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
fixed by "ceph: fix divide-by-zero in __validate_layout()" in the testing branch Zheng Yan
12:03 AM rgw Feature #8316: Ceilometer support for RGW Swift statistics
If we want to support ceilometer, can't we support the statistics of both S3 & swift APIs? Abhishek Lekshmanan

10/13/2014

08:45 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
I added a new test (#9758) and testing it on ceph-qa-suites branch 'wip_9758' which is doing step upgrades v0.80.4-v0... Yuri Weinstein
10:52 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
wip-9731-firefly does not have this patch. Samuel Just
05:40 PM rgw Bug #9763 (Resolved): firefly upgrade tests fail s3tests, apache goes away
... Sage Weil
04:50 PM CephFS Feature #414 (Fix Under Review): ceph-fuse: implement file locking
Zheng Yan
04:42 PM devops Bug #9747: ceph.spec.in will always use 95-ceph-osd-alt.rules
running gitbuilder Loïc Dachary
04:40 PM devops Bug #9747 (Fix Under Review): ceph.spec.in will always use 95-ceph-osd-alt.rules
* backported to giant already
* firefly backport https://github.com/ceph/ceph/pull/2717
Loïc Dachary
08:16 AM devops Bug #9747 (Pending Backport): ceph.spec.in will always use 95-ceph-osd-alt.rules
Sage Weil
03:05 PM rbd Feature #9733: Separate rbd listing into CAP
Using those caps does not allow the kernel client to mount the image:
[root@nodezz ~]# ceph auth caps client.rdleb...
Robert LeBlanc
02:20 PM rbd Feature #9733: Separate rbd listing into CAP
I've looked over that document a few times, but I'm not finding specifics about things like "object_prefix", "rbd_hea... Robert LeBlanc
01:59 PM rbd Feature #9733: Separate rbd listing into CAP
Yes, you should be able to use different users within Nova, Cinder, and Glance config files. The capability grammar ... Jason Dillaman
01:16 PM rbd Feature #9733: Separate rbd listing into CAP
Let me try this and see if it will do what we think. I don't know enough about the Open Stack side, but I hope we can... Robert LeBlanc
01:08 PM rbd Feature #9733: Separate rbd listing into CAP
The RBD image directory is stored within an object named 'rbd_directory' in each pool. You could create a capspec wh... Jason Dillaman
12:52 PM CephFS Feature #9755: Fence late clients during reconnect timeout
Hmm, I like the basic thrust of this, but I'm a little concerned as well — we have other tickets to let clients recon... Greg Farnum
03:39 AM CephFS Feature #9755 (Resolved): Fence late clients during reconnect timeout

During reconnect, MDSs terminate the sessions of any clients which fail to reconnect within the window. Because wh...
John Spray
12:45 PM Linux kernel client Feature #190: krbd: DISCARD support
This should go upstream to Linus in the next day or two (for 3.18-rc1). Sage Weil
11:29 AM Linux kernel client Feature #190: krbd: DISCARD support
Alphe Salas wrote:
> I agree with Kyle and Brian. This feature is necessary. I would like to have more information a...
Alphe Salas
11:27 AM Linux kernel client Feature #190: krbd: DISCARD support
I agree with Kyle and Brian. This feature is necessary. I would like to have more information about the status of thi... Alphe Salas
12:28 PM rbd Fix #7787 (In Progress): rbd diff takes longer as images grow larger
Jason Dillaman
12:17 PM Bug #9761 (Rejected): ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error ...
Just found this error in the logs. Ceph 0.80.6, Ubuntu 14.04, kernel 3.13.0-36-generic
Nothing special in the ceph...
Pavel Veretennikov
12:03 PM devops Bug #9760 (Rejected): librados2 fails to install from ceph-qa
Which is causing lots of failures in ceph-deploy's test runs
Example full log of one of the failures: http://qa-pr...
Alfredo Deza
10:48 AM Bug #9731: Ceph 0.80.6 OSD crashes
Samuel Just
10:19 AM Linux kernel client Bug #9749: kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
Ilya Dryomov
09:59 AM Bug #9714 (Duplicate): Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-mu...
i think this is a dup of #9757 Sage Weil
09:48 AM Bug #9714: Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi run
Sam, can you take a look at this?
Still an issue in one off run - http://qa-proxy.ceph.com/teuthology/teuthology-2...
Yuri Weinstein
09:35 AM rbd Bug #9742: `rbd map lun` fails with: (2) No such file or directory on kernel 3.14.14 w/ udev-216 ...
is this running inside a container? this looks lik ea problem with the authentication keys and there is a known issu... Sage Weil
09:32 AM Bug #9744 (Won't Fix): cephx: verify_reply couldn't decrypt with error: error decoding block for ...
this happens when clocks are very skewed. Sage Weil
08:59 AM rbd Bug #9602 (Closed): rbd export -> nc ->rbd import = memory leak
Irek, thanks for the update. Closing as not a bug. Jason Dillaman
04:43 AM rbd Bug #9602: rbd export -> nc ->rbd import = memory leak
Jason Dillaman wrote:
> I quickly attempted to reproduce this on the same version w/o success. Can you attach /etc/...
Irek Fasikhov
08:47 AM Documentation #9730: ceph-deploy mon create-inital, does not take arguments
merged commit eb27245 into master Alfredo Deza
08:46 AM Documentation #9730 (Resolved): ceph-deploy mon create-inital, does not take arguments
Merged the pull request. John Wilkins
08:21 AM Documentation #9730 (In Progress): ceph-deploy mon create-inital, does not take arguments
PR opened https://github.com/ceph/ceph/pull/2714 Alfredo Deza
08:34 AM Bug #9757: mon: loops on osd pool create
also breaking teuthology-2014-10-09_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi Sage Weil
06:31 AM Bug #9757 (Resolved): mon: loops on osd pool create
http://pulpito.ceph.com/sage-2014-10-12_09:13:46-upgrade:dumpling-x-wip-sam-firefly-testing-distro-basic-multi/541361... Sage Weil
07:22 AM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
I have submitted the following patches:
Update s3-tests with the new small size multipart tests:
https://github.c...
Luis Pabon
06:56 AM Cleanup #9756: Issues found by Clang
start with
https://github.com/ceph/autobuild-ceph/blob/master/build-ceph.sh
and make a build-ceph-clang.sh. ...
Sage Weil
04:50 AM Cleanup #9756 (In Progress): Issues found by Clang
I again [1] used Clang with -Weverything [2] to compile the Ceph repository [3].
There is still a huge amount of ser...
Daniel Hofmann
06:22 AM rbd Bug #8329: qemu-img rpm provided breaks snapshooting functionality on centos
Any info on this? At least can we define some prefered way of enabling qemu-img/kvm to speak to CEPH (Do it your self... Andrija Panic
03:16 AM CephFS Feature #9754 (Resolved): A 'fence and evict' client eviction command

Currently the "session evict" operation on the MDS admin socket will terminate the session, and release any capabil...
John Spray
01:07 AM Linux kernel client Bug #9355 (Closed): rbd: map fails with EINVAL inside a container
Opened #9753. Ilya Dryomov
01:06 AM Linux kernel client Feature #9753 (Resolved): libceph: allow custom network namespaces
See the bottom of #9355. Ilya Dryomov
12:57 AM Linux kernel client Bug #9192: krbd: poor read (about 10%) vs write performance
Hi Eric,
Thanks for doing this. I was concerned about this being a regression after the queueing changes, but it ...
Ilya Dryomov

10/12/2014

10:07 PM Bug #9614: PG stuck with remapped
The original fix was not clean, just added a new pull request: https://github.com/ceph/ceph/pull/2711 Guang Yang
09:06 PM Bug #9215: Ceph Firefly 0.80.5 : OSD flapping too frequently
karan singh wrote:
> You can close this case , problem has been solved after applying fix (0.80.5-1-gc4b77d2)
May...
Wang Qiang
06:43 PM Bug #9731: Ceph 0.80.6 OSD crashes
I already saw the commit in branch. The status of this issue should not be New. Wang Qiang
03:23 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Results for the run teuthology-2014-10-11_19:00:02-upgrade:dumpling-x-wip-9731-firefly-distro-basic-multi
Still jo...
Yuri Weinstein
02:44 PM Linux kernel client Bug #9192: krbd: poor read (about 10%) vs write performance
I was able to get some dedicated test time on one of our Ceph test clusters to rerun the kernel RBD read/write tests ... Eric Eastman
12:09 PM Bug #9752 (Resolved): acting in past intervals contains primary and up_primary (looks like duplic...
In a 0.80.6 in the context of http://tracker.ceph.com/issues/9750 the following showed up (the full output can be fou... Loïc Dachary
11:20 AM Bug #9751: ceph tell osd.6 version hangs
here is the gdb output of the OSD process that fails to answer to ceph tell Loïc Dachary
10:18 AM Bug #9751: ceph tell osd.6 version hangs
attaching the log with lockdep = true, starting from when the osd boots up to the point where ceph tell blocks forever Loïc Dachary
10:10 AM Bug #9751: ceph tell osd.6 version hangs
greg : the log is from when the osd started up to the point where ceph tell hangs Loïc Dachary
09:51 AM Bug #9751: ceph tell osd.6 version hangs
Maybe similar to #9748 and #9714 ? Yuri Weinstein
09:49 AM Bug #9751: ceph tell osd.6 version hangs
Was the OSD already "hanging" when you generated this log? Greg Farnum
09:21 AM Bug #9751 (Rejected): ceph tell osd.6 version hangs
... Loïc Dachary
10:05 AM Bug #9718 (Fix Under Review): osd_types: check_new_interval: min_size check needs to consider CRU...
Sage Weil
09:36 AM Bug #9750: pg incomplete
I guess it's not a bug indeed, only the logical outcome of something going wrong. What is probably a bug is having th... Loïc Dachary
09:32 AM Bug #9750 (Won't Fix): pg incomplete
Loïc Dachary
09:07 AM Bug #9750: pg incomplete
So you don't actually think it's a bug?
In any case, if you still have the disk accessible, you probably want to u...
Greg Farnum
08:54 AM Bug #9750: pg incomplete
osd.3 has failed recently (this morning) because btrfs turned it read-only. it is likely that it contains the missing... Loïc Dachary
08:45 AM Bug #9750: pg incomplete
What did you *do* to this cluster? I don't think these PGs are supposed to have historical acting sets that look like... Greg Farnum
08:01 AM Bug #9750 (Won't Fix): pg incomplete
... Loïc Dachary

10/11/2014

10:33 PM Linux kernel client Bug #9749 (Resolved): kcephfs: kernel divide-by-zero crash in __validate_layout (fs/ceph/ioctl.c)
Our UC-KLEE tool discovered a Linux kernel divide-by-zero crash in the Ceph
client driver. I found the bug on kernel...
David Ramos
03:52 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Sage has scheduled run on wip-9731-firefly http://pulpito.front.sepia.ceph.com/teuthology-2014-10-10_16:50:01-upgrade... Yuri Weinstein
03:35 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
pre-firefly mons I think would also suffice to cause this bug. Actually, if you upgrade the osds from pre-firefly to... Samuel Just
03:29 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Can you rerun with wip-sam-firefly-testing? (actually, ignore the firefly branch for the moment and use wip-sam-firef... Samuel Just
01:05 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
I guess it's expected as backport is still pending.
Update:
In the run http://pulpito.front.sepia.ceph.com/teutho...
Yuri Weinstein
01:10 PM Bug #7588 (Pending Backport): OSD Seg fault in string assign ObjectOperation::C_ObjectOperation_c...
This actually doesn't seem to have been backported to firefly. I think it might be causing some of the cache/tiering... Samuel Just
01:01 PM Bug #9748 (Rejected): Dead jobs in upgrade:dumpling-x-firefly-distro-basic-multi run
Jobs '537916', '537917'
Logs are in:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-10_19:00:01-upgrade...
Yuri Weinstein
12:54 PM Bug #9714: Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-multi run
Same problem in run - http://pulpito.front.sepia.ceph.com/teuthology-2014-10-10_19:00:01-upgrade:dumpling-x-firefly-d... Yuri Weinstein
09:27 AM devops Bug #9747 (Fix Under Review): ceph.spec.in will always use 95-ceph-osd-alt.rules
Loïc Dachary
09:22 AM devops Bug #9747: ceph.spec.in will always use 95-ceph-osd-alt.rules
https://github.com/ceph/ceph/pull/2706 Loïc Dachary
09:12 AM devops Bug #9747 (Resolved): ceph.spec.in will always use 95-ceph-osd-alt.rules
In ceph.spec.in *%if (0%{?rhel} || 0%{?rhel} < 7)* "see sources":https://github.com/ceph/ceph/blob/giant/ceph.spec.in... Loïc Dachary
09:08 AM Bug #9746 (Resolved): reconcile upstream ceph.spec.in with other ceph.spec (SuSE, EPEL, etc)
There are many differences between the "epel ceph.spec":https://dl.fedoraproject.org/pub/epel/7/SRPMS/c/ceph-0.80.5-8... Loïc Dachary
08:32 AM devops Bug #9721 (Rejected): partx -a should be called after creating the data partition
The diagnostic is incorrect. At the time the data partition is created it does not make sense to try to activate it b... Loïc Dachary

10/10/2014

08:51 PM Bug #9716 (Resolved): Warning in API headers when compiling with -Wstrict-prototypes
commit:d98b75530b0ea8f243a4dc8e1881bc6da2bca99d Josh Durgin
02:10 PM Bug #9716 (Fix Under Review): Warning in API headers when compiling with -Wstrict-prototypes
https://github.com/ceph/ceph/pull/2701 Adam Crume
01:44 PM Bug #9716: Warning in API headers when compiling with -Wstrict-prototypes
Forgot to mention that qemu uses -Werror by default, hence the errors. Adam Crume
08:45 PM Bug #8983 (Resolved): rados bench -b option does not take orders of magnitude (k,M,..) but also d...
commit:3b9dcff7755a3ffcb9df8a06e6d0e525e77de641 Josh Durgin
02:13 PM Bug #8983: rados bench -b option does not take orders of magnitude (k,M,..) but also does not thr...
https://github.com/ceph/ceph/pull/2678 Adam Crume
08:12 PM Bug #9143 (Rejected): Incorrect key sequence in encoding object name to key for GenericObjectMap
Haomai Wang
07:21 PM Bug #9731: Ceph 0.80.6 OSD crashes
wip-firefly-9696-9731 should have a fix for this as well as 9696. Let me know whether that helps. Samuel Just
03:11 PM Bug #9731: Ceph 0.80.6 OSD crashes
I think this is a bug in PGLog::IndexedLog::trim(). Making patch. Samuel Just
11:44 AM Bug #9731: Ceph 0.80.6 OSD crashes
Attached ceph OSD log from crash with debugging turned on. Brad House
11:03 AM Bug #9731: Ceph 0.80.6 OSD crashes
I will add those to my configuration and restart ceph on each node.
Luckily this is just my test environment.
Brad House
10:47 AM Bug #9731: Ceph 0.80.6 OSD crashes
Can you reproduce either of these with logging?
debug osd = 20
debug filestore = 20
debug ms = 1
Samuel Just
09:45 AM Bug #9731 (Can't reproduce): Ceph 0.80.6 OSD crashes
I received 2 different crashes on 2 different OSDs on different nodes within 30s of eachother on 0.80.6. I just upgr... Brad House
06:41 PM Bug #9744: cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
I think I found the problem: new node (with new OSD) had incorrect time.
Everything returned to normal after correct...
Dmitry Smirnov
06:10 PM Bug #9744: cephx: verify_reply couldn't decrypt with error: error decoding block for decryption
Found the following in the logs of the new OSD:... Dmitry Smirnov
05:57 PM Bug #9744 (Won't Fix): cephx: verify_reply couldn't decrypt with error: error decoding block for ...
Shortly after upgrade 0.80.5 to 0.80.6 cluster became slow and then almost completely stopped
with several OSDs exhi...
Dmitry Smirnov
05:45 PM rgw Bug #9307 (Resolved): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-fir...
above errors from yuri are #9169.. something else Sage Weil
08:10 AM rgw Bug #9307: "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-x-mast...
Same issues in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-b... Yuri Weinstein
05:00 PM Fix #9199 (In Progress): librados: watch linger pings need to verify pg mapping hasn't changed
Sage Weil
04:59 PM Fix #9196 (In Progress): librados: watch_check() to synchronous verify we haven't missed notifies
Sage Weil
04:59 PM Fix #8905: msgr: encode osd epoch in nonce to avoid misc OSD reconnect races
Sage Weil
04:56 PM Bug #9706 (Fix Under Review): osdc/Objecter.cc: 1570: FAILED assert(op->session)
Sage Weil
09:26 AM Bug #9706 (In Progress): osdc/Objecter.cc: 1570: FAILED assert(op->session)
tick() locking is broken Sage Weil
04:55 PM rgw Bug #7796 (Pending Backport): RGW Keystone token auth fails with '411 Length Required' when Keyst...
Sage Weil
04:11 PM CephFS Bug #9679: Ceph hadoop terasort job failure
I do believe that Hadoop kills the clients after they reach a point that the run-time believes everything has been fl... Noah Watkins
02:02 PM CephFS Bug #9679: Ceph hadoop terasort job failure
Looking at the bad client (11139), the first thing I notice is that the messaging is way backed up. What's the networ... Greg Farnum
09:13 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Here is the directory listing. All of the files should be the same size.... Noah Watkins
03:07 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
commit:82175ec94acc89dc75da0154f86187fb2e4dbf5e Josh Durgin
03:06 PM rbd Bug #9513 (Pending Backport): rbd_cache=true default setting is degading librbd performance ~10X ...
Josh Durgin
01:50 PM rbd Bug #9742 (Resolved): `rbd map lun` fails with: (2) No such file or directory on kernel 3.14.14 w...
when trying to map a standard rbd image as a block device, the command fails with (2) No such file or directory.
I...
Adeel N
01:27 PM Feature #9741 (Closed): teuthology-suite: allow scheduling sub-suites
Samuel Just
01:18 PM Bug #9443: btrfs pwrite returns EEXIST on journal FileJournal::write_bl
Don't run on that kernel. :(
(My understanding is that they have a fix in testing and it shouldn't be in an actual r...
Greg Farnum
01:04 PM Bug #9443: btrfs pwrite returns EEXIST on journal FileJournal::write_bl
Is there a workaround ? Loïc Dachary
01:03 PM Bug #9740 (Duplicate): FileJournal::do_write assert(0)
Loïc Dachary
12:59 PM Bug #9740 (Duplicate): FileJournal::do_write assert(0)

http://pulpito.ceph.com/loic-2014-10-10_08:45:20-rados:thrash-erasure-code-isa-master-testing-basic-vps/536207/
...
Loïc Dachary
01:00 PM Bug #9696 (Pending Backport): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
Samuel Just
10:11 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Ok, can you reproduce with the logging above? Samuel Just
01:09 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Also, could either Loïc or Sam explain what exact combination of circumstances causes this assert to trigger? I can't... Florian Haas
12:59 AM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Sam, I can confirm with certainty that this did *not* happen during an upgrade from dumpling. All nodes were running ... Florian Haas
12:09 PM Bug #9739 (Won't Fix): rados cli: listsnaps does not list snaps
To reproduce:... Adam Crume
12:04 PM Bug #9738 (Won't Fix): rados cli: objects not present in a snapshot are listed anyway
To reproduce:... Adam Crume
11:54 AM Bug #9737 (Resolved): rados cli: --snapid (not --snap) option is broken
Running "rados --pool mypool --snapid 1 ls" (assuming 1 is a valid snap number) crashes without printing or returning... Adam Crume
11:34 AM RADOS Bug #9736 (New): rados cli doesn't print specific usage errors
If a user executes e.g. "rados lssnap", the command prints out the usage information. However, it does not say that ... Adam Crume
11:23 AM Bug #9735 (Resolved): "rados lock list" doesn't output final end-of-line
The rados command-line utility doesn't output an end-of-line character at the end of the output of the "lock list" co... Adam Crume
11:10 AM Bug #9729: "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:parallel-giant-dis...
David, is it related to some code you were working on? Pls take a look and reassigned if necessary. Yuri Weinstein
09:06 AM Bug #9729 (Resolved): "LibRadosMisc.Operate1PP" test failed in upgrade:dumpling-firefly-x:paralle...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-10_08:19:51-upgrade:dumpling-firefly-x:parallel-gi... Yuri Weinstein
10:59 AM Feature #6258: ceph-disk: zap should wipefs
A user in the #ceph-devel channel had issues, it wouldn't matter that he tried to zap the disk, the filesystem was st... Alfredo Deza
10:25 AM rbd Feature #9733 (New): Separate rbd listing into CAP
We are concerned that if the key is compromised in our OpenStack environment, then all images in the pool can be list... Robert LeBlanc
09:46 AM Bug #9732 (Resolved): ReplicatedPG::hit_set_trim osd/ReplicatedPG.cc: 11006: FAILED assert(obc)
The timezone of the machine was incorrect CDT instead of CEST. All other machines (MON and OSD) are on CEST.
On a ...
Loïc Dachary
09:26 AM Documentation #9730 (Resolved): ceph-deploy mon create-inital, does not take arguments
It uses the same hosts that where passed into `ceph-deploy new {HOSTS}`
But this sections says the user should pas...
Alfredo Deza
09:15 AM Feature #9728: erasure-code: jerasure support for NEON
https://github.com/ceph/ceph/pull/2694 Loïc Dachary
09:02 AM Feature #9728 (Resolved): erasure-code: jerasure support for NEON
Work done by Janne Grunau @ https://github.com/jannau/ceph/compare/neon . It will be available in Hammer. Loïc Dachary
08:50 AM rbd Feature #8902: rbd mirroring: librbd: funnel snapshot, resize events via lock holder
WIP branch: https://github.com/ceph/ceph/compare/wip-8902 Jason Dillaman
08:46 AM Bug #9702: "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:firefly-x-giant-...
Same in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-08_19:20:02-upgrade:firefly-x-giant-distro-basic-m... Yuri Weinstein
08:46 AM Bug #9703: "Segmentation fault" in upgrade:firefly-x-giant-distro-basic-multi run
Same in run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-08_19:20:02-upgrade:firefly-x-giant-distro-basic-m... Yuri Weinstein
08:45 AM devops Bug #9724: VPS machines not being locked "No route to host"
Why does this keep happening? Zack Cerza
08:44 AM devops Bug #9724: VPS machines not being locked "No route to host"
correct URL:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-ba...
Zack Cerza
07:59 AM devops Bug #9724 (Rejected): VPS machines not being locked "No route to host"
In the run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-basic... Yuri Weinstein
08:27 AM Bug #9727 (Duplicate): 0.86 EC+ KV OSDs crashing
Hi, testing our Tiering setup with EC+KV backend a bit further on the latest dev release, our OSDS started to crash a... Kenneth Waegeman
08:03 AM devops Bug #9725 (Won't Fix): Error "'sudo yum install ceph-radosgw-0.67.11 -y'"in upgrade:dumpling-x-fi...
In run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-09_19:00:01-upgrade:dumpling-x-firefly-distro-basic-vps... Yuri Weinstein
07:18 AM CephFS Bug #9692 (Resolved): ACL workunit syntax error
Zheng Yan
07:05 AM rgw Feature #9723 (New): Support metering info
Add object storage metering support similar to openstack swift ceilometer. Its should able plugable with openstack ce... Swami Reddy
05:39 AM devops Bug #9721 (Fix Under Review): partx -a should be called after creating the data partition
https://github.com/ceph/ceph/pull/2648 and https://github.com/dachary/ceph/commit/81d6c5b5a33de745ae4a23536409de0c0e7... Loïc Dachary
05:26 AM devops Bug #9721: partx -a should be called after creating the data partition
Loïc Dachary
05:18 AM devops Bug #9721 (Rejected): partx -a should be called after creating the data partition
In the following udev is racing with the creation of the partition:... Loïc Dachary
03:23 AM Feature #9720 (Resolved): erasure-code: non regression should test jerasure variants
check that content encoded with one variant exactly matches content encoded with another variant Loïc Dachary
12:36 AM rgw Feature #8052: Support for Keystone Identity API v3
From swift 2.2.0 changelog:
* Added support for Keystone v3 auth.
Keystone v3 introduced the concept ...
Dag Stenstad

10/09/2014

05:14 PM Bug #9718 (Resolved): osd_types: check_new_interval: min_size check needs to consider CRUSH_ITEM_...
Samuel Just
04:40 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
wip-9696-firefly removes the assert on firefly, it's not valid for the compat case. Samuel Just
04:32 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
https://github.com/ceph/ceph/pull/2684/files Samuel Just
04:31 PM Bug #9696 (Fix Under Review): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
Samuel Just
04:31 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Can you restart one of the crashing osds with
debug osd = 20
debug filestore = 20
debug ms = 1 ?
As far as we...
Samuel Just
04:26 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Loïc Dachary
04:25 PM Bug #9696 (Fix Under Review): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
https://github.com/ceph/ceph/pull/2684 Loïc Dachary
04:20 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Samuel Just
04:16 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
running in gitbuilder under the branch wip-9696-compat-acting Loïc Dachary
03:56 PM Bug #9696 (Fix Under Review): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED as...
Loïc Dachary
03:38 PM Bug #9696 (In Progress): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(...
https://github.com/ceph/ceph/pull/2682 Loïc Dachary
03:01 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
It actually failed a new test case AFTER it went out into a stable release version. Ian Colle
02:28 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
Whoa, wait -- Loïc, are you saying this actually failed a test case and still made it into a release in a stable vers... Florian Haas
02:23 PM Bug #9696: Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(want_acting_ba...
For the record http://tracker.ceph.com/issues/9715 hits the same assert in similar conditions in teuthology and the f... Loïc Dachary
03:46 PM rgw Bug #7796 (Fix Under Review): RGW Keystone token auth fails with '411 Length Required' when Keyst...
Yehuda Sadeh
03:32 PM Bug #9715: assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting) firefly
sjust: I think it's due to the compatibility thing where we include the backfill peer in the acting set if there are ... Loïc Dachary
03:11 PM Bug #9715: assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting) firefly
I see the change (92cfd370) that added the assert and didn't consider "compat_mode." In older OSDs we only have one ... David Zafman
02:20 PM Bug #9715 (Duplicate): assert(want_acting_backfill.size() - want_backfill.size() == num_want_acti...
Loïc Dachary
02:08 PM Bug #9715: assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting) firefly
" assert(want_acting_backfill.size() - want_backfill.size() == num_want_acting);":https://github.com/ceph/ceph/blob/f... Loïc Dachary
09:51 AM Bug #9715 (Duplicate): assert(want_acting_backfill.size() - want_backfill.size() == num_want_acti...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_19:30:01-upgrade:dumpling-firefly-x:stress-spli... Yuri Weinstein
03:12 PM Bug #8983 (Fix Under Review): rados bench -b option does not take orders of magnitude (k,M,..) bu...
Adam Crume
02:26 PM devops Bug #9712: ceph.com is not accessible from IPv6 only environments
Works for me (native v6 ip).
Thanks!
Gleb Borisov
12:23 PM devops Bug #9712 (Resolved): ceph.com is not accessible from IPv6 only environments
This is me. Somehow the ipv6 ip address became unassigned to the ceph.com dedicated server in DH's database. Not sure... Sandon Van Ness
10:29 AM devops Bug #9712: ceph.com is not accessible from IPv6 only environments
Apologies if somebody else should be handling this, but I think it's yours? :) Greg Farnum
07:48 AM devops Bug #9712 (Resolved): ceph.com is not accessible from IPv6 only environments
Some time ago we found that we can't connect to ceph.com:443. Moreover we can't either ping it.... Gleb Borisov
01:14 PM Bug #9711 (Duplicate): 'cache' osd crash on ceph 0.86
Loïc Dachary
01:29 AM Bug #9711 (Duplicate): 'cache' osd crash on ceph 0.86
In a tiering setup cache+ EC on KV, one cache OSD has crashed after about 12hours testing with rados bench.
Stackt...
Kenneth Waegeman
01:13 PM Bug #9480: OSD is crashing while object deletion
It's back http://tracker.ceph.com/issues/9711 Loïc Dachary
12:07 PM CephFS Bug #9679: Ceph hadoop terasort job failure
empty fs:... Noah Watkins
08:21 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Thanks Huamin. Yeh, It looks like some writes are being lost, probably due to an unclean shutdown. I'll get some trac... Noah Watkins
08:06 AM CephFS Bug #9679: Ceph hadoop terasort job failure
For comparison, teragen files on CephFS
./hadoop/bin/hadoop fs -ls /in-dir-3
14/10/09 08:05:05 WARN util.NativeC...
Huamin Chen
07:04 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Run the same tests on HDFS 2.4.1, thoguh on a different setup. Terasort finished without any problem.
Cmd:
./hado...
Huamin Chen
11:46 AM rbd Feature #2467 (In Progress): qemu: implement bdrv_invalidate_cache
Patch sent to qemu-devel@nongnu.org. Adam Crume
10:50 AM Bug #9716 (Resolved): Warning in API headers when compiling with -Wstrict-prototypes
Configuring qemu fails because of the compile errors:... Adam Crume
09:40 AM Bug #9714 (Duplicate): Dead jobs in upgrade:dumpling-firefly-x:stress-split-giant-distro-basic-mu...
Run http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_19:30:01-upgrade:dumpling-firefly-x:stress-split-giant-... Yuri Weinstein
09:30 AM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Thanks for the update, Ilya! You actually gave me a hint as to a workaround - run the container with `--net host` so ... Chris Armstrong
08:59 AM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
The... Ilya Dryomov
09:23 AM rbd Feature #8902 (In Progress): rbd mirroring: librbd: funnel snapshot, resize events via lock holder
... also include flatten. Jason Dillaman
09:22 AM rbd Feature #8900 (In Progress): rbd mirroring: librbd:making image locking mandatory
Jason Dillaman
08:28 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
Ian Colle
08:11 AM Bug #9610: Crash "RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)...
Still an issue: http://pulpito.front.sepia.ceph.com/teuthology-2014-10-08_23:20:03-multi-version-giant-distro-basic-m... Yuri Weinstein
08:16 AM rgw Bug #9612 (New): "ERROR: test suite for <module 's3tests.functional'" in multi-version-giant-test...
Still an issue: http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-08_23:20:03-multi-version-giant-distro-basic-m... Yuri Weinstein
08:09 AM Bug #9705 (Duplicate): "RadosModel.h: 829: FAILED assert(0)" in multi-version-giant-distro-basic-...
Yuri Weinstein
07:06 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
I've just updated pull request 2419 with a more complete fix for the issue.
I was now able to reproduce 100% when my...
Sebastien Ponce
02:52 AM Bug #9327: Usability Issue: Ceph-deploy does not print all the commands which it is executing
This issue seen in 0.84 build, can we cross check ones. Ramakrishnan P
02:30 AM Feature #9420 (Fix Under Review): erasure-code: tools and archive to check for non regression of ...
The gitbuilders have been updated, it is ready for review. Loïc Dachary
12:28 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
What will be state of OSD in 3 node cluster ?, in 3 node cluster there will be other OSD's running on other nodes, so... Ramakrishnan P

10/08/2014

11:08 PM CephFS Bug #9679: Ceph hadoop terasort job failure
missing one of these?... Noah Watkins
10:46 PM CephFS Bug #9679: Ceph hadoop terasort job failure
My bet at this point is on the generation of the input data set. Teragen creates a file with X 100byte entries. When ... Noah Watkins
07:57 PM Bug #9559: ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() function
please give me a cve id ,thanks qinghao tang
04:23 PM Bug #9630 (Need More Info): osd: leaked pg refs on shutdown (dumpling)
I'm out of ideas but happy to keep exploring if someone has a lead. If this happens again cross referencing the logs ... Loïc Dachary
04:12 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
"OSD::shutdown":https://github.com/ceph/ceph/blob/dumpling/src/osd/OSD.cc#L1521 clear() the "finished":https://github... Loïc Dachary
03:21 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
... Loïc Dachary
01:53 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
The last thing that happened to pg 2.15 was... Loïc Dachary
01:35 PM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
Log lines related to pg 2.15... Loïc Dachary
11:38 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
It could not be in_progress_splits : the logs do not contain the word *split* Loïc Dachary
11:18 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
It is the same assert as http://tracker.ceph.com/issues/7891 but the PGBackend did not exist at the time, therefore t... Loïc Dachary
11:03 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
The osd.2 actually crashed with:... Loïc Dachary
10:07 AM Bug #9630: osd: leaked pg refs on shutdown (dumpling)
The full valgrind report from remote/vpm180/log/valgrind/osd.2.log.gz... Loïc Dachary
03:58 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
It seems that we need to backport the update_range/scan_range changes (intended to avoid backfill related flushes) fr... Samuel Just
01:17 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
I think the least distasteful solution is to actually backport the last_backfill_started modifications. I'll start t... Samuel Just
03:38 PM rbd Feature #7272 (Duplicate): rbd: import performance
Josh Durgin
03:14 PM rbd Feature #7272: rbd: import performance
Reads are single-threaded, but writes are asynchronous, so multiple could be in flight at once. (In rbd.cc, do_impor... Adam Crume
11:53 AM rbd Bug #9513 (Fix Under Review): rbd_cache=true default setting is degading librbd performance ~10X ...
Adam Crume
11:06 AM Bug #9496 (Resolved): mon: pg scrub timestamps must be populated at pg creation
Samuel Just
11:04 AM Bug #9128 (Resolved): Newly-restarted OSD may suicide itself after hitting suicide time out value...
Samuel Just
11:04 AM Bug #9419 (Pending Backport): dumpling->firefly upgrade, sending setallochint?
Samuel Just
10:59 AM rbd Feature #8900: rbd mirroring: librbd:making image locking mandatory
WIP branch: https://github.com/ceph/ceph/compare/wip-8900 Jason Dillaman
09:37 AM rbd Bug #9642: Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-giant-distro-ba...
Yuri Weinstein
08:49 AM Bug #9706 (Resolved): osdc/Objecter.cc: 1570: FAILED assert(op->session)
This was actually on wip-sam-testing, but does not appear related to any of the patches.
ubuntu@teuthology:/a/samu...
Samuel Just
08:43 AM rgw Bug #9307 (New): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-...
I see the same issues on giant run:
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-07_15:54:57-upgrade:dum...
Yuri Weinstein
08:31 AM Bug #9705 (Duplicate): "RadosModel.h: 829: FAILED assert(0)" in multi-version-giant-distro-basic-...
Looks similar to #9528 (no root issue mentioned)
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-1...
Yuri Weinstein
08:05 AM Bug #9703 (Resolved): "Segmentation fault" in upgrade:firefly-x-giant-distro-basic-multi run
I see coredump on mira076 client.1 (@*/531751/remote/mira076/@), but could not get any info about it.
Logs are in ...
Yuri Weinstein
08:00 AM Bug #9702 (Duplicate): "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:fire...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-06_19:20:01-upgrade:firefly-x-giant-distro-basic-m... Yuri Weinstein
07:56 AM Bug #9700: cephtool mon_osd intermittent failure
Waiting about a week to see if it shows up again. Loïc Dachary
07:30 AM Bug #9700 (Fix Under Review): cephtool mon_osd intermittent failure
https://github.com/ceph/ceph/pull/2670... Loïc Dachary
07:23 AM Bug #9700: cephtool mon_osd intermittent failure
The osd 1 goes down during the following. Reading the script and what it does I can imagine why. Unless osd.1 dies be... Loïc Dachary
06:57 AM Bug #9700: cephtool mon_osd intermittent failure
ENXIO is expected when ceph tell tries to join an osd that is not ready and it should be treated as EAGAIN. If it hap... Loïc Dachary
06:11 AM Bug #9700 (Resolved): cephtool mon_osd intermittent failure

Hit this one time on a gitbuilder: it's not clear to me why we have a 5-time retry here: some timeout raciness in t...
John Spray
07:28 AM CephFS Feature #9437 (Resolved): make 'ceph tell mds.* ...' work, deprecate 'ceph mds tell * ...'
... John Spray
05:29 AM devops Support #8861: Deploying additional monitors fails.
My work around that was to declare all monitors before install, and install all monitors at once. Pretty sure if I ne... Bobby Yakov
02:39 AM devops Support #8861: Deploying additional monitors fails.
As per my update in #5195:
Same here. I have run through the latest quick start documentation and am using Ubuntu ...
Matthew Rees
05:16 AM devops Bug #9697 (Rejected): exitcode of gatherkeys has changed the latests versions
Hi,
We've been using ceph-deploy in a deployment component, and we also use the gatherkeys function.
In some earl...
Kenneth Waegeman
02:37 AM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
Same here. I have run through the latest quick start documentation and am using Ubuntu 14.04.1 and Ceph firefly with ... Matthew Rees
02:15 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Finally I can reproduce it ! I know, I've already said that and was wrong...
Actually I still don't manage with the ...
Sebastien Ponce
12:35 AM Bug #9696 (Resolved): Upgrade from 0.80.5 to 0.80.6 causes OSDs to go down with FAILED assert(wan...
After an upgrade from 0.80.5 to 0.80.6, almost *all* OSDs went down after hitting the following failed assertion:
...
Florian Haas
12:31 AM Bug #9408 (In Progress): erasure-code: misalignment
Loïc Dachary
12:15 AM Bug #9677 (Resolved): osd_disk_thread_ioprio_class is ignored
Loïc Dachary

10/07/2014

10:01 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
Josh Durgin wrote:
> The issue was reported on firefly - does it have the same behavior as master, or is there somet...
Luis Pabon
05:48 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
The issue was reported on firefly - does it have the same behavior as master, or is there something that should be ba... Josh Durgin
05:30 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
Here is the response from the gateway:... Luis Pabon
12:56 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
Maybe the problem is that we don't send the xml body with the appropriate error? Yehuda Sadeh
12:52 PM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
I have added a test to s3-test to check for EntityTooSmall and it *passes* on the current code. According to AWS an ... Luis Pabon
09:50 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
I'm happy to redefine the permissions if and when that becomes an option/requirement, but until then, it seems like t... Dan Mick
09:42 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
The immediate issue was resolved by switching it to rw (or so my code check and utter lack of memory tells me). But I... Greg Farnum
09:28 PM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
Well, hang on a minute...the question is about the nature of the command, which is totally mds-specific, not rest-api... Dan Mick
07:07 AM Feature #7104: rest-api: support commands requiring 'w' cap without 'rw' cap
I don't know that this is still a bug, but since it was a REST api issue I don't think it belongs in the MDS tracker ... Greg Farnum
07:28 PM CephFS Bug #9692 (Resolved): ACL workunit syntax error
http://pulpito.ceph.com/gregf-2014-10-06_19:59:42-kcephfs-wip-9628-testing-basic-multi/531900... Greg Farnum
07:26 PM CephFS Bug #9628 (Resolved): mds: race between ms_handle_accept() and ms_handle_reset()
Merged to master in commit:1b7fae7b2953649564a9e226b4abedad0ce652cc Greg Farnum
05:51 PM rbd Bug #9513 (In Progress): rbd_cache=true default setting is degading librbd performance ~10X in Giant
The regression was introduced in commit 4fc9fffc494abedac0a9b1ce44706343f18466f1 (according to git bisect). This is ... Adam Crume
04:33 PM RADOS Bug #9606: mon: ambiguous error_status returned to user when type is wrong in a command
regardless of this being properly parsed on the client or not, the monitor should not rely on client argument validat... Joao Eduardo Luis
04:28 PM Bug #9496 (Fix Under Review): mon: pg scrub timestamps must be populated at pg creation
https://github.com/ceph/ceph/pull/2663
also in wip-sam-testing
Joao Eduardo Luis
01:46 PM Bug #9496: mon: pg scrub timestamps must be populated at pg creation
Kind of odd, last scrub timestamp should never be 0. Samuel Just
02:35 PM Bug #9416 (Duplicate): ods crash in upgrade:dumpling-dumpling-distro-basic-vps run
Samuel Just
02:35 PM Fix #9689 (New): ceph df reports % of global size used instead of MAX AVAIL 0.80.6
running the ceph df command returns a much lower %USED than expected. instead of reporting %USED of MAX AVAIL, which ... Heath Jepson
02:33 PM Bug #9503 (Pending Backport): Dumpling: removing many snapshots in a short time makes OSDs go ber...
Samuel Just
01:57 PM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
https://github.com/ceph/ceph/pull/2659 Samuel Just
02:08 PM Bug #9203 (Resolved): ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limi...
Sage Weil
11:25 AM Bug #9203 (Fix Under Review): ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `po...
Samuel Just
02:06 PM Bug #9113 (Pending Backport): osd: snap trimming eats memory, linearly
Sage Weil
11:25 AM Bug #9113 (Fix Under Review): osd: snap trimming eats memory, linearly
Samuel Just
02:04 PM Bug #9626 (Pending Backport): PG: cancel backfill reservations if we get a cancel during backfill
Sage Weil
11:25 AM Bug #9626 (Fix Under Review): PG: cancel backfill reservations if we get a cancel during backfill
Samuel Just
01:57 PM Bug #7368: ceph osd repair * blocks after some minutes and prevent other ceph pg repair commands
Loic, If I understand correctly, #9566 is "normal" backfilling, and Sage's explanation is clear. In my case, I had lo... Yann Dupont
01:40 PM Bug #7368 (Can't reproduce): ceph osd repair * blocks after some minutes and prevent other ceph p...
Samuel Just
01:51 PM Bug #9467 (Won't Fix): Delete default erasure coded profile getting succeeded
This looks like exactly how it is supposed to work? Samuel Just
01:49 PM Bug #9434: rbd rm hangs
Are your pgs clean? (ceph -s) Samuel Just
01:43 PM Bug #9551 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-testing-basic-vps run
Samuel Just
01:42 PM Bug #2848 (Won't Fix): OSDMap: pool_id is 64-bit, but pool_max is 32-bit
Samuel Just
01:32 PM Bug #9515 (Duplicate): "Segmentation fault (ceph_test_rados_api_io)" in upgrade:dumpling-giant-x:...
Ok, hopefully fixed this time. Samuel Just
01:28 PM Bug #8822 (Resolved): osd: hang on shutdown, spinlocks
Samuel Just
01:25 PM Bug #9181: Osd: segv in OpTracker::unregister_inflight_op
Somnath Roy wrote:
> Sam,
> This core is different and happening on Firefly. The other optracker fixes should also ...
Somnath Roy
01:25 PM Bug #9181: Osd: segv in OpTracker::unregister_inflight_op
Sam,
This core is different and happening on Firefly. The other optracker port should also be backported to Firefly ...
Somnath Roy
01:15 PM Bug #9181 (Resolved): Osd: segv in OpTracker::unregister_inflight_op
I think this got fixed with the other optracker fix? Samuel Just
01:22 PM Bug #9661 (Resolved): ceph_objectstore_tool doesn't work with memstore
6067f295e7bc571b43aa891f5560d96933721b19 David Zafman
01:20 PM Bug #9682 (Duplicate): "os/FileJournal.cc: 1677: FAILED assert(0)" in upgrade:firefly-firefly-dis...
Samuel Just
10:55 AM Bug #9682 (Duplicate): "os/FileJournal.cc: 1677: FAILED assert(0)" in upgrade:firefly-firefly-dis...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-05_10:00:04-upgrade:firefly-firefly-distro-basic-m... Yuri Weinstein
01:19 PM Bug #9683 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-distro-basic-multi run
Samuel Just
10:58 AM Bug #9683 (Duplicate): "Segmentation fault" in upgrade:firefly-firefly-distro-basic-multi run
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-05_10:00:04-upgrade:firefly-firefly-distro-basic-m... Yuri Weinstein
01:16 PM Bug #8333 (Can't reproduce): ceph_test_rados_delete_pools_parallel: Received fewer notifies than ...
Samuel Just
01:12 PM Bug #9128: Newly-restarted OSD may suicide itself after hitting suicide time out value because it...
Samuel Just
01:10 PM Bug #9322 (In Progress): OSDMap updates from pgmap can be delayed indefinitely
Joao Eduardo Luis
01:10 PM Bug #9321 (In Progress): pgmap updates from OSDMap can be delayed indefinitely
Joao Eduardo Luis
01:09 PM Bug #6101 (Can't reproduce): ceph-osd crash on corrupted store
Samuel Just
12:28 PM Bug #9582 (Resolved): librados: segmentation fault on timeout
i believe all patches affecting firefly and dumpling have been backported. Sage Weil
11:41 AM Bug #9582 (Pending Backport): librados: segmentation fault on timeout
Sage Weil
11:42 AM Bug #9650 (Resolved): RWTimer cancel_event is racy
Sage Weil
11:29 AM Bug #8520 (Can't reproduce): osd: segv in PushOp::print()
Samuel Just
11:28 AM Bug #9008 (Pending Backport): Objecter: pg listing can deadlock when throttling is in use
Samuel Just
11:28 AM Bug #9417 (Duplicate): "Segmentation fault" in upgrade:dumpling-giant-x-master-distro-basic-vps run
Samuel Just
11:26 AM Bug #9614: PG stuck with remapped
Samuel Just
11:03 AM Bug #9684 (Can't reproduce): "Scrubbing terminated" in upgrade:firefly-firefly-distro-basic-multi...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-05_10:00:04-upgrade:firefly-firefly-distro-basic-m... Yuri Weinstein
10:28 AM rbd Bug #9642: Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-giant-distro-ba...
Should be fixed by https://github.com/ceph/ceph-qa-suite/pull/169 Yuri Weinstein
09:54 AM Messengers Fix #9678 (Rejected): errno shadowed in Pipe.cc
Greg Farnum
07:41 AM Messengers Fix #9678: errno shadowed in Pipe.cc
If it is expected to see an error message when there is nothing to read, then I was mistaken.
Not retrieving the ...
Loïc Dachary
07:13 AM Messengers Fix #9678: errno shadowed in Pipe.cc
Where's it being reset? That error message is admittedly strange but it actually happens because the underlying funct... Greg Farnum
05:54 AM Messengers Fix #9678 (Rejected): errno shadowed in Pipe.cc
In some places errno is used after it has been reset and the original error code does not show in the message. For in... Loïc Dachary
09:54 AM rbd Bug #6926 (Resolved): rbd: diff output includes previously non-existent objects as zeroed extents
commit:9a1ab95176fe4d200a83b7b4f7e2b3097d541a7a Josh Durgin
09:54 AM CephFS Bug #9679: Ceph hadoop terasort job failure
https://issues.apache.org/jira/browse/MAPREDUCE-2018 Noah Watkins
09:53 AM CephFS Bug #9679: Ceph hadoop terasort job failure
https://svn.apache.org/repos/asf/hadoop/common/branches/MAPREDUCE-233/src/examples/org/apache/hadoop/examples/terasor... Noah Watkins
09:39 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Teragen command:
./hadoop/bin/hadoop jar ./hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar t...
Huamin Chen
09:22 AM CephFS Bug #9679: Ceph hadoop terasort job failure
Thanks for adding this. What command did you use to generate the input? Noah Watkins
09:04 AM CephFS Bug #9679 (Closed): Ceph hadoop terasort job failure
Hadoop version: 2.4.1
Ceph version:
ceph --version
ceph version 0.85-986-g031ef05 (031ef0551ebc98d824075558e884...
Huamin Chen
09:36 AM rbd Bug #9680 (Duplicate): Errors in test_rbd.* in upgrade:dumpling-firefly-x:parallel-giant-distro-b...
Josh Durgin
09:28 AM rbd Bug #9680 (Duplicate): Errors in test_rbd.* in upgrade:dumpling-firefly-x:parallel-giant-distro-b...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-06_17:20:35-upgrade:dumpling-firefly-x:parallel-gi... Yuri Weinstein
09:32 AM Linux kernel client Bug #4689: libceph: don't have alloc_msg methods limit length
Related to #9560, #9561? Ilya Dryomov
09:28 AM rbd Bug #5768 (Resolved): rbd-fuse: leak in enumerate_images()
commit:9132ca47959ae1a9a658971b0c8f4fe6e8d0cad3 Josh Durgin
09:26 AM rbd Bug #7385: Objectcacher setting max object counts too low
Josh Durgin
09:24 AM rbd Bug #9391 (Need More Info): fio rbd driver rewrites same blocks
Josh Durgin
09:24 AM rbd Bug #9146 (Can't reproduce): EPERM from image_read.sh
Josh Durgin
09:22 AM rbd Bug #9602 (Need More Info): rbd export -> nc ->rbd import = memory leak
Josh Durgin
09:20 AM rgw Bug #9254 (Fix Under Review): rgw: civetweb requires explicit \r\n for http headers
Yehuda Sadeh
09:15 AM rgw Bug #9039 (Pending Backport): Using COPY on radosgw to copy object from one bucket to another tha...
Yehuda Sadeh
09:13 AM rgw Bug #8587 (Pending Backport): rgw: subuser object not created correctly
Yehuda Sadeh
09:13 AM rgw Bug #9155 (Resolved): Swift Subuser - 403 Forbidden - during upload/post
Fixed (#8587) Yehuda Sadeh
09:11 AM rgw Bug #5595 (Fix Under Review): object has a Content-Type, but its content_type property is not sho...
Josh Durgin
09:09 AM Bug #9677 (Pending Backport): osd_disk_thread_ioprio_class is ignored
Loïc Dachary
05:10 AM Bug #9677: osd_disk_thread_ioprio_class is ignored
https://github.com/ceph/ceph/pull/2654 Loïc Dachary
04:48 AM Bug #9677 (Resolved): osd_disk_thread_ioprio_class is ignored
The "osd_disk_thread_ioprio_class configuration option":http://ceph.com/docs/giant/rados/configuration/osd-config-ref... Loïc Dachary
09:00 AM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Ilya Dryomov
08:54 AM Bug #9635 (Resolved): mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())
Joao Eduardo Luis
07:03 AM CephFS Bug #9636 (Duplicate): segfault in CInode::get_caps_allowed_for_client
Greg Farnum
07:02 AM CephFS Bug #9562 (Resolved): Lockdep assertion in Filer purge
Backported to giant:... John Spray
07:02 AM CephFS Bug #8576 (Need More Info): teuthology: nfs tests failing on umount
Greg Farnum
06:50 AM Bug #6756: journal full hang on startup
... Sage Weil
06:37 AM Bug #6003: journal Unable to read past sequence 406 ...
... Sage Weil
06:32 AM Bug #9418 (Pending Backport): mon: drop internal-purpose messages from clients without proper caps
Sage Weil
03:55 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
as per this document "http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/", that mon will get 3... Ramakrishnan P
03:53 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
as per this document "http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/", that mon will get 3... Ramakrishnan P
01:15 AM Bug #9676 (Resolved): disk thread ioprio class misses osd
Loïc Dachary
01:10 AM Bug #9676 (Fix Under Review): disk thread ioprio class misses osd
https://github.com/ceph/ceph/pull/2653 Loïc Dachary
01:06 AM Bug #9676 (Resolved): disk thread ioprio class misses osd
http://ceph.com/docs/master/rados/configuration/osd-config-ref/ Loïc Dachary
12:28 AM Bug #9675: splitting a pool doesn't start when rule_id != ruleset_id
Sorry for formatting... should be like this:... Dan van der Ster
12:24 AM Bug #9675 (Resolved): splitting a pool doesn't start when rule_id != ruleset_id
commit:78e84f34da83abf5a62ae97bb84ab70774b164a6
Dumpling 0.67.10
Rule is like this:
{ "rule_id": 6,
...
Dan van der Ster

10/06/2014

10:53 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Note that we only see the debug output when we're trying to write to the RBD bus directly on the host - from within t... Chris Armstrong
10:32 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Here's some debugging after disabling auth.
As root on the CoreOS host, echoing directly into the RBD bus also doe...
Chris Armstrong
10:04 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
For posterity, recording my conversation with Josh here. http://irclogs.ceph.widodh.nl/index.php?date=2014-09-04
<...
Chris Armstrong
10:03 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
Seeing the same issue on a 3.16.2 kernel: ... Chris Armstrong
08:55 PM Linux kernel client Bug #9355: rbd: map fails with EINVAL inside a container
A fellow member of the CoreOS community is also running into this: https://groups.google.com/forum/#!topic/coreos-use... Chris Armstrong
06:27 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
rsync return codes aren't standard error codes. The man page says that 23 means... Greg Farnum
05:59 PM CephFS Bug #9674: nightly failed multiple_rsync.sh
#define ENFILE 23 /* File table overflow */
maybe we should adjust ulimit
Zheng Yan
02:23 PM CephFS Bug #9674 (Resolved): nightly failed multiple_rsync.sh
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-03_23:04:01-fs-giant-distro-basic-multi/527949/... Greg Farnum
05:52 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
Good to know that you are able to reproduce this.
I think the log entries you mentioned are there in Firefly as well...
Somnath Roy
03:09 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
I double-checked, and I had multiple versions of librbd on my path. (I forgot about installing one of them.) I remo... Adam Crume
01:14 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
Hopefully, you made sure the librbd/librados libraries fio_rbd is loading are from the giant. As I said, replacing th... Somnath Roy
11:55 AM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
Thanks for the details. Unfortunately, I'm still unable to reproduce the issue. Was your cluster created in Firefly... Adam Crume
03:52 PM devops Bug #9658 (Resolved): upgrade from dumpling to firefly is broken
Sage Weil
03:06 PM devops Bug #9658: upgrade from dumpling to firefly is broken
ubuntu@teuthology:/a/teuthology-2014-10-06_14:06:56-upgrade:dumpling-firefly-x:parallel-wip-9658-firefly-distro-basic... Tamilarasi muthamizhan
12:51 PM devops Bug #9658: upgrade from dumpling to firefly is broken
ubuntu@teuthology:/a/teuthology-2014-10-06_10:31:05-upgrade:dumpling-firefly-x:parallel-wip-9658-distro-basic-vps/529856 Tamilarasi muthamizhan
10:12 AM devops Bug #9658: upgrade from dumpling to firefly is broken
sure, am testing it now Tamilarasi muthamizhan
09:55 AM devops Bug #9658: upgrade from dumpling to firefly is broken
Tamil, this should be fixed in the wip-9658 branch.. can you test please? The firefly backport will be a bit differe... Sage Weil
09:48 AM devops Bug #9658: upgrade from dumpling to firefly is broken
Sage Weil
03:23 PM Bug #9203: ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limit' failed.
Samuel Just
02:54 PM Bug #9419: dumpling->firefly upgrade, sending setallochint?
Notes on using feature bits already present. The problem is that CEPH_FEATURE_MSGR_KEEPALIVE2 was back ported, so we... David Zafman
02:17 PM Bug #9385 (Duplicate): ceph_test_rados: incorrect buffer at pos ...
Samuel Just
01:49 PM Documentation #9673 (Closed): Document ceph df numbers
We need to just write down what they mean. It's one of the first questions, and it's one of the hardest ones to answ... Dan Mick
11:29 AM Bug #9664 (Rejected): mon: ceph osd metada failure on centos7
it fails because the dockerized centos is has a fake systemd http://jperrin.github.io/centos/2014/09/25/centos-docker... Loïc Dachary
11:02 AM Bug #9657 (Resolved): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
No backport is needed; this is done. (commit:25bcc39bb809e2d13beea1529e4ab92d1b61fa5b) Greg Farnum
09:56 AM devops Tasks #9669 (Resolved): teuthology.front needs an upgrade
We need a newer libvirt version on the machine, and Ubuntu precise just doesn't contain what we need. I can't even se... Zack Cerza
09:39 AM devops Bug #9654 (Duplicate): "error: subprocess paste was killed by signal (Broken pipe)" in upgrade:du...
Sage Weil
09:39 AM devops Bug #9656: Remove conditional statement in ceph-radosgw startup script log section
Hmm yeah. I think the better solution would be to fix the /var/log/ceph (/var/log/radosgw?) permissions so that log ... Sage Weil
09:37 AM Bug #9663 (Resolved): Objecter assertion failure
Sage Weil
09:34 AM devops Bug #9667 (Duplicate): Missing packages in upgrade:dumpling-firefly-x:parallel-giant-distro-basic...
#9658 Sage Weil
08:45 AM devops Bug #9667 (Duplicate): Missing packages in upgrade:dumpling-firefly-x:parallel-giant-distro-basic...
This looks like a dupe of #9640 but a different version.
Job: http://qa-proxy.ceph.com/teuthology/teuthology-2014-...
Yuri Weinstein
09:33 AM Bug #9668 (Rejected): osd killed by ABRT from FAILED assert
this is almost certainly either the max files ulimit (ulimit -n , see max open files = ... in ceph.conf) or /proc/sys... Sage Weil
09:00 AM Bug #9668 (Rejected): osd killed by ABRT from FAILED assert
-----------
[Mon Oct 6 15:09:03 2014] init: ceph-osd (ceph/46) main process (3058) killed by ABRT signal
---------...
Sheldon Mustard
09:32 AM Bug #9655 (Resolved): tests: qa/workunits/cephtool/test.sh fails ENXIO
Loïc Dachary
08:38 AM devops Fix #9666 (Resolved): ceph-disk error when activate is missing an argument is cryptic
When the device argument is missing:... Loïc Dachary
08:14 AM devops Bug #9665: ceph-disk zap should call partprobe
https://github.com/ceph/ceph/pull/2648 Loïc Dachary
07:43 AM devops Bug #9665 (Resolved): ceph-disk zap should call partprobe
h3. User description
Symptoms:
* A disk is used by an OSD
* The OSD is not longer useful and the disk is clear...
Loïc Dachary
08:09 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
I believe the fact that the commit message for 255b430a87201c7d0cf8f10a3c1e62cbe8dd2d93 said @Backfill@ where it shou... Florian Haas
08:04 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
Hi Sam,
I think this is fixed in master/giant.. correct? Just a gentle reminder that we'd appreciate a backport in d...
Dan van der Ster
08:04 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
Hi Sam,
Same as for #9503
I think this is fixed in master/giant.. correct? Just a gentle reminder that we'd appreci...
Dan van der Ster
12:51 AM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
Loïc Dachary

10/05/2014

01:12 PM Bug #9663: Objecter assertion failure
Probably this... Noah Watkins
12:41 PM Bug #9663 (Resolved): Objecter assertion failure
In latest Giant build. This bug appears to be related to http://tracker.ceph.com/issues/9067... Noah Watkins
01:00 PM Bug #9664 (Rejected): mon: ceph osd metada failure on centos7
"qa/workunits/cephtool/test.sh":https://github.com/ceph/ceph/blob/master/qa/workunits/cephtool/test.sh#L665 consisten... Loïc Dachary

10/04/2014

12:50 PM RADOS Bug #9492 (Fix Under Review): Crush Mapper crashes when number of replicas is less than total num...
* firefly https://github.com/ceph/ceph/pull/2643
* giant https://github.com/ceph/ceph/pull/2642
running on http:/...
Loïc Dachary
12:33 PM RADOS Bug #9492: Crush Mapper crashes when number of replicas is less than total number of osds to be s...
Pull req info :
fix for firstn rules: https://github.com/ceph/ceph/pull/2568
fix for indep rules : https://githu...
Johnu George
04:41 AM RADOS Bug #9492 (Pending Backport): Crush Mapper crashes when number of replicas is less than total num...
I think both patches should be backported to giant and firefly. Would you like to do that ? It essentially means you ... Loïc Dachary
02:46 AM RADOS Bug #9492 (Resolved): Crush Mapper crashes when number of replicas is less than total number of o...
Loïc Dachary
02:38 AM Bug #9655 (Fix Under Review): tests: qa/workunits/cephtool/test.sh fails ENXIO
https://github.com/ceph/ceph/pull/2641 Loïc Dachary

10/03/2014

06:36 PM Bug #9657: MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
tested with wip-9657, fix works fine.
logs are copied to vpm102.front.sepia.ceph.com:/home/ubuntu/wip-9657
Tamilarasi muthamizhan
05:02 PM Bug #9657 (Pending Backport): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
fix looks right. merged it into giant branch Sage Weil
04:11 PM Bug #9657: MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
https://github.com/ceph/ceph/pull/2640
Tamil will put it through the upgrade suite.
Greg Farnum
04:07 PM Bug #9657 (Fix Under Review): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
Greg Farnum
04:05 PM Bug #9657: MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
Okay, it's because Message::encode() transmutes a compat_version of 0 into compat_version == HEAD_VERSION, and we are... Greg Farnum
03:59 PM Bug #9657 (In Progress): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
Well, good news and bad news:
This is not a monitor bug, and my initial guess is that it will only affect clusters r...
Greg Farnum
11:11 AM Bug #9657 (Resolved): MMDSBeacon: failure to decode; compat_version = 3 on Firefly monitor
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-01_19:20:01-upgrade:firefly-x-giant-distro-basic-m... Yuri Weinstein
05:04 PM devops Bug #9658: upgrade from dumpling to firefly is broken
possible debian fix in wip-9658. asked ceph-maintainers and branto for review and help with the spec file change. Sage Weil
03:56 PM devops Bug #9658 (In Progress): upgrade from dumpling to firefly is broken
Tamilarasi muthamizhan
03:52 PM devops Bug #9658 (New): upgrade from dumpling to firefly is broken
sandon: looks like the problem is a file that was in python-ceph was moved to ceph and apt is bailing due to over-wri... Tamilarasi muthamizhan
03:52 PM devops Bug #9658 (In Progress): upgrade from dumpling to firefly is broken
this is broken by commit:eb0f6e347969b40c0655d3165a6c4531c6b595a3, which is post 0.80.6. phew! yay testing. Sage Weil
02:49 PM devops Bug #9658 (Resolved): upgrade from dumpling to firefly is broken
This is definitely blocking the upgrade testing for giant.
logs: http://qa-proxy.ceph.com/teuthology/teuthology-20...
Tamilarasi muthamizhan
03:16 PM Bug #9661 (Fix Under Review): ceph_objectstore_tool doesn't work with memstore
David Zafman
03:07 PM Bug #9661 (Resolved): ceph_objectstore_tool doesn't work with memstore

A CephContext* isn't passed to ObjectStore::create() so MemStore::mount() crashes.
MemStore::set_allow_sharded_o...
David Zafman
03:04 PM devops Bug #9640: Missing packages in multi-version-giant-testing-basic-multi
Looks like there two different issues here,
Update on "sudo apt-get update..."
Running the command line on a ma...
Yuri Weinstein
11:20 AM devops Bug #9640: Missing packages in multi-version-giant-testing-basic-multi
4 jobs failed in http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_23:20:03-multi-version-giant-distro-basic-... Yuri Weinstein
02:53 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
Please make sure you are following these steps..
1. Build the latest giant package both in cluster and client side...
Somnath Roy
01:35 PM rbd Bug #9513: rbd_cache=true default setting is degading librbd performance ~10X in Giant
In my test cluster, I'm getting the same performance with "rbd cache = false" and "rbd cache = true". Could you post... Adam Crume
02:50 PM CephFS Feature #9659 (Duplicate): MDS: support cache eviction
It would be really useful when writing certain kinds of tests (eg, for scrubbing) to be able to know that a particula... Greg Farnum
12:01 PM Bug #9653 (Resolved): ceph-disk: bootstrap-osd keyring ignores --statedir
Loïc Dachary
06:54 AM Bug #9653: ceph-disk: bootstrap-osd keyring ignores --statedir
* giant https://github.com/ceph/ceph/pull/2635
* firefly https://github.com/ceph/ceph/pull/2634
Loïc Dachary
05:13 AM Bug #9653 (Fix Under Review): ceph-disk: bootstrap-osd keyring ignores --statedir
https://github.com/ceph/ceph/pull/2633 Loïc Dachary
04:54 AM Bug #9653 (Resolved): ceph-disk: bootstrap-osd keyring ignores --statedir
... Loïc Dachary
11:40 AM Fix #9245 (Resolved): remove Monitor::osdmonitor_prepare_command
Loïc Dachary
10:01 AM Fix #9245 (Fix Under Review): remove Monitor::osdmonitor_prepare_command
giant backport https://github.com/ceph/ceph/pull/2637 Loïc Dachary
09:27 AM Fix #9245 (Pending Backport): remove Monitor::osdmonitor_prepare_command
Sage Weil
07:15 AM Fix #9245: remove Monitor::osdmonitor_prepare_command
https://github.com/ceph/ceph/pull/2636 Loïc Dachary
11:06 AM devops Bug #9656 (Rejected): Remove conditional statement in ceph-radosgw startup script log section
The startup script has a conditional statement to determine if a log file exists, and will touch and chown the log fi... Tupper Cole
10:43 AM Bug #9655 (Resolved): tests: qa/workunits/cephtool/test.sh fails ENXIO
... Loïc Dachary
10:34 AM Bug #8083: erasure-code: fix static code analysis errors found in gf-complete
For the record these are minor fixes and I expect to see them used when NEON is merged upstream and we update the jer... Loïc Dachary
10:12 AM Bug #8083 (Resolved): erasure-code: fix static code analysis errors found in gf-complete
merged https://bitbucket.org/jimplank/gf-complete/pull-request/24/static-code-analysis-fixes
Loïc Dachary
09:04 AM devops Bug #9654 (Duplicate): "error: subprocess paste was killed by signal (Broken pipe)" in upgrade:du...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-01_19:10:01-upgrade:dumpling-firefly-x:parallel-gi... Yuri Weinstein
07:36 AM rbd Feature #9374 (Resolved): rbd: use a rolling average for bench-write
commit:b47fdd400e14bd1b5e5bea9d18f895c92b8050be Jason Dillaman
07:17 AM Bug #9644 (Can't reproduce): ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
I tried with latest master and I'm no longer hitting it. I'm not sure if this was due to an environment issue or som... Joao Eduardo Luis
04:32 AM Bug #9644: ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
The CEPH_CONF and CEPH_ARGS are "taken care of when the test starts":https://github.com/ceph/ceph/blob/giant/src/test... Loïc Dachary
04:17 AM Bug #9644: ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
Could you include the error you get also ? One idea that comes to mind is that the test-erasure-code.sh do require au... Loïc Dachary
06:52 AM CephFS Bug #9636: segfault in CInode::get_caps_allowed_for_client
looks like it's the same as #9628 Zheng Yan
12:37 AM Bug #9619 (Can't reproduce): excessive mon memory usage when rbd rm 1PB
At 83% completion (rbd rm big)... Loïc Dachary

10/02/2014

08:53 PM Bug #9625 (Need More Info): firefly: memory corruption
Sage Weil
07:32 PM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
sent a pull request now, #2632 Yehuda Sadeh
05:48 PM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
not sure what the state of this bug is then.. yehuda? Sage Weil
06:01 PM Bug #9544 (Resolved): osd: pg deletion vs create race leads to EEXIST on mkcoll (dumpling)
Sage Weil
06:00 PM rbd Bug #6494 (Resolved): High memory consumption of qemu/librbd with enabled cache
ok did dumpling too Sage Weil
05:51 PM rbd Bug #6494: High memory consumption of qemu/librbd with enabled cache
backported to firefly. josh, should we do dumpling too? Sage Weil
05:46 PM rgw Bug #8621 (Resolved): civetweb frontend fails authentication if URL has special chars
a953b313f1e2f884be6ee2ce356780f4f70849dd Sage Weil
05:46 PM rgw Bug #8718 (Resolved): CORS OPTIONS request fails for presigned urls
6fee71154d838868807fd9824d829c8250d9d2eb Sage Weil
05:45 PM rgw Bug #8784 (Resolved): rgw: completion leak
b0d08aab837808f18708a4f8ced0503c0fce2fec Sage Weil
05:44 PM rgw Bug #9089 (Resolved): rgw: copy_obj_data() does not stripe target object
Sage Weil
05:44 PM rgw Feature #9200 (Resolved): rgw: log civetweb access
Sage Weil
05:42 PM rgw Bug #9206 (Resolved): rgw: cross rgw message headers filtered by apache 2.4
Sage Weil
05:41 PM rgw Bug #9353 (Resolved): Log files created under /var/log/radosgw/ do not have the .log extension
Sage Weil
05:37 PM rgw Bug #9148 (Resolved): rgw: multiregion tests failing, s3tests.functional.test_s3.test_region_copy...
Sage Weil
05:37 PM rgw Bug #9226 (Resolved): rgw: crash when copying specific objects
Sage Weil
05:36 PM rgw Bug #9208 (Resolved): rgw: civetweb does not drain request buffer correctly
Sage Weil
05:36 PM rgw Bug #9201 (Resolved): rgw: bad object with different pool alignment
Sage Weil
05:23 PM Feature #8391 (Resolved): sysvinit does not support custom cluster names
Sage Weil
05:22 PM Feature #8203 (Resolved): Replica setting values in df output
Sage Weil
05:22 PM Feature #7792 (Closed): leveldb 1.12.0 for rhel
Sage Weil
05:21 PM Feature #7344 (Resolved): osd: add additional heartbeat on cluster interface
Sage Weil
05:20 PM Feature #6261 (Resolved): ceph-filestore-dump use cases for disaster recovery
Sage Weil
05:18 PM Feature #5614 (Resolved): mon: enable moving pools to HASHPSPOOL mode
Sage Weil
05:15 PM Feature #4914 (Resolved): rados tool: read xattr from file / stdin
Sage Weil
05:14 PM Feature #4005: Add perftools to the kernel debian package script
Sage Weil
05:13 PM Feature #3345 (Resolved): support multiple clusters with sysvinit
Sage Weil
05:13 PM Feature #3340 (New): refuse to accept "cluster=foo" in ceph.conf
Sage Weil
05:13 PM Feature #3340 (Rejected): refuse to accept "cluster=foo" in ceph.conf
Sage Weil
05:13 PM Feature #3288 (Resolved): docs: document the chooseleaf command in crush
Sage Weil
05:12 PM Feature #3086 (Resolved): workqueue: dynamically adjust number of threads
Sage Weil
05:12 PM Feature #2894 (Resolved): cli: help command for ceph subsystems
Sage Weil
05:11 PM Feature #1880 (Rejected): osd: optionally log all request latencies
Sage Weil
05:11 PM Messengers Feature #1851 (Rejected): SimpleMessenger: use non-blocking io
Sage Weil
05:10 PM Feature #1267 (Rejected): osd: rgw class to do acl check
Sage Weil
05:09 PM RADOS Feature #84 (Rejected): mon: auto adjust pg_num as pool grows
Sage Weil
05:08 PM Feature #2222 (Resolved): osd: distinguish between 'degraded' and 'misplaced'
Sage Weil
05:08 PM Feature #5907 (Resolved): permanently log all administrative actions
Sage Weil
05:07 PM Feature #3849 (Resolved): Track slow PGs and times OSDs marked down
Sage Weil
04:28 PM Feature #8560 (Resolved): mon: instrument paxos
Sage Weil
03:58 PM rgw Bug #9651 (Duplicate): RGW: Object Removal Atomicity
The issue appears then a system does down when there are pending object deletions. The object can be removed but will... Tyler Brekke
03:30 PM Bug #9650: RWTimer cancel_event is racy
Sage Weil
03:30 PM Bug #9650 (Fix Under Review): RWTimer cancel_event is racy
wip-rwtimer Sage Weil
02:57 PM Bug #9650: RWTimer cancel_event is racy
The issue is that we execute events under a shared (read) lock, and we allow you to cancel them under a shared (read)... Sage Weil
01:55 PM Bug #9650 (Resolved): RWTimer cancel_event is racy
(in safe mode) we carry the rwlock for the callback. but we use a separate mutex to protect the events. and we can
...
Sage Weil
03:30 PM Bug #9582 (Fix Under Review): librados: segmentation fault on timeout
Sage Weil
10:23 AM Bug #9582 (In Progress): librados: segmentation fault on timeout
hmm, several failures on giant
ubuntu@teuthology:/var/lib/teuthworker/archive/samuelj-2014-10-01_18:59:42-rados-gi...
Sage Weil
03:29 PM rgw Bug #9307 (Resolved): "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-fir...
Sage Weil
09:31 AM rgw Bug #9307: "s3.test_multipart_upload_multiple_sizes ... ERROR" in upgrade:dumpling-firefly-x-mast...
suite:upgrade:dumpling-x
In run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_19:00:02-upgrade:dumplin...
Yuri Weinstein
02:15 PM CephFS Bug #9514 (Resolved): ceph-fuse pjd test is failing in giant nightlies
Dumpling commit:5f601f099be98c2b061cc94fb06917e7543f3efe
Firefly commit:9fee8de25ab5c155cd6a3d32a71e45630a5ded15
Greg Farnum
01:56 PM Bug #8752: firefly: scrub/repair stat mismatch
I think I found where it is happening. For a while I was using Btrfs-based OSDs with journals on SSD-based ext4. For ... Dmitry Smirnov
11:58 AM Bug #9559: ?off-by-one vulnerability?ceph-0.80.5/src/common/fd.cc dump_open_fds() function
This was fixed in version 0.83 in commit 046c9769fc4eaffc1dd4a21b61c1c5696d537def, although I'm sure it could be back... Adam Crume
11:42 AM Bug #9649 (Can't reproduce): OSD hang in op_tp
ubuntu@teuthology:/a/samuelj-2014-10-01_18:59:42-rados-giant-wip-testing-old-vanilla-basic-multi/524982
valgrind, ...
Samuel Just
11:30 AM Bug #9626: PG: cancel backfill reservations if we get a cancel during backfill
Samuel Just
11:20 AM Feature #9647 (New): osd: hard cap on PGs per OSD
Sage Weil
11:00 AM devops Feature #9411: remove qemu symlink for librbd on rhel7.1 (and later)
This ticket is inaccurate.
The version of qemu-kvm that ships with base RHEL 6.x or 7.x does not and has no plans ...
Neil Levine
10:56 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
ubuntu@teuthology:/a/samuelj-2014-10-01_18:59:42-rados-giant-wip-testing-old-vanilla-basic-multi/524988 Samuel Just
10:51 AM devops Feature #3161 (Rejected): make gcov website public, via proxy on gitbuilder.sepia.ceph.com
Ian Colle
10:49 AM devops Feature #2663 (Closed): crowbar: UI for setting generic ceph.conf values
Neil Levine
10:48 AM devops Feature #2910 (Closed): crowbar: Use JBOD mode for ceph-osd
Neil Levine
10:46 AM devops Feature #8037 (Closed): Test leveldb 1.12 (or newer) and package as necessary
Ian Colle
10:46 AM devops Feature #3023 (Closed): juju: automated QA of OpenStack RBD integration
Neil Levine
10:46 AM devops Feature #3022 (Closed): juju: automated QA of Ceph
Neil Levine
10:46 AM devops Feature #2695 (Closed): crowbar: Automated QA
Neil Levine
10:45 AM devops Feature #3017 (Closed): juju: dev env setup
Neil Levine
10:45 AM devops Feature #3018 (Closed): juju: test deploy of openstack
Neil Levine
10:45 AM devops Feature #3020 (Closed): juju: change nova to use rbd
Neil Levine
10:44 AM devops Feature #7925 (In Progress): Feature: create new download.ceph.com site
Ian Colle
10:42 AM devops Feature #3021 (Closed): juju: change glance to use rbd
Neil Levine
09:42 AM Bug #9644 (Can't reproduce): ceph-disk not playing nice with test/erasure-code/test-erasure-code.sh
I haven't seen anyone complaining about this so either 1) no one is running this test, or 2) I'm the only one hitting... Joao Eduardo Luis
09:23 AM devops Bug #9643 (Rejected): Error "install ceph-devel-0.67.11 -y" in -upgrade:dumpling-x-firefly-distro...
In run http://pulpito.front.sepia.ceph.com/teuthology-2014-10-01_19:00:02-upgrade:dumpling-x-firefly-distro-basic-vps... Yuri Weinstein
08:39 AM rbd Bug #9642 (Resolved): Errors in test_rbd.test_* tests in upgrade:dumpling-firefly-x:parallel-gian...
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-01_15:06:04-upgrade:dumpling-firefly-x:parallel-gi... Yuri Weinstein
07:55 AM Bug #9619: excessive mon memory usage when rbd rm 1PB
The mon memory indeed grows but after 30 minutes running I'm not sure it is related. And it's growing slowly.... Loïc Dachary
06:45 AM Bug #9619 (New): excessive mon memory usage when rbd rm 1PB
Checking the OSD memory usage when the problem is MON growth is not a good idea. Loïc Dachary
06:35 AM Bug #9619 (Can't reproduce): excessive mon memory usage when rbd rm 1PB
With a vstart cluster with one monitor and three OSDs and... Loïc Dachary
07:42 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
I'm able to reproduce the problem with 0daddfbf1164d6ba3f38eee29d2f11acfa62f2b6 from your tree https://github.com/spo... Loïc Dachary
07:28 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
Damn... I was a bit too fast when I thought I was reproducing the issue !
I was indeed reproducing the original one,...
Sebastien Ponce
05:40 AM Bug #9356: ceph_test_rados_striper_api_aio Segmentation faults
I've finally managed to reproduce it, thanks to Loic : the trick was Ubuntu + debug mode. Maybe you also need more th... Sebastien Ponce
05:34 AM Bug #8011 (Resolved): osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= s...
I'm unable to reproduce it any more, assuming fixed. Dmitry Smirnov
05:33 AM Bug #8747: OSD crash on scrub:osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || s...
I can't reproduce any more on 0.80.5 + Firefly HEAD as of 2014-09-16... Dmitry Smirnov
 

Also available in: Atom