Project

General

Profile

Activity

From 07/31/2014 to 08/29/2014

08/29/2014

10:08 AM Bug #9260 (Resolved): hadoop fs gets EINVAL
... John Spray
09:59 AM Bug #9252 (Fix Under Review): Worker thread to advance MDS in absence of messages
John Spray
09:59 AM Bug #9151 (Fix Under Review): mds should log/error/warn when segments are NOT getting trimmed
John Spray
07:18 AM Feature #9287 (Rejected): qa: hadoop: add big top tests to suite
Sage Weil
07:18 AM Feature #9286 (Rejected): qa: hadoop: test 2.x with teuthology
Sage Weil
07:02 AM Feature #9284 (Resolved): mds: warn when clients are not responding to cache pressure
Sage Weil
07:00 AM Feature #9283 (New): mds: limit inodes with caps to <100% of cache
Sage Weil
07:00 AM Feature #9282 (Resolved): mds: warn (and kill?) sessions to clients which aren't revoking caps
We need better ways of dealing with clients who aren't following our instructions. The most obvious of them is to sim... Greg Farnum
06:37 AM Bug #9280 (Resolved): valgrind failures in ceph-fuse

/a/john-2014-08-29_03:49:04-fs-wip-jcsp-test-testing-basic-multi
Valgrind: client (Leak_DefinitelyLost, Leak_Ind...
John Spray
12:43 AM Bug #9123: kceph: had 130k+ inodes with write caps
Zheng Yan
12:43 AM Bug #9123: kceph: had 130k+ inodes with write caps
I saw 10.214.137.25 in the mds log, 10.214.137.25 is gitbuilder-archive if IP hasn't changed. Maybe the issue and #89... Zheng Yan
12:15 AM Bug #8962: kcephfs: client does not release revoked cap
... Zheng Yan

08/28/2014

05:46 PM Bug #9266 (Resolved): ceph_test_libcephfs pool name failures
... John Spray
11:59 AM Bug #9266 (In Progress): ceph_test_libcephfs pool name failures
John Spray
09:20 AM Bug #9266 (Resolved): ceph_test_libcephfs pool name failures

http://pulpito.front.sepia.ceph.com/teuthology-2014-08-25_23:04:01-fs-master-testing-basic-multi/451157/
Also fa...
John Spray
05:36 PM Bug #9276 (New): Client::get_file_extent_osds asserts in object_locator_to_pg if osd map is out o...

This would happen if files in the filesystem had layouts referring to pools that were no in the OSD map, which can ...
John Spray
04:21 PM Bug #9264 (Duplicate): mds: occasionally log segments can't trim
Sage Weil
04:20 PM Bug #9264: mds: occasionally log segments can't trim
... Sage Weil
04:16 PM Bug #9264: mds: occasionally log segments can't trim
... Sage Weil
07:00 AM Bug #9264 (Duplicate): mds: occasionally log segments can't trim
it happened with latest lab mds restart yesterday; we have the logs (for another 6 days or so)... Sage Weil
11:59 AM Bug #9260: hadoop fs gets EINVAL
Could be related to #9266, a recurrence of something trying to look up pool names before osdmap is loaded in client. ... John Spray
10:08 AM Bug #9260: hadoop fs gets EINVAL
Duh, that last exception was just libcephfs-java not being installed. John Spray
09:46 AM Bug #9260: hadoop fs gets EINVAL

Hmm, apparently there's more than one way this can fail:...
John Spray
11:31 AM Bug #9178: samba: ENOTEMPTY on "rm -rf"
It might not need to if there's a client bug somewhere. (Or some other issue?) Greg Farnum
06:40 AM Bug #9178: samba: ENOTEMPTY on "rm -rf"
The strange thing is that the MDS never reply ENOTEMPTY. Zheng Yan
09:17 AM Bug #9173 (Resolved): Crash in Server::_session_logged
John Spray
08:09 AM Bug #8962: kcephfs: client does not release revoked cap
... Sage Weil
06:19 AM Bug #9252: Worker thread to advance MDS in absence of messages
Testing on wip-jcsp-test John Spray
06:19 AM Bug #9151: mds should log/error/warn when segments are NOT getting trimmed
Testing on wip-jcsp-test John Spray

08/27/2014

05:02 PM Bug #9260 (Resolved): hadoop fs gets EINVAL
This will fail on hadoop fs -put with EINVAL. No apparent problems in the libcephfs log.... Sage Weil
02:37 PM Feature #4583 (Resolved): libcephfs: add test that kills a client and verifies mds cleans it up
... John Spray
02:22 PM Feature #7810 (Resolved): libcephfs: add a test that freezes + unfreezes a client, and then verif...
... John Spray
02:21 PM Feature #4886 (Resolved): teuthology: add tests that use the MDS dumper
John Spray
02:21 PM Feature #4886: teuthology: add tests that use the MDS dumper
... John Spray
10:03 AM Bug #8962: kcephfs: client does not release revoked cap
... Sage Weil
06:51 AM Bug #9252 (Resolved): Worker thread to advance MDS in absence of messages

As we move dispatchers outside of the MDS (first Objecter, now Beacon), there are some cases that don't progress pr...
John Spray

08/26/2014

04:07 PM Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
Sage Weil
04:04 PM Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
NB the handing for this case in rbd landed with wip-objecter, keep this ticket open for general purpose ObjectCacher ... John Spray
04:05 PM Bug #9238: Floating point exception in Locker::calc_new_client_ranges
Loic: if the original change was on firefly too then yes John Spray
11:34 AM Bug #9238: Floating point exception in Locker::calc_new_client_ranges
Should this be backport to firefly ? Loïc Dachary
11:05 AM Bug #9238 (Resolved): Floating point exception in Locker::calc_new_client_ranges
Sage Weil
09:50 AM Bug #9238 (Fix Under Review): Floating point exception in Locker::calc_new_client_ranges
https://github.com/ceph/ceph/pull/2331 John Spray
09:29 AM Bug #9238 (Resolved): Floating point exception in Locker::calc_new_client_ranges

On master, MDS starts fine first time and then crashes on second start.
Floating point error on Locker::calc_new...
John Spray
06:26 AM Bug #9212: mon election delays mds beacon
mds sent beacon... Zheng Yan
06:09 AM Bug #9152 (Fix Under Review): mds: beacon needs to not take mds_lock
John Spray

08/25/2014

05:07 PM Bug #8878 (Resolved): mds lock cycle (wip-objecter)
Sage Weil
03:56 PM Bug #9216: mds may regard active clients as stale due to slow pg recovery
I haven't got that far yet, but if I had to guess I'd say it is not about caps, since when this happens, all existing... Alexandre Oliva
10:26 AM Bug #9216: mds may regard active clients as stale due to slow pg recovery
Interesting. Did you establish the mechanism by which the clients are being stale? Do they have a renew caps request ... Greg Farnum
01:59 AM Bug #9216 (New): mds may regard active clients as stale due to slow pg recovery
I occasionally get fuse and ceph.ko mounts into weird states, and I can generally track them down to the mds's decidi... Alexandre Oliva
10:29 AM Bug #9212: mon election delays mds beacon
Sage Weil
10:21 AM Bug #9212: mon election delays mds beacon
Did we identify why it was blacklisted? I don't think we have any tests that should make it that slow or whatever. Greg Farnum
09:30 AM Bug #9212 (Rejected): mon election delays mds beacon
EBLACKLISTED Sage Weil

08/24/2014

09:25 AM Bug #9212 (Won't Fix): mon election delays mds beacon
ubuntu@teuthology:/a/teuthology-2014-08-22_23:04:01-fs-master-testing-basic-multi/444359... Sage Weil

08/22/2014

02:57 AM Bug #4545: error creating empty object store. Invalid argument.
i maybe found the problem.
before you mkcephfs,you should ensure the dir(/var/lib/ceph/osd/ceph-0) empty.
once i wr...
cache china

08/21/2014

04:44 PM Bug #5762 (Resolved): teuthology: Failed MPI runs lead to a hung test instead of a failure
Sage Weil
09:55 AM Bug #9152 (In Progress): mds: beacon needs to not take mds_lock
wip-9152 John Spray
09:50 AM Bug #9177: ceph-fuse: failing MPI mdtest runs
The compiler is spitting out a warning about getcwd -- no evidence that that's what it's actually hitting in this ins... John Spray
08:53 AM Bug #9177: ceph-fuse: failing MPI mdtest runs
http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-20_23:04:01-fs-next-testing-basic-multi/439228/ Greg Farnum
08:29 AM Bug #9177: ceph-fuse: failing MPI mdtest runs
How did you track it down to getcwd? If that is the issue there are a bunch of avenues of attack here, and we should ... Greg Farnum
06:31 AM Bug #9177: ceph-fuse: failing MPI mdtest runs
mdtest has a getcwd call into an unzeroed buffer that it doesn't check the error of. If fuse is failing the getcwd f... John Spray
06:56 AM Bug #9151 (In Progress): mds should log/error/warn when segments are NOT getting trimmed
John Spray
05:56 AM Feature #9189 (Resolved): Expose client identifying metadata to MDS, e.g. hostname

Currently, when doing e.g. a "session ls" on an MDS's admin socket, we get client IDs and IP addresses. It would b...
John Spray
05:35 AM Bug #9173 (Fix Under Review): Crash in Server::_session_logged

https://github.com/ceph/ceph/pull/2297
John Spray

08/20/2014

10:47 AM Bug #9173: Crash in Server::_session_logged
Better log. John Spray
06:30 AM Bug #9173 (Resolved): Crash in Server::_session_logged

Hit by mds_client_recovery task...
John Spray
10:33 AM Bug #9178: samba: ENOTEMPTY on "rm -rf"
http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-10_23:14:02-samba-next-testing-basic-plana/415869/
Greg Farnum
10:30 AM Bug #9178 (Resolved): samba: ENOTEMPTY on "rm -rf"
... Greg Farnum
10:14 AM Bug #9177 (Resolved): ceph-fuse: failing MPI mdtest runs
... Greg Farnum

08/19/2014

09:13 AM Bug #9152: mds: beacon needs to not take mds_lock
Hmm, the beacon send code doesn't need to hold the lock on its own, but it's triggered by the SafeTimer, which is jus... Greg Farnum
09:05 AM Bug #9151: mds should log/error/warn when segments are NOT getting trimmed
What kind of logging do we want? I assume you mean journal segments, and this is a bog standard operation...
If it's...
Greg Farnum
06:49 AM Fix #4286: SLES 11 - cfuse: disable 'big_writes'and 'atomic_o_trunc
Ian Colle

08/17/2014

01:00 PM Bug #9152 (Resolved): mds: beacon needs to not take mds_lock
any random task that holds the mds lock for a long time prevents beacons, which will trigger a failover Sage Weil
12:48 PM Bug #9151 (Resolved): mds should log/error/warn when segments are NOT getting trimmed
Sage Weil

08/16/2014

03:42 PM Bug #8574 (Resolved): teuthology: NFS mounts on trusty are failing
chef adds a dummy export and restarts nfs-kernel-server now Sage Weil
02:41 PM Bug #8574: teuthology: NFS mounts on trusty are failing
root@mira055:~# service nfs-kernel-server restart
* Stopping NFS kernel daemon ...
Sage Weil

08/15/2014

11:08 AM Feature #8869 (Resolved): MDS: support standby-replay on old-format journals
This merged a couple of weeks ago in https://github.com/ceph/ceph/commit/440c820cce2c262570ab78e352bed8a630d41be5 John Spray
04:45 AM Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
Punting on a general purpose fix for ObjectCacher for the time being, and just fixing this in librbd teardown. John Spray
04:44 AM Bug #9105 (Fix Under Review): ~ObjectCacher behaves poorly on EBLACKLISTED
https://github.com/ceph/ceph/pull/2263 John Spray

08/14/2014

04:14 PM Bug #9101: multimds: unlinked file is not pruned from replica mds caches
Sage Weil
03:20 PM Bug #9123 (Can't reproduce): kceph: had 130k+ inodes with write caps
in #9121 the client had more than 130k inodes open for write, resulting in a huge file recovery queue. there definit... Sage Weil
02:37 PM Bug #9121 (In Progress): mds: inode stuck recovering after client restart
recovery is working.. there are just a lot of inodes queued:
2014-08-14 14:40:06.695087 7fd45f757700 10 mds.0.cach...
Sage Weil
02:10 PM Bug #9121 (Resolved): mds: inode stuck recovering after client restart
... Sage Weil
01:51 PM Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
John Spray wrote:
> This is happening when the librbd-using client is blacklisted, ObjectCacher fails to flush when ...
Sage Weil
10:16 AM Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
This is happening when the librbd-using client is blacklisted, ObjectCacher fails to flush when requested, and ImageC... John Spray
09:44 AM Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
Started failing in 061c8e93f76dc4fd6290d6d15723d76e73267444 where rbd_cache and rbd_cache_writethrough_until_flush we... John Spray
06:34 AM Bug #8725 (Resolved): mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
Sage Weil

08/13/2014

03:35 PM Bug #8964 (Resolved): kcephfs: client does not resend requests on mds restart
Sage Weil
03:13 PM Bug #8725 (Fix Under Review): mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic...
https://github.com/ceph/ceph/pull/2254 Sage Weil
02:15 PM Bug #9105 (New): ~ObjectCacher behaves poorly on EBLACKLISTED

In ceph master 78dc4df
http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-11_23:00:01-rbd-master-testing-bas...
John Spray
01:44 PM Bug #8962: kcephfs: client does not release revoked cap
... Sage Weil
01:19 PM Bug #8962: kcephfs: client does not release revoked cap
... Sage Weil
01:39 PM Bug #9101: multimds: unlinked file is not pruned from replica mds caches
looks like the problem is that another mds has the inode in its cache and isn't trimming it (or being asked to trim i... Sage Weil
01:13 PM Bug #9101 (Fix Under Review): multimds: unlinked file is not pruned from replica mds caches
https://github.com/ceph/ceph/pull/2250 Sage Weil
11:36 AM Bug #9101: multimds: unlinked file is not pruned from replica mds caches
Here is the debug data when using a ceph-fuse client.
We did reproduce the problem
Stephane Boisvert
11:15 AM Bug #9101 (New): multimds: unlinked file is not pruned from replica mds caches
as a result, deleted files stay pinned for a long time and space does not get removed. Sage Weil
12:38 PM Feature #9029 (Resolved): min/max uid for snapshot creation
Sage Weil

08/12/2014

07:31 AM Bug #9056: fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid context"
... John Spray
06:51 AM Bug #9056 (Resolved): fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid ...
Sage Weil
05:10 AM Bug #9056: fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid context"
This is supposed to be fixed upstream in v3.16-rc6 by commit c55a01d360af, will close this when we've seen a clean fs... John Spray
06:56 AM Bug #8648: Standby MDS leaks memory over time
Any change you can run one of these in standby under massif for a while? that will tell us what is leaking! Sage Weil
06:55 AM Bug #8651 (Won't Fix): crashing mds in an active-active mds setup
this MDS got blacklisted. there is an open issues somewhere to make the shutdown more friendly, but the behavior is ... Sage Weil
06:50 AM Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
we probably have to do a reencoding trick like we do in MOSDMap? Sage Weil
06:48 AM Bug #8876 (Resolved): kcephfs: hang on read of length 0
Sage Weil

08/11/2014

04:09 AM Bug #8878 (In Progress): mds lock cycle (wip-objecter)
I think all these are OK now in wip-mds-contexts: remaining failures on that branch are all outside MDS. John Spray

08/10/2014

02:16 AM Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
"same error":http://pulpito.ceph.com/loic-2014-08-10_09:59:49-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping... Loïc Dachary
12:53 AM Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
Another "similar crash":http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chun... Loïc Dachary
12:39 AM Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
And the same trace at "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upgrade:firef... Loïc Dachary
12:33 AM Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
Looks like a similar problem at "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upg... Loïc Dachary

08/09/2014

05:50 PM Bug #9056: fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid context"

http://pulpito.front.sepia.ceph.com/john-2014-08-09_14:56:53-fs-wip-mds-contexts-testing-basic-plana/409236/
http:...
John Spray
05:48 PM Bug #9056 (Resolved): fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid ...

kernel 5f740d7e1531099b888410e6bab13f68da9b1a4d
wip-mds-contexts (aka wip-objecter) 7be59771bff09e2b46b5467627cb...
John Spray

08/07/2014

06:41 AM Feature #9029: min/max uid for snapshot creation
Wido den Hollander

08/06/2014

12:25 PM Feature #9029 (Resolved): min/max uid for snapshot creation
On shared systems like shared hosting it might be useful to prevent regular users from creating snapshots on CephFS.
...
Wido den Hollander
07:11 AM Feature #9026 (Resolved): client: vxattr support for rctime, rsize, etc.
Sage Weil

08/03/2014

09:11 PM Bug #8962: kcephfs: client does not release revoked cap
another similar hang:... Sage Weil

08/01/2014

03:51 PM Bug #8622 (Resolved): erasure-code: rados command does not enforce alignement constraints
commit:7a58da53ebfcaaf385c21403b654d1d2f1508e1a Sage Weil
12:40 AM Bug #8962: kcephfs: client does not release revoked cap
... Zheng Yan

07/31/2014

08:54 PM Bug #8962: kcephfs: client does not release revoked cap
Zheng Yan wrote:
> Sage Weil wrote:
> > Zheng Yan wrote:
> > > no clue what happened. please dump the mds cache wh...
Sage Weil
07:32 PM Bug #8962: kcephfs: client does not release revoked cap
Sage Weil wrote:
> Zheng Yan wrote:
> > no clue what happened. please dump the mds cache when it happens next time
...
Zheng Yan
10:11 AM Bug #8962: kcephfs: client does not release revoked cap
Zheng Yan wrote:
> no clue what happened. please dump the mds cache when it happens next time
We have a dump, act...
Sage Weil
 

Also available in: Atom