Project

General

Profile

Activity

From 01/14/2013 to 02/12/2013

02/12/2013

10:23 PM devops Documentation #4111: installing RPMs - documented install release key command fails
It may need to point to the github repository instead of the ceph mirror. I found that worked ok with github:
[ub...
Anonymous
05:04 PM devops Documentation #4111 (Resolved): installing RPMs - documented install release key command fails
attempting a bobtail rpm installation for centos got the following when executing the suggested command
[qauser...
Ken Franklin
09:50 PM devops Feature #3920 (New): ceph-deploy: support other deb-based distros
Sage Weil
09:48 PM devops Feature #3907 (Resolved): ceph-deploy: be verbose about what is run and what is done (with -q)
Sage Weil
09:48 PM devops Feature #4008: ceph-deploy: make sure new version works with old ceph-disk_*
Sage Weil
09:45 PM devops Feature #3913 (Resolved): ceph-deploy: break mon into create/destroy
Sage Weil
09:45 PM devops Feature #3912 (Resolved): ceph-deploy: break osd into create/destroy
Sage Weil
06:17 PM devops Feature #3157 (Fix Under Review): upstart: move mds scripts to ceph-mds package.
Sage Weil
06:16 PM devops Feature #4112 (Resolved): ceph-deploy: support mds
Sage Weil
06:04 PM Feature #4075: osd: move pg log into leveldb
Proposed code for this change is in branch wip-4075. This was originally developed by Sage in the wip-pginfo branch ... David Zafman
05:34 PM rbd Bug #4100 (Resolved): rbd: unprotecting a snapshot in the "UNPROTECTING" state fails with EINVAL
commit:fe283813b44a7c45def6768ea0788a3a0635957e and commit:bfb4482c4596759b464caf45f8f30368898519d8 in bobtail. Josh Durgin
09:37 AM rbd Bug #4100: rbd: unprotecting a snapshot in the "UNPROTECTING" state fails with EINVAL
Dan - please review the wip branch. Ian Colle
05:20 PM Feature #3891 (Resolved): osd: move purged_snaps out of info
David Zafman
05:20 PM Feature #3892 (Resolved): osd: move pg info into leveldb
David Zafman
04:27 PM rgw Bug #3682: valgrind errors seen when running rgw tests in nightlies
ubuntu@teuthology:/a/teuthology-2013-02-12_01:00:03-regression-next-testing-basic/5247 Tamilarasi muthamizhan
04:24 PM Bug #4110 (Resolved): assertion in DeleteOp::_begin
log: ubuntu@teuthology:/a/teuthology-2013-02-12_01:00:03-regression-next-testing-basic/5133... Tamilarasi muthamizhan
03:52 PM rbd Bug #3958: rbd fsx fails with EBUSY
ubuntu@teuthology:/a/teuthology-2013-02-11_20:00:06-regression-bobtail-master-basic/4941 Josh Durgin
03:43 PM Bug #4109 (Duplicate): incorrect degraded count
ubuntu@teuthology:/a/teuthology-2013-02-11_20:00:06-regression-bobtail-master-basic/4909$
2013-02-11T22:08:22.398 ...
Samuel Just
03:23 PM rgw Feature #4108 (Duplicate): rgw: optionally put bucket index data in separate pool
Yehuda Sadeh
03:09 PM CephFS Bug #4105: mds: fix up the Dumper
Greg Farnum
02:00 PM CephFS Bug #4105 (Resolved): mds: fix up the Dumper
The messenger/objecter locking is wrong, and my quick stab at a fix resulted in lockdep warnings and things. Spend so... Greg Farnum
03:01 PM Feature #4107 (Duplicate): Usage quota for rados pools
The ability to set a quota (either by percentage or cap) on how much data can be stored in a rados pool using the sam... Tyler Brekke
02:51 PM Feature #4106 (Resolved): Monitor free space on crush pools
The ability to monitor free space on individual crush pools. Currently rados df does not provide this information. Tyler Brekke
02:44 PM Bug #3595: ceph-osd and ceph-mds crash on Debian Squeeze
Yeah, if it's working for some other people I don't think we want to change the current config without taking the tim... Greg Farnum
02:40 PM Bug #3595: ceph-osd and ceph-mds crash on Debian Squeeze
It's also working great for at least one customer on squeeze. But I don't think we can prioritize digging in given s... Sage Weil
01:04 PM Bug #3595: ceph-osd and ceph-mds crash on Debian Squeeze
It was most noticeable with the MDS, but I bet we'd see it a lot more with our present OSD design as well now and it ... Greg Farnum
12:57 PM Bug #3595: ceph-osd and ceph-mds crash on Debian Squeeze
IIRC this was mostly a problem for ceph-mds. And probably working at all is better than inflated memory usage in the... Sage Weil
12:42 PM Bug #3595: ceph-osd and ceph-mds crash on Debian Squeeze
How bizarre:... Greg Farnum
12:13 PM Bug #3595: ceph-osd and ceph-mds crash on Debian Squeeze
> dpkg -s ceph
Version: 0.56.2-1~bpo60+1
> dpkg -s libgoogle-perftools0
Version: 1.5-1
> dpkg -s libtcmalloc-...
Jörg Blank
11:21 AM Bug #3595: ceph-osd and ceph-mds crash on Debian Squeeze
I actually run these on Debian pretty often and don't have any issues, so I'm a bit confused. Can you grab a backtrac... Greg Farnum
02:40 PM CephFS Bug #4060 (Resolved): mds: vxattr ceph.file.layout.pool doesn't check latest osdmap
commit:a04c01f6822b165bf339d41eda29fcc5fa555f53 Sage Weil
02:23 PM CephFS Bug #4061: mds crashed at LogEvent::decode
Hmm, actually this one might be different from the other. It's a client cap update event, and the event on disk claim... Greg Farnum
01:59 PM CephFS Bug #4061: mds crashed at LogEvent::decode
Probably the same Tamil, yes. This should be a little easier to debug if we get it again in the future following last... Greg Farnum
11:10 AM CephFS Bug #4061: mds crashed at LogEvent::decode
There was no load yet. I was attempting the ceph-fuse mount test after the update.
Today I have ceph version 0....
Ken Franklin
11:05 AM CephFS Bug #4061: mds crashed at LogEvent::decode
looks like it is same as bug#3773 Tamilarasi muthamizhan
09:44 AM CephFS Bug #4061: mds crashed at LogEvent::decode
Ken, what was the workload you were running on this before the crash? Greg Farnum
02:11 PM rbd Feature #2770: krbd: define tasks to add osd_client compound class op support
The way the osd client handles an object class method right now
assumes that outbound data (headed from the client t...
Alex Elder
10:18 AM rbd Feature #2770: krbd: define tasks to add osd_client compound class op support
... Alex Elder
10:16 AM rbd Feature #2770: krbd: define tasks to add osd_client compound class op support
... Alex Elder
11:42 AM rbd Feature #4104: osd_client: support passing page array as data for CALL op
I guess this needs to be considered an rbd task if it is
to show up as a subtask for 2770.
Alex Elder
11:41 AM rbd Feature #4104 (Resolved): osd_client: support passing page array as data for CALL op
The rbd object "copyup" operation is defined as a class method
operation. Currently when a class method needs to su...
Alex Elder
11:18 AM Bug #3972: new boost dependency: libboost-program-options
David Zafman
11:10 AM rgw Feature #3991 (In Progress): rgw: dr: region mgt changes: define datastructures
Yehuda Sadeh
11:08 AM Bug #3434: Unknown variables in test_xattr_support
Giving to Ian for proper distribution. :) Greg Farnum
10:59 AM Feature #3403 (In Progress): librados: expose a list of watchers on an object
David Zafman
10:56 AM Bug #3433: Error: Store.__init__() takes no parameters
I'm not really sure where obsync is tracked — Ian, can you figure that out? Greg Farnum
10:49 AM Bug #3172 (Resolved): ceph::buffer::bad_alloc downloading a large object using rados
I believe this got fixed in commit:234becd3447a679a919af458440bc31c8bd6b84f.
Previously it was trying to read a full...
Greg Farnum
10:41 AM CephFS Bug #733: cmds crash: mds/LogEvent.cc:88: FAILED assert(p.end())
This is at least the same crash as #4061, although it'd be nice to get one of these with logging on the caused end in... Greg Farnum
10:15 AM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
John Wilkins
10:09 AM Bug #4103 (Duplicate): mon: Single-Paxos: on MonitorDBStore, segfault during sync
... Joao Eduardo Luis
08:32 AM Documentation #4102 (Resolved): doc: in crush-map-rules, wrong spec for step take

At: http://ceph.com/docs/master/rados/operations/crush-map/#crush-map-rules
It says:
step take <bucket-type>
...
Sam Lang
07:27 AM Cleanup #4101 (Rejected): buffer::list::iterator constructor should be private
The ... Loïc Dachary
12:49 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
Journal is 1G in RAM. Since there are no writers at all yet, I don't think that journal is filled and causes device s... Ivan Kudryavtsev
12:47 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
Tried on regular 1G ethernet (other client server). Works well. Speed is stable and no such effect.
I don't under...
Ivan Kudryavtsev

02/11/2013

10:15 PM CephFS Feature #3543 (Closed): mds: new encoding
After an fs suite run passed all tests except for libcephfs-java (which was known bad at the branch point), iogen whi... Greg Farnum
09:04 PM Bug #4065: Crash of 0.56.2 OSD on Ubuntu 12.04 LTS
Last 1000 lines of the log, no problem. Matthias Babisch
02:16 PM Bug #4065 (Need More Info): Crash of 0.56.2 OSD on Ubuntu 12.04 LTS
Samuel Just
02:15 PM Bug #4065: Crash of 0.56.2 OSD on Ubuntu 12.04 LTS
I suspect the log included more output. Can you attach the previous 1000 lines?
-Sam
Samuel Just
07:46 PM Documentation #3432 (Resolved): move explanation for rbd on libvirt to new docs
I've put together a full procedure including using virt-manager. There are notes to use "virsh edit" and to specifica... John Wilkins
07:25 PM rbd Bug #4100 (Fix Under Review): rbd: unprotecting a snapshot in the "UNPROTECTING" state fails with...
wip-snap-unprotect Josh Durgin
05:34 PM rbd Bug #4100 (Resolved): rbd: unprotecting a snapshot in the "UNPROTECTING" state fails with EINVAL
As reported on ceph-users, an unprotect earlier was taking too long (possibly due to inactive pgs) and was killed. Th... Josh Durgin
04:44 PM rgw Feature #4099 (Resolved): rgw: Object Expiration
would like to see an implementation for object expiration for the RadosGW.
JuanJose Galvez
04:41 PM rgw Feature #4098 (Resolved): rgw: multi-site: Global Bucket Namespace
a feature request, basically they want to support multiple regions. I've copied the request below.
> We would like...
JuanJose Galvez
04:29 PM rgw Feature #4097 (Resolved): rgw: s3 static website
would like to see root domain support so that customers can run their static sites directly from rgw.
http://aws.a...
JuanJose Galvez
03:35 PM rbd Feature #4095 (Rejected): rbd: 2-phase commit for snapshot creation
To ensure snapshots are created as close as possible to when the user intended, instead of just waiting for a notify,... Josh Durgin
02:55 PM Bug #2803: filer: probe crash
turned debugging on and the logs are placed in ubuntu@burnupi06:~/log_2803 Tamilarasi muthamizhan
10:30 AM Bug #2803: filer: probe crash
Ian Colle
02:33 PM rbd Subtask #4092: rbd: re-read header when watch is re-established
Yes, I'll create corresponding tasks for krbd once I get the rest of the general ones in. Josh Durgin
02:09 PM rbd Subtask #4092: rbd: re-read header when watch is re-established
This same issue would apply to the kernel rbd client also,
right?
Alex Elder
01:32 PM rbd Subtask #4092 (Resolved): rbd: re-read header when watch is re-established
This avoids races that would result in a snapshot not being created correctly, like:... Josh Durgin
01:55 PM CephFS Bug #4061: mds crashed at LogEvent::decode
sorry Greg, I pulled the information from ken and filed this bug. Please let me know if you need more info. Tamilarasi muthamizhan
01:36 PM CephFS Bug #4061: mds crashed at LogEvent::decode
IIRC I was waiting on some other info from Ken for this. Is that coming? :) Greg Farnum
01:39 PM CephFS Feature #3730: Support replication factor in Hadoop
Initial set of tests are in the Hadoop tree and working. Need to add them to Teuthology test thingy. There are now tw... Noah Watkins
01:38 PM Bug #2890: monitor: "recognize" heap commands
I've had this sitting in my queue for a long time; can you check if it's still an issue and do something appropriate ... Greg Farnum
01:34 PM Bug #4071 (Resolved): osd: snap coll not created on scrub trim repop
4 patches culimating in commit:31e911b63d326bdd06981ec4029ad71b7479ed70 Sage Weil
01:24 PM rbd Subtask #4091 (Resolved): ObjectCacher: optionally make readx/writex calls never block
The idea is to prevent any aio calls from blocking client (i.e. qemu) threads.
This was what Sage was thinking as ...
Josh Durgin
12:41 PM rbd Subtask #4090 (New): rbd: investigate sources of client-side latency
Some possible sources:
* lock contention
* unnecessary data copying
* contention on queues
* throttling in messen...
Josh Durgin
12:39 PM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
The only way I can think of that a 32-bit client would be different is in the inode assignment; could it be running i... Greg Farnum
05:37 AM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
Sage Weil
12:00 AM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
Tried with updated to 0.56.2. Found no troubles, but actually environment changed, since I removed 32-bit kernel clie... Ivan Kudryavtsev
12:39 PM rbd Feature #4089 (Closed): rbd: improve small I/O performance
Root task for general improvement. Josh Durgin
12:38 PM Bug #3690 (In Progress): osd crashed in FileStore::_do_transaction
recent log: ubuntu@teuthology:/a/teuthology-2013-02-10_01:00:02-regression-master-testing-gcov/4059... Tamilarasi muthamizhan
12:36 PM rbd Feature #4088 (Resolved): rbd: optionally copy-on-read instead of copy-on-write
This can be beneficial in some use cases (such as when there's high latency to the original pool, but not the new pool). Josh Durgin
12:31 PM rbd Feature #4087: rbd: bitmaps for tracking object existence
Also, I wonder at what point it becomes worthwhile to
use something different from bitmaps (such as extents
that de...
Alex Elder
12:29 PM rbd Feature #4087: rbd: bitmaps for tracking object existence
I was thinking this weekend of creating this issue exactly.
Are you envisioning keeping these with the image heade...
Alex Elder
12:27 PM rbd Feature #4087 (Resolved): rbd: bitmaps for tracking object existence
This would improve layered image performance, and enable quick, conservative usage for an image. It should be possibl... Josh Durgin
12:27 PM Bug #4079: osd: journal aio deadlock
Below is a chart with test number, an example of a test
run with results more in line with what's expected, and
the t...
Alex Elder
11:48 AM Bug #4079 (Resolved): osd: journal aio deadlock
I don't really understand this yet. I have seen it occurring
with the new request code. I thought there could be a...
Alex Elder
12:24 PM rbd Feature #4086 (Resolved): rbd: rate-limiting
Enforce policies like max-iops from a client point of view. The objecter throttling is too low-level, especially when... Josh Durgin
12:15 PM rbd Feature #4085 (New): qemu-rbd: allow storing snapshot of ram associated with snapshot of disk
This way the entire state of a VM can be restored, instead of just the disks. Josh Durgin
12:12 PM rbd Feature #4084 (Resolved): rbd: incremental backups
Root task for the feature in general Josh Durgin
12:06 PM rbd Feature #4083 (New): rbd-fuse: expose snapshots (and maybe other pools)
Maybe only do so for an image when asked, but optionally always show all pool and snapshots? Need to think about the ... Josh Durgin
12:03 PM rbd Feature #4082 (Rejected): rbd-fuse: improve performance
Use aio, different IoCtxs per image, and the C++ librbd api for fewer data copies. Josh Durgin
12:02 PM rbd Feature #4081 (New): rbd-fuse: improve usage, make consistent with other ceph tools
Use the common command line/conf file/env var parsing. Re-write internals to c++ instead of C as needed.
Josh Durgin
12:01 PM Documentation #4080 (Closed): We should document what needs to be touched when adding a build dep...
Recently the build started depending a a new package (boost-program-options); when it was added, the
gitbuilders we...
Dan Mick
11:58 AM Bug #2761: osd: failed to recover before timeout expired
recent log: ubuntu@teuthology:/a/teuthology-2013-02-09_20:00:03-regression-bobtail-master-basic/3922 Tamilarasi muthamizhan
11:55 AM rgw Bug #3682 (In Progress): valgrind errors seen when running rgw tests in nightlies
recent log: ubuntu@teuthology:/a/teuthology-2013-02-09_20:00:03-regression-bobtail-master-basic/3989 Tamilarasi muthamizhan
11:51 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
What size is the journal on your osds? You may just be seeing a slowdown when the journals fill up, and must be flush... Josh Durgin
07:57 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
It may be a network issue as well. Is it easy for you by chance to try with regular ethernet (say, gig instead of 10... Sage Weil
07:43 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
> 3.7.2-ceph
That tells me a lot, and in fact makes me suspect it might
not be rbd that's the cause.
What can ...
Alex Elder
07:41 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
During slow periods iostat shows no operations on OSDs. First of all I've thought about sceduler and iowait problems,... Ivan Kudryavtsev
07:38 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
Linux hosting-cloud1-s1.zzzing.ru 3.7.2-ceph #1 SMP Wed Jan 16 23:25:11 NOVT 2013 x86_64 GNU/Linux
Kernel config: ...
Ivan Kudryavtsev
04:29 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
I'm sorry if I missed it, but can you tell me the version
of the kernel you are working with? Running "uname -a"
m...
Alex Elder
11:51 AM rbd Bug #3664: osdc/ObjectCacher.cc: 517: FAILED assert(!i->size())
recent log: ubuntu@teuthology:/a/teuthology-2013-02-09_20:00:03-regression-bobtail-master-basic/3977... Tamilarasi muthamizhan
11:30 AM rbd Feature #2770 (In Progress): krbd: define tasks to add osd_client compound class op support
At our sprint planning meeting we discussed this. The task
was too large and unknown to provide a meaningful estima...
Alex Elder
11:17 AM Bug #2947 (In Progress): osd: out of order reply
recent logs: ubuntu@teuthology:/a/teuthology-2013-02-10_01:00:02-regression-master-testing-gcov/4134... Tamilarasi muthamizhan
11:08 AM rbd Bug #4033 (Resolved): krbd: add barriers near done flag operations
I just committed this to the ceph-client/testing branch:
commit 4ad2b189c1a52ce8ae1d6d2528c512021a2f1654
Author: ...
Alex Elder
11:08 AM rbd Bug #4010 (Resolved): krbd: turn off interrupts for open/remove locking
I just committed this to the ceph-client/testing branch.
commit 4cfc31e59fc6521ee0950782a028eccb3f5c9096
Author: ...
Alex Elder
10:49 AM CephFS Feature #4073: qa: add message delay injection to test suite
I'm rewording the title, because its similar to #3570, which has tests already in the marginal suite. The purpose of... Sam Lang
10:00 AM CephFS Feature #4073 (Resolved): qa: add message delay injection to test suite
Sage Weil
10:35 AM CephFS Bug #4068 (Closed): libcephfs: if client->init() fails, shutdown() erroneously calls client->shut...
Commit 133295ed001a950e3296f4e88a916ab2405be0cc resolves this issue. The failure case no longer throws an assert but ... Anonymous
10:00 AM CephFS Bug #4068: libcephfs: if client->init() fails, shutdown() erroneously calls client->shutdown(), r...
Reviewed by Sage and Slang. Ian Colle
09:34 AM CephFS Bug #4068 (Fix Under Review): libcephfs: if client->init() fails, shutdown() erroneously calls cl...
Josh - can you please review Joe's wip branch? Ian Colle
10:35 AM Feature #4076: ceph-disk-prepare/activate: basic dm-crypt support
Initial implementation will:
- keep keys in /etc/ceph somewhere (/etc/ceph/disk-keys/* ?
- identify keys by GPT...
Sage Weil
10:33 AM Feature #4076 (Resolved): ceph-disk-prepare/activate: basic dm-crypt support
Sage Weil
10:25 AM Feature #4075 (Resolved): osd: move pg log into leveldb
Ian Colle
10:01 AM CephFS Feature #4074 (Resolved): qa: add traceless reply test to fs suite
Sage Weil
09:48 AM CephFS Feature #4002 (In Progress): mds: design fsck
Sage Weil
09:39 AM Bug #4051 (Duplicate): osd: inconsistent snapcolls on argonaut
Sage Weil
09:38 AM CephFS Bug #4035 (Need More Info): Ceph doesn't recover from fault on Opensuse (cfuse tests & rbd-cli te...
the fault message itself is nothing to worry about; just a socket error that we normally recover from. can you clari... Sage Weil
09:36 AM Bug #4052: OSD High memory usage (8-12GB) right after start
Just to clarify: please try the latest bobtail branch. Ignore wip_bobtail_f. Thanks! Sage Weil
09:36 AM Bug #4059 (Duplicate): osd: ENOTEMPTY unhandled for remove op
Ian Colle
09:04 AM Bug #3979 (Rejected): Ceph 0.56.2 RPM does not install
Pretty weird! I wonder if the permissions/mode on /etc/ceph are what RPM expects if it would skip the chown entirely... Sage Weil
08:55 AM Bug #3979: Ceph 0.56.2 RPM does not install
Got bored; I took an strace of this. Looks like it's failing tying to chown /etc/ceph:
5589 chown("/etc/ceph", 0...
Steven Presser
08:03 AM rgw Feature #4072: Allow binding on a tcp port
here is my pull request:
https://github.com/ceph/ceph/pull/46
Guilhem Lettron
08:01 AM rgw Feature #4072 (Resolved): Allow binding on a tcp port
For the moment, rgw can only bind on an unix socket.
It's very good for performance when using a local apache2.
I...
Guilhem Lettron
07:41 AM Bug #4070 (Resolved): memory leak
commit:749218f155969fd87a6194b26acd00a9332d522d Josh Durgin

02/10/2013

11:56 PM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
eth1 and ib0 connected to ceph cloud. Ivan Kudryavtsev
11:53 PM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
Ifstat output looks like:... Ivan Kudryavtsev
10:16 PM Bug #3955 (In Progress): Configure should explicity check for c++ compiler.
Anonymous
10:15 PM Bug #4030 (Resolved): Missing Fedora 18 release packages
The fedora18 gitbuilder has been updated to the released version, packages for daily builds can be found on gitbuilde... Anonymous
05:04 PM Bug #4071 (Fix Under Review): osd: snap coll not created on scrub trim repop
wip-4071 Sage Weil
11:32 AM Bug #4071 (Resolved): osd: snap coll not created on scrub trim repop
on replica:... Sage Weil
09:27 AM Bug #4069 (Resolved): OSD crashed and failed to start again
This issue was fixed in commit:3293b31b44c9adad2b5e37da9d5342a6e4b72ade, released in v0.56.2. There were some other ... Sage Weil
03:48 AM Bug #4069: OSD crashed and failed to start again
Removed and formatted again to XFS and started well but a lot of data should be replicated from other OSDs. Ivan Kudryavtsev
02:33 AM Bug #4069 (Resolved): OSD crashed and failed to start again
I was using 0.56 and found that all the OSDs on one host is down. When I tried to restart I got nothing successful, b... Ivan Kudryavtsev
06:37 AM Bug #4070: memory leak
Here is the proposed fix https://github.com/ceph/ceph/pull/44 Loïc Dachary
06:29 AM Bug #4070 (Resolved): memory leak
The "assignment operator of buffer::ptr":https://github.com/ceph/ceph/blob/master/src/common/buffer.cc#L286 leaks whe... Loïc Dachary
06:19 AM Tasks #4066: unit tests for src/include/buffer.h
https://github.com/ceph/ceph/pull/41 Loïc Dachary

02/09/2013

09:22 PM Bug #4064: osd: filestore assert on FORREMOVAL_* collection removal
this was the job:... Sage Weil
09:15 PM Bug #4064: osd: filestore assert on FORREMOVAL_* collection removal
Related (i think) crash, on latest next:... Sage Weil
09:11 PM Bug #4052: OSD High memory usage (8-12GB) right after start
quick update: chasing a relate issue, hold off on upgrading a bit longer. Sage Weil
06:51 PM CephFS Bug #4068: libcephfs: if client->init() fails, shutdown() erroneously calls client->shutdown(), r...
branch wip-4068-buck has one possible fix for this issue (but maybe not the cleanest option). Anonymous
06:27 PM CephFS Bug #4068: libcephfs: if client->init() fails, shutdown() erroneously calls client->shutdown(), r...
title should be "erroneously calls" and not the inverse. Anonymous
06:24 PM CephFS Bug #4068 (Closed): libcephfs: if client->init() fails, shutdown() erroneously calls client->shut...
In src/libcephfs.cc mount() function:
if the client->init() call fails, shutdown() gets called. Assuming that the ...
Anonymous
04:36 PM Bug #4067: Argonaut fails to build on fedora18
Argonaut for new users is a low priority.. As long as bobtail and master are working we can focus elsewhere. Sage Weil
12:08 PM Bug #4067 (Won't Fix): Argonaut fails to build on fedora18
The fedora18 argonaut build fails on a reference to boost-system. This is likely an issue with configure finding out... Anonymous
04:26 AM Feature #3850 (Closed): Add json output for ceph pg dump and ceph osd tree
This has been idle for about a month now. Previous update gives insight on the ticket's request. Feel free to reopen ... Joao Eduardo Luis
01:28 AM Tasks #4066 (Resolved): unit tests for src/include/buffer.h
The "tests":https://github.com/ceph/ceph/blob/fadb5ae75ab70f55ce039113c856d09cf20699cb/src/test/bufferlist.cc for "bu... Loïc Dachary

02/08/2013

11:38 PM Bug #4065 (Can't reproduce): Crash of 0.56.2 OSD on Ubuntu 12.04 LTS
Hi.
I am new to this and new to ceph. So please bear with me...
I tried to setup a ceph cluster here at home to t...
Matthias Babisch
11:33 PM Bug #4052: OSD High memory usage (8-12GB) right after start
Simon Frerichs wrote:
> Hi Sage,
>
> i checked two osds one start with this branch each and they crashed.
> I'll...
Sage Weil
11:30 PM Bug #4052: OSD High memory usage (8-12GB) right after start
Hi Sage,
i checked two osds one start with this branch each and they crashed.
I'll do another check later. Our cl...
Simon Frerichs
11:20 PM Bug #4052: OSD High memory usage (8-12GB) right after start
Hi Simon-
Is the osd crashing on startup *every* time? From the trace it looks like there is an invalid xattr set...
Sage Weil
05:13 PM Bug #4052: OSD High memory usage (8-12GB) right after start
i'll add an heap dump soon.
i just restarted another osd with wip_bobtail_f, also crashing:...
Simon Frerichs
02:22 PM Bug #4052: OSD High memory usage (8-12GB) right after start
hi Josh, burnupi57 running on wip-f branch might help.
we have this running from last week for the memory leak testing.
Tamilarasi muthamizhan
02:01 PM Bug #4052: OSD High memory usage (8-12GB) right after start
Yeah, if you can still reproduce it, a heap profile of an osd that's using excessive memory would be great.... Josh Durgin
01:42 PM Bug #4052: OSD High memory usage (8-12GB) right after start
Josh Durgin wrote:
> I haven't been able to reproduce this locally.
do you need more information / log output?
Simon Frerichs
01:39 PM Bug #4052: OSD High memory usage (8-12GB) right after start
I haven't been able to reproduce this locally. Josh Durgin
02:00 AM Bug #4052: OSD High memory usage (8-12GB) right after start
... Simon Frerichs
01:50 AM Bug #4052: OSD High memory usage (8-12GB) right after start
as requested current cluster status:
2013-02-08 10:48:40.125733 mon.0 [INF] pgmap v25369962: 2112 pgs: 1460 acti...
Simon Frerichs
01:43 AM Bug #4052 (Can't reproduce): OSD High memory usage (8-12GB) right after start
Hi,
some of our osds need 8-12GB RAM right after startup.
Sage mentioned wip_bobtail_f might fix it but this bra...
Simon Frerichs
09:51 PM Bug #4064: osd: filestore assert on FORREMOVAL_* collection removal
... Sage Weil
09:50 PM Bug #4064 (Resolved): osd: filestore assert on FORREMOVAL_* collection removal
ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2013-02-08_12:11:07-rados-master-testing-basic/2870$ ... Sage Weil
05:21 PM CephFS Bug #4060: mds: vxattr ceph.file.layout.pool doesn't check latest osdmap
Also, I've tested this fix with a basic script:
set -e
mnt=$1
touch ${mnt}/foo.$$
rados mkpool foo.$$
poolid=$...
Sam Lang
05:20 PM CephFS Bug #4060 (Fix Under Review): mds: vxattr ceph.file.layout.pool doesn't check latest osdmap
Pushed a proposed fix to wip-4060. Needs review. Sam Lang
03:18 PM CephFS Bug #4060 (In Progress): mds: vxattr ceph.file.layout.pool doesn't check latest osdmap
Even with add_data_pool I get EINVAL, so I'm reopening this. I've verified that (as before), the osdmap on the mds ... Sam Lang
02:57 PM CephFS Bug #4060 (Rejected): mds: vxattr ceph.file.layout.pool doesn't check latest osdmap
You need to add it to the MDSMap first ("ceph mds add datapool [x]" or something), and the server does at least wait ... Greg Farnum
02:53 PM CephFS Bug #4060 (Resolved): mds: vxattr ceph.file.layout.pool doesn't check latest osdmap
touch foo
create pool d2
setfattr -n ceph.file.layout.pool -v d2 foo
> returns EINVAL
The problem is that the p...
Sam Lang
05:19 PM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
the bit that looks fishy here is m_flush_mutex.__nusers. can you see what that thread is doing in gdb?
maybe it's...
Sage Weil
04:58 PM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
... Josh Durgin
04:55 PM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
A log dump shows nothing, so I'm guessing the log is corrupted such that it keeps logging to more and more memory wit... Josh Durgin
04:49 PM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
Something like that (or some kind of bug in the logging system that only gets hit with syslog or when not logging) wo... Greg Farnum
04:48 PM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
Oh, how interesting...I wonder if this is syslog not having enough network bandwidth? Or (in the more general sense) ... Greg Farnum
04:45 PM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
On wip-f, one osd grew to consume 70% of ram. The heap profiler tells us:... Josh Durgin
06:16 AM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
I've also started to see this and will try to get some heap profiling done to report back.
* ceph version 0.56.1 (...
Wido den Hollander
05:08 PM Bug #4006: osd: repeating 'wrong node' message in log
I am also seeing this message in the radosgw.log file for .56-623. The error appears when restarting rgw and again d... Ken Franklin
05:08 PM Bug #4063 (Duplicate): filer: probe crash on wip-bobtail-osd-msgr branch
Tamilarasi muthamizhan
05:02 PM Bug #4063: filer: probe crash on wip-bobtail-osd-msgr branch
restarting the mds/all daemons in the cluster does not help, still hitting the same issue again.
leaving the clust...
Tamilarasi muthamizhan
04:56 PM Bug #4063: filer: probe crash on wip-bobtail-osd-msgr branch
repasting the core dump... Tamilarasi muthamizhan
04:56 PM Bug #4063 (Duplicate): filer: probe crash on wip-bobtail-osd-msgr branch
ceph version 0.56.2-15-g2ebf4d0 [wip-bobtail-osd-msgr]
test set up: burnupi06, burnupi07
hit this when running ...
Tamilarasi muthamizhan
03:47 PM devops Feature #4062 (Rejected): Add data collection to the gitbuilders
Need to track how long the builds are taking. Anonymous
03:42 PM CephFS Bug #4061 (Can't reproduce): mds crashed at LogEvent::decode
hit this on burnupi60, when upgrading from ceph v0.56-598-gb970d05 to 0.56.2-12-gcc16791 on 4 feb and it seems to be ... Tamilarasi muthamizhan
01:58 PM CephFS Bug #4044 (Rejected): replay failure between MDS and client
Never mind, this turned out to be another encoding issue. Phew! Greg Farnum
01:28 PM Documentation #3432 (In Progress): move explanation for rbd on libvirt to new docs
John Wilkins
01:27 PM Documentation #4058 (Resolved): fstab documentation has invalid or misleading options
Removed extraneous/contradictory options, committed and pushed. Fix should be up shortly. John Wilkins
11:26 AM Documentation #4058 (Resolved): fstab documentation has invalid or misleading options
http://ceph.com/docs/master/cephfs/fstab/ states "the Ceph file system will mount automatically on startup". However... Dan Reif
01:22 PM CephFS Feature #3543: mds: new encoding
Okay, that was an easy bug to fix. Hurray!
(The LogEvent encoding was off a little bit.)
Running it through anoth...
Greg Farnum
01:17 PM Documentation #4046 (Resolved): Typo in ceph.com docs webpage
Fixed and checked in. Should appear shortly. John Wilkins
01:12 PM Bug #4059 (Duplicate): osd: ENOTEMPTY unhandled for remove op
This occurred on wip_bobtail_f in a local vstart 11-osd cluster which I was trying to use to reproduce #4052 by causi... Josh Durgin
11:19 AM rgw Cleanup #4057 (New): Update Admin API spec with comments
Incorporate received comments into admin API specification caleb miles
09:54 AM Bug #3895: librados test hang during mon thrashing
commit:17827769f1fe6d7c4838253fcec3b3a4ad288f41 Sage Weil
09:20 AM Bug #3895 (Resolved): librados test hang during mon thrashing
Sage Weil
09:05 AM Bug #3895: librados test hang during mon thrashing
wip-mon-eagain looks good Joao Eduardo Luis
09:36 AM Bug #4043: osd: validate/scrub collections
Ian Colle
09:22 AM rgw Feature #3973 (New): rgw: Handle requests sent in non-UTC time
Moved to a feature for possible future consideration. Ian Colle
01:18 AM rgw Feature #3973: rgw: Handle requests sent in non-UTC time
Yehuda, i admit that i looks like the client is sending the wrong date, although it would be nice if radosgw could co... Moritz Krinke
09:07 AM Linux kernel client Bug #3997 (Resolved): xfs: insert memory barriers before wake_up_bit()
Our work is done! Thanks! Sage Weil
09:03 AM Linux kernel client Bug #3997: xfs: insert memory barriers before wake_up_bit()
Ben has committed my fix to the upstream XFS tree.
I'm not sure when it will hit Linus' tree, but I
think we can ca...
Alex Elder
08:36 AM rbd Cleanup #4053: ceph: cleanup ceph page vector functions
Apparently for cleanup there is no "need review" so I'm
marking this "Feedback". I've posted a series of patches
t...
Alex Elder
08:30 AM rbd Cleanup #4053 (Resolved): ceph: cleanup ceph page vector functions
This is just documenting some cleanup activity I've done
that I'm about to post for review.
- delete bogus (re)decl...
Alex Elder
08:21 AM rbd Subtask #4007 (Fix Under Review): libceph: support STAT osd operation
A patch implementing this has been posted to the
ceph-devel mailing list for review.
[PATCH] libceph: allow STAT ...
Alex Elder

02/07/2013

11:38 PM Bug #4051 (Duplicate): osd: inconsistent snapcolls on argonaut
latest run:... Sage Weil
11:04 PM Bug #3895 (Fix Under Review): librados test hang during mon thrashing
tracked this down; see wip-mon-eagain
qa run against rados api tests seems to confirm that this fixes it (previous...
Sage Weil
01:07 PM Bug #3895: librados test hang during mon thrashing
Attached mon logs from a recent run after the rados test seemed to hang for a big (100 mon elections or so). The log... Sam Lang
12:49 PM Bug #3895: librados test hang during mon thrashing
Attached log files for this from hung runs (librados and kernel untar). Sam Lang
09:48 PM Bug #4050 (Resolved): recovery assert failure, osd/PG.cc: 6255: FAILED assert(query.query.type ==...
2013-02-07 20:58:49.461754 7f518f18c700 -1 osd/PG.cc: In function 'boost::statechart::result PG::RecoveryState::Repli... Samuel Just
08:31 PM Feature #3891 (Fix Under Review): osd: move purged_snaps out of info
David Zafman
06:10 PM Documentation #3432: move explanation for rbd on libvirt to new docs
The secondary issue is only without cephx, it's true, but the bigger issue of "we *really*
need this documentation i...
Dan Mick
06:08 PM Documentation #4049 (Resolved): public/cluster network doc should mention that multiple subnets a...
public network and cluster network allow comma-separated (at least) lists of subnets. It is of course assumed
that ...
Dan Mick
05:16 PM rgw Feature #2941 (Resolved): rgw: improve streaming read performance
Merged, commit:8a2de334fed5c56919063bba8c60a3c73bd6534c Yehuda Sadeh
05:11 PM rgw Bug #4048 (Resolved): API mismatch between RGW and Swift
As discussed with Yehuda, when using RadosGW with a delimiter:
curl -H 'x-auth-token: 909e3793e499425fb90364738107da...
Alexandre Marangone
05:08 PM rbd Bug #4047 (Resolved): removing a non-existing rbd image logs error in osd logs
when removing a non-existing rbd image floods osd logs even when the debug is turned off. This can be avoided.
ubu...
Tamilarasi muthamizhan
04:31 PM Documentation #4046 (Resolved): Typo in ceph.com docs webpage
In this section:
http://ceph.com/docs/master/rados/operations/operating/#stopping-a-cluster
the example command:...
Anonymous
04:22 PM rbd Bug #4045 (Resolved): snap unprotect on a snapshot that is already unprotected throws inappropria...
ceph version 0.56.2-7-gc3468f7 (c3468f76a5e68a6426f03e508d8ecf26950fca2a)
Trying to unprotect a snapshot, that is ...
Tamilarasi muthamizhan
04:09 PM Feature #3982 (Resolved): Performance tests on branches that change the way pg info is stored
David Zafman
02:54 PM rgw Feature #3667 (Resolved): rgw: support extra canned acl params
Merged commit:e345dfe04a64fcd0d37c9e0717b6714038c302ae Yehuda Sadeh
02:14 PM CephFS Bug #4044 (Rejected): replay failure between MDS and client
While testing #3543 (but that shouldn't be related to this issue), I restarted the MDS and ran into a case where the ... Greg Farnum
02:11 PM CephFS Feature #3543: mds: new encoding
3Still haven't gotten in on teuthology (soon!), but I did some local upgrade testing. I was able to upgrade from mast... Greg Farnum
01:58 PM Bug #4042: osd crash in recovery state: FAILED assert(0 == "we got a bad state machine event")
Nope. I've looked at it when reporting this issue, but I couldn't find a core file. I'd expected one to be in /, but ... Wido den Hollander
01:51 PM Bug #4042 (Need More Info): osd crash in recovery state: FAILED assert(0 == "we got a bad state m...
Hey Wido- Do you have have the core by chance? Sage Weil
08:40 AM Bug #4042 (Resolved): osd crash in recovery state: FAILED assert(0 == "we got a bad state machine...
I just rebooted a couple of my 0.56.2 nodes and out of 12 OSDs one went down with:... Wido den Hollander
01:55 PM rgw Bug #4039 (Resolved): rgw: bucket info discrepencies
Fixed, commit:9d006ec40ced9d97b590ee07ca9171f0c9bec6e9.
Recovery tool, commit:9cb6c33f0e2281b66cc690a28e08459f2e62ca...
Yehuda Sadeh
11:06 AM rgw Bug #4039 (In Progress): rgw: bucket info discrepencies
Ian Colle
01:49 PM rbd Bug #4003 (Resolved): rbd: EBUSY errors from rbd unmap
closing this. phew! Sage Weil
01:45 PM Bug #4043 (Resolved): osd: validate/scrub collections
check that existent collections is correct.
one option is to just do this during startup (along with some optional...
Sage Weil
11:23 AM Bug #4036: init-ceph: assumes write access to /var/run/ceph
I was mistaken about vstart clusters; it's restarting them just fine. Changed the bug description to more correctly d... Greg Farnum
10:09 AM CephFS Cleanup #1499: mds: clean up directory layouts
I've rebased on top of the wip-mds-encode-rebased branch as wip-1499-mds-layouts, although I notice it's failing some... Greg Farnum
10:09 AM CephFS Bug #1435: mds: loss of layout policies upon mds restart
I've been totally unable to come up with a scenario for how this could happen via code inspection, so I think I'm jus... Greg Farnum
09:54 AM Bug #3995: OSD heartbeat-crashes during startup
All right, I'll try to confirm if I see the problem again.
Thank you.
Artem Grinblat
09:47 AM Bug #3995 (Resolved): OSD heartbeat-crashes during startup
Artem Grinblat wrote:
> Sage, no, as I've said in comment #1, after a couple of restarts the OSD returned to normal....
Sage Weil
09:43 AM Bug #3995: OSD heartbeat-crashes during startup
Sage, no, as I've said in comment #1, after a couple of restarts the OSD returned to normal. Artem Grinblat
09:39 AM Bug #3995 (Need More Info): OSD heartbeat-crashes during startup
Artem, does it do this on every startup? Can you test wip_bobtail_f and see if it resolves the problem?
Thanks!
Sage Weil
09:52 AM rgw Feature #3973: rgw: Handle requests sent in non-UTC time
from RFC 2616:... Yehuda Sadeh
09:39 AM rgw Feature #3973: rgw: Handle requests sent in non-UTC time
Ian Colle
12:26 AM rgw Feature #3973: rgw: Handle requests sent in non-UTC time
Ian, i dont think this is an client issue. Checking the AWS documentation (http://docs.aws.amazon.com/AmazonS3/latest... Moritz Krinke
08:14 AM Bug #4041 (Can't reproduce): mon: Single-Paxos: on Paxos, leader didn't trim old versions
Possibly after being killed at some point, the leader ignored earlier versions when it trimmed its state, such that t... Joao Eduardo Luis
07:44 AM Bug #4040: mon: Single-Paxos: on PGMonitor, FAILED assert(0 == "update_from_paxos: error parsing ...
Triggered again, same symptoms, and it appears as if the issue is a skipped version on the store:
from the origina...
Joao Eduardo Luis
04:25 AM Bug #4040: mon: Single-Paxos: on PGMonitor, FAILED assert(0 == "update_from_paxos: error parsing ...
Also, I suspect this might be causing the same problem described on #4026 Joao Eduardo Luis
04:24 AM Bug #4040: mon: Single-Paxos: on PGMonitor, FAILED assert(0 == "update_from_paxos: error parsing ...
Something got messed up when updating the 'last_committed' version on mon.f, which by the way has fallen some 10 vers... Joao Eduardo Luis
04:01 AM Bug #4040 (Resolved): mon: Single-Paxos: on PGMonitor, FAILED assert(0 == "update_from_paxos: err...
... Joao Eduardo Luis

02/06/2013

11:46 PM Bug #3945: osd: dynamically link to leveldb
The current version of leveldb that is being used by ceph is 1.2. The wip-leveldb has version 1.9 which is the lates... Anonymous
04:48 PM CephFS Bug #4038 (Resolved): ceph-fuse: various hangs
He says it fixed the problem, and it's in master now. (commit: 46d7dbd3472f26926c6d048bfc3c150074bfd283) Greg Farnum
04:32 PM CephFS Bug #4038: ceph-fuse: various hangs
There's a shortcut return in CInode::_flush() that wasn't setting the new completion to done (when called from _fsync... Greg Farnum
04:01 PM CephFS Bug #4038 (Resolved): ceph-fuse: various hangs
... Sage Weil
04:41 PM rgw Bug #4039 (Resolved): rgw: bucket info discrepencies
bucket (re)creation ends up clobbering the bucket info stored under user's info. Yehuda Sadeh
03:22 PM Bug #4037 (Resolved): mon: Single-Paxos: on Paxos, FAILED assert(begin->last_committed == last_co...
... Joao Eduardo Luis
01:38 PM CephFS Feature #3626 (Resolved): mds: debug mode to generate traceless replies to clients
Greg Farnum
01:38 PM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Merged into master in commit:08b82b3ef6b43283e35fd4e56eb5c78651345bea. Greg Farnum
01:26 PM CephFS Feature #3626 (Fix Under Review): mds: debug mode to generate traceless replies to clients
wip-4036 (commit:4ebba50a15584c89e0c5e4c6e48618055ceb96d8). Testing it now with pjd on a vstart cluster with no trace... Greg Farnum
12:52 PM Bug #4036 (Resolved): init-ceph: assumes write access to /var/run/ceph
I noticed this when using init-ceph on a vstart cluster:... Greg Farnum
12:26 PM rgw Bug #4011 (Resolved): rgw: multipart upload complete does not clean up parts from index
Fixed, commit:b663c097d1e6f41aed9abeadaae80f66fc71f5ec
Recovery tool, commit:2d8faf8e5f15e833e6b556b0f3c4ac92e4a4151...
Yehuda Sadeh
11:49 AM rbd Subtask #4007: libceph: support STAT osd operation
This has turned out to be simple change. It was needed in
rbd as well, and I'll just add support to both under this...
Alex Elder
09:19 AM rbd Subtask #4007: libceph: support STAT osd operation
It wasn't really possible to know this up front but
it looks like this is trivial. I've basically
completed it but...
Alex Elder
11:32 AM rgw Feature #3973 (Need More Info): rgw: Handle requests sent in non-UTC time
Moritz - this seems like an issue with aws-sdk-ruby not reporting time in UTC, rather than our inability to handle a... Ian Colle
11:01 AM CephFS Bug #4035 (Rejected): Ceph doesn't recover from fault on Opensuse (cfuse tests & rbd-cli tests)
I'm not sure if this is exclusive to fs but on an opensuse, single node cluster, when running cfuse and rbd tests a f... Ken Franklin
10:56 AM rbd Bug #3697 (In Progress): rbd copy.sh test failing in nightly
Tamilarasi muthamizhan
10:38 AM Bug #3768 (Resolved): perl is required for logrotate, we need to include Perl as a dependency
commit:0aea4dba040b8caaeb5c4079728078541e5bb2c1 Sage Weil
10:08 AM CephFS Fix #4034 (Resolved): mds: fix replayed ino creation extra_bl
I haven't tested this, but I noticed during code inspection for other things that I believe all our recent fixes for ... Greg Farnum
09:59 AM Bug #4026 (In Progress): mon: Single-Paxos: abort on LogMonitor::update_from_paxos
Joao Eduardo Luis
09:59 AM Bug #4026: mon: Single-Paxos: abort on LogMonitor::update_from_paxos
Haven't been able to reproduce this nor to find an obvious cause for this to have happened.
After inspecting the s...
Joao Eduardo Luis
09:37 AM devops Feature #4032: ceph-disk-prepare should allow the definition of an OSD id
Ah, right. I was thinking we could get into badness over that disagreement, but of course everything checks the real ... Greg Farnum
09:27 AM devops Feature #4032: ceph-disk-prepare should allow the definition of an OSD id
Greg Farnum wrote:
> I don't think we want to do this. The problem is that if we plug in a new OSD that has the same...
Sage Weil
09:21 AM devops Feature #4032: ceph-disk-prepare should allow the definition of an OSD id
I don't think we want to do this. The problem is that if we plug in a new OSD that has the same ID as the previous on... Greg Farnum
05:55 AM devops Feature #4032 (Rejected): ceph-disk-prepare should allow the definition of an OSD id
When replacing disks in existing boxes, sometimes it's useful to keep the existing OSD numbering, rather than start a... Faidon Liambotis
08:56 AM rbd Bug #3958: rbd fsx fails with EBUSY
this is causing several failures on master runs.. something has changed.
latest:
ubuntu@teuthology:/a/sage-2013-...
Sage Weil
08:55 AM Bug #3810 (Resolved): btrfs corrupts file size on 3.7
Sage Weil
08:31 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
The testing I've been doing now has shown no problems
now that teuthology has been updated.
The two other issues ...
Alex Elder
06:16 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
Seems to have done the trick! The kernel_untar_build.sh
task just finished for me without error, and it failed
rel...
Alex Elder
05:06 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
That sounds promising, I hope it works!
This was actually the last thing I was looking at last
night while waitin...
Alex Elder
08:18 AM Bug #3854: mon: clock skew tests failing on master
This was fixed by commit:d74b31b24db647a8b7c80d1552fa6f0b02c54ba4 and commit:c54781618569680898e77e151dd7364f22ac4aa1 Joao Eduardo Luis
07:20 AM rbd Bug #4033 (Fix Under Review): krbd: add barriers near done flag operations
A fix for this has been posted for review.
[PATCH] rbd: add barriers near done flag operations
Alex Elder
06:15 AM rbd Bug #4033 (Resolved): krbd: add barriers near done flag operations
I fixed this problem while investigating the rbd hangs
in http://tracker.ceph.com/issues/4003.
Somehow, I missed ...
Alex Elder
05:53 AM devops Bug #4031 (Won't Fix): ceph-disk-activate hardcodes journal path, ignores configuration
I'm having my ceph.conf configured to store journals in a different place, like:
[osd]
osd journal = /var/lib/ceph...
Faidon Liambotis

02/05/2013

11:56 PM Bug #4030 (Resolved): Missing Fedora 18 release packages
In the docs on OS recommendations:
http://ceph.com/docs/master/install/os-recommendations/
It is mentioned that...
Jens Kristian Søgaard
11:43 PM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
This was backing up qa stuff because the rbd.py qa task wasn't unmounting during cleanup. That bit is now fixed. I ... Sage Weil
10:54 PM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
new theory:
the reason umount hangs is because nuke is killing the client and osds at the same time. the umount i...
Sage Weil
10:41 PM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
I found that unmount was hanging too. I think somehow the
completion of the I/O is not getting propagated up when
...
Alex Elder
10:33 PM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
aha:... Sage Weil
10:15 PM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
btw i am able to reproduce the EBUSY with just... Sage Weil
08:28 PM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
I've added some instrumentation and find that the rbd
client is not dropping its watch at the end of the
kernel_unt...
Alex Elder
12:51 PM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
The interrupt issue has been fixed, but the other issue
(rbd device can't be unmapped because EBUSY) remains.
I h...
Alex Elder
11:35 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
I ran the kernel_untar_build.sh workunit using the
ceph "master" branch and the ceph-client "testing"
branch and go...
Alex Elder
11:13 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
I think I found *a* problem, possibly not *the* problem.
This commit:
bc7a62ee5 rbd: prevent open for image ...
Alex Elder
11:04 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
I am able to reproduce this problem by running
the kernel_untar_build.sh workunit.
I ran the test using the ceph ...
Alex Elder
08:53 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
> Alex, unless there is another high priority regression, can you
> look at this first?
Yes I will.
Alex Elder
08:52 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
Sam Lang wrote:
> I was able to verify that this happens with an older version of teuthology, one without the change...
Sage Weil
08:41 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
I was able to verify that this happens with an older version of teuthology, one without the changes I've made recentl... Sam Lang
05:18 AM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
I had the impression this might be a problem that
is holding up completion of the nightly test suite.
But I'm not...
Alex Elder
10:07 PM Bug #4028 (Duplicate): rbd: qa runs failing to remove image after unmap
sorry, this is actuall a dup of #4003. Sage Weil
09:55 PM Bug #4028: rbd: qa runs failing to remove image after unmap
Is this the fault of the rbd task? What should
be removing the image?
Alex Elder
08:57 PM Bug #4028 (Duplicate): rbd: qa runs failing to remove image after unmap
Pretty consistently reproducible with kernel rbd tasks against master branch, and either master or testing kernel. Sage Weil
08:47 PM Bug #3971: can't attach rbd image volume to instance
1) The log shows an attempt to open volume-ade3b6fb-2386-4d10-9472-16cd4f955faa; this isn't the same volume you show ... Khanh Nguyen Dang Quoc
08:41 PM Bug #3971: can't attach rbd image volume to instance
The log shows it trying to access an rbd_header.volume-ade3b6fb-2386-4d10-9472-16cd4f955faa object without looking at... Josh Durgin
08:15 PM Bug #3971: can't attach rbd image volume to instance
1) The log shows an attempt to open volume-ade3b6fb-2386-4d10-9472-16cd4f955faa; this isn't the same volume you show ... Dan Mick
07:48 PM Bug #3971: can't attach rbd image volume to instance
yes sure, restarted all.
Please refer to the attached file for more detail.
Thanks.
Khanh Nguyen Dang Quoc
03:49 PM Bug #3971: can't attach rbd image volume to instance
Did you restart the monitors and osds after you set auth supported = none in the global section of every /etc/ceph/ce... Josh Durgin
08:16 PM Bug #4027 (Resolved): ceph-fuse on opensuse12 has the wrong requirement name for libfuse dependency
Instead of fuse-libs it should require libfuse2. This is likely specific to opensuse, but should double check others... Anonymous
05:33 PM RADOS Feature #3807 (Resolved): crush: simple commands to create common rules
commit:9eff2ee13dc03f245a11c91f4ed7d5bc15c55aef Sage Weil
05:31 PM rgw Bug #4011: rgw: multipart upload complete does not clean up parts from index
actually does not affect bobtail. Will still need to port the fix tool to bobtail. Yehuda Sadeh
12:55 PM rgw Bug #4011 (Resolved): rgw: multipart upload complete does not clean up parts from index
Yehuda Sadeh
05:17 PM Bug #4026 (Resolved): mon: Single-Paxos: abort on LogMonitor::update_from_paxos
While running teuthology with 20+ monitors, the monitor workloadgen with 10 osds, and mon thrasher, we triggered the ... Joao Eduardo Luis
04:22 PM CephFS Feature #3626 (In Progress): mds: debug mode to generate traceless replies to clients
Server::set_trace_dist() sets several things on the reply:
*snapbl
*head.is_dentry
*head.is_target
*trace_bl
H...
Greg Farnum
02:35 PM CephFS Bug #1435: mds: loss of layout policies upon mds restart
Wait, never mind. Too excited and didn't look closely enough at the projected node struct! :) Greg Farnum
01:56 PM CephFS Bug #4023 (New): kclient: d_revalidate is abusing d_parent
See Viro's email to linux-fsdevel, http://marc.info/?l=linux-fsdevel&m=135968126020360&w=2 .
We probably need t...
Sage Weil
01:52 PM CephFS Bug #2753 (Resolved): Writes to mounted Ceph FS fail silently if client has no write capability o...
wip-2753-fsync errors merged and pushed in commit:b3ffc718c93b7daa75841778b5d50ea3bc5fcc53 and fsync works properly o... Greg Farnum
01:47 PM CephFS Feature #4022 (New): client: qa: test non-cached operation (force sync mode)
Right now it's possible to run the client without going through the cacher. This isn't tested at all right now. It's ... Greg Farnum
01:47 PM rbd Feature #4021 (Resolved): rbd: openstack: add ability to copy volume to image for rbd
Ian Colle
01:46 PM rbd Subtask #4020 (Resolved): rbd: openstack: simplify volume booting with new api: make image boot b...
Ian Colle
01:44 PM rbd Subtask #4019 (Resolved): rbd: openstack: simplify volume booting with new api: add boot option t...
Ian Colle
01:44 PM rbd Subtask #4018 (Resolved): rbd: openstack: simplify volume booting with new api: modify boot panel...
Ian Colle
01:42 PM rbd Feature #4017 (Resolved): rbd: openstack: simplify volume booting with new api
Ian Colle
01:42 PM rbd Feature #4013 (In Progress): rbd: openstack: extend nova boot api to support going from image to ...
Ian Colle
01:24 PM rbd Feature #4013 (Resolved): rbd: openstack: extend nova boot api to support going from image to volume
Ian Colle
01:41 PM rbd Subtask #4016 (Resolved): rbd: openstack: extend nova boot api: modify libvirt driver to support ...
Ian Colle
01:40 PM rbd Subtask #4015 (Resolved): rbd: openstack: extend nova boot api: add block_dev_mapping_v2 to nova-...
Ian Colle
01:40 PM rbd Subtask #4014 (Resolved): rbd: openstack: extend nova boot api: add block_dev_mapping_v2 to nova-api
Ian Colle
01:13 PM rbd Bug #4012 (Won't Fix): rbd: image creation behaviour has to be uniform across bobtail and argonau...
rbd allows images to be created with size 0 in bobtail, but it fails in argonaut.
similarly,while in bobtail it do...
Tamilarasi muthamizhan
12:52 PM rbd Bug #4010 (Fix Under Review): krbd: turn off interrupts for open/remove locking
Posted for review.
[PATCH] rbd: turn off interrupts for open/remove locking
Alex Elder
12:49 PM rbd Bug #4010 (Resolved): krbd: turn off interrupts for open/remove locking
This fix is done. The problem was discovered while
investigating http://tracker.ceph.com/issues/4003.
This commi...
Alex Elder
11:40 AM Bug #4009 (Duplicate): osd reports map e6 wrongly marked me down
... Tamilarasi muthamizhan
10:37 AM Bug #3683 (In Progress): mon: leak of MMonPaxos
ubuntu@teuthology:/a/teuthology-2013-02-04_20:00:03-regression-bobtail-master-basic/15658 Tamilarasi muthamizhan
10:34 AM devops Feature #4008 (Resolved): ceph-deploy: make sure new version works with old ceph-disk_*
Sage Weil
10:12 AM rbd Bug #3697: rbd copy.sh test failing in nightly
recent log : ubuntu@teuthology:/a/teuthology-2013-02-04_20:00:03-regression-bobtail-master-basic/15773 Tamilarasi muthamizhan
09:49 AM Linux kernel client Bug #3997 (Fix Under Review): xfs: insert memory barriers before wake_up_bit()
The first patch was ACK'd by Dave Chinner.
The second one he explained wasn't needed,
because an atomic increment a...
Alex Elder
07:42 AM rbd Subtask #4007 (Resolved): libceph: support STAT osd operation
In order to do layered writes we need to check whether
an object to be written exists before issuing the write.
Thi...
Alex Elder

02/04/2013

10:23 PM Bug #4006 (Resolved): osd: repeating 'wrong node' message in log
Two users now (paravoid an xiaoxi in #ceph) have reported seeing repeated "... - wrong node!" messages in the osd log... Sage Weil
07:52 PM rgw Bug #3294: Ceph S3 API test
I researched this error many times by results are so bad,
Thank to "lollipop king", you are very good :D
--
Ta...
tuan ta ba
07:28 PM Bug #3979: Ceph 0.56.2 RPM does not install
Gary,
I did that on a freshly kickstarted system already. I'm unsure how much fresher I can get the system without ...
Steven Presser
06:57 PM Bug #3979: Ceph 0.56.2 RPM does not install
Hi Steven -
I looked at the kickstart file and I did not see anything that looked suspicious. At the moment we're...
Anonymous
06:50 PM Bug #3768 (Fix Under Review): perl is required for logrotate, we need to include Perl as a depend...
Anonymous
06:50 PM Bug #3736 (Resolved): kernel build: failures starting in 3.8-rc1

The problem that resulting in this bug being opened originally has been solved with the update patch. I've created...
Anonymous
06:45 PM Feature #4005 (New): Add perftools to the kernel debian package script
Currently on the kernel gitbuilder we install a patch to the debian package script in order to build the performance ... Anonymous
06:41 PM Bug #4004 (In Progress): Intermittent kernel build failures
Anonymous
06:39 PM Bug #4004 (Can't reproduce): Intermittent kernel build failures
From time to time the kernel builds will fail in the packaging step with a gzip internal error, usually EINVAL on a w... Anonymous
06:35 PM Bug #3788 (Resolved): debian source packages are missing
Resolved with the following commits:
commit e3e0b40f1b44e2458e47f31bedaa91408dc294c9
Author: Gary Lowell <gary.lo...
Anonymous
05:53 PM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
I really can't tell who's got a watch on the header
object. It should be getting removed when the object
gets unma...
Alex Elder
05:02 PM rbd Bug #4003: rbd: EBUSY errors from rbd unmap
There is clearly something that is keeping the rbd image
from getting removed. I reproduced this with just running
...
Alex Elder
04:12 PM rbd Bug #4003 (In Progress): rbd: EBUSY errors from rbd unmap
This sounds familiar, but I'm going to look a little
more closely to see if I can learn why it's happening.
Alex Elder
04:03 PM rbd Bug #4003 (Resolved): rbd: EBUSY errors from rbd unmap
From the teuthology kernel untar task on rbd, we get EBUSY trying to unmap. I'm investigating that this isn't someho... Sam Lang
04:48 PM CephFS Bug #2753 (Fix Under Review): Writes to mounted Ceph FS fail silently if client has no write capa...
Okay, I've checked that the kernel client deals correctly with an fsync (it'll return EPERM). The client branch wip-2... Greg Farnum
03:49 PM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
Comments on Github for this. Greg Farnum
02:21 PM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
I've pushed the additional changes for rename and sentinel to the wip-bt2 branch. Those bits are still untested, but... Sam Lang
10:45 AM CephFS Feature #3540 (In Progress): mds: maintain per-file backpointers on first file object
The initial review happened last week; Sam has some updates for the rename and sentinel object infrastructure now but... Greg Farnum
02:58 PM CephFS Cleanup #1499: mds: clean up directory layouts
there is an old branch that does this. note that this code changed with the wip-vxattrs work, so the rebase needs to ... Sage Weil
02:34 PM CephFS Feature #3953: kclient: get/set layout via virtual xattrs
in testing branch, passing tests. yay! Sage Weil
02:33 PM CephFS Feature #3953 (Resolved): kclient: get/set layout via virtual xattrs
Sage Weil
02:30 PM Bug #3810: btrfs corrupts file size on 3.7
I believe the btrfs patch fixed this issue. I consider this bug closed. Mike Lowe
02:15 PM rgw Cleanup #3777: rgw: audit code for reading NULL env variables
commit:9019fbbe8f84f530b6a8700dfe99dfeb03e0ed3d Yehuda Sadeh
01:41 PM rgw Cleanup #3777 (Resolved): rgw: audit code for reading NULL env variables
Fixes merged into bobtail, next, master. Yehuda Sadeh
01:24 PM rgw Feature #2941: rgw: improve streaming read performance
Bunch of comments on Github for this.
Given some of them it also needs more testing before going into master. :)
Greg Farnum
12:36 PM CephFS Feature #4002 (Resolved): mds: design fsck
Ian Colle
10:40 AM Bug #3787: Ceph OSD crashes on ceph tell osd.x
ushed to bobtail, commit:55687240b2de20185524de07e67f42c3b1ae6592 Sage Weil
10:37 AM Bug #3787: Ceph OSD crashes on ceph tell osd.x
Should this be backported to Bobtail? Ian Colle
10:35 AM CephFS Feature #3242 (New): samba: push plugin upstream
I believe we decided to hold off on putting more effort into this. Greg Farnum
10:34 AM CephFS Feature #3542 (Duplicate): mds: migration path for existing anchors, anchortables, etc.
Closing in favor of #4000 and #4001. Greg Farnum
10:34 AM Bug #3938 (Can't reproduce): ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons,...
After a couple of days trying to reproduce this issue (and massively failing at it), and given the lack of debug info... Joao Eduardo Luis
10:33 AM CephFS Feature #4001 (Resolved): Implement the migration path from using the AnchorTable to using lookup...
Actually do whatever #4000 specifies. Greg Farnum
10:31 AM CephFS Feature #4000 (Resolved): Design a migration path from using the AnchorTable to using lookup-by-ino
We're currently engaged in work to do lookup-by-ino when we get an ino we don't recognize. However, any old installs ... Greg Farnum
10:27 AM CephFS Feature #3999 (Resolved): update CDir encoding
Either following or as part of #2177, we should update CDir on-disk encoding (and possibly wire encoding) to be versi... Greg Farnum
10:25 AM CephFS Cleanup #3998 (Resolved): mds: split up mdstypes
Right now we have mdstypes and it contains both MDS-exclusive and client-shared structs. Split it up into "metadata_t... Greg Farnum
10:03 AM CephFS Feature #3543: mds: new encoding
Got an early review from Sage; now waiting for a merge review and for test results from the FS suites, which are dela... Greg Farnum
09:59 AM CephFS Bug #3951 (Resolved): ceph-fuse: permissions error on create
Fixed in master commit:cf7c3f7d3fc7b8dc3a08a4fbe4ca1c10f2cb6054 and tested that it solves the problem. Greg Farnum
09:45 AM CephFS Bug #3935 (Need More Info): kclient: Big directory access bugs (multiple), mixed 32- and 64-bit c...
Sage Weil
08:18 AM Linux kernel client Bug #3997: xfs: insert memory barriers before wake_up_bit()
And here is something Sage provided that led me to believe
this could be the source of the problem. I'm not sure ho...
Alex Elder
08:17 AM Linux kernel client Bug #3997: xfs: insert memory barriers before wake_up_bit()
Sorry, I meant to include these in the last one:
[PATCH 1/2] xfs: memory barrier before wake_up_bit()
[PATCH 2/2]...
Alex Elder
08:16 AM Linux kernel client Bug #3997: xfs: insert memory barriers before wake_up_bit()
I have posted two patches to the XFS mailing list for review.
I am also waiting for a build to complete before doing...
Alex Elder
08:03 AM Linux kernel client Bug #3997 (Resolved): xfs: insert memory barriers before wake_up_bit()
I looked at this briefly last week and found what could explain
a hang on an osd node due to a bug in XFS. I ran it...
Alex Elder

02/03/2013

03:28 PM Bug #3995: OSD heartbeat-crashes during startup
Good news: the OSD has recovered after a couple of restarts. Artem Grinblat
01:58 PM Bug #3995 (Resolved): OSD heartbeat-crashes during startup
OSD can't start, it does something then crashes with a heartbeat assertion.
Debian, ceph version 0.56.2 (586538e22af...
Artem Grinblat
02:55 PM Bug #3996 (Resolved): mon: 'ceph mon add' results in dubious return message
Disclaimer: this might be only present on the current single-paxos branch (maybe due to some mistaken conflict resolu... Joao Eduardo Luis
02:11 AM Bug #3948: problems from leveldb static linkage and leveldb downgrade
It's not really urgent, but being able to upgrade to latest argonaut (and if that works for 2-3 days) to latest bobta... Corin Langosch

02/02/2013

08:56 PM Bug #3948: problems from leveldb static linkage and leveldb downgrade
Corin Langosch wrote:
> After the downgrade my cluster is still stable and no osd crashed so far.
>
> What can I ...
Sage Weil
02:55 AM Bug #3948: problems from leveldb static linkage and leveldb downgrade
After the downgrade my cluster is still stable and no osd crashed so far.
What can I do to upgrade to latest argon...
Corin Langosch
04:41 PM Bug #3966 (Resolved): osdthrasher: does tell on osd just after restarting it
fixed in tuethology commit:fadc22c0b9e1755b1d1826fcfe8be71e28574bc9 Sage Weil
04:40 PM Bug #3854 (Resolved): mon: clock skew tests failing on master
Sage Weil
04:40 PM Bug #3994 (Closed): ceph-osd crash under little to no load
... Sage Weil
02:10 PM Bug #3994: ceph-osd crash under little to no load
Also potentially of interest is the kernel log having some btrfs checksum failures:
btrfs csum failed ino 583798 ext...
Matthew Via
02:09 PM Bug #3994: ceph-osd crash under little to no load
It died again, here is the log output:
https://pastee.org/fbgch
Matthew Via
02:07 PM Bug #3994 (Closed): ceph-osd crash under little to no load
One of my osd's crashed a number of times in a row, and was repeatably enough that I had time to set the debugging le... Matthew Via

02/01/2013

04:33 PM devops Feature #3907 (Fix Under Review): ceph-deploy: be verbose about what is run and what is done (wit...
Sage Weil
04:33 PM devops Feature #3913 (Fix Under Review): ceph-deploy: break mon into create/destroy
Sage Weil
04:33 PM devops Feature #3920 (Fix Under Review): ceph-deploy: support other deb-based distros
Sage Weil
04:32 PM devops Feature #3918 (Fix Under Review): ceph-deploy: osd create HOST:DIR[:JOURNAL]
Sage Weil
04:32 PM devops Feature #3993 (Resolved): upstart/sysvinit: control whether crush position is readjusted on start
Sage Weil
02:52 PM rgw Feature #3667 (Fix Under Review): rgw: support extra canned acl params
Ian Colle
02:50 PM rgw Feature #3992 (Resolved): rgw: refactor internal user API for RGW Admin
Ian Colle
02:43 PM rgw Feature #3991 (Resolved): rgw: dr: region mgt changes: define datastructures
Ian Colle
02:42 PM rgw Feature #3990 (Resolved): rgw: dr: implement new version objclass
Ian Colle
02:40 PM rgw Feature #3989 (Resolved): rgw: dr: region mgt changes: radosgw admin changes
Ian Colle
02:38 PM rgw Feature #3988 (Resolved): rgw: dr: region mgt changes: define/implement internal API
Ian Colle
02:36 PM rgw Feature #3987 (In Progress): rgw: dr: region mgt changes: extend json parser with json decoder
Ian Colle
02:36 PM rgw Feature #3987 (Resolved): rgw: dr: region mgt changes: extend json parser with json decoder
Ian Colle
02:31 PM Linux kernel client Feature #3974 (Resolved): libceph: use data length rather than nr_pages
commit 012d5bda1c0f229494c67098d00edfa24c531ea5
Author: Alex Elder <elder@inktank.com>
Date: Thu Jan 31 16:02:00 ...
Alex Elder
02:18 PM rbd Subtask #3741 (Resolved): krbd: rework request tracking code
commit 9ac90ea3d8dd6ab82f3665a132ca29e6ada56ad8
Author: Alex Elder <elder@inktank.com>
Date: Thu Nov 22 00:00:08 ...
Alex Elder
02:17 PM rbd Feature #3754 (Closed): krbd: use new request tracking code for notify ack
commit 1c8c3c5c571607a188203142020d80aa58e5e280
Author: Alex Elder <elder@inktank.com>
Date: Fri Nov 30 17:53:04 ...
Alex Elder
02:16 PM rbd Tasks #3755: krbd: use new request tracking code for sync object operations
commit 5d08568324f53368f927cc10927b1b105533c044
Author: Alex Elder <elder@inktank.com>
Date: Thu Jan 17 12:25:27 ...
Alex Elder
01:44 PM rbd Tasks #3755 (Resolved): krbd: use new request tracking code for sync object operations
commit 304819b1a49937753ee01aa7ccf8d66547a0be36
Author: Alex Elder <elder@inktank.com>
Date: Sat Jan 19 00:30:28 ...
Alex Elder
02:11 PM rbd Feature #3877 (Closed): krbd: don't wait for notify ack to complete
commit a8a34efcac7a33e7631fe8bf25530bd4be0417f8
Author: Alex Elder <elder@inktank.com>
Date: Thu Jan 17 12:18:46 ...
Alex Elder
01:57 PM devops Feature #3909 (Resolved): ceph-deploy: update install for bobtail/argonaut urls
Dan Mick
01:56 PM devops Feature #3923 (Resolved): ceph-deploy: discover HOST
Dan Mick
01:48 PM Subtask #3986 (Rejected): Send to ceph-dev for review
Ian Colle
01:48 PM Subtask #3985 (Rejected): api: Send to DH for Review
Ian Colle
01:47 PM Feature #3984 (Resolved): api: Send Out DRAFT REST API for Review
Ian Colle
01:45 PM Feature #3983 (Resolved): api: create initial DRAFT REST API Design
Ian Colle
01:38 PM rbd Bug #3940 (Resolved): krbd: decrement obj request count when deleting
commit 150fde1984ec8454c163e4f89a50416cd68edbc4
Author: Alex Elder <elder@inktank.com>
Date: Fri Jan 25 17:08:55 ...
Alex Elder
01:38 PM rbd Bug #3937 (Resolved): krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
commit 8d93192992301f8c3a288c8cf4dc8598ac4b8427
Author: Alex Elder <elder@inktank.com>
Date: Fri Jan 25 17:08:55 ...
Alex Elder
01:37 PM rbd Bug #3427 (Resolved): krbd: unmap does not remove block device properly
commit bc7a62ee52cffc735cb8383b6d26648883f1a01e
Author: Alex Elder <elder@inktank.com>
Date: Mon Jan 14 12:43:31 ...
Alex Elder
01:37 PM Linux kernel client Bug #3800 (Resolved): libceph: check compatibility between ceph modules
commit 4f6e0e37c103675df11c8e3d836d64cd24b31734
Author: Alex Elder <elder@inktank.com>
Date: Wed Jan 30 11:13:33 ...
Alex Elder
01:36 PM Linux kernel client Bug #3799 (Resolved): libceph/rbd: bio refs are messed up
commit dfcc01f9f093ea960289e40ca2e73334c708c0f2
Author: Alex Elder <elder@inktank.com>
Date: Wed Jan 30 11:13:33 ...
Alex Elder
01:36 PM Linux kernel client Bug #3798 (Resolved): libceph/rbd: take reference to all bio's in list
commit dfcc01f9f093ea960289e40ca2e73334c708c0f2
Author: Alex Elder <elder@inktank.com>
Date: Wed Jan 30 11:13:33 ...
Alex Elder
01:35 PM Linux kernel client Bug #3976: libceph: add some #ifdef CONFIG_BLOCK in messenger
Sorry, I made a mistake and had to rebase.
commit 1eded6f9903ff388e7af08b2037fc3f3981cdfb2
Author: Alex Elder <el...
Alex Elder
01:32 PM Linux kernel client Bug #3976 (Resolved): libceph: add some #ifdef CONFIG_BLOCK in messenger
commit a88b6b32770dc97b303cda7eade2feade3b945df
Author: Alex Elder <elder@inktank.com>
Date: Thu Jan 31 16:02:01 ...
Alex Elder
01:35 PM Linux kernel client Bug #3875: osd_client: don't use r_num_pages for bio requests
Sorry, I made a mistake and had to rebase.
commit 012d5bda1c0f229494c67098d00edfa24c531ea5
Author: Alex Elder <el...
Alex Elder
01:32 PM Linux kernel client Bug #3875 (Resolved): osd_client: don't use r_num_pages for bio requests
commit 06224afd90f261256b1e0a0db2334f39c21872a9
Author: Alex Elder <elder@inktank.com>
Date: Thu Jan 31 16:02:00 ...
Alex Elder
12:54 PM Feature #3982: Performance tests on branches that change the way pg info is stored
Includes creating teuthology tasks. Ian Colle
12:48 PM Feature #3982: Performance tests on branches that change the way pg info is stored
Need to look at xfs and btrfs (possibly ext4) with small IOs to determine whether to put the fixed sized chunk of the... Ian Colle
12:37 PM Feature #3982 (Resolved): Performance tests on branches that change the way pg info is stored
David Zafman
12:48 PM rbd Bug #1740 (Resolved): krbd: don't return head data when reading from a non-existent snapshot
This was fixed a while ago. Josh Durgin
12:32 PM Feature #3891 (In Progress): osd: move purged_snaps out of info
Ian Colle
12:07 PM rgw Feature #3981 (New): rgw: handle really large buckets
Yehuda Sadeh
11:58 AM rbd Bug #3980 (Won't Fix): rbd image created with size zero on a mixed cluster crashes rbd
creating a rbd image with size 0 is allowed in bobtail but not on argonaut.
on a mixed cluster running argonaut[bu...
Tamilarasi muthamizhan
11:13 AM Bug #3979: Ceph 0.56.2 RPM does not install
Nope, pretty vanilla install, other than the kernel.
I've attached the kickstart file. The cluster is managed by ...
Steven Presser
10:51 AM Bug #3979: Ceph 0.56.2 RPM does not install
Still failing on the chown of /etc/ceph ? Are you by any chance using selinix features, or anything that might cause... Anonymous
10:19 AM Bug #3979: Ceph 0.56.2 RPM does not install
Nope, issue persists on a fresh install of the node. I'm not sure what information would be helpful, but if you let ... Steven Presser
09:44 AM Bug #3979: Ceph 0.56.2 RPM does not install
Nope, Running CentOS 6.3 with a custom kernel (3.6.9-vanilla at the moment). Give me about half an hour and I'll kic... Steven Presser
09:42 AM Bug #3979: Ceph 0.56.2 RPM does not install
Hi Steven -
Are you running Arch Linux ? If so, can you tell me the version, and also the versions of the rpm and...
Anonymous
09:39 AM Bug #3979: Ceph 0.56.2 RPM does not install
Hi,
I installed and upgraded Ceph RPMS on my fresh CentOS VM without issue. I can provide more information if needed.
Anonymous
09:29 AM Bug #3979 (In Progress): Ceph 0.56.2 RPM does not install
Anonymous
09:00 AM Bug #3979 (Rejected): Ceph 0.56.2 RPM does not install
Hey all,
I run a local Ceph mirror for a cluster. I mirrored the 0.56.2 RPMS this morning and went to update my nod...
Steven Presser
10:53 AM rgw Bug #3620 (Resolved): rgw:improve multiple user access keys scalability
Fix merged into master, commit:0797be3f86df8b413256d69e3770ec39ed6e6912. Yehuda Sadeh
09:50 AM Feature #3890 (Resolved): osd: create tool to extract pg info and pg log from filestore
David Zafman

01/31/2013

08:51 PM rbd Bug #3978 (Resolved): krbd qa: concurrent.sh test leaves something read-only
I don't know what exactly is happening here, but it appears
that after running the "rbd/concurrent.sh" workunit, if
...
Alex Elder
08:01 PM Bug #3948: problems from leveldb static linkage and leveldb downgrade
Well, after downgrading them they seem to work stable again. If it's related to leveldb, then upgrading leveldb as th... Corin Langosch
07:50 PM Bug #3948: problems from leveldb static linkage and leveldb downgrade
Both osd.7 and osd.15 have corrupted leveldb state. It's likely related to downgrading and then upgrading leveldb. Samuel Just
07:40 PM Bug #3948: problems from leveldb static linkage and leveldb downgrade
Hi Sage!
Today I was brave and upgraded two more nodes (one has 1 osd, the other 3 osds). I worked for some time b...
Corin Langosch
07:18 PM Bug #3971: can't attach rbd image volume to instance
Does 'rbd ls volumes' show volume-5529a8cd-28db-4a72-a0f0-f7b2a221cf8d?
-> yes, i can see it
Khanh Nguyen Dang Quoc
07:01 PM Bug #3971: can't attach rbd image volume to instance
+These're all information need to verify:
root@master:~# dpkg -l | grep librbd
ii librbd1 ...
Khanh Nguyen Dang Quoc
01:44 PM Bug #3971: can't attach rbd image volume to instance
Does 'rbd ls volumes' show volume-5529a8cd-28db-4a72-a0f0-f7b2a221cf8d?
If so, could you provide a few more detail...
Josh Durgin
02:20 AM Bug #3971 (Rejected): can't attach rbd image volume to instance
my env:
libvirt-bin: 0.9.13-0ubuntu12.1~cloud0
ceph : 0.56.1
+ i tried disable module apparmor from system.
+...
Khanh Nguyen Dang Quoc
05:57 PM Cleanup #3977 (Resolved): Do a great stream operator const cleanup!
I just spent a little while trying to figure out why the compiler couldn't resolve operator<< (the stream operator) o... Greg Farnum
05:37 PM Documentation #3960: [Document bug]MON and MDS do not need a ssd for data storage.
John Wilkins:
What do you mean by:
>One way Ceph accelerates filesystem performance is to segregate the storage of ...
Xiaoxi Chen
04:35 PM Documentation #3960 (Resolved): [Document bug]MON and MDS do not need a ssd for data storage.
I removed the reference to monitors, added detail on sequential write throughput, and a link to an example for a CRUS... John Wilkins
02:45 PM Linux kernel client Bug #3976 (Resolved): libceph: add some #ifdef CONFIG_BLOCK in messenger
There are two spots in the messenger code that would
cause a build failure if CONFIG_BLOCK weren't define.
I've a...
Alex Elder
02:21 PM rbd Bug #3975 (Rejected): librbd: xfstests 008 failed inside qemu
This one's not a problem. This test pokes random holes in a
file (or maybe fills random spots). And when done it s...
Alex Elder
02:05 PM rbd Bug #3975 (Rejected): librbd: xfstests 008 failed inside qemu
From xfstests output in ubuntu@teuthology:/a/teuthology-2013-01-29_20:00:04-regression-bobtail-master-basic/7794/remo... Josh Durgin
02:12 PM devops Feature #3912 (Fix Under Review): ceph-deploy: break osd into create/destroy
Sage Weil
02:12 PM devops Feature #3923 (Fix Under Review): ceph-deploy: discover HOST
commit:56b996b76f37fb6a7c3ffc812e87a8cbd6f8c3b8 Sage Weil
02:12 PM devops Feature #3909 (Fix Under Review): ceph-deploy: update install for bobtail/argonaut urls
commit:3c1e4d1d73556560e06686843ed1010174b5ffda Sage Weil
02:01 PM Bug #3970 (Resolved): cls_log should be declared with __attribute__(format..) so -Wformat validat...
commit:3f53c3f016ab0db1a33848ac406239dc07204ea2
Dan Mick
01:59 PM Linux kernel client Feature #3974 (Resolved): libceph: use data length rather than nr_pages
While looking at http://tracker.ceph.com/issues/3875 I learned
that the nr_pages field in a ceph message is never re...
Alex Elder
01:00 PM Bug #3966: osdthrasher: does tell on osd just after restarting it
pushed fix to master, fadc22c0b9e1755b1d1826fcfe8be71e28574bc9 (teuthology) Samuel Just
11:22 AM Bug #3906: ceph-mon leaks memory during peering
So, today I upgraded my whole cluster to 0.56.2, then added a bunch more OSDs (from 84 -> 144). At peering time monit... Faidon Liambotis
09:09 AM rgw Feature #3973 (New): rgw: Handle requests sent in non-UTC time
Executing a S3 Request using the following Date Header... Moritz Krinke
07:35 AM Bug #3972 (Resolved): new boost dependency: libboost-program-options
libboost-program-options is now required to build master, this prerequisite is not mentioned in the documentation. caleb miles
05:18 AM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
I disabled scrubbing using
> ceph osd tell \* injectargs '--osd-scrub-min-interval 1000000'
> ceph osd tell \* in...
Sylvain Munaut

01/30/2013

05:02 PM Bug #3970: cls_log should be declared with __attribute__(format..) so -Wformat validates the form...
Dan Mick
04:57 PM Bug #3970 (Resolved): cls_log should be declared with __attribute__(format..) so -Wformat validat...
It'll involve some changes to callers to fix all the harmless errors, but may find some significant
ones and avoid a...
Dan Mick
04:49 PM Bug #3938 (In Progress): ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons, 1 b...
Have a cluster set-up and ready to start trying to reproduce this in the morning. Joao Eduardo Luis
02:13 PM Bug #3938: ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons, 1 bobtail)
No, didn't have it set up. I could probably reproduce if necessary. Samuel Just
04:32 PM devops Feature #3965: upstart: ulimit -n hardcoded; doesn't use 'max open files' config setting
I guess there are settings in the upstart config files, but they aren't derived from ceph.conf.
I imagine there are w...
Dan Mick
11:09 AM devops Feature #3965 (Rejected): upstart: ulimit -n hardcoded; doesn't use 'max open files' config setting
3900 tweaked the setting of ulimit -n "max open files" on all daemons in the cluster, but,
at present, we only have...
Dan Mick
03:37 PM CephFS Feature #3540 (Fix Under Review): mds: maintain per-file backpointers on first file object
Initial implementation in wip-bt. Needs review. Sam Lang
02:13 PM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
The burnupi57 cluster (wip-f) does not appear to be leaking after all, the osds seem to have leveled off at around 35... Samuel Just
02:10 PM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
The patch is reviewed and ready to push to the testing
branch, and I will do that in a day or so.
I'm going to le...
Alex Elder
02:09 PM rgw Feature #3968 (Resolved): https should work for rest-bench
Trying to set the protocol to https by using the --protocol=https flag does not work. ... Kevin Horan
02:08 PM rbd Bug #3940: krbd: decrement obj request count when deleting
Reviewed and ready to push to master. Will do that in a day or so. Alex Elder
02:07 PM rbd Bug #3427: krbd: unmap does not remove block device properly
Reviewed and ready to push to the ceph-client "testing" branch.
I'm going to wait a day or two before pushing this...
Alex Elder
01:34 PM Linux kernel client Bug #3967 (Resolved): libceph: complete linger requests only once
Currently if a linger request gets resubmitted by the osd
client, its callback function (if provided) will get calle...
Alex Elder
01:05 PM Documentation #3960 (In Progress): [Document bug]MON and MDS do not need a ssd for data storage.
You are correct. The machines and processes would only boot a bit faster. The way to accelerate metadata servers is t... John Wilkins
12:12 PM Bug #3966 (Resolved): osdthrasher: does tell on osd just after restarting it
figured out where the thrasher errors are coming from:... Sage Weil
11:31 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
...and to answer your other question Alex, there's now a workunit test Sage just added
in c782d2ac531cbb7650968e62f0...
Dan Mick
11:00 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
Josh thinks 32-bitness probably doesn't matter, and remembers problems with snapshots that were fixed long ago; I gue... Dan Mick
10:55 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
I don't know if Sage tested 32-bit, or if it matters, and no, that script was just a reproduction scenario; as far as... Dan Mick
06:25 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
So is this then a request to port whatever it was that
fixed the problem back to 3.2?
If so, how do we prioritize...
Alex Elder
01:10 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
added test to suite, commit:c782d2ac531cbb7650968e62f0b24e6136a64359 Sage Weil
12:15 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
This works fine on current testing 3.6.0-00210-g8cc17ca Sage Weil
11:16 AM rbd Bug #3961 (Resolved): 32-bit cls_rbd tries cls_log with %d for 64-bit int, segfaults
commit:e253830abac76af03c63239302691f7fac1af381 on next
Dan Mick
09:37 AM rbd Subtask #3741: krbd: rework request tracking code
My testing on this code is nearly complete. However, I'm going
to hold off on pushing this (along with the changes ...
Alex Elder
06:34 AM rbd Subtask #3741: krbd: rework request tracking code
Alex Elder
09:30 AM Linux kernel client Bug #3740 (Resolved): ceph-client: change to be based on 3.8-rc2
I have finished my testing and have now updated the
ceph-client "testing" branch to be based on 3.8-rc5,
with the p...
Alex Elder
06:14 AM Linux kernel client Bug #3740: ceph-client: change to be based on 3.8-rc2
I discussed this with Sage yesterday. We're now up to
Linux 3.8-rc5. Merging our testing branch into v3.5-rc5
pro...
Alex Elder
08:56 AM Linux kernel client Bug #3798 (In Progress): libceph/rbd: take reference to all bio's in list
It looks like the extra reference that the osd client requires
of the first bio on the list isn't necessary. Nor wo...
Alex Elder
08:41 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
Sage, I already implemented the fix, and it's pretty trivial,
and it's generally useful. By "won't fix" do you mean...
Alex Elder
07:40 AM Linux kernel client Bug #3799 (In Progress): libceph/rbd: bio refs are messed up
Looking at the code here, the osd client isn't really doing
anything with the bio pointer. It is simply a middleman...
Alex Elder
06:47 AM rbd Bug #3927 (Closed): krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
It turns out this new behavior is a good thing, we're just
reporting errors now where we apparently did not previous...
Alex Elder
06:47 AM rbd Bug #3745 (Rejected): krbd: individual response errors are ignored
I no longer believe this is a problem. Although there is no
aggregate result value for a collection of osd requests...
Alex Elder
06:36 AM Linux kernel client Bug #3959 (Duplicate): krbd: decrement img_request->obj_request_count when deleting
Found it! http://tracker.ceph.com/issues/3940
already documents this.
Alex Elder
06:35 AM rbd Feature #3877: krbd: don't wait for notify ack to complete
Alex Elder
06:35 AM rbd Tasks #3755: krbd: use new request tracking code for sync object operations
Alex Elder
06:35 AM rbd Feature #3754: krbd: use new request tracking code for notify ack
Alex Elder
03:11 AM Bug #3948: problems from leveldb static linkage and leveldb downgrade
Hi Sage,
does it matter that the OSD is now down for around 1-2 days or will it just pickup any changes made to th...
Corin Langosch
02:19 AM Bug #3595: ceph-osd and ceph-mds crash on Debian Squeeze
root@cluster:~# ceph-osd
Segmentation fault
root@cluster:~# ceph-osd -h
Segmentation fault
root@cluster:~# ceph-...
Jörg Blank
01:03 AM RADOS Feature #3807 (Fix Under Review): crush: simple commands to create common rules
see wip-osd-commands Sage Weil

01/29/2013

11:40 PM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
... Dan Mick
11:28 PM rbd Bug #3964 (Won't Fix): krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd i...
fghaas reported, I reproduced on a precise 32-bit system:
create an image, map, writes work fine, even with dd ofl...
Dan Mick
11:06 PM Bug #3963 (Won't Fix): cls_log should check should_gather before vsnprintf()
1) faster
2) would have allowed workaround for 3961
Dan Mick
11:04 PM Bug #2481 (Won't Fix): ceph tell has almost no error reporting
this should get cleaned up with whatever refactor we do with the api work, but not worth spending time on individuall... Sage Weil
11:01 PM Bug #3577 (Can't reproduce): osd missing reported by osd_recovery.test_incomplete_pgs workload
we fixed several things that could explain this. Sage Weil
11:01 PM Bug #3595 (Need More Info): ceph-osd and ceph-mds crash on Debian Squeeze
Is this still a problem with the bobtail packages? Sage Weil
10:58 PM Bug #2721 (Resolved): Ceph status does not work in 0.48 even if it is still documented
wrong monitor version was running Sage Weil
10:57 PM Bug #2647 (Can't reproduce): osd: old request, waiting for subops
Sage Weil
10:56 PM Bug #2500 (Resolved): osd: unprotected ::decodes in ReplicatedPG::do_osd_ops
cleaned up ages ago Sage Weil
10:55 PM Bug #1197 (Resolved): osd: make inconsistent state durable
this got fixed in commit:2475066c3247774a2ad048a2e32968e47da1b0f5 Sage Weil
10:54 PM Bug #3646 (Resolved): pg_temp with two down/out osds
commit:6122a9f62f9eeae1410d1703fecb8939a35fb03f Sage Weil
10:46 PM rbd Bug #3961 (Resolved): 32-bit cls_rbd tries cls_log with %d for 64-bit int, segfaults
32-bit system: rbd create i -s 1; rbd rm i causes death of osd in cls_log();
presumably this is because of cls_log(%...
Dan Mick
10:10 PM RADOS Feature #3807: crush: simple commands to create common rules
ceph osd crush rule list
ceph osd crush rule create-simple <name> <root> <failure domain>
ceph osd crush rule create-...
Sage Weil
09:33 PM Documentation #3960 (Resolved): [Document bug]MON and MDS do not need a ssd for data storage.
From :http://ceph.com/docs/master/install/hardware-recommendations/#data-storage
it says:
Since the storage requi...
Xiaoxi Chen
08:38 PM Linux kernel client Bug #3959 (Duplicate): krbd: decrement img_request->obj_request_count when deleting
Each image request keeps a count of its object requests.
Adding a object request to or deleting one from an image
r...
Alex Elder
08:34 PM Feature #2472: osd: add opaque 'class <name> <foo>' cap that class can interpret/enforce
Sage Weil
08:34 PM CephFS Bug #1946 (Resolved): snapshot inherits timestamp/size/etc from modified trunk dir upon mds restart
commit:7842bb50c7814cc16c22589bf41df7db1f7492eb Sage Weil
08:33 PM Feature #3890 (Fix Under Review): osd: create tool to extract pg info and pg log from filestore
In final review to merge from wip-3890 branch. David Zafman
08:33 PM Bug #3126 (Can't reproduce): mds crashed bool CDir::check_rstats()
we'll see i this comes up with all of yan's fixes in now. Sage Weil
08:33 PM rbd Bug #3566 (Resolved): log max new = 1 can cause hang on process exit
fixed a few weeks ago, commit:813787af3dbb99e42f481af670c4bb0e254e4432 and a few prior commits Sage Weil
08:32 PM Bug #3125 (Resolved): Assertion Error in peer.py - failure from the nightly run
this is fixed up now, most recent commit was 3772d437dd4c562a6490f84124eb4757e22eca92 Sage Weil
08:26 PM rbd Bug #3958 (Resolved): rbd fsx fails with EBUSY
... Sage Weil
07:41 PM CephFS Bug #3553 (Won't Fix): MDS core dumped running 0.48.2argonaut
if/when see this on bobtail or later, we'll investigate. Sage Weil
07:32 PM Bug #3878 (Rejected): osd: nobackfill flag doesn't work
it works. it just doesn't leave the pg in backfill_wait, as i was expecting. Sage Weil
07:30 PM Bug #3836 (Resolved): osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
in bobtail, commit:e6bceeedb0b77d23416560bd951326587470aacb Sage Weil
07:24 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
Sage Weil wrote:
> Aaron Schulz wrote:
> > Ian Colle wrote:
> > > Aaron are you still seeing this?
> >
> > Sorr...
Aaron Schulz
12:31 PM rgw Bug #3365 (Can't reproduce): Broken metadata (duplicated as CSV)
Thanks for trying to reproduce this on Bobtail, Aaron. I'm moving it to Can't Reproduce. Ian Colle
12:26 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
I'm having a hard time reproducing this on bobtail. If I remove the metadata normalization code in the MediaWiki/Clou... Aaron Schulz
07:07 PM Bug #3938: ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons, 1 bobtail)
is there a core for this? Sage Weil
06:43 PM Bug #3957 (Resolved): new #include breaks assert.h (again)
Dan Mick
06:40 PM Bug #3957 (Resolved): new #include breaks assert.h (again)
#include <boost/lexical_cast.hpp> in mds/Server.cc apparently re-includes the system assert.h,
blowing up dout(). F...
Dan Mick
06:42 PM Bug #3956 (Resolved): ceph auth add/del entity name parameter check
commit:25e9a0be63fdad9fd8f7909585c9270a3729dc44 Sage Weil
06:00 PM Bug #3956 (Resolved): ceph auth add/del entity name parameter check
It's currently (as of v0.56.1) possible to run "ceph auth add" without any further parameters. This results in the ad... Alex Moore
05:27 PM Bug #3955 (Resolved): Configure should explicity check for c++ compiler.
If no c++ compiler is installed, configure fails with a misleading message when checking for boost libraries. Anonymous
05:11 PM Bug #3747 (Closed): PGs stuck in active+remapped
I think this was probably related to the lagging pg peering workqueue.. is there anything to suggest that isn't the c... Sage Weil
05:09 PM Bug #3948 (Need More Info): problems from leveldb static linkage and leveldb downgrade
Corin-
Just restart the osd. And check dmesg for any kernel malfeasance... that is usually what triggers this. A...
Sage Weil
04:51 PM Bug #3900 (Resolved): init-ceph should do ulimit -n's with do_root_cmd
commit:84a024b647c0ac2ee5a91bacdd4b8c966e44175c in next, cherry-pick -x'ed to bobtail
Dan Mick
03:21 PM Bug #3900 (Fix Under Review): init-ceph should do ulimit -n's with do_root_cmd
Dan Mick
04:37 PM Subtask #3840 (Resolved): osd: ack push after apply+commit
as part of #3833 Sage Weil
04:36 PM Feature #3732 (Resolved): osd/mon: report recovery rate (bytes and objects per sec)
commit:c2e50e580d18107162d2d101c5c243c665e56124 Sage Weil
04:33 PM CephFS Feature #3953 (Resolved): kclient: get/set layout via virtual xattrs
Sage Weil
04:32 PM CephFS Feature #1236 (Resolved): libceph: set layout via virtual xattrs (libceph/cfuse)
commit:1564c3a0a3efbde5a326001586238fde8f6648ad for userspace bits.
the kernel bits still need review.. opening se...
Sage Weil
03:11 PM rbd Bug #3952 (Resolved): krbd: no need for object header version
The header object watch operation had a sort of half implemented
use of the version of the object. It apparently is...
Alex Elder
03:08 PM rbd Bug #3946 (Resolved): rbd fsx failing in nightly
Just an extra delete in a code path in flush_set that wasn't exercised before. Fixed by commit:3bc21143552b35698c9916... Josh Durgin
02:44 PM rbd Bug #3946: rbd fsx failing in nightly
Reproducing locally seems to confirm this, since there was a recent change to replace commit_set() with flush_set():
...
Josh Durgin
12:06 PM rbd Bug #3946: rbd fsx failing in nightly
I'm guessing these are related to recent objectcacher changes, since they didn't affect runs without caching. The cor... Josh Durgin
02:48 PM rbd Feature #3949 (Resolved): krbd: create test script that exercises concurrent operations
I just committed the test script to the ceph master branch.
The script is located here: qa/workunits/rbd/concurrent...
Alex Elder
09:16 AM rbd Feature #3949: krbd: create test script that exercises concurrent operations
Well the script is really nice. And I just got a new
crash while running it on a real machine (rather than
my UML ...
Alex Elder
08:22 AM rbd Feature #3949 (Resolved): krbd: create test script that exercises concurrent operations
I suggested doing this in http://tracker.ceph.com/issues/3427.
That issue is about a bug where an image unmapping ca...
Alex Elder
01:50 PM rgw Bug #3941 (Resolved): s3tests crash on bobtail
Crash fixed, commit:f41010c44b3a4489525d25cd35084a168dc5f537.
Also, pushed a change to s3-tests.git, setting a requi...
Yehuda Sadeh
01:27 PM Bug #3268: osd: localize reads handling is incorrect
Yes, the OSDs will serve replica reads as things stand. Greg Farnum
01:11 PM Bug #3268: osd: localize reads handling is incorrect
I'm starting on this bug now. Before fixing the flag handling described in the ticket, I want to make sure that the O... Noah Watkins
12:43 PM Bug #3810: btrfs corrupts file size on 3.7
I'm making an attempt. Mike Lowe
12:36 PM Bug #3810: btrfs corrupts file size on 3.7
Mike, Bill: are you able to test Josef's patch? Sage Weil
11:30 AM CephFS Bug #3951: ceph-fuse: permissions error on create
I've got a question in for Sam, but other than that this looks good to me! Greg Farnum
09:37 AM CephFS Bug #3951 (Resolved): ceph-fuse: permissions error on create
Reported by Greg Farnum:
gregf@kai:~/ceph/src [master]$ cd mnt/
gregf@kai:~/ceph/src/mnt$ sudo chown gregf.gregf ...
Sam Lang
11:10 AM rbd Bug #3950: krbd: new assertion failure running concurrent rbd test
OK, I do have the osd request pointer now. It was available
in register R14. And with a little work I can determin...
Alex Elder
10:35 AM rbd Bug #3950: krbd: new assertion failure running concurrent rbd test
The object being operated on is the rbd header image, in
this case named "image.5X5ZNB.rbd". The object request typ...
Alex Elder
10:06 AM rbd Bug #3950: krbd: new assertion failure running concurrent rbd test
Weird. It looks to me like the object request that's
just completing is already done, meaning we got
a callback fr...
Alex Elder
09:19 AM rbd Bug #3950 (Can't reproduce): krbd: new assertion failure running concurrent rbd test
(I think this is a new issue, I haven't investigated it yet.)
I hit an assertion failure while running my new test...
Alex Elder
10:34 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
I've opened a new issue that has symptoms similar to this
but not identical:
http://tracker.ceph.com/issues/395...
Alex Elder
09:41 AM Bug #3768 (In Progress): perl is required for logrotate, we need to include Perl as a dependency
Putting back to in-progress. The preferred solution is to replace the perl filter line with sed or python and remove... Anonymous
09:38 AM Bug #3930 (Resolved): ceph.spec: udev rule for rbd not in rpms
Branch: refs/heads/master
Home: https://github.com/ceph/ceph
Commit: 0b66994c180b1ce5856a38518423d82fbebc8a2e
...
Anonymous
09:15 AM rbd Bug #3427: krbd: unmap does not remove block device properly
I have opened this to cover developing that test script
http://tracker.ceph.com/issues/3949
Alex Elder
07:53 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
...yes, yes it is. I've been working in FUSE so far. *sigh* Well, it needed the fix too. Greg Farnum
07:26 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
I don't see wip-2753-fsync-errors in the repo. Also, note that this problem was reported on the cephfs kernel client... Sam Lang

01/28/2013

10:50 PM Bug #3948 (Resolved): problems from leveldb static linkage and leveldb downgrade
Two days ago I upgraded one of my osds to 0.48.3 (see http://tracker.ceph.com/issues/3797) and everything worked fine... Corin Langosch
09:51 PM Bug #3930 (In Progress): ceph.spec: udev rule for rbd not in rpms
Anonymous
09:50 PM Bug #3945 (In Progress): osd: dynamically link to leveldb
Anonymous
04:56 PM Bug #3945 (Resolved): osd: dynamically link to leveldb
We hit a problem with quantal that underscored the danger of linking statically to libleveldb. After some discussion... Sage Weil
09:21 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
wip-2753-fsync-errors has a patch which makes fsync return an error if the client gets back an error from the Objecte... Greg Farnum
05:32 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
Looked at this briefly; I see that the way we do fsyncs is attached to a "FIXME: this could starve" comment, and I be... Greg Farnum
09:18 PM rbd Bug #3947 (Resolved): krbd: read zeroing freed bio?
This happened to me once before but I wasn't sure what
I did. Now I think I do know. This is with the new
request...
Alex Elder
08:45 PM Feature #3833: osd: improve recovery throttling
commit:d6db239ce5134a9c410554fb292c54981375c628 Sage Weil
08:20 PM Feature #3833: osd: improve recovery throttling
Commit? Ian Colle
07:32 PM Feature #3833 (Resolved): osd: improve recovery throttling
Sage Weil
06:08 PM RADOS Documentation #3830: crush-map.rst: chooseleaf doesn't include 'firstn|indep', and 'aggregates' i...
Can we get something moving on this bug, or give it to John to research? (and btw, firstn|indep has
been addressed u...
Dan Mick
05:20 PM Bug #3906 (Won't Fix): ceph-mon leaks memory during peering
This isn't something that's worth dealing with on the monitor side right now. Sage Weil
05:19 PM Bug #3797 (Duplicate): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48....
see #3376 Sage Weil
04:43 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
The conclusion:
- quantal had a newer libleveldb than we built statically into our debs
- downgrading made compac...
Sage Weil
05:02 PM rbd Bug #3946 (Resolved): rbd fsx failing in nightly
... Sage Weil
04:49 PM Bug #3905 (Can't reproduce): incomplete & stale (lost?) PGs
This appears to be something that was triggered and exacerbated by now-fixed issues. Until we can trigger it, I'm in... Sage Weil
08:04 AM Bug #3905: incomplete & stale (lost?) PGs
Due to some other issues and after a chat with Sage, I restarted all of my osds and this disappeared since. So I'm af... Faidon Liambotis
04:28 PM Bug #3944 (Resolved): ceph tool should prevent --admin-socket
Misremembering, I tried several 'ceph --admin-socket' commands rather than 'ceph --admin-daemon'; the result was that... Dan Mick
04:12 PM Bug #3810: btrfs corrupts file size on 3.7
sent a report to linux-btrfs Sage Weil
03:41 PM Bug #3810: btrfs corrupts file size on 3.7
Ok, this looks like a btrfs bug to me. On osd.3, the write extends the file size to 4194304, but the later stat sees... Sage Weil
10:59 AM Bug #3810: btrfs corrupts file size on 3.7
I ran part of the workload and found an inconsistent pg. I've uploaded ceph.log and logs from the primary and second... Mike Lowe
03:50 PM Documentation #3711: crush-map.rst: choose firstn talks about "N", but does not clearly define wh...
Things I mentioned in comment 4 are still present; I'd like to either change them or update here why we're not. Dan Mick
02:11 PM rbd Bug #3427 (Fix Under Review): krbd: unmap does not remove block device properly
I have posted two patches for review, the second of which
should fix this problem. I have not actually reproduced
...
Alex Elder
12:50 PM devops Feature #3479: ceph-deploy: uninstall
commit:93082e82df56b01c524d0195e20068f6a6c8ca26 Sage Weil
12:49 PM devops Feature #3910: ceph-deploy: uninstall purge
ceph-deploy commit:93082e82df56b01c524d0195e20068f6a6c8ca26
Sage Weil
12:48 PM devops Feature #3341: ceph-disk-activate: Make --mount the default
I made it autodetect whether to mount or not based on whether you pass a directory or block device in. Simpler all a... Sage Weil
12:48 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
Aaron Schulz wrote:
> Ian Colle wrote:
> > Aaron are you still seeing this?
>
> Sorry I need to get the time to ...
Sage Weil
10:24 AM Feature #3890 (In Progress): osd: create tool to extract pg info and pg log from filestore
Ian Colle
10:10 AM rgw Cleanup #3777 (In Progress): rgw: audit code for reading NULL env variables
reopening, see #3941. Yehuda Sadeh
10:09 AM rgw Bug #3941: s3tests crash on bobtail
Yeah, similar to that other issue (#3777)... Yehuda Sadeh
09:21 AM CephFS Feature #3540 (In Progress): mds: maintain per-file backpointers on first file object
Sam Lang

01/27/2013

03:59 PM Bug #3810: btrfs corrupts file size on 3.7
I can do that, it will take somewhere between 12 and 24 hours to run. Mike Lowe
03:34 PM Bug #3810: btrfs corrupts file size on 3.7
Mike, would it be possible to reproduce this with debug file store = 20? That will tell us if what Ceph thinks it di... Sage Weil
02:10 PM Bug #3810: btrfs corrupts file size on 3.7
I deleted the rbd's with inconsistent pg's, recreated the rbd's, ran rsync with the same data set, made sure no btrfs... Mike Lowe
12:58 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Ok, everything still looks good :). Last question: should I upgrade my whole cluster to this version or will a new ar... Corin Langosch
12:01 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Ok, after around 10 minutes of runtime everything seems normal. Thanks for the fast and great help! :-)
ceph versi...
Corin Langosch
12:00 PM Bug #3797 (Fix Under Review): osd takes 100% cpu after upgrading from 0.48.2argonaut to the lates...
Sage Weil
11:56 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
that fixed it it seems. we could
- update argonaut and bobtail to newer leveldb :/
- link dynamically for quant...
Sage Weil
11:08 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
looks like levedb spinning on background compaction.
his .2 package is quantals, which is leveldb 1.5.. newer than...
Sage Weil
10:56 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Output of gdb /usr/bin/ceph-osd $pid, then 'thread apply all bt' Corin Langosch
10:25 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Hi Sage, here we go. Is it enough data or do you need more? I didn't disable the logging yet... Corin Langosch
10:00 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Hi Corin-
Can you enable 'debug osd = 20' for a bit and attach that log? I think this is related to commit:830b8f...
Sage Weil
08:31 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Just another small update - nothing changed so far. The cluster is still healthy, but the osd is still using 100% of ... Corin Langosch
05:50 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Just a small update - nothing changed so far. The cluster is still healthy, but the osd is still using 100% of one co... Corin Langosch
05:06 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Here's a nice graph to see the difference before/ after upgrade of disk activity....
The cluster is clean, no reco...
Corin Langosch
05:03 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Hi Sage,
sorry for the delay. I just shutdown the osd, upgraded it and started it again. It's again using almost 1...
Corin Langosch
10:29 AM rgw Bug #3941 (Resolved): s3tests crash on bobtail
... Sage Weil
01:24 AM devops Feature #3479 (Resolved): ceph-deploy: uninstall
Sage Weil
01:24 AM devops Feature #3910 (Resolved): ceph-deploy: uninstall purge
Sage Weil

01/26/2013

08:58 PM devops Feature #3917 (Fix Under Review): ceph-dir-prepare command
Sage Weil
08:58 PM devops Feature #3915 (Rejected): ceph-disk-prepare: support sysvinit or upstart
init system is a property of the host, not the disk.. doesn't belong in ceph-disk-prepare. Sage Weil
08:57 PM devops Feature #3911 (Fix Under Review): sysvinit: allow daemon enumeration via dirs
Sage Weil
08:57 PM devops Feature #3914 (Fix Under Review): ceph-disk-activate: support sysvinit
Sage Weil
08:54 PM devops Feature #3341 (Rejected): ceph-disk-activate: Make --mount the default
Sage Weil
08:53 PM devops Bug #3898 (Resolved): ceph-deploy: problems with >1 mon
ceph-deploy commit:8067dd0afa19ff7b7ca75f984dedc4213d3a4be8 Sage Weil
05:21 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
Ian Colle wrote:
> Aaron are you still seeing this?
Sorry I need to get the time to try and reproduce this (and o...
Aaron Schulz
12:44 PM rbd Bug #3937 (Fix Under Review): krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
A patch resolving this has been posted for review.
[PATCH 4/4] rbd: don't drop watch requests on completion
Alex Elder
12:43 PM rbd Bug #3940 (Fix Under Review): krbd: decrement obj request count when deleting
A patch resolving this has been posted for review. Alex Elder
08:05 AM rbd Bug #3940 (Resolved): krbd: decrement obj request count when deleting
The obj_request_count value keeps track of how many object requests
are associated with an image request. It is inc...
Alex Elder
07:57 AM rbd Bug #3939 (Duplicate): krbd: circular locking report in sysfs code
I intended to write this up before but don't think I did.
I'm getting a "possible circular locking dependency detect...
Alex Elder
01:22 AM Bug #3938 (Can't reproduce): ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons,...
7:09:03.310220 7f652087e700 1 mon.a@1(peon).osd e72 e72: 20 osds: 20 up, 20 in ... Samuel Just

01/25/2013

08:16 PM Linux kernel client Bug #3860: rbd: problems if watch setup returns ERANGE
Just to close this out...
The fix (not repeating no ERANGE) has been committed:
commit c04306471ad93f1daf60771a...
Alex Elder
06:27 AM Linux kernel client Bug #3860: rbd: problems if watch setup returns ERANGE
Josh rejected this. But since he said that the
change I proposed--to not do the loop--was OK
I suggest this bug sh...
Alex Elder
05:38 PM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
I will be able to reproduce after the Feb,8. Willl do if nobody will reproduce before. Ivan Kudryavtsev
04:33 PM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
please set 'debug mds = 10' and upload mds log. To minimize mds log size, please truncate the mds log before executin... Zheng Yan
09:54 AM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
I made a mistake during initial post: amount of files in directory is 3.5K, not 35K. It's my netflow for last years, ... Ivan Kudryavtsev
12:11 AM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
At #3936 I'm providing some benchmarks to show that IOPS/speed is OK for my installation and my hands are not perform... Ivan Kudryavtsev
04:25 PM Documentation #3222 (Resolved): DOC: Get an Object from a Primary OSD
Added a full exercise toward the end here: http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/ John Wilkins
08:50 AM Documentation #3222 (In Progress): DOC: Get an Object from a Primary OSD
John Wilkins
04:24 PM Documentation #3333 (Resolved): doc: Explain "degraded" more
More extensive discussion here: http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/ John Wilkins
04:24 PM Documentation #3331 (Resolved): doc: Where is my data placed?
Provided an entire exercise toward the end of this document: http://ceph.com/docs/master/rados/operations/monitoring-... John Wilkins
04:22 PM Documentation #3320 (Resolved): doc: What persistency does Ceph guarantee
Added more extensive discussions.
Here: http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/ and
Her...
John Wilkins
03:25 PM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
OK, with Josh's help I finally managed to reproduce the
problem intentionally to check my fix.
I'm building it no...
Alex Elder
11:11 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
I have confirmed that every time a request registered to linger
is re-submitted the osd client will call the callbac...
Alex Elder
08:07 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
I've decoded the osd request that's been provided to
rbd_osd_req_callback(). Its contents look completely
legitima...
Alex Elder
06:54 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
Adding two things:
- this occurred during test 190 of the third consecutive pass
of xfstests with this in the teuth...
Alex Elder
05:04 AM rbd Bug #3937 (Resolved): krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
Looking at a crash this morning in the new request code due
to this failed assertion in rbd_osd_req_callback():
...
Alex Elder
03:14 PM rgw Bug #3620: rgw:improve multiple user access keys scalability
Ian Colle
01:51 PM Subtask #3840: osd: ack push after apply+commit
Ian Colle
01:50 PM Feature #3833: osd: improve recovery throttling
Ian Colle
11:48 AM Bug #3836: osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
pushed to master, still need to backport Samuel Just
11:40 AM Bug #3836 (Fix Under Review): osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
D'oh. sharedptr_registry.hpp has an extaneous Mutex::Locker l(lock) declaration in the retry loop. It only actually... Samuel Just
11:41 AM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
John Wilkins
11:40 AM Documentation #3390 (Resolved): doc: add detail on different bucket algorithms
John Wilkins
11:12 AM rgw Feature #3669 (In Progress): rgw: support acl grants through http headers
Ian Colle
11:09 AM rgw Cleanup #3777 (Resolved): rgw: audit code for reading NULL env variables
Merged into master, commit: b3a2e7e955547a863d29566aab62bcc480e27a65 caleb miles
11:07 AM rgw Feature #3667 (In Progress): rgw: support extra canned acl params
Ian Colle
10:55 AM Bug #3928 (Resolved): osd: peering workqueue tryings to advance through *all* past osdmaps in one...
The timeout should be fixed by e0511f4f4773766d04e845af2d079f82f3177cb6. Samuel Just
10:55 AM rgw Bug #3778 (Resolved): document procedure for enabling subdomain S3 api calls
Added info for subdomain call. John Wilkins
10:33 AM rgw Bug #3778 (In Progress): document procedure for enabling subdomain S3 api calls
John Wilkins
09:54 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
It's pretty likely that this is a server-side behavior rather than a client-side one. Keep that in mind when reproduc... Greg Farnum
12:00 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
rados -p rbd bench 120 write -t 16
shows about 90-110 MB/sec.
Ivan Kudryavtsev
09:52 AM rbd Bug #3654 (Resolved): libvirt: colons in ipv6 monitor addresses are not escaped when sent to qemu
Upstream commit c1509ab47edf61e9f20d11922526b9fca518d238 Josh Durgin
09:34 AM rbd Bug #3927: krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
Yes, the ENXIO is expected. Assuming it's being propagated out to dd, and the test passes (outputs OK at the end of k... Josh Durgin
05:55 AM rbd Bug #3427: krbd: unmap does not remove block device properly
We had some discussion about the whether an atomic bit
operation for this was sufficient, or whether a memory
barri...
Alex Elder
04:40 AM CephFS Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
Heh. Funny markup. The numbered list came out of #s used for comments.
Anyway, I've just verified that the issue...
Alexandre Oliva
04:34 AM CephFS Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
I've just verified that the problem is still present in 3.7.3, and I have a much simpler reproducer too.
mount -t ...
Alexandre Oliva
02:05 AM Bug #3810: btrfs corrupts file size on 3.7
Kernel was 3.7.1
Ran btrfsck on the partitions when the error first occurred with nothing found.
Tried your fix o...
Bill Kenworthy

01/24/2013

11:59 PM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
I also tried to do:
dd if=/dev/rbd/rbd/test of=/dev/null bs=4M - the same situation.
Ivan Kudryavtsev
11:57 PM rbd Bug #3936 (Rejected): rbd: Strange dd speed behaviour (server side issue?)
I have 3 node/15 osds (5 on each), every on separate drive installation (with SSD cache), journal in RAMFS. XFS as ba... Ivan Kudryavtsev
11:46 PM CephFS Bug #3935 (Can't reproduce): kclient: Big directory access bugs (multiple), mixed 32- and 64-bit ...
I have next directory structure in ceph fs:... Ivan Kudryavtsev
09:41 PM Bug #3810: btrfs corrupts file size on 3.7
Bill Kenworthy wrote:
> Version was 55.1 when created and the error occurred, now updated to 56.1 (on gentoo) after ...
Sage Weil
09:36 PM Bug #3810: btrfs corrupts file size on 3.7
Version was 55.1 when created and the error occurred, now updated to 56.1 (on gentoo) after error
Its organised as 5...
Bill Kenworthy
08:55 PM Bug #3810: btrfs corrupts file size on 3.7
Bill Kenworthy wrote:
> I have been hit by the same thing ... is there any information you need before I try and fix...
Sage Weil
06:18 PM Bug #3810: btrfs corrupts file size on 3.7
I have been hit by the same thing ... is there any information you need before I try and fix it further.
Ive tried...
Bill Kenworthy
01:35 PM Bug #3810: btrfs corrupts file size on 3.7
How about this object instead:
2013-01-23 18:41:31.336722 osd.7 149.165.228.11:6800/28046 159 : [ERR] 2.202 osd.0: s...
Mike Lowe
01:16 PM Bug #3810: btrfs corrupts file size on 3.7
the going theory is that this is triggered by btrfs scrub. can we confirm this somehow? Sage Weil
11:03 AM Bug #3810: btrfs corrupts file size on 3.7
Samuel Just wrote:
> I need a dump of the xattrs on the d0c18e1d/605.00000000/head//1 object in pg 1.1d on osd 7 and...
Mike Lowe
10:17 AM Bug #3810: btrfs corrupts file size on 3.7
Additional info, btrfs scrubs were done while the osd's were active which may or may not have had a negative effect. ... Mike Lowe
08:55 PM rgw Bug #3724 (Resolved): docs refer to non-implemented features of the radosgw-admin rest api
commit d95b4313de1614fd85265879e6d7ddadd5268af2
Dan Mick
08:45 PM rgw Bug #3724: docs refer to non-implemented features of the radosgw-admin rest api
Since the docs are in wip-admin-api, this amounts to rolling doc/radosgw/admin/adminops.rst back to its state as of 0... Dan Mick
01:41 PM rgw Bug #3724: docs refer to non-implemented features of the radosgw-admin rest api
Sage Weil
01:38 PM rgw Bug #3724: docs refer to non-implemented features of the radosgw-admin rest api
John - any update? Ian Colle
08:43 PM Bug #3885: osd: osd-recovery-incomplete qa test failing
(the above commit is in the teuthology code) Dan Mick
04:28 PM Bug #3885 (Resolved): osd: osd-recovery-incomplete qa test failing
fixed, mostly by commit:20af01f23ba932cb97cb40bba89bff546e10c461, which may fix up some of hte other spurious failure... Sage Weil
11:13 AM Bug #3885 (In Progress): osd: osd-recovery-incomplete qa test failing
Sage Weil
04:54 PM devops Bug #3934: ceph-deploy new should require at least one host name
If no hosts are specified on the command line, a ceph.conf file is created without any monitors listed. No errors or... Anonymous
04:51 PM devops Bug #3934 (Resolved): ceph-deploy new should require at least one host name
Anonymous
04:14 PM devops Bug #3933: ceph-deploy gatherkeys silently fails if no host is specified
If no host is specified and ceph.conf exists gatherkeys will fail, but not report any error. Anonymous
04:12 PM devops Bug #3933 (Resolved): ceph-deploy gatherkeys silently fails if no host is specified
Anonymous
02:52 PM Bug #3930 (Resolved): ceph.spec: udev rule for rbd not in rpms
The udev rule for kernel rbd (udev/50-rbd.rules in ceph.git) should be packaged. It's already in the debs: debian/lib... Josh Durgin
01:41 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
Sage Weil
01:39 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
Any update? Ian Colle
01:41 PM rgw Bug #3450: WRITE permission only doesn't allow proper multi-part upload
Sage Weil
01:33 PM rgw Bug #3450: WRITE permission only doesn't allow proper multi-part upload
Needs to be part of larger overall discussion about the intent of subusers. Ian Colle
01:41 PM rgw Bug #3706: rgw functional test testSlashInName failed in nightly
Sage Weil
01:38 PM rgw Bug #3706: rgw functional test testSlashInName failed in nightly
Need to see if happens again and then find reproducer. Ian Colle
01:41 PM rgw Feature #2804: rgw: disallow running multiple gateways on the same fastcgi socket
Sage Weil
01:41 PM rgw Feature #3074: radosgw needs --help support
Sage Weil
01:41 PM rgw Bug #2366: rgw: bucket index update rely on pg state
Sage Weil
01:41 PM rgw Bug #2650: rgw: swift key creation overrides subuser access mask
Sage Weil
01:41 PM rgw Bug #1777: rgw: user info modification is not atomic
Sage Weil
01:41 PM rgw Bug #1779: rgw: swift auth returns wrong error code when unexisting user is given
Sage Weil
01:14 PM rgw Bug #1779: rgw: swift auth returns wrong error code when unexisting user is given
Work in course with other swift changes, but not a driver. Ian Colle
01:40 PM rgw Feature #3366: rgw: dr: define management api
Caleb to get out updated document for review. Ian Colle
01:37 PM rgw Bug #3620: rgw:improve multiple user access keys scalability
Caleb to review. Ian Colle
01:36 PM rgw Bug #3682 (Resolved): valgrind errors seen when running rgw tests in nightlies
Increased time in tests and has not occurred. Ian Colle
01:35 PM rgw Bug #3628 (Resolved): rgw: leak of object parts on partial upload
Fixed in bobtail Ian Colle
01:34 PM rgw Bug #3485 (In Progress): rgw: unique user emails not enforced
Ian Colle
01:34 PM Bug #3906: ceph-mon leaks memory during peering
the logs indicate this may be related to failed auth connection attempts spamming the monitor. Sage Weil
11:43 AM Bug #3906: ceph-mon leaks memory during peering
we need to reproduce this on a large internal cluster, with many osds and even more pgs. Sage Weil
09:38 AM Bug #3906: ceph-mon leaks memory during peering
I believe this to be related to #3609 Joao Eduardo Luis
01:32 PM rgw Bug #3073: radosgw-admin: is not a daemon, should not have -d/-f options
commit:40ae8ceab58b4c05e01dc9f7809728a592cc4f0d actaully Sage Weil
01:30 PM rgw Bug #3073 (Resolved): radosgw-admin: is not a daemon, should not have -d/-f options
commit:b878b2c6e9ee41de25faf4dfdd7285dcb01b36e8 Sage Weil
01:26 PM rgw Bug #3073: radosgw-admin: is not a daemon, should not have -d/-f options
Change common init Ian Colle
01:30 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
Aaron are you still seeing this? Ian Colle
01:29 PM rgw Bug #3365 (Need More Info): Broken metadata (duplicated as CSV)
Ian Colle
01:21 PM rgw Feature #2490: rgw-admin: only register watch when needed
Performance improvement. Ian Colle
01:21 PM CephFS Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
This is still present in 3.6.11 (I'll know about 3.7.* soon). I suspect this may have to do with failing to mark met... Alexandre Oliva
01:18 PM rgw Bug #2482 (Rejected): rgw: duplicate content-length results in 400
Apache issue. Ian Colle
01:14 PM rgw Bug #1906 (Can't reproduce): rgw: total_time isn't logged consistently
Ian Colle
01:13 PM Documentation #3831 (Resolved): ceph osd crush set command needs correction in the doc
John Wilkins
01:10 PM rgw Bug #1673: rgw: mod_fastcgi needs to be backward compatible
Ian Colle
01:10 PM rgw Bug #1673: rgw: mod_fastcgi needs to be backward compatible
Canonical can not take our changes up stream until we solve this issue. Ian Colle
11:16 AM rgw Cleanup #3929 (New): s3-tests: refactor all test_post_* tests
These tests mostly do the same thing, can be cleaned up, no need to duplicate the same code across all. Yehuda Sadeh
10:58 AM CephFS Feature #3821 (In Progress): qa: run backuppc as part of qa suite
Ekapol Rojpiboonphun wrote:
> Just to make sure that I will be on this along the line of what you might already have...
Sage Weil
10:52 AM CephFS Feature #3821: qa: run backuppc as part of qa suite
Just to make sure that I will be on this along the line of what you might already have in mind. (More details please ... Anonymous
09:56 AM CephFS Feature #3821: qa: run backuppc as part of qa suite
Download/install backuppc and get it into suite. Ian Colle
10:32 AM Bug #3928 (In Progress): osd: peering workqueue tryings to advance through *all* past osdmaps in ...
Samuel Just
10:02 AM Bug #3928 (Resolved): osd: peering workqueue tryings to advance through *all* past osdmaps in one...
Sage Weil
10:10 AM Bug #3905: incomplete & stale (lost?) PGs
Sounds like a combination of crush map and rules that aren't behaving well together — "incomplete" means the PG doesn... Greg Farnum
09:42 AM Bug #3801: Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED assert(0 == "...
The olog stuff is fixed in bobtail, and won't be backported to argonaut.
I'm not sure what the root cause of hte h...
Sage Weil
08:42 AM Bug #3854: mon: clock skew tests failing on master
Happened again on QA, reopening while testing a new patch. Joao Eduardo Luis
08:15 AM rbd Bug #3927: krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
Hey! I just looked at the test, and here's how it ends:
# remove snapshot and detect error from mapped snapshot
...
Alex Elder
08:15 AM rbd Bug #3927: krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
This is the relevant portion of the yaml file:
- workunit:
clients:
all:
- rbd/map-unmap.sh
...
Alex Elder
08:09 AM rbd Bug #3927 (Closed): krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
I'm seeing ENXIO errors at what I believe to the "rbd/kernel.sh
teuthology workunit while testing the new request co...
Alex Elder
05:49 AM rbd Feature #3926 (Resolved): krbd: use slab allocation for common data structures
There are some common data structures--like image and object
requests--that are very frequently allocated and would ...
Alex Elder
05:29 AM rbd Bug #3925 (Resolved): krbd: sysfs write lockdep warnings
... Alex Elder

01/23/2013

10:00 PM devops Feature #3229 (Resolved): Support clean ceph-fuse fstab automounting
implemented this already; /sbin/mount.fuse.ceph is in bobtail. Sage Weil
09:59 PM devops Feature #3924 (Resolved): ceph-deploy: package it
Sage Weil
09:57 PM devops Feature #3923 (Resolved): ceph-deploy: discover HOST
somewhat similar to new, except we pull the ceph.conf from a remote host. Sage Weil
09:57 PM devops Feature #3922 (Resolved): ceph-deploy: version command
Sage Weil
09:57 PM devops Feature #3921 (Resolved): ceph-deploy: support RPM-based distros
Sage Weil
09:57 PM devops Feature #3920 (Resolved): ceph-deploy: support other deb-based distros
Sage Weil
09:56 PM devops Feature #3919 (Resolved): ceph-deploy: remove upstart dependency
eliminate whatever remaining upstart dependencies are in ceph-deploy, so that upstart and sysvinit are both viable. Sage Weil
09:55 PM devops Feature #3918 (Resolved): ceph-deploy: osd create HOST:DIR[:JOURNAL]
trigger ceph-dir-prepare instead of ceph-disk-prepare. Sage Weil
09:54 PM devops Feature #3917 (Resolved): ceph-dir-prepare command
ceph-dir-prepare <dir> [journal] or similar
somewhat similar to ceph-disk-prepare, but simpler.
- allocate osd ...
Sage Weil
09:54 PM devops Feature #3916 (Resolved): ceph-disk-activate: non-upstart trigger (udev?)
Sage Weil
09:53 PM devops Feature #3915 (Rejected): ceph-disk-prepare: support sysvinit or upstart
Sage Weil
09:53 PM devops Feature #3914 (Resolved): ceph-disk-activate: support sysvinit
Sage Weil
09:52 PM devops Feature #3913 (Resolved): ceph-deploy: break mon into create/destroy
Sage Weil
09:52 PM devops Feature #3912 (Resolved): ceph-deploy: break osd into create/destroy
Actually, we want
ceph-deploy osd prepare HOST:DEV[:JOURNAL]
ceph-deploy osd activate HOST:DEVORDIR
and perh...
Sage Weil
09:52 PM devops Feature #3911 (Resolved): sysvinit: allow daemon enumeration via dirs
Sage Weil
09:52 PM devops Feature #3910 (Resolved): ceph-deploy: uninstall purge
Sage Weil
09:52 PM devops Feature #3909 (Resolved): ceph-deploy: update install for bobtail/argonaut urls
Sage Weil
09:51 PM devops Feature #3907 (Resolved): ceph-deploy: be verbose about what is run and what is done (with -q)
Sage Weil
08:10 PM Bug #3904: FAILED assert(want_acting.empty())
I have a theory:
reset
started
primary
getinfo
got infos
getlog
calc_acting succeeds, choose_acting fails,...
Sage Weil
02:48 PM Bug #3904 (Resolved): FAILED assert(want_acting.empty())
Ceph 0.56.1 on Ubuntu 12.04, standard ceph.com packages. Multiple OSDs started getting marked down/crashing out, this... Faidon Liambotis
06:48 PM CephFS Bug #3832 (Resolved): client: does not observe O_SYNC
commit:64b9dd088d8f20019d6c1042895676b2ec57077e Sage Weil
06:42 PM Feature #3888 (Resolved): osd: stop heartbeating peers when internal heartbeat fails
Sage Weil
06:42 PM Feature #3888: osd: stop heartbeating peers when internal heartbeat fails
commit:62579eefba057eea200d8a9a3f6b3d8bca29b8b4 Sage Weil
06:31 PM Bug #3906 (Won't Fix): ceph-mon leaks memory during peering
I've done multiple OSD swaps with both 0.55 & 0.56/0.56.1 on a cluster with > 16k PGs. In those, I've noticed multipl... Faidon Liambotis
06:27 PM Bug #3905 (Can't reproduce): incomplete & stale (lost?) PGs
I added a bunch of new OSDs into my Ceph cluster (0.56.1 on Ubuntu 12.04 LTS) about 72h ago. Simultaneously, I marked... Faidon Liambotis
02:43 PM Bug #3903 (Resolved): OSDMap::raw_pg_to_pps causes pools to have similar mappings
The pool should be added in a way to ensure that different pools have independent mappings. Samuel Just
12:31 PM Support #3902 (Closed): S3-tests need to cleanup after themselves
On Congress, DHO has hit the max number of users due to s3-tests not cleaning up after execution. Could we have the s... JuanJose Galvez
11:27 AM rbd Tasks #2853 (In Progress): krbd: read path
With my patches for the basic new request code now
out for initial review, I've started working on this
feature. I...
Alex Elder
11:20 AM rbd Subtask #2852 (In Progress): krbd: open parent on open
The many patches have now been posted for review.
Included in that is a small, temporary patch that enables
this ...
Alex Elder
05:21 AM rbd Fix #3665: librbd: deadlock during flatten
possibly here: ... Sage Weil
12:23 AM Bug #3900: init-ceph should do ulimit -n's with do_root_cmd
I think he's right, except it should be do_root_cmd, and I'm not certain if that echoes the result of the command cor... Dan Mick
12:11 AM Bug #3900 (Resolved): init-ceph should do ulimit -n's with do_root_cmd
Chen Xiaoxi points out on ceph-devel:
Here is part of /etc/init.d/ceph script:
case "$command" in
s...
Dan Mick

01/22/2013

09:23 PM Feature #3888 (Fix Under Review): osd: stop heartbeating peers when internal heartbeat fails
wip-osd-hb Sage Weil
03:09 PM Feature #3888: osd: stop heartbeating peers when internal heartbeat fails
backport to bobtail! Sage Weil
08:12 AM Feature #3888 (Resolved): osd: stop heartbeating peers when internal heartbeat fails
if our internal thread heartbeats fail, stop replying to pings from peers. Sage Weil
07:37 PM Bug #3899 (Won't Fix): osd: failed to decode object_info_t
This happened after moving a journal from a file to an ssd, and changing filestore xattr use omap from true to false,... Josh Durgin
07:36 PM Bug #3836: osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
ubuntu@teuthology:/a/teuthology-2013-01-22_07:00:04-regression-bobtail-master-basic/3235... Tamilarasi muthamizhan
07:19 PM devops Bug #3898 (Resolved): ceph-deploy: problems with >1 mon
If you try "ceph-deploy new ceph1 ceph2" then it correctly creates the ceph.conf and then spits out "Cluster config e... Greg Farnum
05:37 PM CephFS Bug #3404: oops in strlen() from set_request_path_attr()
I'm found the same bug in Bobtail release with NFS kernel server and 3.7.3 kernel
[70205.985665] BUG: unable to ha...
Ivan Kudryavtsev
05:35 PM Bug #3513 (Resolved): rgw log show error
Dan Mick
05:35 PM Bug #3513: rgw log show error
Nope, I had it wrong; the required params are: object *or* all three of date, bucket, and bucket-id.
Message change ...
Dan Mick
02:37 PM Bug #3513: rgw log show error
Actually I guess the && should be || and the || should be && (the old DeMorgan's rule) Dan Mick
02:30 PM Bug #3513: rgw log show error
I experienced this also on ubuntu 12.10 0.56.1-1
root@dlcephgw01:~# radosgw-admin log show --bucket=chris --date...
Chris Holcombe
05:04 PM rgw Bug #3896 (Resolved): rest-bench common/WorkQueue.cc: 54: FAILED assert(_threads.empty())
It seems rest-bench doesn't like to exit cleanly while cleaning up after itself.... I did test at low concurrency bu... Bill Reid
04:31 PM Bug #3895 (Resolved): librados test hang during mon thrashing
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-01-21_19:00:03-regression-master-testing-gcov/2929
...
Sage Weil
04:05 PM Feature #3651 (Resolved): osd: deep scrub should hash omap
David Zafman
02:58 PM Bug #3894 (Closed): monclient: --keyring failed despite presence of file
While going over install basics with Gary, we got "ERROR: missing keyring, cannot use cephx for authentication" when ... Greg Farnum
02:40 PM rbd Feature #3877 (Fix Under Review): krbd: don't wait for notify ack to complete
I've posted this code for review. I continue to do testing. Alex Elder
02:39 PM rbd Subtask #3741 (Fix Under Review): krbd: rework request tracking code
I've posted this code for review. I continue to do testing. Alex Elder
02:39 PM rbd Tasks #3755 (Fix Under Review): krbd: use new request tracking code for sync object operations
I've posted this code for review. I continue to do testing. Alex Elder
02:39 PM rbd Feature #3754 (Fix Under Review): krbd: use new request tracking code for notify ack
I've posted this code for review. I continue to do testing. Alex Elder
02:19 PM rbd Feature #3893 (Rejected): krbd: document the new request code
There are bits and pieces of the new request code
documented for the kernel rbd client--in the comments
and in the ...
Alex Elder
02:09 PM CephFS Bug #3832: client: does not observe O_SYNC
Fixed a bug in objectcacher::flush_set. Branch wip-3832-oc-flushrange has been updated, and passes the accompanying ... Sam Lang
01:09 PM Subtask #2659: mon: Single-Paxos: ceph tool -w subscriptions not being updated
Can't recall if this was fixed at some point, or if the root cause was even related.
This must be tested again onc...
Joao Eduardo Luis
01:06 PM Subtask #2622 (Resolved): mon: Single-Paxos: convert existing, old MonitorStore to a brand new Mo...
This was implemented both as an offline tool as well as integrated in ceph-mon. The ceph-mon will attempt to open the... Joao Eduardo Luis
01:02 PM Subtask #3069: mon: Single-Paxos: messaging: log MMonSync messages for offline matching
If we really want to do offline matching, this can be done using just the logs. This could be interesting however fo... Joao Eduardo Luis
12:54 PM Subtask #3843 (Rejected): osd: move purged_snaps out of info
Sage Weil
12:54 PM Subtask #3844 (Rejected): osd: move info and log into leveldb
Sage Weil
12:54 PM Subtask #3842 (Rejected): osd: create tool to extract pg info and pg log from filestore
Sage Weil
12:54 PM Feature #3841 (Rejected): osd: avoid seeks for log and info writes on client writes
broke out subtasksa nd top level features Sage Weil
12:53 PM Feature #3892 (Resolved): osd: move pg info into leveldb
Sage Weil
12:53 PM Feature #3891 (Resolved): osd: move purged_snaps out of info
Sage Weil
12:53 PM Feature #3890 (Resolved): osd: create tool to extract pg info and pg log from filestore
Sage Weil
10:38 AM Feature #2580 (Resolved): perf: investigate poor performance at 10 osds per node
This was probably unique to the burnupi cluster and/or older ceph. Performance is fine on the SC847a now with lots o... Mark Nelson
10:27 AM rbd Bug #3889 (Won't Fix): krbd: handle zero-length requests
I'm pretty sure there are some special zero-length
requests (like flush) that can come down from the
block layer. ...
Alex Elder
07:07 AM Linux kernel client Bug #3887 (Closed): kernel client: small object memory leak
In testing my new request code for rbd (issue 3741 and related)
I tried paying special attention to Linux slab usage...
Alex Elder
04:11 AM Linux kernel client Bug #3886: Futher testing result for the issue "ceph: avoid 32-bit page index overflow"
https://SizableSend.com/0g9dwn/ceph_mds.a.log Mohamed Pakkeer
04:06 AM Linux kernel client Bug #3886 (New): Futher testing result for the issue "ceph: avoid 32-bit page index overflow"
We raised an issue in the following ticket and the ticket has been resolved
http://tracker.newdrea...
Mohamed Pakkeer

01/21/2013

10:20 PM Feature #3848: osd: gracefully handle cluster network heartbeat failure
One option: do not mark ourselves back up (after being wrongly marked down) unless we are able to successfully ping a... Sage Weil
10:12 PM Bug #3885 (Resolved): osd: osd-recovery-incomplete qa test failing
ubuntu@teuthology:/a/teuthology-2013-01-21_19:00:03-regression-master-testing-gcov$ teuthology-ls --archive-dir . | g... Sage Weil
10:08 PM Feature #3833 (In Progress): osd: improve recovery throttling
Sage Weil
09:59 PM Bug #2655: scrub slows writes more than it should
This ticket predates the chunky scrub work that went into ~0.54 or thereabouts. Sage Weil
09:15 PM Bug #2655 (Resolved): scrub slows writes more than it should
Sage Weil
09:12 PM Bug #2357 (Can't reproduce): mds takes down ceph
Sage Weil
09:11 PM Bug #3854 (Resolved): mon: clock skew tests failing on master
pushed to master Sage Weil
01:20 PM Fix #3884 (Resolved): osd: resurrect partially deleted PGs
If a PG is in the process of getting removed and we repeer and discover we want to keep it, we currently block waitin... Sage Weil
12:30 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Corin-
Have you tried 0.48.3 again since then? I'd like to get to the bottom of this, if possible... :)
Sage Weil
09:35 AM rbd Bug #3737: Higher ping-latency observed in qemu with rbd_cache=true during disk-write
Hi Josh,
according to our conversation I did some testing.
I started the dd if=/dev... of=/tmp/doof.dat bs=4k cou...
Oliver Francke

01/20/2013

11:01 PM CephFS Feature #1236 (Fix Under Review): libceph: set layout via virtual xattrs (libceph/cfuse)
wip-vxattr (ceph.git) and wip-vxattrs (ceph-client.git). There's a test script that passes on both fuse and kclient.... Sage Weil
10:58 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
Greg Farnum wrote:
> How large would a simple "layout" xattr actually be in comparison to the shipped inodes? I'm no...
Sage Weil
03:12 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
How large would a simple "layout" xattr actually be in comparison to the shipped inodes? I'm not sure the size is so ... Greg Farnum
08:26 PM rbd Feature #3877: krbd: don't wait for notify ack to complete
I have implemented this in the new request code.
It will be posted for review along with the rest
of that new code ...
Alex Elder
08:14 PM rbd Feature #3877 (In Progress): krbd: don't wait for notify ack to complete
Ian points out that "I've already implemented this change"
suggests that the status of this issue should at least
b...
Alex Elder
08:26 PM rbd Subtask #3741 (In Progress): krbd: rework request tracking code
Considering this "is actually work that's mostly complete"
I'm (finally) marking it "In Progress."
This code is f...
Alex Elder
08:22 PM rbd Feature #3754 (In Progress): krbd: use new request tracking code for notify ack
I have completed implementing sending synchronous acknowledgement
in response to a watch request notification. It i...
Alex Elder
08:19 PM rbd Tasks #3755 (In Progress): krbd: use new request tracking code for sync object operations
I have completed implementing all of these in the new request
code:
- synchronous object read (for v1 header object...
Alex Elder
04:12 PM Bug #3879 (Resolved): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
thanks! commit:17160843d0c523359d8fa934418ff2c1f7bffb25 Sage Weil
03:51 PM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
Looks good to me. Samuel Just
09:58 AM Bug #3879 (Fix Under Review): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
wip-3879 Sage Weil
09:06 AM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
Output from the following attached:
ceph osd getmap 554 -o 554
Jens Kristian Søgaard
08:46 AM Bug #3879 (In Progress): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
Jens Kristian Søgaard wrote:
> Output from the following attached:
>
> ceph osd getmap 555 -o 555
> ceph osd get...
Sage Weil
12:49 AM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
Output from the following attached:
ceph osd getmap 555 -o 555
ceph osd getmap 556 -o 556
Jens Kristian Søgaard
11:15 AM Bug #3883 (Won't Fix): osd: leaks memory (possibly triggered by scrubbing) on argonaut
100MB/day reported by multiple users, both on 0.48 and 0.56.1.
Some correlation with scrubbing. Possibly specific...
Sage Weil
09:55 AM CephFS Feature #3882: Hide snapshot directory name in mount/mtab
It seems like better (or perhaps just "more important") fix is to restrict access to .snap in the first place.
FWI...
Sage Weil
07:14 AM CephFS Feature #3882 (Rejected): Hide snapshot directory name in mount/mtab
The idea is to avoid users to see what snapshot directory name choosen during mount.
This is useful if we want to...
Ivan Kudryavtsev
09:51 AM CephFS Bug #3881 (Rejected): Wrong ip network to exchange data between kernel ceph and MDS
Ivan Kudryavtsev wrote:
> Hm. It seems that I'm wrong about the way it works. It connects to OSDs via OSD-defined pu...
Sage Weil
09:44 AM CephFS Bug #3881: Wrong ip network to exchange data between kernel ceph and MDS
Hm. It seems that I'm wrong about the way it works. It connects to OSDs via OSD-defined public network. It seems that... Ivan Kudryavtsev
07:03 AM CephFS Bug #3881 (Rejected): Wrong ip network to exchange data between kernel ceph and MDS
I'm using ceph installation with three networks:
1st is Infiniband network for OSD exchange and replication
2nd i...
Ivan Kudryavtsev

01/19/2013

02:24 PM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
full log at http://bit.ly/11Hn7BN
Sage Weil
02:04 PM Bug #3879 (Resolved): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
... Sage Weil
12:58 PM Bug #3878 (Rejected): osd: nobackfill flag doesn't work
on currently master, bobtail Sage Weil
11:43 AM Feature #3833: osd: improve recovery throttling
see wip-3833 for push Sage Weil
08:40 AM rbd Feature #3877 (Closed): krbd: don't wait for notify ack to complete
When we receive notification of a change to an rbd image's header
object we need to refresh our information about th...
Alex Elder

01/18/2013

11:21 PM Documentation #3711: crush-map.rst: choose firstn talks about "N", but does not clearly define wh...
Dan Mick
11:20 PM Documentation #3711: crush-map.rst: choose firstn talks about "N", but does not clearly define wh...
Sorry, I think this is still wrong; the descriptions of {num} only apply if firstn is supplied, correct? Otherwise {... Dan Mick
11:12 PM Bug #3869: ceph osd pool get doesn't support everything set does
Added tests with commit:2491f976e4cd6eca5c30f7c184038364e4fe1873
Dan Mick
01:22 PM Bug #3869: ceph osd pool get doesn't support everything set does
how about a quick bash test script that gets and sets some of these values? Sage Weil
12:49 PM Bug #3869 (Resolved): ceph osd pool get doesn't support everything set does
commit:1f911fd0616c3fb45d5d36de7947a1914190017b
Dan Mick
12:27 PM Bug #3869 (Fix Under Review): ceph osd pool get doesn't support everything set does
Dan Mick
12:15 PM Bug #3869: ceph osd pool get doesn't support everything set does
This was noted on #ceph overnight. Dan Mick
12:14 PM Bug #3869 (Resolved): ceph osd pool get doesn't support everything set does
...for no apparently good reason. Adding the missing info is easy. Dan Mick
11:11 PM RADOS Bug #3872 (Resolved): You can put negative weights on OSDs
commit:aea898db2b56878b50f09dcbbf52347f4cc5c754
Dan Mick
05:39 PM RADOS Bug #3872: You can put negative weights on OSDs
Dan Mick
04:01 PM RADOS Bug #3872 (Fix Under Review): You can put negative weights on OSDs
Dan Mick
02:32 PM RADOS Bug #3872 (Resolved): You can put negative weights on OSDs
DHO reports that negative weights can be assigned to an OSD. Tested on Alexandria running 0.56-20-g9aecacd-1precise.
...
JuanJose Galvez
09:16 PM Linux kernel client Bug #3875 (Resolved): osd_client: don't use r_num_pages for bio requests
There is an osd request field "r_num_pages" that's used
to record the number of pages supplied with the request.
Fo...
Alex Elder
05:35 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
Translating any ceph.* setxattrs into a sync setxattr and handling it on the MDS seems like an easy win. I can't thi... Sage Weil
01:34 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
We're still thinking through the implications of the best way to implement this. Nonetheless there are people using h... Greg Farnum
05:01 PM CephFS Bug #3832: client: does not observe O_SYNC
Current status: the iozone-sync.sh test script is causing a segfault (sometimes at hang). Needs more testing! Segf... Sam Lang
04:46 PM Documentation #3808 (In Progress): Block device quick start page need update
John Wilkins
03:58 PM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
It had sounded to me like the trend was towards eliminating the Perl usage rather than adding it as a dependency. Did... Greg Farnum
03:56 PM Feature #3815 (Duplicate): osd: move pg_info_t back into the xattr; avoid writing pginfo file whe...
Sage Weil
03:49 PM Bug #3870 (Resolved): osd: make pg removal more friendly
commit:684a8f8f84312d4d9c6cdeb8d6d9fad792bd5a6d Sage Weil
01:44 PM Bug #3870 (Resolved): osd: make pg removal more friendly
wip-pg-removal needs cleanup and merge Sage Weil
03:49 PM Bug #3806 (Won't Fix): OSDs stuck in active+degraded after changing replication from 2 to 3
Thanks. I was trying to figure out where the conflict could come from, and actually it does make sense: The single-os... Greg Farnum
03:45 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
Sure, it's attached... Ben Poliakoff
03:40 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
@Josh: Even with the new CRUSH tunables it's still a matter of probability, so if you give it a particularly challeng... Greg Farnum
03:31 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
OK, it looks like I may have simply given CRUSH a challenging assignment, given the resources of the cluster.
I ...
Ben Poliakoff
02:58 PM Bug #3873 (Duplicate): Ceph cli tool allows setting negative weights
Ian Colle
02:54 PM Bug #3873 (Duplicate): Ceph cli tool allows setting negative weights
Setting OSD weights to negative values:... Kyle Bader
02:46 PM Bug #1807 (Can't reproduce): CentOS compile error in perfglue/heap_profiler.cc
Anonymous
02:01 PM CephFS Feature #3570 (Resolved): teuthology: mds thrasher
Sage Weil
02:01 PM rbd Bug #3871 (Resolved): krbd: initial header read may be out of date
Currently krbd uses the version parameter of a watch operation to try to prevent this, but that was never implemented... Josh Durgin
01:55 PM Linux kernel client Bug #3860 (Rejected): rbd: problems if watch setup returns ERANGE
Josh Durgin
01:54 PM Linux kernel client Bug #3860: rbd: problems if watch setup returns ERANGE
ERANGE is never actually returned - it was never implemented (#2592). The real fix for the race it was intended to pr... Josh Durgin
08:08 AM Linux kernel client Bug #3860 (Rejected): rbd: problems if watch setup returns ERANGE
When rbd sets up the watch request for a newly-mapped rbd image
it loops and tries again if the request returns ERAN...
Alex Elder
12:49 PM CephFS Feature #3865 (Duplicate): mds: implement lookup-by-ino based on inode backtraces
#3541. Whoops! Greg Farnum
11:02 AM CephFS Feature #3865 (Duplicate): mds: implement lookup-by-ino based on inode backtraces
Following #3862 and #3863, implement the lookup-by-ino algorithm described in http://www.spinics.net/lists/ceph-devel... Greg Farnum
12:49 PM CephFS Feature #3541: mds: robust ino lookup using file backpointers
We have a design now! Greg Farnum
12:48 PM CephFS Feature #3862 (Duplicate): mds: add file backtraces to data objects
#3540. Whoops! Greg Farnum
10:26 AM CephFS Feature #3862 (Duplicate): mds: add file backtraces to data objects
Add backtraces to each file object, as described at http://www.spinics.net/lists/ceph-devel/msg11872.html. This ticke... Greg Farnum
12:48 PM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
We have a design now! Greg Farnum
11:09 AM CephFS Feature #3727: mds: refactor EMetablob encoding paths
What is this bug about? Greg Farnum
11:08 AM CephFS Feature #3867 (Resolved): optionally do not use an anchor table
Following #3865 and #3866, we should introduce a config option that, when set, does not make use of the Anchor table ... Greg Farnum
11:07 AM CephFS Feature #3866 (New): mds: Add lazily-updated backtraces to hard links
As described in http://www.spinics.net/lists/ceph-devel/msg11872.html, we want hard links to contain lazily-updated b... Greg Farnum
10:55 AM CephFS Feature #3863: implement a tool to lookup inode numbers without holding their path
+1 for just adding the libcephfs function, and a test in test_libcephfs. Sam Lang
10:41 AM CephFS Feature #3863 (Resolved): implement a tool to lookup inode numbers without holding their path
This should just be a small wrapper around Client.cc*, but we need to be able to generate inode lookups without knowi... Greg Farnum
10:41 AM Feature #3769: osd: scrub should verify snap collection existence, membership
In master, sha-1 7b6fe03208c507b55517abe45cdff5c96d91904a
Needs backport when we are happy with the testing (if it's...
Samuel Just
10:15 AM rbd Tasks #3755: krbd: use new request tracking code for sync object operations
The sync header read operation was another one that was needed.
That's basically done too.
All of this will be re...
Alex Elder
10:09 AM rbd Tasks #3755: krbd: use new request tracking code for sync object operations
I have been looking in detail at how the watch requests are
implemented and in the process identified a few potentia...
Alex Elder
10:14 AM Linux kernel client Bug #3751 (Resolved): krbd: fix type of snap_id local variable
... Alex Elder
10:11 AM Bug #3854 (Fix Under Review): mon: clock skew tests failing on master
Joao Eduardo Luis
10:07 AM Bug #3854: mon: clock skew tests failing on master
teuthology's wip-3854 commit:1d8640860441dc27e8342788c1ae17f5c1b3ccc0 fixes this issue. Joao Eduardo Luis
09:00 AM Bug #3816: osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
commit:98a763123240803741ac9f67846b8f405f1b005b
When the osd does a "mark myself back up" it takes care to rebind ...
Sage Weil
08:58 AM rbd Feature #3861 (Resolved): rbd: consider splitting rbd_osd_req_op_create()
When it was out for review, Josh suggested that it might
be better to have separate (type-checking) functions for
b...
Alex Elder
08:25 AM CephFS Bug #3845: mds: standby_for_rank not getting cleared on takeover
+1 clearing it for cosmetic reasons. Sam Lang
08:07 AM rbd Bug #3859 (Resolved): osd_client: define ceph_osdc_clear_request_linger()
There is a ceph_osdc_set_request_linger() function that
sets a flag on a request and takes an additional reference.
...
Alex Elder
08:04 AM rbd Bug #3858 (Resolved): osd_client: ceph_osdc_wait_request() seems wrong
The only error wait_for_completion_interruptible() will
return is ERESTARTSYS. So if that gets returned inside
cep...
Alex Elder
07:14 AM rbd Feature #1491: qemu: make qemu-img convert fast
This was rejected because feature is not relevant anymore. At the time, when I was looking at it there was some obvio... Yehuda Sadeh
12:43 AM devops Feature #2885 (Resolved): doc: mon initial members requirements, functioning, admin steps to take
This was done some time ago. Step 9 here: http://ceph.com/docs/master/rados/deployment/chef/#configure-your-ceph-envi... John Wilkins
12:35 AM Documentation #3062 (Resolved): doc: osd tuning config options
This was completed some time ago. John Wilkins
12:28 AM Documentation #3329 (Resolved): doc: What metrics should be used to set node weight
Discussion was primarily starting with 1TB as a weight of 1.00 with additional consideration for throughput. If this ... John Wilkins
12:27 AM Tasks #3779 (In Progress): update osd config ref as appropriate
John Wilkins
12:26 AM Bug #3776 (Resolved): Need doc describing how to alter our log rotation
John Wilkins

01/17/2013

11:41 PM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
John Wilkins
11:41 PM Documentation #3389 (Resolved): doc: crush docs could use a full example crushmap
John Wilkins
11:40 PM Documentation #3709 (Resolved): crush-map.rst: claims 'types' are default, not true (must be spec...
John Wilkins
11:40 PM Documentation #3707 (Resolved): crush-map.rst: syntax error in example
John Wilkins
11:28 PM Feature #3505 (Resolved): default to libnss
This was done for RPMs with the commit listed below. Debians already had the --with-nss flag in the rules file.
...
Anonymous
11:21 PM Bug #2176 (In Progress): dependencies not checked by autoconf
All these are listed as build requirements for the rpm and debian packages. I'll add the missing ones to configure.ac. Anonymous
11:16 PM devops Tasks #3512 (In Progress): Publish our fastcgi packages
The approach is to pick up the latest debian and rpm packages for mod_fastcgi, apply the ceph patch, and build manual... Anonymous
11:13 PM Bug #3736: kernel build: failures starting in 3.8-rc1
The immediate kernel build problems have been solved by recreating the patch that is applied to the debian package bu... Anonymous
11:09 PM Bug #3736: kernel build: failures starting in 3.8-rc1
Branch: refs/heads/master
Home: https://github.com/ceph/autobuild-ceph
Commit: 0ff4f9a9ce82b37288b3bbcc5b5d65b5...
Anonymous
10:54 PM Bug #3768 (Resolved): perl is required for logrotate, we need to include Perl as a dependency
Branch: refs/heads/master
Home: https://github.com/ceph/ceph
Commit: bebdc70b4254a78d9fe86af9c645e828fd11e2b2
...
Anonymous
10:16 PM Documentation #3831 (In Progress): ceph osd crush set command needs correction in the doc
John Wilkins
10:14 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
Sage Weil
10:02 PM CephFS Feature #3857: mds: enforce unique mds names in mdsmap
see wip-mds-names Sage Weil
09:36 PM CephFS Feature #3857 (Resolved): mds: enforce unique mds names in mdsmap
Currently mds's are uniquely identified by their addr (i.e., a unique instance of the process). The name is useful on... Sage Weil
06:37 PM rbd Bug #3413 (Resolved): rbd bench-write fails with assert when rbd caching turned on
commit:d81ac8418f9e6bbc9adcc69b2e7cb98dd4db6abb Josh Durgin
01:39 PM rbd Bug #3413 (Fix Under Review): rbd bench-write fails with assert when rbd caching turned on
branch wip-rbd-bench-write Josh Durgin
06:00 PM rgw Feature #3856 (Resolved): rgw: list buckets S3 api should be paginated
The S3 api (unlike swift) does not define marker, max when listing buckets (probably due to the fact that max buckets... Yehuda Sadeh
05:25 PM Bug #3836: osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
... Sage Weil
08:52 AM Bug #3836 (Resolved): osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
... Sage Weil
04:55 PM Bug #3279: mon/caps: cap comparison in get-or-create is based on a string literal
This effects the chef mon recipe. I am able to correct this error by joining lines 96-99.
[Thu, 17 Jan 2013 16:5...
Kraig Amador
04:44 PM Feature #3850: Add json output for ceph pg dump and ceph osd tree
'pg dump' and 'osd dump' both have 'json' support since argonaut, but argonaut does not support outputting json on 'o... Joao Eduardo Luis
03:20 PM Feature #3850: Add json output for ceph pg dump and ceph osd tree
It already exists for pg dump and osd dump too. osd tree was recent though, maybe it's not in the version he's using? Josh Durgin
03:02 PM Feature #3850 (Closed): Add json output for ceph pg dump and ceph osd tree
Kyle Bader has requested json output for the following commands:
ceph pg dump
ceph osd tree
Sage Comment:
th...
Ian Colle
04:32 PM Feature #3855 (Resolved): Making Scrubs Nicer
As requested from DHO:
Currently scrubs are not very nice, Sage referred to these issues and it would be nice if t...
JuanJose Galvez
04:26 PM Bug #3854 (Resolved): mon: clock skew tests failing on master
... Sage Weil
04:21 PM Feature #3853 (Resolved): qa: include iogen in qa suite
Sage Weil
04:10 PM Bug #3827 (Resolved): crushtool --test: claims to want -o, really wants --output-csv or --show-*
commit:60db6e3e394df1e4110eefa5951657b648b02006
Dan Mick
04:10 PM RADOS Bug #3834 (Resolved): crushtool really really hates \r
commit:e776b63dd5c540a6f49b03b67e72a1f4636a74fd Dan Mick
11:06 AM RADOS Bug #3834: crushtool really really hates \r
Well isspace() would catch newline too, which I think we don't want, so it'd be iswhite(c) && c != '\n', which I'm no... Dan Mick
04:06 PM devops Bug #3852 (Resolved): chef recipes don't try to start OSDs
I wasn't aware the chef recipes were this incomplete, but it appears as though, unless
you're running Crowbar, osd.r...
Dan Mick
04:05 PM devops Bug #3851 (Resolved): chef recipes don't enable upstart
Since upstart management of daemons now explicitly looks for an upstart tag file, Chef
doesn't start the monitors co...
Dan Mick
03:17 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
I presume we're planning to backport this to bobtail after it passes some nights of testing? Maybe we should leave th... Greg Farnum
03:03 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
commit:f358cb1d2b0a3a78bf59c4fd085906fcb5541bbe Sage Weil
02:58 PM Feature #3849 (Resolved): Track slow PGs and times OSDs marked down
Kyle Bader:
"Over the weekend of 01/02/13 we encountered an issue that we had not yet
encountered. One of our cephs...
Ian Colle
02:54 PM Feature #3848 (Resolved): osd: gracefully handle cluster network heartbeat failure
From Kyle Bader
"Back in October we had a switch failure on our cluster (backend) network.
This was not noticed b...
Ian Colle
02:24 PM rbd Bug #3847 (Resolved): rbd: figure out correct byte order for watch version
In the process of refactoring rbd code that builds up osd
operations I noticed that for NOTIFY_ACK and WATCH operati...
Alex Elder
01:40 PM Documentation #3846 (Resolved): Debian install has incorrect gitbuilder URL

From http://ceph.com/docs/master/install/debian/ :...
Anonymous
12:32 PM rbd Feature #1491 (Rejected): qemu: make qemu-img convert fast
Yehuda Sadeh
12:28 PM CephFS Bug #3832 (Fix Under Review): client: does not observe O_SYNC
Implemented in wip-3832. Needs review. Sam Lang
12:17 PM CephFS Bug #3845: mds: standby_for_rank not getting cleared on takeover
I dont' think it matters. It's is a fixed lifecycle from standby -> active -> dead, so the leftover standby_ just te... Sage Weil
12:13 PM CephFS Bug #3845: mds: standby_for_rank not getting cleared on takeover
This is a monitor thing; the MDS is only involved in relaying the config setting over on boot-up. Greg Farnum
11:38 AM CephFS Bug #3845 (Closed): mds: standby_for_rank not getting cleared on takeover
This is the mdsmap after mds.a was active and given rank 0, then killed, and another mds (mds.b-s-r0) that had standb... Sam Lang
11:34 AM CephFS Feature #3730: Support replication factor in Hadoop
Sage Weil wrote:
> If there are more such cases, that is a separate bug!
It was a bug I had introduced in wip-cli...
Noah Watkins
09:51 AM CephFS Feature #3730: Support replication factor in Hadoop
Noah Watkins wrote:
> In Client, osdmap is protected by client_lock? If so, new version of branch isn't broken..
...
Sage Weil
08:55 AM CephFS Feature #3730: Support replication factor in Hadoop
In Client, osdmap is protected by client_lock? If so, new version of branch isn't broken.. Noah Watkins
10:45 AM Subtask #3844 (Rejected): osd: move info and log into leveldb
Samuel Just
10:45 AM Subtask #3843 (Rejected): osd: move purged_snaps out of info
the purged_snaps set is really a property of the local pg instance rather than a global property and does not get upd... Samuel Just
10:42 AM Subtask #3842 (Rejected): osd: create tool to extract pg info and pg log from filestore
Once these are moved into leveldb, it will be much more difficult to manually extract these structures. Samuel Just
10:41 AM Feature #3841 (Rejected): osd: avoid seeks for log and info writes on client writes
Probable approach is to move log and info into leveldb. Samuel Just
10:38 AM Subtask #3840 (Resolved): osd: ack push after apply+commit
This will prevent the primary from shoving another push before the first has completed. Alternately, make the number... Samuel Just
10:28 AM Documentation #3839 (Resolved): SSD crushmap example will not compile
The SSD CRUSH map example (http://ceph.com/docs/master/rados/operations/crush-map/#placing-different-pools-on-differe... Alexandre Marangone
10:24 AM CephFS Bug #1435: mds: loss of layout policies upon mds restart
wip-mds-layout2
needs to be rebased reviewed and tested!
Sage Weil
10:13 AM Bug #3835 (Resolved): mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) when monito...
pushed to master, commit:c6f8010b1c8e4d54f9fb24b2e4e25ff8a2bde778 Joao Eduardo Luis
09:34 AM Bug #3835 (Fix Under Review): mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) whe...
Ian Colle
08:51 AM Bug #3835: mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) when monitors are seve...
This issue is fixed on wip-3835, commit:785a2bc3e9271607b1ddf25390056e9dd9c72b21 Joao Eduardo Luis
07:47 AM Bug #3835 (Resolved): mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) when monito...
The leader schedules a new 'ping' to the monitors in the quorum as soon as the pings are all sent.
This allows for...
Joao Eduardo Luis
10:04 AM Bug #3820: osdmaptool - user cannot specify pool
85eb8e382a26dfc53df36ae1a473185608b282aa Samuel Just
09:58 AM Bug #3816 (Resolved): osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
Sage Weil
09:50 AM rbd Feature #3838 (New): krbd: use common functions for striping calculations
With the STRIPINGV2 feature bit, format 2 striping has the same parameters as cephfs striping. Re-work the rbd object... Josh Durgin
09:29 AM Linux kernel client Feature #3837 (Resolved): krbd: support format 2 striping
Format 2 images with the STRIPINGV2 feature bit set (created with rbd create --stripe-count X --stripe-unit Y --order... Josh Durgin
09:12 AM rbd Feature #3754: krbd: use new request tracking code for notify ack
Yay! Sage Weil
04:52 AM rbd Feature #3754: krbd: use new request tracking code for notify ack
Yeehah! All tests passed, including the previously-failing
blogbench.sh, fsstress, and two passes through xfstests.
Alex Elder
09:11 AM Bug #2843: filestore: replay failure on xfs
The post-v0.50 version of this bug was just fixed, commit:66eb93b83648b4561b77ee6aab5b484e6dba4771, which is backport... Sage Weil
02:38 AM Bug #2843: filestore: replay failure on xfs
Hi,
We have exactly the same problem on 1 of our osd (bobtail 0.56.1).
[[https://gist.github.com/4555135]]
Wha...
Guilhem Lettron
09:08 AM CephFS Bug #3261 (Rejected): mds crashes in EMetaBlob::replay
Understood. I'm sorry we weren't able to dig in when it happened. When do you get around to retesting we should be ... Sage Weil
02:09 AM CephFS Bug #3261: mds crashes in EMetaBlob::replay
should i test the same btrfs volume with a new ceph? if so i might get to it in the next month. please close with ins... Tobias Florek

01/16/2013

09:41 PM RADOS Bug #3834: crushtool really really hates \r
Ha! Sorry about htat. Maybe iswhite() (or wahtever that helper is) would be best here? Sage Weil
09:36 PM RADOS Bug #3834 (Resolved): crushtool really really hates \r
Spent a long time trying to figure out why a crush map wouldn't compile; finally got it to no differences at all, eve... Dan Mick
09:23 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
OK, that quick fix wasn't enough.
I had a spinlock protecting the check for something being
complete. But that w...
Alex Elder
08:13 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
Well that's unfortunate. I hit the same problem. I'll
need to take a closer look I guess.
Alex Elder
07:39 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
Seems to be working better. It may end up being an
atomic rather than protecting with a spinlock, but
either way, ...
Alex Elder
03:15 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
I've pretty much implemented this feature but having done
this I'm looking at a crash that happened with this code
...
Alex Elder
06:37 PM rgw Bug #3813: radosgw doesn't have a logrotate script
Let's go with /var/log/radosgw and a separate logrotate script. Simpler! Sage Weil
09:06 AM rgw Bug #3813: radosgw doesn't have a logrotate script
Given that radosgw gets installed without ceph, it seems like teh viable optoins are putting the logrotate cofnig in ... Sage Weil
04:14 AM rgw Bug #3813: radosgw doesn't have a logrotate script
Note that the official docs suggest to put "log file = /var/log/ceph/radosgw.log" too. If "ceph" isn't installed, thi... Faidon Liambotis
04:02 AM rgw Bug #3813 (Resolved): radosgw doesn't have a logrotate script
Currently there's no logrotate configuration for radosgw at all. Even if one sets "log file" to /var/log/ceph/somethi... Faidon Liambotis
06:35 PM Feature #3833 (Resolved): osd: improve recovery throttling
Sage Weil
06:24 PM Bug #3810 (Need More Info): btrfs corrupts file size on 3.7
I need a dump of the xattrs on the d0c18e1d/605.00000000/head//1 object in pg 1.1d on osd 7 and osd 0 Samuel Just
05:59 PM CephFS Bug #3832 (Resolved): client: does not observe O_SYNC
if the file was opened with O_SYNC we need to flush the io on every write call. Sage Weil
05:49 PM Bug #3795 (Resolved): loadgen task gets into msgr loop
Sage Weil
05:44 PM rgw Feature #3207 (Resolved): qa: swift functional tests in nightly
Yehuda Sadeh
05:34 PM CephFS Feature #3730: Support replication factor in Hadoop
Oh right, libcephfs is not built on top of librados. Never mind, that's a whole different discussion we start occasio... Greg Farnum
05:15 PM CephFS Feature #3730: Support replication factor in Hadoop
I don't think libcephfs will give up an instance of the rados client, if that's what you mean by grant access to rado... Noah Watkins
04:33 PM CephFS Feature #3730: Support replication factor in Hadoop
Sorry to back this up a little, but I can't recall — does using libcephfs automatically grant a user access to the RA... Greg Farnum
04:30 PM CephFS Feature #3730: Support replication factor in Hadoop
This interface update is up for review in wip-client-pool-api Noah Watkins
09:52 AM CephFS Feature #3730: Support replication factor in Hadoop
From stand-up, stick with int64_t for userspace, and enforce 32-bit range. Noah Watkins
09:43 AM CephFS Feature #3730: Support replication factor in Hadoop
The move from int32 -> int64 was misguided, and incomplete. At this point it's not really worth the effort to move a... Sage Weil
07:31 AM CephFS Feature #3730: Support replication factor in Hadoop
It looks like in OSDMap there is some mixed usage of int64 and int for pool id, too. In Client::_create pool id is e... Noah Watkins
06:40 AM CephFS Feature #3730: Support replication factor in Hadoop
Can we change the type in libcephfs to uint64? We're the only ones calling ceph_get_file_pool() right now as far as ... Sam Lang
05:33 PM Bug #3820 (Resolved): osdmaptool - user cannot specify pool
Samuel Just
02:24 PM Bug #3820 (Resolved): osdmaptool - user cannot specify pool
Samuel Just
05:23 PM Documentation #3831 (Resolved): ceph osd crush set command needs correction in the doc
ceph osd crush set command has different parameters in different places.
http://ceph.com/docs/master/rados/operat...
Tamilarasi muthamizhan
05:21 PM rgw Bug #3802 (Resolved): x-amz-acl header ignored on copy operation
Fixed, commit:ccfefe3097a51b49885f2ed5d9334e85b497d963. Fix was pushed to both argonaut and bobtail branches. Yehuda Sadeh
11:17 AM rgw Bug #3802: x-amz-acl header ignored on copy operation
ok, affects both argonaut and bobtail. Actual bug is when copying object, if x-amz-metadata-directive is set to COPY ... Yehuda Sadeh
10:01 AM rgw Bug #3802: x-amz-acl header ignored on copy operation
On what version? Yehuda Sadeh
05:16 PM RADOS Documentation #3830 (Closed): crush-map.rst: chooseleaf doesn't include 'firstn|indep', and 'aggr...
1) I think chooseleaf should also include [firstn|indep] like choose does.
2) I'm not certain I understand just wh...
Dan Mick
05:15 PM Bug #3829 (Can't reproduce): new osd added to the cluster is not receiving data
ceph version: 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
1. Initially , had a cluster[burnupi21,burnupi22,b...
Tamilarasi muthamizhan
04:12 PM CephFS Bug #3828 (Rejected): seeing error: fault, server, going to standby whenever I run a ceph-syn loa...
This is showing up on your MDS, about 15 minutes after a client completes accesses, right? This is associated with th... Greg Farnum
04:01 PM CephFS Bug #3828 (Rejected): seeing error: fault, server, going to standby whenever I run a ceph-syn loa...
while validating bug 520, i saw an interesting error. it may be a red herring, as I am seeing no problem with the wr... Anonymous
03:47 PM CephFS Bug #520 (Closed): mds: change ifile state mix->sync on (many) lookups?
3 Node Cluster:
ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
# cat /etc/ceph/ceph.conf
[global]...
Anonymous
02:51 PM CephFS Bug #520: mds: change ifile state mix->sync on (many) lookups?
csyn is now called ceph-syn
and --debug-ms 1 to see those messages go by!
Sage Weil
03:26 PM CephFS Bug #3261: mds crashes in EMetaBlob::replay
This looks like a problem with what's in the journal, but soo much MDS code has changed since then that I don't think... Sage Weil
03:24 PM CephFS Bug #1760 (Resolved): multiple_rsync workunit cannot remove non-empty directory intermittently
this also looks like the tmap problem, commit:e52ebacb73747ef642aabdb3cc3cb2a328687a4c and preceeding 4 commits. Sage Weil
03:23 PM CephFS Bug #2380 (Rejected): kclient: aufs over a cephfs mount fails with Stale NFS file handle
this is a generic problem with lookup by ino, see #3541 and other features Sage Weil
03:23 PM CephFS Bug #2092 (Can't reproduce): BUG at fs/ceph/caps.c:999
commit:561cf283173360c39db19dc735da4a319be68ff6 fixes the multi-mds case. we haven't seen this again for single-mds..... Sage Weil
03:21 PM Bug #3827 (Resolved): crushtool --test: claims to want -o, really wants --output-csv or --show-*
The error message is wrong, apparently, for crushtool's test mode; it looks like it wants either
--output-csv (in wh...
Dan Mick
03:11 PM CephFS Feature #3826 (Resolved): uclient: Be more aggressive about checking for pools we can't write to
Right now the client will happily buffer up writes to a pool that it can't actually write to. #2753 is going to make ... Greg Farnum
03:06 PM CephFS Bug #3746 (Rejected): kclient mmap doesn't zero past EOF
Run against bad code. Greg Farnum
03:03 PM CephFS Bug #2444 (Can't reproduce): null pointer deference in ceph_d_prune inside kvm
Sage Weil
03:00 PM CephFS Bug #2071 (Can't reproduce): kclient: pjd mkfifo failures
Sage Weil
02:59 PM CephFS Bug #1770 (Can't reproduce): directory nonexistent on kernel_untar_build.sh
Sage Weil
02:58 PM CephFS Bug #1749 (Can't reproduce): nonexistent directory in kclient_workunit_kernel_untar_build
Sage Weil
02:55 PM CephFS Bug #1318 (Resolved): directories disappear across multiple rsyncs
commit:e52ebacb73747ef642aabdb3cc3cb2a328687a4c and 4 preceeding patches fix up the TMAP bug that is the likely cause... Sage Weil
02:55 PM CephFS Bug #1511: fsstress failure with 3 active mds
Sam thinks this works now! Adding to QA suite. Greg Farnum
02:50 PM CephFS Bug #3625 (Resolved): client: EEXIST error on multiple clients to create
commit:b4d3bd06d4083d780755f6ef506df1643932fa2f Sage Weil
02:49 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
Maybe you already handled this? Greg Farnum
02:11 PM CephFS Bug #3625 (Fix Under Review): client: EEXIST error on multiple clients to create
Sam Lang
06:16 AM CephFS Bug #3625: client: EEXIST error on multiple clients to create
The kernel side has been reviewed and tested, but needs to be merged. The fuse side has been tested, but I think it ... Sam Lang
02:48 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
we should return an error code on fsync().. that is the quick fix.
a more polite feature will be opened to return ...
Sage Weil
09:19 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
This is clearly a bug, bureaucracy or not. It should not be a feature. We can do new development to fix a bug. If you... Ian Colle
02:47 PM Bug #3812 (Resolved): rados.cc listomapvals usage is wrong, <key> <val> are ignored and not needed
Samuel Just
02:47 PM Bug #3811 (Resolved): rados.cc getomapval implementation is broken, should use omap_get_vals_by_keys
Samuel Just
02:46 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
I think this can be closed. There is a bunch of autoconf changes for Java that have or will be merged. Noah Watkins
02:41 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
I just did a ./configure and using CPPFLAGS to indicate where the jni headers were and that worked just fine. Using C... Anonymous
02:45 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
Multi-mds, currently low priority. Greg Farnum
02:44 PM CephFS Bug #3637 (In Progress): client: not issuing caps for with clients doing shared writes
Sage Weil
02:43 PM CephFS Bug #3637 (Fix Under Review): client: not issuing caps for with clients doing shared writes
Sage Weil
02:40 PM CephFS Bug #3498 (Resolved): mds: mds assert failure during untar_kernel
this was a msgr bug, long since fixed. commit:36c0fd220ef02b1ffd7a3ae0d98e0fdec6b55a5b or thereabouts Sage Weil
02:39 PM CephFS Bug #1666: hadoop: time-related meta-data problems
http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg10334.html
Also wip-mtime-incr in the ceph repo.
Sam Lang
02:38 PM CephFS Bug #2218: CephFS "mismatch between child accounted_rstats and my rstats!"
Greg Farnum
02:32 PM CephFS Feature #3821 (New): qa: run backuppc as part of qa suite
Sage Weil
02:32 PM CephFS Bug #2494 (Can't reproduce): mds: Cannot remove directory despite it being empty.
The dupe inode suggests this is the problem fixed by Yan's tmap fixes. Greg Farnum
02:29 PM CephFS Bug #2019 (Can't reproduce): mds: CInode::filelock stuck in sync->mix
Presumably we'll see this again, but it hasn't turned up in our testing lately and we need more info to debug it. Greg Farnum
02:27 PM CephFS Bug #1811 (Duplicate): 2 pjd chown tests failed on cfuse
Ian Colle
02:22 PM CephFS Bug #1537 (Resolved): cmds 100% when copying lots of files, mds_cache_size and mds_bal_frag
This is an optimization issue, which we'll get to! Sage Weil
02:22 PM Bug #3816: osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
Interesting, but where did this actually get from?
And why didn't it get triggered when I started the OSDs again? ...
Wido den Hollander
01:08 PM Bug #3816 (Fix Under Review): osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
-5678> 2013-01-15 17:18:24.509093 7f5a10cec700 1 accepter.accepter.rebind avoid 6812
-5677> 2013-01-15 17:18:24.5...
Sage Weil
12:43 PM Bug #3816: osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
Like requested on the mailinglist I'm attaching the logfiles from osd.0 to osd.3
There is indeed a osd_map logline...
Wido den Hollander
09:59 AM Bug #3816 (Resolved): osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
... Sage Weil
02:21 PM CephFS Feature #3819 (Resolved): mds: re-add snaptests to qa suite
Sage Weil
02:02 PM CephFS Bug #3818 (Duplicate): kclient: fsx fails in mapread

With the fix in #3681, fsx fails in mapread with bad data. It looks like this is unrelated to the fix, and is a se...
Sam Lang
01:56 PM Bug #3786: osd: scrub is deferred indefinitely if load is high
Fixed by https://github.com/ceph/ceph/commit/299548024acbf8123a4e488424c06e16365fba5a Ian Colle
01:38 PM Bug #3786 (Resolved): osd: scrub is deferred indefinitely if load is high
Sage Weil
01:38 PM Bug #3774 (Resolved): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
Sage Weil
11:38 AM rbd Feature #3817 (Resolved): librbd: make cache write-through until a flush is encountered
Writeback caching is unsafe if higher layers don't send flushes. qemu can be accidentally misconfigured to not send f... Josh Durgin
11:09 AM CephFS Feature #3543 (In Progress): mds: new encoding
Oh, this has been in progress all week. Greg Farnum
10:35 AM CephFS Bug #3773 (Can't reproduce): mds crashed at LogEvent::decode
I have been trying to reproduce this but have not hit it yet.
will reopen the bug, when needed.
Tamilarasi muthamizhan
10:34 AM Bug #3801 (New): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED assert(...
Ian Colle
10:28 AM Bug #3801: Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED assert(0 == "...
Sage Weil wrote:
> The osd.40 error means the fs returned EIO on a read operation. Check yoru kern.org.. there is p...
Justin Lott
09:39 AM Bug #3801 (Need More Info): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAI...
The osd.40 error means the fs returned EIO on a read operation. Check yoru kern.org.. there is probably a bad disk, ... Sage Weil
09:41 AM Feature #3815 (Duplicate): osd: move pg_info_t back into the xattr; avoid writing pginfo file whe...
see wip-pginfo for a hacky prototype.
did some testing, and it looks good:...
Sage Weil
09:39 AM Linux kernel client Bug #3800 (Won't Fix): libceph: check compatibility between ceph modules
Sage Weil
07:03 AM Feature #3805: log: detect dup messages
The one that comes to mind is "no heartbeat from osd.foo since timestamp bar" messages. We could try to identify the... Sam Lang
06:04 AM CephFS Bug #3601: client: With multiple clients, file remove doesn't free up space
Yeah its that the lru doesn't have a timeout.
The mds could send an "enable timeout" message to clients once it se...
Sam Lang

01/15/2013

09:39 PM Bug #3811 (Fix Under Review): rados.cc getomapval implementation is broken, should use omap_get_v...
Samuel Just
09:21 PM Bug #3811 (Resolved): rados.cc getomapval implementation is broken, should use omap_get_vals_by_keys
Samuel Just
09:38 PM Bug #3812 (Fix Under Review): rados.cc listomapvals usage is wrong, <key> <val> are ignored and n...
Samuel Just
09:22 PM Bug #3812 (Resolved): rados.cc listomapvals usage is wrong, <key> <val> are ignored and not needed
Samuel Just
08:53 PM CephFS Feature #3728 (Resolved): mds: draft design for lookup by ino
Sage Weil
08:38 PM CephFS Feature #3730: Support replication factor in Hadoop
pool ids are currently exposed via libcephfs from ceph_file_layout, which uses a 32bit integer for pool id. However, ... Noah Watkins
08:34 PM CephFS Feature #3730: Support replication factor in Hadoop
Someone could toss a 'ceph osd pool set size' Hadoop's way, so a static mapping between pg pool size and pool name co... Noah Watkins
07:51 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
I'm not sure yet whether the problem has to do with this
or whether it's in the existing "new request" code. But
I...
Alex Elder
06:23 PM Documentation #3808: Block device quick start page need update
Fixed description formatting. Also, 3784 is in master now (e94b06a19218decaf7d2d7b009bd862040f20285) Dan Mick
04:46 PM Documentation #3808: Block device quick start page need update
The current writeup also assumes that the mount is local to the cluster so it hides (for the beginner) important deta... Ken Franklin
03:38 PM Documentation #3808: Block device quick start page need update
-c and --secret aren't needed if you're using the default ceph.conf and your keyring can be found based on your ceph.... Josh Durgin
03:30 PM Documentation #3808 (Resolved): Block device quick start page need update
The instructions don't match well with the bobtail release.
- should include a note that ceph-common needs to be ins...
Ken Franklin
06:21 PM Feature #3805: log: detect dup messages
I tend to think there aren't very many dups we could usefully compress. It's pretty easy to add a one-string buffer ... Dan Mick
02:25 PM Feature #3805: log: detect dup messages
What kind of dups are we trying to detect?
This sounds to me like a wishlist item that requires much more work to...
Greg Farnum
02:17 PM Feature #3805 (New): log: detect dup messages
If a log message comes through and is a dup of the previous, increment a counter or something and only log it once wi... Sage Weil
05:35 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
No. So far I'm focus on stabilize basic fs function for multiple MDS setup, completely ignore snapshot. Zheng Yan
03:28 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
Hmm, did this get fixed by some of Zheng's later patches? I remember things about snaprealms and migration... Greg Farnum
05:33 PM Bug #3810 (Resolved): btrfs corrupts file size on 3.7
After creating a new ceph cluster pg's become inconsistent after using the qemu client. Logs indicate that the prima... Mike Lowe
04:54 PM Bug #3809 (Won't Fix): crush compiler errors are not helpful
Small, or large, errors in the CRUSH input are apparently all treated the same by crushtool -c:
error: parse error a...
Dan Mick
04:44 PM CephFS Feature #3289: ceph-fuse: somehow exert pressure on the VFS to remove dentries from the cache
#3575 should be kept in mind while doing this/instead of this — there's a forget_multi as well. Greg Farnum
04:44 PM CephFS Bug #3601 (New): client: With multiple clients, file remove doesn't free up space
Whoops, didn't mean to change that status. Greg Farnum
04:43 PM CephFS Bug #3601 (Duplicate): client: With multiple clients, file remove doesn't free up space
The LRU actually already exists; check out Client::lru. (Unless I'm misunderstanding something?) So we might want to ... Greg Farnum
04:37 PM CephFS Bug #925: mds: update replica snaprealm on rename
De-prioritizing multi-MDS issues... Greg Farnum
04:34 PM CephFS Bug #1117: mds: rename rollback broken on slaves during replay
De-prioritizing multi-mds issues for now. Greg Farnum
04:27 PM CephFS Bug #1435: mds: loss of layout policies upon mds restart
I'm guessing we want to move this up the queue; will discuss in bug scrub tomorrow! Greg Farnum
04:23 PM CephFS Bug #1511: fsstress failure with 3 active mds
De-prioritizing multi-mds failures at this time. Greg Farnum
04:23 PM CephFS Bug #1535: concurrent creating and removing directories crashes cmds
De-prioritizing multi-MDS bugs at this time. Greg Farnum
03:51 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
Fair enough, but if I can just make a suggestion, perhaps you might want to explain these procedures somewhere in the... Florian Haas
03:45 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
I agree it's a bug, but given the procedures we have now (ack! changing procedures coming alert!) I don't think we wa... Greg Farnum
03:43 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
No, please. A write pretending to succeed while actually not writing data _is_ a bug. The filesystem _not lying to it... Florian Haas
03:33 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
This is a great suggestion but falls into feature rather than bug-fix category. My initial thought is keeping a list ... Greg Farnum
03:42 PM CephFS Bug #1675 (Can't reproduce): mds: failed rstat assert
The logs are long gone. This will presumably pop up again; it's a pretty common failure mode, but there's nothing in ... Greg Farnum
03:38 PM CephFS Bug #1938: mds: snaptest-2 doesn't pass with 3 MDS system
De-prioritizing all multi-MDS bugs for now. Greg Farnum
03:27 PM CephFS Bug #3267: Multiple active MDSes stall when listing freshly created files
Currently de-prioritizing multi-MDS bugs. Greg Farnum
03:23 PM Bug #3537: Logs can run root out of space and crash ceph cluster (need more aggressive log rotation)
Not an FS bug, and #3775 has a lot more conversation on this subject. Greg Farnum
03:22 PM Bug #3552: After ceph-deploy installation a reboot breaks OSDs
Whoops, not an FS bug!
I've put this in the main Ceph project for now, but it might also belong in devops. We need...
Greg Farnum
03:18 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
I know you guys did a couple rounds on this one, what's the status? Greg Farnum
02:39 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
Yes, the question is why they're 'getting unlucky'. Josh Durgin
02:22 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
Haven't looked into this, but my guess is a couple PGs are getting unlucky with their replica selection. I assume you... Greg Farnum
02:17 PM Bug #3806 (Won't Fix): OSDs stuck in active+degraded after changing replication from 2 to 3
Small 3 node cluster running 0.56.1-1~bpo60+1 on Debian/Squeeze, with "tuneables" enabled
I recently changed the r...
Ben Poliakoff
02:27 PM RADOS Feature #3807 (Resolved): crush: simple commands to create common rules
These should be in CrushWrapper or similar, and available via crushtool and via some 'ceph osd crush ...' commands.
...
Sage Weil
02:16 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
I agree. If there are lots of log messages at the default levels, that is the problem. I don't think there is much ... Sage Weil
01:59 PM Feature #3775 (Need More Info): log: stop logging in statfs reports usage above some threshold
So I suggest we split this into two issues:
1) the documentation examples show an awfully-high logging value for s...
Dan Mick
12:03 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
so, a couple ideas of what can be done.
if we do set size and frequency (or inform the user how to), then it could...
Anonymous
11:39 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
So a couple of thoughts:
1) changing size in logrotate.conf doesn't help unless we also change frequency
2) with ...
Dan Mick
02:15 PM Documentation #3804 (Resolved): Logging section recommends fairly high levels, doesn't stress how...
3775 introduced the observation that logs can fill very quickly and bury a small root disk.
Our documentation could ...
Dan Mick
02:03 PM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
commit:15bb00cafc31305cacf3c4684a429c2c9ee6f804 in master
Dan Mick
02:03 PM rbd Feature #3635 (Resolved): rbd cli: call "udevadm settle" after use of add/remove kernel interface
Dan Mick
02:02 PM rbd Feature #3784: rbd: issue modprobe when rbd map is called
commit:e94b06a19218decaf7d2d7b009bd862040f20285 in master
Dan Mick
02:01 PM rbd Feature #3784 (Resolved): rbd: issue modprobe when rbd map is called
Dan Mick
01:47 PM Bug #3803 (Resolved): rados parsing error with hostnames in mon_host
nevermind.. this is fixed in v0.48.3argonaut too. Sage Weil
01:45 PM Bug #3803: rados parsing error with hostnames in mon_host
Responed to the upstraem bug. This is fixed in master and bobtail, but not backported to argonaut. Should we? Sage Weil
08:37 AM Bug #3803 (Resolved): rados parsing error with hostnames in mon_host
In /etc/ceph/ceph.conf, if I set hostnames in the mon_host variable and separate them with spaces, the parsing algori... Ian Colle
01:25 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
Sage has a different proposed fix than what's in the branch. Still needs to be tested. Sam Lang
12:50 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
I don't remember where this ended up. Was the proposed fix problematic, or did it never get looked at? Greg Farnum
01:16 PM Bug #3770: OSD crashes on boot
Yeah, I just pushed a work-around branch (which I haven't tested much, so ideally you would try it on a node you can ... Samuel Just
12:08 PM rbd Subtask #3741: krbd: rework request tracking code
I found the source of my trouble, and in the process understood
a little more about some subtlety in bio reference c...
Alex Elder
11:39 AM CephFS Bug #3718: multi-client dbench gets stuck over NFS exported cephfs
This apparently is only a problem under re-export, which I believe we are not focusing on right now. Greg Farnum
11:35 AM CephFS Bug #3553: MDS core dumped running 0.48.2argonaut
Given what we know so far (the Op got sent to the wrong OSD) this is a bug in the Objecter, not the MDS. Or possibly ... Greg Farnum
11:17 AM Bug #3771: ceph does not have startup scripts in Centos
Not an FS bug! :) Greg Farnum
10:17 AM Bug #3771 (In Progress): ceph does not have startup scripts in Centos
Anonymous
11:16 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
Whoops, this was never an FS bug. :) Greg Farnum
10:15 AM Bug #3768 (In Progress): perl is required for logrotate, we need to include Perl as a dependency
Anonymous
10:54 AM Bug #3747: PGs stuck in active+remapped
No I didn't, just the CRUSH rule. Faidon Liambotis
10:46 AM Bug #3747 (Need More Info): PGs stuck in active+remapped
Faidon: did you also change the replication level of pool 3 (.rgw.buckets) ? Samuel Just
10:18 AM Feature #3505 (In Progress): default to libnss
This may already have been done. Will double check. Anonymous
10:16 AM Feature #3733 (In Progress): osd: update leveldb submodule
Anonymous
10:10 AM Bug #3797 (Need More Info): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest ...
Ian Colle
07:09 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Can you try reupgrading one of the nodes and start it with debug file store = 20? That will tell is what it is writing. Sage Weil
02:49 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
I just downgraded to 0.48.2argonaut and everything seems to be running normally again now:
Before downgrade:
ii ...
Corin Langosch
02:28 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Here's the output of dstat http://pastie.org/5687470.text
I'm not sure why it is writing so much now, before the ...
Corin Langosch
02:17 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
I just noticed the second osd is now consuming 100% cpu too. Before it was properly running for around 15 minutes. Gu... Corin Langosch
02:14 AM Bug #3797 (Duplicate): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48....
I just upgraded one of my production servers (2 osds) from 0.48.2argonaut to the latest 0.48.3argonaut and now of the... Corin Langosch
08:33 AM rgw Bug #3802 (Resolved): x-amz-acl header ignored on copy operation
When copying an object the x-amz-acl header is ignored. To replicate; copy a private object and send the 'x-amz-acl' ... JuanJose Galvez
07:43 AM Bug #3801 (Won't Fix): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED a...
0.48.2argonaut
Relevant logs are attached. Core dumps are available if needed....
Justin Lott
07:25 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
You're right, as long as you are using matching
code it's fine.
If it occurred, it's a serious problem. It just
...
Alex Elder
07:17 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
Is this really a problem? It seems like this could only bite someone building mixed versions out of tree. Sage Weil
06:57 AM Linux kernel client Bug #3800 (Resolved): libceph: check compatibility between ceph modules
It's possible for semantic changes to occur in one of the
ceph modules (fs/ceph, net/libceph, or block/rbd) that is
...
Alex Elder
06:58 AM Linux kernel client Bug #3799: libceph/rbd: bio refs are messed up
Because this suggests a semantically-incompatible change
between modules, this should probably be completed first:
...
Alex Elder
06:56 AM Linux kernel client Bug #3799 (Resolved): libceph/rbd: bio refs are messed up
There is an ugly reference counting dance that occurs with bio
pointers in the kernel osd I/O path, and it needs to ...
Alex Elder
06:57 AM Linux kernel client Bug #3798: libceph/rbd: take reference to all bio's in list
The other bug related to this is:
http://tracker.newdream.net/issues/3799
Alex Elder
06:56 AM Linux kernel client Bug #3798 (Resolved): libceph/rbd: take reference to all bio's in list
In a separate bug ("libceph/rbd: bio refs are messed up") I
describe how reference counting of bio's interact betwee...
Alex Elder

01/14/2013

10:07 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
oh *fine*. :) Dan Mick
10:04 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
Funny you should mention it: that is step #1 (or maybe 2 or 3) for the management API work, IMHO. :) Sage Weil
09:41 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
I sorta think we ought to clean up how the various output channels are used in this code in general. This fixes the ... Dan Mick
08:02 PM rbd Subtask #3741: krbd: rework request tracking code
OK, I ran a test and got a crash. The bio built for
an object request gets handed off to an osd request.
I need to...
Alex Elder
07:32 PM rbd Subtask #3741: krbd: rework request tracking code
I spent the day trying to find the memory leak and finally
found it. The structure being leaked was a bio. It was
...
Alex Elder
06:48 AM rbd Subtask #3741: krbd: rework request tracking code
For some reason my tests started hanging on Friday when
I added memory debug code for catching leaks and reuses.
I ...
Alex Elder
07:49 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
Is this still an issue? Noah Watkins
04:54 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
Josh just pinged me that there was a typo in the chmod patch, and nobody's noticed so apparently it still hasn't been... Greg Farnum
04:24 PM Bug #3795: loadgen task gets into msgr loop
I looked a bit more and I see some failures before that, and also some passes after, e.g. teuthology-2013-01-11_07:00... Sage Weil
11:35 AM Bug #3795: loadgen task gets into msgr loop
taking a look again at the nightly runs, looks like this issue has been happening on next branch from 01-01-2013 whic... Tamilarasi muthamizhan
08:13 AM Bug #3795: loadgen task gets into msgr loop
going to see if the recent msgr changes are to blame.. bisecting! Sage Weil
08:04 AM Bug #3795: loadgen task gets into msgr loop
This appears to be a simple cycle:
- objecter has lots of requests outstanding
- there is a fault (msgr failure i...
Sage Weil
03:04 PM CephFS Documentation #3796 (Resolved): FUSE mount documentation needs some corrections for v0,56x
The FUSE instructions need to be updated for v0.56 and later
currently:
> http://ceph.com/docs/master/cephfs/fuse...
Anonymous
01:35 PM Bug #3772 (Can't reproduce): osd: osd_disk_threads = 5 seems to hang recovery
I also don't seem to be able to reproduce on bobtail, marking can't reproduce. Samuel Just
12:58 PM Bug #3772 (New): osd: osd_disk_threads = 5 seems to hang recovery
I don't seem to be able to reproduce this on master. Samuel Just
10:37 AM Bug #3772: osd: osd_disk_threads = 5 seems to hang recovery
didn't reproduce with simple test, trying something more complicated. (roles/8882.yaml + osd disk threads : 10, teste... Samuel Just
01:28 PM CephFS Feature #3749 (Resolved): Remove forced synchronization from Java bindings
Noah Watkins
12:57 PM Feature #3769 (Fix Under Review): osd: scrub should verify snap collection existence, membership
wip_snap_scrub Samuel Just
11:55 AM rbd Bug #2871 (Resolved): rbd export command hangs when trying to export an image of size 0 to a loca...
Not certain which recent fix resolved this, but it works now.
Dan Mick
11:32 AM rbd Bug #3585 (Closed): Image import via QEMU-IMG results in a corrupt rbd
Great, glad to hear it's fixed. Josh Durgin
11:09 AM rbd Bug #3427: krbd: unmap does not remove block device properly
Patch posted for review. I'm not sure I'll be able to test
the scenario very well but hopefully it can be seen by
...
Alex Elder
09:56 AM rbd Bug #3427: krbd: unmap does not remove block device properly
Implementing the change I described now. Alex Elder
11:01 AM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
for reference, ubuntu@teuthology:/a/teuthology-2013-01-10_07:00:03-regression-argonaut-master-basic/38145 Tamilarasi muthamizhan
10:50 AM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
This has shown up once in argonaut, probably not worth backporting unless it becomes more of a problem? Samuel Just
09:42 AM Bug #3629 (Resolved): test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsid())
commit:3610e72e4f9117af712f34a2e12c5e9537a5746f Joao Eduardo Luis
07:00 AM CephFS Bug #2187: pjd chown/00.t failed test 97
Happened again on Friday. Time to add the delay injection to the nightlies?
2013-01-11T07:32:37.489 INFO:teutholo...
Sam Lang
05:43 AM Bug #3770: OSD crashes on boot
So, my (very basic) understanding of this suggests that the fix is that the trim wouldn't happen in the first place.
...
Faidon Liambotis
 

Also available in: Atom