Project

General

Profile

Activity

From 05/22/2013 to 06/20/2013

06/20/2013

11:13 PM Bug #5301: mon: leveldb crash in tcmalloc
Maciej Galkiewicz wrote:
> debian wheezy (7.0)
ok now it sounds a lot like #5239. i'm not able to reproduce this...
Sage Weil
11:01 PM Feature #3273 (Need More Info): mon: simple dm-crypt key management
http://marc.info/?l=ceph-devel&m=137179443405614&w=2 Sage Weil
10:40 PM rbd Feature #4550: Create Qemu+RBD rpm package for RHEL+CentOS 6.3 on ceph.com
Following the packaging discussions, the redhat packages were respun with the latest redhat sources + the ceph rados ... Anonymous
09:51 PM rgw Feature #4340 (In Progress): rgw: dr: data sync agent: implement full sync
Yehuda Sadeh
09:51 PM rgw Feature #5358 (In Progress): rgw: RESTful api for intra-region copy state
Yehuda Sadeh
09:50 PM rgw Feature #5356 (In Progress): rgw: RESTful api for bucket upstream zone + marker info
Yehuda Sadeh
09:50 PM rgw Feature #5341 (Fix Under Review): rgw: keep state for cross-rgw copy operations
Yehuda Sadeh
09:33 PM CephFS Fix #5399: timestamp changes on replayed mds request (pjd link 71)
probably need to extend the replayed request message to include the timestamps we got for the inode and dir so that t... Sage Weil
09:33 PM CephFS Fix #5399: timestamp changes on replayed mds request (pjd link 71)
- we send a create to mds
- get an ack, but it isn't journaled
- pjd stats the mtime/ctime/ec.
- mds restarts
- w...
Sage Weil
09:12 PM CephFS Bug #5290: mds: crash whilst trying to reconnect
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-06-20_13:32:57-fs-master-testing-basic/41231
logs in ...
Sage Weil
06:45 PM CephFS Bug #5333 (Fix Under Review): mds: segfault in MDLog::standby_trim_segments
wip-5333
this looks like a simple matter of not crashing if the segment list is empty. that at least covers this ...
Sage Weil
12:53 PM CephFS Bug #5333: mds: segfault in MDLog::standby_trim_segments
Just a note: maybe we missed a spot, but I remember doing a re-read head object, retry journal read whenever we get a... Greg Farnum
12:47 PM CephFS Bug #5333: mds: segfault in MDLog::standby_trim_segments
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-06-20_01:00:49-fs-next-testing-basic/40965
with ful...
Sage Weil
06:15 PM CephFS Bug #5380 (Resolved): osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
Sage Weil
12:30 PM CephFS Bug #5380: osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
Sage Weil
05:42 PM Bug #5412 (Resolved): doc bug: incorrect reference to monitor quorum requirements
http://ceph.com/docs/master/rados/deployment/ceph-deploy-mon/... Greg Farnum
02:56 PM Bug #4004 (Can't reproduce): Intermittent kernel build failures
Anonymous
02:52 PM Bug #4004: Intermittent kernel build failures
Closing since we haven't seen any problems for a couple months. Anonymous
02:55 PM devops Cleanup #5106 (Resolved): ceph_deploy: install/compile error on wheezy
The was happening do to syntax in the test programs that wasn't supported on Python 2.6. Not shipping the test direc... Anonymous
02:51 PM Bug #2176 (Resolved): dependencies not checked by autoconf
We've got all the current dependencies in the configure.ac checks and in the rpm or debian requirements. Anonymous
02:42 PM CephFS Bug #5411 (Resolved): teuthology: bad object dereference
... Greg Farnum
01:30 PM CephFS Fix #5268: mds: fix/clean up file size/mtime recovery code
See also #4485. Greg Farnum
01:30 PM CephFS Feature #4485: Improve "needsrecover" handling
See also #5268. Greg Farnum
01:24 PM CephFS Feature #1693 (In Progress): libcephfs: Support TRIM (hole punching)
See "[PATCH] Ceph-fuse: Punch hole support" from Li Wang. Greg Farnum
01:17 PM CephFS Feature #3541 (In Progress): mds: robust ino lookup using file backpointers
A bunch of this got done, but Sage isn't sure if the client -> LOOKUPINO messages are wired up to that infrastructure... Greg Farnum
12:58 PM Feature #4929: Erasure encoded placement group
the pad is only archived for so long, keep a "pad backup":http://pad.ceph.com/p/Erasure_encoding_as_a_storage_backend... Loïc Dachary
11:31 AM Bug #5409 (Resolved): mon: log command does not wait for commit
Sage Weil
10:43 AM Bug #5409 (Resolved): mon: log command does not wait for commit
the mon replies immediately, and may lose the msg if it restarts.... Sage Weil
10:35 AM rgw Feature #5408 (Resolved): rgw: turn off dr/geo logging
Yehuda Sadeh
10:26 AM Bug #5407 (Resolved): mon: is_writeable doesn't match wait_for_writeable on cuttlefish
fixes in master.. need a minimal cuttlefish backport. Sage Weil
09:58 AM rgw Feature #5406 (Resolved): rgw: a RESTful api to dump region map
Yehuda Sadeh
09:42 AM devops Bug #5405 (Resolved): ceph-deploy: transient pushy exception on install
... Sage Weil
08:37 AM devops Feature #5403 (Resolved): make ceph.com repos mirrorable
Sage Weil
08:01 AM Fix #5388: osd: localized reads (from replicas) reordered wrt writes
Since disabling localized reads I've not seen the problem occur, so thanks :) Mike Bryant
07:32 AM Bug #5401: cuttlefish osd recovery slow
Full backtrace (while recovering):
http://pastebin.com/raw.php?i=DWGHiNP6
2nd full backtrace:
http://pastebin.co...
Stefan Priebe
07:18 AM Bug #5401: cuttlefish osd recovery slow
Not sure if this helps:
# /etc/init.d/ceph stop osd.24; sleep 15; /etc/init.d/ceph start osd.24; sleep 10; inotif...
Stefan Priebe
06:59 AM Bug #5401: cuttlefish osd recovery slow
Some more information. While recoverig i see nearly no CPU load. If i look at the disk activity i see a HUGE amount o... Stefan Priebe
06:53 AM Bug #5401: cuttlefish osd recovery slow
Lowering osd recovery max active makes it even more worth as the over all recovery takes longer. So it's not the I/O ... Stefan Priebe
01:01 AM Bug #5401 (Can't reproduce): cuttlefish osd recovery slow
While the peering is fine now (Bug #5232) (latest upstream/cuttlefish) even without wip_cuttlefish_compact_on_startup... Stefan Priebe
06:59 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
Thanks. Jeff Moskow
06:56 AM Bug #5292 (Resolved): mon: monitor crashing due to not being in the monmap (no monmap to be in)
You hit #5205 -- not the same issue, thus closing this ticket again. Joao Eduardo Luis
06:48 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
Here you go. Please let me know if you need anything else.
Jeff
Jeff Moskow
06:42 AM Bug #5292 (Need More Info): mon: monitor crashing due to not being in the monmap (no monmap to be...
Okay, can you post the monitor's logs with 'debug mon = 20' ? Joao Eduardo Luis
02:46 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
I did a reboot, just to make sure :-(
# ceph -v
ceph version 0.61.4 (1669132fcfc27d0c0b5e5bb93ade59d147e23404)
...
Jeff Moskow
06:48 AM rgw Bug #5402 (Resolved): rgw compilation problem on wip-rgw-geo-2 branch
The wip-rgw-geo-2 branch does not compile from a2cf14fe27a2da54e44b12a373b15b29c89d31b9.
In fact the method encode...
Christophe Courtaut
12:06 AM Fix #5232: osd: slow peering due to pg log rewrites
I'm fridudad ;-) Peering works fine but recovery does not. In my initial text of this tracker i also mentioned recove... Stefan Priebe

06/19/2013

11:14 PM Subtask #5213 (Resolved): unit tests for src/osd/PGLog.{cc,h}
Loïc Dachary
10:48 PM CephFS Bug #5289: mds closing stale session
Sage Weil wrote:
> this is caused when teh client is not talknig to the mds. can you verify the network is working, ...
chen atrmat
08:08 PM CephFS Bug #5380: osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
The patch only fixes the root cause. It doesn't help if objects already have wrong size. Zheng Yan
06:06 PM rbd Bug #5222 (Resolved): krbd: use per-rbd_dev mutex to protect header updates
Sage Weil
06:06 PM rbd Bug #3925 (Resolved): krbd: sysfs write lockdep warnings
Sage Weil
06:04 PM Bug #5398 (Resolved): PGLog::rewind_divergent_log dereferencing iterator "p" past the end of its ...
Sage Weil
01:58 PM Bug #5398 (Fix Under Review): PGLog::rewind_divergent_log dereferencing iterator "p" past the end...
"pull request":https://github.com/ceph/ceph/pull/366 Loïc Dachary
12:59 PM Bug #5398 (Resolved): PGLog::rewind_divergent_log dereferencing iterator "p" past the end of its ...
Loïc Dachary
06:03 PM devops Bug #5161 (Resolved): daemons should create /var/run/ceph if it doesn't already exist
Sage Weil
06:02 PM Bug #5227 (Can't reproduce): ARM set up: rados test failed
Sage Weil
07:45 AM Bug #5227: ARM set up: rados test failed
Been trying to reproduce this on the talas but no joy so far. Still hammering cuttlefish. Joao Eduardo Luis
06:00 PM devops Bug #5387 (Resolved): ceph-disk: lockfile does not detect stale locks (dead parent process)
Sage Weil
06:00 PM devops Bug #5390 (Pending Backport): ceph-deploy osd create hangs
Sage Weil
05:28 PM devops Bug #5390: ceph-deploy osd create hangs
bah, trivial fcntl(2) is all we need here. Sage Weil
05:00 PM rgw Feature #4335 (Fix Under Review): rgw: dr: sync processing state: define datastructures
We're well beyond "defining" the data structures at this point; there's code and it's undergoing review. But this is ... Greg Farnum
04:12 PM devops Bug #5211 (Pending Backport): ceph-disk prepare: list_partitions() shouldn't return disks
Sage Weil
09:25 AM devops Bug #5211 (Fix Under Review): ceph-disk prepare: list_partitions() shouldn't return disks
pusehd to wip-ceph-disk Sage Weil
04:02 PM CephFS Fix #5399 (New): timestamp changes on replayed mds request (pjd link 71)
Hmm, Sage points out this might be something else; reopening. Greg Farnum
03:56 PM CephFS Fix #5399 (Rejected): timestamp changes on replayed mds request (pjd link 71)
It's a time stamp check for things going backwards, and is failing due to out-of-sync clocks (over a network) being h... Greg Farnum
03:44 PM CephFS Fix #5399 (Resolved): timestamp changes on replayed mds request (pjd link 71)
teuthology-2013-06-19_10:46:59-fs-cuttlefish-master-basic 40138 40141 Sage Weil
03:50 PM Fix #5232: osd: slow peering due to pg log rewrites
This won't get backported. Some mitigating patches did go into cuttlefish. Also, there is wip_cuttlefish_compact_on... Samuel Just
03:45 PM Fix #5232 (Resolved): osd: slow peering due to pg log rewrites
Samuel Just
03:44 PM RADOS Tasks #5243: osd testing: create peering speed test
peering_speed_test.py, still needs to be added to ceph-qa-suite somewhere appropriate. Samuel Just
03:43 PM Fix #5278: osd: smarter recovery for small objects
wip-small-object-recovery, in progress Samuel Just
03:26 PM Bug #5389 (Resolved): osd: op_tp timeout on big cluster + radosmodel
Samuel Just
03:11 PM Bug #5389: osd: op_tp timeout on big cluster + radosmodel
lfn_unfound Samuel Just
03:21 PM devops Bug #5306: Xen based OSDs fail to start ceph-osd process
Here's the output of udevadm test
~ sudo udevadm test --action=add /sys/devices/vbd-51728/block/xvdb/xvdb1
run_c...
Yan-Fa Li
03:04 PM devops Bug #5306: Xen based OSDs fail to start ceph-osd process
OK, I updated using ceph-deploy to all my nodes to this version:
ceph version 0.61.3-29-g08304a7 (08304a7c46da7517...
Yan-Fa Li
11:43 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
I'm still using the cluster with the modified ceph-mds program, it still works. I caused another power outage (this i... Jérôme Poulin
11:25 AM rgw Bug #5346: rgw: invalid read from RGWFormatter_Plain::write_data
well, swift is the only user of the plain formatter I guess. Yehuda Sadeh
11:14 AM rgw Bug #5346: rgw: invalid read from RGWFormatter_Plain::write_data
this appears to be triggered by the swift test.. doesn't happen with s3tests or readwrite etc
also present on cutt...
Sage Weil
11:15 AM Bug #4976 (Resolved): osd powercycle triggers object corruption on xfs
Sage Weil
09:09 AM Bug #4976: osd powercycle triggers object corruption on xfs
What do you mean "remove fadvise"? And is this a known upstream issue? Greg Farnum
10:37 AM rbd Documentation #3220 (Resolved): doc: more detail on QEMU+RBD page
http://ceph.com/docs/master/rbd/qemu-rbd/ John Wilkins
09:56 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
It was backported to the branch that should soon become 0.61.4. Until then, you'll be able to find it on the gitbuil... Joao Eduardo Luis
08:36 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
I just tried apt-get update and it didn't pull down any cuttlefish updates. Have they been released? Do I need to d... Jeff Moskow
07:43 AM Bug #5292 (Resolved): mon: monitor crashing due to not being in the monmap (no monmap to be in)
Fix for this went into next and cuttlefish branches as of last night; see #5256. Joao Eduardo Luis
08:59 AM devops Feature #5397 (New): terminate ceph-create-keys when its mon process dies
Right now, it's easy to build up a bunch of ceph-create-keys processes on a node because it is started when the monit... Greg Farnum
08:38 AM Bug #5375 (Resolved): squeeze tcmalloc leaks
i'll send a note to the email list. thanks for tracking this down! Sage Weil

06/18/2013

11:38 PM Bug #5375: squeeze tcmalloc leaks
OK this def. fixes it for me. So the squeeze perftool package seems to have a memory leak. Stefan Priebe
12:39 PM Bug #5375: squeeze tcmalloc leaks
Sage Weil
12:17 PM Bug #5375: squeeze tcmalloc leaks
OK backported: google-perftools from wheezy to squeeze, recompiled leveldb and ceph to reflect new google-perftools v... Stefan Priebe
11:18 AM Bug #5375: squeeze tcmalloc leaks
yeah, try wheezy. they won't update squeeze at this point anyway. Sage Weil
11:17 AM Bug #5375: squeeze tcmalloc leaks
The Debian Maintainer is: Daigo Moriwaki <daigo at debian.org>
Should i first try to use the one from wheezy on sq...
Stefan Priebe
11:14 AM Bug #5375: squeeze tcmalloc leaks
Hmm, looks like maybe we need to send a bug to upstream (Debian and/or libgoogle-perftools devs).
Sage, any ideas ...
Greg Farnum
11:02 AM Bug #5375: squeeze tcmalloc leaks
no change. Should i update my tcmalloc on debian squeeze?
[: ~]# pmap -x 11783|tail -n1
total kB 176290...
Stefan Priebe
10:56 AM Bug #5375: squeeze tcmalloc leaks
Yes. In particular the "heap release" bit is trying to more aggressively give memory back to the OS. We've observed i... Greg Farnum
10:48 AM Bug #5375: squeeze tcmalloc leaks
not should be now Stefan Priebe
10:48 AM Bug #5375: squeeze tcmalloc leaks
not it's using 1GB should i run these commands again? Stefan Priebe
10:57 PM rgw Feature #4335 (In Progress): rgw: dr: sync processing state: define datastructures
Yehuda Sadeh
10:57 PM rgw Feature #4338 (In Progress): rgw: multisite: metadata sync agent: implement delta changes sync
Yehuda Sadeh
10:56 PM rgw Feature #5341 (In Progress): rgw: keep state for cross-rgw copy operations
Yehuda Sadeh
10:56 PM rgw Bug #5357 (Fix Under Review): rgw: set and retrieve intra-region copy operation state
Yehuda Sadeh
10:33 PM devops Bug #5390: ceph-deploy osd create hangs
starting with the mercurial lock implementation, which uses a pid. see wip-ceph-disk-lock, tho still incomplete. Sage Weil
08:45 AM devops Bug #5390 (Fix Under Review): ceph-deploy osd create hangs
care to review teh top patch in wip-ceph-disk?
alternatively, do you know of a replacement for lockfile that will ...
Sage Weil
08:38 AM devops Bug #5390 (In Progress): ceph-deploy osd create hangs
see also #5387. and i'll add the sigint handler to reduce the probability of this happening! Sage Weil
07:36 AM devops Bug #5390 (Resolved): ceph-deploy osd create hangs
On Ubuntu 13.04 with ceph 0.61.3 .
It hangs when creating a new osd using ceph-deploy.
ceph@ceph-node4:~/mycluste...
Da Chun Wu
10:24 PM Fix #5279 (In Progress): pipeline large object recovery
Sage Weil
10:24 PM Feature #4200 (In Progress): mon: break pgmap into separate leveldb keys
Sage Weil
10:08 PM Bug #5256 (Resolved): Upgraded bobtail->cuttlefish mon crashes, then can't resume the conversion
Sage Weil
09:45 PM Bug #4976 (Fix Under Review): osd powercycle triggers object corruption on xfs
the problem is that sync_file_range(2) and posix_fadvaise(..DONTNEED) break xfs's internal write and zero ordering. ... Sage Weil
09:19 PM Bug #5372 (Duplicate): osd/SnapMapper.cc: 270: FAILED assert(check(oid))
I think this is caused by the same thing as 5269. Samuel Just
09:18 PM Bug #5320 (Resolved): osd/ReplicatedPG.cc: 4753: FAILED assert(!pg_log.get_missing().is_missing(s...
Samuel Just
09:00 PM Fix #5388: osd: localized reads (from replicas) reordered wrt writes
... Sage Weil
10:21 AM Fix #5388: osd: localized reads (from replicas) reordered wrt writes
Sage Weil
09:33 AM Fix #5388: osd: localized reads (from replicas) reordered wrt writes
I've just reproduced it with those log levels.
There was 1 master, 1 regionserver.
So I think the writing and readi...
Mike Bryant
08:04 AM Fix #5388 (Need More Info): osd: localized reads (from replicas) reordered wrt writes
Hi Mike,
I gather that the same data was just written by a different node in the cluster? And this is right near/...
Sage Weil
03:40 AM Fix #5388 (New): osd: localized reads (from replicas) reordered wrt writes
I'm using hbase, with the hadoop-cephfs bindings, on top of a ceph 0.61 cluster.
I'm seeing instances where reading ...
Mike Bryant
08:37 PM Bug #5395 (Can't reproduce): arm: osd: big performance differential between read/write
-- arm --
Raw /dev/rbd device
$ sudo dd if=/dev/zero of=/dev/rbd1 bs=4M count=128 conv=fdatasync
128+0 record...
Sage Weil
05:43 PM Bug #5084: osd: slow peering after osd restart (bobtail)
I tried wip_cuttlefish_compact_on_startup today. First, I upgraded one box to 0.61.3-47-g47f1bed-1precise.
Then, w...
Faidon Liambotis
05:02 PM devops Bug #5266 (Closed): the apt-get install instructions are missing an update
Verified fixes, thanks. Yan-Fa Li
02:32 PM devops Bug #5266 (Resolved): the apt-get install instructions are missing an update
See:
http://ceph.com/docs/master/start/quick-start-preflight/#install-ceph-deploy
http://ceph.com/docs/master/rado...
John Wilkins
04:59 PM devops Bug #5306: Xen based OSDs fail to start ceph-osd process
I will do this tomorrow. The xen box is temporarily down. Yan-Fa Li
01:46 PM devops Bug #5306 (Need More Info): Xen based OSDs fail to start ceph-osd process
can you retest on latest cuttlefish branch? (ceph-deploy install --dev=cuttlefish) Sage Weil
04:07 PM Documentation #3391: doc: add instructions on snapshot reversion
Actually, this isn't a documentation oversight. There's no means of rolling back an entire snapshot on a pool visible... John Wilkins
03:29 PM Bug #5226: Some PG stay in "incomplete" state
Well, I have pools on that clusters which are fines (thanks to 3 copies) ; so how can I recover a HEALTH_OK status, s... Olivier Bonvalet
01:05 PM Bug #5226 (Won't Fix): Some PG stay in "incomplete" state
nothing much to be done here if 2 disk were replaced/failed Sage Weil
03:18 PM rgw Documentation #5178 (Resolved): rgw: fix keystone openssl to nss conversion
See http://ceph.com/docs/master/radosgw/config/#integrating-with-openstack-keystone John Wilkins
03:17 PM devops Bug #5211: ceph-disk prepare: list_partitions() shouldn't return disks
Came up with this: https://gist.github.com/alram/33ea3360d5aa6a86e8a4
Alexandre Marangone
02:38 PM devops Bug #5211: ceph-disk prepare: list_partitions() shouldn't return disks
i think the right way to do this is to look to see if /sys/block/$disk/$part exist (e.g., /sys/block/sda/sda1) to tel... Sage Weil
01:43 PM devops Bug #5211: ceph-disk prepare: list_partitions() shouldn't return disks
Alexandre, please implement one of the suggestions you mentioned. Ian Colle
02:21 PM devops Bug #5182 (Won't Fix): ceph-disk looks like it tries to mark preexisting OSD partitions with the ...
this is correct.. if you do ceph-disk prepare /dev/sdb1 (a partition) we don't touch the partition type. if you do /... Sage Weil
02:14 PM devops Bug #5161 (Pending Backport): daemons should create /var/run/ceph if it doesn't already exist
Sage Weil
02:02 PM devops Bug #5338 (In Progress): need rpm packages built for libapache-mod-fastcgi
Sage Weil
01:53 PM devops Bug #5263 (Resolved): Python Error While Installing ceph-deply on debian wheezy
Sage Weil
01:52 PM devops Bug #5299 (Won't Fix): ceph-deploy fails with cryptic error message if expected directories not f...
Sage Weil
01:52 PM Bug #5392: osd: unfound objects from thrashing
There seem to be a lot of threads waiting on throttle. Unfortunately, the test timed out before I could get more inf... Samuel Just
01:50 PM Bug #5392: osd: unfound objects from thrashing
... Samuel Just
11:21 AM Bug #5392 (Resolved): osd: unfound objects from thrashing
... Sage Weil
01:51 PM devops Bug #5342 (Resolved): Make tcmalloc default on ARM
Sage Weil
01:51 PM devops Bug #5339 (Resolved): ceph-deploy suite failures, 'insufficient osds'
Sage Weil
01:50 PM devops Bug #5359 (Resolved): ceph-deploy: install and purge commands on rhel sometimes errors out though...
Sage Weil
01:49 PM devops Bug #5066 (Resolved): Problems with ceph-deploy debs
Sage Weil
01:48 PM devops Bug #5199 (Resolved): ceph-deploy: on fedora18, osd create command doesnt seem to mount the disks
Sage Weil
01:47 PM Bug #5301: mon: leveldb crash in tcmalloc
debian wheezy (7.0) Maciej Galkiewicz
01:01 PM Bug #5301: mon: leveldb crash in tcmalloc
what distro are you using? this sounds a bit like #5239 Sage Weil
01:47 PM devops Bug #5258 (Resolved): ceph-deploy: forgetkeys command could delete existing keyring files without...
commit:953bee3cc66d19ef9b201299fc82c270587936a9 Sage Weil
01:46 PM devops Bug #4916 (Resolved): ceph-deploy: mon create fails on bobtail branch in centos 6.3
Sage Weil
01:45 PM devops Bug #5334 (Resolved): ceph-deploy: "modules not installed"
Sage Weil
01:40 PM devops Bug #5345 (Need More Info): ceph-disk: handle less common device names
Sage Weil
01:32 PM Bug #5389 (Resolved): osd: op_tp timeout on big cluster + radosmodel
Samuel Just
12:18 PM Bug #5389: osd: op_tp timeout on big cluster + radosmodel
... Samuel Just
07:26 AM Bug #5389 (Resolved): osd: op_tp timeout on big cluster + radosmodel
no errors in kern.log, so we can't blame this on the kenrel.... Sage Weil
01:29 PM Bug #4268 (Can't reproduce): mon: timecheck: teuthology task fails due to unreported timecheck fr...
Joao Eduardo Luis
01:28 PM Bug #4189 (Resolved): osd/ReplicatedPG.cc: 4994: FAILED assert(log.objects.count(soid) ...
Samuel Just
01:28 PM Bug #4265 (Won't Fix): ceph-deploy new doesn't support multiple monitors on one host.
Sage Weil
01:28 PM Bug #4216 (Resolved): osd: dbojectmap incorrectly skipping ops
Sage Weil
01:27 PM Bug #3683 (Resolved): mon: leak of MMonPaxos
Sage Weil
01:26 PM Bug #3723 (Can't reproduce): ceph osd down command reports incorrectly
Sage Weil
01:26 PM Bug #3607 (Resolved): FileStore::_write conditional code for HAVE_SYNC_FILE_RANGE seems wrong
Sage Weil
01:25 PM Bug #3593 (Can't reproduce): MDS crash in MDCache.cc _recovered()
Sage Weil
01:25 PM Bug #2563 (Resolved): leveldb corruption
Samuel Just
01:24 PM Bug #3576 (Resolved): scripe scripts broken after upgrade to 0.55
Sage Weil
01:24 PM Bug #3182 (Can't reproduce): No JSON object could be decoded - failure in the nightly run
Sage Weil
01:24 PM Bug #3287 (Resolved): OSD dies when using zfs
Sage Weil
01:23 PM Bug #3458 (Can't reproduce): aio enabled but not used
Sage Weil
01:23 PM Bug #3644 (Resolved): ObjectCacher: discard_set ignores waiters
Sage Weil
01:23 PM Bug #3771 (Resolved): ceph does not have startup scripts in Centos
Sage Weil
01:23 PM Bug #3537 (Won't Fix): Logs can run root out of space and crash ceph cluster (need more aggressiv...
Sage Weil
01:21 PM Bug #4041 (Can't reproduce): mon: Single-Paxos: on Paxos, leader didn't trim old versions
Sage Weil
01:20 PM Bug #2896 (Won't Fix): ceph pg dump has empty hb_out field
it's vestigal. Sage Weil
01:18 PM Bug #4523 (Duplicate): osd: read stats not updated
Sage Weil
01:18 PM Bug #4723 (Can't reproduce): FAILED assert(!db->create_and_open(std::cerr)) after IO Error.
Ian Colle
01:15 PM Bug #5052 (Duplicate): kclient_workunit_misc test failed in the nightlies
Sage Weil
01:14 PM Bug #5074 (Can't reproduce): nightlies: timed out waiting for admin socket of restarted osd
Sage Weil
01:13 PM Bug #5059 (Won't Fix): PGs can get stuck degraded if OSD removed before being out
Sage Weil
01:11 PM Bug #5082 (Can't reproduce): OSD wrongly marked as down
Sage Weil
01:10 PM Bug #4856 (Won't Fix): monitor: upgrades produce "client did not provide supported auth type" in log
Sage Weil
01:07 PM Bug #3143 (Won't Fix): Obsync object verification takes too long
https://github.com/dreamhost/obsync Sage Weil
01:07 PM Bug #5173 (Can't reproduce): ceph scrub found missing pg object
Sage Weil
01:04 PM Bug #5205: mon: FAILED assert(ret == 0) on config's set_val_or_die() from pick_addresses()
Sage Weil
01:04 PM Bug #5292 (In Progress): mon: monitor crashing due to not being in the monmap (no monmap to be in)
Monitor is not in the monmap because there is no monmap. This should be due to a sync bug (related to #5256) that re... Joao Eduardo Luis
12:57 PM Bug #5343 (Resolved): mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Sage Weil
12:51 PM CephFS Bug #5289 (Can't reproduce): mds closing stale session
this is caused when teh client is not talknig to the mds. can you verify the network is working, and ceph-fuse is hea... Sage Weil
12:50 PM Bug #5288 (Resolved): ceph.py: catch rados errors and print them nicely
Sage Weil
12:49 PM Bug #4179 (Resolved): osd: memory leak during deep scrub on bobtail
Sage Weil
12:49 PM Bug #5163 (Can't reproduce): filestore: ENOTEMPTY on object removal
Samuel Just
12:48 PM Bug #5246 (Resolved): mon crashing on pool/pg creation with wip-mon
Sage Weil
12:48 PM Bug #5157 (Resolved): install: unable to pull ceph rpm packages on fedora18
Sage Weil
12:44 PM Bug #3829 (Can't reproduce): new osd added to the cluster is not receiving data
Sage Weil
12:43 PM Bug #4764 (Can't reproduce): ceph -w sometimes does not reflect clean pgs
Sage Weil
12:42 PM Bug #5072 (Can't reproduce): mon: segfault on leveldb::Table::Open() during monitor start
Sage Weil
12:42 PM Bug #4791 (Can't reproduce): osd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0) in scan_range
Sage Weil
12:35 PM Bug #5238 (Resolved): osd: slow recovery (uselessly dirtying pg logs during peering)
Sage Weil
12:01 PM devops Feature #5393 (Rejected): ceph-disk: prepare should warn when using partitions
When using ceph-disk prepare with already created partitions, we do not set the partition uuid, thus the udev rules a... Alexandre Marangone
11:11 AM Bug #5383 (Resolved): arm write EFBIG
6b52acc8502ec16e2d0b89d8caf6235ec45778cb Samuel Just
10:54 AM Bug #5069: monitor crashed during mon thrash in nightlies
Forgot to mention that the sync flag is set on the store. Sage pointed out that the real issue here is that we're al... Joao Eduardo Luis
10:32 AM Bug #5069: monitor crashed during mon thrash in nightlies
I've been able to reproduce this on some locked nodes that were hammering the monitors pretty hard for the past week.... Joao Eduardo Luis
09:34 AM CephFS Bug #5379 (Resolved): mds/ceph-fuse hang on mount
Sage Weil
08:26 AM rbd Bug #5391 (Duplicate): krbd: crash in rbd_obj_request_create -> strlen
... Sage Weil
07:48 AM Bug #5272 (Can't reproduce): Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of ex...
Sage Weil
12:47 AM Bug #5272: Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of existing cluster
As I re-encountered the same issue without upgrading, just restarting MDS daemon, I think this tracker issue may be c... To Pro

06/17/2013

11:59 PM Bug #5375: squeeze tcmalloc leaks
[: ~]# pmap -x 11783|tail -n1
total kB 1547412 688752 685152
[: ~]# ceph -m 10.255.0.100:6789 heap stat...
Stefan Priebe
09:40 AM Bug #5375: squeeze tcmalloc leaks
Stefan, could you please try (for your monitor's IP and PORT):... Joao Eduardo Luis
06:31 AM Bug #5375 (Resolved): squeeze tcmalloc leaks
While running cuttlefish 0.61.3 or 08304a7c46da7517319b7db0b64d1c4f54771472
i'm seeing high memory usage of ceph-mo...
Stefan Priebe
09:15 PM CephFS Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
This is different from #4850. In issue #4850, disconnected inodes have no cap. In this issue, all disconnected inodes... Zheng Yan
01:32 PM CephFS Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
Good chance this is a duplicate of #4850 (though that's fsstress, so maybe not). Greg Farnum
01:22 PM CephFS Bug #5381 (Resolved): ceph-fuse: stuck with disconnected inodes on shutdown
Seen this at least 2x in the last few days:... Sage Weil
08:37 PM rgw Bug #5357 (In Progress): rgw: set and retrieve intra-region copy operation state
Yehuda Sadeh
08:36 PM rgw Bug #5351 (Resolved): rgw: make sure wip-rgw-geo passes gitbuilder
Yehuda Sadeh
08:35 PM devops Bug #5387 (Resolved): ceph-disk: lockfile does not detect stale locks (dead parent process)
python lockfile class does not detect when teh prior lock owner process is gone. we should switch to a class that do... Sage Weil
05:43 PM CephFS Bug #5380: osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
see commit a41bad1a9b(ceph: re-calculate truncate_size for strip object) Zheng Yan
01:18 PM CephFS Bug #5380 (Resolved): osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
on mds shutdown... Sage Weil
05:34 PM rgw Feature #5354 (Fix Under Review): rgw: intra-region object copy should also set mtime on object
Yehuda Sadeh
04:44 PM CephFS Bug #5379: mds/ceph-fuse hang on mount
Sage Weil
12:52 PM CephFS Bug #5379 (Resolved): mds/ceph-fuse hang on mount
have observed serveral times ceph-fuse hanging on getattr(#1). latest job was... Sage Weil
03:55 PM Bug #5373 (Can't reproduce): osd: dump_stuck test fails on tell
Sage Weil
03:49 PM devops Bug #5194 (Resolved): udev does not start osd after reboot on wheezy or el6 or fedora
Sage Weil
02:54 PM devops Bug #5194 (Fix Under Review): udev does not start osd after reboot on wheezy or el6 or fedora
now works on rhel, centos, wheezy, precise. f18 still has the mon start issue. Sage Weil
02:16 PM Bug #5383 (Resolved): arm write EFBIG
2013-06-17 15:05:31.066237 a6919420 0 -- 10.214.156.115:6800/7870
submit_message osd_op_reply(30
...
Samuel Just
02:09 PM CephFS Bug #5382: mds: failed objecter assert on shutdown
Sorry, logs at /a/teuthology-2013-06-15_01:00:44-fs-next-testing-basic/36375 Greg Farnum
02:07 PM CephFS Bug #5382 (Can't reproduce): mds: failed objecter assert on shutdown
I haven't been through this completely, but it looks like the mds went laggy, and then it received a SIGTERM (the tes... Greg Farnum
01:26 PM Bug #5269 (Resolved): osd: EEXIST on mkcoll
Samuel Just
10:08 AM Bug #5269: osd: EEXIST on mkcoll
ubuntu@teuthology:/a/teuthology-2013-06-17_01:00:05-rados-master-testing-basic/37637 Sage Weil
12:36 PM rgw Bug #5362: rgw: failure when listing objects with prefix that starts with underscore
I confirmed that this was tested, and I built it on all the branches:
next as of commit:d582ee2438a3bd307324c5f44491...
Greg Farnum
12:35 PM rgw Bug #4600: rgw: list bucket broken when marker start with underscore
Cherry-picked this commit into bobtail as well, in commit:a8f9d57a15ad7a69d53aa8fc6090fd1b394b616a. It got missed in ... Greg Farnum
12:25 PM Bug #5366 (Resolved): assert in ODSMap::is_blacklisted()
Sage Weil
09:40 AM Bug #5366: assert in ODSMap::is_blacklisted()
Sam, please review. Ian Colle
12:24 PM CephFS Bug #5368 (Resolved): ceph-fue: fsx-mpi hangs in _sync_read
commit:ee40c217e373b538e227f7218b09c1c794b4124a Sage Weil
11:49 AM rbd Bug #4446: librbd: crash from opensolaris vm
I just upgraded to KVM 1.4.2 -- same problem. Jeff Moskow
11:04 AM rgw Bug #5378 (Resolved): make radosgw-admin user rm idempotent
It would be extremely useful for radosgw-admin user rm to be idempotent, specifically so that it will return success ... JuanJose Galvez
09:42 AM Bug #5340 (Resolved): Bad arguments to zero will cause OSD to crash
Sage Weil
05:42 AM rgw Bug #5374 (Resolved): Avoid relying on keystone's admin token
The current Keystone integration requires knowledge of the keystone admin token. The keystone admin token is for Keys... Soren Hansen

06/16/2013

08:36 PM Bug #5269: osd: EEXIST on mkcoll
Running with logging overnight to reproduce. Samuel Just
08:08 PM Bug #5269: osd: EEXIST on mkcoll
and... Sage Weil
07:58 PM Bug #5269: osd: EEXIST on mkcoll
don't think this was #5270.. just hit it on... Sage Weil
08:09 PM Bug #5373 (Can't reproduce): osd: dump_stuck test fails on tell
... Sage Weil
04:52 PM Bug #5372 (Duplicate): osd/SnapMapper.cc: 270: FAILED assert(check(oid))
... Sage Weil
10:04 AM Bug #5370 (Resolved): ceph tool occasionally hangs
Sage Weil
10:01 AM Bug #5370: ceph tool occasionally hangs
fixed by ceph-qa-suite commit:73413642d7a1a1aa09cfa240cadba925b1ba812d Sage Weil
05:50 AM CephFS Bug #5367: multiclient tests: kernel mount gets EPERM
kclient and MDS never return -EACCES. was ior executed with root privilege? Zheng Yan

06/15/2013

10:10 PM rgw Feature #4310 (Fix Under Review): rgw: multisite: radosgw changes: copy across regions
Yehuda Sadeh
10:09 PM rgw Bug #5362 (Fix Under Review): rgw: failure when listing objects with prefix that starts with unde...
Yehuda Sadeh
10:09 PM rgw Feature #5352 (Fix Under Review): rgw: metadata get should also dump mtime
Yehuda Sadeh
10:08 PM rgw Feature #5353 (Fix Under Review): rgw: metadata put should apply mtime if set
Yehuda Sadeh
08:49 PM Bug #5366: assert in ODSMap::is_blacklisted()
commit:f25f212027294e5107fc9938e67d31879c171088 merged to fix the weekend qa runs. still should get a review. Sage Weil
09:10 AM Bug #5366 (Resolved): assert in ODSMap::is_blacklisted()
wip pushed Sage Weil
08:46 PM Bug #5371 (Resolved): idempotent filestore test failure
... Sage Weil
08:10 PM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Sage Weil
08:09 PM devops Bug #5363 (Resolved): specfile: ceph does not start on reboot
Sage Weil
08:09 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
update:
* wheezy is working well.
* fedora is failing only because the mon doesn't start on boot. see #5369
* r...
Sage Weil
07:57 PM Bug #5370 (Resolved): ceph tool occasionally hangs
"description": "/var/lib/teuthworker/archive/teuthology-2013-06-15_01:00:11-rados-next-testing-basic/36197",
...
Sage Weil
07:50 PM devops Bug #5369 (Resolved): fedora18: sysvinit doesn't start mon on reboot
mon log indicates it can't bind to the ip, suggesting it is starting before the network. however, note that... Sage Weil
07:46 PM CephFS Bug #5367: multiclient tests: kernel mount gets EPERM
mpi-fsx also gets EPERM. Sage Weil
07:15 PM CephFS Bug #5367 (Resolved): multiclient tests: kernel mount gets EPERM
... Sage Weil
07:45 PM CephFS Bug #5368 (Resolved): ceph-fue: fsx-mpi hangs in _sync_read
infinite loop in _sync_read() due to a short read. see wip-client-sync. Sage Weil
08:19 AM Bug #5365 (Rejected): Massive OSD flaps
Note that the current development releases include more robust heartbeat checks and a backoff behavior that prevents ... Sage Weil
03:10 AM Bug #5365: Massive OSD flaps
I found networking bug (not full connectivity). Ticket could be closed.
The reason was that new osd host was unable ...
Ivan Kudryavtsev
03:05 AM Bug #5365: Massive OSD flaps
During upgrade I restarted services on all nodes. Ivan Kudryavtsev
02:55 AM Bug #5365: Massive OSD flaps
I upgraded full cluster to
new: ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)
but it still flap...
Ivan Kudryavtsev
02:31 AM Bug #5365 (Rejected): Massive OSD flaps
Hi, all.
Today I added one more node to my CEPH and it became unstable, i mean here that it's unable to work with ...
Ivan Kudryavtsev

06/14/2013

11:28 PM rgw Feature #5349 (Fix Under Review): rgw: intra-region object copy
Yehuda Sadeh
11:01 AM rgw Feature #5349 (Resolved): rgw: intra-region object copy
This should also include the ability to copy namespaced objects (to be able to copy multipart upload parts). Yehuda Sadeh
06:11 PM rgw Bug #5348 (Fix Under Review): rgw: missing copy constraints checks for inter region user object copy
Yehuda Sadeh
11:00 AM rgw Bug #5348 (Resolved): rgw: missing copy constraints checks for inter region user object copy
Yehuda Sadeh
06:04 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
rhel seems to be working, fedora18 is acting very strange. Sage Weil
02:06 PM devops Bug #5194 (In Progress): udev does not start osd after reboot on wheezy or el6 or fedora
tahnks- i now see the problem (and can reproduce it here, yay!). testing a fix Sage Weil
01:09 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Hi Sage,
attached is the current syslog.
I started "partprobe /dev/sdb" at Jun 14 21:57:06 and "partprobe /dev/...
Robert Sander
01:04 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Can you generate and attach a udev log after the reboot? Actually, ideally,
- reboot
- note the time
- run part...
Sage Weil
12:59 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Sage Weil wrote:
> Can you grab
>
> https://github.com/ceph/ceph/blob/master/src/ceph-disk and copy it to /usr/...
Robert Sander
12:43 PM devops Bug #5194 (Need More Info): udev does not start osd after reboot on wheezy or el6 or fedora
Hi Robert,
Can you grab
https://github.com/ceph/ceph/blob/master/src/ceph-disk and copy it to /usr/sbin
https:...
Sage Weil
12:42 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Sage Weil
05:48 PM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
nope wrong ticket; ignore Sage Weil
05:32 PM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Sage, was that reply intended for this ticket? If it was I'm surely missing something... Joao Eduardo Luis
01:03 PM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Can you generate and attach a udev log after the reboot? Actually, ideally,
- reboot
- note the time
- run part...
Sage Weil
12:44 PM Bug #5343 (Pending Backport): mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
pushed.. will backport once we have done more testing Sage Weil
10:45 AM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
I ran the following test for an already existing single-monitor setup:
* generate monmap with random fsid
* injec...
Joao Eduardo Luis
09:28 AM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Greg pointed out that it's likely the fsid issue results from messing around with the monmap's fsid. Setting up a te... Joao Eduardo Luis
09:01 AM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Running gdb, looks like the 2810's incremental fsid is different from the OSDMap's fsid:... Joao Eduardo Luis
07:41 AM Bug #5343 (In Progress): mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Joao Eduardo Luis
07:33 AM Bug #5343 (Resolved): mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
A user on ceph-users shared a log containing a most interesting behavior happening on OSDMonitor::update_from_paxos()... Joao Eduardo Luis
03:44 PM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
sandon put wheezy on these mira for us to test this locally: mira09[456] Sage Weil
03:04 PM devops Bug #5363 (Resolved): specfile: ceph does not start on reboot
testing fix Sage Weil
02:54 PM rgw Bug #5347 (Fix Under Review): rgw: bucket marker should include original zone name
Yehuda Sadeh
11:00 AM rgw Bug #5347 (Resolved): rgw: bucket marker should include original zone name
To avoid marker collisions Yehuda Sadeh
02:41 PM rgw Bug #5362 (Resolved): rgw: failure when listing objects with prefix that starts with underscore
Yehuda Sadeh
02:40 PM Bug #5062 (Can't reproduce): mon: 0.61.2 asserts on AuthMonitor during monitor start
Sage Weil
02:39 PM devops Feature #5361 (Resolved): ceph-all should start after networking bug before runlevel [2345]
just in case other system services rely on it being up. Sage Weil
02:38 PM devops Bug #5248 (Resolved): upstart: ceph-all job is starting too soon
hmm opening a separate bug for the 'start earlier than this' part. Sage Weil
12:39 PM devops Bug #5248: upstart: ceph-all job is starting too soon
changing this to runlevel [2345] for now. Sage Weil
02:38 PM devops Feature #3302 (Resolved): ceph-disk: activate-journal, and matching udev rule
Sage Weil
02:23 PM devops Feature #3302: ceph-disk: activate-journal, and matching udev rule
commit:a2a78e8d16db0a71b13fc15457abc5fe0091c84c Sage Weil
02:18 PM devops Bug #5189 (Resolved): ceph-deploy disk prepare fails silently
this is now working with the fixes from #4984. Sage Weil
02:14 PM devops Bug #4984 (Resolved): ceph_deploy: osd create succeeds with an error message (partprobe returns e...
woot! tested and backported to cuttlefish!
still issues on reboot with wheezy... #5194
Sage Weil
01:08 PM Bug #5326 (Resolved): mon: osd crush add ... comamdn broken
commit:9a7ed0b3f8df5bd74133f216bad61ae71eab0816, tho this actual error was a problem with the ceph cli sometime in te... Sage Weil
12:50 PM CephFS Bug #5360 (Rejected): ceph-fuse: failing smbtorture tests
We're failing the maxfid test when samba is backed by a ceph-fuse mount. It seems to be an inconsistent (this is the ... Greg Farnum
11:39 AM devops Bug #5359 (Resolved): ceph-deploy: install and purge commands on rhel sometimes errors out though...
install command on rhel platform errors out though the command is successful and ceph is installed,
the error mess...
Tamilarasi muthamizhan
11:09 AM rgw Feature #5358 (Resolved): rgw: RESTful api for intra-region copy state
Yehuda Sadeh
11:08 AM rgw Bug #5357 (Resolved): rgw: set and retrieve intra-region copy operation state
Yehuda Sadeh
11:07 AM rgw Feature #5356 (Rejected): rgw: RESTful api for bucket upstream zone + marker info
Yehuda Sadeh
11:07 AM rgw Feature #5355 (Rejected): rgw: get and set bucket upstream zone + marker info
Yehuda Sadeh
11:06 AM rgw Feature #5354 (Resolved): rgw: intra-region object copy should also set mtime on object
Yehuda Sadeh
11:05 AM rgw Feature #5353 (Resolved): rgw: metadata put should apply mtime if set
Yehuda Sadeh
11:05 AM rgw Feature #5352 (Resolved): rgw: metadata get should also dump mtime
Yehuda Sadeh
11:04 AM rgw Bug #5351 (Resolved): rgw: make sure wip-rgw-geo passes gitbuilder
Yehuda Sadeh
11:03 AM rgw Feature #5350 (New): rgw: copy object metadata should include omap data for object
That's needed multipart head objects copy Yehuda Sadeh
10:56 AM devops Bug #5339: ceph-deploy suite failures, 'insufficient osds'
changing the priority as this has nothing to do with ceph-deploy,
leaving it in this state until the nightlies succ...
Tamilarasi muthamizhan
10:18 AM Bug #5252 (Resolved): osd: EINVAL from truncate causes osd to crash
commit:f1b6bd7988ab964c9167eff7bea51a49573f5175 Sage Weil
08:50 AM rgw Bug #5346 (Resolved): rgw: invalid read from RGWFormatter_Plain::write_data
ubuntu@teuthology:/a/teuthology-2013-06-14_01:00:36-rgw-master-testing-basic/35856$ zless ./remote/ubuntu@plana63.fro... Sage Weil
08:35 AM devops Bug #5345 (Resolved): ceph-disk: handle less common device names
/dev/sdaa*
/dev/cciss/c0d0p1
etc.
Sage Weil
08:21 AM rgw Bug #5344 (Resolved): rgw: make list of bucket placement pools index configurable
The object containing the list of placement pools is hard coded, make it configurable (through ceph.conf). Yehuda Sadeh

06/13/2013

07:43 PM CephFS Bug #5333: mds: segfault in MDLog::standby_trim_segments
I think it's an old race. The standby MDS gets the pos of journal head, then reads the corresponding journal object. ... Zheng Yan
02:02 PM CephFS Bug #5333: mds: segfault in MDLog::standby_trim_segments
I see that Yan changed one line in this function recently (which shouldn't have had any impact), but other than that ... Greg Farnum
05:41 PM devops Bug #5339: ceph-deploy suite failures, 'insufficient osds'
modified ceph-deploy task to throw appropriate exceptions in case of failures.
most of the ceph-deploy tests have ...
Tamilarasi muthamizhan
10:48 AM devops Bug #5339 (Resolved): ceph-deploy suite failures, 'insufficient osds'
The cluster is NOT operational due to insufficient OSDs Sage Weil
04:37 PM devops Bug #5342 (Resolved): Make tcmalloc default on ARM
tcmalloc usage needs to be enabled on ARM. While packages are not available on all platforms yet, the locally compil... Anonymous
04:03 PM devops Feature #3302 (Fix Under Review): ceph-disk: activate-journal, and matching udev rule
this was causing unreliable ubuntu activation, at least in my case Sage Weil
03:01 PM rgw Feature #5341 (Resolved): rgw: keep state for cross-rgw copy operations
Need to implement a new class that'd index the data. Yehuda Sadeh
01:52 PM devops Bug #5283: Ceph-deploy can't handle /dev/disk/by-* device paths
With the by-id path which does not have embedded colons:... Anonymous
12:52 PM devops Bug #5283: Ceph-deploy can't handle /dev/disk/by-* device paths
glowell@gary-ubuntu-01:~/ceph-deploy$ ./ceph-deploy osd create gary-ubuntu-01:/dev/disk/by-path/pci-0000:00:07.0-scsi... Anonymous
12:28 PM devops Bug #5309 (Closed): ceph-deploy mon create fails to start monitor damon
Issue is no longer occurring after recent commits to ceph-deploy. Not sure which one fixed it but around 10 June. Anonymous
10:49 AM devops Bug #4984 (In Progress): ceph_deploy: osd create succeeds with an error message (partprobe return...
Sage Weil
10:49 AM Bug #5329 (Resolved): ceph osd tell * injectargs broken
commit:3abd2d8bc94ab77364345e3f830cfb83124df31d Sage Weil
10:49 AM Bug #5340 (Resolved): Bad arguments to zero will cause OSD to crash
Check offset/len arguments for zero operation so that later fallocate() error doesn't cause OSD to crash. David Zafman
10:41 AM devops Bug #5338 (Resolved): need rpm packages built for libapache-mod-fastcgi
We currently have libapache-mod-fastcgi packages built for debs. It would be nice to have them built for rpms as well... Tamilarasi muthamizhan
10:23 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Hi Sage,
this was a clean reboot of the cluster node.
As the filesystems have not been mounted automatically no...
Robert Sander
09:16 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
I see it starting osd.5 and osd.2:... Sage Weil
08:40 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Hi,
attached is /var/log/syslog after booting the machine with udev debug level logging.
The filesystems have n...
Robert Sander
10:06 AM Bug #5227 (Need More Info): ARM set up: rados test failed
This sure looks a lot like #4879 which would have been fixed by 0.61. I thought I had grabbed the stores and the logs... Joao Eduardo Luis
09:37 AM devops Bug #5334: ceph-deploy: "modules not installed"
Update. I was able to get it installed correctly with the `ceph-deploy-1.0-0.noarch.rpm` package, but my understandin... Noah Watkins
09:28 AM Bug #5301: mon: leveldb crash in tcmalloc
Okay, regarding the crash, although I've been unable to figure out what or who (us or leveldb) may be causing it, the... Joao Eduardo Luis
08:28 AM devops Bug #5189: ceph-deploy disk prepare fails silently
Hi Sage,
We are currently testing with some Debian wheezy VMs on a VMware ESXi host.
root@ceph01-test:~# lsb_re...
Robert Sander
08:02 AM Bug #5336 (Can't reproduce): osd crash triggered by 'rbd rm ...'
Reported by Florian Wiessner on ML
looks like a stall in the op_tp.. requested detailed logs.
Sage Weil
07:48 AM Bug #5256: Upgraded bobtail->cuttlefish mon crashes, then can't resume the conversion
Okay, here's what is the likely order of events in this case:
* the monitor was converting when it was killed for ...
Joao Eduardo Luis
01:24 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Argh. I spoke too soon. We just had another crash this morning while deleting the benchmark pool. Using the staticall... Emil Renner Berthing

06/12/2013

08:37 PM rbd Feature #5335 (New): qa: test that kernel rbd and librbd can read images written by each other
This test would have caught an issue with format 2 object names being different in librbd and the kernel driver. Josh Durgin
06:17 PM Bug #5329 (Fix Under Review): ceph osd tell * injectargs broken
Sage Weil
12:29 PM Bug #5329 (Resolved): ceph osd tell * injectargs broken
... Sage Weil
06:13 PM Bug #5331 (Resolved): objecter: osd_command doesn't handle dne/down osd properly
commit:8808ca57c652502d9cf803b0dc53673ca9dd62af Sage Weil
01:02 PM Bug #5331 (Resolved): objecter: osd_command doesn't handle dne/down osd properly
we return an error but don't trigger the callback or clean up... Sage Weil
05:51 PM devops Bug #5259 (Duplicate): osd create command fails inconsistently on ubuntu
i think we should call this a dup of the other bug.. this is all about udev vs partprobe vs udevadm settle races. se... Sage Weil
04:13 PM devops Bug #5334 (Resolved): ceph-deploy: "modules not installed"
Using cuttlefish RPM install for CentoOS 6.4. Ceph-deploy is installed on all the nodes. I get the following:
<pre...
Noah Watkins
02:00 PM Bug #5327 (Resolved): cephtool/test.sh fails
commit:701943a27857fcad7fbb405cf95a59c945fea815 Sage Weil
11:44 AM Bug #5327 (Resolved): cephtool/test.sh fails
... Sage Weil
01:49 PM Bug #5238 (Pending Backport): osd: slow recovery (uselessly dirtying pg logs during peering)
Sage Weil
01:46 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
Maybe something different i've this one:
http://tracker.ceph.com/issues/5232
and it makes a HUGE difference regar...
Stefan Priebe
01:44 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
For what it's worth, I also tried it (wip_5238_cuttlefish specifically) per Sam's suggestion while troubleshooting #5... Faidon Liambotis
01:33 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
we are going to tset it a bit more in master before putting it in teh cuttlefish branch. good to know this is helpin... Sage Weil
01:28 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
This one is missing in upstream/cuttlefish ? It helps a lot. Stefan Priebe
01:39 PM Bug #5332 (Resolved): boost::get: key stuckops is not type std::vector<std::string, std::allocato...
commit:de1723834cf2cfe51cc991ece1b53624ff56d7d5 Sage Weil
01:05 PM Bug #5332 (Resolved): boost::get: key stuckops is not type std::vector<std::string, std::allocato...
2013-06-12T02:25:15.786 INFO:teuthology.task.ceph.mon.a.err:2013-06-12 02:26:06.734468 7f2e3ef1e700 -1 bad boost::get... Sage Weil
01:23 PM CephFS Bug #5333 (Resolved): mds: segfault in MDLog::standby_trim_segments
... Sage Weil
12:35 PM Bug #5330 (Resolved): ceph daemon <name> ... broken
it uses ceph-conf to get admin_socket, but taht doesn't work. this does:
ubuntu@plana38:~$ ceph-osd -n osd.0 --s...
Sage Weil
12:23 PM rbd Feature #5168: openstack: cinder: rbd as a backup target
https://blueprints.launchpad.net/cinder/+spec/cinder-backup-to-ceph Josh Durgin
12:23 PM rbd Feature #5167: openstack: cinder: differential backups
https://blueprints.launchpad.net/cinder/+spec/cinder-backup-to-ceph Josh Durgin
11:42 AM Bug #5326 (Resolved): mon: osd crush add ... comamdn broken
... Sage Weil
10:21 AM devops Bug #4984: ceph_deploy: osd create succeeds with an error message (partprobe returns error)
it should have been wip-4984 :) Tamilarasi muthamizhan
09:33 AM Bug #5312: Skip EXT4StoreTest._detect_fs test if DISK or MOUNTPOINT environment variables not set
1577e203f08c3f94c36fd128dda14e8bceeca7a9 Ian Colle
09:32 AM Bug #5311 (Resolved): Existence of parent directories for admin and bootstrap keys in ceph-create...
Sage Weil
08:18 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Ok, I tried the ubuntu leveldb package but in ubuntu leveldb is only built as a static library. So what I did was to ... Emil Renner Berthing
06:10 AM CephFS Bug #5290: mds: crash whilst trying to reconnect
Hi Zheng,
Is this what you mean?
Damien Churchill

06/11/2013

08:54 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
We need to gather some udev logs to diagnose this... can you change teh level in /etc/udev/udev.conf to 'debug', rest... Sage Weil
08:50 PM Bug #4698 (Won't Fix): osd suicide timed out after 150
this was an ext4 bug:... Sage Weil
08:45 PM Bug #5062: mon: 0.61.2 asserts on AuthMonitor during monitor start
Do we have any logs or recent occurrences of this bug to go on, or mon logs of it happening?
If not, I think this ...
Sage Weil
08:43 PM devops Bug #5189 (Need More Info): ceph-deploy disk prepare fails silently
Hi Robert-
Are you still having this problem? Can you share a bit more information about the environment? What d...
Sage Weil
07:20 PM rgw Bug #5324 (Resolved): radosgw-admin --help missing the --shard-id option
The new 'mdlog trim' call requires a --shard-id option be specified but that option is not listed in the --help output. Anonymous
06:32 PM devops Bug #4984: ceph_deploy: osd create succeeds with an error message (partprobe returns error)
pushed wip-4948. works ok on centos/rhel, but we should verify it also behaves on ubuntu and debian. Sage Weil
06:19 PM rgw Bug #5323: trim data log lists dates as optional, enforced as required in the current code
I believe that the offending line is in ceph/src/rgw/rgw_rest_log.cc in the function RGWOp_MDLog_Delete::execute().
...
Anonymous
06:17 PM rgw Bug #5323 (Resolved): trim data log lists dates as optional, enforced as required in the current ...
In the wip-rgw-geo branch, the
DELETE /admin/log?id=<shard id>
call lists start-time and end-time as optional. How...
Anonymous
05:41 PM Linux kernel client Bug #4854 (Rejected): read more than they should
this is due to readahead. readahead can be disabled by posix_fadvise(2) Zheng Yan
04:49 PM Bug #5310 (Resolved): StoreTest.ColSplitTest1 hits assert in _split_collection()
Samuel Just
11:13 AM Bug #5310 (Resolved): StoreTest.ColSplitTest1 hits assert in _split_collection()
$ ./ceph_test_filestore
...
[ RUN ] StoreTest.ColSplitTest1
2013-06-11 11:06:49.332610 7f38942e4780 1 filest...
David Zafman
04:30 PM Bug #5176 (Resolved): leveldb: Compaction makes things time-out yielding spurious elections
Sylvain Munaut wrote:
> I can try to do this tomorrow.
>
> But in the mean time I played with the paxos trimming ...
Sage Weil
11:16 AM Bug #5176: leveldb: Compaction makes things time-out yielding spurious elections
I can try to do this tomorrow.
But in the mean time I played with the paxos trimming values and made it go away.
...
Sylvain Munaut
08:12 AM Bug #5176 (Need More Info): leveldb: Compaction makes things time-out yielding spurious elections
Can you capture a debug mon = 20, debug paxos = 20, debug ms = 1 log that includes an election and send us the set of... Sage Weil
12:59 AM Bug #5176: leveldb: Compaction makes things time-out yielding spurious elections
fyi, I just upgraded from wip-5176 to 0.61.3 and those spurious elections are back. Sylvain Munaut
03:43 PM Bug #5320 (Resolved): osd/ReplicatedPG.cc: 4753: FAILED assert(!pg_log.get_missing().is_missing(s...
-901> 2013-06-11 14:02:22.138530 7f9bd4913700 5 filestore(/var/lib/ceph/osd/ceph-1) _do_op 0x1d4bfa0 seq 68202 osr... Samuel Just
02:56 PM Linux kernel client Bug #4614: Root cephfs does not mount at boot on Ubuntu 12.04
I can confirm this problem occurs on Ubuntu 12.04 as well. sam beckwith
01:46 PM Bug #5311: Existence of parent directories for admin and bootstrap keys in ceph-create-keys not c...
Yes, the packages do it right after the installation. But this does not mean that these dirs still exist when you run... Peter Wienemann
01:08 PM Bug #5311: Existence of parent directories for admin and bootstrap keys in ceph-create-keys not c...
Aren't these directories supposed to be installed by the packages? *Something* is doing it in the normal case or thes... Greg Farnum
01:06 PM Bug #5311: Existence of parent directories for admin and bootstrap keys in ceph-create-keys not c...
A fix is available as pull request #355. Peter Wienemann
12:56 PM Bug #5311 (Resolved): Existence of parent directories for admin and bootstrap keys in ceph-create...
The ceph-create-key script does not check the existence of the parent directories in which the admin and the bootstra... Peter Wienemann
01:32 PM Bug #5312 (Resolved): Skip EXT4StoreTest._detect_fs test if DISK or MOUNTPOINT environment variab...
I disabled the ColSplitTest1/ColSplitTest2 tests (see bug #5310).
Currently, this test case just crashes with uncl...
David Zafman
10:57 AM devops Bug #5309 (Closed): ceph-deploy mon create fails to start monitor damon
This is with current master: 0.63-572-g0948624-1
It appears that somewhere between ceph-deploy and the ceph-mon...
Anonymous
10:53 AM Bug #5307 (Resolved): ceph_test_filestore crashes
Needs --filestore-xattr-use-omap=true Samuel Just
10:33 AM Bug #5307 (Resolved): ceph_test_filestore crashes
$ ./ceph_test_filestore
[==========] Running 11 tests from 2 test cases.
[----------] Global test environment set-u...
David Zafman
10:53 AM devops Bug #5300: ceph-deploy purgedata should give warning if ceph still installed
I'll retest, I might not have been paying attention to purge vs purge data. In any event the test system was left in... Anonymous
09:37 AM devops Bug #5300: ceph-deploy purgedata should give warning if ceph still installed
purge is supposed to remove the package files *and* any config files... Sage Weil
10:53 AM Bug #5269 (Duplicate): osd: EEXIST on mkcoll
This is probably the same thing as 5270. Samuel Just
10:52 AM Bug #5240 (Resolved): run_seed_to_range failed, probably fdcache
Samuel Just
10:27 AM devops Bug #5306 (Can't reproduce): Xen based OSDs fail to start ceph-osd process
After a clean install and ceph-deploy prepare and activate the osd process is running on the node.
After a reboot th...
Yan-Fa Li
10:26 AM Bug #5305 (Resolved): ceph-deploy gatherkeys fails (ceph-create-keys)
When invoked with ceph-deploy ceph-create-keys fails silently and the only indication of a problem is that the subsqu... Anonymous
10:21 AM Bug #5305 (Resolved): ceph-deploy gatherkeys fails (ceph-create-keys)
glowell@gary-ubuntu-01:~/ceph-deploy$ sudo /usr/sbin/ceph-create-keys --cluster=ceph -i gary-ubuntu-01
INFO:ceph-cre...
Anonymous
09:56 AM rgw Bug #5302: rest-bench breaks with XmlParseFailure
what fastcgi module is being used here? Maybe try:
rgw print continue = false
int your ceph.conf.
Yehuda Sadeh
07:26 AM rgw Bug #5302 (Can't reproduce): rest-bench breaks with XmlParseFailure
This was reported on the mailing list when trying to run rest-bench:... Mark Nelson
09:39 AM devops Bug #5299: ceph-deploy fails with cryptic error message if expected directories not found
/etc/ceph should be installed by the package.
did yo uby chance run purgedata without running purge first? that mi...
Sage Weil
09:36 AM Bug #5301 (New): mon: leveldb crash in tcmalloc
Ian Colle
09:29 AM Bug #5301: mon: leveldb crash in tcmalloc
Well I could try to reproduce but I am not going to do this because it is my production cluster. I have also experien... Maciej Galkiewicz
08:21 AM Bug #5301: mon: leveldb crash in tcmalloc
Hi-
The 3.8.y kernel is EOL, but I pushed a branch that has the patch that (I believe) fixes this problem: linux-3...
Sage Weil
06:10 AM Bug #5301 (Can't reproduce): mon: leveldb crash in tcmalloc
Hello
I have replaced my crushmap:...
Maciej Galkiewicz
08:55 AM CephFS Bug #5303 (Resolved): OSD segfaults on SIGINT
This was a missed backport for an old fix. I pushed it to the cuttlefish branch and it will be included in .4. Thanks! Sage Weil
08:41 AM CephFS Bug #5303: OSD segfaults on SIGINT
Without debugger:... Jérôme Poulin
08:38 AM CephFS Bug #5303 (Resolved): OSD segfaults on SIGINT
This is not the first time but interrupting the OSD with SIGINT (CTRL+C) causes a segmentation fault.
Cuttlefish 0...
Jérôme Poulin
08:39 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Ah. Can you please try the ubuntu leveldb package and see if the problem persists? Thanks! Sage Weil
07:43 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
I just looked into LevelDB packaging in wheezy and precise. Again it seems that debian ships a newer version of Level... Emil Renner Berthing
01:06 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Yes, now we seem to have provoked two different errors. Both of them has happened at least twice each but on differen... Emil Renner Berthing
08:34 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
I think that this is what you want, if not, just let me know.
Jeff
Jeff Moskow
08:21 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
Can you share the monitor's logs with 'debug mon = 20' set? Joao Eduardo Luis
07:19 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
Removing the assert worked around the problem:... Jérôme Poulin
06:32 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
I noticed that resetting the MDS journal using ceph-mds -i 1 --reset-journal 0 -d hangs there.... Jérôme Poulin
01:40 AM Fix #5232: osd: slow peering due to pg log rewrites
This one misses cuttlefish for backport? Stefan Priebe

06/10/2013

11:34 PM Bug #5272: Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of existing cluster
I'm afraid that as long as no one else encounters this issue I am not able to provide more detailed information. The ... To Pro
05:53 PM Bug #5272 (Need More Info): Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of exi...
I went through a diff and there's nothing obvious between those two versions that could have caused these feature bit... Greg Farnum
11:07 PM devops Bug #5283 (In Progress): Ceph-deploy can't handle /dev/disk/by-* device paths
The fix for this will actually be in ceph-disk, ceph-deploy pretty much passes the device unmodified.
Anonymous
10:28 PM CephFS Bug #5290: mds: crash whilst trying to reconnect
looks like session map corruption.
Damien, please upload the session map. you can find where is it by "ceph osd ma...
Zheng Yan
02:16 AM CephFS Bug #5290 (Can't reproduce): mds: crash whilst trying to reconnect
Hi,
Recently I experienced an issue with the mds servers in my cluster, the cluster storage would be absolutely fi...
Damien Churchill
10:15 PM devops Bug #5300 (Resolved): ceph-deploy purgedata should give warning if ceph still installed
Purge will remove directores needed for continued operation. Probable need to issue a warning in this case since if ... Anonymous
10:10 PM devops Bug #5299 (Won't Fix): ceph-deploy fails with cryptic error message if expected directories not f...
In this case it's /etc/ceph
glowell@gary-ubuntu-01:~/ceph-deploy$ ./ceph-deploy mon create gary-ubuntu-01
Traceba...
Anonymous
05:51 PM RADOS Bug #5298 (New): mon: "setting" CRUSH tunables to their current values creates a map
Maybe this is adding pointless churn, maybe it's blocking the user longer than necessary, or maybe it's a great way t... Greg Farnum
05:24 PM Bug #5297 (Resolved): Slow requests after restarting an OSD (post peering)
On my Cuttlefish 0.61.3, when I restart an OSD, besides the effects of #5084, I see a bunch of "slow request" message... Faidon Liambotis
05:22 PM Bug #5084: osd: slow peering after osd restart (bobtail)
Just for the record:
We did a troubleshooting/log collecting session with Sam last week. It seems that the issue i...
Faidon Liambotis
05:17 PM Bug #5270 (Resolved): osd: crash in PG::peek_map_epoch()
Samuel Just
02:06 AM Bug #5270: osd: crash in PG::peek_map_epoch()
I've got the same error when some pginfo files have been lost due to XFS corruption. Removing pg collection helped to... Sergey Fionov
04:50 PM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
any luck? Sage Weil
08:06 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Ok, all our OSD nodes are now running v0.61.3, but built --without-tcmalloc.
We'll try different workloads during ...
Emil Renner Berthing
04:24 PM devops Bug #5295 (Resolved): mon keyring path in mon.py not checked properly
commit:dd9392023da4773c7006ec1fb86fee07a862d8f9 Sage Weil
02:06 PM devops Bug #5295 (Resolved): mon keyring path in mon.py not checked properly
In the file mon.py, line 37 ff., of the ceph-deploy code the mon keyring path is not checked properly. Prior to writi... Peter Wienemann
04:20 PM devops Bug #4916: ceph-deploy: mon create fails on bobtail branch in centos 6.3
commit:96c001021e6dd06b43686de7040f78c484869344 fixes the mkdir -p thing. Does that fix the centos problem too? Sage Weil
01:48 PM devops Bug #4916: ceph-deploy: mon create fails on bobtail branch in centos 6.3
I am having the same problem on Debian wheezy. After some debugging I found that the cause of the problem is in the f... Peter Wienemann
04:15 PM Subtask #5213: unit tests for src/osd/PGLog.{cc,h}
"related thread":http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/15499... Loïc Dachary
01:29 PM Bug #5294 (Closed): mon upgrade issue 0.61.2 -> 0.61.3
This was reported on the mailing list by Nelson Jeppesen at Disney. Joao, any idea if we've seen anything else like ... Mark Nelson
11:31 AM devops Documentation #5293 (Rejected): ceph-osd needs ulimit value to be set otherwise won't start
I needed to add the following line to my /etc/security/limits.conf otherwise the osd didn't start up correctly and th... Yan-Fa Li
11:24 AM Bug #5291: Bug with client naming for Cinder-Volume usage
The defaults everywhere are client.admin. Perhaps you've got the CEPH_ARGS environment variable specifying --id volum... Josh Durgin
02:42 AM Bug #5291 (Can't reproduce): Bug with client naming for Cinder-Volume usage
Hello!
It seems there are bug with naming client for Cinder-Volume usage.
According to this documentation http://...
Igor Laskovy
09:42 AM CephFS Bug #5287 (Resolved): the permission of file in CephFS
Ian Colle
06:53 AM rbd Bug #4446: librbd: crash from opensolaris vm
I've upgraded to Cuttlefish and the newest Promox (KVM 1.4.1) and still have the same problem. The kvm command is:
...
Jeff Moskow
06:48 AM Bug #5292 (Resolved): mon: monitor crashing due to not being in the monmap (no monmap to be in)
I run a 4 node CEPH cluster (all are currently running 0.61.3 - upgraded to cuttlefish a few weeks ago) and (3 nodes ... Jeff Moskow
04:29 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Sage Weil wrote:
> what happens if you do 'ceph-disk-active /dev/sdb1' (or whatever the xfs patition is)? what abou...
Robert Sander

06/09/2013

01:54 AM CephFS Bug #5289 (Can't reproduce): mds closing stale session
Hi all,
I found a stale session in MDS.
$ceph -w
\ health HEALTH_OK
..................
.....................
chen atrmat

06/08/2013

11:00 PM CephFS Support #5285 (Closed): cephfs give permission to write files
dup #5287 Zheng Yan
10:37 PM CephFS Bug #5287: the permission of file in CephFS
so far the only solution is chmod Zheng Yan
07:55 PM CephFS Bug #5287: the permission of file in CephFS
Zheng Yan wrote:
> The short answer is no better solution so far. If a given node can mount the FS, it can access to...
chen atrmat
06:24 PM CephFS Bug #5287: the permission of file in CephFS
The short answer is no better solution so far. If a given node can mount the FS, it can access to the data pool direc... Zheng Yan
01:43 AM CephFS Bug #5287 (Resolved): the permission of file in CephFS
Hi all,
I used the CephFS v0.56.3 to store VMs. There're 8 nodes of my cluster, and I mount the cephFS in every node...
chen atrmat
10:24 PM Bug #5200 (Resolved): mon: valgrind leaks
Sage Weil
10:23 PM CephFS Bug #4832 (Resolved): mds: failed auth_unpin assert
Sage Weil
09:39 PM Bug #5286 (Resolved): LibRadosCmd.PGCmd fails pg command test
simpler fix in commit:81a786e9e52ad5168bb7024145ba11f98e35229b Sage Weil
08:43 AM Bug #5288 (Resolved): ceph.py: catch rados errors and print them nicely
ubuntu@plana30:~$ ceph health
Traceback (most recent call last):
File "/usr/bin/ceph", line 1541, in <module>
...
Sage Weil
01:14 AM Linux kernel client Bug #5267: Kernal 3.2.0-23 crashed
Thx very much, so quickly get reply made me glad. Maybe the kernel is too old, we will update ASAP.
I forget to upd...
roman luo

06/07/2013

11:14 PM Bug #5286 (Fix Under Review): LibRadosCmd.PGCmd fails pg command test
wip-5286 Sage Weil
10:45 PM Bug #5286 (Resolved): LibRadosCmd.PGCmd fails pg command test
... Sage Weil
10:04 PM CephFS Bug #4832: mds: failed auth_unpin assert
aie.. thanks Sage Weil
09:36 PM CephFS Bug #4832: mds: failed auth_unpin assert
that commit breaks filelock eval gather Zheng Yan
05:23 PM CephFS Bug #4832 (Resolved): mds: failed auth_unpin assert
commit:a08d62045657713bf0a5372bf14136082ec3b17e Sage Weil
08:13 PM Linux kernel client Bug #5267: Kernal 3.2.0-23 crashed
Thx very much, so quickly get reply made me glad. Maybe the kernel is too old, we will update ASAP.
I forget to upd...
roman luo
09:35 AM Linux kernel client Bug #5267 (Won't Fix): Kernal 3.2.0-23 crashed
please try kernel 3.4 or later.. we aren't backproting fixes as far back as 3.2! Sage Weil
07:39 PM CephFS Support #5285 (Closed): cephfs give permission to write files
Hi all,
I used the CephFS v0.56.3 to store VMs. There're 8 nodes of my cluster, and I mount the cephFS in every n...
chen atrmat
06:05 PM Bug #4698: osd suicide timed out after 150
log: ubuntu@teuthology:/a/teuthology-2013-06-07_01:30:04-upgrade-master-testing-basic/32963... Tamilarasi muthamizhan
05:58 PM Bug #4179: osd: memory leak during deep scrub on bobtail
Sage Weil
08:38 AM Bug #4179 (Fix Under Review): osd: memory leak during deep scrub on bobtail
Sage Weil
05:41 PM Bug #5273 (Rejected): osd: ops waiting a long time for osdmaps
sam points out that the 'waiting for osdmap' status is misleading here.. Sage Weil
10:24 AM Bug #5273 (Rejected): osd: ops waiting a long time for osdmaps
mark nelson is observing this.. diagnose and track down. Sage Weil
05:38 PM devops Bug #5248 (Need More Info): upstart: ceph-all job is starting too soon
waiting to hear back from jamespage ... he's conferring with the upstart people Sage Weil
05:37 PM devops Bug #5194 (Need More Info): udev does not start osd after reboot on wheezy or el6 or fedora
can you confirm whether 'partprobe /dev/...' will start the osd? Sage Weil
05:34 PM CephFS Bug #5236 (Resolved): mds assert when starting file scan
no more failures, yay! Sage Weil
05:24 PM Documentation #5284: crushtool's manpage is very out of date
see crushtool --help Dan Mick
05:24 PM Documentation #5284 (Closed): crushtool's manpage is very out of date
Dan Mick
03:39 PM devops Bug #5283 (Won't Fix): Ceph-deploy can't handle /dev/disk/by-* device paths
If you try to create a new osd with ceph-deploy using /dev/disk/by-* path instead of the /dev/* path the osd creation... Andrei Mikhailovsky
03:37 PM devops Feature #5282 (Closed): Get Dumpling into EPEL
Neil Levine
03:35 PM devops Feature #4515 (Duplicate): packaging: create qemu packages with rbd enabled for centos 6
Duplicates 4550 Ian Colle
03:33 PM devops Documentation #5253 (Resolved): Update Pre-Flight docs to use ceph-deploy package
Ian Colle
03:31 PM devops Feature #5015 (Resolved): ceph-deploy: push packages to all ceph repos
Ian Colle
03:29 PM devops Feature #5019 (Resolved): arm: gitbuilder for ARM
Ian Colle
03:28 PM devops Feature #5018: arm: ceph-deploy: push packages to ARM
Neil Levine
03:26 PM rbd Feature #4834 (Resolved): Recompile/package qemu with new version of librbd to enable asynchronou...
Ian Colle
02:56 PM devops Feature #5089 (Resolved): ceph-deploy install fails on arm
It works.
Needed python-pushy and ceph-deploy built on arm added to the repos.
Anonymous
02:54 PM devops Feature #5016: ceph-deploy: gitbuilders for release packages
Opened ticket #5281 for the gitbuilder vms. This task may have fallen off the radar. Anonymous
02:25 PM RADOS Feature #5280 (New): osd/client: messages should be tagged with the earliest sane map
A client at epoch e should not have to wait for an osd to catch up to epoch e unless the mapping changed in epoch e. ... Samuel Just
02:17 PM Fix #5279 (In Progress): pipeline large object recovery
currently pushes for large objects are syncronous: push->reply->push etc.
should be push->push->push
...
Samuel Just
02:17 PM rgw Bug #5262 (Resolved): rgw: can't access buckets with names that start with 'auth'
Backported to cuttlefish in commit:bd12e81e48014024171c55f5984c9183c8e363cb and commit:c75760e39d8df5b1971343e9f9186f... Greg Farnum
01:59 PM rgw Bug #5262 (Pending Backport): rgw: can't access buckets with names that start with 'auth'
Fixed in next, commit:8d55b87f95d59dbfcfd0799c4601ca37ebb025f5. Fixed a related issue as well, commit:ad3934e335399f7... Greg Farnum
02:15 PM Fix #4567 (Resolved): mon: refactor mon caps; allow restriction of key/value storage by prefix
Sage Weil
02:15 PM Feature #3273: mon: simple dm-crypt key management
- make sure ceph-deploy and chef can use this Sage Weil
02:06 PM Fix #5278 (Resolved): osd: smarter recovery for small objects
1) avoid collection move for single write pushes
2) maybe package multiple small objects at once?
Samuel Just
02:05 PM rgw Bug #5261 (Resolved): rgw: 'cors' is not regarded as a sub-resource
Backported to cuttlefish in commit:b1d436e752c9c20e7dbff91b769cb2ba47383571 Greg Farnum
01:58 PM rgw Bug #5261 (Pending Backport): rgw: 'cors' is not regarded as a sub-resource
Fixed in next branch, commit:9a0a9c205b8c24ca9c1e05b0cf9875768e867a9e.
Will backport to cuttlefish and update with c...
Greg Farnum
10:46 AM rgw Bug #5261: rgw: 'cors' is not regarded as a sub-resource
Yeah, tested it. I created a new functional test for it. Also, there's no CORS in bobtail, so we don't need it there. Yehuda Sadeh
10:29 AM rgw Bug #5261: rgw: 'cors' is not regarded as a sub-resource
Well, that's a simple enough fix. Have you tested it yet?
And it's marked as needing a backport to cuttlefish, but...
Greg Farnum
02:03 PM Fix #4840 (Resolved): mon: transition from old-style allow command to new command descriptions
Sage Weil
02:02 PM Feature #5147 (Resolved): Display unique cluster ID in ceph status
Sage Weil
01:54 PM Bug #5200 (In Progress): mon: valgrind leaks
Sage Weil
01:35 PM Bug #5270: osd: crash in PG::peek_map_epoch()
Very odd. That xattr is written atomically on pg collection creation and never overwritten thereafter. Samuel Just
01:29 PM rbd Feature #5005: cinder: switch rbd driver to use librbd instead of the cli tool
Review: https://review.openstack.org/30792
Commit: http://github.com/openstack/cinder/commit/e2d0e1f479a56d60dc09ae9...
Josh Durgin
11:32 AM rbd Feature #5005 (Resolved): cinder: switch rbd driver to use librbd instead of the cli tool
Ian Colle
01:28 PM rbd Feature #5004: cinder: make rbd configuration easier to use
Review: https://review.openstack.org/30791
Commit: http://github.com/openstack/cinder/commit/483b84e42b90f2ffe0a09f5...
Josh Durgin
11:32 AM rbd Feature #5004 (Resolved): cinder: make rbd configuration easier to use
Ian Colle
01:14 PM rgw Feature #5164: rgw: multisite: metadata push notifications: design blueprint
Neil Levine
01:13 PM rgw Feature #4098 (Fix Under Review): rgw: multi-site: Global Bucket Namespace
Ian Colle
01:13 PM rgw Feature #4329 (Fix Under Review): rgw: dr: updated buckets log: RESTful API
Ian Colle
01:08 PM rgw Feature #4715: rgw: Add support for OPTIONS HTTP method
They are waiting a bit before the upgrade, however they recently reported back that they put together some custom rul... JuanJose Galvez
01:03 PM rgw Feature #4715: rgw: Add support for OPTIONS HTTP method
I'd rather not to do a backport. Do we have a bobtail customer asking for this who can't/won't upgrade to Cuttlefish? Neil Levine
01:04 PM rgw Feature #5136 (Need More Info): rgw: revise user stats
? Neil Levine
01:00 PM rgw Feature #5169: Do not list swift containers when enumerating buckets using S3 API
Neil Levine
12:58 PM rgw Feature #5218: rgw: make bucket removal "atomic"
Neil Levine
11:51 AM rbd Documentation #5212: doc: link to recommended kernel version from pages that describe using kerne...
Neil Levine
11:42 AM rbd Feature #4013 (In Progress): rbd: openstack: extend nova boot api to support going from image to ...
Ian Colle
11:42 AM rbd Feature #4017 (In Progress): rbd: openstack: simplify volume booting with new api
Ian Colle
11:30 AM rbd Feature #5275 (Resolved): openstack: port always_use_volumes option to grizzly
The folsom version is git://github.com/jdurgin/nova wip-volumes. Josh Durgin
10:52 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
I'll try commenting out the assert, and yes, we tried the snapshots feature of the MDS hours before the shutdown. Jérôme Poulin
09:44 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
were you using the mds snapshots? Sage Weil
09:42 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
probably the workaround is to comment out that assert.. Sage Weil
07:56 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
Is it useful for me to keep the FS in this state much longuer, right now the FS is unusable. Is it possible to clear ... Jérôme Poulin
10:20 AM devops Bug #5242 (Resolved): ceph-deploy: reports purgedata as invalid command when purge is not successful
fixed the mirror! Sage Weil
09:47 AM devops Bug #5242: ceph-deploy: reports purgedata as invalid command when purge is not successful
any news here, tamil?
Sage Weil
09:56 AM Bug #5272 (Duplicate): Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of existing...
I'm running a ceph cluster with three server nodes, each running one MON, one MDS and three OSDs to provide CEPHFS st... To Pro
09:49 AM devops Bug #5263 (In Progress): Python Error While Installing ceph-deply on debian wheezy
The version 1.0 pachages should not have included the test directory. I'm double checking the repos to ensure thaey... Anonymous
09:35 AM Bug #5260 (Resolved): mon: FAILED assert(other->is_writeable()) from MDSMonitor on 0.61.2
Sage Weil
08:47 AM Bug #4999 (Can't reproduce): monitor sync failure
Sage Weil
08:34 AM Bug #5257 (Resolved): Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
The prefork fix is backported to cuttlefish, so closing this one out then. Sage Weil
07:34 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Running without tcmalloc would be a very helpful data point, yes. You can get non-tcmalloc packages built for precis... Sage Weil
07:16 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
It turns out that the Debian wheezy libgoogle-perftools-dev package and ceph packages depends on libgoogle-perftools4... Emil Renner Berthing
05:44 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Sorry. s/Gary/Sage/ Emil Renner Berthing
05:43 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Would it be helpful to try and build packages that don't use tcmalloc (using the --without-tcmalloc configure option)... Emil Renner Berthing
12:40 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
No, unfortunately the latest cuttlefish branch didn't fix it. We had another crash about 6 hours after we upgraded.
...
Emil Renner Berthing

06/06/2013

11:01 PM devops Feature #5018: arm: ceph-deploy: push packages to ARM
Is this by any chance a duplicate ? Anonymous
10:59 PM devops Feature #5089 (In Progress): ceph-deploy install fails on arm
Needs a bit more testing. Anonymous
10:57 PM devops Feature #5091 (In Progress): google-perftools for arm
Need to verify that new upstream package build will work for us. Anonymous
10:56 PM devops Feature #5092 (Closed): libatomic-ops for arm; or use gcc atomics instead
This does not appear to be an issues. libatomicops is not supported on some arm architectures, but v7 is ok. Early ... Anonymous
10:53 PM devops Feature #5015: ceph-deploy: push packages to all ceph repos
ceph-deploy is being added to all the testing and named releases.
The is still some automation that could be appli...
Anonymous
10:51 PM devops Feature #5088 (Resolved): ceph-deploy packages need to install on arm
Completed. Arm version of ceph-deploy built and added to the repo. Anonymous
10:49 PM devops Feature #5090 (Resolved): ceph-build: Need to support arm in the repos.
Completed, It was just adding armhf to the architectures in the repo cofnig. Anonymous
10:48 PM devops Feature #5016 (In Progress): ceph-deploy: gitbuilders for release packages
Waiting for gitbuilder VMs to be instantiated. Anonymous
10:35 PM Bug #5270 (Resolved): osd: crash in PG::peek_map_epoch()
... Sage Weil
10:33 PM Bug #5269 (Resolved): osd: EEXIST on mkcoll
... Sage Weil
09:53 PM CephFS Bug #4832: mds: failed auth_unpin assert
full log attached for posterity. see wip-4832 Sage Weil
06:27 PM CephFS Bug #4832: mds: failed auth_unpin assert
... Sage Weil
07:23 AM CephFS Bug #4832: mds: failed auth_unpin assert
... Sage Weil
09:38 PM CephFS Fix #5268 (Closed): mds: fix/clean up file size/mtime recovery code
from diagnosing #4832 (see the attached log) it looks like this code needs an overhaul:
* i don't think we should ...
Sage Weil
08:20 PM Linux kernel client Bug #5267 (Won't Fix): Kernal 3.2.0-23 crashed
I don't know how to descript it. The kernal crashed and the last output on the screen is in attached. Who can tell me... roman luo
05:00 PM devops Bug #5266 (Closed): the apt-get install instructions are missing an update
http://ceph.com/docs/master/start/quick-start-preflight/
This section is missing the update:
wget -q -O- 'https...
Yan-Fa Li
04:53 PM devops Documentation #5265: node-name is confusing. hostname is probably more accurate
I think it might be helpful to have a section of the QSG that describes the basic networking requirements (i.e., host... Ross Turk
04:49 PM devops Documentation #5265 (Closed): node-name is confusing. hostname is probably more accurate
http://ceph.com/docs/master/start/quick-ceph-deploy/
ceph-deploy new {node-name}
ceph-deploy new ceph-node
nod...
Yan-Fa Li
04:23 PM Bug #4179: osd: memory leak during deep scrub on bobtail
found it (probably):... Sage Weil
04:06 PM Bug #4179: osd: memory leak during deep scrub on bobtail
... Sage Weil
04:17 PM devops Bug #5263: Python Error While Installing ceph-deply on debian wheezy
Adding package list just in case:
root@ceph-server:/mnt/my-cluster# dpkg -l
Desired=Unknown/Install/Remove/Purge/...
Yan-Fa Li
04:14 PM devops Bug #5263 (Resolved): Python Error While Installing ceph-deply on debian wheezy
While trying to install ceph-deploy on a new Debian 7.0/Wheezy with all the latest updates I got the following error:... Yan-Fa Li
03:46 PM Bug #5084: osd: slow peering after osd restart (bobtail)
I updated my cluster from 0.61.2 to 0.61.3 and can tell a noticeable improvement. There are still some I/O stalls whi... John Nielsen
11:51 AM Bug #5084: osd: slow peering after osd restart (bobtail)
I've uploaded slowpeer-ceph-osd.2.log.bz2 (--debug-ms=1 --debug-filestore=5 --debug-osd=20) & slowpeer-osd2-ceph.log ... Faidon Liambotis
08:21 AM Bug #5084: osd: slow peering after osd restart (bobtail)
So, I've upgraded my whole cluster to cuttlefish git (7d549cb), mainly to address this issue. The treee I've installe... Faidon Liambotis
02:16 PM rgw Bug #5262 (Resolved): rgw: can't access buckets with names that start with 'auth'
Yehuda Sadeh
11:47 AM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
Sage Weil wrote:
> the problem:
>
> mon.1 and .2 had newer data, mon.0 had older data.
> mon.0 converts, waits t...
Joao Eduardo Luis
11:16 AM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
the mon.1 log snippet... Sage Weil
11:15 AM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
the problem:
mon.1 and .2 had newer data, mon.0 had older data.
mon.0 converts, waits to join quorum
mon.1 conve...
Sage Weil
06:31 AM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
cephdrop:ceph-5257-mondirs.tar.bz2, fetched earlier today. Faidon Liambotis
10:51 AM Feature #4982 (In Progress): OSD: namespaces pt 1 (librados/osd, not caps)
David Zafman
10:09 AM rgw Bug #5261 (In Progress): rgw: 'cors' is not regarded as a sub-resource
Ian Colle
10:04 AM rgw Bug #5261 (Resolved): rgw: 'cors' is not regarded as a sub-resource
'cors' needs to be regarded as a sub-resource, otherwise auth signing is not being done correctly. Yehuda Sadeh
09:51 AM Bug #4976: osd powercycle triggers object corruption on xfs
ubuntu@teuthology:/a/teuthology-2013-06-05_10:57:29-rados-cuttlefish-master-basic/31967 Tamilarasi muthamizhan
09:37 AM Bug #5154 (Resolved): osd/SnapMapper.cc: 270: FAILED assert(check(oid))
Samuel Just
09:29 AM Bug #4731 (Resolved): PG: don't write out pg epoch on every map activation
Samuel Just
08:10 AM Bug #5246: mon crashing on pool/pg creation with wip-mon
comments on gh Joao Eduardo Luis
06:28 AM Bug #5255 (Resolved): 0.56.6 -> cuttlefish tip (to be .3), mon crashes on boot
Joao Eduardo Luis
06:19 AM Bug #5260 (Resolved): mon: FAILED assert(other->is_writeable()) from MDSMonitor on 0.61.2
Lack of logging doesn't help that much in assessing what may be going on, but the stack trace might prove itself usef... Joao Eduardo Luis
06:09 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
All our OSD nodes have now been updated to packages built from the latest cuttlefish branch, commit 7d549cb82ab8e..
...
Emil Renner Berthing
01:27 AM Feature #3527 (Resolved): osd: blacklist should cancel outstanding watches from blacklisted client
commit:8f9b1470dd50bab9fa85450306c274b1a70a672c David Zafman

06/05/2013

09:21 PM CephFS Bug #4832: mds: failed auth_unpin assert
lgo is here flab:/home/sage/tmp/4832
Sage Weil
09:21 PM CephFS Bug #4832: mds: failed auth_unpin assert
it's getting recovered twice:... Sage Weil
09:02 PM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
Faidon: can you send a tarball of your mon dirs? IIRC the old files are still present post-conversion, so we should ... Sage Weil
06:19 PM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
from the logs on cephdrop, this looks like a non-deterministic store conversion maybe? the quorum 0,1 is happily chu... Sage Weil
06:16 PM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
Sage Weil
05:39 PM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
I just upgraded another box and I'm not observing the same behavior. OSDs are now down while PGs are upgrading. This ... Faidon Liambotis
05:22 PM Bug #5257 (Need More Info): Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
do you have a complete mon log for this? if not, can you capture one the next time around? that osd should have bee... Sage Weil
04:48 PM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
There seems to have been a monitor election (without me doing anything) exactly before the "141 up":... Faidon Liambotis
04:36 PM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
I'm sure nodown wasn't set. I didn't restart all mons at once, just 12 (one box) out of 141.
This is what "grep os...
Faidon Liambotis
04:30 PM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
are you sure 'nodown' wasn't set? this upgrade happens in load_pgs(), long before the osd sends a message to the mon... Sage Weil
01:05 PM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
Just to give a sense of the size of the issue:... Faidon Liambotis
12:54 PM Bug #5257: Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
This happened after a while:... Faidon Liambotis
11:38 AM Bug #5257 (Resolved): Ceph OSD bobtail->cuttlefish upgrade goes backward in time with osdmap
I upgraded ceph on one of my boxes (12 osds) and the osds came up, printing "183140 PGs are upgrading". During that t... Faidon Liambotis
07:05 PM devops Feature #5214: Kernel gitbuilders for rpm distros
This needed centos 6.4 .I know it said 6.3 or 6.4 but we already needed 6.4 cloud-init support for other things so it... Sandon Van Ness
05:59 PM Bug #5256: Upgraded bobtail->cuttlefish mon crashes, then can't resume the conversion
Cephdrop now has 5256-ceph-mon.ms-be1005.log.gz, 5256-ceph-mon.ms-fe1001.log.gz, 5256-ceph-mon.ms-fe1003.log.gz.
T...
Faidon Liambotis
01:50 PM Bug #5256 (In Progress): Upgraded bobtail->cuttlefish mon crashes, then can't resume the conversion
Joao Eduardo Luis
11:29 AM Bug #5256 (Resolved): Upgraded bobtail->cuttlefish mon crashes, then can't resume the conversion
... Faidon Liambotis
03:53 PM devops Bug #5259 (Duplicate): osd create command fails inconsistently on ubuntu
ubuntu@teuthology:/a/teuthology-2013-06-05_01:01:15-ceph-deploy-master-testing-basic/31847... Tamilarasi muthamizhan
03:43 PM devops Bug #4924: ceph-deploy: gatherkeys fails on raring (cuttlefish)
Okay so I tried duplicating this again today. And now I can't. I think it was due to an iptables issue at first, but ... Greg Poirier
03:11 PM rgw Feature #5218: rgw: make bucket removal "atomic"
Don't know since we don't have a design; but probably not as I suspect it will require a (very minor) format change/e... Greg Farnum
03:06 PM rgw Feature #5218: rgw: make bucket removal "atomic"
When fixed, will this be backported to bobtail? JuanJose Galvez
02:21 PM devops Bug #5258 (Resolved): ceph-deploy: forgetkeys command could delete existing keyring files without...
From an admin point of view, it would be nice to have 'forgetkeys' command to delete only existing keyring files and ... Tamilarasi muthamizhan
01:51 PM Bug #5240: run_seed_to_range failed, probably fdcache
2013-06-05T04:21:14.657 INFO:teuthology.orchestra.run.err:2013-06-05 04:21:58.389650 7faabeffd700 10 filestore(b) tru... Samuel Just
12:31 PM Bug #4179: osd: memory leak during deep scrub on bobtail
... Tamilarasi muthamizhan
12:30 PM Bug #4179: osd: memory leak during deep scrub on bobtail
as Sam requested, here is the perf dump for each osds on the cluster... Tamilarasi muthamizhan
11:40 AM Bug #5255: 0.56.6 -> cuttlefish tip (to be .3), mon crashes on boot
Just tested this and it works as expected. Faidon Liambotis
10:39 AM Bug #5255: 0.56.6 -> cuttlefish tip (to be .3), mon crashes on boot
backported with commit commit:7d549cb82ab8ebcf1cc104fc557d601b486c7635 Joao Eduardo Luis
10:29 AM Bug #5255 (Pending Backport): 0.56.6 -> cuttlefish tip (to be .3), mon crashes on boot
Sage had already created a patch for this but it's only on next (commit:ce67c58db7d3e259ef5a8222ef2ebb1febbf7362).
...
Joao Eduardo Luis
10:27 AM Bug #5255 (In Progress): 0.56.6 -> cuttlefish tip (to be .3), mon crashes on boot
Ian Colle
10:04 AM Bug #5255 (Resolved): 0.56.6 -> cuttlefish tip (to be .3), mon crashes on boot
I upgraded my first mon from 0.56.6 to cuttlefish tip as of now (8544ea7) and it crashes on boot with:... Faidon Liambotis
11:32 AM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
For the slow peering case, I think the first problem is that we unconditionally dirty the log in activate(). Since m... Samuel Just
07:51 AM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
Looking more closely it appears that for the qa job the problem is just that the recovery gets very low priority due ... Sage Weil
07:50 AM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
Stefan Priebe wrote:
> Hi sage is this related to my one? http://tracker.ceph.com/issues/5232
Only sort of.. one ...
Sage Weil
10:32 AM Fix #5232 (In Progress): osd: slow peering due to pg log rewrites
Ian Colle
07:34 AM Bug #4999: monitor sync failure
No, I meant I had hit the original issue again, where a sync failed
due to timeout (see updates 2,3)
I haven't be...
Jim Schutt
04:14 AM Bug #5205: mon: FAILED assert(ret == 0) on config's set_val_or_die() from pick_addresses()
Thanks Adam, this provides great insight on what's going on. Joao Eduardo Luis

06/04/2013

11:20 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
Hi sage is this related to my one? http://tracker.ceph.com/issues/5232 Stefan Priebe
04:49 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
the health checks was a red herring. wait_for_recovery calls assert, but the other thread(s) finish before we see th... Sage Weil
09:26 AM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
I think this might be a teuthology problem: i can't find any ceph process running on the cluster when it hangs. tryi... Sage Weil
10:11 PM Bug #5205: mon: FAILED assert(ret == 0) on config's set_val_or_die() from pick_addresses()
I've also encountered this problem, running 0.61.2 on CentOS 6.4 (uname 2.6.32-220.el6.x86_64 #1 SMP Tue Dec 6 19:48:... Adam Compton
07:40 PM CephFS Bug #3681: kclient fsx fails nightly
I think this has already been fixed (a cap revoke bug in the MDS code). When handling truncate request, current MDS ... Zheng Yan
05:44 PM Bug #4999: monitor sync failure
Jim, you mean you hit the leveldb error again? can you post a complete log for that? The one in the original report... Sage Weil
05:13 PM Bug #5246 (Fix Under Review): mon crashing on pool/pg creation with wip-mon
pushed a simplification of the is_readable/writeable checks to wip-mon Sage Weil
08:33 AM Bug #5246: mon crashing on pool/pg creation with wip-mon
Postponed but not forgotten. Joao Eduardo Luis
07:46 AM Bug #5246 (Resolved): mon crashing on pool/pg creation with wip-mon
this is using wip-mon when the cluster is first being setup during pool creation. OSDs were (possibly unrelated) goi... Mark Nelson
04:40 PM Bug #5233 (Resolved): python rados tests induce bad filestore truncate on arm
commit:051f477 Sage Weil
01:59 PM Bug #5233 (Fix Under Review): python rados tests induce bad filestore truncate on arm
Added #5252 for the osd error handling part. Josh Durgin
03:56 PM Feature #5147: Display unique cluster ID in ceph status
oh.. yeah, it's a uuid, e.g. "3cbff3a6-18f6-42e8-8940-febea7eb4282"
also, i didn't backport the change to cuttlefi...
Sage Weil
03:55 PM Feature #5147 (Need More Info): Display unique cluster ID in ceph status
Can you please confirm the format of the unique string? PS have requested it being something easy to communucate over... Neil Levine
03:09 PM devops Documentation #5253 (Resolved): Update Pre-Flight docs to use ceph-deploy package
update pre-flight info at http://ceph.com/docs/master/start/ to instruct users to download ceph-deploy package, which... Neil Levine
03:04 PM Bug #5225 (Rejected): arm: rbd fsx test failed on the arm set up
fsx allocates the entire image size in memory. We just need to decrease the image size to make it work on these machi... Josh Durgin
01:59 PM Bug #5252 (Resolved): osd: EINVAL from truncate causes osd to crash
If a rados client sends a truncate operation that exceeds the maximum file size, truncate/ftruncate(2) will return EI... Josh Durgin
01:15 PM Bug #4976: osd powercycle triggers object corruption on xfs
this is looking more like an xfs bug to me.. sent something to the list.
i also think it is new in 3.9. need to tr...
Sage Weil
12:49 PM Bug #5239 (Need More Info): osd: Segmentation fault in ceph-osd / tcmalloc
Sage Weil
09:36 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
this is either heap corruption, or a buggy tcmalloc, i think.
are there known problsm with wheezy's tcmalloc versi...
Sage Weil
09:36 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Gary, can you please take a look at this? Ian Colle
12:34 PM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
I'm running a single MDS on the same server as a MON and a ODS. We're not using the FS very much, just testing, this ... Jérôme Poulin
12:16 PM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
Can you provide the output of "ceph -s" as well, please. And start up an MDS daemon after setting "debug mds = 20" an... Greg Farnum
11:19 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
Full log at pastebin.com : http://pastebin.com/9YPMjw0t Jérôme Poulin
11:18 AM CephFS Bug #5250 (Can't reproduce): ceph-mds 0.61.2 aborts on start
After rebooting the whole cluster using the "shut the braker off" method, I had some BTRFS corruption which was fixed... Jérôme Poulin
12:32 PM Bug #5247 (Resolved): upgrade suite is hanging
tested on '0.63-229-g64b3e83-1precise' [sha1: 64b3e833f62f2538ffd7bd565d968decf6584691] Tamilarasi muthamizhan
12:19 PM Bug #5247: upgrade suite is hanging
error seen is ... Tamilarasi muthamizhan
10:47 AM Bug #5247: upgrade suite is hanging
Sage Weil
09:27 AM Bug #5247 (Resolved): upgrade suite is hanging
has gottne hung the last 2-3 nights Sage Weil
12:27 PM Bug #5251 (Can't reproduce): wrong node messages in mds log
when upgrading from bobtail to next branch, seeing repeated wrong node messages in the osd logs.... Tamilarasi muthamizhan
10:57 AM Bug #5163: filestore: ENOTEMPTY on object removal
Can we get a recursive ls of 2.363_head on that osd? Samuel Just
10:50 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
what happens if you do 'ceph-disk-active /dev/sdb1' (or whatever the xfs patition is)? what about 'partprobe /dev/sd... Sage Weil
10:44 AM Bug #5240 (Resolved): run_seed_to_range failed, probably fdcache
Samuel Just
10:26 AM RADOS Feature #5249 (Resolved): mon: support leader election configuration
Right now, monitor election is handled by selecting the monitor with the lowest IP that can reach enough peers. This ... Greg Farnum
10:02 AM devops Bug #5248 (Resolved): upstart: ceph-all job is starting too soon
The current ceph-all job specifies the following:
start on (local-filesystems and net-device-up IFACE!=lo)
This c...
Alexandre Marangone
09:49 AM Bug #5237 (Duplicate): filestore idempotent tester failure
Samuel Just
09:39 AM rbd Bug #5220 (Resolved): test_ls_snaps segfaults on the arm test setup
Ian Colle
09:37 AM CephFS Bug #5236: mds assert when starting file scan
Sage Weil
09:33 AM devops Bug #5242: ceph-deploy: reports purgedata as invalid command when purge is not successful
this is definitely using hte wrong version of ceph-deploy.. discover is not a command any more. somehow pulling from... Sage Weil
08:42 AM rgw Bug #5245: Frequent 500s from radosgw
Yes, there is a single radosgw process:... Jiri Brunclik
07:58 AM rgw Bug #5245: Frequent 500s from radosgw
Can you verify that you only have a single gateway running on that socket, and that the process id does not change wh... Yehuda Sadeh
07:51 AM rgw Bug #5245: Frequent 500s from radosgw
This is my Apache config:... Jiri Brunclik
07:30 AM rgw Bug #5245: Frequent 500s from radosgw
Could it be that you let apache spawn the gateways by itself? Or maybe running multiple gateways over the same socket... Yehuda Sadeh
02:32 AM rgw Bug #5245 (Can't reproduce): Frequent 500s from radosgw
Hi,
I have roughly 30 clients talking simultaneously to radosgw over 1Gbps link. I use boto library on the client ...
Jiri Brunclik
08:33 AM Bug #5215 (Resolved): mon: hang during sync with mon thrashing
commit:eb6d5fcf994d2a25304827d7384eee58f40939af Sage Weil
07:17 AM Bug #5215 (In Progress): mon: hang during sync with mon thrashing
Managed to trigger this using the following job:... Joao Eduardo Luis

06/03/2013

10:08 PM Cleanup #4809 (Resolved): MMonProbe extra fields
Sage Weil
09:53 PM Feature #5147 (Resolved): Display unique cluster ID in ceph status
don't think we need to backport this one. Sage Weil
09:53 PM Bug #5062: mon: 0.61.2 asserts on AuthMonitor during monitor start
could this simply be:
- start sync
- sync last_committed
- crash before reaching osdmap_$lastcommitted
- osd re...
Sage Weil
09:50 PM CephFS Bug #5236: mds assert when starting file scan
commit:2d655bde8de9ad255d63718768558399cacd7068
thanks!
Sage Weil
05:53 PM CephFS Bug #5236: mds assert when starting file scan
looks like I forget to initialize MDCache::rejoins_pending Zheng Yan
02:17 PM CephFS Bug #5236: mds assert when starting file scan
Yan, I got as far as identifying that the problem is that rejoin_gather_finish->identify_files_to_recovery is getting... Sage Weil
10:00 AM CephFS Bug #5236: mds assert when starting file scan
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-06-03_01:00:48-fs-master-testing-basic/30161 Sage Weil
07:52 AM CephFS Bug #5236 (Resolved): mds assert when starting file scan
... Sage Weil
05:25 PM Linux kernel client Bug #5244 (Rejected): btrfs hang on tree lock, 3.9 kernel
... Sage Weil
04:44 PM RADOS Tasks #5243 (New): osd testing: create peering speed test
Create teuthology task which generates a deterministic numbers of pg remaps and summarizes the peering "speed".
Th...
Samuel Just
04:23 PM rgw Feature #5170: RGW: Object restriping tool to fix large objects from argonaut.
Neil Levine
04:09 PM devops Bug #5242: ceph-deploy: reports purgedata as invalid command when purge is not successful
ubuntu@teuthology:/a/teuthology-2013-06-02_01:00:44-fs-master-testing-basic/29298 Tamilarasi muthamizhan
04:08 PM devops Bug #5242 (Resolved): ceph-deploy: reports purgedata as invalid command when purge is not successful
... Tamilarasi muthamizhan
03:50 PM CephFS Fix #5241: MDS: not valgrind (leak) clean
teuthology-2013-06-03_01:00:48-fs-master-testing-basic:
30170, 30172, 30174
Greg Farnum
03:43 PM CephFS Fix #5241 (New): MDS: not valgrind (leak) clean
Valgrind info at /a/teuthology-2013-06-01_01:00:43-fs-next-testing-basic/28691/remote/ubuntu@plana85.front.sepia.ceph... Greg Farnum
03:50 PM Bug #5240: run_seed_to_range failed, probably fdcache
Looks like the tester will place objects with the same name into different collections, fixing test. Samuel Just
03:39 PM Bug #5240 (Resolved): run_seed_to_range failed, probably fdcache
2013-06-03T04:26:53.232 INFO:teuthology.orchestra.run.err:2013-06-03 04:27:34.948984 7fa652ef5780 0 filestore_diff d... Samuel Just
03:40 PM Bug #4976: osd powercycle triggers object corruption on xfs
two writes to the object, at offset A~B and C~D, then read the whole thing. the original write appears intact, but a... Sage Weil
03:35 PM Bug #5156 (Duplicate): OSD: split followed by pg resurrection might leave an object in two collec...
Samuel Just
02:55 PM Bug #5226: Some PG stay in "incomplete" state
Well, if I look /var/lib/ceph/osd/ceph-19/current/4.5c_head or /var/lib/ceph/osd/ceph-19/current/4.0_head for example... Olivier Bonvalet
09:35 AM Bug #5226 (Need More Info): Some PG stay in "incomplete" state
it sounds as though osd.19 was also missing hte data prior to osd.25 going away. can you look for the pg subdirector... Sage Weil
01:07 PM Feature #4107: Usage quota for rados pools
Duplicated by 4465 and 4466. Ian Colle
12:26 PM devops Bug #5211: ceph-disk prepare: list_partitions() shouldn't return disks
One way to do that would be to use lsblk /dev/<disk> and look for the word "part". I'm not sure lsblk is on every dis... Alexandre Marangone
09:42 AM devops Bug #5211: ceph-disk prepare: list_partitions() shouldn't return disks
the python code that does this is pretty kludgey.. any suggestions for a more robust enumeration strategy should be p... Sage Weil
09:26 AM devops Bug #5211: ceph-disk prepare: list_partitions() shouldn't return disks
More info:
A customer has its OS installed on /dev/sdak.
When running ceph-disk prepare /dev/sda, ceph-disk-pre...
Alexandre Marangone
10:19 AM rbd Bug #5220 (In Progress): test_ls_snaps segfaults on the arm test setup
Josh Durgin
09:55 AM rgw Feature #4310 (In Progress): rgw: multisite: radosgw changes: copy across regions
Ian Colle
09:55 AM rgw Feature #4337 (In Progress): rgw: multisite: metadata sync agent: implement full sync
Ian Colle
09:39 AM Bug #5176 (Resolved): leveldb: Compaction makes things time-out yielding spurious elections
Sage Weil
08:42 AM Fix #5232: osd: slow peering due to pg log rewrites
Should I post the config? Stefan Priebe
08:37 AM Bug #5239 (Can't reproduce): osd: Segmentation fault in ceph-osd / tcmalloc
We're still experiencing segmentation faults in the ceph-osd daemons from the 0.61.2-1~bpo70+1 debian packages.
It a...
Emil Renner Berthing
08:28 AM Bug #5238 (Resolved): osd: slow recovery (uselessly dirtying pg logs during peering)
seeing several failures due to slow recovery. it looks like the health checks stop, and teuthology continues on for ... Sage Weil
08:26 AM Bug #5237 (Duplicate): filestore idempotent tester failure
... Sage Weil

06/02/2013

07:16 PM Fix #5232: osd: slow peering due to pg log rewrites
No config changes except the minimum right now. Before I just had changed the osd op thread count.
4096 pgs 24 osd...
Stefan Priebe
02:40 PM Fix #5232: osd: slow peering due to pg log rewrites
Stefan, are there any non-default options for in your ceph.conf that might affect pg log size? How many pgs do you h... Sage Weil
01:45 PM Fix #5232: osd: slow peering due to pg log rewrites
even so, it seems like a lot of time is spent just in the removal phase.. perhaps there is something not quite right ... Sage Weil
01:43 PM Fix #5232: osd: slow peering due to pg log rewrites
This looks to me like a lot of time is being spent in leveldb clearing and rewriting the pglog. This is probably jus... Sage Weil
11:33 AM Fix #5232: osd: slow peering due to pg log rewrites
May be a hint or just luck i could reduce the effect and time to recover by lowering osd op threads to 2 (default) in... Stefan Priebe
11:32 AM Fix #5232: osd: slow peering due to pg log rewrites
Attached you'll find a log with debugging enabled in betweek and two new gdb thread all traces. Stefan Priebe
09:39 AM Fix #5232: osd: slow peering due to pg log rewrites
stefan: can you also do
ceph --admin-daemon /var/run/ceph/ceph-osd.NNN.asok config set debug_ms 1
ceph --admin-...
Sage Weil
09:38 AM Fix #5232: osd: slow peering due to pg log rewrites
this thread?... Sage Weil
05:08 AM Fix #5232 (Resolved): osd: slow peering due to pg log rewrites
I noticed that since cuttlefish the osd recovery process is extremely slow. Also client I/o gets stalled to the recov... Stefan Priebe
03:34 PM Bug #5163: filestore: ENOTEMPTY on object removal
moved tamil's issue to #5233. and mike, i see the output now, but it doesn't make much sense. a more complete log w... Sage Weil
03:26 PM Bug #5163: filestore: ENOTEMPTY on object removal
Tamil- I see, it's the python rados tests. Is this reproducible? Sage Weil
03:18 PM Bug #5163 (Need More Info): filestore: ENOTEMPTY on object removal
Tamil- Yours looks like a different (and easier) bug. what was the workload? It appears to just be a bad trucnation... Sage Weil
03:29 PM Bug #5233 (Resolved): python rados tests induce bad filestore truncate on arm
see #5163
filestore saw...
Sage Weil
06:29 AM Bug #5226: Some PG stay in "incomplete" state
After replacing OSD.25, near all incompletes PG are [19, 25] or [25, 19] :
> $ ceph health detail
> HEALTH_WARN 1...
Olivier Bonvalet

06/01/2013

01:36 PM Bug #4976 (In Progress): osd powercycle triggers object corruption on xfs
ubuntu@teuthology:/a/teuthology-2013-05-31_20:00:08-rados-cuttlefish-master-basic/28270
trying to reproduce this w...
Sage Weil

05/31/2013

07:28 PM rbd Bug #5040 (Resolved): krbd: record that an parent info refresh has failed
The following has been committed to the ceph-client
"testing" branch:
93e85fb rbd: clean up a few things in the r...
Alex Elder
07:27 PM rbd Bug #3094 (Resolved): krbd: race between finding existing client and creating new one
The following has been committed to the "testing" branch
of the ceph-client git repository.
601e01d rbd: protect ...
Alex Elder
06:23 PM rbd Bug #5222 (Fix Under Review): krbd: use per-rbd_dev mutex to protect header updates
This patch has been posted for review:
0004-rbd-use-rwsem-to-protect-header-updates.patch
Alex Elder
02:40 PM rbd Bug #5222 (Resolved): krbd: use per-rbd_dev mutex to protect header updates
Currently updating header information for an rbd device
is protected by the control lock, which precludes
concurren...
Alex Elder
06:22 PM rbd Bug #3925 (Fix Under Review): krbd: sysfs write lockdep warnings
I found that avoiding taking the ctl_lock when updating
getting or putting device references got rid of the
problem...
Alex Elder
10:58 AM rbd Bug #3925: krbd: sysfs write lockdep warnings
That sequence reproduces the problem, even in the latest
version of the "testing" branch. (Not all of it may be
re...
Alex Elder
06:14 PM rgw Bug #5228 (Duplicate): radosgw-admin bucket list no longer shows all buckets
It can still list the buckets owned by a specific user when --uid is specified.
The bug was introduced by the foll...
Jan Harkes
06:11 PM Bug #5227 (Can't reproduce): ARM set up: rados test failed
rados_workunit_loadgen_mostlyread.yaml test failed in the ARM test setup [tala002, tala003, tala004]... Tamilarasi muthamizhan
04:21 PM Bug #5226 (Won't Fix): Some PG stay in "incomplete" state
Hi,
With bobtail I first loose the OSD.25 : the OSD process was crashing, and when its data are ballanced on other...
Olivier Bonvalet
03:55 PM Bug #4855: peek map assert
root@ceph2:/var/log/ceph# ceph -v
ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60)

Hit this rep...
Nigel Williams
03:35 PM Bug #5225 (Closed): arm: rbd fsx test failed on the arm set up
rbd fsx test failed with core dump on the client.
logs are copied to ubuntu@burnupi24:/home/ubuntu/arm_testing_lo...
Tamilarasi muthamizhan
03:19 PM Bug #5163: filestore: ENOTEMPTY on object removal
The teuthology logs are copied to ubuntu@burnupi24.front.sepia.ceph.com:/home/ubuntu/bug5163/testing_logs_rados_python Tamilarasi muthamizhan
03:16 PM Bug #5163: filestore: ENOTEMPTY on object removal
This happened when running rados_python test on the arm test setup.... Tamilarasi muthamizhan
03:17 PM Bug #4579 (Resolved): kclient + ffsb workload makes osds mark themselves down
e21f8df1eb0c459d12911785c69f7427d1ad5689 Samuel Just
03:16 PM Bug #5216 (Resolved): restarted or failed osd resulted in a lot of caller_ops.size error messages...
Samuel Just
11:25 AM Bug #5216: restarted or failed osd resulted in a lot of caller_ops.size error messages and stalle...
The stalled I/O seems to come from the freshly started OSDs. They seem to tell ceph hey i can handle I/O but they're ... Stefan Priebe
11:24 AM Bug #5216: restarted or failed osd resulted in a lot of caller_ops.size error messages and stalle...
this is the backport:
commit 2af3f1d40b9c64f58d1a05232c52b2a47426fef5
Author: Samuel Just <sam.just@inktank.com>
...
Stefan Priebe
11:12 AM Bug #5216 (Pending Backport): restarted or failed osd resulted in a lot of caller_ops.size error ...
pushed fix to master, needs backport to cuttlefish
Note, this probably did not cause the IO hang.
Samuel Just
06:51 AM Bug #5216: restarted or failed osd resulted in a lot of caller_ops.size error messages and stalle...
So i get the caller_ops.size 3002 > log size 3001 messages while the osd is offline and i get the slow request messag... Stefan Priebe
06:36 AM Bug #5216: restarted or failed osd resulted in a lot of caller_ops.size error messages and stalle...
To me it seems that the osd sets itself online / available before it is really ready which then results in slow I/O.
...
Stefan Priebe
05:33 AM Bug #5216: restarted or failed osd resulted in a lot of caller_ops.size error messages and stalle...
Then the whole ceph storage became instable until the osd is up and running again and had recovered. Stefan Priebe
05:32 AM Bug #5216 (Resolved): restarted or failed osd resulted in a lot of caller_ops.size error messages...
I'm running upstream/cuttlefish 85ad65e294f2b3d4bd1cfef6ae613e31d1cea635
I've seen the following today while just ...
Stefan Priebe
03:16 PM Bug #5223 (Resolved): ./osd/OSDMap.h: 387: FAILED assert(exists(osd))
Samuel Just
02:59 PM Bug #5223 (Resolved): ./osd/OSDMap.h: 387: FAILED assert(exists(osd))
13-05-31 03:07:57.486103 7fe8cc625700 0 -- 10.214.132.10:6801/30895 >> 10.214.131.23:6805/9730 pipe(0x211cc80 sd=70 ... Samuel Just
03:16 PM Bug #5224 (Resolved): too many open fds
Samuel Just
03:11 PM Bug #5224 (Resolved): too many open fds
Samuel Just
02:37 PM devops Bug #4924: ceph-deploy: gatherkeys fails on raring (cuttlefish)
:/
0.61.2
[root@test-ceph-1001 ~]# yum list ceph
Loaded plugins: security
Installed Packages
ceph.x86_64 ...
Greg Poirier
09:36 AM devops Bug #4924: ceph-deploy: gatherkeys fails on raring (cuttlefish)
This fix landed in 0.61.1. Please try that (or a newer) version and see if you're still hitting it.
Ian Colle
09:22 AM devops Bug #4924: ceph-deploy: gatherkeys fails on raring (cuttlefish)
I hate to kick a dead horse, but did this make it into 0.63 or will it be available in a later release? Ran into this... Greg Poirier
02:21 PM rbd Bug #5220: test_ls_snaps segfaults on the arm test setup
recopying the yaml... Tamilarasi muthamizhan
02:20 PM rbd Bug #5220 (Resolved): test_ls_snaps segfaults on the arm test setup
Test setup: Tala002, Tala003, Tala004
this happens when trying to run rbd/workloads/c_api_tests.yaml on the arm te...
Tamilarasi muthamizhan
01:17 PM rgw Bug #5197: Bucket shows up when listing buckets but does not exist anywhere else.
And #5219 covers the "user check" not cleaning up. Greg Farnum
01:09 PM rgw Bug #5197 (Resolved): Bucket shows up when listing buckets but does not exist anywhere else.
Okay, so the bucket rm didn't work because the object's not on disk, so the initial stat fails, and the radosgw-admin... Greg Farnum
01:17 PM rgw Feature #5219 (New): "radosgw-admin user check" should handle non-existent buckets in index
Right now, if "radosgw-admin user check" encounters a bucket whose object doesn't exist it uses default values (becau... Greg Farnum
01:04 PM rgw Feature #5218 (New): rgw: make bucket removal "atomic"
Right now, bucket removal consists of two steps:
1) Remove the bucket object (making sure the bucket index doesn't l...
Greg Farnum
12:04 PM devops Feature #5019 (In Progress): arm: gitbuilder for ARM
The arm kernel gitbuilder is now building bootable kernels. No debug yet. Sandon Van Ness
12:00 PM Fix #3188 (In Progress): osd: close read hole
Samuel Just
11:33 AM Bug #5084: osd: slow peering after osd restart (bobtail)
Some more details about my setup:
Hosts are CentOS 6.4 + elrepo kernel-ml. Ceph is cuttlefish (0.61.2) from official...
John Nielsen
10:54 AM Bug #5084: osd: slow peering after osd restart (bobtail)
I just want to add that I am definitely seeing this behavior on Cuttlefish. We run a number of VM's atop RBD. Any tim... John Nielsen
11:01 AM rgw Bug #5209 (Resolved): rgw: crash when head contains unexpected data (when getting range of bytes)
Fix is reviewed and in the next branch, commit:c5fc52ae0fc851444226abd54a202af227d7cf17. Cherry-picked back to cuttle... Greg Farnum
11:01 AM Bug #4813 (Resolved): pgs stuck creating
Samuel Just
11:00 AM rgw Bug #5204 (Resolved): rgw: copy object leaks tail
Fix is reviewed and in next branch, commit:b1312f94edc016e604f1d05ccfe2c788677f51d1. Cherry-picked to cuttlefish and ... Greg Farnum
09:58 AM devops Bug #5193: RHEL6 does not ship with xfsprogs

As a work around, the xfsprogs rpm is available from the Centos 6 repository, however installing that may result in...
Anonymous
09:35 AM devops Feature #5217 (Rejected): Add "Ceph" to all Ceph package descriptions
A number of the Ceph packages such as librbd and librados do not have "Ceph" in the package title. This makes it har... Anonymous
07:06 AM Bug #4357 (Can't reproduce): osd: FAILED assert("join on thread that was never started" == 0)
I'm closing this one for now. It hasn't popped up anymore, when it does, I'll re-open. Wido den Hollander
05:19 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Then a component is missing on my test system (Debian 7 wheezy).
After rebooting the filesystem is not mounted whe...
Robert Sander

05/30/2013

10:54 PM CephFS Bug #4753 (Resolved): mds/Locker.cc: 4167: FAILED assert(0)
fixed this in commit:482733e9603e47a3a427b17bfb9b9189dacd5109 Sage Weil
10:23 PM Bug #5171 (Resolved): After crash monitor trying to bind to address of other monitor
Denis reports that #5203 fix should resolve this one as well. Thanks! Sage Weil
10:22 PM Bug #5203 (Resolved): mon: backup monmap for sync appears to drop correct monitor names?
fix is merged, commit:626de387e617db457d6d431c16327c275b0e8a34, and backported to cuttlefish.
Denis, can you open ...
Sage Weil
10:20 PM Bug #5203: mon: backup monmap for sync appears to drop correct monitor names?
Good. Looks like solution for #5171 too (unsure about all cases, but I still too distubed to remember precise - happe... Denis kaganovich
10:24 AM Bug #5203 (Fix Under Review): mon: backup monmap for sync appears to drop correct monitor names?
Joao Eduardo Luis
10:23 AM Bug #5203: mon: backup monmap for sync appears to drop correct monitor names?
Joao Eduardo Luis
10:19 AM Bug #5203: mon: backup monmap for sync appears to drop correct monitor names?
proposed fix in wip-5203 Joao Eduardo Luis
09:15 AM Bug #5203: mon: backup monmap for sync appears to drop correct monitor names?
Edit: crash log had nothing to do with this bug. It's an entirely different issue regarding pick_addresses(). Joao Eduardo Luis
08:43 AM Bug #5203: mon: backup monmap for sync appears to drop correct monitor names?
Verified by forcing a monitor to sync and to assert out before actually synchronizing (using --mon-sync-requester-kil... Joao Eduardo Luis
08:17 AM Bug #5203 (Resolved): mon: backup monmap for sync appears to drop correct monitor names?
Came across this one while debugging one of saaby's mon crashes.
Apparently, saaby (@ #ceph) recreated a monitor u...
Joao Eduardo Luis
10:18 PM Bug #5177 (Rejected): logrotate.conf: "which /etc/init.d/ceph reload"
Ah, ok. THanks! Sage Weil
10:08 PM Bug #5177: logrotate.conf: "which /etc/init.d/ceph reload"
Oh, sorry, there are not your bug. This is Gentoo ebuild "sed" result. Denis kaganovich
09:38 PM Bug #5215: mon: hang during sync with mon thrashing
yeah, this one too:
ubuntu@teuthology:/a/teuthology-2013-05-30_18:12:14-rados-next-testing-basic/26830$
Sage Weil
09:37 PM Bug #5215 (Resolved): mon: hang during sync with mon thrashing
mon syncs for a while and then stops/get stuck. i think this job failed yesterday, too, so it is likely easy to repr... Sage Weil
09:28 PM rbd Bug #3925 (In Progress): krbd: sysfs write lockdep warnings
Well shit.
I unmapped my image and I got a lockdep error.
I'll look some more tomorrow....
Alex Elder
09:26 PM rbd Bug #3925 (Resolved): krbd: sysfs write lockdep warnings
I have my answer. The problem does not show up
now that the snapshot sysfs files are gone.
I'm marking this bug ...
Alex Elder
08:48 PM rbd Bug #3925: krbd: sysfs write lockdep warnings
Well that was fun. I reproduced the problem immediately with:... Alex Elder
07:14 PM rbd Bug #3925 (In Progress): krbd: sysfs write lockdep warnings
Since I've been unable to reproduce this problem with
current code, I'm going to try reproducing it using
code that...
Alex Elder
04:42 PM rbd Bug #3925: krbd: sysfs write lockdep warnings
I just committed the following change to the
rbd/kernel.sh workunit in the "master" branch
of the ceph git reposito...
Alex Elder
09:05 AM rbd Bug #3925: krbd: sysfs write lockdep warnings
Oh, now I know what's happening. The "kernel.sh" script
was looking at the snapshot sysfs files, which are no
long...
Alex Elder
08:56 AM rbd Bug #3925: krbd: sysfs write lockdep warnings
I have tried to reproduce this a bunch of times, both
manually (as I described, using the refresh sysfs file)
and u...
Alex Elder
06:10 PM Bug #5198 (Duplicate): osd: powercycle testing triggers corrupt object data on xfs
oh, this is a dup of #4976 Sage Weil
05:40 PM Feature #3848 (Resolved): osd: gracefully handle cluster network heartbeat failure
Sage Weil
05:27 PM devops Bug #5210 (Resolved): ceph_deploy: purge and purgedata fails on ceph master branch
daemons weren't getting stopped. fixed as of commit:cf9aa7a0037e56eada8b3c1bb59d59d0bfe7bba5 Sage Weil
12:53 PM devops Bug #5210 (Resolved): ceph_deploy: purge and purgedata fails on ceph master branch
test set up: plana08... Tamilarasi muthamizhan
05:26 PM Bug #5206 (Resolved): debian: daemons stopped on upgrade
fixed as of commit:cf9aa7a0037e56eada8b3c1bb59d59d0bfe7bba5 Sage Weil
09:30 AM Bug #5206 (Resolved): debian: daemons stopped on upgrade
wip-deb-removal Sage Weil
05:17 PM devops Feature #5214 (Resolved): Kernel gitbuilders for rpm distros
Need kernel gitbuilders for centos 6.3 or 6.4, Fedora18, OpenSuse 12.2 or 12.3 and sles11sp2.
The centos and fedora ...
Anonymous
04:27 PM Bug #5200 (Resolved): mon: valgrind leaks
Sage Weil
11:24 AM Bug #5200 (Fix Under Review): mon: valgrind leaks
Sage Weil
10:02 AM Bug #5200 (In Progress): mon: valgrind leaks
Sage Weil
03:29 PM Subtask #5213 (Resolved): unit tests for src/osd/PGLog.{cc,h}
"work in progress":https://github.com/dachary/ceph/tree/wip-5213
Focus on the functions related to log merging ( m...
Loïc Dachary
03:06 PM rbd Documentation #5212 (Closed): doc: link to recommended kernel version from pages that describe us...
Default kernels like 3.2 in ubuntu precise are missing a lot of bug fixes for rbd and cephfs.
The docs recommend k...
Josh Durgin
03:04 PM Bug #5176 (Pending Backport): leveldb: Compaction makes things time-out yielding spurious elections
merged ito next, commit:3cc0f3d803c376167175dd9082dc24f76ee1bd7a Sage Weil
11:29 AM Bug #5176: leveldb: Compaction makes things time-out yielding spurious elections
sylvain reports:... Sage Weil
03:04 PM rgw Bug #5197 (In Progress): Bucket shows up when listing buckets but does not exist anywhere else.
Looking at the cluster indicates that indeed, there's an orphaned omap entry on the <user>.buckets object, that doesn... Greg Farnum
12:28 PM rgw Bug #5197: Bucket shows up when listing buckets but does not exist anywhere else.
This was an empty bucket created under argonaut. It was deleted normally while an argonaut->bobtail upgrade was "in p... Greg Farnum
02:39 PM rgw Feature #4715: rgw: Add support for OPTIONS HTTP method
Yes, but not trivially. Yehuda Sadeh
02:31 PM rgw Feature #4715: rgw: Add support for OPTIONS HTTP method
Neil Levine wrote:
> Yehuda, can we close this?
Can this be backported to bobtail?
JuanJose Galvez
02:08 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
udev shoudl trigger ceph-disk active after the reboot to bring the osd back up; no fstab entry should be necessary (p... Sage Weil
12:53 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Something like... Robert Sander
01:55 PM devops Bug #5211 (Resolved): ceph-disk prepare: list_partitions() shouldn't return disks
@# ceph-disk-prepare /dev/sda
ceph-disk: Error: Device is mounted: /dev/sdak1@
list_partitions('/dev/sda') will o...
Alexandre Marangone
01:45 PM Bug #5157 (In Progress): install: unable to pull ceph rpm packages on fedora18
The install is failing because the epel repository is not configured.
The work around is to configure the repo:
...
Anonymous
01:41 PM Bug #5188 (Resolved): ceph-deploy nightlies failing
tested this locally and it works fine. Tamilarasi muthamizhan
01:14 PM Bug #5188: ceph-deploy nightlies failing
related to bug#5210 Tamilarasi muthamizhan
01:14 PM Bug #5188: ceph-deploy nightlies failing
modified the yaml for ceph-deploy to pick cuttlefish branch instead of master. Tamilarasi muthamizhan
01:18 PM rgw Bug #5209 (In Progress): rgw: crash when head contains unexpected data (when getting range of bytes)
Yehuda Sadeh
12:45 PM rgw Bug #5209 (Resolved): rgw: crash when head contains unexpected data (when getting range of bytes)
We ended up with a multipart object that had head with data (some old argonaut issue?). A request to retrieve only pa... Yehuda Sadeh
11:38 AM devops Bug #5208 (Resolved): Debian Wheezy Needs the 'ca-certificates' package before you can wget the p...
'ceph-deploy install...' needs the ca-certificates or you get :
pushy.protocol.proxy.ExceptionProxy: Command 'wget -...
Steve H.
11:23 AM Bug #5183 (Resolved): occasional failure of rbd DiffIterateStress test
Sage Weil
11:09 AM rgw Feature #5207 (New): rgw: make listing non-standard bucket names through S3 api configurable
Buckets that were created through the swift api and do not conform to the S3 naming requirements can be listed. Make ... Yehuda Sadeh
09:37 AM rgw Bug #5204 (In Progress): rgw: copy object leaks tail
Ian Colle
09:03 AM rgw Bug #5204 (Resolved): rgw: copy object leaks tail
Problem is that we end up overriding the copied object tag with the original tag. Yehuda Sadeh
09:24 AM Bug #5205 (Resolved): mon: FAILED assert(ret == 0) on config's set_val_or_die() from pick_address...
This is the crash's log (from saaby @ #ceph):... Joao Eduardo Luis
08:14 AM rbd Bug #3978: krbd qa: concurrent.sh test leaves something read-only
The following has been committed to the ceph-qa-suite
"master" branch:
2957d68 rbd_concurrent: add new task t...
Alex Elder
08:11 AM rbd Bug #3978: krbd qa: concurrent.sh test leaves something read-only
The following has been committed to the ceph "master" branch:
f402568 rbd/concurrent.sh: probe rbd module at s...
Alex Elder
07:40 AM rbd Bug #3978: krbd qa: concurrent.sh test leaves something read-only
The cleanup routine run when concurrent.sh exits is
run after a call to "wait", so all background tasks
should be d...
Alex Elder
05:50 AM rbd Bug #3978 (In Progress): krbd qa: concurrent.sh test leaves something read-only
I've been running this test this morning and am finding it
is *not* exhibiting the problem that I originally reporte...
Alex Elder
07:35 AM devops Documentation #5202 (Rejected): "ceph osd stop" not available
The documentation at http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/#stopping-w-out-rebalancin... Robert Sander
05:21 AM Bug #5062: mon: 0.61.2 asserts on AuthMonitor during monitor start
There has been another iteration of this bug happening on PGMonitor (from user saaby on IRC):... Joao Eduardo Luis
05:06 AM Feature #2283: The ceph command should time out
I'm hitting this too ... I wanted to monitor the health more closely with the recent mon issues unfortunately it oft... Sylvain Munaut
12:04 AM devops Feature #5019: arm: gitbuilder for ARM
Took some futzing but I got the builds working. Unfortunately it did build an armel image as it looks like the auto d... Sandon Van Ness

05/29/2013

10:54 PM Bug #5201 (Resolved): osd: valgrind leaks
Sage Weil
10:54 PM Bug #5200 (Resolved): mon: valgrind leaks
Sage Weil
09:01 PM Bug #5157: install: unable to pull ceph rpm packages on fedora18
[ubuntu@burnupi23 ~]$ su -c 'rpm -Uvh http://gitbuilder.ceph.com/ceph-rpm-fc18-x86_64-basic/ref/cuttlefish/RPMS/x86_6... Tamilarasi muthamizhan
08:51 PM devops Bug #4641 (Resolved): ceph-deploy install fails on fedora 18
tested and it works fine on ceph version 0.61.2
Tamilarasi muthamizhan
08:50 PM devops Bug #5199 (Resolved): ceph-deploy: on fedora18, osd create command doesnt seem to mount the disks
test setup: burnupi22
while osd create command succeeds with no error, the osd disks are not mounted and the osd p...
Tamilarasi muthamizhan
08:48 PM Linux kernel client Bug #4646 (Need More Info): kcephfs: writeback pagevec pool size vs stripe unit limit
I'd like someone (like Sage) to determine whether
we should just mark this "won't fix."
Alex Elder
05:55 PM Linux kernel client Bug #4646: kcephfs: writeback pagevec pool size vs stripe unit limit
I implemented a fix for this, and got all the way to
the end of describing it, when I realized the math
makes this ...
Alex Elder
03:56 PM Linux kernel client Bug #4646: kcephfs: writeback pagevec pool size vs stripe unit limit
I think an easy fix for now is just to allocate the pagevec_pool
to have objects sufficient to hold pages that would...
Alex Elder
02:35 PM Linux kernel client Bug #4646: kcephfs: writeback pagevec pool size vs stripe unit limit
On the osd, it looks to me like CEPH_MDS_OP_SETLAYOUT uses
ceph_file_layout_is_valid() to verify the layout supplied...
Alex Elder
08:23 PM Linux kernel client Feature #4770: krbd: consider including write data with layered existence check
Removing myself as assignee, I won't have time to complete this. Alex Elder
08:22 PM Linux kernel client Bug #4869: libceph: osd_client: get_reply() generalize for more ops
Removing myself as assignee, I won't have time to complete this. Alex Elder
05:25 PM Bug #5198 (Duplicate): osd: powercycle testing triggers corrupt object data on xfs
... Sage Weil
04:53 PM Bug #4967 (Resolved): Misbehaving OSD sets over half of the cluster as down despite "osd min down...
committed to next, backported to cuttlefish Sage Weil
04:15 PM Bug #4967: Misbehaving OSD sets over half of the cluster as down despite "osd min down reporters ...
Sage Weil
04:40 PM rgw Bug #5197 (Resolved): Bucket shows up when listing buckets but does not exist anywhere else.

There is a bucket which shows up when buckets are listed through the api but exists nowhere else. We need to get th...
JuanJose Galvez
04:37 PM Feature #5147: Display unique cluster ID in ceph status
Sage Weil
03:07 PM Feature #4782 (Resolved): osd: build writeback model to replace async flusher
Samuel Just
02:47 PM Bug #5195 (Resolved): "ceph-deploy mon create" fails when adding additional monitors
When trying to add another monitor to an existing cluster with "ceph-deploy mon create <hostname>" the operation fail... Robert Sander
02:35 PM Bug #4603: ceph: writeback pagevec pool is created incorrectly
Whoops, mean to update http://tracker.ceph.com/issues/4646.
Alex Elder
02:33 PM Bug #4603: ceph: writeback pagevec pool is created incorrectly
On the osd, it looks to me like CEPH_MDS_OP_SETLAYOUT uses
ceph_file_layout_is_valid() to verify the layout supplied...
Alex Elder
01:59 PM devops Bug #5194 (Resolved): udev does not start osd after reboot on wheezy or el6 or fedora
ceph-deploy creates a partition with a filesystem (XFS by default) and mounts it to /var/lib/ceph/osd/<clustername>-<... Robert Sander
01:53 PM rbd Bug #5040 (Fix Under Review): krbd: record that an parent info refresh has failed
The following has been posted for review:
[PATCH] rbd: clean up a few things in the refresh path
Alex Elder
01:51 PM rbd Bug #3094 (Fix Under Review): krbd: race between finding existing client and creating new one
The following has been posted for review:
[PATCH] rbd: protect against duplicate client creation
Alex Elder
08:49 AM rbd Bug #3094 (In Progress): krbd: race between finding existing client and creating new one
I've been able to reproduce this problem by simply running
five instances of an "rbd map" command for the same image...
Alex Elder
01:43 PM Feature #3848 (Fix Under Review): osd: gracefully handle cluster network heartbeat failure
Sage Weil
01:39 PM Bug #4801 (Duplicate): osd class path broken on fedora 18?
Tamilarasi muthamizhan
01:35 PM devops Bug #4984 (New): ceph_deploy: osd create succeeds with an error message (partprobe returns error)
Tamilarasi muthamizhan
01:35 PM devops Bug #4984: ceph_deploy: osd create succeeds with an error message (partprobe returns error)
yes, the problem still exists.... Tamilarasi muthamizhan
01:23 PM devops Bug #5193 (Resolved): RHEL6 does not ship with xfsprogs
The following commit adds an rpm package dependency on xfsprogs,
https://github.com/ceph/ceph/commit/b2501e91bb8...
Jan Harkes
12:42 PM rgw Bug #5192 (Won't Fix): RGW: radosgw-admin user rm --access-key not working on bobtail
access-key should stil be able to lookup the uid, but the command is failing.
radosgw-admin user rm --access-key=$...
Tyler Brekke
12:11 PM devops Bug #5047 (Closed): ceph build needs libboost 1.50 for debian sid
The boost library issue seems to have been resolved upstream. Anonymous
11:48 AM devops Feature #5191 (Rejected): Create gitbuilder for Hadoop v2.x compatible Ceph plugin
We need a gitbuilder to build the Hadoop / Ceph plugin that is compatible with the Hadoop 2.x line (this is distinct ... Anonymous
11:47 AM rbd Feature #4834 (In Progress): Recompile/package qemu with new version of librbd to enable asynchro...
QEMU packages built against bobtail (0.56.6) and cuttlefish (0.61.2) with and without the async flush patch are avail... Anonymous
11:45 AM devops Feature #5190 (Rejected): Create Apache Hadoop 2.x gitbuilder
We need another gitbuilder for the Apache Hadoop 2.x line so that we can develop and test against it.
Let's call it ...
Anonymous
11:22 AM Bug #5084: osd: slow peering after osd restart (bobtail)
Faidon, it shouldn't affect cuttlefish as much, though that is not clear. That patch would need to be installed on a... Samuel Just
11:17 AM Bug #5084: osd: slow peering after osd restart (bobtail)
Igur, a main problem is that we are writing out the pg epoch to the filestore when we don't need to. The second prob... Samuel Just
10:40 AM Bug #5084: osd: slow peering after osd restart (bobtail)
We wrote a test that sequentially reads 1M blocks spreaded by 64Mb offsets (i.e. from different placement groups) an... Igor Lukyanov
11:19 AM Bug #5183 (Pending Backport): occasional failure of rbd DiffIterateStress test
Sage Weil
09:45 AM Bug #5183: occasional failure of rbd DiffIterateStress test
Looks good. Samuel Just
11:18 AM rbd Feature #5005: cinder: switch rbd driver to use librbd instead of the cli tool
Ian Colle
11:18 AM rbd Feature #5004: cinder: make rbd configuration easier to use
Ian Colle
11:18 AM rbd Feature #5003: cinder/nova: don't require ceph.conf on a compute host / support multiple clusters
Ian Colle
10:07 AM devops Feature #5019: arm: gitbuilder for ARM
Need various flavors of kernel gitbuilders for ARM - both debug and performance Ian Colle
10:05 AM devops Bug #5189: ceph-deploy disk prepare fails silently
When I add another disk to the test VM (/dev/sdc) and create a partition /dev/sdc1 ceph-deploy succeeds with:
ceph...
Robert Sander
09:10 AM devops Bug #5189: ceph-deploy disk prepare fails silently
Ceph was installed from the cuttlefish Debian/Ubuntu repo (including ceph-deploy).
ceph-deploy was used to create ...
Robert Sander
09:09 AM devops Bug #5189 (Resolved): ceph-deploy disk prepare fails silently
$ ceph-deploy disk list ceph01-test
/dev/sda :
/dev/sda1 other, ext2, mounted on /boot
/dev/sda2 other
/dev/sd...
Robert Sander
09:58 AM Bug #5176: leveldb: Compaction makes things time-out yielding spurious elections
Sylvain, I have a wip-5176 branch that makes us compact in a background thread, and over smaller ranges. Can you giv... Sage Weil
09:35 AM Bug #4179 (In Progress): osd: memory leak during deep scrub on bobtail
Sage Weil
08:56 AM Bug #5188 (Resolved): ceph-deploy nightlies failing
Sage Weil
08:47 AM rbd Feature #5187 (Resolved): rbd: allow unmap using mapped image name
The umount(8) command has a very useful feature that allows
one to specify *either* the device *or* the directory th...
Alex Elder
08:44 AM rbd Bug #5186 (Won't Fix): krbd: mapping same image produces ambiguous /dev file
Since it's possible to map the same image more than once,
the mechanism of putting an entry in /dev/rbd/rbd/<image>
...
Alex Elder
08:38 AM rbd Bug #5185 (Closed): rbd: nothing prevents concurrent write mappings
While attempting to test http://tracker.ceph.com/issues/3094
I learned that nothing prevented me from mapping the sa...
Alex Elder
08:16 AM rbd Bug #5184 (Resolved): libceph: create_singlethread_workqueue() error handling
In ceph_osdc_init() there are these lines of code:... Alex Elder
07:39 AM rbd Bug #5146 (Resolved): krbd: wait for safe callback for writes
The following has been committed to the ceph-client
"testing" branch:
70c725f rbd: wait for safe callback for...
Alex Elder
07:39 AM rbd Bug #3859 (Resolved): osd_client: define ceph_osdc_clear_request_linger()
The following has been committed to the ceph-client
"testing" branch:
ebd8324 libceph: add lingering request ...
Alex Elder
07:33 AM Bug #4999: monitor sync failure
I've been unable to reproduce this while using the debugging info patch.
Finally, yesterday I tried the cuttlefish...
Jim Schutt
07:32 AM rbd Bug #4777 (Resolved): krbd: verify a few things in the zeroing routines
The following has been committed to the "testing" branch
of the ceph-client git respository:
81d7ac5 rbd: flu...
Alex Elder
03:26 AM Feature #4929: Erasure encoded placement group
maybe use erasure encoding from "rozofs":https://github.com/rozofs/rozofs Loïc Dachary
03:24 AM Subtask #5046: Factor out PG logs, PG missing
Write "tests for pg_missing_t":https://github.com/dachary/ceph/tree/wip-pg_missing_t-tests Loïc Dachary

05/28/2013

10:17 PM Feature #685 (Duplicate): libcephmon: interact with ceph monitors via a library
Sage Weil
08:48 PM Bug #5183 (Resolved): occasional failure of rbd DiffIterateStress test
wip-osd-obc-snapdir Sage Weil
08:45 PM Bug #5172 (Resolved): wrongly marked down heartbeat issues
commit:b6be785775442af1999b2543bd07a0d28391dbc5 Sage Weil
04:39 PM devops Bug #5182 (Won't Fix): ceph-disk looks like it tries to mark preexisting OSD partitions with the ...
ceph-disk prepare_dev says, near the end: if not is_partition(data), mark the partition as an OSD
type, udevadm set...
Dan Mick
04:38 PM Bug #5176 (Fix Under Review): leveldb: Compaction makes things time-out yielding spurious elections
wip-5176 Sage Weil
04:37 PM Documentation #5181 (Closed): need to explain what does and doesn't work with ceph-deploy and pre...
ceph-deploy with preexisting partitions is weird; first, they may not be GPT, in which case
ceph-disk activate from ...
Dan Mick
01:15 PM Bug #5180 (Resolved): start_split, start_col_split, start_merge must fsync after tagging the in p...
Samuel Just
11:10 AM Bug #5180 (Resolved): start_split, start_col_split, start_merge must fsync after tagging the in p...
Samuel Just
09:55 AM rgw Feature #5169: Do not list swift containers when enumerating buckets using S3 API
Can you provide some more logs for this issue, just to make sure that what we think happens actually happens? Yehuda Sadeh
09:49 AM Bug #5177 (Need More Info): logrotate.conf: "which /etc/init.d/ceph reload"
I can't figure out which version has this problem... where do you see the borken reload line?
Thanks!
Sage Weil
09:08 AM Bug #5177 (Fix Under Review): logrotate.conf: "which /etc/init.d/ceph reload"
Anonymous
08:57 AM Bug #5171: After crash monitor trying to bind to address of other monitor
Okay, so you have a 15G monitor store? Is that it? If so, you might have been bit by #4895 and restarting the monito... Joao Eduardo Luis
07:51 AM Bug #5171: After crash monitor trying to bind to address of other monitor
PPS
1) fix: 15G->30G->15G;
2) In theory, can be fixed by "--inject-monmap", but repair is slow or infinite...
Denis kaganovich
07:05 AM Bug #5171: After crash monitor trying to bind to address of other monitor
PS One more issue (I will not open new by same reason): in time of 2 of 3 monitors up and repair after (or in time) t... Denis kaganovich
06:43 AM Bug #5171: After crash monitor trying to bind to address of other monitor
OK, now I see: sync is tooo slow, but seems to be ready in future. Somebody can answer to this sync speed (fix or wan... Denis kaganovich
03:33 AM Bug #5171: After crash monitor trying to bind to address of other monitor
No. First I trying to purge/recreate monitor. Now it infinite syncing and not going up. I just in panic (I have ticke... Denis kaganovich
08:51 AM rgw Documentation #5178 (Resolved): rgw: fix keystone openssl to nss conversion
as specified in here:
[[http://thread.gmane.org/gmane.comp.file-systems.ceph.user/1637]]
Yehuda Sadeh
06:08 AM Bug #4895: leveldb: mon workload makes store.db grow without bound
See https://code.google.com/p/leveldb/issues/detail?id=158 and the discussion https://groups.google.com/forum/#!msg/... Sylvain Munaut
04:28 AM Bug #4895: leveldb: mon workload makes store.db grow without bound
I just disabled compact-on-trim, and it doesn't look good :( It grew about 1GB in 2 hours.
On the plus side, there i...
Sylvain Munaut
01:31 AM Documentation #3808 (Resolved): Block device quick start page need update
This was verified as working with cuttlefish and ceph-deploy. John Wilkins
01:29 AM rgw Documentation #2990 (Resolved): doc: expand/complete RGW S3 API reference
This is complete now. Todo: A path between Quick Start and using the APIs. S3 subdomain configuration still needs to ... John Wilkins

05/27/2013

07:55 PM Bug #5172 (Fix Under Review): wrongly marked down heartbeat issues
or wip-5172, don't see wip_5172 :) Sage Weil
02:54 PM Fix #3188: osd: close read hole
pushed wip-osd-readhole with some old incomplete work on this. here's a brain dump of where my thinking is/was on th... Sage Weil
01:44 PM Bug #5175: leveldb: LOG and MANIFEST file grow without bound (LOG being _text_ log !)
i wonder if turning off the compaction will make this grow slowly enough to not be an issue. strangely, i still get ... Sage Weil
08:16 AM Bug #5175: leveldb: LOG and MANIFEST file grow without bound (LOG being _text_ log !)
Work around for LOG is to use this config :... Sylvain Munaut
07:29 AM Bug #5175 (Resolved): leveldb: LOG and MANIFEST file grow without bound (LOG being _text_ log !)
leveldb has two files that seem to grow without bound and are only cleared on db open.
The first is the LOG file w...
Sylvain Munaut
01:24 PM Bug #4895 (Resolved): leveldb: mon workload makes store.db grow without bound
awesome. Sylvain, can you try setting 'mon compact on trim = false' and seeing if it continues to not grow? the ori... Sage Weil
01:22 AM Bug #4895: leveldb: mon workload makes store.db grow without bound
I've been testing this for the last 5 days and I haven't seen any uncontrolled/fast growth of the mon store like I us... Sylvain Munaut
01:19 PM devops Bug #5174 (Resolved): df: ‘/media/osd.0/.’: No such file or directory
fixed by commit:d81d0ea5c442699570bd93a90bea0d97a288a1e9, backported to cuttlefish branch, but not yet in a cuttlefis... Sage Weil
12:29 PM Bug #5171 (Need More Info): After crash monitor trying to bind to address of other monitor
Do you have the full log for this monitor? Joao Eduardo Luis
10:25 AM Bug #5084: osd: slow peering after osd restart (bobtail)
> As we can assume client ops are waiting for new OSD map that is issued only after peering finishes.
> It seems tha...
Igor Lukyanov
08:28 AM Bug #5177 (Rejected): logrotate.conf: "which /etc/init.d/ceph reload"
logrotate.conf: "which /etc/init.d/ceph reload". It always false (if no file "reload" in "."). New log always zero.
...
Denis kaganovich
07:56 AM Bug #5176 (Resolved): leveldb: Compaction makes things time-out yielding spurious elections
It seems that compaction can take a few seconds (despite running on 10k SAS disks) and can cause peons to not renew t... Sylvain Munaut
07:05 AM Tasks #4560 (Closed): unit tests for src/os/LFNIndex.cc
There is still more work to be done but another ticket can be re-opened if someone wants to work on it. Loïc Dachary
04:51 AM CephFS Bug #5162: File is locked unexpected and not released anymore
I tried restart all ceph services by issuing # /etc/init.d/ceph -a restart but didn't solve the problem. However I di... joe huang
12:44 AM CephFS Bug #5105: mds/CInode.cc: 1996: FAILED assert(auth_pins >= 0)
I think uncomment MDS_AUTHPIN_SET in src/mds/mdstypes.h would help
Zheng Yan

05/26/2013

10:43 PM CephFS Bug #5162: File is locked unexpected and not released anymore
>ceph: check_caps ffff880117288848 file_want pFscr used p dirty - flushing - issued pAsLsXsFcb revoking - retain pAsx... Zheng Yan
08:09 PM CephFS Bug #5162: File is locked unexpected and not released anymore
Hi Zheng,
Sorry for reply late. Here is the kernel msg.
[ 219.824078] ceph: mdsc delayed_work
[ 219.82...
joe huang
07:45 PM devops Bug #5174 (Resolved): df: ‘/media/osd.0/.’: No such file or directory
In my cluster, there are two machine:
host1: mon/mds
host2: two osd/mon
When i exec" service ceph -a start' on h...
jianpeng ma
08:14 AM Subtask #5046: Factor out PG logs, PG missing
"Ceph placement groups backfilling":http://dachary.org/?p=2009 Loïc Dachary
04:16 AM Bug #5173: ceph scrub found missing pg object
Run ceph pg repair 2.df
Finally, I umounted all osds one by one and checked XFS and mounted back with barriers (we...
Ivan Kudryavtsev
03:27 AM Bug #5173: ceph scrub found missing pg object
All files have equal md5 sums equal to:... Ivan Kudryavtsev
02:34 AM Bug #5173 (Can't reproduce): ceph scrub found missing pg object
I'm using ceph version 0.56.4 (63b0f854d1cef490624de5d6cf9039735c7de5ca)
All data is 3-times replicated (pools Size ...
Ivan Kudryavtsev

05/25/2013

11:17 PM Bug #4608 (Resolved): Incorrect RGW apache conf example
http://ceph.com/docs/master/start/quick-rgw/#create-a-gateway-configuration-file John Wilkins
07:27 AM rbd Bug #3737: Higher ping-latency observed in qemu with rbd_cache=true during disk-write
Update: seems to work fine if I turn writeback caching back on again (previously turned off before patching). Edwin Peer
12:24 AM rbd Bug #3737: Higher ping-latency observed in qemu with rbd_cache=true during disk-write
Using ceph 0.61.2 and qemu 1.4.2 or earlier versions with the patch:
The following hangs after a few iterations:
...
Edwin Peer

05/24/2013

05:56 PM Bug #5172 (In Progress): wrongly marked down heartbeat issues
wip_5172, going to test later Samuel Just
05:22 PM Bug #5172: wrongly marked down heartbeat issues
2013-05-23 10:35:22.200882 7fe1668a1700 1 -- 10.214.131.15:0/14951 <== osd.4 10.214.131.14:6803/14730 11 ==== osd_pi... Samuel Just
05:20 PM Bug #5172 (Resolved): wrongly marked down heartbeat issues
ubuntu@teuthology:/a/samuelj-2013-05-23_10:20:47-rados-wip_osd_throttle-master-basic/20593/remote
Despite the bran...
Samuel Just
04:23 PM Bug #5160 (Resolved): Enable good pool behavior FLAG_HASHPSPOOL
Samuel Just
04:02 PM Bug #5171: After crash monitor trying to bind to address of other monitor
After it other 2 monitors working (answering) only after restart mon.3 Denis kaganovich
03:58 PM Bug #5171 (Resolved): After crash monitor trying to bind to address of other monitor
Rebooted cluster (3 nodes, 3 monitors). After it one of monitors sync, update database, then die (assertion) and then... Denis kaganovich
02:43 PM Cleanup #4828 (Rejected): dan: don't respond to e-mail via your phone in the bathroom
I claim this is not a bug, and the behavior is expected. Closing as Rejected; submitter can appeal if he wants to co... Dan Mick
02:36 PM Feature #4457: api: add JSON schema/output protocol to rados.py
Dan Mick
02:36 PM Feature #4547 (In Progress): api: implement self-description for --admin-daemon commands
Dan Mick
02:36 PM Feature #4548: api: implement self-description for osd/mon tell commands
Dan Mick
02:33 PM Feature #4455: api: move '--format' into just another command argument
Dan Mick
02:32 PM Feature #4839: api: make new CLI send old version of commands to old monitors during upgrade
Dan Mick
02:31 PM Feature #4315: api: create python CLI wrapper for ceph tool; read command descriptions and valida...
Dan Mick
02:30 PM Feature #4314: api: modify ceph tool to describe own commands
Ian Colle
02:29 PM rgw Feature #5170 (Resolved): RGW: Object restriping tool to fix large objects from argonaut.
DHO needs a object restriping tool to read in the extremely large objects from argonaut and write them back into rado... Tyler Brekke
02:28 PM Feature #3849: Track slow PGs and times OSDs marked down
Neil Levine
02:26 PM Bug #3143: Obsync object verification takes too long
Apparently DH are maintaining the obsync code more than Inktank now . Perhaps check with them if they want to fix thi... Neil Levine
02:26 PM Feature #4982: OSD: namespaces pt 1 (librados/osd, not caps)
The CRUSH bit of this is just for including the namespace when choosing the PG an object hashes into, right? (ie, the... Greg Farnum
02:17 PM Feature #4982: OSD: namespaces pt 1 (librados/osd, not caps)
crush needs to take into account the namespace argument in such a way that hobjects with an empty namespace hash the ... Samuel Just
02:03 PM rgw Feature #5169 (New): Do not list swift containers when enumerating buckets using S3 API
If a user has created containers over swift protocol, they show up in bucket listing over S3, causing problems for an... JuanJose Galvez
02:03 PM Feature #4782: osd: build writeback model to replace async flusher
Ian Colle
12:03 PM rbd Feature #4454: openstack: support volume migration in Cinder
https://blueprints.launchpad.net/cinder/+spec/volume-migration Neil Levine
11:59 AM rbd Feature #4085: qemu-rbd: allow storing snapshot of ram associated with snapshot of disk
Neil Levine
11:59 AM Linux kernel client Feature #4888: krbd: support boot from root file system on an rbd image
Neil Levine
11:50 AM rbd Feature #5167: openstack: cinder: differential backups
Neil Levine
11:26 AM rbd Feature #5167 (Resolved): openstack: cinder: differential backups
Update the backup service in cinder to support a differential backup format, and the rbd driver to output differentials. Josh Durgin
11:49 AM rbd Documentation #5009: doc: explain how to get qemu packages for each distro
Neil Levine
11:49 AM rbd Documentation #5006: doc: openstack configuration changes for havana
Neil Levine
11:48 AM rbd Feature #5168: openstack: cinder: rbd as a backup target
Neil Levine
11:30 AM rbd Feature #5168 (Resolved): openstack: cinder: rbd as a backup target
This would allow using a different pool as a backup instead of an object store. Josh Durgin
11:17 AM rgw Documentation #5166 (Resolved): rgw: dr: async repl and DR documentation
Ian Colle
11:17 AM rgw Documentation #5165 (Resolved): rgw: multisite: regions and global namespace documentation
Ian Colle
11:12 AM rgw Feature #4335: rgw: dr: sync processing state: define datastructures
This is so that agents which get restarted have durable information about what work they were doing before restart, a... Greg Farnum
11:10 AM rgw Feature #5164 (Closed): rgw: multisite: metadata push notifications: design blueprint
Ian Colle
10:49 AM rgw Feature #4334 (Fix Under Review): rgw: dr: bucket index log API: implement RESTful API
Yehuda Sadeh
10:49 AM rgw Feature #4333 (Fix Under Review): rgw: multisite: metadata-changes log: implement RESTful API
Yehuda Sadeh
10:49 AM rgw Feature #5008 (Fix Under Review): rgw: bucket metadata changes should be reflected in mdlog
Yehuda Sadeh
09:53 AM devops Cleanup #5106 (In Progress): ceph_deploy: install/compile error on wheezy
commit 61b610fbc841c6943e41f23569ed5f6835d8caed
Author: Gary Lowell <glowell@inktank.com>
Date: Thu May 23 09:23:...
Anonymous
08:12 AM Linux kernel client Cleanup #2438 (Closed): ceph-client: use BUG_ON() for null auth_client->ops pointers
Sage added this commit:
27859f9 libceph: wrap auth ops in wrapper functions
...which neatened up the auth calls...
Alex Elder
07:59 AM rbd Bug #3889 (Won't Fix): krbd: handle zero-length requests
OK, after a little more discussion... We're going to
go the easy route and just close this issue. We'll
continue ...
Alex Elder
07:54 AM Bug #5163: filestore: ENOTEMPTY on object removal
The object in question was part of an rbd image that a vm was doing a fstrim on when the crash happened. Mike Lowe
07:46 AM Bug #5163 (Can't reproduce): filestore: ENOTEMPTY on object removal
I had an osd crash during normal opperation, this could possibly be related to 4927. I was able to restart the osd a... Mike Lowe
07:23 AM rbd Bug #5146: krbd: wait for safe callback for writes
Josh has reviewed this patch and the two others I posted
with it. I was testing the three of them together yesterda...
Alex Elder
01:52 AM CephFS Bug #5162: File is locked unexpected and not released anymore
looks like client's fault. try following command on client.5898 and upload debug.txt:
# echo module ceph +p >/sys/...
Zheng Yan

05/23/2013

11:16 PM CephFS Bug #5162 (Can't reproduce): File is locked unexpected and not released anymore
I deployed a ceph cluster and mount cephfs via kernel module. After using it few days later, when I ls a particular f... joe huang
08:11 PM Bug #5159 (Resolved): OSD: reset heartbeat timer for each read chunk in deep scrub
Samuel Just
03:26 PM Bug #5159 (Resolved): OSD: reset heartbeat timer for each read chunk in deep scrub
Samuel Just
07:49 PM devops Bug #5161 (Resolved): daemons should create /var/run/ceph if it doesn't already exist
I wanted to add a new mon into a cluster.But when i exce "ceph-mon -i majianpeng --mkfs --monmap map --keyring key", ... jianpeng ma
03:31 PM Bug #5160 (Resolved): Enable good pool behavior FLAG_HASHPSPOOL
Samuel Just
03:08 PM Feature #5158 (New): Objecter: support multi-read-from-replica
Sam pointed out that we could turn our reads into essentially an Available (versus Consistent) model just by turning ... Greg Farnum
01:18 PM rgw Bug #5152 (Resolved): rgw: usage iteration by user doesn't skip to correct epoch
Landed to Next, Bobtail, and Cuttlefish. Ian Colle
10:24 AM rgw Bug #5152: rgw: usage iteration by user doesn't skip to correct epoch
Reviewed-by. Greg Farnum
09:33 AM rgw Bug #5152: rgw: usage iteration by user doesn't skip to correct epoch
Please review - needs to go into Next, Cuttlefish, and Bobtail Ian Colle
08:57 AM rgw Bug #5152 (Fix Under Review): rgw: usage iteration by user doesn't skip to correct epoch
Sage Weil
01:13 PM Bug #5084: osd: slow peering after osd restart (bobtail)
We repeat the same bug on both Bobtail and Cuttlefish deployments just by calling osd in/out/reweight.
Peering compl...
Igor Lukyanov
12:33 PM rbd Feature #4236 (Duplicate): krbd: properly handle flush commands
Marking this duplicate of: http://tracker.ceph.com/issues/3889
I looked into the zero-length request stuff more ge...
Alex Elder
11:39 AM Bug #5157 (Resolved): install: unable to pull ceph rpm packages on fedora18
test set up: burnupi23... Tamilarasi muthamizhan
10:59 AM Bug #5156 (Duplicate): OSD: split followed by pg resurrection might leave an object in two collec...
remove a
split a -> a, c
create c
resurrect a
will leave objects in both a and c.
Samuel Just
10:35 AM CephFS Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
I have attached the logs from two nodes of my MDS cluster.
I started mds.0 first. When I started mds.1, mds.0 crashed.
Walter Huf
05:54 AM CephFS Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
Sage Weil wrote:
> Argh.. i don't have a log after all.
>
> Yan, dropping the assert avoids teh crash, but it see...
Zheng Yan
10:29 AM rbd Bug #3889: krbd: handle zero-length requests
We'll discuss details at our standup, but here is an update.
Unless I misunderstand him, Sage believes that reques...
Alex Elder
08:58 AM rbd Bug #3889 (In Progress): krbd: handle zero-length requests
I just sent the following in an e-mail to Josh and Sage,
but thought I might as well document it here. If we want
...
Alex Elder
10:26 AM devops Bug #4641: ceph-deploy install fails on fedora 18
[ubuntu@burnupi22 ceph-deploy]$ ./ceph-deploy install burnupi22
########################################
Error: Pac...
Tamilarasi muthamizhan
10:22 AM Bug #5102 (Resolved): mon: assert(is_active()) on propose_pending()
Sage Weil
10:14 AM rbd Bug #5040 (In Progress): krbd: record that an parent info refresh has failed
I've implemented these fixes and will post them for review
after they've gone through a teuthology run this afternoon.
Alex Elder
07:58 AM rbd Bug #5040: krbd: record that an parent info refresh has failed
... Alex Elder
10:09 AM Bug #5069 (Need More Info): monitor crashed during mon thrash in nightlies
Sage thought there were logs on teuthology, on the directory from the initial report from Tamil. That run must have ... Joao Eduardo Luis
09:57 AM Bug #5069 (In Progress): monitor crashed during mon thrash in nightlies
der, we have a complete log from the teuthology failure. Sage Weil
09:55 AM Bug #5069 (Need More Info): monitor crashed during mon thrash in nightlies
i think the blind sync is still the way forward.. the more we make it aware of what is on top the harder it is to do ... Sage Weil
09:32 AM Bug #5069: monitor crashed during mon thrash in nightlies
This is definitely something wrong in the store: 'version' contains the last committed version on the store, while md... Joao Eduardo Luis
10:07 AM rbd Bug #5070: rbd map failed and stalled in "D"
Quick summary: I don't think this is an rbd problem.... Alex Elder
01:05 AM rbd Bug #5070: rbd map failed and stalled in "D"
I added one more RBD device and mapped it, and it mapped ok, after I've tried to map again previous one and it is suc... Ivan Kudryavtsev
12:55 AM rbd Bug #5070: rbd map failed and stalled in "D"
I use vanilla 3.7.2 and built it with debian make-kpkg env.
root@tsk-vps-node-04:/sys# grep blk_trace_attr_group /...
Ivan Kudryavtsev
09:40 AM Bug #5140 (Resolved): ceph init script failed to determine correct hostname for remote osd
Sage Weil
09:35 AM devops Bug #4984: ceph_deploy: osd create succeeds with an error message (partprobe returns error)
Please confirm if this is still happening or not. Ian Colle
09:32 AM devops Bug #5150 (Resolved): How many memory need if we compile ceph?
we put ~4 gigs on our VMs for building... sometimes 6. Sage Weil
09:20 AM CephFS Bug #4832: mds: failed auth_unpin assert
ubuntu@teuthology:/a/teuthology-2013-05-23_01:00:08-rados-next-testing-basic/20276 Sage Weil
09:17 AM Bug #5154 (Resolved): osd/SnapMapper.cc: 270: FAILED assert(check(oid))
... Sage Weil
08:40 AM Bug #4357: osd: FAILED assert("join on thread that was never started" == 0)
a variation of this i just fixed in commit:c2e262fc9493b4bb22c2b7b4990aa1ee7846940e. but note the
2013-05-22 09:0...
Sage Weil
02:23 AM Bug #4357: osd: FAILED assert("join on thread that was never started" == 0)
No, since the upgrade to Cuttlefish I haven't seen the same one, BUT my cluster crashed yesterday with a different ba... Wido den Hollander
05:16 AM rbd Bug #4777 (Fix Under Review): krbd: verify a few things in the zeroing routines
The following patch has been posted for review. It's one of three
new patches available in the "review/wip-rbd" bra...
Alex Elder
05:16 AM rbd Bug #3859 (Fix Under Review): osd_client: define ceph_osdc_clear_request_linger()
The following patch has been posted for review. It's one of three
new patches available in the "review/wip-rbd" bra...
Alex Elder
05:15 AM rbd Bug #5146 (Fix Under Review): krbd: wait for safe callback for writes
The following patch has been posted for review. It's one of three
new patches available in the "review/wip-rbd" bra...
Alex Elder
12:38 AM rbd Bug #5099: io performance / ceph block device
dd if=/dev/zero of=/dev/rbd1 bs=1M count=1000 oflag=direct
but I try to test the same command above on nfs file sy...
Khanh Nguyen Dang Quoc
12:36 AM devops Bug #5066: Problems with ceph-deploy debs
Probably this is fixed by now (see http://article.gmane.org/gmane.comp.file-systems.ceph.user/1419) but I did not tes... Peter Wienemann

05/22/2013

11:35 PM rgw Feature #5153 (New): rgw: usage log trim is unbounded
Yehuda Sadeh
09:06 PM rgw Bug #5152 (Resolved): rgw: usage iteration by user doesn't skip to correct epoch
Instead of starting to iterate from the correct timestamp, we iterate fro,m the beginning of time (if a user was spec... Yehuda Sadeh
06:54 PM Bug #5140: ceph init script failed to determine correct hostname for remote osd
A new commit has been submitted, thanks
https://github.com/ceph/ceph/pull/307
Xiaoxi Chen
05:35 PM rbd Bug #4777: krbd: verify a few things in the zeroing routines
I'm going to ignore the whole IRC issue for now.
I'm almost certain it's not needed, and in fact
I'm pretty sure it...
Alex Elder
05:27 PM devops Bug #4641: ceph-deploy install fails on fedora 18
[ubuntu@burnupi22 ceph-deploy]$ ./ceph-deploy install burnupi22
ceph-deploy: Platform is not supported: Fedora Spher...
Tamilarasi muthamizhan
05:21 PM devops Bug #5150 (Resolved): How many memory need if we compile ceph?
My machine hardware:
memory: 2G
Intel(R) Core(TM) i3-2120 CPU @ 3.30GHz
But when i compile ceph, it will cause O...
jianpeng ma
05:07 PM rbd Bug #4870 (Resolved): rbd: watch request error handling bugs
The first problem listed will be addressed by changes
done for http://tracker.ceph.com/issues/3859.
The second pr...
Alex Elder
04:56 PM devops Bug #4864 (Resolved): ceph-deploy: mon create command seems to output info about the first node only
Tamilarasi muthamizhan
04:55 PM devops Bug #4864 (Closed): ceph-deploy: mon create command seems to output info about the first node only
tested on centos machines and it works fine,
from ceph.log:
2013-05-22 16:54:58,117 ceph_deploy.mon DEBUG Deplo...
Tamilarasi muthamizhan
04:50 PM rbd Bug #3859 (In Progress): osd_client: define ceph_osdc_clear_request_linger()
Alex Elder
04:47 PM rbd Bug #3859: osd_client: define ceph_osdc_clear_request_linger()
Once again, rather than doing what I thought might work,
I've decided on a better fix.
Right now the osd client t...
Alex Elder
04:12 PM rbd Bug #3859: osd_client: define ceph_osdc_clear_request_linger()
As described initially, it's not really valid to
call ceph_osdc_unregister_linger_request() until
after the origina...
Alex Elder
04:06 PM rbd Bug #3859: osd_client: define ceph_osdc_clear_request_linger()
I have implemented a change that waits for a WATCH
request (as well as "normal" data write requests)
to get an indi...
Alex Elder
04:44 PM Feature #5141: Some clone errors aren't repaired
If I only deleted the head or clone data from one of the OSDs of a 3 replica pool, repair did work. So this is a VER... David Zafman
04:24 PM Bug #5139 (Resolved): Seg fault if listsnaps request with missing clones
To trigger this you have to delete all copies of a clone or the head. We aren't going to handle that gracefully, but... David Zafman
10:22 AM Bug #5139 (Fix Under Review): Seg fault if listsnaps request with missing clones
David Zafman
03:59 PM rbd Bug #5146: krbd: wait for safe callback for writes
I have this implemented and will post a patch for
review after I've tested. It was easier than
expected.
Note,...
Alex Elder
12:57 PM rbd Bug #5146 (Resolved): krbd: wait for safe callback for writes
Right now rbd only waits for the acknowledgement callback
for all osd requests. This means that an rbd client may
...
Alex Elder
03:47 PM Feature #5148 (New): repair should handle snapset/clone discrepancies
This should handle the following cases:
- head/snapdir are missing
- clone is missing
Samuel Just
03:20 PM Feature #5147 (Resolved): Display unique cluster ID in ceph status
Some customers will be running more than one Ceph cluster.
In addition to associate the output of ceph status to a s...
Neil Levine
02:35 PM Bug #5084: osd: slow peering after osd restart (bobtail)
Something that's not clear to me: does this need to be in all peers to have an effect? Or in other words, to fix this... Faidon Liambotis
02:20 PM Bug #5084: osd: slow peering after osd restart (bobtail)
This could be a result of writing out pg epochs on each map change. I have a branch which should greatly reduce the ... Samuel Just
02:12 PM CephFS Bug #5031 (Need More Info): mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
Sage Weil
02:12 PM CephFS Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
Walter: can you produce a log? 'debug mds = 20', 'debug ms = 1', restart the mds and wait for it to crash.
I have...
Sage Weil
02:10 PM CephFS Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
Argh.. i don't have a log after all.
Yan, dropping the assert avoids teh crash, but it seems like the real issue i...
Sage Weil
01:21 PM Bug #4967: Misbehaving OSD sets over half of the cluster as down despite "osd min down reporters ...
pushed a patch to bobtail branch that logs on the only other osd down path in the mon. Sage Weil
11:35 AM Bug #4967: Misbehaving OSD sets over half of the cluster as down despite "osd min down reporters ...
No, I'm not. This is a standard Ubuntu/bobtail config. Faidon Liambotis
11:16 AM Bug #4967: Misbehaving OSD sets over half of the cluster as down despite "osd min down reporters ...
I don't suppose you're using syslog? That will lose log messages quite easily. Greg Farnum
10:37 AM Bug #4967 (Need More Info): Misbehaving OSD sets over half of the cluster as down despite "osd mi...
Sage Weil
06:27 AM Bug #4967: Misbehaving OSD sets over half of the cluster as down despite "osd min down reporters ...
There was no shut down of those OSDs at the time:... Faidon Liambotis
12:42 PM Bug #2891 (New): heap profiler hangs when trying to start it up on the mon
Precise is still the latest LTS release. :/
Maybe we don't want to invest the effort to fixing it (if we can, since ...
Greg Farnum
11:19 AM Bug #2891 (Closed): heap profiler hangs when trying to start it up on the mon
This no longer seems to be an issue on Quantal, and haven't seen people complaining about it for a while.
So, I'm ...
Joao Eduardo Luis
12:20 PM rbd Bug #3858 (Resolved): osd_client: ceph_osdc_wait_request() seems wrong
tl;dr: This is no longer a bug, so marking it resolved.
First of all, based on how it's used, if an
error occurs...
Alex Elder
10:08 AM rbd Bug #3858: osd_client: ceph_osdc_wait_request() seems wrong
I honestly don't know when it happened, but I now find
that rbd is not susceptible to this problem. All rbd
reques...
Alex Elder
11:54 AM rbd Bug #5070: rbd map failed and stalled in "D"
This is somewhat old code, and there are a few bugs
that have since been fixed that could be contributing
to this. ...
Alex Elder
11:13 AM Bug #4856: monitor: upgrades produce "client did not provide supported auth type" in log
This comes from the AuthMonitor, and it *should* inhibit functionality to some extent, as the client should have rece... Joao Eduardo Luis
11:13 AM Bug #4645 (Resolved): osd: Adding osd causes long stall without restart
this should be fixed... Sage Weil
11:11 AM Bug #4357: osd: FAILED assert("join on thread that was never started" == 0)
Wido, do you still hit this problem? Sage Weil
11:11 AM Bug #3440 (Resolved): Running OSDs on ZFS on Linux
Sage Weil
11:10 AM Bug #3609 (Resolved): mon: track down the Monitor's memory consuption sources
Sage Weil
11:07 AM Bug #4816 (Can't reproduce): Monitor crashed with signal Aborted in MMonSubscribe::~MMonSubscribe()
Sage Weil
11:03 AM Bug #3552 (Resolved): After ceph-deploy installation a reboot breaks OSDs
Sage Weil
11:02 AM Bug #5068 (Won't Fix): ceph_test_rados gets SIGFPE when run with no args
Sage Weil
10:58 AM Bug #5082: OSD wrongly marked as down
still need a log to track down the osd marked down issue, if you have it Sage Weil
10:57 AM Bug #5100 (Can't reproduce): teuthology kclient (?): fails to unmount after tiobench
Sage Weil
10:54 AM Bug #5102 (Fix Under Review): mon: assert(is_active()) on propose_pending()
Sage Weil
09:17 AM Bug #5102: mon: assert(is_active()) on propose_pending()
new fix on wip-5102, comprised of simply ripping out the prepare_bootstrap() stuff. Joao Eduardo Luis
10:53 AM Bug #5118 (Rejected): osd version reporting incorrectly when client libs installed
Sage Weil
10:52 AM Bug #5114 (Rejected): os/FileStore.cc: 2225: FAILED assert(0 == "_close_replay_guard failed")
Sage Weil
10:47 AM Bug #5054 (Resolved): deep scrub reports 1 inconsistent object
Samuel Just
10:46 AM Bug #4910 (Duplicate): journal Unable to read past sequence 337 but header indicates the journal ...
Sage Weil
10:46 AM Bug #4910 (Resolved): journal Unable to read past sequence 337 but header indicates the journal h...
Samuel Just
10:45 AM Bug #4521 (Can't reproduce): mon: starting a new osd crashes all mon's
Sage Weil
10:44 AM Bug #4855 (Can't reproduce): peek map assert
Samuel Just
10:42 AM Bug #4602 (Can't reproduce): osd/ReplicatedPG.cc: 6487: FAILED assert(latest->is_update())
Sage Weil
10:42 AM Bug #4801: osd class path broken on fedora 18?
verify that ceph-osd --show-config | grep class shows the right class path that matches the rpm contents (/var/libsom... Sage Weil
10:41 AM Bug #4686 (Can't reproduce): corrupt or missing osdmap on load_pgs
Sage Weil
10:40 AM Bug #3829: new osd added to the cluster is not receiving data
Sage Weil
10:40 AM Bug #3829 (Can't reproduce): new osd added to the cluster is not receiving data
Samuel Just
10:39 AM Cleanup #4507 (Resolved): mon: drop atomic_t
merged into master; commit:2c58b790ff1dc7578325ae47c2ad0380c3310040 Joao Eduardo Luis
10:39 AM Bug #4937 (Can't reproduce): osd/ReplicatedPG.cc: 1379: FAILED assert(0)
This was caused by corruption of some kind. That corruption may have been a bug. Samuel Just
10:14 AM Bug #4937: osd/ReplicatedPG.cc: 1379: FAILED assert(0)
> Updated by Olivier Bonvalet 2 days ago
>
> I also have scrub errors with this message : "found clone without he...
David Zafman
10:36 AM Bug #5020 (Resolved): osd: 2.5 deep-scrub stat mismatch, got 710/665 objects, 0/0 clones, 2908160...
Samuel Just
10:34 AM Bug #4228 (Resolved): mon uses pick_addresses if invoked with mkfs or without mon addr; fails if ...
Sage Weil
10:24 AM Bug #4813 (Pending Backport): pgs stuck creating
Samuel Just
09:43 AM Bug #5145 (Resolved): make check fails on "ceph osd lost"
just fixed it Sage Weil
09:42 AM Bug #5145: make check fails on "ceph osd lost"
It is introduced by a "recent change in the usage":https://github.com/ceph/ceph/commit/132d5bf7f9af7de9e2028e20c95ba9... Loïc Dachary
09:39 AM Bug #5145 (Resolved): make check fails on "ceph osd lost"
... Loïc Dachary
09:08 AM Bug #4895: leveldb: mon workload makes store.db grow without bound
wip-4895-cuttlefish has a backport of the proposed fix. anyone experiencing growth, please test! Sage Weil
05:26 AM Bug #4895: leveldb: mon workload makes store.db grow without bound
Just a quick update for those following the bug and not on IRC:
joao found out that when there is an election whil...
Sylvain Munaut
07:04 AM rgw Tasks #5144 (New): rgw: incorporate greg's comment to the log objclass
Yehuda Sadeh
06:59 AM Feature #5143 (New): objclass: maintain global namespaces
Currently index data of different objclasses may overlap so it's a real problem using multiple classes on a single ob... Yehuda Sadeh
 

Also available in: Atom