Project

General

Profile

Activity

From 06/04/2012 to 07/03/2012

07/03/2012

10:36 PM Bug #2707 (Can't reproduce): mkcephfs failing on v0.48 "argonaut"
Firstly, well done guys on achieving this version milestone. I successfully upgraded to the 0.48 format uneventfully ... Paul Pettigrew
04:45 PM RADOS Feature #2706 (Resolved): crush: update kernel code to decode tunables
Sage Weil
04:44 PM RADOS Feature #2705 (Resolved): crush: graceful transition to new default tunables
Sage Weil
04:44 PM RADOS Bug #2214 (Resolved): crush: pgs only mapped to 2 devices with replication level 3
Sage Weil
04:44 PM RADOS Bug #2047 (Resolved): crush: with a rack->host->device hierarchy, several down devices are likely...
Sage Weil
04:43 PM RADOS Bug #187 (Rejected): crush: high variance, latency for straw buckets
Sage Weil
04:43 PM RADOS Feature #2422 (Resolved): crush: test that mapping result is uncorrelated
Sage Weil
04:39 PM rgw Bug #2106: failed s3tests.functional.test_s3.test_100_continue
recent logs from the nightly run: /a/teuthology-2012-07-03_00:00:09-regression-next-testing-basic/5054
Tamilarasi muthamizhan
04:34 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
Tamilarasi muthamizhan wrote:
> latest logs:
> /a/teuthology-2012-07-03_00:00:09-regression-next-testing-basic/5019...
Tamilarasi muthamizhan
04:33 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
latest logs:
/a/teuthology-2012-07-03_00:00:09-regression-next-testing-basic/5019
config.yaml:
++++++++++++
k...
Tamilarasi muthamizhan
03:53 PM devops Feature #2704 (Closed): sepia: Use ``names`` as resolver on plana, burnupi, vercoi
Anonymous
03:45 PM Feature #2702 (Resolved): gitbuilder: sync each build as it completes
Sage Weil
03:27 PM devops Feature #2549: ceph-disk-prepare: take fstype, mkfs and mount options from ceph.conf
As of commit ad97415ef72b55934adfa5024fd9af8fd1f0f82d, this now needs mount options too. Anonymous
03:26 PM devops Feature #2547 (Resolved): ceph-disk-prepare: handle partitioning and mkfs
commit ad97415ef72b55934adfa5024fd9af8fd1f0f82d
Author: Tommi Virtanen <tv@inktank.com>
Date: 2012-07-03 15:24:26...
Anonymous
02:24 PM rbd Bug #2457 (Resolved): libvirt: migration fails with rbd in 0.9.11 and 0.9.12
Fixed by upstream libvirt commit 78290b1641e95304c862062ee0aca95395c5926c. Josh Durgin
02:08 PM rbd Bug #2457: libvirt: migration fails with rbd in 0.9.11 and 0.9.12
Fixed in 0.9.12-3(debian naming) and later. Also recently in-list reports told the same, so issue may be closed safely. Andrey Korolyov
02:17 PM rgw Bug #2701 (Resolved): rgw: don't keep bucket info indexed by bucket_id
Yehuda Sadeh
02:15 PM rbd Bug #2700 (Resolved): blkdeviotune method at libvirt doesn`t work on RBD volumes
Since qemu implemented its own i/o limiting mechanism rather than cgroups, all block backends may be controlled over ... Andrey Korolyov
12:17 PM Messengers Bug #2569: msgr: connect_rank crash
i've merged fix for this into master, commit:204bc594be1a6046d1b362693d086b49294c2a27 (with possible side-effects fro... Sage Weil
12:16 PM Bug #2682 (Resolved): config lockdep error (recursive lock?) in LibRadosAio.SimpleWritePP
Sage Weil
10:48 AM devops Feature #2699 (Rejected): crowbar: change barclamp-glance to use rbd
Anonymous
10:38 AM devops Feature #2698: crowbar: Guide for using "front" network
We need an easy way to drop a "dhclient eth1" upstart job into a crowbar server installation. Just a sudo tee /etc/in... Anonymous
10:28 AM devops Feature #2698 (Closed): crowbar: Guide for using "front" network
Anonymous
10:26 AM devops Feature #2697 (In Progress): crowbar: ISO generation, reproducible in a cloud image vm
Anonymous
10:16 AM devops Feature #2697 (Resolved): crowbar: ISO generation, reproducible in a cloud image vm
Anonymous
10:12 AM devops Feature #2696 (Rejected): chef: Automated QA
Use downburst vms on vercoi to automatically bring up ceph clusters, do basic RADOS/RBD functionality testing, tear d... Anonymous
10:11 AM devops Feature #2695 (Closed): crowbar: Automated QA
Use downburst vms on vercoi to automatically bring up ceph clusters, do basic RADOS/RBD functionality testing and Ope... Anonymous
10:10 AM rgw Bug #2642 (Resolved): rgw: show/trim usage using also time (not just date)
Done, commit:80a939a99db64f7802a4a3c1320316c91720f5d9 Yehuda Sadeh
10:08 AM rgw Bug #2658 (Resolved): rgw-admin: usage show fails when specifying hour > 12
Fixed, commit:c5d19b6df0bcb238e5e68732b4d252b06f2d9e56. Yehuda Sadeh
10:05 AM devops Feature #2584 (Resolved): sepia: provide networking, DHCP for dynamic virtual machines
Anonymous
10:05 AM devops Feature #2584: sepia: provide networking, DHCP for dynamic virtual machines
Split the DNS part to #2694, this is already providing value to users. Anonymous
09:59 AM devops Feature #2584: sepia: provide networking, DHCP for dynamic virtual machines
Status update: missing DNS updates, all the strictly required components are there; vms attached to the front network... Anonymous
10:04 AM devops Feature #2553: crowbar: open question: What's the correct way to add RBD support to the Nova barc...
(Wrong ticket, ignore) Anonymous
10:04 AM devops Feature #2694 (Closed): sepia: provide DNS for dynamic vms
Anonymous
09:24 AM devops Feature #2546 (Resolved): ceph-disk-prepare: take fsid from ceph.conf (support --cluster=name)
commit 4e774fbcb38fd6883232b72352512a5f8e4a66e8
Author: Tommi Virtanen <tv@inktank.com>
Date: 2012-07-03 09:22:28...
Anonymous
08:04 AM Bug #2693 (Resolved): osd/ReplicatedPG.cc: 4293: FAILED assert(info.last_update <= active_rep_scr...
... Sage Weil

07/02/2012

09:25 PM Feature #2692 (Resolved): stable testing debian repos
Sage Weil
06:49 PM rbd Bug #2689 (In Progress): qemu iozone test hangs
Josh Durgin
02:51 PM rbd Bug #2689 (Resolved): qemu iozone test hangs
... Sage Weil
05:07 PM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
took down osd.2 and osd.3 with same crash. coredumps are on the hosts.. Sage Weil
05:06 PM Bug #2691 (Won't Fix): osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
... Sage Weil
04:40 PM Bug #2690 (Won't Fix): mon: persist quorum features
currently the non-leaders do not know the quorum features, and encode everything with a minimal (0) feature set.
...
Sage Weil
02:26 PM Linux kernel client Bug #2688 (Duplicate): lockup on ffsb + thrashing
... Sage Weil
12:54 PM Bug #2687: FileStore crashes when "osd_journal_size" is larger than the filesystem
for files, i think the right approach is to fallocate(), which will reserve the space. we shouldn't have to look at ... Sage Weil
12:47 PM Bug #2687 (Resolved): FileStore crashes when "osd_journal_size" is larger than the filesystem
See: http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/7282
If a user (on tmpfs, in this case) specifies...
Greg Farnum
12:49 PM Bug #2476: osd: watch timeout depends on operations to an object
fix qa/workunits/rbd/copy.sh when this is fixed !!! Sage Weil
12:36 PM rbd Feature #2556: rbd tool: break image locks
The current progress in is wip-rbd-locking. Still needs tests and docs, plus a small cleanup as noted on github. Josh Durgin
12:32 PM rbd Feature #2686 (Resolved): rbd: let users specify a usage for shared locks
If existing lockers have the same usage, the lock succeeds. Otherwise, it fails. This could let you use locks with e.... Josh Durgin
11:28 AM rbd Feature #2685 (Rejected): Support QEMU migration with caching enabled
This is a libvirt problem, it's not related to qemu at all. I already looked into and tested whether qemu was doing f... Josh Durgin
11:21 AM rbd Feature #2685 (Rejected): Support QEMU migration with caching enabled
See http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/7524
Apparently newer versions of QEMU refuse to...
Greg Farnum
09:44 AM Documentation #2684 (Won't Fix): doc: ceph and all daemons take --show-config
Quoting Sage:
For future reference, you can get a dump of all these values with
ceph-osd -i 123 --show-...
Anonymous
09:30 AM Bug #2593: logmonitor: decode failure
Do we know if the log in question actually existed on disk or not? Greg Farnum
07:28 AM Bug #2593: logmonitor: decode failure
saw this again on next:... Sage Weil
07:37 AM Bug #2683: ceph-fuse: crash during fsstress
... Sage Weil
07:31 AM Bug #2022 (Need More Info): osd: misdirectect request
apparently there is a different cause for this:... Sage Weil
05:57 AM Subtask #2621 (In Progress): mon: Single-Paxos: synchronize the MonitorDBStore of oblivious monitor
Joao Eduardo Luis

07/01/2012

09:46 PM Feature #2651: mon: race calling tick() when doing slurping
making this a cleanup so that it stops confusing me :) Sage Weil
08:57 PM Bug #2683 (Can't reproduce): ceph-fuse: crash during fsstress
... Sage Weil
07:48 PM Bug #2682 (Resolved): config lockdep error (recursive lock?) in LibRadosAio.SimpleWritePP
... Sage Weil
03:06 PM CephFS Bug #2681: client: got push without mds session
this was with 'ms inject socket failure = 200' Sage Weil
03:06 PM CephFS Bug #2681 (Resolved): client: got push without mds session
... Sage Weil
02:41 PM Bug #2599 (Can't reproduce): osd: crash in ReplicatedPG::C_OSD_OndiskWriteUnlock::finish
chalking this up to the bugs in next a couple weeks back Sage Weil
09:22 AM Feature #2680 (Resolved): osd: report backfill progress via query
... Sage Weil
07:09 AM CephFS Bug #2679 (Can't reproduce): POSIX file lock not released on process termination
I obtained a POSIX file lock with the following code:
> --- snip ---
>
> ...
> std::string x = "/tmp/ceph_mount...
Daniel Godas-Lopez

06/30/2012

10:52 PM rbd Documentation #2670: Docs shouldn't direct users to echo to /sys/bus/rbd for normal use
Sage Weil
10:51 PM rbd Feature #2279 (Resolved): rbd: trivial layering design doc
Sage Weil
11:34 AM Bug #2675: osd: segfault during log trim
and ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-28_19:00:12-regression-master-testing-gcov/3450 Tamilarasi muthamizhan

06/29/2012

09:44 PM Bug #2675: osd: segfault during log trim
and ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-28_19:00:12-regression-master-testing-gcov/3441... Tamilarasi muthamizhan
03:39 PM Bug #2675: osd: segfault during log trim
and ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-28_19:00:12-regression-master-testing-gcov/3435 Sage Weil
03:37 PM Bug #2675: osd: segfault during log trim
and ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-28_19:00:12-regression-master-testing-gcov/3437 Sage Weil
03:33 PM Bug #2675: osd: segfault during log trim
also:... Sage Weil
03:30 PM Bug #2675 (Resolved): osd: segfault during log trim
... Sage Weil
09:02 PM Feature #2471 (Resolved): osd: add prefix match to OSDCaps
Sage Weil
09:00 PM Feature #2678 (Rejected): osd, objecter: redirect misdirected requests
Generic mechanism to refer the client to the correct osd when they misdirect their requests. This will allow the clu... Sage Weil
08:59 PM Bug #2676 (Resolved): mon: cannot create pool with old renamed name
commit:5a9355091296121823156de7d3160de45328a0cc Sage Weil
04:46 PM Bug #2676 (Resolved): mon: cannot create pool with old renamed name
renaming a pool name, and then trying to create a new pool with the old name fails. Yehuda Sadeh
07:27 PM rbd Bug #2677 (Resolved): librbd: create does not clean up well
A create that fails part way through does not remove objects it created or undo modifications it does, for example ad... Josh Durgin
07:23 PM rbd Feature #2279 (Fix Under Review): rbd: trivial layering design doc
See wip-rbd-layering-doc Josh Durgin
03:26 PM Messengers Bug #2569: msgr: connect_rank crash
fix for this is in wip-msgr, still testing Sage Weil
02:16 PM RADOS Feature #2541 (Resolved): crush: move command to adjust non-leaf node position
Sage Weil
12:54 PM Feature #2575 (Resolved): perf: 0.48 numbers
Mark Nelson
12:53 PM Feature #2582 (Resolved): set up chart.io + mysql (or equivalent) infrastructure for tracking perf
Mark Nelson
12:51 PM Feature #2577 (Resolved): teuthology: blktrace task
Mark Nelson
12:29 PM Subtask #2674: mon: Single-Paxos: mon commits suicide after remove&add
Tried this on master. Although at first I triggered something else, the bottom line is that this works, and the monit... Joao Eduardo Luis
12:14 PM Subtask #2674 (Rejected): mon: Single-Paxos: mon commits suicide after remove&add
Yep. Makes sense. I was afraid this was cause by my changes.
Rejecting it then.
Joao Eduardo Luis
11:30 AM Subtask #2674: mon: Single-Paxos: mon commits suicide after remove&add
Yeah.. basically we're changing the mon's ip by removing and re-adding it, and the mon isn't smart enough to realize ... Sage Weil
11:12 AM Subtask #2674: mon: Single-Paxos: mon commits suicide after remove&add
I believe this is intended behavior, note the last line:... Greg Farnum
03:07 AM Subtask #2674 (Rejected): mon: Single-Paxos: mon commits suicide after remove&add
Pre-conditions:
3 mons: a=127.0.0.1:6789 ; b=127.0.0.1:6790 ; c=127.0.0.1:6791
* remove 'c' with ./ceph mon rem...
Joao Eduardo Luis
11:09 AM Bug #2646: mon:update_from_paxos: error parsing incremental update: buffer::end_of_buffer
commit:840ae244499496d543d634713bdee7c7884ce527
The tick happened at the same time as slurping, which meant the di...
Greg Farnum
10:54 AM Bug #2646 (Resolved): mon:update_from_paxos: error parsing incremental update: buffer::end_of_buffer
Sage Weil
10:53 AM Bug #2264 (Can't reproduce): mon: failed assert in bump_epoch
Sage Weil
06:19 AM Bug #2618: error: unable to open OSD superblock
Thanks, but that didn't help.
I did notice that drives get mounted a little weird.
Don't know if that's a problem...
John S

06/28/2012

10:06 PM Bug #2664 (Resolved): osd: extra attr _path, extra attr snapset from scrub
Sage Weil
11:29 AM Bug #2673 (Resolved): ReplicatedPG::prepare_transaction: don't crash on empty ops
Samuel Just
11:26 AM Cleanup #2672 (Rejected): PG::find_best_info cleanup
see 253033cd720db86e7c8372fd4184de7d4c43bce2 Samuel Just
11:26 AM Cleanup #2671 (Resolved): buffer.h: do efficient buffer comparisons
Samuel Just
10:15 AM rbd Documentation #2670 (Resolved): Docs shouldn't direct users to echo to /sys/bus/rbd for normal use
A naive user looking for "rbd map" will instead find this:
http://ceph.com/docs/master/rbd/rados-rbd-cmds/
with...
Anonymous
10:04 AM Linux kernel client Bug #2260: libceph: null pointer dereference at try_write+0x638+0xfb0
Lots of work on the messenger client, but still not completely
clear this particular bug is fixed. There are a few ...
Alex Elder
09:42 AM Linux kernel client Bug #147: lockdep: possible irq lock inversion dependency w/ osdc->request_mutex and con->mutex
I suppose this really ought to get fixed at some point.
For now, it looks like Sage has implemented a workaround
th...
Alex Elder
09:41 AM rbd Bug #1070: krbd: ^C doesn't work
No progress on this. None expected unless it gets
reprioritized and planned.
Alex Elder
09:40 AM Linux kernel client Feature #1699: debug symbols in autobuilt (sepia) kernels
No progress on this. I have a vague memory that someone
else might have looked at this problem a while back (Dan?)....
Alex Elder
09:39 AM Feature #2127: Save kernel core dumps on all of our test machines
My work on this was pretty much complete a few months ago.
It included a shell script that leverages Ubuntu kdump
...
Alex Elder
09:32 AM Linux kernel client Bug #2261 (Can't reproduce): paging error in libceph after crashed osd comes back online
the osd_client refcounting bug fix may explain this one, too... commit:0d47766f14211a73eaf54cab234db134ece79f49
an...
Sage Weil
09:16 AM Linux kernel client Bug #2261: paging error in libceph after crashed osd comes back online
No progress on this.
There has been a lot of work on the messenger code since this bug was
reported. One change ...
Alex Elder
09:31 AM Linux kernel client Cleanup #2130: ceph: xattr: complete cleanups following review
No progress on this, but I still have the patches. I'll
try to sneak them in as I'm working on RBD. I believe
the...
Alex Elder
09:29 AM Linux kernel client Cleanup #2131: ceph: xattr: use the generic kernel xattr code
No progress on this. It should be put on our roadmap as a task
to complete, maybe within the next 6 months.
Alex Elder
09:12 AM Bug #2267 (Closed): Ceph client crashed after shutting down one mds and osd
A recent fix supplied by Zheng Yan of Intel seems to have fixed
this problem, so I'm closing this bug.
rbd: C...
Alex Elder
09:05 AM rbd Feature #2326 (In Progress): krbd: use new class interfaces, new image format
I've finally begun work on this, following some in-person discussion
with Josh, Dan, and Sage this week.
I will u...
Alex Elder
09:00 AM Linux kernel client Feature #2374: ceph-client: start laying the groundwork for Linux tracepoints
No progress on this yet.
However, I got this e-mail from Jim Schutt shortly after creating
this bug, and just wan...
Alex Elder
08:44 AM Bug #2386: xfstests: failed #34
I've been trying to find out whether this is still a problem or
if it was transient. But teuthology has had a strin...
Alex Elder
07:41 AM Linux kernel client Bug #2424 (Resolved): ceph-client: messenger: badness in prepare_write_connect()
This bug was fixed in May, by a small series of changes that
culminated in this one:
commit 3da54776e2c0385c3...
Alex Elder
07:37 AM Linux kernel client Cleanup #2432: ceph-client: messenger: refactor to simplify state model
I had worked out on paper some notes about a longer-term state/event
model that could be used for the client messeng...
Alex Elder
07:33 AM Linux kernel client Cleanup #2432: ceph-client: messenger: refactor to simplify state model
I worked on doing this for a good month but the job really isn't
complete. Nevertheless I think there was some prog...
Alex Elder
07:23 AM Linux kernel client Cleanup #2438: ceph-client: use BUG_ON() for null auth_client->ops pointers
Touching all my bugs today. This one's a good idea but
very low priority.
Alex Elder
07:20 AM rbd Bug #2608: rbd: hung xfstest 270
Just to summarize what I just added...
There are some recent XFS problems that might explain this,
irrespective o...
Alex Elder
07:16 AM rbd Bug #2608: rbd: hung xfstest 270
I looked at this on Tuesday, and sent a note to Sage that should
have instead been put here. Here it is.
I w...
Alex Elder
04:54 AM Feature #2668 (Resolved): Build linux-tools-common package for perf
It'd be really nice if we built linux-tools-common with our gitbuilder kernels so we can install perf on our test box... Mark Nelson

06/27/2012

06:10 PM Bug #2618: error: unable to open OSD superblock
I noticed an issue in your ceph.conf - you have keyring = /etc/ceph/keyring.admin in the global section, and the osd ... Josh Durgin
05:19 PM rbd Bug #2667 (Won't Fix): librbd: create_snap on a closed image segfaults
I wrote silly code, and in reordering it, managed to attempt rbd_snap_create() on an
image that I had rbd_close()d. ...
Dan Mick
05:13 PM Feature #2651: mon: race calling tick() when doing slurping
oops, stronger fix, yes! Sage Weil
05:13 PM Feature #2651 (Resolved): mon: race calling tick() when doing slurping
Sage Weil
05:01 PM Feature #2661 (Resolved): mon: do not allow monitors to be added to the map with port 0
Merged into dho and next. Thanks Joao! Greg Farnum
11:25 AM Feature #2661 (Resolved): mon: do not allow monitors to be added to the map with port 0
Last week, somebody used the "ceph mon add" command without specifying a port, and it defaulted to port 0. This cause... Greg Farnum
04:48 PM Feature #2666 (Resolved): rados tool: copy pool
A new operation to copy the entire content of a pool into a different pool. For each object we'd copy the locator, da... Yehuda Sadeh
04:04 PM rgw Bug #2665 (Resolved): rest-bench hangs periodically
rest-bench seems to hang periodically with the following spit out the console on a regular basis:
plana83: 2012-06...
Mark Nelson
04:04 PM Bug #2656 (Rejected): rados-bench hangs periodically
Mark Nelson
04:03 PM Bug #2656: rados-bench hangs periodically
gah,
this is what I get for submitting bugs at the end of the day. You are correct, rest-bench.
Mark Nelson
03:29 PM devops Feature #2587 (Resolved): sepia: isolated networking on vercoi (manual, a handful)
Anonymous
03:28 PM devops Feature #2587: sepia: isolated networking on vercoi (manual, a handful)
Confirmed: isolated0..isolated9 work even if Crowbar wants to put VLANs in them. They pass between vercoi as packets ... Anonymous
02:17 PM devops Feature #2662: crowbar: Make barclamp-ceph set mon initial members, monitor-secret, fsid
More on where that snippet should live:
- for standalone chef deployment, we want the admin run something similar,...
Anonymous
02:14 PM devops Feature #2662: crowbar: Make barclamp-ceph set mon initial members, monitor-secret, fsid
This python snippet creates ceph keys in the right format (for now). Where it should live is still an open question.
...
Anonymous
01:38 PM devops Feature #2662 (Resolved): crowbar: Make barclamp-ceph set mon initial members, monitor-secret, fsid
Without this, multi-mon bring-up is racy.
At proposal save time, the barclamp should inspect the roles, and assign...
Anonymous
02:12 PM Bug #2664: osd: extra attr _path, extra attr snapset from scrub
full logs at metropolis:~sage/bug-2664 Sage Weil
02:11 PM Bug #2664 (Resolved): osd: extra attr _path, extra attr snapset from scrub
... Sage Weil
01:43 PM devops Feature #2663 (Closed): crowbar: UI for setting generic ceph.conf values
This needs to be some sort of an extensible list of key: value pairs.
Do we need to support sections too? Probably...
Anonymous
01:17 PM devops Feature #2589 (Resolved): crowbar: Update barclamp-ceph for Essex, new ceph-cookbooks
Tyler reported success as of b2c5d3307eef0ca44fd4b001136e9af043b322bd. Anonymous
01:16 PM devops Feature #2588: downburst: multiple, configurable networks to libvirt
For historical value: https://github.com/ceph/downburst/commit/de494eeefad0f0c72916d5dab8ba015b441a94f0 Anonymous
11:30 AM devops Feature #2588 (Resolved): downburst: multiple, configurable networks to libvirt
Anonymous
11:26 AM Linux kernel client Bug #2590: possible irq lock inversion dependency with con->mutex and osdc->request_mutex
Recent log location: /a/teuthology-2012-06-27_00:00:07-regression-next-testing-basic/3076
2012-06-27T01:25:05.11...
Tamilarasi muthamizhan
10:17 AM rbd Feature #2660 (New): qa: test resizing an rbd image while a vm has it open
Make sure the resize is visible to the guest. This works with the virtio driver after doing e.g. 'echo 1 | sudo tee /... Josh Durgin
10:02 AM Subtask #2659 (Can't reproduce): mon: Single-Paxos: ceph tool -w subscriptions not being updated
how to reproduce:... Joao Eduardo Luis

06/26/2012

05:16 PM rgw Bug #2658 (Resolved): rgw-admin: usage show fails when specifying hour > 12
using wrong modifier on for parsing it. Yehuda Sadeh
05:11 PM Bug #2453: osd/OSD.h: 840: FAILED assert(last_scrub_pg.count(p))
possibly fixed by commit:0d8970fc813b33e7c6ba2484fbc43cce947d3f4d Sage Weil
04:31 PM CephFS Bug #2657 (Resolved): kclient: direct io write larger than 8MiB fails
Writes larger than 8MiB get EFAULT, e.g.:... Josh Durgin
02:13 PM Bug #2656: rados-bench hangs periodically
rados-bench or rest-bench? Yehuda Sadeh
01:27 PM Bug #2656 (Rejected): rados-bench hangs periodically
rados-bench seems to hang periodically with the following spit out the console on a regular basis:
plana83: 2012-0...
Mark Nelson
01:45 PM Bug #2563 (Can't reproduce): leveldb corruption
It looks like one of the leveldb store files was corrupted, possibly by the filesystem. It may be possible to recove... Samuel Just
09:36 AM Bug #2655 (Resolved): scrub slows writes more than it should
Samuel Just
09:34 AM Subtask #2616 (Closed): mon: Single-Paxos: AuthMonitor: key_server has no entries
Joao Eduardo Luis
09:34 AM Subtask #2616 (Resolved): mon: Single-Paxos: AuthMonitor: key_server has no entries
Joao Eduardo Luis
09:33 AM Subtask #2620 (Closed): mon: Single-Paxos: MDSMonitor: MMDSBeacon from entity with insufficient p...
Note: turns out this was the same bug as #2643
Had to do with the AuthMonitor losing some infos when reading versi...
Joao Eduardo Luis
09:32 AM Subtask #2643 (Closed): mon: Single-Paxos: mds: Strange message behavior on peon
Had to do with the AuthMonitor losing some infos when reading versions from the store.
This is fixed.
Joao Eduardo Luis
09:01 AM Linux kernel client Bug #2523: xfs: xfs_iolock_reclaimable
... Sage Weil
06:15 AM rbd Bug #2654 (Won't Fix): Stale rbd volume cannot be unmaped
/dev/rbd0 exists in system but /dev/rbd/winnie-test/postgresql not... Maciej Galkiewicz

06/25/2012

10:01 PM rbd Bug #2608: rbd: hung xfstest 270
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-23_00:00:02-regression-next-testing-basic/1471
m...
Sage Weil
09:56 PM Bug #2536 (Need More Info): librados crashed while getting stat of an object
Sage Weil
09:56 PM Bug #2536: librados crashed while getting stat of an object
Have you seen this problem since then? It looks like it could be due to racing with rados startup or shutdown... Sage Weil
09:41 PM Bug #2346 (Resolved): xfs filesystem on top of rbd volume corrupts
No news is good news! Sage Weil
09:40 PM Bug #2602 (Resolved): osd: push failed because local copy is X
Sage Weil
05:09 PM Messengers Bug #2569: msgr: connect_rank crash
All three mon nodes and a client node on the second aging cluster died over the weekend (kernel and all). Looks like ... Mark Nelson
10:25 AM Messengers Bug #2569: msgr: connect_rank crash
Saw the following while debugging my aging test scripts. Seems to have happened when the mon was started. No core d... Mark Nelson
03:33 PM Bug #2649: osd: log bound mismatch
... Sage Weil
03:31 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
moved test to marginal suite; move back to regression when this is resolved! Sage Weil
03:31 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
ubuntu@teuthology:/a/teuthology-2012-06-24_00:00:07-regression-next-testing-basic$ ... Sage Weil
03:28 PM Bug #2593: logmonitor: decode failure
I wonder if this is also due to tick() colliding with slurping — the first one definitely could be (not sure about th... Greg Farnum
03:27 PM Bug #2653 (Resolved): Web docs point to obsolete "fusermount" page
The page http://ceph.com/docs/master/man/8/mount.ceph/ has a link at the bottome that points to "fusermount" descript... Ken Franklin
03:21 PM Bug #2618: error: unable to open OSD superblock
attaching my ceph.conf.
Can't get to IRC from work - I'll try in the evenings.
thanks
John S
02:54 PM rgw Bug #2652 (Resolved): Segmentation fault in rest-bench
This happened while running rest-bench during aging tests on the burnupi cluster.
--
plana83: *** Caught signal...
Mark Nelson
02:48 PM Bug #2022 (Resolved): osd: misdirectect request
Sage Weil
02:40 PM Feature #2651 (Rejected): mon: race calling tick() when doing slurping
Right now the monitor calls tick() on all the PaxosService implementations when it's doing slurping. This introduces ... Greg Farnum
09:19 AM rgw Bug #2650 (Resolved): rgw: swift key creation overrides subuser access mask
# radosgw-admin subuser create --uid=johndoe --subuser=johndoe:swift
--access=full
{ "user_id": "johndoe",
"rados...
Yehuda Sadeh

06/23/2012

04:56 PM Bug #2649 (Resolved): osd: log bound mismatch
... Sage Weil

06/22/2012

07:14 PM Bug #2648 (Resolved): removing a monitor from the map while it's running causes a crash
... Greg Farnum
05:27 PM Bug #2647 (Can't reproduce): osd: old request, waiting for subops
primary:... Yehuda Sadeh
11:43 AM Bug #2618: error: unable to open OSD superblock
John, can we see your ceph.conf file? If you have time, try chatting in #ceph on irc.oftc.net as well; perhaps we ca... Dan Mick
11:30 AM Bug #2646 (Resolved): mon:update_from_paxos: error parsing incremental update: buffer::end_of_buffer
... Yehuda Sadeh
08:17 AM Subtask #2645 (Rejected): mon: Single-Paxos: Could not decrypt ticket info (immediately after run...
There was a lingering monitor still running, from a previous install.
Apparently, holding the wrong keys will lead...
Joao Eduardo Luis
08:09 AM Subtask #2645 (Rejected): mon: Single-Paxos: Could not decrypt ticket info (immediately after run...
... Joao Eduardo Luis
12:24 AM Bug #2602: osd: push failed because local copy is X
Hi Sage,
just updated to your wip_rolling_upgrade branch.
FileStore update worked ( 100GB => 30 minutes on XFS ) ...
Simon Frerichs

06/21/2012

06:55 PM rbd Feature #2566 (Duplicate): teuthology: task to run rbd workunits in a vm
Same as #1713. Josh Durgin
06:53 PM rbd Feature #1713 (Resolved): teuthology: qemu tasks, tests
Basic teuthology task is done in 38f6a78c71910a39b7f1890316c0a134ced8b0ec. Making a gitbuilder for qemu seems less im... Josh Durgin
06:52 PM rbd Feature #2644 (Rejected): qa: gitbuilder for qemu
This should build qemu with rbd support for regression testing new versions of qemu. Josh Durgin
06:49 PM rbd Feature #2567 (Resolved): qa: add qemu+rbd jobs to qa suite
Added in 94a6ab8ff3637f68c03261cf845b402d6bfa8e76 Josh Durgin
04:30 PM Subtask #2643: mon: Single-Paxos: mds: Strange message behavior on peon
This is what can be seen on the Leader:... Joao Eduardo Luis
04:24 PM Subtask #2643 (Closed): mon: Single-Paxos: mds: Strange message behavior on peon
Just for future reference.
When checking how things were going with the monitors, we noticed that the following sn...
Joao Eduardo Luis
03:37 PM Subtask #2633: mon: Single-Paxos: ceph tool unable to connect to monitor
Has something changed in the last five hours that you think fixed this? Greg Farnum
03:28 PM Subtask #2633 (Closed): mon: Single-Paxos: ceph tool unable to connect to monitor
It appears to be fixed.
The ceph tool is able to obtain the status from the monitors.
The 'watch' command doesn...
Joao Eduardo Luis
10:11 AM Subtask #2633 (Closed): mon: Single-Paxos: ceph tool unable to connect to monitor
This is what usually happens on the monitor side. Every now and then, the ceph tool is able to connect, but we haven'... Joao Eduardo Luis
01:54 PM rgw Bug #2642 (Resolved): rgw: show/trim usage using also time (not just date)
Yehuda Sadeh
01:42 PM Feature #2577 (In Progress): teuthology: blktrace task
Sage Weil
01:41 PM Feature #2581 (Resolved): perf: investigate 0.47.2 precise vs 0.46 oneiric discrepancy
Sage Weil
01:40 PM Feature #2576 (Resolved): perf: 0.48 on long-term clusters
Sage Weil
01:17 PM Linux kernel client Bug #2302 (Can't reproduce): xfs: warning at mutex_remove_waiter
Sage Weil
12:38 PM Bug #2550 (Resolved): logrotate: SIGHUP upstart jobs too, not just sysvinit
Sage Weil
12:06 PM rbd Feature #2641 (Duplicate): qa: regression tests for rbd openstack volume driver
This should include:
* booting a vm from an rbd device
* attaching/detaching an rbd device to a running guest
* ad...
Josh Durgin
11:30 AM rbd Feature #2640 (Duplicate): qa: regression tests for rbd glance backend
This should run against development versions of openstack to verify that the glance backend continues to work. Namely... Josh Durgin
11:26 AM Bug #2042 (Duplicate): mon: crash in LogMonitor::update_from_paxos
Indeed! Sage Weil
11:21 AM Bug #2042: mon: crash in LogMonitor::update_from_paxos
Hrm, I think that this is duplicated by #2593? Greg Farnum
11:16 AM Bug #2042 (Can't reproduce): mon: crash in LogMonitor::update_from_paxos
Sage Weil
11:13 AM Cleanup #2623 (Resolved): filestore btrfs trans should be removed
Sage Weil
11:07 AM Feature #1494 (Resolved): openstack: vm can boot off rbd
This has been possible for a long time. Josh Durgin
11:03 AM Bug #2638 (Resolved): mon: make pool ops idempotent
for example, deleting a pool fails with ENOENT (or ENODATA :/) if the pool doesn't exist, but if we lose our mon sess... Sage Weil
11:02 AM rbd Feature #2637 (New): teuthology: task for running a vm using libvirt
This should have similar semantics to the qemu task that runs qemu directly, but configure and run the vm via libvirt... Josh Durgin
10:59 AM rbd Feature #2636 (New): qa: regression tests for qemu monitor commands
Test attach/detach of rbd devices and snapshot operations executed directly by the qemu monitor. This is probably eas... Josh Durgin
10:40 AM Bug #2602 (Need More Info): osd: push failed because local copy is X
Hi Simon-
This looks like something that could be caused by the broken rolling osd upgrade support in the branch y...
Sage Weil
10:14 AM rgw Feature #2635 (New): benchmark for measuring rgw metadata operations
We need to come up with a benchmark that will measure the following operations:
* Service:
1. List buckets
*...
Yehuda Sadeh
10:11 AM rbd Feature #2634 (Resolved): teuthology: add networking to qemu task
Let the guest speak to the outside world so test scripts can e.g. check out git repos and download test programs to c... Josh Durgin
06:32 AM Bug #2618: error: unable to open OSD superblock
I manually created the directory.
Then I ran the mkcephfs command.
The directory has some files in it (journal, mag...
John S

06/20/2012

09:39 PM Feature #2631 (Resolved): mon: kill rm -rf --mkfs behavior
Sage Weil
09:02 PM Bug #2593: logmonitor: decode failure
... Sage Weil
06:55 PM rbd Feature #2630 (Resolved): teuthology: add task to run qemu-iotests against rbd
qemu-iotests are included in upstream qemu.git. They exercise qemu's block layer to test correctness. They use existi... Josh Durgin
06:44 PM rbd Feature #2629 (New): qa: test performance during live migration
This could be done after #2628 by running iozone during the migration, parsing its output, and checking that throughp... Josh Durgin
06:41 PM rbd Feature #2628 (New): qa: test live migration with qemu
Run something like fsstress in the vm during the migration, and verify that it completes successfully. To do this we'... Josh Durgin
06:37 PM rbd Feature #2627 (New): qa: regression tests for libvirt rbd storage pool
Libvirt storage pools allow you to create, delete, and list volumes. Wido wrote a backend that uses librbd to do this... Josh Durgin
06:30 PM rbd Feature #2626 (New): qa: regression tests for basic rbd libvirt integration (disks)
Test using rbd disks with vms through libvirt.
This includes:
* booting a vm backed only by rbd
* attaching rb...
Josh Durgin
06:15 PM rbd Feature #2625 (Rejected): qa: gitbuilder for libvirt
Create a gitbuilder for libvirt packages so we can regression test rbd against upstream releases. Base this on the ub... Josh Durgin
05:36 PM Bug #2600: osd: crazy long watch timeout?
In another recurrence, there are no objecter requests:... Josh Durgin
04:34 PM Bug #2524 (Won't Fix): librados crashed while connecting to cluster
Sage Weil
04:34 PM Bug #2456 (Resolved): librbd: failed LibRBD.TestIOToSnapshot
Haven't seen this in a while. Maybe some of the race cleanups fixed it... Sage Weil
04:32 PM Documentation #2624: OpenStack creation instructions should recommend non-default number of pg's ...
It'll have to be ceph osd pool create <pool> <num_pgs> until #2519 is done. Josh Durgin
04:25 PM Documentation #2624 (Resolved): OpenStack creation instructions should recommend non-default numb...
http://ceph.com/docs/master/rbd/rbd-openstack/ recommends
sudo rados mkpool nova
This should probably be
su...
Dan Mick
03:46 PM Cleanup #2623 (Resolved): filestore btrfs trans should be removed
On Wed, 20 Jun 2012, Stefan Priebe - Profihost AG wrote:
> Hello list,
>
> i've looked at the wiki (http://ceph.co...
Dan Mick
03:01 PM Subtask #2622 (Resolved): mon: Single-Paxos: convert existing, old MonitorStore to a brand new Mo...
The new monitor design does not support the old MonitorStore, nor does it store the versions and their values in the ... Joao Eduardo Luis
02:58 PM Subtask #2621 (Resolved): mon: Single-Paxos: synchronize the MonitorDBStore of oblivious monitor
*Objective:* synchronize monitor stores over the network whenever a given monitor mon.X falls too far behind.
*Sol...
Joao Eduardo Luis
02:50 PM Subtask #2615 (Closed): mon: Single-Paxos: MDSMap::get_health() asserting
Joao Eduardo Luis
02:49 PM Subtask #2615: mon: Single-Paxos: MDSMap::get_health() asserting
This issue stopped popping up after we changed the criteria to propose queued proposals and restarted testing with a ... Joao Eduardo Luis
03:59 AM Subtask #2615 (Closed): mon: Single-Paxos: MDSMap::get_health() asserting
MDSMap infos, dumped on MDSMap::get_health() just before the assert is triggered:... Joao Eduardo Luis
02:47 PM Subtask #2616: mon: Single-Paxos: AuthMonitor: key_server has no entries
Appears to be fixed.
The ceph tool is able to connect to the cluster and obtain status information.
However, th...
Joao Eduardo Luis
11:01 AM Subtask #2616: mon: Single-Paxos: AuthMonitor: key_server has no entries
Although this appears to be fixed, we still are unable to authenticate clients.
My current suspicion is that we ar...
Joao Eduardo Luis
09:00 AM Subtask #2616: mon: Single-Paxos: AuthMonitor: key_server has no entries
We were encoding an empty "full version" of the key server during AuthMonitor::encode_pending(), along side with the ... Joao Eduardo Luis
08:36 AM Subtask #2616: mon: Single-Paxos: AuthMonitor: key_server has no entries
The problem appears to affect all mon clients, and it may be the reason why our OSDs do not work as well.
Log snip...
Joao Eduardo Luis
08:00 AM Subtask #2616 (Closed): mon: Single-Paxos: AuthMonitor: key_server has no entries
The Monitor's key_server has no entries, even though we made sure to populate mon.X/keyring with every single service... Joao Eduardo Luis
02:45 PM Subtask #2614: Single Paxos instance shared across the existing services
Joao Eduardo Luis
03:48 AM Subtask #2614 (Closed): Single Paxos instance shared across the existing services
One Paxos to propose them all. Joao Eduardo Luis
02:44 PM Subtask #2620 (Closed): mon: Single-Paxos: MDSMonitor: MMDSBeacon from entity with insufficient p...
... Joao Eduardo Luis
02:06 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
Please mention https://bugs.launchpad.net/upstart/+bug/1012938 in the "sucks" comment, so someone can some day nicely... Anonymous
01:47 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
repushed upstart-vs-logrotate branch Sage Weil
12:25 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
yeah, that'll work. only solves the logrotate case, but that's fine by me. Sage Weil
11:39 AM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
That killall thing is hideous, and I'm utterly unconvinced having even more upstart jobs for Ceph is helpful in any w... Anonymous
12:30 PM Feature #2619 (Resolved): filejournal: instrument with perfcounters
Sage Weil
12:09 PM Bug #2618: error: unable to open OSD superblock
Hi John,
Did you create the /data/ceph/osd0 directory? mkcephfs doesn't do it for you because of the potential for...
Josh Durgin
11:31 AM Bug #2618 (Can't reproduce): error: unable to open OSD superblock
I am new at this.
I installed ceph.
When I do a service ceph start, mon.0 and mds.(machine name) seem ok.
When it ...
John S
11:13 AM Bug #2022: osd: misdirectect request
here is the smoking gun. note that teh pgid goes to 0.0 when linger tid 1 is resending the watch op 4:... Sage Weil
03:45 AM Subtask #2613: Sandbox PaxosServices accesses to the store
I messed up the formatting and don't seem to be able to edit it. So here goes a decent version of it.... Joao Eduardo Luis
03:41 AM Subtask #2613 (Resolved): Sandbox PaxosServices accesses to the store
Each service used to have direct access to the MonitorStore, and they could mess around wherever they wanted, allowin... Joao Eduardo Luis
03:25 AM Subtask #2612 (Resolved): Monitor key/value store
Create a key/value store, with transaction support, to be used on the monitor subsystem.
Its interface should refl...
Joao Eduardo Luis
03:21 AM Feature #2611 (Resolved): mon: Single-Paxos
The ceph-mon is (roughly) composed by a Monitor class, responsible for all things monitor-ish, and several monitor se... Joao Eduardo Luis

06/19/2012

07:04 PM rbd Feature #2556: rbd tool: break image locks
Argh. I don't seem to be getting my email notifications from you and Josh on Github, and I don't know why. Greg Farnum
06:57 PM rbd Feature #2556: rbd tool: break image locks
https://github.com/ceph/ceph/commit/3c05629691deb800e3c6e62e81f444a748e8857c#src-rbd-cc-P108
just making sure i un...
Sage Weil
06:48 PM rbd Feature #2556: rbd tool: break image locks
Your commits look good to me (sorry I missed the cli tests; I need to get into the habit of running those), but I don... Greg Farnum
05:46 PM rbd Feature #2556: rbd tool: break image locks
rebase, fixed up ENOENT vs ENOEXEC behavior. one clarification about the purpose/scope of 'rbd lock', but otherwise ... Sage Weil
03:13 PM rbd Feature #2556 (Fix Under Review): rbd tool: break image locks
wip-rbd-locking has this now, but it also merges in wip-clsrbd for an unrelated change, so you might want to wait to ... Greg Farnum
05:06 PM Bug #2610 (Resolved): osd: pg stuck at scrubbing
Happened on congress, pg was stuck at scrubbing state for two and a half days.... Yehuda Sadeh
04:20 PM rbd Feature #2558 (Resolved): cls_rbd: child/parent methods
Sage Weil
04:05 PM devops Feature #2584 (In Progress): sepia: provide networking, DHCP for dynamic virtual machines
Sage Weil
04:04 PM Feature #2576 (In Progress): perf: 0.48 on long-term clusters
Sage Weil
04:04 PM Feature #2575 (In Progress): perf: 0.48 numbers
Sage Weil
03:52 PM rbd Feature #2609 (Resolved): librbd: new image name -> image head indirection
To prevent rename from disrupting clients with images open,
* put header in rbd_head.$id
* put $id in rbd_id.$nam...
Sage Weil
02:32 PM rgw Feature #2516 (Resolved): rgw: new bandwidth-only per-user log
Sage Weil
02:28 PM rbd Bug #2608 (Closed): rbd: hung xfstest 270
Logs are available in ubuntu@teuthology:/a/teuthology-2012-06-19_00:00:09-regression-next-testing-basic/1792
2012-...
Tamilarasi muthamizhan
01:25 PM Bug #2022: osd: misdirectect request
latest run log: ubuntu@teuthology:/a/teuthology-2012-06-18_19:00:05-regression-master-testing-gcov/1586 Tamilarasi muthamizhan
12:54 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
ubuntu@teuthology:/a/teuthology-2012-06-18_19:00:05-regression-master-testing-gcov/1579 Tamilarasi muthamizhan
11:31 AM Messengers Bug #1985: msgr: creating new Pipe for pre-existing connection leaks Pipe if they don't replace
I've still got this sitting around in my workspace. Since we seem to have pushed back a messenger re-do, perhaps we s... Greg Farnum
09:57 AM rbd Feature #2607 (Resolved): librbd: copyup helper
copyup helper to perform a copyup from parent to child. will be used by both the rbd command-line copyup command, an... Sage Weil
09:57 AM rbd Subtask #2606 (Resolved): librbd layering: copyup on missing child object
Sage Weil
09:57 AM rbd Subtask #2605 (Resolved): librbd layering: guard writes
Sage Weil
09:56 AM rbd Subtask #2604 (Resolved): librbd layering: read path
Sage Weil
09:56 AM rbd Subtask #2603 (Resolved): librbd layering: open parent on open
Sage Weil

06/18/2012

10:07 PM Bug #2550 (Fix Under Review): logrotate: SIGHUP upstart jobs too, not just sysvinit
Sigh. See branch upstart-vs-logrotate. Sage Weil
08:57 PM rbd Feature #2556: rbd tool: break image locks
Greg Farnum wrote:
> Team RBD needs more to do! Pulling this forward. :)
Go team! :)
Sage Weil
06:26 PM rbd Feature #2556 (In Progress): rbd tool: break image locks
Team RBD needs more to do! Pulling this forward. :) Greg Farnum
05:56 PM rbd Feature #2585 (In Progress): rbd: clone command
Dan Mick
05:34 PM rbd Feature #2585: rbd: clone command
Dan Mick
05:35 PM rbd Feature #2559: cls_rbd: copyup method
Dan Mick
01:50 PM rbd Feature #2601: rbd: Show image size with an "ls"
We've also heard from others that having a better estimate of rbd usage and expected usage would be good; taking into... Dan Mick
06:09 AM rbd Feature #2601 (Resolved): rbd: Show image size with an "ls"
On the mailinglist the request came if the "rbd" tool could be modified to not only show image names when doing an ls... Wido den Hollander
01:34 PM rgw Bug #2542 (Resolved): rgw: support S3 update of metadata
Yehuda Sadeh
01:32 PM rgw Bug #2542: rgw: support S3 update of metadata
Resolved, commit:343cc792e847ca8901f6c08e41799a2fbbd2ca92 Yehuda Sadeh
11:04 AM Bug #2602: osd: push failed because local copy is X
Updated another osd to 'next' and same errors happened.
I've attached the log with debug osd = 20 set.
Simon Frerichs
08:46 AM Bug #2602: osd: push failed because local copy is X
Is this reproducible with 'debug osd = 20'? Sage Weil
08:44 AM Bug #2602 (Resolved): osd: push failed because local copy is X
Hi,
filestore updated completed.
When i start the "updated" OSD the whole cluster starts lagging.
Is the next br...
Sage Weil
08:45 AM Bug #2598: filestore: error during upgrade
Simon Frerichs wrote:
> Hi,
>
> filestore updated completed.
> When i start the "updated" OSD the whole cluster ...
Sage Weil
08:42 AM Bug #2598 (Resolved): filestore: error during upgrade
THanks! Sage Weil
01:29 AM Bug #2598: filestore: error during upgrade
Hi,
filestore updated completed.
When i start the "updated" OSD the whole cluster starts lagging.
Is the next br...
Simon Frerichs
12:56 AM Bug #2598: filestore: error during upgrade
Thanks.
The bug seems to be fixed.
Simon Frerichs
08:43 AM Bug #2595: filestore: error creating filestore during mkcephfs
2012-06-18 17:42:16.232924 7f54292fb780 -1 filestore(/srv/osd.20) could not find 23c2fcde/osd_superblock/0//-1 in ind... Stefan Priebe
08:29 AM Bug #2599: osd: crash in ReplicatedPG::C_OSD_OndiskWriteUnlock::finish
commit:5efaa8d7799347dfae38333b1fd6e1a87dc76b28 Sage Weil
07:25 AM CephFS Bug #2596: mds: spinning on restart
gdb is not helpful here, process seems to be spinning in syscall:
(gdb) thread apply all bt
Thread 1 (process 148...
Amon Ott

06/17/2012

10:40 PM Bug #2600: osd: crazy long watch timeout?
Possibly related to #2476 Josh Durgin
09:37 PM Bug #2600 (Resolved): osd: crazy long watch timeout?
... Sage Weil
09:34 PM CephFS Bug #1737: ceph-fuse crash in xlist::remove
see ubuntu@teuthology:/a/teuthology-2012-06-17_19:00:03-regression-master-testing-gcov/1303 for a failure with logs! Sage Weil
02:33 PM RADOS Feature #2422 (In Progress): crush: test that mapping result is uncorrelated
Sage Weil
02:32 PM Bug #2598: filestore: error during upgrade
Ah... should have tested on another filesystem. Samuel Just
02:21 PM Bug #2598: filestore: error during upgrade
Oh, der.. pretty sure commit:82cb3d61ff4f200e0a9040e6381a9eed32db9de1 fixes this. Sage Weil
02:29 PM Bug #2022: osd: misdirectect request
Last two failures were the rados api tests:... Sage Weil
06:50 AM CephFS Bug #2385: max mds = 2, mds hang and crash
... Yavuz Selim Komur
06:45 AM CephFS Bug #2385: max mds = 2, mds hang and crash
... Yavuz Selim Komur

06/16/2012

11:34 AM Bug #2598: filestore: error during upgrade
That's odd, it's updating the omap directory as a collection. list_collections should not have returned omap as a co... Samuel Just
08:04 AM Bug #2598 (Resolved): filestore: error during upgrade
from ML:... Sage Weil
08:25 AM Bug #2462 (Resolved): osd/PG.cc: 402: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
I'm going to optimistically call this resolved. If we see this crash again, though, we'll need to reopen, and hopefu... Sage Weil
08:24 AM rbd Bug #2535: rbd: random data corruption in vm
We've disabled fiemap, which appears to be the culprit. Josh is still tracking down which kernel releases are affect... Sage Weil
08:21 AM Bug #2599 (Can't reproduce): osd: crash in ReplicatedPG::C_OSD_OndiskWriteUnlock::finish
from ml:... Sage Weil
07:59 AM Bug #2595 (Resolved): filestore: error creating filestore during mkcephfs
Sage Weil
07:59 AM Bug #2595: filestore: error creating filestore during mkcephfs
commit:1e899d08e61bbba0af6f3600b6bc9a5fc9e5c2e9 Sage Weil
06:40 AM Bug #2595: filestore: error creating filestore during mkcephfs
Yes Stefan Priebe

06/15/2012

05:58 PM rbd Feature #1480 (Resolved): librbd: image locking
Okay, discussed and merged in commit:dac9f223598c5f67b228403e514f202280d56488 Greg Farnum
05:49 PM rbd Feature #1480: librbd: image locking
And after thorough review from Josh, this should be ready for merge (commit:5b1b02b60a253092700f364dca77bb6b1065e3e0)... Greg Farnum
02:40 PM rgw Bug #1643 (Rejected): radosgw-admin log show should accept --time
Yehuda Sadeh
02:03 PM Bug #2595: filestore: error creating filestore during mkcephfs
Oh, it looks like it's just noise from checking the journal. The mkcephfs succeeded, right? Sage Weil
01:57 PM Bug #2595: filestore: error creating filestore during mkcephfs

> Can you reproduce with 'debug filestore = 20' and attach the log to this
> bug?
Log:...
Stefan Priebe
10:32 AM Bug #2595: filestore: error creating filestore during mkcephfs
FYI, I saw this once when I was working on the OSD hotplug code paths. Mine might have been caused by a missing "osd ... Anonymous
09:29 AM Bug #2595 (Resolved): filestore: error creating filestore during mkcephfs
from ML:... Sage Weil
11:48 AM rbd Bug #2597 (Resolved): Import of image from file appears to succeed, but image not present in the ...
I have been testing with storing an image file, a basic QCOW2 image of latest Ubuntu distro on a pool, which is used ... Sam Zaydel
10:44 AM rbd Feature #2558: cls_rbd: child/parent methods
wip-clsrbd Sage Weil
10:44 AM rbd Feature #2558 (Fix Under Review): cls_rbd: child/parent methods
Sage Weil
09:44 AM CephFS Bug #2596 (Can't reproduce): mds: spinning on restart
from ML:... Sage Weil

06/14/2012

09:02 PM Linux kernel client Bug #2389 (Duplicate): rbd: hung xfstest 67
Sage Weil
09:01 PM Linux kernel client Bug #2359 (Can't reproduce): xfstest 62 failing
haven't seen this in a while Sage Weil
05:55 PM Feature #2571 (Resolved): sepia: enable virtualization
Dan Mick
11:34 AM Feature #2571 (In Progress): sepia: enable virtualization
BIOS settings changed on all plana; one reboot test shows good results. One can tell if
virtualization is enabled w...
Dan Mick
04:12 PM Bug #2593 (Resolved): logmonitor: decode failure
Saw this while trying to reproduce #2569. Sadly teuthology cleaned everything up before I could get to the data.
<pr...
Greg Farnum
03:24 PM Feature #2581 (In Progress): perf: investigate 0.47.2 precise vs 0.46 oneiric discrepancy
Sage Weil
03:13 PM devops Feature #2415 (Resolved): upstart: support radosgw
Sage Weil
03:06 PM rbd Bug #2534: librbd: make sure watch is established on same header version as initial read was
Okay, this is blocked by #2592. Greg Farnum
03:06 PM Bug #2563: leveldb corruption
It's triggerable without ceph, I've filed a bug below with leveldb and I'm continuing to look into it.
http://code...
Samuel Just
03:05 PM Bug #2592: osd and all clients: watch version parameter is ignored
Alternatively, maybe the OSD should just enforce the version with those checks when setting a watch? It looks to me a... Greg Farnum
03:01 PM Bug #2592 (Resolved): osd and all clients: watch version parameter is ignored
Watch operations have a version parameter that is supposed act like an assert_version op. This could easily be done i... Josh Durgin
02:38 PM Feature #2471 (In Progress): osd: add prefix match to OSDCaps
you can have this one too, given your wip-osdcap branch. Greg Farnum
02:37 PM rbd Feature #1480 (Fix Under Review): librbd: image locking
wip-rbd-locking Greg Farnum
02:09 PM rgw Feature #2517 (Resolved): rgw: limit number of buckets per user (configurable per user)
added teuth tests, in master, backported to dho Sage Weil
02:04 PM rgw Bug #2591 (Resolved): misc rgw s3tests failures
Should be ok for now. I've set boto to 2.4.1, we can change that later once upstream fixes its issues. Yehuda Sadeh
10:15 AM rgw Bug #2591: misc rgw s3tests failures
boto 2.5.0 issue. For some reason it doesn't set the error.reason on 400 responses. Yehuda Sadeh
07:57 AM rgw Bug #2591 (Resolved): misc rgw s3tests failures
2012-06-13T12:51:42.657 INFO:teuthology.orchestra.run.err:s3tests.functional.test_headers.test_bucket_create_bad_auth... Sage Weil
12:59 PM rbd Bug #2535: rbd: random data corruption in vm
Sage Weil wrote:
> Just a bit of context: rbd without caching does a 'sparse-read' operation, which uses FIEMAP to d...
Sage Weil
12:52 PM rbd Bug #2535: rbd: random data corruption in vm
Just a bit of context: rbd without caching does a 'sparse-read' operation, which uses FIEMAP to determine which parts... Sage Weil
12:50 PM rbd Bug #2535: rbd: random data corruption in vm
Let's try a different tack: I pushed a osd-verify-sparse-read-holes branch to ceph.git (based on 0.47.2) that reads ... Sage Weil
09:09 AM rbd Bug #2535: rbd: random data corruption in vm
Status update:
I tried modifying the iotester so that it would work directly on the block device, in the hopes I c...
Guido Winkelmann
10:14 AM Feature #2472: osd: add opaque 'class <name> <foo>' cap that class can interpret/enforce
wip-osdcap is doing this way better than I was, although I'm happy to take it back to do the OSD changes if need be. Greg Farnum
09:09 AM rbd Bug #2410: hung xfstest #68
disabled 68 in qa for the time being. Sage Weil
09:03 AM rbd Bug #2522: xfstest #219
Sigh.. took a quick look and it's non-obvious why the repquota output doesn't match. Disabling this for now, but lea... Sage Weil

06/13/2012

08:57 PM Linux kernel client Bug #2590 (New): possible irq lock inversion dependency with con->mutex and osdc->request_mutex
i thought this was #147, but on closer inspection it's something else;... Sage Weil
07:17 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
Filed upstream: https://bugs.launchpad.net/upstart/+bug/1012938 Anonymous
06:17 PM rgw Feature #2516: rgw: new bandwidth-only per-user log
I think the last thing we need here is to add it to the radosgw-admin test so that we don't break these commands in t... Sage Weil
12:26 PM rgw Feature #2516 (In Progress): rgw: new bandwidth-only per-user log
Sage Weil
05:04 PM rgw Feature #2473: rgw: revisit operation logging
Not the top priority, but we can have an async flush, similar to the one we have for the usage logging. Yehuda Sadeh
04:20 PM Linux kernel client Bug #2573: libceph: many "socket closed" messages
In that case, if you want to run this with the osd messenger debug at 5 and can gather logs next time I'll be happy t... Greg Farnum
02:35 PM Linux kernel client Bug #2573: libceph: many "socket closed" messages
The test takes on the order of a minute to complete one pass
of test 049. During that time I typically see 10-20 so...
Alex Elder
10:46 AM Linux kernel client Bug #2573: libceph: many "socket closed" messages
The sockets have a default timeout of 15 minutes, after which they will close — the idea being that if the socket is ... Greg Farnum
10:40 AM Linux kernel client Bug #2573 (Resolved): libceph: many "socket closed" messages
While trying to reproduce a null pointer problem in the client
messenger code I was running xfstests #049 over RBD d...
Alex Elder
02:31 PM Feature #988 (Duplicate): librbd: trivial layering
replaced by other tasks Sage Weil
02:31 PM Feature #988 (Rejected): librbd: trivial layering
Sage Weil
02:31 PM devops Feature #2589 (Resolved): crowbar: Update barclamp-ceph for Essex, new ceph-cookbooks
Anonymous
02:30 PM devops Feature #2588 (Resolved): downburst: multiple, configurable networks to libvirt
Right now, it hardcodes that a vm only has the "default" network. Make that configurable. Anonymous
02:29 PM devops Feature #2587 (Resolved): sepia: isolated networking on vercoi (manual, a handful)
One-time switch & linux configuration for a handful of VLANs, manually allocated to people who want to run Crowbar. Anonymous
02:28 PM rbd Feature #2586 (Rejected): rbd: check/take locks on --lock
if you pass --lock to rbd, take an exclusive lock, do whatever, unlock Sage Weil
02:20 PM rbd Feature #2585 (Resolved): rbd: clone command
A command for the rbd tool to create a child image from a parent. Example:
rbd clone --parent pool/image@snap pool...
Sage Weil
01:56 PM rbd Feature #2467 (Rejected): qemu: implement bdrv_invalidate_cache
I've tested migration with caching, and read the code, and it looks like this is unnecessary. qemu is doing a flush b... Josh Durgin
01:47 PM devops Feature #2584 (Resolved): sepia: provide networking, DHCP for dynamic virtual machines
downburst can provision them really nicely, but right now only static networking works. To fix that, we need DNS to w... Anonymous
01:40 PM devops Feature #2583 (Resolved): crowbar: change barclamp-nova to use rbd
The nova proposal needs to point to a ceph proposal. Look at how nova&glance use mysql.
barclamp-chef should inclu...
Anonymous
01:25 PM Feature #1964 (Rejected): ferro: Create a cloud-init OVF config that reimages a machine
Dell's vMedia functionality is awfully buggy, aborting this plan (for now?). Anonymous
01:25 PM Feature #1965 (Rejected): ferro: Machine management state machine (fake actions)
Dell's vMedia functionality is awfully buggy, aborting this plan (for now?). Anonymous
01:24 PM Feature #1966 (Rejected): ferro: Connect actions to state machine
Anonymous
01:20 PM Feature #1966: ferro: Connect actions to state machine
Dell's vMedia functionality is awfully buggy, aborting this plan (for now?). Anonymous
01:21 PM Feature #1967 (Rejected): ferro: Single API endpoint that delegates to machine managers
Anonymous
01:20 PM Feature #1967: ferro: Single API endpoint that delegates to machine managers
Dell's vMedia functionality is awfully buggy, aborting this plan (for now?). Anonymous
01:20 PM Feature #1968 (Rejected): ferro: Batch resource allocation (not fair, no quotas yet)
Dell's vMedia functionality is awfully buggy, aborting this plan (for now?). Anonymous
01:20 PM rbd Bug #2522: xfstest #219
The problem here appears to be that the output of the repquota
command is not what's expected. I think the group qu...
Alex Elder
01:17 PM Feature #1962 (Rejected): ferro: Trigger vMedia boot via IPMI/DRAC
Dell's vMedia functionality is awfully buggy, aborting this plan (for now?). Anonymous
01:16 PM Feature #1961 (Rejected): ferro: Python wrapper for vmcli (using gevent)
Dell's vMedia functionality is awfully buggy, aborting this plan. Anonymous
01:12 PM Feature #1963 (Closed): ferro: OVF Environment creation as a library
downburst actually ended up containing this logic, not OVF but still cloud-init. Anonymous
01:04 PM rgw Feature #2517 (Fix Under Review): rgw: limit number of buckets per user (configurable per user)
Sage Weil
01:03 PM Feature #2582 (Resolved): set up chart.io + mysql (or equivalent) infrastructure for tracking perf
Sage Weil
12:44 PM Linux kernel client Bug #2287 (Resolved): rbd: crashes with 10Gbit network and fio
This looks like the bio->iter problem, which is now fixed by commit:43643528cce60ca184fe8197efa8e8da7c89a037 in ceph-... Sage Weil
12:38 PM Feature #2581 (Resolved): perf: investigate 0.47.2 precise vs 0.46 oneiric discrepancy
Sage Weil
12:37 PM Feature #2580 (Resolved): perf: investigate poor performance at 10 osds per node
Sage Weil
12:32 PM Feature #2578 (New): rados ager
aging function that is invoked (probably) similarly to rados bench, ideally using the same bencher abstraction so tha... Sage Weil
12:30 PM Feature #2577 (Resolved): teuthology: blktrace task
* run blktrace on the osds' disks.
* put results in the archive dir
* maybe an optional start delay, duration, ...
Sage Weil
12:30 PM Feature #2576 (Resolved): perf: 0.48 on long-term clusters
Sage Weil
12:29 PM Feature #2575 (Resolved): perf: 0.48 numbers
populate the spreadsheet with values from 0.48 Sage Weil
11:43 AM Messengers Bug #2569 (Need More Info): msgr: connect_rank crash
I'm attempting to reproduce this, but what's available right now is just the teuthology log — it didn't pull off any ... Greg Farnum
09:57 AM Messengers Bug #2569 (Resolved): msgr: connect_rank crash
... Sage Weil
11:31 AM devops Feature #2574 (Resolved): crowbar: use data disks automatically, journal inside data directory
Crowbar sets node['crowbar']['disks'] to an array of disks. First one is used for the OS, and disk['usage'] is set to... Anonymous
11:20 AM rbd Cleanup #2347 (Resolved): The rbd help text is misleading on required arguments
commit:67710a65c7cd1173c73c40241572d615dd7da1f3 Sage Weil
11:06 AM devops Feature #2415 (Fix Under Review): upstart: support radosgw
Sage Weil
11:02 AM Cleanup #2331 (Resolved): Makefile.am:182: `lib/libgtest.a' is not a standard libtool library name
commit:66553d25f09f0d0cea735a862a228060b72c0ce6 Sage Weil
10:30 AM rbd Bug #2572 (Resolved): krbd: writeback errors?
While trying to reproduce a null pointer messenger problem,
I kept hitting messages like this after some (fairly ran...
Alex Elder
10:29 AM Feature #2571 (Resolved): sepia: enable virtualization
Sage Weil
10:27 AM rbd Bug #2535: rbd: random data corruption in vm
Sage Weil wrote:
> Guido Winkelmann wrote:
> > Sage Weil wrote:
> > > Are there multiple partitions or is LVM on t...
Guido Winkelmann
10:03 AM Linux kernel client Bug #2389: rbd: hung xfstest 67
ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7559 Sage Weil
09:55 AM Linux kernel client Bug #2389: rbd: hung xfstest 67
ubuntu@teuthology:/a/master-2012-06-12_16:17:15/7465 Sage Weil
10:02 AM Linux kernel client Bug #147: lockdep: possible irq lock inversion dependency w/ osdc->request_mutex and con->mutex
ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7579
ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7587
<...
Sage Weil
09:59 AM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7526 Sage Weil
09:23 AM rbd Feature #2568 (Resolved): qa: run xfstests on qemu+rbd
This will build on #2566:
* stage xfstests on vdb, like a regular workunit, and:
* map additional rbd images to r...
Sage Weil
09:21 AM rbd Feature #2567 (Resolved): qa: add qemu+rbd jobs to qa suite
Add a bunch of workunits to the qa suite that will run on top of rbd inside a vm. Sage Weil
09:20 AM rbd Feature #2566 (Duplicate): teuthology: task to run rbd workunits in a vm
teuthology task that will:
* download workunit vm
* create and format rbd image
* mount, stage a workunit in rbd...
Sage Weil

06/12/2012

08:48 PM Feature #2564 (Resolved): teuthology: install kernels from local dir
Sage Weil
02:58 PM Bug #2462 (Need More Info): osd/PG.cc: 402: FAILED assert(log.head >= olog.tail && olog.head >= l...
f822c0257e4c7fad181332cd149205ad15a8b9db
See the commit description. Unfortunately, I don't really have evidence ...
Samuel Just
02:55 PM Bug #2563 (Resolved): leveldb corruption
This was also mentioned once in the mailing list.
ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17a...
Samuel Just
02:40 PM rbd Feature #2561: rbd: copyup command
What? How does a class function of any kind provide atomicity in cross-OSD data copies? Greg Farnum
02:37 PM rbd Feature #2561: rbd: copyup command
'rbd copyup pool/image' command to copy any missing objects up from the parent. simple O(n) operation that leverages... Sage Weil
02:11 PM rbd Feature #2561 (Resolved): rbd: copyup command
'rbd copyup pool/image' command to copy any missing objects up from the parent. simple O(n) operation that leverages ... Sage Weil
02:39 PM rbd Feature #2562 (Resolved): librbd: open parent images, read path, write path
- when we open an image, open the parent image too.
- make reads fall through to parent
- guard writes beyond paren...
Sage Weil
02:05 PM rbd Feature #2560 (Resolved): rbd: safe parent deletion
- maintain map of parent/child pairs in each child pool... Sage Weil
02:04 PM rbd Feature #2531: rbd: fencing broken clients
As I see it, we have two options that we need to choose between.
1) We can add fencing to librbd and let anybody do ...
Greg Farnum
01:58 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
The instance jobs make this a bit trickier. Either process "initctl list" output or copy the logic that walks the /va... Anonymous
01:04 PM Bug #2550 (Resolved): logrotate: SIGHUP upstart jobs too, not just sysvinit
Anonymous
01:55 PM rbd Feature #2559 (Resolved): cls_rbd: copyup method
- client provides object content
- if object exists, fail with EEXIST (or 0, or something)
- if object does not exi...
Sage Weil
01:54 PM rbd Feature #2558 (Resolved): cls_rbd: child/parent methods
On the new image header:
- set_parent(poolid, image (maybe id, maybe name), snapid)
On the per-pool child list:
...
Sage Weil
01:52 PM rbd Feature #2557 (Rejected): QEMU support for image locking
We should convert QEMU to make use of rbd cooperative locking, once it's done (#1480).
And any other appropriate c...
Greg Farnum
01:50 PM rbd Feature #2556 (Resolved): rbd tool: break image locks
Once #1480 is done, expose lock breaking via the rbd tool. Greg Farnum
01:47 PM devops Feature #2555 (Rejected): chef: SECURITY: Re-evaluate where configuration & key handoff gets stored
The current setting seems to mean root on all chef nodes (even ones not running Ceph), and all knife users, have full... Anonymous
01:44 PM devops Feature #2554 (Rejected): chef: open question: How do we discover what disks we should use as Cep...
For Crowbar, see #2574.
- This is somewhat a dangerous operation, run accidentally it will clobber a lot of data. ...
Anonymous
01:43 PM devops Feature #2553 (Closed): crowbar: open question: What's the correct way to add RBD support to the ...
We'll need to get set --volume-driver etc in nova.conf,
glance-api.conf, etc. So I guess we need to (temporarily) fo...
Anonymous
01:36 PM devops Feature #2415 (In Progress): upstart: support radosgw
Sage Weil
01:21 PM devops Feature #2552 (Rejected): chef: admin tool to generate config in json (uuid, secret)
The environment needs things like... Anonymous
01:12 PM Bug #2551 (Rejected): leveldb broke "make distcheck"
... Anonymous
01:03 PM devops Feature #2549 (Resolved): ceph-disk-prepare: take fstype, mkfs and mount options from ceph.conf
See #2548 for similar need. Anonymous
01:02 PM devops Feature #2548 (Resolved): ceph-disk-activate: take mount options from ceph.conf
Anonymous
01:02 PM devops Feature #2547 (Resolved): ceph-disk-prepare: handle partitioning and mkfs
spawn gdisk in a subprocess.
How much protection do admins need to avoid ceph-disk-prepare /dev/sda mistakes?
Anonymous
01:00 PM devops Feature #2546 (Resolved): ceph-disk-prepare: take fsid from ceph.conf (support --cluster=name)
Anonymous
12:49 PM devops Feature #2498 (Resolved): standardize keyring locations for daemons
Sage Weil
10:56 AM Bug #2545 (Resolved): init-ceph: stops if one instance fails to start
Sage Weil
10:52 AM Bug #2543 (Resolved): crush: invalid pointer when outputting local retry histogram for large rang...
caleb miles
10:10 AM rbd Bug #2535: rbd: random data corruption in vm
Guido Winkelmann wrote:
> Sage Weil wrote:
> > Are there multiple partitions or is LVM on the disk, or is the file ...
Sage Weil
10:07 AM rbd Bug #2535: rbd: random data corruption in vm
Sage Weil wrote:
> Are there multiple partitions or is LVM on the disk, or is the file system on the raw device?
...
Guido Winkelmann
09:29 AM rbd Bug #2535: rbd: random data corruption in vm
Are there multiple partitions or is LVM on the disk, or is the file system on the raw device? Sage Weil
05:32 AM rbd Bug #2535: rbd: random data corruption in vm
Am Montag, 11. Juni 2012, 09:30:42 schrieb Sage Weil:
> If you can reproduce it with 'debug filestore = 20' too, tha...
Guido Winkelmann
05:29 AM rbd Bug #2535: rbd: random data corruption in vm
The bug also does not seem to have any effect with the setting "filestore fiemap = false" in ceph.conf. Guido Winkelmann
02:27 AM Bug #2544 (Closed): Help text for "usage show" identical to "usage trim"
cerr << " usage show show usage (by user, date range)\n";
cerr << " usage trim ...
Dan Mick

06/11/2012

09:17 PM Bug #2543 (Resolved): crush: invalid pointer when outputting local retry histogram for large rang...
buggered the memory when we are generating the histogram for a large range of x. caleb miles
06:57 PM rgw Bug #2542 (Resolved): rgw: support S3 update of metadata
S3 metadata update is being done by copying of an object to itself with new metadata info. Yehuda Sadeh
04:22 PM Feature #1772 (Resolved): rbd: define new on-disk header format
Sage Weil
11:31 AM Feature #1772 (In Progress): rbd: define new on-disk header format
Sage Weil
03:17 PM Bug #2540 (Resolved): "ceph osd crush set" should treat "foo=" as if foo wasn't mentioned on the ...
Sage Weil
03:12 PM Bug #2540 (In Progress): "ceph osd crush set" should treat "foo=" as if foo wasn't mentioned on t...
Sage Weil
03:09 PM Bug #2540 (Resolved): "ceph osd crush set" should treat "foo=" as if foo wasn't mentioned on the ...
The current behavior, using an empty string as the name, is quite confusing.
Instead of an error message, a better...
Anonymous
03:13 PM RADOS Feature #2541 (Resolved): crush: move command to adjust non-leaf node position
the add or update function is intentionally limited to leaves. allow the hierarchy to be adjusted using a different ... Sage Weil
03:08 PM Feature #2510 (Resolved): update on-disk hobject_t encoding to include pool and namespace fields
Sage Weil
02:13 PM Feature #2539 (Duplicate): ceph should issue timeout message when it can't connect to mon
I forgot to start the ceph service before issuing ceph -s to check its status. The tool happily
waited forever to c...
Dan Mick
11:31 AM Feature #2496 (Resolved): reinstall pudgy
Sage Weil
10:32 AM RADOS Feature #2521 (Resolved): crush: control bucket vs device mark-down probabilities independently
Sage Weil
09:50 AM Linux kernel client Bug #2392: First read of symlink after ceph filesystem mounted gives error
This is going to be easy to fix once the atomic_open stuff is merged. Real Soon Now. Sage Weil
09:40 AM Linux kernel client Bug #2537 (Won't Fix): bad header for RHEL6-like kernels
That backports tree is very old and not maintained. Assuming you do get it working, you'll have 1-2 year old code. ... Sage Weil
05:07 AM Linux kernel client Bug #2537: bad header for RHEL6-like kernels
Sorry,
I forgot to mention that it implies caps.c and super.h files.
For detecting that kernel is RHEL it is mayb...
Yannick Perret
04:28 AM Linux kernel client Bug #2537 (Won't Fix): bad header for RHEL6-like kernels
Hello,
I tried to compile the kernel module (kclient-0.20) and get a problem with ceph_write_inode:
it is declared ...
Yannick Perret
09:33 AM Feature #1773 (Resolved): rbd: class interface for header interaction
Sage Weil
04:01 AM Bug #2536 (Can't reproduce): librados crashed while getting stat of an object
librados crashed while getting stat of an object:... Xiaopong Tran

06/10/2012

09:58 PM Feature #1400 (Resolved): throw exceptions on unknown encoding
Sage Weil
09:46 PM Feature #2088: msgr: refactor 2 threads to one
Sage Weil
09:46 PM Feature #2149: osd: use omap for snap collections
Sage Weil
09:22 PM Feature #1772: rbd: define new on-disk header format
Sage Weil
05:47 PM Feature #1773: rbd: class interface for header interaction
Sage Weil
05:47 PM Feature #1773 (Fix Under Review): rbd: class interface for header interaction
Sage Weil
05:41 PM Linux kernel client Bug #2389: rbd: hung xfstest 67
nightly_coverage_2012-06-10-a 6787 Sage Weil
11:05 AM CephFS Bug #2444: null pointer deference in ceph_d_prune inside kvm
hi,
same bug here on native x86 and amd64 machines.
It affects debian wheezy and ubuntu 12.04 LTS.
I did not check...
Christian Krafft

06/09/2012

08:06 PM rbd Bug #2535: rbd: random data corruption in vm
The information that *should* let us fully diagnose:
* set
debug osd = 20
debug filestore = 20
debug ms = ...
Sage Weil
08:04 PM rbd Bug #2535 (Resolved): rbd: random data corruption in vm
From ML:... Sage Weil
04:27 PM CephFS Bug #1947 (Need More Info): mds: SIGBUS during _mark_dirty
It looks liek this one still lives on:... Sage Weil

06/08/2012

11:14 PM Bug #2524: librados crashed while connecting to cluster
Thanks for the update. Yes, we do have different models, including a pool of set number of rados_t instances, etc. Bu... Xiaopong Tran
10:37 PM Bug #2524: librados crashed while connecting to cluster
Xiaopong Tran wrote:
> This is on my system:
> [...]
>
> Does it create a thread to every configured osd or only one...
Sage Weil
09:27 PM Bug #2524: librados crashed while connecting to cluster
I bumped up the threads-max to:... Xiaopong Tran
07:40 PM Bug #2524: librados crashed while connecting to cluster
This is on my system:... Xiaopong Tran
07:17 AM Bug #2524: librados crashed while connecting to cluster
Sage Weil wrote:
> can you cat /proc/sys/kernel/threads-max ? on my system it's only 127837.
Yeah, for each libr...
Sage Weil
07:09 AM Bug #2524: librados crashed while connecting to cluster
can you cat /proc/sys/kernel/threads-max ? on my system it's only 127837. Sage Weil
03:17 AM Bug #2524: librados crashed while connecting to cluster
Ah, formatting... sorry... Xiaopong Tran
03:15 AM Bug #2524: librados crashed while connecting to cluster
Alright, more information. I was thinking, maybe it was the max number of open files, or the stack size is too low, s... Xiaopong Tran
11:04 PM Feature #2496 (In Progress): reinstall pudgy
Sage Weil
09:03 PM Feature #2337 (Resolved): rgw and rados performance numbers
Sage Weil
10:14 AM Feature #2337: rgw and rados performance numbers
Actually, the specific sprint test is here:
https://docs.google.com/a/inktank.com/spreadsheet/ccc?key=0AnmmfpoQ1_9...
Mark Nelson
09:53 AM Feature #2337: rgw and rados performance numbers
Results are being posted here:
https://docs.google.com/a/inktank.com/folder/d/0B3mmfpoQ1_94amRLQW5YT3l3OG8/edit
Mark Nelson
04:42 PM rbd Bug #2534 (Resolved): librbd: make sure watch is established on same header version as initial re...
Right now there's a race where it doesn't. Greg Farnum
11:16 AM Bug #2533 (Duplicate): osd: watchers tracked by entity_name_t, not by cookie
In the object info, watchers are tracked in a map<entity_name_t, watch_info_t>, but if there are multiple watchers fr... Josh Durgin
10:43 AM Feature #1711 (Resolved): chef: multiple monitor support
Works as of ceph-cookbook.git commit b5cc21bf5b9c3f59474a7dfe38e04ee01b584fa3 and ceph.git commit 7332e9c717fb627d51e... Anonymous
10:12 AM rbd Feature #2531: rbd: fencing broken clients
I talked to Sam about the combination of blacklisting, bad client writes, and changing primaries that we discussed an... Greg Farnum
10:11 AM Linux kernel client Feature #26 (Rejected): statlite
Sage Weil
10:09 AM Linux kernel client Cleanup #2093 (Resolved): ceph-client: messenger: the "to" parameter to read_partial() needs to go
Sage Weil
10:08 AM Linux kernel client Bug #2395 (Resolved): kernel crash after unmap a rdb device while the cluster is down
I'm going go assume this is running the older code and close it. If not, let us know! Sage Weil
10:06 AM rbd Bug #2478 (New): krbd: unmap on 3.4.0: scheduling while atomic...
Sage Weil
10:04 AM Linux kernel client Feature #949 (Rejected): rbd: async writes, flush/barrier
Sage Weil
10:04 AM Linux kernel client Bug #2243 (Resolved): btrfs: warning in orphan_commit_root
Sage Weil
09:51 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
That's probably best. It is always easier though when all subcommands under the main command, rbd in this case used o... Sam Zaydel
09:00 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
Oh, i see.
I think the right fix is to make '--secret' and synonym for '--keyfile', and fix up rbd to use the conf...
Sage Weil
08:20 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
When I try to use --keyfile=<file> with map, it seemingly fails, but using --secret=<file> succeeds. ... Sam Zaydel
08:13 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
This is part of the rbd cmd helper message. It seems that for the map command one uses --secret.... Sam Zaydel
07:00 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
the option is --keyfile <file>... where did you see --secret <file> documented? Sage Weil
05:49 AM rbd Bug #2532 (Resolved): rbd command allows passing in -K </path/to/secret>, but long version of (--...
While rolling back a snapshot I succeed when I pass in `-K with location of key file, but it looks like I fail when I... Sam Zaydel

06/07/2012

09:38 PM rbd Feature #2531 (Resolved): rbd: fencing broken clients
Sage Weil
06:45 PM Bug #2524: librados crashed while connecting to cluster
objdump on the NIF shared library. Xiaopong Tran
06:29 PM Bug #2524: librados crashed while connecting to cluster
This is weird, if the problem is caused by resource exhaustion. I run this app on a machine with i7 CPU (with 8 cores... Xiaopong Tran
09:24 AM Bug #2524: librados crashed while connecting to cluster
This assert means that either a malloc or a call to pthread_create failed. It's probably resource exhaustion of some ... Greg Farnum
04:23 AM Bug #2524 (Won't Fix): librados crashed while connecting to cluster
Librados crahsed while connecting to the cluster.
Here is some log information. Unfortunately, I don't have more i...
Xiaopong Tran
04:25 PM rbd Documentation #2530 (Closed): Doc: rbd manpage doesn't mention watch; usage does, and it works
Dan Mick
04:20 PM Tasks #2529 (Resolved): debian: Merge packaging changes from Ubuntu 12.04
The package in ubuntu is split to ceph-fs-common (mount helpers), ceph-mds (not in main), etc. Merge what makes sense. Anonymous
03:10 PM rbd Bug #2528 (Resolved): Mounted RBD image appears to go read-only after a snapshot is created
I have been able to repeat this a number of times. Essentially, I create a small rbd device, using the map command in... Sam Zaydel
01:54 PM Bug #2526 (Resolved): ceph-mon $mon_data_dir/keyring is world readable
gah... commit:7332e9c717fb627d51efcaa3f31473a2c129e876 Sage Weil
01:25 PM Bug #2526 (Resolved): ceph-mon $mon_data_dir/keyring is world readable
Keys to the kingdom, for anyone to grab. ceph-mon --mkfs creates this file, it should enforce the access mode.
ubu...
Anonymous
01:52 PM rgw Bug #2527 (Resolved): RGW may return 409 Conflict when deleting a bucket
If a bucket delete call occurs immediately after running a delete operation on the final remaining object in that buc... Jeremy Hanmer
12:53 PM Bug #2525 (Resolved): librados: some functions are not thread-safe
Some functions are accessing the osdmap without any locks. There are probably other cases like this. Find and fix all... Josh Durgin

06/06/2012

09:07 PM Feature #1422 (Resolved): libvirt: rbd storage pool
Sage Weil
09:06 PM Feature #2486 (Resolved): crush: evaluate local retry behavior
Sage Weil
09:06 PM Feature #2493 (Resolved): teuthology-lock --status
Sage Weil
09:05 PM devops Feature #2498 (Fix Under Review): standardize keyring locations for daemons
Sage Weil
03:57 PM Messengers Cleanup #2150 (Resolved): repair the Simple/Messenger interface
Sage Weil
02:06 PM Feature #2497 (Resolved): mon: new cluster logging strategy
commit:47b202ecfdc00996b085a0c0d557564fbaa8bdfe Sage Weil
12:28 PM Feature #2497 (Fix Under Review): mon: new cluster logging strategy
Sage Weil
12:28 PM Feature #2497: mon: new cluster logging strategy
see wip-2497 Sage Weil
01:27 PM Linux kernel client Bug #2523 (Resolved): xfs: xfs_iolock_reclaimable
... Sage Weil
01:22 PM rbd Bug #2522: xfstest #219
ubuntu@teuthology:/a/nightly_coverage_2012-06-05-b Sage Weil
01:21 PM rbd Bug #2522 (Closed): xfstest #219
... Sage Weil
11:30 AM Bug #2518 (Resolved): mon: limit size of paxos log event
Sage Weil
11:29 AM RADOS Feature #2521: crush: control bucket vs device mark-down probabilities independently
Sage Weil
11:27 AM RADOS Feature #2521 (Resolved): crush: control bucket vs device mark-down probabilities independently
--mark-down-ratio -- probability that a device (in eligible bucket) will be marked down
--mark-down-bucket...
Sage Weil
11:27 AM RADOS Feature #2421 (Resolved): crush: quantitatively validate mapping quality
Sage Weil
09:16 AM Bug #2520 (Duplicate): iozone random read/write with 4k block size hangs
http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/6777/focus=6856
User reports iozone random read/write (...
Anonymous
04:20 AM Bug #2508: osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_...
Hi Josh,
i've increased osd_min_pg_log_entries to 5000. Let's see if it fixes the problem.
Simon
Simon Frerichs

06/05/2012

01:36 PM Feature #2519 (Resolved): rados: allow setting pg_num and pgp_num when creating a pool
Right now rados mkpool creates a pool with 8 pgs, which is almost always too few. 'ceph osd pool create' accepts pg_n... Josh Durgin
01:04 PM Bug #2518: mon: limit size of paxos log event
Sage Weil
01:03 PM Bug #2518 (Resolved): mon: limit size of paxos log event
dho was having trouble with a 400MB paxos event/record. make LogMonitor limit an individual paxos event to something... Sage Weil
11:42 AM rgw Feature #2517 (Resolved): rgw: limit number of buckets per user (configurable per user)
Yehuda Sadeh
11:37 AM rgw Feature #2516 (Resolved): rgw: new bandwidth-only per-user log
- orthogonal to operations logs
- only aggregate user bandwidth usage (read, write) per date
- rgw sends a perio...
Yehuda Sadeh
11:02 AM Bug #2508: osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_...
Hi Simon,
If this is at all reproducible, could you try setting osd_min_pg_log_entries higher on all your osds, sa...
Josh Durgin
07:47 AM Bug #2508 (Resolved): osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobje...
Hi,
we've random KVM VPS crashes with the following error:...
Simon Frerichs
10:32 AM Feature #2510: update on-disk hobject_t encoding to include pool and namespace fields
Samuel Just
10:15 AM Feature #2510 (Resolved): update on-disk hobject_t encoding to include pool and namespace fields
This will allow hobject_t's to be globally unique in the filestore. That is, there will be a 1-to-1 inode to hobject... Samuel Just
10:31 AM Subtask #2515: allow collection upgrade to use more than one transaction
Samuel Just
10:31 AM Subtask #2515 (Resolved): allow collection upgrade to use more than one transaction
Samuel Just
10:31 AM Subtask #2514: Implement DBObjectMap upgrade from old version
Samuel Just
10:30 AM Subtask #2514 (Resolved): Implement DBObjectMap upgrade from old version
Samuel Just
10:31 AM Subtask #2513: Update DBObjectMap implementation to ignore collection
Samuel Just
10:30 AM Subtask #2513 (Resolved): Update DBObjectMap implementation to ignore collection
This allows us to remove the (coll_t,hobject_t)->seq mapping and directly store the leaf nodes keyed by hobject_t. Samuel Just
10:31 AM Subtask #2512: implement upgrade process for collections
Samuel Just
10:29 AM Subtask #2512 (Resolved): implement upgrade process for collections
also upgrade object_info and pg log encodings Samuel Just
10:31 AM Subtask #2511: Change hobject_t encoding
Samuel Just
10:16 AM Subtask #2511 (Resolved): Change hobject_t encoding
Samuel Just
10:17 AM CephFS Bug #733: cmds crash: mds/LogEvent.cc:88: FAILED assert(p.end())
ok here is a logfile with the following config:
[mds]
debug = 20
debug ms = 1
debug md...
Eric Dold
10:08 AM Subtask #2402 (In Progress): audit calls into osd from pg for locking correctness
Samuel Just
10:07 AM Subtask #2509 (Resolved): create OSDService to limit pg/osd interface
Samuel Just
10:06 AM Subtask #2430: simplify pg removal
Samuel Just
10:06 AM Subtask #2403: remove osd pointer from PG
Samuel Just
10:06 AM Subtask #2333: create queueing for peering messages
Samuel Just
10:06 AM Subtask #825: osd: remove pg map updating from handle_osd_map
Samuel Just
10:06 AM Subtask #2332: move pg queueing into pgs
Samuel Just
10:06 AM Subtask #2282: Handle map updates on a per-pg basis
Samuel Just
09:56 AM rbd Feature #1480: librbd: image locking
lock(entity)
unlock(entity)
new code should lock before open, unlock on close.
the rbd map tool have 'lock lis...
Sage Weil
12:28 AM CephFS Bug #1047: mds: crash on anchor table query
No, I am not sure about that. Only saw the same assert message and a similar trace, so I assumed it to be the same bug. Amon Ott

06/04/2012

04:09 PM rgw Bug #2503 (Resolved): rgw: ungraceful failure when cannot create unix domain socket
Fixed, commit:5087997a1c90ecd1244dc1047a17858607c940f9. Yehuda Sadeh
03:09 PM rgw Bug #2503: rgw: ungraceful failure when cannot create unix domain socket
No, another problem. This refers to the 'rgw socket path' that is being used for fastcgi. Yehuda Sadeh
06:26 AM rgw Bug #2503: rgw: ungraceful failure when cannot create unix domain socket
There was a stupid error in master for a few days that was making noise about the admin socket.. is that what this wa... Sage Weil
03:56 PM Bug #2507 (Resolved): auth: "ceph auth get-or-create-key" argument validation is lacking
This should probably have errored out:
ubuntu@inst01:~$ sudo ceph auth get-or-create-key client.foo borkbork
AQBW...
Anonymous
01:08 PM CephFS Bug #1047: mds: crash on anchor table query
Amon, are you sure you're hitting exactly this bug with your users? This particular one requires hard links to be in ... Greg Farnum
01:04 PM CephFS Bug #733: cmds crash: mds/LogEvent.cc:88: FAILED assert(p.end())
Aww, the actual debug line that's interesting here is generic_dout().
Can you do it again, this time adding "debug =...
Greg Farnum
10:05 AM Messengers Cleanup #2150: repair the Simple/Messenger interface
I scheduled another test run but I don't anticipate any problems — this should be reviewed for merge! Greg Farnum
09:23 AM CephFS Bug #2494: mds: Cannot remove directory despite it being empty.
Note that this was triggered frequently by backuppc runs:
http://thread.gmane.org/gmane.comp.file-systems.ceph.devel...
Anonymous
09:23 AM Linux kernel client Bug #2506: ceph: ceph_add_cap: couldn't find snap realm NNN
Note that this was triggered frequently by backuppc runs:
http://thread.gmane.org/gmane.comp.file-systems.ceph.devel...
Anonymous
06:33 AM Bug #2487 (Resolved): rgw: (re)creating a suspended bucket succeeds
Sage Weil
06:29 AM Bug #2491 (Resolved): watch/notify: racing notify and unwatch
Sage Weil
01:35 AM Bug #2346: xfs filesystem on top of rbd volume corrupts
I am not 100% sure but it looks like kernel 3.2.17-1 fixed the problem. Let's wait 4 weeks to make sure of it. Maciej Galkiewicz
 

Also available in: Atom