Activity
From 06/04/2012 to 07/03/2012
07/03/2012
- 10:36 PM Bug #2707 (Can't reproduce): mkcephfs failing on v0.48 "argonaut"
- Firstly, well done guys on achieving this version milestone. I successfully upgraded to the 0.48 format uneventfully ...
- 04:45 PM RADOS Feature #2706 (Resolved): crush: update kernel code to decode tunables
- 04:44 PM RADOS Feature #2705 (Resolved): crush: graceful transition to new default tunables
- 04:44 PM RADOS Bug #2214 (Resolved): crush: pgs only mapped to 2 devices with replication level 3
- 04:44 PM RADOS Bug #2047 (Resolved): crush: with a rack->host->device hierarchy, several down devices are likely...
- 04:43 PM RADOS Bug #187 (Rejected): crush: high variance, latency for straw buckets
- 04:43 PM RADOS Feature #2422 (Resolved): crush: test that mapping result is uncorrelated
- 04:39 PM rgw Bug #2106: failed s3tests.functional.test_s3.test_100_continue
- recent logs from the nightly run: /a/teuthology-2012-07-03_00:00:09-regression-next-testing-basic/5054
- 04:34 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
- Tamilarasi muthamizhan wrote:
> latest logs:
> /a/teuthology-2012-07-03_00:00:09-regression-next-testing-basic/5019... - 04:33 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
- latest logs:
/a/teuthology-2012-07-03_00:00:09-regression-next-testing-basic/5019
config.yaml:
++++++++++++
k... - 03:53 PM devops Feature #2704 (Closed): sepia: Use ``names`` as resolver on plana, burnupi, vercoi
- 03:45 PM Feature #2702 (Resolved): gitbuilder: sync each build as it completes
- 03:27 PM devops Feature #2549: ceph-disk-prepare: take fstype, mkfs and mount options from ceph.conf
- As of commit ad97415ef72b55934adfa5024fd9af8fd1f0f82d, this now needs mount options too.
- 03:26 PM devops Feature #2547 (Resolved): ceph-disk-prepare: handle partitioning and mkfs
- commit ad97415ef72b55934adfa5024fd9af8fd1f0f82d
Author: Tommi Virtanen <tv@inktank.com>
Date: 2012-07-03 15:24:26... - 02:24 PM rbd Bug #2457 (Resolved): libvirt: migration fails with rbd in 0.9.11 and 0.9.12
- Fixed by upstream libvirt commit 78290b1641e95304c862062ee0aca95395c5926c.
- 02:08 PM rbd Bug #2457: libvirt: migration fails with rbd in 0.9.11 and 0.9.12
- Fixed in 0.9.12-3(debian naming) and later. Also recently in-list reports told the same, so issue may be closed safely.
- 02:17 PM rgw Bug #2701 (Resolved): rgw: don't keep bucket info indexed by bucket_id
- 02:15 PM rbd Bug #2700 (Resolved): blkdeviotune method at libvirt doesn`t work on RBD volumes
- Since qemu implemented its own i/o limiting mechanism rather than cgroups, all block backends may be controlled over ...
- 12:17 PM Messengers Bug #2569: msgr: connect_rank crash
- i've merged fix for this into master, commit:204bc594be1a6046d1b362693d086b49294c2a27 (with possible side-effects fro...
- 12:16 PM Bug #2682 (Resolved): config lockdep error (recursive lock?) in LibRadosAio.SimpleWritePP
- 10:48 AM devops Feature #2699 (Rejected): crowbar: change barclamp-glance to use rbd
- 10:38 AM devops Feature #2698: crowbar: Guide for using "front" network
- We need an easy way to drop a "dhclient eth1" upstart job into a crowbar server installation. Just a sudo tee /etc/in...
- 10:28 AM devops Feature #2698 (Closed): crowbar: Guide for using "front" network
- 10:26 AM devops Feature #2697 (In Progress): crowbar: ISO generation, reproducible in a cloud image vm
- 10:16 AM devops Feature #2697 (Resolved): crowbar: ISO generation, reproducible in a cloud image vm
- 10:12 AM devops Feature #2696 (Rejected): chef: Automated QA
- Use downburst vms on vercoi to automatically bring up ceph clusters, do basic RADOS/RBD functionality testing, tear d...
- 10:11 AM devops Feature #2695 (Closed): crowbar: Automated QA
- Use downburst vms on vercoi to automatically bring up ceph clusters, do basic RADOS/RBD functionality testing and Ope...
- 10:10 AM rgw Bug #2642 (Resolved): rgw: show/trim usage using also time (not just date)
- Done, commit:80a939a99db64f7802a4a3c1320316c91720f5d9
- 10:08 AM rgw Bug #2658 (Resolved): rgw-admin: usage show fails when specifying hour > 12
- Fixed, commit:c5d19b6df0bcb238e5e68732b4d252b06f2d9e56.
- 10:05 AM devops Feature #2584 (Resolved): sepia: provide networking, DHCP for dynamic virtual machines
- 10:05 AM devops Feature #2584: sepia: provide networking, DHCP for dynamic virtual machines
- Split the DNS part to #2694, this is already providing value to users.
- 09:59 AM devops Feature #2584: sepia: provide networking, DHCP for dynamic virtual machines
- Status update: missing DNS updates, all the strictly required components are there; vms attached to the front network...
- 10:04 AM devops Feature #2553: crowbar: open question: What's the correct way to add RBD support to the Nova barc...
- (Wrong ticket, ignore)
- 10:04 AM devops Feature #2694 (Closed): sepia: provide DNS for dynamic vms
- 09:24 AM devops Feature #2546 (Resolved): ceph-disk-prepare: take fsid from ceph.conf (support --cluster=name)
- commit 4e774fbcb38fd6883232b72352512a5f8e4a66e8
Author: Tommi Virtanen <tv@inktank.com>
Date: 2012-07-03 09:22:28... - 08:04 AM Bug #2693 (Resolved): osd/ReplicatedPG.cc: 4293: FAILED assert(info.last_update <= active_rep_scr...
- ...
07/02/2012
- 09:25 PM Feature #2692 (Resolved): stable testing debian repos
- 06:49 PM rbd Bug #2689 (In Progress): qemu iozone test hangs
- 02:51 PM rbd Bug #2689 (Resolved): qemu iozone test hangs
- ...
- 05:07 PM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
- took down osd.2 and osd.3 with same crash. coredumps are on the hosts..
- 05:06 PM Bug #2691 (Won't Fix): osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
- ...
- 04:40 PM Bug #2690 (Won't Fix): mon: persist quorum features
- currently the non-leaders do not know the quorum features, and encode everything with a minimal (0) feature set.
... - 02:26 PM Linux kernel client Bug #2688 (Duplicate): lockup on ffsb + thrashing
- ...
- 12:54 PM Bug #2687: FileStore crashes when "osd_journal_size" is larger than the filesystem
- for files, i think the right approach is to fallocate(), which will reserve the space. we shouldn't have to look at ...
- 12:47 PM Bug #2687 (Resolved): FileStore crashes when "osd_journal_size" is larger than the filesystem
- See: http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/7282
If a user (on tmpfs, in this case) specifies... - 12:49 PM Bug #2476: osd: watch timeout depends on operations to an object
- fix qa/workunits/rbd/copy.sh when this is fixed !!!
- 12:36 PM rbd Feature #2556: rbd tool: break image locks
- The current progress in is wip-rbd-locking. Still needs tests and docs, plus a small cleanup as noted on github.
- 12:32 PM rbd Feature #2686 (Resolved): rbd: let users specify a usage for shared locks
- If existing lockers have the same usage, the lock succeeds. Otherwise, it fails. This could let you use locks with e....
- 11:28 AM rbd Feature #2685 (Rejected): Support QEMU migration with caching enabled
- This is a libvirt problem, it's not related to qemu at all. I already looked into and tested whether qemu was doing f...
- 11:21 AM rbd Feature #2685 (Rejected): Support QEMU migration with caching enabled
- See http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/7524
Apparently newer versions of QEMU refuse to... - 09:44 AM Documentation #2684 (Won't Fix): doc: ceph and all daemons take --show-config
- Quoting Sage:
For future reference, you can get a dump of all these values with
ceph-osd -i 123 --show-... - 09:30 AM Bug #2593: logmonitor: decode failure
- Do we know if the log in question actually existed on disk or not?
- 07:28 AM Bug #2593: logmonitor: decode failure
- saw this again on next:...
- 07:37 AM Bug #2683: ceph-fuse: crash during fsstress
- ...
- 07:31 AM Bug #2022 (Need More Info): osd: misdirectect request
- apparently there is a different cause for this:...
- 05:57 AM Subtask #2621 (In Progress): mon: Single-Paxos: synchronize the MonitorDBStore of oblivious monitor
07/01/2012
- 09:46 PM Feature #2651: mon: race calling tick() when doing slurping
- making this a cleanup so that it stops confusing me :)
- 08:57 PM Bug #2683 (Can't reproduce): ceph-fuse: crash during fsstress
- ...
- 07:48 PM Bug #2682 (Resolved): config lockdep error (recursive lock?) in LibRadosAio.SimpleWritePP
- ...
- 03:06 PM CephFS Bug #2681: client: got push without mds session
- this was with 'ms inject socket failure = 200'
- 03:06 PM CephFS Bug #2681 (Resolved): client: got push without mds session
- ...
- 02:41 PM Bug #2599 (Can't reproduce): osd: crash in ReplicatedPG::C_OSD_OndiskWriteUnlock::finish
- chalking this up to the bugs in next a couple weeks back
- 09:22 AM Feature #2680 (Resolved): osd: report backfill progress via query
- ...
- 07:09 AM CephFS Bug #2679 (Can't reproduce): POSIX file lock not released on process termination
- I obtained a POSIX file lock with the following code:
> --- snip ---
>
> ...
> std::string x = "/tmp/ceph_mount...
06/30/2012
- 10:52 PM rbd Documentation #2670: Docs shouldn't direct users to echo to /sys/bus/rbd for normal use
- 10:51 PM rbd Feature #2279 (Resolved): rbd: trivial layering design doc
- 11:34 AM Bug #2675: osd: segfault during log trim
- and ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-28_19:00:12-regression-master-testing-gcov/3450
06/29/2012
- 09:44 PM Bug #2675: osd: segfault during log trim
- and ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-28_19:00:12-regression-master-testing-gcov/3441...
- 03:39 PM Bug #2675: osd: segfault during log trim
- and ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-28_19:00:12-regression-master-testing-gcov/3435
- 03:37 PM Bug #2675: osd: segfault during log trim
- and ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-28_19:00:12-regression-master-testing-gcov/3437
- 03:33 PM Bug #2675: osd: segfault during log trim
- also:...
- 03:30 PM Bug #2675 (Resolved): osd: segfault during log trim
- ...
- 09:02 PM Feature #2471 (Resolved): osd: add prefix match to OSDCaps
- 09:00 PM Feature #2678 (Rejected): osd, objecter: redirect misdirected requests
- Generic mechanism to refer the client to the correct osd when they misdirect their requests. This will allow the clu...
- 08:59 PM Bug #2676 (Resolved): mon: cannot create pool with old renamed name
- commit:5a9355091296121823156de7d3160de45328a0cc
- 04:46 PM Bug #2676 (Resolved): mon: cannot create pool with old renamed name
- renaming a pool name, and then trying to create a new pool with the old name fails.
- 07:27 PM rbd Bug #2677 (Resolved): librbd: create does not clean up well
- A create that fails part way through does not remove objects it created or undo modifications it does, for example ad...
- 07:23 PM rbd Feature #2279 (Fix Under Review): rbd: trivial layering design doc
- See wip-rbd-layering-doc
- 03:26 PM Messengers Bug #2569: msgr: connect_rank crash
- fix for this is in wip-msgr, still testing
- 02:16 PM RADOS Feature #2541 (Resolved): crush: move command to adjust non-leaf node position
- 12:54 PM Feature #2575 (Resolved): perf: 0.48 numbers
- 12:53 PM Feature #2582 (Resolved): set up chart.io + mysql (or equivalent) infrastructure for tracking perf
- 12:51 PM Feature #2577 (Resolved): teuthology: blktrace task
- 12:29 PM Subtask #2674: mon: Single-Paxos: mon commits suicide after remove&add
- Tried this on master. Although at first I triggered something else, the bottom line is that this works, and the monit...
- 12:14 PM Subtask #2674 (Rejected): mon: Single-Paxos: mon commits suicide after remove&add
- Yep. Makes sense. I was afraid this was cause by my changes.
Rejecting it then. - 11:30 AM Subtask #2674: mon: Single-Paxos: mon commits suicide after remove&add
- Yeah.. basically we're changing the mon's ip by removing and re-adding it, and the mon isn't smart enough to realize ...
- 11:12 AM Subtask #2674: mon: Single-Paxos: mon commits suicide after remove&add
- I believe this is intended behavior, note the last line:...
- 03:07 AM Subtask #2674 (Rejected): mon: Single-Paxos: mon commits suicide after remove&add
- Pre-conditions:
3 mons: a=127.0.0.1:6789 ; b=127.0.0.1:6790 ; c=127.0.0.1:6791
* remove 'c' with ./ceph mon rem... - 11:09 AM Bug #2646: mon:update_from_paxos: error parsing incremental update: buffer::end_of_buffer
- commit:840ae244499496d543d634713bdee7c7884ce527
The tick happened at the same time as slurping, which meant the di... - 10:54 AM Bug #2646 (Resolved): mon:update_from_paxos: error parsing incremental update: buffer::end_of_buffer
- 10:53 AM Bug #2264 (Can't reproduce): mon: failed assert in bump_epoch
- 06:19 AM Bug #2618: error: unable to open OSD superblock
- Thanks, but that didn't help.
I did notice that drives get mounted a little weird.
Don't know if that's a problem...
06/28/2012
- 10:06 PM Bug #2664 (Resolved): osd: extra attr _path, extra attr snapset from scrub
- 11:29 AM Bug #2673 (Resolved): ReplicatedPG::prepare_transaction: don't crash on empty ops
- 11:26 AM Cleanup #2672 (Rejected): PG::find_best_info cleanup
- see 253033cd720db86e7c8372fd4184de7d4c43bce2
- 11:26 AM Cleanup #2671 (Resolved): buffer.h: do efficient buffer comparisons
- 10:15 AM rbd Documentation #2670 (Resolved): Docs shouldn't direct users to echo to /sys/bus/rbd for normal use
- A naive user looking for "rbd map" will instead find this:
http://ceph.com/docs/master/rbd/rados-rbd-cmds/
with... - 10:04 AM Linux kernel client Bug #2260: libceph: null pointer dereference at try_write+0x638+0xfb0
- Lots of work on the messenger client, but still not completely
clear this particular bug is fixed. There are a few ... - 09:42 AM Linux kernel client Bug #147: lockdep: possible irq lock inversion dependency w/ osdc->request_mutex and con->mutex
- I suppose this really ought to get fixed at some point.
For now, it looks like Sage has implemented a workaround
th... - 09:41 AM rbd Bug #1070: krbd: ^C doesn't work
- No progress on this. None expected unless it gets
reprioritized and planned. - 09:40 AM Linux kernel client Feature #1699: debug symbols in autobuilt (sepia) kernels
- No progress on this. I have a vague memory that someone
else might have looked at this problem a while back (Dan?).... - 09:39 AM Feature #2127: Save kernel core dumps on all of our test machines
- My work on this was pretty much complete a few months ago.
It included a shell script that leverages Ubuntu kdump
... - 09:32 AM Linux kernel client Bug #2261 (Can't reproduce): paging error in libceph after crashed osd comes back online
- the osd_client refcounting bug fix may explain this one, too... commit:0d47766f14211a73eaf54cab234db134ece79f49
an... - 09:16 AM Linux kernel client Bug #2261: paging error in libceph after crashed osd comes back online
- No progress on this.
There has been a lot of work on the messenger code since this bug was
reported. One change ... - 09:31 AM Linux kernel client Cleanup #2130: ceph: xattr: complete cleanups following review
- No progress on this, but I still have the patches. I'll
try to sneak them in as I'm working on RBD. I believe
the... - 09:29 AM Linux kernel client Cleanup #2131: ceph: xattr: use the generic kernel xattr code
- No progress on this. It should be put on our roadmap as a task
to complete, maybe within the next 6 months. - 09:12 AM Bug #2267 (Closed): Ceph client crashed after shutting down one mds and osd
- A recent fix supplied by Zheng Yan of Intel seems to have fixed
this problem, so I'm closing this bug.
rbd: C... - 09:05 AM rbd Feature #2326 (In Progress): krbd: use new class interfaces, new image format
- I've finally begun work on this, following some in-person discussion
with Josh, Dan, and Sage this week.
I will u... - 09:00 AM Linux kernel client Feature #2374: ceph-client: start laying the groundwork for Linux tracepoints
- No progress on this yet.
However, I got this e-mail from Jim Schutt shortly after creating
this bug, and just wan... - 08:44 AM Bug #2386: xfstests: failed #34
- I've been trying to find out whether this is still a problem or
if it was transient. But teuthology has had a strin... - 07:41 AM Linux kernel client Bug #2424 (Resolved): ceph-client: messenger: badness in prepare_write_connect()
- This bug was fixed in May, by a small series of changes that
culminated in this one:
commit 3da54776e2c0385c3... - 07:37 AM Linux kernel client Cleanup #2432: ceph-client: messenger: refactor to simplify state model
- I had worked out on paper some notes about a longer-term state/event
model that could be used for the client messeng... - 07:33 AM Linux kernel client Cleanup #2432: ceph-client: messenger: refactor to simplify state model
- I worked on doing this for a good month but the job really isn't
complete. Nevertheless I think there was some prog... - 07:23 AM Linux kernel client Cleanup #2438: ceph-client: use BUG_ON() for null auth_client->ops pointers
- Touching all my bugs today. This one's a good idea but
very low priority. - 07:20 AM rbd Bug #2608: rbd: hung xfstest 270
- Just to summarize what I just added...
There are some recent XFS problems that might explain this,
irrespective o... - 07:16 AM rbd Bug #2608: rbd: hung xfstest 270
- I looked at this on Tuesday, and sent a note to Sage that should
have instead been put here. Here it is.
I w... - 04:54 AM Feature #2668 (Resolved): Build linux-tools-common package for perf
- It'd be really nice if we built linux-tools-common with our gitbuilder kernels so we can install perf on our test box...
06/27/2012
- 06:10 PM Bug #2618: error: unable to open OSD superblock
- I noticed an issue in your ceph.conf - you have keyring = /etc/ceph/keyring.admin in the global section, and the osd ...
- 05:19 PM rbd Bug #2667 (Won't Fix): librbd: create_snap on a closed image segfaults
- I wrote silly code, and in reordering it, managed to attempt rbd_snap_create() on an
image that I had rbd_close()d. ... - 05:13 PM Feature #2651: mon: race calling tick() when doing slurping
- oops, stronger fix, yes!
- 05:13 PM Feature #2651 (Resolved): mon: race calling tick() when doing slurping
- 05:01 PM Feature #2661 (Resolved): mon: do not allow monitors to be added to the map with port 0
- Merged into dho and next. Thanks Joao!
- 11:25 AM Feature #2661 (Resolved): mon: do not allow monitors to be added to the map with port 0
- Last week, somebody used the "ceph mon add" command without specifying a port, and it defaulted to port 0. This cause...
- 04:48 PM Feature #2666 (Resolved): rados tool: copy pool
- A new operation to copy the entire content of a pool into a different pool. For each object we'd copy the locator, da...
- 04:04 PM rgw Bug #2665 (Resolved): rest-bench hangs periodically
- rest-bench seems to hang periodically with the following spit out the console on a regular basis:
plana83: 2012-06... - 04:04 PM Bug #2656 (Rejected): rados-bench hangs periodically
- 04:03 PM Bug #2656: rados-bench hangs periodically
- gah,
this is what I get for submitting bugs at the end of the day. You are correct, rest-bench. - 03:29 PM devops Feature #2587 (Resolved): sepia: isolated networking on vercoi (manual, a handful)
- 03:28 PM devops Feature #2587: sepia: isolated networking on vercoi (manual, a handful)
- Confirmed: isolated0..isolated9 work even if Crowbar wants to put VLANs in them. They pass between vercoi as packets ...
- 02:17 PM devops Feature #2662: crowbar: Make barclamp-ceph set mon initial members, monitor-secret, fsid
- More on where that snippet should live:
- for standalone chef deployment, we want the admin run something similar,... - 02:14 PM devops Feature #2662: crowbar: Make barclamp-ceph set mon initial members, monitor-secret, fsid
- This python snippet creates ceph keys in the right format (for now). Where it should live is still an open question.
... - 01:38 PM devops Feature #2662 (Resolved): crowbar: Make barclamp-ceph set mon initial members, monitor-secret, fsid
- Without this, multi-mon bring-up is racy.
At proposal save time, the barclamp should inspect the roles, and assign... - 02:12 PM Bug #2664: osd: extra attr _path, extra attr snapset from scrub
- full logs at metropolis:~sage/bug-2664
- 02:11 PM Bug #2664 (Resolved): osd: extra attr _path, extra attr snapset from scrub
- ...
- 01:43 PM devops Feature #2663 (Closed): crowbar: UI for setting generic ceph.conf values
- This needs to be some sort of an extensible list of key: value pairs.
Do we need to support sections too? Probably... - 01:17 PM devops Feature #2589 (Resolved): crowbar: Update barclamp-ceph for Essex, new ceph-cookbooks
- Tyler reported success as of b2c5d3307eef0ca44fd4b001136e9af043b322bd.
- 01:16 PM devops Feature #2588: downburst: multiple, configurable networks to libvirt
- For historical value: https://github.com/ceph/downburst/commit/de494eeefad0f0c72916d5dab8ba015b441a94f0
- 11:30 AM devops Feature #2588 (Resolved): downburst: multiple, configurable networks to libvirt
- 11:26 AM Linux kernel client Bug #2590: possible irq lock inversion dependency with con->mutex and osdc->request_mutex
- Recent log location: /a/teuthology-2012-06-27_00:00:07-regression-next-testing-basic/3076
2012-06-27T01:25:05.11... - 10:17 AM rbd Feature #2660 (New): qa: test resizing an rbd image while a vm has it open
- Make sure the resize is visible to the guest. This works with the virtio driver after doing e.g. 'echo 1 | sudo tee /...
- 10:02 AM Subtask #2659 (Can't reproduce): mon: Single-Paxos: ceph tool -w subscriptions not being updated
- how to reproduce:...
06/26/2012
- 05:16 PM rgw Bug #2658 (Resolved): rgw-admin: usage show fails when specifying hour > 12
- using wrong modifier on for parsing it.
- 05:11 PM Bug #2453: osd/OSD.h: 840: FAILED assert(last_scrub_pg.count(p))
- possibly fixed by commit:0d8970fc813b33e7c6ba2484fbc43cce947d3f4d
- 04:31 PM CephFS Bug #2657 (Resolved): kclient: direct io write larger than 8MiB fails
- Writes larger than 8MiB get EFAULT, e.g.:...
- 02:13 PM Bug #2656: rados-bench hangs periodically
- rados-bench or rest-bench?
- 01:27 PM Bug #2656 (Rejected): rados-bench hangs periodically
- rados-bench seems to hang periodically with the following spit out the console on a regular basis:
plana83: 2012-0... - 01:45 PM Bug #2563 (Can't reproduce): leveldb corruption
- It looks like one of the leveldb store files was corrupted, possibly by the filesystem. It may be possible to recove...
- 09:36 AM Bug #2655 (Resolved): scrub slows writes more than it should
- 09:34 AM Subtask #2616 (Closed): mon: Single-Paxos: AuthMonitor: key_server has no entries
- 09:34 AM Subtask #2616 (Resolved): mon: Single-Paxos: AuthMonitor: key_server has no entries
- 09:33 AM Subtask #2620 (Closed): mon: Single-Paxos: MDSMonitor: MMDSBeacon from entity with insufficient p...
- Note: turns out this was the same bug as #2643
Had to do with the AuthMonitor losing some infos when reading versi... - 09:32 AM Subtask #2643 (Closed): mon: Single-Paxos: mds: Strange message behavior on peon
- Had to do with the AuthMonitor losing some infos when reading versions from the store.
This is fixed. - 09:01 AM Linux kernel client Bug #2523: xfs: xfs_iolock_reclaimable
- ...
- 06:15 AM rbd Bug #2654 (Won't Fix): Stale rbd volume cannot be unmaped
- /dev/rbd0 exists in system but /dev/rbd/winnie-test/postgresql not...
06/25/2012
- 10:01 PM rbd Bug #2608: rbd: hung xfstest 270
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2012-06-23_00:00:02-regression-next-testing-basic/1471
m... - 09:56 PM Bug #2536 (Need More Info): librados crashed while getting stat of an object
- 09:56 PM Bug #2536: librados crashed while getting stat of an object
- Have you seen this problem since then? It looks like it could be due to racing with rados startup or shutdown...
- 09:41 PM Bug #2346 (Resolved): xfs filesystem on top of rbd volume corrupts
- No news is good news!
- 09:40 PM Bug #2602 (Resolved): osd: push failed because local copy is X
- 05:09 PM Messengers Bug #2569: msgr: connect_rank crash
- All three mon nodes and a client node on the second aging cluster died over the weekend (kernel and all). Looks like ...
- 10:25 AM Messengers Bug #2569: msgr: connect_rank crash
- Saw the following while debugging my aging test scripts. Seems to have happened when the mon was started. No core d...
- 03:33 PM Bug #2649: osd: log bound mismatch
- ...
- 03:31 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
- moved test to marginal suite; move back to regression when this is resolved!
- 03:31 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
- ubuntu@teuthology:/a/teuthology-2012-06-24_00:00:07-regression-next-testing-basic$ ...
- 03:28 PM Bug #2593: logmonitor: decode failure
- I wonder if this is also due to tick() colliding with slurping — the first one definitely could be (not sure about th...
- 03:27 PM Bug #2653 (Resolved): Web docs point to obsolete "fusermount" page
- The page http://ceph.com/docs/master/man/8/mount.ceph/ has a link at the bottome that points to "fusermount" descript...
- 03:21 PM Bug #2618: error: unable to open OSD superblock
- attaching my ceph.conf.
Can't get to IRC from work - I'll try in the evenings.
thanks - 02:54 PM rgw Bug #2652 (Resolved): Segmentation fault in rest-bench
- This happened while running rest-bench during aging tests on the burnupi cluster.
--
plana83: *** Caught signal... - 02:48 PM Bug #2022 (Resolved): osd: misdirectect request
- 02:40 PM Feature #2651 (Rejected): mon: race calling tick() when doing slurping
- Right now the monitor calls tick() on all the PaxosService implementations when it's doing slurping. This introduces ...
- 09:19 AM rgw Bug #2650 (Resolved): rgw: swift key creation overrides subuser access mask
- # radosgw-admin subuser create --uid=johndoe --subuser=johndoe:swift
--access=full
{ "user_id": "johndoe",
"rados...
06/23/2012
- 04:56 PM Bug #2649 (Resolved): osd: log bound mismatch
- ...
06/22/2012
- 07:14 PM Bug #2648 (Resolved): removing a monitor from the map while it's running causes a crash
- ...
- 05:27 PM Bug #2647 (Can't reproduce): osd: old request, waiting for subops
- primary:...
- 11:43 AM Bug #2618: error: unable to open OSD superblock
- John, can we see your ceph.conf file? If you have time, try chatting in #ceph on irc.oftc.net as well; perhaps we ca...
- 11:30 AM Bug #2646 (Resolved): mon:update_from_paxos: error parsing incremental update: buffer::end_of_buffer
- ...
- 08:17 AM Subtask #2645 (Rejected): mon: Single-Paxos: Could not decrypt ticket info (immediately after run...
- There was a lingering monitor still running, from a previous install.
Apparently, holding the wrong keys will lead... - 08:09 AM Subtask #2645 (Rejected): mon: Single-Paxos: Could not decrypt ticket info (immediately after run...
- ...
- 12:24 AM Bug #2602: osd: push failed because local copy is X
- Hi Sage,
just updated to your wip_rolling_upgrade branch.
FileStore update worked ( 100GB => 30 minutes on XFS ) ...
06/21/2012
- 06:55 PM rbd Feature #2566 (Duplicate): teuthology: task to run rbd workunits in a vm
- Same as #1713.
- 06:53 PM rbd Feature #1713 (Resolved): teuthology: qemu tasks, tests
- Basic teuthology task is done in 38f6a78c71910a39b7f1890316c0a134ced8b0ec. Making a gitbuilder for qemu seems less im...
- 06:52 PM rbd Feature #2644 (Rejected): qa: gitbuilder for qemu
- This should build qemu with rbd support for regression testing new versions of qemu.
- 06:49 PM rbd Feature #2567 (Resolved): qa: add qemu+rbd jobs to qa suite
- Added in 94a6ab8ff3637f68c03261cf845b402d6bfa8e76
- 04:30 PM Subtask #2643: mon: Single-Paxos: mds: Strange message behavior on peon
- This is what can be seen on the Leader:...
- 04:24 PM Subtask #2643 (Closed): mon: Single-Paxos: mds: Strange message behavior on peon
- Just for future reference.
When checking how things were going with the monitors, we noticed that the following sn... - 03:37 PM Subtask #2633: mon: Single-Paxos: ceph tool unable to connect to monitor
- Has something changed in the last five hours that you think fixed this?
- 03:28 PM Subtask #2633 (Closed): mon: Single-Paxos: ceph tool unable to connect to monitor
- It appears to be fixed.
The ceph tool is able to obtain the status from the monitors.
The 'watch' command doesn... - 10:11 AM Subtask #2633 (Closed): mon: Single-Paxos: ceph tool unable to connect to monitor
- This is what usually happens on the monitor side. Every now and then, the ceph tool is able to connect, but we haven'...
- 01:54 PM rgw Bug #2642 (Resolved): rgw: show/trim usage using also time (not just date)
- 01:42 PM Feature #2577 (In Progress): teuthology: blktrace task
- 01:41 PM Feature #2581 (Resolved): perf: investigate 0.47.2 precise vs 0.46 oneiric discrepancy
- 01:40 PM Feature #2576 (Resolved): perf: 0.48 on long-term clusters
- 01:17 PM Linux kernel client Bug #2302 (Can't reproduce): xfs: warning at mutex_remove_waiter
- 12:38 PM Bug #2550 (Resolved): logrotate: SIGHUP upstart jobs too, not just sysvinit
- 12:06 PM rbd Feature #2641 (Duplicate): qa: regression tests for rbd openstack volume driver
- This should include:
* booting a vm from an rbd device
* attaching/detaching an rbd device to a running guest
* ad... - 11:30 AM rbd Feature #2640 (Duplicate): qa: regression tests for rbd glance backend
- This should run against development versions of openstack to verify that the glance backend continues to work. Namely...
- 11:26 AM Bug #2042 (Duplicate): mon: crash in LogMonitor::update_from_paxos
- Indeed!
- 11:21 AM Bug #2042: mon: crash in LogMonitor::update_from_paxos
- Hrm, I think that this is duplicated by #2593?
- 11:16 AM Bug #2042 (Can't reproduce): mon: crash in LogMonitor::update_from_paxos
- 11:13 AM Cleanup #2623 (Resolved): filestore btrfs trans should be removed
- 11:07 AM Feature #1494 (Resolved): openstack: vm can boot off rbd
- This has been possible for a long time.
- 11:03 AM Bug #2638 (Resolved): mon: make pool ops idempotent
- for example, deleting a pool fails with ENOENT (or ENODATA :/) if the pool doesn't exist, but if we lose our mon sess...
- 11:02 AM rbd Feature #2637 (New): teuthology: task for running a vm using libvirt
- This should have similar semantics to the qemu task that runs qemu directly, but configure and run the vm via libvirt...
- 10:59 AM rbd Feature #2636 (New): qa: regression tests for qemu monitor commands
- Test attach/detach of rbd devices and snapshot operations executed directly by the qemu monitor. This is probably eas...
- 10:40 AM Bug #2602 (Need More Info): osd: push failed because local copy is X
- Hi Simon-
This looks like something that could be caused by the broken rolling osd upgrade support in the branch y... - 10:14 AM rgw Feature #2635 (New): benchmark for measuring rgw metadata operations
- We need to come up with a benchmark that will measure the following operations:
* Service:
1. List buckets
*... - 10:11 AM rbd Feature #2634 (Resolved): teuthology: add networking to qemu task
- Let the guest speak to the outside world so test scripts can e.g. check out git repos and download test programs to c...
- 06:32 AM Bug #2618: error: unable to open OSD superblock
- I manually created the directory.
Then I ran the mkcephfs command.
The directory has some files in it (journal, mag...
06/20/2012
- 09:39 PM Feature #2631 (Resolved): mon: kill rm -rf --mkfs behavior
- 09:02 PM Bug #2593: logmonitor: decode failure
- ...
- 06:55 PM rbd Feature #2630 (Resolved): teuthology: add task to run qemu-iotests against rbd
- qemu-iotests are included in upstream qemu.git. They exercise qemu's block layer to test correctness. They use existi...
- 06:44 PM rbd Feature #2629 (New): qa: test performance during live migration
- This could be done after #2628 by running iozone during the migration, parsing its output, and checking that throughp...
- 06:41 PM rbd Feature #2628 (New): qa: test live migration with qemu
- Run something like fsstress in the vm during the migration, and verify that it completes successfully. To do this we'...
- 06:37 PM rbd Feature #2627 (New): qa: regression tests for libvirt rbd storage pool
- Libvirt storage pools allow you to create, delete, and list volumes. Wido wrote a backend that uses librbd to do this...
- 06:30 PM rbd Feature #2626 (New): qa: regression tests for basic rbd libvirt integration (disks)
- Test using rbd disks with vms through libvirt.
This includes:
* booting a vm backed only by rbd
* attaching rb... - 06:15 PM rbd Feature #2625 (Rejected): qa: gitbuilder for libvirt
- Create a gitbuilder for libvirt packages so we can regression test rbd against upstream releases. Base this on the ub...
- 05:36 PM Bug #2600: osd: crazy long watch timeout?
- In another recurrence, there are no objecter requests:...
- 04:34 PM Bug #2524 (Won't Fix): librados crashed while connecting to cluster
- 04:34 PM Bug #2456 (Resolved): librbd: failed LibRBD.TestIOToSnapshot
- Haven't seen this in a while. Maybe some of the race cleanups fixed it...
- 04:32 PM Documentation #2624: OpenStack creation instructions should recommend non-default number of pg's ...
- It'll have to be ceph osd pool create <pool> <num_pgs> until #2519 is done.
- 04:25 PM Documentation #2624 (Resolved): OpenStack creation instructions should recommend non-default numb...
- http://ceph.com/docs/master/rbd/rbd-openstack/ recommends
sudo rados mkpool nova
This should probably be
su... - 03:46 PM Cleanup #2623 (Resolved): filestore btrfs trans should be removed
- On Wed, 20 Jun 2012, Stefan Priebe - Profihost AG wrote:
> Hello list,
>
> i've looked at the wiki (http://ceph.co... - 03:01 PM Subtask #2622 (Resolved): mon: Single-Paxos: convert existing, old MonitorStore to a brand new Mo...
- The new monitor design does not support the old MonitorStore, nor does it store the versions and their values in the ...
- 02:58 PM Subtask #2621 (Resolved): mon: Single-Paxos: synchronize the MonitorDBStore of oblivious monitor
- *Objective:* synchronize monitor stores over the network whenever a given monitor mon.X falls too far behind.
*Sol... - 02:50 PM Subtask #2615 (Closed): mon: Single-Paxos: MDSMap::get_health() asserting
- 02:49 PM Subtask #2615: mon: Single-Paxos: MDSMap::get_health() asserting
- This issue stopped popping up after we changed the criteria to propose queued proposals and restarted testing with a ...
- 03:59 AM Subtask #2615 (Closed): mon: Single-Paxos: MDSMap::get_health() asserting
- MDSMap infos, dumped on MDSMap::get_health() just before the assert is triggered:...
- 02:47 PM Subtask #2616: mon: Single-Paxos: AuthMonitor: key_server has no entries
- Appears to be fixed.
The ceph tool is able to connect to the cluster and obtain status information.
However, th... - 11:01 AM Subtask #2616: mon: Single-Paxos: AuthMonitor: key_server has no entries
- Although this appears to be fixed, we still are unable to authenticate clients.
My current suspicion is that we ar... - 09:00 AM Subtask #2616: mon: Single-Paxos: AuthMonitor: key_server has no entries
- We were encoding an empty "full version" of the key server during AuthMonitor::encode_pending(), along side with the ...
- 08:36 AM Subtask #2616: mon: Single-Paxos: AuthMonitor: key_server has no entries
- The problem appears to affect all mon clients, and it may be the reason why our OSDs do not work as well.
Log snip... - 08:00 AM Subtask #2616 (Closed): mon: Single-Paxos: AuthMonitor: key_server has no entries
- The Monitor's key_server has no entries, even though we made sure to populate mon.X/keyring with every single service...
- 02:45 PM Subtask #2614: Single Paxos instance shared across the existing services
- 03:48 AM Subtask #2614 (Closed): Single Paxos instance shared across the existing services
- One Paxos to propose them all.
- 02:44 PM Subtask #2620 (Closed): mon: Single-Paxos: MDSMonitor: MMDSBeacon from entity with insufficient p...
- ...
- 02:06 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
- Please mention https://bugs.launchpad.net/upstart/+bug/1012938 in the "sucks" comment, so someone can some day nicely...
- 01:47 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
- repushed upstart-vs-logrotate branch
- 12:25 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
- yeah, that'll work. only solves the logrotate case, but that's fine by me.
- 11:39 AM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
- That killall thing is hideous, and I'm utterly unconvinced having even more upstart jobs for Ceph is helpful in any w...
- 12:30 PM Feature #2619 (Resolved): filejournal: instrument with perfcounters
- 12:09 PM Bug #2618: error: unable to open OSD superblock
- Hi John,
Did you create the /data/ceph/osd0 directory? mkcephfs doesn't do it for you because of the potential for... - 11:31 AM Bug #2618 (Can't reproduce): error: unable to open OSD superblock
- I am new at this.
I installed ceph.
When I do a service ceph start, mon.0 and mds.(machine name) seem ok.
When it ... - 11:13 AM Bug #2022: osd: misdirectect request
- here is the smoking gun. note that teh pgid goes to 0.0 when linger tid 1 is resending the watch op 4:...
- 03:45 AM Subtask #2613: Sandbox PaxosServices accesses to the store
- I messed up the formatting and don't seem to be able to edit it. So here goes a decent version of it....
- 03:41 AM Subtask #2613 (Resolved): Sandbox PaxosServices accesses to the store
- Each service used to have direct access to the MonitorStore, and they could mess around wherever they wanted, allowin...
- 03:25 AM Subtask #2612 (Resolved): Monitor key/value store
- Create a key/value store, with transaction support, to be used on the monitor subsystem.
Its interface should refl... - 03:21 AM Feature #2611 (Resolved): mon: Single-Paxos
- The ceph-mon is (roughly) composed by a Monitor class, responsible for all things monitor-ish, and several monitor se...
06/19/2012
- 07:04 PM rbd Feature #2556: rbd tool: break image locks
- Argh. I don't seem to be getting my email notifications from you and Josh on Github, and I don't know why.
- 06:57 PM rbd Feature #2556: rbd tool: break image locks
- https://github.com/ceph/ceph/commit/3c05629691deb800e3c6e62e81f444a748e8857c#src-rbd-cc-P108
just making sure i un... - 06:48 PM rbd Feature #2556: rbd tool: break image locks
- Your commits look good to me (sorry I missed the cli tests; I need to get into the habit of running those), but I don...
- 05:46 PM rbd Feature #2556: rbd tool: break image locks
- rebase, fixed up ENOENT vs ENOEXEC behavior. one clarification about the purpose/scope of 'rbd lock', but otherwise ...
- 03:13 PM rbd Feature #2556 (Fix Under Review): rbd tool: break image locks
- wip-rbd-locking has this now, but it also merges in wip-clsrbd for an unrelated change, so you might want to wait to ...
- 05:06 PM Bug #2610 (Resolved): osd: pg stuck at scrubbing
- Happened on congress, pg was stuck at scrubbing state for two and a half days....
- 04:20 PM rbd Feature #2558 (Resolved): cls_rbd: child/parent methods
- 04:05 PM devops Feature #2584 (In Progress): sepia: provide networking, DHCP for dynamic virtual machines
- 04:04 PM Feature #2576 (In Progress): perf: 0.48 on long-term clusters
- 04:04 PM Feature #2575 (In Progress): perf: 0.48 numbers
- 03:52 PM rbd Feature #2609 (Resolved): librbd: new image name -> image head indirection
- To prevent rename from disrupting clients with images open,
* put header in rbd_head.$id
* put $id in rbd_id.$nam... - 02:32 PM rgw Feature #2516 (Resolved): rgw: new bandwidth-only per-user log
- 02:28 PM rbd Bug #2608 (Closed): rbd: hung xfstest 270
- Logs are available in ubuntu@teuthology:/a/teuthology-2012-06-19_00:00:09-regression-next-testing-basic/1792
2012-... - 01:25 PM Bug #2022: osd: misdirectect request
- latest run log: ubuntu@teuthology:/a/teuthology-2012-06-18_19:00:05-regression-master-testing-gcov/1586
- 12:54 PM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
- ubuntu@teuthology:/a/teuthology-2012-06-18_19:00:05-regression-master-testing-gcov/1579
- 11:31 AM Messengers Bug #1985: msgr: creating new Pipe for pre-existing connection leaks Pipe if they don't replace
- I've still got this sitting around in my workspace. Since we seem to have pushed back a messenger re-do, perhaps we s...
- 09:57 AM rbd Feature #2607 (Resolved): librbd: copyup helper
- copyup helper to perform a copyup from parent to child. will be used by both the rbd command-line copyup command, an...
- 09:57 AM rbd Subtask #2606 (Resolved): librbd layering: copyup on missing child object
- 09:57 AM rbd Subtask #2605 (Resolved): librbd layering: guard writes
- 09:56 AM rbd Subtask #2604 (Resolved): librbd layering: read path
- 09:56 AM rbd Subtask #2603 (Resolved): librbd layering: open parent on open
06/18/2012
- 10:07 PM Bug #2550 (Fix Under Review): logrotate: SIGHUP upstart jobs too, not just sysvinit
- Sigh. See branch upstart-vs-logrotate.
- 08:57 PM rbd Feature #2556: rbd tool: break image locks
- Greg Farnum wrote:
> Team RBD needs more to do! Pulling this forward. :)
Go team! :) - 06:26 PM rbd Feature #2556 (In Progress): rbd tool: break image locks
- Team RBD needs more to do! Pulling this forward. :)
- 05:56 PM rbd Feature #2585 (In Progress): rbd: clone command
- 05:34 PM rbd Feature #2585: rbd: clone command
- 05:35 PM rbd Feature #2559: cls_rbd: copyup method
- 01:50 PM rbd Feature #2601: rbd: Show image size with an "ls"
- We've also heard from others that having a better estimate of rbd usage and expected usage would be good; taking into...
- 06:09 AM rbd Feature #2601 (Resolved): rbd: Show image size with an "ls"
- On the mailinglist the request came if the "rbd" tool could be modified to not only show image names when doing an ls...
- 01:34 PM rgw Bug #2542 (Resolved): rgw: support S3 update of metadata
- 01:32 PM rgw Bug #2542: rgw: support S3 update of metadata
- Resolved, commit:343cc792e847ca8901f6c08e41799a2fbbd2ca92
- 11:04 AM Bug #2602: osd: push failed because local copy is X
- Updated another osd to 'next' and same errors happened.
I've attached the log with debug osd = 20 set. - 08:46 AM Bug #2602: osd: push failed because local copy is X
- Is this reproducible with 'debug osd = 20'?
- 08:44 AM Bug #2602 (Resolved): osd: push failed because local copy is X
- Hi,
filestore updated completed.
When i start the "updated" OSD the whole cluster starts lagging.
Is the next br... - 08:45 AM Bug #2598: filestore: error during upgrade
- Simon Frerichs wrote:
> Hi,
>
> filestore updated completed.
> When i start the "updated" OSD the whole cluster ... - 08:42 AM Bug #2598 (Resolved): filestore: error during upgrade
- THanks!
- 01:29 AM Bug #2598: filestore: error during upgrade
- Hi,
filestore updated completed.
When i start the "updated" OSD the whole cluster starts lagging.
Is the next br... - 12:56 AM Bug #2598: filestore: error during upgrade
- Thanks.
The bug seems to be fixed. - 08:43 AM Bug #2595: filestore: error creating filestore during mkcephfs
- 2012-06-18 17:42:16.232924 7f54292fb780 -1 filestore(/srv/osd.20) could not find 23c2fcde/osd_superblock/0//-1 in ind...
- 08:29 AM Bug #2599: osd: crash in ReplicatedPG::C_OSD_OndiskWriteUnlock::finish
- commit:5efaa8d7799347dfae38333b1fd6e1a87dc76b28
- 07:25 AM CephFS Bug #2596: mds: spinning on restart
- gdb is not helpful here, process seems to be spinning in syscall:
(gdb) thread apply all bt
Thread 1 (process 148...
06/17/2012
- 10:40 PM Bug #2600: osd: crazy long watch timeout?
- Possibly related to #2476
- 09:37 PM Bug #2600 (Resolved): osd: crazy long watch timeout?
- ...
- 09:34 PM CephFS Bug #1737: ceph-fuse crash in xlist::remove
- see ubuntu@teuthology:/a/teuthology-2012-06-17_19:00:03-regression-master-testing-gcov/1303 for a failure with logs!
- 02:33 PM RADOS Feature #2422 (In Progress): crush: test that mapping result is uncorrelated
- 02:32 PM Bug #2598: filestore: error during upgrade
- Ah... should have tested on another filesystem.
- 02:21 PM Bug #2598: filestore: error during upgrade
- Oh, der.. pretty sure commit:82cb3d61ff4f200e0a9040e6381a9eed32db9de1 fixes this.
- 02:29 PM Bug #2022: osd: misdirectect request
- Last two failures were the rados api tests:...
- 06:50 AM CephFS Bug #2385: max mds = 2, mds hang and crash
- ...
- 06:45 AM CephFS Bug #2385: max mds = 2, mds hang and crash
- ...
06/16/2012
- 11:34 AM Bug #2598: filestore: error during upgrade
- That's odd, it's updating the omap directory as a collection. list_collections should not have returned omap as a co...
- 08:04 AM Bug #2598 (Resolved): filestore: error during upgrade
- from ML:...
- 08:25 AM Bug #2462 (Resolved): osd/PG.cc: 402: FAILED assert(log.head >= olog.tail && olog.head >= log.tail)
- I'm going to optimistically call this resolved. If we see this crash again, though, we'll need to reopen, and hopefu...
- 08:24 AM rbd Bug #2535: rbd: random data corruption in vm
- We've disabled fiemap, which appears to be the culprit. Josh is still tracking down which kernel releases are affect...
- 08:21 AM Bug #2599 (Can't reproduce): osd: crash in ReplicatedPG::C_OSD_OndiskWriteUnlock::finish
- from ml:...
- 07:59 AM Bug #2595 (Resolved): filestore: error creating filestore during mkcephfs
- 07:59 AM Bug #2595: filestore: error creating filestore during mkcephfs
- commit:1e899d08e61bbba0af6f3600b6bc9a5fc9e5c2e9
- 06:40 AM Bug #2595: filestore: error creating filestore during mkcephfs
- Yes
06/15/2012
- 05:58 PM rbd Feature #1480 (Resolved): librbd: image locking
- Okay, discussed and merged in commit:dac9f223598c5f67b228403e514f202280d56488
- 05:49 PM rbd Feature #1480: librbd: image locking
- And after thorough review from Josh, this should be ready for merge (commit:5b1b02b60a253092700f364dca77bb6b1065e3e0)...
- 02:40 PM rgw Bug #1643 (Rejected): radosgw-admin log show should accept --time
- 02:03 PM Bug #2595: filestore: error creating filestore during mkcephfs
- Oh, it looks like it's just noise from checking the journal. The mkcephfs succeeded, right?
- 01:57 PM Bug #2595: filestore: error creating filestore during mkcephfs
> Can you reproduce with 'debug filestore = 20' and attach the log to this
> bug?
Log:...- 10:32 AM Bug #2595: filestore: error creating filestore during mkcephfs
- FYI, I saw this once when I was working on the OSD hotplug code paths. Mine might have been caused by a missing "osd ...
- 09:29 AM Bug #2595 (Resolved): filestore: error creating filestore during mkcephfs
- from ML:...
- 11:48 AM rbd Bug #2597 (Resolved): Import of image from file appears to succeed, but image not present in the ...
- I have been testing with storing an image file, a basic QCOW2 image of latest Ubuntu distro on a pool, which is used ...
- 10:44 AM rbd Feature #2558: cls_rbd: child/parent methods
- wip-clsrbd
- 10:44 AM rbd Feature #2558 (Fix Under Review): cls_rbd: child/parent methods
- 09:44 AM CephFS Bug #2596 (Can't reproduce): mds: spinning on restart
- from ML:...
06/14/2012
- 09:02 PM Linux kernel client Bug #2389 (Duplicate): rbd: hung xfstest 67
- 09:01 PM Linux kernel client Bug #2359 (Can't reproduce): xfstest 62 failing
- haven't seen this in a while
- 05:55 PM Feature #2571 (Resolved): sepia: enable virtualization
- 11:34 AM Feature #2571 (In Progress): sepia: enable virtualization
- BIOS settings changed on all plana; one reboot test shows good results. One can tell if
virtualization is enabled w... - 04:12 PM Bug #2593 (Resolved): logmonitor: decode failure
- Saw this while trying to reproduce #2569. Sadly teuthology cleaned everything up before I could get to the data.
<pr... - 03:24 PM Feature #2581 (In Progress): perf: investigate 0.47.2 precise vs 0.46 oneiric discrepancy
- 03:13 PM devops Feature #2415 (Resolved): upstart: support radosgw
- 03:06 PM rbd Bug #2534: librbd: make sure watch is established on same header version as initial read was
- Okay, this is blocked by #2592.
- 03:06 PM Bug #2563: leveldb corruption
- It's triggerable without ceph, I've filed a bug below with leveldb and I'm continuing to look into it.
http://code... - 03:05 PM Bug #2592: osd and all clients: watch version parameter is ignored
- Alternatively, maybe the OSD should just enforce the version with those checks when setting a watch? It looks to me a...
- 03:01 PM Bug #2592 (Resolved): osd and all clients: watch version parameter is ignored
- Watch operations have a version parameter that is supposed act like an assert_version op. This could easily be done i...
- 02:38 PM Feature #2471 (In Progress): osd: add prefix match to OSDCaps
- you can have this one too, given your wip-osdcap branch.
- 02:37 PM rbd Feature #1480 (Fix Under Review): librbd: image locking
- wip-rbd-locking
- 02:09 PM rgw Feature #2517 (Resolved): rgw: limit number of buckets per user (configurable per user)
- added teuth tests, in master, backported to dho
- 02:04 PM rgw Bug #2591 (Resolved): misc rgw s3tests failures
- Should be ok for now. I've set boto to 2.4.1, we can change that later once upstream fixes its issues.
- 10:15 AM rgw Bug #2591: misc rgw s3tests failures
- boto 2.5.0 issue. For some reason it doesn't set the error.reason on 400 responses.
- 07:57 AM rgw Bug #2591 (Resolved): misc rgw s3tests failures
- 2012-06-13T12:51:42.657 INFO:teuthology.orchestra.run.err:s3tests.functional.test_headers.test_bucket_create_bad_auth...
- 12:59 PM rbd Bug #2535: rbd: random data corruption in vm
- Sage Weil wrote:
> Just a bit of context: rbd without caching does a 'sparse-read' operation, which uses FIEMAP to d... - 12:52 PM rbd Bug #2535: rbd: random data corruption in vm
- Just a bit of context: rbd without caching does a 'sparse-read' operation, which uses FIEMAP to determine which parts...
- 12:50 PM rbd Bug #2535: rbd: random data corruption in vm
- Let's try a different tack: I pushed a osd-verify-sparse-read-holes branch to ceph.git (based on 0.47.2) that reads ...
- 09:09 AM rbd Bug #2535: rbd: random data corruption in vm
- Status update:
I tried modifying the iotester so that it would work directly on the block device, in the hopes I c... - 10:14 AM Feature #2472: osd: add opaque 'class <name> <foo>' cap that class can interpret/enforce
- wip-osdcap is doing this way better than I was, although I'm happy to take it back to do the OSD changes if need be.
- 09:09 AM rbd Bug #2410: hung xfstest #68
- disabled 68 in qa for the time being.
- 09:03 AM rbd Bug #2522: xfstest #219
- Sigh.. took a quick look and it's non-obvious why the repquota output doesn't match. Disabling this for now, but lea...
06/13/2012
- 08:57 PM Linux kernel client Bug #2590 (New): possible irq lock inversion dependency with con->mutex and osdc->request_mutex
- i thought this was #147, but on closer inspection it's something else;...
- 07:17 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
- Filed upstream: https://bugs.launchpad.net/upstart/+bug/1012938
- 06:17 PM rgw Feature #2516: rgw: new bandwidth-only per-user log
- I think the last thing we need here is to add it to the radosgw-admin test so that we don't break these commands in t...
- 12:26 PM rgw Feature #2516 (In Progress): rgw: new bandwidth-only per-user log
- 05:04 PM rgw Feature #2473: rgw: revisit operation logging
- Not the top priority, but we can have an async flush, similar to the one we have for the usage logging.
- 04:20 PM Linux kernel client Bug #2573: libceph: many "socket closed" messages
- In that case, if you want to run this with the osd messenger debug at 5 and can gather logs next time I'll be happy t...
- 02:35 PM Linux kernel client Bug #2573: libceph: many "socket closed" messages
- The test takes on the order of a minute to complete one pass
of test 049. During that time I typically see 10-20 so... - 10:46 AM Linux kernel client Bug #2573: libceph: many "socket closed" messages
- The sockets have a default timeout of 15 minutes, after which they will close — the idea being that if the socket is ...
- 10:40 AM Linux kernel client Bug #2573 (Resolved): libceph: many "socket closed" messages
- While trying to reproduce a null pointer problem in the client
messenger code I was running xfstests #049 over RBD d... - 02:31 PM Feature #988 (Duplicate): librbd: trivial layering
- replaced by other tasks
- 02:31 PM Feature #988 (Rejected): librbd: trivial layering
- 02:31 PM devops Feature #2589 (Resolved): crowbar: Update barclamp-ceph for Essex, new ceph-cookbooks
- 02:30 PM devops Feature #2588 (Resolved): downburst: multiple, configurable networks to libvirt
- Right now, it hardcodes that a vm only has the "default" network. Make that configurable.
- 02:29 PM devops Feature #2587 (Resolved): sepia: isolated networking on vercoi (manual, a handful)
- One-time switch & linux configuration for a handful of VLANs, manually allocated to people who want to run Crowbar.
- 02:28 PM rbd Feature #2586 (Rejected): rbd: check/take locks on --lock
- if you pass --lock to rbd, take an exclusive lock, do whatever, unlock
- 02:20 PM rbd Feature #2585 (Resolved): rbd: clone command
- A command for the rbd tool to create a child image from a parent. Example:
rbd clone --parent pool/image@snap pool... - 01:56 PM rbd Feature #2467 (Rejected): qemu: implement bdrv_invalidate_cache
- I've tested migration with caching, and read the code, and it looks like this is unnecessary. qemu is doing a flush b...
- 01:47 PM devops Feature #2584 (Resolved): sepia: provide networking, DHCP for dynamic virtual machines
- downburst can provision them really nicely, but right now only static networking works. To fix that, we need DNS to w...
- 01:40 PM devops Feature #2583 (Resolved): crowbar: change barclamp-nova to use rbd
- The nova proposal needs to point to a ceph proposal. Look at how nova&glance use mysql.
barclamp-chef should inclu... - 01:25 PM Feature #1964 (Rejected): ferro: Create a cloud-init OVF config that reimages a machine
- Dell's vMedia functionality is awfully buggy, aborting this plan (for now?).
- 01:25 PM Feature #1965 (Rejected): ferro: Machine management state machine (fake actions)
- Dell's vMedia functionality is awfully buggy, aborting this plan (for now?).
- 01:24 PM Feature #1966 (Rejected): ferro: Connect actions to state machine
- 01:20 PM Feature #1966: ferro: Connect actions to state machine
- Dell's vMedia functionality is awfully buggy, aborting this plan (for now?).
- 01:21 PM Feature #1967 (Rejected): ferro: Single API endpoint that delegates to machine managers
- 01:20 PM Feature #1967: ferro: Single API endpoint that delegates to machine managers
- Dell's vMedia functionality is awfully buggy, aborting this plan (for now?).
- 01:20 PM Feature #1968 (Rejected): ferro: Batch resource allocation (not fair, no quotas yet)
- Dell's vMedia functionality is awfully buggy, aborting this plan (for now?).
- 01:20 PM rbd Bug #2522: xfstest #219
- The problem here appears to be that the output of the repquota
command is not what's expected. I think the group qu... - 01:17 PM Feature #1962 (Rejected): ferro: Trigger vMedia boot via IPMI/DRAC
- Dell's vMedia functionality is awfully buggy, aborting this plan (for now?).
- 01:16 PM Feature #1961 (Rejected): ferro: Python wrapper for vmcli (using gevent)
- Dell's vMedia functionality is awfully buggy, aborting this plan.
- 01:12 PM Feature #1963 (Closed): ferro: OVF Environment creation as a library
- downburst actually ended up containing this logic, not OVF but still cloud-init.
- 01:04 PM rgw Feature #2517 (Fix Under Review): rgw: limit number of buckets per user (configurable per user)
- 01:03 PM Feature #2582 (Resolved): set up chart.io + mysql (or equivalent) infrastructure for tracking perf
- 12:44 PM Linux kernel client Bug #2287 (Resolved): rbd: crashes with 10Gbit network and fio
- This looks like the bio->iter problem, which is now fixed by commit:43643528cce60ca184fe8197efa8e8da7c89a037 in ceph-...
- 12:38 PM Feature #2581 (Resolved): perf: investigate 0.47.2 precise vs 0.46 oneiric discrepancy
- 12:37 PM Feature #2580 (Resolved): perf: investigate poor performance at 10 osds per node
- 12:32 PM Feature #2578 (New): rados ager
- aging function that is invoked (probably) similarly to rados bench, ideally using the same bencher abstraction so tha...
- 12:30 PM Feature #2577 (Resolved): teuthology: blktrace task
- * run blktrace on the osds' disks.
* put results in the archive dir
* maybe an optional start delay, duration, ... - 12:30 PM Feature #2576 (Resolved): perf: 0.48 on long-term clusters
- 12:29 PM Feature #2575 (Resolved): perf: 0.48 numbers
- populate the spreadsheet with values from 0.48
- 11:43 AM Messengers Bug #2569 (Need More Info): msgr: connect_rank crash
- I'm attempting to reproduce this, but what's available right now is just the teuthology log — it didn't pull off any ...
- 09:57 AM Messengers Bug #2569 (Resolved): msgr: connect_rank crash
- ...
- 11:31 AM devops Feature #2574 (Resolved): crowbar: use data disks automatically, journal inside data directory
- Crowbar sets node['crowbar']['disks'] to an array of disks. First one is used for the OS, and disk['usage'] is set to...
- 11:20 AM rbd Cleanup #2347 (Resolved): The rbd help text is misleading on required arguments
- commit:67710a65c7cd1173c73c40241572d615dd7da1f3
- 11:06 AM devops Feature #2415 (Fix Under Review): upstart: support radosgw
- 11:02 AM Cleanup #2331 (Resolved): Makefile.am:182: `lib/libgtest.a' is not a standard libtool library name
- commit:66553d25f09f0d0cea735a862a228060b72c0ce6
- 10:30 AM rbd Bug #2572 (Resolved): krbd: writeback errors?
- While trying to reproduce a null pointer messenger problem,
I kept hitting messages like this after some (fairly ran... - 10:29 AM Feature #2571 (Resolved): sepia: enable virtualization
- 10:27 AM rbd Bug #2535: rbd: random data corruption in vm
- Sage Weil wrote:
> Guido Winkelmann wrote:
> > Sage Weil wrote:
> > > Are there multiple partitions or is LVM on t... - 10:03 AM Linux kernel client Bug #2389: rbd: hung xfstest 67
- ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7559
- 09:55 AM Linux kernel client Bug #2389: rbd: hung xfstest 67
- ubuntu@teuthology:/a/master-2012-06-12_16:17:15/7465
- 10:02 AM Linux kernel client Bug #147: lockdep: possible irq lock inversion dependency w/ osdc->request_mutex and con->mutex
- ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7579
ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7587
<... - 09:59 AM CephFS Bug #1947: mds: SIGBUS during _mark_dirty
- ubuntu@teuthology:/a/nightly_coverage_2012-06-13-a/7526
- 09:23 AM rbd Feature #2568 (Resolved): qa: run xfstests on qemu+rbd
- This will build on #2566:
* stage xfstests on vdb, like a regular workunit, and:
* map additional rbd images to r... - 09:21 AM rbd Feature #2567 (Resolved): qa: add qemu+rbd jobs to qa suite
- Add a bunch of workunits to the qa suite that will run on top of rbd inside a vm.
- 09:20 AM rbd Feature #2566 (Duplicate): teuthology: task to run rbd workunits in a vm
- teuthology task that will:
* download workunit vm
* create and format rbd image
* mount, stage a workunit in rbd...
06/12/2012
- 08:48 PM Feature #2564 (Resolved): teuthology: install kernels from local dir
- 02:58 PM Bug #2462 (Need More Info): osd/PG.cc: 402: FAILED assert(log.head >= olog.tail && olog.head >= l...
- f822c0257e4c7fad181332cd149205ad15a8b9db
See the commit description. Unfortunately, I don't really have evidence ... - 02:55 PM Bug #2563 (Resolved): leveldb corruption
- This was also mentioned once in the mailing list.
ceph version 0.47.2 (commit:8bf9fde89bd6ebc4b0645b2fe02dadb1c17a... - 02:40 PM rbd Feature #2561: rbd: copyup command
- What? How does a class function of any kind provide atomicity in cross-OSD data copies?
- 02:37 PM rbd Feature #2561: rbd: copyup command
- 'rbd copyup pool/image' command to copy any missing objects up from the parent. simple O(n) operation that leverages...
- 02:11 PM rbd Feature #2561 (Resolved): rbd: copyup command
- 'rbd copyup pool/image' command to copy any missing objects up from the parent. simple O(n) operation that leverages ...
- 02:39 PM rbd Feature #2562 (Resolved): librbd: open parent images, read path, write path
- - when we open an image, open the parent image too.
- make reads fall through to parent
- guard writes beyond paren... - 02:05 PM rbd Feature #2560 (Resolved): rbd: safe parent deletion
- - maintain map of parent/child pairs in each child pool...
- 02:04 PM rbd Feature #2531: rbd: fencing broken clients
- As I see it, we have two options that we need to choose between.
1) We can add fencing to librbd and let anybody do ... - 01:58 PM Bug #2550: logrotate: SIGHUP upstart jobs too, not just sysvinit
- The instance jobs make this a bit trickier. Either process "initctl list" output or copy the logic that walks the /va...
- 01:04 PM Bug #2550 (Resolved): logrotate: SIGHUP upstart jobs too, not just sysvinit
- 01:55 PM rbd Feature #2559 (Resolved): cls_rbd: copyup method
- - client provides object content
- if object exists, fail with EEXIST (or 0, or something)
- if object does not exi... - 01:54 PM rbd Feature #2558 (Resolved): cls_rbd: child/parent methods
- On the new image header:
- set_parent(poolid, image (maybe id, maybe name), snapid)
On the per-pool child list:
... - 01:52 PM rbd Feature #2557 (Rejected): QEMU support for image locking
- We should convert QEMU to make use of rbd cooperative locking, once it's done (#1480).
And any other appropriate c... - 01:50 PM rbd Feature #2556 (Resolved): rbd tool: break image locks
- Once #1480 is done, expose lock breaking via the rbd tool.
- 01:47 PM devops Feature #2555 (Rejected): chef: SECURITY: Re-evaluate where configuration & key handoff gets stored
- The current setting seems to mean root on all chef nodes (even ones not running Ceph), and all knife users, have full...
- 01:44 PM devops Feature #2554 (Rejected): chef: open question: How do we discover what disks we should use as Cep...
- For Crowbar, see #2574.
- This is somewhat a dangerous operation, run accidentally it will clobber a lot of data. ... - 01:43 PM devops Feature #2553 (Closed): crowbar: open question: What's the correct way to add RBD support to the ...
- We'll need to get set --volume-driver etc in nova.conf,
glance-api.conf, etc. So I guess we need to (temporarily) fo... - 01:36 PM devops Feature #2415 (In Progress): upstart: support radosgw
- 01:21 PM devops Feature #2552 (Rejected): chef: admin tool to generate config in json (uuid, secret)
- The environment needs things like...
- 01:12 PM Bug #2551 (Rejected): leveldb broke "make distcheck"
- ...
- 01:03 PM devops Feature #2549 (Resolved): ceph-disk-prepare: take fstype, mkfs and mount options from ceph.conf
- See #2548 for similar need.
- 01:02 PM devops Feature #2548 (Resolved): ceph-disk-activate: take mount options from ceph.conf
- 01:02 PM devops Feature #2547 (Resolved): ceph-disk-prepare: handle partitioning and mkfs
- spawn gdisk in a subprocess.
How much protection do admins need to avoid ceph-disk-prepare /dev/sda mistakes? - 01:00 PM devops Feature #2546 (Resolved): ceph-disk-prepare: take fsid from ceph.conf (support --cluster=name)
- 12:49 PM devops Feature #2498 (Resolved): standardize keyring locations for daemons
- 10:56 AM Bug #2545 (Resolved): init-ceph: stops if one instance fails to start
- 10:52 AM Bug #2543 (Resolved): crush: invalid pointer when outputting local retry histogram for large rang...
- 10:10 AM rbd Bug #2535: rbd: random data corruption in vm
- Guido Winkelmann wrote:
> Sage Weil wrote:
> > Are there multiple partitions or is LVM on the disk, or is the file ... - 10:07 AM rbd Bug #2535: rbd: random data corruption in vm
- Sage Weil wrote:
> Are there multiple partitions or is LVM on the disk, or is the file system on the raw device?
... - 09:29 AM rbd Bug #2535: rbd: random data corruption in vm
- Are there multiple partitions or is LVM on the disk, or is the file system on the raw device?
- 05:32 AM rbd Bug #2535: rbd: random data corruption in vm
- Am Montag, 11. Juni 2012, 09:30:42 schrieb Sage Weil:
> If you can reproduce it with 'debug filestore = 20' too, tha... - 05:29 AM rbd Bug #2535: rbd: random data corruption in vm
- The bug also does not seem to have any effect with the setting "filestore fiemap = false" in ceph.conf.
- 02:27 AM Bug #2544 (Closed): Help text for "usage show" identical to "usage trim"
- cerr << " usage show show usage (by user, date range)\n";
cerr << " usage trim ...
06/11/2012
- 09:17 PM Bug #2543 (Resolved): crush: invalid pointer when outputting local retry histogram for large rang...
- buggered the memory when we are generating the histogram for a large range of x.
- 06:57 PM rgw Bug #2542 (Resolved): rgw: support S3 update of metadata
- S3 metadata update is being done by copying of an object to itself with new metadata info.
- 04:22 PM Feature #1772 (Resolved): rbd: define new on-disk header format
- 11:31 AM Feature #1772 (In Progress): rbd: define new on-disk header format
- 03:17 PM Bug #2540 (Resolved): "ceph osd crush set" should treat "foo=" as if foo wasn't mentioned on the ...
- 03:12 PM Bug #2540 (In Progress): "ceph osd crush set" should treat "foo=" as if foo wasn't mentioned on t...
- 03:09 PM Bug #2540 (Resolved): "ceph osd crush set" should treat "foo=" as if foo wasn't mentioned on the ...
- The current behavior, using an empty string as the name, is quite confusing.
Instead of an error message, a better... - 03:13 PM RADOS Feature #2541 (Resolved): crush: move command to adjust non-leaf node position
- the add or update function is intentionally limited to leaves. allow the hierarchy to be adjusted using a different ...
- 03:08 PM Feature #2510 (Resolved): update on-disk hobject_t encoding to include pool and namespace fields
- 02:13 PM Feature #2539 (Duplicate): ceph should issue timeout message when it can't connect to mon
- I forgot to start the ceph service before issuing ceph -s to check its status. The tool happily
waited forever to c... - 11:31 AM Feature #2496 (Resolved): reinstall pudgy
- 10:32 AM RADOS Feature #2521 (Resolved): crush: control bucket vs device mark-down probabilities independently
- 09:50 AM Linux kernel client Bug #2392: First read of symlink after ceph filesystem mounted gives error
- This is going to be easy to fix once the atomic_open stuff is merged. Real Soon Now.
- 09:40 AM Linux kernel client Bug #2537 (Won't Fix): bad header for RHEL6-like kernels
- That backports tree is very old and not maintained. Assuming you do get it working, you'll have 1-2 year old code. ...
- 05:07 AM Linux kernel client Bug #2537: bad header for RHEL6-like kernels
- Sorry,
I forgot to mention that it implies caps.c and super.h files.
For detecting that kernel is RHEL it is mayb... - 04:28 AM Linux kernel client Bug #2537 (Won't Fix): bad header for RHEL6-like kernels
- Hello,
I tried to compile the kernel module (kclient-0.20) and get a problem with ceph_write_inode:
it is declared ... - 09:33 AM Feature #1773 (Resolved): rbd: class interface for header interaction
- 04:01 AM Bug #2536 (Can't reproduce): librados crashed while getting stat of an object
- librados crashed while getting stat of an object:...
06/10/2012
- 09:58 PM Feature #1400 (Resolved): throw exceptions on unknown encoding
- 09:46 PM Feature #2088: msgr: refactor 2 threads to one
- 09:46 PM Feature #2149: osd: use omap for snap collections
- 09:22 PM Feature #1772: rbd: define new on-disk header format
- 05:47 PM Feature #1773: rbd: class interface for header interaction
- 05:47 PM Feature #1773 (Fix Under Review): rbd: class interface for header interaction
- 05:41 PM Linux kernel client Bug #2389: rbd: hung xfstest 67
- nightly_coverage_2012-06-10-a 6787
- 11:05 AM CephFS Bug #2444: null pointer deference in ceph_d_prune inside kvm
- hi,
same bug here on native x86 and amd64 machines.
It affects debian wheezy and ubuntu 12.04 LTS.
I did not check...
06/09/2012
- 08:06 PM rbd Bug #2535: rbd: random data corruption in vm
- The information that *should* let us fully diagnose:
* set
debug osd = 20
debug filestore = 20
debug ms = ... - 08:04 PM rbd Bug #2535 (Resolved): rbd: random data corruption in vm
- From ML:...
- 04:27 PM CephFS Bug #1947 (Need More Info): mds: SIGBUS during _mark_dirty
- It looks liek this one still lives on:...
06/08/2012
- 11:14 PM Bug #2524: librados crashed while connecting to cluster
- Thanks for the update. Yes, we do have different models, including a pool of set number of rados_t instances, etc. Bu...
- 10:37 PM Bug #2524: librados crashed while connecting to cluster
- Xiaopong Tran wrote:
> This is on my system:
> [...]
>
> Does it create a thread to every configured osd or only one... - 09:27 PM Bug #2524: librados crashed while connecting to cluster
- I bumped up the threads-max to:...
- 07:40 PM Bug #2524: librados crashed while connecting to cluster
- This is on my system:...
- 07:17 AM Bug #2524: librados crashed while connecting to cluster
- Sage Weil wrote:
> can you cat /proc/sys/kernel/threads-max ? on my system it's only 127837.
Yeah, for each libr... - 07:09 AM Bug #2524: librados crashed while connecting to cluster
- can you cat /proc/sys/kernel/threads-max ? on my system it's only 127837.
- 03:17 AM Bug #2524: librados crashed while connecting to cluster
- Ah, formatting... sorry...
- 03:15 AM Bug #2524: librados crashed while connecting to cluster
- Alright, more information. I was thinking, maybe it was the max number of open files, or the stack size is too low, s...
- 11:04 PM Feature #2496 (In Progress): reinstall pudgy
- 09:03 PM Feature #2337 (Resolved): rgw and rados performance numbers
- 10:14 AM Feature #2337: rgw and rados performance numbers
- Actually, the specific sprint test is here:
https://docs.google.com/a/inktank.com/spreadsheet/ccc?key=0AnmmfpoQ1_9... - 09:53 AM Feature #2337: rgw and rados performance numbers
- Results are being posted here:
https://docs.google.com/a/inktank.com/folder/d/0B3mmfpoQ1_94amRLQW5YT3l3OG8/edit - 04:42 PM rbd Bug #2534 (Resolved): librbd: make sure watch is established on same header version as initial re...
- Right now there's a race where it doesn't.
- 11:16 AM Bug #2533 (Duplicate): osd: watchers tracked by entity_name_t, not by cookie
- In the object info, watchers are tracked in a map<entity_name_t, watch_info_t>, but if there are multiple watchers fr...
- 10:43 AM Feature #1711 (Resolved): chef: multiple monitor support
- Works as of ceph-cookbook.git commit b5cc21bf5b9c3f59474a7dfe38e04ee01b584fa3 and ceph.git commit 7332e9c717fb627d51e...
- 10:12 AM rbd Feature #2531: rbd: fencing broken clients
- I talked to Sam about the combination of blacklisting, bad client writes, and changing primaries that we discussed an...
- 10:11 AM Linux kernel client Feature #26 (Rejected): statlite
- 10:09 AM Linux kernel client Cleanup #2093 (Resolved): ceph-client: messenger: the "to" parameter to read_partial() needs to go
- 10:08 AM Linux kernel client Bug #2395 (Resolved): kernel crash after unmap a rdb device while the cluster is down
- I'm going go assume this is running the older code and close it. If not, let us know!
- 10:06 AM rbd Bug #2478 (New): krbd: unmap on 3.4.0: scheduling while atomic...
- 10:04 AM Linux kernel client Feature #949 (Rejected): rbd: async writes, flush/barrier
- 10:04 AM Linux kernel client Bug #2243 (Resolved): btrfs: warning in orphan_commit_root
- 09:51 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
- That's probably best. It is always easier though when all subcommands under the main command, rbd in this case used o...
- 09:00 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
- Oh, i see.
I think the right fix is to make '--secret' and synonym for '--keyfile', and fix up rbd to use the conf... - 08:20 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
- When I try to use --keyfile=<file> with map, it seemingly fails, but using --secret=<file> succeeds. ...
- 08:13 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
- This is part of the rbd cmd helper message. It seems that for the map command one uses --secret....
- 07:00 AM rbd Bug #2532: rbd command allows passing in -K </path/to/secret>, but long version of (--secret) doe...
- the option is --keyfile <file>... where did you see --secret <file> documented?
- 05:49 AM rbd Bug #2532 (Resolved): rbd command allows passing in -K </path/to/secret>, but long version of (--...
- While rolling back a snapshot I succeed when I pass in `-K with location of key file, but it looks like I fail when I...
06/07/2012
- 09:38 PM rbd Feature #2531 (Resolved): rbd: fencing broken clients
- 06:45 PM Bug #2524: librados crashed while connecting to cluster
- objdump on the NIF shared library.
- 06:29 PM Bug #2524: librados crashed while connecting to cluster
- This is weird, if the problem is caused by resource exhaustion. I run this app on a machine with i7 CPU (with 8 cores...
- 09:24 AM Bug #2524: librados crashed while connecting to cluster
- This assert means that either a malloc or a call to pthread_create failed. It's probably resource exhaustion of some ...
- 04:23 AM Bug #2524 (Won't Fix): librados crashed while connecting to cluster
- Librados crahsed while connecting to the cluster.
Here is some log information. Unfortunately, I don't have more i... - 04:25 PM rbd Documentation #2530 (Closed): Doc: rbd manpage doesn't mention watch; usage does, and it works
- 04:20 PM Tasks #2529 (Resolved): debian: Merge packaging changes from Ubuntu 12.04
- The package in ubuntu is split to ceph-fs-common (mount helpers), ceph-mds (not in main), etc. Merge what makes sense.
- 03:10 PM rbd Bug #2528 (Resolved): Mounted RBD image appears to go read-only after a snapshot is created
- I have been able to repeat this a number of times. Essentially, I create a small rbd device, using the map command in...
- 01:54 PM Bug #2526 (Resolved): ceph-mon $mon_data_dir/keyring is world readable
- gah... commit:7332e9c717fb627d51efcaa3f31473a2c129e876
- 01:25 PM Bug #2526 (Resolved): ceph-mon $mon_data_dir/keyring is world readable
- Keys to the kingdom, for anyone to grab. ceph-mon --mkfs creates this file, it should enforce the access mode.
ubu... - 01:52 PM rgw Bug #2527 (Resolved): RGW may return 409 Conflict when deleting a bucket
- If a bucket delete call occurs immediately after running a delete operation on the final remaining object in that buc...
- 12:53 PM Bug #2525 (Resolved): librados: some functions are not thread-safe
- Some functions are accessing the osdmap without any locks. There are probably other cases like this. Find and fix all...
06/06/2012
- 09:07 PM Feature #1422 (Resolved): libvirt: rbd storage pool
- 09:06 PM Feature #2486 (Resolved): crush: evaluate local retry behavior
- 09:06 PM Feature #2493 (Resolved): teuthology-lock --status
- 09:05 PM devops Feature #2498 (Fix Under Review): standardize keyring locations for daemons
- 03:57 PM Messengers Cleanup #2150 (Resolved): repair the Simple/Messenger interface
- 02:06 PM Feature #2497 (Resolved): mon: new cluster logging strategy
- commit:47b202ecfdc00996b085a0c0d557564fbaa8bdfe
- 12:28 PM Feature #2497 (Fix Under Review): mon: new cluster logging strategy
- 12:28 PM Feature #2497: mon: new cluster logging strategy
- see wip-2497
- 01:27 PM Linux kernel client Bug #2523 (Resolved): xfs: xfs_iolock_reclaimable
- ...
- 01:22 PM rbd Bug #2522: xfstest #219
- ubuntu@teuthology:/a/nightly_coverage_2012-06-05-b
- 01:21 PM rbd Bug #2522 (Closed): xfstest #219
- ...
- 11:30 AM Bug #2518 (Resolved): mon: limit size of paxos log event
- 11:29 AM RADOS Feature #2521: crush: control bucket vs device mark-down probabilities independently
- 11:27 AM RADOS Feature #2521 (Resolved): crush: control bucket vs device mark-down probabilities independently
- --mark-down-ratio -- probability that a device (in eligible bucket) will be marked down
--mark-down-bucket... - 11:27 AM RADOS Feature #2421 (Resolved): crush: quantitatively validate mapping quality
- 09:16 AM Bug #2520 (Duplicate): iozone random read/write with 4k block size hangs
- http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/6777/focus=6856
User reports iozone random read/write (... - 04:20 AM Bug #2508: osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_...
- Hi Josh,
i've increased osd_min_pg_log_entries to 5000. Let's see if it fixes the problem.
Simon
06/05/2012
- 01:36 PM Feature #2519 (Resolved): rados: allow setting pg_num and pgp_num when creating a pool
- Right now rados mkpool creates a pool with 8 pgs, which is almost always too few. 'ceph osd pool create' accepts pg_n...
- 01:04 PM Bug #2518: mon: limit size of paxos log event
- 01:03 PM Bug #2518 (Resolved): mon: limit size of paxos log event
- dho was having trouble with a 400MB paxos event/record. make LogMonitor limit an individual paxos event to something...
- 11:42 AM rgw Feature #2517 (Resolved): rgw: limit number of buckets per user (configurable per user)
- 11:37 AM rgw Feature #2516 (Resolved): rgw: new bandwidth-only per-user log
- - orthogonal to operations logs
- only aggregate user bandwidth usage (read, write) per date
- rgw sends a perio... - 11:02 AM Bug #2508: osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_...
- Hi Simon,
If this is at all reproducible, could you try setting osd_min_pg_log_entries higher on all your osds, sa... - 07:47 AM Bug #2508 (Resolved): osdc/ObjectCacher.cc:761: void ObjectCacher::bh_write_commit(int64_t, sobje...
- Hi,
we've random KVM VPS crashes with the following error:... - 10:32 AM Feature #2510: update on-disk hobject_t encoding to include pool and namespace fields
- 10:15 AM Feature #2510 (Resolved): update on-disk hobject_t encoding to include pool and namespace fields
- This will allow hobject_t's to be globally unique in the filestore. That is, there will be a 1-to-1 inode to hobject...
- 10:31 AM Subtask #2515: allow collection upgrade to use more than one transaction
- 10:31 AM Subtask #2515 (Resolved): allow collection upgrade to use more than one transaction
- 10:31 AM Subtask #2514: Implement DBObjectMap upgrade from old version
- 10:30 AM Subtask #2514 (Resolved): Implement DBObjectMap upgrade from old version
- 10:31 AM Subtask #2513: Update DBObjectMap implementation to ignore collection
- 10:30 AM Subtask #2513 (Resolved): Update DBObjectMap implementation to ignore collection
- This allows us to remove the (coll_t,hobject_t)->seq mapping and directly store the leaf nodes keyed by hobject_t.
- 10:31 AM Subtask #2512: implement upgrade process for collections
- 10:29 AM Subtask #2512 (Resolved): implement upgrade process for collections
- also upgrade object_info and pg log encodings
- 10:31 AM Subtask #2511: Change hobject_t encoding
- 10:16 AM Subtask #2511 (Resolved): Change hobject_t encoding
- 10:17 AM CephFS Bug #733: cmds crash: mds/LogEvent.cc:88: FAILED assert(p.end())
- ok here is a logfile with the following config:
[mds]
debug = 20
debug ms = 1
debug md... - 10:08 AM Subtask #2402 (In Progress): audit calls into osd from pg for locking correctness
- 10:07 AM Subtask #2509 (Resolved): create OSDService to limit pg/osd interface
- 10:06 AM Subtask #2430: simplify pg removal
- 10:06 AM Subtask #2403: remove osd pointer from PG
- 10:06 AM Subtask #2333: create queueing for peering messages
- 10:06 AM Subtask #825: osd: remove pg map updating from handle_osd_map
- 10:06 AM Subtask #2332: move pg queueing into pgs
- 10:06 AM Subtask #2282: Handle map updates on a per-pg basis
- 09:56 AM rbd Feature #1480: librbd: image locking
- lock(entity)
unlock(entity)
new code should lock before open, unlock on close.
the rbd map tool have 'lock lis... - 12:28 AM CephFS Bug #1047: mds: crash on anchor table query
- No, I am not sure about that. Only saw the same assert message and a similar trace, so I assumed it to be the same bug.
06/04/2012
- 04:09 PM rgw Bug #2503 (Resolved): rgw: ungraceful failure when cannot create unix domain socket
- Fixed, commit:5087997a1c90ecd1244dc1047a17858607c940f9.
- 03:09 PM rgw Bug #2503: rgw: ungraceful failure when cannot create unix domain socket
- No, another problem. This refers to the 'rgw socket path' that is being used for fastcgi.
- 06:26 AM rgw Bug #2503: rgw: ungraceful failure when cannot create unix domain socket
- There was a stupid error in master for a few days that was making noise about the admin socket.. is that what this wa...
- 03:56 PM Bug #2507 (Resolved): auth: "ceph auth get-or-create-key" argument validation is lacking
- This should probably have errored out:
ubuntu@inst01:~$ sudo ceph auth get-or-create-key client.foo borkbork
AQBW... - 01:08 PM CephFS Bug #1047: mds: crash on anchor table query
- Amon, are you sure you're hitting exactly this bug with your users? This particular one requires hard links to be in ...
- 01:04 PM CephFS Bug #733: cmds crash: mds/LogEvent.cc:88: FAILED assert(p.end())
- Aww, the actual debug line that's interesting here is generic_dout().
Can you do it again, this time adding "debug =... - 10:05 AM Messengers Cleanup #2150: repair the Simple/Messenger interface
- I scheduled another test run but I don't anticipate any problems — this should be reviewed for merge!
- 09:23 AM CephFS Bug #2494: mds: Cannot remove directory despite it being empty.
- Note that this was triggered frequently by backuppc runs:
http://thread.gmane.org/gmane.comp.file-systems.ceph.devel... - 09:23 AM Linux kernel client Bug #2506: ceph: ceph_add_cap: couldn't find snap realm NNN
- Note that this was triggered frequently by backuppc runs:
http://thread.gmane.org/gmane.comp.file-systems.ceph.devel... - 06:33 AM Bug #2487 (Resolved): rgw: (re)creating a suspended bucket succeeds
- 06:29 AM Bug #2491 (Resolved): watch/notify: racing notify and unwatch
- 01:35 AM Bug #2346: xfs filesystem on top of rbd volume corrupts
- I am not 100% sure but it looks like kernel 3.2.17-1 fixed the problem. Let's wait 4 weeks to make sure of it.
Also available in: Atom