Activity
From 12/15/2012 to 01/13/2013
01/13/2013
- 10:11 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- Nope.. which leads me to realize that that setting needs to go in teuthology's ceph.conf. Doing that now, and then I...
- 10:01 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- *sigh*
This also looks good to me, and I like it better (should have suggested this the first time around). But no... - 10:05 PM Bug #3774 (Fix Under Review): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- wip-scrub
- 10:05 PM Bug #3786 (Fix Under Review): osd: scrub is deferred indefinitely if load is high
- wip-scrub
- 07:04 AM Revision 410906e0 (ceph): mon: OSDMonitor: don't output to stdout in plain text if json is specified
- Fixes: #3748
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
01/12/2013
- 11:05 PM Bug #3748 (Resolved): ceph osd dump --format=json includes non-JSON line
- commit:410906e04936c935903526f26fb7db16c412a711
- 11:03 PM Bug #3795 (Resolved): loadgen task gets into msgr loop
- ...
- 11:01 PM Bug #3785 (Fix Under Review): ceph: default crush rule does not suit multi-OSD deployments
- der, broke vstart. can you review wip-3785?
- 08:01 AM CephFS Feature #3749: Remove forced synchronization from Java bindings
- In libcephfs mount/unmount race against each other, and the test of the API (e.g. unmount racing against write). In C...
- 01:10 AM Revision 7ea5d84f (ceph): osdmap: spread replicas across hosts with default crush map
- This is more often the case than not, and we don't have a good way to
magically know what size of cluster the user wi... - 01:09 AM Revision 3610e72e (ceph): mon: OSDMonitor: only share osdmap with up OSDs
- Try to share the map with a randomly picked OSD; if the picked monitor is
not 'up', then try to find the nearest 'up'... - 12:25 AM Revision 1f721804 (ceph): rbd: Fix tabs
- Signed-off-by: Dan Mick <dan.mick@inktank.com>
01/11/2013
- 11:56 PM Revision 34138993 (ceph): doc: Updates to CRUSH paper.
- fixes: 3329, 3707, 3711, 3389
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 10:28 PM Revision 15bb00ca (ceph): rbd: call udevadm settle on map/unmap
- When we map/unmap devices, udev gets called to manage device nodes;
this will allow the command to wait for those man... - 10:28 PM Revision e94b06a1 (ceph): rbd: make 'add' modprobe rbd so it has a chance of success
- Check for existence of /sys/bus/rbd first to avoid unnecessary calls
Fixes: #3784
Signed-off-by: Dan Mick <dan.mick@... - 08:17 PM Revision 66eb93b8 (ceph): OSD: only trim up to the oldest map still in use by a pg
- map_cache.cached_lb() provides us with a lower bound across
all pgs for in-use osdmaps. We cannot trim past this sin... - 08:15 PM Revision 8cf79f25 (ceph): OSD: check for empty command in do_command
- Fixes: #3878
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com> - 08:09 PM Revision 3e147295 (ceph): Merge pull request #32 from imjustmatthew/imjustmatthew_docs
- Correct typo in mon docs 'ceph.com' to 'ceph.conf'
- 07:59 PM Revision 0f161f1e (ceph): Correct typo in mon docs 'ceph.com' to 'ceph.conf'
- 06:49 PM Revision aeb02061 (ceph): qa/run_xfstests.sh: use cloned xfstests repository
- Use our own copy of the xfstests repository rather than hitting
the upstream one repeatedly.
Signed-off-by: Alex Eld... - 06:15 PM Revision 8d0fa15e (ceph): mon: Monitor: only schedule a timecheck after election if we are not alone
- Fixes: #3790
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 05:51 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
- Merged to master in commit:7ea5d84fa3d0ed3db61eea7eb9fa8dbee53244b6 and cherry-picked to bobtail in commit:503917f004...
- 05:45 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- good question. let's start with bobtail.
- 05:39 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- Looks good to me. What branches do we want to cherry-pick it on.
- 05:24 PM Bug #3785 (Fix Under Review): ceph: default crush rule does not suit multi-OSD deployments
- wip-3785
- 01:59 PM Bug #3785 (New): ceph: default crush rule does not suit multi-OSD deployments
- dang! wrong bug. opening this one back up.
sorry all! - 12:34 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- I think maybe Deb's comments and closure were meant for another bug (perhaps 3789?)
- 11:34 AM Bug #3785 (Won't Fix): ceph: default crush rule does not suit multi-OSD deployments
- This comment should have been in bug 3789
caused by a lack of resources on the system.
have increased the memory fro... - 11:32 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- This comment should have been in bug 3789
upping the memory on these VMs from 512M to 2G
since it appears it was a... - 10:55 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- I agree with Ian, I have seen *very bad things* happen when crush choses two OSD on one host, rather than distribute...
- 10:11 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- The issue here is that CRUSH maps which behave well on multi-host deployments behave quite poorly on one or two host ...
- 05:46 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- Yes, Greg. The test passed in the recent runs.
- 05:34 PM Bug #3752 (Resolved): fsync-tester script need to be fixed to run in the nightlies
- This appears to be passing now, right Tamil?
Since I'm not seeing anything else breaking I'm inclined to leave the... - 04:25 PM Bug #3772 (In Progress): osd: osd_disk_threads = 5 seems to hang recovery
- 03:53 PM Documentation #3330 (In Progress): doc: How to troubleshoot unbalanced CRUSH
- 03:51 PM Documentation #3329 (In Progress): doc: What metrics should be used to set node weight
- 02:45 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
- That makes this sounds like a simple fix... we need to swap the frsize and bsize fields. Except that right now we ar...
- 02:39 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
- I spent a bit of time with gregaf trying to find authoritative sources for what the different values denote. While `...
- 01:40 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
- This coreutils commit may have useful data:
http://git.savannah.gnu.org/cgit/coreutils.git/commit/src?id=0863f018f0f... - 01:38 PM CephFS Bug #3793 (Resolved): wrong size reported in some distributions/toolchains
- In ceph_statfs we set f_bsize to be 1MB in order to report very large available spaces. However, nowadays it is appar...
- 02:38 PM CephFS Feature #3749: Remove forced synchronization from Java bindings
- This needs more thought than just removing synchronization. We'd like to be segfault free in Java, even though you co...
- 02:26 PM Bug #3789: OSD core dump and down OSD on CentOS cluster
- There is 'ceph health', and a nagios plugin that runs it. A similarly trivial plugin can probably be written for oth...
- 02:01 PM Bug #3789 (Won't Fix): OSD core dump and down OSD on CentOS cluster
- dmesg shows it was a lack of resources.
upping the memory on these VMs from 512M to 2G
since it appears it ... - 10:28 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
- Deb Barba wrote:
> all core files have similar backtrace.
> again, Sage, looks like you are right, low resources
>... - 10:27 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
- all core files have similar backtrace.
again, Sage, looks like you are right, low resources
dmesg:
hrtimer: inte... - 10:23 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
- looks from dmesg, you are right Sage, low on resources
centos1 core# gdb /usr/bin/ceph-osd core.0.26177
Core wa... - 10:16 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
- backtrace of core.0.14401 from centos3:
Core was generated by `/usr/bin/ceph-osd -i 8 --pid-file /var/run/ceph/osd.... - 09:37 AM Bug #3789 (Need More Info): OSD core dump and down OSD on CentOS cluster
- check dmesg, or VM responsiveness. this triggers when a call to sync(2) takes more than... 2 minutes? i forget how l...
- 09:13 AM Bug #3789 (Won't Fix): OSD core dump and down OSD on CentOS cluster
- Running a CentOS VM cluster. Running v0.56.1
I had written a bit of data, and stopped writing about 4pm yesterday... - 02:17 PM rbd Subtask #3741: krbd: rework request tracking code
- Unfortunately my system crashed after an hour or so. The
crash was in the network driver, and a little analysis
su... - 10:45 AM rbd Subtask #3741: krbd: rework request tracking code
- My full test run isn't complete but I seem to have resolved
whatever problem I was hitting yesterday. I have not ye... - 01:39 PM CephFS Bug #3794 (Resolved): uclient: reports sizes wrong in some cases
- This is the counterpart to kernel bug #3793. See Client::statfs, in which we set f_bsize to 1MB but f_frsize to 4KB. ...
- 12:22 PM Bug #3787 (Resolved): Ceph OSD crashes on ceph tell osd.x
- 8cf79f252a1bcea5713065390180a36f31d66dfd
- 11:12 AM Bug #3787 (Fix Under Review): Ceph OSD crashes on ceph tell osd.x
- wip_3787
- 09:33 AM Bug #3787: Ceph OSD crashes on ceph tell osd.x
- verified this happens on master. should be an easy fix. thanks for the report!
- 12:17 AM Bug #3787 (Resolved): Ceph OSD crashes on ceph tell osd.x
- I recently set up a small test cluster with 2 nodes to test the 0.48.3 -> 0.56.1 upgrade. After Upgrading one of the ...
- 12:22 PM Bug #3770 (Resolved): OSD crashes on boot
- 66eb93b83648b4561b77ee6aab5b484e6dba4771
- 11:16 AM Bug #3770 (Fix Under Review): OSD crashes on boot
- wip_3770
- 11:03 AM Bug #3770: OSD crashes on boot
- The fault is in OSD::handle_osd_map where we trim old maps. Prior to 0.50, the pgs would have processed up to the cu...
- 09:59 AM Bug #3770: OSD crashes on boot
- I'm seeing this same assert failure when trying to startup 3 of my OSDs. Happy to provide feedback for the debugging ...
- 09:43 AM Bug #3770: OSD crashes on boot
- sjust said that we're done collecting information and that I could rm the pg directory/log/info, which I did. Unfortu...
- 09:41 AM Bug #3770: OSD crashes on boot
- 12:04 PM Bug #3788: debian source packages are missing
- Gary Lowell wrote:
> It looks like the Sources file has been zero length in past releases as well. Still investigat... - 12:03 PM Bug #3788: debian source packages are missing
- My favorite use case when source packages are available would be...
- 11:33 AM Bug #3788: debian source packages are missing
- I think we should build source packages too (in addition to tarballs, etc.).
- 10:47 AM Bug #3788: debian source packages are missing
- We are not currently building debian or rpm source packages. We do put out a source tarball corresponding to the rel...
- 09:56 AM Bug #3788 (In Progress): debian source packages are missing
- It looks like the Sources file has been zero length in past releases as well. Still investigating.
- 02:20 AM Bug #3788: debian source packages are missing
- Proposed fix at https://github.com/ceph/ceph-build/pull/1
- 01:44 AM Bug #3788: debian source packages are missing
- http://ceph.com/debian/conf/distributions is created from https://github.com/ceph/ceph-build/blob/master/gen_reprepro...
- 01:35 AM Bug #3788 (Resolved): debian source packages are missing
- Following the instructions at http://ceph.com/docs/master/install/debian/ to add the ...
- 10:52 AM CephFS Bug #3773: mds crashed at LogEvent::decode
- Sure Sage. I was running bonnie from client during upgrade.
I had debug ms=1 set, i will try to reproduce this with... - 09:41 AM CephFS Bug #3773 (Need More Info): mds crashed at LogEvent::decode
- Tamil, I wonder if you can try to reproduce this with mds logging turned up from teh start (debug mds = 20, debug ms ...
- 10:34 AM Messengers Bug #2569: msgr: connect_rank crash
- yes, you are right, Greg. I just wanted to put a note of this somewhere, so chose to update the bug itself :)
- 10:23 AM Bug #3748 (Fix Under Review): ceph osd dump --format=json includes non-JSON line
- wip-3748 has a fix, commit:0edb53f02231fb83f33d3bc5f58b37b14cd5df82
- 10:20 AM Bug #3695 (Resolved): monitor crashed after an upgrade in Monitor::timecheck
- 10:16 AM Bug #3790 (Resolved): Mon crash after update to ceph version 0.56-209-g310112f
- looks good, merged into master. commit:8d0fa15e6aa3847e89de5d5adfca0a863e8da976
- 10:06 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
- Had a redundant check on the previous commit; fixed and rebased it and the new commit can be found on wip-3790 commit...
- 10:02 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
- This patch fixes it.
- 09:31 AM Bug #3790 (In Progress): Mon crash after update to ceph version 0.56-209-g310112f
- My fault. Forgot a check on win_election().
Any chance you can test 6104629d95207f3dfd3a744d81b011b6a714070e on wi... - 09:18 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
- Previous installed version was .56-193.
- 09:14 AM Bug #3790 (Resolved): Mon crash after update to ceph version 0.56-209-g310112f
- I have a single node cluster on burnupi60 updated each morning to the latest Master branch. After the update this mo...
- 09:16 AM Bug #3774 (In Progress): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- 09:16 AM Bug #3774: osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- wip-scrub-sched for the argonaut version. should look very similar for master/bobtail.
- 02:05 AM Revision 310112f7 (ceph): Merge remote-tracking branch 'gh/wip-3633'
- Reviewed-by: Sage Weil <sage@inktank.com>
- 02:04 AM Revision 9e4a3f03 (ceph): Merge remote-tracking branch 'gh/wip-3633'
- 02:03 AM Revision 305cb54a (ceph): suites: rados: multimon: add mon clock skews task yaml files
- Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
- 12:58 AM Revision 2fa5d23b (ceph): test: Hadoop cluster and task config.
- Add a 3-node cluster specification and a
task for running wordcount with Hadoop on Ceph.
Signed-off-by: Joe Buck <jb... - 12:44 AM Revision aa40de90 (ceph): messages: add MTimeCheck
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 12:44 AM Revision 684d4ba2 (ceph): mon: Monitor: add timecheck infrastructure to detect clock skews
- Fixes: #3633
Fixes: #3695
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inkt... - 12:44 AM Revision ff1c254b (ceph): mon: Monitor: reduce indentation level; make code more readable
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
- 12:44 AM Revision 7a7fff57 (ceph): mon: Monitor: move a couple of if's together on handle_command()
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
- 12:44 AM Revision bc57c7a9 (ceph): mon: Monitor: use 'else if' on handle_command instead of bunches of 'if'
- ... when the options are mutually exclusive.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> - 12:44 AM Revision 58e03ecb (ceph): mon: Monitor: unify 'ceph health' and 'ceph status'; add json output
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
- 12:03 AM Revision e6f284e9 (ceph): doc: Added -a option. Should work without from server, as described.
- fixes: #3750
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
01/10/2013
- 11:59 PM Revision de6633f9 (ceph): doc: Normalized to term "drive" rather than disk. Changed "(Manual)" en...
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 11:06 PM Revision 7a8ec194 (ceph): Merge branch 'next'
- 09:54 PM Revision 988f3597 (ceph): rados: add truncate support
- Signed-off-by: Samuel Just <sam.just@inktank.com>
Revewed-by: Greg Farnum <greg@inktank.com> - 09:04 PM Bug #3786 (Resolved): osd: scrub is deferred indefinitely if load is high
- If the load is above the threshold, we will never scrub. For some environments, this is normal (e.g., mixed OSD and ...
- 08:23 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
- This seems to be fixed in QEMU 1.3.0 and Ceph 0.56.1
I've tried QED -> Raw -> Ceph -> Raw then QED -> Ceph -> Raw an... - 07:56 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
- Version: 0.48.2-0ubuntu2~cloud0
Our Ceph deployments typically involve multiple OSDs per host with no disk redunda... - 07:10 PM rbd Feature #3635 (In Progress): rbd cli: call "udevadm settle" after use of add/remove kernel interface
- 07:10 PM Revision 44625d44 (ceph): config_opts.h: default osd_recovery_delay_start to 0
- This setting was intended to prevent recovery from overwhelming peering traffic
by delaying the recovery_wq until osd... - 07:09 PM rbd Feature #3784 (In Progress): rbd: issue modprobe when rbd map is called
- 06:04 PM rbd Feature #3784 (Resolved): rbd: issue modprobe when rbd map is called
- rbd map will not work unless the rbd kernel module is loaded, and this must be done manually. Add code to rbd to cau...
- 07:02 PM Revision 830b8ffa (ceph): ReplicatedPG: fix snapdir trimming
- The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l... - 06:34 PM Revision 0f42c373 (ceph): ReplicatedPG: fix snapdir trimming
- The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l... - 06:24 PM Bug #3774: osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- 06:14 PM Revision 035caac5 (ceph): Revert "rgw: fix handler leak in handle_request"
- This reverts commit eba314a811cd98a79f483dc7a9128fe76c722c78.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> - 06:11 PM rgw Feature #3402 (Fix Under Review): rgw: improve tests for multipart upload
- 06:10 PM rgw Feature #3634 (Fix Under Review): rgw: improve teuthology radosgw-admin test
- 06:09 PM Bug #3633 (Resolved): mon: clock drift errors not reported by ceph status
- commit:310112f702d14294e6ba48f8af41a306288cba65
- 06:09 PM Revision eb997e25 (ceph): Merge pull request #31 from chrisglass/expose_cluster_stats_to_python
- Added python wrapper to rados_cluster_stat
- 05:59 PM rbd Bug #3518 (Can't reproduce): rbd import file --format 2 creates an image named '--format'
- 05:59 PM rbd Bug #3518: rbd import file --format 2 creates an image named '--format'
- It seems that this no longer happens as of e6f284e945f45e39c57921149d4551d9e78557a5,
so closing non-reproducible. - 05:06 PM CephFS Bug #3773: mds crashed at LogEvent::decode
- Okay, I gathered up a core file, a high-debug MDS log, and the log with the bad event (and the bad event itself) in t...
- 02:05 PM CephFS Bug #3773: mds crashed at LogEvent::decode
- I'll at least start this off.
- 04:54 PM Revision c8f3fd6e (ceph): marginal: Remove broken symlinks
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 04:47 PM Messengers Bug #2569: msgr: connect_rank crash
- I believe this was caused by some issues which we decided not to backport the fixes for due to their size; Sage can c...
- 04:43 PM Messengers Bug #2569: msgr: connect_rank crash
- hit this on a mixed cluster running argonaut v0.48.3 and v0.56 [ ceph version 0.56-193-g00898c1]
monitors,mds,osds... - 04:37 PM rbd Bug #3688 (Won't Fix): rbd allows image of size 0 to be created
- I claim that zero-sized images are legal, if not particularly useful in that size...but one might well want to create...
- 04:15 PM Bug #3770: OSD crashes on boot
- root@ms-be1003:/var/lib/ceph/osd/ceph-27# find current/meta/ | tee ~/ceph-osd.27.meta | wc -l
42992
Attached. - 04:02 PM Bug #3770: OSD crashes on boot
- root@ms-be1003:/var/lib/ceph/osd/ceph-27/current/4.f9_head# attr -lq $PWD | while read attr; do echo $attr; attr -q -...
- 02:27 PM Bug #3770 (Need More Info): OSD crashes on boot
- From the backtrace:
pgid = {m_pool = 4, m_seed = 249, m_preferred = -1}
Based on the info attr, we try to... - 04:04 PM Bug #3750 (Resolved): Possible Ceph 5-minute quick start guide typo
- Documentation described making the call from the server console, which should work as described. Added -a so that it ...
- 03:52 PM Bug #3780 (Won't Fix): pg_num inappropriately low on new pools
- Version: 0.48.2-0ubuntu2~cloud0
On a Ceph cluster with 18 OSDs, new object pools are being created with a pg_num o... - 03:08 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
- The documentation should note that the
@rgw dns name = {hostname}@
option must be set in the
@[client.radosgw.g... - 11:13 AM rgw Bug #3778 (Resolved): document procedure for enabling subdomain S3 api calls
- The process for setting up a server that handles subdomain API requests is not documented. If possible we should add ...
- 03:07 PM Documentation #3711 (In Progress): crush-map.rst: choose firstn talks about "N", but does not cle...
- 03:05 PM devops Documentation #2886 (In Progress): doc: crush location tricks, ceph.conf, automatic host=
- 02:23 PM rbd Subtask #3741: krbd: rework request tracking code
- I am leaving shortly for a few hours. In reviewing this
new code I find a few things that make it a little hard
ma... - 01:00 PM rbd Subtask #3741: krbd: rework request tracking code
- I did some testing yesterday and found that I got I/O errors
while running xfstests. This was unexpected; I thought... - 01:43 PM Revision 797b3db3 (ceph): Added python wrapper to rados_cluster_stat
- The new get_cluster_stats() method on the rados.Rados object calls
the rados_cluster_stat() function in the librados ... - 12:51 PM Bug #2533 (Duplicate): osd: watchers tracked by entity_name_t, not by cookie
- 12:48 PM Feature #3769: osd: scrub should verify snap collection existence, membership
- Written, just needs to be ported to Bobtail
- 09:40 AM Feature #3769 (In Progress): osd: scrub should verify snap collection existence, membership
- 12:47 PM Bug #3736 (In Progress): kernel build: failures starting in 3.8-rc1
- 12:02 PM Bug #3736: kernel build: failures starting in 3.8-rc1
- The remaining issue is that the patch we apply to scripts/package/builddeb to build the perf tools is out of date. I...
- 12:45 PM Bug #3702 (New): OSD SIGABRT during startup
- 12:40 PM Bug #3617 (Resolved): Ceph doesn't support > 65536 PGs(?) and fails silently
- 09:35 AM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
- How's the testing come along, Sage?
- 12:39 PM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
- Believed fixed by patch to 3633
684d4ba242b26828bd7927860226bfc8a0cfcc2b - 12:35 PM Bug #3650 (Can't reproduce): osd: crash in Reset state -> start_peering_interval -> on_change -> ...
- Looked into the core dump, can't see how this happened.
- 12:30 PM Bug #3591 (Closed): auth: could not find secret_id=0
- 12:30 PM Bug #3591 (Resolved): auth: could not find secret_id=0
- Resolved by Sage's fix above.
- 12:29 PM Bug #3563 (Closed): osd crashed with error "auth: could not find secret_id=2"
- 12:29 PM Bug #3563 (Resolved): osd crashed with error "auth: could not find secret_id=2"
- Resolved by fix to 3591
- 12:20 PM Bug #3467 (Closed): osd: bad state machine event in start_recoverY_ops
- 12:20 PM Bug #3467 (Won't Fix): osd: bad state machine event in start_recoverY_ops
- If encountered, restart OSD.
- 12:13 PM Bug #3300: ceph::buffer::end_of_buffer isn't caught
- Josh - Is this just a case where the documentation needs to be updated?
- 11:46 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- The same issue exists with the debian packages. We have an explicit dependency on python, but not on perl. I don't ...
- 10:55 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- Can we check to ensure perl is not used elsewhere?
Are there guidelines that are provided to the developers that spe... - 10:06 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- I hate to see a dependency like perl get added for a oneliner perl regex. Is this the only place perl is used? Can ...
- 09:43 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- backport to bobtail
- 11:26 AM Tasks #3779 (Resolved): update osd config ref as appropriate
- I'm not sure what our update policies on the docs are, but the defaults named in http://ceph.com/docs/master/rados/co...
- 11:11 AM rgw Cleanup #3777 (Resolved): rgw: audit code for reading NULL env variables
- Similar to the issue that triggered #3735
- 10:25 AM Bug #3647 (Can't reproduce): forgot the auth options for Cephx and added them later: Get msg: 7f...
- 10:19 AM rgw Bug #3735 (Closed): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- 10:19 AM rgw Bug #3735 (Resolved): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- 10:00 AM rgw Bug #3735: rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- commit:e1da85f286838cdd3a6329840cec748c6a11fd26
- 09:57 AM Bug #3747: PGs stuck in active+remapped
- Sage Weil wrote:
> commit:f83fcf63a928fdb8ab4d604bdce596c0c4afd854
oops, wrong bug! - 09:45 AM Bug #3747 (Resolved): PGs stuck in active+remapped
- commit:f83fcf63a928fdb8ab4d604bdce596c0c4afd854
- 09:55 AM CephFS Feature #3621 (Closed): qa: add knfsd reexport tests to qa suite
- 09:52 AM CephFS Feature #3621: qa: add knfsd reexport tests to qa suite
- commit:aaa03bbcd2549a38f962a61fc63be16cca3a6d90 in teuthology.git
- 09:34 AM Bug #3776 (Resolved): Need doc describing how to alter our log rotation
- If a user has a small to moderate size of root disk, they will probably have to modify the log rotation process for c...
- 09:32 AM Bug #3661 (Resolved): mon: idle/empty osds marked down after 15 min
- 08:34 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
- Sam,
That is a cool idea. I will open a doc bug for that. Providing instructions for those with smaller root dri... - 06:32 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
- The easiest solution for this might be to adjust the default logrotate script (src/logrotate.conf) to use the size pa...
- 03:52 AM Revision 59aad347 (ceph): configure.ac: check for org.junit.rules.ExternalResource
- Check for org.junit.rules.ExternalResource if build with
--enable-cephfs-java and --with-debug. Checking for junit4
i... - 01:13 AM Revision 12af11a1 (ceph): src/java/Makefile.am: fix default java dir
- Fix default javadir in src/java/Makefile.am to $(datadir)/java
since this is the common data dir for java files.
Sig... - 01:13 AM Revision 9b167b46 (ceph): ceph.spec.in: fix handling of java files
- Fix handling of JAVA (jar) files. Don't move the files around in the install
section since the related Makefile is fi... - 01:13 AM Revision f027d025 (ceph): ceph.spec.in: rename libcephfs-java package to cephfs-java
- Rename the libcephfs-java package to cephfs-java since the package
contains no (classic) library and RPMLINT complain... - 01:13 AM Revision d8c4fc5e (ceph): ceph.spec.in: fix libcephfs-jni package name
- Rename libcephfs-jni to libcephfs_jni1 to reflect the SO name/version of
the library and to prevent RPMLINT to compla... - 01:13 AM Revision aedbb97f (ceph): configure.ac: remove AC_PROG_RANLIB
- Remove already comment out AC_PROG_RANLIB to get rid of warning:
libtoolize: `AC_PROG_RANLIB' is rendered obsolete b... - 01:13 AM Revision 61437ee2 (ceph): configure.ac: change junit4 handling
- Change handling of --with-debug and junit4. Add a new conditional HAVE_JUNIT4
to be able to build ceph-test package a... - 12:11 AM Revision 00898c18 (ceph): rbd: allow copy of zero-length images. Includes simple test.
- Fixes: #3765
Signed-off-by: Dan Mick <dan.mick@inktank.com> - 12:10 AM Revision 1c3d6840 (ceph): doc/install/debian.rst: fix typo in link ref; broke doc build
- Signed-off-by: Dan Mick <dan.mick@inktank.com>
01/09/2013
- 11:11 PM Revision 133e4e34 (ceph): Merge branch 'next'
- Want to get various rbd-related fixes together for upgrade testing
- 10:40 PM Revision 48f13946 (ceph): ReplicatedPG: increment scrubber.errors rather than errors
- Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 05:37 PM Bug #3705 (Resolved): osd: crash in scrub finalize [argonaut]
- commit:5b12b514b047a8a46cc5549bd94b398289b9b5f6
- 05:08 PM rbd Bug #3766 (Resolved): rbd resize command fails on a mixed node cluster when it is a copied rbd im...
- I'm calling this fixed, then.
- 04:54 PM rbd Bug #3766: rbd resize command fails on a mixed node cluster when it is a copied rbd image and whe...
- This works fine on the master branch that has a fix for it :
ceph version 0.56-193-g00898c1 (00898c1860e8ae95b52192... - 01:44 PM rbd Bug #3766 (Need More Info): rbd resize command fails on a mixed node cluster when it is a copied ...
- I think this might be e1776809031c6dad441cfb2b9fac9612720b9083, which is still in next. Can you try an rbd client fr...
- 04:35 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
- Deb Barba <deb.barba@inktank.com>
3:13 PM (1 hour ago)
to Dan
so, as I explained in chat.
i am again seeing ... - 04:34 PM Feature #3775 (New): log: stop logging in statfs reports usage above some threshold
- Add a 'log stop on utilization = .95' option that will make the log code print one last line like
--- suspending l... - 04:31 PM Bug #3774 (Resolved): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- These should get put at the top of the scrub queue in a way that still honors all the scheduling.
The problem is t... - 04:27 PM rbd Bug #3765 (Resolved): rbd cp of a zero sized image succeeds with error
- 04:27 PM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
- Fixed, test added, in master:
commit:00898c1860e8ae95b5219257d1635b15ccdce5c1 - 11:44 AM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
- 02:58 PM CephFS Bug #3773 (Can't reproduce): mds crashed at LogEvent::decode
- ceph version: 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
I had a cluster [burnupi06, burnupi07, burnupi08] ... - 02:32 PM rbd Bug #3753 (Resolved): rbd copy command reports error even though copy is successful on a mixed no...
- I believe this to have been fixed by the fix for #3744.
- 01:47 PM rbd Bug #3753: rbd copy command reports error even though copy is successful on a mixed node cluster
- Tamil, does this still happen with the fix in wip-no-cls-lock (and now in next) for 3744?
- 02:14 PM Bug #3772 (Can't reproduce): osd: osd_disk_threads = 5 seems to hang recovery
- reported on IRC, should be easy to reproduce.
we may want to change the default to 2 in order to avoid hiding thes... - 01:51 PM rbd Bug #3697 (Can't reproduce): rbd copy.sh test failing in nightly
- unable to reproduce so far
- 12:05 PM CephFS Feature #3570 (In Progress): teuthology: mds thrasher
- 11:47 AM rbd Feature #2256: rbd: parallelize deletions
- 11:46 AM rbd Feature #2297: ObjectCacher: mark buffers mergeable for ksm
- 11:46 AM rbd Bug #3518: rbd import file --format 2 creates an image named '--format'
- 11:46 AM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
- 11:42 AM Bug #3744 (Resolved): librbd: need to handle older OSDs that don't have cls_lock
- commit:4483285c9fb16f09986e2e48b855cd3db869e33c in next
- 11:28 AM Bug #3771: ceph does not have startup scripts in Centos
- Gary found that the installation script was commented out 2011-10-17
> commit 9baf5ef4f35c38d7fbaa70bde8f2c9383b2f... - 11:13 AM Bug #3771 (Resolved): ceph does not have startup scripts in Centos
- I did a basic ceph v0.56 installation on Centos 6.3
I have rebooted my nodes, and find that ceph is not startup up a... - 10:58 AM CephFS Bug #3681: kclient fsx fails nightly
- Proposed fix to set i_size before the setattr request:
This will resolve the above issue, because the cap flush on... - 09:59 AM Bug #3683 (Can't reproduce): mon: leak of MMonPaxos
- 09:58 AM Bug #3683: mon: leak of MMonPaxos
- I can't for the life of me get to reproduce this leak. In the meantime, Sage submitted a patch to msg/Pipe.cc [1] tha...
- 07:17 AM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
- I've been unable to reproduce this bug, but the cause was pretty obvious, so I pushed a fix that should deal with thi...
- 03:39 AM Revision 62e721a9 (ceph): librados: add aio stat tests
- Implement simple write-stat test, and a write-stat-remove-stat test cycle.
Signed-off-by: Filippos Giannakos <philip... - 03:38 AM Revision 879578c1 (ceph): librados: implement aio_stat
- Implement aio stat and also export this functionality to the C API.
Signed-off-by: Filippos Giannakos <philipgian@gr... - 02:32 AM Revision 5b12b514 (ceph): osd: make missing head non-fatal during scrub
- If we encounter a scrub without a preceeding head, warn instead of
crashing. Note that this is still something we ca... - 02:29 AM Revision e1da85f2 (ceph): rgw: Fix crash when FastCGI frontend doesn't set SCRIPT_URI
- Fixes: #3735
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 02:28 AM Revision eba314a8 (ceph): rgw: fix handler leak in handle_request
- Fixes: #3682
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 02:25 AM Revision 4483285c (ceph): librbd: Allow get_lock_info to fail
- If the lock class isn't present, EOPNOTSUPP is returned for lock calls
on newer OSDs, but sadly EIO on older; we need... - 02:21 AM Revision 77ddf276 (ceph): doc/release-notes: v0.48.3argonaut
- Signed-off-by: Sage Weil <sage@inktank.com>
- 12:23 AM Bug #3770 (Resolved): OSD crashes on boot
- One of my 0.56.1 OSDs crashed and couldn't boot: it was reaching tp_op heartbeats, and even after increasing that I w...
01/08/2013
- 10:21 PM Feature #3769 (Resolved): osd: scrub should verify snap collection existence, membership
- and, hopefully, backport this to argonaut
- 09:39 PM Feature #3651 (In Progress): osd: deep scrub should hash omap
- 07:57 PM Revision 573f5315 (ceph): marginal/multiclient: Matching tests for kclient
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 07:54 PM Revision 14385a66 (ceph): marginal/multiclient: Add three client cluster
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 07:51 PM Revision a4df5238 (ceph): marginal/multiclient: Adding ior test to marginal
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 06:36 PM Revision 1e03fe18 (ceph): marginal/multiclient: Add a test for fsx-mpi
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 06:23 PM Revision c07a4cb6 (ceph): marginal/multiclient: New task to run mdtest
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 06:11 PM Revision f17847e5 (ceph): task/kclient: chmod root to 1777.
- Signed-off-by: Greg Farnum <greg@inktank.com>
- 05:27 PM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
- I looked into this; it happens because clip_io() (called from read_iterate()) tries to validate
that writing at offs... - 03:23 PM rbd Bug #3765 (Resolved): rbd cp of a zero sized image succeeds with error
- ceph version 0.56-131-gd283abd (d283abdf50b1e4429b775680bfae1bb20c75306b)
while am still surprised about why we ne... - 04:45 PM Bug #3768 (Resolved): perl is required for logrotate, we need to include Perl as a dependency
- logrotate for ceph (/etc/logrotate.d/ceph) uses perl commands
if perl is not installed, logrotate fails
if logrotat... - 04:29 PM CephFS Bug #3597: ceph-fuse: denying root access
- Is root actually a member of the fuse group? If not that would be correct behavior.
- 04:07 PM Revision f8958463 (ceph): task/mpi: Allow working directory to be specified
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 03:46 PM rbd Bug #3766 (Resolved): rbd resize command fails on a mixed node cluster when it is a copied rbd im...
- ubuntu@burnupi24:/var/log/ceph$ ceph -v
ceph version 0.56-131-gd283abd (d283abdf50b1e4429b775680bfae1bb20c75306b)
... - 03:42 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- I think so.
But first let's verify it passes. - 12:43 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- Should we revert that teuthology commit, then?
- 12:31 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- There was a bug in the kernel for o_creat permissions checking for non root users.. Its fixed in the testing branch. ...
- 10:49 AM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- This is weird. Tamil says this one has never passed, but we can both run it locally fine and it passes in the ceph-fu...
- 09:39 AM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- I made a change to the cfuse task to chmod 1777 the ceph root dir after its mounted. I think we should do the same f...
- 09:21 AM Bug #3752 (Resolved): fsync-tester script need to be fixed to run in the nightlies
- log: ubuntu@teuthology:/a/teuthology-2013-01-05_22:28:52-regression-next-testing-basic/35949
35949: (190s) collect... - 03:34 PM Revision 16248121 (ceph): task: A task to setup mpi
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 03:33 PM Revision e88c0fc8 (ceph): task/ceph-fuse: chmod root to 1777
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 03:32 PM Revision 4ed20ae8 (ceph): task/pexec: Add barrier capability
- This patch adds the ability to barrier between
parallel exec tasks so that all tasks will perform
the following step ... - 03:31 PM Revision 35320083 (ceph): task/pexec: More fixes for all case, exec on hosts
- We don't want to do an exec per role, but per-host. We
were already doing an exec per host, but the names were confu... - 03:29 PM Revision 081a80f8 (ceph): task/pexec: Fix when 'all' is used
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 03:25 PM Revision d44fb147 (ceph): radosgw-admin.py: Increase test coverage to current admin feature set.
- Signed-off-by: caleb miles <caleb.miles@inktank.com>
- 12:58 PM Feature #3760: osd: maintain checksum on collection contents
- It wasn't clear to me from the description, but we are of course talking about maintaining in the HashIndex a checksu...
- 12:13 PM Feature #3760 (Rejected): osd: maintain checksum on collection contents
- Currently, there is no way for an OSD to detect erroneously missing objects in a pg collection. A scrub, therefore, ...
- 12:33 PM RADOS Feature #3764 (New): osd: async replicas
- The following is more a topic for conversation than a feature:
Currently, latency on any operation is limited by t... - 12:23 PM rbd Feature #3763 (Resolved): krbd: handle flattening of mapped image
- An rbd client receives notice if the snapshot context for
a mapped rbd image has changed. It is possible for the
s... - 12:19 PM Linux kernel client Bug #3762 (Duplicate): kernel osd client: verify support for multiple ops per request
- In order to support layered rbd images, the osd client needs
to support multiple ops in a single osd request.
Loo... - 12:15 PM rbd Feature #3761 (Resolved): kernel messenger: need to support multiple ops per request
- The kernel messenger currently gets message data from either
a bio list or a page vector. That is one or the other,... - 12:13 PM Bug #3759 (Duplicate): osd: maintain checksum on collection contents
- 12:11 PM Bug #3759 (Duplicate): osd: maintain checksum on collection contents
- Currently, there is no way for an OSD to detect erroneously missing objects in a pg collection. A scrub, therefore, ...
- 12:08 PM rbd Tasks #2853: krbd: read path
- This task depends on the completion of the following others
before it can be completed:
3741 krbd: rework request ... - 12:07 PM Feature #3758 (Rejected): osd: incremental object checksumming
- Currently, scrub can only compare the checksums between replicas. If an inconsistency is found between two replicas,...
- 12:07 PM rbd Subtask #2854: krbd: write path
- Work on this won't really begin until the read path work
has completed (http://tracker.newdream.net/issues/2853).
- 12:06 PM rbd Subtask #2854: krbd: write path
- OK, I'm going to interpret this as:
Any write operation on a layered image will be preceded
by an existence c... - 12:04 PM CephFS Feature #626 (Closed): qa: add IOR, rompio, or other parallel workloads suite
- Added tests to the _marginal_ qa suite that run IOR, mdtest, and fsx-mpi.
- 11:48 AM Feature #3756 (Duplicate): Watch/Notify cleanup
- 11:41 AM Feature #3756 (Duplicate): Watch/Notify cleanup
- The current design is rather fragile particularly with respect to the locking and ref counting.
The result of this... - 11:47 AM Feature #3757 (Resolved): osd: Watch/Notify cleanup
- The current design is rather fragile particularly with respect to the locking and ref counting.
The result of this... - 11:24 AM Bug #3744: librbd: need to handle older OSDs that don't have cls_lock
- Actually, rados lock list should continue to fail.
- 11:10 AM Documentation #3322: doc: Explain multi-tenant CephFS
- Where is this located? I wasn't able to find it.
- 11:00 AM rbd Tasks #3755 (Resolved): krbd: use new request tracking code for sync object operations
- The last request type still using the old request tracking code
is for handling synchronous operations. There are t... - 10:58 AM rbd Feature #3754 (Closed): krbd: use new request tracking code for notify ack
- Two request types remain that still use the old request
tracking mechanism. One of them is sending acknowledgements... - 09:54 AM rbd Bug #3753 (Resolved): rbd copy command reports error even though copy is successful on a mixed no...
- ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
On a mixed node cluster running argonaut[burnupi21,... - 09:39 AM CephFS Feature #3543: mds: new encoding
- I'm going to get started on this (mostly just figuring out current state, probably) today.
- 09:28 AM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
- 06:54 AM Bug #3695 (In Progress): monitor crashed after an upgrade in Monitor::timecheck
- 08:47 AM Linux kernel client Bug #3751: krbd: fix type of snap_id local variable
- I have a fix for this and I'll post it for review later
today.... - 08:47 AM Linux kernel client Bug #3751 (Resolved): krbd: fix type of snap_id local variable
- The type of the snap_id local variable in rbd_dev_v2_snap_info()
is defined with the wrong byte order. - 06:43 AM Bug #3748: ceph osd dump --format=json includes non-JSON line
- One other option would be to provide "standard" fields for status output when using json, regardless of any other exp...
- 05:08 AM Revision 920f82e8 (ceph): v0.48.3argonaut
- 04:51 AM Bug #3750 (Resolved): Possible Ceph 5-minute quick start guide typo
- I believe that the Ceph quick start guide should specify
@sudo service ceph -a start@
instead of the current
@... - 04:51 AM Revision f07921be (ceph): doc/install: new URLs for argonaut vs bobtail
- Also restructure the document a bit to make the choice of packages more
clear.
Signed-off-by: Sage Weil <sage@inktan... - 04:46 AM Revision 72674ad4 (ceph): doc/release-notes: v0.56.1
- Signed-off-by: Sage Weil <sage@inktank.com>
- 03:40 AM Bug #3747: PGs stuck in active+remapped
- I did a "ceph osd out 0; sleep 30; ceph osd in 0" and out of those 61 active+remapped pgs, 5 went into active+remappe...
- 12:14 AM Revision 1b194b25 (ceph): Merge branch 'wip-stripe-gran'
- Reviewed-by: Greg Farnum <greg@inktank.com>
01/07/2013
- 11:50 PM Revision 26e8438a (ceph): test: enforce -ENOTCONN contract in libcephfs
- Tests all relevant calls for -ENOTCONN when used with an unmounted
ceph_mount_info param.
Signed-off-by: Noah Watkin... - 11:49 PM Revision 5c58aa96 (ceph): libcephfs: return -ENOTCONN when call unmounted
- Adds -ENOTCONN return value for stat, fchmod, fchown, lchown.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> - 11:16 PM Revision f83fcf63 (ceph): PG: set DEGRADED in Active AdvMap handler based on pool size
- Otherwise, if the acting set does not change, the pg might
not show up as degraded if the pool size now exceeds the
a... - 11:04 PM Revision c4121093 (ceph): libcephfs: clarify interface return value
- Document that ceph_get_stripe_unit_granularity may return an error code
(e.g. -ENOTCONN). The interface requires a mo... - 09:33 PM Revision e4a54162 (ceph): v0.56.1
- 09:12 PM Revision c8f8c7e6 (ceph): Merge branch 'next'
- 09:08 PM Revision 9aecacda (ceph): msg/Pipe: prepare Message data for wire under pipe_lock
- We cannot trust the Message bufferlists or other structures to be
stable without pipe_lock, as another Pipe may claim... - 09:08 PM Revision 299dbad4 (ceph): msgr: update Message envelope in encode, not write_message
- Fill out the Message header, footer, and calculate CRCs during
encoding, not write_message(). This removes most modi... - 09:08 PM Revision 35d2f583 (ceph): msg/Pipe: encode message inside pipe_lock
- This modifies bufferlists in the Message struct, and it is possible
for multiple instances of the Pipe to get referen... - 09:08 PM Revision 9b23f195 (ceph): msg/Pipe: associate sending msgs to con inside lock
- Associate a sending message with the connection inside the pipe_lock.
This way if a racing thread tries to steal thes... - 09:08 PM Revision 6229b5a0 (ceph): msg/Pipe: fix msg leak in requeue_sent()
- The sent list owns a reference to each message.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from comm... - 09:04 PM Revision 1b39b316 (ceph): Merge branch 'wip-3678-b' into next
- Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com> - 09:02 PM Revision 40706afc (ceph): msgr: update Message envelope in encode, not write_message
- Fill out the Message header, footer, and calculate CRCs during
encoding, not write_message(). This removes most modi... - 09:02 PM Revision d16ad926 (ceph): msg/Pipe: prepare Message data for wire under pipe_lock
- We cannot trust the Message bufferlists or other structures to be
stable without pipe_lock, as another Pipe may claim... - 09:01 PM Revision 6a00ce0d (ceph): osdc/Objecter: fix linger_ops iterator invalidation on pool deletion
- The call to check_linger_pool_dne() may unregister the linger request,
invalidating the iterator. To avoid this, inc... - 08:58 PM Revision 62586884 (ceph): osdc/Objecter: fix linger_ops iterator invalidation on pool deletion
- The call to check_linger_pool_dne() may unregister the linger request,
invalidating the iterator. To avoid this, inc... - 06:39 PM Revision 213e3559 (ceph): osd: fix race in do_recovery()
- Verify that the PG is still RECOVERING or BACKFILL when we take the pg
lock in the recovery thread. This prevents a ... - 06:38 PM Revision e410d1a0 (ceph): ReplicatedPG: requeue waiting_for_ondisk in apply_and_flush_repops
- Fixes: #3722
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 06:34 PM Revision 4c9f4c3c (ceph): ceph-fuse: rename ceph_ll_* to fuse_ll_*
- To not conflict with future linuxbox pull for nfs-ganesha.
Signed-off-by: David Zafman <david.zafman@inktank.com>
Re... - 04:04 PM CephFS Feature #3749 (Resolved): Remove forced synchronization from Java bindings
- Remove "synchronized" keyword from native interface. This was originally added when we were seeing some pthread mutex...
- 03:58 PM Bug #3748 (Resolved): ceph osd dump --format=json includes non-JSON line
- ceph osd dump --format=json includes the non-JSON "dumped osdmap epoch N" at the top of the output, which of course b...
- 03:42 PM Bug #3747 (Closed): PGs stuck in active+remapped
- About a week ago I doubled the number of OSDs in my cluster from 24 to 48 and, in the same day, adjusted CRUSH's defa...
- 03:35 PM rbd Subtask #2854: krbd: write path
- rbd write path.. 'guard' in the sense that the write has a check to verify the object already exists.
- 03:22 PM rbd Subtask #2854: krbd: write path
- Pretty sure this is about the rbd locking and fencing.
- 03:11 PM rbd Subtask #2854: krbd: write path
- I'm about to mark bug 3418 as a duplicate of this one.
I'm adding the following from that bug here first.
I did... - 03:11 PM rbd Subtask #2854: krbd: write path
- I'm not sure what "guard writes" is supposed to mean.
But I'm going to interpret it as simply implementing the
writ... - 03:26 PM CephFS Bug #3746 (Rejected): kclient mmap doesn't zero past EOF
- Error coming from fsx:
INFO:teuthology.orchestra.run.out:Mapped Write: non-zero data past EOF (0xb826) page offset... - 03:14 PM rbd Feature #3419 (Duplicate): krbd: copy-up on write to clone
- This is a duplicate of http://tracker.newdream.net/issues/2855.
- 03:14 PM rbd Subtask #2855: krbd: copy-up on write to clone
- I don't know how to change the one-line bug description or I
would.
I need some clarification about the intended ... - 03:12 PM rbd Feature #3418 (Duplicate): krbd: write path (layering)
- This is a duplicate of http://tracker.newdream.net/issues/2854.
- 03:07 PM rbd Feature #3417 (Duplicate): krbd: read path (layering)
- This is a duplicate of tracker.newdream.net/issues/2854.
- 03:06 PM rbd Tasks #2853: krbd: read path
- I'm about to mark bug 3417 as a duplicate of this.
I'm putting this bit of info from there here first.
Work o... - 03:05 PM rbd Feature #3416 (Duplicate): krbd: open parent on open
- Marking this as a duplicate of http://tracker.newdream.net/issues/2852.
- 02:51 PM rbd Bug #3743: krbd: errors on submitted requests are ignored
- If I could figure out how, I'd change the title of this
to say "krbd" rather than "rbd" to help make it clear
which... - 02:27 PM rbd Bug #3743 (Won't Fix): krbd: errors on submitted requests are ignored
- When a Linux request comes down to the rbd driver via rbd_rq_fn(),
rbd_dev_do_request() is called after validating t... - 02:50 PM rbd Bug #3745 (Rejected): krbd: individual response errors are ignored
- A Linux I/O request on an rbd image is broken into one or
more rbd requests, one request directed to each osd object... - 02:41 PM Bug #3744 (Resolved): librbd: need to handle older OSDs that don't have cls_lock
- Older OSDs didn't have libcls_lock, and will fail lock operations; this means
virtually all rbd operations and rados... - 01:22 PM Bug #3722 (Resolved): osd: indefinitely hung request on stable cluster
- commit:e410d1a066b906cad3103a5bbfa5b4509be9ac37
- 01:22 PM Bug #3736: kernel build: failures starting in 3.8-rc1
- Sure enough, this is the commit that causes the problem:
af3df2c perf tools: Try to build Documentation when insta... - 11:48 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- Looks like commit 6ca2a9c is the first one in that branch
that fails. It has a parent ce37f40 that succeeds.
I'v... - 10:24 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- Heard back from Neil as well as Vlad Yasevich about my
proposed fix and they both ack'd it. Linus was in on
the di... - 09:07 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- Despite a working build of the *kernel*, the package build
overall is still failing. It has something to do with bu... - 08:52 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- Neil Horman sent a response to my message and suggested
three possible alternatives to fix the underlying problem,
... - 05:42 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- I changed our config file, found in the git repository
autobuild-ceph in the file "kernel-config" in the way
descri... - 05:40 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- I'm retroactively updating this so a bit about what's been
done gets documented.
The problem was in the Kconfig f... - 05:35 AM Bug #3736 (Resolved): kernel build: failures starting in 3.8-rc1
- Kernels as of version 3.8-rc1 are not properly building in
autobuilder. The initial symptom was that the config pha... - 01:16 PM Bug #3678 (Resolved): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- 01:16 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- commit:1b39b31678aea8c5bbdb38811b3919525228d10f
- 01:01 PM Bug #3734 (Resolved): osd/objecter: misdirected op in librados api tests
- 12:19 PM CephFS Cleanup #3742 (Resolved): Remove old Hadoop wrappers and configuration options
- I think it's likely that the current Hadoop shim is at least at feature parity with the old wrappers.
- 12:16 PM Bug #3702: OSD SIGABRT during startup
- Dan Mick wrote:
> Is this related to rbd, or should it be in category 'ceph'?
Ah, yes, it should. Thank you for c... - 11:31 AM Bug #3702: OSD SIGABRT during startup
- Is this related to rbd, or should it be in category 'ceph'?
- 12:07 PM rbd Subtask #3741: krbd: rework request tracking code
- ...
- 11:54 AM rbd Subtask #3741 (Resolved): krbd: rework request tracking code
- This is actually work that's mostly complete, but it never
got a bug assigned to it.
In order to handle layering ... - 11:26 AM Bug #3632 (Resolved): occasional testrados failure: process_8 exited with a signal
- this is probably #3734, now fixed.
- 11:09 AM rbd Subtask #2852: krbd: open parent on open
- This work is essentially done, and has been since
October 2012 (or even earlier). However I held off
posting it fo... - 11:00 AM Linux kernel client Bug #3740 (Resolved): ceph-client: change to be based on 3.8-rc2
- Our current ceph-client tree is based on Linux 3.6.
That is fairly old code (late September, 2012). We
should upda... - 10:12 AM Feature #3739 (Resolved): osd: repair object size vs object_info_t mismatches
- if the object_info_t size doesn't match the on-disk file/object size, we needt o repair it. this means proposing a s...
- 10:02 AM CephFS Bug #3726 (Resolved): Enforce Ceph's minimum stripe size in the java bindings
- 10:02 AM CephFS Bug #3726 (Closed): Enforce Ceph's minimum stripe size in the java bindings
- 09:21 AM CephFS Bug #3738 (Resolved): kclient fsx truncate/write multi-client race
This bug is similar to #3681, but occurs only in the non-exclusive case (multiple clients), where a truncate doesn'...- 09:09 AM CephFS Bug #3681: kclient fsx fails nightly
- The race here is between a truncate down, and completion of osd write ops triggering a cap flush. The exact order th...
- 06:30 AM rbd Bug #3737 (Resolved): Higher ping-latency observed in qemu with rbd_cache=true during disk-write
- Hi Josh,
as per our short conversation in IRC-#ceph there is an issue with latency/responsiveness with rbd_cache e... - 04:38 AM Revision 4cfc4903 (ceph): msg/Pipe: encode message inside pipe_lock
- This modifies bufferlists in the Message struct, and it is possible
for multiple instances of the Pipe to get referen... - 04:38 AM Revision a058f161 (ceph): msg/Pipe: associate sending msgs to con inside lock
- Associate a sending message with the connection inside the pipe_lock.
This way if a racing thread tries to steal thes... - 04:38 AM Revision 2a1eb466 (ceph): msg/Pipe: fix msg leak in requeue_sent()
- The sent list owns a reference to each message.
Signed-off-by: Sage Weil <sage@inktank.com> - 04:18 AM rgw Bug #3735: rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- Here's the fix I used on my system to fix the problem. The S3 service is set at the root of the virtual server so "" ...
- 03:07 AM rgw Bug #3735 (Closed): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- I'm using lighttpd as a Fast CGI front end for radosgw and it doesn't set SCRIPT_URI environment variable.
So the ...
01/06/2013
- 10:50 PM Bug #3734 (Fix Under Review): osd/objecter: misdirected op in librados api tests
- wip-3734
- 10:41 PM Bug #3734: osd/objecter: misdirected op in librados api tests
- epoch 328:...
- 10:15 PM Bug #3734 (Resolved): osd/objecter: misdirected op in librados api tests
- ...
- 03:10 PM Bug #3715 (Duplicate): Crash during 0.55 -> 0.56 upgrade
- this was #3731
- 02:38 PM Bug #3722: osd: indefinitely hung request on stable cluster
- 02:34 PM Bug #3678 (Fix Under Review): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MN...
- YAY, wip-3678 is consistently passing now.
- 05:37 AM Revision a10950f9 (ceph): os/FileJournal: include limits.h
- Needed for IOV_MAX.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit ce49968938ca3636f48fe5431... - 04:54 AM Revision ce499689 (ceph): os/FileJournal: include limits.h
- Needed for IOV_MAX.
Signed-off-by: Sage Weil <sage@inktank.com>
01/05/2013
- 09:32 PM Feature #3733 (Closed): osd: update leveldb submodule
- 07:17 PM Revision e9efa332 (ceph): java: add stripe unit granularity tests
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:12 PM Revision ececcf57 (ceph): java: update javadoc comments
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:10 PM Revision cdd138da (ceph): java: fix whitespace
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:08 PM Revision abcda95b (ceph): libcephfs: expose stripe unit granularity
- Assists clients in choosing layout parameters.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> - 07:08 PM Revision 6954bf33 (ceph): java: add support for get_stripe_unit_granularity
- Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com> - 06:47 PM Documentation #3389 (In Progress): doc: crush docs could use a full example crushmap
- 10:02 AM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
- Do we have a test that checks our interfaces to
automatically catch inadvertent protocol changes?
If not, we should. - 09:04 AM Bug #3731 (Resolved): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible prot...
- commit:988a52173522e9a410ba975a4e8b7c25c7801123
- 09:04 AM Bug #3721 (Resolved): filestore: op_seq written in wrong order on non-btrfs
- commit:28d59d374b28629a230d36b93e60a8474c902aa5
- 09:03 AM Bug #3698 (Resolved): filestore: ENOENT on clone
- commit:e89b6ade63cdad315ab754789de24008cfe42b37
- 08:27 AM Feature #3732 (Resolved): osd/mon: report recovery rate (bytes and objects per sec)
- Report the rate of recovery (objects and bytes per second) via the monitor, presumably via 'ceph -w' and similar inte...
- 04:48 AM Revision 415294c0 (ceph): Merge branch 'next'
- 04:47 AM Revision cd194ef3 (ceph): osd: special case CALL op to not have RD bit effects
- In commit 20496b8d2b2c3779a771695c6f778abbdb66d92a we treat a CALL as
different from a normal "read", but we did not ... - 04:47 AM Revision 921e06de (ceph): Revert "OSD: remove RD flag from CALL ops"
- This reverts commit 91e941aef9f55425cc12204146f26d79c444cfae.
We cannot change this op code without breaking compati... - 04:46 AM Revision 988a5217 (ceph): osd: special case CALL op to not have RD bit effects
- In commit 20496b8d2b2c3779a771695c6f778abbdb66d92a we treat a CALL as
different from a normal "read", but we did not ... - 04:46 AM Revision d3abd0fe (ceph): Revert "OSD: remove RD flag from CALL ops"
- This reverts commit 91e941aef9f55425cc12204146f26d79c444cfae.
We cannot change this op code without breaking compati... - 03:51 AM Revision 3a940874 (ceph): libcephfs: delete client after messenger shutdown
- Prevents race between messages being dispatched to the client after the
client has been free'd.
Signed-off-by: Noah ... - 02:02 AM Revision 0978dc49 (ceph): rbd: Don't call ProgressContext's finish() if there's an error.
- do_copy was different from the others; call pc.fail() on error and
do not call pc.finish().
Fixes: #3729
Signed-off-...
01/04/2013
- 09:45 PM Revision 7513e971 (ceph): ReplicatedPG: remove old-head optization from push_to_replica
- This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic... - 09:44 PM Revision e89b6ade (ceph): ReplicatedPG: remove old-head optization from push_to_replica
- This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic... - 09:37 PM Revision 6a3d475c (ceph): Merge remote branch 'origin/wip-rbd-watch'
- Reviewed-by: Dan Mick <dan.mick@inktank.com>
- 08:32 PM Revision cd5f2bfd (ceph): ObjectCacher: fix off-by-one error in split
- This error left a completion that should have been attached
to the right BufferHead on the left BufferHead, which wou... - 07:54 PM CephFS Bug #3666 (Resolved): Segfault running test_libcephfs
- commit:3a9408742a8a6cbc870cba543a208285f1a6cec1
- 03:25 PM CephFS Bug #3666: Segfault running test_libcephfs
- I pushed a new wip-client-shutdown. This switches the clean-up order of client/messenger in libcephfs, rather than mo...
- 01:36 PM CephFS Bug #3666: Segfault running test_libcephfs
- Right, I think your fix will work, but it breaks the interface abstraction (messenger is created above the client, de...
- 01:16 PM CephFS Bug #3666: Segfault running test_libcephfs
- This is what I'm running to reproduce the error. It's been running now for an hour on wip-client-shutdown without any...
- 12:57 PM CephFS Bug #3666: Segfault running test_libcephfs
- Rather than moving messenger shutdown into client shutdown?
- 12:48 PM CephFS Bug #3666: Segfault running test_libcephfs
- A similar issue was just handled in the ceph_fuse.cc code. There we just delay deleting the client till the end. Yo...
- 10:41 AM CephFS Bug #3666: Segfault running test_libcephfs
- During unmount, the client is shutdown and free'd before the messenger. If any messages are delivered after the clien...
- 07:07 PM Revision 802c486f (ceph): config: change default log_max_recent to 10,000
- Commit c34e38bcdc0460219d19b21ca7a0554adf7f7f84 meant to do this but got
the wrong number of zeros.
Signed-off-by: S... - 06:18 PM Revision d6496abf (ceph): remove rbd_header_race test
- This no longer works since export does not do a watch, and the race is
being closed a different way not detectable by... - 06:16 PM Revision 620dd551 (ceph): task: mon_clock_skew_check.py: Check for clock skews on the monitors
- Will run for as long as teuthology runs. By default, fails if any clock
skews higher than 0.05 seconds are detected, ... - 06:11 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
- commit:0978dc4963fe441fb67afecb074bc7b01798d59d
- 03:12 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
- ceph version 0.56-109-gd8940d1 (d8940d15c330d05c8a198ff7dde16df748938b65)
when trying to copy rbd image to an alre... - 06:06 PM Bug #3702: OSD SIGABRT during startup
- Sage Weil wrote:
> Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else?
... - 09:42 AM Bug #3702 (Need More Info): OSD SIGABRT during startup
- 05:54 PM Revision 1a878611 (ceph): regression: include nfs suite
- 05:50 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- got msgr logs in ubuntu@teuthology:/a/sage-a3/34724, but the crash looked different from the earlier ones (whose logs...
- 05:40 PM Bug #3731 (Fix Under Review): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompati...
- see wip-3731
- 05:19 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
- Agreed. And let's make sure it's fixed for 0.56.1.
- 05:15 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
- Discussed this with Dan and Sam and I think we just want to roll this patch back and tell people not to use v0.56 for...
- 04:34 PM Bug #3731 (Resolved): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible prot...
- CEPH_OSD_OP_CALL changed to remove the CEPH_OSD_OP_MODE_RD bit in
91e941aef9f55425cc12204146f26d79c444cfae; however,... - 05:03 PM Revision e88b909a (ceph): task: ceph_manager: add 'get_mon_health' function
- Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
- 03:29 PM CephFS Feature #3730 (Closed): Support replication factor in Hadoop
- In order to support per-file replication values in Hadoop we need to specify that a new file should be generated in a...
- 02:38 PM rbd Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
- commit:6a3d475cf08eb3051e8cdbce10b17b53c92b9cb5
- 11:31 AM rbd Bug #3642 (Fix Under Review): librbd: watch is sent with assert version, which fails on resends
- in branch wip-rbd-watch
- 01:54 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
- Also, name it something along the lines of get_stripe_granularity() and not .._min(imum)_ as that isn't entirely accu...
- 01:40 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
- After a discussion on jabber, the decision is to go with exposing a function call in libcephfs and then using that in...
- 11:09 AM CephFS Bug #3726 (Resolved): Enforce Ceph's minimum stripe size in the java bindings
- The Hadoop bindings are using the blocksize as the stripe size. If a block size is explicitly passed down, it ends up...
- 01:00 PM CephFS Bug #3718: multi-client dbench gets stuck over NFS exported cephfs
- Heads up, Zheng Yan's patches on the mds fix issues related to running multiclient dbench tests.
- 12:24 PM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Hmm, okay. I wasn't real clear on the previous bugs so I'll need to look at it more if I end up taking this, but soun...
- 11:46 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Greg Farnum wrote:
> Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thin... - 11:35 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thing set_trace() does. Ex...
- 11:17 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Greg Farnum wrote:
> Am I reading it correctly that this is just going to be doing the config and wrapper work to no... - 09:01 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Am I reading it correctly that this is just going to be doing the config and wrapper work to not call set_trace() in ...
- 12:20 PM CephFS Feature #3543: mds: new encoding
- 12:20 PM CephFS Feature #3728: mds: draft design for lookup by ino
- 12:14 PM CephFS Feature #3728 (Resolved): mds: draft design for lookup by ino
- 12:20 PM CephFS Feature #3570: teuthology: mds thrasher
- 12:06 PM CephFS Feature #3727 (Resolved): mds: refactor EMetablob encoding paths
- Right now, the EMetaBlob sub-structures — for performance reasons — use an encoding pattern that doesn't match anythi...
- 11:42 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
- Greg Farnum wrote:
> I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should e... - 09:34 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
- I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should encompass. Are you talk...
- 11:41 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
- This was a whiteboard discussion 2 years ago. Nothing was written down. We should reopen new and more detailed issu...
- 09:29 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
- Where are the results of this bug? It's marked resolved but I don't see any fsck references in the git tree, and ther...
- 11:39 AM Feature #685: libcephmon: interact with ceph monitors via a library
- BTW it may make sense to push the client command stuff in the ceph tool into MonClient, and then wrap that in libceph...
- 11:38 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
- Greg Farnum wrote:
> Do we have a separate bug for the library calls this needs?
#685, which would take the clien... - 09:27 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
- Do we have a separate bug for the library calls this needs?
- 11:36 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
- Greg Farnum wrote:
> And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the... - 09:24 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
- And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the Ceph shim or anything...
- 11:35 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
- Greg Farnum wrote:
> Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph moun... - 09:24 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
- Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph mount and then running tes...
- 11:34 AM CephFS Feature #3426: ceph-fuse: build/run on os x
- Greg Farnum wrote:
> Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a... - 09:22 AM CephFS Feature #3426: ceph-fuse: build/run on os x
- Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a cluster going with vs...
- 11:32 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
- Greg Farnum wrote:
> What all does this encompass? Design? Implementation? Does it need to be an online switch or ca... - 09:13 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
- What all does this encompass? Design? Implementation? Does it need to be an online switch or can it be an offline job?
- 11:30 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
- Greg Farnum wrote:
> Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect ... - 09:12 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
- Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect so, but the presence o...
- 11:23 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
- 10:32 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
- log: ubuntu@teuthology:/a.old/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33734...
- 11:23 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
- Greg Farnum wrote:
> Do we have any kind of design for this? We've talked about it some and it's conceptually simple... - 09:08 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
- Do we have any kind of design for this? We've talked about it some and it's conceptually simple, but splitting up the...
- 11:15 AM CephFS Feature #626 (In Progress): qa: add IOR, rompio, or other parallel workloads suite
- Yeah, that's what slang's working on to enable this. Assigning this to him.
- 08:57 AM CephFS Feature #626: qa: add IOR, rompio, or other parallel workloads suite
- SamL has done some work on getting MPI going under teuthology, and on running some multi-client FS tests. I'm not sur...
- 11:14 AM Bug #3722: osd: indefinitely hung request on stable cluster
- the trigger is a brief osd reset due to an intermittent network outage. no actual ceph-osd daemons restart.
<pr... - 09:39 AM Bug #3722 (Need More Info): osd: indefinitely hung request on stable cluster
- 08:36 AM Bug #3722 (Resolved): osd: indefinitely hung request on stable cluster
- 0.48.2argonaut, rbd workload.
occasional requests are blocked indefinitely.
*may* be osd down/up cycles (due to... - 11:13 AM CephFS Feature #3621 (Resolved): qa: add knfsd reexport tests to qa suite
- 10:53 AM Bug #3723: ceph osd down command reports incorrectly
- similarly for "ceph osd in" command as well
ubuntu@burnupi06:/etc/ceph$ sudo ceph osd in 2 -k /etc/ceph/ceph.key... - 09:33 AM Bug #3723 (Can't reproduce): ceph osd down command reports incorrectly
- issuing the command: "sudo ceph osd down 2" reports osd.2 is already down but sudo ceph osd stat reports all are up.
... - 10:21 AM Bug #3698 (In Progress): filestore: ENOENT on clone
- 09:43 AM Bug #3699 (Resolved): osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
- commit:4ae4dce5c5bb547c1ff54d07c8b70d287490cae9
- 09:43 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
- Oh, those are librados specific numbers, aren't they. So this bug is to create and expose a libceph version, then. Wh...
- 09:35 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
- In libcephfs there is a call to get Ceph version (yes, just expose this). But, I recall Sage mentioning that it might...
- 09:19 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
- This is just exposing the librados version() function to Java, right?
- 09:41 AM rgw Bug #3724 (Resolved): docs refer to non-implemented features of the radosgw-admin rest api
- The only radosgw-admin API calls currently are *get usage* and *trim usage* The docs at
http://ceph.com/doc... - 09:41 AM CephFS Cleanup #660: mds: use helpers in mknod, mkdir, openc paths
- What kind of helpers are you talking about with this? inode fetchers and lock grabbers? In a quick scan over handle_c...
- 09:36 AM CephFS Feature #603: mds: repair directory hierarchy
- This is part of #82 fsck, right? Do we have a more detailed algorithm anywhere?
- 05:02 AM Revision 39a734fb (ceph): os/FileStore: fix non-btrfs op_seq commit order
- The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as... - 04:17 AM devops Documentation #3686: install prerequisites (Debian)
- Greg Farnum wrote:
> Nat, you should be able to install either of libtcmalloc-minimal or libgoogle-perftools — are... - 03:40 AM Revision c63c6646 (ceph): os/FileStore: fix non-btrfs op_seq commit order
- The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as... - 03:00 AM Revision acfa0c9a (ceph): mds: optimize C_MDC_RetryOpenRemoteIno
- When opening remote inode, C_MDC_RetryOpenRemoteIno is used as onfinish
context for discovering remote inode. When it... - 02:45 AM Revision b03eab22 (ceph): mds: forbid creating file in deleted directory
- Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
- 02:45 AM Revision 59953257 (ceph): mds: keep dentry lock in sync state as much as possible
- Unlike locks of other types, dentry lock in unreadable state can block path
traverse, so it should be in sync state a... - 02:45 AM Revision f9280cb6 (ceph): mds: fix replica state for LOCK_MIX_LOCK
- LOCK_MIX_LOCK state is for gathering local locks and caps, so replica state
should be LOCK_MIX.
Signed-off-by: Yan, ... - 02:45 AM Revision 248e4ab8 (ceph): mds: fix cap mask for ifile lock
- ifile lock has 8 cap bits, should its cap mask should be 0xff
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> - 02:45 AM Revision 420f3355 (ceph): mds: rdlock prepended dest trace when handling rename
- rdlock prepended dest trace to prevent them from being xlocked by
someone else.
Signed-off-by: Yan, Zheng <zheng.z.y... - 02:45 AM Revision ea2fd127 (ceph): mds: check null context in CDir::fetch()
- Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
- 02:45 AM Revision 3705c7ca (ceph): mds: drop locks when opening remote dentry
- Opening remote dentry while holding locks may cause dead lock. For example,
'discover' is blocked by a xlocked dentry... - 02:45 AM Revision ca4dc4db (ceph): mds: check if stray dentry is needed
- The necessity of stray dentry can change before the request acquires
all locks.
Signed-off-by: Yan, Zheng <zheng.z.y... - 02:45 AM Revision acbe6d97 (ceph): mds: don't issue caps while inode is exporting caps
- If issue caps while inode is exporting caps, the client will drop the
caps soon when it receives the CAP_OP_EXPORT me... - 02:45 AM Revision d379ac8e (ceph): mds: disable concurrent remote locking
- Current code allows multiple MDRequests to concurrently acquire a
remote lock. But a lock ACK message wakes all reque... - 01:15 AM Revision 28d59d37 (ceph): os/FileStore: fix non-btrfs op_seq commit order
- The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as... - 12:23 AM Revision 49416619 (ceph): log: broadcast cond signals
- We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre... - 12:13 AM Revision f1e0305f (ceph): doc: Removed the --without-tcmalloc flag until further advised.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 12:07 AM Revision 19df2086 (ceph): Merge pull request #30 from rca/master
- Minor clarification in docs.
01/03/2013
- 11:04 PM Revision 5ce47c2a (ceph): ssh_keys.py: pull the keys out of targets entry
- rather than the hosts known hosts file.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@i... - 10:51 PM Revision 88af7d18 (ceph): doc: Added defaults for PGs, links to recommended settings, and updated...
- Fixes: #3555
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 10:32 PM Revision b8f061dc (ceph): OSD: for old osds, dispatch peering messages immediately
- Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ... - 10:18 PM Revision 4ae4dce5 (ceph): OSD: for old osds, dispatch peering messages immediately
- Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ... - 09:30 PM Revision 73bc8ffc (ceph): doc: Added comments on --without-tcmalloc option when building Ceph.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 09:30 PM Revision 37b57cdf (ceph): Update doc/rados/configuration/filesystem-recommendations.rst
- Clarified when it's necessary to use the setting:
filestore xattr use omap = true - 09:29 PM Revision 43ef6772 (ceph): doc: Added some packages to the copyable line.
- Fixes: #3686
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 09:28 PM Revision 333ae82c (ceph): doc: Fixed syntax error.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 08:57 PM Revision aaa03bbc (ceph): qa: Add knfsd reexport suite
- Feature http://tracker.newdream.net/issues/3621
Signed-off-by: David Zafman <david.zafman@inktank.com> - 08:55 PM Revision 67968d11 (ceph): osd: move common active vs booting code into consume_map
- Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o... - 08:54 PM Revision 34266e6b (ceph): osd: let pgs process map advances before booting
- The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c... - 08:53 PM Revision 4034f6c8 (ceph): log: broadcast cond signals
- We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre... - 08:53 PM Revision 7e94f6f1 (ceph): Merge remote-tracking branch 'gh/wip-3714-b' into next
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 08:44 PM Revision 224a33bb (ceph): qa/workunit: Add dbench-short.sh for nfs suite
- A multi-client dbench run doesn't work over NFS,
see bug #3718. Make single client dbench available.
Signed... - 08:13 PM Documentation #3709 (In Progress): crush-map.rst: claims 'types' are default, not true (must be s...
- 02:32 PM Documentation #3709: crush-map.rst: claims 'types' are default, not true (must be specified); spe...
- These are "defaults" in the sense that they're generated as part of the default OSD Map. Apparently that needs to be ...
- 07:57 PM Documentation #3707 (In Progress): crush-map.rst: syntax error in example
- 05:54 PM Bug #3702: OSD SIGABRT during startup
- Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else?
- 05:45 PM Bug #3721: filestore: op_seq written in wrong order on non-btrfs
- 04:02 PM Bug #3721 (Resolved): filestore: op_seq written in wrong order on non-btrfs
- see wip-fsync
- 05:23 PM Revision f8bb4814 (ceph): log: fix locking typo/stupid for dump_recent()
- We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb... - 05:14 PM Revision eee795c0 (ceph): rbd_xfstests.yaml: drop test 186
- Stop running test 186. It keeps failing in nightly runs, unable
to unmount the scratch file system during setup. As... - 04:47 PM rgw Documentation #2993 (Resolved): doc: write quick RGW guide (if feasible)
- 04:45 PM devops Feature #2884: doc: osd hotplugging
- I believe the hotplug event was added, but will confirm.
- 04:43 PM devops Documentation #2974: doc: update chef docs for mon key distribution
- I believe this is done. Will verify.
- 04:13 PM devops Documentation #3686: install prerequisites (Debian)
- Greg Farnum wrote:
> John, can you remove that --without-tcmalloc bit until we hear more?
>
> Nat, you should be ... - 02:48 PM devops Documentation #3686 (In Progress): install prerequisites (Debian)
- John, can you remove that --without-tcmalloc bit until we hear more?
Nat, you should be able to install either of ... - 02:45 PM devops Documentation #3686: install prerequisites (Debian)
- Eek. We really, really want people to be using tcmalloc (memory behavior without it is astonishingly atrocious). I kn...
- 01:31 PM devops Documentation #3686 (Resolved): install prerequisites (Debian)
- Added packages to the copyable lines. Modified the build page to include --without-tcmalloc.
- 03:50 PM Bug #3698: filestore: ENOENT on clone
- Ok. The recovery_qos stuff can allow a client op to reorder past a push. This is a problem since the push might be ...
- 07:53 AM Bug #3698: filestore: ENOENT on clone
- another instance with logs: ubuntu@teuthology:/a/sage-a2/33879
- 02:52 PM Documentation #3555 (Resolved): {page-num} in ceph osd pool create is not optional
- Updated the document to add "required," the default values, a link to calculating PG values, clarification about PGP,...
- 02:49 PM Bug #3633: mon: clock drift errors not reported by ceph status
- The OSD clocks are actually fairly unimportant. Everything they use that requires precise timing should be based enti...
- 10:12 AM Bug #3633: mon: clock drift errors not reported by ceph status
- The objective here was to make sure that clock skews on the monitors were detected and reported, as said skews might ...
- 08:46 AM Bug #3633: mon: clock drift errors not reported by ceph status
- Reading the patch it looks only the clocks of the mons are checked. So the clocks of the osds are not important to ce...
- 02:34 PM Bug #3720: Ceph Reporting Negative Number of Degraded objects
- Per Josh D's suggestion, I set the tunables and it resolved the issue.
# ceph osd getcrushmap -o /tmp/crush
# cru... - 01:02 PM Bug #3720 (Duplicate): Ceph Reporting Negative Number of Degraded objects
- Changed the replication of two pools from 2x to 3x. Cluster rebalanced to nearly HEALTH_OK but got stuck at:
HEALT... - 02:32 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- When reproducing with lots of error logging to stderr, the error occurs on snapshots because the snap rm/snap info te...
- 01:59 PM CephFS Bug #3597: ceph-fuse: denying root access
- I believe that we can reproduce this error. We are running Ubuntu 12.04 LTS Server on both the client and on the Cep...
- 12:56 PM CephFS Bug #3719 (Can't reproduce): pjd test 145 failed in the nightly runs
- logs: ubuntu@teuthology:/a/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33621...
- 12:53 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
- commit:7e94f6f1a7b7a865433edacd6a521f6ea1170eac
- 10:28 AM Bug #3714 (Fix Under Review): osd: new peering code does not consume osdmaps prior to booting
- 12:48 PM CephFS Bug #3718 (Rejected): multi-client dbench gets stuck over NFS exported cephfs
- When running qa/workunit dbench.sh the dbench 1 passes, but the dbench 10 gets hung up.
We should check this with ... - 12:28 PM CephFS Feature #3621 (In Progress): qa: add knfsd reexport tests to qa suite
- 09:49 AM RADOS Feature #3717 (New): osd: Make Rebalancing Smarter
- From Corin Langosch - During recovery/ rebalacing it can happen that an osd receives lots of new data before data tha...
- 09:45 AM Bug #3716: recovery should take osd usage into account
- 1. My cluster already uses the tuned crushmap "crushtool -i /tmp/crush --set-choose-local-tries 0 --set-choose-local-...
- 09:36 AM Bug #3716 (Closed): recovery should take osd usage into account
- #1: this is a matter of adjusting the crush tunables. see http://ceph.com/docs/master/rados/operations/crush-map/?hig...
- 09:08 AM Bug #3716 (Closed): recovery should take osd usage into account
- Using argonaut 0.48.2. Yesterday one osd crashed (disk io error) and recovery started as expected. All osds had an us...
- 09:44 AM Bug #3550: mon: Ceph fails to work when IP address is changed on the host
- Joao,
thanks for the update.
Since mine came about due to a testing environment build on DHCP, I did not have the ... - 09:32 AM CephFS Bug #3681: kclient fsx fails nightly
- Its most likely all the same bug, but fsx fails in different ways each time (always because of a truncate down). The...
- 09:27 AM CephFS Feature #3543: mds: new encoding
- right. about 80% complete, see wip-mds-encoding.
- 09:22 AM CephFS Feature #3543: mds: new encoding
- What is this task? Switching to use our versioned encoding scheme?
- 09:17 AM rbd Bug #3685: xfs test 186 fails in the nightlies
- I just disabled test 186 from the list run for the nightly
tests. It's defined in the ceph-qa-suite git repository,... - 06:39 AM Revision a32d6c5d (ceph): osd: move common active vs booting code into consume_map
- Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o... - 06:20 AM Revision 0bfad8ef (ceph): osd: let pgs process map advances before booting
- The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c... - 06:04 AM Revision 5fc94e89 (ceph): osd: drop oldest_last_clean from activate_map
- Signed-off-by: Sage Weil <sage@inktank.com>
- 06:04 AM Revision 67f7ee67 (ceph): osd: drop unused variables from activate_map
- Signed-off-by: Sage Weil <sage@inktank.com>
- 05:09 AM Revision a14a36ed (ceph): OSDMap: fix modifed -> modified typo
- Signed-off-by: Sage Weil <sage@inktank.com>
- 04:44 AM Revision 9ca69e73 (ceph): ceph: malloc check =3 means we hear on stderr too
- 03:58 AM Revision 2141454e (ceph): log: fix locking typo/stupid for dump_recent()
- We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb... - 02:13 AM Revision 6b5a89d2 (ceph): Merge remote-tracking branch 'gh/next'
- 01:01 AM Revision 43cba617 (ceph): log: fix locking typo/stupid for dump_recent()
- We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
01/02/2013
- 11:59 PM Revision 29ff87a5 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
- 11:58 PM Revision 64d2760a (ceph): doc: Added a memory profiling section. Ported from the wiki.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 11:57 PM Revision 5066abf1 (ceph): doc: Added memory profiling to the index.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 11:08 PM Revision 0e9a0cd7 (ceph): qa/workunit: Update pjd script to use new tarball
- The pjd script now uses the latest version of pjd
with an additional test for opening a non-existent
file.
Signed-of... - 11:07 PM Bug #3715: Crash during 0.55 -> 0.56 upgrade
- is someone sending an MOSDOp that has no ops? init_op_flags() is called before can_*(), so this sounds like an empty...
- 10:05 PM Bug #3715 (Duplicate): Crash during 0.55 -> 0.56 upgrade
- I started upgrading my 0.55.1 cluster to 0.56 and at one point in the middle of the upgrade, all 0.55.1 OSDs started ...
- 10:38 PM Revision d8940d15 (ceph): fuse: Fix cleanup code path on init failure
- With the changes from 856f32ab, the cfuse.init call returns
a _positive_ errno, which was getting ignored. Also, if ... - 10:15 PM Revision c4370ff0 (ceph): librbd: establish watch before reading header
- This eliminates a window in which a race could occur when we have an
image open but no watch established. The previou... - 09:56 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- Reproduces OK on plana cluster, indeed. This seems to point toward some sort of OSD bug where committed state isn't ...
- 09:39 AM rbd Bug #3697 (In Progress): rbd copy.sh test failing in nightly
- 09:42 PM Revision 93656013 (ceph): test_filejournal: optionally specify journal filename as an argument
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 483c6f76adf960017614a8641c4dcdbd7902ce33) - 09:42 PM Revision be0473bb (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c461e7fc1e34fdddd8ff8833693d067451df906b) - 09:42 PM Revision de619327 (ceph): os/FileJournal: limit size of aio submission
- Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig... - 09:42 PM Revision ded454c6 (ceph): os/FileJournal: logger is optional
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 076b418c7f03c5c62f811fdc566e4e2b776389b7) - 09:42 PM Revision 9a1cf518 (ceph): Merge branch 'wip-journal-aio' into next
- Reviewed-by: Samuel Just <sam.just@inktank.com>
Backport: bobtail - 09:39 PM Revision dda7b651 (ceph): os/FileJournal: limit size of aio submission
- Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig... - 09:39 PM Revision c461e7fc (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
- Signed-off-by: Sage Weil <sage@inktank.com>
- 09:39 PM Revision 483c6f76 (ceph): test_filejournal: optionally specify journal filename as an argument
- Signed-off-by: Sage Weil <sage@inktank.com>
- 09:34 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
- Previously when we handled the old osdmaps catching up (pre-MOSDBoot) we'd do advance_map and the pgs would update th...
- 08:32 PM Revision e0858fa8 (ceph): Revert "librbd: ensure header is up to date after initial read"
- Using assert version for linger ops doesn't work with retries,
since the version will change after the first send.
Th... - 08:31 PM Revision 06310994 (ceph): ceph: enable malloc debugging for ceph-osd
- 07:49 PM Revision 3686371e (ceph): rados: add test_filejournal
- This writes to /tmp by default; should be ok plana, since it's / and not
tmpfs. - 07:24 PM Revision 82297706 (ceph): doc: Minor edits.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 07:15 PM Revision d3b9803e (ceph): doc: Fixed typo, clarified usage.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 05:23 PM rbd Bug #3685: xfs test 186 fails in the nightlies
- It is possible for umount() to return EBUSY. However from
what I can tell that only occurs when the device being
u... - 02:34 PM rbd Bug #3685: xfs test 186 fails in the nightlies
- OK I've tried reproducing it manually (on a teuthology node, but
running it using a command line while in an "intera... - 12:06 PM rbd Bug #3685: xfs test 186 fails in the nightlies
- Test 184 doesn't touch the scratch device. Looks like the next
one back is 167, which exercises unwritten extent co... - 11:56 AM rbd Bug #3685: xfs test 186 fails in the nightlies
- I thought I had updated this but I have not.
Test 186 is exercising activities that at one time caused a
bug in x... - 05:15 PM Bug #3699: osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
- reproduced this on burnupi21.
- 05:00 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- with glibc malloc and debug enabled:...
- 08:57 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- another one with full osd logs:...
- 04:13 PM Documentation #3687 (Resolved): Documentation needs a "memory profiling" section
- This has been ported. I haven't added a valgrind use case yet.
- 01:20 PM Documentation #3687 (In Progress): Documentation needs a "memory profiling" section
- 03:51 PM Feature #3713 (Rejected): ceph osd tree should show disk usage
- As ceph seems to already monitor the disk usage of each osd it's be great to have it displayed in "ceph osd tree".
- 03:08 PM rbd Bug #3619: librbd: read_iterate sparse behavior broken
- Mitigated somewhat by sparsification efforts in rbd import/export, but still librbd
should be fixed. - 02:11 PM devops Feature #3712 (New): Ceph Commands should provide appropriate responses, when Ceph Service is not...
- When ceph service is not running, running other ceph command should give a response that makes sense instead of just ...
- 02:02 PM Cleanup #2078: ceph tool: only output response data to stdout
- i think we need to phase out all of the first-line nonsense.
- 01:48 PM Cleanup #2078: ceph tool: only output response data to stdout
- This also affects things like ceph pg dump --format=json. You can't pipe it to a pretty printer without ignoring the ...
- 01:52 PM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
- The implication is that 'N' is "the number of buckets of type 'type' available", but Sam believes it must really be "...
- 01:40 PM Bug #3684 (Resolved): filejournal: aio vector size is not limited
- 01:34 PM rbd Feature #3456 (Closed): make exit code of ceph status commands status dependent
- 01:29 PM rbd Documentation #2992 (Resolved): doc: RBD parent/child snapshot
- 01:26 PM rbd Documentation #2992: doc: RBD parent/child snapshot
- This should be resolved.
- 01:24 PM Documentation #3710 (Closed): crush-map.rst: talks about 'step choose' but does not document it
- 01:23 PM Documentation #3411 (Resolved): doc: add introductory detail to the main doc page (index.rst)
- 01:21 PM rgw Feature #3207 (In Progress): qa: swift functional tests in nightly
- 01:21 PM rgw Feature #3366 (In Progress): rgw: dr: define management api
- 01:18 PM Documentation #2980 (Resolved): doc: write upgrading Ceph version
- This was checked in and also reviewed by Josh and Sage.
- 01:16 PM Documentation #3322 (Resolved): doc: Explain multi-tenant CephFS
- This has been added to a the end of the Ceph Configuration file section. It may benefit from review, as I believe the...
- 01:12 PM Feature #647 (Duplicate): mon: refactor paxos interaction
- 01:11 PM Feature #183 (Resolved): qa: xfstests workunit
- 01:10 PM Documentation #3709 (Resolved): crush-map.rst: claims 'types' are default, not true (must be spec...
- crush-map.rst claims that the bucket type defaults are as appear in the table, but they're
not defaults; they must b... - 01:09 PM Feature #3376 (Duplicate): use external leveldb package for default builds
- 01:08 PM Documentation #3707 (Resolved): crush-map.rst: syntax error in example
- example includes:
item ceph-osd-server-1 2.00
this must have 'weight' explicitly in the line:
... - 01:03 PM Feature #3425 (Resolved): mon workload generator
- 12:39 PM Bug #3702: OSD SIGABRT during startup
- Attempting to start osd.131 (which was down due to the above noted problems) today resulted in quorum loss. Essential...
- 12:03 PM rgw Bug #3706 (Resolved): rgw functional test testSlashInName failed in nightly
- logs: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33224...
- 11:25 AM Revision a79493da (ceph): mds: skip frozen inode when assimilating dirty inodes' rstat
- CDir::assimilate_dirty_rstat_inodes() may encounter frozen inodes that
are being renamed. Skip these frozen inodes be... - 11:25 AM Revision 2f96b472 (ceph): mds: fix anchor table commit race
- Anchor table updates for a given inode is fully serialized on client side.
But due to network latency, two commit req... - 11:25 AM Revision 7e04504d (ceph): mds: fix on-going two phrase commits tracking
- The slaves for two phrase commit should be mdr->more()->witnessed
instead of mdr->more()->slaves. mdr->more()->slaves... - 11:25 AM Revision b3796f46 (ceph): mds: indroduce DROPLOCKS slave request
- In some rare case, Locker::acquire_locks() drops all acquired locks
in order to auth pin new objects. But Locker::dro... - 11:25 AM Revision b2d5005a (ceph): mds: fix lock state transition check
- Locker::simple_excl() and Locker::scatter_mix() miss is_rdlocked
check; Locker::file_excl() miss is_rdlocked check an... - 11:25 AM Revision fe5936b1 (ceph): mds: remove unnecessary is_xlocked check
- Locker::foo_eval() is always called for stable locks, so no need to
check if the lock is xlocked.
Signed-off-by: Yan... - 11:25 AM Revision f5ea5c36 (ceph): mds: don't defer processing caps if inode is auth pinned
- We should not defer processing caps if the inode is auth pinned by MDRequest,
because the MDRequest may change lock s... - 11:25 AM Revision 5e8642a8 (ceph): mds: call maybe_eval_stray after removing a replica dentry
- MDCache::handle_cache_expire() processes dentries after inodes, so the
MDCache::maybe_eval_stray() in MDCache::inode_... - 11:25 AM Revision 84224743 (ceph): mds: fix rename inode exportor check
- Use "srcdn->is_auth() && destdnl->is_primary()" to check if the MDS is
inode exportor of rename operation is not reli... - 11:25 AM Revision 26279574 (ceph): mds: don't trigger assertion when discover races with rename
- Discover reply that adds replica dentry and inode can race with rename
if slave request for rename sends discover and... - 11:25 AM Revision 5ae715be (ceph): mds: xlock stray dentry when handling rename or unlink
- This prevents MDS from reintegrating stray before rename/unlink finishes
Signed-off-by: Yan, Zheng <zheng.z.yan@inte... - 11:25 AM Revision 7a520168 (ceph): mds: don't journal null dentry for overwrited remote linkage
- Server::_rename_prepare() adds null dest dentry to the EMetaBlob if
the rename operation overwrites a remote linkage.... - 11:25 AM Revision fcb9f988 (ceph): mds: use null dentry to find old parent of renamed directory
- When replaying an directory rename operation, MDS need to find old parent of
the renamed directory to adjust auth sub... - 11:25 AM Revision d9d71473 (ceph): mds: don't trim ambiguous imports in MDCache::trim_non_auth_subtree
- Trimming ambiguous imports in MDCache::trim_non_auth_subtree() confuses
MDCache::disambiguate_imports() and causes in... - 11:25 AM Revision 3b13d3dc (ceph): mds: only export directory fragments in stray to their auth MDS
- Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
- 11:25 AM Revision 61da9b18 (ceph): mds: mark rename inode as ambiguous auth on all involved MDS
- When handling cross authority rename, the master first sends OP_RENAMEPREP
slave requests to witness MDS, then sends ... - 11:09 AM Linux kernel client Bug #2764 (Closed): xfstest hang; osd socket closed messages
- The fix for the warning messages is:
28362986f8743124b3a0fda20a8ed3e80309cce1
libceph: report connection ... - 10:54 AM Bug #3698: filestore: ENOENT on clone
- recent log: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33152
- 09:45 AM CephFS Bug #3700: mds: FAILED assert(!item_session_list.is_on_list())
- fixed by revert of bad fix, see commit:6711a4c4038dbdf843f9dfe42c7809c5c37ae534
- 09:37 AM CephFS Bug #3700 (Resolved): mds: FAILED assert(!item_session_list.is_on_list())
- 09:41 AM rbd Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
- This is a known problem with argonaut, but the fix is a rewrite of the whole module and we've chosen not to backport ...
- 09:09 AM Bug #3705 (Resolved): osd: crash in scrub finalize [argonaut]
- ...
- 08:28 AM Feature #3704 (Resolved): mon: add min log level to send cluster msgs to syslog
- e.g., WARN and above only, but not INFO. This is for the mon/LogMonitor.cc submission path, not log/Log.cc (for debu...
- 05:55 AM Revision e10267b5 (ceph): mds: fix Locker::simple_eval()
- Locker::simple_eval() checks if the loner wants CEPH_CAP_GEXCL to
decide if it should change the lock to EXCL state, ... - 05:54 AM Revision 7e23321b (ceph): mds: don't renew revoking lease
- MDS may receives lease renew request while lease is being revoked,
just ignore the renew request.
Signed-off-by: Yan...
01/01/2013
- 06:36 PM Revision eb02eaed (ceph): Merge remote-tracking branch 'gh/wip-bobtail-docs'
- 05:35 AM Revision f1196c7e (ceph): Merge branch 'master' of https://github.com/ceph/ceph
- 05:31 AM Revision 5dd6b199 (ceph): Merge branch 'next'
- 02:37 AM Revision 8f77ec7d (ceph): Merge branch 'next'
- 02:36 AM Revision 94a5dd6b (ceph): Merge remote-tracking branch 'gh/wip-3675'
- Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
- 01:10 AM Revision 1a32f0a0 (ceph): v0.56
12/31/2012
- 11:28 PM Revision 49ebe1ee (ceph): client: fix _create created ino condition
- We get 8 bytes back for the created ino.
Signed-off-by: Sage Weil <sage@inktank.com> - 11:26 PM Revision a10054bc (ceph): libcephfs: choose more unique nonce
- We were using a per-process counter combined with the pid. A short
running process can easily loop through and reuse... - 11:26 PM Revision e2fef38d (ceph): client: fix _create
- make_request() clear out req->reply and frees req; we can't inspect
it here.
Instead, just assume that extra_bl is t... - 06:35 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- FWIW I ran this in a loop and reproduced it after 7 iterations (well, a slightly different error actually, when it re...
- 05:42 PM rbd Bug #3697 (Can't reproduce): rbd copy.sh test failing in nightly
- 05:08 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- Hm, doesn't reproduce on local vstart cluster. Pondering possible failure modes.
- 04:23 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- Trying to reproduce now
- 06:17 PM Revision 7d70dd11 (ceph): Revert "kernel: move fsync test to marginal suite until it works"
- This reverts commit acb91f7d0d4882d7393a99b142aec8687b9b4bb7.
Now fixed in master branch, commit b4d3bd06d4083d78075... - 06:16 PM Revision b4d3bd06 (ceph): Merge remote-tracking branch 'gh/wip-3625'
- 05:38 PM rbd Bug #3703: osd: crash while encrypting
- This is an osd crash....
- 02:55 PM rbd Bug #3703 (Can't reproduce): osd: crash while encrypting
- logs: ubuntu@teuthology:/a/teuthology-2012-12-30_19:00:03-regression-next-testing-basic/32113...
- 04:11 PM Revision ed586c1b (ceph): task: ceph: don't wait for 'healthy' if 'wait-for-healthy' is false.
- This new config option obviously defaults to 'true' in order to not only
maintain compatibility, but because it makes... - 02:58 PM Bug #3699: osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
- bringing back the marked out osd.1 in on burnupi06 while running the io hit the following,
2012-12-31 14:26:26.6... - 02:30 PM Messengers Feature #3509 (Resolved): msgr: delay injection
- 10:18 AM Bug #3689 (Resolved): osd: bad peering state machine event with mixed v0.52 and next cluster
- 09:06 AM Bug #3702 (Can't reproduce): OSD SIGABRT during startup
- After conversion of OSD's from btrfs to XFS, some OSD's SIGABRT during their first startup on XFS:
2012-12-29 05:0... - 08:55 AM Bug #3683: mon: leak of MMonPaxos
- recent logs: ubuntu@teuthology:/a/teuthology-2012-12-29_19:00:03-regression-next-testing-basic/31414
- 08:37 AM rbd Bug #3701 (Can't reproduce): qemu xfstest hung BUG: unable to handle kernel NULL pointer derefere...
- logs: ubuntu@teuthology:/a/teuthology-2012-12-30_03:00:06-regression-master-testing-gcov/31929...
12/30/2012
- 11:29 PM Revision ec5288a3 (ceph): Merge remote-tracking branch 'gh/wip-rbd-unprotect' into next
- Reviewed-by: Sage Weil <sage@inktank.com>
- 07:18 PM Revision 82cec48e (ceph): doc: add-or-rm-mons.rst: Add 'Changing Monitor's IPs' section
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 07:17 PM Revision 379f0792 (ceph): doc: add-or-rm-mons.rst: Clarify what the monitor name/id is.
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
- 06:08 PM CephFS Fix #3630: mds: broken closed connection cleanup
- ...
- 06:06 PM CephFS Fix #3630: mds: broken closed connection cleanup
- The con re-use looks like this:
- client connects
- mds ms_verify_authorizer creates a new session
- msgr see ex... - 06:04 PM CephFS Bug #3696 (Resolved): mds: FAILED assert(session_map.count(s->inst.name) == 0)
- see #3630..let's fix this properly.
- 08:06 AM Revision 85e9d4f0 (ceph): cls_rbd: get_children does not need write permission
- This prevented a read-only user from being able to unprotect a
snapshot without write permission on all pools. This w... - 08:06 AM Revision 91e941ae (ceph): OSD: remove RD flag from CALL ops
- 20496b8d2b2c3779a771695c6f778abbdb66d92a forgot to do this. Without
this change, all class methods required regular r... - 08:06 AM Revision c67c789d (ceph): librbd: add {rbd_}open_read_only()
- Since 58890cfad5f7bee933baa599a68e6c65993379d4, regular {rbd_}open()
would fail with -EPERM if the user did not have ... - 08:06 AM Revision 47bf5195 (ceph): librbd: open parent as read-only during clone
- We never write to the parent, and don't need to watch it during this process.
Signed-off-by: Josh Durgin <josh.durgi... - 08:06 AM Revision 958addc0 (ceph): rbd: open (source) image as read-only
- This allows users without write access to copy, export and list
information about an image.
Signed-off-by: Josh Durg... - 08:06 AM Revision d0a14d11 (ceph): librbd: fix race between unprotect and clone
- Clone needs to actually re-read the header to make sure the image is
still protected before returning. Additionally, ... - 08:06 AM Revision 8bbb4a36 (ceph): doc: fix rbd permissions for unprotect
- Unprotect examines all pools, so use blanket x before 0.54. After
that, use class-read restricted by object_prefix to... - 05:00 AM Revision 7b0dbeb0 (ceph): doc/install/upgrading: edits to upgrade document
- Signed-off-by: Sage Weil <sage@inktank.com>
- 05:00 AM Revision 4aa6af76 (ceph): doc/release-notes: link to upgrade doc
- Signed-off-by: Sage Weil <sage@inktank.com>
12/29/2012
- 08:04 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- Possibly the same bug in teuthology:/a/joshd-3631-12-28-12_08.55/30739...
- 07:45 PM Bug #3698: filestore: ENOENT on clone
- 07:44 PM Bug #3698: filestore: ENOENT on clone
- Can you add 'debug osd = 20' so the job you're running?
- 04:30 PM Bug #3698: filestore: ENOENT on clone
- Happened again in teuthology:/a/joshd-3631-12-28-12_08.53/30681
- 09:50 AM Bug #3698 (Resolved): filestore: ENOENT on clone
- Full logs in teuthology:/a/joshd-3631-12-27-12_22.21/29826...
- 04:38 PM Revision 6711a4c4 (ceph): Revert "mds: replace closed sessions on connect"
- This reverts commit 8b599083705c2495810c00f9f5fd5bb8ace7f32e.
This fix is not correct. See #3696. - 04:28 PM Revision bb4a2c55 (ceph): rgw: enable logging in ceph.conf
- 02:39 PM CephFS Bug #3700 (Resolved): mds: FAILED assert(!item_session_list.is_on_list())
- logs: ubuntu@teuthology:/a/teuthology-2012-12-29_03:00:03-regression-master-testing-gcov/30039...
- 02:32 PM CephFS Bug #3696: mds: FAILED assert(session_map.count(s->inst.name) == 0)
- ubuntu@teuthology:/a/teuthology-2012-12-29_03:00:03-regression-master-testing-gcov/30036
- 09:43 AM CephFS Bug #3696: mds: FAILED assert(session_map.count(s->inst.name) == 0)
- reverted the broken fix, reproducing the original problem again.
- 02:19 PM Bug #3699 (Resolved): osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
- cluster: burnupi06 [running osd.1 on v0.55.1] , burnupi07[running osd.3, osd.4, mon.b on argonaut], burnupi08[running...
- 08:37 AM rbd Bug #3697 (Duplicate): rbd copy.sh test failing in nightly
- ...
- 01:21 AM Revision a5d692a7 (ceph): msgr: inject delays at inconvenient times
- Exercise some rare races by injecting delays before taking locks
via the 'ms inject internal delays' option.
Signed-... - 01:21 AM Revision 82f8bcdd (ceph): msg/Pipe: use state_closed atomic_t for _lookup_pipe
- We shouldn't look at Pipe::state in SimpleMessenger::_lookup_pipe() without
holding pipe_lock. Instead, use an atomi... - 01:21 AM Revision 7bf0b085 (ceph): msgr: atomically queue first message with connect_rank
- Atomically queue the first message on the new pipe, without dropping
and retaking pipe_lock.
Signed-off-by: Sage Wei... - 01:21 AM Revision 6339c5d4 (ceph): msgr: don't queue message on closed pipe
- If we have a con that refs a pipe but it is closed, don't use it. If
the ref is still there, it is only because we a... - 01:21 AM Revision e99b4a30 (ceph): msgr: fix race on Pipe removal from hash
- When a pipe is faulting and shutting down, we have to drop pipe_lock to
take msgr lock and then remove the entry. Th... - 01:19 AM Revision 83c8025d (ceph): Merge remote-tracking branch 'gh/next'
- 01:19 AM Revision c2a75253 (ceph): test: mon: workloadgen: debug when message fsid != monmap fsid
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
- 01:19 AM Revision b30ab517 (ceph): test: mon: workloadgen: assert if monmap's fsid is zero after authenticate
- Fixes: #3629
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> - 01:19 AM Revision 35836847 (ceph): doc: update Hadoop documentation
- Updates configuration option names, and adds object.size,
localize.reads, and root.dir control options.
Signed-off-b... - 01:12 AM Revision 942c7145 (ceph): init-ceph: ok, 8K files
- 16K might be a bit many.
Signed-off-by: Sage Weil <sage@inktank.com> - 01:10 AM Revision 0a5d6d87 (ceph): msg/Pipe: remove broken cephs signing requirement check
- Remove the special-case check, which does not inform the peer what
protocol features are missing. It also enforces t... - 12:00 AM Revision 65b787ea (ceph): msg/Pipe: include remote socket addr in debug output
- Signed-off-by: Sage Weil <sage@inktank.com>
12/28/2012
- 11:55 PM Revision 9e5e08f8 (ceph): doc: Added a new upgrade document.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 11:55 PM Revision 1553267e (ceph): doc: Minor edit.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 11:54 PM Revision 02b8bcd0 (ceph): doc: Added upgrade link to index.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 11:44 PM Revision 076b418c (ceph): os/FileJournal: logger is optional
- Signed-off-by: Sage Weil <sage@inktank.com>
- 11:14 PM Revision 3debf0cf (ceph): client: fix fh leak in non-create case
- We may take the O_CREAT path and get an fh from _create, but created can
still be false. In that case, skip the fina... - 11:10 PM Revision 7f35e5dd (ceph): client: Make ll_create use _create
- This is a fix for bug #3625, where multiple clients race to create a
file, and the loser returns EEXIST instead of a ... - 11:10 PM Revision 67bc849c (ceph): mds: Return created inode in mds reply to create
- If multiple clients race to create a file, multiple clients will send a
create request and get back a valid dentry+in... - 11:08 PM Revision 813787af (ceph): log: broadcast cond signals
- We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre... - 11:03 PM Revision ca34fc4d (ceph): osd: allow RecoveryDone self-transition in RepNotRecovering
- In a mixed cluster where some OSDs support the recovery reservations and
some don't, the replica may be new code in R... - 10:15 PM Revision 0f5383f4 (ceph): Merge remote-tracking branch 'origin/wip-gl-docs'
- Update release process documentation.
- 10:05 PM Revision 1867b818 (ceph): docs: fix typo in release-process doc
- Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
- 10:00 PM Linux kernel client Bug #3519: rbd map hang during system startup
- yay!
- 06:08 AM Linux kernel client Bug #3519 (Resolved): rbd map hang during system startup
- Done, pushed to master, and soon to be included in a pull request
to Linus for 3.8. - 09:47 PM Revision 3a8bf3af (ceph): doc/release-notes: document new 'max open files' default
- Signed-off-by: Sage Weil <sage@inktank.com>
- 09:11 PM CephFS Bug #3696: mds: FAILED assert(session_map.count(s->inst.name) == 0)
- 06:42 PM CephFS Bug #3696 (Resolved): mds: FAILED assert(session_map.count(s->inst.name) == 0)
- This occurred shortly after startup when trying to reproduce another bug on the master branch:...
- 08:34 PM Revision ea13ecc2 (ceph): osd: less noise about inefficient tmap updates
- Signed-off-by: Sage Weil <sage@inktank.com>
- 08:12 PM Revision 9483a032 (ceph): init-ceph: fix status version check across machines
- The local state isn't propagated into the backtick shell, resulting in
'unknown' for all remote daemons. Avoid backt... - 08:12 PM Revision 8fef9360 (ceph): init-ceph: use SSH in "service ceph status -a" to get version
- When running "service ceph status -a", a version number was never
returned for remote hosts, only for the local. Thi... - 08:11 PM Revision 672c56b1 (ceph): init-ceph: default to 16K max_open_files
- Signed-off-by: Sage Weil <sage@inktank.com>
- 07:58 PM Revision 948e7524 (ceph): ceph-fuse: Avoid doing handle cleanup in dtor
- The CephFuse::Handle class needs the client
pointer to be valid for finalizing, so don't finalize
in the destructor (... - 07:10 PM Revision ff2d4abb (ceph): ceph-fuse: Pass client handle as userdata
- The fuse lowlevel API isn't getting the client
handle when when it gets initialized, resulting
in a null pointer for ... - 06:21 PM CephFS Fix #3630: mds: broken closed connection cleanup
- 05:57 PM Bug #3695 (Resolved): monitor crashed after an upgrade in Monitor::timecheck
- ceph version : 0.55.1-329-g01376d4 (01376d44d73189080d207f701fc7e38cf55c738d)
cluster:
burnupi15[running osd.1, ... - 05:09 PM Bug #3675 (Resolved): osd: hang during intial peering
- 04:55 PM Bug #3690 (Resolved): osd crashed in FileStore::_do_transaction
- the problem was old ceph-osd daemons on other hosts trying to connect. running code that didn't include commit:4d20b...
- 12:27 PM Bug #3690: osd crashed in FileStore::_do_transaction
- made the default fd limit much higher in commit:672c56b18de3b02606e47013edfc2e8b679d8797
- 10:39 AM Bug #3690: osd crashed in FileStore::_do_transaction
- ...
- 10:32 AM Bug #3690: osd crashed in FileStore::_do_transaction
- ...
- 04:40 PM Bug #3684 (Fix Under Review): filejournal: aio vector size is not limited
- wip-journal-aio
- 04:09 PM Revision acb91f7d (ceph): kernel: move fsync test to marginal suite until it works
- 04:08 PM Revision 02e4eeff (ceph): kernel: move fsx to marginal suite until it passese
- 04:06 PM Documentation #3694 (Closed): doc: how to use the admin socket interface
- A couple pages in the docs mention specific commands, but there's no overall explanation of what it is, and how you c...
- 01:56 PM Bug #3691: Lock issue in librados resulting in application hang
- This affects small clusters more because a single osd is a larger proportion of the whole cluster. In bobtail, there ...
- 01:46 PM Bug #3691: Lock issue in librados resulting in application hang
- Well, this is an even worse issue. We are adding new osds (just 8 now), and the cluster has been staying "unhealthy" ...
- 01:04 PM Bug #3691 (Rejected): Lock issue in librados resulting in application hang
- You're calling the synchronous version of write, and the spot where it's 'hung' is just waiting for the response from...
- 04:53 AM Bug #3691 (Rejected): Lock issue in librados resulting in application hang
- We ran into some nasty lock issue in librados, it's trying to write some data, and hangs there for a many seconds unt...
- 12:46 PM RADOS Bug #3693 (Duplicate): crushtool compile fails with unhelpful message, diagnosis quite difficult
- A user tried to create his own crushmap as follows:...
- 12:18 PM rbd Bug #2689 (In Progress): qemu iozone test hangs
- This seems to still be a problem. I'll try to get more information about what's going on. It looks like there's an er...
- 12:12 PM devops Documentation #2774: doc: ceph-disk man page
- These would be useful. Someone on irc was confused earlier by the undocumented requirement to set --cluster-uuid (or ...
- 12:08 PM rbd Bug #3692: OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
- Chronology of events (UTC) in the latest example of this happening, in case it's relevant:
15:50:46 mon.b is s... - 12:01 PM rbd Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
- I've seen this happen twice:
- Reboot a node running a number of OSD's
- Within a short period of time, seemingly... - 11:42 AM Bug #3689: osd: bad peering state machine event with mixed v0.52 and next cluster
- wip-3689 has a fix; please test!
- 10:58 AM rgw Bug #3682: valgrind errors seen when running rgw tests in nightlies
- ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28728
- 10:53 AM Bug #3631: osdc/ObjectCacher.cc: 834: FAILED assert(ob->last_commit_tid < tid) during librbd_fsx
- ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28662...
- 10:34 AM rbd Bug #3600: rbd: assert in objectcacher destructor after flatten
- recent log: ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28713
- 06:10 AM Bug #3657 (Resolved): rbd: crash mapping image
- Done, pushed to master, and soon to be included in a pull request
to Linus for 3.8. - 06:08 AM Revision 9967cf24 (ceph): release-notes: rgw logging now off by default
- Signed-off-by: Sage Weil <sage@inktank.com>
- 06:03 AM Revision 1c3e12a2 (ceph): doc: warn about using caching without QEMU knowing
- Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
- 06:02 AM Revision f6ce5dda (ceph): rgw: disable ops and usage logging by default
- Most users don't need this, and having it on will just fill their clusters
with objects that will need to be cleaned ... - 04:28 AM Bug #3683 (In Progress): mon: leak of MMonPaxos
- 04:26 AM Bug #3633: mon: clock drift errors not reported by ceph status
- wip-3633 now has a couple of patches that introduce a mechanism to keep track of clock skews on the monitors.
If s... - 01:24 AM Revision 64b845f6 (ceph): features is uint64_t
- This won't bite us for a while yet (we're on bit 26), but it will soon!
Signed-off-by: Sage Weil <sage@inktank.com>
... - 01:15 AM Revision 2fbe3e17 (ceph): Merge remote-tracking branch 'gh/next'
- 12:55 AM Revision 856f32ab (ceph): ceph-fuse: Split main into init/main/finalize
- With the invalidate callback enabled for fuse, the Client::unmount
call requires the fuse channel and session objects... - 12:39 AM Revision c0fe3815 (ceph): java: remove deprecated libcephfs
- Removes ceph_set_default_*
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> - 12:32 AM Revision 6c7b667b (ceph): init-ceph: fix status version check across machines
- The local state isn't propagated into the backtick shell, resulting in
'unknown' for all remote daemons. Avoid backt...
12/27/2012
- 11:39 PM Revision 774a54cb (ceph): docs: update release process documentation.
- Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
- 10:23 PM Bug #3689: osd: bad peering state machine event with mixed v0.52 and next cluster
- This looks like a compatibility issue with recovery queueing:...
- 05:26 PM Bug #3689: osd: bad peering state machine event with mixed v0.52 and next cluster
- Log from crashing osd with greater debug level https://dl.dropbox.com/u/5820195/ceph-osd.1.log.gz.
- 05:09 PM Bug #3689 (Resolved): osd: bad peering state machine event with mixed v0.52 and next cluster
- Reported by mgalkiewicz in #ceph. https://gist.github.com/raw/4393494/f3ae88406350b74ac6d608b8b75960f85435e85e/gist...
- 09:40 PM Revision af37cc3a (ceph): Merge remote-tracking branch 'gh/wip-mds'
- 09:26 PM Revision 63567392 (ceph): osd: fix recovery assert for pg repair case
- In the case of PG repair, this assert is not valid. Disable it for now.
Signed-off-by: Sage Weil <sage@inktank.com> - 09:09 PM Revision 1fa8c83d (ceph): Merge branch 'wip-osd-flags'
- 09:07 PM Revision 207e93ab (ceph): Merge remote-tracking branch 'gh/wip-mds-pool'
- Reviewed-by: Sam Lang <sam.lang@inktank.com>
- 08:12 PM Revision 03f6dfa4 (ceph): osd: move rmw_flags to OpRequest, out of MOSDOp
- It was very sloppy to put a server-side processing state inside the
messsage. Move it to the OpRequestRef instead.
... - 08:12 PM Revision f1dfd64f (ceph): messages/MOSDOpReply: remove misleading may_read/may_write
- These are OpRequest properties, calculated/enforced at the OSD. They don't
belong in the MOSDOp or MOSDOpReply messa... - 08:12 PM Revision f2306038 (ceph): osd: only calculate OpRequest rmw flags once
- Signed-off-by: Sage Weil <sage@inktank.com>
- 08:04 PM Linux kernel client Bug #3519: rbd map hang during system startup
- Nick reports:
I have some exciting news. After 215 test runs, no hung processes
were detected. I think we may... - 07:58 PM Bug #3657: rbd: crash mapping image
- I'm currently testing two patches related to this bug, and
while I haven't pushed them to the testing branch yet I
... - 07:27 PM Revision 998f7194 (ceph): dropping xfs test 186 due to bug: 3685
- Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
- 07:14 PM Revision 98e7b598 (ceph): docs: remove extra release-process2 file.
- This file mostly duplicated the existing release documentation. Differences
have been merged into the primary file.
... - 07:12 PM Revision 82c71716 (ceph): osd: drop 'osd recovery max active' back to previous default (5)
- Having this too large means that queues get too deep on the OSDs during
backfill and latency is very high. In my tes... - 07:11 PM Revision 6f1f03c7 (ceph): journal: reduce journal max queue size
- Keep the journal queue size smaller than the filestore queue size.
Keeping this small also means that we can lower t... - 07:09 PM Revision 0d2ad2f2 (ceph): mds: use set to store MDSMap data pools
- Signed-off-by: Sage Weil <sage@inktank.com>
- 06:53 PM Revision 80bcaa29 (ceph): rados: add filestore_idempotent test with journal aio = true
- 05:55 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- this reproduced once out of ~60 runs on the fsx task....
- 05:36 PM Revision 2137d5cd (ceph): mds: wait for client's mdsmap when specifying data pool
- The client may have a newer map than we do; make sure we wait for it lest
we inadvertantly reply because we think the... - 05:33 PM Bug #3690: osd crashed in FileStore::_do_transaction
- leaving the cluster as it is for someone to take a look at it.
- 05:33 PM Bug #3690 (Resolved): osd crashed in FileStore::_do_transaction
- ceph version: 0.55.1-360-g6356739 (635673928a6b4dae6d4712cacad81cbac6412dc3)
I had a cluster[burnupi15, burnupi19,... - 05:33 PM Revision 9da6d882 (ceph): doc: document mds config options
- Signed-off-by: Sage Weil <sage@inktank.com>
- 04:52 PM rbd Bug #3688 (Won't Fix): rbd allows image of size 0 to be created
- ceph version : 0.55.1-360-g6356739 (635673928a6b4dae6d4712cacad81cbac6412dc3)
rbd allows images created with zero ... - 04:45 PM Documentation #3687 (Resolved): Documentation needs a "memory profiling" section
- While debugging what I thought was a Ceph memory leak, I was pointed to
http://ceph.com/deprecated/Memory_Profiling
... - 04:37 PM devops Documentation #3686 (Resolved): install prerequisites (Debian)
- On http://ceph.com/docs/master/install/build-prerequisites/ , in the "On Debian/Squeeze, execute aptitude install ......
- 12:32 PM rbd Bug #3427: krbd: unmap does not remove block device properly
- I am going to assume that the racing open is the cause of
the problem reported by Nikola Kotur.
To fix it, I will... - 12:17 PM rbd Bug #3427: krbd: unmap does not remove block device properly
- > For RBD, wasn't the use_count something we just added? Would it cover this situation?
No.
The first warning i... - 08:53 AM rbd Bug #3427: krbd: unmap does not remove block device properly
- For cephfs, the vfs normally handles that.
For RBD, wasn't the use_count something we *just* added? Would it cove... - 08:37 AM rbd Bug #3427: krbd: unmap does not remove block device properly
- I also note, having taken a little closer look at Nikola Kotur's
kernel log that both an open and a close appear to ... - 08:31 AM rbd Bug #3427: krbd: unmap does not remove block device properly
- It looks to me like the osd client code has nothing in place
to protect itself from one of its users (ceph client, m... - 12:01 PM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
- There aren't known leaks in argonaut. If you can reproduce with valgrind massif and see where the heap is going, tha...
- 11:28 AM rbd Bug #2689: qemu iozone test hangs
- Testing again since some possible causes were fixed.
- 10:54 AM rbd Bug #3685 (Closed): xfs test 186 fails in the nightlies
- logs: ubuntu@teuthology:/a/teuthology-2012-12-26_19:00:03-regression-next-testing-basic/28039
... - 09:19 AM Bug #3684 (Resolved): filejournal: aio vector size is not limited
- FileJournal::write_aio_bl does not limit the size of the iov to IOV_MAX.
- 08:23 AM Bug #3683 (Resolved): mon: leak of MMonPaxos
- ubuntu@teuthology:/a/teuthology-2012-12-22_19:00:02-regression-next-testing-basic/22989
saw it a few days earlier,... - 01:34 AM Revision 916d1cf6 (ceph): doc: journaler config options
- Signed-off-by: Sage Weil <sage@inktank.com>
12/26/2012
- 10:27 PM Revision c34e38bc (ceph): log: 10,000 recent log entries
- This is what we were (wrongly) doing before, so there are no memory
utilization surprises.
Signed-off-by: Sage Weil ... - 10:27 PM Revision 4daede79 (ceph): log: fix log_max_recent config
- <facepalm>
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4de7748b72d4f90eb1197a70015c199c15... - 10:27 PM Revision fdae0552 (ceph): log: fix flush/signal race
- We need to signal the cond in the same interval where we hold the lock
*and* modify the queue. Otherwise, we can hav... - 08:54 PM Revision cedea139 (ceph): docs: Merge changes from release-process2 document.
- 07:58 PM Revision 850a056b (ceph): mds: add waiting_for_mdsmap queue
- Defer events until we get a specific MDSMap epoch.
Signed-off-by: Sage Weil <sage@inktank.com> - 07:58 PM Revision c764935d (ceph): mds: do not check for pool existence in osdmap
- We don't have a wait mechanism to ensure the MSDMap has the latest osdmap
here. Just trust the MDSMap.
Signed-off-b... - 06:55 PM Revision 4929fc7d (ceph): qa: remove xfstests 172 and 173 from qemu testing
- These seem to require newer xfs.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> - 05:42 PM Revision f5403f94 (ceph): doc/man/8/mkcephfs: update --mkfs a bit
- Document that 'devs' and 'osd mkfs type' must be defined.
Signed-off-by: Sage Weil <sage@inktank.com> - 04:23 PM rgw Bug #3682: valgrind errors seen when running rgw tests in nightlies
- log: ubuntu@teuthology:/a/teuthology-2012-12-26_03:00:10-regression-master-testing-gcov/27925
- 04:20 PM rgw Bug #3682 (Resolved): valgrind errors seen when running rgw tests in nightlies
- Logs: ubuntu@teuthology:/a/teuthology-2012-12-26_03:00:10-regression-master-testing-gcov/27924
ubuntu@teuthology:/... - 03:48 PM Bug #3378 (Can't reproduce): common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
- The suicide timeout is the symptom only. Usually it means the thread is blocked by a hung syscall. In your case, Ma...
- 02:38 PM rbd Bug #3427: krbd: unmap does not remove block device properly
- I haven't spent time on this in almost a month so wanted to just
provide an update. We have been looking at and try... - 01:02 PM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
- We are using 0.48.2 for the OSDs and our plan is to upgrade to 0.56 (or the next stable release) when it comes out.
- 11:45 AM Bug #3546 (Won't Fix): CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
- The crash is a known problem with pre-3.4 kernels. Fixes have been backported to 3.4 stable and 3.6 stable kernels, ...
- 11:36 AM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
- At the time, the clients where running 3.2.0-32, but we have since upgraded to 3.6.9 per another ceph bug.
We have... - 11:25 AM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
- What kernel version are you running?
- 11:05 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- another run:...
- 09:38 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- another one:...
- 09:59 AM CephFS Bug #3681 (Resolved): kclient fsx fails nightly
- ...
- 08:44 AM Bug #3676: osd keeps crashing at ReplicatedPG::scan_range()
- Xiaopong Tran wrote:
> I'm using xfs, with no specific mount options, just the default.
>
> I added the debug set... - 02:22 AM Bug #3676: osd keeps crashing at ReplicatedPG::scan_range()
- I'm using xfs, with no specific mount options, just the default.
I added the debug settings, and got a large log f... - 08:39 AM CephFS Feature #3679 (Closed): Any API to get metadata?
- Yep! See libcephfs. There is...
- 01:08 AM CephFS Feature #3679 (Closed): Any API to get metadata?
- hello,there.
I am wondering if there is any API to get the metadata of a file .
I have the ceph file system run by ... - 07:20 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
- err... that should have been each monitor's ip and port.
as in... - 01:10 AM CephFS Tasks #3680 (Rejected): deduplication in ceph
- I am wondering how to do deduplication in ceph...the big problem is how to get the metadata of a file
and how to mod...
12/25/2012
- 08:35 PM Bug #3378: common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
- Saw this show up during parametric sweep testing on EXT4 with 8 concurrent OSD disk threads. Ceph build is from gitb...
12/24/2012
- 02:58 PM CephFS Feature #1448 (In Progress): test hadoop on sepia
- 02:58 PM CephFS Cleanup #814 (Resolved): hadoop: refactor hadoop shim in terms of java libceph bindings
- 02:56 PM rbd Feature #3580 (Resolved): rbd import from stdin could try harder to sparsify images
- 02:54 PM rgw Feature #1950: rgw: create S3/Swift ACL interoperability suite
- 12:27 PM Bug #3676 (Need More Info): osd keeps crashing at ReplicatedPG::scan_range()
- ...
- 12:04 PM Bug #3678 (Resolved): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- ...
- 09:22 AM rbd Bug #3654 (Fix Under Review): libvirt: colons in ipv6 monitor addresses are not escaped when sent...
- 08:45 AM rbd Fix #3665: librbd: deadlock during flatten
- the problem is that we are holding the snap_lock and then waiting for io. but we mostly use snap_lock as a tight inne...
- 04:01 AM Revision d18f3c2d (ceph): mds: don't force in->first == dn->first
- The fullbit sets it now. For multiversion inodes, it's "first" can be in
the future, since this dentry may not have ... - 04:01 AM Revision 8b599083 (ceph): mds: replace closed sessions on connect
- If a connection comes and there is a closed session attached, remove it.
This is probably a failure of an old session... - 04:01 AM Revision a3e70aed (ceph): mds: always send discover if want_xlocked is true
- If want_xlocked is true, we can not rely on previously sent discover
because it's likely the previous discover is blo... - 04:01 AM Revision 96f48aa0 (ceph): mds: re-issue caps after importing caps
- The imported caps may prevent unstable locks from entering stable
states. So we should call Locker::eval_gather() wit... - 04:01 AM Revision dd441576 (ceph): mds: take export lock set before sending MExportDirDiscover
- Migrator::export_dir() only check if it can lock the export lock set
but not take the lock set. So someone else can c... - 04:01 AM Revision 1174dd31 (ceph): mds: don't retry readdir request after issuing caps
- If remote linkage without inode is encountered after some caps are
issued, Server::handle_client_readdir() should sen... - 04:01 AM Revision f5e86ecb (ceph): mds: delay processing cache expire when state >= EXPORT_EXPORTING
- It's possible that MDS receives cache expire in EXPORT_LOGGINGFINISH
and EXPORT_NOTIFYING states.
Signed-off-by: Yan... - 04:01 AM Revision efbca31d (ceph): mds: fix file existing check in Server::handle_client_openc()
- Creating new file needs to be handled by directory fragment's auth
MDS, opening existing file in write mode needs to ... - 04:01 AM Revision 00025462 (ceph): mds: fix race between send_dentry_link() and cache expire
- MDentryLink message can race with cache expire, When it arrives at
the target MDS, it's possible there is no correspo... - 04:01 AM Revision a1485f95 (ceph): mds: compare sessionmap version before replaying imported sessions
- Otherwise we may wrongly increase mds->sessionmap.version, which
will confuse future journal replays that involving s... - 04:01 AM Revision 48d8ae58 (ceph): mds: alllow handle_client_readdir() fetching freezing dir.
- At that point, the request already auth pins and locks some objects.
So CDir::fetch() should ignore the can_auth_pin ... - 04:01 AM Revision 0ab0744e (ceph): mds: properly mark dirfrag dirty
- If predirty_journal_parents() does not propagate changes in dir's
fragstat into corresponding inode's dirstat, it sho... - 04:01 AM Revision b7e698a5 (ceph): mds: no bloom filter for replica dir
- We should delete dir fragment's bloom filter after exporting the dir
fragment to other MDS. Otherwise the residual bl... - 04:01 AM Revision e6b8f0a6 (ceph): mds: set want_base_dir to false for MDCache::discover_ino()
- When frozen inode is encountered, MDCache::handle_discover() sends
reply immediately if the reply message is not empt... - 04:01 AM Revision 69f9f024 (ceph): mds: fix error hanlding in MDCache::handle_discover_reply()
- The error hanlding code in MDCache::handle_discover_reply() has two
main issues. MDCache::handle_discover_reply() doe... - 03:59 AM Revision d9673ca3 (ceph): Merge branch 'wip-create-layout'
- Reviewed-by: Greg Farnum <greg@inktank.com>
The functional tests for the create operations should add and specify no... - 03:39 AM Revision d2f5890f (ceph): client, libcephfs: add method to get the pool name for an open file
- Signed-off-by: Sage Weil <sage@inktank.com>
- 03:39 AM Revision 8efcf54d (ceph): mds: *_pg_pool -> *_pool
- Signed-off-by: Sage Weil <sage@inktank.com>
- 03:39 AM Revision 697ed23c (ceph): client: remove set_default_*() methods
- This is a poor interface. The hadoop stuff is shifting to specify this
information on file creation instead.
Signed... - 03:39 AM Revision 99d9e1da (ceph): mds: allow data pool to be specfied on create
- Reuse old preferred_pg field. Only use if the new CREATEPOOLID feature
is present, and the value is >= 0.
Verify th... - 03:39 AM Revision 3f458217 (ceph): mds: verify that the pool id is valid on SET[DIR]LAYOUT
- Make sure the data pool exists and is part of the MDSMap data pools list.
Signed-off-by: Sage Weil <sage@inktank.com> - 03:39 AM Revision 32ab274a (ceph): client: specify data pool on create operations
- Fill in the data pool field if specified by the client, or set to -1.
Signed-off-by: Sage Weil <sage@inktank.com>
12/23/2012
- 11:21 PM Revision 61d43af7 (ceph): osd: make MOSDFailure output more sensible
- Signed-off-by: Sage Weil <sage@inktank.com>
- 11:21 PM Revision 850d1d54 (ceph): osd: fix dup failure cancellations
- If we had a pending failure report, and send a cancellation, take it
out of our pending list so that we don't keep re... - 11:11 PM Revision 9df522e9 (ceph): mon: make osd failure report log msgs sensible
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:42 PM Revision 1290671f (ceph): Merge branch 'wip-scrub' into next
- Reviewed-by: Sage Weil <sage@inktank.com>
Conflicts:
src/osd/PG.cc - 09:53 PM Revision 8362e640 (ceph): monclient: fix get_monmap_privately retry interval
- Use mon_client_hunt_interval (default 3) instead of hardcoding 1 second.
Signed-off-by: Sage Weil <sage@inktank.com> - 09:53 PM Revision d843a64a (ceph): Makefile: fix 'base' rule
- Signed-off-by: Sage Weil <sage@inktank.com>
- 09:12 PM CephFS Cleanup #3677 (Closed): libcephfs, mds: test creation/addition of data pools, create policy
- the create data pool argument is tested only with the default pools. once an lib is in place for the unit/functional...
- 09:06 PM CephFS Bug #3663 (Rejected): ceph kernel client is getting stuck on xstat* operations
- No worries. Let us know if you do come across behavior that looks like a bug!
- 08:59 PM CephFS Bug #3663: ceph kernel client is getting stuck on xstat* operations
- Hi Sage,
i am very sorry for taking your time with this issue, I feel like an idiot :(
The buggy client is runnin... - 07:19 PM Revision 00b89c3f (ceph): Merge branch 'next'
- 07:18 PM Revision a09f5b1b (ceph): init-ceph,mkcephfs: default inode64 for mounting xfs
- According to hch this is now the default or new kernels.
Signed-off-by: Sage Weil <sage@inktank.com> - 03:22 PM Bug #3675 (Fix Under Review): osd: hang during intial peering
- wip-3675
12/22/2012
- 09:39 PM Bug #3675: osd: hang during intial peering
- ok, this is actually also a race that can cause the register_pipe assert. the locking needs to be reworked here. pu...
- 09:28 PM Bug #3675: osd: hang during intial peering
- ...
- 08:54 PM Bug #3675: osd: hang during intial peering
- it took about 1500 iterations of this job to reproduce the hang:...
- 08:53 PM Bug #3675: osd: hang during intial peering
- ubuntu@teuthology:/a/sage-peer1/21827
- 08:52 PM Bug #3675: osd: hang during intial peering
- this is a messenger bug. if there is a socket error at the end of accept(), after the register_pipe(), we then fail ...
- 08:11 AM Bug #3675 (Resolved): osd: hang during intial peering
- the initial wait for healthy blocked on 2 pgs. ms inject socket failres = 500. everything was up.
no logs, so it... - 07:10 PM Revision 5f25f9f8 (ceph): init-ceph: default osd_data path
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:40 PM Bug #3657 (In Progress): rbd: crash mapping image
- I got a response from Ugis. The patches I supplied to him
did stop the crashes he was seeing. So we'll want to get... - 09:22 AM Bug #3676 (Can't reproduce): osd keeps crashing at ReplicatedPG::scan_range()
- This specific osd (osd.17) keeps crashing at the same location, as I tried to bring it back. It would start peering a...
- 07:04 AM Documentation #3674 (Resolved): Deployment documentation is confusing
- As a new user who spent hours googling and reading source code to decipher what each tool does, I thought of giving s...
- 06:29 AM devops Feature #3255: ceph-disk: allow prepare without activate (for spares)
- Couldn't ceph-disk-prepare take a lock by e.g. writing a file (or even flock()ing it) in /var/lib/ceph/ before it sta...
- 04:37 AM Revision ad9bcc70 (ceph): PG: don't use a self-transition for WaitRemoteRecoveryReserved
- Previously, using the state on active worked, but now we might
go back through WaitRemoteRecoveryReserved without res... - 04:37 AM Revision f6b2ca8b (ceph): OSD: always do a deep scrub when repairing
- Otherwise, errors turned up in a deep-scrub will be
swept under the rug without being repaired.
Signed-off-by: Samue... - 04:35 AM Revision 2e96bb18 (ceph): PG: Handle repair once in scrub_finish
- We don't want to change missing sets during a chunky
scrub since it would cause !is_clean() and derail
the rest of th... - 02:47 AM devops Feature #3673 (Rejected): ceph-disk-prepare should provide an option for SSD alignment
- ceph-disk-prepare takes an option to use an external disk as a journal. It is commonly suggested that the journal is ...
- 01:12 AM Revision bdcf6647 (ceph): .gitignore: Add ar-lib to ignore list
- 01:03 AM Revision 4a558048 (ceph): librbd: move buf_is_zero() to new common/util.cc and include/util.h
- Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com> - 01:03 AM Revision 410903fe (ceph): rbd: check for all-zero buf in export, seek output if so
- Use buf_is_zero in common/util.cc
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durg... - 01:03 AM Revision 5905d7fa (ceph): rbd: harder-working sparse import from stdin
- Try to accumulate image-sized blocks when importing from stdin, even if
each read is shorter than requested; if we ge... - 01:03 AM Revision 6325a480 (ceph): import_export.sh: sparse import export
- Add tests for:
- sparse import makes expected sparse images
- sparse export makes expected sparse files
- sp... - 12:55 AM Revision 51a900cf (ceph): autogen.sh: Create m4 directory for leveldb
- Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
- 12:47 AM Revision 8f5de156 (ceph): osd: fix pg stat msgs vs timeout
- We can get a pattern like so:
- new mon session
- after say 120 seconds, we decide to send a stats msg
- outstanding... - 12:19 AM Revision 74473bb6 (ceph): leveldb: Update submodule
- Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
- 12:14 AM Revision 2bf4f42b (ceph): doc: Added new journaler page to CephFS section. Needs descriptions.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 12:14 AM Revision 53afac1a (ceph): doc: Added Journaler Configuration to toc tree.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 12:09 AM Revision 757902d6 (ceph): doc: Added --mkfs options.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 12:08 AM Revision 46d03344 (ceph): doc: Added running multiple clusters. Per Tommi.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 12:07 AM Revision e3d07566 (ceph): doc: Updated the Configuration File section.
- - Replaced ceph.conf with Ceph configuration to clarify
when running multiple clusters on the same hardware.
- Adde...
12/21/2012
- 11:20 PM Revision 00ed6657 (ceph): PG::scrub_compare_maps increment scrubber.fixed for missing repairs
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 11:16 PM Revision c9e05174 (ceph): PG::_compare_scrubmaps: increment scrubber.errors on missing object
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 11:15 PM Revision b564fdb8 (ceph): release-notes: remove warning about osd caps
- This was only an issue from 0.49-0.52 upgrading to 0.53+
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> - 11:15 PM Revision 3076e459 (ceph): release-notes: pgnum is required now
- This should have been in the 0.55 release notes.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> - 11:15 PM Revision 048567e0 (ceph): release-notes: fix typos
- Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
- 11:15 PM Revision b39928df (ceph): release-notes: remove bug fix that does not affect argonaut
- Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
- 11:15 PM Revision 4a039393 (ceph): release-notes: add more user-visible changes
- These are from looking through the shortlog from 0.48.2..next.
The description of the min_size defaults could probabl... - 10:54 PM Revision 09d4f036 (ceph): doc: Added sudo the ceph health for when cephx is on.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 10:53 PM Revision 085992f6 (ceph): doc: minor fix to syntax.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 10:23 PM Revision 206ffcd8 (ceph): mkcephfs: error out if 'devs' defined but 'osd fs type' not defined
- We can infer btrfs if they use btrfs devs, but if they use devs there is
no default fs.
Signed-off-by: Sage Weil <sa... - 10:04 PM Revision 4a40067d (ceph): doc: update ceph.conf examples about btrfs default
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:00 PM Revision 677a7a5a (ceph): rgw: add swift tasks
- Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
- 09:56 PM Revision 11fb3141 (ceph): Merge remote-tracking branch 'gh/wip-scrub' into next
- 09:45 PM Revision 47145d80 (ceph): Merge remote-tracking branch 'gh/wip-3643' into next
- Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
- 09:44 PM Revision 999ba1b2 (ceph): monc: only warn about missing keyring if we fail to authenticate
- This avoids the situation where a librados or other user with the default
of 'cephx,none' and no keyring is authentic... - 09:10 PM Revision 5d5a42bc (ceph): osd: clear CLEAN on exit from Clean state
- This means we can drop the scrub repair state_clear() call. We probably
can drop others, but lets leave that for ano... - 08:19 PM Revision b3e62ad6 (ceph): auth: use none auth if keyring not found
- If both cephx and none are accepted auth methods, and
cephx keyring cannot be found then resort to using
none, instea... - 07:37 PM Revision ae044e64 (ceph): osd: allow transition from Clean -> WaitLocalRecoveryReserved for repair
- If we do a scrub repair, we need to go from clean to recovery again to
copy objects around.
This fixes a simple repa... - 07:36 PM Revision 7c56d8fa (ceph): PG::sched_scrub: return true if scrub newly kicked off
- The previous return value wasn't really what OSD::sched_scrub
wanted to know.
Signed-off-by: Samuel Just <sam.just@i... - 07:36 PM Revision 4d661e0d (ceph): PG::sched_scrub: only set PG_STATE_DEEP_SCRUB once reserved
- Otherwise we would have +DEEP before we have +SCRUB.
Signed-off-by: Samuel Just <sam.just@inktank.com> - 07:29 PM Revision 19e44bff (ceph): osd: clear scrub state if queued scrub doesn't start
- We set SCRUBBING when we queue a pg for scrub. If we dequeue and
call scrub() but abort for some reason (!active, de... - 07:29 PM Revision 670afc6c (ceph): PG: in sched_scrub() set PG_STATE_DEEP_SCRUB not scrubber.deep
- scrubber.deep gets reset in scrub() to match
state_test(PG_STATE_DEEP_SCRUB).
Signed-off-by: Samuel Just <sam.just@i... - 06:20 PM Revision c02d34dc (ceph): task/swift: change upstream repository url
- Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
- 06:20 PM Revision 2f829870 (ceph): task/swift: change upstream repository url
- Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
- 06:15 PM Revision feb0aad2 (ceph): doc: Moved path to individual OSD entires.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 04:41 PM Bug #3661: mon: idle/empty osds marked down after 15 min
- commit:8f5de156056de78c90f1dc7bf7c5a131c32c1bb8
- 03:58 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
- ubuntu@ceph3:/etc/ceph$ ceph -m ip:port -s
server name not found: ip (servname not supported for ai_socktype)
unabl... - 11:10 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
- Pat, just a little triage before I dive into this full head on, could you please try the following for each monitor?
... - 02:39 PM CephFS Documentation #3672 (Resolved): doc: how to mount ceph-fuse from fstab
- There's a new mount helper in bobtail for this. It contains these comments:...
- 02:36 PM Bug #3662: mkcephfs --mkfs is not inserting any default settings
- anywhere i inserted a hash "#", this lovely program made them into numbered columns, so if you see a block with
x... - 02:23 PM Bug #3662: mkcephfs --mkfs is not inserting any default settings
- ok, looks like my conf settings got munged.
let's try this again
i was trying to get the mkcephfs to create a defau... - 02:26 PM rgw Feature #3671 (Resolved): Request for x-amz-grant-full-control support
- DH is requesting support for x-amz-grant-full-control:
"With Amazon S3, you can do specific grants like
x-amz-g... - 02:23 PM rgw Feature #3670 (Resolved): Request for bucket-owner-read and bucket-owner-full-control grants
- From DH, they'd like to see two types of requests which we currently ignore.
"Amazon has bucket-owner-read and buc... - 01:59 PM Linux kernel client Bug #1492 (Can't reproduce): fsx failure on kclient
- 01:55 PM rgw Feature #3669 (Resolved): rgw: support acl grants through http headers
- support x-amz-grant-* http header fields.
- 01:48 PM Bug #3643 (Resolved): default authentication on the client does not work without a config file or...
- commit:47145d800951db396785560df4e6d5d344af97dd
- 12:17 PM Bug #3643 (Fix Under Review): default authentication on the client does not work without a config...
- 11:18 AM Bug #3658 (Resolved): osd/mon: stops processing pg stat messages
- pretty sure this was caused by the log bug and 'log max new = 1', fixed by commit:50914e7a429acddb981bc3344f51a793280...
- 11:03 AM Bug #3657: rbd: crash mapping image
- hmm. yeah, it probably means we should set the required features during negotiation to include MSG_AUTH instead of ...
- 10:56 AM Bug #3657: rbd: crash mapping image
- There is another thing that came from the two crash logs Ugis
just supplied. They both contained lines like this:
... - 10:47 AM Bug #3657: rbd: crash mapping image
- Ugis supplied two more images containing captured crash
stack traces. Both contained lines like this:
[ 32... - 10:27 AM rgw Feature #3668 (Resolved): rgw: support CORS
- 10:21 AM rgw Feature #3667 (Resolved): rgw: support extra canned acl params
- bucket-owner-read, bucket-owner-full-control
- 10:20 AM CephFS Bug #3666 (Resolved): Segfault running test_libcephfs
- ...
- 10:18 AM Bug #3650 (In Progress): osd: crash in Reset state -> start_peering_interval -> on_change -> proc...
- 09:36 AM Bug #3650: osd: crash in Reset state -> start_peering_interval -> on_change -> process_event Reset
- 10:03 AM rbd Fix #3665 (Resolved): librbd: deadlock during flatten
- Ran into this trying to reproduce #3631.
The test_librbd_fsx process is still running on plana34 for debugging.
... - 08:36 AM CephFS Bug #3655 (Can't reproduce): client: hang in fsstress
- I ran this test throughout the day yesterday and couldn't reproduce it, with message delays enabled. Marking as can'...
- 08:32 AM rbd Bug #3664 (Resolved): osdc/ObjectCacher.cc: 517: FAILED assert(!i->size())
- ...
- 07:52 AM CephFS Bug #3663: ceph kernel client is getting stuck on xstat* operations
- Hi Roman-
The logging levels are right, but in both mds logs neither mds was ever active; both were in the up:stan... - 05:45 AM Revision e765dcb4 (ceph): osd: only dec_scrubs_active if we were active
- This fixes a bug that puts scrubs_active negative.
Signed-off-by: Sage Weil <sage@inktank.com> - 05:44 AM Revision ada3e27f (ceph): osd: reintroduce inc_scrubs_active helper
- This mostly generates nice debug output. It also slightly simplifies
code and makes things symmetric.
Signed-off-by... - 01:43 AM Revision ae26432d (ceph): Merge remote-tracking branch 'gh/next'
- 12:49 AM Revision bc4f74c7 (ceph): ceph.spec.in: Fedora builds debuginfo by default.
- Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
- 12:24 AM Revision accce830 (ceph): Merge remote-tracking branch 'upstream/wip_notify' into next
- Reviewed-by: Sage Weil <sage@inktank.com>
12/20/2012
- 11:51 PM Revision 129a49ad (ceph): cephtool: mention ceph osd ls, fix ceph osd tell N bench
- Add ceph osd ls to help; make help for ceph osd tell N bench look
more like injectargs, which says <osd-id or *> to m... - 11:32 PM Revision a36d1db1 (ceph): rgw: remove noisy log message
- No need for that log message.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> - 11:30 PM Revision 5b5a19ac (ceph): rgw: fix daemonize initialization
- Just call the common daemonize function. Otherwise we end up
not initializng stdout / stderr correctly.
Signed-off-b... - 11:10 PM Revision 754fc200 (ceph): release notes: Mention new cephtool commands
- ceph osd ls and ceph tell osd.N version are new. Mention their use
for verifying that all OSDs are upgraded in the n... - 10:19 PM CephFS Bug #3663: ceph kernel client is getting stuck on xstat* operations
- Hello Sage,
added 4 logs:
screen output from console of the laggy client. it ends up on 'jroger@pr02:~/data$ cp... - 09:07 PM CephFS Bug #3663 (Need More Info): ceph kernel client is getting stuck on xstat* operations
- Hmm. It's actually just saying its the oldest client; it's not actually too old (yet). The looping connect attempts...
- 08:48 PM CephFS Bug #3663 (Rejected): ceph kernel client is getting stuck on xstat* operations
- there are 2 kernel clients happily working with ceph. as soon as I try mounting ceph from the third client, it's gett...
- 09:54 PM Bug #3661 (In Progress): mon: idle/empty osds marked down after 15 min
- 04:57 PM Bug #3661 (Resolved): mon: idle/empty osds marked down after 15 min
- wip-mon
- 09:48 PM Revision 50914e7a (ceph): log: fix flush/signal race
- We need to signal the cond in the same interval where we hold the lock
*and* modify the queue. Otherwise, we can hav... - 09:29 PM Revision c0e23712 (ceph): ReplicatedPG::remove_notify : don't leak the notify object
- Following remove_notify, there are no other references to
notif, delete it.
Signed-off-by: Samuel Just <sam.just@ink... - 09:27 PM Revision b5031a22 (ceph): OSD,ReplicatedPG: do not track notifies on the session
- handle_notify_timeout and remove_notify currently do not clean up this
state leaving dangling Notification*. Further... - 08:59 PM Revision 719679ea (ceph): doc: Added package and repo links for Apache and FastCGI. Added SSL ena...
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 08:59 PM Revision 04eb1e73 (ceph): doc: Fixed restructuredText usage.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 07:42 PM Bug #3662: mkcephfs --mkfs is not inserting any default settings
- The algorithm appears to be
1) if 'devs' is not defined, look for 'btrfs devs'; if that's defined, use those for ... - 05:00 PM Bug #3662 (Won't Fix): mkcephfs --mkfs is not inserting any default settings
- It was my understanding that "sudo mkcephfs -a -c ceph.conf -k ceph.keyring --mkfs" would format a device with btrfs ...
- 07:39 PM Revision ea9fc87d (ceph): doc: Removed foo. Apparently myimage was added and foo not removed.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 07:07 PM Revision 9f67c450 (ceph): Merge branch 'next'
- 07:04 PM Revision 17c627b5 (ceph): Merge remote-tracking branch 'gh/wip-cephtool' into next
- 06:58 PM Revision 0953ce53 (ceph): rados: add cephtool test
- 06:49 PM Revision f38d8911 (ceph): Merge branch 'wip-build-fixes' into next
- 06:46 PM Bug #3627 (Resolved): osd: segfault in ~MOSDSubOp during thrashing+rbd_fsx
- accce830514c6b099eb0e00a8ae34396d14565a3 should fix it.
- 06:45 PM Bug #3659 (Resolved): complete_notify crash
- accce830514c6b099eb0e00a8ae34396d14565a3 should take care of it.
- 12:24 PM Bug #3659: complete_notify crash
- Saw on alexandria
- 12:23 PM Bug #3659 (Resolved): complete_notify crash
- l=0).accept connect_seq 58 vs existing 57 state standby
2012-12-19 18:20:42.186013 7f3b1afe7700 0 <cls> cls/rgw/cls... - 06:13 PM Revision a803159b (ceph): rgw: configurable exit timeout
- Fixes: #3638
rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If s... - 06:07 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
- I just did an "scp" to burnupi40.front.sepia.ceph.com:/home/ubuntu/3647.vm.tgz
- 04:32 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
- I am in Sunnyvale and the VMs reside on my desktop. I have snapshotted and created a tar file of my 3 node cluster. ...
- 07:32 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
- Pat, do you still have the VMs in this state? If so, can I take a look?
- 05:45 PM Revision 92b59e90 (ceph): rgw: don't try to assign content type if not found
- Fixes: #3648
Cannot assign a NULL pointer into stl string. This is only
relevant to swift, when uploading an object w... - 04:53 PM Revision c02e9062 (ceph): Merge remote-tracking branch 'gh/wip-crushtool' into next
- Reviewed-by: Caleb Miles <caleb.miles@inktank.com>
- 03:49 PM rbd Bug #3524: test_librbd_fsx: crash after flatten
- Sam saw this come up again in: ubuntu@teutholog:/a/sam-ooo3/19022
It's a different cause of the same symptom. In t... - 02:52 PM Bug #3660 (Resolved): osd: marking objects lost invalidates pg stats
- If you lose an object, the pg stats become invalid, and the next scrub will report a problem.
We could mark the st... - 02:20 PM Linux kernel client Bug #3519: rbd map hang during system startup
- We've learned a few things since my last update, but the main
thing is that Nick tried the latest thing I offered an... - 11:41 AM Bug #3496 (Resolved): doc: have old URL's redirect to new ones
- 11:41 AM Documentation #3564 (Resolved): doc: many broken links since rearrangement
- 11:40 AM rgw Documentation #2989 (Resolved): doc: write RGW troubleshooting
- 11:40 AM Bug #3656 (Resolved): docs: "foo" doesn't mean anything in rbd example
- Apparently foo was the image name, and myimage was added and foo not removed.
- 11:36 AM Bug #3656 (In Progress): docs: "foo" doesn't mean anything in rbd example
- 05:29 AM Bug #3656: docs: "foo" doesn't mean anything in rbd example
- Whoops, I forgot to assign it.
- 05:28 AM Bug #3656 (Resolved): docs: "foo" doesn't mean anything in rbd example
- Someone named "Ugis" on the mailing list was having trouble
with the rbd command. One of the things this person men... - 10:10 AM rgw Bug #3638 (Resolved): rgw: configurable exit timeout
- Fixed, commit:04e7a5ca1364166a6b93e6cd0fcf58faf629a01c
- 09:47 AM rgw Bug #3648 (Resolved): rgw: swift put object with empty mime type crashes
- Fixed, commit:92b59e90590aee501ae090adebf58978912f9dd3.
- 09:42 AM Bug #3658: osd/mon: stops processing pg stat messages
- see /a/sage-ooo2, /a/sam-ooo3
- 09:42 AM Bug #3658 (Resolved): osd/mon: stops processing pg stat messages
- ...
- 09:37 AM Feature #3622 (Rejected): RADOS pools should support more than 65535 PGs
- kernel limit only
- 07:40 AM Bug #3633: mon: clock drift errors not reported by ceph status
- 'HEALTH_OK' and 'HEALTH_WARN' are assessed in a way that makes it non-trivial to leverage the existing way of doing t...
- 06:03 AM Revision 799c59ae (ceph): rgw: remove useless configurable, fix swift auth error handling
- Fixes: #3649
No need to have an extra configurable to use keystone. Use keystone
whenever keystone url has been speci... - 06:03 AM Revision 08c64249 (ceph): rgw: don't initialize keystone if not set up
- Fixes: #3653
No need to initialize keystone, including the keystone
revocation thread which was verbose if key stone ... - 05:56 AM Bug #3657 (Resolved): rbd: crash mapping image
- I'm just creating this to track some activity from someone
on the mailing list reporting kernel crashes when attempt... - 01:07 AM Revision 3ed2d59e (ceph): rgw: fix error handling with swift
- Fixes: #3649
verify_swift_token returns a bool and not an int.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> - 12:51 AM Revision 9a9778fb (ceph): Merge remote-tracking branch 'upstream/wip_pg_temp' into next
- Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Luis <joao.luis@inktank.com>
12/19/2012
- 11:19 PM CephFS Bug #3655 (Can't reproduce): client: hang in fsstress
- fsstress stuck in _read_sync()
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_6... - 10:22 PM Revision 5497d228 (ceph): doc: Modified the demo configuration file for Bobtail.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 10:02 PM Revision 40fdd773 (ceph): doc: Added Gateway Quick Start.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 10:02 PM Revision 5281ee24 (ceph): doc: Added Gateway Quick Start configuration file.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 10:01 PM rgw Bug #3649 (Resolved): rgw: swift list buckets returns empty result
- Fixed, commit:799c59ae89c9a70f08d9bf2e7624d25e6641d41f.
- 05:13 PM rgw Bug #3649 (Fix Under Review): rgw: swift list buckets returns empty result
- 05:02 PM rgw Bug #3649: rgw: swift list buckets returns empty result
- Backporting is required for a bad error handling that triggered the symptoms.
- 04:59 PM rgw Bug #3649: rgw: swift list buckets returns empty result
- This was happening when trying to use keystone, but without specifying 'rgw swift use keystone'. Ended up shortcuttin...
- 07:39 AM rgw Bug #3649 (Resolved): rgw: swift list buckets returns empty result
- 10:01 PM Revision 84fb371d (ceph): Updated Getting Started index to include Gateway Quick Start.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 10:01 PM rgw Bug #3653 (Resolved): In bobtail, turn off keystone errors in radosgw.log when not applicable
- done, commit:08c64249eb8cd7922de5c398a9426538918db77c.
- 05:13 PM rgw Bug #3653 (Fix Under Review): In bobtail, turn off keystone errors in radosgw.log when not applic...
- 01:30 PM rgw Bug #3653 (Resolved): In bobtail, turn off keystone errors in radosgw.log when not applicable
- In bobtail, when radosgw is installed and configured on a cluster node, we see the following errors in radosgw.log, w...
- 10:00 PM Revision 5e955103 (ceph): doc: Added REST Gateway link to 5-minute Quick Start.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 09:52 PM Revision c2b231e4 (ceph): doc: Updated the 5-minute Quick Start for Bobtail.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 09:47 PM Revision f596cee7 (ceph): doc: Updated Block Device Quick Start for Bobtail.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 09:46 PM Revision 60b2857d (ceph): doc: Updated CephFS Quick Start for Bobtail.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 09:45 PM Revision d17bd384 (ceph): doc: Added authentication and mkcephfs settings for Bobtail.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 09:36 PM Revision cd5c82db (ceph): doc: Added javascript code block tag.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 06:33 PM Revision 6122a9f6 (ceph): OSDMonitor: remove temp pg mappings with no up pgs
- Otherwise, the pg won't be validly mapped until one of the temp
pgs comes back up.
Signed-off-by: Samuel Just <sam.j... - 06:32 PM Revision 2395af9f (ceph): OSDMap: make apply_incremental take a const argument
- This requires us to copy bufferlists in two cases since bufferlist
does not have a const interator at this time.
Sig... - 05:17 PM rgw Tasks #3152 (Resolved): rgw: document usage testing
- Done, commit:2f73c07511dce200b5dd298c6f86e03fbb9b3dd1
- 05:16 PM rgw Feature #3494 (Closed): ceph S3 upload slowly
- Closing, need more info about the specific user problem.
- 05:15 PM rgw Bug #3620 (Fix Under Review): rgw:improve multiple user access keys scalability
- 05:13 PM rgw Bug #3648 (Fix Under Review): rgw: swift put object with empty mime type crashes
- 07:39 AM rgw Bug #3648: rgw: swift put object with empty mime type crashes
- [[https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1092137]]
- 07:38 AM rgw Bug #3648 (Resolved): rgw: swift put object with empty mime type crashes
- 04:38 PM rbd Bug #3654 (Resolved): libvirt: colons in ipv6 monitor addresses are not escaped when sent to qemu
- Given xml like:...
- 04:37 PM Revision 2e49d5c4 (ceph): cephtool: add qa workunit
- A few basic sanity checks, including a tell on a down osd.
Signed-off-by: Sage Weil <sage@inktank.com> - 04:03 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
- Proposed fix in wip-3637. The client's max size request in MClientCaps gets dropped if the file lock is in a non-sta...
- 03:10 PM Linux kernel client Bug #3519: rbd map hang during system startup
- I found a possible explanation of the problem, and have
created and pushed a fix on top of the code that I most
rec... - 09:06 AM Linux kernel client Bug #3519: rbd map hang during system startup
- Nick provided more information:
https://gist.github.com/raw/4330223/2f131ee312ee43cb3d8c307a9bf2f454a7edfe57/rbd... - 03:00 PM Bug #3624: BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last function: xfs_...
- Dave Chinner has confirmed my explanation. The bug no
longer exists (in its current form) in the latest code,
so w... - 10:46 AM Bug #3624 (Won't Fix): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last fu...
- I'm fairly sure this is an XFS problem, so as suggested by
Ian I'm marking this "Won't Fix" (again). If new evidenc... - 06:27 AM Bug #3624: BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last function: xfs_...
- Dave Chinner responded to my note with a few questions
requesting more information. I spent some time this
morning... - 12:30 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
- Pushed fixes to wip-3625 (ceph and ceph-client repos) that implement #3 (mds sends back the created flag in reply to ...
- 12:29 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
- David and I have posted comments on github about the fix to allow multiple
clients opening the same file to get a va... - 12:11 PM Bug #3652 (Duplicate): split should not mess up stats
- this will be replaced with bugs corresponding to a design of some kind
- 12:10 PM Feature #3651 (Resolved): osd: deep scrub should hash omap
- 11:47 AM rbd Bug #3611 (Resolved): rbd.py: segfault with many snapshots
- This was caused by c3107009f66bc06b5e14c465142e14120f9a4412. Reverting it fixes the problem. There is a corrected imp...
- 11:44 AM Bug #3632: occasional testrados failure: process_8 exited with a signal
- This still occurs with the wip-3611 branch, so it is a different problem.
- 11:15 AM Bug #3633: mon: clock drift errors not reported by ceph status
- Here's my config: http://pastie.org/5554031
I'm pretty sure there was no warning when I did 'ceph -w', because I w... - 08:25 AM Bug #3633 (In Progress): mon: clock drift errors not reported by ceph status
- I'm looking into an adequate way to make 'ceph -s' return a warning when the clocks have drifted.
However, 'ceph -... - 10:29 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
- Below works, but "ceph -s" does not
ubuntu@ceph1:~$ ceph health
2012-12-19 18:27:51.090414 mon <- [health]
2012-... - 10:22 AM Bug #2784 (Resolved): osd hit suicide timeout
- Not actually a bug in the renzhi case.
- 08:18 AM rgw Feature #3207: qa: swift functional tests in nightly
- from James' last bug report:...
- 08:02 AM Bug #3650: osd: crash in Reset state -> start_peering_interval -> on_change -> process_event Reset
- that line of code is...
- 07:52 AM Bug #3650 (Can't reproduce): osd: crash in Reset state -> start_peering_interval -> on_change -> ...
- ...
- 05:00 AM Revision d9c2396b (ceph): ceph.spec.in: Improve finding location of jni.h for sles11.
- Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
- 04:08 AM Revision b2eb8bd2 (ceph): osd: implement 'version' tell command
- Signed-off-by: Sage Weil <sage@inktank.com>
- 03:40 AM Revision 46344105 (ceph): ceph.spec.in: Add packages for libcephfs-jni and libcephfs-java
- Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
- 03:21 AM Revision 85763f09 (ceph): ceph: report error string to stderr, not stdout
- If we return an error, send the message to stderr. This makes things
more easily scriptable because error messages w... - 03:20 AM Revision 5f24e23b (ceph): ceph: fix error reporting when tell target is invalid or down
- Signed-off-by: Sage Weil <sage@inktank.com>
- 03:11 AM Revision b00eb6fd (ceph): mon: 'ceph osd ls'
- List osd ids that exist.
Signed-off-by: Sage Weil <sage@inktank.com> - 01:00 AM Revision 212f6b56 (ceph): OSDMap::dump: tag pg_temp mappings with pgid
- Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
12/18/2012
- 10:12 PM Revision 04e7a5ca (ceph): rgw: configurable exit timeout
- Fixes: #3638
rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If s... - 08:59 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
- The hang occurs because a client requests a max size increase, but doesn't have write caps, so the mds puts it on the...
- 07:53 AM CephFS Bug #3637 (Resolved): client: not issuing caps for with clients doing shared writes
- With 3 clients running ceph-fuse, running the ior command:
/tmp/cephtest/binary/usr/local/bin/ior -e -w -r -W -b 1... - 07:40 PM Bug #3646: pg_temp with two down/out osds
- sounds exactly right
- 07:29 PM Bug #3646: pg_temp with two down/out osds
- It actually does that already. OSDMonitor::remove_redundant_pg_temp(). I'll hook in around there for the fix, doing ...
- 06:42 PM Bug #3646: pg_temp with two down/out osds
- Good point. We can also remove mappings that match the crush result. Although that is a more expensive scan by the m...
- 04:41 PM Bug #3646 (Resolved): pg_temp with two down/out osds
- Encountered on MassEffect, osdmap is attached.
{ "pgid": "2.25",
"osds": [
30,... - 06:08 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
- added output for dmesg on ceph1
- 06:04 PM Bug #3647 (Can't reproduce): forgot the auth options for Cephx and added them later: Get msg: 7f...
- Seeing errors when setting up ceph from scratch with the options in the ceph.conf file. I forgot the auth options f...
- 04:23 PM CephFS Feature #3645 (Resolved): Requesting the ability to rename CephFS snapshots inside the ".snap"-di...
- I believe the ability to rename CephFS snapshots can come in handy in many cases. For example, if one wants to imple...
- 02:26 PM Bug #3644 (Resolved): ObjectCacher: discard_set ignores waiters
- IO in flight contained entirely in a discarded section will not be acked to the caller, since the waiters are removed...
- 01:41 PM Bug #3643 (Resolved): default authentication on the client does not work without a config file or...
- On a single node bobtail cluster,the ceph-auth setting is as mentioned below,
ubuntu@burnupi09:/etc/ceph$ sudo cat... - 01:26 PM rbd Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
- Instead of using an assert version op, establish the watch before reading the header. This hasn't actually caused any...
- 12:01 PM CephFS Bug #3639 (Duplicate): kclient: hit EOF prematurely
Moved to #3641- 10:56 AM CephFS Bug #3639 (Duplicate): kclient: hit EOF prematurely
- Failures seen when running IOR on the kernel client:
WARNING: Task 1 requested transfer of 1048576 bytes,
... - 12:00 PM CephFS Bug #3641 (Resolved): kclient: hit EOF prematurely
Failures seen when running IOR on the kernel client:
WARNING: Task 1 requested transfer of 1048576 bytes,
...- 11:57 AM CephFS Bug #3640 (Duplicate): kclient: hang and kernel panic
Creating a placeholder for the following issue reported by Eric Renfro on the mailing list:
http://thread.gmane....- 11:17 AM rgw Feature #2941 (Fix Under Review): rgw: improve streaming read performance
- 10:36 AM rgw Bug #3638 (Resolved): rgw: configurable exit timeout
- Currently exit timeout is 5 seconds, we should make it configurable, and probably have a higher default.
- 09:46 AM Bug #2784: osd hit suicide timeout
- This bug popped again on v0.55.1
renzhi on IRC stumbled upon it after upgrading from v0.48.2, and has been unable ... - 08:54 AM rbd Bug #3611: rbd.py: segfault with many snapshots
- This survived overnight testing (with the python librbd tests) with 56 passes.
- 08:03 AM Linux kernel client Bug #3519: rbd map hang during system startup
- I looked through the latest log message supplied by Nick
Bartos. I scanned through it to look only at the rbd
acti... - 07:39 AM Linux kernel client Bug #3519: rbd map hang during system startup
- There has been quite a lot of activity on this bug but it's
all been recorded on the mailing list rather than here.
... - 06:40 AM Bug #3624 (In Progress): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last ...
- Answer to my question, based on evidence in this bug:
The control (yaml) file contains this:
overrides:
... - 04:42 AM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
- Note how this was on a cluster with *very* few OSDs (4 at the time!) as I originally mentioned and this may play a fa...
- 01:12 AM Revision dbe6fb72 (ceph): crushtool: only dump usage on -h|--help
- Instead, output a useful error message.
Fix error code to be a success.
Add test for the output usage.
Signed-off-... - 01:12 AM Revision 6c7ec2d4 (ceph): crushtool: nicer error message on extra args
- Signed-off-by: Sage Weil <sage@inktank.com>
- 12:51 AM Revision 0dd13025 (ceph): Merge remote-tracking branch 'gh/testing' into next
- 12:38 AM Revision fd482a27 (ceph): ceph.spec.in: Update pre-reqs for ceph-fuse pacakge.
- 12:29 AM Revision 1b67a438 (ceph): Revert "objecter: don't use new tid when retrying notifies"
- This reverts commit c3107009f66bc06b5e14c465142e14120f9a4412.
This appears to be causing problems in the objecter by... - 12:14 AM Feature #3288: docs: document the chooseleaf command in crush
- Commit 9f0510 added docs for multiple crush hierarchies and the examples use chooseleaf, which is still undocumented.
12/17/2012
- 10:59 PM rbd Bug #3611 (Fix Under Review): rbd.py: segfault with many snapshots
- wip-3611 contains a respin of the bad commit. It's passing test_stress_watch with failure injection and the python te...
- 11:09 AM rbd Bug #3611: rbd.py: segfault with many snapshots
- also, ubuntu@teuthology:/a/teuthology-2012-12-15_19:00:04-regression-next-testing-basic/16289
- 11:08 AM rbd Bug #3611: rbd.py: segfault with many snapshots
- recent log: ubuntu@teuthology:/a/teuthology-2012-12-15_19:00:04-regression-next-testing-basic/16281
- 10:53 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
- Thanks for the logs. All the differences there are zeroes where actual data should be, but the librbd debug log shows...
- 10:41 PM Revision bdc998ef (ceph): mon: OSDMonitor: add option 'mon_max_pool_pg_num' and limit 'pg_num' ac...
- Instead of having a hardcoded default, use a configurable one. It is
limited to 65536 until future testing guarantees... - 10:39 PM Revision 21c47c6a (ceph): osd: debug EMSGSIZE / OSD_WRITETOOBIG
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:39 PM Revision f81ca898 (ceph): doc/release-notes: don't use format 2 rbd images until after osds upgrade
- Signed-off-by: Sage Weil <sage@inktank.com>
- 07:14 PM Revision 3c246226 (ceph): crushtool: add --set-chooseleaf-descend-once to help
- We forgot to update this in 88f218181a9e6d2292e2697fc93797d0f6d6e5dc.
Signed-off-by: Sage Weil <sage@inktank.com> - 06:53 PM Revision 874b2732 (ceph): doc/release-notes: 'mon max pool pg num'
- Signed-off-by: Sage Weil <sage@inktank.com>
- 04:30 PM Feature #1655: gitbuilder aggregator page
- ...
- 04:25 PM Feature #1655: gitbuilder aggregator page
- I'm not sure if anyone else has asked, but any chance of sharing the updated server side cgi script which now has aja...
- 04:10 PM Bug #3617 (In Progress): Ceph doesn't support > 65536 PGs(?) and fails silently
- Looking closer, I have a feeling this was a large # of pgs making a different bug surface. Jim has been running his ...
- 04:03 PM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
- Note how your commit changed the (default) limit from 65535 to 65536.
- 04:01 PM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
- The default is now 65536, and can be adjusted using the option 'mon max pool pg num' if higher values are desired.
- 03:57 PM Revision e8b8531e (ceph): doc: fix typo in config file
- The option is host, not hostname
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> - 02:35 PM Bug #3636 (Resolved): sub_op_modify assert(!missing.is_missing(soid));
- Encountered in Alexandria, fixed in 047aecd90f1dbfb172f48f9d10b67e82b3a8ce15, may it rest in piece.
- 12:40 PM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
- Trivial change. Biggest decision is which libc routine to use to spawn the command...
- 11:42 AM rbd Feature #3635 (Resolved): rbd cli: call "udevadm settle" after use of add/remove kernel interface
- The rbd command line interface creates mappings by sending
output to the /sys/bus/rbd/add file system entry, and rem... - 11:14 AM rgw Feature #3634 (Resolved): rgw: improve teuthology radosgw-admin test
- 11:10 AM rgw Bug #3620: rgw:improve multiple user access keys scalability
- Possibly impacts interoperability
- 11:09 AM rgw Bug #3628: rgw: leak of object parts on partial upload
- Appears to only be in Argonaut
- 09:31 AM rgw Bug #3628: rgw: leak of object parts on partial upload
- Actually, per user, this affects older versions (argonaut), but does not happen in newer version. Looking at the code...
- 10:53 AM rbd Bug #3600: rbd: assert in objectcacher destructor after flatten
- Tried to reproduce this behavior to no avail.
There are operations on the test that do hang for a long time, but a... - 09:49 AM Bug #3624: BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last function: xfs_...
- When XFS gets an I/O error, there is not a lot it can do.
If it happens to involve user data blocks it could continu... - 09:42 AM Bug #3624 (Won't Fix): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last fu...
- XFS bug
- 09:42 AM Bug #3599 (Resolved): mkcephfs should fail out when ceph.conf has an error
- 09:38 AM Bug #3632: occasional testrados failure: process_8 exited with a signal
- Possibly related to 3611
- 09:14 AM Bug #3629: test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsid())
- I've gone through the logs again and again, as well as through the code. The logs only show the last couple hundred l...
- 08:08 AM Linux kernel client Bug #2764: xfstest hang; osd socket closed messages
- I have posted a fix for the "socket closed" messages, and it has
been reviewed and will fairly soon be pushed to the... - 07:36 AM Bug #3633 (Resolved): mon: clock drift errors not reported by ceph status
- Using argonat 0.48.2. Today all ceph commands were randomly slow. So I checked all hosts, all monitors (3) and osds (...
12/16/2012
- 08:49 PM Bug #3632: occasional testrados failure: process_8 exited with a signal
- ...
- 08:48 PM Bug #3632 (Resolved): occasional testrados failure: process_8 exited with a signal
- seen several of these in qa, e.g....
- 08:29 PM Revision e9231fe6 (ceph): Makefiles: Two new packages needed in the debian build depdencies.
- The ceph test programs that are now being built by default require the junit
and libboost-program-options packages. ... - 08:29 PM Revision bc9d9d8a (ceph): Refactor rule file to separate arch/indep builds.
- Prior to the ceph fs java bindings, all packages where
architecture depdendent so the packaging rules file
worked OK;... - 12:44 PM rbd Bug #2872 (Resolved): RBD resize command allows image size -1
- 11:02 AM rbd Bug #2689: qemu iozone test hangs
- let's retest this with all of the recent caching fixes?
- 10:46 AM Bug #3609: mon: track down the Monitor's memory consuption sources
- Which in memory maps? Nothing should grow without bound, except perhaps some of the intern monitor messages...
- 04:30 AM Bug #3609: mon: track down the Monitor's memory consuption sources
- Attaching 3 heap profiles from the monitors.
The monitors were under load from the mon workload gen, as well as so... - 10:12 AM Bug #3631 (Resolved): osdc/ObjectCacher.cc: 834: FAILED assert(ob->last_commit_tid < tid) during ...
- old symptom, presumably new bug....
- 09:48 AM CephFS Fix #3630 (Resolved): mds: broken closed connection cleanup
- Consider:
- client->mds REQUEST_CLOSE
- mds->client CLOSE
- client closes con
- mds see fault, goes to stan... - 07:25 AM Bug #3629 (Fix Under Review): test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsi...
- Pushed a fix to wip-3629.
After looking into what the OSD does in this case and go through the code, I realized th... - 01:45 AM Revision 4bf90782 (ceph): osdc/Objecter: prevent pool dne check from invalidating scan_requests i...
- We iterate over ops and, if the pool dne and other conditions are true,
we will immediately return ENOENT and cancel ...
12/15/2012
- 09:00 PM Bug #3629 (Resolved): test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsid())
- ...
- 08:33 PM rgw Bug #3628 (Resolved): rgw: leak of object parts on partial upload
- 08:23 PM Bug #3613 (Resolved): Objecter::scan_requests crash
- commit:4bf9078286d58c2cd4e85cb8b31411220a377092
passed 100 iterations of the test (previously failed after ~15). - 05:44 PM Bug #3613: Objecter::scan_requests crash
- the pool dne check invalidated the iterator. switching to map<> and incrementing hte iterator at hte top of the loop
- 03:54 PM Bug #3627 (Resolved): osd: segfault in ~MOSDSubOp during thrashing+rbd_fsx
- ...
- 09:59 AM Bug #3458: aio enabled but not used
- I didn't compile it, it's the version from the ubuntu quantal repository. Is there any way to see which feature have ...
- 09:46 AM Bug #3458 (Need More Info): aio enabled but not used
- this probably means that libaio wasn't found when you compiled the code?
- 09:45 AM rbd Fix #3588 (In Progress): rbd.py's clone should take stripe parms, call rbd_clone2
- 09:45 AM rbd Feature #2601 (Resolved): rbd: Show image size with an "ls"
- 09:44 AM rbd Feature #2634 (Resolved): teuthology: add networking to qemu task
- 09:43 AM rbd Bug #3619 (In Progress): librbd: read_iterate sparse behavior broken
- 09:42 AM rbd Bug #2689: qemu iozone test hangs
- 09:00 AM Bug #3616 (Resolved): osd/ReplicatedPG.cc: 4534: FAILED assert(!missing.is_missing(soid))
- 09:00 AM Bug #3603 (Resolved): osd/msgr: mutex assert failure in try_get_pipe
- 09:00 AM Bug #3221 (Resolved): disconnect_session_watchers missing pg
- 09:00 AM Bug #2954 (Resolved): osd: scrub stat mismatch, got 18/19 objects, 14/15 clones, 22478527/2538528...
- 01:08 AM Revision 601a6c93 (ceph): Merge remote-tracking branch 'gh/next'
- 12:56 AM Revision 6f978aa5 (ceph): doc: draft bobtail release notes
- Signed-off-by: Sage Weil <sage@inktank.com>
Also available in: Atom