Project

General

Profile

Activity

From 10/12/2011 to 11/10/2011

11/10/2011

11:07 PM Revision b600ec2a (ceph): v0.38
Sage Weil
11:05 PM Revision 2a7fbe0c (ceph): common: return null if mc.init() unsuccessful
Prevents ceph.cc from segfaulting on missing keyring.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
11:05 PM Revision a177a702 (ceph): rbd.py: fix list when there are no images
It should return [], not [''].
Reported-by: Eric Chen <Eric_YH_Chen@wistron.com>
Signed-off-by: Josh Durgin <josh.du...
Josh Durgin
11:05 PM Revision 27bb48c5 (ceph): mon: overwrite in put_bl
This fixes a situation where we accept a large value, there is some failure
and recovery, and then we commit a smalle...
Sage Weil
11:05 PM Revision 2f97a222 (ceph): PG: mark scrubmap entry as not absent when we see an update
Previously, there would be an assert failure in _scan_list if we see an
object deleted and then recreated.
Signed-of...
Samuel Just
10:58 PM Revision 87941128 (ceph): rgw: implement swift copy, fix copy auth
Yehuda Sadeh
10:13 PM Revision 77c977c1 (ceph): misc: allow >1 monitor per role in get_mon_names()
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:09 PM Revision 704644bc (ceph): PG: gen_prefix: use osdmap_ref rather than osd->osdmap
Otherwise, the debug output might not match the map used by
the pg logic.
Signed-off-by: Samuel Just <samuel.just@dr...
Samuel Just
10:09 PM Revision 7fb182a1 (ceph): OSD: sync_and_flush afer mkfs to create first snap
Previously, if we kill the OSD process before the filestore
does its first sync, we end up replaying the journal on t...
Samuel Just
09:41 PM Bug #1670 (Can't reproduce): osd: crash in update_heartbeat_peers
Sage Weil
09:38 PM Bug #213 (Resolved): non-idempotent transactions (clone) under ext3 may not replay correct result
commit:dae6c956543276e103a272eb1e897db17b840348 Sage Weil
08:54 PM Bug #1530: osd crash during build_inc_scrub_map
Sage Weil
05:29 PM Bug #1530: osd crash during build_inc_scrub_map
We just found surprisingly similar stack traces in three of last night's failures:
nightly_coverage_2011-11-10/1740/...
Anonymous
06:45 PM Feature #1516 (Resolved): openstack: single node dev environment
Josh Durgin
05:06 PM rgw Feature #1717 (Resolved): rgw: support json input
Yehuda Sadeh
05:06 PM Feature #1653 (Resolved): librados: python binding nose tests
Fixed by commit:ea42e02ca2fd3655dbaf2e720e31d78da5022e21. Josh Durgin
05:05 PM rgw Cleanup #1716 (Closed): rgw: remove curl use
Yehuda Sadeh
05:05 PM Bug #1577 (Resolved): rados.py: Snap.get_timestamp does not work
Fixed by commit:25cde7f98ac195b0458830a3e345db54a994384b. Josh Durgin
04:57 PM Feature #1539 (Duplicate): libvirt: make sure snapshots work
Sage Weil
04:11 PM rgw Feature #1715 (Rejected): rgw: use RENAME osd operation to avoid slow CLONE operations
add to osd too Sage Weil
04:03 PM rbd Feature #1713 (Resolved): teuthology: qemu tasks, tests
gitbuilder
teuthology task
some tests that run in it
Sage Weil
03:29 PM CephFS Bug #1702: Ceph MDS crash + client mount problem
Gokul Krishnan wrote:
> Thank you for reverting back so quickly.
>
> Well in my scenario, i just have one Ceph se...
Sage Weil
03:29 PM CephFS Bug #1702: Ceph MDS crash + client mount problem
Gokul Krishnan wrote:
> by the way,
> you have assigned a target version as v0.39...but in the site i can find only...
Sage Weil
01:50 AM CephFS Bug #1702: Ceph MDS crash + client mount problem
by the way,
you have assigned a target version as v0.39...but in the site i can find only the source for v0.37...
e...
Gokul Krishnan
12:45 AM CephFS Bug #1702: Ceph MDS crash + client mount problem
Thank you for reverting back so quickly.
Well in my scenario, i just have one Ceph server running. And yes, every ...
Gokul Krishnan
03:29 PM rgw Feature #1712 (Resolved): rgw: support swift manifest objects
Yehuda Sadeh
03:22 PM Feature #1711 (Resolved): chef: multiple monitor support
Sage Weil
03:22 PM Bug #1669 (Resolved): linux 32 bit kernel client ld libraries and rm issue
Sage Weil
03:14 PM Feature #1709 (Resolved): specfile: merge suse spec file changes
Sage Weil
03:00 PM rgw Bug #1706 (Resolved): rgw: copy object auth verification (probably) broken
Yehuda Sadeh
02:59 PM rgw Bug #1706: rgw: copy object auth verification (probably) broken
Fixed, commit:87941128b60608d66dc5327038f099a1fb2a99c3. Yehuda Sadeh
02:59 PM rgw Bug #1705 (Resolved): rgw: swift copy is broken
Fixed, commit:87941128b60608d66dc5327038f099a1fb2a99c3. Yehuda Sadeh
02:57 PM CephFS Feature #1448: test hadoop on sepia
The following benchmark, TestDFSIO, is for 12 OSDs, 1 MDS/MON. There is a single ext4 disk per node dedicated to Ceph... Noah Watkins
02:46 PM Bug #1632 (Can't reproduce): osd: crash in dequeue_op
Sage Weil
01:54 PM Bug #1708 (In Progress): mon/PGMonitor.cc: 218: FAILED assert(paxos->get_version() + 1 == pending...
Sage Weil
01:45 PM Bug #1708 (Resolved): mon/PGMonitor.cc: 218: FAILED assert(paxos->get_version() + 1 == pending_in...
Running ceph version from git: a3dd5bd67ba19aae51a51318138ef10213a91449
Slaves are all ubuntu 11.10, 3.0.0-12
Files...
Josh Pieper
12:06 PM Bug #1707 (Resolved): After fresh install, OSD initialization fails with: error error 17: File ex...
Running ceph from git @ a3dd5bd6 with btrfs
Ubuntu 11.10, 3.0.0-12 on all machines
After installing my compiled c...
Josh Pieper
01:17 AM Revision a3dd5bd6 (ceph): PG: update info.history even if lastmap is absent
Previously, we did not update same_interval_since etc if
we do not have the previous map.
Signed-off-by: Samuel Just...
Samuel Just
12:36 AM Revision 023ff590 (ceph): Makefile: add MMonProbe.h
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:33 AM Revision fd5fb993 (ceph): osd: remove useless proc_replica_log() side-effect
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil

11/09/2011

11:38 PM Revision 78ad144a (ceph): hadoop: update patch and Readme.
Patch generated by Noah Watkins <noahwatkins@gmail.com>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum
11:30 PM Revision 386c0db3 (ceph): rgw: swift guesses mime type if not specified
Yehuda Sadeh
10:50 PM Revision 78ccb2a9 (ceph): osd: comment PG::lock*(), whitespace
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:46 PM Revision 87318389 (ceph): Merge branch 'master' of github.com:NewDreamNetwork/ceph
Conflicts:
src/osd/PG.cc
Sage Weil
10:32 PM Revision 5fa8df1e (ceph): osd: improve last_peering_reset debugging
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:32 PM Revision 383dfa33 (ceph): crypto: make crypto handlers non-static
These were static in auth/Crypto.cc, which was mostly fine, except when
we got a signal shutting everything down for ...
Sage Weil
10:15 PM Revision 9db994a5 (ceph): PG: always add backlog entry
Previously, we did not add a backlog entry if the object already had an
entry in the log along with an entry for that...
Samuel Just
10:15 PM Revision 0dffddf3 (ceph): osd/: change type of osd::osdmap to a shared_ptr
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
10:15 PM Revision 5df28ece (ceph): OSDMap,CrushWrapper: const cleanup on OSDMap
The osd's cached maps are not actually modified once cached. Marking
these methods const (which they should be) allo...
Samuel Just
10:15 PM Revision b41b1fa5 (ceph): PG: cache read-only reference to the current osdmap on pg lock
Previously, we needed to grab an osd_map read lock to send messages,
among other things. Now, we grab a reference to...
Samuel Just
10:04 PM Revision 15da4787 (ceph): rbd: Fix the showmapped cmd usage
If the rbd showmapped cmd is given any extra arguments, rbd will fail
with "assert(0)". Fix it by exiting with "usage...
Stratos Psomadakis
09:37 PM Revision 303e863d (ceph): add hammer.sh
simple script to repeat a test until it fails. can probably do something much more sophisticated
here, but this works.
Sage Weil
09:28 PM Revision 33549333 (ceph): hadoop: return all replica hostnames
Updates CephFileSystem to return all replica locations,
and in addition attempts to use reverse DNS to convert
the OS...
Noah Watkins
09:23 PM Revision e6035a62 (ceph): hadoop: make listStatus quiet
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
09:23 PM Revision d7f911fb (ceph): hadoop: handle new ceph_get_file_stripe_address
Updates the Hadoop JNI/CephFileSystem to handle
the new version of ceph_get_file_stripe_address
which returns the loc...
Noah Watkins
09:23 PM Revision 619430a7 (ceph): client: return stripe address replicas
Changes ceph_get_file_stripe_address to return a
vector of entity_addr_t's for the primary and the
replicas. libcephf...
Noah Watkins
09:15 PM Revision c5c50377 (ceph): client: fix bad perfcounter fset callers
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:50 PM Revision 808c6442 (ceph): Improve use of syncfs.
Test syncfs return value and fallback to btrfs sync and then sync.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unic...
Alexandre Oliva
08:48 PM Revision c51e2f72 (ceph): osd: fix perfcounter typo
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:43 PM Revision 1ac6b47c (ceph): os: rename and make use of the split_threshold parameter.
This was accidentally left out of the must_split calculation. Put it
in, and rename it to split_multiplier (as that i...
Greg Farnum
07:03 PM Revision 09455eea (ceph): perfcounters: fix users of fset on averages
I forgot to audit these before merging the assert and they popped up
in teuthology and stuff. :(
Signed-off-by: Greg...
Greg Farnum
06:49 PM Revision afa56f16 (ceph): nuke: increase reboot timeout
Some sepia nodes are very slow to reboot. Josh Durgin
05:35 PM Bug #1690: osd re-created from scratch will crash on start-up
I was using v0.37; in order to debug this, I first build top of the tree stable (b8979f4d292f6a739daac81ce8e59aa084e1... Alexandre Oliva
05:11 PM rgw Bug #1706 (Resolved): rgw: copy object auth verification (probably) broken
Looking at RGWCopyObj::verify_permission(), we don't look at the source acl, but rather at the source bucket's acl. Yehuda Sadeh
05:07 PM rgw Bug #1705 (Resolved): rgw: swift copy is broken
Swift can accept alternative HTTP COPY method (with src/dest transposed). Yehuda Sadeh
04:38 PM Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct result
Sage Weil
02:55 PM Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct result
Update: the current first pass plan is to initiate a FileStore sync after any non-idempotent operation. This updates... Sage Weil
03:35 PM Linux kernel client Bug #1701: krbd: limits and constants are not consistent in kernel and userspace
Also related: we have MAX_POOL_NAME_SIZE and MAX_SNAP_NAME_SIZE as 128 in qemu right now. Josh Durgin
02:37 PM Linux kernel client Bug #1701: krbd: limits and constants are not consistent in kernel and userspace
Stratos Psomadakis wrote:
> Instead of opening a new issue, I think I can add it here.
>
> Besides those limits o...
Sage Weil
02:18 PM Linux kernel client Bug #1701: krbd: limits and constants are not consistent in kernel and userspace
Instead of opening a new issue, I think I can add it here.
Besides those limits on the RBD images, there's also a ...
Stratos Psomadakis
12:44 PM Linux kernel client Bug #1701 (New): krbd: limits and constants are not consistent in kernel and userspace
There are a few things that exist in the kernel but not userspace:
* SNAP_NAME_LEN
* (MIN|MAX)_OBJECT_ORDER
Also...
Josh Durgin
03:00 PM CephFS Bug #1702: Ceph MDS crash + client mount problem
Ok, so generally speaking, the only time you shoudl see fsid mismatches like that is if you have daemons from multipl... Sage Weil
02:55 PM CephFS Bug #1702: Ceph MDS crash + client mount problem
Hello,
thank you for the reply.
no, unfortunately i am not able to reproduce the error using debug ms = 20(for MD...
Gokul Krishnan
01:23 PM CephFS Bug #1702 (Need More Info): Ceph MDS crash + client mount problem
Are you able to reproduce this with 'debug mds = 20' and 'debug ms = 20' in your ceph.conf [mds section]?
Not sure...
Sage Weil
12:51 PM CephFS Bug #1702 (Can't reproduce): Ceph MDS crash + client mount problem
Hello,
i have configured ceph using a configuration as shown here[[http://pastebin.com/sQb8WZbx]].
The Ceph serve...
Gokul Krishnan
02:43 PM Bug #1684 (Duplicate): mon: crash in CryptoKey::encrypt
Sage Weil
02:42 PM Bug #1633 (Resolved): osd crash in CryptoKey::decrypt
should be fixed by commit:383dfa33682abeae7348655fc103dd80c41b7ba7 Sage Weil
02:39 PM Linux kernel client Feature #962 (Resolved): d_prune
Sage Weil
02:39 PM Linux kernel client Bug #850 (Resolved): make NULL lookup using I_COMPLETE work
Sage Weil
02:39 PM Linux kernel client Bug #851 (Resolved): make dcache readdir with I_COMPLETE work
Sage Weil
02:38 PM Linux kernel client Bug #1704 (Resolved): oid limited to 40 chars, rbd images can be longer
From Stratos Psomadakis:
"Besides those limits on the RBD images, there's also a hardcoded limit in
libceph (mess...
Sage Weil
02:27 PM rgw Bug #1698: radosgw-admin log list returns invalid json when a log object was created with a name ...
This is my vote for "let's not allow radosgw clients to create artifacts with non-utf8 names in the first place". Anonymous
02:19 PM Bug #1530 (Resolved): osd crash during build_inc_scrub_map
Samuel Just
02:08 PM Bug #1703 (Resolved): rbd: showmapped cmd fails, when extra args are present
Sage Weil
02:00 PM Bug #1703 (Resolved): rbd: showmapped cmd fails, when extra args are present
rbd showmapped cmd will fail with assert(0), when given any extra arguments.
Patch to fix it attached (exiting wit...
Stratos Psomadakis
01:02 PM Bug #1695 (Rejected): wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
Serge Rittscher wrote:
> ok, the output is:
> @
> rm -f init-ceph init-ceph.tmp
> sed -e 's|@bindir[@]|/usr/local...
Sage Weil
11:39 AM Bug #1695: wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
ok, the output is:
@
rm -f init-ceph init-ceph.tmp
sed -e 's|@bindir[@]|/usr/local/bin|g' -e 's|@libdir[@]|/usr/lo...
Serge Rittscher
11:04 AM Bug #1695: wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
oops, 'touch init-ceph.in' first, then 'make init-ceph' Sage Weil
12:49 AM Bug #1695: wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
@make init-ceph@
returns:
@make: `init-ceph' is up to date.@
Serge Rittscher
11:10 AM Bug #1700 (Resolved): osd: invalid perfcounter usage
Should be fixed in commit:09455eeac4fb37c31998202ad9503901f53c21dc. My bad! Greg Farnum
10:14 AM Bug #1700 (Resolved): osd: invalid perfcounter usage
During dbench, two osds crashed on this assert:... Josh Durgin
11:09 AM Bug #1694 (Resolved): monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Sage Weil
11:09 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
oh nevermind, didn't see that second comment. the fix is commit:0bcdd4f3b2a2dba405639122b84f7aad978f347b, which come... Sage Weil
11:06 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Great. Can you attach (or email) the ceph.conf you're using?
Thanks!
Sage Weil
07:55 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
The monitor that was generating the osdmap was running commit:5bd029ef01fcb59bea9170af563c3499cce1e8c4 and that faile... Wido den Hollander
02:25 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Ok, I've ran those commands and it gives me:... Wido den Hollander
07:19 AM CephFS Bug #1472: cfuse hangs with v0.34
Some of the hangs we've been seeing on the client may have been related to having two nics on each node. We had seen... Sam Lang
06:17 AM Revision 6d39cc11 (ceph): ceph: keep ceph.conf at ctx.ceph.conf
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:17 AM Revision 60863f70 (ceph): ceph_manager: manipulate monitors
Sage Weil
06:17 AM Revision 6618a027 (ceph): mon_recovery: add task to test monitor cluster failure recovery
Some simple tests to start with. We still need some sort of mon cluster
thrashing.
Signed-off-by: Sage Weil <sage@n...
Sage Weil
06:16 AM Revision 9acea7a6 (ceph): multimon mon_recovery tests on variously sized monitor clusters
Sage Weil
06:11 AM Revision 6ab14874 (ceph): Merge branch 'wip-mon'
Sage Weil
05:58 AM Revision 87634ce1 (ceph): osd: don't open deleted map from generate_past_intervals
The first get_map() call needs to be avoided when stop < last_epoch. This
fixes a crash like
2011-11-08 21:51:09.04...
Sage Weil
05:13 AM Revision 20cf1e96 (ceph): automake: enable 'make V=0'
Enables silent mode for automake generated Makefiles,
and silent mode is _off_ by default. Using V=0 the output
is mu...
Sage Weil
12:45 AM Revision 4b0cf89b (ceph): Add rbd python binding test.
Josh Durgin
12:24 AM Revision 1bc1a244 (ceph): mon: handle active -> electing transition properly
If we are already active, make sure we reset things properly before going
into an election.
Signed-off-by: Sage Weil...
Sage Weil
12:09 AM Revision 5d32bcae (ceph): Add nuke-on-error option.
This lets automated jobs nuke and unlock machines after failed
tests. Each machine is nuke individually, so one down ...
Josh Durgin
12:09 AM Revision 006a0dd4 (ceph): Remove unused imports and variable.
Josh Durgin

11/08/2011

10:21 PM Feature #1007 (Resolved): qa: osd failure and cluster recovery test(s)
yay thrashing Sage Weil
10:20 PM Bug #1694 (Need More Info): monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Sage Weil
09:28 PM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Can you try this and see if there is a mismatch?... Sage Weil
10:06 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Aha! Read that wrong, tnx.
I used mkcephfs to generate the crushmap, I did not write my own.
Wido den Hollander
09:17 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
max_osd in the osdmap needs to be >= the max_devices in the crush map. how did you set up the cluster? did mkcephfs... Sage Weil
07:18 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
I just made a small adjustment to crushtool so it would print max_devices:... Wido den Hollander
07:01 AM Bug #1694 (Resolved): monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
I just did a fresh install of my cluster and after starting I saw my monitors go down with:... Wido den Hollander
10:18 PM Feature #1646 (Resolved): mon: catch up on committed items before attempting to join quorum
Sage Weil
10:17 PM Revision 7a32cc60 (ceph): rgw: swift bucket report returns both bytes size and actual size
Yehuda Sadeh
10:17 PM Revision 76090324 (ceph): rgw: don't return partial content response with bad header
Yehuda Sadeh
10:17 PM Revision a04afd09 (ceph): rgw: abort early on incorrect method
Yehuda Sadeh
09:33 PM Bug #1695: wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
What is the output if you... Sage Weil
09:06 AM Bug #1695 (Rejected): wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
After installing Ceph from sources (version ceph-0.37.tar.gz) on Ubuntu by executing
$ ./autogen.sh
$ ./configure...
Serge Rittscher
09:09 PM Revision 2fb73bdd (ceph): paxos: fix race between active and commit
If paxos reproposes an old learned value, we have a C_Active waiter, and
also a commit in progress.
When we reach qu...
Sage Weil
08:56 PM Revision 1ffb7b97 (ceph): mon: add 'quorum_status' command
Show status of the current quorum. Block until there is one.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:52 PM Revision a8b28ee5 (ceph): mon: do not participate in the election unless we are in electing state
If we participate, we may be included in the quorum, even tho we are
probing, slurping, whatever.
Signed-off-by: Sag...
Sage Weil
07:50 PM Revision 64350c0b (ceph): rgw: guard perfcounter accesses in rgw_cache.
This gets called by radosgw-admin, so it needs to handle
perfcounter being a null pointer.
Signed-off-by: Greg Farnu...
Greg Farnum
07:28 PM Revision 42f5f024 (ceph): rgw: initialize all the perfcounters, in order
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
06:42 PM Revision e952e10f (ceph): ReplicatedPG: use finc, not fset, on average counters
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
06:42 PM Revision 29e091b5 (ceph): mon: 'mon_status' command to dump individual mon state
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:04 PM Revision f0b9a331 (ceph): rgw: use l_rgw_qactive perfcounter
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
05:58 PM Revision 9035ffb2 (ceph): mon: add probe+slurp timeouts
A short timeout on probe, so we can form new quorums quickly.
A longer timeout on slurp, so we will tolerate a slow ...
Sage Weil
05:50 PM Revision 0fe0f9db (ceph): rgw: create and tear down a radosgw perfcounter
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil
05:50 PM Revision d0b226e7 (ceph): perfcounter: assert when you try and set an average.
If you're trying to set an average, you're probably doing it wrong.
Signed-off-by: Greg Farnum <gregory.farnum@dream...
Greg Farnum
05:50 PM Revision 57b60b8a (ceph): perfcounter: add some minimal documentation.
The data model is a bit obtuse if you're just looking at the code.
Signed-off-by: Greg Farnum <gregory.farnum@dreamh...
Greg Farnum
05:50 PM Revision cf566550 (ceph): rgw: implement perfcounters
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
04:59 PM Linux kernel client Bug #1696: kclient: crash in ceph_d_prune
Here is the code:... Sage Weil
11:50 AM Linux kernel client Bug #1696 (Resolved): kclient: crash in ceph_d_prune
During the 11/08 nighly, several suites:
1606 autotest dbench
1607 workunit direct_io
1608 workunit kc...
Anonymous
04:57 PM Bug #1684: mon: crash in CryptoKey::encrypt
This happened on an mds during a thrashing run:... Josh Durgin
04:29 PM Linux kernel client Feature #1699 (Resolved): debug symbols in autobuilt (sepia) kernels
We need debug symbols in the .ko objects:... Sage Weil
03:49 PM rgw Bug #1698: radosgw-admin log list returns invalid json when a log object was created with a name ...
The two preceding days show similar errors as well. Matthew Wodrich
03:48 PM rgw Bug #1698: radosgw-admin log list returns invalid json when a log object was created with a name ...
The description above is malformed for whatever reason, so I'll try again:
radosgw-admin log list is producing bad J...
Matthew Wodrich
03:44 PM rgw Bug #1698 (Resolved): radosgw-admin log list returns invalid json when a log object was created w...
2011-11-07-12-0-<80>.. Matthew Wodrich
02:34 PM rgw Feature #1697 (Resolved): s3-tests: test bucket headers
Sage Weil
12:04 PM rgw Feature #1591 (Resolved): rgw: instrument with perfcounter
Finally sat down and did this. Merged in commit:64350c0b4d3ba2061cebed87f4cd6f513d2ba6ed and passed s3tests. Greg Farnum
06:46 AM Revision 2523b70e (ceph): mon: slurp latest state from active monitors before joining quorum
If a monitor has been down and is behind, and joins the quorum, the
other nodes will try to send it all of the needed...
Sage Weil
06:41 AM Revision c2fc986e (ceph): monmap: simplify constructor
Explicitly set created, last_changed where appropriate.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
06:41 AM Revision 279661f3 (ceph): paxos: last_consumed == latest_stashed; behave accordingly
Initialize on startup.
Don't re-read off of disk on every trim_to() call.
Signed-off-by: Sage Weil <sage.weil@dreamh...
Sage Weil
06:41 AM Revision 100fba8e (ceph): mon: fix osdmap trim
We can raise the floor even when min_last_epoch_clean if very close to
the current version, as long as it is still ab...
Sage Weil
04:40 AM Revision 628de548 (ceph): mon: don't call out to mon->call_election for internal election restarts
This lets us drop the is_new kludge.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:40 AM Revision 18941dd0 (ceph): mon: rename election_starting -> restart
These callbacks reset monitor/paxos/paxosesrvice state, which used to
happen when an election started, but will now n...
Sage Weil
04:40 AM Revision 2f46e8cd (ceph): mon: revamp monitor states
starting -> probing, electing
some cleanup
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:40 AM Revision 40843eb3 (ceph): rgw: fix warning
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
01:08 AM Revision 2836104a (ceph): rgw: fix accept-range for suffix format, other related issues
Yehuda Sadeh

11/07/2011

11:04 PM Revision 2f881e12 (ceph): Timer.cc: remove global thread variable
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
11:04 PM Revision d4ef9215 (ceph): common: return null if mc.init() unsuccessful
Prevents ceph.cc from segfaulting on missing keyring.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
09:05 PM Revision c764b247 (ceph): Fix leftover orchestra import clause.
This seems to be a leftover from
a2372fce12b6bd1818e155d1d8ed5134dbd8fd4a,
no idea how it stayed hidden this long.
Tommi Virtanen
05:27 PM Revision 480b8260 (ceph): rbd: add showmapped to clitests and rst man page
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
05:27 PM Revision 4e518ed3 (ceph): rbd: Document the rbd showmapped cmd
Document the rbd showmapped cmd in rbd.usage(), and rbd's man page,
and add it to the bash completion script.
Signed...
Stratos Psomadakis
05:10 PM Revision 34d80397 (ceph): rbd.py: fix list when there are no images
It should return [], not [''].
Reported-by: Eric Chen <Eric_YH_Chen@wistron.com>
Signed-off-by: Josh Durgin <josh.du...
Josh Durgin
03:35 PM Bug #1690: osd re-created from scratch will crash on start-up
I seem to be having some trouble reproducing this. What version are you running? Could you repeat the procedure wit... Samuel Just
10:33 AM Bug #1690 (Can't reproduce): osd re-created from scratch will crash on start-up
Some time ago, it was possible to re-create an osd after its filesystem failed as simply as running “cosd -i # --mkfs... Alexandre Oliva
02:59 PM CephFS Feature #1693: libcephfs: Support TRIM (hole punching)
Kernelside ceph.ko ticket is #591. Let this ticket stand for the userspace libcephfs (and ceph-fuse) support. Anonymous
02:12 PM CephFS Feature #1693 (Resolved): libcephfs: Support TRIM (hole punching)
Anonymous
02:57 PM Feature #1692: librbd: Support TRIM (hole punching) (userspace client)
Kernel-side rbd.ko ticket is #190. Let this ticket stand for the librbd (userspace) support. Anonymous
02:11 PM Feature #1692 (Duplicate): librbd: Support TRIM (hole punching) (userspace client)
Anonymous
01:56 PM Bug #1691 (Can't reproduce): rados export failures
... Sage Weil
11:36 AM Linux kernel client Bug #1667 (Resolved): BUG at fs/inode.c line 1375
Sage Weil
11:17 AM rbd Feature #1662 (In Progress): libvirt: obscure qemu/rbd secrets
Sage Weil

11/06/2011

03:08 PM Linux kernel client Bug #1667: BUG at fs/inode.c line 1375
Sage Weil

11/05/2011

09:37 PM Linux kernel client Bug #1686 (Resolved): directory not empty errors
fixed commit:c6ffe10015f4e6fba8a915318b319c43aed1836f clear helper Sage Weil
09:37 PM Linux kernel client Bug #1687 (Resolved): directory existence failures
fixed commit:c6ffe10015f4e6fba8a915318b319c43aed1836f clear helper Sage Weil
01:38 AM Revision ae41f323 (ceph): OSD: write_info/log before dropping lock in generate_backlog
Bug #1530
This should fix the following race:
1) osd->generate_backlog does pg->assemble_backlog
2) osd->generate_ba...
Samuel Just
12:30 AM Revision fb70f5cc (ceph): FileJournal: stop using sync_file_range
Using sync_file_range means that neither any required metadata gets commited,
nor the disk cache gets flushed. Stop ...
Christoph Hellwig
12:29 AM Revision 585a46c5 (ceph): monclient: simplify auth_supported set
Use AuthSupported class instead of repopulating it ourselves.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:23 AM Revision a38c0054 (ceph): test_libcephfs
Greg Farnum
12:21 AM Revision 10141673 (ceph): Makefile: use static add for test_libcephfs_readdir.
Otherwise it doesn't seem to play nicely with teuthology/sepia
due to requiring the host to have gtest installed.
Si...
Greg Farnum

11/04/2011

09:57 PM Revision 5b4e9d31 (ceph): RadosModel: add DeleteOp to test object deletions
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
08:40 PM Revision 280a4d1d (ceph): rgw: fix tmp objects leakage
Yehuda Sadeh
08:13 PM Revision 8d914f0e (ceph): rgw: list system buckets through rados api
Yehuda Sadeh
08:13 PM Revision fc6522a8 (ceph): rgw: don't purge pools in any case
Yehuda Sadeh
06:44 PM Bug #1530: osd crash during build_inc_scrub_map
ae41f3232a39dbf33487ab02cbac292f58debea8 Samuel Just
04:59 PM Bug #1530: osd crash during build_inc_scrub_map
My best guess about this bug goes something like this:
1) osd->generate_backlog does pg->assemble_backlog
2) osd->g...
Samuel Just
05:20 PM Linux kernel client Bug #1686: directory not empty errors
this is probably due to the d_prune stuff i just pushed to master. need to do some serious debugging here.
the re...
Sage Weil
01:43 PM Linux kernel client Bug #1686 (Resolved): directory not empty errors
Today, many of the kclient ceph fs tests failed due to problems removing directories. This did not happen with yester... Josh Durgin
04:52 PM Bug #1689 (Can't reproduce): osd: segfault in recover_primary
This happened in run 1497, thrashing with the snaps workload, on 3 osds.... Josh Durgin
04:50 PM Bug #1529: cosd: os/FileStore.cc: 2390: FAILED assert(0 == "ENOENT on clone suggests osd bug")
Thrashing with the snaps workload triggered this on several osds in run 1497 today. Josh Durgin
03:21 PM Bug #1683: librados: list objects should also return locator key
Apparently, I implemented this about 2 months ago but didn't merge it... Samuel Just
01:19 PM Bug #1683 (Resolved): librados: list objects should also return locator key
Yehuda Sadeh
02:47 PM CephFS Bug #1472: cfuse hangs with v0.34
We're seeing similar hangs again. One thing I didn't mention in my previous posts, we are always adjusting the repli... Sam Lang
02:43 PM Bug #1688 (Closed): Benjamin: pg stuck in scrub
Looks like the bug is related to last_update_applied not getting up to last_update on primary. No further scrubbing ... Samuel Just
01:46 PM Linux kernel client Bug #1687 (Resolved): directory existence failures
Some benchmarks today failed to cd to directories. These worked yesterday.
From blogbench and ffsb:...
Josh Durgin
01:40 PM rgw Bug #1685 (Resolved): rgw: tmp objects leakage
Yes, but the problem was elsewhere. Fixed, commit:280a4d1ded4b83974805c60bcd410ee00ccc3884. Yehuda Sadeh
01:38 PM rgw Bug #1685: rgw: tmp objects leakage
This is probably due to to #1683, as tmp objects are all placed using locators, right? Greg Farnum
01:27 PM rgw Bug #1685 (Resolved): rgw: tmp objects leakage
After running radosgw-admin temp remove, we're still left out with objects from the tmp namespace. Either we fail to ... Yehuda Sadeh
01:21 PM Bug #1684 (Duplicate): mon: crash in CryptoKey::encrypt
From teuthology:~teuthworker/archive/nightly_coverage_2011-11-04/1472/teuthology.log:... Josh Durgin
01:17 PM rgw Bug #1672 (Resolved): rgw: support chunked transfer encoding
Done. Yehuda Sadeh
12:55 PM CephFS Bug #1682 (Resolved): mds: segfault in CInode::authority
From teuthology:~teuthworker/archive/nightly_coverage_2011-11-04/1469/teuthology.log:... Josh Durgin
12:30 PM rgw Bug #1681 (Resolved): rgw: user rm with --purge doesn't remove data
I just disabled it as it did it incorrectly Yehuda Sadeh
10:25 AM Feature #1618: libvirt: make sure migration works
Mike Lowe emailed me and mentioned it works for him on Oneiric with a custom kvm 0.15.1, no other changes. I still wa... Anonymous
09:53 AM CephFS Feature #1680 (New): support reflink (cheap file copy/clone)
It seems the API is still fs-specific ioctls, but there's repeated discussion about reflink(2).
If a nice common API...
Anonymous
08:19 AM Bug #1679: assertion failure is_replica()
Upon trying to restart the failed osds, other osds (7) fail:
*** Caught signal (Aborted) **
in thread 0x7fcceb...
Sam Lang
08:12 AM Bug #1679 (Can't reproduce): assertion failure is_replica()
3 boxes, 12 osds per box. 4 osds (9,11,20,24) crashed at the following assertion. This was triggered by first setti... Sam Lang

11/03/2011

11:01 PM Revision 0f98006c (ceph): rgw: fix PUT without content length (non chunked)
Yehuda Sadeh
10:46 PM Revision 256ac72a (ceph): rbd: document --order and list required args where they're necessary
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
09:43 PM Revision 0df3f036 (ceph): Merge remote branch 'nwatkins/for-master'
Greg Farnum
09:11 PM Revision 90249069 (ceph): Merge branch 'wip-getdir'
Greg Farnum
08:59 PM Revision b8733476 (ceph): gitignore: just ignore all test_ files
We don't want to add a new ignore for each test!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum
08:55 PM Revision d4faf588 (ceph): qa: workunit to run test_libcephfs_readder
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
08:49 PM Revision 120c3fbd (ceph): test: write a test to try and check on Client::readdir_r_cb.
It's made difficult by having to go through libcephfs, but it's better
than nothing and should catch most of the erro...
Greg Farnum
08:39 PM Feature #1678 (Resolved): rados tool: ability to specify object locator
We need to be able to access objects with none-default locators. Yehuda Sadeh
08:27 PM Revision 4f3b1138 (ceph): ceph_manager: log ceph -s output so progress is visible in the logs
Josh Durgin
08:08 PM Revision 0b451f94 (ceph): Keep each ssh connection alive.
With long-running jobs like thrashing, ssh connections were timing
out.
Josh Durgin
08:07 PM Revision 6e3e0d7c (ceph): connection: allow the caller to specify whether keep-alive should be used
Josh Durgin
06:45 PM Revision 58eb8c5e (ceph): rgw: fix null deref, cleanups
Yehuda Sadeh
06:29 PM Revision 0d4987d9 (ceph): rgw: fix crash when accessing swift auth without user
Yehuda Sadeh
06:29 PM Revision 7726e78d (ceph): rgw: add support for chunked upload
Yehuda Sadeh
06:29 PM Revision b1a0c1ad (ceph): locker: fix race in locking
The isolation level is lower than I thought. This made it possible for
two clients to think they both locked the same...
Josh Durgin
04:39 PM CephFS Bug #1663 (Resolved): Hadoop: file ownership/permission not available in hadoop
This is still a pretty cheap fix :), but I think it's enough to close out this bug. Greg Farnum
04:12 PM CephFS Bug #1663: Hadoop: file ownership/permission not available in hadoop
a79b7e17ebbc70cedae80216986ae5fd52a1c0b7 provides an OK fix for now. Basically it makes any file look like the curren... Noah Watkins
04:08 PM CephFS Bug #1666: hadoop: time-related meta-data problems
Bummer. Well... for the time being it may be sufficient to force FileStatus.getModificationTime() to go directly to t... Noah Watkins
03:58 PM CephFS Bug #1666: hadoop: time-related meta-data problems
Yeah, it's not impossible, I just would have thought that one of the other updates would have prompted the server to ... Greg Farnum
03:52 PM CephFS Bug #1666: hadoop: time-related meta-data problems
Do you mean that you are surprised that client-1's inode didn't get updated from the server's change before the stat ... Noah Watkins
03:49 PM CephFS Bug #1666: hadoop: time-related meta-data problems
If that's the case then I'm surprised the mtime didn't get updated at an earlier time. If nothing else we can probabl... Greg Farnum
03:44 PM CephFS Bug #1666: hadoop: time-related meta-data problems
Greg Farnum wrote:
> So the "bad" mtime is the same time the inode was created on the MDS server?
I think so. Her...
Noah Watkins
03:35 PM CephFS Bug #1666: hadoop: time-related meta-data problems
So the "bad" mtime is the same time the inode was created on the MDS server? Greg Farnum
03:30 PM CephFS Bug #1666: hadoop: time-related meta-data problems
If Client-1 is seeing a cached copy of the inode's mtime, then the following server-side scenario may explain what's ... Noah Watkins
02:44 PM CephFS Bug #1666: hadoop: time-related meta-data problems
Grepping for the inode number got me this:... Greg Farnum
01:20 PM CephFS Bug #1666: hadoop: time-related meta-data problems
Sage Weil wrote:
> If you can generate client logs for C1 and C2 (debug ms = 1, debug client = 10) that should tell ...
Noah Watkins
11:44 AM CephFS Bug #1666: hadoop: time-related meta-data problems
If you can generate client logs for C1 and C2 (debug ms = 1, debug client = 10) that should tell us everything. Sage Weil
11:07 AM CephFS Bug #1666: hadoop: time-related meta-data problems
Just ran a little experiment that may shed some light on this.... Noah Watkins
03:49 PM CephFS Bug #1677: mds interval_set.h: 385: FAILED assert(p->first <= start)
Here is the log from the MDS that caused this. I have from the other mds's, mon, and osd if it is relevant -- but not... Noah Watkins
03:44 PM CephFS Bug #1677 (Resolved): mds interval_set.h: 385: FAILED assert(p->first <= start)
Noah got this and sent it to the mailing list on Oct 28, 2011:... Greg Farnum
02:15 PM Bug #1617 (New): pgs stuck down and peering with only one osd down and out
Happened again today in teuthology:~teuthworker/archive/nightly_coverage_2011-11-03/1433:... Josh Durgin
02:06 PM Messengers Bug #1674: daemons crash when sent random data
This is actually going to be pretty unpleasant. Removing the asserts that deliberately crash on unexpected types is e... Greg Farnum
06:29 AM Messengers Bug #1674 (Can't reproduce): daemons crash when sent random data
mon seem to crash every time, osd seem to take a few attempts (similar stack trace). not tested mds... John Leach
12:04 PM Bug #1676 (Resolved): stats mismatch during snaps workunit
It looks like this started failing between 10-20 and 10-24.... Josh Durgin
11:54 AM CephFS Bug #1675 (Can't reproduce): mds: failed rstat assert
This happened during the multiple_rsync workunit.
From teuthology:~teuthworker/archive/nightly_coverage_2011-11-03/1...
Josh Durgin
11:29 AM Bug #1671: rgw: access to swift auth url without user info crashes gateway
Ah, failed to push. Rebased commit:0d4987d990e9795fda75d9e7903ba2d449b11fec. Yehuda Sadeh
02:52 AM Revision 376dad92 (ceph): hadoop: remove unused fs_default_name
The variable fs_default_name is effectively unused
and the same affect is achieved by treating paths
in a standard wa...
Noah Watkins
02:51 AM Revision 3191e0db (ceph): hadoop: FileSystem.rename should not return FileNotFound
This fixes several unit test failure cases.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
02:51 AM Revision 60e1e148 (ceph): hadoop: ENOTDIR should be negative
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
02:51 AM Revision 6deea1c2 (ceph): hadoop: fix unit test: testWorkingDirectory
The working directory should be set in initialize() and
is expected by the unit tests to be fully qualified (i.e.
wit...
Noah Watkins
02:51 AM Revision ccb08e21 (ceph): hadoop: remove deprecation warning
The routine cannot be fully removed yet because it
still exists as an abstract function in FileSystem class.
Signed-...
Noah Watkins
02:51 AM Revision 1c24fc7a (ceph): hadoop: remove deprecated isDirectory()
Uses the suggested getFileStatus() method for
replacing the deprecated isDirectory(). This is
only marginally slower ...
Noah Watkins
02:51 AM Revision a407da0e (ceph): hadoop: remove statistics initialization
This is already handled by super.initialize()
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
02:51 AM Revision dcf2d629 (ceph): hadoop: remove unused variable
Remove CephFileSystem.debug as log4j is now
used for debug level control.
Signed-off-by: Noah Watkins <noahwatkins@g...
Noah Watkins
02:51 AM Revision 9e8fa029 (ceph): hadoop: remove initialization check
The initialization check is removed because
it is part of Hadoop's treatment of file systems
that initialize() is cal...
Noah Watkins
02:51 AM Revision 3006c6e5 (ceph): hadoop: simplify workingDir handling; add home directory
1. Simplifies the handling of paths by allowing them to be passed
around and manipulated in their fully qualified for...
Noah Watkins
02:50 AM Revision a79b7e17 (ceph): hadoop: emulate Ceph file owner as current user
Make CephFileSystem tell Hadoop that the owner
of all files is the current user. This provides
zero security or isola...
Noah Watkins
02:49 AM Revision e9adf735 (ceph): hadoop: use standard log4j logging facility
Replace ceph.debug(msg, level) with LOG.level(msg)
provided by the log4j facility used by Hadoop. The
level can now b...
Noah Watkins
02:06 AM Bug #1529: cosd: os/FileStore.cc: 2390: FAILED assert(0 == "ENOENT on clone suggests osd bug")
Sorry for the slow response! Somehow I didn't get a e-mail update.
I do have logs preceeding the crash, but they a...
Wido den Hollander

11/02/2011

08:45 PM CephFS Bug #1666: hadoop: time-related meta-data problems
Something like this would make the most sense to me. (I'd have to check the specifics of mtime updating to see exactly.) Greg Farnum
08:30 PM CephFS Bug #1666: hadoop: time-related meta-data problems
Formatting oops:... Noah Watkins
08:29 PM CephFS Bug #1666: hadoop: time-related meta-data problems
You're right about that last point Greg, it doesn't quite add up--not thinking straight today.
Here is what happen...
Noah Watkins
07:46 PM CephFS Bug #1666: hadoop: time-related meta-data problems
I'd have to look at the specifics again -- but it probably can't be done. If the client buffers a write and then flus... Greg Farnum
06:39 PM CephFS Bug #1666: hadoop: time-related meta-data problems
So, I think I've got this nailed down. The good news is that the error was a clock sync issue. The bad news is that i... Noah Watkins
06:51 PM Revision c861ee10 (ceph): PG: mark scrubmap entry as not absent when we see an update
Previously, there would be an assert failure in _scan_list if we see an
object deleted and then recreated.
Signed-of...
Samuel Just
06:33 PM Revision a2f406ef (ceph): testrados: set CEPH_CLIENT_ID without a ;
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
04:23 PM Bug #1633: osd crash in CryptoKey::decrypt
Happened again today. I put the core and tarball on the gcov gitbuilder in ~ubuntu/bug_1633. Josh Durgin
03:45 PM Revision 78111d07 (ceph): Merge branch 'wip-freebsd'
Conflicts:
src/osd/OSD.cc
Sage Weil
03:44 PM Revision 47b70367 (ceph): debian: update VCS sources
Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu> Laszlo Boszormenyi
03:44 PM Revision 0b0f65a4 (ceph): add missingok to logrotate
When ceph is not running, it has no logs. Thus logrotate has nothing to
rotate. The missingok directive handles this ...
Laszlo Boszormenyi
03:44 PM Revision f4971328 (ceph): debian: empty dependency_libs in *.la files
Per policy and multiarch support.
Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
Laszlo Boszormenyi
03:44 PM Revision 26787ce3 (ceph): debian: add watch
Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu> Laszlo Boszormenyi
03:44 PM Revision ee34e09c (ceph): debian: fix libceph1 -> libcephfs1 rename
Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu> Laszlo Boszormenyi
02:34 PM rgw Bug #1673 (Won't Fix): rgw: mod_fastcgi needs to be backward compatible
The changes we introduced for 100-continue breaks the protocol, we need to make that optional that way or another. Yehuda Sadeh
02:18 PM rgw Bug #1672 (Resolved): rgw: support chunked transfer encoding
This is required for swift support. Currently mod_fastcgi doesn't support chunked transfer and we can't just use mod_... Yehuda Sadeh
01:33 PM Bug #1530: osd crash during build_inc_scrub_map
Alright, in irc, slb seems to have hit a related bug...with logging! Samuel Just
11:49 AM Bug #1530: osd crash during build_inc_scrub_map
c861ee105475b3f20f64f51b8611f9b69207ca8c should take care of the assert(!o.negative) error. Still trying to reproduc... Samuel Just
09:02 AM Bug #1530: osd crash during build_inc_scrub_map
Possibly related: the snaps workunit failed yesterday and today with bad stats:... Josh Durgin
08:53 AM Bug #1530: osd crash during build_inc_scrub_map
Two more tests hit this last night, and two other osds crashed due to an assert in build_inc_scrub_map:... Josh Durgin
12:41 PM Bug #1671 (Resolved): rgw: access to swift auth url without user info crashes gateway
Fixed, commit:add8f59df9b6ef63a8431d3415e791b14ce1fe3c. Yehuda Sadeh
12:36 PM Bug #1671 (Resolved): rgw: access to swift auth url without user info crashes gateway
Yehuda Sadeh
11:31 AM Bug #1657 (Resolved): teuthology: testrados failed to find conf
Forgot to include my fix for that, pushed: a2f406ef49a1e5ec31d90957122e14addf56901c. Samuel Just
08:58 AM Bug #1657 (New): teuthology: testrados failed to find conf
Failed due to escaped env setting:... Josh Durgin
09:35 AM Bug #1670 (Can't reproduce): osd: crash in update_heartbeat_peers
... Sage Weil
04:20 AM Revision 2fc01b52 (ceph): osdmaptool: test --create-with-conf with racks
Make sure we generate a map that will map (and not assert about bad
max_osd/max_device mismatch).
Signed-off-by: Sag...
Sage Weil
04:14 AM Revision 885d7148 (ceph): osdmap: assert that osdmap max_osds >= crushmap max_devices
This will catch potential array overruns before they happen.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:14 AM Revision 0bcdd4f3 (ceph): osdmap: fix off-by-one in build_simple_from_conf
maxosd is the highest osd id. set_max_osd(that + 1), since that is
setting the array size. This fixes references of...
Sage Weil
03:04 AM Revision b66847ea (ceph): osd: fix assert include
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

11/01/2011

11:07 PM Bug #1669: linux 32 bit kernel client ld libraries and rm issue
Yes it is much better. I used a git version of the kernel and it's version is 3.1.0+. It seems ldconfig and rm are ... Hong Cho
08:59 PM Bug #1669: linux 32 bit kernel client ld libraries and rm issue
There was a recent fix for 32-bit ino generation that will avoid this problem most of the time, although in theory yo... Sage Weil
07:00 PM Bug #1669 (Resolved): linux 32 bit kernel client ld libraries and rm issue
I am running ceph on 64 bit OS (Debian linux-image-3.0.0-2-x86_64). It is on two machines each of them having 1 mon,... Hong Cho
11:02 PM Revision 219141e9 (ceph): rgw: swift prefix and path params fixes
Yehuda Sadeh
08:12 PM Revision 143c572b (ceph): .gitignore: test_str_list
Sage Weil
08:10 PM Revision aa5f697f (ceph): Makefile: include/compat.h in tarball
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:35 PM Revision 9252dccc (ceph): Merge branch 'master' into wip-freebsd
Sage Weil
06:49 PM Revision b3b45bf9 (ceph): Merge remote-tracking branch 'gh/wip-auth'
Sage Weil
06:43 PM Revision 79d9718d (ceph): common: make get_str_list work with other delimiters, and skip the
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:43 PM Revision 99bcd7b5 (ceph): common: get_str_list unit tests
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:19 PM Revision ba8c345b (ceph): monclient: fail fast when our auth protocols aren't supported
This handles the case where the server does not support any of the
authentication protocols that the client does. Pre...
Josh Durgin
06:19 PM Revision 7a4c232f (ceph): monclient: fix else formatting
If one branch has braces, the other should too.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
06:16 PM Revision d1e95134 (ceph): PG: set_last_peering_reset in Reset constructor
If an osd in the prior set comes up, we can restart peering without a
new peering interval starting. However, we sti...
Samuel Just
05:46 PM Revision e15177ab (ceph): monclient: fail fast when our auth protocols aren't supported
This handles the case where the server does not support any of the
authentication protocols that the client does. Pre...
Josh Durgin
05:46 PM Revision ef51f0fa (ceph): monclient: fix else formatting
If one branch has braces, the other should too.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
02:56 PM Bug #1633: osd crash in CryptoKey::decrypt
have a core but no matching binary :(. need to reproduce again, and save the build tarball. Sage Weil
01:03 PM devops Feature #1668 (New): collectd: push ceph plugin upstream
Rebase the perfcounter ceph plugin in the dho collectd repo against mainline collectd and push upstream. Sage Weil
11:09 AM Bug #1530: osd crash during build_inc_scrub_map
can someone work on reproducing this? see metropolis:~sage/src/teuthology/j.1530 and hammer.sh Sage Weil
10:11 AM Bug #1530: osd crash during build_inc_scrub_map
This happened again in teuthology:~teuthworker/archive/nightly_coverage_2011-11-01/1254/remote/ubuntu@sepia68.ceph.dr... Josh Durgin
11:08 AM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
Someone needs to try to reproduce this with logs. fwiw metropolis:~sage/src/teuthology/hammer.sh is what i've been u... Sage Weil
10:22 AM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
This happened after the misc workunit today. Josh Durgin
08:49 AM Linux kernel client Bug #1667 (Resolved): BUG at fs/inode.c line 1375
... Sage Weil

10/31/2011

10:03 PM Revision 9ea02239 (ceph): osd: kill unused on_osd_failure() hook
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:00 PM Revision 1d9e8065 (ceph): RadosModel.h: use default conf location
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
09:54 PM Revision 810cae1a (ceph): testrados: specify CEPH_CONF directly
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
09:02 PM Revision b9a0b2b7 (ceph): Revert "PG: call set_last_peering_reset in Started contructor"
Unfortunately, the Started constructor doesn't occur until map
activation. We need to reset last_peering_reset exact...
Samuel Just
06:15 PM Revision f9b7ecdb (ceph): hadoop: Return NULL when the path does not exist.
Although unspecified in the declaration header, other file
systems return a single result when the path is a file.
T...
Noah Watkins
05:53 PM Bug #1633: osd crash in CryptoKey::decrypt
Another occurrence in teuthology:~teuthology/archive/nightly_coverage_2011-10-28/1170/remote/ubuntu@sepia50.ceph.drea... Josh Durgin
05:32 PM CephFS Bug #1666: hadoop: time-related meta-data problems
It looks like the check is equality of timestamps. So, I think Hadoop is setting an explicit timestamp, and sometime ... Noah Watkins
05:30 PM CephFS Bug #1666: hadoop: time-related meta-data problems
All of the local clocks on the nodes look good. The code is comparing timestamps (I assume since epoch), so maybe the... Noah Watkins
05:06 PM CephFS Bug #1666: hadoop: time-related meta-data problems
Neither of these errors are in code that's remotely familiar to me. So my first favorite question is:
Are your clock...
Greg Farnum
04:55 PM CephFS Bug #1666 (Resolved): hadoop: time-related meta-data problems
The following exceptions are being thrown. It looks like something related to lstat?
pre>
java.io.IOException: Th...
Noah Watkins
02:59 PM Bug #1657 (Resolved): teuthology: testrados failed to find conf
Should work now
ceph: 1d9e8065c835c343608930585c2853984cde2fa8
teuthology: 810cae1a1d03138abfa54cd31059723ec0c22ab1
Samuel Just
02:04 PM Bug #1665 (Resolved): osd: last_peering_reset incorrect on stray?
b9a0b2b7a4d3b5a7db1f942af0158712199377a8 reverted 6d123067ce1ba99522281d5c72623bd5ba3e0fc8 Samuel Just
12:09 PM Bug #1665: osd: last_peering_reset incorrect on stray?
this is why. the interval starts at 150,a nd that is when teh query is sent. on the stray, we hit it in 151:... Sage Weil
11:46 AM Bug #1665 (Resolved): osd: last_peering_reset incorrect on stray?
on alexandria,... Sage Weil
01:55 PM Bug #1588 (Can't reproduce): blogbench on kclient possibly made machine die
I think this is fixed - the nightly tests haven't hit it in the past week, since 339573406737461cfb17bebabf7ba536a302... Josh Durgin
11:35 AM CephFS Bug #1661 (Resolved): Hadoop: expected system directories not present
Apparently this was actually the result of an API mismatch. Fixed by Noah's patch in commit:f9b7ecdb5bba1439dc4c13005... Greg Farnum
11:26 AM Feature #1618: libvirt: make sure migration works
Braindump of what I did for the earlier libvirt migration demo:
- on each vm host, install kvm 0.15 (0.14 is too o...
Anonymous
09:13 AM Bug #1415 (Duplicate): cosd assertion: existing->state == STATE_CONNECTING || existing->state ==...
Sage Weil
09:11 AM rgw Feature #1664 (Resolved): rgw: pass swift tests
Sage Weil
09:06 AM Messengers Feature #1648 (Duplicate): msgr: choose ip to bind to based on network
Sage Weil
09:02 AM Messengers Feature #1648: msgr: choose ip to bind to based on network
duplicates #1487 Sage Weil
07:58 AM Bug #1529: cosd: os/FileStore.cc: 2390: FAILED assert(0 == "ENOENT on clone suggests osd bug")
Sage Weil wrote:
> Do you have the odd log preferring the restart?
Er, osd log preceeding ...
Sage Weil
07:54 AM Bug #1529: cosd: os/FileStore.cc: 2390: FAILED assert(0 == "ENOENT on clone suggests osd bug")
Do you have the odd log preferring the restart? Sage Weil
06:46 AM Bug #1529: cosd: os/FileStore.cc: 2390: FAILED assert(0 == "ENOENT on clone suggests osd bug")
I'm still seeing this one. All my 6 OSDs went down and after starting them most of them would crash:... Wido den Hollander

10/30/2011

12:42 AM Revision 5bd029ef (ceph): osdmap: fix g_ceph_context reference
Use cct.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil

10/28/2011

10:48 PM Revision 0fa86182 (ceph): ReplicatedPG: check for peering restart before share_pg_info
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
09:33 PM Revision 199e04ab (ceph): mkcephfs: build initial osdmap from information in ceph.conf
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:32 PM Revision 3f678931 (ceph): crush: make insert_item take float for weight
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:32 PM Revision 07c9de83 (ceph): osdmaptool: build initial map from ceph.conf
This builds the intial osd and crush maps from what is in the ceph.conf,
taking advantage of host or rack tags that a...
Sage Weil
09:25 PM Revision ef4b95c8 (ceph): ReplicatedPG: Clean up old snap links when recovering a clone
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
09:25 PM Revision bd3223f9 (ceph): PG: Create new snap directories independently on replica
Previously, we shipped over the collection creation as part
of the transaction. However, the snap directory on the
r...
Samuel Just
09:04 PM Revision b497b385 (ceph): rgw: canonical resource should use unencoded url
Yehuda Sadeh
08:00 PM Revision 5fe8e00a (ceph): Merge pull request #4 from vzctl/master
fix error: 'snprintf' was not declared in this scope Sage Weil
06:49 PM Revision a8450005 (ceph): rgw: cleanup, remove unused user_id
Some access methods required user_id param, but that was never really used. At
this point we should just remove them.
Yehuda Sadeh
06:42 PM Revision 7ee0747c (ceph): mkcephfs: skip non-btrfs osds even with --mkbtrfs
This lets you do a mixed btrfs and non-btrfs file systems.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:39 PM Revision 2bb283ba (ceph): Merge branch 'stable'
Sage Weil
05:38 PM Revision 3a17f023 (ceph): debian: break redundant dependencies
They confuse APT it seems.
ceph-common -> librbd1 -> librados2
radosgw -> ceph-common -> librados2
Signed-off-by:...
Sage Weil
05:05 PM Revision b8979f4d (ceph): MOSDMap: do not leave {oldest,newest}_map uninitialized when decoding o...
This leads to badness like
osd_map(295..296 src has 74308224..0) v1
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
03:46 PM CephFS Bug #1661: Hadoop: expected system directories not present
Blindly creating directories is definitely not the proper solution. Somebody will need to take the time to figure out... Greg Farnum
03:32 PM CephFS Bug #1661: Hadoop: expected system directories not present
In this particular instance it is a map-reduce specific directory. I suspect that MapReduce is responsible for this, ... Noah Watkins
03:22 PM CephFS Bug #1661: Hadoop: expected system directories not present
Sounds to me like CephFileSystem should just create the directory if it doesn't exist.. Sage Weil
03:13 PM CephFS Bug #1661: Hadoop: expected system directories not present
Good to know. I think at this point I need to paper over many things, but want to record all these issues. I'll just ... Noah Watkins
03:08 PM CephFS Bug #1661: Hadoop: expected system directories not present
I remember running into this issue when developing things and deciding to just paper over it at the time -- I couldn'... Greg Farnum
03:05 PM CephFS Bug #1661: Hadoop: expected system directories not present
Adding: when this directory is created by hand before map reduce starts the error is gone. Noah Watkins
03:04 PM CephFS Bug #1661 (Resolved): Hadoop: expected system directories not present
Hadoop complains that directories within the file system that are expected to be present are not present. Hadoop may ... Noah Watkins
03:24 PM CephFS Bug #1663: Hadoop: file ownership/permission not available in hadoop
Noah Watkins wrote:
> This is a very simple hack that will make hadoop ignore the permission for the time being:
...
Noah Watkins
03:23 PM CephFS Bug #1663: Hadoop: file ownership/permission not available in hadoop
This is a very simple hack that will make hadoop ignore the permission for the time being:
diff --git a/src/mapred...
Noah Watkins
03:16 PM CephFS Bug #1663 (Resolved): Hadoop: file ownership/permission not available in hadoop
Hadoop complains about incorrect file ownership. An 'ls' via Hadoop FS interface reveals no permission information, b... Noah Watkins
03:08 PM rbd Feature #1662 (Resolved): libvirt: obscure qemu/rbd secrets
Sage Weil
02:36 PM Feature #1067 (Resolved): mkcephfs: magically group osds on same host into subtrees in the genera...
commit:199e04aba1bd3d0c5a2a0e13e4500bef9cc206cf Sage Weil
01:46 PM Revision 6353d7b5 (ceph): include stdio in order to fix snprintf compilation error
Signed-off-by: Alexey Lapitsky <lex@realisticgroup.com> Alexey Lapitsky
12:08 PM rgw Bug #1645 (Resolved): rgw bucket suspended broken
Fixed, commit:6752babdfda1be0524d82b84adfa4663aded32f6. Also added a teuthology test. Yehuda Sadeh
09:30 AM rgw Feature #829 (Resolved): rgw: support swift POST
We actually support now swift POST for metadata changes. For ACL changes there's issue #830. Yehuda Sadeh
09:28 AM rgw Bug #1643: radosgw-admin log show should accept --time
The problem is that the logs are indexed by date, and not by time. Filtering by time means that we need to scan the o... Yehuda Sadeh
04:04 AM Revision 46bb82f5 (ceph): client: fix return value for _readdir_cache_cb
Return 0 for end of directory here, too.
Clarify some comments.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
03:28 AM Revision 943893e8 (ceph): ceph: fix snprintf warning
warning: tools/ceph.cc:146: format not a string literal and no format arguments
Signed-off-by: Sage Weil <sage.weil@...
Sage Weil
01:12 AM Revision 64992113 (ceph): auth: return unknown if no supported auth is found
If NONE is supported, it will already be in the list of supported
protocols, so there's no need to default to it here...
Josh Durgin
01:01 AM Bug #1659 (Can't reproduce): Upgrade from 0.27 -> 0.37 going wrong, OSDs miss map updates
Hi,
Like I mentioned on IRC, I had some problems with upgrading my cluster from 0.27 to 0.37.
It was a big step...
Wido den Hollander
12:24 AM Revision 1a4eec20 (ceph): uclient: fix _getdents and add some documentation.
If readdir_r_cb returns 0, that means SUCCESS, regardless of how
many entries it actually wrote.
If it returns <0, th...
Greg Farnum

10/27/2011

11:15 PM Revision 27ec04e7 (ceph): cfuse: remove unneeded loop.
The only time this was looping previously was completely unnecessary
anyway, as 1 meant the same thing as 0: there ar...
Greg Farnum
11:15 PM Revision e37ab416 (ceph): uclient: align readdirplus_r with readdir_r.
The only user of this code expects to get 1 on a successfully-filled
value, 0 on a successful non-fill, or -errno oth...
Greg Farnum
11:15 PM Revision 55aace73 (ceph): uclient: readdir_r_cb documentation, and it only returns 0 or -errno.
Returning 0 or 1 in different situations that were effectively the
same is useless and confusing.
Signed-off-by: Gre...
Greg Farnum
09:35 PM Revision 354055f8 (ceph): rgw: swift related adjustments
Yehuda Sadeh
09:26 PM Revision 713a4428 (ceph): Merge branch 'master' of github.com:NewDreamNetwork/ceph
Sage Weil
09:04 PM Revision ed839f5a (ceph): fixed graphic reference and headings
Sondra.Menthers
09:00 PM Revision 2c4eb075 (ceph): fixed image reference
Sondra.Menthers
08:54 PM Revision b42443ec (ceph): fixed architecture document
Sondra.Menthers
08:43 PM Revision c57ed06c (ceph): add images for documentation
Sondra.Menthers
07:51 PM Revision 7a022029 (ceph): rgw: handle swift PUT with incorrect etag
Sondra.Menthers
07:44 PM Revision cae7d5a0 (ceph): rgw: handle swift PUT with incorrect etag
Sondra.Menthers
07:44 PM Revision 697bba39 (ceph): rgw: handle swift PUT with incorrect etag
Sondra.Menthers
07:11 PM Revision 10c35087 (ceph): rgw: add user suspend/enable test
Yehuda Sadeh
06:32 PM Revision 86aa940f (ceph): rgw: log-to-stderr is now a binary flag
Yehuda Sadeh
06:20 PM Revision a817a38e (ceph): rgw: handle swift PUT with incorrect etag
Sondra.Menthers
06:16 PM Revision d9dfd147 (ceph): rgw: handle swift PUT with incorrect etag
Sondra.Menthers
06:02 PM Revision 87224c08 (ceph): rgw: handle swift PUT with incorrect etag
Sondra.Menthers
05:02 PM Revision e4dcbd03 (ceph): ceph: refactor for generic --admin-daemon <sock> <cmd> too
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:50 PM Revision 6979eaa0 (ceph): filejournal: journal_replay_from
Force journal replay from a point other than the op_seq recorded by the
fs. This is useful if you want to skip bad e...
Sage Weil
04:50 PM Revision 89dccc0e (ceph): ceph: --dump-perf-counters[-schema] sockpath
Quick and dirty way to dump perfcounters stats. Not documenting this until
we decide this is where it should live.
...
Sage Weil
04:26 PM Revision a9b75f21 (ceph): Merge branch 'stable'
Sage Weil
04:26 PM Revision b3e1e3e1 (ceph): rados: improve error message
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:46 AM CephFS Bug #1549 (Need More Info): mds: zeroed root CDir* vtable in scatter_writebehind_finish
bleh. need logs... i'll start this up in a loop again. Sage Weil
10:33 AM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
This happened again today after fsstress. From teuthology:~teuthworker/archive/nightly_coverage_2011-10-27/1083/teuth... Josh Durgin
09:26 AM Feature #1658 (Resolved): osd: backfill instead of backlog
Sage Weil
08:59 AM Feature #1646: mon: catch up on committed items before attempting to join quorum
Not sure exactly what you mean, but that sounds a bit like the behavior when the encoding changes and the monitors ar... Sage Weil
03:55 AM Feature #1646: mon: catch up on committed items before attempting to join quorum
Any chance this is related with an issue I noticed last night, in which the primary mon was receiving and displaying ... Alexandre Oliva
04:20 AM Revision 11691a71 (ceph): radosgw-admin: fix key create check
Also fixes warning
warning: rgw/rgw_admin.cc:812: suggest parentheses around ‘&&’ within ‘||’
Signed-off-by: Sage W...
Sage Weil
12:24 AM Revision 921ce53d (ceph): osd: guard checks for writes
fa722de6708d3e92037df6289cc29ece12c8ea66 moved these checks, and
accidentally removed the may_write() guard. This cau...
Josh Durgin
12:20 AM Revision 0c78f0dc (ceph): rgw: handle swift PUT with incorrect etag
Yehuda Sadeh
12:00 AM Revision 213eb13d (ceph): Revert "hadoop: get hadoop bindings to build again" and fix.
It's just wrong. The Java code is still passing a String along
regardless of what you ask the C to do! Fix it by grab...
Greg Farnum

10/26/2011

11:07 PM Revision e8e10158 (ceph): rgw: rgw-admin --skip-zero-entries
Yehuda Sadeh
11:00 PM Revision 180c744b (ceph): perfcounters: fix accessor name
FreakingCamelCaps
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
11:00 PM Revision 1a0a732e (ceph): objecter: instrument with perfcounter
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:34 PM Revision e747456c (ceph): rgw: rgw-admin generate-key/access-key=false fix
Yehuda Sadeh
10:34 PM Revision 9386a7b5 (ceph): rgw: rgw-admin can show log summation
Yehuda Sadeh
09:56 PM Revision 7fbf28a9 (ceph): osd: read_log: only list the collection once
After upgrading we may need to list the collection to recover the hash
value when upgrading an old collection.
Signe...
Sage Weil
09:30 PM Revision 6752babd (ceph): rgw: fix bucket suspension
Yehuda Sadeh
05:46 PM Bug #1654 (Resolved): snaps workunit failed on cfuse
Fixed by 921ce53d6efc3f1bf7056f05467aff5c3104dcc8. Josh Durgin
03:24 PM Bug #1654: snaps workunit failed on cfuse
And the librados selfmanaged snaps tests also failed with an unexpected EINVAL when reading from a snapshot. Josh Durgin
11:39 AM Bug #1654: snaps workunit failed on cfuse
There might have been a bug introduced in snapshot contexts - two rbd tests got EINVAL when setting a snapshot, meani... Josh Durgin
11:35 AM Bug #1654 (Resolved): snaps workunit failed on cfuse
... Josh Durgin
05:31 PM Bug #1657 (Resolved): teuthology: testrados failed to find conf
From teuthology:~teuthworker/archive/nightly_coverage_2011-10-26/1037/teuthology.log:... Josh Durgin
04:11 PM rgw Feature #773: rgw: efficient list-objects filtering
With the new osd-class index, this should be pretty straight forward. Yehuda Sadeh
04:09 PM rgw Feature #1641 (Rejected): radosgw-admn log show --bandwidth-only
commit:9386a7b5e57de4994ff3ad4987ef309cb8275392 added data aggregation, so there's no need to dump the entire log now... Yehuda Sadeh
04:06 PM rgw Feature #1642 (Resolved): radosgw-admin log show --nonzero-only
Fixed, commit:e8e101580ea04628713f51171e9af58aec1acbd2.
rgw-admin accepts --skip-zero-entries now.
Yehuda Sadeh
04:03 PM CephFS Bug #1656: Hadoop client unit test failures
Sounds good to me -- which patches we want to keep in the tree are probably a management decision but I'm happy to pu... Greg Farnum
03:55 PM CephFS Bug #1656: Hadoop client unit test failures
Alright, so I think at this point I'd like to see two patches:
1) A patch against the downloadable tarball (much e...
Noah Watkins
03:49 PM CephFS Bug #1656: Hadoop client unit test failures
I believe the patch was made against the then-current svn 0.21 branch (which is now very dead). I pushed changes to t... Greg Farnum
03:39 PM CephFS Bug #1656: Hadoop client unit test failures
This was hadoop-0.20.205.0 with the latest Ceph master branch.
It looked like the patch in src/client/hadoop was o...
Noah Watkins
03:30 PM CephFS Bug #1656: Hadoop client unit test failures
What versions of the systems were you running when these failed?
I don't remember how they're set up but they migh...
Greg Farnum
01:59 PM CephFS Bug #1656 (Won't Fix): Hadoop client unit test failures
The Ceph Hadoop File System passes nearly all its tests except a few. I've included the test log below that shows the... Noah Watkins
03:38 PM Bug #1555 (Resolved): radosgw_admin --gen-access-key=false and --gen-secret=false flags appear to...
Fixed, commit:e747456c9f6cc8cc0367bb80e757b1b24e098de1. Yehuda Sadeh
01:49 PM Feature #1655 (Resolved): gitbuilder aggregator page
single page that has 1 line per gitbuilder, with instance name and then the top line of the gitbuilder status screen ... Sage Weil
10:13 AM Bug #1590 (Duplicate): occasionally excessive mon memory footprint
Sage Weil
10:12 AM Bug #1590: occasionally excessive mon memory footprint
this will go away with #1646. Sage Weil
10:11 AM Bug #1634 (Can't reproduce): osd: crash decoding non-existent object_info_t
going to see if this comes up again after this last round of osd fixes Sage Weil
09:58 AM Feature #1653 (Resolved): librados: python binding nose tests
Sage Weil
04:34 AM Revision f197e845 (ceph): rgw: fix uninitialized variable warnings
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil

10/25/2011

11:39 PM Revision 952be11a (ceph): hadoop: bring back Java changes.
These convert the Hadoop stuff to work on the branch-0.20 API.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost....
Greg Farnum
11:29 PM Revision 71fd8302 (ceph): Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph
Conflicts:
src/rgw/rgw_rados.cc
Yehuda Sadeh
11:23 PM Revision d9f73605 (ceph): rgw: fix attr cache
Yehuda Sadeh
10:35 PM Bug #1628 (Resolved): segfault attempting to map an rbd snapshot
Sage Weil
10:33 PM Bug #1099 (Closed): osd: handle recovery of lost objects
this has been reimplemented (at least the revert case). Sage Weil
10:32 PM Cleanup #146 (Rejected): Complete build options for Pthread API
Sage Weil
10:29 PM Feature #641 (Rejected): allow logs to be piped to an external program
works for me. Sage Weil
10:28 PM Bug #250 (Resolved): mon: delete old states to avoid filling disk
Sage Weil
10:28 PM Feature #875 (Resolved): osd: clean up old osdmaps
Sage Weil
10:24 PM Feature #1649 (Resolved): osd: make replay interval a per-pool setting
Sage Weil
10:08 PM Revision 5151a8af (ceph): common/ceph_extattr.[ch] > common/xattr.[ch]
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:54 PM Revision 46f330d0 (ceph): Merge branch 'master' into wip-freebsd
Sage Weil
09:15 PM Revision ef48183a (ceph): fix osdmaptool clitests
Sage Weil
09:02 PM Revision 8ae02dab (ceph): Merge branch 'wip-pools'
Sage Weil
05:52 PM Revision 6287ccf6 (ceph): mon: reencode routed messages
The message encoding may depend on the target features. Clear the
payload so that the Message gets reencoded appropr...
Sage Weil
05:51 PM Revision 72e0ca02 (ceph): MOSDMap: reencode full map embedded in Incremental, as needed
The Incremental may have a bufferlist containing a full map; reencode
that too if we are reencoding for old clients.
...
Sage Weil
05:13 PM Revision cd6d7009 (ceph): Merge remote-tracking branch 'gh/wip-rbd-tool'
Sage Weil
04:53 PM Revision 6ca99060 (ceph): mon: parse 0 values properly
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:53 PM Revision 90f0429f (ceph): mon: fix rare races with pool updates
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:48 PM CephFS Bug #1114 (Need More Info): NFS export extreme slowdown
Need to reproduce this on the current trunk and fully characterize what is going on.
- the the nfs server in sync ...
Sage Weil
04:46 PM Bug #1194 (Resolved): kclient: NFS reexport does not survive ceph fs remount
going to assume the above fixed it until we hear otherwise :) Sage Weil
03:50 PM CephFS Bug #1585 (Can't reproduce): mds crash during shutdown
Sage Weil
03:38 PM Bug #1629 (Can't reproduce): pgs stuck degraded (only mapped to 1 osd)
pre-prior set refactor and current round of thrashing fixes. Sage Weil
03:34 PM Bug #1624 (Resolved): osd crash in HearbeatMap::_check
going to chalk these up to the infinite loop fixed in that previous patch. Sage Weil
03:33 PM Bug #1617 (Rejected): pgs stuck down and peering with only one osd down and out
non-specific, and pre-prior set refactor. Sage Weil
03:31 PM Bug #1311 (Closed): qa: TestSnaps: stuck in active
ancient and presumably covered by current thrashing tests Sage Weil
03:30 PM Bug #1292 (Closed): qa: bench & thrashosd PG won't go clean
this is ancient and presumably covered by the new thrashing tests. Sage Weil
03:29 PM Bug #1609 (Resolved): osd: failed assert(info.last_complete == info.last_update)
lots of stuff, mainly commit:03ad5a28eee2328eb2419c48a14df1a3624fc4c7 Sage Weil
10:31 AM Bug #1526 (Resolved): log bound mismatch after thrashing with bonnie
Sage Weil
05:51 AM Revision 43aa33a2 (ceph): Merge remote branch 'gh/wip-osd-queue'
Sage Weil
05:50 AM Revision 7de2f7a9 (ceph): osd: print useful debug info from choose_acting
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:50 AM Revision c30ab1e2 (ceph): osd: MOSDPGNotify: print prettier
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:50 AM Revision 12b3b2d5 (ceph): osd: fix generate_past_intervals maybe_went_rw on oldest interval
We stop working backwards when we hit last_epoch_clean, which means for the
oldest interval first_epoch may not be th...
Sage Weil
05:50 AM Revision 03ad5a28 (ceph): osd: fix last_complete adjustment after recovering an object
After we recover each object, we try to raise the last_complete value
(and matching complete_to iterator). If our lo...
Sage Weil
05:50 AM Revision e2f3c20b (ceph): osd: make proc_replica_log missing dump include useful information
I needed to see have/need to debug a weird unfound issue turned up by
thrashing.
Signed-off-by: Sage Weil <sage@newd...
Sage Weil
05:21 AM Revision f8e92896 (ceph): osd: fix/simplify op discard checks
Use a helper to determine when we should discard an op due to the client
being disconnected. Use this when the op is...
Sage Weil
05:13 AM Revision fa722de6 (ceph): osd: move queue checks into enqueue_op, kill _handle_ helpers
This simplifies things, and renames the checks to make it clear that we are
doing validation checks only, with no sid...
Sage Weil
04:59 AM Revision 3a2dc656 (ceph): osd: move op cap check into helper
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:54 AM Revision b17c9ca5 (ceph): osd: handle missing/degraded in op thread
The _handle_op() method (and friends) are called when an op is initially
queued and when it is requeued. In the requ...
Sage Weil
04:54 AM Revision b1de9131 (ceph): osd: drop ability to disable op queue entirely
This is pretty useless, and broken wrt requeueing anyway.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:54 AM Revision 662414d7 (ceph): osd: drop useless PG hooks
These no longer need to be exposed to the generic OSD code.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
03:54 AM Revision 7aa0d89b (ceph): osd: set reqid on push/pull ops
Not strictly necessary, but makes logs easier to follow.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
03:42 AM Revision e2766bd8 (ceph): mon: remove compatset cruft
The CompatSet is built on demand; it's no longer static.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil

10/24/2011

11:54 PM Revision 6f1b65c6 (ceph): ReplicatedPG: fix snapshot directory handling in snap_trimmer
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
11:54 PM Revision 024bcc4b (ceph): FileStore: ignore EEXIST on clones and collection creation !btrfs_snap
We need to ignore EEXIST on btrfs also when m_filestore_btrfs_snap is
disabled.
Signed-off-by: Samuel Just <samuel.j...
Samuel Just
11:43 PM Revision 4d884040 (ceph): rgw: fix rgw_obj compare function
Yehuda Sadeh
10:34 PM Revision df2967a6 (ceph): rgw: use a uint64_t instead of a size_t for storing the size
librados uses uint64_t so that 32-bit architectures aren't hobbled.
Signed-off-by: Greg Farnum <gregory.farnum@dream...
Greg Farnum
10:32 PM Revision 4b10cad8 (ceph): rbd: check command before opening the image
Now map/unmap won't use librbd, and commands that don't take --snap
will give an error when it's used.
Signed-off-by...
Josh Durgin
10:32 PM Revision 8c6db18d (ceph): rbd: specify which commands take --snap in usage
Maybe this will be less confusing.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
10:32 PM Revision 46bb4122 (ceph): rbd: let all commands use the pool/image@snapshot format
This way you aren't forced to use '-p' or '--snap' to specify a pool
or snapshot for some commands.
Signed-off-by: J...
Josh Durgin
10:32 PM Revision afa34794 (ceph): librbd: show correct size for snapshots
header.size is the current size of the image.
ImageCtx::get_image_size() already does the right thing for
snapshots.
...
Josh Durgin
10:32 PM Revision f4aa69a8 (ceph): workunit: check that rbd info returns the right size for snapshots
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
10:32 PM Revision e2296c3a (ceph): clitests: add rbd usage and invalid snap usage tests
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
10:32 PM Revision 93ccccd7 (ceph): rbd: remove unnecessary condition
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
10:32 PM Revision bfb5ceb2 (ceph): workunits: add rbd rollback and snapshot removal tests
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
10:32 PM Revision 315ab94e (ceph): librbd: propagate error from snap_set
Previously rbd_snap_set always returned 0, even when the snapshot did
not exist.
Signed-off-by: Josh Durgin <josh.du...
Josh Durgin
10:32 PM Revision a5a8a9cf (ceph): test_rbd: add a test for rolling back after resizing
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
10:32 PM Revision ae91911c (ceph): librbd: resize if necessary before rolling back
This is a partial fix for test_rbd.TestImage.test_rollback_with_resize
Signed-off-by: Josh Durgin <josh.durgin@dream...
Josh Durgin
10:32 PM Revision 2af32a41 (ceph): librados: use stored snap context for all operations
Using an empty snap context led to the failure of
test_rbd.TestImage.test_rollback_with_resize, since clones weren't
...
Josh Durgin
10:32 PM Revision b7aa57ff (ceph): rbd.py: update python bindings for new copy interface
It was changed to return 0 on success in d7f7a213546b599d2eec4c6617593d232b43a7d6
Signed-off-by: Josh Durgin <josh.d...
Josh Durgin
10:32 PM Revision e161ce15 (ceph): workunits: test rbd python bindings
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
10:15 PM Revision 2be3999d (ceph): Add btrfs dimension to thrash tasks
Thrash tasks will now also run with and without btrfs.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
09:30 PM Revision 2ad6545a (ceph): Add testrados based thrashing tasks
readwrite.yaml runs a read/write workload against a set of objects.
snaps.yaml adds snaps and rollback.
Signed-off-b...
Samuel Just
09:25 PM Revision 8d0a7c59 (ceph): testrados: rename testsnaps to testrados and make snap testing optional
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
08:52 PM Revision a1249d07 (ceph): workunit: set PYTHONPATH so we can test python bindings
Josh Durgin
06:46 PM Revision 88905b3a (ceph): test/osd: Add TestReadWrite
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
06:27 PM Revision 5e4e7972 (ceph): mon: allow adjustment of per-pool crash_replay_interval
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:12 PM Revision 40b7b572 (ceph): Merge branch 'rgw-dir-cleanup'
Greg Farnum
05:06 PM Revision f57c33df (ceph): rgw: fix check_disk_state; add a strip_namespace function.
Use copies of the IoCtx rather than references so that
we can set locators without breaking stuff, and make use of th...
Greg Farnum
05:04 PM Revision 0da45ca6 (ceph): rgw: rename translate_raw_obj to translate_raw_obj_to_obj_in_ns
And document it. Because the naming is so bad that neither I nor
the author noticed it wasn't doing what we wanted it...
Greg Farnum
05:04 PM Revision 927c3577 (ceph): rgw: add locators to the directory objects, and functions handling them
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
03:55 PM Linux kernel client Bug #1652 (Resolved): rbd: rollback correctly after resizing
I just fixed this bug in librbd, but it seems the kernel has it too. If you take a snapshot, resize the image, then r... Josh Durgin
11:38 AM rgw Bug #1567 (Resolved): rgw [list|delete]_bucket should clean up
Merged into master in commit:40b7b57239515bd0794ef5da2477a2c5eb7a85e4.
Passed s3tests with a greatly-reduced timeo...
Greg Farnum
10:53 AM Feature #1651 (Resolved): command line tool to interact with admin socket
Maybe something like 'ceph --socket /var/run/ceph/osd.0.asok foo'? Sage Weil
04:07 AM Revision f37b08f8 (ceph): librados: behave if shutdown is called twice
On failure, we shut ourselves down. If the caller calls shutdown again,
don't crash.
Fixes: #1650
Signed-off-by: Sa...
Sage Weil
04:05 AM Revision c15e62aa (ceph): mon: need to print pool id for output to be useful
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:40 AM Revision 8a087729 (ceph): mon: PGMap::dump: fix order in totals
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
02:01 AM Revision 1b941390 (ceph): osd: make osd dump slightly more concise
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:13 AM Revision 34c2f6a4 (ceph): osd: pg_pool_t: set crash_replay_interval on data pool when decoding old
We want to preserve the crash_replay_interval on old clusters being
upgraded. Kludge this by setting it to 60 (the o...
Sage Weil

10/23/2011

11:26 PM Revision 6779eb39 (ceph): osd: make osd replay interval a per-pool property
Change the config value to only control the interval set when the data
pool is first created (presumably during mkfs)...
Sage Weil
11:26 PM Revision 8bb8e85d (ceph): Merge remote-tracking branch 'gh/master' into n
Conflicts:
src/osd/OSDMap.h
Sage Weil
11:24 PM Revision f2816a1e (ceph): osd: pg_pool_t: normalize encoding
Normalize encoding to be less awkward. Use a FEATURE bit to indicate
whether the new encoding is supported, and enco...
Sage Weil
11:24 PM Revision 7cb4d25d (ceph): osd: pg_pool_t: introduce flags, crash_replay_interval
Introduce a per-pool crash_replay_interval so we can control whether
the OSD waits for replayed ACKed but not COMMITt...
Sage Weil
09:35 PM Bug #1650 (Resolved): “rados df” joins on thread never started with mons down or laggy (regressio...
fixed by commit:f37b08f821a54263847e2c5c095bba5750908f86 Sage Weil
07:56 PM Bug #1650 (Resolved): “rados df” joins on thread never started with mons down or laggy (regressio...
If rados's attempt to connect the mons time out, it prints:
# rados df
couldn't connect to cluster! error -110
c...
Alexandre Oliva
05:30 PM Revision 61cbb321 (ceph): ceph.conf: python parser doens't like ; comments
Sage Weil
05:16 AM Revision 3ed06562 (ceph): ceph.conf: more frequent osd scrubbing; remove old cruft
Sage Weil
03:44 AM Revision 54e28263 (ceph): scratchtool[pp]: fix rados_conf_set/get test of log_to_stderr
Fix this warning
warning: scratchtool.c:142: comparison with string literal results in unspecified behavior
and fli...
Sage Weil
03:41 AM Revision 9323f25a (ceph): osd: fix PG::Log::copy_after wrt backlogs (again)
Commit 68fe748fc2d703623050e8f2a448a0fd31ca8a0f fixed half of this problem,
but set this->tail incorrectly. If we re...
Sage Weil

10/22/2011

10:13 PM Bug #1530: osd crash during build_inc_scrub_map
I'm going to up the scrub frequency in the teuthology conf to help shake out these problems. There was another bug r... Sage Weil
10:07 PM Bug #1616 (Resolved): crash in is_supported_auth
Sage Weil
10:06 PM Bug #1631 (Need More Info): osd: failed assert(repop_queue.front() == repop)
need an osd log on this one Sage Weil
10:05 PM Cleanup #1644 (Resolved): osd: prior_set refactor
Sage Weil
01:01 PM Bug #1471: osd: destroy_collection on non-empty dir
I'm actually hitting the same bug with v0.37
It was time to upgrade my old (and good running!) 0.27 cluster to the...
Wido den Hollander

10/21/2011

11:36 PM Revision 1b846f43 (ceph): radosgw: drop useless/broken set_val daemonize
Not sure what the intent was here anyway... but it is broken (the func
takes a string, not a bool).
Signed-off-by: S...
Sage Weil
11:35 PM Revision 1f7cb757 (ceph): config: separate --log-to-stderr and --err-to-stderr
Instead of having magic values (1 == errors only to stderr, 2 =
everything), have two booleans.
Signed-off-by: Sage ...
Sage Weil
11:14 PM Revision e98cbc43 (ceph): rgw: fix xattrs cache
Yehuda Sadeh
10:24 PM Revision cf6a9404 (ceph): osd: eliminate CRASHED state
This was an intermediate state that indicated that replay would be needed.
It was poorly named, and not very useful. ...
Sage Weil
10:24 PM Revision 03593019 (ceph): mon: fix last_clean_interval calculation
This up_rom == first check is old and wrong. It may have been correct at
the time, when the OSD had a defined shutdo...
Sage Weil
10:24 PM Revision 600bda47 (ceph): osd: fix last_clean interval bounds
It was _first and _last, inclusive, but the epochs are really points in
time, so _last should have been non-inclusive...
Sage Weil
10:24 PM Revision 249ed569 (ceph): osd: move may_need_replay calculation out of PriorSet
Although they both depend on past intervals, they are unrelated. Factor
out the may_need_replay calculation from Pri...
Sage Weil
10:24 PM Revision 30c34ab8 (ceph): osd: trim past intervals when we complete recovery.
We weren't trimming at all, which meant these would just accumulate
indefinitely.
Signed-off-by: Sage Weil <sage@new...
Sage Weil
10:14 PM Revision d6661f93 (ceph): ReplicatedPG: Include pg version in MOSDOpReply on error
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
08:23 PM Revision f8afd8bf (ceph): rgw: reduce rados bucket stats (and getxattrs)
we didn't pass the context, and some other issue with the context map Yehuda Sadeh
05:54 PM Revision b8beff3d (ceph): ceph_manager: count active+clean+<somjething else> as active+clean
In my case, one pg was active+clean+scrubbing.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:32 PM Revision a1756c5e (ceph): rgw: object removal should remove object from index anyway
even if object doesn't exist. Index might have the wrong info. Yehuda Sadeh
04:56 PM Revision dd5087fa (ceph): osd: simplify finalizing scrub on replica
We can simply call osr.flush() (with pg lock held) to ensure that prior
writes are visible and scrubbable. This avoi...
Sage Weil
04:29 PM Feature #1649 (Resolved): osd: make replay interval a per-pool setting
Most pools don't need it. Make it a per-pool thing.
This involves a feature bit and refactor of the pg_pool_t e...
Sage Weil
04:14 PM Revision 29899de5 (ceph): osd: PriorSet: acting/up membership implies still alive
If the osd is in the acting or up sets, we can assume they are still alive,
even though we don't know that for sure, ...
Sage Weil
03:58 PM Revision a1ddec2a (ceph): Merge remote branch 'gh/master' into wip-prior
Conflicts:
src/osd/PG.cc
Sage Weil
03:37 PM Messengers Feature #1648 (Duplicate): msgr: choose ip to bind to based on network
Currently we bind to an explicit address or to any, and learn what address to advertise by looking at our first outgo... Sage Weil
03:34 PM Feature #1647 (Resolved): mon: robust bootstrap
Currently mkfs looks like:
- create initial states on each monitor independently
- start them up and they'll fo...
Sage Weil
03:29 PM Feature #1646 (Resolved): mon: catch up on committed items before attempting to join quorum
This will prevent a mon that is way behind from dragging down the mon cluster when it comes back online. Sage Weil
12:45 PM rgw Bug #1645 (Resolved): rgw bucket suspended broken
code still looks at the pool auid, which is obviously broken Yehuda Sadeh
11:07 AM Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct result
I think the simplest solution would be:
- for all operations, set an xattr with the last op_seq to write to that ...
Sage Weil
10:59 AM Bug #1632 (Need More Info): osd: crash in dequeue_op
need osd logs Sage Weil
10:59 AM CephFS Bug #1640 (Need More Info): mds: failed assert(trim_to > trimming_pos)
need logs with 'debug journaler = 20' and 'debug ms = 1' on the mds for this one Sage Weil
10:58 AM Bug #1624 (Need More Info): osd crash in HearbeatMap::_check
Sage Weil
10:57 AM CephFS Bug #1509 (Need More Info): cfuse sometimes hangs after unmount
Sage Weil
10:56 AM CephFS Bug #1596 (Need More Info): mds crash during ffsb on kernel client in CInode::is_frozen
Sage Weil
10:55 AM Bug #1609 (Need More Info): osd: failed assert(info.last_complete == info.last_update)
Sage Weil
10:55 AM Bug #1598 (Resolved): osd: fix lost objects
merged, along with the teuthology tests Sage Weil
10:52 AM CephFS Bug #1603 (Need More Info): ceph-fuse crash during unmount
have this one going in a loop to catch it with logs Sage Weil
10:51 AM Bug #1530 (Need More Info): osd crash during build_inc_scrub_map
Sage Weil
10:51 AM Bug #1432 (In Progress): libvirt: fix definition for rbd params/sources/etc
Sage Weil
10:51 AM Bug #1508 (Need More Info): iozone stuck on kernel rbd mount
Sage Weil
10:10 AM Cleanup #1644 (Resolved): osd: prior_set refactor
Sage Weil
09:39 AM rgw Bug #1643 (Rejected): radosgw-admin log show should accept --time
Yehuda Sadeh
09:39 AM rgw Feature #1642 (Resolved): radosgw-admin log show --nonzero-only
Have another flag for radosgw-admin low show like --nonzero-only that only prints a log entry if it will have a nonze... Yehuda Sadeh
09:38 AM rgw Feature #1641 (Rejected): radosgw-admn log show --bandwidth-only
Have a flag for radosgw-admin log show like --bandwidth-only that reduces a log line down to {'bytes_sent':<number>, ... Yehuda Sadeh
12:20 AM Revision f94a44e6 (ceph): OSDMonitor: reweight towards average utilization
The existing reweight-by-utilization calculation did not take into
account the current weight of an OSD, and depended...
Josh Durgin

10/20/2011

11:28 PM Revision 409c5717 (ceph): coverage: don't remove ceph tarball
We want to keep it for examining core files, and we're already
fetching it here, once per suite run.
Josh Durgin
10:56 PM Revision 49b6c118 (ceph): osd: PG::PriorSet: make debug_pg arg const
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:51 PM Revision fa66e65c (ceph): osd: PgPriorSet -> PriorSet
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:50 PM Revision 7bc855a8 (ceph): osd: PgPriorSet: rename prior_set_affected -> affected_by_map
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:47 PM Revision 78236e4e (ceph): osd: PgPriorSet: remove obsolete comment
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:46 PM Revision 2a870c14 (ceph): osd: PgPriorSet: move prior_set_affected into PgPriorSet
This is really where it belongs.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:46 PM Revision c2e66fbd (ceph): osd: PgPriorSet: kill whoami; make PG arg strictly optional
It is only used for the debug output prefix. Make it so we can leave it
out entirely (e.g. for unit tests).
We don'...
Sage Weil
09:12 PM Revision 47e938c0 (ceph): Merge branch 'stable'
Sage Weil
09:12 PM Revision 2b3bdea9 (ceph): osd: fix requeue_ops
The ls argument passed to requeue_ops() is a reference, and one of the
methods we call (say, _handle_op) might want t...
Sage Weil
08:59 PM Revision 3b76f9fc (ceph): perfcounters: remove dout
We can't use this because we're part of libglobal and there is no
g_ceph_context. And i'm too lazy to use cct.
Sign...
Sage Weil
08:58 PM Revision 863e5b04 (ceph): perfcounters: fix unit test
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
08:48 PM Revision 1d002a1e (ceph): Merge remote branch 'gh/wip-unfound'
Sage Weil
08:16 PM Revision 28df1e91 (ceph): filestore: measure commit interval, latency, journal full count
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:45 PM Revision d2dbae97 (ceph): osd: clean up perfcounter names
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:43 PM Revision d31e78f6 (ceph): filestore: simplify perfcounter lifecycle
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:43 PM Revision b000e4d4 (ceph): filestore: simplify, clean up perfcounters
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
06:34 PM Revision d26488bc (ceph): perfcounters: fix addition/removal
We are not responsible for deleting removed perfcounters.
Add debugging.
Signed-off-by: Sage Weil <sage.weil@dreamh...
Sage Weil
06:33 PM Revision 7207b819 (ceph): filestore: fix perfcounter definition
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
05:59 PM Revision 53ad579e (ceph): filestore: fix logger start
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
05:54 PM Revision d3366ccc (ceph): Merge remote-tracking branch 'github/master' into wip-swift-fix
Yehuda Sadeh
05:52 PM Feature #1420 (Resolved): build release rpms
this is done the extent that i am willing to spend time on it. hopefully suse will show up at some point and improve... Sage Weil
05:45 PM Bug #1636 (Resolved): reweight-by-utilization does not choose good weights
The existing reweight-by-utilization code didn't make sense - commit:f94a44e688883f2db0971435a5333a8b60c77dec fixes t... Josh Durgin
04:11 AM Bug #1636 (Resolved): reweight-by-utilization does not choose good weights
there's a problem distributing the data evenly over all devices.
i'm using v0.36 and have a test setup with two host...
pille palle
05:22 PM CephFS Bug #1640 (Resolved): mds: failed assert(trim_to > trimming_pos)
This happened with bonnie++ on cfuse in teuthology:~teuthworker/archive/nightly_coverage_2011-10-20/729/remote/ubuntu... Josh Durgin
05:09 PM RADOS Feature #1639 (New): osd: guard against bad objects in cls map functions
Got this when I accidentally set a bad locator:... Greg Farnum
04:21 PM Revision 288ccc88 (ceph): perfcounters: clean up interface a bit
No logger_ prefix necessary.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:21 PM Revision daea03ef (ceph): perfcounters: use simple names
We don't need to uniquely identify ourselves in the global namespace with
the PerfCounter name.. only in the current ...
Sage Weil
02:15 PM Bug #1624: osd crash in HearbeatMap::_check
commit:2b3bdea9f7bcf9e9f8d4328f62d82ff43e996b3a fixes at least some of these.... Sage Weil
01:45 PM Bug #1624: osd crash in HearbeatMap::_check
running this in a loop with logs to try ot catch it Sage Weil
12:02 PM Bug #1624: osd crash in HearbeatMap::_check
And teuthology:~teuthworker/archive/nightly_coverage_2011-10-20/730/remote/ubuntu@sepia50.ceph.dreamhost.com/log/osd.... Josh Durgin
12:00 PM Bug #1624: osd crash in HearbeatMap::_check
And teuthology:~teuthworker/archive/nightly_coverage_2011-10-20/727/remote/ubuntu@sepia27.ceph.dreamhost.com/log/osd.... Josh Durgin
11:57 AM Bug #1624: osd crash in HearbeatMap::_check
Happened again in teuthology:~teuthworker/archive/nightly_coverage_2011-10-20/726/remote/ubuntu@sepia41.ceph.dreamhos... Josh Durgin
02:01 PM Feature #1630 (Resolved): Monitor journal fullness (bytes used, size) via perfcounters
Sage Weil
01:45 PM Bug #1635 (Duplicate): osd hit suicide timeout in heartbeat_map thread
Sage Weil
01:28 PM Bug #1588: blogbench on kclient possibly made machine die
Happened again today - just more transactions timing out in the logs. Josh Durgin
01:23 PM Bug #1633: osd crash in CryptoKey::decrypt
Happened again while thrashing in teuthology:~teuthworker/archive/nightly_coverage_2011-10-20/732/remote/ubuntu@sepia... Josh Durgin
11:52 AM Bug #1530: osd crash during build_inc_scrub_map
This happened again during cfuse on ffsb (teuthology:~teuthworker/archive/nightly_coverage_2011-10-20/694). Josh Durgin
09:27 AM Bug #1638 (Won't Fix): Can't create object with large xattrs in a single operation (on extN)
A single compound operation that does:
- create
- setxattr (small enough to fit but large enough to fill in the e...
Yehuda Sadeh
04:45 AM Revision f25879ac (ceph): encoding: add optional features
Update encode macros to allow a feature bitmask to be passed through
to a classes encode() method.
Signed-off-by: Sa...
Sage Weil
04:22 AM Feature #1637 (Duplicate): OSDs running full take down other OSDs
this issue has a relation to #1636.
in my test setup of v0.36 when one OSD runs full it gets taken down.
this start...
pille palle
04:14 AM Revision 0aa40ea0 (ceph): assert: no 0x before thread id
There's no 0x prefix in the log lines either. This makes it easier to
copy/paste word and search.
Signed-off-by: Sa...
Sage Weil
03:48 AM Revision 0f0c5947 (ceph): osdmap: uninline big stuff
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:46 AM Revision a71455c8 (ceph): rgw: properly handle cleaning up of listings
If a listing you get back from the OSD consists only of
non-existent entries, you still need to handle it and resume ...
Greg Farnum
12:46 AM Revision 470742d8 (ceph): cls_rgw: move stat update code after error checks in complete_op
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
12:46 AM Revision 6dc0da4c (ceph): cls_rgw: implement a dir_suggest_changes function.
This takes a bufferlist of suggested changes to the directory, trims
out any sufficiently old tags, and then applies ...
Greg Farnum
12:46 AM Revision 952ebbae (ceph): cls_rgw: add constructors to data structs; don't leak tags on races
We were leaking tags on races before, since we cut out of the function
before clearing the tag. We don't do that any ...
Greg Farnum
12:45 AM Revision 9496732d (ceph): rgw: write and use the check_disk_state function
This is used to check the actual on-disk state, and encode
suggested updates for the index.
Then cls_bucket_list send...
Greg Farnum

10/19/2011

11:47 PM Revision a5ada568 (ceph): rgw: fix bad snprintf
Yehuda Sadeh
10:35 PM Revision 5de847f3 (ceph): .gitignore: add test_filestore_idempotent
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
10:35 PM Revision b57e8967 (ceph): test_filestore_idempotent: initialize var
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
09:58 PM Revision 7f5f1ec1 (ceph): rgw: implement swift metadata POST
Yehuda Sadeh
04:14 PM Revision cf333152 (ceph): Merge branch 'stable'
Conflicts:
src/mon/OSDMonitor.cc
src/osd/OSD.cc
Sage Weil
03:15 PM Bug #1526: log bound mismatch after thrashing with bonnie
another occurance, running on swab. This may have led to pg version reset.... Yehuda Sadeh
03:02 PM Bug #1635 (Duplicate): osd hit suicide timeout in heartbeat_map thread
This was while thrashing with radosbench, during peering, with osds 3 and 6 marked out.
From teuthology:~teuthworker...
Josh Durgin
02:46 PM Bug #1634 (Can't reproduce): osd: crash decoding non-existent object_info_t
From teuthology:~teuthworker/archive/nightly_coverage_2011-10-19/680/remote/ubuntu@sepia28.ceph.dreamhost.com/log/osd... Josh Durgin
12:05 PM Bug #1633 (Resolved): osd crash in CryptoKey::decrypt
From teuthology:~teuthworker/archive/nightly_coverage_2011-10-19/682/remote/ubuntu@sepia72.ceph.dreamhost.com/log/osd... Josh Durgin
11:59 AM Bug #1632 (Can't reproduce): osd: crash in dequeue_op
During ffsb:... Josh Durgin
11:26 AM Bug #1631 (Can't reproduce): osd: failed assert(repop_queue.front() == repop)
This happened on two osds during a multiple_rsync workunit (teuthology:~teuthworker/archive/nightly_coverage_2011-10-... Josh Durgin
10:51 AM Feature #1630 (Resolved): Monitor journal fullness (bytes used, size) via perfcounters
Anonymous
05:33 AM Revision b297d1ed (ceph): osdmap: make encoding based on features
Instead of relying on the caller to decide whether encode_old_client()
is appropriate, pass in the feature set and en...
Sage Weil
05:26 AM Revision 6e2018ce (ceph): osd: normalize encoding of pg_pool_t
Instead of using a cumbersom C struct, move members into pg_pool_t and
use normal encode/decode methods.
Signed-off-...
Sage Weil
05:26 AM Revision cee1b27f (ceph): crush: clean up encoder/decoder
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:25 AM Revision 954d3f9b (ceph): use WRITE_CLASS_ENCOER macro when possible
Sage Weil
05:25 AM Revision 9d93bfce (ceph): encoding: WRITE_CLASS_ENCODER_MEMBER -> WRITE_CLASS_MEMBER_ENCODER
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

10/18/2011

11:52 PM Revision 83914d22 (ceph): test_filestore_idempotent: simple tool to generate a worklaod of non-id...
Generate a workload of operations that are non-idempotent. These are:
transaction {
clone A -> A.($n-1)
writ...
Sage Weil
11:52 PM Revision 3e92aace (ceph): filestore: tolerate EEXIST on mkcoll when not-btrfs
For non-btrfs file systems we should tolerate EEXIST because we may
replay the event more than once.
Signed-off-by: ...
Sage Weil
11:41 PM Revision ba165fec (ceph): rgw: some swift obj metadata related fixes
Yehuda Sadeh
11:18 PM Revision 13b0bbb3 (ceph): mds: handle xattrs on inode creation
Allow mknod, mkdir, symlink, create to provide xattrs for the new
inode. This will be used by the kclient to set ACL...
Sage Weil
11:04 PM Revision 7ea07832 (ceph): radosgw-admin: fix conflict with KeyType in libnss
rgw/rgw_admin.cc:459:6: error: using typedef-name 'KeyType' after 'enum'
/usr/include/nss3/keythi.h:69:3: error: 'Key...
Sage Weil
08:28 PM Revision ed5e4341 (ceph): rgw: add content-type to index dirent
Yehuda Sadeh
06:42 PM Revision da6cdfdd (ceph): osd: PgPriorSet: cur -> probe
Rename cur to probe, the set of OSDs we need to probe in order to
successfully peer.
Signed-off-by: Sage Weil <sage@...
Sage Weil
06:40 PM Revision 4e5242e0 (ceph): osd: PgPriorSet: restructure lost checks for prior set
When we add down osds to the cur set, we block peering because there
are OSDs that may have data we need and they are...
Sage Weil
06:01 PM Revision 298dbbe6 (ceph): rgw: workqueue suicide timeout is infinity
Yehuda Sadeh
04:49 PM Bug #1629 (Can't reproduce): pgs stuck degraded (only mapped to 1 osd)
From teuthology:~teuthworker/archive/nightly_coverage_2011-10-18/636/teuthology.log:... Josh Durgin
04:42 PM Bug #1624: osd crash in HearbeatMap::_check
argh, the tarball is already gone:
# wget http://ceph.newdream.net/gitbuilder/output/sha1/e6dbd7141bd8b4403f3b931f...
Sage Weil
12:00 PM Bug #1624 (Resolved): osd crash in HearbeatMap::_check
Logs with debugging are in vit:~joshd/thrash_stuck_active4. This happened on osds 0 and 4:... Josh Durgin
03:35 PM Bug #1628: segfault attempting to map an rbd snapshot
This is a bug in the rbd command line tool - it accepts snapname but doesn't use it for map/unmap. Additionally, it d... Josh Durgin
03:18 PM Bug #1628 (Resolved): segfault attempting to map an rbd snapshot
... John Leach
03:08 PM Bug #1626: ceph-mon HA not working right; all must be up
Sorry to dribble this in: it seems with one mon down and voted out, "ceph -s" takes <1sec 66% of the time, ~3sec 33% ... Anonymous
03:07 PM Bug #1626: ceph-mon HA not working right; all must be up
Oh sorry, what I see with vstart is a 10-second timeout until the mons vote mon.c out. This is *not* what Carl report... Anonymous
03:05 PM Bug #1626: ceph-mon HA not working right; all must be up
Carl saw it originally. Easy to repro with vstart:... Anonymous
02:48 PM Bug #1626: ceph-mon HA not working right; all must be up
where did you see this? Sage Weil
02:28 PM Bug #1626 (Can't reproduce): ceph-mon HA not working right; all must be up
If mon.gamma is down, "ceph -s" hangs trying to connect to all three ceph-mon. The paxos majority rule system does no... Anonymous
02:52 PM Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct result
FWIW even if we know what not to replay, we could still be screwed with ext4 (which does not commit everything in ord... Sage Weil
02:25 PM Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct result
Tommi Virtanen wrote:
> Isn't the idempotency in that case "clone foo_head -> foo_2 IFF foo_2 does not exist" ?
T...
Sage Weil
02:06 PM Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct result
Isn't the idempotency in that case "clone foo_head -> foo_2 IFF foo_2 does not exist" ? Anonymous
02:37 PM Bug #1627 (Can't reproduce): ceph-mon memleak if ceph-osd cluster ip is not reachable, but public...
... Anonymous
02:25 PM Feature #1625 (Rejected): changing ceph-mon ip address needs monmap change on every mon machine
Moving mon.{alpha,beta,gamma} to new IP addresses was a fairly convoluted process. This would be nice if it was simpl... Anonymous
02:15 PM Feature #641: allow logs to be piped to an external program
This is feature creep. If you want to process the logs asap in another process, just have it get ceph stdout as stdin... Anonymous
11:19 AM Bug #1620 (Resolved): rgw suicide due to heartbeat timeout
Fixed, commit:298dbbe64f8b0738ec58db43782813d0686717c7. Basically a 0 value for the rgw suicide timeout should do the... Yehuda Sadeh
11:01 AM Bug #1588: blogbench on kclient possibly made machine die
This happened again yesterday and today with different machines. Both times, the only unusual thing in kern.log was t... Josh Durgin
01:59 AM Revision 0a027599 (ceph): osd: PgPriorSet: simplify (and change) CRASHED logic
Any single OSD from a given interval surviving is sufficient to ensure
that an ACKed write during that interval was c...
Sage Weil
01:57 AM Revision f7ef94d3 (ceph): osd: PgPriorSet: update comment terms a bit
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:51 AM Revision bbb06d34 (ceph): osd: do not short-cut up_thru update for new PGs
Commit e731885d2550ee985bf875ab5bb5faf28f1693eb made it possible for
a new PG to go active without forcing the OSDs u...
Sage Weil
12:44 AM Revision 57e0ab74 (ceph): osd: PgPriorSet: clean up per-interval var names
We don't actually use any_lost_now, but it makes the logic easier
to understand to have it there.
Signed-off-by: Sag...
Sage Weil
12:44 AM Revision 53381364 (ceph): osd: PgPriorSet: clean up comments a bit
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:44 AM Revision 33b33f7e (ceph): osd: PgPriorSet: remove unused PG member
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:44 AM Revision 113b7833 (ceph): osd: PgPriorSet: revert start_since_joining check
Commit 5b78f5db8c200edcc949033e1badae70fecd2e08 added a check to
prevent some sort of badness when osds were marked l...
Sage Weil
12:44 AM Revision 3dda4465 (ceph): osd: PgPriorSet: remove up_thru crap
This was added way back in 1cf9bebc8e5063f5f311d33e7735bcc9286e98ce,
but as far as I can tell it didn't make any sens...
Sage Weil
12:44 AM Revision f89f4d9b (ceph): osd: PgPriorSet: do not include UP osds in prior.cur
The up osds are not (directly) relevant since they are not necessarily
members of the PG. We only care about acting ...
Sage Weil
12:09 AM Revision 9dfa1105 (ceph): rgw: fix swift account and containers listing limits
Yehuda Sadeh

10/17/2011

11:48 PM Revision c5638b70 (ceph): osd: PgPriorSet: any_survived -> any_is_alive_now
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
11:13 PM Revision e6dbd714 (ceph): doc: Change diagram to have radosgw closer to direct rados access.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
10:35 PM Revision 3c90c0d4 (ceph): add singleton lost-unfound
Sage Weil
10:32 PM Revision 4ec37b23 (ceph): add lost_unfound task
Also some misc useful bits to ceph_manager. Sage Weil
10:21 PM Revision edcd4d97 (ceph): rgw: some more swift fixes
Yehuda Sadeh
09:45 PM Revision 83cf3fef (ceph): Expect 'wrongly marked me down' messages during thrashing
Josh Durgin
09:42 PM Revision bcded7f1 (ceph): ceph: add whitelist for cluster log errors
Some messages are expected when thrashing osds or creating unfound
objects.
Fixes: #1622
Josh Durgin
09:13 PM Revision 0bad37e3 (ceph): streamtest: do mkfs
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:12 PM Revision 525a610f (ceph): streamtest: print to stdout
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:49 PM Revision 9c956049 (ceph): mkcephfs: copy ceph.conf to /etc/ceph/ceph.conf (when -a)
You can disable this with --no-copy-conf.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:40 PM Revision fba220ec (ceph): nuke: reset syslog configuration after rebooting
Previously we removed a file and rebooted without syncing, so the file
was never deleted.
Josh Durgin
04:56 PM Bug #1623 (Can't reproduce): ceph-osd fails to bind socket
... Yehuda Sadeh
03:51 PM Revision 9baf5ef4 (ceph): ceph.spec: don't chkconfig
This was fighting with suse insserv. Still needs some cleanup.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
03:50 PM Revision 21d941e8 (ceph): ceph.spec: work around build.opensuse.org
The redhat-rpm-config isn't installed on build.opensuse.org, which means
the processor is set to i386 instead of some...
Sage Weil
03:49 PM Revision 195a484b (ceph): ceph.spec: capitalize first letter to make rpmlint happy
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:35 PM Revision a6f3bbb7 (ceph): v0.37
Sage Weil
03:27 PM Bug #1473 (Resolved): osd assert failure: FAILED assert(0 == "oi disagrees with stat, or error c...
Samuel Just
03:26 PM Bug #1473: osd assert failure: FAILED assert(0 == "oi disagrees with stat, or error code on stat")
At least the recent instances of this were probably caused by the btrfs xattr bug (#1612). Samuel Just
03:25 PM Bug #1486 (Resolved): osd: 0-length meta/pginfo_* files
Sage Weil
10:10 AM Bug #1486: osd: 0-length meta/pginfo_* files
I saw this on alexandria, and it was caused by:
1- EMFILE (too many open files)
2- filestore wasn't assering on...
Sage Weil
03:20 PM Bug #1612 (Resolved): osd/PG.cc: 3839: FAILED assert(missing[oid].need <= v)
This was caused by a btrfs xattr bug. I got a patch back from josef and pushed it to the dho kernel. Samuel Just
03:05 PM Feature #1622 (Resolved): teuthology: whitelist ceph.log entries
Implemented in teuthology and whitelisted 'wrongly marked me down' messages for thrashing jobs in the suite. Josh Durgin
10:08 AM Feature #1622 (Resolved): teuthology: whitelist ceph.log entries
Need to be able to do this to make certain tests pass. notably thrashing and the new lost_unfound. Sage Weil
03:37 AM Revision ca8f6036 (ceph): osd: fix assemble_backlog
This was written assuming that le->prior_version wouldn't be the version
that we have locally on disk. Not always tr...
Sage Weil
03:37 AM Revision 2fdec7b8 (ceph): osd: fix add_next_event Missing::item::have
The missing set should be accurate up to the current point in the log. The
log_tail has no bearing on that, nor does...
Sage Weil

10/15/2011

05:56 AM Revision c1cabf56 (ceph): ceph: don't crash when sending message to !up osd
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:03 AM Revision 615689a9 (ceph): osd: implement lost_revert
Roll back to the last available version of an object. If there is no
available version, delete it.
Leave the door o...
Sage Weil
04:03 AM Revision 3a046774 (ceph): osd: pull old version to revert to
If we are the primary, and are doing a LOST_REVERT, pull the old version
of the object and update the version when we...
Sage Weil
04:02 AM Revision 03cd1088 (ceph): osd: adjust LOST log entry types; simplify log entry type strings
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:02 AM Revision ad39d814 (ceph): osd: all_unfound_are_queried_or_lost
The check to make isn't whether all locations are lost, but whether all
locations are either lost or have been querie...
Sage Weil
04:02 AM Revision 81f36c2d (ceph): osd: remove superfluous write_info calls
- merge_log() will write_info (and log) as needed
- Activate() will do the same
Signed-off-by: Sage Weil <sage@newdr...
Sage Weil
04:02 AM Revision c3fa0783 (ceph): messages/MOSDPG*: clean up output a bit
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:02 AM Revision a3be6651 (ceph): osd: fix share_pg_log()
We need to handle a log message in the ReplicaActive state. And set the
epoch properly when we send it.
Signed-off-...
Sage Weil
04:02 AM Revision 22684f25 (ceph): osd: pass version explicitly to pull
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:02 AM Revision efe5abfc (ceph): osd: make C_OSD_CommittedPushedObject::op optional
This lets us reuse this helper for commiting recovery ops that aren't a
result of a push.
Signed-off-by: Sage Weil <...
Sage Weil
04:02 AM Revision 51158820 (ceph): osd: factor out recover_primary_got() helper
This handles the missing set and lsat_complete adjustment when we recover
an object on the primary.
Signed-off-by: S...
Sage Weil
04:02 AM Revision 43bd49d8 (ceph): osd: fix up PG::Missing methods a bit
Pass in iterators when possible. Stack methods instead of duplicating
functionality.
Signed-off-by: Sage Weil <sage...
Sage Weil
04:02 AM Revision 7c05c1fe (ceph): osd: simplify share_pg_log
Use Log::copy_after(). Drop the useless argument. Strip out the broken
LOST logic.
Signed-off-by: Sage Weil <sage@...
Sage Weil
04:02 AM Revision a8760e50 (ceph): osd: fix up mark_all_unfound_lost so that it actually works
Well, it works given our weak definition of LOST.
- use ObjectContexts properly
- move into ReplicatedPG
- no need f...
Sage Weil
03:43 AM Revision 35dab57f (ceph): msg: add MCommand, MCommandReply message types
These are similar to MMonCommand[Ack], but aren't PaxosServiceMessage
children, don't include the command in the repl...
Sage Weil
03:43 AM Revision beaca74d (ceph): msg: entity_name_t::parse()
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
03:43 AM Revision a37d6d03 (ceph): cephtool: ability to send commands directly to osds
This makes commands beginning with 'tell <target>' magic in that they go
to the given target instead of to the monito...
Sage Weil
03:43 AM Revision 7f687fca (ceph): osd: handle (and reply to) direct MCommands
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:43 AM Revision f868e382 (ceph): osd: remove some pg stats debug cruft
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:43 AM Revision b35d96d5 (ceph): mon: feed MPGStats tids back through the MPGStatsAck
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:43 AM Revision 1cbcc953 (ceph): osd: process commands in a workqueue
This lets us do commands that can potentially block. For example:
- flush pg stats to osd
- request (and wait for...
Sage Weil
03:43 AM Revision 84a6f6e7 (ceph): osd: implement 'flush_pg_stats' command
This flushes the current pg stats to the monitor, and blocks until the
monitor commits it.
Signed-off-by: Sage Weil ...
Sage Weil
03:21 AM Revision 502fbba5 (ceph): paxos: trim extra state dirs
OSDMonitor, for instance, stores both an "osdmap" and "osdmap_full" for
each state. Trim them both.
Signed-off-by: ...
Sage Weil
03:20 AM Revision 6d123067 (ceph): PG: call set_last_peering_reset in Started contructor
Calling it here should cover all possible replica and primary peering
resets.
Signed-off-by: Samuel Just <samuel.jus...
Samuel Just

10/14/2011

11:49 PM Revision b5c60623 (ceph): filestore: assert on any unexpected error
Right now, the only errors we expect out of the underlying filesystem are
-ENOENT, -ENODATA, or (as a workaround for ...
Sage Weil
08:31 PM Revision ba41e6c7 (ceph): osd: send full map if we don't have sufficiently old incremental
If the peer has a really old map, send a full map instead of crashing
because we are missing the needed incremental.
...
Sage Weil
08:30 PM Revision 607043ed (ceph): osd: send full map if we don't have sufficiently old incremental
If the peer has a really old map, send a full map instead of crashing
because we are missing the needed incremental.
...
Sage Weil
08:30 PM Revision 0cc7da2f (ceph): osd: share oldest_map info with peers
This helps OSDs trim their old maps even when they don't get MOSDMap
messages directly from the monitor.
It also fee...
Sage Weil
08:30 PM Revision 818cf8c8 (ceph): mon: make number of old paxos states configurable
Currently settable on osdmaps, pgmaps, and log. Still need MDSMap and
authmap trimming.
Signed-off-by: Sage Weil <s...
Sage Weil
08:30 PM Revision 474e368d (ceph): osd: remove old osdmaps
When the monitor removes old maps, we should too.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
08:30 PM Revision 3acb9197 (ceph): paxos: trim extra state dirs
OSDMonitor, for instance, stores both an "osdmap" and "osdmap_full" for
each state. Trim them both.
Signed-off-by: ...
Sage Weil
08:20 PM Revision dd3282c1 (ceph): rgw: multiple swift fixes and cleanups
Yehuda Sadeh
08:03 PM Revision b3c68a51 (ceph): PG: call set_last_peering_reset in Started contructor
Calling it here should cover all possible replica and primary peering
resets.
Signed-off-by: Samuel Just <samuel.jus...
Samuel Just
07:57 PM Revision ef30e69c (ceph): PG: Fix log.empty confusion
Previously, log.empty meant that the log.head was everion_t(). However,
it was in a few places used to mean that log...
Samuel Just
05:17 PM Revision fccd28df (ceph): PG: Fix log.empty confusion
Previously, log.empty meant that the log.head was everion_t(). However,
it was in a few places used to mean that log...
Samuel Just
01:35 PM Feature #1604 (Resolved): kclient: handle osdmap discontinuity
Sage Weil
01:00 PM Bug #1449 (Resolved): osd: FAILED assert(0 == "we got a bad state machine event")
b3c68a514135318e0dfda9f929f15f26340cd664 Samuel Just
12:42 PM Bug #1620 (Resolved): rgw suicide due to heartbeat timeout
Happens around a hour after osd went down:... Yehuda Sadeh
10:14 AM Bug #1607 (Resolved): osd: failed assert(missing.is_missing(oe.soid))
a50fbe2b982e5d19040f4ae5795455dde3a9a02e Samuel Just
10:13 AM Bug #1599 (Resolved): osd assert fail (new_tail >= ondisklog.tail)
fccd28df371dceffaf6ff7a50422b6a5b1ee126c should take care of it. Samuel Just
03:03 AM Revision f658cb4a (ceph): makefile changes for interval tree
Added unit test case for interval tree to the makefile template.
Signed-off-by: Jojy George Varghese <jvarghese@scal...
Jojy George Varghese
03:02 AM Revision d516f9b5 (ceph): mds: Unit tests for interval tree
Provides usage scenarios and test cases for interval tree
implementation.
Tests include:
- testing addInterval inte...
Jojy George Varghese
03:02 AM Revision 72d50fa5 (ceph): mds: Interval tree implementation
Interval tree is an optimized data structure for representing and
querying intervals. Elementary intervals are repres...
Jojy George Varghese

10/13/2011

11:02 PM Revision 87f8389e (ceph): rgw: more swift fixes and adjustments
Yehuda Sadeh
08:28 PM Revision b6d9ed94 (ceph): auth: remove global instance of auth_supported
Wrap it in a class.
Instantiate locally, or keep a copy around if we'll need it often.
Factor out the protocol sele...
Sage Weil
04:53 PM Revision 1f3b12e0 (ceph): osd: bound generate_past_intervals() by oldest map
The oldest osdmap we maintain is a lower bound on last_epoch_clean for the
entire system (assuming the monitor is doi...
Sage Weil
04:35 PM Revision 0167e824 (ceph): cls_rgw: rewrite rgw_bucket_complete_op to use update.
Unfortunately we can't do multiple writes via the interface -- the
second one will clobber the first one. So use the ...
Greg Farnum
04:35 PM Revision 45ebaf70 (ceph): cls_rgw: remove the write_bucket_dir function.
It's no longer called anywhere. Hurray, we don't do our own
read-modify-write cycle any more (and can exploit the pow...
Greg Farnum
04:33 PM Revision 75f7e546 (ceph): cls_rgw: refactor rgw_bucket_complete_op in terms of TMAP
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
04:33 PM Revision 2592e41a (ceph): cls_rgw: refactor rgw_bucket_prepare_op in terms of tmap
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
04:33 PM Revision 83504c42 (ceph): cls_rgw: refactor rgw_bucket_init_index in terms of tmap
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
04:33 PM Revision 15a3df84 (ceph): cls_rgw: refactor read_bucket_dir in terms of tmap.
This function won't be called often once refactoring is done, but
its functionality will be needed for listing, if no...
Greg Farnum
04:32 PM Revision 583e16d9 (ceph): objclass: add map interfaces.
Right now, they implement the TMAP functions, plus a few obvious
extras to read/write select keys and the header. In ...
Greg Farnum
04:29 PM Feature #1619 (Resolved): libvirt: test with selinux/apparmour enabled
There are probably checks that assume the image is a file. Josh Durgin
04:29 PM Revision c98e1c57 (ceph): ReplicatedPG: remove unused tmap implementation.
If it's surrounded by an if(0), it shouldn't still be in the code.
Signed-off-by: Greg Farnum <gregory.farnum@dreamh...
Greg Farnum
04:28 PM Feature #1618 (Resolved): libvirt: make sure migration works
I think there's a small patch needed since it assumes the image is a file. Josh Durgin
04:18 PM Bug #1617: pgs stuck down and peering with only one osd down and out
Happened in run 494 as well. These were both rados bench with thrashing. Josh Durgin
03:42 PM Bug #1617 (Won't Fix): pgs stuck down and peering with only one osd down and out
From teuthology:~teuthworker/archive/nightly_coverage_2011-10-13/491/teuthology.log:... Josh Durgin
01:35 PM Bug #1616: crash in is_supported_auth
hopefully this is resolved by commit:b6d9ed9412cb046747bb0d0713c286613757bfcf
i confess i don't see why exactly th...
Sage Weil
12:52 PM Bug #1616: crash in is_supported_auth
This happened again in run 493. Josh Durgin
11:54 AM Bug #1616 (Resolved): crash in is_supported_auth
From teuthology:~teuthworker/archive/nightly_coverage_2011-10-13/490/remote/ubuntu@sepia29.ceph.dreamhost.com/log/osd... Josh Durgin
12:41 PM rgw Bug #1584 (Resolved): rgw: swift key management is busted
We can now hold multiple swift keys, and multiple S3 keys. There's one swift key per subuser, and we can specify key ... Yehuda Sadeh
11:15 AM rgw Bug #1568 (Rejected): rgw: add object_locator to bucket index
The only locator we use is on shadow and temporary objects, and these are located by their associated actual object. ... Greg Farnum
10:35 AM Linux kernel client Bug #1615 (Can't reproduce): null pointer dereference in ceph_msg_new
This happened during a blogbench run:... Josh Durgin
10:02 AM Bug #1599: osd assert fail (new_tail >= ondisklog.tail)
Finally reproduced this with debugging - logs and pg and osd dump will be in vit:~joshd/thrash_stuck_active3 in a bit... Josh Durgin
09:58 AM rgw Bug #1570 (Resolved): rgw: use tmap for bucket index objects
Pushed to master in commit:45ebaf705d1e37f6b0af84f27767c141496c2f1e
Passes S3 tests.
Greg Farnum
09:58 AM Feature #1569 (Resolved): osd: create a tmap class api
Pushed to master in commit:583e16d9591391c834cd17154571926bffc05abc Greg Farnum

10/12/2011

11:26 PM Revision 42c8ae77 (ceph): test_librbd: expect copy to succeed
0 is the success return code. These were accidentally changed in the
conversion to gtest.
Signed-off-by: Josh Durgin...
Josh Durgin
11:26 PM Revision d0d265bf (ceph): librbd: return errors when read_iterate fails during copy
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
10:44 PM Revision a50fbe2b (ceph): PG: merge_old_entry: merged delete might not be in missing
If the new log does not contain an entry for that oid, it might not yet
be in missing, and we would need to add it.
...
Samuel Just
10:37 PM Revision 493596a7 (ceph): radosgw-admin: test swift keys creation/removal
Yehuda Sadeh
09:46 PM Revision 42bbea89 (ceph): rgw: swift key removal
Yehuda Sadeh
06:14 PM Revision 05dae94f (ceph): Revert "config: base default libdir, sysconfdir off autoconf values"
This reverts commit 7e5dee907a8218647a88d1c7d3316cc277e1c44b. Sage Weil
06:09 PM Revision 1216eb2d (ceph): rgw: some swift api fixes
Yehuda Sadeh
04:34 PM Revision 7e5dee90 (ceph): config: base default libdir, sysconfdir off autoconf values
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
03:42 PM Bug #1449 (In Progress): osd: FAILED assert(0 == "we got a bad state machine event")
This happened again, with a log being received in GetInfo. This was during radosbench and fast thrashing.
From teuth...
Josh Durgin
12:54 PM Bug #1613 (Resolved): mon crash
Greg Farnum
11:44 AM Bug #1613: mon crash
Excellent. I got it running again using second monitor's data on first monitor.
Yes I am running kernel client on...
Hong Cho
11:11 AM Bug #1613: mon crash
The mon data dir is specified in your ceph.conf.
This backtrace though makes it look like you're running the kerne...
Greg Farnum
11:02 AM Bug #1613: mon crash
Unfortunately I didn't get a chance to record the OOPS. I'll try to get them next time. In the syslog I found this ... Hong Cho
09:25 AM Bug #1613: mon crash
Did you record the OOPS somewhere? It looks as though the monitor is pulling bad data off disk. You should be able to... Greg Farnum
11:24 AM Bug #1594 (Resolved): pgs stuck degraded or active after 3 hours
The bug in the second reproduced case was fixed by commit:af6a9f30696c900a2a8bd7ae24e8ed15fb4964bb. Josh Durgin
09:36 AM Bug #1614 (Resolved): default rados class location needs to be depend on autoconf libdir
Sage Weil
09:20 AM Bug #1614 (Duplicate): default rados class location needs to be depend on autoconf libdir
it's /usr/lib64/... on many platforms. Sage Weil
09:14 AM CephFS Bug #1435: mds: loss of layout policies upon mds restart
Greg Farnum wrote:
> I would assume this is just the IFILE lock state thing you talked about earlier?
>
> There w...
Sage Weil
09:09 AM CephFS Bug #1435: mds: loss of layout policies upon mds restart
I would assume this is just the IFILE lock state thing you talked about earlier?
There were a few other bugs that ...
Greg Farnum
09:05 AM CephFS Bug #1435 (In Progress): mds: loss of layout policies upon mds restart
Can you do a bit of legwork and help us get a process to reproduce this? Once we have that it's easy to fix.
Prob...
Sage Weil
07:54 AM CephFS Bug #1435: mds: loss of layout policies upon mds restart
I lied :-(
I had been running with a single mds for a while, and even though it restarted a number of times, it di...
Alexandre Oliva
 

Also available in: Atom