Project

General

Profile

Activity

From 05/07/2011 to 06/05/2011

06/05/2011

11:34 PM Bug #1127: RBD got silent after 1 month
Hi Josh, Fix confirmed. Please commit the fix to the upstream. Yoshi Tamura
03:16 AM Bug #1143 (Resolved): mon addr without port breaks a new setup
When you create a new cephFS with a config that doesn't specify a port on the mon addr line, it creates an unusable c... Bernard Grymonpon

06/04/2011

01:45 AM Revision 69f90874 (ceph): dumpjnl: call msgr->register_entity before start
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
01:15 AM Revision 4abdf6fd (ceph): simple_spin: use file-scope global not function
function-scoped globals are protected by a mutex, and taking a mutex
inside a spin lock implementation kind of defeat...
Colin Patrick McCabe
12:17 AM Revision b198e5ac (ceph): messages: fix missing bit
Yehuda Sadeh
12:13 AM Revision 53adde03 (ceph): fix the MonClient problems for --with-debug programs.
Still doesn't compile, though.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum
12:08 AM Revision b0afaacb (ceph): messages: fix test for multi op
Yehuda Sadeh

06/03/2011

11:36 PM Revision ec18be53 (ceph): simple_spin: fix linker error
For some stupid reason the linker is pulling this in twice, resulting in
/bin/sh ../libtool --tag=CXX --mode=link ...
Sage Weil
10:19 PM Revision a082747c (ceph): osd: make CLONERANGE src oid encoding more sane
Encode the src_oid in the OSDOp data space, but put it in a separate easy
to access member. This avoids changing the...
Sage Weil
10:16 PM Revision a635a9cb (ceph): rgw: multipart complete upload
Yehuda Sadeh
10:15 PM Revision 8e55e186 (ceph): librados: remove useless reference holding
Yehuda Sadeh
09:57 PM Revision 740eea1e (ceph): Refactor MonClient, KeyRing
MonClient should contain a KeyRing and a RotatingKeyRing. All the
MonClient users, except possibly csyn, don't want t...
Colin Patrick McCabe
09:49 PM Revision 7f393379 (ceph): Prettify exception handling.
Display exception type (e.g. "RuntimeError").
Don't re-display the traceback.
Tommi Virtanen
09:48 PM Revision 08607692 (ceph): Remove dead code.
Tommi Virtanen
09:47 PM Revision b6e22436 (ceph): Prettify config debug printing.
Tommi Virtanen
09:47 PM Revision f2f2f42e (ceph): osd: src src_oids oloc check
We need to ensure that the src and dst objects are always in the same pg.
That is true if
- both oloc.keys match, or...
Sage Weil
09:47 PM Revision 57f979f1 (ceph): Refactor for modularity.
New style: run "./virtualenv/bin/teuthology -v interactive.yaml". Tommi Virtanen
09:47 PM Revision 90b53543 (ceph): dout:remove stream from dout_emerg_streams earlier
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:47 PM Revision ed41f29a (ceph): remove g_keyring
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:47 PM Revision 98226c22 (ceph): DoutStreambuf: de-globalize dout lock
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:47 PM Revision 6ed9a583 (ceph): Add simple_spin
Add simple spinlock implementation that is safe to use from anywhere.
Signed-off-by: Colin McCabe <colin.mccabe@drea...
Colin Patrick McCabe
09:47 PM Revision 5b7049c8 (ceph): DoutStreambuf: de-globalize emergency logging
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:44 PM Revision 0d975b5b (ceph): Record Remote in RemoteProcess.remote, for caller convenience.
Tommi Virtanen
09:29 PM Revision 70d77095 (ceph): Revert "cfuse.cc: use safe_write"
This reverts commit e8ac5aa2a4c4e3ce84ed553dbebfb1cccf5679a9.
This commit is just erroneous. It adds checks on a pip...
Greg Farnum
09:09 PM Revision 73ea844a (ceph): librados: get reference to the io context for the pending async ops
Yehuda Sadeh
09:09 PM Revision 1aee7f98 (ceph): rgw: use clone_range for multi upload completion
Yehuda Sadeh
08:28 PM Revision befcff02 (ceph): SimpleMessenger: Keep a disposable flag for use in reset
pipes marked disposable must not inherit the lossy policy on reconnect.
Also, in Pipe::writer, when sent.empty() && c...
Samuel Just
07:36 PM CephFS Bug #1084: blogbench won't finish: waiting for Fr cap forever
I haven't test it for a while. I'll give it a try after the holidays. Henry Chang
04:34 PM CephFS Bug #1084 (In Progress): blogbench won't finish: waiting for Fr cap forever
Henry, have you seen this lately? I get the impression you were seeing it very easily and I've not been able to repro... Greg Farnum
07:13 PM Revision 7bd016f9 (ceph): rados_bencher: re-add written objects constraint to read benchmark.
Somehow, in the last major change, the constraints that kept the
bencher from trying to read non-existent objects got...
Greg Farnum
07:04 PM Bug #1142 (Resolved): dumpjournal crashes without dumping the journal
I reproduced this on change commit:6fd694c3942a12a3730a30d059b51b37d3f7536f, before the wip-815 branch was merged in.... Colin McCabe
06:53 PM Revision b4eb5efa (ceph): rados_bencher: re-add written objects constraint to read benchmark.
Somehow, in the last major change, the constraints that kept the
bencher from trying to read non-existent objects got...
Greg Farnum
06:22 PM Revision a97451f6 (ceph): librados: support clone_range
Yehuda Sadeh
05:49 PM Revision d1d3e26c (ceph): mds: remove now-erroneous comment
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
05:12 PM Revision 4ef74308 (ceph): Merge branch 'next'
Conflicts:
src/mds/Server.cc
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum
04:53 PM Revision 19949f6d (ceph): mds: Clean up _rename_prepare journaling
This has been broken for a while in terms of journaling
things the MDS isn't auth for. This patch should fix that, an...
Greg Farnum
04:48 PM Revision 4689073c (ceph): mds: _rename_prepaer should only journal dest if auth for it
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
04:48 PM Revision 97ff24c0 (ceph): Un-hardcode tasks.
Tommi Virtanen
04:40 PM Revision 3be4b482 (ceph): Make autotest show debug messages.
Otherwise it's very quiet for a long time. Tommi Virtanen
04:40 PM Revision 1634f3e4 (ceph): Move autotest running into a task.
Tommi Virtanen
04:14 PM Revision 44fe80ab (ceph): Merge branch 'osd_clonerange' into rgw-multipart
Yehuda Sadeh
02:34 PM Bug #1141 (Closed): osd: misc snap bugs
Sage Weil
01:46 PM CephFS Bug #1139 (Resolved): cfuse crashes on exit
Fixed in master by commit:e8ac5aa2a4c4e3ce84ed553dbebfb1cccf5679a9.
The bug doesn't do anything meaningful except ...
Greg Farnum
10:20 AM CephFS Bug #1139 (Resolved): cfuse crashes on exit
I believe this is a new bug. All the FUSE components exit properly but it leaves behind a core dump from the assert i... Greg Farnum
11:33 AM Bug #1118 (Resolved): Crash OSD after upgrdae from 0.28.1 to 0.28.2
If it's xfs it sounds like that was it! Sage Weil
09:31 AM Bug #1118: Crash OSD after upgrdae from 0.28.1 to 0.28.2
Yes, I use xfs.
Sorry, I cannot reproduce this error - I already reformat cluster.
Fyodor Ustinov
11:32 AM Linux kernel client Bug #1140 (Resolved): balance_dirty_pages makes Fw cap revocation slow
See comments for #1110 Sage Weil
11:31 AM CephFS Bug #1110 (Resolved): mds: ls -l hangs on concurrent writer
I'm going to open a separate kclient issue to deal with the balance_dirty_pages issue. Sage Weil
11:29 AM Bug #1121 (Resolved): rados: rados bench read aborts with an error
Pushed to master in commit:b4eb5efaf87d8213f89dee0d9bb156171fcd18e1 and stable in commit:7bd016f97691919689a84b4bd27e... Greg Farnum
10:29 AM Cleanup #726 (Closed): Make libcommon self-sufficient
we can reopen if this causes any real problems Sage Weil
10:27 AM CephFS Bug #1047 (Can't reproduce): mds: crash on anchor table query
The log doesn't have enough info. If anyone sees this again, let's reopen! Sage Weil
10:25 AM Bug #1018: error on building ceph on red hat 5.5
Does this problem still exists on v0.28+? Have you looked at the redhat info in the wiki?
http://ceph.newdream.ne...
Sage Weil
10:23 AM Bug #1032: osd: Marked down and become zombies after killing
Wanted to check in on this one. Are you still seeing this problem? When the processes are zombies, are there any btr... Sage Weil
10:17 AM CephFS Bug #1137: MDS Crash
does this happen each time you try to start cmds?
If so, can you add
debug mds = 20
debug ms = 1
to [mds] sec...
Sage Weil
04:03 AM CephFS Bug #1137 (Can't reproduce): MDS Crash
... Damien Churchill
10:17 AM CephFS Bug #1041 (Resolved): standby-replay fails on multi-mds fsstress journals
Okay, after 3 or 4 more runs I've only seen #1128. Greg Farnum
09:33 AM CephFS Bug #1041: standby-replay fails on multi-mds fsstress journals
All right, I went over _rename_prepare pretty carefully and reworked a lot of the checks on journaling and now i have... Greg Farnum
04:59 AM Bug #1138 (Resolved): need to package rados.py in the debian .deb
need to package rados.py in the debian .deb
It's a little tricky because paths to 'site-python' vary based on pyth...
Colin McCabe
04:57 AM Bug #1134 (Resolved): rados export --delete-after can't clean up after a crash
Resolved by commit:0f3224e172a077155f64897c8a3665fea6d5d892 and commit:637dfc3ed3194fdb1f5235cd48c8023c7fb1cbda Colin McCabe
01:35 AM Revision c28b749b (ceph): uclient: don't use racy check for uncommitted data.
Previously we used a check for if there were CEPH_CAP_FILE_BUFFER refs,
but that was racy if we had other threads (th...
Greg Farnum
01:35 AM Revision cd5049dc (ceph): uclient: reset flushing_caps on (mds) cap import.
Previously, we could get stuck thinking that we'd flushed caps
(that went to the original MDS, waited on freeze for e...
Greg Farnum
01:35 AM Revision 39d50c13 (ceph): mds: fail out of path_traverse if we have a null dentry.
Previously if we had a null dentry which we were not auth for,
we would go into a loop of discover lookups on that de...
Greg Farnum
01:35 AM Revision 2c6b5600 (ceph): uclient: call the right function pointer on truncate
fixes 67533e14439e9b to do what it meant to.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum
01:31 AM Revision 350e6503 (ceph): mds: use XSYN state for rdlocks during EXCL
Move to XSYN state if we get an rdlock attempt from EXCL. This means that
when there is an EXCL client doing buffere...
Sage Weil
01:22 AM Revision bbaf0b57 (ceph): mds: add xsyn states
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
01:21 AM Revision 5fc6d921 (ceph): filestore: compare dentry->d_type against d_type constant
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
01:10 AM Revision ade2ccbe (ceph): osd, filestore: debug collection listing
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
01:10 AM Revision ea76ea50 (ceph): filestore: stat to test for file type if d_type is unsupported
This only affects list_collections. Previously, when using an FS that
does not support d_type, like xfs, load_pgs wou...
Josh Durgin

06/02/2011

11:59 PM Revision 637dfc3e (ceph): rados_sync: add test for temp file deletion, fix
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
11:58 PM Revision d4edd17c (ceph): rgw: multipart: use locator on created parts
Yehuda Sadeh
11:48 PM Revision 0f3224e1 (ceph): rados_sync: in export, download, then rename
Download files to a 'temporary' name and then rename them when they are
complete. If the download gets aborted halfwa...
Colin Patrick McCabe
10:24 PM Revision 37666185 (ceph): rgw: multipart additions and fixes
Yehuda Sadeh
10:14 PM Revision 6fd694c3 (ceph): Remove unneeded libcrush1 files
Laszlo Boszormenyi
10:13 PM Revision d6bbf3e5 (ceph): mds: journal parents of srci when srcdn is remote
If srcdn is a remote dentry, we will be journaling the src inode to update
the mtime, but we need to ensure the paren...
Sage Weil
10:04 PM Revision ce5f0e71 (ceph): Move interactive and cfuse into tasks.
Tommi Virtanen
09:51 PM Revision 806646b0 (ceph): journaler: also initialize safe_pos
on reread_head. Keep consistent across the two methods.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:50 PM Revision a13b6643 (ceph): journaler: fix trim crash after standby-replay -> active
The reread_head method needs to initialize trimming_pos (like read_head
does) or else we get confused later.
Signed-...
Sage Weil
09:14 PM Revision 7ca240bf (ceph): mds: cleanup rename_prepare a bit
Use *srci tmp.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:57 PM Revision 0bcd9ac7 (ceph): vstart.sh: turn down debug ms
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:45 PM Revision 4d03e159 (ceph): rgw: some more multipard upload list
Yehuda Sadeh
08:33 PM Revision 52bf3fce (ceph): rgw: extend multipart list parts response
Yehuda Sadeh
08:24 PM Revision a670b4b3 (ceph): osd: implement clonerange
Clone ranges of bytes between objects, provided
- src object locators match dest object
- src objects are not miss...
Sage Weil
08:24 PM Revision fc4cc399 (ceph): osd: give obc refs to RepGather
Just give the ref to RepGather instead of doing a get and put.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
08:16 PM Revision 4cf342a1 (ceph): mds: pin inode while one renamed_files list
Make sure the inode is pinned while it is on the LogSegment::renamed_files
list. Avoids a crash when an inode on tha...
Sage Weil
07:25 PM Bug #1098 (Closed): mds never coming "up:active" awaits in "up:creating"
Sage Weil
07:11 PM CephFS Bug #1110: mds: ls -l hangs on concurrent writer
Okay, there was an issue with the behavior with the MDS locks (they didn't do what I thought they did). I added a ne... Sage Weil
06:59 PM Linux kernel client Bug #1136 (Resolved): mempool_destroy failure on umount
on current master, i was doing umount on a sluggish cluster, and... Sage Weil
06:23 PM CephFS Bug #1117: mds: rename rollback broken on slaves during replay
It seems to also be broken on the master -- I've been testing cross-MDS rename ops and of course you see a lot of rol... Greg Farnum
06:15 PM Bug #1118: Crash OSD after upgrdae from 0.28.1 to 0.28.2
This looks like the same problem as #1127. If you're not using ext3/4 or btrfs, it almost certainly is. Can you try t... Josh Durgin
06:11 PM Bug #1127: RBD got silent after 1 month
Hi Yoshi, I put a fix in the filestore_debugging branch, which will generate a new package in about 1/2 an hour.
L...
Josh Durgin
05:39 PM Revision b152a93c (ceph): rgw: more cleanup
Yehuda Sadeh
05:28 PM Revision 3546cfdd (ceph): rgw: some cleanup
Yehuda Sadeh
04:49 PM Revision 50731646 (ceph): rgw: multipart upload parser test util
Yehuda Sadeh
04:39 PM Revision 2f3f36ab (ceph): rgw: fix multipart upload complete parser
Yehuda Sadeh
04:30 PM Revision 711a77cf (ceph): rgw: multipart complete fix
Yehuda Sadeh
04:19 PM Revision 0cce0a5e (ceph): filestore: allow clone_range to different offsets
The old OP_CLONERANGE would only clone a range of bytes at the same offset
in both objects. Add an OP_CLONERANGE2 op...
Sage Weil
04:17 PM Revision 502baeab (ceph): filestore: fix fallback/slow do_clone_range
We need to seek to the appropriate offsets on the src and destination
fd's for this to do the right thing.
Signed-of...
Sage Weil
04:17 PM Revision 6ca168ed (ceph): filestore: fix fallback/slow do_clone_range
We need to seek to the appropriate offsets on the src and destination
fd's for this to do the right thing.
Signed-of...
Sage Weil
04:09 PM Revision 95163e94 (ceph): Fetch ceph binary tarball independently on every node.
Avoids shuffling the bytes through the controlling node.
Use sha1 file to make sure everyone gets the same version.
Tommi Virtanen
03:58 PM Cleanup #1135 (Resolved): d_type cleanup
the codebase seems to have lots of this going on:
src/mds/CDir.cc:726: if (dn->get_linkage()->get_remote_d_ty...
Anonymous
03:21 PM Bug #1134: rados export --delete-after can't clean up after a crash
I guess I should add that manually removing that file from the exported directory makes it work again!
Also, it wo...
Jeremy Kitchen
03:16 PM Bug #1134 (Resolved): rados export --delete-after can't clean up after a crash
I was using rados export to dump out a pool and it was taking a long time so I ctrl-c'd it. Now when I do it on that ... Jeremy Kitchen
03:13 PM RADOS Bug #1129 (Won't Fix): sort out libcrush
meh, let's not worry about it until someone needs libcrush.so. Sage Weil
03:11 PM CephFS Bug #1132 (Resolved): mds: missing parent in rename metablob
commit:d6bbf3e5fbe1df26d1bfe6f695ca52cfbb3694b2 Sage Weil
01:19 PM CephFS Bug #1132 (Resolved): mds: missing parent in rename metablob
single mds, fsstress -p 30 workload... Sage Weil
03:07 PM CephFS Bug #1133 (Resolved): mds: journaler failed assertion on standby-replay -> replay
fixed by commit:a13b66436561bfe86f4907d18d2ea7762632d36d Sage Weil
02:04 PM CephFS Bug #1133 (Resolved): mds: journaler failed assertion on standby-replay -> replay
fsstress workload. kill master mds. standby crashes with:... Sage Weil
12:29 PM Bug #1131 (Resolved): OSD assert failure in update_heartbeat_peers()
Probably fixed in current stable: c5470e0f855b246cfbde6982ca90f565e7074600. Let us know if it persists! Samuel Just
12:20 PM Bug #1131 (Resolved): OSD assert failure in update_heartbeat_peers()
I'm not sure I can reproduce it, because my system state is a bit out of whack due to a previous bug (#1130), but I'v... Sam Lang
09:54 AM Linux kernel client Bug #1096 (Resolved): LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
Thanks Jeff! Sage Weil
02:51 AM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
Hi , I apply the patch ,to verify this bug,
run "./fsstress -d /mnt/ceph/fstest -l 1 -n 10000 -p 1 -v" , pass.
r...
changping Wu
04:20 AM Revision 7e2e4779 (ceph): mon: make sure osd paxos is writeable before doing timeouts
The osd paxos machine has to be writeable before we can update it.
Fixes: #1130
Signed-off-by: Sage Weil <sage.weil@...
Sage Weil
12:05 AM Revision c5470e0f (ceph): OSD: don't keep old connection over new one in update_heartbeat_peers
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just

06/01/2011

11:34 PM Revision 780322db (ceph): boto_tool: add get_bucket_acl
Signed-off-by: "Colin McCabe" <colin.mcccabe@dreamhost.com> Matthew Wodrich
11:28 PM Revision e11958b2 (ceph): Merge branch 'stable' into next
Sage Weil
11:23 PM Revision 59501e1d (ceph): Merge branch 'stable'
Sage Weil
11:04 PM Revision de0f0c72 (ceph): Refactor to use Cluster and Remote, to evaluate the new APIs.
Tommi Virtanen
10:13 PM Revision 65dc8411 (ceph): rgw: implement list multipart
still partially implemented Yehuda Sadeh
09:54 PM Bug #1130 (Resolved): monitor crash in PaxosService:propose_pending()
This should be fixed by commit:7e2e4779e4323429167af36e9a5fb9741c075e96. Thanks for the report! Sage Weil
04:36 PM Bug #1130 (Resolved): monitor crash in PaxosService:propose_pending()
While doing some failure testing, one of the ceph monitors crashed. I have 6 osds, 3 monitors, and 3 mds servers run... Sam Lang
06:42 PM Revision e340bfe1 (ceph): dout: use recursive mutex for dout
Using a recursive mutex for dout is desirable because it allows us to
survive situations like this:
> foo() { dout <...
Colin Patrick McCabe
06:21 PM Linux kernel client Tasks #1112: check all igrab at ceph-client,remove deadlock : spin_lock(&inode->i_lock) + igrab...
Hi ,
ceph_set_page_dirty still exist igrab ,
i merge the patch to ceph-cleint-standalone,
run fsstress, still hit...
changping Wu
06:18 PM Revision 44770df8 (ceph): lockdep: fix shadowed global, add printout
Fix a bug that was keeping lockdep from starting. Add a printout that
lets the user know that lockdep is enabled.
Si...
Colin Patrick McCabe
04:44 PM Revision 9b37f4fa (ceph): Allow embedded '\0' in bufferlists when copying to std::string.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
04:36 PM RADOS Bug #1129 (Won't Fix): sort out libcrush
librados and libceph now statically link in crush code. Should it be a .so? Should be provide a .so anyway, for thi... Sage Weil
04:16 PM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
This should be fixed by commit:85defe76f7e2a0b3d285a3be72fcffce96629b5c, pushed to the master branch. Can you test an... Sage Weil
11:35 AM Linux kernel client Bug #1096 (In Progress): LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
Scratch that, something a bit more subtle is going on. Sage Weil
11:14 AM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
The problem is a short O_DIRECT read that hits EOF. This seems to fix it for me:... Sage Weil
12:19 AM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
git ceph-client commit 98cc99822dac96710a8b64bdc2be4eccffc78956 ,
hand compiling , btrfs+ ubuntu 10.10+2.6.39+ ..
...
changping Wu
04:09 PM CephFS Bug #1128: clustered mds: failed verify_subtree_bounds
Oh right. Logs and core dump in:
kai:~gregf/logs/fstress/replay_bad_bounds
Greg Farnum
04:04 PM CephFS Bug #1128 (Resolved): clustered mds: failed verify_subtree_bounds
... Greg Farnum
09:14 AM Bug #1127: RBD got silent after 1 month
Yoshi, can you attach 'ceph osd dump -o - 26', 'ceph osd dump -o -', and 'ceph pg dump -o -' outputs? Sage Weil
08:43 AM Bug #1127: RBD got silent after 1 month
Looked into this a bit on irc yesterday. This part of the osd log looks problematic - there's only one osd, so the pg... Josh Durgin
12:03 AM Revision 7c6c6a9e (ceph): rados_sync: don't hash paths with periods
A period is not such a bad character.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Colin Patrick McCabe

05/31/2011

11:57 PM Revision 4870393a (ceph): test_rados_tool.sh: test hashed paths
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
11:45 PM Revision b4bc1c68 (ceph): rados export: better name mangling rules, fix test
Introduce a versioning scheme for name mangling, so that we can change
it in the future if we want to.
For names tha...
Colin Patrick McCabe
11:05 PM Revision 5dd0e122 (ceph): rgw: handle multipart completion
still wip Yehuda Sadeh
10:32 PM Revision d29b3b77 (ceph): rgw: parser for multi upload completion
Yehuda Sadeh
10:01 PM Revision 7a474b10 (ceph): Use orchesta.remote as a higher-level wrapper, stop worrying about host...
This changes just first caller in a series of many; the rest will change
once a role-based API is in place.
Tommi Virtanen
10:01 PM Revision 33c39ab5 (ceph): rados_sync: prefix user extended attributes
Start user extended attributes with USER_XATTR_PREFIX.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Colin Patrick McCabe
09:59 PM Revision 0806e65b (ceph): rgw: some more xml reshuffling
Yehuda Sadeh
09:59 PM Revision 9970b86c (ceph): Wrap Remote._runner in staticmethod() or it gets mistaken for a method.
It used to get an extra self argument, and mistook that as client. Tommi Virtanen
09:33 PM Revision dc9aaacf (ceph): Add a pretty wrapper on top of Paramiko and run.run.
Most importantly right now, it knows its name, and can
prettyprint it.
Tommi Virtanen
09:31 PM Revision f5d6be6e (ceph): rgw: move generic xml parsing code to some shared location
Yehuda Sadeh
09:31 PM Revision 5875f796 (ceph): Remove dead code.
Tommi Virtanen
09:28 PM Revision efee7466 (ceph): objecter, osd: clonerange operation
Add a src_oids field to MOSDOp, referenced by a new CLONERANGE osd op type
that will clone data from one object to an...
Sage Weil
08:58 PM Revision 07c1989a (ceph): librados: implement aio_flush
Implement a per-ioctx flush that blocks until all previously submitted
aio operations on the ioctx are safe. Each ai...
Sage Weil
08:51 PM Revision 7d4bb120 (ceph): Initial import.
Currently hardcoded to run dbench, not modular, and the remote
execution API is clumsy.
Tommi Virtanen
08:46 PM Revision 6db2a4e2 (ceph): crushtool: error out if uniform weights vary
Fixes: #1075
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:28 PM Revision 35b19a41 (ceph): osd: fix ScrubFinalizeWQ::_clear condition
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
07:58 PM Revision 1528d2c4 (ceph): debian: depend on libboost-dev >= 1.34
for statechart. Partially fixes #1124.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:45 PM Bug #1127 (Resolved): RBD got silent after 1 month
RBD got silent after about 1 month running.
Although I restarted the daemons, the symptom doesn't go away.
Attached...
Yoshi Tamura
04:37 PM Revision 0cfa911f (ceph): osd: don't leak Connection reference
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:37 PM Revision 8aa67aa4 (ceph): osd: ignore old/stale heartbeat messages
If we get heartbeat messages from old epochs from peers that are not
current, drop them and mark the connection down....
Sage Weil
04:37 PM Revision e5c9100b (ceph): osd: fix map sharing due to heartbeats
- share the map with the cluster addr
- use the new {note,get}_peer_epoch helpers to do it sanely
- don't share if we...
Sage Weil
02:26 PM RADOS Feature #1126 (Rejected): crush: extend rule definition
The current rule command structure does not allow you to do something like:
- pick 2 racks
- pick 2 devices under...
Sage Weil
01:57 PM Feature #511 (Resolved): librados: implement flush
Sage Weil
01:44 PM RADOS Feature #1075 (Resolved): crushtool: warn if uniform item weights vary
Sage Weil
01:15 PM rgw Subtask #1125 (Resolved): osd: support for merging/cloning several objects into one final object
Sage Weil
12:34 PM Bug #1124: Depend on new enough Boost
and
3. ceph.spec.in
Anonymous
11:47 AM Bug #1124 (Resolved): Depend on new enough Boost
Ensure that we depend on a new enough libboost to build successfully. Do this in
1. debian/control
2. autoconf
...
Anonymous
11:34 AM Feature #1123 (Resolved): qa: small but completely functional suite
Sage Weil
10:57 AM Bug #906 (Can't reproduce): clustered mds: lchown not setting uid/gid
Sage Weil
10:53 AM CephFS Bug #1111 (Resolved): file lock requests in wait queue not getting cleaned up after process exit
Sage Weil

05/30/2011

09:45 PM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6

echo 'file fs/ceph/caps.c +p' > /sys/kernel/debug/dynamic_debug/control
logs attached .
changping Wu
08:29 PM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6

Hi ,i git ceph-client master branch:
commit 98cc99822dac96710a8b64bdc2be4eccffc78956
Author: Sage Weil <sage@ne...
changping Wu
09:14 PM Bug #906: clustered mds: lchown not setting uid/gid
I don't think that I ever did manage to reproduce it.
I haven't thought it through much, but it's also possible th...
Greg Farnum
08:49 PM Linux kernel client Bug #1109 (Closed): rbd: btrfs crash
this was on old code. Sage Weil
08:47 PM Bug #1122 (Resolved): kclient: async readahead
Many people now have noticed that sequential read performance is slower than writes. Is this simply a matter of adju... Sage Weil
08:46 PM Bug #1121 (Resolved): rados: rados bench read aborts with an error
Reported by multiple people now on ceph-devel. Probably easy to fix? Sage Weil
08:45 PM Feature #1120 (Resolved): qa: gcov metrics
generate total coverage statistics for the entire qa suite so we can measure overall coverage and improvements. we c... Sage Weil
08:44 PM Feature #1119 (Resolved): qa: gcov/lcov html output
generate browsable lcov pages for individual tests and/or the whole qa suite Sage Weil
07:37 PM Revision 5b7c8ae8 (ceph): osd: protect recovery_wq ops with the recovery lock
We were calling recovery_item.remove_myself() without holding the
recoveryWQ::lock. Naughty naughty!
Signed-off-by: ...
Greg Farnum
07:37 PM Revision b3fb58ea (ceph): crushtool: add -v verbose for --test mode
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
01:27 PM Bug #1118 (Resolved): Crash OSD after upgrdae from 0.28.1 to 0.28.2
I know, "after" is not always "because". :)
I stop cosd, upgrade (by aptitude from yours repository) and start aga...
Fyodor Ustinov
01:10 PM Bug #1116 (Resolved): RecoveryWQ assert failure
commit:5b7c8ae8bdc26e7593323c76527cb37912b9d833 Sage Weil

05/29/2011

10:55 PM Linux kernel client Tasks #1112: check all igrab at ceph-client,remove deadlock : spin_lock(&inode->i_lock) + igrab...
Hi ,
I'am verifing fsstress test with ceph-client master branch:
commit 98cc99822dac96710a8b64bdc2be4eccffc78956
...
changping Wu
10:00 PM Revision 57ea5020 (ceph): Add content to obsync package
Laszlo Boszormenyi
09:42 PM RADOS Bug #1017 (Closed): ceph 0.26 ,mkcephfs --crushmap crush.new ,wait for very long time,mds stat i...
Looks like you need 'chooseleaf' instead of 'choose' in the crush rules. Sage Weil
09:42 PM RADOS Bug #1016 (Closed): ceph 0.26,crushmap change,mount fail.
Looks like you need 'chooseleaf' instead of 'choose' in the crush rules. Sage Weil

05/28/2011

04:14 PM Revision 23242045 (ceph): v0.28.2
Sage Weil

05/27/2011

09:46 PM Revision 7e1de380 (ceph): hadoop: track Hadoop API changes
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:22 PM Revision 232cd6b3 (ceph): rgw: generate random upload id
Yehuda Sadeh
09:05 PM Revision 4ddf8df8 (ceph): SimpleMessenger: allow multiple calls to shutdown
Fixes a case where radostool crashed on an error shutdown.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Colin Patrick McCabe
09:01 PM Revision 8490b784 (ceph): common/Thread.h: const cleanup
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
08:35 PM Revision a0d521b2 (ceph): rgw: fix signing for some requests
Yehuda Sadeh
07:50 PM Revision 818bfd15 (ceph): rgw: serve multipard init upload
still needs to generate a random hidden object, and use rados locator
for accessing it.
Yehuda Sadeh
05:59 PM Revision 7cfb3b6a (ceph): Merge branch 'wip-obsync'
Colin Patrick McCabe
04:01 PM Bug #1116: RecoveryWQ assert failure
Looks as though this patch has helped.
At least this osd has completd rebalancing.
Great! Thanks!
Fyodor Ustinov
12:17 PM Bug #1116: RecoveryWQ assert failure
Okay, checked this out. It turns out that the only function violating the locking was OSD::do_recovery. Simply adding... Greg Farnum
09:48 AM Bug #1116 (Resolved): RecoveryWQ assert failure
From Fyodor:... Greg Farnum
02:30 PM CephFS Bug #1117 (Resolved): mds: rename rollback broken on slaves during replay
Best I can tell it's just busted. The rollback object contains all the dentries and inodes, but on a slave it's entir... Greg Farnum
10:03 AM Bug #1052 (Resolved): obsync: add rados backend tests to test-obsync.py
Implemented. Colin McCabe
09:39 AM CephFS Bug #1041: standby-replay fails on multi-mds fsstress journals
Back from vacation, and I'm trying to remember what's still broken here. Looking through my logs:
1) MDS 1 gets requ...
Greg Farnum
09:24 AM Linux kernel client Tasks #1112: check all igrab at ceph-client,remove deadlock : spin_lock(&inode->i_lock) + igrab...
Jeff Wu wrote:
> static int ceph_set_page_dirty(struct page *page)
> {
> ...............................
> /* dir...
Sage Weil
08:01 AM Linux kernel client Tasks #1112: check all igrab at ceph-client,remove deadlock : spin_lock(&inode->i_lock) + igrab...

static int ceph_set_page_dirty(struct page *page)
{
...............................
/* dirty the head */
spin...
changping Wu
08:00 AM Linux kernel client Tasks #1112: check all igrab at ceph-client,remove deadlock : spin_lock(&inode->i_lock) + igrab...
Hi ,
I attached some of logs at bug #1096 http://tracker.newdream.net/issues/1096.
:ceph-client-fsstress log 1,2,3....
changping Wu
08:17 AM CephFS Bug #1110: mds: ls -l hangs on concurrent writer
> OK, thanks. I'll try out 2.6.39 tomorrow. Will keep you informed.
Now running 2.6.39 everywhere on freshly creat...
Andre Noll
04:37 AM Revision 574b58f3 (ceph): mkcephfs: pass config to osdmaptool
This lets OSDMap::create_simple() see g_conf.osd_pool_default_size when
creating the initial data, metadata, and rbd ...
Sage Weil
04:31 AM Revision d2ab764b (ceph): drop useless cm.txt
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:20 AM Revision 1292436b (ceph): osdmap: take default pool size from config
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil

05/26/2011

10:18 PM Revision 9e8484e8 (ceph): rgw: handle POST requests for s3
Yehuda Sadeh
10:07 PM Revision 9b8daa92 (ceph): crushtool: update help
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:00 PM rgw Feature #767 (In Progress): rgw: incremental/large file uploads
Sage Weil
09:11 PM Revision 6f704e33 (ceph): obysnc: rgw target: validate all users
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
08:17 PM Revision 22082c4f (ceph): mon: remove pg_temp mappings when we delete pools
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:17 PM Revision ae5bbc7b (ceph): Merge branch 'wip-obsync'
Colin Patrick McCabe
08:15 PM Revision e0cbb131 (ceph): test-obsync: test sync directly from s3->rgw
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
08:12 PM Revision a93c86e5 (ceph): crushtool: fix --add-item weight being zero when parent bucket(s) created
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:21 PM Revision 56d5d959 (ceph): obsync: fix bucket creation through rgw target
The rgw: target can now create buckets. Add a test.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Colin Patrick McCabe
06:04 PM Revision 9cefb56b (ceph): Merge branch 'stable'
Sage Weil
05:25 PM Revision b2c1bff8 (ceph): test-obsync: test big objects, user-defined xattr
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
05:22 PM Bug #1098 (In Progress): mds never coming "up:active" awaits in "up:creating"
shyamali mukherjee wrote:
> I have put OSd logfile and journal to ext3. osd data still comes from "btrfs".
>
> I ...
Sage Weil
11:26 AM Bug #1098: mds never coming "up:active" awaits in "up:creating"
I have put OSd logfile and journal to ext3. osd data still comes from "btrfs".
I have tried atleast about 50 times...
shyamali mukherjee
10:14 AM Bug #1098: mds never coming "up:active" awaits in "up:creating"
You switched everything over to ext3?
It doesn't look like a user_xattr issue; the cosd daemon will error out and ...
Sage Weil
10:01 AM Bug #1098: mds never coming "up:active" awaits in "up:creating"
The cosd has blocked on a btrfs bug; it doesn't have much to do with Ceph.
Eventually your cluster should declare ...
Greg Farnum
09:50 AM Bug #1098: mds never coming "up:active" awaits in "up:creating"
Hi Sage,
I know you have closed the issue. But I could not attach the logfile as it is too huge. I have got few li...
shyamali mukherjee
05:19 PM Revision e9eeb161 (ceph): mkcephfs: set rdir for local mon setup
Fixes: #1113
Reported-by: Bernard Grymonpon <bernard@openminds.be>
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:55 PM Revision 5d51b8fd (ceph): init-ceph: ssh
Another bell/whistle.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:31 PM rgw Bug #1115: rgw allows users to "give away" s3 objects
I added a test for this to s3-tests. To run it, use:... Colin McCabe
03:36 PM rgw Bug #1115 (Resolved): rgw allows users to "give away" s3 objects
The Rados gateway should not allow the owner of an object to be changed through a PUTACL operation. Amazon doesn't al... Colin McCabe
03:42 PM Bug #1056 (Won't Fix): obsync: add warning when source owner is not the same as dest owner (after...
The owner of the object we create is determined by the access key and secret key supplied by the user. It can never b... Colin McCabe
03:37 PM Bug #1046 (Resolved): rgw: changing object owners
Filed bug #1115 because RGW's behavior does not match Amazon's. Colin McCabe
03:33 PM Bug #1046: rgw: changing object owners
The answer to question #1 is no, Amazon does not allow users to "give away" the ownership of objects. Colin McCabe
03:29 PM CephFS Bug #1114 (Rejected): NFS export extreme slowdown
Attached is debug mds 20 output.
Below is ceph -w output for a corresponding period.
Time synchronization is < 0.1s...
Brian Chrisman
03:28 PM Bug #906: clustered mds: lchown not setting uid/gid
Greg, what did you do before to reproduce this? Sage Weil
02:29 PM Bug #1050 (Won't Fix): obsync: implement --filter to allow certain objects in the source to be sk...
The original reason we wanted this feature was to skip objects with different owners.
This was handled by the creati...
Colin McCabe
02:28 PM Bug #1051 (Resolved): obsync: create a librgw to parse binary ACLs generated by RGW
Colin McCabe
01:22 PM Bug #960 (Resolved): obsync: support rados pool "buckets"
> - sync directly to/from librados
Implmeneted in the rgw: target.
> - copy amazon acl's into same xattr name tha...
Colin McCabe
12:53 PM CephFS Bug #1110: mds: ls -l hangs on concurrent writer
Sage Weil wrote:
> Andre Noll wrote:
> > Hm that does not seem to work. I had to compile a kernel with dynamic debu...
Andre Noll
09:32 AM CephFS Bug #1110: mds: ls -l hangs on concurrent writer
Andre Noll wrote:
> Hm that does not seem to work. I had to compile a kernel with dynamic debug enabled,
> but noth...
Sage Weil
02:30 AM CephFS Bug #1110: mds: ls -l hangs on concurrent writer
Hm that does not seem to work. I had to compile a kernel with dynamic debug enabled,
but nothing makes it to the log...
Andre Noll
11:16 AM CephFS Bug #1108: Large number of files in a directory makes things grind to a halt
Excellent thanks for the tips. It'll have to wait until Tuesday now for testing but I'll report back then. Going to u... Damien Churchill
10:10 AM CephFS Bug #1108: Large number of files in a directory makes things grind to a halt
If that turns out to be too unstable for you and you have gobs of RAM for your MDS, you could also bump up the MDS ca... Greg Farnum
10:20 AM Bug #1113 (Resolved): rdir is not set correctly for the mons
Fixed in stable branch. BTW in the future please add a Signed-off-by to your patches... see SubmittingPatches file i... Sage Weil
09:45 AM Bug #1113 (Resolved): rdir is not set correctly for the mons
In mkcephfs, rdir is used to keep the config in for remote hosts, and dir is used for localhost. However, when bootst... Bernard Grymonpon
09:53 AM Bug #1095: run "rados bench 10 seq -p data" print "error during benchmark: -5"
Actually, the write benchmark should record how many objects are left and the read benchmark isn't supposed to go pas... Greg Farnum
09:17 AM Linux kernel client Tasks #1112: check all igrab at ceph-client,remove deadlock : spin_lock(&inode->i_lock) + igrab...
Hi Jeff-
Are there actual cases of this that you're seeing? I've fixed several of these, but I'm not aware curren...
Sage Weil
12:48 AM Revision 05cfb4d5 (ceph): obysnc: fix content-type on RGWStore
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
12:36 AM Revision 6cf67a26 (ceph): test-obsync: compare_directory now compares xattrs
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe

05/25/2011

11:50 PM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
Hi ,i git ceph-client master :
commit 35b0ed997b1a49ff73a6110cbd04681467dbe217
Author: Sage Weil <sage@newdream.n...
changping Wu
07:40 AM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
I will build the kernel to verify it.thanks. changping Wu
11:45 PM Linux kernel client Tasks #1112 (Resolved): check all igrab at ceph-client,remove deadlock : spin_lock(&inode->i_lock...
Hi , at igrab function,it has existed the codes: spin_lock(&inode->i_lock);
if coding this:
spin_lock(&inode->i_...
changping Wu
10:55 PM Revision 4cae0ea8 (ceph): ceph-pybind-test: test embedded NULLs in data
Test embedded nulls in rados data. Fix a bug in rados.Object.__str__
Signed-off-by: Colin McCabe <colin.mccabe@dream...
Colin Patrick McCabe
10:49 PM Revision a2d35295 (ceph): obsync: more fixes for RgwStore
* Fix content-type handling
* add vvprint and use it in Object::equals.
* support RgwStore::prefix
* more tests
S...
Colin Patrick McCabe
10:48 PM Revision b76874f6 (ceph): pybind/rados: correctly return data with NULLs
Correctly handle returning data with embedded NULLs in it.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Colin Patrick McCabe
10:21 PM Revision 970897ce (ceph): pybind/rados.py: throw NoData on ENODATA
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:56 PM Revision 28c30265 (ceph): mds: fix canceled lock attempt
If client tries to lock a file, has to wait, and then cancels the attempt,
the client will send an unlock request to ...
Sage Weil
09:34 PM Revision 596a3d6a (ceph): librbd: make image contexts threadsafe
Use refresh_lock to protect the needs_refresh member, and
ImageContext::lock for the header and snapshot metadata.
S...
Josh Durgin
09:22 PM Revision d38001c7 (ceph): pybind/rados.py: rados.Object.key should be string
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
07:58 PM Revision b2554823 (ceph): obysnc: RgwStore: make sure destination users exist
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
07:36 PM Revision 5d865fb6 (ceph): obsync: fix DST_OWNER
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
07:33 PM Revision 73e28f2e (ceph): rgw: return EACCES if acl xattr doesn't exist
Yehuda Sadeh
07:05 PM Revision ea76712a (ceph): obsync: Add boto_retries, remove rgw_store.prefix
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
06:23 PM Revision e3dd77d8 (ceph): librbd: const cleanup
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
06:06 PM Revision 2aa9151e (ceph): librbd: clean up md_oid use a bit
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
06:01 PM Revision 0adaa6b6 (ceph): rados python bindings: handle xattrs with NULL
Handle extended attributes that contain NULL bytes correctly, rather
than treating everything as zero-terminated C st...
Colin Patrick McCabe
05:54 PM Revision d4bfd964 (ceph): PG: fix race in _activate_committed
Previously, _activate_committed would access the osdmap epoch racing
with handle_osd_map's osdmap update. This would...
Samuel Just
05:50 PM Revision 7de7ba00 (ceph): RgwStore: fix some ACL issues
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
05:21 PM Revision 42f873e6 (ceph): Proper ACL support for rados targets
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
05:21 PM Revision 17053739 (ceph): test-obsync: refactor a little bit
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
05:21 PM Revision e4e098ba (ceph): Rename RadosStore to RgwStore
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
05:21 PM Revision 3f5f5620 (ceph): test-obysnc.py: support librgw testing
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
04:38 PM Revision 0aa18f32 (ceph): mds: do not shift to EXCL or MIX while rdlocked
There was an old change in file_eval() that was allowing us to switch from
SYNC to MIX or EXCL while there were rdloc...
Sage Weil
04:25 PM Messengers Bug #1107 (Resolved): msgr: old outgoing connection + mark_down leaves stale state on remote peer
Sage Weil
02:57 PM CephFS Bug #1111: file lock requests in wait queue not getting cleaned up after process exit
pushed kclient fix to ceph-client.git master branch and mds fix ceph.git master branch. Sage Weil
02:38 PM CephFS Bug #1111 (In Progress): file lock requests in wait queue not getting cleaned up after process exit
Sage Weil
12:12 PM CephFS Bug #1111: file lock requests in wait queue not getting cleaned up after process exit
Adding our test program. Brian Chrisman
11:37 AM CephFS Bug #1111: file lock requests in wait queue not getting cleaned up after process exit
Here's the notes on our bug and the related PIDs
NODE 1: 192.168.98.112 (client 4113)
- gets the lock and ho...
Brian Chrisman
11:36 AM CephFS Bug #1111 (Resolved): file lock requests in wait queue not getting cleaned up after process exit
Our interpretation of events:
1) proc1 requests lock
2) proc1 receives lock
3) proc2 requests lock
4) proc2 gets ...
Brian Chrisman
12:05 PM CephFS Bug #1110: mds: ls -l hangs on concurrent writer
The kclient isn't responding to a cap revocation message. I'm not seeing anything since 2.6.38 that would have fixed... Sage Weil
11:39 AM CephFS Bug #1110: mds: ls -l hangs on concurrent writer
mds log created with
ceph mds tell 0 injectargs '--debug-mds 20 --debug-ms 1'
The problem occurs at 2011-05...
Andre Noll
10:49 AM CephFS Bug #1110 (Resolved): mds: ls -l hangs on concurrent writer
... Sage Weil
10:57 AM Bug #1098 (Closed): mds never coming "up:active" awaits in "up:creating"
Sage Weil
10:56 AM Bug #1012 (Rejected): Autotest: Measure RADOS IO performance under read and write loads
Sage Weil
10:56 AM Feature #948 (Rejected): autotest: graph rbd performance
Sage Weil
08:59 AM CephFS Bug #1108: Large number of files in a directory makes things grind to a halt
Enabling directory fragmention should fix this.. add
mds bal frag = true
to your [mds] section and restart the...
Sage Weil
04:18 AM CephFS Bug #1108 (Closed): Large number of files in a directory makes things grind to a halt
Whilst extracting a copy of our mail directories onto a 10 node cluster(3xmds, 3xmon, 10xosd) I found that there was ... Damien Churchill
08:55 AM Linux kernel client Bug #1109 (Closed): rbd: btrfs crash
... Sage Weil
04:14 AM Revision fe955881 (ceph): crushtool: clean up add-item a bit; don't add item to same bucket twice
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:05 AM Revision dd89ff44 (ceph): crushtool: fix remove-item
Scan all buckets instead of doing a tree traverse.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
03:30 AM Revision 1c334d1a (ceph): radosgw_admin: update clitest
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
01:58 AM CephFS Bug #1104: Segmentation fault when deleting a folder
Fixed!
Pulled in the latest changes, recompiled, and works like a charm now.
Bernard Grymonpon
01:16 AM Revision ab01d74e (ceph): mkcephfs.in: print out usage if no actions given
If the user didn't specify any actions, print out a usage message rather
than silently exiting.
Signed-off-by: Colin...
Colin Patrick McCabe
12:53 AM Revision f7ea7c98 (ceph): rgw: Fix RGWAccess::init_storage_provider
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
12:06 AM Revision c67dd164 (ceph): mkcephfs: error out on bad usage
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

05/24/2011

11:40 PM Revision 5532f897 (ceph): make: fix build for rgw
Yehuda Sadeh
11:33 PM Revision 203a43bf (ceph): rgw_admin: clean warning
Yehuda Sadeh
10:58 PM Messengers Bug #1107 (Resolved): msgr: old outgoing connection + mark_down leaves stale state on remote peer
Peers A and B:
- A reached epoch 10, in which A and B should exchange heartbeats
- A advances to epoch 20, and ...
Sage Weil
10:30 PM Revision 95c594f6 (ceph): Merge commit 'origin/master' into rgw-multiuser
Yehuda Sadeh
09:29 PM Revision ab278b4b (ceph): rgw_admin: add key create
Yehuda Sadeh
09:19 PM Bug #1095 (Closed): run "rados bench 10 seq -p data" print "error during benchmark: -5"
Hi Jeff-
I think the problem here is just that the read phase is running out of data to read. Let the write phase...
Sage Weil
09:17 PM Revision bd0eb9a3 (ceph): rgw_admin: subuser and key removal
Yehuda Sadeh
08:38 PM Revision 0566de49 (ceph): Let callers specify that some arguments should not be quoted.
This lets you do things such as "test -e /foo && bar" or
"cd /tmp && blah". Remember that shell pipelines do not dete...
Tommi Virtanen
08:29 PM Revision dad0a67a (ceph): Simple unit tests for shell quoting.
Tommi Virtanen
08:27 PM Revision be28e5bf (ceph): Refactor to extract shell quoting into utility function.
Tommi Virtanen
08:16 PM Revision 1a459dd7 (ceph): Depend on Paramiko 1.7.7 or newer to be able to read modern OpenSSH keys.
Tommi Virtanen
08:16 PM Revision 7330c3c4 (ceph): journaler: tolerate ENOENT when prezeroing
ENOENT is okay and expected.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
08:12 PM Revision 37c94af8 (ceph): Pyflakes cleanup.
Tommi Virtanen
08:07 PM Revision 5d5b1795 (ceph): Add a utility function run.wait to wait for processes to exit.
Tommi Virtanen
08:06 PM Revision 073a4bbc (ceph): Paramiko ChannelFile.close() didn't actually close the remote stdin.
Add a wrapper that does the calls shutdown on the channel itself,
to actually cause EOF. Add integration test using r...
Tommi Virtanen
08:01 PM Revision 6dd4774f (ceph): Log debug info of commands actually executed.
Tommi Virtanen
08:01 PM Revision 9c42fe6b (ceph): Cleanup dead code.
Tommi Virtanen
08:00 PM Revision f10668f5 (ceph): Allow easy writing to stdin of remote processes.
Tommi Virtanen
07:36 PM Revision bb13c92a (ceph): test_common.sh: skip rm before put
The rm before the put is unecessary and actually incorrect now.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost....
Colin Patrick McCabe
07:34 PM Revision e42736ae (ceph): radostool: rados put should use write_full
If "rados put" uses write instead of write_full, the resulting object on
the server may be a mismash of old and new o...
Colin Patrick McCabe
07:22 PM Revision cfe372ec (ceph): Merge branch 'wip_ceph_context'
Colin Patrick McCabe
07:21 PM Revision 9ff7cc7c (ceph): Create a libcommon service thread
Create a libcommon service thread. Use it to handle SIGHUP.
Handle it by means of a flag that gets set. Using a queu...
Colin Patrick McCabe
05:00 PM Revision 29702685 (ceph): librados: len should be size_t
Unsigned, and size_t because it's a buffer size.
Fixes signedness warning in testrados.
Signed-off-by: Sage Weil <s...
Sage Weil
04:47 PM Revision ce04e3db (ceph): osd: add ability to explicitly mark unfound as lost
Instead of automatically marking unfound objects lost (once we've tried
every location we can think of), do it when t...
Sage Weil
04:42 PM Revision 87309e94 (ceph): osd: make automatically marking of unfound as lost optional
We may not want to do this automatically until we have more confidense in
the recovery code. Even then, possible not...
Sage Weil
04:27 PM Revision cea7b651 (ceph): mds: clean up get_or_create_stray
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:24 PM Revision 081acc4c (ceph): mds: initialize stray_index on startup
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:17 PM Revision 754cfaac (ceph): Merge branch 'stable'
Sage Weil
01:21 PM CephFS Bug #1104: Segmentation fault when deleting a folder
I'll try it first thing tomorrow, no more access to the machines now - everything is always updated completely on all... Bernard Grymonpon
12:55 PM CephFS Bug #1104: Segmentation fault when deleting a folder
cherry-picked commit:7330c3c473aa128b1e3ecb8752278f655bc79620 to stable. i'm a bit surprised you're seeing this on t... Sage Weil
12:50 PM CephFS Bug #1104: Segmentation fault when deleting a folder
There we go:
[Switching to Thread 0x7ffff5574700 (LWP 27162)]
0x00007ffff67c1165 in raise () from /lib/libc.so....
Bernard Grymonpon
12:18 PM CephFS Bug #1104: Segmentation fault when deleting a folder
I'll have to rebuild everything, "r" it is optimized out in my build. This will take a little longer...
#6 0x0000...
Bernard Grymonpon
11:37 AM CephFS Bug #1104: Segmentation fault when deleting a folder
Can you check with gdb to see what the value of 'r' actually is? Sage Weil
11:32 AM CephFS Bug #1104: Segmentation fault when deleting a folder
Tried the stable branch (i'm at ce04e3dbaf2383a521b267585a860f772c4cc786), made debian packages, installed it all, st... Bernard Grymonpon
11:20 AM CephFS Bug #1104 (Resolved): Segmentation fault when deleting a folder
Yay! Thanks for your help testing. We'll do 0.28.2 in a few days. Sage Weil
11:19 AM CephFS Bug #1104: Segmentation fault when deleting a folder
Compiled from last master sources (sorry, forgot switch to stable branch) not have this trouble. Hooray? Maybe it mak... Fyodor Ustinov
10:15 AM CephFS Bug #1104: Segmentation fault when deleting a folder
Attached! You may have problems if your libraries don't match mine. There are also the autobuilt debian packages th... Sage Weil
09:56 AM CephFS Bug #1104: Segmentation fault when deleting a folder
Sage Weil wrote:
> the 'stable' branch has that fix, or you can apply it manually...
Published in your repository...
Fyodor Ustinov
09:28 AM CephFS Bug #1104: Segmentation fault when deleting a folder
the 'stable' branch has that fix, or you can apply it manually... Sage Weil
09:23 AM CephFS Bug #1104: Segmentation fault when deleting a folder
Sage Weil wrote:
> Can you try with this patch applied?
It's 0.28.1 or I should compile master branch?
Fyodor Ustinov
09:01 AM CephFS Bug #1104: Segmentation fault when deleting a folder
Can you try with this patch applied?... Sage Weil
01:40 AM CephFS Bug #1104: Segmentation fault when deleting a folder
I can not attach files to this issue.
http://blog.ufm.su/core.zip - core file
http://blog.ufm.su/mds.zip - log fi...
Fyodor Ustinov
12:50 PM Linux kernel client Feature #962: d_prune
Sage Weil
12:50 PM Linux kernel client Bug #851: make dcache readdir with I_COMPLETE work
Sage Weil
12:50 PM Linux kernel client Bug #850: make NULL lookup using I_COMPLETE work
Sage Weil
11:33 AM Bug #1099: osd: handle recovery of lost objects
For the time being I disabled automatic marking of lost objects. That makes dealing when "recovering" them less of a... Sage Weil
11:31 AM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
This is a kclient bug due to multiple threads entering flush_dirty_caps, which is not reentrant due to commit:e9964c1... Sage Weil
09:53 AM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
The 30k lines still doesn't have the last client_request arrival. I bumped the limit to 50mb. Can you grab a bigger ... Sage Weil
04:11 AM Revision d66c6ca1 (ceph): v0.28.1
Sage Weil
12:28 AM Revision 9a660ac9 (ceph): librads, libceph: store CephContext
Don't use the global g_ceph_context. Instead, store the CephContext in
the structures provided by the library user.
...
Colin Patrick McCabe
12:28 AM Revision 13aed89e (ceph): Add CephContext
A CephContext represents the context held by a single library user.
There can be multiple CephContexts in the same pr...
Colin Patrick McCabe
12:07 AM Revision 1c7b9821 (ceph): Split common_init_daemonize from common_init_finish
Split off common_init_daemonize from common_init_finish. cfuse is a
daemon that calls common_init_finish, but handles...
Colin Patrick McCabe

05/23/2011

11:52 PM Revision 478c6bbc (ceph): rgw_admin: make interface a bit more explicit
Yehuda Sadeh
10:12 PM Revision c167a28d (ceph): rgw: subuser permissions
Yehuda Sadeh
09:58 PM Revision 6360154d (ceph): mon: verify that crush max does not exceed osd max
- when injecting a new crushmap
- when adjusting osdmap max_osd
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:45 PM RADOS Bug #1106 (Resolved): crush/osd: inconsistent mapping values
This was because crush max_devices was osdmap.max_osd - 1. Need to add some loud warnings and checks for this. Sage Weil
03:24 PM Bug #1098: mds never coming "up:active" awaits in "up:creating"
The MDS isn't coming up because teh OSD requests aren't completing because btrfs is wedged. Which kernel are you usin... Sage Weil
03:16 PM Feature #1105 (Resolved): have multiple access keys per user in rgw
Although the radosgw_admin interface needs a bit of polishing, it's implemented as of commit:c167a28d73b665f7239f8fe7... Yehuda Sadeh

05/22/2011

11:25 PM Revision 5d982803 (ceph): crushtool: add --reweight-item <name> <weight>
Reweight and individual item via crushtool.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil

05/21/2011

07:55 PM Revision e9754d88 (ceph): osdmaptool: fail --import-crush if crush max_devices > osdmap max_osd
Crush will spew non-deterministic badness if it walks off the end of
the osd_weight vector.
Signed-off-by: Sage Weil...
Sage Weil
01:16 AM Revision ba7ef845 (ceph): config: delete after new
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:57 AM Revision 35ee7e64 (ceph): ceph_crypto: add assert_init
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
12:57 AM Revision 3a2acefe (ceph): common_init: don't init crypto until after fork
Get rid of the initialize-then-shutdown-crypto hack. We just initialize
crypto once, after it is safe to do so. There...
Colin Patrick McCabe
12:10 AM Revision 4cc83a68 (ceph): crush: fix signedness warnings
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

05/20/2011

11:46 PM Revision 5baef8f6 (ceph): rgw_admin: able to create multiple keys/subusers
Yehuda Sadeh
11:45 PM Revision cc1737bd (ceph): crushtool: --remove-item name
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
11:45 PM Revision d287ade5 (ceph): crush: fix tree weight accessor, decompile
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
11:45 PM Revision 9a14402a (ceph): crush: fix tree bucket encoding
I wonder how long this has been broken!
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:44 PM Revision 127dcde1 (ceph): crushtool: default to hash 0 (rjenkins1)
Otherwise we get 255 which is undefined and get bad results!
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
10:15 PM Revision 2cf5048f (ceph): rgw: user info structure supports multiple subusers and keys
Yehuda Sadeh
10:15 PM Revision 27c0bce6 (ceph): mon: fix parsing of 'osd foo N ...' commands with multiple ids
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:15 PM Revision 726aebea (ceph): osd: rework peer map epoch caching
We try to keep track of which epochs our peers have so that we can be
semi-intelligent about which map incrementals w...
Sage Weil
10:15 PM Revision bc960ac1 (ceph): osd: show last_epoch_clean in PG::Info::History printer
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:15 PM Revision bac1021e (ceph): osd: only forget peer epochs if they are down AND no longer heartbeat p...
If we forget the peer epoch when we see them go down, we won't share the
map later in update_heartbeat_peers() to tel...
Sage Weil
10:15 PM Revision b5ebe6b5 (ceph): msgr: don't close close_on_empty until outgoing messages are acked
Otherwise, if we close the socket, we may lose in-flight data.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
10:15 PM Revision a51bf3e9 (ceph): osd: more heartbeat rework
A few things:
- track Connection* instead of entity_inst_t for hb peers
- we can only send maps over the cluster_me...
Sage Weil
10:15 PM Revision e3191b7d (ceph): osd: merge history when primary sends replica new pg info
This, among other things, lets us update last_epoch_started and
last_epoch_clean.
Signed-off-by: Sage Weil <sage.wei...
Sage Weil
10:15 PM Revision c22aca1f (ceph): osd: small cleanup
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:15 PM Revision 4a83de18 (ceph): osd: update last_epoch_clean in PG::Info::History::merge()
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
09:27 PM Revision 68021ce8 (ceph): dout: reopen log files on SIGHUP
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:23 PM Revision 277dc66f (ceph): dout: reopen log files on SIGHUP
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
07:39 PM Revision 960d2a36 (ceph): Add SignalSafeQueue
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
06:29 PM Revision 74691e7c (ceph): osd: clean up old _from target cleanup; fix one case; share map
Clean up the code to mirror the _to case.
Previously we would not mark down an old _from that is still a _to but wit...
Sage Weil
06:25 PM Revision 0f1be629 (ceph): osd: mark down old _to targets
If a peer remains a _to target but their address changes, we still want
to mark down the old connection.
Signed-off-...
Sage Weil
06:20 PM Revision 3811d8bf (ceph): osd: share map with old _to peers
Use new msgr hooks to do this cleanly.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
06:17 PM Revision f87e1dd5 (ceph): osd: clean up handle_osd_ping output
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
06:12 PM RADOS Bug #1106 (Resolved): crush/osd: inconsistent mapping values
I'm getting different results for the crush mapping on different nodes. md5sum of the on-disk osdmaps match up. the... Sage Weil
05:54 PM Revision 3a7931c7 (ceph): osd: ignore stale requests for heartbeats
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
05:43 PM Revision f9bea340 (ceph): osd: don't prioritize heartbeat requests
This could conceivably screw up ordering, and priority doesn't matter
anyway when this is the first message we send t...
Sage Weil
05:42 PM Revision 7a574d88 (ceph): osd: do not clobber explicitly requested heartbeat_to target addresss
Consider peer P.
- P does down in, say, epoch 60, and back up in epoch 70
- P and requests a heartbeat, as_of 70
- W...
Sage Weil
04:29 PM Revision e1830dbd (ceph): osd: request proper log extent for missing
We can't blinding ask for everything since last_epoch_started because that
may mean we get some fragment of a backlog...
Sage Weil
03:48 PM Bug #1101 (Resolved): osd: osds don't immediately notice when they've been marked down
commit:a51bf3e9df027bb9ed58679666ee4207b4185961 Sage Weil
02:10 AM Bug #1101 (Resolved): osd: osds don't immediately notice when they've been marked down
I suspect this is related to to the messenger changes (mark_down_on_empty etc). It takes ~20 seconds or more before ... Sage Weil
03:44 PM Revision ff031ce8 (ceph): osd: fix log bounds check
We weren't accounting for the case where we have
(foo,foo]+backlog
i.e., everything is backlog, and rbegin().versi...
Sage Weil
03:35 PM Revision 1dba8dd6 (ceph): osd: osd# is in log entry header/prefix
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
03:33 PM Revision d75f6237 (ceph): osd: log broken pg state to monitor on startup, activate
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
03:09 PM Revision b7b8127e (ceph): osd: fix proc_replica_log when peer log is empty
If the peer log is empty, and we break out of the loop on the first pass,
then clearly last_update has not been adjus...
Sage Weil
02:29 PM Bug #1102 (Resolved): SIGHUP log file reopen is broken
implemented by commit:277dc66f645f83552789cc6b314f59bdf75ba22d Colin McCabe
08:04 AM Bug #1102: SIGHUP log file reopen is broken
on stable branch (v0.28+) Sage Weil
08:04 AM Bug #1102 (Resolved): SIGHUP log file reopen is broken
Sage Weil
02:25 PM Revision f4001108 (ceph): osd: encode keyring as plaintext after --mkkey
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
02:25 PM Revision 93709f89 (ceph): keyring: make encode_plaintext method
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
12:15 PM Bug #1103 (Resolved): osd: log bound mismatch
commit:e1830dbd09784b7bddf2ab0657b50e6f293cef13 Sage Weil
08:43 AM Bug #1103 (Resolved): osd: log bound mismatch
A whole bunch of nodes have inconsistent PG::Info and log bounds. I don't have logs, so I'm not sure how it happened... Sage Weil
12:14 PM Bug #1100 (Resolved): osd: marking peers down
Sage Weil
01:57 AM Bug #1100 (Resolved): osd: marking peers down
I'm reliably seeing peers mark each other down when they shouldn't on benjamin. There are ~21 osds across 3 nodes, a... Sage Weil
10:35 AM Feature #1105 (Resolved): have multiple access keys per user in rgw
Yehuda Sadeh
09:50 AM CephFS Bug #1104: Segmentation fault when deleting a folder
Logfile from the first mds, as asked:
18:25 < sage> great. add
18:25 < sage> debug mds = 20
18:25 < sage> debu...
Bernard Grymonpon
09:47 AM CephFS Bug #1104 (Resolved): Segmentation fault when deleting a folder
got this after removing a just created folder:
2011-05-20 18:19:09.679553 7f8254c89700 mds0.18 handle_mds_map i am...
Bernard Grymonpon
09:05 AM Bug #1098: mds never coming "up:active" awaits in "up:creating"
Log file is too big.. It can not be attached ( > 5 MB). 192.168.2.101:6800/8459 --> 192.168.2.107 means bz1 ( MDS no... shyamali mukherjee
07:45 AM Bug #1099: osd: handle recovery of lost objects
My hacky workaround was... Sage Weil
01:16 AM Bug #1099 (Closed): osd: handle recovery of lost objects
... Sage Weil
07:41 AM Revision 6995fd51 (ceph): Merge branch 'wip_choose_acting' into stable
Sage Weil
07:27 AM Revision bdc371e5 (ceph): osd: take remote log when it is clearly superior
I'm hitting a case where the primary is compensating for a replica's
last_complete < log.tail by sending a log+backlo...
Sage Weil
07:14 AM Revision 4c97cb5f (ceph): osd: fix compensation for bad last_complete
If the peer has a last_complete below their tail, we can get by with our
log (without backlog) if our tail if _before...
Sage Weil
06:48 AM Revision 332565f1 (ceph): osd: remove some build_prior stringstream cruft
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
06:46 AM Revision 45e8627c (ceph): osd: remove useless debug print
We dump this (and more) at the end of the PgPriorSet constructor.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
06:40 AM Revision a2cb690d (ceph): osd: include past acting osds if they were up
This fixes a bug where we were excluding up (but not acting) nodes from
past intervals, which in turn was triggering ...
Sage Weil
06:38 AM Revision d4b44f9e (ceph): osd: do not exclude me during build_prior
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
06:25 AM Revision f7e6b1c1 (ceph): osd: show final build_prior result
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
03:46 AM Revision 6f8708ba (ceph): mon: log mkfs as INFO with fs
The [ERR] log level is misleading.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
01:02 AM Revision dfe52d9e (ceph): OSD, PG: ignore peering messages from before the last peering restart
Check them before entering the state machine so we can
safely enter the Crashed state on unexpected messages
from the...
Josh Durgin
01:02 AM Revision 628665bc (ceph): OSD: decrement message refcount before returning
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
12:46 AM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
tail -n 30k mds.alpha.log changping Wu
12:20 AM Revision 4404116b (ceph): mds: kick linklock on revoke_stale_caps
Also use the eval() method and issue caps instead of calling the individual
eval methods.
Signed-off-by: Sage Weil <...
Sage Weil

05/19/2011

11:15 PM Revision cef8eb9c (ceph): debian: no shlibs:Depends for obsync either
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
11:15 PM Revision 4e2c1f45 (ceph): debian: no shlibs:Depends for -dev packages
So says dpkg-gencontrol, at least:
warning: dpkg-gencontrol: Depends field of package librados-dev: unknown substitu...
Sage Weil
11:13 PM Revision 94433898 (ceph): librbd: don't need to link against crypto libs
All that is done by librados.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
10:59 PM Revision a71981c0 (ceph): PG: add_event, add_next_event: ignore prior_version on backlog events
We would not have the previous version if we are merging backlog events.
Signed-off-by: Samuel Just <samuel.just@dre...
Samuel Just
10:24 PM Revision 3471d41b (ceph): add ceph_readdir() to libceph
Signed-off-by: Brian Chrisman <brchrisman@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Brian Chrisman
10:24 PM Revision 922f7cc3 (ceph): expanding testceph to test open/readdir/telldir
Signed-off-by: Brian Chrisman <brchrisman@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Brian Chrisman
10:10 PM Revision f8f6bed6 (ceph): Add run.run option wait, this will make handling stdin easier soon.
Tommi Virtanen
10:04 PM Revision 3f43c78b (ceph): client: _flush should no-op if nothing to flush
If there are no FILE_BUFFER cap_refs, then we can bail out early.
Otherwise we will end up dropping refs we don't hav...
Sage Weil
10:04 PM Revision 67533e14 (ceph): client: be more careful with FILE_BUFFER cap refs
We should either hold a ref or not; whether we release one can't depend on
whether one is held because we can't assum...
Sage Weil
10:04 PM Revision 510f2dd7 (ceph): client: assert(in) on _flush
We should never arrive in _flush() and not have a reference to the inode
in question, because the presence of dirty b...
Sage Weil
10:04 PM Revision 838067d0 (ceph): client: clean up _flush callers
Have _flush return true if there are no dirty buffers. Clean up some
redundant conditionals in the callers
Signed-o...
Sage Weil
10:04 PM Revision 3df86c38 (ceph): client: hold FILE_BUFFER ref while waiting for dirty throttle
We may block in the write path because we've reached out dirty data limit.
Hold a reference to the FILE_BUFFER cap du...
Sage Weil
10:04 PM Revision 8549fc9a (ceph): Merge remote branch 'origin/stable'
Sage Weil
09:49 PM Revision 8f7d6c7e (ceph): librados: add python bindings for getxattrs
Add python bindings for getxattrs. Test getxattr, getxattrs, and
setxattr.
Signed-off-by: Colin McCabe <colin.mccabe...
Colin Patrick McCabe
09:33 PM Revision fe298f64 (ceph): OSD: send a log in response to a log query when the pg dne
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
09:33 PM Revision bc2c31e0 (ceph): PG: choose_log_location: prefer OSDs with a backlog
Without preferring an OSD with a backlog, PGs would get stuck in the
active state when acting != up and the backlog w...
Josh Durgin
08:47 PM Revision 93c2e17c (ceph): Return a structured result from run.run, to make capturing stdout/stder...
Tommi Virtanen
08:27 PM Revision 9a5c959b (ceph): Add integration tests for signals and connection loss.
Tommi Virtanen
08:05 PM Revision df84f4e0 (ceph): Check for errors on remote commands.
Tommi Virtanen
07:46 PM Revision 57f423ba (ceph): librados: add rados_getxattrs API for C bindings
Support getxattrs in the Rados C API.
Also add a test of getattrs to testrados.c
Signed-off-by: Colin McCabe <colin...
Colin Patrick McCabe
07:24 PM Revision bcbcf302 (ceph): ReplicatedPG: wait_for_missing_object in _rollback_to
Previously, we failed if the relevant clone had not yet been recovered.
Signed-off-by: Samuel Just <samuel.just@drea...
Samuel Just
07:17 PM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
Hi ,
The whole thing mds log size is too large, it about 5.4GB , can't attach it to this web.
this web limit the a...
changping Wu
10:08 AM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
Do you have a larger piece of the mds log you can attach? (Perhaps the whole thing?) Sage Weil
07:17 PM Revision 87d7192c (ceph): Add setup.py, install in devel mode into virtualenv.
Tommi Virtanen
07:16 PM Revision 492fa488 (ceph): Don't close file after copying stdout/stderr to it.
If a caller uses StringIO to capture the output, they
cannot call .getvalue() after the close.
This also lets you co...
Tommi Virtanen
06:49 PM Revision 40430595 (ceph): testrados: retab with C-style tabs
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
06:48 PM Revision 6a580bf2 (ceph): testrados: more getxattr / setxattr tests
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
06:33 PM Revision 1dd17431 (ceph): Remove libcrush from packaging
This removes libcrush from the packaging system, now that it's been
merged into libcommon.
Signed-off-by: Colin McCa...
Colin Patrick McCabe
05:20 PM Revision 37df5b1c (ceph): Refactor remote running to support more use cases.
Tommi Virtanen
05:19 PM Revision 5bfcec26 (ceph): Add debug logging to monkeypatching.
Tommi Virtanen
05:18 PM Revision 1ed70d78 (ceph): Silence paramiko transport logging.
Tommi Virtanen
05:17 PM Revision b397eb5d (ceph): Silence a Paramiko crypto deprecation.
Tommi Virtanen
05:17 PM Revision 85a28a23 (ceph): Make monkeypatching respect order.
Tommi Virtanen
05:14 PM Revision f16903d7 (ceph): client: do not retake lock in sync_write_commit
We already hold the lock from a few frames up the stack (ms_dispatch).
Reported-by: Simon Tian <aixt2006@gmail.com>
...
Sage Weil
05:13 PM Revision ce7f78d0 (ceph): ceph.spec.in: fix obsync description
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
04:41 PM Revision 4d39f1be (ceph): journaler: ENOENT is okay on trim
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:29 PM Revision ecb7c961 (ceph): mkcephfs: pick rdir based on whether current daemon is local or not
We need to pick $rdir as local or remote inside the for name loop.
Fixes: #1094
Signed-off-by: Sage Weil <sage@newdr...
Sage Weil
03:49 PM Bug #1098: mds never coming "up:active" awaits in "up:creating"
2011-05-18 16:06:18.643599 41ece940 -- 192.168.2.101:6800/8459 --> 192.168.2.107:6812/18794 -- osd_op(mds0.1:10 604.0... Sage Weil
11:30 AM Bug #1098 (Closed): mds never coming "up:active" awaits in "up:creating"
After upgrading to ceph 0.27 and latest ceph-client-standalone tree I am unable to mount FS. Intial debugging in kern... shyamali mukherjee
03:04 PM CephFS Bug #1087 (Resolved): userspace Client readdir_r failing
Sage Weil
02:45 PM CephFS Bug #1097 (Resolved): client: failed assert in Client::sync_write_commit
commit:f16903d724150ce7ec6886972a1726509bdcb828 and commit:67533e14439e9b23ee4be5d62277bba6cd99895c Sage Weil
09:39 AM Bug #1094 (Resolved): "mkcephfs -c /etc/ceph.conf --allhosts --mkbtrfs" finds /tmp/mkcephfs.*...
Thanks for testing! Sage Weil
09:35 AM Bug #1094: "mkcephfs -c /etc/ceph.conf --allhosts --mkbtrfs" finds /tmp/mkcephfs.**** dire...
Sage,
Thanks! There is one more thing I had to change. But I see that it is fixed in your latest code.
maxosd=`$...
shyamali mukherjee
12:04 AM Revision 5d161aa0 (ceph): PG: make choose_acting a bit smarter
This change allows old strays that don't need backlogs
to stay acting until current members of the up set are caught ...
Samuel Just
12:04 AM Revision 8c6ce348 (ceph): osd: clean up choose_acting output
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
12:04 AM Revision 9b979797 (ceph): PG: prefer log with longer tail
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
12:04 AM Revision 0aeb8efb (ceph): PG: merge_log- fix extend log case
Previously, when extending an empty log with a log with the same
last_update, we would fail an assert since we would ...
Samuel Just
12:04 AM Revision dbb2c383 (ceph): PG: _remove_pg, reset info.last_update and info.log_tail on log zero
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
12:04 AM Revision 51daa435 (ceph): PG: choose_acting: we need best_info to have a backlog, not the primary
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
12:04 AM Revision 92706af3 (ceph): PG: reset pg_trim_to in clear_primary_state
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
12:04 AM Revision 524ab3a6 (ceph): PG: GetLog: don't fail if we get an outdated log
If we request a log from one osd, and then another member of our prior
set comes up with a later last_update, we shou...
Josh Durgin
12:04 AM Revision cad3dfae (ceph): PG: choose acting set and newest_update_osd based on a map of all osds
newest_update osd should be stable when the primary changes, to
prevent cycles of acting set choices. For the same re...
Josh Durgin
12:04 AM Revision 2452d415 (ceph): PG: include ourselves in the prior set
All acting OSDs should be in the prior set, since any of them may have
the newest update.
Signed-off-by: Josh Durgin...
Josh Durgin
12:04 AM Revision 2a0f0cd1 (ceph): PG: remove unused argument to adjust_need_up_thru
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin

05/18/2011

10:37 PM Revision d0752e81 (ceph): Merge branch 'move_crush_to_libcommon'
Colin Patrick McCabe
10:37 PM Revision 14a3f262 (ceph): Move crush into libcommon
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:46 PM Revision 0535e4df (ceph): Initial import
Tommi Virtanen
09:14 PM CephFS Bug #1087: userspace Client readdir_r failing
Yes.
I have added 'Client::readdir()' and ceph_readdir(), which call Client::readdir_r etc underneath.
This is work...
Brian Chrisman
04:36 PM CephFS Bug #1087: userspace Client readdir_r failing
Were you able to sort out the callback return value stuff? Sage Weil
08:39 PM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
Hi ,
single mds and two mds , both of them fsstress test hang.
ceph.conf and single mds test log mds.alpha.log atta...
changping Wu
05:04 PM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
Was this with a single mds? Fsstress is known to turn up clustered mds bugs Sage Weil
02:51 AM Linux kernel client Bug #1096: LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
do the following steps , be easy to reproduce
$./fsstress -d /mnt/ceph/mdstest -f write=freq -l 100 -n 10000 -p ...
changping Wu
02:46 AM Linux kernel client Bug #1096 (Resolved): LTP fsstress test always hang ,ceph 0.27.1+linux-2.6.38.6
Hi ,
I do fsstress test for ceph 0.27.1 + linux-2.6.38.6 + ubuntu 10.10
$modprobe libceph
$modprobe ceph
$mount...
changping Wu
07:18 PM Revision d4588bae (ceph): Merge branch 'stable'
Sage Weil
06:48 PM Revision 2fc13de1 (ceph): Move crush into libcommon
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
06:45 PM Revision 0d79f1de (ceph): man: update cosd man page
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:17 PM Revision 071881d7 (ceph): v0.28
Sage Weil
05:15 PM Revision b060f5c8 (ceph): Revert "Makefile.am: link some utils with libcrush"
This reverts commit c26649861e4c154b1bedf6801342d0a8461a2d0a.
I'm not having any problems linking. I suspect this w...
Sage Weil
05:09 PM Revision f1c82aae (ceph): logclient: get rid of send_log; simplify monitor special casing
Change the SYNC flag to MON and send the Mlog synchronously in the do_log
call. This eliminates teh send_log vestiga...
Sage Weil
05:07 PM Revision baba0a7a (ceph): msgr: fix signedness in alloc_aligned_buffer
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:01 PM Revision bd1995c2 (ceph): logclient: log synchronously to syslog
This is simpler. And there is no reason to delay logging to syslog.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:58 PM Revision 4237da88 (ceph): logclient: send entries once per mon session
We have a lossless session with the monitor! Only send log entries once.
Otherwise, if the mon is down or something,...
Sage Weil
04:27 PM Revision 38ba4762 (ceph): crush: fix clitest now that leading spaces are stripped
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:22 PM Revision 883a980a (ceph): Makefile: don't clean up some files
Yehuda Sadeh
02:36 PM Bug #1094: "mkcephfs -c /etc/ceph.conf --allhosts --mkbtrfs" finds /tmp/mkcephfs.**** dire...
I see the problem. Can you please test commit:0efd51dede578e2cc8c68e1a55d1468a06eef83e (the wip-mkcephfsb branch) and... Sage Weil
01:43 PM CephFS Bug #1097: client: failed assert in Client::sync_write_commit
Sage Weil
08:56 AM CephFS Bug #1097 (Resolved): client: failed assert in Client::sync_write_commit
2011/5/17 Simon Tian <aixt2006@gmail.com>:
> Hi folks,
>
> ? ? ? When I write and read a file in client A, open wi...
Sage Weil
12:25 PM Revision c2664986 (ceph): Makefile.am: link some utils with libcrush
Yehuda Sadeh
11:53 AM Revision e3841dc6 (ceph): Makefile: don't clean up some files
Yehuda Sadeh
11:25 AM Linux kernel client Bug #1071 (Resolved): rbd: mkfs.ext4 doesn't complete (but mke2fs -j does)
Sage Weil
10:20 AM Bug #943 (Resolved): 3-mon cluster won't start
Ok, this should be fixed by commit:4237da886e61c88935d7fb856b49a2d9676cbf9d. Subsequent patches have some further cl... Sage Weil
05:04 AM Revision 2b729875 (ceph): Merge remote branch 'origin/stable' into next
Sage Weil
05:00 AM Revision 2f9ff022 (ceph): page: redefine PAGE_* macros
Saw this on sid i386:
msg/SimpleMessenger.cc: In function 'void alloc_aligned_buffer(ceph::bufferlist&
, int, int)':...
Sage Weil
05:00 AM Revision 09810cb2 (ceph): page: fix #ifdef guard
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:29 AM Revision ee7fa813 (ceph): mds: do not shift to EXCL or MIX while rdlocked
There was an old change in file_eval() that was allowing us to switch from
SYNC to MIX or EXCL while there were rdloc...
Sage Weil
04:08 AM Revision 9be71938 (ceph): vstart: simplify mds keyring add
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:08 AM Revision 812ce6e9 (ceph): Merge branch 'next'
Sage Weil
01:49 AM Revision 8ad346a3 (ceph): mon: 'auth caps <name> [svc value [svc2 value2 [...]]]'
Avoid having to futz with cauthtool if possible.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
01:49 AM Revision bfca7ac5 (ceph): osd: add --mkkey mkfs option
Optionally generate a new key as part of the mkfs step. This makes life
a bit easier for the admin.
Signed-off-by: ...
Sage Weil
12:45 AM Revision 660e6d52 (ceph): Merge remote branch 'origin/next'
Josh Durgin
12:30 AM Revision a22511db (ceph): PG: update same_acting_since when acting or up changes
This is a hack since we currently use same_up_since to denote the beginning of an interval.
We should probably change...
Josh Durgin
12:27 AM Revision 73b99163 (ceph): msgr: avoid clearing connection_state on pipe replacement
read_message and write_message both dereference connection-state, so avoid
clearing it when replacing a pipe.
read_m...
Sage Weil
12:27 AM Revision 45494b4d (ceph): crushtool: strip leading spaces from identifiers
No idea where these are coming from! Weird.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
12:27 AM Revision 50be4c46 (ceph): crush: allow - and _ in crushmap type/item names
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
12:05 AM Revision ec63ec3e (ceph): mon: 'osd tree [epoch]'
Dump crush map + osd state, displayed as a tree.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:05 AM Revision 108b2a69 (ceph): osdmaptool: print crush tree + osd state
Output osd state combined with crush tree placement. Note osds in tree
that do not exist and list osds that exist th...
Sage Weil

05/17/2011

11:29 PM Revision 0e3f0923 (ceph): librgw: be quiet by default
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
11:12 PM Bug #1095 (Closed): run "rados bench 10 seq -p data" print "error during benchmark: -5"
Hi ,
ceph 0.27 ,
run "rados bench 10 seq -p data",sometimes , print "error during benchmark: -5".
================...
changping Wu
10:04 PM Revision b5726e11 (ceph): librgw: make API reentrant
By passing in the configuration, we can use multiple librgw instances in
parallel-- or will be able to, once g_conf i...
Colin Patrick McCabe
10:04 PM Revision d4c4fe81 (ceph): pybind/rgw: fix python bindings for librgw
Use string_at to convert between librgw buffers and Python strings.
Signed-off-by: Colin McCabe <colin.mccabe@dreamh...
Colin Patrick McCabe
09:59 PM CephFS Bug #791 (Resolved): ls -al waits for writes to complete
commit:ee7fa813ef29890557f0b03bd3950d422484215d Sage Weil
04:48 PM CephFS Bug #791: ls -al waits for writes to complete
I reproduced some long stalls (~20 seconds) due to the loner flip-flopping. Need to analyze the logs (currently on v... Sage Weil
09:53 PM Messengers Bug #1093 (Resolved): msgr: race conditon with replaced pipe's connection_state
commit:73b99163aba7db77aa122eab99780c3d66f0aa91 Sage Weil
09:03 AM Messengers Bug #1093: msgr: race conditon with replaced pipe's connection_state
I was unclear: only one of the OSDs died due to this race. Running 10 on one disk just made this kind of race more li... Josh Durgin
09:26 PM Revision 28e175d6 (ceph): debian: obsync
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:26 PM Revision 34ffe738 (ceph): ceph.spec.in: add obsync
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:26 PM Revision 6d56c20f (ceph): obsync: no .py
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:19 PM Revision bbb1747a (ceph): PG: Replicas send Notifies in response to queries
Replicas only send Infos during activate.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
09:12 PM Revision 4320cb15 (ceph): Merge branch 'wip-crush'
Sage Weil
09:09 PM Revision 9c8f30f1 (ceph): PG: choose_log_location, fix error when scanning up set
++up.begin() does not skip the primary. Primary might not be up[0].
Signed-off-by: Samuel Just <samuel.just@dreamhos...
Samuel Just
08:58 PM Revision d90458a9 (ceph): osdmap: set type 0 to 'osd'
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:58 PM Revision f6dc19e3 (ceph): crushtool: fix error handling for adding devices
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:35 PM Revision c73e37b0 (ceph): crushtool: fix unittest map
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:33 PM Revision c9a257e3 (ceph): crushtool: fix usage
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:30 PM Revision d40010bd (ceph): crush: add add_item and reweight functions
Insert a device at a particular point in the hierarchy, and adjust weights
as appropriate.
Signed-off-by: Sage Weil ...
Sage Weil
08:30 PM Revision e46804bb (ceph): osdmap: use straw buckets everywhere by default
We were using uniform for the leaf buckets. Use straw instead.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:03 PM Revision 042139d1 (ceph): crushtool: include cumulative bucket weight in decompile
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:03 PM Revision 9a2def6e (ceph): crush: fix up constness some
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:04 PM Revision 36fb0846 (ceph): Add Python bindings for librgw
Add some Python bindings for librgw.
Also add some more verbose error logging to librgw.
Signed-off-by: Colin McCabe...
Colin Patrick McCabe
04:58 PM Revision b13bbb06 (ceph): PG: PG can receive a log in WaitActingChange requested in GetLog
Discard logs requested during GetLog once we are in WaitActingChange.
Signed-off-by: Samuel Just <samuel.just@dreamh...
Samuel Just
03:44 PM Bug #943: 3-mon cluster won't start
This tarball contains the huge logm file I had to drop to recover the cluster in the logm.dropped directory, as well ... Alexandre Oliva
03:04 PM Bug #943 (In Progress): 3-mon cluster won't start
Can you attach a tarball of one of the mon directories with the big files? It's possible this is a side effect of a ... Sage Weil
02:27 PM Bug #943: 3-mon cluster won't start
It happened again, even on the Gbps network. After two mons failed, the third kept accummulating messages in logm fo... Alexandre Oliva
02:55 PM Bug #1094: "mkcephfs -c /etc/ceph.conf --allhosts --mkbtrfs" finds /tmp/mkcephfs.**** dire...
It is happening due to check_host fails to identify this host as "localhost"
Here is what happened:
On a dif...
shyamali mukherjee
12:36 PM Bug #1094: "mkcephfs -c /etc/ceph.conf --allhosts --mkbtrfs" finds /tmp/mkcephfs.**** dire...
Sage Weil
12:35 PM Bug #1094: "mkcephfs -c /etc/ceph.conf --allhosts --mkbtrfs" finds /tmp/mkcephfs.**** dire...
can you run mkcephfs with -x (bash -x mkcephfs <regular args>) so we can tell exactly what it's doing? Sage Weil
11:24 AM Bug #1094 (Resolved): "mkcephfs -c /etc/ceph.conf --allhosts --mkbtrfs" finds /tmp/mkcephfs.*...
I have used ceph0.23 for quite sometime. But now after a fresh install and build of ceph 0.27.1
I see that during ...
shyamali mukherjee
02:52 PM Feature #1089 (Resolved): obsync: deb/rpm package
Sage Weil
01:52 PM RADOS Feature #433 (Resolved): improve osd reweighting
commit:4320cb15d4840c88b6e5c91c9923fb82749f78f4 Sage Weil
01:39 PM Revision 8ed372c9 (ceph): rgw: ahrm.. now really fix logging
Yehuda Sadeh
01:18 PM Revision 0b6cb47d (ceph): rgw: fix logging
Yehuda Sadeh
11:24 AM Revision b7b47a02 (ceph): rgw: fix typo
Yehuda Sadeh
11:15 AM Revision 8836b844 (ceph): rgw: don't log operations on unexisting bucket
Yehuda Sadeh
10:57 AM Feature #1091: librados: support pgls filter
> I assume that we can define a base class for the pgls_filter iterators and
> specialize it according to the filte...
Colin McCabe
01:29 AM Feature #1091: librados: support pgls filter
The original librados list_filter() had the following:
void Rados::list_filter(Rados::ListCtx& ctx, bufferlist& fi...
Yehuda Sadeh
09:27 AM Bug #1088 (Closed): osd: assert(is_up) failed when sending queries
Samuel Just
09:26 AM Bug #1079 (Closed): pgs stuck peering or degraded
Samuel Just
05:00 AM rgw Feature #1027: rgw log operations on non-existent bucket
Starting at commit:8836b8447a3a70fc6dd647d070d763f283084ee7 we don't log operations to unexisting bucket. Still need ... Yehuda Sadeh
12:12 AM Revision e0439626 (ceph): obsync: preserve user-defined metadata
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe

05/16/2011

11:51 PM Revision 24233f21 (ceph): obsync: filestore: separate xattr metadata nspace
User-defined s3 metadata lives in a separate namespace from regular S3
metadata like Content-Type, etc.
Signed-off-b...
Colin Patrick McCabe
11:26 PM Revision 265ab992 (ceph): PG: Don't use exit to call proc_master_log
exit is also invoked when transitioning to Reset...
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
10:56 PM Revision f863862c (ceph): obysnc: preserve Content-Type
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:47 PM Revision a4bd854f (ceph): client: update ctime for auth, xattr
This mirrors the kclient fix in d8672d64. The client can have a newer
ctime due to auth or xattr excl caps. This fi...
Sage Weil
09:25 PM Revision 8e6b53fe (ceph): obsync: FileStore: test storing ACLs in xattrs
Update unit tests now that we're storing ACLs in xattrs. Fix a bug.
Signed-off-by: Colin McCabe <colin.mccabe@dreamh...
Colin Patrick McCabe
09:12 PM Messengers Bug #1093: msgr: race conditon with replaced pipe's connection_state
Wow, that's unexpected. If you look at the source you'll notice that the connection_state is referred to in Pipe::wri... Greg Farnum
05:38 PM Messengers Bug #1093 (Resolved): msgr: race conditon with replaced pipe's connection_state
When a non-lossy connection is replaced, the messenger sets its connection_state to NULL while holding the pipe_lock.... Josh Durgin
08:42 PM Revision 3865ca56 (ceph): mon: health WARN if monitor quorum is incomplete
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:02 PM Feature #1091: librados: support pgls filter
The Objecter is client-side. The filtering is done by the OSD, so as to preserve network bandwidth and reduce client ... Greg Farnum
04:54 PM Feature #1091: librados: support pgls filter
Yeah. If the filtering were done on the librados side, there would be little point to the API.
However, even thoug...
Colin McCabe
10:30 AM Feature #1091: librados: support pgls filter
Note that the filtering is being done on the osd side. Yehuda Sadeh
10:22 AM Feature #1091 (Duplicate): librados: support pgls filter
pgls_filter support was removed in the librados API redesign while we were converting everything to iterators. I stil... Colin McCabe
05:54 PM Revision a82e062e (ceph): obsync: FileStore: store ACLs in xattrs
Store the ACL XML in extended attributes rather than in side files.
Signed-off-by: Colin McCabe <colin.mccabe@dreamh...
Colin Patrick McCabe
05:42 PM Revision ac6afe06 (ceph): obsync: FileStore: test for xattr support
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
05:17 PM Bug #1051: obsync: create a librgw to parse binary ACLs generated by RGW
librgw exists and is passing unit tests. The thing to do now is to get the python bindings to librgw working, and the... Colin McCabe
05:16 PM Bug #1040 (Resolved): obsync: preserve content-type, misc metadata
implemented preserving user-defined metadata by commit:24233f210e1391454ae02140b65da41a30f71209 and commit:e043962601... Colin McCabe
04:00 PM Bug #1040: obsync: preserve content-type, misc metadata
We now preserve content-type by commit:f863862c67bd043ba8a7e61b4d776a0bd7ae924c
I'm working on preserving the othe...
Colin McCabe
02:48 PM Bug #906: clustered mds: lchown not setting uid/gid
audit of the uclinet vs kclient code turned up one difference, but it was a bug fix in kclient that was missing from ... Sage Weil
01:24 PM Revision 1db29a26 (ceph): rados: don't force order on params
Yehuda Sadeh
11:12 AM Feature #1092 (Rejected): mon: checkpointing
ability to checkpoint monitor state to facilitate rollback. To be used in combination with #1080. Sage Weil
11:11 AM Feature #1080 (Resolved): osd: cluster snapshot
going to call mon checkpointing out of scope for now. we can that later as needed. Sage Weil
11:10 AM Bug #1085 (Won't Fix): bug in cclass
cclass will be gone in v0.28, which will be out in the next day or two! Sage Weil
06:26 AM Bug #1090 (Resolved): broken param parsing in the rados tool
Fixed, commit:1db29a261016e64f2fba65d3b911991fa29f3d40. Yehuda Sadeh
06:14 AM Bug #1090 (Resolved): broken param parsing in the rados tool
'rados ls -p data' does not return what 'rados -p data ls' returns. Yehuda Sadeh
03:56 AM Revision e93c0fc0 (ceph): fix segfault introduced by commit de640d85fa3e0e5e5a31704eab5a8714a1ffe867
That commit introduces the line 'cur_con->put()' which has the possibility
of being called while cur_con is not initi...
root

05/15/2011

09:07 AM Feature #1089 (Resolved): obsync: deb/rpm package
Sage Weil

05/14/2011

09:07 PM Revision cd75a9d2 (ceph): osd: lazily close connections to down peers
If we hear from a peer that should be dead, tell them, but mark our
connection so that it will close after that messa...
Sage Weil
09:07 PM Revision a5b5aea4 (ceph): msgr: mark_down_on_empty and mark_disposable
Mark a connection to close when messages are sent, and to close on any
error. We can use this to tell people who sho...
Sage Weil
08:46 PM Revision 5ecc42b5 (ceph): PG: Remove downed osds from peer_missing and peer_info
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
08:42 PM Revision ba753630 (ceph): PG: Only pull the master log from a member of the prior_set
There must be a member of the prior_set such that no other
osd has a more recent last_update. This way, prior_set_af...
Samuel Just
12:26 AM Revision 6af0379e (ceph): rgw: Move rgw_log_level to md_config_t
Need to do this to get librgw to be usable as a standalone library
without unresolved symbols. Also, this makes it co...
Colin Patrick McCabe
12:16 AM Revision 56cab8ca (ceph): Makefile.am: add SimpleMessenger.cc to libcommon
libcommon depends on this file, and there's no other library that it
could go in. It is certainly silly to manually i...
Colin Patrick McCabe
12:10 AM Revision 924c000b (ceph): librgw: only include rgw_acl.cc and librgw.cc
Rather than putting all of RGW into librgw, only put rgw_acl.cc. Have
RGW use librgw instead of re-including the same...
Colin Patrick McCabe

05/13/2011

11:39 PM Revision 298e5c72 (ceph): rgw_acl: move constructors, destructors to .cc
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
11:28 PM Revision a17db024 (ceph): MDS: don't journal slave ops if we only have caps.
Previously we wanted to journal if we had caps on something. Now
that we're being strict about only journaling stuff ...
Greg Farnum
11:28 PM Revision b8ddecce (ceph): MDS: do journal on rename if we're auth for the inode.
We missed this case: we can be auth for the inode being moved without
being auth for the srcdn (first case) or owning...
Greg Farnum
11:28 PM Revision e8504c0b (ceph): uclient: do not accept max_size changes unless they're from auth mds.
Unlike most of the cap options, max_size is an inode member. This meant
that if we got a shared cap grant from a repl...
Greg Farnum
11:08 PM Revision 9847eb8b (ceph): rgw: put XML-to-bin translation into a librgw
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
11:08 PM Revision 883d1807 (ceph): librgw: small error handling fix
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
11:08 PM Revision 1c98da66 (ceph): librgw: use dout for logging
Use dout for logging so that the librgw library user can turn off or
redirect the logs if necessary.
Signed-off-by: ...
Colin Patrick McCabe
11:08 PM Revision 0dea92f6 (ceph): boto_tool.py: use s3-tests config file
boto_tool now grabs the configuration variables it needs from the
s3-tests config file, similar to s3-tests and test-...
Colin Patrick McCabe
11:08 PM Revision 50e41fbe (ceph): boto_tool.py: fix old-style argument-passing
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
11:08 PM Revision 6f4f702b (ceph): boto_tool.py: add --rmobjects, --rm_rf
Add some options to help destroy buckets.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Colin Patrick McCabe
11:08 PM Revision 12deaaa4 (ceph): obsync: add DST_CONSISTENCY
The DST_CONSISTENCY variable allows us to specify that the destination
is expected to use read-after-write consistenc...
Colin Patrick McCabe
11:08 PM Revision bf81df27 (ceph): obsync: fix eventual consistency handler
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
09:19 PM Revision 3d4971b3 (ceph): Merge remote branch 'origin/osd_snap' into stable
Sage Weil
09:07 PM Revision 6e0e5532 (ceph): PG: search_for_missing takes the other osd's missing set
Previously, search_for_missing was erroneously passed the
primary's missing in a few places.
Signed-off-by: Samuel J...
Samuel Just
08:56 PM Revision e0d83fe7 (ceph): PG: search_for_missing takes the other osd's missing set
Previously, search_for_missing was erroneously passed the
primary's missing in a few places.
Signed-off-by: Samuel J...
Samuel Just
08:13 PM Revision 89a821c6 (ceph): radosgw_admin: fix clitest
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:59 PM Revision 8161122b (ceph): fix null deref when callback invoked en route from readdir_r rather tha...
Signed-off-by: Brian Chrisman <brchrisman@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Brian Chrisman
04:59 PM Revision 72ca96e1 (ceph): add basic test case for readdir_r
Signed-off-by: Brian Chrisman <brchrisman@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Brian Chrisman
04:31 PM Bug #1088 (Closed): osd: assert(is_up) failed when sending queries
This happened when I was stress testing the peering code with 10 osds running off one disk, streaming writes, and mar... Josh Durgin
01:51 PM Linux kernel client Bug #1086 (Resolved): rbd: iozone failure
Sage Weil
01:51 PM Linux kernel client Bug #1086: rbd: iozone failure
fixed by commit:1fec70932d867416ffe620dd17005f168cc84eb5 Sage Weil
09:10 AM Linux kernel client Bug #1086: rbd: iozone failure
i just noticed this comment didn't post yesterday, no wonder yehuda didn't know what i was talking about :)
here:
...
Sage Weil
06:29 AM Linux kernel client Bug #1086: rbd: iozone failure
The problem is that our use of blk_end_request is wrong, as it assumes ordering on the requests completion. In most r... Yehuda Sadeh
10:23 AM CephFS Bug #1087: userspace Client readdir_r failing
Oh, I got it.
I don't really remember how the readdir works at this point, but if you follow the calls for libceph's...
Greg Farnum
10:16 AM CephFS Bug #1087: userspace Client readdir_r failing
Just thought about this some more... what I need to reconcile is any differences between ceph_ll_add_dirent and _read... Brian Chrisman
10:03 AM CephFS Bug #1087: userspace Client readdir_r failing
Yeah.. sorry about the context... this is in libceph testing.
ceph_readdir_r was already implemented when I looked a...
Brian Chrisman
09:33 AM CephFS Bug #1087: userspace Client readdir_r failing
Well, looking at this real quick I see:
29) lookup on readdir_r_test (succeeds, 0)
30) lookup on readdir_r_test/opene...
Greg Farnum
07:24 AM CephFS Bug #1087 (Resolved): userspace Client readdir_r failing
I chased this down a bit of a ways but there's a lot to look through.
This log is output from testceph with client d...
Brian Chrisman

05/12/2011

10:43 PM Revision 84644dc5 (ceph): uclient: compare _revoked_ caps when deciding whether to release.
cap->issued is already set to new_caps, so that branch was never taken!
Signed-off-by: Greg Farnum <gregory.farnum@d...
Greg Farnum
10:36 PM Revision 932f4eb0 (ceph): uclient: clear out cap->wanted when caps get revoked.
This ensures that we will send a response to the MDS letting it know
that we've revoked our caps.
Signed-off-by: Gre...
Greg Farnum
09:34 PM Revision 5e2b57d0 (ceph): uclient: be more careful about sending caps.
This should prevent us from "losing" caps off the dirty list. See
#1063. If we have dirty caps we don't want to short...
Greg Farnum
06:01 PM Revision 91a268ed (ceph): radosgw_admin: dump log by object
instead of only by date+bucket.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:13 PM CephFS Bug #1063 (Can't reproduce): dbench breaks if MDS and client times aren't synced
I won't be surprised if this comes back again, but I can't reproduce it and there've been several fixes for client ca... Greg Farnum
03:50 PM Revision 30491e8f (ceph): updated test to cover "." directory stat
Signed-off-by: Brian Chrisman <brchrisman@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Brian Chrisman
03:50 PM Revision 4456b6c3 (ceph): Add analogous special case for "." directory alongside ".." in _lookup
Signed-off-by: Brian Chrisman <brchrisman@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Brian Chrisman
11:12 AM Linux kernel client Bug #1086: rbd: iozone failure
My comment.
I have the starting file size 4G (-n4g) because on this server 4G memory. Into smaller files on the se...
Fyodor Ustinov
10:06 AM Linux kernel client Bug #1086: rbd: iozone failure
strangely, the file looks correct (before and after a remount):... Sage Weil
10:00 AM Linux kernel client Bug #1086 (Resolved): rbd: iozone failure
I was able to reproduce Fyodor's problem on rbd (latest kernel) and ext2:... Sage Weil
09:43 AM rgw Bug #1083 (Won't Fix): rgw: log by user, user+bucket
nevermind. we can just list the log objects directly from the .log pool. Sage Weil
09:11 AM Linux kernel client Bug #557 (Can't reproduce): BUG_ON(!session->s_num_cap_releases);
Sage Weil
09:10 AM Linux kernel client Bug #465 (Resolved): need to refresh osdmap when full flag is set
added bit to subscribe to next osdmap if current osdmap has full bit set. Sage Weil
08:55 AM Linux kernel client Bug #1071: rbd: mkfs.ext4 doesn't complete (but mke2fs -j does)
Blarg, I can't reproduce this consistently. That bisect is probably bogus. Sage Weil
08:54 AM Linux kernel client Bug #909 (Can't reproduce): ceph-client+ceph v0.25.1,iozone test, "libceph: tid 115358 timed out...
Sage Weil
04:21 AM Revision 935f7dc1 (ceph): mds: drop unneed default arg
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
03:39 AM Bug #1085 (Won't Fix): bug in cclass
... Fyodor Ustinov
12:57 AM Revision 2655a2b5 (ceph): Merge branch 'osd_pgls'
Sage Weil
12:56 AM Revision a6417c6a (ceph): objecter: set pgls start_epoch field
For each pg, start out with start_epoch = 0 in the first request. For
subsequent requests, set it to the first reply...
Sage Weil
12:55 AM Revision 8a1644ef (ceph): osd: add pgls start_epoch field
If the pgls.start_epoch is set, the cookie is only considered valid if the
osd pg interval has not changed since then...
Sage Weil
12:51 AM Revision 222126e8 (ceph): rgw: in S3 PUT, don't crash on Content-Length == 0
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
12:47 AM Revision 5c382d35 (ceph): objecter: fix calc_op_budget bit mask checks
Use the helpers; we need to mask out several bits and compare.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
12:45 AM Revision b6cccc74 (ceph): Objecter: switch handle_osd_map op resending around
We need to order the resend by tid. We could do that in a
set with a special-purpose comparison function, but just
sw...
Greg Farnum
12:27 AM Revision 1d29cc7c (ceph): rgw: in S3 PUT, don't crash on Content-Length == 0
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe

05/11/2011

11:52 PM Revision 5e4f6bae (ceph): Objecter: implement operator<.
This will maintain ordering of Ops when they're in eg STL sets.
Previously Objecter::handle_osd_map would indiscrimin...
Greg Farnum
11:27 PM Revision c155a2b7 (ceph): osd: prepend missing objects to pgls results
This will prepend any missing objects to the set of objects returned by
a sequence of PGLS operations. Because recov...
Sage Weil
09:52 PM CephFS Bug #1084: blogbench won't finish: waiting for Fr cap forever
i initially thought something like this would work
diff --git a/src/mds/Locker.cc b/src/mds/Locker.cc
index 3c7...
Sage Weil
08:03 PM CephFS Bug #1084 (Resolved): blogbench won't finish: waiting for Fr cap forever
Run blogbench with kclient: blogbench -d /mnt/ceph/henry/b5/
Blogbench won't finish and keeps waiting for Fr caps of...
Henry Chang
09:00 PM Revision d9896b3c (ceph): obsync: handle eventual consistency issues
Handle eventual consistency issues so that obsync will be usable on more
S3 stores.
Signed-off-by: Colin McCabe <col...
Colin Patrick McCabe
08:58 PM Revision 7083777c (ceph): osd: remove weird commit_op_seq fast-forward
This doesn't serve any purpose that we can discern.
In fact, it might cause problems because it'd allow the journal ...
Sage Weil
08:58 PM Revision 82f9a923 (ceph): osd: key Missing::rmissing on version (not eversion)
This switches the key to the uint64_t (version_t) only, which is still
unique given a particular timeline (which is a...
Sage Weil
08:54 PM Revision f1af92fb (ceph): PG: choose_log_location, fix error when scanning up set
++up.begin() does not skip the primary. Primary might not be up[0].
Signed-off-by: Samuel Just <samuel.just@dreamhos...
Samuel Just
08:39 PM Revision 326d01b2 (ceph): osd: support rollback to cluster snapshot
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:27 PM Revision 2dc891f6 (ceph): Clock: add new clock_offset config option, and use it in g_clock.now()
This way we can test clock drifts without needing to actually
drift the clocks.
Signed-off-by: Greg Farnum <gregory....
Greg Farnum
08:11 PM Revision 7aed34c2 (ceph): clock: remove cruft.
There were some odd pieces that are artifacts of a very old and
different use. Remove them to simplify the interface ...
Greg Farnum
08:11 PM Revision 88641b88 (ceph): osd: trigger a store snapshot when the osdmap says to
Move the OSDMap decoding up a bit so that we can either snapshot or flush.
We can't do it after we take map_lock or e...
Sage Weil
08:10 PM Revision 6db09bac (ceph): filestore: add a snapshot command to create a snapshot of the entire store
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:10 PM Revision 918eeaf0 (ceph): mon: add 'osd cluster_snap foo' command
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:10 PM Revision 2ced4e24 (ceph): osdmap: add cluster_snapshot field
Add a cluster_snapshot marker in the map that is valid for a single epoch
to do a coordinated snapshot of the entire ...
Sage Weil
07:12 PM rgw Bug #1081 (Resolved): rgw: 500 error with x-amz-acl instead of explicit xml
yay fixed! patch is in next branch (for v0.28). Sage Weil
05:59 PM rgw Bug #1081: rgw: 500 error with x-amz-acl instead of explicit xml
should be fixed by 1d29cc7c7627683ba0ae2aa064abab4ea942b4e8.
Just need to test.
Colin McCabe
04:37 PM rgw Bug #1081: rgw: 500 error with x-amz-acl instead of explicit xml
Setting canned ACLs works for me in the tests I am running.
I am running more s3-tests, so maybe that will unearth...
Colin McCabe
03:43 PM rgw Bug #1081: rgw: 500 error with x-amz-acl instead of explicit xml
Here's a tcpdump snippet from s3-tests that works, compare against this to find the cause. (But 500 => there's an rgw... Anonymous
03:41 PM rgw Bug #1081: rgw: 500 error with x-amz-acl instead of explicit xml
When running $Conn->put_bucket_acl('berlertestobsync1', '', { 'x-amz-acl' => 'public-read' }); where $Conn is an S3::... Matthew Wodrich
03:41 PM rgw Bug #1081: rgw: 500 error with x-amz-acl instead of explicit xml
here's the full Response object (it contains the request object, so you can see what produced it)
$VAR1 = bless( {...
Steven Berler
03:22 PM rgw Bug #1081 (Resolved): rgw: 500 error with x-amz-acl instead of explicit xml
if something like
'_headers' => bless( {
                                                               'user-agent...
Sage Weil
06:56 PM Revision d3aa0c1e (ceph): PG: Replicas send Notifies in response to queries
Replicas only send Infos during activate.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
06:42 PM Revision 151bf29d (ceph): Merge branch 'next'
Sage Weil
06:37 PM Revision 8d201d4b (ceph): librbd: tolerate ENOENT when trying to delete an object.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
06:37 PM Revision 484e6e6f (ceph): rados_sync: tolerate ENOENT when deleting an object.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
06:37 PM Revision 140886c8 (ceph): mdcache: check return values when purging an inode.
Previously we weren't looking, and if there's a problem
we probably shouldn't be moving on.
Signed-off-by: Greg Farnu...
Greg Farnum
06:07 PM rgw Bug #1083: rgw: log by user, user+bucket
actually, we can scratch problem 2.. each log entry apparently has the bucket owner at the time of the request. Sage Weil
04:19 PM rgw Bug #1083 (Won't Fix): rgw: log by user, user+bucket
problem 1:
- user creates bucket at beginning if day
- pumps full of data
- removes bucket at end of day
- we...
Sage Weil
06:06 PM Bug #1030 (Resolved): osd: list pool/bucket contents excludes missing objects
Sage Weil
06:04 PM Revision 1429d776 (ceph): test-obsync.sh: fix obsync unit tests
Fix the obsync unit tests to take into account the new ACL changes.
ACLs must be either translated or ignored when co...
Colin Patrick McCabe
05:28 PM Revision 0f42099a (ceph): expand testceph to check xattrs
Signed-off-by: Brian Chrisman <brchrisman@gmail.com> Brian Chrisman
05:28 PM Revision b0e0c361 (ceph): client: support security. namespace
Signed-off-by: Brian Chrisman <brchrisman@gmail.com> Brian Chrisman
05:28 PM Revision 3521771c (ceph): support for xattrs in libceph
Signed-off-by: Brian Chrisman <brchrisman@gmail.com> Brian Chrisman
03:47 PM Feature #1082 (Rejected): obsync: swift support
Sage Weil
01:48 PM Bug #1079: pgs stuck peering or degraded
The ones stuck in degraded were likely caused by the bug fixed in f1af92fb3d3bdab5a74ef40744028001d1943203. Samuel Just
01:33 PM Feature #1080 (Resolved): osd: cluster snapshot
create a snapshot of all osds so we can rollback the state of the entire osd cluster
warping back the monitor will...
Sage Weil
01:24 PM CephFS Bug #1063: dbench breaks if MDS and client times aren't synced
Scratch that, I did manage to reproduce locally. It just took a bit longer. Greg Farnum
01:07 PM CephFS Bug #1063: dbench breaks if MDS and client times aren't synced
On the other hand, adding a clock skew option and setting the MDS into the future doesn't let me reproduce the broken... Greg Farnum
10:10 AM CephFS Bug #1063: dbench breaks if MDS and client times aren't synced
Well, job 576 completed successfully after TV time-synced the cluster. Looks like bad mtimes are somehow causing the ... Greg Farnum
11:01 AM Bug #1078 (Resolved): rados remove fails silently on non-existent objects
Okay, I checked through these. A lot of callers don't pay any attention to the return code from remove but I looked a... Greg Farnum
10:13 AM CephFS Bug #930 (Resolved): libceph not exporting getattr
commit:3521771cb6bdb8eb0cbec7dc27a9999ddb494ad0 Sage Weil
04:35 AM Revision 3a8f36f5 (ceph): journaler: tolerate ENOENT when prezeroing
ENOENT is okay and expected.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:32 AM Revision d2243e82 (ceph): osd: unlink of nonexistent object should return -ENOENT
fixes bug #1078. Yehuda Sadeh
04:31 AM Revision f114cf18 (ceph): monclient: fix crash on shutdown
cur_con may be null on shutdown.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
12:46 AM Revision 6f340456 (ceph): Revert "osd: unlink of nonexistent object should return -ENOENT"
This reverts commit a7f87965f5c49fa878dadd458f87b7974252ab6e.
This commit breaks at least how Filer does zeroing. We...
Greg Farnum

05/10/2011

11:58 PM Revision 59995908 (ceph): Merge branch 'wip-merge-radostool-with-radossync'
Colin Patrick McCabe
11:54 PM Revision de640d85 (ceph): monclient: maintain explicit session connection; ignore stray messages
Maintain an explicit Connection handle to send messages and mark_down old
monitor connections. Ignore any incoming m...
Sage Weil
11:45 PM Revision 3425a8e5 (ceph): rados tool: integrate rados_sync with rados tool
* integrate rados_sync with rados_tool
* Improve rados tool usage a bit
* Rename test_rados_sync.sh to test_rados_too...
Colin Patrick McCabe
11:25 PM Revision 15756550 (ceph): rados tool: change initial argument parsing a bit
Use the ceph_argparse functions. Prepare to integrate with rados_sync.
Signed-off-by: Colin McCabe <colin.mccabe@dre...
Colin Patrick McCabe
10:23 PM Revision 203edaca (ceph): librados: don't crash if we call connect twice
Fixes: #1034
Reported-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:49 PM Bug #1078: rados remove fails silently on non-existent objects
probably needs
diff --git a/src/osdc/Journaler.cc b/src/osdc/Journaler.cc
index 9322ad2..6fbe3f8 100644
--- a/sr...
Sage Weil
05:11 PM Bug #1078 (In Progress): rados remove fails silently on non-existent objects
Yeah, it definitely hits other things. The MDS no longer starts up due to error codes coming back on prezeroing.
I...
Greg Farnum
02:23 PM Bug #1078 (Resolved): rados remove fails silently on non-existent objects
Fixed by commit:a7f87965f5c49fa878dadd458f87b7974252ab6e. Yehuda Sadeh
01:11 PM Bug #1078: rados remove fails silently on non-existent objects
This will fix it, however, I'm not sure either whether it was done on purpose or was just an oversight and/or how it'... Yehuda Sadeh
11:56 AM Bug #1078 (Resolved): rados remove fails silently on non-existent objects
Trying to remove a non-existent object produces no error. If this is intentional, it needs to be documented in the li... Colin McCabe
09:33 PM Revision d67dba76 (ceph): Merge remote branch 'origin/stable'
Sage Weil
09:20 PM Revision a7f87965 (ceph): osd: unlink of nonexistent object should return -ENOENT
fixes bug #1078. Yehuda Sadeh
06:16 PM Revision 331c01e8 (ceph): rados_sync: implement --delete-after, fix bugs
Implement --delete-after for both export and import.
Fix DIR* leaks.
Signed-off-by: Colin McCabe <colin.mccabe@drea...
Colin Patrick McCabe
06:16 PM Revision 6e55b237 (ceph): rados_sync: support --force
Support --force, which re-copies all objects all the time.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Colin Patrick McCabe
06:16 PM Revision 001c18c1 (ceph): test_rados_sync: test --force
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
06:16 PM Revision 357910c5 (ceph): Allow dashes in ceph_argparse, etc.
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
05:40 PM Bug #1079: pgs stuck peering or degraded
It looks like the degraded ones are staying that way because they need backlogs, but we didn't populate peer_backlog_... Josh Durgin
04:49 PM Bug #1079 (Closed): pgs stuck peering or degraded
Using the same setup as in #1073, but with 10 osds, the cluster recovered most pgs, but a few were stuck degraded, an... Josh Durgin
04:50 PM Bug #1077 (Resolved): integrate rados_sync with rados tool
implemented in 3425a8e5031a4f0c9c0eb85e8a329b02d05b9420 Colin McCabe
11:43 AM Bug #1077 (Resolved): integrate rados_sync with rados tool
As we discussed. Colin McCabe
04:36 PM Bug #1033 (Resolved): osd: CephxClientHandler::handle_response
commit:de640d85fa3e0e5e5a31704eab5a8714a1ffe867 Sage Weil
03:40 PM CephFS Bug #1063: dbench breaks if MDS and client times aren't synced
Okay, hopefully we can rerun this with a time-synced cluster soon and see if that's what is causing the breakage.
...
Greg Farnum
03:22 PM Revision d006c6f2 (ceph): osd: initialize oi.oloc if on-disk value is bogus
If the on-disk locator is undefined (upgrade of an old cluster?) initialize
the oloc fields based on the PG::Info.
R...
Sage Weil
03:04 PM Bug #1034 (Resolved): librados: Calling connect twice causes a segfault
Sage Weil
12:15 PM Revision f00edf73 (ceph): rgw: switch bucket creation operations
First we create the pool, then we create the bucket object. This
should have the effect of making the bucket creation...
Yehuda Sadeh
11:19 AM Bug #1074 (Resolved): rados_sync: implement --delete
implemented by 331c01e847c471980c31980a618c3bce3face50e Colin McCabe
10:47 AM Feature #1068 (Resolved): rados: incremental import/export
Sage Weil
10:47 AM Feature #1069 (Resolved): rados: support attrs in import/export
Sage Weil
10:04 AM Bug #1076 (Resolved): avoid sparse read for small reads
It doesn't make sense to make sparse read for reads that are smaller than the block size. This should either be in li... Yehuda Sadeh
09:43 AM Bug #1021 (Can't reproduce): osd: _process_pg_info FAILED assert(pg->log.tail <= pg->info.last_co...
Let's see if this comes up post-refactor. Sage Weil
09:43 AM Bug #1028 (Resolved): segfault in OSDMap::object_locator_to_pg
Sage Weil
04:53 AM Bug #1028: segfault in OSDMap::object_locator_to_pg
ok, it seems fixed. Now back to #1022 ar Fred
12:33 AM Bug #1028: segfault in OSDMap::object_locator_to_pg
Thank you for the patch, compiling right now.
This is indeed an old FS that got created approximately a year ago, ...
ar Fred
06:01 AM rgw Bug #1059 (Resolved): RGW consistency issues
Fixed now, commit:f00edf73284fc0f6e32973d16f58eb81f7b96bf8. However, this might have impact on performance. Yehuda Sadeh

05/09/2011

11:01 PM Revision 0ac419e0 (ceph): osd: drop bad warning
The stats won't match reality if there are any missing or if there are any
snapped objects.
Signed-off-by: Sage Weil...
Sage Weil
11:01 PM Revision 8e1e45c0 (ceph): osd: reset last_complete on mark_all_unfound_as_lost if no more missing
If we marked _all_ missing as lost, reset last_complete, since missing is
now empty!
Signed-off-by: Sage Weil <sage....
Sage Weil
11:01 PM Revision 70d8c994 (ceph): osd: simplify build_might_have_unfound
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
11:01 PM Revision 8a781f11 (ceph): osd: fix osd$foo typos
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
11:01 PM Revision 7a6b9b97 (ceph): osd: fix pollution of peer_info
The ++ postfix has no effect here! We really want +1.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
11:01 PM Revision 7ada5cd6 (ceph): osd: wait for up_thru updates
Before the primary can go active we need to wait for the up_thru in the
osdmap to reflect that we were alive during t...
Sage Weil
11:01 PM Revision 6d70592d (ceph): osd: log debug output for Crashed state
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
11:01 PM Revision 8cb861c3 (ceph): osd: rename Pending -> WaitActingChange
We only use the Pending state while waiting for the acting set to change.
Rename the state and log it appropriately s...
Sage Weil
10:18 PM Revision d9ea95f2 (ceph): rados tool: remove import/export
rados_sync replaces rados import / rados export
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com>
Colin Patrick McCabe
04:46 PM RADOS Feature #1075 (Resolved): crushtool: warn if uniform item weights vary
uniform bucket weights are ignored (and presumed uniform, hence "uniform"). Sage Weil
04:29 PM Bug #1073 (Resolved): osd: failed assert: info.last_complete == info.last_update
fixed by commit:7ada5cd685fcf3cae4e1c5d2dd81ea1817cceee7 Sage Weil
09:51 AM Bug #998 (In Progress): qemu/librbd race conditon
Christian Brunner had a similar error, but in aio_write during a yum upgrade. We should do more testing on this. Josh Durgin
09:05 AM Bug #1028: segfault in OSDMap::object_locator_to_pg
This problem is that the locator stored in the object_info_t on disk is wrong. Can you say anything about when the o... Sage Weil
05:28 AM rgw Bug #1035 (Resolved): incorrect rgw log data
This is already fixed, commit:a09eb0c33f6b05714bd4f780f79c70cb4529f840. Yehuda Sadeh

05/08/2011

11:30 PM Bug #1028: segfault in OSDMap::object_locator_to_pg
Cherry-picked 85292b367b0e6e6d8963de32ad198482500c887f into the stable branch, here are the logs... I kept the core f... ar Fred

05/07/2011

07:56 PM Revision 1cb611a0 (ceph): .gitignore: rados_sync
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
02:15 PM Bug #1073: osd: failed assert: info.last_complete == info.last_update
we only wait for up_thru updates if some_down:
if (prior.some_down) {
need_up_thru = true;
for (vect...
Sage Weil
01:51 PM Bug #1073: osd: failed assert: info.last_complete == info.last_update
this is wrong:
2011-05-06 17:33:48.284200 7f1466b56710 osd4 31 pg[0.12( v 21'17 (21'15,21'17] n=17 ec=2 les=11 31/...
Sage Weil
01:39 PM Bug #1073: osd: failed assert: info.last_complete == info.last_update
something is definitely going wrong here.. i see
2011-05-06 17:34:24.391722 7fe3aae59710 osd9 43 pg[0.12( v 21'17...
Sage Weil
05:10 AM Revision ea0a1395 (ceph): osd: fix compilation for some g++ versions
wasn't compiling on lenny, g++ ver 4.3.2. Might be that
it's also due to differebt boost version.
Yehuda Sadeh
04:59 AM Revision 25bfb987 (ceph): osd: reassert our assert definition after including boostchart
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:59 AM Revision f9ed9885 (ceph): assert: make our assert clobber any others too
Two can play this game, /usr/include/assert.h!
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:42 AM Revision 7db49499 (ceph): rgw: opening bucket io ctx stats bucket info if failed
this should trigger request of a new osdmap if we were racing
with bucket creation.
Yehuda Sadeh
04:42 AM Revision 588fe672 (ceph): rgw: minor cleanup
Yehuda Sadeh
12:43 AM Revision 290668c0 (ceph): Merge branch 'wip-rados-sync'
Colin Patrick McCabe
12:42 AM Revision fbe0bd1b (ceph): test_rados_sync: check that second sync does nada
Signed-off-by: Colin McCabe <colin.mccabe@dreamhost.com> Colin Patrick McCabe
12:33 AM Revision 210c38d2 (ceph): rados_sync: more fixes
* separate BackedUpObject::from_path and BackedUpObject::from_file.
* librados functions return negative values on e...
Colin Patrick McCabe
 

Also available in: Atom