Project

General

Profile

Activity

From 09/28/2010 to 10/27/2010

10/27/2010

10:15 PM Linux kernel client Tasks #480 (Resolved): rebase btrfs snapshot ioctls, resend to list
Sage Weil
09:43 PM Revision 745a8ee5 (ceph): clarify CDir/CInode content comments a little bit
Greg Farnum
08:21 PM Revision c1d07816 (ceph): filestore: can force use of stale snaps
also, overwrite the commit_seq with the current version in case we
forced stale snaps.
Yehuda Sadeh
06:39 PM Revision bcc068ea (ceph): filestore: read commit_seq before mounting (btrfs ioctls)
Yehuda Sadeh
06:39 PM Revision c1a6ee57 (ceph): filestore: don't revert to old snapshots on startup
This should fix bug #55 Yehuda Sadeh
05:14 PM Bug #522 (Resolved): osd: put potentially large pg info in separately object, not xattr
I'm looking at the prior interval stuff, currently an attr on the head pg dir. This can be an object in meta/. Sage Weil
04:25 PM Bug #518: cfuse crashed on ls
compiled b5d9bec659daa8ba26810e7508ec473aba8ad287 but is still crashing on ls:... John Leach
01:21 PM Bug #55 (Resolved): osd: fix transition from snaps -> no snaps -> snaps
Fixed with commit:c1d078160a454c92fea899659d506e0b0ab7d92b. Yehuda Sadeh
11:45 AM Bug #521 (Resolved): objecter: crash in osdmap assert
... Sage Weil
06:28 AM Revision ae78ed42 (ceph): ceph.cc: delete deadcode
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:25 AM Revision 551711fb (ceph): Move ceph.cc to tools/
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
03:59 AM Revision a14dd819 (ceph): configure.ac: add ./configure option for gtk2
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
03:06 AM Revision 5fe0b5a0 (ceph): mds: fix split use after free; merge works
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:20 AM Revision b771ba89 (ceph): mds: simplify fragtree_t printer
val/bits^split
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
02:19 AM Revision 4afbc529 (ceph): mds: check/take wrlock on dirfragtreelock; unwind after freeze if needed
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:19 AM Revision 4c79f369 (ceph): mds: requeue dir if we can't split now due to dftlock
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:19 AM Revision 05fa106c (ceph): mds: implement frag.parse()
Sage Weil
02:19 AM Revision 7bd00b96 (ceph): mds: implement command 'merge_dir path frag'
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:19 AM Revision 0f8f02d3 (ceph): mds: add 'mds bal split bits' config option (default 3)
This is how many bits we fragment by, by default.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
02:19 AM Revision 2f9c9606 (ceph): client: fix dup entries in multifrag readdir
We need a next_offset of 0 for non-leftmost frags. Otherwise we set
our dentry offsets incorrectly and the next_offs...
Sage Weil
02:19 AM Revision 96d26e38 (ceph): mds: reimplement split_dir
Do not use an mdrequest; the old approach was totally broken wrt freezing,
locks, and deadlock.
First freeze, then l...
Sage Weil
02:19 AM Revision e1b53794 (ceph): mds: generalize split/merge call chain a bit
Still need work at the lower levels.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
02:19 AM Revision 332195a2 (ceph): mds: clean up merge() callchain
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:19 AM Revision a4b21449 (ceph): mds: don't replicate new frags (at least for now)
Lease commented out stubs in place. Sage Weil
02:19 AM Revision e79417ba (ceph): mds: move fragment checks into shared helper
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

10/26/2010

11:28 PM Revision 96beaf6c (ceph): messenger: always unlock existing pipes, even if they're lossy
Greg Farnum
07:58 PM Revision b5d9bec6 (ceph): client: Initialize Inode::truncate_size to 0 instead of -1, and check p...
on truncation.
truncate_size needs to precisely match the defaults on the MDS, or we run into
problems when importin...
Greg Farnum
07:30 PM CephFS Bug #520 (Closed): mds: change ifile state mix->sync on (many) lookups?
I'm seeing this on csyn --syn makefiles 1000 1 0... Sage Weil
07:04 PM Revision 2ed57d2a (ceph): Merge remote branch 'origin/testing' into unstable
Conflicts:
configure.ac
src/rados.cc
Sage Weil
07:00 PM Revision ef90cb5e (ceph): filestore: some cleanup
Yehuda Sadeh
06:59 PM Revision 54fdd641 (ceph): filestore: escape the xattr chunk names
Yehuda Sadeh
06:41 PM Revision 84b85aa6 (ceph): osd::Missing: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:41 PM Revision 45f7110d (ceph): osd: move PG::Missing implementation to PG.cc
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:06 PM Revision 44202873 (ceph): filestore: some cleanup
Yehuda Sadeh
01:05 PM Bug #518: cfuse crashed on ls
commit:b5d9bec659daa8ba26810e7508ec473aba8ad287 in testing. Waiting to hear back before closing. Greg Farnum
10:34 AM Bug #518: cfuse crashed on ls
These numbers confused me. When I get confused, I like to generate logs. Like the attached one. Greg Farnum
09:47 AM Bug #518: cfuse crashed on ls
And while I've got it here's the inode printout:
$4 = {ino = {val = 1099511632800}, snapid = {val = 18446744073709...
Greg Farnum
09:39 AM Bug #518: cfuse crashed on ls
All right, got in and found:
Identical truncate_seqs of 2.
Identical truncate_sizes of 0.
prior_size of 209715200....
Greg Farnum
12:34 PM Feature #169 (Resolved): osd: start up despite corrupted pg log(s)
Sage Weil
04:52 AM Revision 2a3e73bb (ceph): Merge branch 'btrfs_snap_ioctls' into unstable
Sage Weil
04:52 AM Revision f131f429 (ceph): filestore: warn if btrfs_snaps enabled but no async snap create ioctl
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

10/25/2010

11:51 PM Revision 5e453454 (ceph): Merge branch 'objectcacher' into unstable
Greg Farnum
11:50 PM Revision b15e3b48 (ceph): client: fix to handle new ObjectCacher pool requirements.
Greg Farnum
11:50 PM Revision 38d7ddf2 (ceph): osdc: Add pool awareness to the ObjectCacher, to prevent unfortunate co...
Greg Farnum
11:45 PM Revision 7bb31f75 (ceph): osdc: Fix release_all so it loops properly
Greg Farnum
11:44 PM Revision a8f6ba94 (ceph): add cephfs to deb, rpm
Sage Weil
11:44 PM Revision 00d54428 (ceph): mds: fix up mds_bal_frag options
Use the mds_bal_frag option to enable/disable. Make checks consistent.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
11:44 PM Revision e275e855 (ceph): mon: remove pg from deleted pools from pg_map
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
11:36 PM Revision e27f0b1e (ceph): filestore: escape the xattr chunk names
Yehuda Sadeh
10:24 PM Revision 961d3bc4 (ceph): PG::Log::Entry: remove unused snap_t field
The snap_t information is stored in the sobject_t field now.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
10:11 PM Feature #359 (Resolved): osd: use new btrfs snapshot ioctls
looks like the ioctl checks were fine. merged in commit:2a3e73bb325f89b708df1fc1fa889de238f5edd7 Sage Weil
09:11 PM Feature #359: osd: use new btrfs snapshot ioctls
Now that we know what's going in this cycle, we just need to make sure the ioctl checks are correct (no more DESTROY_... Sage Weil
09:08 PM CephFS Feature #519 (Closed): mds: dirfrag merge
Sage Weil
08:42 PM Revision 61b3fc35 (ceph): makefile: add cephfs
Greg Farnum
07:31 PM Revision d4bbde5a (ceph): ./ceph osd setcrushmap: validate crushmap
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:08 PM Revision 394b0712 (ceph): crush: improve error handling in map decoding
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
05:10 PM Bug #518: cfuse crashed on ls
Unfortunately I can't use these because my libraries don't match....gdb is finicky. :(
Can you instead run gdb you...
Greg Farnum
04:36 PM Bug #518: cfuse crashed on ls
attached core dump and cfuse binary as requested on irc John Leach
10:59 AM Bug #518 (Resolved): cfuse crashed on ls
cluster: 2 monitors, 2 metadata servers, 3 osds.
cfuses segfault when ls is run on the mount point (df works, cd /...
John Leach
04:51 PM Bug #507 (Resolved): objectcacher mixes pool namespaces
Merged into unstable as of commit:5e453454f8cc539de46d0ee2666e7a98e71a27a6. Greg Farnum
03:56 PM Bug #507: objectcacher mixes pool namespaces
Pushed to the objectcacher branch. I think this is done but need to make sure it's not breaking anything with its han... Greg Farnum
12:35 PM Bug #517 (Resolved): monitors crashing on startup after injecting corrupt crush map
Fixed by commit:d4bbde5ab171b37d1ecefdd396b7b04c6d41d0d2 and commit:394b0712bc2c12cba6b6043f633a9670c46e4df7 Colin McCabe
09:57 AM Bug #517: monitors crashing on startup after injecting corrupt crush map
Need to decode the provided map in a try {} block to verify it is valid before using it. In OSDMonitor::prepare_comm... Sage Weil

10/24/2010

09:28 PM Revision a869b35a (ceph): cap_reconnect_t: ignore embedded NULLs in the path
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:56 PM Linux kernel client Bug #498: reconnect sends string with NULL?
By change a869b35abdab37bd4505f435bf0f7ab1860b28cc, we no longer assert when there's a NULL in the path string.
I'...
Colin McCabe
08:53 AM Bug #517 (Resolved): monitors crashing on startup after injecting corrupt crush map
I followed the instructions at http://ceph.newdream.net/wiki/OSD_cluster_expansion/contraction to add a 3rd osd node ... John Leach

10/23/2010

08:47 PM Bug #516 (Closed): filestore: handle large xattrs on ext3
Sage Weil
08:46 PM Bug #515 (Can't reproduce): osd: recovery isn't completing
I'm seeing a few stray objects left over on sepia. Sage Weil
05:49 PM Revision e912e686 (ceph): v0.22.1
Sage Weil
05:17 PM Revision 96d46737 (ceph): Makefile: add errno.h
Sage Weil
05:17 PM Revision a974cfda (ceph): mds: be quiet about snaprealm push/pop
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

10/22/2010

11:06 PM Revision 69078946 (ceph): filestore: ignore ENOSPC on setxattr pending a better workaround
This effectively reverts to old behavior (we weren't checking for ENOSPC
errors at all before). Log which object it ...
Sage Weil
10:55 PM Revision 6826ce4a (ceph): filestore: change xattr chunk size to 2048
Yehuda Sadeh
10:55 PM Revision 76157d91 (ceph): filestore: split xattrs to multiple chunks
Yehuda Sadeh
10:55 PM Revision 3ee37ee7 (ceph): rados: add getxattr, setxattr
Yehuda Sadeh
10:52 PM Revision 22bb2118 (ceph): filestore: change xattr chunk size to 2048
Yehuda Sadeh
10:51 PM Revision 557e7e34 (ceph): mds: Add new LOCK_MIX_STALE state to lock structs.
Greg Farnum
10:51 PM Revision 512a1da9 (ceph): mds: Check for LOCK_MIX_STALE along with LOCK_MIX
LOCK_MIX_STALE precludes writing to the protected data, but
in general cases it's an acceptable state whenever LOCK_M...
Greg Farnum
10:51 PM Revision f893a63b (ceph): mds: rename Locker::file_mixed to scatter_mix
Greg Farnum
10:51 PM Revision 372e8b3e (ceph): mds: Add bool "dirty" to ScatterLock, plus manipulation functions.
Also add is_dirty() to SimpleLock so we don't need typing in these checks.
This lets us set that a dirfrag's account...
Greg Farnum
10:51 PM Revision 47a5fc95 (ceph): mds: Whenever we set locks to state LOCK_MIX, check is_stale()
and set to state LOCK_MIX_STALE instead, if necessary. Greg Farnum
10:51 PM Revision db6759fe (ceph): mds: use set_stale() as appropriate:
1) When we update a lock but can't write its new data,
2) We load potentially-stale data off disk (ie, in restart).
Greg Farnum
10:51 PM Revision b4fd986a (ceph): mds: Remove scatter_pins.
We used these to prevent freezing a tree during gather-scatter ops,
but now we can just go stale on data when a scatt...
Greg Farnum
09:33 PM Revision 9d4f7b8e (ceph): librados: add rmxattr
Yehuda Sadeh
08:36 PM Revision 429b2d99 (ceph): Revert "messenger: Make sure to unlock existing->pipe_lock. There are a...
This reverts commit 96692d24c8cdf0fe88260949b67f8580e0c70696.
This patch accidentally got merged into the tree twice,...
Greg Farnum
06:50 PM Revision 242b5992 (ceph): test_lost.sh: put common functions in test_common
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:24 PM Revision 55fcbc64 (ceph): Merge branch 'msgr' into unstable
Greg Farnum
06:16 PM Revision 696da815 (ceph): messenger: If we error out of accept() but have messages in our queue, ...
This can occur if we're replacing another Pipe and hit an error
in the process.
Greg Farnum
06:15 PM Revision 49d8fd8a (ceph): messenger: If we're replacing an existing Pipe, steal queue when we kil...
Previously we could fail out after killing existing but before
splicing its queue into our own, which lost messages.
Greg Farnum
06:01 PM Revision bf0d347d (ceph): PG::peer: introduce prior_set_build flag
Just because we have prior_set.empty() doesn't mean that the prior set
wasn't built. Create a flag to represent this ...
Colin Patrick McCabe
05:16 PM Revision b8c0d3df (ceph): filestore: update btrfs_ioctl.h
Yehuda Sadeh
05:16 PM Revision 1a7a341d (ceph): filestore: use different encoding for snap async_create
Yehuda Sadeh
05:16 PM Revision bb451d20 (ceph): filestore: use SNAP_DESTROY_ASYNC ioctl if available
Sage Weil
05:16 PM Revision a3d8c1ff (ceph): filestore: remove stray async_snap_test if present
This cleans up if a prior instance failed to delete its
async_snap_test subvol.
Sage Weil
05:16 PM Revision 953ef1da (ceph): filestore: use new async btrfs ioctls
Sage Weil
05:11 PM Revision 78352b32 (ceph): osd: fix deadlock in map handler
To avoid deadlock,
- we need to drop osd_lock while we flush.
- we need to take map_lock _after_ we flush.
Signed-of...
Sage Weil
05:10 PM Revision 515efd5a (ceph): rados: add getxattr, setxattr
Yehuda Sadeh
05:10 PM Revision f96eb805 (ceph): filestore: split xattrs to multiple chunks
Yehuda Sadeh
04:44 PM CephFS Feature #340: large directories, directory fragmenting
We still need to add a wrlock of the dirfragtreelock.
Sage Weil
04:23 PM CephFS Cleanup #514 (Rejected): Optimize MIX/MIX_STALE reconnects, etc
Right now the MDS puts locks into the MIX_STALE state whenever it loads from disk. This is safe but unnecessary. Fix! Greg Farnum
04:11 PM CephFS Feature #495 (In Progress): mds: add MIX_STALE
A first pass is done and pushed to the mix_stable branch. Testing and debugging now, but that may take a while. Greg Farnum
11:42 AM Bug #55: osd: fix transition from snaps -> no snaps -> snaps
I think all we need to do is look at current/commit_op_seq. If it is greater than the newest snap, than that snap is... Sage Weil
11:25 AM Bug #505 (Resolved): osd assert on flab
Well, that was a Duh.
Fixed in commit:49d8fd8a21778d0f805176d670d5f63f14e36b47 and commit:696da81588621ac9ee256993a1...
Greg Farnum
04:45 AM Revision 6a88d572 (ceph): mds: implement 'fragment_dir path frag by' command
For testing dir fragmentation.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:38 AM Revision b4f82328 (ceph): Merge branch 'testing' into unstable
Sage Weil
12:31 AM Revision ce050ef6 (ceph): Create cpp_strerror to make error reporting easier
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision dec5b787 (ceph): errno: add missing common/errno.h
Sage Weil
12:31 AM Revision 881bf02d (ceph): include/utime.h: should include include/types.h
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision 399d31fa (ceph): test_lost.sh: ensure that recovery doesn't start.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision 831075c4 (ceph): osd: PG::prior_set_affected: fix lost OSD detection
When looking for newly-lost OSDs, we should check prior_set_lost rather
than prior_set. Down OSDs often are in PG::pr...
Colin Patrick McCabe
12:31 AM Revision 17c615c0 (ceph): osd: build_prior: clean up started_since_joining
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision 3cbeaa14 (ceph): prior_set_affected: log msg when we see a lost osd
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision 7207476e (ceph): PG::recover_master_log: replace count with find
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision 96692d24 (ceph): messenger: Make sure to unlock existing->pipe_lock. There are a few cas...
Greg Farnum
12:31 AM Revision ad270f91 (ceph): osd: test: Add script to test LOST state
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision dc18e7a0 (ceph): osd mon: validate arguments before marking lost
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision 3e4e73f2 (ceph): OSDMap::print: print osd_info_t using ostream op
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision 794cf707 (ceph): osd: fix spacing in OSDMap::print
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision e3a53bbf (ceph): osd: track prior_set_lost
In the placement group code, track prior_set_lost. This fixes a bug
where a new OSDMap updates an OSD's lost_at time,...
Colin Patrick McCabe
12:31 AM Revision dca856d1 (ceph): PG::build_prior: update comment
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision f812f7eb (ceph): OSDMap: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision 1d8d744e (ceph): test_lost.sh: update timeout, fix payload
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:31 AM Revision 10ec8ce5 (ceph): Timer.cc: add testtimers
Add testtimers to test the timer code.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
12:31 AM Revision 17b8b0d7 (ceph): TestTimers: test SafeTimer as well as Timer
Test SafeTimer as well as Timer. Test timer shutdown.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
12:31 AM Revision 9c59a6aa (ceph): debian: 0.22-4
Sage Weil
12:31 AM Revision 4d1b9e69 (ceph): makefile: simplify cdebugpack install rule
Sage Weil
12:31 AM Revision 3d94b6af (ceph): FileJournal: fix journal size calculation
If the journal is a raw block device, the user shouldn't need to give a
journal size argument most of the time-- it s...
Colin Patrick McCabe

10/21/2010

11:16 PM Revision acc2e4de (ceph): mds: show readdir frag
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
11:16 PM Revision e380fc2e (ceph): client: reset fg after _readdir_get_frag
The _readdir_get_frag may remap our frag; update the local variable
accordingly.
Signed-off-by: Sage Weil <sage@newd...
Sage Weil
11:15 PM Revision 0abf57b6 (ceph): client: fix skipped dentry on readdir chunk boundaries
The at_cache_name is the last name successfully passed to the caller.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
11:15 PM Revision 59426bdd (ceph): client: fix dcache removal during multiple frags
We remove unexpected dentries from our cache while processing mds results.
Results are ordered within a frag, but not...
Sage Weil
11:13 PM Revision 6c2f0f07 (ceph): client: show file offsets in hex
This makes it easy to pick out frags and offsets.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:50 PM Revision 28d89928 (ceph): messenger: a 0 timeout on ::poll really means don't wait
(as opposed to -1, which waits until an event occurs).
So, set the default timeout to -1, and convert ms_tcp_read_ti...
Greg Farnum
08:07 PM Revision 32ba7760 (ceph): mds: fix inodestat encoding when frags are present
Also simplify the max_size check calculation.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:48 PM Revision cb82eb59 (ceph): mds: do not finish_scatter_gather_update_accounted on dirfraglock
This needs to match finish_scatter_gather_update, and we don't
update/project the dirfrag there.
Signed-off-by: Sage...
Sage Weil
06:37 PM Revision 814f9dbd (ceph): objecter: reconnect on osd disconnect
If the connection closes to an OSD, we need to reconnect and resubmit our
ops. Otherwise we just hang. This is prob...
Sage Weil
06:18 PM Revision 34da1ac8 (ceph): rgw: return 204 on successful removal of bucket/object
Yehuda Sadeh
06:18 PM Revision 44c78634 (ceph): init-ceph: Make sure daemon_is_running() checks the correct instance
When starting multiple instances of a daemon on a single host,
for unknown reasons /var/run/ceph/$type.$id.pid can ho...
Jim Schutt
06:14 PM Revision 78660cd6 (ceph): objecter: pause writes when FULL flag is set
Also, subscribe to all osdmap updates while FULL flag is set, so that we
discover when it is unset.
Signed-off-by: S...
Sage Weil
06:14 PM Revision 66e493dd (ceph): objecter: always set READ or WRITE flag
We should set either (or both). Assert if we don't.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:56 PM Revision 58f2f375 (ceph): include/utime.h: should include include/types.h
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:53 PM Revision c1f2f9a1 (ceph): rgw: return 204 on successful removal of bucket/object
Yehuda Sadeh
04:07 PM Bug #505 (In Progress): osd assert on flab
The error that exposed this was introduced in commit:8528ebb0c6286eb6660773fcaf29d1cccd98d72c, but the root cause is ... Greg Farnum
03:36 PM Bug #513 (Closed): limited xattrs length
We hit a problem where we have to use really large xattrs, we hit a limitation of the underlying fs. Need to figure o... Yehuda Sadeh
03:04 PM Bug #512 (Resolved): rados_initialize returns 0 when ceph.conf contains no monitors
if ceph.conf contains no monitors, calling rados_initialise prints "unable to find any monitors in conf", doesn't act... John Leach
11:56 AM Feature #511 (Resolved): librados: implement flush
Just wait for any previous writes to complete. Sage Weil
11:35 AM Bug #506 (Resolved): objecter: handle disconnects from osds
Actually, it wasn't handling osd reconnects at all. Doh.
Fixed by commit:814f9dbdc57238d4e10c8e93fc298e9d3744516b
Sage Weil
11:16 AM Bug #510 (Resolved): objecter: (optionally) honor osdmap full flag
commit:78660cd6ebd9456a26df10c39a13226267061745 Sage Weil
10:18 AM Bug #510 (Resolved): objecter: (optionally) honor osdmap full flag
We don't want to honor it on the MDS, but we do for librados etc. Make it optional. Sage Weil
10:16 AM Bug #496 (Closed): osd: OSDMap::decode / PG::read_log
See #502. Closing this one out. Sage Weil
10:09 AM Feature #509 (Resolved): assimilate ceph gui code
Michael McThrow has written a simple ceph gui with similar functionality to 'ceph -w', but based on an old version of... Sage Weil
12:30 AM Revision a5f6da43 (ceph): errno: add missing common/errno.h
Sage Weil

10/20/2010

11:38 PM Revision 1127e47c (ceph): Create cpp_strerror to make error reporting easier
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:27 PM Revision 18b1f78b (ceph): FileJournal: fix journal size calculation
If the journal is a raw block device, the user shouldn't need to give a
journal size argument most of the time-- it s...
Colin Patrick McCabe
08:47 PM Revision 9e3607fe (ceph): debian: 0.22-4
Sage Weil
08:47 PM Revision 9b4ec49c (ceph): makefile: simplify cdebugpack install rule
Sage Weil
07:14 PM Revision 6620a5a8 (ceph): TestTimers: test SafeTimer as well as Timer
Test SafeTimer as well as Timer. Test timer shutdown.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
07:13 PM Revision 1b0cf69b (ceph): Timer.cc: add testtimers
Add testtimers to test the timer code.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
04:24 PM Revision 1c6349c9 (ceph): Merge remote branch 'origin/testing' into unstable
Greg Farnum
04:10 PM Revision 99013bad (ceph): mon: Don't force a wait of paxos_propose_interval seconds on every commit.
Instead, we wait
1) Until last_commit_time + paxos_propose_interval
2) If past paxos_propose_interval, for paxos_min_...
Greg Farnum
01:48 PM Tasks #508 (Closed): test hadoop on sepia
There are two options: the userland version (see http://ceph.newdream.net/wiki/Hadoop_FileSystem) and the kernel vers... Sage Weil
09:56 AM Bug #505: osd assert on flab
Sage thinks it's a problem in reconnect, where the messenger is dropping messages which causes the OSD assert. Greg Farnum
09:53 AM Bug #474 (Resolved): mon: improve paxos commit batching
Pushed a fix to unstable in commit:99013badb676986deb82757b77d91d0aa1f54cc9.
Instead of waiting g_conf.paxos_propose...
Greg Farnum

10/19/2010

10:55 PM Revision 197928c2 (ceph): Objecter::shutdown(): call SafeTimer::Join()
Objecter::shutdown() needs to call Timer::join() to ensure that
concurrently exectuting events in other threads get f...
Colin Patrick McCabe
10:33 PM Bug #507 (Resolved): objectcacher mixes pool namespaces
ObjectCacher uses a single object map for all objects, regardless of pool. Whoops. Sage Weil
09:44 PM Revision 5aca7285 (ceph): btrfs_ioc_test.c: added a unitest
Yehuda Sadeh
05:23 PM Bug #505: osd assert on flab
because:... Sage Weil
05:23 PM Bug #505: osd assert on flab
but osd.0 didn't see those two it skipped:... Sage Weil
05:20 PM Bug #505: osd assert on flab
osd.1 is replying out of order:... Sage Weil
12:14 PM Bug #505 (Resolved): osd assert on flab
When running a few tests with radostool, I hit an assert in the OSD:
(12:12:48 PM) colinm@newdream.net/: osd/Repli...
Colin McCabe
05:05 PM Bug #504 (Resolved): hang when using radostool
The second issue looks like a transient osd issue.
Closing this for now, but we should keep an eye out for it happ...
Sage Weil
04:03 PM Bug #504: hang when using radostool
Perhaps 197928c26cec52e0f3f91e930988b1e5767e355b will resolve the radostool shutdown race condition.
The second ba...
Colin McCabe
12:06 PM Bug #504 (Resolved): hang when using radostool
I was adding some objects using radostool, when I got an unexplained hang. It looked like this:
gdb -p 19724
(g...
Colin McCabe
05:05 PM Revision dac9ecd0 (ceph): SimpleMessenger::Pipe::Accept(): fix open
When not replacing an existing pipe, zero the 'existing' pointer.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
04:36 PM Revision b55af75f (ceph): Revert "Revert "messenger: introduce a "halt_delivery" flag, checked by...
This reverts commit d44267c2d6a77d4a3cda1e44ec7c58a19be51cc4.
The problem with this code was that it's possible for t...
Greg Farnum
02:55 PM Bug #506 (Resolved): objecter: handle disconnects from osds
The kclient is smart about osd disconnect: if there are outstanding requests, it reopens the connection. Objecter do... Sage Weil
12:16 PM Bug #501: unexpected lockdep crash during vstart.sh
I believe that the second crash I saw should be fixed by dac9ecd0e05f75744fd0f10ae51ec1d92e9931c1.
Resolved.
Colin McCabe
11:12 AM Bug #479: ceph/mount crash badly when writing
What's the exact client version and kernel that you're running on?
Please do the following under ceph-client-standalo...
Yehuda Sadeh
01:02 AM Bug #479: ceph/mount crash badly when writing
Continuing with the ext4
I have set no-journal mode by commenting out the two lines in the ceph
; osd journal = /da...
DongJin Lee
10:20 AM Bug #484 (Resolved): msgr: crash on just-closed pipe
I think the issue is in SimpleMessenger::Pipe::connect()... Greg Farnum

10/18/2010

08:29 PM Revision c0db71fb (ceph): debian: update standards-version; fix ceph-client-tools-dbg
Sage Weil
08:29 PM Revision aab2a360 (ceph): debian: sign/publish specific deb version
Sage Weil
08:29 PM Revision fd42c852 (ceph): filestore: deliberate crash on ENOSPC or EIO
Neither of these are handled, so crash when we hit them. This ensures we
don't blindly continue on with a partially ...
Sage Weil
08:28 PM Revision 19c2c833 (ceph): filestore: deliberate crash on ENOSPC or EIO
Neither of these are handled, so crash when we hit them. This ensures we
don't blindly continue on with a partially ...
Sage Weil
08:02 PM Revision 781874f0 (ceph): messenger: Make sure to unlock existing->pipe_lock.
There are a few cases in the "open" section where we can go to
fail_unlocked while still holding existing->pipe_lock....
Greg Farnum
05:19 PM Revision 1b2e9927 (ceph): debian: update scripts to do packaging fixes
Sage Weil
04:59 PM Bug #479: ceph/mount crash badly when writing
Looks like some issue with the journal:
2010-10-19 11:42:43.918144 7ffae0acf710 journal room 3928063 max_size 1048...
Yehuda Sadeh
03:58 PM Bug #479: ceph/mount crash badly when writing
Update: more concise setup :)
I created simple four files; file1 (1MiB), file10 (10MiB), file100 (100MiB), file1000 ...
DongJin Lee
03:27 PM Bug #479: ceph/mount crash badly when writing
Thanks.
I've ran the above lines, no crash. But There was nothing in the osdc.
I'm unsure what output to expect fro...
DongJin Lee
02:08 PM Bug #479: ceph/mount crash badly when writing
From the osdc.txt, it looks as if none of the IOs are actually flushing to disk. Can you do a simple test like
<pre...
Sage Weil
03:54 PM Bug #501: unexpected lockdep crash during vstart.sh
I applied the fix, but then I got a different crash in cmon:
#0 0x0000000000000000 in ?? ()
#1 0x0000000000719b...
Colin McCabe
11:24 AM Bug #501 (Resolved): unexpected lockdep crash during vstart.sh
Looks like it's caused by trying to pipe_lock.Lock() while holding existing->pipe_lock.
Should be fixed in commit:4b...
Greg Farnum
11:01 AM Bug #501 (Resolved): unexpected lockdep crash during vstart.sh
I was running the unstable branch, at commit 1190313ae954f12f9b5bc364e1226d6d2440880c.
To test, I was running "vstar...
Colin McCabe
03:13 PM Bug #354 (Resolved): Detect errors during transactions
EIO and ENOSPC now checked. Sage Weil
02:23 PM Linux kernel client Tasks #499 (Resolved): avoid dcache_lock inside i_lock, if possible
Sage Weil
11:54 AM Linux kernel client Tasks #499: avoid dcache_lock inside i_lock, if possible
see commit:95c9f6141d0d4af18dd41165cc4e5a1d0fc10f57 ? Sage Weil
08:51 AM Linux kernel client Tasks #499: avoid dcache_lock inside i_lock, if possible
Mmm, see this friendly thread: http://marc.info/?t=128721715100001&r=1&w=2 Sage Weil
01:41 PM Bug #503 (Closed): osd: query osds since last_epoch_clean before concluding objects lost?
We currently query prior_set osds through last_epoch_started. This gives us teh latest log and version. But if we ar... Sage Weil
01:33 PM Linux kernel client Tasks #422 (Resolved): update ceph-client-standalone.git for multiple modules
Sage Weil
01:32 PM Linux kernel client Bug #502 (Won't Fix): honor osdmap FULL flag
We should return ENOSPC (presumably) if attempting to write to a full osd cluster.
This needs to go somewhere in o...
Sage Weil
01:29 PM Bug #496: osd: OSDMap::decode / PG::read_log
See commit:fd42c8527be21923d633b253a3260e1e600c1853 and commit:19c2c8332915c323defb2ff2e62bee2e7a3db845. These will ... Sage Weil
01:14 PM Bug #496: osd: OSDMap::decode / PG::read_log
This all looks like fallout from a full disk and failed writes.
The unstable branch has some code to handle corrup...
Sage Weil
12:47 PM Linux kernel client Bug #497 (Closed): (no request) in /sys/kernel/debug/ceph/*/mdsc?
The stalled reconnect was probably #498.
Sage Weil
10:19 AM Bug #484 (In Progress): msgr: crash on just-closed pipe
Apparently the playground was failing because nobody could connect to the monitors, and with this commit reverted it ... Greg Farnum
08:37 AM CephFS Bug #500 (Closed): mds: FAILED assert("shouldn't be called if we are already xlockable" == 0)
nevermind, old code. Sage Weil
08:36 AM CephFS Bug #500 (Closed): mds: FAILED assert("shouldn't be called if we are already xlockable" == 0)
... Sage Weil
03:15 AM Revision d44267c2 (ceph): Revert "messenger: introduce a "halt_delivery" flag, checked by queue_d...
This reverts commit 69be0df61d29a093dbeadf6dbcd4e18b429d0a22. Sage Weil
03:04 AM Revision 69b764a8 (ceph): mon: add 'mds rm <gid>' and 'mds rmfailed <id>' commands
For cleaning up the mds map when things get weird.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
03:00 AM Revision ce09cbdd (ceph): Merge remote branch 'origin/testing' into testing
Sage Weil

10/17/2010

09:18 PM Linux kernel client Tasks #499 (Resolved): avoid dcache_lock inside i_lock, if possible
... Sage Weil
08:50 PM Linux kernel client Bug #498 (Can't reproduce): reconnect sends string with NULL?
I saw this with the latest v0.22 in the mds log:... Sage Weil
08:48 PM Linux kernel client Bug #497 (Closed): (no request) in /sys/kernel/debug/ceph/*/mdsc?
saw this after an mds restart on ladder0:... Sage Weil

10/16/2010

01:59 AM Bug #496 (Closed): osd: OSDMap::decode / PG::read_log
This morning I found out that 4 of my 12 OSD's had crashed at almost exactly the same moment, all with the following ... Wido den Hollander

10/15/2010

11:43 PM Revision 1190313a (ceph): Merge branch 'rc' into unstable
Conflicts:
configure.ac
src/mds/ScatterLock.h
Sage Weil
10:34 PM Revision 2bc159e6 (ceph): debian: no libgoogle-perftools-dev on lenny
Sage Weil
10:34 PM Revision 8a7c95f6 (ceph): v0.22
Sage Weil
08:41 PM Revision d8ee92a6 (ceph): mds: take nestlock wrlock when projecting rstat into dirfrag
We were already checking that we _can_ wrlock before doing the rstat
projection (if we can't, we mark_dirty_rstat() o...
Sage Weil
08:41 PM Revision 0e472d4a (ceph): mds: use correct helper when pinning past snaprealm parent
The heler also updates the SnapRealm::open_past_parents, which is needed
for the have_past_parents_open() check.
Tha...
Sage Weil
08:41 PM Revision b8ab009a (ceph): mds: cleanup: print waiter masks in hex
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:41 PM Revision 180f4412 (ceph): mds: cleanup: clarify issue_seq in cap release debug output
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:21 PM Revision 8528ebb0 (ceph): messenger: introduce timeouts on pipes.
This will return read errors on a pipe if it gets no data
for the given period of time (default 15 minutes). In a sta...
Greg Farnum
05:41 PM Revision 6e1eeac3 (ceph): rgw: small cleanup
Yehuda Sadeh
05:41 PM Revision b378cb48 (ceph): Add RGW_PRINT_CONTINUE to control wether we print the 100-continue header
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Wido den Hollander
05:17 PM Revision 32e790cf (ceph): conf: only set sig handler if wasn't set already
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Yehuda Sadeh
04:12 PM Bug #481 (Resolved): cosd leaking messenger threads
Sage Weil
11:46 AM Feature #169: osd: start up despite corrupted pg log(s)
Sage Weil
11:15 AM CephFS Cleanup #493 (Rejected): mds: allow scatter_pinned inode to go from mix -> sync
We're going to skill scatter_pins instead, see #495 Sage Weil
11:13 AM CephFS Feature #495 (Resolved): mds: add MIX_STALE
... Sage Weil
10:40 AM rgw Bug #439 (Resolved): Duplicate "Status" headers being sent
Applied patch, commit:b378cb4899e78d1d1e5b81f376a2536e56fe54c4. Resolving. Yehuda Sadeh
10:16 AM Bug #494 (Resolved): reentrant sigabort handler?
Yehuda Sadeh
10:16 AM Bug #494: reentrant sigabort handler?
We set the signal handler multiple times (probably due to injectargs). Fixed by commit:32e790cf03c80b71cd224cf9c2e284... Yehuda Sadeh
07:43 AM Bug #494 (Resolved): reentrant sigabort handler?
Something is amiss here? This was triggered by a regular assertion failure, on ceph version 0.22~rc (commit:60bfc670... Sage Weil
07:02 AM rbd Bug #489 (Closed): Memory leak when doing a lot of I/O
I've got two rsync's running right now (Debian CD and kernel.org pub) without any problems at all. Memory usage is st... Wido den Hollander
05:47 AM rbd Bug #489: Memory leak when doing a lot of I/O
I'm positive I used the latest version. I just backported qemu-kvm from Ubuntu 10.10 (Maverick) which is Qemu-kvm ver... Wido den Hollander
06:01 AM Bug #490: Cluster stays in a degraded state
Tnx, this shows me a lot of information, but it's not clear what tells me which PG is degraded.
Just checked my OS...
Wido den Hollander
03:06 AM Revision dfc46f5e (ceph): mon: do not assert if paxosv < monmap->epoch
Signed-off-by: Sage Weil <sage@newdream.net> Henry C Chang
03:06 AM Revision 406648e1 (ceph): mon: do not delete mon->monmap which is not created by new
Signed-off-by: Sage Weil <sage@newdream.net> Henry C Chang

10/14/2010

10:18 PM Revision 94c96fa8 (ceph): Merge remote branch 'origin/osd_pglog_checksums' into unstable
Sage Weil
10:07 PM Revision 04189f84 (ceph): mds: fix can_scatter_pin() to be only SYNC and MIX
Those are the only states where the replica can effectively prevent the
lock from cycling in a way that would force a...
Sage Weil
10:06 PM Revision 9a8f1ad8 (ceph): object store: create OP_COLL_RENAME operation
The OP_COLL_RENAME operation is used to rename collections.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
09:48 PM CephFS Bug #329: mds: mislinked dentry found during journal replay
I suspect the solution (for the clustered case) is something like:
- trim_non_auth and a _subtree_ when we replay...
Sage Weil
09:39 PM CephFS Bug #329: mds: mislinked dentry found during journal replay
This can come up with multiple MDSs. (Wido saw it with one MDS; not sure how that happened.)
With multiple MDSs, ...
Sage Weil
09:46 PM Revision 039a86f7 (ceph): doc: add object_store.dot
Add object_store.dot. This graph is a rough sketch of the dependencies
between modules in the object store.
Signed-o...
Colin Patrick McCabe
09:42 PM Revision 966a5b84 (ceph): conf: actually handle long long config options from conf file
Yehuda Sadeh
09:31 PM Tasks #417 (Resolved): update wiki article on mon cluster expansion for v0.22 and monitor naming ...
Sage Weil
07:09 PM Revision ad12d5d5 (ceph): Fix bug #487: osd: fix hang during mkfs
If the user has turned on journalling, but left osd_journal_size at 0,
normally we would use the existing size of the...
Colin Patrick McCabe
07:09 PM Revision 17de417f (ceph): FileJournal.h: add attribute __packed where needed
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:05 PM Revision 69be0df6 (ceph): messenger: introduce a "halt_delivery" flag, checked by queue_delivery.
Defaults to false, is set to true by destroy_queue. Greg Farnum
04:24 PM rbd Bug #489: Memory leak when doing a lot of I/O
I do see the memory going up when running on 0.12.3, but not when running with the original version (the one on the r... Yehuda Sadeh
04:16 PM rbd Bug #489: Memory leak when doing a lot of I/O
This patch doesn't compile (at least on my system). Are you sure you got the latest version running? Yehuda Sadeh
12:10 PM rbd Bug #489 (Closed): Memory leak when doing a lot of I/O
I have a virtual machine with the following configuration:... Wido den Hollander
03:14 PM Bug #484 (Resolved): msgr: crash on just-closed pipe
Sage Weil
11:10 AM Bug #484: msgr: crash on just-closed pipe
Okay, make that commit:69be0df61d29a093dbeadf6dbcd4e18b429d0a22.
Adds a halt_delivery flag instead.
Greg Farnum
10:39 AM Bug #484: msgr: crash on just-closed pipe
Pretty sure this was dealt with by commit:587d1d5b42c378ebc8ede04e9bc72d260ed04f93, which makes destroy_queue check f... Greg Farnum
03:07 PM CephFS Cleanup #493 (Rejected): mds: allow scatter_pinned inode to go from mix -> sync
Sage Weil
02:30 PM Bug #492 (Rejected): osd: do not remove divergent objects
Instead of blindly removing divergent objects, we should try to be smart about recovery. Overwrite them with a diffe... Sage Weil
01:02 PM Bug #490: Cluster stays in a degraded state
ceph pg dump -o -
should let you know which PGs are degraded. If you're still running Cephx and having issues betwee...
Greg Farnum
12:19 PM Bug #490 (Can't reproduce): Cluster stays in a degraded state
My cluster is staying in a degraded state for the last few days.... Wido den Hollander
12:26 PM Bug #491 (Can't reproduce): osd: pg incorrectly going active
This wiped out some data on ceph-playground:... Sage Weil
12:08 PM Bug #487 (Resolved): osd: fix hang during mkfs
Fixed by ad12d5d5be41ce740dfb8a6084484858d40898cc
cheers,
C.
Colin McCabe
11:15 AM Bug #487: osd: fix hang during mkfs
I can reproduce every time by not specifying any osd journal size in my ceph.conf. Colin McCabe
10:46 AM Feature #488 (Resolved): osd: prehash pg content into subcollections
We want to pre-hash pg content into subcollections (subdirs) based on the same hash we map objects into pgs with, so ... Sage Weil
10:36 AM Bug #482 (Closed): cephx assert
We decided this was caused by walking off into the weeds due to #484. Sage Weil

10/13/2010

08:29 PM Bug #487: osd: fix hang during mkfs
This was on the testing branch.
Need to confirm the source of the problem and fix in testing; we'll merge it into ...
Sage Weil
08:25 PM Bug #487 (Resolved): osd: fix hang during mkfs
Ted writes on ML:... Sage Weil
07:11 PM Revision 60bfc670 (ceph): osd: fix MOSDBoot versioning
1 is what it was before; make it 2.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:08 PM Bug #460: OSD crash: ReplicatedPG::push_to_replica / Rb_tree
Sage Weil wrote:
> This shouldn't ever happen...
I have it happening on quite a few OSDs in my test cluster. Gen...
Tony Butler
06:20 PM Revision 0ff6e41d (ceph): RadosClient: clean up Rados::client use
Forward declare RadosClient in librados.hpp so that we don't ahve to use
so many typecasts in class Rados.
Signed-of...
Colin Patrick McCabe
05:33 PM Revision 36b61da5 (ceph): mds: SimpleLock and subclasses: const cleanup
Const cleanup for SimpleLock, ScatterLock, and LocalLock.
Make SimpleLock::get_state_name() nonvirtual, since nobody...
Colin Patrick McCabe
05:33 PM Revision d5d45039 (ceph): lists templates: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
05:09 PM Revision 7f493a11 (ceph): qa: add ffsb
Sage Weil
04:40 PM Bug #484 (In Progress): msgr: crash on just-closed pipe
Okay, it looks like there is a race between dispatch_entry and discard_queue. I'll patch that today, but I'd like to ... Greg Farnum
01:17 PM Bug #484 (Resolved): msgr: crash on just-closed pipe
The log shows the pipe 0x7fea08000c30 was just marked down:... Sage Weil
03:50 PM Revision e6d28ce3 (ceph): prefix git sha1 with commit:
This just makes it into a link when pasted directly into redmine.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
03:34 PM Bug #479: ceph/mount crash badly when writing
Update.
re-ran again, this time capturing sys/kernel/debug/ceph/*/
briefly,
10:55:00 - start the ceph (sudo m...
DongJin Lee
03:21 AM Bug #479 (Can't reproduce): ceph/mount crash badly when writing
ceph version 0.23~rc (a7ed2ee05dc7453942018d7876401c28d3918214)
kclient master-backport
Linux ss1 2.6.36-020636rc7-...
DongJin Lee
02:56 PM Bug #481 (In Progress): cosd leaking messenger threads
The problem here is that tcp_read never times out, and OSDs don't write to sessions unless they're replying to someth... Greg Farnum
08:40 AM Bug #481: cosd leaking messenger threads
see ballpit3:/tmp/a
Sage Weil
08:40 AM Bug #481 (Resolved): cosd leaking messenger threads
600 threads on ballpit3, running 0.22~rc, almost all messenger threads. Sage Weil
02:19 PM Subtask #486 (Resolved): osd: make scrub not block writes
The overarching goal is to make scrub interact with writes. I think currently it holds the pg lock the whole time an... Sage Weil
02:13 PM Subtask #485 (Resolved): osd: cooperative scrub scheduling
Each OSD probably needs some concurrency target (max concurrent scrubs). And a counter that indicates how many are i... Sage Weil
08:55 AM CephFS Feature #483 (Resolved): mds: add timestamp to LogEvent
Would be nice if every log even had an mtime associated with it. Sage Weil
08:47 AM Bug #482 (Closed): cephx assert
commit:e5882981b55f3c74d6b8b22a2bf5fbec81b775e6... Sage Weil
08:02 AM Linux kernel client Tasks #480 (Resolved): rebase btrfs snapshot ioctls, resend to list
Sage Weil
01:55 AM CephFS Bug #478 (Can't reproduce): MDS crash: LogEvent::decode()
On both my MDS'es I'm seeing the following crash:... Wido den Hollander

10/12/2010

10:26 PM Revision dc295a37 (ceph): mds: don't assert on mismatched rbytes
Sage Weil
10:15 PM Revision 53decffc (ceph): Merge branch 'testing' into rc
Sage Weil
10:15 PM Revision f35bdc28 (ceph): add rc to release.sh
Sage Weil
09:42 PM Revision 098a4931 (ceph): mdsmonitor: remove unused variable
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:35 PM Revision fbb5a457 (ceph): mon: add 'ceph health' command
Create MDSMonitor::get_health and OSDMonitor::get_health to check the
health of the MDSes and OSDes, respectively.
S...
Colin Patrick McCabe
08:59 PM Revision 219b4764 (ceph): mds: fix const-ness of is_dirty()
This was fixed before, got lost somehow.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:42 PM Revision df265a22 (ceph): mon: don't include endl on clock drift warning
Greg Farnum
06:17 PM Revision dead368d (ceph): Makefile: add cdebugpack.in to EXTRA_DIST
Sage Weil
02:49 PM Revision 53fe418d (ceph): mds: MDCache should adjust_nested_anchors once the op's been logged.
Fixes crashes from assert(nested_anchors >= 0) failures
when updating at the wrong point.
Greg Farnum
02:49 PM Revision c56ab53f (ceph): mds: Locker::local_wrlock_finish now calls finish_waiters!
Fixes a bug that could cause requests to hang since they were
put to sleep and never woken up.
Greg Farnum
02:49 PM Revision 4ba060cc (ceph): mds: CInode doesn't always call assimilate_dirty_rstate_inodes_finish
This was causing a mis-match in the projection code, since
assimilate_...finish() calls pop_and_dirty_projected_inode...
Greg Farnum
02:49 PM Revision b438b3d6 (ceph): mds: Fix projection in rename code paths.
We aren't actually projecting the inode unless destdn->is_auth(),
so check for that before projecting the snaprealm (...
Greg Farnum
02:37 PM Cleanup #430 (Resolved): make simple 'ceph mon stat' check syntax
Colin McCabe
02:36 PM Cleanup #430: make simple 'ceph mon stat' check syntax
Implemented "ceph health" in fbb5a457bacc656cd
The format is:
"HEALTH_OK|HEALTH_WARN|HEALTH_ERR <free-text-string...
Colin McCabe
10:41 AM Linux kernel client Bug #473: Kernel panic: ceph_pagelist_append
It looks like it wasn't the master branch, but some outtake from the unstable branch, probably commit:53f05210b418eaa... Yehuda Sadeh
09:43 AM Linux kernel client Bug #473: Kernel panic: ceph_pagelist_append
commit:299ef41b70e26e6725073c2d0f85e5da7aa547d0 touches similar code, although it's not clear to me that it could cau... Sage Weil
10:08 AM Linux kernel client Bug #477 (Can't reproduce): kernel BUG at fs/inode.c:295
On the playground machine, kernel version 2.6.36-rc3.
client commit 5954ea853b08105190d960032aa33cc339b2a3f1
[601...
Yehuda Sadeh
09:41 AM Linux kernel client Bug #464 (Resolved): fix bdi warning
fixed upstream Sage Weil
04:25 AM Revision fc609846 (ceph): mds: avoid EXCL if mds_caps_wanted in _do_cap_update
The file_excl() trigger asserts mds_caps_wanted is empty. The caller
shouldn't call it if that's the case. If it is...
Sage Weil
04:13 AM Revision fa2c371f (ceph): mds: bump dirstat.version during link/unlink/mtime update
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:57 AM Revision 9e5a203d (ceph): mds: fix get_xlock() assert on slave xlock
If we do a slave request xlock, the state is LOCK, not XLOCK. Weaken
the SimpleLock::get_xlock() assert accordingly....
Sage Weil
03:32 AM Revision f9b102e0 (ceph): mds: bump rstat version in predirty_journal_parents
When we propagate the rstat to inode in predirty_journal_parents (because
we hold the nestlock), bump the rstat versi...
Sage Weil

10/11/2010

08:54 PM Bug #376 (Can't reproduce): File corruption after cluster crashes
Sage Weil
05:50 PM CephFS Bug #472: mds: fragstat crash
Well, this seems to have gotten rid of the first assert issue -- and made pjd last a bit longer -- and it's a bit mor... Greg Farnum
04:50 PM CephFS Bug #472: mds: fragstat crash
let's try... Sage Weil
04:32 PM CephFS Bug #472: mds: fragstat crash
Applied patch you gave me. Got new crash:
#0 0x0000000000000000 in ?? ()
#1 0x0000000000a1e317 in sigabrt_handler...
Greg Farnum
09:51 AM CephFS Bug #472: mds: fragstat crash
Similarly:
#0 0x0000000000000000 in ?? ()
#1 0x0000000000a1e2e7 in sigabrt_handler (signum=6) at config.cc:238
#...
Greg Farnum
04:46 PM Cleanup #430: make simple 'ceph mon stat' check syntax
or just 'ceph health' Sage Weil
01:10 PM Cleanup #430: make simple 'ceph mon stat' check syntax
* probably want to call it 'ceph mon health'
* should check status of all components, not just monitor
Colin McCabe
01:10 PM Tasks #476 (Resolved): wiki page for adding mds
Looks good. Changed a couple things. Sage Weil
10:45 AM Tasks #476: wiki page for adding mds
Something like this? http://ceph.newdream.net/wiki/MDS_cluster_expansion
I'v also grouped the cluster expanding ac...
Wido den Hollander
09:28 AM Tasks #476 (Resolved): wiki page for adding mds
Sage Weil
10:09 AM Bug #475 (Resolved): failed to parse ceph_options
Fixed by 566292a5871686e612b30bee58481db489b27bfb Colin McCabe
10:05 AM Bug #326 (Resolved): OSD crash PG::IndexedLog::unindex
fixed by commit:6bcda253e593b1f59f62a16798f56a92bdbbe0ab Sage Weil
09:44 AM Linux kernel client Bug #434: mds: clustered mds pjd failures
To reproduce, you need to turn on mds thrashing (mds thrash exports = 1 in ceph.conf).
However, I've yet to get thes...
Greg Farnum
01:40 AM Linux kernel client Bug #473: Kernel panic: ceph_pagelist_append
I'm not completely sure, I see my vmlinuz is from 30-09-2010, so about 12 days old.
*vmlinuz-2.6.36-rc5-rbd-20014-...
Wido den Hollander

10/10/2010

11:54 PM Bug #475: failed to parse ceph_options
System: 2 x Intel Xeon E5630 (8 cores), 16GB Ram
OS: Linux ss1 2.6.36-020636rc7-generic #201010070908 SMP Thu Oct 7 ...
DongJin Lee
09:32 PM Bug #475 (Resolved): failed to parse ceph_options
from ML... Sage Weil
08:27 PM Bug #474 (Resolved): mon: improve paxos commit batching
We should commit immediately if we haven't committed in the last 2 seconds. Currently we delay 2 seconds from the fi... Sage Weil
08:05 PM Linux kernel client Bug #473: Kernel panic: ceph_pagelist_append
Do you know the commit id the client was running? Sage Weil

10/09/2010

11:35 AM Linux kernel client Bug #473: Kernel panic: ceph_pagelist_append
I checked my mds log, this shows:... Wido den Hollander
05:22 AM Linux kernel client Bug #473 (Can't reproduce): Kernel panic: ceph_pagelist_append
I was just doing a rsync of kernel.org, debian and ubuntu (simultaneous) and my client got a kernel panic.
The dme...
Wido den Hollander
04:22 AM Tasks #417: update wiki article on mon cluster expansion for v0.22 and monitor naming changes
Something like this? http://ceph.newdream.net/wiki/Monitor_cluster_expansion Wido den Hollander
12:23 AM Revision d2175ee8 (ceph): filestore: don't start commit if nothing new is _applied_
We were starting a commit if we had started a new op, but that left a
window in which the op could be being journaled...
Sage Weil
12:10 AM Revision a7ed2ee0 (ceph): mon: const crusade
Make print_summary, print, dump, etc. functions const methods.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe

10/08/2010

08:55 PM Revision 55370d3a (ceph): cdebugpack: update Makefile.am, add missing line
Yehuda Sadeh
07:16 PM Revision 3d9a93ed (ceph): mount.ceph: make -v a little more verbose
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:07 PM Revision 8efef663 (ceph): mount.ceph: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:09 PM Revision 566292a5 (ceph): mount.ceph: allow the user to omit ceph_options
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
05:44 PM CephFS Bug #472 (Resolved): mds: fragstat crash
see pudgy:/home/gregf/logs/fragstat_assert... Sage Weil
12:40 PM CephFS Cleanup #468 (Resolved): mds: use enum for LOCK_* in mds/locks.h
Sage Weil
12:15 PM Linux kernel client Bug #471: NULL pointer dereference __list_add+0x42/0x89 kick_requests+0x24/0x9e
Here's teh full dmesg, fwiw:... Sage Weil
12:04 PM Linux kernel client Bug #471 (Can't reproduce): NULL pointer dereference __list_add+0x42/0x89 kick_requests+0x24/0x9e
On commit:0d328c1... Sage Weil
06:21 AM Revision 0b26f315 (ceph): mon: class library encodes/decodes activated class
This fixes bug #470 Yehuda Sadeh
01:12 AM Revision 932cfcbe (ceph): mount.ceph: add usage message
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
01:07 AM Revision 35c08d5f (ceph): mount.ceph: argument parsing cleanup
* Functions that are local to the file are now static
* Don't modify the string argument to mount_resolve_src / pars...
Colin Patrick McCabe

10/07/2010

11:56 PM Bug #460: OSD crash: ReplicatedPG::push_to_replica / Rb_tree
node07 and node12 are online again (about 12 hours). Wido den Hollander
01:55 PM Bug #460: OSD crash: ReplicatedPG::push_to_replica / Rb_tree
The problem here is that we don't have the snapset attr. This happens when there is no _head and no _snapset object.... Sage Weil
01:08 AM Bug #460: OSD crash: ReplicatedPG::push_to_replica / Rb_tree
I just saw this crash again.
Used "cdebugpack" to gather the right files.
Added "issue_460_node02.tar.gz" to th...
Wido den Hollander
11:19 PM Bug #470 (Resolved): Class gets disactivated
Fixed by commit:0b26f3153f7aa06b70ebbab7aa61887bfe634909. Yehuda Sadeh
04:23 PM Bug #470 (Resolved): Class gets disactivated
From time to time we see cases where classes lost their 'active' status. Might happen after restarting the monitors. Yehuda Sadeh
11:17 PM Revision 6679c274 (ceph): osd: move to boot state if down OR wrong address in map
Saw an OSD that was up in the map, but the address didn't match. Caused
all kinds of strange behavior. I'm not sure...
Sage Weil
11:17 PM Revision 6bcda253 (ceph): osd: loosen caller_ops asserts
The problem is that merge_log adds new items to the log before it unindexes
divergent items, and that behavior is nee...
Sage Weil
11:17 PM Revision 873095be (ceph): osd: fix merge_log cut point
Look at the eversion.version field (not the whole eversion) when deciding
what is divergent. That way if we have
ou...
Sage Weil
06:16 PM Tasks #441: reconfigure sepia cluster
All the daemons are running now! I'm still testing the stability of everything, of course.
Also, the make install ...
Colin McCabe
01:56 PM Tasks #441 (Resolved): reconfigure sepia cluster
Sage Weil
06:14 PM CephFS Cleanup #468: mds: use enum for LOCK_* in mds/locks.h
Implemented in the cleanup branch.
C.
Colin McCabe
04:47 PM Revision 6545f3ca (ceph): cdebugpack: behave when /bin/sh is dash
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:38 PM Revision af749e62 (ceph): cdebugpack: man page
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:31 PM Revision 9805eb5b (ceph): cdebugpack: include cdebugpack.XXXX dir in tarball
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:31 PM Revision 2c49ac4d (ceph): cdebugpack: include .tar.gz in usage filename
Sage Weil
04:25 PM Revision 3b1b8f89 (ceph): cdebugpack: include in deb, rpm
Sage Weil
02:52 PM Revision f10906b3 (ceph): mds: respawn (instead of suicide) on being marked down
This makes temporarily laggy daemons will restart and rejoin the cluster
in standby mode.
Signed-off-by: Sage Weil <...
Sage Weil
02:52 PM Revision a2bcb419 (ceph): debug: always append to log
We were truncating if we were in log_per_instance mode. But normally those
logs don't exist. And if they do, we pro...
Sage Weil
02:28 PM Revision a7deada2 (ceph): init-ceph: DTRT when cconf returns host = localhost
cconf behavior was just changed by bcf1bdef56a256d4857dd4f9d859acca631cc347
Signed-off-by: Sage Weil <sage@newdream....
Sage Weil
09:46 AM Feature #463 (Resolved): tool to capture debug info
commit:6545f3ca1c9d358870e643bb511bd318710f2b94 Sage Weil
12:51 AM Feature #463: tool to capture debug info
There is a little bug in "cebugpack".
/bin/sh is used as interpreter. On Debian systems /bin/sh is symlinked to /b...
Wido den Hollander
09:18 AM Bug #469 (Rejected): Profiler detection is inaccurate
Upon further inspection, I don't think this is a problem with the detection scripts, since IsHeapProfilerRunning is r... Greg Farnum
08:09 AM Bug #469 (Rejected): Profiler detection is inaccurate
After the latest git update, i.e., 22nd-Sept (unstable)
The 'make' breaks down. Here's the last line.
> /bin/bash...
Greg Farnum
07:52 AM CephFS Feature #466 (Resolved): mds: respawn on suicide
commit:f10906b3fdb720ef822478c7221836d67becef2b Sage Weil
03:30 AM Revision a18213d6 (ceph): debugpack: add ceph-pg-dump
Yehuda Sadeh
03:04 AM Revision f6e49cbb (ceph): cdebugpack: save some more info
ceph.conf
ceph -s
ceph osd dump
ceph mds dump
Yehuda Sadeh

10/06/2010

11:42 PM Revision 8b716c6d (ceph): mds: Check the lock state, not the inode state!
This was causing a lot of slowdowns.
Additionally, pin the inode when exporting caps -- otherwise it could
disappear ...
Greg Farnum
11:06 PM Revision b778f830 (ceph): osd: on clearing corrupt logs, call pg::write_info
After changing PG::info, call PG::write_info to get the on-disk
information back in sync with the in-memory state.
S...
Colin Patrick McCabe
09:51 PM Revision 23bcc53a (ceph): Merge branch 'unstable' into osd_pglog_checksums
Colin Patrick McCabe
09:33 PM Revision 430377be (ceph): v0.23~rc (new unstable branch)
Sage Weil
08:42 PM Revision 48196f91 (ceph): Merge branch 'testing' into unstable
Conflicts:
src/osd/ReplicatedPG.cc
Sage Weil
08:41 PM Feature #463: tool to capture debug info
Yehuda can you make a quick man page? Sage Weil
08:40 PM Feature #463 (In Progress): tool to capture debug info
add to deb, rpm packages Sage Weil
08:29 PM Feature #463 (Resolved): tool to capture debug info
done with commit:a18213d6fab3910ed75c838a150573b5456d8cec. Yehuda Sadeh
04:09 PM Feature #463: tool to capture debug info
(04:08:30 PM) sage@newdream.net/slip: logs, binaries, core
(04:08:36 PM) sage@newdream.net/slip: /usr/lib/debug bina...
Sage Weil
08:21 PM Revision e5882981 (ceph): osd: fix pull completion tests, again
op->complete==false is inconclusive.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:21 PM Revision 47f2efb2 (ceph): osd: log error instead of crashing on failed pull attempt
If peering screws up and the primary mistakenly tries to pull an object
from us we don't have, log an error instead o...
Sage Weil
08:05 PM Revision a2806854 (ceph): osd: save corrupt pg_logs to a special collection
If the PG log is corrupt when we start up, save it to a special
collection so that we can examine it later.
Signed-o...
Colin Patrick McCabe
08:01 PM Revision f6b47e38 (ceph): osd: clean out redundant (and wrong) complete calculation
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:01 PM Revision 1bb60b45 (ceph): osd: make sparse data/clone push behave with partial object push
We can't error out if we don't get everything we want in one go now that
we support pushing objects in pieces. Remov...
Sage Weil
05:39 PM CephFS Cleanup #468 (Resolved): mds: use enum for LOCK_* in mds/locks.h
We just fixed a bug that (I think?) the compiler would have warned about.. in->get_state() == LOCK_MIX instead of loc... Sage Weil
04:45 PM Revision 5ef97562 (ceph): Merge branch 'osd_lost_objects' into unstable
Sage Weil
04:41 PM Linux kernel client Bug #459 (Resolved): bonnie++ is slow on clustered mds
Solved the most apparent issue, which is that if the kclient had already dropped caps for the MDS on an existing inod... Greg Farnum
04:23 PM Feature #169: osd: start up despite corrupted pg log(s)
Done. We put each corrupt page log in a new collection. Colin McCabe
04:17 PM CephFS Bug #295 (Can't reproduce): mds: can't rmdir due to dir size underflow
Sage Weil
07:06 AM Revision ed3976ce (ceph): rgw: change default content type to binary/octet-stream
Yehuda Sadeh
05:04 AM Revision 1f94a8fe (ceph): monclient: fix leaks in build_initial_monmap address lookup
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:02 AM Revision 7935e30e (ceph): monclient: fix off-by-one buffer overrun
Still leaked, though.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:01 AM Revision 16f053f7 (ceph): addr_parsing: remove unused mount_path logic
This was breaking parsing if any of the hosts included a ":port" too.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:05 AM rgw Bug #467 (Resolved): change default content type
Yehuda Sadeh
12:05 AM rgw Bug #467: change default content type
Should be fixed with commit:ed3976ce562908a0df02828d7c8d3dc79fa6443e. Yehuda Sadeh
12:03 AM rgw Bug #467 (Resolved): change default content type
If content type was not specified we need to set it as 'binary/octet-stream' and not as 'text/plain'. Yehuda Sadeh

10/05/2010

11:47 PM Revision b2774979 (ceph): Merge remote branch 'origin/testing' into unstable
Sage Weil
11:47 PM Revision 6a53d733 (ceph): Merge branch 'unstable' of ssh://ceph.newdream.net/home/sage/ceph.newdr...
Sage Weil
11:26 PM Revision 109dcdf6 (ceph): cdebugpack: add a utility to generate a debug package
Yehuda Sadeh
10:47 PM Revision 4bc4cba5 (ceph): osd: ignore info queries on deleting pgs
Since we cancel deletion on pg change, we will only receive these from
old primaries, so we can safely ignore.
Signe...
Sage Weil
10:47 PM Revision a4eb5996 (ceph): osd: cancel deletion on pg change
If the primary changes, cancel deletion so that the new primary has the
benefit of considering whether they need anyt...
Sage Weil
10:47 PM Revision ed2eee54 (ceph): config: fix address list parsing
Skip past comma, whitespace.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:44 PM Revision 414bc4f9 (ceph): cmon: better error handling
If we can't create the mon0/magic file, show an error message rather
than calling assert(). These cases are probably ...
Colin Patrick McCabe
10:28 PM CephFS Feature #466 (Resolved): mds: respawn on suicide
Either that, or we need some wrapper that restarts the daemon. Otherwise a cmds that gets laggy and is replaced won'... Sage Weil
10:16 PM Linux kernel client Bug #465 (Resolved): need to refresh osdmap when full flag is set
Something as simple as calling
ceph_monc_request_next_osdmap(&osdc->client->monc);
before retur...
Sage Weil
10:02 PM Revision bcf1bdef (ceph): conf: cconf return default values from config.cc if not found
Yehuda Sadeh
07:38 PM Revision 12373a6e (ceph): mds: allow do_null_snapflush on multiversion inodes
The _do_snap_update() can handle a multiversion inode. Behave when
_do_null_snapflush() encounters one.
Signed-off-...
Sage Weil
07:26 PM Revision e064796b (ceph): signal handlers: be more elaborate about caught signals
Yehuda Sadeh
07:16 PM Revision 22c38466 (ceph): mds: don't call mrk_dirty_rstat for base/root inodes
Base inodes have no parent.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:05 PM Revision 3e56ac4b (ceph): dump backtrace when getting sigsegv and sigabrt
Yehuda Sadeh
06:54 PM Revision f5958ad5 (ceph): mds: set dir layout during replay
Need to copy layout from the EMetaBlob::fullbit into the inode.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:54 PM Revision 09b2db73 (ceph): mds: use helper to update inode from EMetaBlob during replay
Removes 3 copies of this code.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:54 PM Revision 11a24f5e (ceph): mds: set root dir_layout during mkfs
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:54 PM Revision d600596a (ceph): mds: fix EMetaBlob dir_layout lifecycle
Initialize, delete pointer.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:54 PM Revision 95e273a6 (ceph): mds: zero inode layout for dirs
These aren't used for anything.
Also rename the default_dir_layout to _log_, since that's all that we now
use it for...
Sage Weil
06:54 PM Revision 50d91f62 (ceph): osd: less chatty in log about caps
Sage Weil
06:54 PM Revision 994525ad (ceph): mds: fix typo in EMetaBlob encoder
This was wrongly setting the dir_layout_exists flag to true.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:54 PM Revision cdc2b898 (ceph): mds: set root inode default_file_layout on mkfs
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:20 PM Revision ede37634 (ceph): mds: fix LocalLock xlocking by replacing default
Greg Farnum
06:20 PM Revision e4d86f31 (ceph): client: Fix truncate_seq/truncate_length initialization.
Initializing to 0 was causing file_to_extents to get called on every inode
since the MDS initializes truncate_seq to ...
Greg Farnum
06:08 PM Revision 5febcb90 (ceph): osd: read_log: clear the pagelog if it is corrupt
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:05 PM Revision e10f4607 (ceph): Merge branch 'unstable' into osd_pglog_checksums
Colin Patrick McCabe
05:12 PM Revision f4581e0d (ceph): mds: fix ESession/ESessions event id type again
Not sure how many times we've screwed this one up!
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:57 PM Revision ff463df5 (ceph): filestore: drop unused parse_coll() declaration
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:17 PM Feature #463: tool to capture debug info
commit:baa3772b1558af280a878c7b32b1d739c4054ed3 introduces cdebugpack. Generates a tar.gz (name needs to be specified... Yehuda Sadeh
10:48 AM Feature #463 (Resolved): tool to capture debug info
- /usr/bin/ binary
- /usr/lib/debug/usr/bin symbol binary (if any)
- core files (if any)
- logs
Maybe it should...
Sage Weil
01:39 PM CephFS Tasks #365 (Resolved): test snaptests against single mds failure
Sage Weil
12:37 PM Linux kernel client Bug #464 (Resolved): fix bdi warning
I'm seeing this on the unstable branch:... Sage Weil
12:24 PM Feature #446 (Resolved): dump stack to log on segfault
We'll keep the backtrace in the assertion code for now. Commit:e064796bea3985c088e74f75f35637225827bab8 adds some inf... Yehuda Sadeh
12:03 PM Feature #446: dump stack to log on segfault
Commit:3e56ac4b377a3f39f040556beffb7c58cc2baea4 adds the signal handling part. Need to decide whether we keep the cur... Yehuda Sadeh
10:27 AM CephFS Bug #362: mds: rejoin crashes on snaptest-2 workload
work on recovery in v0.23 Sage Weil
10:26 AM CephFS Bug #395 (Resolved): mds: interval_set assert(0) during journal replay
Sage Weil
10:26 AM CephFS Bug #426 (Resolved): mds: rstat propagation
Sage Weil
04:22 AM Bug #460: OSD crash: ReplicatedPG::push_to_replica / Rb_tree
I'm now seeing this crash on multiple OSD's.
Added some coredumps to the collection on the logger machine.
Wido den Hollander

10/04/2010

09:45 PM Bug #461: Hanging OSD during recovery
While testing #462, I restarted osd6 to see if the cephx problems went await.
During boot, osd6 started to hang to...
Wido den Hollander
09:34 PM Bug #461 (Closed): Hanging OSD during recovery
The OSD shutted down after about 3 hours it seems without any logging, so we probably won't find what ever caused the... Wido den Hollander
12:01 PM Bug #461 (Closed): Hanging OSD during recovery
While my cluster was recovering from a few OSD crashes, one of my OSD's.... Wido den Hollander
09:39 PM Bug #462 (Resolved): cephx: verify_authorizer_reply exception in decode_decrypt
Since I started using _cephx_ on my cluster I started seeing these messages in my logfiles.
Now for example, I see...
Wido den Hollander
09:24 PM Bug #460: OSD crash: ReplicatedPG::push_to_replica / Rb_tree
I just tested if I could start the OSD again, but it crashed again, with almost the same backtrace:... Wido den Hollander
11:55 AM Bug #460 (Can't reproduce): OSD crash: ReplicatedPG::push_to_replica / Rb_tree
After my cluster recovered from the latest crashes, I wanted to check if my RBD data was still in tact.
This cause...
Wido den Hollander
08:59 PM CephFS Bug #451: mds: replay error
Uhh...Sorry, I thought the log should be enough, so I re-deployed the cluster and destroyed everything...
Henry Chang
10:09 AM CephFS Bug #451: mds: replay error
Henry Chang wrote:
> OK.. I've put it on the gateway machine: /tmp/ceph_logs/mds.1.log.gz
Got it, thanks.
Okay...
Sage Weil
06:21 PM Revision c3d3b422 (ceph): Merge branch 'testing' into unstable
Conflicts:
src/mds/Locker.cc
Sage Weil
06:08 PM Revision 7aab70dd (ceph): Merge branch 'file_layouts' into unstable
Conflicts:
src/mds/CInode.cc
src/mds/CInode.h
src/mds/MDCache.cc
src/mds/SimpleLock.h
Greg Farnum
06:04 PM Revision 2b4eb4ab (ceph): add set layout ops to ceph_strings
Greg Farnum
06:04 PM Revision 45fa4a2f (ceph): mds: Conditionally encode default dir layout.
Previously we unconditionally encoded the standard layout, which
on a directory inode is meaningless. So, use that sp...
Greg Farnum
06:04 PM Revision 8938f271 (ceph): cephfs: Wrote and committed cephfs
Greg Farnum
06:04 PM Revision 212c1890 (ceph): client: update test_ioctls to test new stuff
Greg Farnum
05:50 PM Revision b5889832 (ceph): always throw by value; always catch by const ref
Always throw exceptions by value rather than as pointers. Always catch
exceptions as const references to avoid uneces...
Colin Patrick McCabe
05:42 PM Revision 2d194c67 (ceph): mds: If a projected inode has a dir_layout, we now encode it to disk.
Greg Farnum
05:42 PM Revision cb7b3601 (ceph): mds: misc fixes for dir default layout projection
Greg Farnum
05:42 PM Revision 64c3556d (ceph): mds: fix setlayout truncation check.
The trunc_seq is initialized to 1 in prepare_new_inode. Greg Farnum
05:42 PM Revision fbbf4481 (ceph): client: import ioctl header from ceph-client
Greg Farnum
05:42 PM Revision 79d18933 (ceph): mds: zero out the layout in handle_client_setlayout
Could have led to an invalid layout by mistake. Greg Farnum
05:42 PM Revision 42c7ed44 (ceph): mds: Implement op CEPH_MDS_OP_SETDIRLAYOUT.
Implement handler functions, add to inode projection machinery, etc. Greg Farnum
05:42 PM Revision 54e95fed (ceph): mds: Look for and make use of directory tree default layouts, if existent.
Greg Farnum
03:50 PM Revision 01ae1be2 (ceph): filestore: make list_collections() list all dirs
coll_t is now unstructured; list all dirs besides '.' and '..'.
The old coll_t::parse() was broken. Remove it. Fix...
Sage Weil
03:44 PM Revision 940354b9 (ceph): osd: make load_pgs verbose
Show what it's skipping any why.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
11:42 AM Bug #458 (Won't Fix): OSD::activate_pg
This is from the old (broken) recovery code attempting to forget lost objects. The bandaid is to just comment out th... Sage Weil
11:35 AM Bug #458 (Won't Fix): OSD::activate_pg
On one of my OSD's (osd7) I started to see:... Wido den Hollander
11:35 AM Linux kernel client Bug #459 (Resolved): bonnie++ is slow on clustered mds
We tracked it down to a problem with cap revocation while deleting inodes. The MDS is requesting that the kclient dro... Greg Farnum
11:32 AM Linux kernel client Bug #434 (In Progress): mds: clustered mds pjd failures
Looking at this now. Greg Farnum
11:27 AM Bug #428 (Resolved): osd: recovery stalls on mismatched snapset and object
There's a separate issue open for the remaining issue #453. Closing this one out. Sage Weil
11:19 AM CephFS Bug #447 (Resolved): mds: failed assert(cap) in void Locker::handle_client_caps(MClientCaps*)
I suspect this one is fixed by commit:113a9bcd957839f2838c0e0cb80c25108278fde2, which will be in v0.21.4 and v0.22. ... Sage Weil
11:17 AM Feature #185 (Resolved): mds: set file layout policy on directory hierarchy
Pushed in commit:7aab70ddc464355f068a143ea0e972183c155f24 (userspace) and commit:f670ee7872e51842e817e1606539e3c72e4b... Greg Farnum
11:04 AM Feature #457 (Rejected): osd: alphanumeric names
Sage Weil
11:03 AM Bug #450 (Won't Fix): osd named with leading/padding 0 gets stripped
Sage Weil
11:03 AM Bug #450: osd named with leading/padding 0 gets stripped
This is normal. The OSD ids are purely numeric (ints). We could add a layer of alphanumeric names at some point, bu... Sage Weil
10:19 AM Feature #456 (Resolved): make dumpjournal functionality usable
It could be integrated into cmds? Maybe something like,... Sage Weil
09:14 AM Bug #455 (Resolved): OSD::_create_lock_pg
fixed by commit:01ae1be288bae196180ad03065e14be867b5e12e Sage Weil
12:40 AM Bug #455: OSD::_create_lock_pg
I just checked (haven't check the cluster state for about a day and a half) and then found that osd11 crashed again w... Wido den Hollander

10/03/2010

08:03 PM CephFS Bug #451: mds: replay error
OK.. I've put it on the gateway machine: /tmp/ceph_logs/mds.1.log.gz Henry Chang

10/02/2010

02:02 AM Bug #455: OSD::_create_lock_pg
A bit later, osd11 crashed with the same backtrace.
I manually marked it "out", but that wouldn't trigger a recove...
Wido den Hollander
01:44 AM Bug #455 (Resolved): OSD::_create_lock_pg
This morning I upgraded to the latest unstable ( 0b7c1afc43202953123f335057b9a5da428bc9a2 ), but when doing so, 10 of... Wido den Hollander

10/01/2010

11:22 PM Revision 0b7c1afc (ceph): mds: fix setlayout truncation check.
The trunc_seq is initialized to 1 in prepare_new_inode. Greg Farnum
11:21 PM Revision c9e69559 (ceph): mds: zero out the layout in handle_client_setlayout
Could have led to an invalid layout by mistake. Greg Farnum
11:21 PM Revision 8a5008b8 (ceph): mds: remove unused CompatSet mds_features.
All the MDS features are stored in the MDSMap::mdsmap_compat Greg Farnum
10:55 PM Revision f389afc9 (ceph): mon: add 'mds fail N' command
Manually mark an mds rank as failed. The daemon should kill itself when
it finds out.
Note that this doesn't do any...
Sage Weil
09:12 PM Revision cdf43d54 (ceph): buffer::list::copy: complain about invalid strings
Raise an exception when someone feeds us a "string" that has embedded
NULL characters.
Signed-off-by: Colin McCabe <...
Colin Patrick McCabe
07:52 PM Revision e18001c1 (ceph): mds: fix stray replica push on _rename_prepare_witness()
We need to push all parents of the straydn to the target. This changed
a while back with the mdsdir stuff but this b...
Sage Weil
07:52 PM Revision e87f751b (ceph): mds: fix and use add_replica_stray() helper for handle_dentry_unlink
Eliminate duplicate code by using (and fixing) the helper.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:34 PM Revision 26511cf7 (ceph): osd: revamp forgetting lost objects
The old forget lost objects rewrote history in the PG log, which is asking
for all kinds of trouble. Instead, add ne...
Sage Weil
07:32 PM Revision 36067ea1 (ceph): osd: revamp forgetting lost objects
The old forget lost objects rewrote history in the PG log, which is asking
for all kinds of trouble. Instead, add ne...
Sage Weil
06:56 PM Revision 5e450300 (ceph): osd: move PG::Info::coll to PG::coll
It's best not to have data members in PG::Info that are not serialized
and sent over the wire. Cache coll directly in...
Colin Patrick McCabe
05:27 PM Revision e305ea01 (ceph): osd: cache coll_t in PG
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin McCabe
01:37 PM CephFS Bug #451: mds: replay error
Henry Chang wrote:
> I put the full log on:
>
> http://veqrya.bay.livefilestore.com/y1p7N29j9-l0ihB4N3FXU2n9__9Ly...
Sage Weil
01:28 PM Tasks #454 (Resolved): c++ coding standards document
Something along the lines of http://wiki.openstack.org/CppCodingStandards perhaps.
And stick it in the ceph wiki. ...
Sage Weil
12:52 PM CephFS Bug #452 (Resolved): mds: failed assert(root) in MDCache::adjust_subtree_auth()
fixed by commit:e87f751b3dc703f13e7580a24df49fbff1359536 Sage Weil
12:34 PM Feature #453 (Resolved): osd: return error (instead of blocking) on lost objects
We now track unfound objects. If we decide those objects are truly lost, we need to return errors when trying to rea... Sage Weil
10:41 AM Cleanup #435: osd: generalize coll_t to a string
> One small optimization we can make here is to make a coll_t member of PG,
> so that we don't construct a new one ...
Colin McCabe
05:00 AM Revision 5b798a3d (ceph): osd: fix recovery_primary loop on local clone
When we take the clone branch, we update the missing map. This invalidates
our current iterator, which can cause bad...
Sage Weil
01:13 AM Revision aaa58f5d (ceph): gitignore: Ignore cscope and vim temporary files
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin McCabe
12:52 AM Revision a4138c90 (ceph): osd: generalize coll_t to a string
coll_t is now a string. META_COLL and TEMP_COLL are just constants now.
Now there is a constructor that takes pgid_t...
Colin Patrick McCabe

09/30/2010

10:42 PM CephFS Bug #451: mds: replay error
I put the full log on:
http://veqrya.bay.livefilestore.com/y1p7N29j9-l0ihB4N3FXU2n9__9LyGY8Keb_yXs1KQFQD4zRyGPRL8G...
Henry Chang
09:59 PM CephFS Bug #451 (Closed): mds: replay error
... Sage Weil
10:14 PM CephFS Bug #452 (Resolved): mds: failed assert(root) in MDCache::adjust_subtree_auth()
... Sage Weil
09:54 PM Cleanup #435: osd: generalize coll_t to a string
One small optimization we can make here is to make a coll_t member of PG, so that we don't construct a new one for ev... Sage Weil
06:14 PM Cleanup #435 (Resolved): osd: generalize coll_t to a string
Implemented in commit a4138c905053cf79a03b50fa766c08ad718b8c58 Colin McCabe
05:54 PM Revision ea6286ac (ceph): Makefile: add missing include
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:54 PM Revision 0e67718a (ceph): mds: drop bad assert
Introduced by f1921c3a952726e025773979a7597de793897058. Should probably
audit this code.
Signed-off-by: Sage Weil <...
Sage Weil
05:23 PM Bug #325: writes starve reads
Hmm, I think I'm seeing this problem or an alternative instance of it, running a multi-mds cluster locally. bonnie++ ... Greg Farnum
02:30 PM Bug #450 (Won't Fix): osd named with leading/padding 0 gets stripped
I have hosts like cephdisk01 cephdisk02 cephdisk03 in my test cluster. I tried to name all my OSD's like osd0101...o... Tony Butler
01:53 PM Linux kernel client Feature #449 (Resolved): Support "secretfile" as an option
ok closing this one out. we should consider whether it makes sense in the future for the rbd (or some other) tool to... Sage Weil
12:00 PM Linux kernel client Feature #449: Support "secretfile" as an option
The crashing part is fixed with commit:95ce358e10a20b31ad98724bf323e707e2b6ce86. Yehuda Sadeh
09:54 AM Linux kernel client Feature #449: Support "secretfile" as an option
Yehuda, can you look at the crashing part?
secretfile won't work here, btw, since the kernel can't read files. To t...
Sage Weil
02:07 AM Linux kernel client Feature #449 (Resolved): Support "secretfile" as an option
I just tried:... Wido den Hollander
01:31 PM Bug #428: osd: recovery stalls on mismatched snapset and object
Not knowing that you patched the binaries, I've overwritten them this morning when I installed my daily build of the ... Wido den Hollander
10:23 AM Bug #428: osd: recovery stalls on mismatched snapset and object
Okay, the cluster is now all active and clean. The rbd snapshot(s) are corrupted.. i had to copy random data into pl... Sage Weil
11:31 AM Feature #444 (Resolved): ceph manager to run ioctls
commit:114004f6e616e0eb9f2e10a60449294e838cb3dd in file_layouts branch. currently called cephfs Greg Farnum
09:57 AM Linux kernel client Feature #448: support dns resolution in libceph
There is a (new) dns resolution framework in the kernel we could support. It'd take some coding, though. It should ... Sage Weil
01:34 AM Linux kernel client Feature #448 (Rejected): support dns resolution in libceph
I tried:... Wido den Hollander
02:03 AM Revision 7657a6d5 (ceph): interval_set: hide data members
This change makes interval_set::m and interval_set::_size private data
members in interval_set, instead of public. Th...
Colin McCabe

09/29/2010

09:43 PM CephFS Bug #447 (Resolved): mds: failed assert(cap) in void Locker::handle_client_caps(MClientCaps*)
on slide1:... Sage Weil
07:02 PM Revision b9f2816b (ceph): Makefile: add missing include
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:32 PM Revision 548df8ec (ceph): mon: Fix issue first addressed in 2c5a3d99aa3be5ce114072e84f73a0a6426e6...
We were properly falling out of the while loop when we reached end(), but
not checking for it in the following if-els...
Greg Farnum
03:47 PM Feature #446 (Resolved): dump stack to log on segfault
Should be doable if we trap SIGSEG?
One thing to keep in mind is that the common/assert.cc code currently delibera...
Sage Weil
03:45 PM Revision aa04c8fb (ceph): osd: try to object from other replica(s) on EOF
If during recovery we are unable to pull from a replica due to reaching
EOF (e.g., zeroed out object), pull from the ...
Sage Weil
03:45 PM Revision 0523ce10 (ceph): osd: do not request backlog from peers with empty pg
This avoids stalling out peering, because the peer just responds with
another 'empty' PG::Info in response (which we ...
Sage Weil
12:58 PM Linux kernel client Bug #392 (Resolved): writes beyond 4GB wrap on 32 bit clients
Sage Weil
12:47 PM Cleanup #440 (Resolved): interval_set<>::iterator
Sage Weil
12:00 PM phprados Feature #445 (New): Session handler
Add a session handler, so we can use RADOS as storage for our PHP sessions.... Wido den Hollander
02:25 AM Revision 3bab6ac1 (ceph): Add the setup-chroot.sh script
The setup-chroot.sh script is very handy for building the server in a
chroot environment. I thought I would share it ...
Colin Patrick McCabe

09/28/2010

07:31 PM Revision 2223b22d (ceph): osd: clarify comment in recovery code
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:26 PM Tasks #441: reconfigure sepia cluster
The sepia cluster is looking good. I put a new config file in place that doesn't refer to /home/sage and installed ev... Colin McCabe
12:54 PM Tasks #441: reconfigure sepia cluster
> Also need to symlink /var/log/ceph to /data/log.
I think this should be ok since we have /var/log symlinked to /...
Colin McCabe
06:45 PM Revision ab62aabf (ceph): msgr: Don't take over old pipes if they're lossy.
Fixes bug #443. Greg Farnum
04:29 PM Feature #444 (Resolved): ceph manager to run ioctls
Make a userland tool that uses the ioctls to set file layouts, layout policies, and view layouts. Greg Farnum
02:33 PM Feature #185: mds: set file layout policy on directory hierarchy
Wrote the ioctl and updated test_ioctls to test it, then debugged issues with it. Still don't have any kind of ceph-m... Greg Farnum
12:31 PM Cleanup #440: interval_set<>::iterator
Please do! Sage Weil
12:09 PM Cleanup #440: interval_set<>::iterator
interval_set::end() now returns a const_iterator, yay!
As a related issue, we should make the std::map<T,T> struct...
Colin McCabe
11:45 AM Bug #443 (Resolved): osd segfault due to pipe->connection_state is NULL.
Pushed a change in commit:ab62aabf1f71b21a8f64bd7985119f3341582ff5
Replacement pipes will only take over the old pip...
Greg Farnum
11:07 AM Bug #443 (Resolved): osd segfault due to pipe->connection_state is NULL.
Hi,
One of my OSD failed due to segfault.
gdb of the core dump shows:...
Henry Chang
 

Also available in: Atom