Activity
From 11/05/2010 to 12/04/2010
12/04/2010
- 08:53 PM Tasks #616 (Rejected): radosacl needs a man page
- 08:52 PM Bug #627 (Resolved): replace openssl with crypto++
- 08:38 PM CephFS Feature #630 (Resolved): release caps on inodes unlinked by other clients
- If client A writes a file, and client B unlinks it, client A needs to drop the inode sooner rather than later.
O... - 03:34 AM Revision 15d8bdf3 (ceph): crypto: use crypto++ for aes instead of openssl
- need to implement it more efficiently, currently going through a string object
- 03:34 AM Revision 58f3ce4a (ceph): crypto: test for allocation failure, cleanup
- 03:34 AM Revision 6ec622c0 (ceph): common: use ceph_armor instead of openssl based functions
- also modify ceph_[un]armor to get dest buffer length
- 03:34 AM Revision 7fa9426c (ceph): makefile.am: most binaries (except rgw_*) don't link with openssl
- 03:34 AM Revision e135e924 (ceph): crypto: remove old openssl implementation
- 03:34 AM Revision 76e02c71 (ceph): common: remove base64.c
- 03:34 AM Revision 88213770 (ceph): crypto: change include
- 03:34 AM Revision a28b4494 (ceph): configure: check for the presence of libcrypto++ header files
- 03:34 AM Revision f2424dfb (ceph): rgw: get rid of openssl altogether
- 03:34 AM Revision e0059259 (ceph): rgw: null terminate armor result
- 03:34 AM Revision 23f37043 (ceph): ceph.spec.in: update dependency
12/03/2010
- 06:02 PM Revision a457cbb9 (ceph): mon: fix typo
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:02 PM Revision 378d13df (ceph): osd: remove poid/soid from ScrubMap::object; clean up callers
- The soid is in the key in the map; no need to store it in the value.
Update the scrub code appropriately.
Signed-off... - 05:35 PM Revision a4cc929c (ceph): make: create log directories and tmp directories
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 05:10 PM Revision a5297388 (ceph): msgr: Correctly handle half-open connections.
- If poll() says a socket is ready for reading, but zero bytes
are read, that means that the peer has sent a FIN. Hand... - 11:11 AM Bug #590 (In Progress): osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- 11:11 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- Hi Fred,
When you say you removed the PGs "by hand"... does that mean you used "rm -rf" on the object store while ... - 07:33 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- this is with ceph rc 39b42b21e9805b3ec838f8682420166fede719f2
I tried to solve the ENOSPC problem by removing PGs ... - 09:36 AM CephFS Bug #623 (Resolved): MDS: MDSTable::load_2
- 12:32 AM CephFS Bug #623: MDS: MDSTable::load_2
- Yes, tried with the latest rc, works!
MDS starts and recovers, als mounting and using the FS goes fine. - 09:35 AM Bug #625 (Resolved): make install should create dirs
- implemented by commit:a4cc929cedb0ee773a2fa68d691a9951221ae31a and commit:39b42b21e9805b3ec838f8682420166fede719f2
C. - 01:35 AM Revision 39b42b21 (ceph): make: create /etc/ceph if it doesn't exist
- make: create /etc/ceph if it doesn't exist. On uninstall, remove the
directory if it's empty. (Never remove a user's ... - 12:56 AM Revision da5ab7c9 (ceph): ost: object_info_t: decode old versions correctly
- object_info_t has one constructor that initializes everything from a
bufferlist. This means that the decode function ... - 12:18 AM Revision 03eb4e7a (ceph): man: add man page for cephfs
- Add to Makefile, debian, and ceph.spec.in bits
12/02/2010
- 07:52 PM Revision 6518fae3 (ceph): watch: some more linger fixes
- 06:16 PM CephFS Feature #91: mds: up:shadow mode
- I have yet to implement trimming, but the basic restarting-replay bits are now in place along with hooks to make it s...
- 05:14 PM Bug #479: ceph/mount crash badly when writing
- Hi all:
Ok, so I gitted again, original/unstable,
- Linux ss1 2.6.36-02063601-generic #201011231330 SMP Tue Nov 2... - 05:07 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- 05:07 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- Hi Fred,
I think the assertion you're seeing here was fixed very recently by commit:78a14622438addcd5c337c4924cce1... - 05:02 PM Bug #629: cosd segfaults when deleting a pool containing degraded objects
- Looks like some kind of lifecycle issue related to deleting pools.
OSD::_remove_pg does a _put_pool, and that does... - 04:54 PM Bug #629 (Resolved): cosd segfaults when deleting a pool containing degraded objects
- started a 4 node osd cluster. created some pools with some objects in them. killed one osd node. waited for it to be ...
- 04:52 PM CephFS Bug #623: MDS: MDSTable::load_2
- I think that commit:da5ab7c9a49f8996b41783175683d4b8b13ece4d should fix this issue.
wido, can you re-run with the ... - 04:44 PM CephFS Bug #623: MDS: MDSTable::load_2
- root@noisy:/var/log/ceph# grep mark_all_unfound_as_lost *
[ no results ]
So we're not marking things as lost in... - 11:52 AM CephFS Bug #623: MDS: MDSTable::load_2
- actually -23 is NFILE, which is I think coming from the LOST code...but that should never trigger unless the admin ha...
- 05:12 AM CephFS Bug #623 (Resolved): MDS: MDSTable::load_2
- On a small test machine I have a Ceph RC cluster running (Which was running a old unstable before), after my upgrade ...
- 04:13 PM Tasks #617: cephfs needs a man page
- Already had it in the Makefile, put it in the other bits and updated the commit.
- 03:56 PM Tasks #617: cephfs needs a man page
- need to add filename to debian/ceph.install and ceph.spec.in too.
and to man/Makefile.am - 02:07 PM Tasks #617 (Resolved): cephfs needs a man page
- Done in commit:6cdaa2f6a7670357313401ddbd322bdf529a1547 on the rc branch.
- 03:29 PM Bug #622 (Resolved): crushtool useless parse error
- Resolved-- the crushmap.txt was bad.
I created #628 for getting better error messages from crushtool. - 01:19 PM Bug #622: crushtool useless parse error
- There is a more advanced error handling API for spirit described at:
http://www.boost.org/doc/libs/1_41_0/libs/spiri... - 11:35 AM Bug #622: crushtool useless parse error
- Reposting the diff; hopefully clearer this time.
--- crushmap.txt.1 2010-12-02 11:38:43.816441440 -0800
++... - 11:33 AM Bug #622: crushtool useless parse error
- I was able to get the crushmap.txt to work by deleting the word "domain" in the gb1 region.
We should definitely h... - 03:07 AM Bug #622 (Resolved): crushtool useless parse error
- I can't decide whether this is a bug in crushtool or a bug in my crushmap but whichever it is, the error message isn'...
- 03:27 PM RADOS Feature #628 (New): crushtool: better error messages when parsing a crushmap.txt
- There is a more advanced error handling API for spirit described at:
http://www.boost.org/doc/libs/1_41_0/libs/spiri... - 03:22 PM Bug #625: make install should create dirs
- Should be pretty straightforward. The only question is, should we remove those directories on an uninstall?
- 11:11 AM Bug #625 (Resolved): make install should create dirs
- /var/log/ceph
/var/lib/ceph/tmp
?
check debian/ceph.dirs to see what else gets created... - 11:57 AM Bug #627 (Resolved): replace openssl with crypto++
- https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/684011
- 11:25 AM CephFS Feature #626 (Closed): qa: add IOR, rompio, or other parallel workloads suite
- We've had reports that rompio is just terrifically unstable, and shows serious scaling issues.
IOR is a more commo... - 09:39 AM Feature #624: radostool: make 'put' write large objects in chunks
- Can we be able to set the chunk size with an argument, for testing this kind of thing in future?
- 09:38 AM Feature #624 (Resolved): radostool: make 'put' write large objects in chunks
- otherwise a put on a large (100mb+) file can fail because it exceeds the size of the osd journals. it's also clearly...
12/01/2010
- 11:40 PM Revision 78a14622 (ceph): osd: fix log tail vs last_complete assert on replica activation
- The last_complete may be below the log tail IFF we have a backlog.
Fixes 756918be3b24d8164699da301ddfbc8e6fd6b751.
... - 11:11 PM Revision 63fab458 (ceph): rados_bencher.h:
- bench_write and bench_seq will now wait on any write/read
rather than the one least recently started.
bench_write ... - 11:00 PM Revision 0ea601ab (ceph): Create SyslogStreambuf
- SyslogStreambuf is a kind of stream buffer that allows you to output
characters from an ostream to syslog. Most stand... - 09:48 PM Revision a3d8c527 (ceph): filestore: call lower-level do_transactions() during journal replay
- We used to call apply_transactions, which avoided rejournaling anything
because the journal wasn't writeable yet, but... - 09:46 PM Revision 9ecbc300 (ceph): filestore: do journal mode autodetect and sanity check _before_ replay
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:25 PM Tasks #617: cephfs needs a man page
- I'll get this tomorrow. I wrote the tool and have had a task in my private manager to do this ever since then.
- 07:05 PM Revision f9fa855a (ceph): filestore: fix journal locking on trailing mode
- We're already holding journal_lock due to the surrounding
op_submit_{start,finish}.
Signed-off-by: Sage Weil <sage@n... - 06:20 PM Revision 0897edaf (ceph): Merge branch 'testing' into rc
- Conflicts:
configure.ac - 06:20 PM Revision cbb56208 (ceph): rbd: use MIN instead of min()
- Not even sure where min() was coming from, but it seems to be missing on
i386 lucid.:
g++ -DHAVE_CONFIG_H -I. -W... - 06:20 PM Revision 792b04ba (ceph): client: connect to export targets on cap EXPORT
- Also unconditionally connect on reconnect, even when there aren't any
outstanding requests.
Signed-off-by: Sage Weil... - 06:03 PM Revision bde0c721 (ceph): filestore: do not autodetect BTRFS_IOC_SNAP_CREATE_ASYNC until interfac...
- Li has proposed an alternative V2 ioctl that looks nicer, so wait until
that is finalized.
Signed-off-by: Sage Weil ... - 06:03 PM Revision 5bdae2af (ceph): ceph v0.23.2
- 05:44 PM Revision 4592c220 (ceph): client: fix cap export handler
- An EXPORT cap msg can race with a cap release; deal with that (realigning
this code with the kclient).
Signed-off-by... - 05:24 PM Revision 15c272e8 (ceph): man: fix monmaptool man page
- I've found the manpage problem that I've noted before. It's about
monmaptool, the CLI says it's usage:
[--print] [--c... - 03:17 PM Bug #611 (Resolved): OSD: OSDMap::get_cluster_inst
- 03:17 PM Bug #612 (Resolved): OSD: Crash during auto scrub
- 02:58 PM Linux kernel client Bug #564 (Resolved): Configuration via configfs instead of sysfs
- acked by greg kh, yay
- 02:58 PM rbd Bug #391 (Can't reproduce): snap create/delete caused corruption
- this is old
- 02:47 PM Bug #550 (Can't reproduce): mon: PGMonitor::update_from_paxos()
- haven't been able to reproduce this. commit:62716aa7 gives us useful error messages. if/when it comes up again we'l...
- 02:28 PM Linux kernel client Bug #436 (Can't reproduce): cmon: basic_string::_S_construct NULL not valid
- 02:28 PM Bug #460 (Can't reproduce): OSD crash: ReplicatedPG::push_to_replica / Rb_tree
- 09:46 AM Bug #621 (Resolved): error building unstable branch, rbd.cc:837: error: no matching function for ...
- should be fixed by commit:307404231ecb09fdd2f6dd6e50677e746bba4236
- 07:08 AM Bug #621 (Resolved): error building unstable branch, rbd.cc:837: error: no matching function for ...
- Building on i386 Ubuntu Lucid, it fails building rbd.
This is a build of unstable at commit bf784cdb4f605c467eb094... - 09:14 AM CephFS Bug #344 (Resolved): cfuse should pass all qa tests
- Sage asked me to mark this resolved. I ran the bonnie test yesterday and it eventually crashed when the disk ran out ...
- 02:22 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- I just tried the latest unstable: fe9fad7bea
osd log attached...
osd/OSD.cc: In function 'void OSD::_process_pg... - 12:50 AM Revision 6d96104e (ceph): osd: simplify scrub sanity checks
- Signed-off-by: Sage Weil <sage@newdream.net>
- 12:50 AM Revision 76b55c8a (ceph): osd: only adjust osd scrub_pending if pg was reserved
- If for some reason we enter scrub() without scrub_reserved == true, don't
adjust the osd->scrubs_pending or we'll scr... - 12:38 AM Revision 260840f5 (ceph): mds: fix import_reverse re-exporting of caps
- Make the import_reverse() set the pin/state before it clears them by using
the helper that sets them.
Signed-off-by:... - 12:25 AM Revision fe9fad7b (ceph): v0.25~rc
- 12:25 AM Revision 109e3f18 (ceph): mds: turn off mds_bal_frag until resolve vs split/merge is fixed
- See #594
Signed-off-by: Sage Weil <sage@newdream.net> - 12:11 AM Revision f216b020 (ceph): Merge remote branch 'origin/lost' into unstable
- Conflicts:
src/osd/osd_types.h
11/30/2010
- 11:48 PM Revision 0cc8d34e (ceph): osd: refactor object_info_t constructor a bit
- Create a copy constructor for object_info_t, since we often want to copy
an object_info_t and would rather not try to... - 11:48 PM Revision c281e1e0 (ceph): osd: mark_all_unfound_as_lost: wake waiters
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 11:48 PM Revision d5e6cae2 (ceph): radostool: fix memleak in error path
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 11:48 PM Revision 55f7e567 (ceph): osd: mark_all_unfound_as_lost: set lost attr
- In mark_all_unfound_as_lost, we need to set the lost bit in the objects'
object_info_t.
Signed-off-by: Colin McCabe ... - 11:48 PM Revision 5e243f3e (ceph): osd: create lost2 test
- This one verifies:
1. Client asks for an unfound object and gets put to sleep
2. Object gets declared lost
3. Client ... - 11:48 PM Revision b46f847c (ceph): osd: mark_obj_as_lost: don't assume we have obj
- In PG::mark_obj_as_lost, we have to mark a missing object as lost. We
should not assume that we have an old version o... - 11:48 PM Revision c29fbb12 (ceph): osd: mark_all_unfound_as_lost: bugfix, refactor
- mark_all_unfound_as_lost: just delete items from the rmissing set as we
find them, rather than using a multi-pass sys... - 11:48 PM Revision e9ccd7eb (ceph): osd: mark_obj_as_lost: fix oloc init, eversion
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 11:48 PM Revision cee3cd51 (ceph): osd: share_pg_log: update peer_missing
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 11:48 PM Revision ad4e5f36 (ceph): osd: ReplicatedPG::do_op: error on read-from-lost
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 11:48 PM Revision b15a97c7 (ceph): test_lost: add lost1 test
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 11:47 PM Revision 136dfdeb (ceph): osd: don't mark objs as lost unless we're active
- We don't have enough information to mark objects as lost until we
activate the PG. might_have_unfound isn't even buil... - 11:43 PM Revision 08bd4ead (ceph): mds: fix resolve for surviving observers
- Make all survivors participate in resolve stage, so that survivors can
properly determine the outcome of migrations t... - 11:43 PM Revision fb4734be (ceph): (re)add mechanism for marking objects as lost
- In activate_map, we now mark objects that we know are unfindable as
lost. This relies on the might_have_unfound set i... - 11:43 PM Revision 80f3ea10 (ceph): Add ./ceph dump pg debug degraded_pgs_exist
- ./ceph dump pg debug degraded_pgs_exist returns TRUE if some pgs are
degraded; false otherwise.
tests: move start_re... - 11:43 PM Revision de094224 (ceph): osd: object_info_t: add lost field
- We can now permanently mark objects as lost by setting the lost bit in
their object_info_t. Rev the object_info_t str... - 11:43 PM Revision e555899c (ceph): osd: active replicas process logs from primaries
- In _process_pg_info, if the primary sends us a PG::Log, a replica should
merge that log into its own.
mark_all_unfou... - 11:43 PM Revision c0e60afe (ceph): test: dump_osd_store: sort dump output
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 09:21 PM Revision 1123b5c5 (ceph): osd, librados: misc fixes, linger related issues
- 08:57 PM Revision bf784cdb (ceph): osd: fix object_info_t() initialization of oloc
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:56 PM Revision 91a75590 (ceph): mds: add debug output to make completions easier to track
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:48 PM Revision ba1f3cb9 (ceph): osd: fix misuses of OLOC_BLANK
- Commit 6e2b594b fixed a bunch of bad get_object_context() calls, but even
with the parameter fixed some were still br... - 08:23 PM Revision 2ad901b3 (ceph): Revert "mds: resolve cleanup"
- This reverts commit cd53719f3ce712a060e4ac80cab934c597531a5e.
We need this on surviving nodes too to resolve ambiguo... - 08:19 PM Revision b39f0425 (ceph): Merge branch 'testing' into unstable
- Conflicts:
src/os/FileJournal.cc - 07:43 PM Revision 1b06332d (ceph): osd: make recovery_oids debug list per-pg
- Otherwise we hit bad asserts if an object of the same name in different
pools is getting recovered simultaneously.
S... - 06:56 PM Revision 05ad97b6 (ceph): client: Set the DirResult buffer to NULL when deleting it.
- This should fix a crash exposed by our bonnie workunit. Previously
the client would keep trying to read out of the (d... - 05:22 PM Revision 559d4d20 (ceph): ceph.spec.in: include gui files
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:13 PM Revision 93601269 (ceph): debian: many many cleanups
- Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
- 04:55 PM Revision 5eb8ef7f (ceph): filejournal: fix throttle vs FULL behavior
- We don't want to add to the throttler if we aren't going to queue the
write, or else we'll never take it off again.
... - 04:45 PM Bug #612: OSD: Crash during auto scrub
- this should be fixed by commit:76b55c8a121acd4e5e8b6f5dbb83c25926ac9f76
- 04:32 PM Revision 132f74c5 (ceph): Merge branch 'osd_journaling' into unstable
- 04:30 PM Revision 7af9ffdf (ceph): filestore: make sure blocked op_start's wake up in order
- If they wake up out of order (which, theoretically, they could before) we
can screw up journal submitting order in wr... - 04:24 PM Revision fac7266d (ceph): filestore: assert op_submit_finish is called in order
- Verify/assert that we aren't screwing up the submission pipeline ordering.
Namely, we want to make sure that if op_ap... - 04:20 PM CephFS Bug #594: mds: frag split/merge vs replay
- disabled in v0.24
- 04:01 PM Tasks #539 (Resolved): wiki: document pg expansion
- Documented on:
http://ceph.newdream.net/wiki/Changing_the_number_of_PGs - 03:54 PM Revision 5e391db0 (ceph): filejournal: rework journal FULL behavior and fix throttling
- Keep distinct states for FULL, WAIT, and NOTFULL.
The old code was more or less correct at one point, but assumed th... - 03:51 PM Revision 79419c33 (ceph): filestore: refactor op_queue/journal locking
- - Combine journal_lock and lock.
- Move throttling outside of the lock (this fixes potential deadlock in
parallel j... - 03:22 PM Revision 0df9dd6e (ceph): filestore: do not throttle op_queue in queue_op()
- In parallel mode, queue_op is called while holding the journal lock, so it
is not okay to throttle there. Instead, t... - 12:25 PM Feature #620 (Resolved): objecter: (optionally) read from replica if on localhost and primary is not
- This can either compare the ip address, or possibly have a netmask (set in g_conf) to determine 'locality' (where 255...
- 12:23 PM Feature #619 (Resolved): objecter: optionally read from replicas
- Add a read flag to allow reads to come from a random replica. If a replica replies with EAGAIN, retry the request, b...
- 12:21 PM Feature #618 (Resolved): osd: allow reads from replicas
- Allow osd to handle reads on a replica. If the replica is missing the object in question, reply with -EAGAIN to the ...
- 11:41 AM Bug #613 (Resolved): OSD crash: FAILED assert(recovery_oids.count(soid) == 0)
- this was actually a problem with the debug sanity checks. fixed by commit:1b06332de69b332092d115451efbd29afec79269
- 10:04 AM Tasks #617 (Resolved): cephfs needs a man page
- 10:04 AM Tasks #616 (Rejected): radosacl needs a man page
- 08:36 AM Bug #615 (Resolved): osd: improve op+journal throttling
- Currently we block first, then take locks, then update the throttle accounting. This makes things racy, because a bu...
- 08:34 AM Bug #598 (Resolved): osd: journal reset in parallel mode acts weird
- fixed as of commit:132f74c56064fdb3c47943679c48aa2a6b98f4eb, along with a ton of other related issues with the io que...
- 02:49 AM Revision 8003915b (ceph): Makefile: add bloom_filter.hpp to noinst_HEADERs
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 01:16 AM Revision 62075f34 (ceph): Makefile: Fix VPATH builds
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 12:41 AM Revision 0bcdc84a (ceph): osd: osd_types.h: const cleanup
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 12:40 AM Revision 7ee50add (ceph): osd: don't try to load a PG in a nonexistent pool
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 12:38 AM Revision 6ab17236 (ceph): filestore: simplify apply_transactions
- Always use queue_transactions, even in no-journal case.
Signed-off-by: Sage Weil <sage@newdream.net>
11/29/2010
- 11:52 PM Revision c9f864a0 (ceph): osd: PG::trim: fix inverted conditional in assert
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 11:12 PM Revision b2bcf4b3 (ceph): common: prevent infinite recursion on SIGSEGV
- Install SIGSEGV / SIGABORT handlers with sigaction using SA_RESETHAND.
This will ensure that if the signal handler it... - 10:12 PM Revision 85191813 (ceph): osd: Create pg_split test
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 09:35 PM Revision fb60e114 (ceph): logger: Fix a crash when the MDS shuts down cleanly.
- We weren't holding the lock on the logger_timer before calling shutdown.
- 09:35 PM Revision b4db4100 (ceph): Timer: add some asserts to catch certain errors.
- 08:56 PM Revision adbb5459 (ceph): osd: some notify simplifications and FIXMEs
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:56 PM Revision ec15c465 (ceph): osd: track unconnected_watchers and when they expire
- - set up an initial expiration when we load the obc off disk
- remove expiration when we connect to an existing watch... - 08:55 PM Revision 376870fa (ceph): osd: add timeout to watch_info_t
- Allow the watch timeout be set on a per-watch basis. Still need to figure
out where that comes from.. the client? A... - 08:55 PM Revision 239c0a12 (ceph): rbd: fix version renaming
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:55 PM Revision b3051531 (ceph): osd: fix up WATCH
- Separate various paths: registering new watch, reconnecting to existing
watch, removing watch, etc.
Signed-off-by: S... - 08:55 PM Revision 2563905b (ceph): osd: some cleanup
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:55 PM Revision b722662e (ceph): osd: use pg_t to find PG's again
- The ceph_object_layout is approaching obsolete. Also, use a more general
lookup_lock_raw_pg() helper that doesn't ta... - 08:54 PM Revision a61f6b5e (ceph): osd: add missing Watch.cc
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:54 PM Revision 0e62c421 (ceph): osdc: spell out version
- Cosmetic
Signed-off-by: Sage Weil <sage@newdream.net> - 08:51 PM Revision 15ffbc8d (ceph): makefile: add missing MWatchNotify.h
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:50 PM Revision 4dca64b2 (ceph): osd: drop unused fields
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:18 PM Revision 463d624d (ceph): Makefile: Add --as-needed to LDFLAGS
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 07:51 PM Revision a77eb6bd (ceph): vstart.sh: don't specify journaling mode
- Let the autodetection kick in, or let the dev specify via -o '...'.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:41 PM Revision e0b927b2 (ceph): osd: PG::trim: add assert
- Assert that we're not trimming the PG log past last_complete.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> - 05:48 PM Revision 756918be (ceph): osd: _process_pg_info: add assert for replicas
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 05:06 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- Fred, can you see if this reproduces on the latest unstable? Thanks.
-C
- 11:14 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- I added the PG::trim assert. It seems to cause problems immediately with test_unfound.sh
The plot thickens... - 10:36 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- Argh yeah I was all wrong here. The recovery code looks ok.. I think the problem is that _before_ this the log was t...
- 09:21 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- > The replicas only ever get messages from the primary, and the primary
> sends a log to activate. Never anything e... - 04:51 PM Bug #614: SEGV loop on _open_lock_pg after rmpool
- Er, by that I mean:
load_pgs shouldn't try to load a PG that is in a nonexistent pool. This could only happen aft... - 04:49 PM Bug #614 (Resolved): SEGV loop on _open_lock_pg after rmpool
- In OSD::load_pgs, we weren't checking to make sure that the pool existed when going through all the collections.
F... - 02:23 PM Bug #614 (Resolved): SEGV loop on _open_lock_pg after rmpool
- discovered my cosd processes at 100%, possibly following some "rados rmpool" commands to delete some pools. Stopped ...
- 04:41 PM Bug #598: osd: journal reset in parallel mode acts weird
- bunch of problems here, not all related to a full journal.
- 12:18 PM Feature #568 (Resolved): debian: build with --as-needed?
- Implemented!
before:
cmccabe@flab:~/src/ceph2/src$ ldd .libs/rados
linux-vdso.so.1 => (0x00007fff4eff... - 11:13 AM Bug #575 (Resolved): monmaptool terminates when input file is not a monmap
- 10:49 AM Bug #479 (Can't reproduce): ceph/mount crash badly when writing
- 10:15 AM CephFS Subtask #547 (Resolved): mds: define fsck strategy, required metadata
- 10:13 AM CephFS Bug #594: mds: frag split/merge vs replay
- needs to be fixed in 0.24, or g_conf.mds_frag needs to be disabled.
- 10:06 AM Bug #595 (Won't Fix): Autogen: not a literal
- seems to go away with latest automake
- 07:12 AM Bug #613 (Resolved): OSD crash: FAILED assert(recovery_oids.count(soid) == 0)
- I'm running a script that reads and writes random objects using librados (creating a new pool once in a while). Runn...
11/25/2010
- 07:36 AM Revision 3ab60091 (ceph): osd: dump_missing: also dump missing_loc
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 07:35 AM Revision da087e47 (ceph): osd: discover_all_missing fix
- Don't request information from an OSD unless it is up and part of the
might_have_unfound set. Add more logging.
Sign... - 12:18 AM Bug #611: OSD: OSDMap::get_cluster_inst
- commit:da087e47c21190f9cbde4d24182b7dfe581cd069 should resolve this
11/24/2010
- 10:54 PM Bug #611: OSD: OSDMap::get_cluster_inst
- I'll take a look
- 10:18 PM Bug #611: OSD: OSDMap::get_cluster_inst
- Okay, I somehow commented/set this bug backwards with another one. Whoops, sorry guys!
This looks like the OSD is as... - 10:38 AM Bug #611: OSD: OSDMap::get_cluster_inst
- Sam said he'd look at this since it's in the background scrubbing bits that he and Josh did.
- 05:11 AM Bug #611 (Resolved): OSD: OSDMap::get_cluster_inst
- After upgrading to the latest unstable, one OSD crashed. Before the upgrade, 10 of the 12 OSD's were online.
When ... - 10:18 PM Bug #612: OSD: Crash during auto scrub
- Dunno how, but somehow commented/assigned this and another bug backwards. Meant to say:
Sam said he'd look at this s... - 10:38 AM Bug #612: OSD: Crash during auto scrub
- This looks like the OSD is assembling a list of missing queries and then sending them out without bothering to check ...
- 05:28 AM Bug #612 (Resolved): OSD: Crash during auto scrub
- After I saw #611 my cluster started to crash. One after the other, the OSD's started to go down, all with a message a...
- 10:09 PM Feature #453 (Resolved): osd: return error (instead of blocking) on lost objects
- It's passing the lost1 and lost2 unit tests now.
- 09:41 PM rgw Bug #353: Handle non-ascii filenames
- Yeah, I agree with Amazon's approach here. UTF-8 makes sense. I think we could continue to use std::string internally...
- 02:03 AM Revision d6e8e8d1 (ceph): gui: some cleanup
- Rather than vectors of pointers, use vectors of NodeInfo structures.
This avoids the problem of freeing the NodeInfo ... - 12:56 AM Revision 1b1e040e (ceph): osd: add a map for lingering messages
- 12:55 AM Revision 99e1e4de (ceph): librados: assert_version on sync operations
- 12:55 AM Revision c4b97953 (ceph): librados: last_objver is set on the pool, and not per thread
- 12:55 AM Revision 454ea06e (ceph): rbd: notify about header changes
- 12:55 AM Revision 520b523b (ceph): librados: fix unnecessary locking
- 12:55 AM Revision 4c8bdc53 (ceph): osd: don't notify notifier
- 12:54 AM Revision a76de3b2 (ceph): librados: complete C interface for watch/notify
- 12:54 AM Revision 38c8e383 (ceph): librados: rename cookie to handle in api
- 12:54 AM Revision 2954799a (ceph): librados: notify waits for completion
- 12:50 AM Revision e7184e6d (ceph): librados: start implementing watch/notify
- 12:50 AM Revision a4864bd8 (ceph): librados: enable object versioning
- 12:50 AM Revision f36677f8 (ceph): librados: update C api
- 12:49 AM Revision f8af4f2c (ceph): osd: add watch/notify timeout
- 12:49 AM Revision cc62f2eb (ceph): osd: fix bad mutex lock
- 12:49 AM Revision e0c548ad (ceph): osd: fix ms_handle_reset
- 12:49 AM Revision d5cc6732 (ceph): osd: some notify related cleanups
- 12:49 AM Revision 7272bfec (ceph): osd: send notify response from reset handler if needed
- 12:49 AM Revision d66b52e1 (ceph): osd: watch infrastructure
- third attempt
- 12:49 AM Revision 2b5e61ca (ceph): osd: send notification id
- 12:49 AM Revision 59e61d0e (ceph): osd: discard of disconnected watchers
- still need to add a timeout
- 12:49 AM Revision f5f33822 (ceph): osd: send notify reply if there are not watchers
- 12:49 AM Revision 9437ea84 (ceph): osd: add user_version field in obect_info_t
- 12:49 AM Revision 7bda45a1 (ceph): osd: reply with either user_version or at_version, depends on the op
- 12:49 AM Revision f7b7d67a (ceph): osd: check requested watch version number
- send appropriate status code if needed
- 12:47 AM Revision 2bce34e7 (ceph): osd: handle watch op, register client on object xattr
- 12:47 AM Revision 3110e361 (ceph): osd: basic watch/notify handling
- 12:47 AM Revision e493c7ae (ceph): osd: handle notify-ack
11/23/2010
- 11:39 PM Revision 2f13dd8e (ceph): gui: more reindenting
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 11:37 PM Revision 66a78c23 (ceph): gui: reindent a bunch of code
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 10:40 PM Revision d8652de6 (ceph): mdcache: in trim_non_auth, only print out path if it has a parent dentry.
- This should only occur with the root inode, but caused a segfault for
anybody running more than one MDS who restarted... - 10:04 PM Revision 8768b52d (ceph): mds: Reply checking_lock while reading filelock
- Use checking_lock to repalce lock_state in extra buffer list to let client can get correct file lock reply.
- 09:59 PM Revision 4041bf0d (ceph): mds: fix set_state_rejoin auth_pin check
- We carry an auth pin IFF !stable AND auth.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:59 PM Revision 5ed06ffc (ceph): client: remove inode from flush_caps list when auth_cap changes
- Avoid confusing other code (e.g. kick_flushing_caps) by staying on the mds
flushign_caps list when we don't even have... - 09:52 PM Revision 285cc946 (ceph): osd: fix is_all_uptodate()
- This should only return true when recovery is done, i.e., no more missing
objects. Nothing to do with unfound.
Sign... - 09:52 PM Revision 36f703e1 (ceph): osd: removing unused variable, fix warning
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:52 PM Revision 413ecb0b (ceph): osd: only search_for_missing if there are unfound objects
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:52 PM Revision 671b1c09 (ceph): osd: add get_num_unfound() helper
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:52 PM Revision 7ea7a435 (ceph): osd: only discover_all_missing if unfound
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:52 PM Revision 5452dae6 (ceph): osd: recover_primary() until primary has all found objects
- The logic in that if was effectively reversed.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:52 PM Revision 5498c467 (ceph): osd: fix recover_replicas() unfound check
- missing_loc.count(soid) == 0 only means unfound if it's not missing on the
primary.
Signed-off-by: Sage Weil <sage@n... - 09:52 PM Revision e97eae15 (ceph): init-ceph: tolerate failure in cleanallogs
- Otherwise /var/log/ceph/stat makes rm -f error out and we fail.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:52 PM Revision 84612286 (ceph): Build might_have_unfound set at activation
- The might_have_unfound set is used by the primary OSD during recovery.
This set tracks the OSDs which might have unfo... - 09:52 PM Revision 0e15da8d (ceph): Rename peer_summary_requested to peer_backlog_req
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 09:52 PM Revision c0c301d5 (ceph): osd: PG::read_log: don't be clever with lost xattr
- Formerly, we had a special case in read_log for dealing with objects
whose objects were present on the disk, but not ... - 09:52 PM Revision 55570baf (ceph): osd: fix PG::is_all_uptodate
- In PG::is_all_uptodate, don't try to look for peer_missing[osd->whoami].
The primary keeps that in PG::missing!
Sign... - 08:26 PM Revision 36c6569c (ceph): monmaptool: Return a non-zero error code and print a useful error
- message if unable to read the monmap file.
Signed-off-by: Samuel Just <samuelj@hq.newdream.net> - 06:14 PM Feature #610 (Resolved): gui: make PG view prettier
- The ceph -g GUI should display PGs in a list, rather than as icons that have to be clicked on. We should get rid of t...
- 06:13 PM Bug #604 (Resolved): Compiler warning: 'status' may be used uninitialized in this function
- Fixed by commit:d6e8e8d15d22b51ec86bc5687336c3d50d9b3a5d
We should change PG view on the GUI to be a list view at ... - 05:43 PM Revision fc212548 (ceph): mds: allow for old fs's with stray instead of stray0
- New fs's get stray0, but we want to still behave with old ones.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:37 PM Revision de61991a (ceph): Merge branch 'testing' into unstable
- Conflicts:
configure.ac - 03:00 PM Bug #531: Journaling Causes System Hang
- Awesome, thanks for the help. I will give these patches a shot towards the end of the week.
Thanks - 02:43 PM Bug #599 (Resolved): recover_master_log, doesn't
- There were two problems here:
1) we were restarting the osds before the monitors, which in this case prevented a f... - 02:01 PM Linux kernel client Bug #552: Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg
- Our friends at Tcloud just submitted patches for this today, which I've applied to the unstable branch of our kernel ...
- 11:46 AM CephFS Feature #593 (Rejected): mds: fsck: anchor table repair
- dup
- 11:42 AM Feature #609 (Resolved): osd: query pool/pg for objects with given xattr
- This will probably take the form of a pool class plugin?
It could start as just a hack, for now.
- 11:03 AM Bug #595: Autogen: not a literal
- This problem does not seem to occur using 2.68 on my local machine. Slider et al. seem to be using 2.67.
- 09:39 AM CephFS Bug #608 (Resolved): mds: MDCache::create_system_inode()
- this should be fixed by commit:fc212548aea1d7f001b56ba096a79ba54b8a92c3
Thanks! - 07:09 AM CephFS Bug #608 (Resolved): mds: MDCache::create_system_inode()
- On a small test cluster I saw that my MDS was not coming up after a fresh mkcephfs, this is what the log showed:
<... - 09:33 AM Tasks #584: do throughput scaling tests on sepia
- What was the variance in per-node throughput? Did we have one node dominating?
- 09:22 AM Tasks #584 (In Progress): do throughput scaling tests on sepia
- There's definitely a problem here; the total throughput should be scaling more or less linearly until we hit a bottle...
- 07:44 AM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
- I'll have to rebuild, since I didn't look at the messages that closely.
- 07:02 AM Revision 868665d5 (ceph): v0.23.1
- 06:41 AM Revision c327c6a2 (ceph): mon: always use send_reply for auth replies
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:41 AM Revision 61dd4f03 (ceph): mon: simplify send_reply code
- No need to specify destination in send_reply, as we always have the request
for reference.
Simplify MRoute construct... - 01:37 AM Revision 2c71bd33 (ceph): osd: add assert to _process_pg_info
- When activating an inactive replica, assert that we are doing so based
on a message from the primary.
Signed-off-by:... - 01:35 AM Revision a70943fd (ceph): osd: re-indent some code in _process_pg_info
- Re-indent the code and add a comment.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> - 12:12 AM Revision 71369541 (ceph): msgr: tolerate 0 bytes from tcp_read_nonblocking
- This can happen, I belive when we get a signal or something.
Signed-off-by: Sage Weil <sage@newdream.net> - 12:12 AM Revision 7ec0034b (ceph): init-ceph: fix (and test!) cleanlogs and cleanalllogs
- Signed-off-by: Sage Weil <sage@newdream.net>
- 12:03 AM Revision 7b4a801f (ceph): mds: fix rejoin_scour_survivor_replicas inode check
- We want to remove replicas that we don't ack, but those don't appear in
the strong_inode map; they're appended to the...
11/22/2010
- 11:08 PM Revision 8d95b5b6 (ceph): messenger: init rc to -1, removing compiler warning.
- This actually is initialized before all uses, but compilers tend to
have trouble with assignment in if-else branches,... - 11:08 PM Revision dd11fe27 (ceph): types: Allow inodeno_t structs to alias.
- This removes a compiler warning that appeared in a gcc upgrade and
is apparently erroneous, about its usage violating... - 10:56 PM Bug #540 (Resolved): CephxClientHandler::handle_response
- couldn't reproduce this, but fixed two smallish things that may have been responsible for this:
commit:61dd4f03e6e15... - 10:35 PM Linux kernel client Bug #552: Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg
- From the reply dump, it looks like a ceph_mds_reply_head, a length 0 tracebl, a length 1 extrabl (containing a u8 == ...
- 09:25 PM Revision ac6b018a (ceph): Causes the MDSes to switch among a set of stray directories when
- switching to a new journal segment.
MDSCache:
The stray member has been replaced with strays, an array of inodes
r... - 09:16 PM Revision 3f8f5905 (ceph): Timer must be initialized in Client::init and shutdown in
- Client::shutdown.
Signed-off-by: Samuel Just <samuelj@hq.newdream.net> - 06:47 PM Revision 8eb4de9e (ceph): generate_past_intervals:generate back to lastclean
- PG::generate_past_intervals needs to generate all the intervals back to
history.last_epoch_clean, rather than just to... - 06:07 PM Revision 80f28235 (ceph): vstart.sh: 'init-ceph stop' instead of 'stop.sh'
- This just makes it easier to run multiple vstart sessions as the same user
on the same host.
Signed-off-by: Sage Wei... - 05:55 PM Revision 53d0650a (ceph): Merge branch 'osd_msgr' into unstable
- 05:55 PM Revision cd53719f (ceph): mds: resolve cleanup
- Only track ambiguous imports and such if we get a resolve message while in
the resolve state.
Signed-off-by: Sage We... - 05:55 PM Revision c0c81d53 (ceph): mds: trim exported subtree _after_ adjusting auth
- We need to set the subtree bounds before trimming it away, or else we may
throw out things we're still auth for.
Sig... - 05:55 PM Revision 9e15ade8 (ceph): mds: do not eval subtree root when replay|resolve
- This is nonsensical. And can lead to scatter_writebehind, which breaks
horribly.
Signed-off-by: Sage Weil <sage@new... - 05:55 PM Revision 27c6f217 (ceph): mds: remove bogus assert
- Causes problems during resolve finish.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:49 PM Revision 924b1fcb (ceph): osd: bind to new cluster address when wrongly marked down
- If we come back up on the same address, there is a possible race. Other
nodes will mark_down when they see us go dow... - 05:45 PM Revision 19409763 (ceph): msgr: implement rebind() to pick a new port
- Closes out all old connections and binds to a _different_ port. This
ensures that someone doing mark_down on our old... - 05:09 PM Revision f7170f95 (ceph): client: only encode_cap_releases once per request.
- Accomplish this by making a list of cap releases in the (permanent)
MetaRequest, and then copying that into the (pote... - 04:36 PM Bug #607 (Rejected): osd: ReplicatedPG: sub_op_modify: fix creation of ObjectState
- There's a part of the ReplicatedPG::sub_op_modify code that goes like this:
> // do op
> ObjectStat... - 04:29 PM CephFS Feature #91: mds: up:shadow mode
- Updated Journaler to make new interface options asynchronous.
Presently working on how to disambiguate between a one... - 03:48 PM Tasks #584 (Resolved): do throughput scaling tests on sepia
- Results of running rados -p bench bench 20 write on <Nodes>. <Average Throughput> is the average of the Bandwidth st...
- 01:24 PM CephFS Feature #88 (Resolved): mds: change stray commit strategy to avoid rolling stray dir commits
- commit:ac6b018acbeaf8670f8c268db164cfb8a12c171d
- 12:59 PM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
- Is the stack trace you're getting now identical, or different? The FileStore.cc change _should_ have avoided the asy...
- 09:28 AM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
- Just to update the issue, Sage asked me to change something in FileStore.cc, tried that for some days, but that didn'...
- 12:47 PM CephFS Feature #606 (Duplicate): mds: optionally store parent attr on file objects
- The goal is to be able to find files contained in rebuilt directories (#603). We can store the same attrs we do for ...
- 12:45 PM CephFS Feature #605 (Rejected): mds: verify/repair anchor table
- - Make sure every item we encounter while traversing the that is anchored correctly appears in the anchor table.
- M... - 12:44 PM Bug #604 (Resolved): Compiler warning: 'status' may be used uninitialized in this function
- In gui.cc
The warning's location references are a bit off, but the function gen_node_info_from_icons declares a "sta... - 12:43 PM CephFS Feature #603 (Resolved): mds: repair directory hierarchy
- The goals are
- rebuild missing/corrupt directories
- repair multiple primary links to directories
We'll do so... - 12:40 PM CephFS Feature #602 (Resolved): mds: handle corrupt/missing journals
- This probably means
- shutting down current instances, resetting cluster membership
- throwing out journals (or m... - 12:37 PM CephFS Feature #601 (New): mds: order directory commits after rename
- When we rename something between directories, we should try to commit the target directory _before_ the source direct...
- 12:34 PM CephFS Feature #600 (Resolved): mds: store full trace on directories
- Currently we only store the immediate parent; store a full trace up to the root. This is CInode::encode_parent_mutat...
- 12:17 PM Bug #599: recover_master_log, doesn't
- Also, I have verified that osd3 and osd9 did NOT crash. They're still running, and they did receive the messages from...
- 12:13 PM Bug #599 (Resolved): recover_master_log, doesn't
- This is another peering bug. We found it on wido's cluster. Basically, peering never completes.
I just examined PG... - 09:52 AM Bug #592 (Resolved): osd: rebind cluster_messenger when wrongly marked down
- commit:53d0650a42cbfd2f02db2c708a570b6d9e116bb4
- 09:14 AM CephFS Bug #596 (Resolved): crash during mds reconnect
- Well, that seems to fix it. I added a releases vector to the MetaReqest so it will only encode the releases once, and...
- 08:49 AM Bug #598 (Resolved): osd: journal reset in parallel mode acts weird
- from ML:...
- 04:52 AM Revision 51abcaa2 (ceph): mon: clean up cluster_addr code a bit, better debug output
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:52 AM Revision 20313644 (ceph): osdmap: fix cluster_addr encoding; printing
- The cluster addrs were getting lost because we were checking v instead of
ev.
Signed-off-by: Sage Weil <sage@newdrea... - 04:52 AM Revision 28498a00 (ceph): osd: send correct ip addrs to monitor for cluster_, hb_addr
- Signed-off-by: Sage Weil <sage@newdream.net>
- 03:59 AM Revision ec434eda (ceph): osd: unconditionally set up separate msgr instance for osd<->osd msgs
- Always set up cluster_messenger (before we would only do so if there was
an explicit address configured for it). The... - 12:16 AM Revision 0dddf453 (ceph): filestore: only warn about disk write cache on kernels <2.6.33
- Signed-off-by: Sage Weil <sage@newdream.net>
- 12:15 AM Revision 0856f57e (ceph): osd: fix search_for_missing: old last_update implies object not present
- For example, if an osd sends an empty PG::Info (last_update = 0'0) and
empty missing, we should not conclude that the... - 12:09 AM Revision 6ef5c2f3 (ceph): init-ceph: fix cleanlogs for no log_sym_dir case
- Signed-off-by: Sage Weil <sage@newdream.net>
11/21/2010
- 07:55 PM Linux kernel client Bug #549 (Resolved): bonnie++ file stat failure
- commit:3105c19c450ac7c18ab28c19d364b588767261b3
- 03:50 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
- I think the cleanest solution here is to re-bind the cluster_messenger to a new port when we are marked down and go b...
- 03:38 PM Linux kernel client Bug #597 (Closed): Reproducible crash mounting multiple directories from a pool
- This bug was fixed in v2.6.36, commit:ca04d9c3ec721e474f00992efc1b1afb625507f5. Thanks for the report though! :)
- 03:34 PM Linux kernel client Bug #597: Reproducible crash mounting multiple directories from a pool
- Should have mentioned - this is with the Ubuntu 10.10 desktop kernel, which is 2.6.35-22, I think.
- 03:33 PM Linux kernel client Bug #597 (Closed): Reproducible crash mounting multiple directories from a pool
- When trying to mount a pool multiple times (with different subdirectories) I get a consistent system hang.
Steps t...
11/20/2010
- 05:06 PM Bug #531: Journaling Causes System Hang
- Please try out the patches in the filestore_throttle branch, commit:b28c0bf82ac28ded4fe85573d32fdc111c66e50b
It lo... - 03:15 AM Revision fc9b0976 (ceph): OSDMap: const cleanup
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 03:14 AM Revision 2a5c3893 (ceph): mds-dumper: Define Dumper::~Dumper()
- To fix compile error.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
11/19/2010
- 10:21 PM Revision 8566c5cd (ceph): ReplicatedPG::pull: fix test for unfound
- The test for unfound objects was reversed, leading us to try to pull
unfound objects and refrain from pulling objects... - 09:41 PM Revision 2f5502fa (ceph): osdmap: fix printing, again
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:21 PM CephFS Bug #596: crash during mds reconnect
- The encode_cap_releases can only be called _once_, the very first time we send the request. So at some level this is...
- 04:22 PM CephFS Bug #596 (Resolved): crash during mds reconnect
- While testing my Journaler changes, I got a cfuse segfault. My steps:
vstart with 1 of each daemon
mount cfuse
cop... - 06:17 PM Revision 4303820b (ceph): Merge remote branch 'origin/mds' into unstable
- 04:26 PM CephFS Feature #91 (In Progress): mds: up:shadow mode
- I've been getting some proper time in on this on and off over the last few days. Pushed the Journaler changes to the ...
- 03:52 PM Bug #531: Journaling Causes System Hang
- Okay,
More updates.
1) All the VMs deployed okay but it looks like towards the end of the deployments I hit the... - 02:49 PM Bug #531: Journaling Causes System Hang
- Okay,
I just started the deployment of 12 vms on a new cephfs with 3 osds in and ssd's for journals on all the sys... - 02:37 PM Bug #531: Journaling Causes System Hang
- I am working on getting the output now. We are having to work on several projects at once right now. Sorry for the de...
- 03:36 PM Bug #595 (Won't Fix): Autogen: not a literal
- We get this running on autoconf 2.67:
configure.ac:6: warning: AC_INIT: not a literal: Sage Weil <sage@newdream.net>... - 02:29 PM CephFS Bug #594 (Resolved): mds: frag split/merge vs replay
- Need to reconcile refragmenting with resolve stage. Currently handle_resolve assumes frags match, when in reality th...
- 12:11 PM Bug #585 (Resolved): OSD: ReplicatedPG::pull
- Fixed by commit:82f1de8c0d6e7817ca7d6dd710e3176b2a549e12
- 10:43 AM Bug #585 (In Progress): OSD: ReplicatedPG::pull
- need to see what's going on with this
- 11:47 AM Bug #503 (Closed): osd: query osds since last_epoch_clean before concluding objects lost?
- 11:39 AM Bug #515 (Can't reproduce): osd: recovery isn't completing
- with the recent changes i'm closing this one out, and reopening with specifics if it comes up in testing over the nex...
- 10:14 AM CephFS Feature #545 (Resolved): mds: use bloom filter to supplement dirfrag COMPLETE flag
- merged commit:4303820b43721a8b46ef36d0e9ef4e1167857c80
- 09:38 AM CephFS Feature #593 (Rejected): mds: fsck: anchor table repair
- We need to be able to fix up the anchor table when there are problems, to avoid e.g....
- 05:13 AM Revision b91e14e1 (ceph): multi-dump.sh: add diff mode
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 04:57 AM Revision 9cab522e (ceph): Add multi-dump.sh
- This is a debug tool that can dump out Ceph information at various
epochs. For instance, it can show how the OSDmap c...
11/18/2010
- 11:05 PM Revision 6e2b594b (ceph): ReplicatedPG::get_object_contect: fix broken calls
- ReplicatedPG::get_object_context takes three parameters. The last two
are "const object_locator_t& oloc" and "bool c... - 09:50 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
- Ah. Looks like you got it figured out.
I wasn't aware of what mark_down did.
Just in case anyone finds it useful... - 09:22 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
- ok, this is a problem with how the osd is interacting with the messenger. looking at the history of 0.5, we see
<pr... - 08:42 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
- i suspect 0.5 didn't get set up on osd1 or 2 before osd0 went down? do you have the full logs for the other instances?
- 05:07 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
- I should also add that Greg Farnum helped me examine the logs for this bug.
- 05:03 PM Bug #592 (Resolved): osd: rebind cluster_messenger when wrongly marked down
- This happened with commit:323565343071ce695f7d454ed29590688de64d5d on flab.ceph.dreamhost.com
While running test_u... - 08:50 PM Revision 43e0b267 (ceph): ReplicatedPG: call finish_recovery when needed
- Don't loop in ReplicatedPG::start_recovery_ops. There is already a loop
in both recover_replicas and recover_primary ... - 08:33 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- Colin McCabe wrote:
> Another potential issue that I can see here is that the code in OSD::_process_pg_info doesn't ... - 12:43 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- Another potential issue that I can see here is that the code in OSD::_process_pg_info doesn't check whether it got a ...
- 09:26 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- Need to look at this more closely. Fred, pretty sure no data is lost here, but the recovery code needs some fixing.
... - 06:19 AM Bug #590 (Resolved): osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
- After upgrading to ceph 0.23, the cluster (3 osd, 3 mon, 3 non-clustered mds) worked for about 2 hours and then one c...
- 06:09 PM Revision ea5d1d66 (ceph): osd_resurrection_1_impl: turn on recovery at end
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 09:47 AM Feature #526 (Resolved): osd: unfound objects rework
- We now let the PG become active even when there are unfound objects. When the user tries to read one of those objects...
- 07:39 AM Linux kernel client Feature #591 (Resolved): implement FALLOC_FL_PUNCH_HOLE
- 12:52 AM Revision 4adfdee7 (ceph): Makefile: fix builddir weirdness
- Signed-off-by: Jim Schutt <jaschut@sandia.gov>
- 12:10 AM Bug #585: OSD: ReplicatedPG::pull
- Well, it did show up again:...
11/17/2010
- 10:37 PM Revision 7e9812b4 (ceph): osd: rev PG::Info encoding for last_epoch_clean change
- This was missed by 184fbf582b27c10b47101735a4495fe8c73ad186, so any fs
created between now and then won't decode prop... - 09:06 PM Revision c17e7da4 (ceph): Merge branch 'mds_frags' into unstable
- 09:06 PM Revision 7f6a2561 (ceph): mds: clear PIN_SUBTREE on split/merge in purge_strays
- This makes the helper work for merge as well as split. Remove the special
fixups in the caller that were making spli... - 09:06 PM Revision 66d43ac8 (ceph): mds: fix subtree map update on dirfrag merge
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:06 PM Revision b705be11 (ceph): mds: wrlock scatterlocks to prevent a gather racing with split/merge lo...
- We have the dirs split in our cache for some time while journaling it to
disk, before the fragment_notify goes out. ... - 09:06 PM Revision f6823a79 (ceph): mds: adjust dir_auth_pins on steal_dentry
- dir_auth_pins is a counter of dentry auth_pins in the current dir; those
need to be added in when stealing.
Signed-o... - 09:06 PM Revision cd5ee006 (ceph): mds: initialize PIN_SUBTREE on split
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:06 PM Revision d538817f (ceph): mds: flush log on fragment
- This makes request lock auth_pins expire, so the fragment moves along.
Otherwise we can end up waiting for the log fl... - 09:06 PM Revision 3777ff8a (ceph): mds: move dirty rstat inodes to new dir on refragment
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:06 PM Revision 669b5544 (ceph): mds: don't complete freeze while parent inode is frozen
- This makes maybe_finish_freeze() conditions match that of is_freezeable()
and avoids an assert.
Signed-off-by: Sage ... - 09:04 PM Revision b58b8d09 (ceph): mds: fix discover requests, tracking wrt fragments
- Track discover requests by tid. The old system of tracking outstanding
discovers was kludgey and somewhat broken. A... - 09:02 PM Revision a63c06c8 (ceph): mds: fix EFragment replay
- If the inode already exists in our cache, adjust our (existing) fragments.
But it might not. In that case, we just r... - 09:02 PM Revision a961049b (ceph): mds: don't fragment mdsdir or .ceph
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:48 PM Revision b54880e0 (ceph): Detect broken system linux/fiemap.h
- RedHat 5.5 has a /usr/include/linux/fiemap.h, but it is
broken because it does not itself include linux/types.h.
As a... - 06:24 PM Revision 29a9e668 (ceph): osdmap: don't include blacklist info in summary
- It's confusing users and isn't that important.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:58 PM Revision c43455ce (ceph): client: Remove the I_COMPLETE flag from the parent directory in relink_...
- This papers over issues arising from the client's lack of proper support
for hard links, and lets it pass the snaptes... - 02:35 PM Bug #589 (Resolved): OSD: crash on startup, PG::read_state
- Ok, this is fixed by commit:7e9812b4a9bbf320a8b0bd0abec48c1c5d78fe66. Assuming your fs is old enough you should be o...
- 11:38 AM Bug #589 (Resolved): OSD: crash on startup, PG::read_state
- After upgrading to today's unstable all my OSD's crashed directly after startup, for example osd0:
Last loglines a... - 12:56 PM Bug #531: Journaling Causes System Hang
- Just pinging you on this one. If you can send the logs I'd like to sort this out. Thanks!
- 09:59 AM CephFS Bug #344: cfuse should pass all qa tests
- At this point the only test it's failing is bonnie. This one tends to fail on a SEGV that just keeps going through th...
- 09:57 AM CephFS Bug #583 (Resolved): cfuse fails snaptest-upchildrealms
- Okay, a proper fix for this is going to require a bit of work, since right now Inodes can only have one parent dentry...
- 09:52 AM CephFS Cleanup #588 (Resolved): Allow Inodes to have multiple parent Dentries
- Right now, cached Inodes can only have one parent Dentry. This is unfortunate when there are multiple hard links to a...
- 09:40 AM Tasks #587 (Rejected): install mpich2 on sepia*
- this will make management and testing easier
- 07:52 AM Bug #585 (Closed): OSD: ReplicatedPG::pull
- This one should also be fixed in the latest unstable. Probably. The recovery code is still being worked on a bit, b...
- 02:55 AM Bug #585 (Resolved): OSD: ReplicatedPG::pull
- On two OSD's (osd5 and osd10) I'm seeing the same crash, the crash almost directly after starting them.
I cranked ... - 07:19 AM Bug #586 (Resolved): OSD: Crash during scheduled scrub
- This was fixed in the commit right after what you were running, commit:556ba7397c352f5a6cb7fe03087c6e2f51dbce32
- 05:31 AM Bug #586 (Resolved): OSD: Crash during scheduled scrub
- After I reported #585 I didn't pay much attention to my cluster, until I found out that I had only one OSD left onlin...
- 12:09 AM Revision d57181d3 (ceph): config: added max_mds
- MDSMonitor: create_new_fs adapted to use the max_mds parameter
max_mds is now a configurable value and create_new_fs...
11/16/2010
- 09:00 PM Tasks #584 (Rejected): do throughput scaling tests on sepia
- Use rados bench on N nodes, scaling N, and see how the throughput scales.
- 08:09 PM Revision c4931265 (ceph): mds: make dirfrag thrashing join and split
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:09 PM Revision d1dcc035 (ceph): mds: allow frag merge on subtree root
- Fix purge_stolen and adjust_dir_fragments.
Signed-off-by: Sage Weil <sage@newdream.net> - 08:08 PM Revision 8f24919d (ceph): mds: add timestamp to LogEvents
- This just gives us a bit of useful info when debugging problems.
Signed-off-by: Sage Weil <sage@newdream.net> - 06:32 PM Revision 56b9e927 (ceph): osd: fix trailing + in pg state string rendering
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:21 PM CephFS Bug #583: cfuse fails snaptest-upchildrealms
- Looks like the problem is caused by linking b/bar to b/foo. The server response to goes through insert_dentry_inode v...
- 06:17 PM CephFS Bug #583 (Resolved): cfuse fails snaptest-upchildrealms
- Fails to rm a/b, ENOTEMPTY.
- 06:11 PM Feature #582 (Closed): Make max_mds configurable
- 03:06 PM Feature #582 (Closed): Make max_mds configurable
- Right now the only way to set it is with the set_max_mds mon command. Add it to the config stuff and have create_new_...
- 06:10 PM Revision 2c9873f0 (ceph): Merge remote branch 'origin/unfound' into unstable
- 06:06 PM Revision d17f7444 (ceph): mds: be less noisy about cap imports
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:01 PM Revision 05bd6b07 (ceph): Merge branch 'mds_dir_hash' into unstable
- 06:01 PM Revision e146767e (ceph): mds: make dentry hash a dir layout property
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:01 PM Revision cc709df8 (ceph): mds: add DIRLAYOUTHASH feature bit
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:01 PM Revision be29e4c3 (ceph): mds: set mode before all the file type dependent inode initialization!
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:01 PM Revision 33580460 (ceph): mds: set dir hash on root inode
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:01 PM Revision 77c05fbc (ceph): mds/client: pass dir hash over the wire
- Add a feature bit DIRLAYOUTHASH.
Also fix client request routing for lookups (we were only hashing when
a Dentry poi... - 05:13 PM Bug #479: ceph/mount crash badly when writing
- Sorry Sage and Yehuda for the late update..
I was spending time experimenting, and just using the default btrfs with... - 01:48 PM Bug #538: Write performance does not scale over multiple computers
- Did you update your installed version of the rados tool as Sage said? If you did and are still getting poor performan...
- 12:48 PM Bug #518: cfuse crashed on ls
- Confirmed this is fixed 0.23.1 (sorry for huge delay in confirmation).
- 12:06 PM CephFS Feature #483 (Resolved): mds: add timestamp to LogEvent
- commit:8f24919d39734cf518f2bf6e50faf6f5266d6eff
- 11:52 AM CephFS Feature #560 (Resolved): mds: alternate directory hashing
- kernel part is done and in unstable branch, currently commit:9f62e3eaafd52875e1f2e4344e11e51ddb726f48
- 09:59 AM CephFS Feature #560: mds: alternate directory hashing
- commit:05bd6b078d743d6c235c0fcedda7ee4f64ab2ad5 has it working for the user client.
- 02:33 AM Revision 267cd845 (ceph): RadosClient::shutdown: call monclient::shutdown
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 02:22 AM Revision dfb78ebf (ceph): osd: don't stop recovery when there are unfound
- There are two phases in recovery: one where we get all the right objects
on to the primary, and another where we push... - 01:03 AM Revision d014acb6 (ceph): dumpjournal.cc: fix compile
- dumpjournal needs to create its own SafeTimers and pass them in to some
constructors.
Signed-off-by: Colin McCabe <c... - 12:44 AM Revision da2d5018 (ceph): rbd: fix rbd snap rm class handling
11/15/2010
- 10:59 PM Revision 250d414e (ceph): Merge remote branch 'origin/unfound_last_epoch_clean' into unstable
- 10:47 PM Revision c7075115 (ceph): Add ./ceph osd tell <osd-num> dump_missing <out>
- Add a command that tells the OSD to dump its missing set for all PGs to
a file. This should be useful for debugging m... - 10:38 PM Revision 755f5759 (ceph): search_for_missing:recalc stats if unfound changed
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 09:31 PM Revision d883a547 (ceph): mds: Use CDir bloom filter as appropriate.
- Add items to the bloom filter when trimming, and look for them
in the filter in the few places where a simple existen... - 09:31 PM Revision be2da00a (ceph): mds: Add bloom filter to CDir.
- You can now add items to a bloom filter and check for their existence.
This is intended to be used when trimming item... - 09:23 PM Revision 1fe31e18 (ceph): timer: make init/shutdown explicit
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:39 PM Revision d2af7b7e (ceph): test_unfound.sh: start recovery at end of test
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 08:31 PM Revision c293b9af (ceph): test_common.sh: add dump_osd_store
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 08:15 PM Revision 184fbf58 (ceph): osd: add last_epoch_clean to PG::Info
- This changes the encoding in a non-backwards compatible way.
Signed-off-by: Sage Weil <sage@newdream.net> - 08:15 PM Revision 873e9bf8 (ceph): osd: add incompat feature LEC for last_epoch_clean
- So an old binary will fail to mount a store with new Info encoding.
Signed-off-by: Sage Weil <sage@newdream.net> - 08:15 PM Revision b0c22bd5 (ceph): Add MOSDPGMissing
- Add MOSDPGMissing, a message which just contains the missing objects
information for a PG. We will request messages l... - 08:15 PM Revision d3cf4787 (ceph): PG::finish_recovery: set info.last_epoch_clean
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 08:15 PM Revision e768bbdf (ceph): Add stray_test to test_unfound.sh
- This test is designed to produce a stray that nonetheless has some
useful objects. The primary should be able to find... - 08:15 PM Revision 796ff1d1 (ceph): Fix bugs in search_for_missing, _process_pg_info
- PG::search_for_missing: fix a bug with the handling of MSG_OSD_PG_INFO
messages. Formerly, when processing these mess... - 08:15 PM Revision e3f65076 (ceph): osd: add discover_all_missing
- Add discover_all_missing. This function makes sure that we have messages
en route to any OSD that we think might have... - 08:15 PM Revision 470b1990 (ceph): stray_test:don't use up/down. timeout extension
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 08:15 PM Revision 05a16d32 (ceph): test_unfound.sh: fix return codes
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 08:15 PM Revision 6a65cc4f (ceph): test_common.sh: remove messenger debug for now
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 08:06 PM Revision 873180aa (ceph): osd: skip unfound in recover_replicas
- This is moot currently, since we don't currently start recovering replicas
until the primary is complete.
Signed-off... - 08:04 PM Revision d61bc3bf (ceph): osd: skip unfound objects in recover_primary()
- We also need to make sure we come back later when they are found.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:57 PM Revision 9ea1d8bb (ceph): osdmap: make printing a bit easier to read
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:50 PM Revision beae97f9 (ceph): objecter: don't dereference null op->outbl
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:36 PM Revision 089cd12d (ceph): include: Add bloom filter library to include/
- Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
- 07:25 PM Revision f2c080b3 (ceph): Merge remote branch 'origin/testing' into unstable
- 07:25 PM Revision 556ba739 (ceph): osd: unreg scrub when removing pg
- This fixes this crash:
osd/OSD.cc: In function 'PG* OSD::_lookup_lock_pg(pg_t)':
osd/OSD.cc:956: FAILED asse... - 04:54 PM CephFS Feature #560: mds: alternate directory hashing
- almost there. need to fix/test uclient hashing.
then implement for kclient... - 04:44 PM Bug #580 (Resolved): rbd rm snap is broken
- Fixed with commit:da2d50180dfdc0e30b4348f2acceb2be650f20b7
- 03:42 PM Bug #580 (Resolved): rbd rm snap is broken
- When doing 'rbd rm snap', the rbd image header gets corrupted.
- 01:49 PM Bug #535 (Resolved): cephtool hangs forever until a UNIX signal is received
- Sage spent some time on the messenger too, and I suspect we're done now.
- 01:39 PM CephFS Feature #545: mds: use bloom filter to supplement dirfrag COMPLETE flag
- Pushed it to branch "mds" (which I apparently created, but thought existed...weird!). Testing it now on a secondary i...
- 11:19 AM Bug #579 (Resolved): OSD::sched_scrub: FAILED assert(pg_map.count(pgid)
- commit:f46f674261bf65a6f7f6313fb688ec4773f526b5
- 10:56 AM Bug #579: OSD::sched_scrub: FAILED assert(pg_map.count(pgid)
- Some more information about this bug.
OSD1 and OSD2 have a PG named 0.6
OSD0 does not.
=====================
... - 10:51 AM Bug #579 (Resolved): OSD::sched_scrub: FAILED assert(pg_map.count(pgid)
- On unfound_last_epoch_clean at commit commit:7201497f2feef6a2bbd0baf89e3a14b8a880e79f
I found this assert when run... - 07:05 AM Bug #538: Write performance does not scale over multiple computers
- I set 'osd heartbeat grace=120' and that got rid of the chatter. My performance is now:...
- 04:48 AM Revision 7f38858c (ceph): Merge branch 'msgr_zerocopy_read' into unstable
- 04:39 AM Revision 7cb2d508 (ceph): msgr: use provided rx buffer if present
- This changes the read path so that we hold the Connection::lock mutex while
reading data off the socket. This ensure... - 04:39 AM Revision e8132cd9 (ceph): objecter: post rx buffer to msgr if target bufferlist is present
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:39 AM Revision 975dd8fa (ceph): librados: pass provided buffer to objecter on rados_read
- This allows us to avoid to the data copy if the objecter and msgr manage
to use it.
Signed-off-by: Sage Weil <sage@n... - 04:23 AM Revision 2854dae8 (ceph): msgr: add Connection rx buffer interface
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:23 AM Revision c04ba725 (ceph): msgr: implement get_connection()
- Get a Connection* for the given destination. This mirrors submit_message,
but does not actually queue a message.
Si... - 04:21 AM Revision 67852352 (ceph): buffer: implement list::iterator::get_current_ptr()
- Return a buffer::ptr for the ptr at the current position/offset, with the
length set to the remaining space in the cu...
11/14/2010
- 09:05 PM Messengers Feature #527 (Resolved): zero copy reads, msgr rx buffer infrastructure
- commit:7f38858c0c19db36c5ecf36cb4d333579981c811
- 07:29 PM Revision 4af14db4 (ceph): Objecter::shutdown: shut down timer.
- We have to explictly shut down the timer in Objecter::shutdown.
Otherwise, we are relying on the destructor of SafeTi... - 11:33 AM Bug #578 (Resolved): assert triggered on radostool shutdown
- 11:33 AM Bug #578: assert triggered on radostool shutdown
- Fixed by commit:4af14db424e770c2f3e99dad6fd2b6f2059feacd
A mutex lifecycle issue. - 11:26 AM Bug #578 (Resolved): assert triggered on radostool shutdown
- I hit this assert when radostool was exiting.
./common/Mutex.h:97: FAILED assert(nlock == 0)
ceph version 0.24~r...
11/13/2010
- 08:46 PM Bug #574: timer: event cancellation apparently broken
- cancel_event always relied on the caller to take the SafeTimer lock, and then goes on to take the Timer lock. So it's...
- 08:39 PM Bug #535: cephtool hangs forever until a UNIX signal is received
- It looks good so far.
- 04:43 AM Revision f18609e8 (ceph): Merge remote branch 'origin/msgr' into testing
- 12:00 AM Revision 2be4215a (ceph): debug: don't print thread id twice
- Signed-off-by: Sage Weil <sage@newdream.net>
11/12/2010
- 11:59 PM Revision b61af6a7 (ceph): msgr: cleanup: make queue_received non-inline; some helpful debug
- Signed-off-by: Sage Weil <sage@newdream.net>
- 11:56 PM Revision f99c84e6 (ceph): msgr: do not clear halt_delivery
- We need to keep the halt_delivery plug set on failure/shutdown in order to
prevent a racing reader from queuing new m... - 10:55 PM Revision 1071a9ab (ceph): msgr: protect pipe queue_item map with pipe_lock AND dispatch_queue lock
- Close a few different races here.
Also, assert that queue_items are not queued in ~Pipe().
Signed-off-by: Sage Weil... - 10:55 PM Revision d4746ab5 (ceph): msgr: close enqueue/discard race
- We need to re-check halt_delivery after dropping and retaking pipe_lock.
Signed-off-by: Sage Weil <sage@newdream.net> - 10:55 PM Revision 20937e88 (ceph): msgr: protect pipe queuing with _both_ pipe and dispatch_queue locks
- We want to make sure the pipe's queue item doesn't go away.
Also, make queue_received() require pipe_lock to be held... - 10:55 PM Revision cbf154e1 (ceph): msgr: only close socket on reconnect or shutdown
- We can't modify 'sd' or (more importnatly) close sd while any other thread
might be using it, or else we might race w... - 10:55 PM Revision 70fe062f (ceph): msgr: add 'ms inject socket failures = foo'
- Where we fail roughly every foo'th socket operation.
Signed-off-by: Sage Weil <sage@newdream.net> - 10:49 PM Revision 20affc65 (ceph): TestTimers: don't test (nonexistent) Timer
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 10:45 PM Revision d5032a05 (ceph): Rename PG::peer to PG::do_peer
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 03:59 PM Revision 46cf27d4 (ceph): Merge branch 'testing' into unstable
- 03:55 PM Revision c5b2d28b (ceph): uclient: insert lssnap results under snapdir, not live dir
- Put the readdir results (list of snapshots) in the right place in the
hierarchy; we were putting them in the parent d... - 03:36 PM Revision 7ccdae8c (ceph): msg: fix buffer size for IPv6 address parsing
- Signed-off-by: Wido den Hollander <wido@widodh.nl>
- 02:20 PM Bug #577 (Resolved): unify PG creation code in OSD::handle_pg_notify and OSD::_process_PG_info
- unify PG creation code in OSD::handle_pg_notify and OSD::_process_PG_info
Duplicated code here. They're slightly d... - 02:16 PM CephFS Feature #545: mds: use bloom filter to supplement dirfrag COMPLETE flag
- Trying to find a bloom filter library. Unfortunately there don't seem to be any available under a GPL-compatible lice...
- 01:16 PM Bug #490 (Can't reproduce): Cluster stays in a degraded state
- 01:15 PM CephFS Cleanup #514 (Rejected): Optimize MIX/MIX_STALE reconnects, etc
- mix_stale is no more
- 12:56 PM Linux kernel client Bug #576 (Can't reproduce): readdir returns too many results
- ...
- 11:02 AM Bug #535: cephtool hangs forever until a UNIX signal is received
- Pushed a potential fix to the msgr branch, waiting for Colin to report back on if it works or not. :)
- 07:56 AM CephFS Bug #561 (Resolved): snaptest-2 doesn't execute properly
- Figured this out. LSSNAPs was adding the snap dentries to the cache under the parent dir instead of the hidden .snap...
- 07:37 AM Messengers Bug #573 (Resolved): monmaptool fails to parse IPv6 address
- Thanks, applied as commit:7ccdae8cd44c143550234511a2a09bab38c6515e
- 04:56 AM Messengers Bug #573: monmaptool fails to parse IPv6 address
- After searching through the source I found it :)
Attached is a patch to fix the IPv6 address parsing. The buffer w... - 05:12 AM Bug #575 (Resolved): monmaptool terminates when input file is not a monmap
- For example:...
- 03:30 AM Bug #540: CephxClientHandler::handle_response
- Just saw it again on the same cluster, this time osd2 crashed when upgrading to this morning's unstable:...
- 12:29 AM Bug #540: CephxClientHandler::handle_response
- I saw that on a test machine of mine. The 'ceph -w' command was hanging for about 10 seconds and then exited with thi...
- 12:38 AM Revision ce6d6394 (ceph): timer: rewrite mostly from scratch
- Just use the provided lock. This _vastly_ reduces the complexity because
we don't have to worry about races between ...
11/11/2010
- 11:31 PM Revision 54848991 (ceph): mds: hit inode created via CREATE
- We missed this path!
Signed-off-by: Sage Weil <sage@newdream.net> - 10:28 PM Revision f8b3271f (ceph): Merge branch 'rc' into unstable
- Conflicts:
configure.ac
src/Makefile.am - 05:47 PM Bug #531: Journaling Causes System Hang
- Sorry I have been able to get the debug output yet. We have spent the last few days working with our production syste...
- 04:47 PM Linux kernel client Tasks #569 (Resolved): test dir frags
- a few fixes, mostly fine. commit:7b88dadc13e0004947de52df128dbd5b0754ed0a
- 04:43 PM Bug #574 (Resolved): timer: event cancellation apparently broken
- Looking into this, it appears that the problem was that the wrong lock was taken during cancel event. Or that the ev...
- 03:38 PM Bug #574 (Resolved): timer: event cancellation apparently broken
- Just saw this on latest unstable, commit:f8b3271f45cc4a87e3f3f212d22e3d34ff13da44
The monitor schedules a propose ... - 03:09 PM CephFS Tasks #366 (New): test snaptests against clustered mds failures
- 03:08 PM CephFS Tasks #366 (Rejected): test snaptests against clustered mds failures
- 03:08 PM CephFS Bug #362 (Rejected): mds: rejoin crashes on snaptest-2 workload
- 02:45 PM Bug #540: CephxClientHandler::handle_response
- Wido just saw this:...
- 05:18 AM Revision 5d1d8d0c (ceph): v0.23
- 04:58 AM Revision 3d10b340 (ceph): mds: fix null_snapflush with multiple intervening snaps
- The client is allowed to not send a snapflush if there is no dirty metadata
to write for a given snap. However, the ... - 02:17 AM Messengers Bug #573 (Resolved): monmaptool fails to parse IPv6 address
- I'm trying to setup a small cluster with IPv6, but mkcephfs fails:...
- 12:36 AM Revision 3d6e9155 (ceph): Merge remote branch 'origin/unfound' into unstable
- 12:31 AM Revision 4d941cf4 (ceph): osd: scrub: change cancel behavior
- Use explicit flag, so that scrub_reserved always indicates whether the
osd count includes us or not.
Signed-off-by: ... - 12:31 AM Revision a87e8901 (ceph): osd: track last_scrubbed in PG::Info::History
- Share with peers and write to disk on scrub completion.
Signed-off-by: Sage Weil <sage@newdream.net> - 12:31 AM Revision 6548fb65 (ceph): osd: do scrub schedule state changes inside scrub()
- Update these values under protection of pg lock iff we start scrubbing,
otherwise back out.
On scrub completion, unr... - 12:31 AM Revision 815c3d56 (ceph): osd: fix sched_scrub
- Insert whoami into reserved set on primary, not 0! Also more cleanup of
sched state helpers.
Signed-off-by: Sage We... - 12:31 AM Revision 92572910 (ceph): osd: call sched_scrub on reserve reply
- Otherwise we have to wait until the next time it's called by the timer, and
during that period we have a reservation ... - 12:31 AM Revision c12829a2 (ceph): osd: don't scrub something we just scrubbed
- Signed-off-by: Sage Weil <sage@newdream.net>
- 12:31 AM Revision 85e08905 (ceph): osd: scrub least recently scrubbed pgs first; once a day
- Signed-off-by: Sage Weil <sage@newdream.net>
11/10/2010
- 10:50 PM Revision 231434af (ceph): pg_state_string: use an ostringstream
- Use an ostringstream for efficiency's sake.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> - 09:49 PM Revision d247616c (ceph): vstart: stop logging to /tmp/foo
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:46 PM CephFS Bug #561: snaptest-2 doesn't execute properly
- I ran the test again and didn't get an mds crash. There was one issue remaining:...
- 06:14 PM CephFS Bug #561 (In Progress): snaptest-2 doesn't execute properly
- I think I may have finally nailed this problem, or at least found a band-aid by more aggressively removing the I_COMP...
- 09:39 PM Revision 74be621c (ceph): osd: fix scrub reserved state when starting scrub
- Also document scrub scheduling/pending/active states.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:18 PM CephFS Bug #570 (Resolved): Locker::_do_null_snapflush assert failure
- 09:18 PM CephFS Bug #570: Locker::_do_null_snapflush assert failure
- Nice catch. Fixed by commit:3d10b340748e5bbff86b49ac7386da9efa27a070. Added a unit test too!
- 02:58 PM CephFS Bug #570 (Resolved): Locker::_do_null_snapflush assert failure
- Seen this a lot while working on the snaptest-2 issue, when shutting down cfuse....
- 09:16 PM Revision 8650418f (ceph): vstart: turn down msgr debugging
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:13 PM Revision 9e4027fb (ceph): monc: cancel timer events with lock held
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:23 PM Revision 07bb6756 (ceph): Wake up clients waiting for now-found objects
- PG::search_for_missing: when we find a previously unfound object, check
to see if there is an entry in waiting_for_mi... - 07:46 PM Revision 8288a23a (ceph): PG::peer: don't block if objects are unfound
- Erase the code in PG::peer that used to keep us from becoming active
when objects were still unfound. Print out the n... - 07:46 PM Revision 040c4bcd (ceph): PG::search_for_missing: minor refactoring, comment
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 07:46 PM Revision 5153ba5e (ceph): Add PG::Missing::have_missing()
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 07:46 PM Revision 85c4e6e6 (ceph): OSD::_process_pg_info:search_for_missing sometimes
- OSD::_process_pg_info: If we're the primary for this active PG, and we
have missing objects, call search_for_missing.... - 07:46 PM Revision 6a04ac52 (ceph): PG::recover_master_log: rename a local variable
- PG::recover_master_log: rename a local variable to avoid using the
overloaded term "missing".
Signed-off-by: Colin M... - 07:46 PM Revision b5181133 (ceph): test_unfound.sh: shorter test
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 07:46 PM Revision 02ec7219 (ceph): Add num_objects_unfound to struct pg_stat_t
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 07:46 PM Revision fc605ced (ceph): test_unfound.sh: verify that we have unfound objs
- test_unfound.sh: verify that we have unfound objs.
Then, when we bring up the other OSD, verify that those unfound ob... - 07:46 PM Revision b9191ddc (ceph): test_unfound.sh: test reading an unfound object.
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 07:46 PM Revision e6b6c539 (ceph): PG::peer: count/find cleanup
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 06:30 PM Revision b80f3e6a (ceph): PG: move ostream operator to .cpp file
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 06:30 PM Revision a46f15e7 (ceph): PG: nomenclature change: talk about unfound objs
- Describe objects as "unfound" when we don't know what OSD has them.
Signed-off-by: Colin McCabe <colinm@hq.newdream.... - 06:30 PM Revision ef1f8ecd (ceph): PG.h erase deadcode
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 06:16 PM Bug #535 (In Progress): cephtool hangs forever until a UNIX signal is received
- After checking the logs and conferring with Sage, I think I've found a possible cause. Designing and testing a fix no...
- 05:43 PM Revision 82aa79f8 (ceph): mds: fix inode->frag rstat projected with snaps
- The snapid 'first' value needs to be >= inode->first; move that into
the helper.
Signed-off-by: Sage Weil <sage@newd... - 05:04 PM Revision 5deef243 (ceph): osdmap: break up asserts for easier debugging
- If we fail one of these it's helpful to know which one.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:03 PM Revision 586c9e7a (ceph): objecter: throttle before looking at lock protected state
- The take_op_budget() may drop our lock if we are in keep_balanced_budget
mode, so we need to do that _before_ we take... - 04:50 PM Revision 57513739 (ceph): mon: drop unnecessary state checks
- We want to ignore all beacons from the mds regardless of what state they
are in.
Signed-off-by: Sage Weil <sage@newd... - 04:46 PM Feature #567 (Resolved): osd: background scrub frequency, scheduling
- fixed up some scheduling problems, then added the interval and oldest-scrubs-first stuff.
- 04:45 PM Revision 84840ed7 (ceph): debian: don't explicitly depend on libgoogle-perftools0
- dpkg-buildpackage will autodetect the dependency. Except on lenny, where
it doesn't exist and we don't use it!
Sign... - 04:14 PM Revision ca3693d8 (ceph): mds: Enable --journal_check mode.
- This replaces the old --shadow option, which didn't work.
It starts up the MDS daemon, then replays the journal for
a... - 04:13 PM Revision 214b7269 (ceph): osdc: Fix bad assert in ~ObjectCacher.
- The objects data member is never empty on shutdown since it now consists
of a vector of pools. Instead, check each po... - 03:43 PM Feature #572 (Resolved): Implement lingering osd requests
- For the watch/notify feature we need to implement lingering osd requests on the userspace client side. Lingering osd ...
- 03:42 PM Revision 5035c822 (ceph): uclient: only update inode if version increased
- This realigns the code with the kernel version, fixing a number of
problems when you have multiple MDSs returning inf... - 03:21 PM Linux kernel client Bug #571 (Closed): client hangs after osd disconnection
- This happens on the rbd watch/notify sync branch. Probably related to lingering requests.
- 12:12 PM Bug #559 (Rejected): osd: dup requests can ack early
- nevermind, this is already done and merged!
- 11:01 AM Linux kernel client Tasks #569 (Resolved): test dir frags
- Make sure we behave with fragmented dirs, esp readdir. (probably need to mirror the recent cfuse fixes.)
- 09:43 AM Bug #521 (Resolved): objecter: crash in osdmap assert
- commit:586c9e7a80b425802ca77d8c09bb00da5c25d616
- 09:15 AM Feature #568 (Resolved): debian: build with --as-needed?
- Can we do this to limit dependencies? See #544.
And the current warnings like... - 08:18 AM CephFS Feature #548 (Resolved): mds: shadowreplay one-shot mode
- commit:ca3693d8ffcdffc3ae95eaba506a72889829bcb5 makes minimal changes to the MDS and MDSMonitor code to enable the ne...
- 08:03 AM Revision 255e34af (ceph): decompile_crush_bucket: fix depth-first decomp
- We need to ensure that buckets are output after their dependencies. The
best way to do this is a depth-first traversa... - 07:58 AM Revision d1f15daf (ceph): CrushWrapper:get_bucket: ret ENOENT for no bucket
- All the callers of CrushWrapper::get_bucket() check for error codes, but
not for NULL returns. So if there is no buck... - 07:24 AM Bug #531: Journaling Causes System Hang
- What would be helpful in diagnosing this problem is:
- turn up osd logging, in [osd] section:
debug osd = 20
...
11/09/2010
- 11:56 PM Revision 11cfcfe8 (ceph): Merge branch 'sched_scrub' into unstable
- Conflicts:
src/osd/PG.cc
src/osd/PG.h - 11:50 PM Revision e8ad6d26 (ceph): osd: small cleanup
- Signed-off-by: Sage Weil <sage@newdream.net>
- 11:46 PM Revision 28b44293 (ceph): osd: scrub: list objects without lock held
- We'll go back to get anything we missed later.
Signed-off-by: Sage Weil <sage@newdream.net> - 11:46 PM Revision c2d6d05f (ceph): Merge branch 'scrub_no_lock' into unstable
- 11:34 PM Revision 966369aa (ceph): ps-ceph.pl: don't show self
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 11:04 PM Revision 6bc31511 (ceph): gui: add missing #include
- Signed-off-by: Sage Weil <sage@newdream.net>
- 10:50 PM Revision 58394828 (ceph): Merge branch 'rbd-fiemap' into unstable
- 10:49 PM Revision e991702e (ceph): objecter: set READ flag on new objecter mapext/read_sparse ops
- Signed-off-by: Sage Weil <sage@newdream.net>
- 10:48 PM Revision adac5163 (ceph): objecter: fix balancer for ops with length < 0
- Notably, mapext.
Signed-off-by: Sage Weil <sage@newdream.net> - 10:36 PM Revision 20060548 (ceph): filestore: autodetect presense of FIEMAP ioctl
- If it's not there, assume the whole object is allocated.
Signed-off-by: Sage Weil <sage@newdream.net> - 10:35 PM Revision e5488718 (ceph): fiemap: include linux fiemap.h header; unconditionally compile helper
- If the system doesn't have the header, use our copy.
Signed-off-by: Sage Weil <sage@newdream.net> - 10:33 PM Revision 9f14dd25 (ceph): ps-ceph.pl: display Ceph tests
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 10:23 PM Revision 53b076d5 (ceph): Merge remote branch 'origin/rbd-fiemap' into unstable
- 10:06 PM Revision 2325a1a2 (ceph): Fix example config file
- We need to specify a journal size for the file-based journal we set up
in the example config file.
Signed-off-by: Co... - 09:59 PM Revision 2947d19d (ceph): TimerThread:don't call pop_front before iter deref
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 09:30 PM Revision 1c7d8f1a (ceph): Makefile: use openssl module check
- This allows ceph to build with --as-needed.
Signed-off-by: Kacper Kowalik <xarthisius@gentoo.org> - 09:17 PM Revision 954ad982 (ceph): osd: shut down if we do not exist
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:08 PM Revision ea56dfdc (ceph): osd: handle osds that no longer exist in prior_set_affected
- Consider no-longer-existent OSDs lost.
Signed-off-by: Sage Weil <sage@newdream.net> - 08:05 PM Revision 29428b9b (ceph): Objecter: initialize timer in Objecter::init
- Just in case future users of Objecter want to create one before calling
Messenger::start as a daemon.
Signed-off-by:... - 06:15 PM Revision ec4200b0 (ceph): Add test_crushtool.sh
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 06:06 PM Revision 019bb70e (ceph): mds: turn on mds_bal_frag (dir fragmentation) by default
- Let the fun begin!
Signed-off-by: Sage Weil <sage@newdream.net> - 06:04 PM Revision ae13fc86 (ceph): osd: handle osds that no longer exist in build_prior
- Fix build_prior to handle OSDs that no longer exist in the current map.
Consider them lost.
Signed-off-by: Sage Weil... - 06:04 PM Revision e15c9569 (ceph): mds: fix inode freeze auth pin allowance
- When we're renaming across nodes, we need to freeze the inode. This
requires that we allow for the auth_pins that _w... - 06:03 PM Revision 3107944e (ceph): osdmap: cleanup: add parens
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:59 PM Revision f28b99b3 (ceph): CrushWrapper::get_bucket_item: bounds check
- Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
- 05:59 PM Revision 9b487256 (ceph): crushtool: don't create a dump we can't recompile
- In crushtool, dump buckets in tree order. Buckets which reference other
buckets must be dumped after their depedencie... - 05:55 PM Revision e1588dc4 (ceph): mds: wipe out client sessions on startup
- For disaster recovery and such.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:55 PM Revision 05a47387 (ceph): mon: implement 'mds newfs <metapool> <datapool>' command
- Create a new fs (by creating a new MDSMap) using the given pools.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:55 PM Revision d80948ad (ceph): mds: use mdsmap data pool for root inode default layout
- The MDSMap may specify any random pool as the data pool; use that.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:55 PM Revision 8a21c6f6 (ceph): mds: add mds_skip_ino and mds_wipe_ino_prealloc options
- These are last-ditch recovery tools. Not particularly effective ones,
though.
Signed-off-by: Sage Weil <sage@newdre... - 05:04 PM Linux kernel client Bug #549: bonnie++ file stat failure
- bonnie tests are running under ceph 5, 6, 8, and 9, logging to /data/qa/ on each machine.
- 04:28 PM Bug #535: cephtool hangs forever until a UNIX signal is received
- cephtool-hang-at-966369aad07461f2610b4dd2a9cdc770155c5a89.txt
- 03:08 PM Bug #535: cephtool hangs forever until a UNIX signal is received
- messenger-bug.txt
- 04:26 PM Bug #521: objecter: crash in osdmap assert
- Can you try with something like...
- 09:45 AM Bug #521: objecter: crash in osdmap assert
- latest from ML:...
- 03:59 PM Feature #567 (Resolved): osd: background scrub frequency, scheduling
- We should have some min interval such that the osds won't scrub the same osd more frequently than that.
Also, the ... - 03:56 PM Feature #425 (Resolved): trigger osd scrub automatically
- 03:54 PM Subtask #485 (Resolved): osd: cooperative scrub scheduling
- merged by commit:11cfcfe87503e50c892178d9c5c5b55da3aac740
- 03:45 PM Subtask #486 (Resolved): osd: make scrub not block writes
- merged commit:28b44293e34c5e97f350b4c68becdf9e7767ed6f
- 02:52 PM Bug #248 (Resolved): rbdtool import should use fiemap
- 02:52 PM Bug #248: rbdtool import should use fiemap
- Merged by commit:58394828a01950d7b26430d61d32df91df5a5fb1, bringing it in line with the objecter changes over the las...
- 02:13 PM RADOS Bug #558 (Resolved): crushtool cannot always re-encode a crushmap that it's created
- Fixed by commit:9b48725614a880cf1f4bcad0bba2ceefdc76c167
C. - 02:11 PM Bug #533 (Resolved): radostool hang on shutdown
- Should be fixed by timer-fixes.
C. - 02:10 PM Bug #565 (Resolved): Example config file is broken
- Fixed by 2325a1a27b434cea7d7af832efff7a9257724fe6
C. - 01:30 PM Bug #544 (Resolved): ceph-0.22.2: fails to build with --as-needed
- 01:16 PM Bug #566 (Resolved): osd: build_prior needs to be wary of nonexistent osds
- fixed by commit:954ad98230085c9c2a174fe15af24df237498977 commit:ea56dfdc663f8b0e19346bb63ffe3fec0c7759c4 commit:ae13f...
- 12:59 PM CephFS Bug #556 (Resolved): clustered mds: rename
- this wasn't too bad.. the locking auth_pin scheme changed a while ago and the auth_pin allowance didn't get adjusted ...
- 12:42 PM Linux kernel client Bug #546 (Resolved): direct i/o does not work when offset is not page-aligned
- See commit:c5c6b19d4b8f5431fca05f28ae9e141045022149. Passes my tests.
- 06:03 AM Revision aad3f7f2 (ceph): ceph.spec.in: don't strip rados classes
- Signed-off-by: Christian Brunner <christian@brunner-muc.de>
11/08/2010
- 10:49 PM Bug #535: cephtool hangs forever until a UNIX signal is received
- > Look, I know it's a pain, but work on this isn't going to progress unless
> we collect AT LEAST:
> 1) The state ... - 01:05 PM Bug #535 (Can't reproduce): cephtool hangs forever until a UNIX signal is received
- Look, I know it's a pain, but work on this isn't going to progress unless we collect AT LEAST:
1) The state of each ... - 10:35 AM Bug #535: cephtool hangs forever until a UNIX signal is received
- The process that is hung is 17181, cephtool.
- 10:35 AM Bug #535 (In Progress): cephtool hangs forever until a UNIX signal is received
- Reproduced again on the unfound branch, which is very close to what is in unstable now.
cmccabe@flab:~/src/ceph/... - 09:22 PM Revision 64f95ad9 (ceph): mds: add missing Dumper.[h,cc]
- 09:18 PM Revision be9328ac (ceph): mds: tolerate/fix negative dir size counts
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:44 PM Revision d5515a8f (ceph): mds: add missing Dumper.[h,cc]
- 08:40 PM Bug #566 (Resolved): osd: build_prior needs to be wary of nonexistent osds
- ...
- 08:09 PM Bug #565 (Resolved): Example config file is broken
- The example config file (src/sample.ceph.conf) specifies the OSD journal as a file, but doesn't specify the size, whi...
- 05:45 PM Revision 1ab7c7ff (ceph): Replace ps-ceph.sh shell script with perl script
- A much faster version of ps-ceph.sh.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> - 04:17 PM Linux kernel client Bug #384 (Closed): crash in splice_dentry
- 03:07 PM Feature #80 (Resolved): uclient: readdir from cache
- He already did it, yay!
- 02:41 PM Feature #96 (Resolved): msgr: close idle connections?
- Yay, this got done with the recent SimpleMessenger changes!
- 02:15 PM Feature #276 (Resolved): Possibility to dump/list xattrs from RADOS object
- Yehuda says he did this!
- 01:24 PM Bug #531: Journaling Causes System Hang
- We've looked at this a bit more but decided today that Sage is taking it over since he's a lot more familiar with the...
- 12:49 PM Linux kernel client Bug #564 (Resolved): Configuration via configfs instead of sysfs
- Will allow creation of different devices and setting them up. Should be device oriented, and will create a sub direct...
- 11:05 AM Bug #563 (Closed): osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
- I'm running the unstable branch and I'm seeing in my dmesg:...
- 09:32 AM CephFS Bug #561: snaptest-2 doesn't execute properly
- Okay, looks like this may be an issue with the test rather than Ceph. I just copied it into the root of the ceph moun...
- 09:07 AM CephFS Bug #561 (Resolved): snaptest-2 doesn't execute properly
- Checked it on cfuse and kclient:...
- 09:27 AM RADOS Bug #558: crushtool cannot always re-encode a crushmap that it's created
- Either the compiler part just needs to be updated to allow forward bucket references, or the dumper needs to dump by ...
- 09:26 AM Feature #562 (Closed): separate gui into separate binary, package
- This will mean refactoring common ceph.cc bits into a separate file and .a.
- 09:22 AM Linux kernel client Bug #434: mds: clustered mds pjd failures
- a few more fixes here on inode updates version check and mtime.
- 07:23 AM Linux kernel client Bug #434 (Resolved): mds: clustered mds pjd failures
- this was a kclient problem caused by bad uid/gid in resent requests. fixed by commit:cb4276cca4695670916a82e359f2e377...
- 09:20 AM Tasks #406 (Closed): push v0.20.2 to upstream debian, ubuntu maintainers
- 09:20 AM CephFS Cleanup #427 (Rejected): mds: tie scatter pins directly to freeze machinery
- no more scatterpins, yay!
- 09:19 AM Linux kernel client Bug #554 (Resolved): clustered mds: max_size not updated
- 07:39 AM CephFS Feature #560 (Resolved): mds: alternate directory hashing
- Currently dentries are hashed among dirfrags using the linux dcache's hash function, which is pretty trivial. The pr...
- 07:30 AM Bug #559: osd: dup requests can ack early
- The dup request check looks at the reqid in the log, and replies early. That request could still be in flight to dis...
- 07:28 AM Bug #559 (Rejected): osd: dup requests can ack early
11/07/2010
- 06:02 PM RADOS Bug #558 (Resolved): crushtool cannot always re-encode a crushmap that it's created
- When a CRUSH text map is encoded, the buckets are read in such a way that they must be defined before they are refere...
- 05:56 PM Revision 0feec2f4 (ceph): Merge remote branch 'origin/object_locator' into unstable
- Conflicts:
src/osd/OSD.cc
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h
src/osd/osd_types.h - 05:45 PM Revision b7f578cf (ceph): Merge remote branch 'origin/timer-fixes' into unstable
- 05:44 PM Revision deb9ef76 (ceph): v0.24~rc
- 05:42 PM Revision 0b190920 (ceph): Merge remote branch 'origin/testing' into unstable
- 03:49 PM Revision a4674af5 (ceph): mds: eval: put scatter in MIX if replicated, otherwise LOCK
- Signed-off-by: Sage Weil <sage@newdream.net>
- 03:45 PM Revision 33c6e230 (ceph): mds: do not scatter_writebehind in MIX state
- Replicas might come in while we're flushing and get a MIX state with
the old state.
Signed-off-by: Sage Weil <sage@n... - 11:29 AM Feature #231: Slow OSDs shouldn't destroy cluster performance
- Today I experienced a btrfs bug where *[btrfs-transacti]* got to status D and causing my OSD to hang (also go into st...
- 10:18 AM Linux kernel client Bug #554: clustered mds: max_size not updated
- fixed by commit:912a9b0319a8eb9e0834b19a25e01013ab2d6a9f. also commit:feb4cc9bb433bf1491ac5ffbba133f3258dacf06 for g...
- 10:15 AM Feature #524 (In Progress): object_locator_t
- Work so far merged by commit:0feec2f4f31aa3a259b2cdf885d6458995ce860b
Still need to update the on-wire protocol to... - 10:08 AM CephFS Feature #495 (Resolved): mds: add MIX_STALE
- merged in commit:0b1909209800229f5098cdc848fc3901508c1e19. best part of this is MIX_STALE went away. yay!
- 10:05 AM Bug #248 (In Progress): rbdtool import should use fiemap
- whoops, this never got merged.
- 08:58 AM Linux kernel client Bug #557 (Can't reproduce): BUG_ON(!session->s_num_cap_releases);
- ...
- 08:11 AM CephFS Bug #556 (Resolved): clustered mds: rename
- various hangs with thrash-exports and pjd rename tests.
- 04:05 AM Revision 1bf8e732 (ceph): Merge branch 'unstable' into mix_stale
- 04:01 AM Revision 1eb94da2 (ceph): mds: introduce/use helpers to resync stale fragstat/rstat; update version
- Simplifies code.
Also, update the version when we resync!
Signed-off-by: Sage Weil <sage@newdream.net> - 04:01 AM Revision c1ee560e (ceph): mds: don't fuss with versions when taking frag/rstat from frag; it's ne...
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:01 AM Revision bdc2fa5b (ceph): mds: remove MIX_STALE
- Yay, we don't need it!
If we can't update the frag on scatter, fine. The staleness of the frag
is implicit in the f... - 04:00 AM Revision c2034829 (ceph): mds: ignore done_locking on slave requests' acquire_locks()
- Slave requests ask for each xlock one at a time. Don't bail out based on
the done_locking flag.
Signed-off-by: Sage... - 04:00 AM Revision 51b6a863 (ceph): mds: don't use helper for rename srcdn
- The rdlock_path_xlock_dentry helper works for _auth_ dentries that we
create locally in an auth dirfrag. For the src... - 04:00 AM Revision eb0a60d0 (ceph): mds: never complete a gather on a flushing lock
- The scatter_writebehind() takes a wrlock, but that may still allow the lock
to complete a gather to LOCK and even mov...
11/06/2010
- 04:38 PM Revision bdf3bc5e (ceph): mds: update version when bring stale rstat back up to date
- Signed-off-by: Sage Weil <sage@newdream.net>
- 02:58 PM Revision a74054d1 (ceph): mds: simplify stale semantics a bit
- is_stale() => next MIX is MIX_STALE. Stale flag is then cleared. Then we
special case the import to preserve stale-n... - 01:30 PM Bug #555 (Closed): debian/ubuntu: ceph-client-tools needs to depend on libgtkmm-2.4-1c2a
- Invalid report, it was due to a upgrade. When doing a fresh install of the packages they do depend on libgtkmm.
Cl... - 11:52 AM Bug #555 (Closed): debian/ubuntu: ceph-client-tools needs to depend on libgtkmm-2.4-1c2a
- Right now, the building process depends on libgtkmm-2.4-dev, but when installing the packages and running 'ceph -g' y...
- 11:55 AM Linux kernel client Bug #434: mds: clustered mds pjd failures
- Just saw this again:...
- 11:18 AM Bug #553: Kernelmodule doen't build under Debian lenny
- Ok, a backport-kernel works fine AFAIS. I updated the wiki-page.
- 10:10 AM Bug #553 (Won't Fix): Kernelmodule doen't build under Debian lenny
- Unfortunately you're going to need to upgrade your kernel if you want the in-kernel client. Using the backports branc...
- 09:52 AM Bug #553 (Won't Fix): Kernelmodule doen't build under Debian lenny
- Hello all,
the wiki-page [1] says that ceph runs under Debian lenny, but as far as I see that is not true because th... - 11:16 AM Linux kernel client Bug #554 (Resolved): clustered mds: max_size not updated
- 3 mds, export thrashing, dbench 1 hang waiting on max_size.
- 04:52 AM Revision e27f111f (ceph): mds: preserve stale state on import; some cleanup
- Our new invariant is that MIX_STALE always implies is_stale(). And on
import, if is_stale(), MIX becomes MIX_STALE. ... - 12:08 AM Revision a582345c (ceph): Merge branch 'mix_stale' into unstable
- 12:06 AM Revision 4126d1ce (ceph): mds: add more verify_scatter asserts
- For catchings fragstat errors sooner.
Signed-off-by: Sage Weil <sage@newdream.net>
11/05/2010
- 10:24 PM Revision ae670c33 (ceph): mds: fix version check on resyncing stale rstat in predirty_journal_par...
- We're resyncing rstat, so check the rstat version (not fragstat!)
Signed-off-by: Sage Weil <sage@newdream.net> - 07:45 PM Revision 4cee6ead (ceph): mds: Fix bad inode deref.
- Accidentally trying to print out the CInode after removing it in trim_non_auth!
Move the print to before it's been un... - 07:20 PM Revision 93344fb2 (ceph): Revisit std::multimap decoder
- Previously I changed the std::multimap decoder to minimize the number of
constructor invocations. However, it could b... - 06:34 PM Revision f015c989 (ceph): autogen.sh: check for pkg-config
- To avoid seeing confusing errors later in the configure process, in
autogen.sh, check to make sure the pkg-config pro... - 05:57 PM Revision fd397aba (ceph): PG.cc: build_scrub_map now drops the PG lock while scanning the PG
- build_inc_scrub_map scans all files modified since the given
version number and creates an incremental scr... - 05:38 PM Revision 989fa67d (ceph): mds: preserve version when recovering rstat from dirfrag in predirty_jo...
- We don't want to screw up the version here. This aligns the code with
other instances of this check.
Signed-off-by:... - 02:50 PM Linux kernel client Bug #552 (Resolved): Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg
- With kernel oplocks = yes, samba fills up dmesg with those
[ 4472.504211] ceph: problem parsing dir contents -5
[... - 01:56 PM Linux kernel client Bug #434: mds: clustered mds pjd failures
- Sage has taken over the clustered MDS stuff for now, so here's the bug!
- 01:55 PM CephFS Feature #495: mds: add MIX_STALE
- 01:36 PM Bug #521: objecter: crash in osdmap assert
- 01:02 PM CephFS Bug #551 (Can't reproduce): cfuse crash on quick mds restart
- Program terminated with signal 11, Segmentation fault.
#0 0x00000000004704ad in Client::kick_flushing_caps (this=0x... - 12:29 PM Bug #550: mon: PGMonitor::update_from_paxos()
- While I thought it wasn't related to the MDS issue i'm seeing, it might seem it is:...
- 12:11 PM Bug #550 (Can't reproduce): mon: PGMonitor::update_from_paxos()
- One of my monitors crashed, got this backtrace:...
- 10:59 AM Linux kernel client Bug #549: bonnie++ file stat failure
- Terri, can you have the qa machiens loop through _just_ the bonnie++ command he's having problems with? Something li...
- 10:57 AM Linux kernel client Bug #549 (Resolved): bonnie++ file stat failure
- From ML:...
- 10:49 AM Bug #531: Journaling Causes System Hang
- Hello,
1) Correct we are running transparent 10GbE
2) From what I can tell monitoring dstat across the cluster ... - 10:14 AM CephFS Feature #91: mds: up:shadow mode
- Update the journaler interface to allow the MDS to 'tail' the journal... periodically check to see if it's been exten...
- 10:10 AM CephFS Feature #548 (Resolved): mds: shadowreplay one-shot mode
- Make sure the current mechanism still works. Clean it up if needed.
- 09:19 AM CephFS Subtask #547 (Resolved): mds: define fsck strategy, required metadata
- 09:19 AM CephFS Feature #340 (Closed): large directories, directory fragmenting
- 09:19 AM CephFS Feature #519 (Closed): mds: dirfrag merge
- 06:20 AM Revision 9586e905 (ceph): mds: restructure finish_scatter_gather_update()
- Separate behavior into two dimensions: whether or not we are updating
the dirfrag, and whether or not the dirfrag is ... - 06:15 AM Revision 669a8afa (ceph): mds: do not bump scatter stat lock in predirty_journal_parents
- If we're in the MIX state, we clearly can't touch this without screwing up
the delicate scatter/gather behavior. If ... - 05:48 AM Revision 663b470f (ceph): mds: mark scatterlock stale on import of stale frag scatter stat
- When the lock scattered, if we didn't have an auth frag that was frozen,
we go into MIX state. Later, we may import ... - 05:44 AM Revision 63c1ad84 (ceph): mds: match bottom half of assilate_dirty_rstat_inodes with a dir flag
- We only do the assimilate_dirty_rstat_inodes if we do an update AND the
frag rstat was non-stale, but the bottom half... - 05:19 AM Revision 9b6d96e9 (ceph): mds: fix inode version used for inest in decode_lock_state
- We need to pass the inode rstat's version into finish_scatter_update, not
the shadowed local variable. Otherwise we ...
Also available in: Atom