Project

General

Profile

Activity

From 10/26/2010 to 11/24/2010

11/24/2010

10:54 PM Bug #611: OSD: OSDMap::get_cluster_inst
I'll take a look Colin McCabe
10:18 PM Bug #611: OSD: OSDMap::get_cluster_inst
Okay, I somehow commented/set this bug backwards with another one. Whoops, sorry guys!
This looks like the OSD is as...
Greg Farnum
10:38 AM Bug #611: OSD: OSDMap::get_cluster_inst
Sam said he'd look at this since it's in the background scrubbing bits that he and Josh did. Greg Farnum
05:11 AM Bug #611 (Resolved): OSD: OSDMap::get_cluster_inst
After upgrading to the latest unstable, one OSD crashed. Before the upgrade, 10 of the 12 OSD's were online.
When ...
Wido den Hollander
10:18 PM Bug #612: OSD: Crash during auto scrub
Dunno how, but somehow commented/assigned this and another bug backwards. Meant to say:
Sam said he'd look at this s...
Greg Farnum
10:38 AM Bug #612: OSD: Crash during auto scrub
This looks like the OSD is assembling a list of missing queries and then sending them out without bothering to check ... Greg Farnum
05:28 AM Bug #612 (Resolved): OSD: Crash during auto scrub
After I saw #611 my cluster started to crash. One after the other, the OSD's started to go down, all with a message a... Wido den Hollander
10:09 PM Feature #453 (Resolved): osd: return error (instead of blocking) on lost objects
It's passing the lost1 and lost2 unit tests now. Colin McCabe
09:41 PM rgw Bug #353: Handle non-ascii filenames
Yeah, I agree with Amazon's approach here. UTF-8 makes sense. I think we could continue to use std::string internally... Colin McCabe
02:03 AM Revision d6e8e8d1 (ceph): gui: some cleanup
Rather than vectors of pointers, use vectors of NodeInfo structures.
This avoids the problem of freeing the NodeInfo ...
Colin Patrick McCabe
12:56 AM Revision 1b1e040e (ceph): osd: add a map for lingering messages
Yehuda Sadeh
12:55 AM Revision 99e1e4de (ceph): librados: assert_version on sync operations
Yehuda Sadeh
12:55 AM Revision c4b97953 (ceph): librados: last_objver is set on the pool, and not per thread
Yehuda Sadeh
12:55 AM Revision 454ea06e (ceph): rbd: notify about header changes
Yehuda Sadeh
12:55 AM Revision 520b523b (ceph): librados: fix unnecessary locking
Yehuda Sadeh
12:55 AM Revision 4c8bdc53 (ceph): osd: don't notify notifier
Yehuda Sadeh
12:54 AM Revision a76de3b2 (ceph): librados: complete C interface for watch/notify
Yehuda Sadeh
12:54 AM Revision 38c8e383 (ceph): librados: rename cookie to handle in api
Yehuda Sadeh
12:54 AM Revision 2954799a (ceph): librados: notify waits for completion
Yehuda Sadeh
12:50 AM Revision e7184e6d (ceph): librados: start implementing watch/notify
Yehuda Sadeh
12:50 AM Revision a4864bd8 (ceph): librados: enable object versioning
Yehuda Sadeh
12:50 AM Revision f36677f8 (ceph): librados: update C api
Yehuda Sadeh
12:49 AM Revision f8af4f2c (ceph): osd: add watch/notify timeout
Yehuda Sadeh
12:49 AM Revision cc62f2eb (ceph): osd: fix bad mutex lock
Yehuda Sadeh
12:49 AM Revision e0c548ad (ceph): osd: fix ms_handle_reset
Yehuda Sadeh
12:49 AM Revision d5cc6732 (ceph): osd: some notify related cleanups
Yehuda Sadeh
12:49 AM Revision 7272bfec (ceph): osd: send notify response from reset handler if needed
Yehuda Sadeh
12:49 AM Revision d66b52e1 (ceph): osd: watch infrastructure
third attempt Yehuda Sadeh
12:49 AM Revision 2b5e61ca (ceph): osd: send notification id
Yehuda Sadeh
12:49 AM Revision 59e61d0e (ceph): osd: discard of disconnected watchers
still need to add a timeout Yehuda Sadeh
12:49 AM Revision f5f33822 (ceph): osd: send notify reply if there are not watchers
Yehuda Sadeh
12:49 AM Revision 9437ea84 (ceph): osd: add user_version field in obect_info_t
Yehuda Sadeh
12:49 AM Revision 7bda45a1 (ceph): osd: reply with either user_version or at_version, depends on the op
Yehuda Sadeh
12:49 AM Revision f7b7d67a (ceph): osd: check requested watch version number
send appropriate status code if needed Yehuda Sadeh
12:47 AM Revision 2bce34e7 (ceph): osd: handle watch op, register client on object xattr
Yehuda Sadeh
12:47 AM Revision 3110e361 (ceph): osd: basic watch/notify handling
Yehuda Sadeh
12:47 AM Revision e493c7ae (ceph): osd: handle notify-ack
Yehuda Sadeh

11/23/2010

11:39 PM Revision 2f13dd8e (ceph): gui: more reindenting
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:37 PM Revision 66a78c23 (ceph): gui: reindent a bunch of code
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:40 PM Revision d8652de6 (ceph): mdcache: in trim_non_auth, only print out path if it has a parent dentry.
This should only occur with the root inode, but caused a segfault for
anybody running more than one MDS who restarted...
Greg Farnum
10:04 PM Revision 8768b52d (ceph): mds: Reply checking_lock while reading filelock
Use checking_lock to repalce lock_state in extra buffer list to let client can get correct file lock reply. Herb Shiu
09:59 PM Revision 4041bf0d (ceph): mds: fix set_state_rejoin auth_pin check
We carry an auth pin IFF !stable AND auth.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:59 PM Revision 5ed06ffc (ceph): client: remove inode from flush_caps list when auth_cap changes
Avoid confusing other code (e.g. kick_flushing_caps) by staying on the mds
flushign_caps list when we don't even have...
Sage Weil
09:52 PM Revision 285cc946 (ceph): osd: fix is_all_uptodate()
This should only return true when recovery is done, i.e., no more missing
objects. Nothing to do with unfound.
Sign...
Sage Weil
09:52 PM Revision 36f703e1 (ceph): osd: removing unused variable, fix warning
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:52 PM Revision 413ecb0b (ceph): osd: only search_for_missing if there are unfound objects
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:52 PM Revision 671b1c09 (ceph): osd: add get_num_unfound() helper
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:52 PM Revision 7ea7a435 (ceph): osd: only discover_all_missing if unfound
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:52 PM Revision 5452dae6 (ceph): osd: recover_primary() until primary has all found objects
The logic in that if was effectively reversed.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:52 PM Revision 5498c467 (ceph): osd: fix recover_replicas() unfound check
missing_loc.count(soid) == 0 only means unfound if it's not missing on the
primary.
Signed-off-by: Sage Weil <sage@n...
Sage Weil
09:52 PM Revision e97eae15 (ceph): init-ceph: tolerate failure in cleanallogs
Otherwise /var/log/ceph/stat makes rm -f error out and we fail.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:52 PM Revision 84612286 (ceph): Build might_have_unfound set at activation
The might_have_unfound set is used by the primary OSD during recovery.
This set tracks the OSDs which might have unfo...
Colin Patrick McCabe
09:52 PM Revision 0e15da8d (ceph): Rename peer_summary_requested to peer_backlog_req
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:52 PM Revision c0c301d5 (ceph): osd: PG::read_log: don't be clever with lost xattr
Formerly, we had a special case in read_log for dealing with objects
whose objects were present on the disk, but not ...
Colin Patrick McCabe
09:52 PM Revision 55570baf (ceph): osd: fix PG::is_all_uptodate
In PG::is_all_uptodate, don't try to look for peer_missing[osd->whoami].
The primary keeps that in PG::missing!
Sign...
Colin Patrick McCabe
08:26 PM Revision 36c6569c (ceph): monmaptool: Return a non-zero error code and print a useful error
message if unable to read the monmap file.
Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
Samuel Just
06:14 PM Feature #610 (Resolved): gui: make PG view prettier
The ceph -g GUI should display PGs in a list, rather than as icons that have to be clicked on. We should get rid of t... Colin McCabe
06:13 PM Bug #604 (Resolved): Compiler warning: 'status' may be used uninitialized in this function
Fixed by commit:d6e8e8d15d22b51ec86bc5687336c3d50d9b3a5d
We should change PG view on the GUI to be a list view at ...
Colin McCabe
05:43 PM Revision fc212548 (ceph): mds: allow for old fs's with stray instead of stray0
New fs's get stray0, but we want to still behave with old ones.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:37 PM Revision de61991a (ceph): Merge branch 'testing' into unstable
Conflicts:
configure.ac
Sage Weil
03:00 PM Bug #531: Journaling Causes System Hang
Awesome, thanks for the help. I will give these patches a shot towards the end of the week.
Thanks
Bryan Tong
02:43 PM Bug #599 (Resolved): recover_master_log, doesn't
There were two problems here:
1) we were restarting the osds before the monitors, which in this case prevented a f...
Colin McCabe
02:01 PM Linux kernel client Bug #552: Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg
Our friends at Tcloud just submitted patches for this today, which I've applied to the unstable branch of our kernel ... Greg Farnum
11:46 AM CephFS Feature #593 (Rejected): mds: fsck: anchor table repair
dup Sage Weil
11:42 AM Feature #609 (Resolved): osd: query pool/pg for objects with given xattr
This will probably take the form of a pool class plugin?
It could start as just a hack, for now.
Sage Weil
11:03 AM Bug #595: Autogen: not a literal
This problem does not seem to occur using 2.68 on my local machine. Slider et al. seem to be using 2.67. Samuel Just
09:39 AM CephFS Bug #608 (Resolved): mds: MDCache::create_system_inode()
this should be fixed by commit:fc212548aea1d7f001b56ba096a79ba54b8a92c3
Thanks!
Sage Weil
07:09 AM CephFS Bug #608 (Resolved): mds: MDCache::create_system_inode()
On a small test cluster I saw that my MDS was not coming up after a fresh mkcephfs, this is what the log showed:
<...
Wido den Hollander
09:33 AM Tasks #584: do throughput scaling tests on sepia
What was the variance in per-node throughput? Did we have one node dominating? Greg Farnum
09:22 AM Tasks #584 (In Progress): do throughput scaling tests on sepia
There's definitely a problem here; the total throughput should be scaling more or less linearly until we hit a bottle... Sage Weil
07:44 AM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
I'll have to rebuild, since I didn't look at the messages that closely. Wido den Hollander
07:02 AM Revision 868665d5 (ceph): v0.23.1
Sage Weil
06:41 AM Revision c327c6a2 (ceph): mon: always use send_reply for auth replies
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:41 AM Revision 61dd4f03 (ceph): mon: simplify send_reply code
No need to specify destination in send_reply, as we always have the request
for reference.
Simplify MRoute construct...
Sage Weil
01:37 AM Revision 2c71bd33 (ceph): osd: add assert to _process_pg_info
When activating an inactive replica, assert that we are doing so based
on a message from the primary.
Signed-off-by:...
Colin Patrick McCabe
01:35 AM Revision a70943fd (ceph): osd: re-indent some code in _process_pg_info
Re-indent the code and add a comment.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
12:12 AM Revision 71369541 (ceph): msgr: tolerate 0 bytes from tcp_read_nonblocking
This can happen, I belive when we get a signal or something.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:12 AM Revision 7ec0034b (ceph): init-ceph: fix (and test!) cleanlogs and cleanalllogs
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:03 AM Revision 7b4a801f (ceph): mds: fix rejoin_scour_survivor_replicas inode check
We want to remove replicas that we don't ack, but those don't appear in
the strong_inode map; they're appended to the...
Sage Weil

11/22/2010

11:08 PM Revision 8d95b5b6 (ceph): messenger: init rc to -1, removing compiler warning.
This actually is initialized before all uses, but compilers tend to
have trouble with assignment in if-else branches,...
Greg Farnum
11:08 PM Revision dd11fe27 (ceph): types: Allow inodeno_t structs to alias.
This removes a compiler warning that appeared in a gcc upgrade and
is apparently erroneous, about its usage violating...
Greg Farnum
10:56 PM Bug #540 (Resolved): CephxClientHandler::handle_response
couldn't reproduce this, but fixed two smallish things that may have been responsible for this:
commit:61dd4f03e6e15...
Sage Weil
10:35 PM Linux kernel client Bug #552: Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg
From the reply dump, it looks like a ceph_mds_reply_head, a length 0 tracebl, a length 1 extrabl (containing a u8 == ... Sage Weil
09:25 PM Revision ac6b018a (ceph): Causes the MDSes to switch among a set of stray directories when
switching to a new journal segment.
MDSCache:
The stray member has been replaced with strays, an array of inodes
r...
Samuel Just
09:16 PM Revision 3f8f5905 (ceph): Timer must be initialized in Client::init and shutdown in
Client::shutdown.
Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
Samuel Just
06:47 PM Revision 8eb4de9e (ceph): generate_past_intervals:generate back to lastclean
PG::generate_past_intervals needs to generate all the intervals back to
history.last_epoch_clean, rather than just to...
Colin Patrick McCabe
06:07 PM Revision 80f28235 (ceph): vstart.sh: 'init-ceph stop' instead of 'stop.sh'
This just makes it easier to run multiple vstart sessions as the same user
on the same host.
Signed-off-by: Sage Wei...
Sage Weil
05:55 PM Revision 53d0650a (ceph): Merge branch 'osd_msgr' into unstable
Sage Weil
05:55 PM Revision cd53719f (ceph): mds: resolve cleanup
Only track ambiguous imports and such if we get a resolve message while in
the resolve state.
Signed-off-by: Sage We...
Sage Weil
05:55 PM Revision c0c81d53 (ceph): mds: trim exported subtree _after_ adjusting auth
We need to set the subtree bounds before trimming it away, or else we may
throw out things we're still auth for.
Sig...
Sage Weil
05:55 PM Revision 9e15ade8 (ceph): mds: do not eval subtree root when replay|resolve
This is nonsensical. And can lead to scatter_writebehind, which breaks
horribly.
Signed-off-by: Sage Weil <sage@new...
Sage Weil
05:55 PM Revision 27c6f217 (ceph): mds: remove bogus assert
Causes problems during resolve finish.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:49 PM Revision 924b1fcb (ceph): osd: bind to new cluster address when wrongly marked down
If we come back up on the same address, there is a possible race. Other
nodes will mark_down when they see us go dow...
Sage Weil
05:45 PM Revision 19409763 (ceph): msgr: implement rebind() to pick a new port
Closes out all old connections and binds to a _different_ port. This
ensures that someone doing mark_down on our old...
Sage Weil
05:09 PM Revision f7170f95 (ceph): client: only encode_cap_releases once per request.
Accomplish this by making a list of cap releases in the (permanent)
MetaRequest, and then copying that into the (pote...
Greg Farnum
04:36 PM Bug #607 (Rejected): osd: ReplicatedPG: sub_op_modify: fix creation of ObjectState
There's a part of the ReplicatedPG::sub_op_modify code that goes like this:
> // do op
> ObjectStat...
Colin McCabe
04:29 PM CephFS Feature #91: mds: up:shadow mode
Updated Journaler to make new interface options asynchronous.
Presently working on how to disambiguate between a one...
Greg Farnum
03:48 PM Tasks #584 (Resolved): do throughput scaling tests on sepia
Results of running rados -p bench bench 20 write on <Nodes>. <Average Throughput> is the average of the Bandwidth st... Samuel Just
01:24 PM CephFS Feature #88 (Resolved): mds: change stray commit strategy to avoid rolling stray dir commits
commit:ac6b018acbeaf8670f8c268db164cfb8a12c171d Sage Weil
12:59 PM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
Is the stack trace you're getting now identical, or different? The FileStore.cc change _should_ have avoided the asy... Sage Weil
09:28 AM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
Just to update the issue, Sage asked me to change something in FileStore.cc, tried that for some days, but that didn'... Wido den Hollander
12:47 PM CephFS Feature #606 (Duplicate): mds: optionally store parent attr on file objects
The goal is to be able to find files contained in rebuilt directories (#603). We can store the same attrs we do for ... Sage Weil
12:45 PM CephFS Feature #605 (Rejected): mds: verify/repair anchor table
- Make sure every item we encounter while traversing the that is anchored correctly appears in the anchor table.
- M...
Sage Weil
12:44 PM Bug #604 (Resolved): Compiler warning: 'status' may be used uninitialized in this function
In gui.cc
The warning's location references are a bit off, but the function gen_node_info_from_icons declares a "sta...
Greg Farnum
12:43 PM CephFS Feature #603 (Resolved): mds: repair directory hierarchy
The goals are
- rebuild missing/corrupt directories
- repair multiple primary links to directories
We'll do so...
Sage Weil
12:40 PM CephFS Feature #602 (Resolved): mds: handle corrupt/missing journals
This probably means
- shutting down current instances, resetting cluster membership
- throwing out journals (or m...
Sage Weil
12:37 PM CephFS Feature #601 (New): mds: order directory commits after rename
When we rename something between directories, we should try to commit the target directory _before_ the source direct... Sage Weil
12:34 PM CephFS Feature #600 (Resolved): mds: store full trace on directories
Currently we only store the immediate parent; store a full trace up to the root. This is CInode::encode_parent_mutat... Sage Weil
12:17 PM Bug #599: recover_master_log, doesn't
Also, I have verified that osd3 and osd9 did NOT crash. They're still running, and they did receive the messages from... Colin McCabe
12:13 PM Bug #599 (Resolved): recover_master_log, doesn't
This is another peering bug. We found it on wido's cluster. Basically, peering never completes.
I just examined PG...
Colin McCabe
09:52 AM Bug #592 (Resolved): osd: rebind cluster_messenger when wrongly marked down
commit:53d0650a42cbfd2f02db2c708a570b6d9e116bb4 Sage Weil
09:14 AM CephFS Bug #596 (Resolved): crash during mds reconnect
Well, that seems to fix it. I added a releases vector to the MetaReqest so it will only encode the releases once, and... Greg Farnum
08:49 AM Bug #598 (Resolved): osd: journal reset in parallel mode acts weird
from ML:... Sage Weil
04:52 AM Revision 51abcaa2 (ceph): mon: clean up cluster_addr code a bit, better debug output
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:52 AM Revision 20313644 (ceph): osdmap: fix cluster_addr encoding; printing
The cluster addrs were getting lost because we were checking v instead of
ev.
Signed-off-by: Sage Weil <sage@newdrea...
Sage Weil
04:52 AM Revision 28498a00 (ceph): osd: send correct ip addrs to monitor for cluster_, hb_addr
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:59 AM Revision ec434eda (ceph): osd: unconditionally set up separate msgr instance for osd<->osd msgs
Always set up cluster_messenger (before we would only do so if there was
an explicit address configured for it). The...
Sage Weil
12:16 AM Revision 0dddf453 (ceph): filestore: only warn about disk write cache on kernels <2.6.33
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:15 AM Revision 0856f57e (ceph): osd: fix search_for_missing: old last_update implies object not present
For example, if an osd sends an empty PG::Info (last_update = 0'0) and
empty missing, we should not conclude that the...
Sage Weil
12:09 AM Revision 6ef5c2f3 (ceph): init-ceph: fix cleanlogs for no log_sym_dir case
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

11/21/2010

07:55 PM Linux kernel client Bug #549 (Resolved): bonnie++ file stat failure
commit:3105c19c450ac7c18ab28c19d364b588767261b3 Sage Weil
03:50 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
I think the cleanest solution here is to re-bind the cluster_messenger to a new port when we are marked down and go b... Sage Weil
03:38 PM Linux kernel client Bug #597 (Closed): Reproducible crash mounting multiple directories from a pool
This bug was fixed in v2.6.36, commit:ca04d9c3ec721e474f00992efc1b1afb625507f5. Thanks for the report though! :) Sage Weil
03:34 PM Linux kernel client Bug #597: Reproducible crash mounting multiple directories from a pool
Should have mentioned - this is with the Ubuntu 10.10 desktop kernel, which is 2.6.35-22, I think. Ravi Pinjala
03:33 PM Linux kernel client Bug #597 (Closed): Reproducible crash mounting multiple directories from a pool
When trying to mount a pool multiple times (with different subdirectories) I get a consistent system hang.
Steps t...
Ravi Pinjala

11/20/2010

05:06 PM Bug #531: Journaling Causes System Hang
Please try out the patches in the filestore_throttle branch, commit:b28c0bf82ac28ded4fe85573d32fdc111c66e50b
It lo...
Sage Weil
03:15 AM Revision fc9b0976 (ceph): OSDMap: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
03:14 AM Revision 2a5c3893 (ceph): mds-dumper: Define Dumper::~Dumper()
To fix compile error.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe

11/19/2010

10:21 PM Revision 8566c5cd (ceph): ReplicatedPG::pull: fix test for unfound
The test for unfound objects was reversed, leading us to try to pull
unfound objects and refrain from pulling objects...
Colin Patrick McCabe
09:41 PM Revision 2f5502fa (ceph): osdmap: fix printing, again
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:21 PM CephFS Bug #596: crash during mds reconnect
The encode_cap_releases can only be called _once_, the very first time we send the request. So at some level this is... Sage Weil
04:22 PM CephFS Bug #596 (Resolved): crash during mds reconnect
While testing my Journaler changes, I got a cfuse segfault. My steps:
vstart with 1 of each daemon
mount cfuse
cop...
Greg Farnum
06:17 PM Revision 4303820b (ceph): Merge remote branch 'origin/mds' into unstable
Sage Weil
04:26 PM CephFS Feature #91 (In Progress): mds: up:shadow mode
I've been getting some proper time in on this on and off over the last few days. Pushed the Journaler changes to the ... Greg Farnum
03:52 PM Bug #531: Journaling Causes System Hang
Okay,
More updates.
1) All the VMs deployed okay but it looks like towards the end of the deployments I hit the...
Bryan Tong
02:49 PM Bug #531: Journaling Causes System Hang
Okay,
I just started the deployment of 12 vms on a new cephfs with 3 osds in and ssd's for journals on all the sys...
Bryan Tong
02:37 PM Bug #531: Journaling Causes System Hang
I am working on getting the output now. We are having to work on several projects at once right now. Sorry for the de... Bryan Tong
03:36 PM Bug #595 (Won't Fix): Autogen: not a literal
We get this running on autoconf 2.67:
configure.ac:6: warning: AC_INIT: not a literal: Sage Weil <sage@newdream.net>...
Greg Farnum
02:29 PM CephFS Bug #594 (Resolved): mds: frag split/merge vs replay
Need to reconcile refragmenting with resolve stage. Currently handle_resolve assumes frags match, when in reality th... Sage Weil
12:11 PM Bug #585 (Resolved): OSD: ReplicatedPG::pull
Fixed by commit:82f1de8c0d6e7817ca7d6dd710e3176b2a549e12 Colin McCabe
10:43 AM Bug #585 (In Progress): OSD: ReplicatedPG::pull
need to see what's going on with this Colin McCabe
11:47 AM Bug #503 (Closed): osd: query osds since last_epoch_clean before concluding objects lost?
Sage Weil
11:39 AM Bug #515 (Can't reproduce): osd: recovery isn't completing
with the recent changes i'm closing this one out, and reopening with specifics if it comes up in testing over the nex... Sage Weil
10:14 AM CephFS Feature #545 (Resolved): mds: use bloom filter to supplement dirfrag COMPLETE flag
merged commit:4303820b43721a8b46ef36d0e9ef4e1167857c80 Sage Weil
09:38 AM CephFS Feature #593 (Rejected): mds: fsck: anchor table repair
We need to be able to fix up the anchor table when there are problems, to avoid e.g.... Sage Weil
05:13 AM Revision b91e14e1 (ceph): multi-dump.sh: add diff mode
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:57 AM Revision 9cab522e (ceph): Add multi-dump.sh
This is a debug tool that can dump out Ceph information at various
epochs. For instance, it can show how the OSDmap c...
Colin Patrick McCabe

11/18/2010

11:05 PM Revision 6e2b594b (ceph): ReplicatedPG::get_object_contect: fix broken calls
ReplicatedPG::get_object_context takes three parameters. The last two
are "const object_locator_t& oloc" and "bool c...
Colin Patrick McCabe
09:50 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
Ah. Looks like you got it figured out.
I wasn't aware of what mark_down did.
Just in case anyone finds it useful...
Colin McCabe
09:22 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
ok, this is a problem with how the osd is interacting with the messenger. looking at the history of 0.5, we see
<pr...
Sage Weil
08:42 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
i suspect 0.5 didn't get set up on osd1 or 2 before osd0 went down? do you have the full logs for the other instances? Sage Weil
05:07 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
I should also add that Greg Farnum helped me examine the logs for this bug. Colin McCabe
05:03 PM Bug #592 (Resolved): osd: rebind cluster_messenger when wrongly marked down
This happened with commit:323565343071ce695f7d454ed29590688de64d5d on flab.ceph.dreamhost.com
While running test_u...
Colin McCabe
08:50 PM Revision 43e0b267 (ceph): ReplicatedPG: call finish_recovery when needed
Don't loop in ReplicatedPG::start_recovery_ops. There is already a loop
in both recover_replicas and recover_primary ...
Colin Patrick McCabe
08:33 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Colin McCabe wrote:
> Another potential issue that I can see here is that the code in OSD::_process_pg_info doesn't ...
Sage Weil
12:43 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Another potential issue that I can see here is that the code in OSD::_process_pg_info doesn't check whether it got a ... Colin McCabe
09:26 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Need to look at this more closely. Fred, pretty sure no data is lost here, but the recovery code needs some fixing.
...
Sage Weil
06:19 AM Bug #590 (Resolved): osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
After upgrading to ceph 0.23, the cluster (3 osd, 3 mon, 3 non-clustered mds) worked for about 2 hours and then one c... ar Fred
06:09 PM Revision ea5d1d66 (ceph): osd_resurrection_1_impl: turn on recovery at end
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:47 AM Feature #526 (Resolved): osd: unfound objects rework
We now let the PG become active even when there are unfound objects. When the user tries to read one of those objects... Colin McCabe
07:39 AM Linux kernel client Feature #591 (Resolved): implement FALLOC_FL_PUNCH_HOLE
Sage Weil
12:52 AM Revision 4adfdee7 (ceph): Makefile: fix builddir weirdness
Signed-off-by: Jim Schutt <jaschut@sandia.gov> Jim Schutt
12:10 AM Bug #585: OSD: ReplicatedPG::pull
Well, it did show up again:... Wido den Hollander

11/17/2010

10:37 PM Revision 7e9812b4 (ceph): osd: rev PG::Info encoding for last_epoch_clean change
This was missed by 184fbf582b27c10b47101735a4495fe8c73ad186, so any fs
created between now and then won't decode prop...
Sage Weil
09:06 PM Revision c17e7da4 (ceph): Merge branch 'mds_frags' into unstable
Sage Weil
09:06 PM Revision 7f6a2561 (ceph): mds: clear PIN_SUBTREE on split/merge in purge_strays
This makes the helper work for merge as well as split. Remove the special
fixups in the caller that were making spli...
Sage Weil
09:06 PM Revision 66d43ac8 (ceph): mds: fix subtree map update on dirfrag merge
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:06 PM Revision b705be11 (ceph): mds: wrlock scatterlocks to prevent a gather racing with split/merge lo...
We have the dirs split in our cache for some time while journaling it to
disk, before the fragment_notify goes out. ...
Sage Weil
09:06 PM Revision f6823a79 (ceph): mds: adjust dir_auth_pins on steal_dentry
dir_auth_pins is a counter of dentry auth_pins in the current dir; those
need to be added in when stealing.
Signed-o...
Sage Weil
09:06 PM Revision cd5ee006 (ceph): mds: initialize PIN_SUBTREE on split
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:06 PM Revision d538817f (ceph): mds: flush log on fragment
This makes request lock auth_pins expire, so the fragment moves along.
Otherwise we can end up waiting for the log fl...
Sage Weil
09:06 PM Revision 3777ff8a (ceph): mds: move dirty rstat inodes to new dir on refragment
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:06 PM Revision 669b5544 (ceph): mds: don't complete freeze while parent inode is frozen
This makes maybe_finish_freeze() conditions match that of is_freezeable()
and avoids an assert.
Signed-off-by: Sage ...
Sage Weil
09:04 PM Revision b58b8d09 (ceph): mds: fix discover requests, tracking wrt fragments
Track discover requests by tid. The old system of tracking outstanding
discovers was kludgey and somewhat broken. A...
Sage Weil
09:02 PM Revision a63c06c8 (ceph): mds: fix EFragment replay
If the inode already exists in our cache, adjust our (existing) fragments.
But it might not. In that case, we just r...
Sage Weil
09:02 PM Revision a961049b (ceph): mds: don't fragment mdsdir or .ceph
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:48 PM Revision b54880e0 (ceph): Detect broken system linux/fiemap.h
RedHat 5.5 has a /usr/include/linux/fiemap.h, but it is
broken because it does not itself include linux/types.h.
As a...
Jim Schutt
06:24 PM Revision 29a9e668 (ceph): osdmap: don't include blacklist info in summary
It's confusing users and isn't that important.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:58 PM Revision c43455ce (ceph): client: Remove the I_COMPLETE flag from the parent directory in relink_...
This papers over issues arising from the client's lack of proper support
for hard links, and lets it pass the snaptes...
Greg Farnum
02:35 PM Bug #589 (Resolved): OSD: crash on startup, PG::read_state
Ok, this is fixed by commit:7e9812b4a9bbf320a8b0bd0abec48c1c5d78fe66. Assuming your fs is old enough you should be o... Sage Weil
11:38 AM Bug #589 (Resolved): OSD: crash on startup, PG::read_state
After upgrading to today's unstable all my OSD's crashed directly after startup, for example osd0:
Last loglines a...
Wido den Hollander
12:56 PM Bug #531: Journaling Causes System Hang
Just pinging you on this one. If you can send the logs I'd like to sort this out. Thanks! Sage Weil
09:59 AM CephFS Bug #344: cfuse should pass all qa tests
At this point the only test it's failing is bonnie. This one tends to fail on a SEGV that just keeps going through th... Greg Farnum
09:57 AM CephFS Bug #583 (Resolved): cfuse fails snaptest-upchildrealms
Okay, a proper fix for this is going to require a bit of work, since right now Inodes can only have one parent dentry... Greg Farnum
09:52 AM CephFS Cleanup #588 (Resolved): Allow Inodes to have multiple parent Dentries
Right now, cached Inodes can only have one parent Dentry. This is unfortunate when there are multiple hard links to a... Greg Farnum
09:40 AM Tasks #587 (Rejected): install mpich2 on sepia*
this will make management and testing easier Sage Weil
07:52 AM Bug #585 (Closed): OSD: ReplicatedPG::pull
This one should also be fixed in the latest unstable. Probably. The recovery code is still being worked on a bit, b... Sage Weil
02:55 AM Bug #585 (Resolved): OSD: ReplicatedPG::pull
On two OSD's (osd5 and osd10) I'm seeing the same crash, the crash almost directly after starting them.
I cranked ...
Wido den Hollander
07:19 AM Bug #586 (Resolved): OSD: Crash during scheduled scrub
This was fixed in the commit right after what you were running, commit:556ba7397c352f5a6cb7fe03087c6e2f51dbce32 Sage Weil
05:31 AM Bug #586 (Resolved): OSD: Crash during scheduled scrub
After I reported #585 I didn't pay much attention to my cluster, until I found out that I had only one OSD left onlin... Wido den Hollander
12:09 AM Revision d57181d3 (ceph): config: added max_mds
MDSMonitor: create_new_fs adapted to use the max_mds parameter
max_mds is now a configurable value and create_new_fs...
Samuel Just

11/16/2010

09:00 PM Tasks #584 (Rejected): do throughput scaling tests on sepia
Use rados bench on N nodes, scaling N, and see how the throughput scales. Sage Weil
08:09 PM Revision c4931265 (ceph): mds: make dirfrag thrashing join and split
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:09 PM Revision d1dcc035 (ceph): mds: allow frag merge on subtree root
Fix purge_stolen and adjust_dir_fragments.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:08 PM Revision 8f24919d (ceph): mds: add timestamp to LogEvents
This just gives us a bit of useful info when debugging problems.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:32 PM Revision 56b9e927 (ceph): osd: fix trailing + in pg state string rendering
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:21 PM CephFS Bug #583: cfuse fails snaptest-upchildrealms
Looks like the problem is caused by linking b/bar to b/foo. The server response to goes through insert_dentry_inode v... Greg Farnum
06:17 PM CephFS Bug #583 (Resolved): cfuse fails snaptest-upchildrealms
Fails to rm a/b, ENOTEMPTY. Greg Farnum
06:11 PM Feature #582 (Closed): Make max_mds configurable
Samuel Just
03:06 PM Feature #582 (Closed): Make max_mds configurable
Right now the only way to set it is with the set_max_mds mon command. Add it to the config stuff and have create_new_... Greg Farnum
06:10 PM Revision 2c9873f0 (ceph): Merge remote branch 'origin/unfound' into unstable
Sage Weil
06:06 PM Revision d17f7444 (ceph): mds: be less noisy about cap imports
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:01 PM Revision 05bd6b07 (ceph): Merge branch 'mds_dir_hash' into unstable
Sage Weil
06:01 PM Revision e146767e (ceph): mds: make dentry hash a dir layout property
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:01 PM Revision cc709df8 (ceph): mds: add DIRLAYOUTHASH feature bit
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:01 PM Revision be29e4c3 (ceph): mds: set mode before all the file type dependent inode initialization!
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:01 PM Revision 33580460 (ceph): mds: set dir hash on root inode
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:01 PM Revision 77c05fbc (ceph): mds/client: pass dir hash over the wire
Add a feature bit DIRLAYOUTHASH.
Also fix client request routing for lookups (we were only hashing when
a Dentry poi...
Sage Weil
05:13 PM Bug #479: ceph/mount crash badly when writing
Sorry Sage and Yehuda for the late update..
I was spending time experimenting, and just using the default btrfs with...
DongJin Lee
01:48 PM Bug #538: Write performance does not scale over multiple computers
Did you update your installed version of the rados tool as Sage said? If you did and are still getting poor performan... Greg Farnum
12:48 PM Bug #518: cfuse crashed on ls
Confirmed this is fixed 0.23.1 (sorry for huge delay in confirmation). John Leach
12:06 PM CephFS Feature #483 (Resolved): mds: add timestamp to LogEvent
commit:8f24919d39734cf518f2bf6e50faf6f5266d6eff Sage Weil
11:52 AM CephFS Feature #560 (Resolved): mds: alternate directory hashing
kernel part is done and in unstable branch, currently commit:9f62e3eaafd52875e1f2e4344e11e51ddb726f48 Sage Weil
09:59 AM CephFS Feature #560: mds: alternate directory hashing
commit:05bd6b078d743d6c235c0fcedda7ee4f64ab2ad5 has it working for the user client. Sage Weil
02:33 AM Revision 267cd845 (ceph): RadosClient::shutdown: call monclient::shutdown
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
02:22 AM Revision dfb78ebf (ceph): osd: don't stop recovery when there are unfound
There are two phases in recovery: one where we get all the right objects
on to the primary, and another where we push...
Colin Patrick McCabe
01:03 AM Revision d014acb6 (ceph): dumpjournal.cc: fix compile
dumpjournal needs to create its own SafeTimers and pass them in to some
constructors.
Signed-off-by: Colin McCabe <c...
Colin Patrick McCabe
12:44 AM Revision da2d5018 (ceph): rbd: fix rbd snap rm class handling
Yehuda Sadeh

11/15/2010

10:59 PM Revision 250d414e (ceph): Merge remote branch 'origin/unfound_last_epoch_clean' into unstable
Sage Weil
10:47 PM Revision c7075115 (ceph): Add ./ceph osd tell <osd-num> dump_missing <out>
Add a command that tells the OSD to dump its missing set for all PGs to
a file. This should be useful for debugging m...
Colin Patrick McCabe
10:38 PM Revision 755f5759 (ceph): search_for_missing:recalc stats if unfound changed
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:31 PM Revision d883a547 (ceph): mds: Use CDir bloom filter as appropriate.
Add items to the bloom filter when trimming, and look for them
in the filter in the few places where a simple existen...
Greg Farnum
09:31 PM Revision be2da00a (ceph): mds: Add bloom filter to CDir.
You can now add items to a bloom filter and check for their existence.
This is intended to be used when trimming item...
Greg Farnum
09:23 PM Revision 1fe31e18 (ceph): timer: make init/shutdown explicit
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:39 PM Revision d2af7b7e (ceph): test_unfound.sh: start recovery at end of test
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
08:31 PM Revision c293b9af (ceph): test_common.sh: add dump_osd_store
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
08:15 PM Revision 184fbf58 (ceph): osd: add last_epoch_clean to PG::Info
This changes the encoding in a non-backwards compatible way.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:15 PM Revision 873e9bf8 (ceph): osd: add incompat feature LEC for last_epoch_clean
So an old binary will fail to mount a store with new Info encoding.
Signed-off-by: Sage Weil <sage@newdream.net>
Colin Patrick McCabe
08:15 PM Revision b0c22bd5 (ceph): Add MOSDPGMissing
Add MOSDPGMissing, a message which just contains the missing objects
information for a PG. We will request messages l...
Colin Patrick McCabe
08:15 PM Revision d3cf4787 (ceph): PG::finish_recovery: set info.last_epoch_clean
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
08:15 PM Revision e768bbdf (ceph): Add stray_test to test_unfound.sh
This test is designed to produce a stray that nonetheless has some
useful objects. The primary should be able to find...
Colin Patrick McCabe
08:15 PM Revision 796ff1d1 (ceph): Fix bugs in search_for_missing, _process_pg_info
PG::search_for_missing: fix a bug with the handling of MSG_OSD_PG_INFO
messages. Formerly, when processing these mess...
Colin Patrick McCabe
08:15 PM Revision e3f65076 (ceph): osd: add discover_all_missing
Add discover_all_missing. This function makes sure that we have messages
en route to any OSD that we think might have...
Colin Patrick McCabe
08:15 PM Revision 470b1990 (ceph): stray_test:don't use up/down. timeout extension
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
08:15 PM Revision 05a16d32 (ceph): test_unfound.sh: fix return codes
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
08:15 PM Revision 6a65cc4f (ceph): test_common.sh: remove messenger debug for now
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
08:06 PM Revision 873180aa (ceph): osd: skip unfound in recover_replicas
This is moot currently, since we don't currently start recovering replicas
until the primary is complete.
Signed-off...
Sage Weil
08:04 PM Revision d61bc3bf (ceph): osd: skip unfound objects in recover_primary()
We also need to make sure we come back later when they are found.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:57 PM Revision 9ea1d8bb (ceph): osdmap: make printing a bit easier to read
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:50 PM Revision beae97f9 (ceph): objecter: don't dereference null op->outbl
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:36 PM Revision 089cd12d (ceph): include: Add bloom filter library to include/
Signed-off-by: Greg Farnum <gregf@hq.newdream.net> Greg Farnum
07:25 PM Revision f2c080b3 (ceph): Merge remote branch 'origin/testing' into unstable
Sage Weil
07:25 PM Revision 556ba739 (ceph): osd: unreg scrub when removing pg
This fixes this crash:
osd/OSD.cc: In function 'PG* OSD::_lookup_lock_pg(pg_t)':
osd/OSD.cc:956: FAILED asse...
Sage Weil
04:54 PM CephFS Feature #560: mds: alternate directory hashing
almost there. need to fix/test uclient hashing.
then implement for kclient...
Sage Weil
04:44 PM Bug #580 (Resolved): rbd rm snap is broken
Fixed with commit:da2d50180dfdc0e30b4348f2acceb2be650f20b7 Yehuda Sadeh
03:42 PM Bug #580 (Resolved): rbd rm snap is broken
When doing 'rbd rm snap', the rbd image header gets corrupted. Yehuda Sadeh
01:49 PM Bug #535 (Resolved): cephtool hangs forever until a UNIX signal is received
Sage spent some time on the messenger too, and I suspect we're done now. Greg Farnum
01:39 PM CephFS Feature #545: mds: use bloom filter to supplement dirfrag COMPLETE flag
Pushed it to branch "mds" (which I apparently created, but thought existed...weird!). Testing it now on a secondary i... Greg Farnum
11:19 AM Bug #579 (Resolved): OSD::sched_scrub: FAILED assert(pg_map.count(pgid)
commit:f46f674261bf65a6f7f6313fb688ec4773f526b5 Sage Weil
10:56 AM Bug #579: OSD::sched_scrub: FAILED assert(pg_map.count(pgid)
Some more information about this bug.
OSD1 and OSD2 have a PG named 0.6
OSD0 does not.
=====================
...
Colin McCabe
10:51 AM Bug #579 (Resolved): OSD::sched_scrub: FAILED assert(pg_map.count(pgid)
On unfound_last_epoch_clean at commit commit:7201497f2feef6a2bbd0baf89e3a14b8a880e79f
I found this assert when run...
Colin McCabe
07:05 AM Bug #538: Write performance does not scale over multiple computers
I set 'osd heartbeat grace=120' and that got rid of the chatter. My performance is now:... Ed Burnette
04:48 AM Revision 7f38858c (ceph): Merge branch 'msgr_zerocopy_read' into unstable
Sage Weil
04:39 AM Revision 7cb2d508 (ceph): msgr: use provided rx buffer if present
This changes the read path so that we hold the Connection::lock mutex while
reading data off the socket. This ensure...
Sage Weil
04:39 AM Revision e8132cd9 (ceph): objecter: post rx buffer to msgr if target bufferlist is present
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:39 AM Revision 975dd8fa (ceph): librados: pass provided buffer to objecter on rados_read
This allows us to avoid to the data copy if the objecter and msgr manage
to use it.
Signed-off-by: Sage Weil <sage@n...
Sage Weil
04:23 AM Revision 2854dae8 (ceph): msgr: add Connection rx buffer interface
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:23 AM Revision c04ba725 (ceph): msgr: implement get_connection()
Get a Connection* for the given destination. This mirrors submit_message,
but does not actually queue a message.
Si...
Sage Weil
04:21 AM Revision 67852352 (ceph): buffer: implement list::iterator::get_current_ptr()
Return a buffer::ptr for the ptr at the current position/offset, with the
length set to the remaining space in the cu...
Sage Weil

11/14/2010

09:05 PM Messengers Feature #527 (Resolved): zero copy reads, msgr rx buffer infrastructure
commit:7f38858c0c19db36c5ecf36cb4d333579981c811 Sage Weil
07:29 PM Revision 4af14db4 (ceph): Objecter::shutdown: shut down timer.
We have to explictly shut down the timer in Objecter::shutdown.
Otherwise, we are relying on the destructor of SafeTi...
Colin Patrick McCabe
11:33 AM Bug #578 (Resolved): assert triggered on radostool shutdown
Colin McCabe
11:33 AM Bug #578: assert triggered on radostool shutdown
Fixed by commit:4af14db424e770c2f3e99dad6fd2b6f2059feacd
A mutex lifecycle issue.
Colin McCabe
11:26 AM Bug #578 (Resolved): assert triggered on radostool shutdown
I hit this assert when radostool was exiting.
./common/Mutex.h:97: FAILED assert(nlock == 0)
ceph version 0.24~r...
Colin McCabe

11/13/2010

08:46 PM Bug #574: timer: event cancellation apparently broken
cancel_event always relied on the caller to take the SafeTimer lock, and then goes on to take the Timer lock. So it's... Colin McCabe
08:39 PM Bug #535: cephtool hangs forever until a UNIX signal is received
It looks good so far. Colin McCabe
04:43 AM Revision f18609e8 (ceph): Merge remote branch 'origin/msgr' into testing
Sage Weil
12:00 AM Revision 2be4215a (ceph): debug: don't print thread id twice
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

11/12/2010

11:59 PM Revision b61af6a7 (ceph): msgr: cleanup: make queue_received non-inline; some helpful debug
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
11:56 PM Revision f99c84e6 (ceph): msgr: do not clear halt_delivery
We need to keep the halt_delivery plug set on failure/shutdown in order to
prevent a racing reader from queuing new m...
Sage Weil
10:55 PM Revision 1071a9ab (ceph): msgr: protect pipe queue_item map with pipe_lock AND dispatch_queue lock
Close a few different races here.
Also, assert that queue_items are not queued in ~Pipe().
Signed-off-by: Sage Weil...
Sage Weil
10:55 PM Revision d4746ab5 (ceph): msgr: close enqueue/discard race
We need to re-check halt_delivery after dropping and retaking pipe_lock.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:55 PM Revision 20937e88 (ceph): msgr: protect pipe queuing with _both_ pipe and dispatch_queue locks
We want to make sure the pipe's queue item doesn't go away.
Also, make queue_received() require pipe_lock to be held...
Sage Weil
10:55 PM Revision cbf154e1 (ceph): msgr: only close socket on reconnect or shutdown
We can't modify 'sd' or (more importnatly) close sd while any other thread
might be using it, or else we might race w...
Sage Weil
10:55 PM Revision 70fe062f (ceph): msgr: add 'ms inject socket failures = foo'
Where we fail roughly every foo'th socket operation.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:49 PM Revision 20affc65 (ceph): TestTimers: don't test (nonexistent) Timer
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:45 PM Revision d5032a05 (ceph): Rename PG::peer to PG::do_peer
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
03:59 PM Revision 46cf27d4 (ceph): Merge branch 'testing' into unstable
Sage Weil
03:55 PM Revision c5b2d28b (ceph): uclient: insert lssnap results under snapdir, not live dir
Put the readdir results (list of snapshots) in the right place in the
hierarchy; we were putting them in the parent d...
Sage Weil
03:36 PM Revision 7ccdae8c (ceph): msg: fix buffer size for IPv6 address parsing
Signed-off-by: Wido den Hollander <wido@widodh.nl> Wido den Hollander
02:20 PM Bug #577 (Resolved): unify PG creation code in OSD::handle_pg_notify and OSD::_process_PG_info
unify PG creation code in OSD::handle_pg_notify and OSD::_process_PG_info
Duplicated code here. They're slightly d...
Colin McCabe
02:16 PM CephFS Feature #545: mds: use bloom filter to supplement dirfrag COMPLETE flag
Trying to find a bloom filter library. Unfortunately there don't seem to be any available under a GPL-compatible lice... Greg Farnum
01:16 PM Bug #490 (Can't reproduce): Cluster stays in a degraded state
Sage Weil
01:15 PM CephFS Cleanup #514 (Rejected): Optimize MIX/MIX_STALE reconnects, etc
mix_stale is no more Sage Weil
12:56 PM Linux kernel client Bug #576 (Can't reproduce): readdir returns too many results
... Sage Weil
11:02 AM Bug #535: cephtool hangs forever until a UNIX signal is received
Pushed a potential fix to the msgr branch, waiting for Colin to report back on if it works or not. :) Greg Farnum
07:56 AM CephFS Bug #561 (Resolved): snaptest-2 doesn't execute properly
Figured this out. LSSNAPs was adding the snap dentries to the cache under the parent dir instead of the hidden .snap... Sage Weil
07:37 AM Messengers Bug #573 (Resolved): monmaptool fails to parse IPv6 address
Thanks, applied as commit:7ccdae8cd44c143550234511a2a09bab38c6515e Sage Weil
04:56 AM Messengers Bug #573: monmaptool fails to parse IPv6 address
After searching through the source I found it :)
Attached is a patch to fix the IPv6 address parsing. The buffer w...
Wido den Hollander
05:12 AM Bug #575 (Resolved): monmaptool terminates when input file is not a monmap
For example:... Wido den Hollander
03:30 AM Bug #540: CephxClientHandler::handle_response
Just saw it again on the same cluster, this time osd2 crashed when upgrading to this morning's unstable:... Wido den Hollander
12:29 AM Bug #540: CephxClientHandler::handle_response
I saw that on a test machine of mine. The 'ceph -w' command was hanging for about 10 seconds and then exited with thi... Wido den Hollander
12:38 AM Revision ce6d6394 (ceph): timer: rewrite mostly from scratch
Just use the provided lock. This _vastly_ reduces the complexity because
we don't have to worry about races between ...
Sage Weil

11/11/2010

11:31 PM Revision 54848991 (ceph): mds: hit inode created via CREATE
We missed this path!
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:28 PM Revision f8b3271f (ceph): Merge branch 'rc' into unstable
Conflicts:
configure.ac
src/Makefile.am
Sage Weil
05:47 PM Bug #531: Journaling Causes System Hang
Sorry I have been able to get the debug output yet. We have spent the last few days working with our production syste... Bryan Tong
04:47 PM Linux kernel client Tasks #569 (Resolved): test dir frags
a few fixes, mostly fine. commit:7b88dadc13e0004947de52df128dbd5b0754ed0a Sage Weil
04:43 PM Bug #574 (Resolved): timer: event cancellation apparently broken
Looking into this, it appears that the problem was that the wrong lock was taken during cancel event. Or that the ev... Sage Weil
03:38 PM Bug #574 (Resolved): timer: event cancellation apparently broken
Just saw this on latest unstable, commit:f8b3271f45cc4a87e3f3f212d22e3d34ff13da44
The monitor schedules a propose ...
Sage Weil
03:09 PM CephFS Tasks #366 (New): test snaptests against clustered mds failures
Sage Weil
03:08 PM CephFS Tasks #366 (Rejected): test snaptests against clustered mds failures
Sage Weil
03:08 PM CephFS Bug #362 (Rejected): mds: rejoin crashes on snaptest-2 workload
Sage Weil
02:45 PM Bug #540: CephxClientHandler::handle_response
Wido just saw this:... Sage Weil
05:18 AM Revision 5d1d8d0c (ceph): v0.23
Sage Weil
04:58 AM Revision 3d10b340 (ceph): mds: fix null_snapflush with multiple intervening snaps
The client is allowed to not send a snapflush if there is no dirty metadata
to write for a given snap. However, the ...
Sage Weil
02:17 AM Messengers Bug #573 (Resolved): monmaptool fails to parse IPv6 address
I'm trying to setup a small cluster with IPv6, but mkcephfs fails:... Wido den Hollander
12:36 AM Revision 3d6e9155 (ceph): Merge remote branch 'origin/unfound' into unstable
Sage Weil
12:31 AM Revision 4d941cf4 (ceph): osd: scrub: change cancel behavior
Use explicit flag, so that scrub_reserved always indicates whether the
osd count includes us or not.
Signed-off-by: ...
Sage Weil
12:31 AM Revision a87e8901 (ceph): osd: track last_scrubbed in PG::Info::History
Share with peers and write to disk on scrub completion.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:31 AM Revision 6548fb65 (ceph): osd: do scrub schedule state changes inside scrub()
Update these values under protection of pg lock iff we start scrubbing,
otherwise back out.
On scrub completion, unr...
Sage Weil
12:31 AM Revision 815c3d56 (ceph): osd: fix sched_scrub
Insert whoami into reserved set on primary, not 0! Also more cleanup of
sched state helpers.
Signed-off-by: Sage We...
Sage Weil
12:31 AM Revision 92572910 (ceph): osd: call sched_scrub on reserve reply
Otherwise we have to wait until the next time it's called by the timer, and
during that period we have a reservation ...
Sage Weil
12:31 AM Revision c12829a2 (ceph): osd: don't scrub something we just scrubbed
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:31 AM Revision 85e08905 (ceph): osd: scrub least recently scrubbed pgs first; once a day
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

11/10/2010

10:50 PM Revision 231434af (ceph): pg_state_string: use an ostringstream
Use an ostringstream for efficiency's sake.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
09:49 PM Revision d247616c (ceph): vstart: stop logging to /tmp/foo
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:46 PM CephFS Bug #561: snaptest-2 doesn't execute properly
I ran the test again and didn't get an mds crash. There was one issue remaining:... Greg Farnum
06:14 PM CephFS Bug #561 (In Progress): snaptest-2 doesn't execute properly
I think I may have finally nailed this problem, or at least found a band-aid by more aggressively removing the I_COMP... Greg Farnum
09:39 PM Revision 74be621c (ceph): osd: fix scrub reserved state when starting scrub
Also document scrub scheduling/pending/active states.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:18 PM CephFS Bug #570 (Resolved): Locker::_do_null_snapflush assert failure
Sage Weil
09:18 PM CephFS Bug #570: Locker::_do_null_snapflush assert failure
Nice catch. Fixed by commit:3d10b340748e5bbff86b49ac7386da9efa27a070. Added a unit test too! Sage Weil
02:58 PM CephFS Bug #570 (Resolved): Locker::_do_null_snapflush assert failure
Seen this a lot while working on the snaptest-2 issue, when shutting down cfuse.... Greg Farnum
09:16 PM Revision 8650418f (ceph): vstart: turn down msgr debugging
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:13 PM Revision 9e4027fb (ceph): monc: cancel timer events with lock held
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:23 PM Revision 07bb6756 (ceph): Wake up clients waiting for now-found objects
PG::search_for_missing: when we find a previously unfound object, check
to see if there is an entry in waiting_for_mi...
Colin Patrick McCabe
07:46 PM Revision 8288a23a (ceph): PG::peer: don't block if objects are unfound
Erase the code in PG::peer that used to keep us from becoming active
when objects were still unfound. Print out the n...
Colin Patrick McCabe
07:46 PM Revision 040c4bcd (ceph): PG::search_for_missing: minor refactoring, comment
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:46 PM Revision 5153ba5e (ceph): Add PG::Missing::have_missing()
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:46 PM Revision 85c4e6e6 (ceph): OSD::_process_pg_info:search_for_missing sometimes
OSD::_process_pg_info: If we're the primary for this active PG, and we
have missing objects, call search_for_missing....
Colin Patrick McCabe
07:46 PM Revision 6a04ac52 (ceph): PG::recover_master_log: rename a local variable
PG::recover_master_log: rename a local variable to avoid using the
overloaded term "missing".
Signed-off-by: Colin M...
Colin Patrick McCabe
07:46 PM Revision b5181133 (ceph): test_unfound.sh: shorter test
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:46 PM Revision 02ec7219 (ceph): Add num_objects_unfound to struct pg_stat_t
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:46 PM Revision fc605ced (ceph): test_unfound.sh: verify that we have unfound objs
test_unfound.sh: verify that we have unfound objs.
Then, when we bring up the other OSD, verify that those unfound ob...
Colin Patrick McCabe
07:46 PM Revision b9191ddc (ceph): test_unfound.sh: test reading an unfound object.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:46 PM Revision e6b6c539 (ceph): PG::peer: count/find cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:30 PM Revision b80f3e6a (ceph): PG: move ostream operator to .cpp file
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:30 PM Revision a46f15e7 (ceph): PG: nomenclature change: talk about unfound objs
Describe objects as "unfound" when we don't know what OSD has them.
Signed-off-by: Colin McCabe <colinm@hq.newdream....
Colin Patrick McCabe
06:30 PM Revision ef1f8ecd (ceph): PG.h erase deadcode
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:16 PM Bug #535 (In Progress): cephtool hangs forever until a UNIX signal is received
After checking the logs and conferring with Sage, I think I've found a possible cause. Designing and testing a fix no... Greg Farnum
05:43 PM Revision 82aa79f8 (ceph): mds: fix inode->frag rstat projected with snaps
The snapid 'first' value needs to be >= inode->first; move that into
the helper.
Signed-off-by: Sage Weil <sage@newd...
Sage Weil
05:04 PM Revision 5deef243 (ceph): osdmap: break up asserts for easier debugging
If we fail one of these it's helpful to know which one.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:03 PM Revision 586c9e7a (ceph): objecter: throttle before looking at lock protected state
The take_op_budget() may drop our lock if we are in keep_balanced_budget
mode, so we need to do that _before_ we take...
Sage Weil
04:50 PM Revision 57513739 (ceph): mon: drop unnecessary state checks
We want to ignore all beacons from the mds regardless of what state they
are in.
Signed-off-by: Sage Weil <sage@newd...
Sage Weil
04:46 PM Feature #567 (Resolved): osd: background scrub frequency, scheduling
fixed up some scheduling problems, then added the interval and oldest-scrubs-first stuff. Sage Weil
04:45 PM Revision 84840ed7 (ceph): debian: don't explicitly depend on libgoogle-perftools0
dpkg-buildpackage will autodetect the dependency. Except on lenny, where
it doesn't exist and we don't use it!
Sign...
Sage Weil
04:14 PM Revision ca3693d8 (ceph): mds: Enable --journal_check mode.
This replaces the old --shadow option, which didn't work.
It starts up the MDS daemon, then replays the journal for
a...
Greg Farnum
04:13 PM Revision 214b7269 (ceph): osdc: Fix bad assert in ~ObjectCacher.
The objects data member is never empty on shutdown since it now consists
of a vector of pools. Instead, check each po...
Greg Farnum
03:43 PM Feature #572 (Resolved): Implement lingering osd requests
For the watch/notify feature we need to implement lingering osd requests on the userspace client side. Lingering osd ... Yehuda Sadeh
03:42 PM Revision 5035c822 (ceph): uclient: only update inode if version increased
This realigns the code with the kernel version, fixing a number of
problems when you have multiple MDSs returning inf...
Sage Weil
03:21 PM Linux kernel client Bug #571 (Closed): client hangs after osd disconnection
This happens on the rbd watch/notify sync branch. Probably related to lingering requests. Yehuda Sadeh
12:12 PM Bug #559 (Rejected): osd: dup requests can ack early
nevermind, this is already done and merged! Sage Weil
11:01 AM Linux kernel client Tasks #569 (Resolved): test dir frags
Make sure we behave with fragmented dirs, esp readdir. (probably need to mirror the recent cfuse fixes.) Sage Weil
09:43 AM Bug #521 (Resolved): objecter: crash in osdmap assert
commit:586c9e7a80b425802ca77d8c09bb00da5c25d616 Sage Weil
09:15 AM Feature #568 (Resolved): debian: build with --as-needed?
Can we do this to limit dependencies? See #544.
And the current warnings like...
Sage Weil
08:18 AM CephFS Feature #548 (Resolved): mds: shadowreplay one-shot mode
commit:ca3693d8ffcdffc3ae95eaba506a72889829bcb5 makes minimal changes to the MDS and MDSMonitor code to enable the ne... Greg Farnum
08:03 AM Revision 255e34af (ceph): decompile_crush_bucket: fix depth-first decomp
We need to ensure that buckets are output after their dependencies. The
best way to do this is a depth-first traversa...
Colin Patrick McCabe
07:58 AM Revision d1f15daf (ceph): CrushWrapper:get_bucket: ret ENOENT for no bucket
All the callers of CrushWrapper::get_bucket() check for error codes, but
not for NULL returns. So if there is no buck...
Colin Patrick McCabe
07:24 AM Bug #531: Journaling Causes System Hang
What would be helpful in diagnosing this problem is:
- turn up osd logging, in [osd] section:
debug osd = 20
...
Sage Weil

11/09/2010

11:56 PM Revision 11cfcfe8 (ceph): Merge branch 'sched_scrub' into unstable
Conflicts:
src/osd/PG.cc
src/osd/PG.h
Sage Weil
11:50 PM Revision e8ad6d26 (ceph): osd: small cleanup
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
11:46 PM Revision 28b44293 (ceph): osd: scrub: list objects without lock held
We'll go back to get anything we missed later.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
11:46 PM Revision c2d6d05f (ceph): Merge branch 'scrub_no_lock' into unstable
Sage Weil
11:34 PM Revision 966369aa (ceph): ps-ceph.pl: don't show self
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:04 PM Revision 6bc31511 (ceph): gui: add missing #include
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:50 PM Revision 58394828 (ceph): Merge branch 'rbd-fiemap' into unstable
Sage Weil
10:49 PM Revision e991702e (ceph): objecter: set READ flag on new objecter mapext/read_sparse ops
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:48 PM Revision adac5163 (ceph): objecter: fix balancer for ops with length < 0
Notably, mapext.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:36 PM Revision 20060548 (ceph): filestore: autodetect presense of FIEMAP ioctl
If it's not there, assume the whole object is allocated.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:35 PM Revision e5488718 (ceph): fiemap: include linux fiemap.h header; unconditionally compile helper
If the system doesn't have the header, use our copy.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:33 PM Revision 9f14dd25 (ceph): ps-ceph.pl: display Ceph tests
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:23 PM Revision 53b076d5 (ceph): Merge remote branch 'origin/rbd-fiemap' into unstable
Sage Weil
10:06 PM Revision 2325a1a2 (ceph): Fix example config file
We need to specify a journal size for the file-based journal we set up
in the example config file.
Signed-off-by: Co...
Colin Patrick McCabe
09:59 PM Revision 2947d19d (ceph): TimerThread:don't call pop_front before iter deref
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:30 PM Revision 1c7d8f1a (ceph): Makefile: use openssl module check
This allows ceph to build with --as-needed.
Signed-off-by: Kacper Kowalik <xarthisius@gentoo.org>
Kacper Kowalik
09:17 PM Revision 954ad982 (ceph): osd: shut down if we do not exist
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:08 PM Revision ea56dfdc (ceph): osd: handle osds that no longer exist in prior_set_affected
Consider no-longer-existent OSDs lost.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:05 PM Revision 29428b9b (ceph): Objecter: initialize timer in Objecter::init
Just in case future users of Objecter want to create one before calling
Messenger::start as a daemon.
Signed-off-by:...
Colin Patrick McCabe
06:15 PM Revision ec4200b0 (ceph): Add test_crushtool.sh
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:06 PM Revision 019bb70e (ceph): mds: turn on mds_bal_frag (dir fragmentation) by default
Let the fun begin!
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:04 PM Revision ae13fc86 (ceph): osd: handle osds that no longer exist in build_prior
Fix build_prior to handle OSDs that no longer exist in the current map.
Consider them lost.
Signed-off-by: Sage Weil...
Sage Weil
06:04 PM Revision e15c9569 (ceph): mds: fix inode freeze auth pin allowance
When we're renaming across nodes, we need to freeze the inode. This
requires that we allow for the auth_pins that _w...
Sage Weil
06:03 PM Revision 3107944e (ceph): osdmap: cleanup: add parens
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:59 PM Revision f28b99b3 (ceph): CrushWrapper::get_bucket_item: bounds check
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
05:59 PM Revision 9b487256 (ceph): crushtool: don't create a dump we can't recompile
In crushtool, dump buckets in tree order. Buckets which reference other
buckets must be dumped after their depedencie...
Colin Patrick McCabe
05:55 PM Revision e1588dc4 (ceph): mds: wipe out client sessions on startup
For disaster recovery and such.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:55 PM Revision 05a47387 (ceph): mon: implement 'mds newfs <metapool> <datapool>' command
Create a new fs (by creating a new MDSMap) using the given pools.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:55 PM Revision d80948ad (ceph): mds: use mdsmap data pool for root inode default layout
The MDSMap may specify any random pool as the data pool; use that.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:55 PM Revision 8a21c6f6 (ceph): mds: add mds_skip_ino and mds_wipe_ino_prealloc options
These are last-ditch recovery tools. Not particularly effective ones,
though.
Signed-off-by: Sage Weil <sage@newdre...
Sage Weil
05:04 PM Linux kernel client Bug #549: bonnie++ file stat failure
bonnie tests are running under ceph 5, 6, 8, and 9, logging to /data/qa/ on each machine. Terri Haber
04:28 PM Bug #535: cephtool hangs forever until a UNIX signal is received
cephtool-hang-at-966369aad07461f2610b4dd2a9cdc770155c5a89.txt Colin McCabe
03:08 PM Bug #535: cephtool hangs forever until a UNIX signal is received
messenger-bug.txt Colin McCabe
04:26 PM Bug #521: objecter: crash in osdmap assert
Can you try with something like... Sage Weil
09:45 AM Bug #521: objecter: crash in osdmap assert
latest from ML:... Sage Weil
03:59 PM Feature #567 (Resolved): osd: background scrub frequency, scheduling
We should have some min interval such that the osds won't scrub the same osd more frequently than that.
Also, the ...
Sage Weil
03:56 PM Feature #425 (Resolved): trigger osd scrub automatically
Sage Weil
03:54 PM Subtask #485 (Resolved): osd: cooperative scrub scheduling
merged by commit:11cfcfe87503e50c892178d9c5c5b55da3aac740 Sage Weil
03:45 PM Subtask #486 (Resolved): osd: make scrub not block writes
merged commit:28b44293e34c5e97f350b4c68becdf9e7767ed6f Sage Weil
02:52 PM Bug #248 (Resolved): rbdtool import should use fiemap
Sage Weil
02:52 PM Bug #248: rbdtool import should use fiemap
Merged by commit:58394828a01950d7b26430d61d32df91df5a5fb1, bringing it in line with the objecter changes over the las... Sage Weil
02:13 PM RADOS Bug #558 (Resolved): crushtool cannot always re-encode a crushmap that it's created
Fixed by commit:9b48725614a880cf1f4bcad0bba2ceefdc76c167
C.
Colin McCabe
02:11 PM Bug #533 (Resolved): radostool hang on shutdown
Should be fixed by timer-fixes.
C.
Colin McCabe
02:10 PM Bug #565 (Resolved): Example config file is broken
Fixed by 2325a1a27b434cea7d7af832efff7a9257724fe6
C.
Colin McCabe
01:30 PM Bug #544 (Resolved): ceph-0.22.2: fails to build with --as-needed
Sage Weil
01:16 PM Bug #566 (Resolved): osd: build_prior needs to be wary of nonexistent osds
fixed by commit:954ad98230085c9c2a174fe15af24df237498977 commit:ea56dfdc663f8b0e19346bb63ffe3fec0c7759c4 commit:ae13f... Sage Weil
12:59 PM CephFS Bug #556 (Resolved): clustered mds: rename
this wasn't too bad.. the locking auth_pin scheme changed a while ago and the auth_pin allowance didn't get adjusted ... Sage Weil
12:42 PM Linux kernel client Bug #546 (Resolved): direct i/o does not work when offset is not page-aligned
See commit:c5c6b19d4b8f5431fca05f28ae9e141045022149. Passes my tests. Sage Weil
06:03 AM Revision aad3f7f2 (ceph): ceph.spec.in: don't strip rados classes
Signed-off-by: Christian Brunner <christian@brunner-muc.de> Christian Brunner

11/08/2010

10:49 PM Bug #535: cephtool hangs forever until a UNIX signal is received
> Look, I know it's a pain, but work on this isn't going to progress unless
> we collect AT LEAST:
> 1) The state ...
Colin McCabe
01:05 PM Bug #535 (Can't reproduce): cephtool hangs forever until a UNIX signal is received
Look, I know it's a pain, but work on this isn't going to progress unless we collect AT LEAST:
1) The state of each ...
Greg Farnum
10:35 AM Bug #535: cephtool hangs forever until a UNIX signal is received
The process that is hung is 17181, cephtool. Colin McCabe
10:35 AM Bug #535 (In Progress): cephtool hangs forever until a UNIX signal is received
Reproduced again on the unfound branch, which is very close to what is in unstable now.
cmccabe@flab:~/src/ceph/...
Colin McCabe
09:22 PM Revision 64f95ad9 (ceph): mds: add missing Dumper.[h,cc]
Sage Weil
09:18 PM Revision be9328ac (ceph): mds: tolerate/fix negative dir size counts
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:44 PM Revision d5515a8f (ceph): mds: add missing Dumper.[h,cc]
Sage Weil
08:40 PM Bug #566 (Resolved): osd: build_prior needs to be wary of nonexistent osds
... Sage Weil
08:09 PM Bug #565 (Resolved): Example config file is broken
The example config file (src/sample.ceph.conf) specifies the OSD journal as a file, but doesn't specify the size, whi... Ravi Pinjala
05:45 PM Revision 1ab7c7ff (ceph): Replace ps-ceph.sh shell script with perl script
A much faster version of ps-ceph.sh.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Andrew F
04:17 PM Linux kernel client Bug #384 (Closed): crash in splice_dentry
Sage Weil
03:07 PM Feature #80 (Resolved): uclient: readdir from cache
He already did it, yay! Greg Farnum
02:41 PM Feature #96 (Resolved): msgr: close idle connections?
Yay, this got done with the recent SimpleMessenger changes! Greg Farnum
02:15 PM Feature #276 (Resolved): Possibility to dump/list xattrs from RADOS object
Yehuda says he did this! Greg Farnum
01:24 PM Bug #531: Journaling Causes System Hang
We've looked at this a bit more but decided today that Sage is taking it over since he's a lot more familiar with the... Greg Farnum
12:49 PM Linux kernel client Bug #564 (Resolved): Configuration via configfs instead of sysfs
Will allow creation of different devices and setting them up. Should be device oriented, and will create a sub direct... Yehuda Sadeh
11:05 AM Bug #563 (Closed): osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
I'm running the unstable branch and I'm seeing in my dmesg:... Wido den Hollander
09:32 AM CephFS Bug #561: snaptest-2 doesn't execute properly
Okay, looks like this may be an issue with the test rather than Ceph. I just copied it into the root of the ceph moun... Greg Farnum
09:07 AM CephFS Bug #561 (Resolved): snaptest-2 doesn't execute properly
Checked it on cfuse and kclient:... Greg Farnum
09:27 AM RADOS Bug #558: crushtool cannot always re-encode a crushmap that it's created
Either the compiler part just needs to be updated to allow forward bucket references, or the dumper needs to dump by ... Sage Weil
09:26 AM Feature #562 (Closed): separate gui into separate binary, package
This will mean refactoring common ceph.cc bits into a separate file and .a. Sage Weil
09:22 AM Linux kernel client Bug #434: mds: clustered mds pjd failures
a few more fixes here on inode updates version check and mtime. Sage Weil
07:23 AM Linux kernel client Bug #434 (Resolved): mds: clustered mds pjd failures
this was a kclient problem caused by bad uid/gid in resent requests. fixed by commit:cb4276cca4695670916a82e359f2e377... Sage Weil
09:20 AM Tasks #406 (Closed): push v0.20.2 to upstream debian, ubuntu maintainers
Sage Weil
09:20 AM CephFS Cleanup #427 (Rejected): mds: tie scatter pins directly to freeze machinery
no more scatterpins, yay! Sage Weil
09:19 AM Linux kernel client Bug #554 (Resolved): clustered mds: max_size not updated
Sage Weil
07:39 AM CephFS Feature #560 (Resolved): mds: alternate directory hashing
Currently dentries are hashed among dirfrags using the linux dcache's hash function, which is pretty trivial. The pr... Sage Weil
07:30 AM Bug #559: osd: dup requests can ack early
The dup request check looks at the reqid in the log, and replies early. That request could still be in flight to dis... Sage Weil
07:28 AM Bug #559 (Rejected): osd: dup requests can ack early
Sage Weil

11/07/2010

06:02 PM RADOS Bug #558 (Resolved): crushtool cannot always re-encode a crushmap that it's created
When a CRUSH text map is encoded, the buckets are read in such a way that they must be defined before they are refere... Ravi Pinjala
05:56 PM Revision 0feec2f4 (ceph): Merge remote branch 'origin/object_locator' into unstable
Conflicts:
src/osd/OSD.cc
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h
src/osd/osd_types.h
Sage Weil
05:45 PM Revision b7f578cf (ceph): Merge remote branch 'origin/timer-fixes' into unstable
Sage Weil
05:44 PM Revision deb9ef76 (ceph): v0.24~rc
Sage Weil
05:42 PM Revision 0b190920 (ceph): Merge remote branch 'origin/testing' into unstable
Sage Weil
03:49 PM Revision a4674af5 (ceph): mds: eval: put scatter in MIX if replicated, otherwise LOCK
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:45 PM Revision 33c6e230 (ceph): mds: do not scatter_writebehind in MIX state
Replicas might come in while we're flushing and get a MIX state with
the old state.
Signed-off-by: Sage Weil <sage@n...
Sage Weil
11:29 AM Feature #231: Slow OSDs shouldn't destroy cluster performance
Today I experienced a btrfs bug where *[btrfs-transacti]* got to status D and causing my OSD to hang (also go into st... Wido den Hollander
10:18 AM Linux kernel client Bug #554: clustered mds: max_size not updated
fixed by commit:912a9b0319a8eb9e0834b19a25e01013ab2d6a9f. also commit:feb4cc9bb433bf1491ac5ffbba133f3258dacf06 for g... Sage Weil
10:15 AM Feature #524 (In Progress): object_locator_t
Work so far merged by commit:0feec2f4f31aa3a259b2cdf885d6458995ce860b
Still need to update the on-wire protocol to...
Sage Weil
10:08 AM CephFS Feature #495 (Resolved): mds: add MIX_STALE
merged in commit:0b1909209800229f5098cdc848fc3901508c1e19. best part of this is MIX_STALE went away. yay! Sage Weil
10:05 AM Bug #248 (In Progress): rbdtool import should use fiemap
whoops, this never got merged. Sage Weil
08:58 AM Linux kernel client Bug #557 (Can't reproduce): BUG_ON(!session->s_num_cap_releases);
... Sage Weil
08:11 AM CephFS Bug #556 (Resolved): clustered mds: rename
various hangs with thrash-exports and pjd rename tests. Sage Weil
04:05 AM Revision 1bf8e732 (ceph): Merge branch 'unstable' into mix_stale
Sage Weil
04:01 AM Revision 1eb94da2 (ceph): mds: introduce/use helpers to resync stale fragstat/rstat; update version
Simplifies code.
Also, update the version when we resync!
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:01 AM Revision c1ee560e (ceph): mds: don't fuss with versions when taking frag/rstat from frag; it's ne...
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:01 AM Revision bdc2fa5b (ceph): mds: remove MIX_STALE
Yay, we don't need it!
If we can't update the frag on scatter, fine. The staleness of the frag
is implicit in the f...
Sage Weil
04:00 AM Revision c2034829 (ceph): mds: ignore done_locking on slave requests' acquire_locks()
Slave requests ask for each xlock one at a time. Don't bail out based on
the done_locking flag.
Signed-off-by: Sage...
Sage Weil
04:00 AM Revision 51b6a863 (ceph): mds: don't use helper for rename srcdn
The rdlock_path_xlock_dentry helper works for _auth_ dentries that we
create locally in an auth dirfrag. For the src...
Sage Weil
04:00 AM Revision eb0a60d0 (ceph): mds: never complete a gather on a flushing lock
The scatter_writebehind() takes a wrlock, but that may still allow the lock
to complete a gather to LOCK and even mov...
Sage Weil

11/06/2010

04:38 PM Revision bdf3bc5e (ceph): mds: update version when bring stale rstat back up to date
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:58 PM Revision a74054d1 (ceph): mds: simplify stale semantics a bit
is_stale() => next MIX is MIX_STALE. Stale flag is then cleared. Then we
special case the import to preserve stale-n...
Sage Weil
01:30 PM Bug #555 (Closed): debian/ubuntu: ceph-client-tools needs to depend on libgtkmm-2.4-1c2a
Invalid report, it was due to a upgrade. When doing a fresh install of the packages they do depend on libgtkmm.
Cl...
Wido den Hollander
11:52 AM Bug #555 (Closed): debian/ubuntu: ceph-client-tools needs to depend on libgtkmm-2.4-1c2a
Right now, the building process depends on libgtkmm-2.4-dev, but when installing the packages and running 'ceph -g' y... Wido den Hollander
11:55 AM Linux kernel client Bug #434: mds: clustered mds pjd failures
Just saw this again:... Sage Weil
11:18 AM Bug #553: Kernelmodule doen't build under Debian lenny
Ok, a backport-kernel works fine AFAIS. I updated the wiki-page. DaB Punkt
10:10 AM Bug #553 (Won't Fix): Kernelmodule doen't build under Debian lenny
Unfortunately you're going to need to upgrade your kernel if you want the in-kernel client. Using the backports branc... Greg Farnum
09:52 AM Bug #553 (Won't Fix): Kernelmodule doen't build under Debian lenny
Hello all,
the wiki-page [1] says that ceph runs under Debian lenny, but as far as I see that is not true because th...
DaB Punkt
11:16 AM Linux kernel client Bug #554 (Resolved): clustered mds: max_size not updated
3 mds, export thrashing, dbench 1 hang waiting on max_size. Sage Weil
04:52 AM Revision e27f111f (ceph): mds: preserve stale state on import; some cleanup
Our new invariant is that MIX_STALE always implies is_stale(). And on
import, if is_stale(), MIX becomes MIX_STALE. ...
Sage Weil
12:08 AM Revision a582345c (ceph): Merge branch 'mix_stale' into unstable
Sage Weil
12:06 AM Revision 4126d1ce (ceph): mds: add more verify_scatter asserts
For catchings fragstat errors sooner.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil

11/05/2010

10:24 PM Revision ae670c33 (ceph): mds: fix version check on resyncing stale rstat in predirty_journal_par...
We're resyncing rstat, so check the rstat version (not fragstat!)
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:45 PM Revision 4cee6ead (ceph): mds: Fix bad inode deref.
Accidentally trying to print out the CInode after removing it in trim_non_auth!
Move the print to before it's been un...
Greg Farnum
07:20 PM Revision 93344fb2 (ceph): Revisit std::multimap decoder
Previously I changed the std::multimap decoder to minimize the number of
constructor invocations. However, it could b...
Colin Patrick McCabe
06:34 PM Revision f015c989 (ceph): autogen.sh: check for pkg-config
To avoid seeing confusing errors later in the configure process, in
autogen.sh, check to make sure the pkg-config pro...
Colin Patrick McCabe
05:57 PM Revision fd397aba (ceph): PG.cc: build_scrub_map now drops the PG lock while scanning the PG
build_inc_scrub_map scans all files modified since the given
version number and creates an incremental scr...
Samuel Just
05:38 PM Revision 989fa67d (ceph): mds: preserve version when recovering rstat from dirfrag in predirty_jo...
We don't want to screw up the version here. This aligns the code with
other instances of this check.
Signed-off-by:...
Sage Weil
02:50 PM Linux kernel client Bug #552 (Resolved): Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg
With kernel oplocks = yes, samba fills up dmesg with those
[ 4472.504211] ceph: problem parsing dir contents -5
[...
Paul Komkoff
01:56 PM Linux kernel client Bug #434: mds: clustered mds pjd failures
Sage has taken over the clustered MDS stuff for now, so here's the bug! Greg Farnum
01:55 PM CephFS Feature #495: mds: add MIX_STALE
Sage Weil
01:36 PM Bug #521: objecter: crash in osdmap assert
Sage Weil
01:02 PM CephFS Bug #551 (Can't reproduce): cfuse crash on quick mds restart
Program terminated with signal 11, Segmentation fault.
#0 0x00000000004704ad in Client::kick_flushing_caps (this=0x...
Greg Farnum
12:29 PM Bug #550: mon: PGMonitor::update_from_paxos()
While I thought it wasn't related to the MDS issue i'm seeing, it might seem it is:... Wido den Hollander
12:11 PM Bug #550 (Can't reproduce): mon: PGMonitor::update_from_paxos()
One of my monitors crashed, got this backtrace:... Wido den Hollander
10:59 AM Linux kernel client Bug #549: bonnie++ file stat failure
Terri, can you have the qa machiens loop through _just_ the bonnie++ command he's having problems with? Something li... Sage Weil
10:57 AM Linux kernel client Bug #549 (Resolved): bonnie++ file stat failure
From ML:... Sage Weil
10:49 AM Bug #531: Journaling Causes System Hang
Hello,
1) Correct we are running transparent 10GbE
2) From what I can tell monitoring dstat across the cluster ...
Bryan Tong
10:14 AM CephFS Feature #91: mds: up:shadow mode
Update the journaler interface to allow the MDS to 'tail' the journal... periodically check to see if it's been exten... Sage Weil
10:10 AM CephFS Feature #548 (Resolved): mds: shadowreplay one-shot mode
Make sure the current mechanism still works. Clean it up if needed. Sage Weil
09:19 AM CephFS Subtask #547 (Resolved): mds: define fsck strategy, required metadata
Sage Weil
09:19 AM CephFS Feature #340 (Closed): large directories, directory fragmenting
Sage Weil
09:19 AM CephFS Feature #519 (Closed): mds: dirfrag merge
Sage Weil
06:20 AM Revision 9586e905 (ceph): mds: restructure finish_scatter_gather_update()
Separate behavior into two dimensions: whether or not we are updating
the dirfrag, and whether or not the dirfrag is ...
Sage Weil
06:15 AM Revision 669a8afa (ceph): mds: do not bump scatter stat lock in predirty_journal_parents
If we're in the MIX state, we clearly can't touch this without screwing up
the delicate scatter/gather behavior. If ...
Sage Weil
05:48 AM Revision 663b470f (ceph): mds: mark scatterlock stale on import of stale frag scatter stat
When the lock scattered, if we didn't have an auth frag that was frozen,
we go into MIX state. Later, we may import ...
Sage Weil
05:44 AM Revision 63c1ad84 (ceph): mds: match bottom half of assilate_dirty_rstat_inodes with a dir flag
We only do the assimilate_dirty_rstat_inodes if we do an update AND the
frag rstat was non-stale, but the bottom half...
Sage Weil
05:19 AM Revision 9b6d96e9 (ceph): mds: fix inode version used for inest in decode_lock_state
We need to pass the inode rstat's version into finish_scatter_update, not
the shadowed local variable. Otherwise we ...
Sage Weil

11/04/2010

11:22 PM Revision 62716aa7 (ceph): PGMonitor::update_from_paxos: check for bad input
Be more robust against bad data coming in from the network.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
11:04 PM Linux kernel client Bug #546: direct i/o does not work when offset is not page-aligned
Attached is the testing file. Henry Chang
10:55 PM Linux kernel client Bug #546 (Resolved): direct i/o does not work when offset is not page-aligned
When opening file with O_DIRECT, seeking to offset 6656 and reading 512 bytes gets wrong data.
Below is a strace log...
Henry Chang
09:33 PM Revision 8f3672dc (ceph): Replace sprintf with snprintf
Replace sprintf with snprintf. This is especially critical when the
format string includes "%s".
Signed-off-by: Coli...
Colin Patrick McCabe
09:26 PM Revision 56179d12 (ceph): start_profiler/enable_profiler_options:fix memleak
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:11 PM Revision e6a751bd (ceph): Set HEAP_PROFILE_INUSE_INTERVAL based on conf
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:09 PM Revision 8c8bfdb3 (ceph): CInode::make_path_string: don't coerce ino
CInode::make_path_string: don't coerce the inode number to 32-bits.
Everyone else is treating it as 64 bits; this fun...
Colin Patrick McCabe
08:17 PM Revision f23ba003 (ceph): mds: verify single frag rstat on projection too
Currently we do a sanity check on gather; do the same check in
project_rstat_frag_to_inode().
Signed-off-by: Sage We...
Sage Weil
08:17 PM Revision 53f6ed16 (ceph): mds: mds debug scatterstat to print out projected rstat/fragstat
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:58 PM Revision 4df92ade (ceph): Merge branch 'dumpjournal' into unstable
Greg Farnum
06:41 PM Revision d3c2b9cb (ceph): cmds: Include journal dumper functionality.
Greg Farnum
06:41 PM Revision e0a5de25 (ceph): dumper: Add new Dumper class.
This lets you dump an MDS journal to a file. Greg Farnum
06:33 PM Revision 28f956ae (ceph): mds: fix optional frag asserts
We want these to trigger when mds_verify_scatter is true. Only one !.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:28 PM Revision 86d6e51e (ceph): objecter: add new wait_for_osd_map function.
Greg Farnum
06:13 PM Revision 8a41d096 (ceph): osd: clean up active <-> booting state transitions
Among other things, get rid of the 'wrongly marked down' log message on
normal startup.
Signed-off-by: Sage Weil <sa...
Sage Weil
05:24 PM Revision f917df79 (ceph): TestEncoding: count number of ctor invocations
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:00 PM Feature #456 (Resolved): make dumpjournal functionality usable
Pushed to branch dumpjournal, and merged into unstable in commit:4df92adea46730bdb7cb88290203cad2369ed895.
Tested ...
Greg Farnum
10:57 AM Bug #544: ceph-0.22.2: fails to build with --as-needed
Sage Weil wrote:
> Thanks! Can I add your Signed-off-by to this (as per SubmittingPatches)?
Sure.
Anonymous
10:04 AM Bug #544: ceph-0.22.2: fails to build with --as-needed
Thanks! Can I add your Signed-off-by to this (as per SubmittingPatches)? Sage Weil
12:57 AM Bug #544 (Resolved): ceph-0.22.2: fails to build with --as-needed
Due to wrong linking order[1] of ceph's libraries, whole package fails to build with LDFLAGS="-Wl,--as-needed".
Supp...
Anonymous
10:25 AM Bug #531: Journaling Causes System Hang
Brian, can you give us a few more details about your cluster and the performance drop you're seeing here? Specific qu... Greg Farnum
10:09 AM Bug #538: Write performance does not scale over multiple computers
Ed Burnette wrote:
> I'll try that if I can the servers to stay up long enough. ceph -w is swamped with chatter abou...
Sage Weil
09:56 AM CephFS Feature #545 (Resolved): mds: use bloom filter to supplement dirfrag COMPLETE flag
Currently we need the complete flag (or a cached negative dentry) to conclude a name does not exist in a frag before ... Sage Weil
05:28 AM Revision 1c934ebd (ceph): mds: wait for last_failure_osd_epoch before starting journal replay
This is extremely important, and it forces the MDS to get the osdmap that
includes the blacklist entry for its predec...
Sage Weil
05:28 AM Revision e90a3b62 (ceph): mds: dump corrupt events; optionally skip them
If we encounter a bad event in the journal, dump it to the log.
Optionally skip it, if 'mds log skip corrupt events ...
Sage Weil
05:28 AM Revision f5112866 (ceph): mon: blacklist and update last_failure_osd_epoch in all failure paths
This includes the pure failure in do_stop(), and the explicit admin
fail command.
Signed-off-by: Sage Weil <sage@new...
Sage Weil
05:28 AM Revision 6345fcda (ceph): mon: update mdsmap.last_failure_osd_epoch when blacklisting
We need to note the osdmap epoch the taking-over mds needs in the mdsmap.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:28 AM Revision 0fb22974 (ceph): mds: add last_failure_osd_epoch to extended section of mdsmap
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:00 AM Revision c4e56e9a (ceph): MonClient: start SafeTimer in MonClient::init()
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:55 AM Revision 8f33a415 (ceph): cosd: start SafeTimer in OSD::init()
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:40 AM Revision 1cf5bc74 (ceph): cephtool: fix timer init/destruction
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:40 AM Revision 2c7d293d (ceph): vstart.sh: turn on MDS debugging
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:40 AM Revision e4853fa8 (ceph): SafeTimer: delete contexts under the event_lock
SafeTimer: delete contexts under the event_lock.
Also add more debug printouts and create two convenience functions.
...
Colin Patrick McCabe
04:40 AM Revision b0e73746 (ceph): TestTimers: add test for out-of-order timer insert
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:40 AM Revision 0b9f2e23 (ceph): Timer: add verbose debugging when debug timer = 20
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:40 AM Revision 124d287a (ceph): Monitor: start timer thread in init(), not ctor
Don't start the SafeTimer when class Monitor is created. We want to hold off on
starting the thread until SimpleMesse...
Colin Patrick McCabe
04:40 AM Revision e6b8dbae (ceph): Timer: fix timer shutdown, efficiency issues
Rework Timer and SafeTimer to be more efficient and to handle shutdown
correctly. Document the API, especially what l...
Colin Patrick McCabe
04:40 AM Revision cd316651 (ceph): TestTimers: call common_init and parse argv
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:40 AM Revision d840e4f0 (ceph): TestTimers: test cancelling single events
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:40 AM Revision 8279f14b (ceph): Timer.cc: clean up debug printouts
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:40 AM Revision 571e3750 (ceph): SafeTimer: clean up copy constructor declaration
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:40 AM Revision d3ead43a (ceph): Logger.cc: avoid creating SafeTimer in global-ctor
Don't create a SafeTimer at global constructor time. Timers
contain a Thread, and the library stuff may not have been...
Colin Patrick McCabe

11/03/2010

11:41 PM Revision 0d1bfe06 (ceph): client: print useful max_size waiting message
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
11:40 PM Revision fc9059e5 (ceph): Merge branch 'mix_stale' into unstable
Sage Weil
11:40 PM Revision 4f24fcbc (ceph): debian: add gtk build-depends
For ceph -g.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
11:30 PM Bug #479: ceph/mount crash badly when writing
Hi DongJin,
Any luck on this issue? Has the problem gone away, or do you have time to help us track it down?
T...
Sage Weil
11:27 PM CephFS Bug #478 (Can't reproduce): MDS crash: LogEvent::decode()
Sage Weil
11:27 PM CephFS Bug #478: MDS crash: LogEvent::decode()
From the mds dump in the debugpacks, it looks like there were MDS daemons on two different nodes. I'm inclined to ch... Sage Weil
10:49 PM CephFS Bug #542 (Resolved): mds journal corruption
Sage Weil
10:49 PM CephFS Bug #542: mds journal corruption
commit:1c934ebd6ff3a3a7000671821a12e83c609f1e27 Sage Weil
10:24 PM CephFS Bug #542: mds journal corruption
Mystery solved.. this was actually a takeover:
- where the old mds was blacklisted
- new mds probed and read jour...
Sage Weil
09:38 PM CephFS Bug #542 (Resolved): mds journal corruption
I saw this on the playground.
THe last bit of the replay log:...
Sage Weil
10:49 PM Bug #535: cephtool hangs forever until a UNIX signal is received
I should have written this at the top of the bug report, but this was on the unstable branch.
Anyway, I'll add mor...
Colin McCabe
02:11 PM Bug #535 (Rejected): cephtool hangs forever until a UNIX signal is received
This occurrence is a problem on the monitor side that reproduces in the timer-fixes branch, but not unstable. Greg Farnum
10:45 PM Feature #543 (Resolved): PG::search_for_missing: don't iterate over all missing
PG::search_for_missing processes a replica's missing map to determine if it has any objects that we need.
If the m...
Colin McCabe
09:47 PM Revision fd57f4de (ceph): mds: fix put_xlock() assert for slave masters
If we are a master of a slave, the state will be LOCK.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:47 PM Revision d0c29d7d (ceph): mds: add 'mds verify scatter' and re-add some scatter asserts
Check on ifile and inest gather that stats match single-frag dirs.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:47 PM Revision 563a9ba6 (ceph): mds: finish_scatter_update on auth dirfrags too
We can update the dirfrag accounted on auth dirfrags at scatter time too.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:47 PM Revision 8b9342c7 (ceph): mds: disable tempsync
Tempsync is not implemented in the filelock state machine. Never use it,
at lesat for now!
Signed-off-by: Sage Weil...
Sage Weil
09:47 PM Revision 4d669c8c (ceph): mds: request unscatter when MIX_STALE on replica
This means implementing REQUNSCATTER.
Eventually this should use TEMPSYNC, but that isn't fully implemented yet.
Si...
Sage Weil
09:47 PM Revision a98812f9 (ceph): mds: rename 'mix stale' => 'mix_stale'
For unambigous debug output
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:12 PM CephFS Bug #472 (Resolved): mds: fragstat crash
Sage Weil
08:08 PM Revision 0e079bc8 (ceph): mds: use helper for scatter dirfrag update; use on local dirfrags
Any time we scatter is an opportunity to update the dirfrag with the
accounted scatter stat if it is out of date. We...
Sage Weil
07:52 PM Revision 77ec378d (ceph): Add the ps-ceph.sh tool
This allows you to see at a glance which ceph programs and tools you
have running.
Signed-off-by: Colin McCabe <coli...
Colin Patrick McCabe
07:19 PM Revision 4e586dd0 (ceph): encoding.h: fix compiler warning
Fix a compiler warning about an uninitialized variable. Basically, we
used to insert uninitialized values into a std:...
Colin Patrick McCabe
07:19 PM Revision c98b0268 (ceph): TestEncoding: add templated encode-then-decode fn
TestEncoding: add a templated encode-then-decode fn that can be used to
test encoding followed by decoding of any typ...
Colin Patrick McCabe
07:18 PM Revision 84e2da8d (ceph): Create TestEncoding to test serialization code
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:07 PM Revision 60c59aed (ceph): mds: add some scatterlock notes
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:03 PM Revision 0dc75a94 (ceph): ceph: remove bad assert for old frag stat
It's normal for old fragstat info to be mismatched (stat !=
accounted_stat).
Signed-off-by: Sage Weil <sage@newdream...
Sage Weil
05:51 PM Revision 34135185 (ceph): mds: match conditions in finish_scatter_gather_update_accounted
This needs to match the frozen check in finish_scatter_gather_update.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:12 PM Revision 33268e20 (ceph): mds: handle MIX_STALE on auth too
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:51 PM Revision 14f4d22c (ceph): mds: scatter_info_t ancestor for nest_info_t and frag_info_t
This will facilitate using generic code for the inest and ifile
scatterlocks.
Signed-off-by: Sage Weil <sage@newdrea...
Sage Weil
04:47 PM Revision cbacc1d4 (ceph): mds: only mark auth dirfrags stale in start_scatter
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:40 PM CephFS Feature #495 (Resolved): mds: add MIX_STALE
commit:fc9059e5270380c3266f7f958da6a8cc9b042f22 Sage Weil
04:05 PM CephFS Feature #495: mds: add MIX_STALE
Sage has been working on this today. Greg Farnum
03:42 PM CephFS Feature #541 (Resolved): mds: tempsync
Integrate this into the filelock state machine, and then use it when appropriate (namely, unscatter) Sage Weil
08:25 AM Bug #540 (Resolved): CephxClientHandler::handle_response
Saw this crash today after upgrading to the latest unstable:... Wido den Hollander
07:19 AM Bug #538: Write performance does not scale over multiple computers
Greg Farnum wrote:
> Just to be clear, do you have all 208 nodes running server daemons and the client? What's your ...
Ed Burnette
04:51 AM Revision 44574e86 (ceph): mds: mark scatterlock stale if any auth dirfrags appear stale
The auth needs to move to MIX_STALE for the same reasons a replica does:
if, on scatter, any dirfrags have an old acc...
Sage Weil
04:49 AM Revision 4a0f7312 (ceph): mds: do not update accounted_*stat if auth and frozen
The auth can't update a frozen dirfrag for the same reason a replica
can't.
Signed-off-by: Sage Weil <sage@newdream....
Sage Weil
12:52 AM Revision 839371cc (ceph): osd: Added load threshold for scrub scheduling
Josh Durgin

11/02/2010

11:34 PM Revision 3ae8c001 (ceph): osd: Make a per-pg sched_scrub, and remove non-active accounting from t...
Josh Durgin
11:28 PM Revision 9d1984e8 (ceph): mds: mark scatterlock stale if dir is frozen, not inode
It's the dir we're auth for and that might potentially be frozen.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
11:25 PM Revision 4838016d (ceph): Merge branch 'unstable' into mix_stale
Sage Weil
10:26 PM Revision e304a245 (ceph): rados: benchmark using unique object names
Include hostname and pid in object name, so that instances running on
different hosts write to unique objects.
Signe...
Sage Weil
09:37 PM Revision 38f96c65 (ceph): debian packaging: set --sbindir=/sbin
We want mkcephfs and mount.ceph to be under /sbin.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
08:00 PM Revision 68f7fede (ceph): config: fix sigsegv handler
Fixed this with sigabrt, forgot to do sigsegv too.
See 7a688a9f999a6b9d3bcdcbebbd8cd984afc70e31.
Signed-off-by: Sag...
Sage Weil
06:10 PM Revision 235aa1c3 (ceph): filestore: disable 'filestore btrfs snap' when SNAP_DESTROY is missing
We want to enable the new snap stuff by default. But we also want to work
with the default configuration on old kern...
Sage Weil
05:43 PM Revision 4cfd198c (ceph): Makefile.am: include the libcrush headers when installing
Signed-off-by: Wido den Hollander <wido@widodh.nl> Wido den Hollander
05:10 PM Revision abb0b6d9 (ceph): Merge branch 'testing' into unstable
Greg Farnum
05:09 PM Revision 5310ab6e (ceph): uclient: Warn on truncate_[size|seq] changes for non-file inodes.
Greg Farnum
05:09 PM Revision 630db2a9 (ceph): mds: Init system CInodes to have a truncate_size of -1.
This should help with bug #518. Greg Farnum
05:09 PM Revision 524c8903 (ceph): client: match initialization with mds
(see Server::prepare_new_inode())
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:09 PM Revision 20e8a451 (ceph): client: only do truncate on regular files
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:45 PM Revision 905ff763 (ceph): debian: add pkg-config as build-depends
Signed-off-by: Wido den Hollander <wido@widodh.nl> Wido den Hollander
03:34 PM Bug #535 (In Progress): cephtool hangs forever until a UNIX signal is received
Okay, got in on a hang. The Pipe's been doing a disconnect/reconnect loop for about 4 minutes, it's currently in stat... Greg Farnum
03:24 PM Bug #538: Write performance does not scale over multiple computers
Oh, there is another issue: the rados bench command always writes to objects "Object %d". So all of your nodes are w... Sage Weil
12:32 PM Bug #538: Write performance does not scale over multiple computers
Just to be clear, do you have all 208 nodes running server daemons and the client? What's your configuration look lik... Greg Farnum
11:55 AM Bug #538: Write performance does not scale over multiple computers
I also tried setting the target pg count to 4,000 and got about the same numbers as 400, maybe a small amount faster.... Ed Burnette
11:24 AM Bug #538: Write performance does not scale over multiple computers
I set the target pg count to 400 and tried again. It helped some, up to 2x, but is still slower than I expected:
<...
Ed Burnette
10:19 AM Bug #538: Write performance does not scale over multiple computers
If the benchpool is a new pool you created, the problem is likely that it is too small. By default, new pools have o... Sage Weil
08:15 AM Bug #538 (Closed): Write performance does not scale over multiple computers
I have ceph0.22.1 installed on a cluster of 208 lightly loaded 64-bit Linux nodes (RHEL5.5 ext3). The configuration i... Ed Burnette
02:48 PM Bug #537: debian/ubuntu: Build system broken after commit
oh yeah, and thanks for your patches, Wido. Good call with the libcrush headers.
C.
Colin McCabe
02:47 PM Bug #537 (Resolved): debian/ubuntu: Build system broken after commit
should be fixed by 38f96c658dee3e7e26a68a3c57eec2a5d8758e17
cheers,
C.
Colin McCabe
10:13 AM Bug #537: debian/ubuntu: Build system broken after commit
I applied #2, but for #1, we really do want those installed in /sbin (so say the debian/ubuntu guys). That's unfortu... Sage Weil
06:45 AM Bug #537 (Resolved): debian/ubuntu: Build system broken after commit
commit 1dd5042e655b80eae99f002047fe1dfb4cc46120 broke some things when building .deb packages, mainly because the loc... Wido den Hollander
10:46 AM Feature #389: Synchronize header modifications between clients
Still working on it. Major functionality that was implemented:
- new osd- watch/notify/notify-ack messages
- most...
Yehuda Sadeh
10:32 AM Linux kernel client Bug #69: ceph: ffff88001976ba50 auth cap (null) not mds0 ???
just saw this on ceph1, running commit:2f56f56ad991edd51ffd0baf1182245ee1277a04... Sage Weil
10:19 AM Tasks #539 (Resolved): wiki: document pg expansion
Sage Weil
10:18 AM CephFS Bug #529 (Resolved): Cfuse: Software caused connection abort
There were a sequence of commits in this, some of which were one step forward and two steps back. The testing branch ... Greg Farnum
05:51 AM CephFS Bug #529: Cfuse: Software caused connection abort
I was going to apply the patch to my version but I noted that my src/client/Client.h line 516 already says "truncate_... Ed Burnette
09:41 AM Bug #536 (Resolved): debian/ubuntu: Add pkg-config as a build dependency
applied commit:905ff7635297614633175f129f491a83c3b2f314, thanks! Sage Weil
02:37 AM Bug #536 (Resolved): debian/ubuntu: Add pkg-config as a build dependency
When trying to build todays unstable, I got the following message:... Wido den Hollander
05:44 AM Bug #532 (Closed): OSD: repop_queue.front() == repop
Indeed, my build system was still building the *rc* branch, oops! Wido den Hollander
05:09 AM Revision bc9bc4cb (ceph): init-ceph: make lockfile dir configuration (redhat)
Reported-by: Ed Burnette <ed.burnette@sas.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
02:31 AM Revision 85ba4f2d (ceph): object.h: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe

11/01/2010

10:36 PM CephFS Bug #451 (Can't reproduce): mds: replay error
Sage Weil
10:35 PM CephFS Bug #523: cfuse locks don't wake on mds reconnect?
This might be the same issue as #535 (which looks to me like it's waiting on tcp_read/poll?). Sage Weil
10:33 PM CephFS Bug #529: Cfuse: Software caused connection abort
Hey Greg, this looks like client truncation stuff again. This was biting me today, almost immediately. These two pa... Sage Weil
07:40 AM CephFS Bug #529 (Resolved): Cfuse: Software caused connection abort
After using ceph for a few minutes it gets into a state where I can no longer access the cfuse mount point. It also s... Ed Burnette
10:33 PM Revision ee3fc3bd (ceph): osd: Add scrub to the names of scrub scheduling-related things.
Josh Durgin
10:31 PM Revision 993ba1cd (ceph): osd: refactor OSD::sched_scrub
Take sched_scrub_lock sparingly, and push active/pending accounting to the work queue. Josh Durgin
10:31 PM Revision 8d200a7d (ceph): osd: Move pending/active scrub accounting into the scrub work queue.
Josh Durgin
10:30 PM Revision 378f84c1 (ceph): osd: Add the rest of infrastructure for scheduling scrubbing
Josh Durgin
10:29 PM Bug #531: Journaling Causes System Hang
Yeah,
I figured not running with journals wouldn't work right. As long as the block size of the writes is very lar...
Bryan Tong
10:22 PM Bug #531: Journaling Causes System Hang
It's expected that you'll get extremely slow performance without the journal.
I'll work on replicating this in o...
Sage Weil
12:08 PM Bug #531: Journaling Causes System Hang
I forgot a bit about the setup.
4 x OSD all with journals on separate drives. Each OSD is on a separate system.
B...
Bryan Tong
12:00 PM Bug #531 (Resolved): Journaling Causes System Hang
Hello,
It seems that when doing a large write once the journal fills up the system goes into a state of lock and h...
Bryan Tong
10:28 PM Bug #530 (Resolved): No way to override lock file path on RH
This look okay? commit:bc9bc4cb28376728e5428eff0ddb3ff301831e50 Sage Weil
07:57 AM Bug #530 (Resolved): No way to override lock file path on RH
init-ceph checks if /var/loc/subsys exists and if it does, tries to create a lock file there. In my case for various ... Ed Burnette
10:28 PM Revision 7b68a403 (ceph): osd: add variables to track scrub scheduling
Add OSD, PG, and config variables to track pending and active scrubs. Josh Durgin
05:12 PM Bug #535: cephtool hangs forever until a UNIX signal is received
Colin McCabe wrote:
> While running vstart.sh, I reproduced this bug with debug_ms = 20.
>
> Here's what the outp...
Colin McCabe
05:09 PM Bug #535: cephtool hangs forever until a UNIX signal is received
While running vstart.sh, I reproduced this bug with debug_ms = 20.
Here's what the output was. Since cephtool does...
Colin McCabe
04:42 PM Bug #535: cephtool hangs forever until a UNIX signal is received
> Perhaps this bug is caused by Nagle's algorithm?
>
As Sage pointed out, we're already running with TCP_NODELAY...
Colin McCabe
04:31 PM Bug #535 (Resolved): cephtool hangs forever until a UNIX signal is received
I just saw this twice in a row. cephtool hangs forever until a UNIX signal is received. That seems to break the logja... Colin McCabe
05:04 PM Revision 3d85a7b9 (ceph): logrotate: separate rule for stat/*.log
Logrotate seems to ignore the entire rule if any part of the file list
is not found. This happens on nodes with only...
Sage Weil
04:56 PM Bug #533: radostool hang on shutdown
I think I have a fix for this one. Colin McCabe
03:48 PM Bug #533 (Resolved): radostool hang on shutdown
radostool still seems to be hanging from time to time on shutdown.
Sending a signal resolves the issue.
For examp...
Colin McCabe
04:53 PM Revision 49153c2c (ceph): osd::PG: Update PG comments
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:50 PM Revision e6df8074 (ceph): test: create test_unfound.sh
Create test_unfound.sh to test handling unfound objects.
Move more test functions into test/test_common.sh to facili...
Colin Patrick McCabe
03:53 PM Linux kernel client Feature #534 (Resolved): support CEPH_FEATURE_RECONNECT_SEQ in klibceph
Sage Weil
02:07 PM Bug #532: OSD: repop_queue.front() == repop
This problem was in v0.22, but fixed in v0.22.1. Can you try with the latest testing (v0.22.2) or unstable? Sage Weil
12:52 PM Bug #532: OSD: repop_queue.front() == repop
I think I was a bit to premature about that, since osd5 just crash again with the same backtrace.... Wido den Hollander
12:45 PM Bug #532 (Closed): OSD: repop_queue.front() == repop
On two of my OSD's I had the following crash:... Wido den Hollander
03:43 AM Revision 1dd5042e (ceph): fix make distcheck, make uninstall
Make distclean was failing because make uninstall was broken. (There were
still leftover files after running make ins...
Colin Patrick McCabe
02:50 AM Bug #462: cephx: verify_authorizer_reply exception in decode_decrypt
I've just done a fresh mkcephfs on my cluster and then I started to see:... Wido den Hollander

10/30/2010

10:46 PM Revision 33e4d533 (ceph): Merge remote branch 'origin/mds_frags' into unstable
Sage Weil
10:22 PM Revision c044829c (ceph): filestore: automatically choose appropriate journaling mode
The three modes each get an explicit config option that defaults to false.
You can choose one explicitly by enabling ...
Sage Weil
10:04 PM Revision 6c69c259 (ceph): Merge remote branch 'origin/testing' into unstable
Conflicts:
configure.ac
Sage Weil
06:24 PM Revision 9f4fd4a6 (ceph): v0.22.2
Sage Weil
06:12 PM Revision 5b06ca1c (ceph): filestore: use updated btrfs ioctls
Switch to the interface finally merged for 2.6.37-rc1.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:32 PM Revision a831b2aa (ceph): btrfs: update ioctls.h
This is what was finally merged for 2.6.37-rc1.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
01:37 PM Bug #528 (Closed): cephx client: unknown request_type 20737
nevermind, this was ancient code. Sage Weil
01:36 PM Bug #528 (Closed): cephx client: unknown request_type 20737
... Sage Weil
12:04 AM Revision bb628d38 (ceph): Get "make dist" working, fix gui build issues
* Fix VPATH builds (i.e., builds where srcdir != builddir).
Don't assume that we can get a source files named blah wi...
Colin Patrick McCabe

10/29/2010

11:42 PM Revision 7e8fc103 (ceph): mds: detect small dirs that should be merged, and merge them
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:47 PM Revision 8a8e37b8 (ceph): mds: hit dir popularity on unlink
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:29 PM Revision 73c6e4cc (ceph): osd: write potentially large pg info to object, not xattr [format change]
Write past_intervals and snap_collections to a separate object instead of
an attr on the collection directory. This ...
Sage Weil
09:36 PM Revision ed89d9a2 (ceph): cephtool-gui : more helpful error on pixbuf fail
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:06 PM Revision 18bb14ea (ceph): osd: snap_trimmer: flush between collection sets
We need to make sure the objects whose collection sets we just adjusted
are reflected on disk when we make the next p...
Sage Weil
08:07 PM Revision 0ce1d509 (ceph): mds: Refactor need_snapflush into CInode helpers
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:07 PM Revision 440cc439 (ceph): mds: auth_pin head/snap pairs for all need_snapflush entries
This ensures that when snap metadata is flushed, we will be auth on both
inodes and be able to do the update properly...
Sage Weil
07:40 PM Revision 9f86a79d (ceph): configure.ac: default to --enable-gtk
Default to enabling gtk rather than disabling it. Gracefully handle
cases where the user tries to enable it but it ca...
Colin Patrick McCabe
07:06 PM Revision c2045286 (ceph): osd: fix decoding of legacy (v2) coll_t
It was u8, not int.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:42 PM Revision 4e180fef (ceph): mds: fix use-after-free iterator no-no in remove_inode_recursive()
We can't close_dirfrag and use the iterator if it's pointing to the
released element.
Signed-off-by: Sage Weil <sage...
Sage Weil
06:42 PM Revision 96e583d3 (ceph): mds: use list instead of vector for trim_unlinked_inodes
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:30 PM Revision 8906819a (ceph): Merge branch 'mds_journal' into unstable
Greg Farnum
06:29 PM Revision aa83e11c (ceph): mds: log if trim_non_auth does anything, since it shouldn't
be now except on rollbacks. Greg Farnum
06:29 PM Revision 404c83e3 (ceph): mds: Call trim_non_auth_subtree when appropriate.
Greg Farnum
06:29 PM Revision 20745218 (ceph): mds: add function MDCache::trim_non_auth_subtree
Trims the subtree rooted at the given dir from cache, except
for those portions linking to directories on other MDSes...
Greg Farnum
05:48 PM Revision 8255a671 (ceph): debian: fix changelog
(This was actually in the 0.22.1-1 package we built.)
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:41 PM Bug #374 (Can't reproduce): mon: osd will null addr added to map
Couldn't find anything with code inspection, and haven't been able to reproduce. Hopefully if/when this pops up agai... Sage Weil
04:39 PM CephFS Feature #340: large directories, directory fragmenting
split and merge rewritten and working. now for the stress testing. Sage Weil
03:24 PM Bug #522 (Resolved): osd: put potentially large pg info in separately object, not xattr
commit:73c6e4cca7d8265e1e478e83d97a638cc7fa6a24 Sage Weil
02:42 PM CephFS Bug #520: mds: change ifile state mix->sync on (many) lookups?
Nothing wrong on the client.. it's just that the mds has / (a subtree root) in the MIX state, and file_eval doesn't d... Sage Weil
01:02 PM CephFS Bug #360 (Resolved): mds: head/snapped snap_cap linkage may cross mdss
For now, let's just auth_pin(). resolved by commit:440cc43956f367e6c8fb1077c83693ff568c9d2c Sage Weil
12:43 PM Bug #518 (Resolved): cfuse crashed on ls
Checked the MDS side and haven't heard back yet, so I'm going to close this out unless I hear about more issues. Greg Farnum
12:38 PM CephFS Bug #525 (Closed): Audit CInode creation code for initialization
Looks all good now! Greg Farnum
09:16 AM CephFS Bug #525 (Closed): Audit CInode creation code for initialization
Specifically, it seems there are some times when truncate_size (and truncate_seq?) aren't geting set. Check if there ... Greg Farnum
11:31 AM CephFS Bug #329 (Resolved): mds: mislinked dentry found during journal replay
The multi-mds fix has been pushed to mds_journal branch commit:aa83e11c67165878e1ca1b0fe66ff9b8c3a906c8. Then merged ... Greg Farnum
10:38 AM Messengers Feature #527 (Resolved): zero copy reads, msgr rx buffer infrastructure
Currently all messages read off the wire (include read results) go into newly allocated buffers. This results in a d... Sage Weil
10:21 AM Feature #526 (Resolved): osd: unfound objects rework
We want to let the page group go active even if there are some unfound objects. We will keep track of which objects a... Colin McCabe

10/28/2010

11:46 PM Revision ba6d931b (ceph): Mutex: add more checks to lockdep
When lockdep is enabled, use PTHREAD_MUTEX_ERRORCHECK instead of
PTHREAD_MUTEX_NORMAL for non-recursive mutexes.
Sig...
Colin Patrick McCabe
10:51 PM Revision 65fbd2ec (ceph): Add the Ceph monitoring GUI
This adds a graphical monitoring mode to the ceph cluster monitoring tool. Its
functionality is similar to ./ceph -w...
Michael McThrow
10:51 PM Revision c8839035 (ceph): cephtool gui: install and locate gui_resources
Make install now installs the gui resource files into
/usr/share/cephtool/gui_resources (or wherever we configure it ...
Colin Patrick McCabe
10:51 PM Revision c13183e7 (ceph): cephtool: join GUI thread before shutting down
Join GUI thread before shutting down.
Move open_icon function.
Delete unused get_widgets function.
Signed-off-by: ...
Colin Patrick McCabe
10:51 PM Revision 819ad635 (ceph): cephtool: fix initialization race
Call GuiMonitor::link_elements before GuiMonitor::connect_signals.
It doesn't seem safe to set up callbacks before t...
Colin Patrick McCabe
10:51 PM Revision 89273b7f (ceph): cephtool: only initialize the tokenizer once
Only initialize the tokenizer once. It gets cranky if it we call
tok_init more than once.
Signed-off-by: Colin McCab...
Colin Patrick McCabe
10:51 PM Revision 329fbc24 (ceph): cephtool: gui: handle bad input in view_node
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:51 PM Revision 7f90cc27 (ceph): ceph: gui: update copyright foo
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:49 PM Revision 10466c52 (ceph): qa: add rbd test
Yehuda Sadeh
09:55 PM Revision b44901cb (ceph): SubmittingPatches: initial version
Largely based on Linux's version. Includes the Signed-off-by stuff at
the top, and a bit more modern description of ...
Sage Weil
09:32 PM Revision b6ffdf18 (ceph): qa: add basic rbd test
Yehuda Sadeh
09:10 PM Revision b434bb1a (ceph): osd: store locator with object_info; add incompat feature
Also: We adjust the get_object_context() et al helpers to take a locator.
We include a locator (sometimes) in the MOS...
Sage Weil
09:03 PM Revision ec8960ff (ceph): osd: make object_locator_t encodable
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:59 PM Bug #518: cfuse crashed on ls
Okay, commit:4fd49203b6c757b97455f1b85d5b93c76e20e199 is a partial revert of my initial (incorrect) fix, but keeps th... Greg Farnum
05:03 PM Bug #518 (In Progress): cfuse crashed on ls
Okay, never mind my previous statements. I just hit this while using vstart.sh -n -d on testing.
Looking over potent...
Greg Farnum
10:32 AM Bug #518 (Resolved): cfuse crashed on ls
Okay, best I can tell this is happening because of some weird interactions between a few different cfuse patchsets we... Greg Farnum
08:49 PM Revision 771c2c44 (ceph): mds: Migrator needs to add_dir_context all the way to root.
It was going to the default subtree root, which doesn't
work when we've just created a new subtree root out of the gi...
Greg Farnum
08:40 PM Revision 66e1d9fc (ceph): osd: fix unneeded get_object_context() (and leak) in _rollback_to
All we want is the name of the head sobject_t, which is 'soid' in the
parent frame.
Signed-off-by: Sage Weil <sage@n...
Sage Weil
08:30 PM Revision 4fe3ec91 (ceph): cephfs: remove unused variables
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:10 PM Revision ee27a61b (ceph): objecter: refactor interface with object_locator_t
This paves the way for a locator that lets the user specify an arbitrary
string to hash for placement (instead of the...
Sage Weil
07:10 PM Revision 7a688a9f (ceph): config: fix signal handler recursion
Avoid having old handler pointer match the new handler.
Avoid calling an old handler if it pointer is null.
Signed-...
Sage Weil
06:02 PM Revision 8f085108 (ceph): mds: pin NEEDSNAPFLUSH only when adding item
This is mainly paranoia.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:42 PM Feature #509 (Resolved): assimilate ceph gui code
It's in there. I integrated it with cephtool too.
cheers,
C.
Colin McCabe
02:20 PM Feature #524 (Resolved): object_locator_t
Sage Weil
02:16 PM Tasks #442 (Closed): reconfigure cosd cluster
did this. now set up like sepia (sudo make DESTDIR=/images/cosd). running latest unstable. Sage Weil
02:00 PM CephFS Bug #523 (Can't reproduce): cfuse locks don't wake on mds reconnect?
Don't know the exact cause, but I was running clustered mds tests using snaptest-2.sh and once the MDSes had failed a... Greg Farnum
04:27 AM Revision 1a0ac01f (ceph): osd: handle missing objects on snap read
The old check in handle_op doesn't work because we don't provide a snap
context on read, and we haven't loaded one of...
Sage Weil
04:27 AM Revision 8d37b280 (ceph): debian: change compat to 6 to match debhelper require
Reported-by: Laszlo Boszormenyi <gcs@debian.hu>
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:17 AM Revision 69f5ccdd (ceph): mds: store dir inode in separate object; fetch from both. incompat flag.
This avoids setting large xattrs. There's no reason the inode needs to be
on the same object as the dir(frag) data.
...
Sage Weil

10/27/2010

10:15 PM Linux kernel client Tasks #480 (Resolved): rebase btrfs snapshot ioctls, resend to list
Sage Weil
09:43 PM Revision 745a8ee5 (ceph): clarify CDir/CInode content comments a little bit
Greg Farnum
08:21 PM Revision c1d07816 (ceph): filestore: can force use of stale snaps
also, overwrite the commit_seq with the current version in case we
forced stale snaps.
Yehuda Sadeh
06:39 PM Revision bcc068ea (ceph): filestore: read commit_seq before mounting (btrfs ioctls)
Yehuda Sadeh
06:39 PM Revision c1a6ee57 (ceph): filestore: don't revert to old snapshots on startup
This should fix bug #55 Yehuda Sadeh
05:14 PM Bug #522 (Resolved): osd: put potentially large pg info in separately object, not xattr
I'm looking at the prior interval stuff, currently an attr on the head pg dir. This can be an object in meta/. Sage Weil
04:25 PM Bug #518: cfuse crashed on ls
compiled b5d9bec659daa8ba26810e7508ec473aba8ad287 but is still crashing on ls:... John Leach
01:21 PM Bug #55 (Resolved): osd: fix transition from snaps -> no snaps -> snaps
Fixed with commit:c1d078160a454c92fea899659d506e0b0ab7d92b. Yehuda Sadeh
11:45 AM Bug #521 (Resolved): objecter: crash in osdmap assert
... Sage Weil
06:28 AM Revision ae78ed42 (ceph): ceph.cc: delete deadcode
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:25 AM Revision 551711fb (ceph): Move ceph.cc to tools/
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
03:59 AM Revision a14dd819 (ceph): configure.ac: add ./configure option for gtk2
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
03:06 AM Revision 5fe0b5a0 (ceph): mds: fix split use after free; merge works
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:20 AM Revision b771ba89 (ceph): mds: simplify fragtree_t printer
val/bits^split
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
02:19 AM Revision 4afbc529 (ceph): mds: check/take wrlock on dirfragtreelock; unwind after freeze if needed
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:19 AM Revision 4c79f369 (ceph): mds: requeue dir if we can't split now due to dftlock
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:19 AM Revision 05fa106c (ceph): mds: implement frag.parse()
Sage Weil
02:19 AM Revision 7bd00b96 (ceph): mds: implement command 'merge_dir path frag'
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:19 AM Revision 0f8f02d3 (ceph): mds: add 'mds bal split bits' config option (default 3)
This is how many bits we fragment by, by default.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
02:19 AM Revision 2f9c9606 (ceph): client: fix dup entries in multifrag readdir
We need a next_offset of 0 for non-leftmost frags. Otherwise we set
our dentry offsets incorrectly and the next_offs...
Sage Weil
02:19 AM Revision 96d26e38 (ceph): mds: reimplement split_dir
Do not use an mdrequest; the old approach was totally broken wrt freezing,
locks, and deadlock.
First freeze, then l...
Sage Weil
02:19 AM Revision e1b53794 (ceph): mds: generalize split/merge call chain a bit
Still need work at the lower levels.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
02:19 AM Revision 332195a2 (ceph): mds: clean up merge() callchain
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:19 AM Revision a4b21449 (ceph): mds: don't replicate new frags (at least for now)
Lease commented out stubs in place. Sage Weil
02:19 AM Revision e79417ba (ceph): mds: move fragment checks into shared helper
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

10/26/2010

11:28 PM Revision 96beaf6c (ceph): messenger: always unlock existing pipes, even if they're lossy
Greg Farnum
07:58 PM Revision b5d9bec6 (ceph): client: Initialize Inode::truncate_size to 0 instead of -1, and check p...
on truncation.
truncate_size needs to precisely match the defaults on the MDS, or we run into
problems when importin...
Greg Farnum
07:30 PM CephFS Bug #520 (Closed): mds: change ifile state mix->sync on (many) lookups?
I'm seeing this on csyn --syn makefiles 1000 1 0... Sage Weil
07:04 PM Revision 2ed57d2a (ceph): Merge remote branch 'origin/testing' into unstable
Conflicts:
configure.ac
src/rados.cc
Sage Weil
07:00 PM Revision ef90cb5e (ceph): filestore: some cleanup
Yehuda Sadeh
06:59 PM Revision 54fdd641 (ceph): filestore: escape the xattr chunk names
Yehuda Sadeh
06:41 PM Revision 84b85aa6 (ceph): osd::Missing: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:41 PM Revision 45f7110d (ceph): osd: move PG::Missing implementation to PG.cc
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:06 PM Revision 44202873 (ceph): filestore: some cleanup
Yehuda Sadeh
01:05 PM Bug #518: cfuse crashed on ls
commit:b5d9bec659daa8ba26810e7508ec473aba8ad287 in testing. Waiting to hear back before closing. Greg Farnum
10:34 AM Bug #518: cfuse crashed on ls
These numbers confused me. When I get confused, I like to generate logs. Like the attached one. Greg Farnum
09:47 AM Bug #518: cfuse crashed on ls
And while I've got it here's the inode printout:
$4 = {ino = {val = 1099511632800}, snapid = {val = 18446744073709...
Greg Farnum
09:39 AM Bug #518: cfuse crashed on ls
All right, got in and found:
Identical truncate_seqs of 2.
Identical truncate_sizes of 0.
prior_size of 209715200....
Greg Farnum
12:34 PM Feature #169 (Resolved): osd: start up despite corrupted pg log(s)
Sage Weil
04:52 AM Revision 2a3e73bb (ceph): Merge branch 'btrfs_snap_ioctls' into unstable
Sage Weil
04:52 AM Revision f131f429 (ceph): filestore: warn if btrfs_snaps enabled but no async snap create ioctl
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
 

Also available in: Atom