Project

General

Profile

Activity

From 11/17/2010 to 12/16/2010

12/16/2010

11:31 PM Bug #656 (Closed): ceph
i have 2 osd,[2.6.36,ceph-0.23.1]
#ceph osd down 0
#ls /mnt/ceph {ceph's mount dir}
hanging here,
after a while ...
longguang yue
10:57 PM Revision 30f752cd (ceph): gceph,ceph: replace cerr->derr
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:05 PM Revision c76379fd (ceph): cosd: replace cerr with derr
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:06 PM Revision 73669d87 (ceph): PG.cc:
sub_op_scrub must set finalizing_scrub on the replica
before waiting for last_update_applied to catch up to
info.la...
Samuel Just
06:32 PM Revision 4644247c (ceph): osd: FileJournal: use derr
Use derr to announce errors in FileJournal.
Handle EINTR where necessary (still haven't fixed
read/write/pread/pwrit...
Colin Patrick McCabe
06:32 PM Revision 4fc1af5e (ceph): logging: re-introduce derr
Re-introduce derr as a special log level (level -1) which will show up
in all logs, and on stderr. These messages are...
Colin Patrick McCabe
05:10 PM Tasks #653: get playground radosgw up and running again
S3 gateway is up and running (I just recreated my user and tested it, it was working before). We still need to recrea... Yehuda Sadeh
03:02 PM Bug #655 (Resolved): class objects are being stripped (debian packages)
When installing the latest debian packages, the resulting /usr/lib/rados-classes/*.so are completely stripped, thus w... Yehuda Sadeh
02:56 PM Tasks #654 (Resolved): get playground ladder0 mounted
mounted, was just missing name=username on the mount command. Also created the users. Yehuda Sadeh
08:11 AM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
My rsync (196k files, 74GB data) finished succesfully, but the btrfs warning repeated itself twice. Wido den Hollander
02:09 AM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
Compiled from the kernel GIT ( a4851d8f7d6351a395d36ae8fdcf41745a832d76 ) last night and then started a rsync this mo... Wido den Hollander
01:23 AM Revision 29480f42 (ceph): ReplicatedPG.cc:
_scrub must set head when it encounters a head snap
curclone counts down, not up
Signed-off-by: Samuel Just <samuel...
Samuel Just
12:33 AM Revision 1e490eff (ceph): osd: timed out watcher is added to unconnected map
Yehuda Sadeh
12:33 AM Revision c321620e (ceph): osd: send notify message only to unexpired watchers
Yehuda Sadeh
12:26 AM Revision 619b45ad (ceph): logging: close file when reloading global config
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe

12/15/2010

11:23 PM Revision 060fd428 (ceph): osd: fix watch timer, locking
Yehuda Sadeh
09:39 PM Revision 914f6dde (ceph): filestore: detect final version of async ioctl SNAP_CREATE_V2
Li's revised interface for the async snap ioctl is more flexible. Update
the ioctl call sites and detection code acc...
Sage Weil
09:15 PM Tasks #654 (Resolved): get playground ladder0 mounted
Get it mounted (it was being weird on me). And recreate the users (probably servicectl ladder0:user config from yakko?) Sage Weil
09:14 PM Tasks #653 (Resolved): get playground radosgw up and running again
I'm not sure what needs to be done to configure the radosgw pools..
Also we need to recreate the users/buckets fr...
Sage Weil
09:10 PM Bug #648 (Resolved): monclient: PGMap::apply_incremental
trimming changed by commit:89d5c91e7d207d646651f8959ee37a15ea199d1b Sage Weil
09:10 PM Bug #631 (Won't Fix): OSD: FileJournal::committed_thru
Sage Weil wrote:
> Okay, the first crash you saw is due to #645. I think it's a kernel bug causing that ioctl to fa...
Sage Weil
09:07 PM Revision 06a2d7a2 (ceph): mds: Save straydn in mdr so it's consistent across retry attempts.
Otherwise, we could choose new stray dirs and fail to get all
the locks we needed (while leaving old strays locked fo...
Greg Farnum
09:06 PM Cleanup #650 (Resolved): objecter: refactor request tracking to be per-osd instead of per-pg
commit:d54a854811a51a5730b548da712d59761057fa58 Sage Weil
08:44 PM Revision e31f0a47 (ceph): tools: don't start msgr thread before daemonize
Calling messenger->add_dispatcher_head() has the side-effect of starting
the messenger thread. So we must not do it b...
Colin Patrick McCabe
07:02 PM Revision d54a8548 (ceph): Merge branch 'objecter' into unstable
Sage Weil
07:02 PM Revision 065cdf52 (ceph): objecter: track pending requests by osd, not pg
This is a big cleanup. Also
- switch to keeping per-osd Connection *'s
- make requests time out independently (not...
Sage Weil
07:02 PM Revision f6dc5d9f (ceph): objecter: cleanup: rename op maps
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:02 PM Revision 32a8aed9 (ceph): objecter: add reopen_session helper
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:02 PM Revision 530083cc (ceph): objecter: check for pg mapping changes in each incremental; refactor mi...
We need to detect when a pg mapping changes but the primary stays the same.
That means we can't just look at the fina...
Sage Weil
07:01 PM Revision 5d44d599 (ceph): msgr: mark down by Connection*
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:01 PM Revision 07e593c4 (ceph): mds: fix inode ancestor attr encoding
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:15 PM Revision fdbd85e4 (ceph): automake: ignore rmdir errors during uninstall
We don't want to fail "make distcheck" for a silly reason.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
06:01 PM Bug #652 (Resolved): ReplicatedPG _scrub missing clone
ReplicatedPG.cc:4016 = handle the missing clone case Samuel Just
03:18 PM Linux kernel client Bug #552 (Resolved): Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg
Closing this out unless we hear about more issues. Greg Farnum
01:38 PM Feature #643 (Resolved): filestore: update btrfs ioctl interface for soon-to-be-pushed SNAP_CREAT...
commit:914f6ddebd899667b1937dfe9d5f1a94537dc500 Sage Weil
01:01 PM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
I took the for-linus branch, since both next-rc and master wouldn't compile against 2.6.37-rc5. Due to this error:
...
Wido den Hollander
10:39 AM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
I've just hit the bug again, this while I was running a rsync to my Ceph cluster.... Wido den Hollander
09:48 AM Feature #95: mon: adjust overload based on osd disk utilization
Implemented reweight-by-utilization in the overload branch.
C.
Colin McCabe
01:00 AM Revision 7b5e923c (ceph): osd: send pending notification for reconnected watcher
Yehuda Sadeh
12:28 AM Revision f9694648 (ceph): automake: add osd/Watch.h to noinst_HEADERS
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe

12/14/2010

09:55 PM Linux kernel client Bug #651 (Resolved): osd_client: need to recalculate request mapping for every osdmap incremental
Currently if we get an osdmap message with multiple incrementals, and a request maps to a different osd and then back... Sage Weil
08:47 PM Revision c8d9b20c (ceph): Merge branch 'sync2' into unstable
Sage Weil
07:50 PM Revision 89d5c91e (ceph): mon: trim pgmap less aggressively
This will make observer crashes due to missed states (#648) much harder to
hit. Eventually the pgmap state trim prob...
Sage Weil
07:02 PM Revision 056e91e0 (ceph): librados: drop watch_lock
Use the existing lock to do protect all of this.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:00 PM Revision d4420a8a (ceph): objecter: drop linger_info_mutex
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:51 PM Revision b989087d (ceph): crypto: catch cryptopp decrypt/encrypt exceptions
Yehuda Sadeh
06:48 PM Revision 215f3320 (ceph): objecter: simplify linger register
Drop single-use helper; make unregister_linger part of the public
interface.
Signed-off-by: Sage Weil <sage@newdream...
Sage Weil
06:47 PM Revision b60a9abf (ceph): objecter: fix up linger ack/commit to trigger first time only
We only want the user-provided ack/commit callbacks to trigger the first
time we register the lingering op. Same goe...
Sage Weil
06:35 PM Revision 8a75086d (ceph): objecter: clean up linger interface
Put LingerOp on heap. Use xlist to attach to PGs. Add in/out bufferlists.
Signed-off-by: Sage Weil <sage@newdream....
Sage Weil
05:55 PM Revision 96b32382 (ceph): Merge remote branch 'origin/unstable' into sync2
Conflicts:
src/auth/Crypto.cc
src/osd/ReplicatedPG.cc
src/osd/ReplicatedPG.h
src/osd/osd_types.h
Sage Weil
03:03 PM Feature #562 (Closed): separate gui into separate binary, package
Colin McCabe
02:56 PM Revision 3e076c39 (ceph): logging: use Mutex::Locker
Use Mutex::Locker to make logging exception-safe. That is, if you are
doing "dout() << foo() << dendl;" and foo throw...
Colin Patrick McCabe
02:20 PM Revision bf31f3f1 (ceph): logger: Fix DoutStreambuf::create_rank_symlink
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
02:13 PM Revision 4c377199 (ceph): cephtool: rename tools files
Rename tools files to be more consistent. For example, the main()
function for ./ceph should be in ceph.cc.
Signed-o...
Colin Patrick McCabe
12:50 PM rbd Feature #341: libvirt bindings
A "network" disk type was introduced in "036ad5052b43fe9f0d197e89fd16715950408e1d":http://libvirt.org/git/?p=libvirt.... Josh Durgin
11:44 AM Bug #646: [OOPS] mount.ceph monip:6789: /mnt/ceph ,ceph-client-standalone.git + ceph 0.23
I'm going to get the -standalone.git repo updated to the latest code (it's a bit out of date) so we can confirm this ... Sage Weil
11:36 AM Cleanup #650 (Resolved): objecter: refactor request tracking to be per-osd instead of per-pg
Sage Weil
10:40 AM Bug #649 (Resolved): OSD: CryptoPP::StreamTransformationFilter::LastPut
Fixed in b989087ddf8775588ddbb6234d099398a2e18072. CryptoPP threw an exception when failed to decode message (probabl... Yehuda Sadeh
02:34 AM Bug #649 (Resolved): OSD: CryptoPP::StreamTransformationFilter::LastPut
This morning on my test machine (noisy.ceph.widodh.nl, 1 MON, 1 MDS, 3 OSD) all three OSD's died at exact the same mo... Wido den Hollander
09:58 AM Revision a3fcf908 (ceph): logging: Fix use-before-access in debug.cc
Signed-off-by: Vangelis Koukis <vkoukis@cslab.ece.ntua.gr>
Signed-off-by: Constantinos Venetsanopoulos <cven@cslab.ec...
Vangelis Koukis
09:53 AM Revision 3932f084 (ceph): osd: PG::prior_set_affected: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:14 AM Bug #648: monclient: PGMap::apply_incremental
This is a known issue, caused by the pg state trimming. It'll go away eventually with #647. In the meantime, I'll m... Sage Weil
01:31 AM Bug #648 (Resolved): monclient: PGMap::apply_incremental
I left my laptop on last night with a 'ceph -w' on one of my test machines, this morning I saw:... Wido den Hollander

12/13/2010

08:23 PM Feature #647 (Duplicate): mon: refactor paxos interaction
We currently have a paxos instance per state machine, which is silly for a bunch of reasons. The big one is that a m... Sage Weil
07:28 PM Bug #646: [OOPS] mount.ceph monip:6789: /mnt/ceph ,ceph-client-standalone.git + ceph 0.23
master-backport ,reproduce it.
unstable-backport,can't reproduce it ,
i'm not sure whether this issue had been fixed.
changping Wu
07:00 PM Bug #646: [OOPS] mount.ceph monip:6789: /mnt/ceph ,ceph-client-standalone.git + ceph 0.23
/ceph-client-standalone$ git branch
master
* master-backport
changping Wu
06:45 PM Bug #646 (Can't reproduce): [OOPS] mount.ceph monip:6789: /mnt/ceph ,ceph-client-standalone.git +...
1. ceph-client-standalone: git from git://ceph.newdream.net/git/ceph-client-standalone.git
2. ceph: ceph-0.23
3.OS:...
changping Wu
10:18 AM Feature #640: support log rotation
I guess when I filed this I was thinking of a setup where there was a small tmpfs partition where the logs went to, w... Colin McCabe

12/12/2010

10:40 PM Revision 9add26be (ceph): mds: fix replay/resent vs completed request check
If it is a _replayed_ request, we should always send a simple ack if it is
completed, because the client doesn't not ...
Sage Weil
09:51 PM rbd Tasks #421 (Resolved): get rbd support into qemu upstream
Sage Weil
09:50 PM Linux kernel client Bug #473 (Can't reproduce): Kernel panic: ceph_pagelist_append
Sage Weil
09:49 PM Linux kernel client Bug #304 (Can't reproduce): GPF in writepages_finish
Sage Weil
09:47 PM Linux kernel client Feature #642 (Rejected): fill in s_uuid on superblock
nevermind, i misread the nfs thread.. this is an extN thing. Sage Weil
02:33 PM CephFS Cleanup #638 (Resolved): mds: verify open+create resent/replayed event exception
commit:9add26b Sage Weil

12/11/2010

09:58 PM Revision 0e08cb0f (ceph): osd: return ENOSPC for non-mds if full flag is set in osdmap
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:09 PM Revision 239b7677 (ceph): Merge remote branch 'origin/syslog' into unstable
Conflicts:
src/mon/Paxos.cc
src/osd/PG.cc
Sage Weil
04:04 PM Revision 46242586 (ceph): Merge branch 'gceph' into unstable
Sage Weil
08:20 AM Feature #562: separate gui into separate binary, package
merged everything so far in commit:46242586eddcc948f71260f8c1ea2e8b1845a9f8 Sage Weil
07:48 AM Feature #562: separate gui into separate binary, package
Looks good! The one thing I'd change is to rename ceph.cc tool-common.cc or something along those lines, and cmd.cc ... Sage Weil
08:20 AM Feature #245 (Resolved): Logging to syslog
merged in commit:239b7677e7a9df86d35cbfb25226c3f1a06771c5 Sage Weil
08:09 AM CephFS Feature #630: release caps on inodes unlinked by other clients
Sage Weil wrote:
> Or, the MDS needs to delete file data as soon as a stray's wanted drops to 0.
That won't work,...
Sage Weil
08:06 AM RADOS Feature #433: improve osd reweighting
I think the thing to do here is extend the CrushWrapper interface (probably by wrapping something in mapper.c or buil... Sage Weil
04:34 AM Revision 292414c5 (ceph): gceph: Add gceph to rpm, deb
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:30 AM Revision 71a19a94 (ceph): gceph: run shutdown functions at exit
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:25 AM Revision bb82fd3d (ceph): gceph: fix compile
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:15 AM Revision 1a201f85 (ceph): gceph: add -h argument
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:12 AM Revision b4ceb194 (ceph): ceph tool: Create gceph
Put the gui into a separate binary.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe

12/10/2010

08:25 PM Feature #562: separate gui into separate binary, package
I implemented this in the gceph branch.
cheers,
Colin
Colin McCabe
04:28 PM Bug #631: OSD: FileJournal::committed_thru
Okay, the first crash you saw is due to #645. I think it's a kernel bug causing that ioctl to fail.
This is actuall...
Sage Weil
03:18 AM Bug #631: OSD: FileJournal::committed_thru
To answer your question, I never saw a crash with 2.6.37, I just rebooted back into 2.6.32
Just rebooting into 2.6...
Wido den Hollander
04:25 PM Bug #645 (Closed): intermittent failure of snap ioctl
We're occasionally getting back an EINVAL from teh snap create ioctl.
extra debugging in place on the sepia test...
Sage Weil
04:15 PM Bug #644: rsync can be sloooow
Copy of my notes file:
FS->FS
gregf@kai:~/ceph/src$ time rsync -r /btrfs/gregf/ceph-client/ mnt/ceph-client
skippi...
Greg Farnum
04:15 PM Bug #644 (Closed): rsync can be sloooow
This is probably due to metadata ops being fairly expensive, but we should inspect an rsync run over Ceph to make sur... Greg Farnum
04:02 PM Feature #643 (Resolved): filestore: update btrfs ioctl interface for soon-to-be-pushed SNAP_CREAT...
Chris is going to push Li's revision of the async ioctl for the next -rc. Sage Weil

12/09/2010

11:57 PM Revision 49844738 (ceph): librados, objecter: fix unwatch operation
Yehuda Sadeh
10:38 PM Revision 346a2aac (ceph): rpm: update changelog
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:35 PM Revision e23d6200 (ceph): rpm: fix ceph.spec to work with gcephtool
Don't try to package gui_resources unless we are building the GUI.
Get GUI dependencies correct.
Signed-off-by: Coli...
Colin Patrick McCabe
09:55 PM Revision e5769b06 (ceph): objecter: resend_linger copies ops
Yehuda Sadeh
09:45 PM Revision 83612ef7 (ceph): Fix overflow in FileJournal::_open_file()
[ The following text is in the "iso-8859-7" character set. ]
[ Your display is set for the "iso-8859-1" character...
Vangelis Koukis
09:09 PM Revision 329ae1bc (ceph): ReplicatedPG: snap_trimmer now acquires a read lock on the osd map
before calling share_pg_info.
Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
Samuel Just
09:09 PM Revision d0fbc30a (ceph): ReplicatedPG.cc: Fixes a bug in snap_trimmer where a pointer to a stack
Cond is left in the mode.waiting_cond list.
Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
Samuel Just
07:43 PM Bug #631: OSD: FileJournal::committed_thru
Okay, pretty sure this was caused by a bug in 2.6.37-rc that was doing an async commit even for the sync snap/subvol ... Sage Weil
01:45 PM Bug #631: OSD: FileJournal::committed_thru
Okay, confirmed it was the same bug you were seeing:... Sage Weil
07:18 PM Revision f68e6e7d (ceph): rpm: don't try to package radosacl
radosacl is just a test binary, so unless we build with --with-debug, we
won't get it.
Signed-off-by: Colin McCabe <...
Colin Patrick McCabe
07:18 PM Revision 6722b0c8 (ceph): rpm: add pkgconfig to BuildRequires
You can't build without pkgconfig.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
06:28 PM Revision 9df18d19 (ceph): rpm: set files-attr for radosgw
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:13 AM Bug #639: RHEL6 beta packaging breaks on 'libcls_rbd.so.1.0.0.debug'
The purpose of touching it is to prevent the installed .so from being stripped. Which means we don't need it in the ... Sage Weil
09:11 AM Linux kernel client Feature #642 (Rejected): fill in s_uuid on superblock
Sage Weil
09:09 AM Feature #640 (Closed): support log rotation
see logrotate.conf. it doesn't cap logs by size, but follows the standard scheme used by just about everything else ... Sage Weil
08:48 AM Feature #640 (Closed): support log rotation
Here I'm talking about the log messages generated by dout() and friends.
We should allow users to set up log rotat...
Colin McCabe
08:50 AM Feature #641 (Rejected): allow logs to be piped to an external program
We already support sending logs to syslog, to stdout, or to a file. We could pretty easily support a fourth option, w... Colin McCabe
02:10 AM Revision b4264fbb (ceph): filejournal: reset last_commited_seq if we find journal to be invalid
If we read an event that's later than our expected entry, we set read_pos
to -1 and discard the journal. If that hap...
Sage Weil
12:02 AM Revision cc78bbf1 (ceph): objecter: create a new op for resending lingering requests
Yehuda Sadeh

12/08/2010

09:51 PM Revision 027d5bfd (ceph): logger: tweak cmon log output a bit
Make the output of cmon on stderr a little bit less verbose.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
08:25 PM Revision fdc7414e (ceph): logging: DoutStreambuf: handle daemonizing better
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:36 PM Revision 5cba1e63 (ceph): objecter: a few lingering fixes
Yehuda Sadeh
07:12 PM Revision a9c098df (ceph): mon: use helper for clock drift check; log relative instead of absolute...
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:54 PM Revision 986c2af4 (ceph): logging: debug.h: use DoutStreambuf
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
03:52 PM Cleanup #299: catch std::bad_alloc and die with helpful error in log on ENOMEM
This is kind of a tricky thing to really handle correctly.
In practice, most people run with memory overcommit tur...
Colin McCabe
01:56 PM Feature #245: Logging to syslog
Colin McCabe
11:19 AM Bug #639 (Resolved): RHEL6 beta packaging breaks on 'libcls_rbd.so.1.0.0.debug'
/usr/lib/debug/usr/lib64/rados-classes/libcls_rbd.so.1.0.0.debug is
installed but unpackaged.
In spec file I see ...
Brian Chrisman
11:11 AM CephFS Cleanup #638 (Resolved): mds: verify open+create resent/replayed event exception
These look a bit fishy to me. Sage Weil
11:11 AM CephFS Bug #637 (Resolved): mds: check replica scatterlock flush on rejoin
This needs to behave consistent with the start_flush/finish_flush hooks. Sage Weil
11:06 AM Bug #636 (Can't reproduce): RHEL6 beta packaging breaks on 'gui_resources'
'gui_resources' not getting into BUILDROOT correctly.
ceph.spec.in:128
%{_datadir}/ceph_tool/gui_resources/*
Co...
Brian Chrisman
10:31 AM Bug #635 (Resolved): RHEL6 beta packaging breaks unexpected characters
ceph.spec.in uses @VERSION@ directly, which chokes rpmbuild when ''Version:' uses certain characters (for example 0.2... Brian Chrisman
07:27 AM Revision c53ffafb (ceph): logging: Remove _dout_check_log
_dout_check_log is unneeded, since every invocation of dout makes the
same check.
Signed-off-by: Colin McCabe <colin...
Colin Patrick McCabe
07:17 AM Revision 8fdd0f44 (ceph): logging: debug.h: minor cleanup
Don't put std::ostream into the global namespace. Copyright update.
Signed-off-by: Colin McCabe <colinm@hq.newdream....
Colin Patrick McCabe
06:52 AM Revision aeba6bca (ceph): logging: eliminate dbeginl
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
03:09 AM Revision 0f0cb46a (ceph): logging: Implement rank symlinks
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
02:54 AM Revision 6c4a7d58 (ceph): logging: Support isym_path
Support instance symlinks, which are activated when we are using
g_conf.log_per_instance.
Signed-off-by: Colin McCab...
Colin Patrick McCabe
02:51 AM Revision e597d02d (ceph): logging: rename_output_file -> handle_pid_change
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
02:49 AM Revision 116478a3 (ceph): logging: _calculate_opath: use g_conf.log_dir
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
02:47 AM Revision ea3414d4 (ceph): logging: DoutStreambuf: better debug output
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
02:45 AM Revision 627399f7 (ceph): logging: create_symlink:sometimes use rel symlinks
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
02:43 AM Revision b00baab1 (ceph): logging: implement get_dirname, move get_basename
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
02:41 AM Revision ef223664 (ceph): logging: fix normalize_relative
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:44 AM Revision 2000f69e (ceph): mds: no not choose lock state on replicas
The lock state has already been set during rejoin.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:44 AM Revision fe103003 (ceph): mds: sync->mix replica state is sync->mix(2)
When auth first moves to sync->mix,
- auth sends AC_MIX to replicas
- replicas go to sync->mix
- replicas finish g...
Sage Weil
12:44 AM Revision 4f643994 (ceph): mds: introduce rejoin_invent_dirfrag() helper
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:44 AM Revision f97660ff (ceph): mds: fix LOOKUPHASH to avoid creating bogus replica CDir
We can't create the CDir if we are non-auth.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:44 AM Revision 681b010f (ceph): mds: clear EXPORTINGCAPS on export_reverse
We need to reverse the effects of encode_export_inode_caps(), which is just
the pin and state bit.
The original prob...
Sage Weil
12:44 AM Revision 9bbb33b4 (ceph): mds: send LOCKFLUSHED to trigger finish_flush on replicas
Since f741766a we have triggered start_flush and finish_flush on replicas.
The problem is that the finish_flush didn'...
Sage Weil
12:44 AM Revision c681ed75 (ceph): mds: explicitly pass scatterlock dirty flag to auth on gather
This ensures that if the replica is thinks it is flushing something the
auth will always do a scatter_writebehind.
S...
Sage Weil
12:44 AM Revision 39c5933d (ceph): mds: add missing try_clear_more() to scatterlock
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:44 AM Revision b5fd2e4d (ceph): mds: open undef dirfrags during rejoin
Any invented dirfrags have a version of 0. This will cause problems later
if we pre_dirty() anything in that dir bec...
Sage Weil
12:44 AM Revision 2ea9b2d7 (ceph): mds: fix replay of already-journaled requests
Check for already-completed tids for both retried and replayed requests.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:44 AM Revision 9b9b8693 (ceph): mds: rev mds cluster internal protocol
The lock encoding changed with the dirty bit on scatterlocks.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:44 AM Revision 3825c4b8 (ceph): mds: small rejoin cleanup
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

12/07/2010

11:47 PM Revision 42464fb7 (ceph): logging: Add symlink helper functions
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:00 PM Revision e2ba601b (ceph): logger: fix EINTR handling
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:57 PM Revision bacdd493 (ceph): logging: rename_output_file: fix bug
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:55 PM Revision d70851ef (ceph): logging: DoutStreambuf: Implement log-to-file
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:46 PM Revision 95211145 (ceph): logging: Add log_to_file option
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:40 PM CephFS Bug #15 (Resolved): mds rejoin: invented dirfrags (MDCache.cc:3469)
commit:b5fd2e4d4ee4bf02a993e75b756a3775b2d566e5 Sage Weil
08:11 PM Revision df5d4e62 (ceph): logging: DoutStreambuf improvements
Write to stdout_fileno directly rather than using a buffer, which we
would then have to flush. Fix a bug in the buffe...
Colin Patrick McCabe
06:56 PM Revision 1e2e4aa0 (ceph): automake: in scripts, use sysconfdir as-is
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:48 PM Revision 10b6887e (ceph): automake: in deb pkg, use --syconfdir=/etc
When building the debian packages, use --sysconfdir=/etc.
Also, don't fudge sysconfdir in the init-ceph script.
Sig...
Colin Patrick McCabe
03:25 PM CephFS Feature #91: mds: up:shadow mode
Okay, this seems to be working now. Had to adjust how the Journaler treated read_pos and to fix a few of my new re-re... Greg Farnum
08:49 AM Linux kernel client Bug #634: Kernel client takes too long to recover after a MDS restart
It's also possible (though unlikely) that the client isn't getting an updated MDSMap quickly enough or that the MDS t... Greg Farnum
07:58 AM Linux kernel client Bug #634: Kernel client takes too long to recover after a MDS restart
The client doesn't 'reconnect' until the MDS reaches the up:reconnect state. That's preceeded by up:replay (journal ... Sage Weil
07:53 AM Linux kernel client Bug #634 (Can't reproduce): Kernel client takes too long to recover after a MDS restart
[208292.940934] libceph: mds0 192.168.1.11:6800 socket closed
[208293.050282] libceph: mds0 192.168.1.11:6800 connec...
Ravi Pinjala
07:46 AM Revision d4043e81 (ceph): logging: add DoutStreambuf::set_prio
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:14 AM Revision 6c7735f6 (ceph): logging: DoutStreambuf must handle stdout + stderr
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:03 AM Revision 12544a49 (ceph): logging: Add log_to_syslog option
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
06:18 AM Revision 57bcdc54 (ceph): mkcephfs: require -k; update man page
Force users to specify keyring location; update man page accordingly.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil

12/06/2010

11:57 PM Revision 5ac581df (ceph): Rename SyslogStreambuf -> DoutStreambuf
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:53 PM Bug #631: OSD: FileJournal::committed_thru
Oh yes, there seems to be a issue with the IPv6 connectivity where 'noisy' is at.
I added debug journal = 20 (I in...
Wido den Hollander
10:30 PM Bug #631: OSD: FileJournal::committed_thru
I updated the above comment but I suspect you only looked at the email notification? In any case, can you reproduce ... Sage Weil
03:13 AM Bug #631: OSD: FileJournal::committed_thru
Yes, simply starting the OSD's again gave me the same crash on one OSD.
I've attached the log, but here are the la...
Wido den Hollander
11:38 PM Revision 9811fbd0 (ceph): logging: Replace derr with dout
derr was really just an alias for STDERR. Unfortunately, after we call
daemonize, STDERR is connected to /dev/null. S...
Colin Patrick McCabe
11:38 PM Revision c94e0d2d (ceph): logging: optimize with likely/unlikely macros
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:38 PM Revision d1e0a2ae (ceph): logging: debug.h: move some debug functions
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:38 PM Revision ab18aaec (ceph): logging: add g_conf.clog_to_syslog
Add a new configuration option that allows you to send central log
messages to syslog.
Signed-off-by: Colin McCabe <...
Colin Patrick McCabe
11:35 PM Revision ab61823e (ceph): logging: LogEntry: don't pass enums by reference
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:30 PM Revision 4ef069c3 (ceph): logging:Move LogEntry.h into common with LogClient
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:30 PM Revision 82fa7f2d (ceph): logging: LogClient: refactor handle_log_ack
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:30 PM Revision fcae8a7a (ceph): logging: MLog.h: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:29 PM Revision f2ead26e (ceph): logging: better syntax for LogClient
Rather than having to write logclient.log(LOG_ERROR, ss), coders can now
write clog.error() << "str". Auto-flushing, ...
Colin Patrick McCabe
11:25 PM Revision 87545d06 (ceph): configure: detect crypto++ library
Yehuda Sadeh
10:01 PM Revision ebcc9395 (ceph): osd: drop not-quite-copy constructor for object_info_t
Making a copy-like constructor that doesn't actaully copy is confusing
and error prone. In this case, we initialized...
Sage Weil
08:49 PM Revision d69f3dd3 (ceph): MDS: Encode a full ancestor trace on inodes, not just the immediate par...
Greg Farnum
07:17 PM Revision 11c7dc03 (ceph): librados: fix the C++ interface init
Yehuda Sadeh
07:17 PM Revision b1afea51 (ceph): librados: fix error path in rados_deinitialize
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:16 PM Revision aa3dda61 (ceph): librados: fix the C++ interface init
Yehuda Sadeh
06:31 PM Revision 9a604816 (ceph): librados: fix C interface error handling in init code
Yehuda Sadeh
06:28 PM Revision 130b8b3f (ceph): librados: fix C interface error handling in init code
Yehuda Sadeh
05:59 PM Revision bf030ca2 (ceph): client: resync ioctl header from ceph-client.
Previous change to the CEPH_IOCTL_MAGIC in fbbf448 was incorrect!
Signed-off-by: Greg Farnum <gregf@hq.newdream.net>
Greg Farnum
12:42 PM CephFS Feature #600 (Resolved): mds: store full trace on directories
Done in commit:d69f3dd327730a61b614c9f41f6155626bc07686. Just loops through the parents and encodes them sequentially... Greg Farnum
11:04 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
If you're not running btrfs, you can also copy a pg directory to another disk and symlink it. Just be sure to preser... Sage Weil
11:01 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Hi Fred,
We don't support deleting things manually from the object store while cosd is running.
Are you running...
Colin McCabe
12:52 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
At the time, when I said "by hand", it was even worse:
As the journal was full, doing a rm -Rf while the osd was s...
ar Fred
11:03 AM CephFS Bug #451 (Closed): mds: replay error
It look like the cluster is running pretty old code (0.22~rc). This particular problem was fixed by 1c934ebd (0.23). Sage Weil
10:22 AM CephFS Bug #451 (In Progress): mds: replay error
Sage Weil
03:59 AM CephFS Bug #451: mds: replay error
<removed sensitive info> Henry Chang
10:25 AM Bug #633 (Resolved): librados crashes when init failed
Fixed with commit:9a60481681d86065979d4353305cdaad74fe1a01 Yehuda Sadeh
10:25 AM Bug #633 (Resolved): librados crashes when init failed
When the librados init fails, specifically when using the C interface, the subsequent call to deinitialize() crashes. Yehuda Sadeh
06:20 AM Revision 4e3c2011 (ceph): Tune Debian packaging for the upcoming v0.24 release.
Including switch OpenSSL dependency to Crypto++ as its being used instead of
the former; remove radosacl as its not c...
Laszlo Boszormenyi

12/05/2010

09:03 PM Bug #631: OSD: FileJournal::committed_thru
Wido, any chance you can reproduce this with 'debug filestore = 20' and 'debug journal = 20'? Sage Weil
01:55 AM Bug #631 (Won't Fix): OSD: FileJournal::committed_thru
On a small cluster (1 MDS, 1 MON, 3 OSD's) I just saw 2 OSD's crashing with the same backtrace:... Wido den Hollander
08:32 PM CephFS Feature #601: mds: order directory commits after rename
What if another file is renamed in the other direction? Do a partial commit first? Tricky. Sage Weil
06:46 PM Bug #632 (Won't Fix): init script won't stop an instance that's been removed from config
If an instance of a Ceph daemon is removed from the config file, then the Ceph init script no longer knows how to kil... Ravi Pinjala
05:29 AM Revision 27b70eb5 (ceph): osd: search for unfound on osds in might_have_unfound
We were looking at 'up', which is just the set of OSDs we should be on in
the current epoch; nothing to do with where...
Sage Weil
04:45 AM Revision 8aa7b391 (ceph): Makefile: make radosacl build WITH_DEBUG only
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

12/04/2010

08:53 PM Tasks #616 (Rejected): radosacl needs a man page
Sage Weil
08:52 PM Bug #627 (Resolved): replace openssl with crypto++
Sage Weil
08:38 PM CephFS Feature #630 (Resolved): release caps on inodes unlinked by other clients
If client A writes a file, and client B unlinks it, client A needs to drop the inode sooner rather than later.
O...
Sage Weil
03:34 AM Revision 15d8bdf3 (ceph): crypto: use crypto++ for aes instead of openssl
need to implement it more efficiently, currently going through a string object Yehuda Sadeh
03:34 AM Revision 58f3ce4a (ceph): crypto: test for allocation failure, cleanup
Yehuda Sadeh
03:34 AM Revision 6ec622c0 (ceph): common: use ceph_armor instead of openssl based functions
also modify ceph_[un]armor to get dest buffer length Yehuda Sadeh
03:34 AM Revision 7fa9426c (ceph): makefile.am: most binaries (except rgw_*) don't link with openssl
Yehuda Sadeh
03:34 AM Revision e135e924 (ceph): crypto: remove old openssl implementation
Yehuda Sadeh
03:34 AM Revision 76e02c71 (ceph): common: remove base64.c
Yehuda Sadeh
03:34 AM Revision 88213770 (ceph): crypto: change include
Yehuda Sadeh
03:34 AM Revision a28b4494 (ceph): configure: check for the presence of libcrypto++ header files
Yehuda Sadeh
03:34 AM Revision f2424dfb (ceph): rgw: get rid of openssl altogether
Yehuda Sadeh
03:34 AM Revision e0059259 (ceph): rgw: null terminate armor result
Yehuda Sadeh
03:34 AM Revision 23f37043 (ceph): ceph.spec.in: update dependency
Yehuda Sadeh

12/03/2010

06:02 PM Revision a457cbb9 (ceph): mon: fix typo
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:02 PM Revision 378d13df (ceph): osd: remove poid/soid from ScrubMap::object; clean up callers
The soid is in the key in the map; no need to store it in the value.
Update the scrub code appropriately.
Signed-off...
Sage Weil
05:35 PM Revision a4cc929c (ceph): make: create log directories and tmp directories
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
05:10 PM Revision a5297388 (ceph): msgr: Correctly handle half-open connections.
If poll() says a socket is ready for reading, but zero bytes
are read, that means that the peer has sent a FIN. Hand...
Jim Schutt
11:11 AM Bug #590 (In Progress): osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Colin McCabe
11:11 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Hi Fred,
When you say you removed the PGs "by hand"... does that mean you used "rm -rf" on the object store while ...
Colin McCabe
07:33 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
this is with ceph rc 39b42b21e9805b3ec838f8682420166fede719f2
I tried to solve the ENOSPC problem by removing PGs ...
ar Fred
09:36 AM CephFS Bug #623 (Resolved): MDS: MDSTable::load_2
Sage Weil
12:32 AM CephFS Bug #623: MDS: MDSTable::load_2
Yes, tried with the latest rc, works!
MDS starts and recovers, als mounting and using the FS goes fine.
Wido den Hollander
09:35 AM Bug #625 (Resolved): make install should create dirs
implemented by commit:a4cc929cedb0ee773a2fa68d691a9951221ae31a and commit:39b42b21e9805b3ec838f8682420166fede719f2
C.
Colin McCabe
01:35 AM Revision 39b42b21 (ceph): make: create /etc/ceph if it doesn't exist
make: create /etc/ceph if it doesn't exist. On uninstall, remove the
directory if it's empty. (Never remove a user's ...
Colin Patrick McCabe
12:56 AM Revision da5ab7c9 (ceph): ost: object_info_t: decode old versions correctly
object_info_t has one constructor that initializes everything from a
bufferlist. This means that the decode function ...
Colin Patrick McCabe
12:18 AM Revision 03eb4e7a (ceph): man: add man page for cephfs
Add to Makefile, debian, and ceph.spec.in bits Greg Farnum

12/02/2010

07:52 PM Revision 6518fae3 (ceph): watch: some more linger fixes
Yehuda Sadeh
06:16 PM CephFS Feature #91: mds: up:shadow mode
I have yet to implement trimming, but the basic restarting-replay bits are now in place along with hooks to make it s... Greg Farnum
05:14 PM Bug #479: ceph/mount crash badly when writing
Hi all:
Ok, so I gitted again, original/unstable,
- Linux ss1 2.6.36-02063601-generic #201011231330 SMP Tue Nov 2...
DongJin Lee
05:07 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Colin McCabe
05:07 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Hi Fred,
I think the assertion you're seeing here was fixed very recently by commit:78a14622438addcd5c337c4924cce1...
Colin McCabe
05:02 PM Bug #629: cosd segfaults when deleting a pool containing degraded objects
Looks like some kind of lifecycle issue related to deleting pools.
OSD::_remove_pg does a _put_pool, and that does...
Colin McCabe
04:54 PM Bug #629 (Resolved): cosd segfaults when deleting a pool containing degraded objects
started a 4 node osd cluster. created some pools with some objects in them. killed one osd node. waited for it to be ... John Leach
04:52 PM CephFS Bug #623: MDS: MDSTable::load_2
I think that commit:da5ab7c9a49f8996b41783175683d4b8b13ece4d should fix this issue.
wido, can you re-run with the ...
Colin McCabe
04:44 PM CephFS Bug #623: MDS: MDSTable::load_2
root@noisy:/var/log/ceph# grep mark_all_unfound_as_lost *
[ no results ]
So we're not marking things as lost in...
Colin McCabe
11:52 AM CephFS Bug #623: MDS: MDSTable::load_2
actually -23 is NFILE, which is I think coming from the LOST code...but that should never trigger unless the admin ha... Sage Weil
05:12 AM CephFS Bug #623 (Resolved): MDS: MDSTable::load_2
On a small test machine I have a Ceph RC cluster running (Which was running a old unstable before), after my upgrade ... Wido den Hollander
04:13 PM Tasks #617: cephfs needs a man page
Already had it in the Makefile, put it in the other bits and updated the commit. Greg Farnum
03:56 PM Tasks #617: cephfs needs a man page
need to add filename to debian/ceph.install and ceph.spec.in too.
and to man/Makefile.am
Sage Weil
02:07 PM Tasks #617 (Resolved): cephfs needs a man page
Done in commit:6cdaa2f6a7670357313401ddbd322bdf529a1547 on the rc branch. Greg Farnum
03:29 PM Bug #622 (Resolved): crushtool useless parse error
Resolved-- the crushmap.txt was bad.
I created #628 for getting better error messages from crushtool.
Colin McCabe
01:19 PM Bug #622: crushtool useless parse error
There is a more advanced error handling API for spirit described at:
http://www.boost.org/doc/libs/1_41_0/libs/spiri...
Colin McCabe
11:35 AM Bug #622: crushtool useless parse error
Reposting the diff; hopefully clearer this time.
--- crushmap.txt.1 2010-12-02 11:38:43.816441440 -0800
++...
Colin McCabe
11:33 AM Bug #622: crushtool useless parse error
I was able to get the crushmap.txt to work by deleting the word "domain" in the gb1 region.
We should definitely h...
Colin McCabe
03:07 AM Bug #622 (Resolved): crushtool useless parse error
I can't decide whether this is a bug in crushtool or a bug in my crushmap but whichever it is, the error message isn'... John Leach
03:27 PM RADOS Feature #628 (New): crushtool: better error messages when parsing a crushmap.txt
There is a more advanced error handling API for spirit described at:
http://www.boost.org/doc/libs/1_41_0/libs/spiri...
Colin McCabe
03:22 PM Bug #625: make install should create dirs
Should be pretty straightforward. The only question is, should we remove those directories on an uninstall? Colin McCabe
11:11 AM Bug #625 (Resolved): make install should create dirs
/var/log/ceph
/var/lib/ceph/tmp
?
check debian/ceph.dirs to see what else gets created...
Sage Weil
11:57 AM Bug #627 (Resolved): replace openssl with crypto++
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/684011 Sage Weil
11:25 AM CephFS Feature #626 (Closed): qa: add IOR, rompio, or other parallel workloads suite
We've had reports that rompio is just terrifically unstable, and shows serious scaling issues.
IOR is a more commo...
Greg Farnum
09:39 AM Feature #624: radostool: make 'put' write large objects in chunks
Can we be able to set the chunk size with an argument, for testing this kind of thing in future? John Leach
09:38 AM Feature #624 (Resolved): radostool: make 'put' write large objects in chunks
otherwise a put on a large (100mb+) file can fail because it exceeds the size of the osd journals. it's also clearly... Sage Weil

12/01/2010

11:40 PM Revision 78a14622 (ceph): osd: fix log tail vs last_complete assert on replica activation
The last_complete may be below the log tail IFF we have a backlog.
Fixes 756918be3b24d8164699da301ddfbc8e6fd6b751.
...
Sage Weil
11:11 PM Revision 63fab458 (ceph): rados_bencher.h:
bench_write and bench_seq will now wait on any write/read
rather than the one least recently started.
bench_write ...
Samuel Just
11:00 PM Revision 0ea601ab (ceph): Create SyslogStreambuf
SyslogStreambuf is a kind of stream buffer that allows you to output
characters from an ostream to syslog. Most stand...
Colin Patrick McCabe
09:48 PM Revision a3d8c527 (ceph): filestore: call lower-level do_transactions() during journal replay
We used to call apply_transactions, which avoided rejournaling anything
because the journal wasn't writeable yet, but...
Sage Weil
09:46 PM Revision 9ecbc300 (ceph): filestore: do journal mode autodetect and sanity check _before_ replay
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:25 PM Tasks #617: cephfs needs a man page
I'll get this tomorrow. I wrote the tool and have had a task in my private manager to do this ever since then. Greg Farnum
07:05 PM Revision f9fa855a (ceph): filestore: fix journal locking on trailing mode
We're already holding journal_lock due to the surrounding
op_submit_{start,finish}.
Signed-off-by: Sage Weil <sage@n...
Sage Weil
06:20 PM Revision 0897edaf (ceph): Merge branch 'testing' into rc
Conflicts:
configure.ac
Sage Weil
06:20 PM Revision cbb56208 (ceph): rbd: use MIN instead of min()
Not even sure where min() was coming from, but it seems to be missing on
i386 lucid.:
g++ -DHAVE_CONFIG_H -I. -W...
Sage Weil
06:20 PM Revision 792b04ba (ceph): client: connect to export targets on cap EXPORT
Also unconditionally connect on reconnect, even when there aren't any
outstanding requests.
Signed-off-by: Sage Weil...
Sage Weil
06:03 PM Revision bde0c721 (ceph): filestore: do not autodetect BTRFS_IOC_SNAP_CREATE_ASYNC until interfac...
Li has proposed an alternative V2 ioctl that looks nicer, so wait until
that is finalized.
Signed-off-by: Sage Weil ...
Sage Weil
06:03 PM Revision 5bdae2af (ceph): ceph v0.23.2
Sage Weil
05:44 PM Revision 4592c220 (ceph): client: fix cap export handler
An EXPORT cap msg can race with a cap release; deal with that (realigning
this code with the kclient).
Signed-off-by...
Sage Weil
05:24 PM Revision 15c272e8 (ceph): man: fix monmaptool man page
I've found the manpage problem that I've noted before. It's about
monmaptool, the CLI says it's usage:
[--print] [--c...
Laszlo Boszormenyi
03:17 PM Bug #611 (Resolved): OSD: OSDMap::get_cluster_inst
Sage Weil
03:17 PM Bug #612 (Resolved): OSD: Crash during auto scrub
Sage Weil
02:58 PM Linux kernel client Bug #564 (Resolved): Configuration via configfs instead of sysfs
acked by greg kh, yay Sage Weil
02:58 PM rbd Bug #391 (Can't reproduce): snap create/delete caused corruption
this is old Sage Weil
02:47 PM Bug #550 (Can't reproduce): mon: PGMonitor::update_from_paxos()
haven't been able to reproduce this. commit:62716aa7 gives us useful error messages. if/when it comes up again we'l... Sage Weil
02:28 PM Linux kernel client Bug #436 (Can't reproduce): cmon: basic_string::_S_construct NULL not valid
Sage Weil
02:28 PM Bug #460 (Can't reproduce): OSD crash: ReplicatedPG::push_to_replica / Rb_tree
Sage Weil
09:46 AM Bug #621 (Resolved): error building unstable branch, rbd.cc:837: error: no matching function for ...
should be fixed by commit:307404231ecb09fdd2f6dd6e50677e746bba4236 Sage Weil
07:08 AM Bug #621 (Resolved): error building unstable branch, rbd.cc:837: error: no matching function for ...
Building on i386 Ubuntu Lucid, it fails building rbd.
This is a build of unstable at commit bf784cdb4f605c467eb094...
John Leach
09:14 AM CephFS Bug #344 (Resolved): cfuse should pass all qa tests
Sage asked me to mark this resolved. I ran the bonnie test yesterday and it eventually crashed when the disk ran out ... Greg Farnum
02:22 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
I just tried the latest unstable: fe9fad7bea
osd log attached...
osd/OSD.cc: In function 'void OSD::_process_pg...
ar Fred
12:50 AM Revision 6d96104e (ceph): osd: simplify scrub sanity checks
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:50 AM Revision 76b55c8a (ceph): osd: only adjust osd scrub_pending if pg was reserved
If for some reason we enter scrub() without scrub_reserved == true, don't
adjust the osd->scrubs_pending or we'll scr...
Sage Weil
12:38 AM Revision 260840f5 (ceph): mds: fix import_reverse re-exporting of caps
Make the import_reverse() set the pin/state before it clears them by using
the helper that sets them.
Signed-off-by:...
Sage Weil
12:25 AM Revision fe9fad7b (ceph): v0.25~rc
Sage Weil
12:25 AM Revision 109e3f18 (ceph): mds: turn off mds_bal_frag until resolve vs split/merge is fixed
See #594
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:11 AM Revision f216b020 (ceph): Merge remote branch 'origin/lost' into unstable
Conflicts:
src/osd/osd_types.h
Sage Weil

11/30/2010

11:48 PM Revision 0cc8d34e (ceph): osd: refactor object_info_t constructor a bit
Create a copy constructor for object_info_t, since we often want to copy
an object_info_t and would rather not try to...
Colin Patrick McCabe
11:48 PM Revision c281e1e0 (ceph): osd: mark_all_unfound_as_lost: wake waiters
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:48 PM Revision d5e6cae2 (ceph): radostool: fix memleak in error path
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:48 PM Revision 55f7e567 (ceph): osd: mark_all_unfound_as_lost: set lost attr
In mark_all_unfound_as_lost, we need to set the lost bit in the objects'
object_info_t.
Signed-off-by: Colin McCabe ...
Colin Patrick McCabe
11:48 PM Revision 5e243f3e (ceph): osd: create lost2 test
This one verifies:
1. Client asks for an unfound object and gets put to sleep
2. Object gets declared lost
3. Client ...
Colin Patrick McCabe
11:48 PM Revision b46f847c (ceph): osd: mark_obj_as_lost: don't assume we have obj
In PG::mark_obj_as_lost, we have to mark a missing object as lost. We
should not assume that we have an old version o...
Colin Patrick McCabe
11:48 PM Revision c29fbb12 (ceph): osd: mark_all_unfound_as_lost: bugfix, refactor
mark_all_unfound_as_lost: just delete items from the rmissing set as we
find them, rather than using a multi-pass sys...
Colin Patrick McCabe
11:48 PM Revision e9ccd7eb (ceph): osd: mark_obj_as_lost: fix oloc init, eversion
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:48 PM Revision cee3cd51 (ceph): osd: share_pg_log: update peer_missing
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:48 PM Revision ad4e5f36 (ceph): osd: ReplicatedPG::do_op: error on read-from-lost
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:48 PM Revision b15a97c7 (ceph): test_lost: add lost1 test
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:47 PM Revision 136dfdeb (ceph): osd: don't mark objs as lost unless we're active
We don't have enough information to mark objects as lost until we
activate the PG. might_have_unfound isn't even buil...
Colin Patrick McCabe
11:43 PM Revision 08bd4ead (ceph): mds: fix resolve for surviving observers
Make all survivors participate in resolve stage, so that survivors can
properly determine the outcome of migrations t...
Sage Weil
11:43 PM Revision fb4734be (ceph): (re)add mechanism for marking objects as lost
In activate_map, we now mark objects that we know are unfindable as
lost. This relies on the might_have_unfound set i...
Colin Patrick McCabe
11:43 PM Revision 80f3ea10 (ceph): Add ./ceph dump pg debug degraded_pgs_exist
./ceph dump pg debug degraded_pgs_exist returns TRUE if some pgs are
degraded; false otherwise.
tests: move start_re...
Colin Patrick McCabe
11:43 PM Revision de094224 (ceph): osd: object_info_t: add lost field
We can now permanently mark objects as lost by setting the lost bit in
their object_info_t. Rev the object_info_t str...
Colin Patrick McCabe
11:43 PM Revision e555899c (ceph): osd: active replicas process logs from primaries
In _process_pg_info, if the primary sends us a PG::Log, a replica should
merge that log into its own.
mark_all_unfou...
Colin Patrick McCabe
11:43 PM Revision c0e60afe (ceph): test: dump_osd_store: sort dump output
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:21 PM Revision 1123b5c5 (ceph): osd, librados: misc fixes, linger related issues
Yehuda Sadeh
08:57 PM Revision bf784cdb (ceph): osd: fix object_info_t() initialization of oloc
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:56 PM Revision 91a75590 (ceph): mds: add debug output to make completions easier to track
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:48 PM Revision ba1f3cb9 (ceph): osd: fix misuses of OLOC_BLANK
Commit 6e2b594b fixed a bunch of bad get_object_context() calls, but even
with the parameter fixed some were still br...
Sage Weil
08:23 PM Revision 2ad901b3 (ceph): Revert "mds: resolve cleanup"
This reverts commit cd53719f3ce712a060e4ac80cab934c597531a5e.
We need this on surviving nodes too to resolve ambiguo...
Sage Weil
08:19 PM Revision b39f0425 (ceph): Merge branch 'testing' into unstable
Conflicts:
src/os/FileJournal.cc
Sage Weil
07:43 PM Revision 1b06332d (ceph): osd: make recovery_oids debug list per-pg
Otherwise we hit bad asserts if an object of the same name in different
pools is getting recovered simultaneously.
S...
Sage Weil
06:56 PM Revision 05ad97b6 (ceph): client: Set the DirResult buffer to NULL when deleting it.
This should fix a crash exposed by our bonnie workunit. Previously
the client would keep trying to read out of the (d...
Greg Farnum
05:22 PM Revision 559d4d20 (ceph): ceph.spec.in: include gui files
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:13 PM Revision 93601269 (ceph): debian: many many cleanups
Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu> Sage Weil
04:55 PM Revision 5eb8ef7f (ceph): filejournal: fix throttle vs FULL behavior
We don't want to add to the throttler if we aren't going to queue the
write, or else we'll never take it off again.
...
Sage Weil
04:45 PM Bug #612: OSD: Crash during auto scrub
this should be fixed by commit:76b55c8a121acd4e5e8b6f5dbb83c25926ac9f76 Sage Weil
04:32 PM Revision 132f74c5 (ceph): Merge branch 'osd_journaling' into unstable
Sage Weil
04:30 PM Revision 7af9ffdf (ceph): filestore: make sure blocked op_start's wake up in order
If they wake up out of order (which, theoretically, they could before) we
can screw up journal submitting order in wr...
Sage Weil
04:24 PM Revision fac7266d (ceph): filestore: assert op_submit_finish is called in order
Verify/assert that we aren't screwing up the submission pipeline ordering.
Namely, we want to make sure that if op_ap...
Sage Weil
04:20 PM CephFS Bug #594: mds: frag split/merge vs replay
disabled in v0.24 Sage Weil
04:01 PM Tasks #539 (Resolved): wiki: document pg expansion
Documented on:
http://ceph.newdream.net/wiki/Changing_the_number_of_PGs
Colin McCabe
03:54 PM Revision 5e391db0 (ceph): filejournal: rework journal FULL behavior and fix throttling
Keep distinct states for FULL, WAIT, and NOTFULL.
The old code was more or less correct at one point, but assumed th...
Sage Weil
03:51 PM Revision 79419c33 (ceph): filestore: refactor op_queue/journal locking
- Combine journal_lock and lock.
- Move throttling outside of the lock (this fixes potential deadlock in
parallel j...
Sage Weil
03:22 PM Revision 0df9dd6e (ceph): filestore: do not throttle op_queue in queue_op()
In parallel mode, queue_op is called while holding the journal lock, so it
is not okay to throttle there. Instead, t...
Sage Weil
12:25 PM Feature #620 (Resolved): objecter: (optionally) read from replica if on localhost and primary is not
This can either compare the ip address, or possibly have a netmask (set in g_conf) to determine 'locality' (where 255... Sage Weil
12:23 PM Feature #619 (Resolved): objecter: optionally read from replicas
Add a read flag to allow reads to come from a random replica. If a replica replies with EAGAIN, retry the request, b... Sage Weil
12:21 PM Feature #618 (Resolved): osd: allow reads from replicas
Allow osd to handle reads on a replica. If the replica is missing the object in question, reply with -EAGAIN to the ... Sage Weil
11:41 AM Bug #613 (Resolved): OSD crash: FAILED assert(recovery_oids.count(soid) == 0)
this was actually a problem with the debug sanity checks. fixed by commit:1b06332de69b332092d115451efbd29afec79269 Sage Weil
10:04 AM Tasks #617 (Resolved): cephfs needs a man page
Sage Weil
10:04 AM Tasks #616 (Rejected): radosacl needs a man page
Sage Weil
08:36 AM Bug #615 (Resolved): osd: improve op+journal throttling
Currently we block first, then take locks, then update the throttle accounting. This makes things racy, because a bu... Sage Weil
08:34 AM Bug #598 (Resolved): osd: journal reset in parallel mode acts weird
fixed as of commit:132f74c56064fdb3c47943679c48aa2a6b98f4eb, along with a ton of other related issues with the io que... Sage Weil
02:49 AM Revision 8003915b (ceph): Makefile: add bloom_filter.hpp to noinst_HEADERs
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
01:16 AM Revision 62075f34 (ceph): Makefile: Fix VPATH builds
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:41 AM Revision 0bcdc84a (ceph): osd: osd_types.h: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:40 AM Revision 7ee50add (ceph): osd: don't try to load a PG in a nonexistent pool
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
12:38 AM Revision 6ab17236 (ceph): filestore: simplify apply_transactions
Always use queue_transactions, even in no-journal case.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil

11/29/2010

11:52 PM Revision c9f864a0 (ceph): osd: PG::trim: fix inverted conditional in assert
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:12 PM Revision b2bcf4b3 (ceph): common: prevent infinite recursion on SIGSEGV
Install SIGSEGV / SIGABORT handlers with sigaction using SA_RESETHAND.
This will ensure that if the signal handler it...
Colin Patrick McCabe
10:12 PM Revision 85191813 (ceph): osd: Create pg_split test
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:35 PM Revision fb60e114 (ceph): logger: Fix a crash when the MDS shuts down cleanly.
We weren't holding the lock on the logger_timer before calling shutdown. Greg Farnum
09:35 PM Revision b4db4100 (ceph): Timer: add some asserts to catch certain errors.
Greg Farnum
08:56 PM Revision adbb5459 (ceph): osd: some notify simplifications and FIXMEs
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:56 PM Revision ec15c465 (ceph): osd: track unconnected_watchers and when they expire
- set up an initial expiration when we load the obc off disk
- remove expiration when we connect to an existing watch...
Sage Weil
08:55 PM Revision 376870fa (ceph): osd: add timeout to watch_info_t
Allow the watch timeout be set on a per-watch basis. Still need to figure
out where that comes from.. the client? A...
Sage Weil
08:55 PM Revision 239c0a12 (ceph): rbd: fix version renaming
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:55 PM Revision b3051531 (ceph): osd: fix up WATCH
Separate various paths: registering new watch, reconnecting to existing
watch, removing watch, etc.
Signed-off-by: S...
Sage Weil
08:55 PM Revision 2563905b (ceph): osd: some cleanup
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:55 PM Revision b722662e (ceph): osd: use pg_t to find PG's again
The ceph_object_layout is approaching obsolete. Also, use a more general
lookup_lock_raw_pg() helper that doesn't ta...
Sage Weil
08:54 PM Revision a61f6b5e (ceph): osd: add missing Watch.cc
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:54 PM Revision 0e62c421 (ceph): osdc: spell out version
Cosmetic
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:51 PM Revision 15ffbc8d (ceph): makefile: add missing MWatchNotify.h
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:50 PM Revision 4dca64b2 (ceph): osd: drop unused fields
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:18 PM Revision 463d624d (ceph): Makefile: Add --as-needed to LDFLAGS
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:51 PM Revision a77eb6bd (ceph): vstart.sh: don't specify journaling mode
Let the autodetection kick in, or let the dev specify via -o '...'.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:41 PM Revision e0b927b2 (ceph): osd: PG::trim: add assert
Assert that we're not trimming the PG log past last_complete.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
05:48 PM Revision 756918be (ceph): osd: _process_pg_info: add assert for replicas
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
05:06 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Fred, can you see if this reproduces on the latest unstable? Thanks.
-C
Colin McCabe
11:14 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
I added the PG::trim assert. It seems to cause problems immediately with test_unfound.sh
The plot thickens...
Colin McCabe
10:36 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Argh yeah I was all wrong here. The recovery code looks ok.. I think the problem is that _before_ this the log was t... Sage Weil
09:21 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
> The replicas only ever get messages from the primary, and the primary
> sends a log to activate. Never anything e...
Colin McCabe
04:51 PM Bug #614: SEGV loop on _open_lock_pg after rmpool
Er, by that I mean:
load_pgs shouldn't try to load a PG that is in a nonexistent pool. This could only happen aft...
Colin McCabe
04:49 PM Bug #614 (Resolved): SEGV loop on _open_lock_pg after rmpool
In OSD::load_pgs, we weren't checking to make sure that the pool existed when going through all the collections.
F...
Colin McCabe
02:23 PM Bug #614 (Resolved): SEGV loop on _open_lock_pg after rmpool
discovered my cosd processes at 100%, possibly following some "rados rmpool" commands to delete some pools. Stopped ... John Leach
04:41 PM Bug #598: osd: journal reset in parallel mode acts weird
bunch of problems here, not all related to a full journal. Sage Weil
12:18 PM Feature #568 (Resolved): debian: build with --as-needed?
Implemented!
before:
cmccabe@flab:~/src/ceph2/src$ ldd .libs/rados
linux-vdso.so.1 => (0x00007fff4eff...
Colin McCabe
11:13 AM Bug #575 (Resolved): monmaptool terminates when input file is not a monmap
Samuel Just
10:49 AM Bug #479 (Can't reproduce): ceph/mount crash badly when writing
Sage Weil
10:15 AM CephFS Subtask #547 (Resolved): mds: define fsck strategy, required metadata
Sage Weil
10:13 AM CephFS Bug #594: mds: frag split/merge vs replay
needs to be fixed in 0.24, or g_conf.mds_frag needs to be disabled. Sage Weil
10:06 AM Bug #595 (Won't Fix): Autogen: not a literal
seems to go away with latest automake Sage Weil
07:12 AM Bug #613 (Resolved): OSD crash: FAILED assert(recovery_oids.count(soid) == 0)
I'm running a script that reads and writes random objects using librados (creating a new pool once in a while). Runn... John Leach

11/25/2010

07:36 AM Revision 3ab60091 (ceph): osd: dump_missing: also dump missing_loc
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
07:35 AM Revision da087e47 (ceph): osd: discover_all_missing fix
Don't request information from an OSD unless it is up and part of the
might_have_unfound set. Add more logging.
Sign...
Colin Patrick McCabe
12:18 AM Bug #611: OSD: OSDMap::get_cluster_inst
commit:da087e47c21190f9cbde4d24182b7dfe581cd069 should resolve this Colin McCabe

11/24/2010

10:54 PM Bug #611: OSD: OSDMap::get_cluster_inst
I'll take a look Colin McCabe
10:18 PM Bug #611: OSD: OSDMap::get_cluster_inst
Okay, I somehow commented/set this bug backwards with another one. Whoops, sorry guys!
This looks like the OSD is as...
Greg Farnum
10:38 AM Bug #611: OSD: OSDMap::get_cluster_inst
Sam said he'd look at this since it's in the background scrubbing bits that he and Josh did. Greg Farnum
05:11 AM Bug #611 (Resolved): OSD: OSDMap::get_cluster_inst
After upgrading to the latest unstable, one OSD crashed. Before the upgrade, 10 of the 12 OSD's were online.
When ...
Wido den Hollander
10:18 PM Bug #612: OSD: Crash during auto scrub
Dunno how, but somehow commented/assigned this and another bug backwards. Meant to say:
Sam said he'd look at this s...
Greg Farnum
10:38 AM Bug #612: OSD: Crash during auto scrub
This looks like the OSD is assembling a list of missing queries and then sending them out without bothering to check ... Greg Farnum
05:28 AM Bug #612 (Resolved): OSD: Crash during auto scrub
After I saw #611 my cluster started to crash. One after the other, the OSD's started to go down, all with a message a... Wido den Hollander
10:09 PM Feature #453 (Resolved): osd: return error (instead of blocking) on lost objects
It's passing the lost1 and lost2 unit tests now. Colin McCabe
09:41 PM rgw Bug #353: Handle non-ascii filenames
Yeah, I agree with Amazon's approach here. UTF-8 makes sense. I think we could continue to use std::string internally... Colin McCabe
02:03 AM Revision d6e8e8d1 (ceph): gui: some cleanup
Rather than vectors of pointers, use vectors of NodeInfo structures.
This avoids the problem of freeing the NodeInfo ...
Colin Patrick McCabe
12:56 AM Revision 1b1e040e (ceph): osd: add a map for lingering messages
Yehuda Sadeh
12:55 AM Revision 99e1e4de (ceph): librados: assert_version on sync operations
Yehuda Sadeh
12:55 AM Revision c4b97953 (ceph): librados: last_objver is set on the pool, and not per thread
Yehuda Sadeh
12:55 AM Revision 454ea06e (ceph): rbd: notify about header changes
Yehuda Sadeh
12:55 AM Revision 520b523b (ceph): librados: fix unnecessary locking
Yehuda Sadeh
12:55 AM Revision 4c8bdc53 (ceph): osd: don't notify notifier
Yehuda Sadeh
12:54 AM Revision a76de3b2 (ceph): librados: complete C interface for watch/notify
Yehuda Sadeh
12:54 AM Revision 38c8e383 (ceph): librados: rename cookie to handle in api
Yehuda Sadeh
12:54 AM Revision 2954799a (ceph): librados: notify waits for completion
Yehuda Sadeh
12:50 AM Revision e7184e6d (ceph): librados: start implementing watch/notify
Yehuda Sadeh
12:50 AM Revision a4864bd8 (ceph): librados: enable object versioning
Yehuda Sadeh
12:50 AM Revision f36677f8 (ceph): librados: update C api
Yehuda Sadeh
12:49 AM Revision f8af4f2c (ceph): osd: add watch/notify timeout
Yehuda Sadeh
12:49 AM Revision cc62f2eb (ceph): osd: fix bad mutex lock
Yehuda Sadeh
12:49 AM Revision e0c548ad (ceph): osd: fix ms_handle_reset
Yehuda Sadeh
12:49 AM Revision d5cc6732 (ceph): osd: some notify related cleanups
Yehuda Sadeh
12:49 AM Revision 7272bfec (ceph): osd: send notify response from reset handler if needed
Yehuda Sadeh
12:49 AM Revision d66b52e1 (ceph): osd: watch infrastructure
third attempt Yehuda Sadeh
12:49 AM Revision 2b5e61ca (ceph): osd: send notification id
Yehuda Sadeh
12:49 AM Revision 59e61d0e (ceph): osd: discard of disconnected watchers
still need to add a timeout Yehuda Sadeh
12:49 AM Revision f5f33822 (ceph): osd: send notify reply if there are not watchers
Yehuda Sadeh
12:49 AM Revision 9437ea84 (ceph): osd: add user_version field in obect_info_t
Yehuda Sadeh
12:49 AM Revision 7bda45a1 (ceph): osd: reply with either user_version or at_version, depends on the op
Yehuda Sadeh
12:49 AM Revision f7b7d67a (ceph): osd: check requested watch version number
send appropriate status code if needed Yehuda Sadeh
12:47 AM Revision 2bce34e7 (ceph): osd: handle watch op, register client on object xattr
Yehuda Sadeh
12:47 AM Revision 3110e361 (ceph): osd: basic watch/notify handling
Yehuda Sadeh
12:47 AM Revision e493c7ae (ceph): osd: handle notify-ack
Yehuda Sadeh

11/23/2010

11:39 PM Revision 2f13dd8e (ceph): gui: more reindenting
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
11:37 PM Revision 66a78c23 (ceph): gui: reindent a bunch of code
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
10:40 PM Revision d8652de6 (ceph): mdcache: in trim_non_auth, only print out path if it has a parent dentry.
This should only occur with the root inode, but caused a segfault for
anybody running more than one MDS who restarted...
Greg Farnum
10:04 PM Revision 8768b52d (ceph): mds: Reply checking_lock while reading filelock
Use checking_lock to repalce lock_state in extra buffer list to let client can get correct file lock reply. Herb Shiu
09:59 PM Revision 4041bf0d (ceph): mds: fix set_state_rejoin auth_pin check
We carry an auth pin IFF !stable AND auth.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:59 PM Revision 5ed06ffc (ceph): client: remove inode from flush_caps list when auth_cap changes
Avoid confusing other code (e.g. kick_flushing_caps) by staying on the mds
flushign_caps list when we don't even have...
Sage Weil
09:52 PM Revision 285cc946 (ceph): osd: fix is_all_uptodate()
This should only return true when recovery is done, i.e., no more missing
objects. Nothing to do with unfound.
Sign...
Sage Weil
09:52 PM Revision 36f703e1 (ceph): osd: removing unused variable, fix warning
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:52 PM Revision 413ecb0b (ceph): osd: only search_for_missing if there are unfound objects
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:52 PM Revision 671b1c09 (ceph): osd: add get_num_unfound() helper
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:52 PM Revision 7ea7a435 (ceph): osd: only discover_all_missing if unfound
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:52 PM Revision 5452dae6 (ceph): osd: recover_primary() until primary has all found objects
The logic in that if was effectively reversed.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:52 PM Revision 5498c467 (ceph): osd: fix recover_replicas() unfound check
missing_loc.count(soid) == 0 only means unfound if it's not missing on the
primary.
Signed-off-by: Sage Weil <sage@n...
Sage Weil
09:52 PM Revision e97eae15 (ceph): init-ceph: tolerate failure in cleanallogs
Otherwise /var/log/ceph/stat makes rm -f error out and we fail.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:52 PM Revision 84612286 (ceph): Build might_have_unfound set at activation
The might_have_unfound set is used by the primary OSD during recovery.
This set tracks the OSDs which might have unfo...
Colin Patrick McCabe
09:52 PM Revision 0e15da8d (ceph): Rename peer_summary_requested to peer_backlog_req
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:52 PM Revision c0c301d5 (ceph): osd: PG::read_log: don't be clever with lost xattr
Formerly, we had a special case in read_log for dealing with objects
whose objects were present on the disk, but not ...
Colin Patrick McCabe
09:52 PM Revision 55570baf (ceph): osd: fix PG::is_all_uptodate
In PG::is_all_uptodate, don't try to look for peer_missing[osd->whoami].
The primary keeps that in PG::missing!
Sign...
Colin Patrick McCabe
08:26 PM Revision 36c6569c (ceph): monmaptool: Return a non-zero error code and print a useful error
message if unable to read the monmap file.
Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
Samuel Just
06:14 PM Feature #610 (Resolved): gui: make PG view prettier
The ceph -g GUI should display PGs in a list, rather than as icons that have to be clicked on. We should get rid of t... Colin McCabe
06:13 PM Bug #604 (Resolved): Compiler warning: 'status' may be used uninitialized in this function
Fixed by commit:d6e8e8d15d22b51ec86bc5687336c3d50d9b3a5d
We should change PG view on the GUI to be a list view at ...
Colin McCabe
05:43 PM Revision fc212548 (ceph): mds: allow for old fs's with stray instead of stray0
New fs's get stray0, but we want to still behave with old ones.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:37 PM Revision de61991a (ceph): Merge branch 'testing' into unstable
Conflicts:
configure.ac
Sage Weil
03:00 PM Bug #531: Journaling Causes System Hang
Awesome, thanks for the help. I will give these patches a shot towards the end of the week.
Thanks
Bryan Tong
02:43 PM Bug #599 (Resolved): recover_master_log, doesn't
There were two problems here:
1) we were restarting the osds before the monitors, which in this case prevented a f...
Colin McCabe
02:01 PM Linux kernel client Bug #552: Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg
Our friends at Tcloud just submitted patches for this today, which I've applied to the unstable branch of our kernel ... Greg Farnum
11:46 AM CephFS Feature #593 (Rejected): mds: fsck: anchor table repair
dup Sage Weil
11:42 AM Feature #609 (Resolved): osd: query pool/pg for objects with given xattr
This will probably take the form of a pool class plugin?
It could start as just a hack, for now.
Sage Weil
11:03 AM Bug #595: Autogen: not a literal
This problem does not seem to occur using 2.68 on my local machine. Slider et al. seem to be using 2.67. Samuel Just
09:39 AM CephFS Bug #608 (Resolved): mds: MDCache::create_system_inode()
this should be fixed by commit:fc212548aea1d7f001b56ba096a79ba54b8a92c3
Thanks!
Sage Weil
07:09 AM CephFS Bug #608 (Resolved): mds: MDCache::create_system_inode()
On a small test cluster I saw that my MDS was not coming up after a fresh mkcephfs, this is what the log showed:
<...
Wido den Hollander
09:33 AM Tasks #584: do throughput scaling tests on sepia
What was the variance in per-node throughput? Did we have one node dominating? Greg Farnum
09:22 AM Tasks #584 (In Progress): do throughput scaling tests on sepia
There's definitely a problem here; the total throughput should be scaling more or less linearly until we hit a bottle... Sage Weil
07:44 AM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
I'll have to rebuild, since I didn't look at the messages that closely. Wido den Hollander
07:02 AM Revision 868665d5 (ceph): v0.23.1
Sage Weil
06:41 AM Revision c327c6a2 (ceph): mon: always use send_reply for auth replies
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:41 AM Revision 61dd4f03 (ceph): mon: simplify send_reply code
No need to specify destination in send_reply, as we always have the request
for reference.
Simplify MRoute construct...
Sage Weil
01:37 AM Revision 2c71bd33 (ceph): osd: add assert to _process_pg_info
When activating an inactive replica, assert that we are doing so based
on a message from the primary.
Signed-off-by:...
Colin Patrick McCabe
01:35 AM Revision a70943fd (ceph): osd: re-indent some code in _process_pg_info
Re-indent the code and add a comment.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe
12:12 AM Revision 71369541 (ceph): msgr: tolerate 0 bytes from tcp_read_nonblocking
This can happen, I belive when we get a signal or something.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:12 AM Revision 7ec0034b (ceph): init-ceph: fix (and test!) cleanlogs and cleanalllogs
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:03 AM Revision 7b4a801f (ceph): mds: fix rejoin_scour_survivor_replicas inode check
We want to remove replicas that we don't ack, but those don't appear in
the strong_inode map; they're appended to the...
Sage Weil

11/22/2010

11:08 PM Revision 8d95b5b6 (ceph): messenger: init rc to -1, removing compiler warning.
This actually is initialized before all uses, but compilers tend to
have trouble with assignment in if-else branches,...
Greg Farnum
11:08 PM Revision dd11fe27 (ceph): types: Allow inodeno_t structs to alias.
This removes a compiler warning that appeared in a gcc upgrade and
is apparently erroneous, about its usage violating...
Greg Farnum
10:56 PM Bug #540 (Resolved): CephxClientHandler::handle_response
couldn't reproduce this, but fixed two smallish things that may have been responsible for this:
commit:61dd4f03e6e15...
Sage Weil
10:35 PM Linux kernel client Bug #552: Samba with kernel oplocks=on produces lots of corrupt mds entries in dmesg
From the reply dump, it looks like a ceph_mds_reply_head, a length 0 tracebl, a length 1 extrabl (containing a u8 == ... Sage Weil
09:25 PM Revision ac6b018a (ceph): Causes the MDSes to switch among a set of stray directories when
switching to a new journal segment.
MDSCache:
The stray member has been replaced with strays, an array of inodes
r...
Samuel Just
09:16 PM Revision 3f8f5905 (ceph): Timer must be initialized in Client::init and shutdown in
Client::shutdown.
Signed-off-by: Samuel Just <samuelj@hq.newdream.net>
Samuel Just
06:47 PM Revision 8eb4de9e (ceph): generate_past_intervals:generate back to lastclean
PG::generate_past_intervals needs to generate all the intervals back to
history.last_epoch_clean, rather than just to...
Colin Patrick McCabe
06:07 PM Revision 80f28235 (ceph): vstart.sh: 'init-ceph stop' instead of 'stop.sh'
This just makes it easier to run multiple vstart sessions as the same user
on the same host.
Signed-off-by: Sage Wei...
Sage Weil
05:55 PM Revision 53d0650a (ceph): Merge branch 'osd_msgr' into unstable
Sage Weil
05:55 PM Revision cd53719f (ceph): mds: resolve cleanup
Only track ambiguous imports and such if we get a resolve message while in
the resolve state.
Signed-off-by: Sage We...
Sage Weil
05:55 PM Revision c0c81d53 (ceph): mds: trim exported subtree _after_ adjusting auth
We need to set the subtree bounds before trimming it away, or else we may
throw out things we're still auth for.
Sig...
Sage Weil
05:55 PM Revision 9e15ade8 (ceph): mds: do not eval subtree root when replay|resolve
This is nonsensical. And can lead to scatter_writebehind, which breaks
horribly.
Signed-off-by: Sage Weil <sage@new...
Sage Weil
05:55 PM Revision 27c6f217 (ceph): mds: remove bogus assert
Causes problems during resolve finish.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:49 PM Revision 924b1fcb (ceph): osd: bind to new cluster address when wrongly marked down
If we come back up on the same address, there is a possible race. Other
nodes will mark_down when they see us go dow...
Sage Weil
05:45 PM Revision 19409763 (ceph): msgr: implement rebind() to pick a new port
Closes out all old connections and binds to a _different_ port. This
ensures that someone doing mark_down on our old...
Sage Weil
05:09 PM Revision f7170f95 (ceph): client: only encode_cap_releases once per request.
Accomplish this by making a list of cap releases in the (permanent)
MetaRequest, and then copying that into the (pote...
Greg Farnum
04:36 PM Bug #607 (Rejected): osd: ReplicatedPG: sub_op_modify: fix creation of ObjectState
There's a part of the ReplicatedPG::sub_op_modify code that goes like this:
> // do op
> ObjectStat...
Colin McCabe
04:29 PM CephFS Feature #91: mds: up:shadow mode
Updated Journaler to make new interface options asynchronous.
Presently working on how to disambiguate between a one...
Greg Farnum
03:48 PM Tasks #584 (Resolved): do throughput scaling tests on sepia
Results of running rados -p bench bench 20 write on <Nodes>. <Average Throughput> is the average of the Bandwidth st... Samuel Just
01:24 PM CephFS Feature #88 (Resolved): mds: change stray commit strategy to avoid rolling stray dir commits
commit:ac6b018acbeaf8670f8c268db164cfb8a12c171d Sage Weil
12:59 PM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
Is the stack trace you're getting now identical, or different? The FileStore.cc change _should_ have avoided the asy... Sage Weil
09:28 AM Bug #563: osd: btrfs, warning at inode.c ( btrfs_orphan_commit_root )
Just to update the issue, Sage asked me to change something in FileStore.cc, tried that for some days, but that didn'... Wido den Hollander
12:47 PM CephFS Feature #606 (Duplicate): mds: optionally store parent attr on file objects
The goal is to be able to find files contained in rebuilt directories (#603). We can store the same attrs we do for ... Sage Weil
12:45 PM CephFS Feature #605 (Rejected): mds: verify/repair anchor table
- Make sure every item we encounter while traversing the that is anchored correctly appears in the anchor table.
- M...
Sage Weil
12:44 PM Bug #604 (Resolved): Compiler warning: 'status' may be used uninitialized in this function
In gui.cc
The warning's location references are a bit off, but the function gen_node_info_from_icons declares a "sta...
Greg Farnum
12:43 PM CephFS Feature #603 (Resolved): mds: repair directory hierarchy
The goals are
- rebuild missing/corrupt directories
- repair multiple primary links to directories
We'll do so...
Sage Weil
12:40 PM CephFS Feature #602 (Resolved): mds: handle corrupt/missing journals
This probably means
- shutting down current instances, resetting cluster membership
- throwing out journals (or m...
Sage Weil
12:37 PM CephFS Feature #601 (New): mds: order directory commits after rename
When we rename something between directories, we should try to commit the target directory _before_ the source direct... Sage Weil
12:34 PM CephFS Feature #600 (Resolved): mds: store full trace on directories
Currently we only store the immediate parent; store a full trace up to the root. This is CInode::encode_parent_mutat... Sage Weil
12:17 PM Bug #599: recover_master_log, doesn't
Also, I have verified that osd3 and osd9 did NOT crash. They're still running, and they did receive the messages from... Colin McCabe
12:13 PM Bug #599 (Resolved): recover_master_log, doesn't
This is another peering bug. We found it on wido's cluster. Basically, peering never completes.
I just examined PG...
Colin McCabe
09:52 AM Bug #592 (Resolved): osd: rebind cluster_messenger when wrongly marked down
commit:53d0650a42cbfd2f02db2c708a570b6d9e116bb4 Sage Weil
09:14 AM CephFS Bug #596 (Resolved): crash during mds reconnect
Well, that seems to fix it. I added a releases vector to the MetaReqest so it will only encode the releases once, and... Greg Farnum
08:49 AM Bug #598 (Resolved): osd: journal reset in parallel mode acts weird
from ML:... Sage Weil
04:52 AM Revision 51abcaa2 (ceph): mon: clean up cluster_addr code a bit, better debug output
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:52 AM Revision 20313644 (ceph): osdmap: fix cluster_addr encoding; printing
The cluster addrs were getting lost because we were checking v instead of
ev.
Signed-off-by: Sage Weil <sage@newdrea...
Sage Weil
04:52 AM Revision 28498a00 (ceph): osd: send correct ip addrs to monitor for cluster_, hb_addr
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:59 AM Revision ec434eda (ceph): osd: unconditionally set up separate msgr instance for osd<->osd msgs
Always set up cluster_messenger (before we would only do so if there was
an explicit address configured for it). The...
Sage Weil
12:16 AM Revision 0dddf453 (ceph): filestore: only warn about disk write cache on kernels <2.6.33
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:15 AM Revision 0856f57e (ceph): osd: fix search_for_missing: old last_update implies object not present
For example, if an osd sends an empty PG::Info (last_update = 0'0) and
empty missing, we should not conclude that the...
Sage Weil
12:09 AM Revision 6ef5c2f3 (ceph): init-ceph: fix cleanlogs for no log_sym_dir case
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

11/21/2010

07:55 PM Linux kernel client Bug #549 (Resolved): bonnie++ file stat failure
commit:3105c19c450ac7c18ab28c19d364b588767261b3 Sage Weil
03:50 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
I think the cleanest solution here is to re-bind the cluster_messenger to a new port when we are marked down and go b... Sage Weil
03:38 PM Linux kernel client Bug #597 (Closed): Reproducible crash mounting multiple directories from a pool
This bug was fixed in v2.6.36, commit:ca04d9c3ec721e474f00992efc1b1afb625507f5. Thanks for the report though! :) Sage Weil
03:34 PM Linux kernel client Bug #597: Reproducible crash mounting multiple directories from a pool
Should have mentioned - this is with the Ubuntu 10.10 desktop kernel, which is 2.6.35-22, I think. Ravi Pinjala
03:33 PM Linux kernel client Bug #597 (Closed): Reproducible crash mounting multiple directories from a pool
When trying to mount a pool multiple times (with different subdirectories) I get a consistent system hang.
Steps t...
Ravi Pinjala

11/20/2010

05:06 PM Bug #531: Journaling Causes System Hang
Please try out the patches in the filestore_throttle branch, commit:b28c0bf82ac28ded4fe85573d32fdc111c66e50b
It lo...
Sage Weil
03:15 AM Revision fc9b0976 (ceph): OSDMap: const cleanup
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
03:14 AM Revision 2a5c3893 (ceph): mds-dumper: Define Dumper::~Dumper()
To fix compile error.
Signed-off-by: Colin McCabe <colinm@hq.newdream.net>
Colin Patrick McCabe

11/19/2010

10:21 PM Revision 8566c5cd (ceph): ReplicatedPG::pull: fix test for unfound
The test for unfound objects was reversed, leading us to try to pull
unfound objects and refrain from pulling objects...
Colin Patrick McCabe
09:41 PM Revision 2f5502fa (ceph): osdmap: fix printing, again
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:21 PM CephFS Bug #596: crash during mds reconnect
The encode_cap_releases can only be called _once_, the very first time we send the request. So at some level this is... Sage Weil
04:22 PM CephFS Bug #596 (Resolved): crash during mds reconnect
While testing my Journaler changes, I got a cfuse segfault. My steps:
vstart with 1 of each daemon
mount cfuse
cop...
Greg Farnum
06:17 PM Revision 4303820b (ceph): Merge remote branch 'origin/mds' into unstable
Sage Weil
04:26 PM CephFS Feature #91 (In Progress): mds: up:shadow mode
I've been getting some proper time in on this on and off over the last few days. Pushed the Journaler changes to the ... Greg Farnum
03:52 PM Bug #531: Journaling Causes System Hang
Okay,
More updates.
1) All the VMs deployed okay but it looks like towards the end of the deployments I hit the...
Bryan Tong
02:49 PM Bug #531: Journaling Causes System Hang
Okay,
I just started the deployment of 12 vms on a new cephfs with 3 osds in and ssd's for journals on all the sys...
Bryan Tong
02:37 PM Bug #531: Journaling Causes System Hang
I am working on getting the output now. We are having to work on several projects at once right now. Sorry for the de... Bryan Tong
03:36 PM Bug #595 (Won't Fix): Autogen: not a literal
We get this running on autoconf 2.67:
configure.ac:6: warning: AC_INIT: not a literal: Sage Weil <sage@newdream.net>...
Greg Farnum
02:29 PM CephFS Bug #594 (Resolved): mds: frag split/merge vs replay
Need to reconcile refragmenting with resolve stage. Currently handle_resolve assumes frags match, when in reality th... Sage Weil
12:11 PM Bug #585 (Resolved): OSD: ReplicatedPG::pull
Fixed by commit:82f1de8c0d6e7817ca7d6dd710e3176b2a549e12 Colin McCabe
10:43 AM Bug #585 (In Progress): OSD: ReplicatedPG::pull
need to see what's going on with this Colin McCabe
11:47 AM Bug #503 (Closed): osd: query osds since last_epoch_clean before concluding objects lost?
Sage Weil
11:39 AM Bug #515 (Can't reproduce): osd: recovery isn't completing
with the recent changes i'm closing this one out, and reopening with specifics if it comes up in testing over the nex... Sage Weil
10:14 AM CephFS Feature #545 (Resolved): mds: use bloom filter to supplement dirfrag COMPLETE flag
merged commit:4303820b43721a8b46ef36d0e9ef4e1167857c80 Sage Weil
09:38 AM CephFS Feature #593 (Rejected): mds: fsck: anchor table repair
We need to be able to fix up the anchor table when there are problems, to avoid e.g.... Sage Weil
05:13 AM Revision b91e14e1 (ceph): multi-dump.sh: add diff mode
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
04:57 AM Revision 9cab522e (ceph): Add multi-dump.sh
This is a debug tool that can dump out Ceph information at various
epochs. For instance, it can show how the OSDmap c...
Colin Patrick McCabe

11/18/2010

11:05 PM Revision 6e2b594b (ceph): ReplicatedPG::get_object_contect: fix broken calls
ReplicatedPG::get_object_context takes three parameters. The last two
are "const object_locator_t& oloc" and "bool c...
Colin Patrick McCabe
09:50 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
Ah. Looks like you got it figured out.
I wasn't aware of what mark_down did.
Just in case anyone finds it useful...
Colin McCabe
09:22 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
ok, this is a problem with how the osd is interacting with the messenger. looking at the history of 0.5, we see
<pr...
Sage Weil
08:42 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
i suspect 0.5 didn't get set up on osd1 or 2 before osd0 went down? do you have the full logs for the other instances? Sage Weil
05:07 PM Bug #592: osd: rebind cluster_messenger when wrongly marked down
I should also add that Greg Farnum helped me examine the logs for this bug. Colin McCabe
05:03 PM Bug #592 (Resolved): osd: rebind cluster_messenger when wrongly marked down
This happened with commit:323565343071ce695f7d454ed29590688de64d5d on flab.ceph.dreamhost.com
While running test_u...
Colin McCabe
08:50 PM Revision 43e0b267 (ceph): ReplicatedPG: call finish_recovery when needed
Don't loop in ReplicatedPG::start_recovery_ops. There is already a loop
in both recover_replicas and recover_primary ...
Colin Patrick McCabe
08:33 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Colin McCabe wrote:
> Another potential issue that I can see here is that the code in OSD::_process_pg_info doesn't ...
Sage Weil
12:43 PM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Another potential issue that I can see here is that the code in OSD::_process_pg_info doesn't check whether it got a ... Colin McCabe
09:26 AM Bug #590: osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
Need to look at this more closely. Fred, pretty sure no data is lost here, but the recovery code needs some fixing.
...
Sage Weil
06:19 AM Bug #590 (Resolved): osd/PG.cc:1645: FAILED assert(info.last_complete >= log.tail || log.backlog)
After upgrading to ceph 0.23, the cluster (3 osd, 3 mon, 3 non-clustered mds) worked for about 2 hours and then one c... ar Fred
06:09 PM Revision ea5d1d66 (ceph): osd_resurrection_1_impl: turn on recovery at end
Signed-off-by: Colin McCabe <colinm@hq.newdream.net> Colin Patrick McCabe
09:47 AM Feature #526 (Resolved): osd: unfound objects rework
We now let the PG become active even when there are unfound objects. When the user tries to read one of those objects... Colin McCabe
07:39 AM Linux kernel client Feature #591 (Resolved): implement FALLOC_FL_PUNCH_HOLE
Sage Weil
12:52 AM Revision 4adfdee7 (ceph): Makefile: fix builddir weirdness
Signed-off-by: Jim Schutt <jaschut@sandia.gov> Jim Schutt
12:10 AM Bug #585: OSD: ReplicatedPG::pull
Well, it did show up again:... Wido den Hollander

11/17/2010

10:37 PM Revision 7e9812b4 (ceph): osd: rev PG::Info encoding for last_epoch_clean change
This was missed by 184fbf582b27c10b47101735a4495fe8c73ad186, so any fs
created between now and then won't decode prop...
Sage Weil
09:06 PM Revision c17e7da4 (ceph): Merge branch 'mds_frags' into unstable
Sage Weil
09:06 PM Revision 7f6a2561 (ceph): mds: clear PIN_SUBTREE on split/merge in purge_strays
This makes the helper work for merge as well as split. Remove the special
fixups in the caller that were making spli...
Sage Weil
09:06 PM Revision 66d43ac8 (ceph): mds: fix subtree map update on dirfrag merge
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:06 PM Revision b705be11 (ceph): mds: wrlock scatterlocks to prevent a gather racing with split/merge lo...
We have the dirs split in our cache for some time while journaling it to
disk, before the fragment_notify goes out. ...
Sage Weil
09:06 PM Revision f6823a79 (ceph): mds: adjust dir_auth_pins on steal_dentry
dir_auth_pins is a counter of dentry auth_pins in the current dir; those
need to be added in when stealing.
Signed-o...
Sage Weil
09:06 PM Revision cd5ee006 (ceph): mds: initialize PIN_SUBTREE on split
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:06 PM Revision d538817f (ceph): mds: flush log on fragment
This makes request lock auth_pins expire, so the fragment moves along.
Otherwise we can end up waiting for the log fl...
Sage Weil
09:06 PM Revision 3777ff8a (ceph): mds: move dirty rstat inodes to new dir on refragment
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:06 PM Revision 669b5544 (ceph): mds: don't complete freeze while parent inode is frozen
This makes maybe_finish_freeze() conditions match that of is_freezeable()
and avoids an assert.
Signed-off-by: Sage ...
Sage Weil
09:04 PM Revision b58b8d09 (ceph): mds: fix discover requests, tracking wrt fragments
Track discover requests by tid. The old system of tracking outstanding
discovers was kludgey and somewhat broken. A...
Sage Weil
09:02 PM Revision a63c06c8 (ceph): mds: fix EFragment replay
If the inode already exists in our cache, adjust our (existing) fragments.
But it might not. In that case, we just r...
Sage Weil
09:02 PM Revision a961049b (ceph): mds: don't fragment mdsdir or .ceph
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:48 PM Revision b54880e0 (ceph): Detect broken system linux/fiemap.h
RedHat 5.5 has a /usr/include/linux/fiemap.h, but it is
broken because it does not itself include linux/types.h.
As a...
Jim Schutt
06:24 PM Revision 29a9e668 (ceph): osdmap: don't include blacklist info in summary
It's confusing users and isn't that important.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:58 PM Revision c43455ce (ceph): client: Remove the I_COMPLETE flag from the parent directory in relink_...
This papers over issues arising from the client's lack of proper support
for hard links, and lets it pass the snaptes...
Greg Farnum
02:35 PM Bug #589 (Resolved): OSD: crash on startup, PG::read_state
Ok, this is fixed by commit:7e9812b4a9bbf320a8b0bd0abec48c1c5d78fe66. Assuming your fs is old enough you should be o... Sage Weil
11:38 AM Bug #589 (Resolved): OSD: crash on startup, PG::read_state
After upgrading to today's unstable all my OSD's crashed directly after startup, for example osd0:
Last loglines a...
Wido den Hollander
12:56 PM Bug #531: Journaling Causes System Hang
Just pinging you on this one. If you can send the logs I'd like to sort this out. Thanks! Sage Weil
09:59 AM CephFS Bug #344: cfuse should pass all qa tests
At this point the only test it's failing is bonnie. This one tends to fail on a SEGV that just keeps going through th... Greg Farnum
09:57 AM CephFS Bug #583 (Resolved): cfuse fails snaptest-upchildrealms
Okay, a proper fix for this is going to require a bit of work, since right now Inodes can only have one parent dentry... Greg Farnum
09:52 AM CephFS Cleanup #588 (Resolved): Allow Inodes to have multiple parent Dentries
Right now, cached Inodes can only have one parent Dentry. This is unfortunate when there are multiple hard links to a... Greg Farnum
09:40 AM Tasks #587 (Rejected): install mpich2 on sepia*
this will make management and testing easier Sage Weil
07:52 AM Bug #585 (Closed): OSD: ReplicatedPG::pull
This one should also be fixed in the latest unstable. Probably. The recovery code is still being worked on a bit, b... Sage Weil
02:55 AM Bug #585 (Resolved): OSD: ReplicatedPG::pull
On two OSD's (osd5 and osd10) I'm seeing the same crash, the crash almost directly after starting them.
I cranked ...
Wido den Hollander
07:19 AM Bug #586 (Resolved): OSD: Crash during scheduled scrub
This was fixed in the commit right after what you were running, commit:556ba7397c352f5a6cb7fe03087c6e2f51dbce32 Sage Weil
05:31 AM Bug #586 (Resolved): OSD: Crash during scheduled scrub
After I reported #585 I didn't pay much attention to my cluster, until I found out that I had only one OSD left onlin... Wido den Hollander
12:09 AM Revision d57181d3 (ceph): config: added max_mds
MDSMonitor: create_new_fs adapted to use the max_mds parameter
max_mds is now a configurable value and create_new_fs...
Samuel Just
 

Also available in: Atom