Activity
From 03/13/2012 to 04/11/2012
04/11/2012
- 11:00 PM Revision 119dd5ae (ceph): mkcephfs: update man page
- Signed-off-by: Sage Weil <sage@newdream.net>
- 11:00 PM Revision 4a4b7994 (ceph): ceph-authtool: update man page
- Signed-off-by: Sage Weil <sage@newdream.net>
- 10:53 PM Revision ab08fb8b (ceph): mkcephfs: note that btrfs (and --mkbtrfs) are optional and experimental
- And that --mkbtrfs will be deprecated soon.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 10:53 PM Revision ee39291a (ceph): ceph-authtool: add warning to man page
- - data is not encrypted over the wire
- intended for trusted environments
Signed-off-by: Sage Weil <sage.weil@dreamh... - 10:40 PM Revision 11b93d3a (ceph): osd: disable localized pgs by default
- Signed-off-by: Sage Weil <sage@newdream.net>
- 10:40 PM Revision 8836b81f (ceph): mon: alloc pgp_num adjustment up and down
- Signed-off-by: Sage Weil <sage@newdream.net>
- 10:39 PM Revision 83e1260b (ceph): mon: set pgp_num == pg_num (by default) for new pools
- For when pg_num is specified but not pgp_num. Thanks Greg!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> - 10:39 PM Revision 58671a4c (ceph): mon: command to disable localized pgs for a pool
- ceph osd pool disable_lpgs <poolname> --yes-i-really-mean-it
Grr, these should be off by default. We can't adjust t... - 08:35 PM Revision 7fdf25bc (ceph): debian: python-support -> dh_python2
- I followed the instructions on
http://wiki.debian.org/Python/TransitionToDHPython2
Signed-off-by: Sage Weil <sage@... - 07:35 PM Revision ed0653b4 (ceph): COPYING: doc/ CC BY-SA
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:33 PM Revision 6e83e119 (ceph): README: update
- - refer to COPYING, SubmittingPatches
- a word about dependencies
- building packages
- drop the list of built binari... - 06:42 PM Revision 838a7618 (ceph): ceph-rbdnamer: include in dist tarball and debs/rpms
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:42 PM Revision af502735 (ceph): obsync: include man page in tarball, packages
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:42 PM Revision 9678c097 (ceph): init-radosgw: start in runlevel 4
- Fixes lintian error
W: radosgw: init.d-script-missing-start etc/init.d/radosgw 4
Signed-off-by: Sage Weil <sage@new... - 06:42 PM Revision 84efc554 (ceph): debian: drop unnecessary conflicts on librgw
- Cut and paste baggage from libcephfs, it looks like.
Signed-off-by: Sage Weil <sage@newdream.net> - 06:18 PM Messengers Cleanup #2150 (In Progress): repair the Simple/Messenger interface
- I haven't done it, but I had enough time to glance over it and see at least a couple things that need fixing before t...
- 05:49 PM Feature #2113: objectcacher perfcounters
- Sage asked me to run it under an rbd mount and look at it. Need to get tests from Josh and then figure out how to do ...
- 04:30 PM Feature #2113 (Fix Under Review): objectcacher perfcounters
- Compile-tested.
- 10:51 AM Feature #2113 (In Progress): objectcacher perfcounters
- Yoink.
- 05:40 PM Revision 292898a8 (ceph): init-ceph: start at all runlevels
- This fixes lintian error:
W: ceph: init.d-script-missing-start etc/init.d/ceph 4
Signed-off-by: Sage Weil <sage@new... - 05:03 PM Revision b1946290 (ceph): Merge branch 'stable'
- 04:30 PM Bug #2266 (Resolved): teuthology: nuke after failure is failing
- it fails, and then fails to unlock, and eats up machines.
for example, ubuntu@teuthology:/a/nightly_coverage_2012-... - 03:08 PM Feature #2265 (Rejected): make sure objecter/kclient error out when localized pgs don't exist
- 11:02 AM Bug #2264 (Can't reproduce): mon: failed assert in bump_epoch
- During startup of a teuthology run on commit 1775301bb46379648f3f88914ef56aa1982db020 (before the cluster was healthy...
- 10:48 AM Bug #2263 (Resolved): obsync: move man page to section 1
- 09:25 AM Bug #2262 (Resolved): qa: osd-recovery tasks fails on flush_pg_stats
- consistently
- 08:09 AM Linux kernel client Bug #2260: libceph: null pointer dereference at try_write+0x638+0xfb0
- Looks like the problem arose while running fsstress on the xfs loop
mount on top of a file on the ext2 filesystem.
... - 07:56 AM Linux kernel client Bug #2260: libceph: null pointer dereference at try_write+0x638+0xfb0
- FYI, xfstests 49 tests running XFS on a loop device. I have to wait for a
reboot in order to see if I can tell at w... - 07:49 AM Linux kernel client Bug #2260: libceph: null pointer dereference at try_write+0x638+0xfb0
- Looks like xfstests #49 is a reproducer for this problem, at least
after running the tests that lead up to it first ... - 05:47 AM Revision be5b25b6 (ceph): filestore: fix collection_move guard
- We had a sequence like:
1- write A block 1
2- write A block 2
3- write A block 3
4- write A block 4
5- move A -... - 05:47 AM Revision 4bd9d1bb (ceph): filestore: fix collection_add guard
- If we crash between the link() and setting the guard, we will get
EEXIST. Tolerate that.
Signed-off-by: Sage Weil <... - 05:47 AM Revision df4d7a47 (ceph): filestore: fix collection_rename guard
- If we crash between the rename and setting the guard, we can get EEXIST
or ENOTEMPTY on rename. Tolerate that.
Sign... - 05:47 AM Revision 85db25e8 (ceph): filestore: fix fd leak on collection_rename
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:47 AM Revision c3e4c5b7 (ceph): filestore: cleanup: flip sense of replay guard check
- The other are all if (_check_replay_guard(..)) do_it;. Make this one
match.
Signed-off-by: Sage Weil <sage.weil@dre... - 05:43 AM Revision 43de5e4f (ceph): FileStore: dumping transactions to a file
- Dump each queued transaction to a predefined file, specified with
--filestore-dump-file, in JSON format.
Signed-off... - 05:43 AM Revision cd4a760e (ceph): osd: fix heartbeat set_port()
- set_port() fails an assert if it isn't an in4 or in6 address, which a
default entity_addr_t is not.
Signed-off-by: S... - 05:29 AM Linux kernel client Bug #2261 (In Progress): paging error in libceph after crashed osd comes back online
- 05:22 AM Linux kernel client Bug #2261 (Can't reproduce): paging error in libceph after crashed osd comes back online
- ...
- 04:43 AM Revision 1775301b (ceph): osd: reenable clone on recovery
- This hasn't turned up problems in QA.
Fixes: #2002
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 02:25 AM Bug #2178: rbd: corruption of first block
- Well Sage,
I have a torture-test already :-D
OK, so it's independent from yours and that's good. It sounds, we ar...
04/10/2012
- 11:24 PM Feature #2223: Tracing facility on FileStore
- did some cleanup, changed the way the output is structured wrt the transaction lists, and tweaked a few other things....
- 11:23 PM Revision ddb98f77 (ceph): ceph_manager: don't try to start greenlet twice
- spawn already scheduled it. Trying to start it again hits an assert.
- 11:11 PM Revision 6fbac10d (ceph): osd: allow users to specify the osd heartbeat server address.
- Reported-by: Nick Bartos <nick@pistoncloud.com>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by... - 10:23 PM Bug #2002 (Resolved): osd: racy push/pull for clones
- 10:19 PM Bug #2161 (Resolved): nonlinear scaling for PGMap::pg_stat encode
- commit:bd518e998c0ff12d611db19a8cff6da3622597cb
- 10:18 PM Bug #1953 (Resolved): teuthology: core files aren't archived when using valgrind
- it works!
- 10:10 PM Bug #2225 (Resolved): gitbuilder.ceph.com returning 503: Service Temporarily Unavailable.
- Yehuda found the bad apache option.. override it in the domain_service (maxconnperip=1000 param)
- 09:56 PM Revision 4f030e1b (ceph): osd_types: fix off by one error in is_temp
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Gregory Farnum <gregory.farnum@dreamhost.com> - 09:49 PM Messengers Cleanup #2150 (Resolved): repair the Simple/Messenger interface
- 09:49 PM Feature #1044 (Fix Under Review): librbd: discard support
- 09:48 PM Revision 31f16a4c (ceph): rgw: list multipart response fix
- LastModified was formatted outside of the Part block.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 09:06 PM Revision 89fecda6 (ceph): Makefile.am: remove some clutter
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 09:04 PM Linux kernel client Bug #2260: libceph: null pointer dereference at try_write+0x638+0xfb0
- I'm going to have to look at this again in the morning, but I think
we're in this block of code:
#ifdef CONFIG_BL... - 08:37 PM Linux kernel client Bug #2260: libceph: null pointer dereference at try_write+0x638+0xfb0
- Here's a disassembled block of the code where the fault occurred.
The address listed corresponds to offset 3468 belo... - 08:10 PM Linux kernel client Bug #2260 (Resolved): libceph: null pointer dereference at try_write+0x638+0xfb0
- It's not an exact match but it's close enough that I wanted to reopen
bug 1793 or 1866, but found myself unable to. ... - 08:41 PM Revision 1ac5554d (ceph): kernel: kludge around mysterious 0-byte .git/HEAD files
- No idea where these are coming from, but they break nodes with behavior
like
ubuntu@plana08:~$ sudo install -d -m075... - 05:42 PM Revision 0aea1cb1 (ceph): v0.45
- 04:17 PM Revision 0d5918f8 (ceph): kernel: reset to remote firmware branch; don't pull
- Pull might merge if upstream rebases. Just make our branch match the
remote one. - 04:12 PM Revision 9b755fd6 (ceph): kernel: change git incantation for firmware pull
- The 'git pull <uri>' seemed to consistently fail on some nodes. Can't be
sure this was really the problem with them ... - 03:59 PM Revision 22b1f17f (ceph): ls: another newline
- 03:57 PM Revision 7757fbb9 (ceph): ls: remote stray newline
- 03:27 PM Feature #2246: force10s on sepia
- Fabric brought up by Networking group. Interfaces up, configured, and working (nuttcp shows 9.5GB/s or so with
defa... - 01:26 PM Feature #2111: msgr workloads
- I think the messenger tester may be at a point where we can call this bug satisfied.
- 01:18 PM Bug #2178: rbd: corruption of first block
- the good news is i see the problem. the bad news is its the exact bug we thought we fixed. the other good news is w...
- 07:38 AM Bug #2178: rbd: corruption of first block
- Hi Sage,
just in case, the reply from yesterday did not reach you:
--- 8-< ---
Good morning,
it's already... - 12:27 PM Feature #2258 (Resolved): use external leveldb package
- autoconf lets you use the installed library. not doing so by default to avoid the pain of building on older distros.
- 04:22 AM Revision 965f83d4 (ceph): Merge branch 'next'
- 04:20 AM Revision d348e1ab (ceph): configure: --with-system-leveldb
- Default to bundled leveldb. Optionally check.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 04:20 AM Revision 34cc308e (ceph): filestore: fix leveldb includes
- Signed-off-by: Sage Weil <sage@newdream.net>
- 03:23 AM Revision 0b2e1cd2 (ceph): cephfs: fix uninit var warning
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
04/09/2012
- 11:58 PM Revision 9906d5ed (ceph): Change to local mirror of linux-firmware repo to try to stop failures
- 11:17 PM Revision f79b95e5 (ceph): Makefile: add missing .h to tarball
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 10:56 PM Revision 8d5c87a8 (ceph): rgw: fix object name with slashes when vhost style bucket used
- Fixes issue #2259. The problem was that we were initializing the
object name, then in the case of a virtual host buck... - 09:02 PM Revision 853b0458 (ceph): OSD: use per-pg temp collections, bug #2255
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 07:08 PM Revision 36d42dea (ceph): buffer: allow advance() to move an iterator backward
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:08 PM Revision bd518e99 (ceph): encoding: fix iterator use for struct_len copy_in
- The end() iterator position does not record an offset when the list is
modified.
Signed-off-by: Sage Weil <sage.weil... - 04:30 PM rgw Bug #2259 (Resolved): rgw: object name cut after slash when virtual host style is used
- Fixed, commit:8d5c87a86e070b4e95ef0d58a469bdbbef4a826c.
- 03:42 PM rgw Bug #2259 (Resolved): rgw: object name cut after slash when virtual host style is used
- 09:32 AM Bug #2178: rbd: corruption of first block
- The missing piece of information is mapping the file offset to a block device offset. Can you, inside the VM,...
- 03:59 AM Revision 7951d7e4 (ceph): Merge remote branch 'gh/stable' into next
- 03:58 AM Revision dd8fd168 (ceph): configure: HAVE_FALLOCATE -> CEPH_HAVE_FALLOCATE
- /usr/include/linux/fs.h defines this on CentOS 5, even though it does not
in fact compile. This stupid workaround av...
04/08/2012
- 09:53 PM Feature #2258 (Resolved): use external leveldb package
- - make our configure take/require a --with-system-leveldb or similar to not use the bundled leveldb
- update the deb... - 08:31 AM Bug #2178: rbd: corruption of first block
- Hi Sage and *Happy easter*,
yesterday I had some "luck" after 10 tries....
Here is what I have for you:
first ...
04/06/2012
- 09:27 PM Feature #1692 (Duplicate): librbd: Support TRIM (hole punching) (userspace client)
- dup of #1044
- 09:07 PM Revision 8e1cc8ab (ceph): init-ceph: manage pid_file from init script
- With upstart the daemon shouldn't manage the pid file itself. Move this
out of the default config and into the legac... - 08:48 PM Revision 81d2cbeb (ceph): config: move /var/run and /var/log defaults to config_opts.h
- This flips the sense of the common_init defaults. Before, the alternate
defaults were filled in if it was a daemon. ... - 08:39 PM Revision dfa043df (ceph): config: {osd,mon}_data default to /var/lib/ceph/$type/$cluster-$id
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:26 PM Revision 2ceda946 (ceph): Merge branch 'stable'
- 06:44 PM Revision 7680cdad (ceph): dencoder, rgw: make ceph-dencoder load much faster
- by avoiding linking with unneeded shared objects.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 05:00 PM Revision 98326968 (ceph): encoding: use iterator to copy_in encoded length
- This gives us a pointer to the position into the list where the final
length value will be copied. Previously we use... - 03:47 PM rgw Feature #2257 (Rejected): rgw: detect fastcgi module 100-continue support automatically
- The current default that is used doesn't work with vanilla fastcgi module. It'd be great if that could be set automat...
- 02:46 PM rbd Feature #2256 (Resolved): rbd: parallelize deletions
- There are a few places where we delete things one at a time: resizing to a smaller size, deleting all snapshots, and ...
- 02:04 PM Feature #2240 (Fix Under Review): osd: new default locations
- wip-defaults
- 12:05 PM Bug #2161: nonlinear scaling for PGMap::pg_stat encode
- wip-encoding
- 09:18 AM Bug #2161: nonlinear scaling for PGMap::pg_stat encode
- Ake van der Meer wrote:
> My ceph-osd processes run at 100% CPU for many minutes at a time doing this: http://pasteb... - 08:25 AM Bug #2161: nonlinear scaling for PGMap::pg_stat encode
- My ceph-osd processes run at 100% CPU for many minutes at a time doing this: http://pastebin.com/wYnPKWeJ
In src/i... - 10:05 AM Feature #2246 (In Progress): force10s on sepia
- Ports being mapped yesterday and today in preparation for switch config review.
- 09:21 AM Bug #2255 (Resolved): osd: fix object name collisions between pools in temp collection
- 08:28 AM Feature #2223: Tracing facility on FileStore
- Made some changes to the ObjectStore.cc, regarding code duplication of the transaction's dump methods. Feedback would...
04/05/2012
- 09:55 PM Revision 689ac5d7 (ceph): v0.44.2
- 09:53 PM Revision e0c4db9e (ceph): FileStore: do not check dbobjectmap without option set
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 09:25 PM Revision 38e24b1e (ceph): config: include /etc/ceph/$cluster.keyring in keyring search path
- mkcephfs and the docs etc still write to /etc/ceph/keyring.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:25 PM Revision 57dff032 (ceph): config: expand metavariables for --show-config, --show-config-value
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:08 PM Revision 90e88a08 (ceph): Merge branch 'wip-cluster'
- Reviewed-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
- 08:35 PM Revision cfee0333 (ceph): config: parse fsid uuid in config, not ceph_mon
- Use the new OPT_UUID type.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 08:35 PM Revision 2c14c8b2 (ceph): config: add distinct UUID type
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:32 PM Revision 2c0dc47e (ceph): global: add -C or --cluster early args to specify cluster name
- This will let you specify which cluster to talk to on the command line
(e.g., 'ceph -C foo ...' or when starting a da... - 08:32 PM Revision 930a669a (ceph): config: add cluster name as metavariable; use for config locations
- Add a cluster name (default "ceph") to the config structure, and expand
$cluster in all config values.
Make the defa... - 08:25 PM Revision bda562fb (ceph): config: implement --show-config and --show-config-value <option>
- Dump internal config value(s) to stdout and then exit.
Signed-off-by: Sage Weil <sage@newdream.net> - 08:04 PM Revision f18b219a (ceph): test_workload_gen: fix logging
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:01 PM Revision 32b5d0f8 (ceph): config: remove obsolete bdev_* options
- These were part of ebofs.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:59 PM Revision 1b769535 (ceph): Merge remote-tracking branch 'gh/wip-log'
- 06:43 PM Revision 0e5d087c (ceph): README: update instructions
- Needed to add submodule instructions.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 03:49 PM Revision 3d7f1db7 (ceph): Kernel: Pull linux-firmware from git
- Signed-off-by: Mark Nelson <nhm@clusterfaq.org>
- 02:21 PM Feature #2248 (Resolved): cluster naming
- 02:20 PM Subtask #2236 (Resolved): filestore failure injection (3)
- wip-filestore-failure
I don't think enumerating/identifying the callers is needed here. For the idempotency teste... - 01:19 PM Feature #2226: osd: better filestore idempotency test
- Thought about the a bit more. The filestore failure injection is easiest to implement with an _exit(1) or something,...
- 01:13 PM Feature #1890 (Resolved): log: async log writeout
- 01:13 PM Feature #1889 (Resolved): log: structure log records
- 12:30 PM Feature #2254 (Resolved): doc: cephx
- pending improved documentation:
* was is, is not protected
* how to convert/upgrade a non-cephx cluster to cephx (e... - 12:22 PM Subtask #2235 (In Progress): generate deterministic sequence of transactions (5)
- 10:51 AM Bug #2178: rbd: corruption of first block
- Ok, my attempts to parse the log to find out of order replies is quickly snowballing. (complexity of dropped replies...
- 08:21 AM Bug #2178: rbd: corruption of first block
- Oliver Francke wrote:
> Uhm...
>
> ... I thought, we were talking about the same issue since the very beginning..... - 01:25 AM Bug #2178: rbd: corruption of first block
- Uhm...
... I thought, we were talking about the same issue since the very beginning... corruption of .rbd-blocks.....
04/04/2012
- 11:12 PM Revision 0df6fbd3 (ceph): rados: fix rados import
- This fixes issue #2253. Wrong param order to fread().
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 11:11 PM Feature #2248 (Fix Under Review): cluster naming
- 11:00 AM Feature #2248: cluster naming
- - new ocmmand line arg (-C, --cluster)
- controls default config files
- becomes another subst ($cluster) to be use... - 10:38 AM Feature #2248 (Resolved): cluster naming
- 08:56 PM Revision ba0fb3ed (ceph): cleanup-and-unlock.sh: helper to nuke and then unlock a set of nodes
- I usually do something like
teuthology-lock --list-targets --owner scheduled_sage@metropolis > /tmp/b
./cleanup-an... - 08:54 PM Revision 3adf2bf9 (ceph): schedule_suite.sh: helper to schedule a suite
- There's a bunch of stuff hardcoded in here, similar to the nightly, but
it's a useful starting point. - 04:09 PM Bug #2253 (Resolved): rados import: uploaded objects are empty
- Fixed, commit:0df6fbd3a66741ad02c7556b0c4026dc3577d797.
- 03:37 PM Bug #2253 (Resolved): rados import: uploaded objects are empty
- 03:33 PM rgw Documentation #1813: doc: document radosgw api diffs with s3
- We'd like to have it for the current sprint, or at least no later than the next sprint. 5/1 as an upperbound target d...
- 12:45 PM Bug #2233: Throttle when there are lots of large conccurent IOs
- Yeah, it's the failing gracefully bit that I'm interested in. :)
- 12:38 PM Bug #2233: Throttle when there are lots of large conccurent IOs
- Just the rados bench tool itself is allocating 16GB to feed into librados.
Now that you mention it, librados might... - 12:29 PM Bug #2233: Throttle when there are lots of large conccurent IOs
- Aha! The plana nodes appear to only have 8GB of ram and 8GB of swap.
Is the allocation of that memory part of libra... - 11:20 AM Linux kernel client Bug #2242: rbd: spinlock on wrong cpu
- OK, I think this problem arises because of the switch to a spinlock to
protect the client list. Doing so was the ri... - 09:53 AM Linux kernel client Bug #2242 (Resolved): rbd: spinlock on wrong cpu
- ...
- 11:19 AM Bug #2178: rbd: corruption of first block
- Oliver Francke wrote:
> Hi Sage,
>
> I was talking about the verbose logfiles from monday. TBH, I don't expect Ba... - 10:32 AM Bug #2178: rbd: corruption of first block
- Hi Sage,
I was talking about the verbose logfiles from monday. TBH, I don't expect BadThings without "rbd_writebac... - 09:49 AM Bug #2178: rbd: corruption of first block
- Oliver Francke wrote:
> Whew, that was fast,
>
> after second run I had some errors in one file with:
> [osd]
>... - 07:01 AM Bug #2178: rbd: corruption of first block
- Whew, that was fast,
after second run I had some errors in one file with:
[osd]
filestore fiemap threshol... - 05:43 AM Bug #2178: rbd: corruption of first block
- Well Sage,
its harder these days to reproduce, cause I think the current version has made "something more stable"(... - 10:57 AM Feature #2252 (Resolved): rgw long run kernels
- 10:54 AM Feature #2251 (Resolved): rgw long run workloads
- 10:53 AM Feature #2250 (Resolved): rgw long run raid config
- 10:47 AM Subtask #2249 (Resolved): teuthology task (3)
- 10:35 AM Feature #2246 (Resolved): force10s on sepia
- 10:32 AM Feature #2245 (Resolved): rgw long run ceph install
- 10:29 AM Messengers Feature #2244 (New): msgr: performance tester
- 09:54 AM Linux kernel client Bug #2243 (Resolved): btrfs: warning in orphan_commit_root
- 2012-04-04T01:02:59.191518-07:00 plana32 kernel: [ 8815.371555] ------------[ cut here ]------------
2012-04-04T01:0... - 09:45 AM Feature #2241 (Rejected): upstart
- 09:45 AM Feature #2240 (Resolved): osd: new default locations
- 09:42 AM Subtask #2239 (New): install + configure package everywhere
- chef!
- 09:42 AM Subtask #2238 (Rejected): vm for coredump archive
- 09:41 AM Subtask #2237 (Resolved): failure+replay tester (8)
- 09:39 AM Subtask #2236 (Resolved): filestore failure injection (3)
- add a hook to operations that we want to potentially fail.
need to identify the caller so that the tester can pote... - 09:38 AM Subtask #2235 (Resolved): generate deterministic sequence of transactions (5)
- 09:22 AM Bug #2234 (Resolved): Sometimes 'ceph -s' is unable to show pg data and crashes
- ceph -s / ceph -w sometimes gives me output as below:...
- 09:15 AM CephFS Feature #1237: mds caps limit mount to some subdir
- Nope — as with all the other MDS stuff, this is currently not a priority.
- 07:10 AM CephFS Feature #1237: mds caps limit mount to some subdir
- Is there any progress on this issue?
- 04:21 AM Revision 0921c062 (ceph): config: drop loud ERROR prefix
- This makes gitbuilder sad.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 12:03 AM Revision b9185bb2 (ceph): osdmap: allow row, room, datacenter, pool in conf for initial crush map
- These work just like host and rack, except that they are optional.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 12:02 AM Revision 4313a2d8 (ceph): crush: don't warn on skipped types
- It's perfectly okay to skip some.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 12:01 AM Revision 56a6aa7a (ceph): osdmap: set 'default' pool type correctly
- Got this wrong in e85961167eb1f37f80f263257799e4e901d17e74
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
04/03/2012
- 11:33 PM Revision dd7b84a5 (ceph): ceph-fuse: fix log reopen when -f is specified
- Don't restart if it wasn't stopped.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 10:56 PM Revision 1836d467 (ceph): Added assertion to check that targets > roles
- Signed-off-by: Mark Nelson <mark.nelson@dreamhost.com>
- 10:56 PM Revision 95294027 (ceph): nuke: don't run umount when no xargs args
- Gets rid of this noise:
INFO:teuthology.nuke:Unmount any osd data directories...
INFO:teuthology.orchestra.run.err:U... - 10:40 PM Revision e8596116 (ceph): osd: define more crush types
- We don't use these by default, but this way they are there should someone
want to use them.
Signed-off-by: Sage Weil... - 10:37 PM Messengers Bug #1674 (Need More Info): daemons crash when sent random data
- FWIW I was unable to reproduce this with the current code, with or without cephx enabled.
- 10:35 PM Revision 2dbdadbe (ceph): test_rewrite_latency: check return value
- Fixes warning
warning: test/test_rewrite_latency.cc:27:36: ignoring return value of ‘ssize_t pwrite(int, const void*... - 10:28 PM Revision 493344fd (ceph): Makefile: add mssing header
- Signed-off-by: Sage Weil <sage@newdream.net>
- 10:07 PM Bug #1627 (Can't reproduce): ceph-mon memleak if ceph-osd cluster ip is not reachable, but public...
- 09:21 PM Revision d57d8af7 (ceph): rgw: throttle at num_threads * 2
- If we throttle at num_threads, then nothing gets into the workqueue until
a worker thread is idle, which means you pa... - 08:44 PM Revision 1ef37ab8 (ceph): Merge remote-tracking branch 'gh/msgr-api-changes'
- Reviewed-by: Sage Weil <sage@newdream.net>
- 08:41 PM Revision a31efd9c (ceph): filestore: print Sequencer name in debug output
- And clean it up just a bit.
Signed-off-by: Sage Weil <sage@newdream.net> - 08:22 PM Revision 756621d5 (ceph): msgr: clean up Pipe::do_sendmsg.
- Document it as with the tcp stuff, remove an if(0)'d debugging block,
and remove the useless "sd" parameter since it'... - 08:22 PM Revision 9f10a991 (ceph): msgr: write minimal documentation for the tcp functions.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 08:22 PM Revision e966c39d (ceph): msgr: make a bunch of stuff private.
- Why were all these data members public? They're accessed by Pipes
and the Accepter and stuff, so maybe that's why...b... - 08:22 PM Revision 096971d4 (ceph): msg: update the Dispatcher and Messenger documentation
- Clarify what mark_down() and mark_down_on_empty() actually do.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.... - 08:21 PM Revision 36ec8e93 (ceph): dispatcher: fix documentation for ms_handle_reset
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 08:21 PM Revision cbe13ab2 (ceph): msgr: rename set_ip() -> set_addr_unknowns()
- The generic interface shouldn't reference specifics like that.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.... - 08:13 PM Revision 607f35e7 (ceph): msgr: Remove _my_name and ms_addr, replace with direct access to my_inst.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 08:13 PM Revision 77f45667 (ceph): msgr: store the entity_inst_t in the Messenger.
- Convert ms_addr and _my_name to be references to their fields in
the entity_inst_t my_inst.
This way we can use const... - 08:11 PM Revision 6374d064 (ceph): buffer: implement a contents_equal function on bufferlists
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 08:11 PM Revision 5681461b (ceph): msgr: change the signature of get_myaddr()
- Return a const reference to the actual address, instead of copying it.
All current users are happy with this, and I c... - 08:11 PM Revision 45a76eaf (ceph): msgr: get_connection() is required to establish a connection if none ex...
- Making an allowance for lossy server connections is silly. Just don't
ask for the Connection in that case. (There are... - 08:10 PM Revision e80126ea (ceph): test: fix monmaptool help text
- Broken by commit:15f0a3270fdcf09acce554313f2d0c0814a511e4
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> - 06:32 PM Revision e06436e9 (ceph): cls_rgw: guard decode
- thee were few cases where decode wasn't guarded.
Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com> - 06:30 PM Revision ebb487a6 (ceph): cls_rgw: reset return code in some cases
- Beforehand the return code was ignored, so fixed the cases
where we erroneously return error instead of success.
Sig... - 05:12 PM Revision a8938422 (ceph): librados: fix exec test
- Return for read operations is now returned correctly.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 04:52 PM rgw Bug #1681: rgw: user rm with --purge doesn't remove data
- Maybe we should disallow removal of user that has data? We can suspend it instead.
- 04:06 PM Revision 57f52479 (ceph): doc: disable broken 'doxygenclass' class in librados c++ doc
- This is the last remaining gitbuilder error. Add it back when the C++
docs actually build.
Signed-off-by: Sage Weil... - 03:58 PM Revision 9d4fcd08 (ceph): Merge remote-tracking branch 'gh/stable'
- 03:57 PM Bug #1921 (Resolved): teuthology: silently continues when len(targets) != len(roles)
- 03:44 PM Revision e40cf8ca (ceph): test_workload_gen: fix Sequencer ctor
- Signed-off-by: Sage Weil <sage@newdream.net>
- 02:43 PM Feature #2226: osd: better filestore idempotency test
- 02:32 PM Documentation #2175 (Resolved): doc: fix doc build errors
- got this to yellow (only warnnings), yay!
- 01:39 PM Feature #1890: log: async log writeout
- 01:39 PM Feature #1889: log: structure log records
- 10:45 AM Feature #2134 (Resolved): qa: smoke suite
- 10:31 AM Bug #2178: rbd: corruption of first block
- Hi Oliver,
I have two things to try:
- 'rbd writeback window = 0'. I know it's not what you want to run, but t... - 10:29 AM Bug #2233: Throttle when there are lots of large conccurent IOs
- That is 16GB of RAM being allocated and used — I don't remember what hardware these are running on and have no idea w...
- 09:47 AM Bug #2233 (Won't Fix): Throttle when there are lots of large conccurent IOs
- When sending large amounts of data via a single client (ie 256 concurrent 64MB IOs) we can hit a bad_alloc on the cli...
- 09:15 AM Cleanup #2191 (Resolved): reexamine simple_spinlock
- 08:51 AM Feature #2087 (Resolved): lightweight filestore workload generator
- 05:04 AM Revision b5ca2fe0 (ceph): Merge remote-tracking branch 'gh/wip-name-sequencers'
- 05:03 AM Revision d70191a8 (ceph): Merge remote-tracking branch 'gh/wip-2087'
04/02/2012
- 08:24 PM Revision addc7446 (ceph): rgw: check for subuser existence
- This fixes #1856: looking up subuser that doesn't exist returns
user as long as subuser prefix defined existing user.... - 02:30 PM rgw Bug #1853 (Resolved): rgw: qa test to verify bucket recreation does not override bucket
- Implemented, commit:1551c5b08714b415c49fc759002b7c6a6d4d611a.
- 01:26 PM rgw Bug #1856 (Resolved): It is possible to look up an rgw user by a subuser that does not exist as l...
- Fixed, commit:addc744692f60885a747c4531cd12bf19b3a7f2a.
- 11:15 AM rgw Feature #2171: rgw: asynchronously calculate md5
- Thinking about it some more, it's probably not the best use of time and effort. We initiate the md5 calculation after...
- 08:29 AM Bug #2178: rbd: corruption of first block
- Hi Sage,
here we go again, with ceph-0.44.1-1-g41f84fa
One bad file with following infos:
20120402 171642.12... - 12:04 AM Revision e792cd93 (ceph): filestore: fix ZERO fallback write
- It helps if we write zeros!
Signed-off-by: Sage Weil <sage@newdream.net>
04/01/2012
- 11:24 PM Revision 8434caf5 (ceph): qa: test_rewrite_latency
- Tool to measure latency of overwriting a single block.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:23 PM Bug #2221: Monitor setup bugs
- 2) ...
- 06:35 PM rbd Feature #2232: qemu: resize guest disk when rbd image is resized
- I tested this on Friday, and qemu rereads the size (at least when using virtio) when the guest requests it (i.e. echo...
- 04:21 PM rbd Feature #2232 (New): qemu: resize guest disk when rbd image is resized
- According to Christoph, this is probably just a matter of calling bdrv_truncate() with the new size. If that doesn't...
- 04:19 PM rbd Feature #2231 (Resolved): librbd: expose header change (resize?) via api
- we need a callback or something so that users (qemu) can be informed when the header changes. this will let them, sa...
03/31/2012
- 03:22 PM Feature #1655: gitbuilder aggregator page
- I took some inspiration from the updated aggregator script that is now at http://ceph.newdream.net/gitbuilder.cgi. I'...
- 03:31 AM Revision dbc70b9d (ceph): Merge remote branch 'gh/wip-mon_setup'
- Reviewed-by: Sage Weil <sage@newdream.net>
- 03:18 AM Revision f8a53869 (ceph): osd: fix error code return from class methods
- Don't shadow the result at function scope.
Fixes: #2148
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 12:22 AM Revision 15f0a327 (ceph): monmaptool: make clear you can set the fsid when making a new map.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 12:07 AM Revision 208daeb3 (ceph): ceph_mon: fix fsid parsing.
- fsid is a field in the CephContext _conf structure and is parsed by
the standard options parsing library before it ge...
03/30/2012
- 11:15 PM Revision 9a69c3f3 (ceph): ceph.conf: enable 'osd recover clone overlap'
- to test the recovery cloning in qa. this was redone, but forgot to enable
it in qa. - 11:14 PM Revision aa31035e (ceph): osd: update_stats() on reads too
- Update pg stats on any op completion (read or write), not just writes. Do
the calls with log_op_stats() for consiste... - 11:11 PM Revision 28788654 (ceph): log: dump_recent in fatal signal handler
- Signed-off-by: Sage Weil <sage@newdream.net>
- 11:07 PM Revision f27acbc8 (ceph): Merge remote-tracking branch 'gh/wip-log'
- Conflicts:
src/common/config_opts.h - 11:00 PM Revision 374bef9c (ceph): Merge remote branch 'gh/wip-osd-hb'
- 10:37 PM Revision f7f65ebe (ceph): osd: fix typo in debug message
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:57 PM Revision 75e3b9b3 (ceph): Merge remote branch 'gh/wip-osd-recovery-sources'
- 09:23 PM Revision df5860fe (ceph): objectstore: name Sequencers
- Assign a (unique) name to each Sequencer. This will aid in debugging, and
can be useful when dumping traces of FileS... - 09:11 PM Cleanup #2230 (Resolved): deprecate 'btrfs devs'
- 09:00 PM rgw Feature #2229 (New): rgw: functional tests for rgw class
- A series of simple functional tests to verify the rgw class methods behave as they should.
- 08:58 PM Bug #2148 (Resolved): osd: class error return not propagated to client
- commit:f8a53869f6db4c76516ee525f00f87f930920692
- 06:57 PM Revision 29c01f25 (ceph): ceph_common.sh: Remove dead code.
- Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
- 06:27 PM Revision ba6bb4cf (ceph): man: Oops, update ceph-mon(8) for real. Sorry about that.
- Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
- 06:26 PM Revision 541a543c (ceph): man: Update ceph-mon(8) after reStructuredText syntax fixes.
- Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
- 06:16 PM Revision 2c542442 (ceph): doc: Remove duplicate anchor from (unused) overview doc.
- Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
- 06:11 PM Revision 1ec47db1 (ceph): doc: Convert the mailing list mention to not be a section heading.
- If toctree is inside a section, the subtree is inside the section too.
We don't want all of dev/* to be under "Mailin... - 06:11 PM Revision b162696b (ceph): doc: Fix reStructuredText syntax errors.
- Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
- 05:52 PM Revision 2d1a96d3 (ceph): add include/stringify.h
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:27 PM Bug #2221: Monitor setup bugs
- (1) is a problem due to options parsing collisions...fixed!
(2) is directly contradicted by my testing...?
(3) I ne... - 04:59 PM Revision b25817a5 (ceph): FileJournal: check pwrite return value when zeroing journal
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 04:52 PM Revision 41f84fac (ceph): filestore: set guard on collection_move
- During recovery we submit transactions like:
- delete a/foo
- move tmp/foo to a/foo
This prevents the EEXIST chec... - 04:25 PM Bug #2026 (Can't reproduce): osd: ceph::HeartbeatMap::check_touch_file
- 04:25 PM Bug #2045 (Can't reproduce): osd: dout_lock deadlock
- haven't seen this in a while.
also, this code is about to go away anyway with wip-log. - 04:16 PM Bug #2102 (Can't reproduce): osd: pg stuck in backfill
- 04:15 PM Bug #2102 (Duplicate): osd: pg stuck in backfill
- 04:14 PM Bug #2002: osd: racy push/pull for clones
- i take that back; this wasn't enabled in qa. adding to the teuthology ceph.conf file.
- 04:12 PM Bug #2002 (Resolved): osd: racy push/pull for clones
- haven't seen this in forever; looks fixed.
- 04:11 PM Bug #2209 (Resolved): osd: read kb stats not tracked?
- commit:aa31035e555129e56888320b84f16264f28bd7df
- 03:59 PM Bug #2116 (Resolved): Repeated messages of "heartbeat_check: no heartbeat from"
- fixed by commit:374bef9c97266600b4c6b83100485d7250363213
- 03:59 PM Bug #2165 (Resolved): osd: recovering ending with missing
- fixed with merge of commit:75e3b9b309e5365975e3e5855c065bd4fe28b64c
- 03:58 PM Bug #2178: rbd: corruption of first block
- 02:51 PM Bug #2178: rbd: corruption of first block
- Please build the current git stable branch, which includes 41f84fac1ae4b4c72bf9bfe07614c4066c916fd1. The version sho...
- 07:35 AM Bug #2178: rbd: corruption of first block
- Here the remaining timestamps from the other VM's with bad blocks:
VM-2:
20120330 105139.579830 filling block 171... - 07:12 AM Bug #2178: rbd: corruption of first block
- Hi *,
I needed a couple of runs, but managed now to provide some 81MiB/97MiB osd.X.log-files, where in between sh.... - 03:58 PM Bug #2164 (Resolved): osd: scrub missing _, snapset attrs
- commit:41f84fac1ae4b4c72bf9bfe07614c4066c916fd1
- 03:49 PM Revision f89f98df (ceph): osd: clear RECOVERING on start_peering_interval
- This prevents us from, say, getting into a recovering+stray state.
Signed-off-by: Sage Weil <sage@newdream.net> - 03:45 PM Revision 3cdd8d58 (ceph): osd: more heartbeat debug
- Signed-off-by: Sage Weil <sage@newdream.net>
- 03:45 PM Revision e1a58912 (ceph): osd: discard heartbeat_peer in note_down_osd
- Discard the heartbeat_peer as soon as we find out, along with queued
failures, or else the heartbeat_check may come a... - 03:45 PM Revision 21e6e2b8 (ceph): osd: ignore peer epoch of 0 on ping reply
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 03:45 PM Revision efc27f19 (ceph): osd: don't fail new heartbeat peers
- last_tx may be 0 because we just added this peer; don't mark them down
yet!
Signed-off-by: Sage Weil <sage.weil@drea... - 03:45 PM Revision 33b9187a (ceph): osd: rename hbin -> hbclient, hbout -> hbserver
- This is way less confusing.
Signed-off-by: Sage Weil <sage@newdream.net> - 03:44 PM Revision 4e2f0d14 (ceph): osd: simplify heartbeat logic
- Simplify heartbeats to use a simple request/reply model.
- avoid any weirdness with map update timing
- no from/to... - 03:44 PM Revision fe5f0331 (ceph): osd: send pings from hbin
- Fixes: #2212
Signed-off-by: Sage Weil <sage@newdream.net> - 02:32 PM Revision eebc9ec2 (ceph): test: test_workload_gen: Add callback for collection destruction.
- When we remove a collection, we must cleanup after the coll_entry_t we
once had on the available collections set. For... - 01:53 PM Revision 424b5b07 (ceph): ceph: --concise by default, add --verbose option
- It's time.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 12:50 PM Feature #2227 (Closed): QA: create a test to verify operation with non-default layouts
- I submitted a patch that modified ceph_calc_file_object_mapping()
in the ceph client, and when reviewing it Sage poi... - 09:53 AM Feature #2226 (Resolved): osd: better filestore idempotency test
- ...
- 03:31 AM Revision 409b648b (ceph): config: drop old debug_* items
- ...and replace code references with conf->subsys.should_gather().
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 02:16 AM Revision 5d981b15 (ceph): rgw: add unittest just to verify we link
- This will flush out references to stuff in libglobal.la, among other
things.
Signed-off-by: Sage Weil <sage.weil@dre... - 02:06 AM Revision 69b01726 (ceph): config: fix librados, libcephfs unit tests
- No more g_conf->debug.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 01:08 AM Revision 394d8b1e (ceph): Add test for object source marked down
- 01:08 AM Revision b4aa098f (ceph): make Thrasher not inherit from Greenlet
- 01:02 AM Revision 1c8ec702 (ceph): PG,ReplicatedPG: update missing_loc_sources with missing_loc
- In some cases missing_loc was updated without missing_loc_sources
Signed-off-by: Samuel Just <samuel.just@dreamhost.... - 01:02 AM Revision 05ef3ba6 (ceph): ReplicatedPG: fix loop in check_recovery_sources
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 12:35 AM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- I think I can be optimistic :)...
03/29/2012
- 10:06 PM Bug #2178: rbd: corruption of first block
- Okay, I suspect this is actually bug #2164, which was causing the _ xattr to get lost when ceph-osd restarts on non-b...
- 09:52 PM Bug #2225 (Resolved): gitbuilder.ceph.com returning 503: Service Temporarily Unavailable.
- I can't find any 503 in the apache logs on this machine. Could it be on the client side?
- 09:48 PM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- Well, I fixed one problem, but I can't see how it could have resulted in the log you posted.
Pushed a few more pat... - 11:36 AM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- I collected logs from 4 OSDs, they can be downloaded at: http://logger.ceph.widodh.nl/ceph/issues/2212/
At 10:13 t... - 09:21 AM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- Der.. do you have a log you can attach/post?
- 02:59 AM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- I reverted the extra debugging for the heartbeat stuff, but that didn't seem to consume all the CPU time.
The load... - 01:40 AM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- I just installted the code on my cluster and things do not seem to behave yet.
The cluster is still jumping around... - 08:54 PM Linux kernel client Bug #1940 (Resolved): locking cycle in ceph_osdc_start_request
- commit:ab434b60ab07f8c44246b6fb0cddee436687a09a
- 08:15 PM Revision 41a09bea (ceph): Merge remote branch 'upstream/wip_latency'
- 07:53 PM Linux kernel client Bug #1793 (Can't reproduce): NULL pointer dereference at try_write+0x627/0x1060
- Marking this Can't Reproduce. Will reopen if it shows up again.
- 03:21 PM Linux kernel client Bug #1793: NULL pointer dereference at try_write+0x627/0x1060
- Another 100 iterations of kernel_untar_build.sh using the current
master branch (c666601a935b94cc0f3310339411b6940de... - 07:51 AM Linux kernel client Bug #1793: NULL pointer dereference at try_write+0x627/0x1060
- Bugs 1793 and 2081 have a signature of a page fault/bad memory reference
from process_one_work() -> con_work(), and ... - 07:53 PM Linux kernel client Bug #2069 (Can't reproduce): client crash during kernel_untar_build rm -r step
- I just finished at least 150 iterations of kernel_untar.sh and never
hit this using the current master branch of cep... - 07:51 PM Linux kernel client Bug #2081 (Can't reproduce): msgr: spinlock badness?
- Marking this Can't Reproduce. Will reopen if it happens again.
- 07:43 PM Linux kernel client Bug #2081: msgr: spinlock badness?
- Another 100 iterations of kernel_untar_build.sh using the current
master branch (c666601a935b94cc0f3310339411b6940de... - 07:51 AM Linux kernel client Bug #2081 (Need More Info): msgr: spinlock badness?
- Bugs 1793 and 2081 have a signature of a page fault/bad memory reference
from process_one_work() -> con_work(), and ... - 07:50 PM Linux kernel client Bug #2174 (Can't reproduce): rbd: iozone thrashing failure
- OK, I'll go ahead and state that I can't reproduce this...
- 07:46 PM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- Status was Verified. Changing it to Need More Info because I can't even
seem to reproduce it at this point. (I sup... - 07:44 PM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- Another 12 iterations of suites/iozone.sh using the current
master branch (c666601a935b94cc0f3310339411b6940de751ba)... - 07:59 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- I don't know whether we've adequately captured the signature or symptoms
of this problem. I believe though that it ... - 07:20 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- I have been trying to reproduce this using the latest testing/master/for-linus
branch (they're the same right now) a... - 02:34 PM Revision c39ed568 (ceph): test: test_workload_gen: Fixing a memleak.
- Apparently, the FileStore does not cleanup after transactions once they
are applied, which may lead to huge memory le... - 09:27 AM Linux kernel client Bug #2224 (Rejected): Oops in __cfh_to_dentry
- I setup an HA pair of NFS servers which re-export Ceph to NFS clients.
The HA pair is in active/standby mode, using... - 07:42 AM Feature #2087: lightweight filestore workload generator
- Memory leak fixed.
Apparently, the FileStore does not cleanup after transactions once they are applied, which may ... - 06:21 AM Feature #2087 (In Progress): lightweight filestore workload generator
- Looks like some memory should be leaking bad, such that valgrind hangs on exit.
==19080==
==19080== HEAP SUMMARY... - 07:24 AM Linux kernel client Bug #2064 (Resolved): ceph-client: messenger: nocrc flag not implemented correctly
- Linus pulled in the changes without any immediate trouble, so
I'm marking this and a few others resolved. - 07:12 AM Linux kernel client Bug #2157 (Resolved): ceph: xattr: fix nanosecond display on i_rctime
- Linus pulled in the changes without any immediate trouble, so
I'm marking this and a few others resolved. - 07:12 AM Linux kernel client Bug #2156 (Resolved): ceph: xattr: fix a possible buffer overrun bug
- Linus pulled in the changes without any immediate trouble, so
I'm marking this and a few others resolved. - 07:11 AM Linux kernel client Bug #2155 (Resolved): ceph: xattr: wrong value assumed for "no preferred PG"
- Linus pulled in the changes without any immediate trouble, so
I'm marking this and a few others resolved. - 05:56 AM Feature #2223 (Resolved): Tracing facility on FileStore
- Allow a user to specify a file onto which log the transactions that come through OSDs' FileStores.
This should all... - 05:47 AM Revision b3069e50 (ceph): ceph_argparse: drop useless declaration from unit test
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 12:16 AM Revision 4269f8d5 (ceph): ReplicatedPG: ctx might not contain an OpRequest
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 12:16 AM Revision 135a11ba (ceph): FileJournal: optionally zero journal on create
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 12:15 AM Revision 2486c61a (ceph): FileStore: Pass OpRequestRef into filestore in queue_transaction
- This allow us to track op progress through the filestore.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> - 12:15 AM Revision d026cdc7 (ceph): FileJournal: use DSYNC for directio path
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 12:15 AM Revision 533bbf7b (ceph): osd/: OpRequest implements TrackedOp for passing into filestore
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
03/28/2012
- 11:12 PM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- Ah, I see the bug now. Pushed a fix to wip-osd-hb, thanks!
Let us know if this behaves for you.. if so I'll pull ... - 04:23 AM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- It's quite large (222MB), so I uploaded the file, available at: http://logger.ceph.widodh.nl/ceph/osd.1.log_27-03-201...
- 10:51 PM Bug #2165: osd: recovering ending with missing
- see wip-osd-recovery-sources
- 10:46 PM CephFS Bug #1811: 2 pjd chown tests failed on cfuse
- ...
- 04:02 PM Revision 4f0d170a (ceph): test: test_workload_gen: Change CLI option and add '--help' usage.
- With this commit, we support the following options (and old ones are no
longer available):
--test-num-colls VAL ... - 03:34 PM Revision 18d219e5 (ceph): rgw: replace dout with ldout
- librgw can't use g_ceph_context
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 03:21 PM Feature #2222: osd: distinguish between 'degraded' and 'misplaced'
- We should pick a designator that doesn't make it sound like the objects are lost.
- 02:27 PM Feature #2222 (Resolved): osd: distinguish between 'degraded' and 'misplaced'
- normal data migration happens with a acting set > the up set, so that we never drop below N replicas, but we still ca...
- 02:45 PM Feature #2087: lightweight filestore workload generator
- 02:07 PM Bug #2221 (Resolved): Monitor setup bugs
- Carl reported several configuration issues when creating new monitors (based on the instructions at http://ceph.newdr...
- 01:59 PM Revision a3bdf055 (ceph): test: test_workload_gen: Default arguments, and minor changes.
- Besides adding support for default arguments, passed onto global_init(),
this commit fixes a conflict in Makefile.am,... - 01:32 PM Revision 37cdbcd4 (ceph): log: fix up unittest
- Fewer entries; compile.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 08:35 AM rgw Bug #2220 (Resolved): rgw: librgw dep on g_ceph_context
- Fixed, commit:18d219e512a8e0f427a2229a71e15869cac3b593.
- 07:16 AM rgw Bug #2220 (Resolved): rgw: librgw dep on g_ceph_context
- from last night's qa,...
- 04:37 AM Bug #2219: OSD's commit suicide with 0.44
- I accidentally removed the core file(s) :(
Hope this one pops up again so I have a core file. - 04:11 AM Linux kernel client Tasks #2138: rbd: run xfstests on a local XFS filesystem over RBD
- After setting up two rbd devices and making some fairly simple changes
to xfstests, then setting up appropriate envi... - 04:04 AM Linux kernel client Bug #2155: ceph: xattr: wrong value assumed for "no preferred PG"
- This got rebased: 3489b42a72a41d477665ab37f196ae9257180abb
This has been sent as part of a pull request to Linus ... - 04:04 AM Linux kernel client Bug #2156: ceph: xattr: fix a possible buffer overrun bug
- This got rebased: 3489b42a72a41d477665ab37f196ae9257180abb
This has been sent as part of a pull request to Linus ... - 04:03 AM Linux kernel client Bug #2157: ceph: xattr: fix nanosecond display on i_rctime
- This got rebased: 3489b42a72a41d477665ab37f196ae9257180abb
This has been sent as part of a pull request to Linus ... - 04:01 AM Linux kernel client Bug #2064: ceph-client: messenger: nocrc flag not implemented correctly
- It got rebased once more, and this should be the last:
37675b0f42a8f7699c3602350d1c3b2a1698a3d3
This has been s... - 03:52 AM Bug #2178: rbd: corruption of first block
- Hi,
I decided to upgrade to "latest-n-greatest" in the test-cluster, to make sure, that if I hit the error again w... - 02:58 AM Revision 94e3abf8 (ceph): Merge branch 'stable'
- 12:22 AM Revision 8948ad01 (ceph): test: test_workload_gen: CodeStyle compliance and cleanup.
- This commit aims at the compliance with Ceph's CodeStyle, as well
as cleaning up some lingering unused code.
Also, n... - 12:22 AM Revision d172b40c (ceph): test: test_workload_gen: Destroy collections.
- 12:22 AM Revision 3770096a (ceph): test: test_workload_gen: Mimic an OSD's workload.
- In it's current state, the workload generator will queue a lot of
transactions onto the FileStore, and will wait if n... - 12:18 AM Revision 749826c2 (ceph): allow use of a separate journal block device
03/27/2012
- 11:44 PM Revision ffc468f2 (ceph): osdmap: less noisy about osd additions during buildmap
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:44 PM Revision 36c2f27d (ceph): osdmaptool: fix clitest conf filename
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:37 PM Revision ca1f79b5 (ceph): dout: no newlines on dout_emergency
- Preserve old behavior to avoid breaking all the cli tests.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 11:27 PM Revision d5360968 (ceph): throttle: fix off by one issue
- We were blocking only if we exceeded max count, not if
we reached it.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdrea... - 11:23 PM Revision a52d048a (ceph): rgw: throttle incoming requests
- Don't accept more than the number of threads, otherwise if cluster is
backed up for any reason we'd end up exhausting... - 11:16 PM Revision 30cadf01 (ceph): prebufferedstreambuf: fix typedef
- 'typename' not allowed here:
./common/PrebufferedStreambuf.h:27: error: using 'typename' outside of template
Signed... - 10:35 PM Revision 93ba4c00 (ceph): Merge branch 'wip-intent-fixes'
- 10:35 PM Revision ca4fab47 (ceph): Merge branch 'master' of ssh://github.com/ceph/ceph
- 10:35 PM Revision 16b60b3e (ceph): rgw: minor style fixes
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 10:12 PM Revision 4d74a7b2 (ceph): osd: fix handling of recovery sources when osds go down
- If a source osd goes down, we need to
- reset any pulls (already did that before)
- remove peer from missing_loc s... - 10:03 PM Revision 8fdde24c (ceph): osd: remove down osds from peer_*_requested maps
- This will leave less crap around to confuse recovery if a source osd goes
down and then up.
Signed-off-by: Sage Weil... - 10:02 PM Revision 1ee60873 (ceph): osd: maintain missing_loc_sources
- This is a superset of all missing_loc values... everywhere we might
pull an object from, or are currently pulling fro... - 09:37 PM Revision 5dbb9715 (ceph): rgw: all intent log operations are now async
- That includes removing a directory index object, and the removal of
the actual intent log object.
Signed-off-by: Yeh... - 09:20 PM Revision 0b1e3ed4 (ceph): osd: increase default heartbeat_interval to 6 seconds
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:12 PM Revision 69844496 (ceph): rgw: remove pool_list(), can't list_objects() on system buckets
- pool_list() was broken, replaced now with pool_iterate(). list_objects()
shouldn't be used any more with system bucke... - 09:04 PM Revision 2e9079cf (ceph): rgw: intent log processing uses new pool_iterate()
- intead of pool_list(), which is broken (assuming pgls results are
sorted, which are not).
Signed-off-by: Yehuda Sade... - 08:57 PM Revision 1814aac1 (ceph): Merge branch 'misc-fixes-for-review'
- 08:57 PM Revision d5c4015d (ceph): uclient: We want to release cache when we lose the CACHE cap, not gain it!
- Looks like this was detected as a problem back in
84644dc56183b67050793a1b8da07850508b29d6 but the fix wasn't complet... - 08:57 PM Revision c3b04644 (ceph): paxos: share_state sends every unknown value, including the stashed one
- Sage points out that the stashed object might not be the same as the
one we actually archive. For instance, OSDMonito... - 08:57 PM Revision 2acf4aea (ceph): mon: Paxos needs to store the latest version permanently on-disk.
- Previously it was only storing this m->latest_value in the stash,
which of course got overwritten. And then when some... - 08:57 PM Revision d0ba27ae (ceph): doc: add a short thing on kernel client troubleshooting.
- I just noticed this sitting uncommitted in my tree.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> - 08:02 PM Revision c89b7f22 (ceph): v0.44.1
- 06:35 PM Revision 6044c5b8 (ceph): hadoop: define subsystem, fix logging
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:31 PM CephFS Bug #2218: CephFS "mismatch between child accounted_rstats and my rstats!"
- The MDS log is at https://matthew.royhousehold.net/mds.a.log.1.gz (1505MB, md5 197ef232d50d27e2b7c2f62370c9c6b6)
- 02:45 PM CephFS Bug #2218 (Need More Info): CephFS "mismatch between child accounted_rstats and my rstats!"
- There's not enough info in the attached log to figure out what happened. I can tell you that your home directory beli...
- 06:20 PM Revision ce61a83f (ceph): log: throttle message submission, trim recent
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:20 PM Revision fe56818e (ceph): config: configure log thresholds
- - max new entries before we wait for flush
- max recent entries to keep around
Signed-off-by: Sage Weil <sage@newdre... - 06:05 PM Revision 339956df (ceph): log: don't spam -1 to syslog; add err_to_syslog for consistency
- This matches the stderr settings.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:44 PM Revision 17a95c22 (ceph): log: use PrebufferedStreambuf
- It's faster than ostringstream!
Signed-off-by: Sage Weil <sage@newdream.net> - 05:44 PM Revision bfa2bcd7 (ceph): prebufferedstreambuf: fix get_str()
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:43 PM Revision 0e3c0c44 (ceph): bench_log: flush
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:42 PM Revision 3a87e452 (ceph): log/EntryQueue: no implicit trim
- dequeue() things explicitly if you want to remove them.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:41 PM Revision f66e0750 (ceph): utime_t: sprintf() method
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:41 PM Revision 6ab85264 (ceph): do_autogen: control optimization level
- -O 2 -> -O2
Signed-off-by: Sage Weil <sage@newdream.net> - 05:41 PM Revision a4509273 (ceph): common: add PrebufferedStreambuf
- Simple streambuf that uses a preallocated buffer, and then spills over
into a std::string if necessary.
Signed-off-b... - 05:41 PM Revision 23f0af3c (ceph): test log performance with PreallocatedStreambuf
- - faster than ostringstream in optimistic case
- same as ostreamstream + std::string assignment in worst case (use
... - 05:41 PM Revision 8c5046fa (ceph): bench_log: simple util to time how long it takes to log stuff
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:41 PM Revision 362ca19b (ceph): log: move create_entry() into Log interface
- This will let us be smarter than putting it on the heap.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:41 PM Revision c7242bfe (ceph): log: flush on_exit
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:41 PM Revision abfadb9b (ceph): assert: dump recent log entries on failed assertions
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:41 PM Revision f41887e3 (ceph): log: new logging infrastructure
- - explicitly defined subsystems, and ceph_subsys_FOO enums to go with them
- modular log system with Entry object
- s... - 04:26 PM rgw Bug #2197 (Resolved): rgw: need to throttle incoming requests
- Fixed, commit:a52d048ac429c3d2b6a9286d96253308f6588762.
- 04:10 PM Bug #2178: rbd: corruption of first block
- The next step is to reproduce the corruption on the test cluster with logs:
debug osd = 20
debug ms = 1
debug... - 08:37 AM Bug #2178: rbd: corruption of first block
- Well,
one more comment:
my guess would be, it has todo something with expansion of the "sparse-file" while writin... - 05:24 AM Bug #2178: rbd: corruption of first block
- Good morning ;)
meanwhile I have not been lazy. I've managed - with current setup in test-cluster - to produce "in... - 04:07 PM Bug #2164: osd: scrub missing _, snapset attrs
- wip-2164
it's a problem with the collection_move guard (or lack thereof) - 03:40 PM rgw Bug #2208 (Resolved): rgw: radosgw-admin temp remove failure
- Fixed, merged at commit:93ba4c004a9269148a75b67da2522855cb1842a3.
- 02:19 PM Bug #2219 (Need More Info): OSD's commit suicide with 0.44
- Can you look at the core file and 'thread apply all bt'?
- 05:57 AM Bug #2219: OSD's commit suicide with 0.44
- ...
- 05:03 AM Bug #2219 (Can't reproduce): OSD's commit suicide with 0.44
- I noticed this myself today, but on IRC somebody else came along:...
- 02:03 PM Bug #2199 (Resolved): mon: get_bl osdmap_full/9583 No such file or directory
- Merged to master in commit:1814aac17593dee0fa4c774d5b462f277f6698da, reviewed by Sage — even though I forgot to add t...
- 12:25 PM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- Can you attach the full osd.1 log?
- 12:36 AM Bug #2211: osd: entity_inst_t OSDMap::get_inst(int) const
- Over night I saw 16 OSD's go down with the same backtrace.
All OSD's were running with debug ms/osd set to 1, this... - 09:07 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- I've been off on other things, but this problem apparently recurred
even if the latest check-in (Josh's change) in p... - 08:38 AM CephFS Bug #2217: sync and O_DIRECT writes only write first extent in iov vector
- The code should not be written that way.
However I think it doesn't matter at this point, because the only caller
...
03/26/2012
- 11:48 PM Revision 974a2013 (ceph): objecter: don't call op_throttle_ops.take(1) unconditionally
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 11:17 PM Revision 679cd1fe (ceph): objecter: add in-flight ops throttling
- In addition to ops length, we also want to throttle it by
actual number of ops.
Signed-off-by: Yehuda Sadeh <yehuda@... - 10:02 PM Revision d6b0cbd4 (ceph): config: use our assert
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:40 PM Revision c3dc6a6e (ceph): msg: assert pipe->msgr == msgr
- Fixes: #2216
Signed-off-by: Sage Weil <sage@newdream.net> - 06:57 PM Revision e30b7710 (ceph): rbd: fix typo in default config
- pyflakes would have caught this if 'all' weren't a built-in function
- 06:43 PM Revision 483fcf80 (ceph): doc: include crush in toctree
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:30 PM Revision 3bd1f18e (ceph): doc: few notes on manipulating the crush map
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:24 PM CephFS Bug #2218 (Resolved): CephFS "mismatch between child accounted_rstats and my rstats!"
- The mismatch is detected at 2012-03-26 18:39:54.306661...
- 05:15 PM Revision 6db77158 (ceph): doc/dev/peering.rst: fix typo
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:39 PM Revision 1a0360cb (ceph): osd/: OpRequest is no longer a RefCountedObject, remove puts/gets
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 04:38 PM Revision ea377a08 (ceph): osd/: Convert OpRequest* to OpRequestRef
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 04:38 PM Revision 2cb6c7d0 (ceph): OSD: Add typedef for shared_ptr<OpRequest>
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 04:36 PM Revision 3ed784c9 (ceph): osd/: add mark_event to OpRequest and move tracking into OpTracker
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 03:51 PM Bug #2192: ceph-mon hangs consuming 100% CPU
- It was reproduced all the time, for 0.44 also. After I adjusted cluster to have only one monitor problem has gone. (U...
- 02:44 PM CephFS Bug #2217 (Resolved): sync and O_DIRECT writes only write first extent in iov vector
- static ssize_t ceph_aio_write(struct kiocb *iocb, const struct iovec *iov,
unsigned long nr_segs, loff_t po... - 01:34 PM Bug #2199 (Fix Under Review): mon: get_bl osdmap_full/9583 No such file or directory
- Re-pushed misc-fixes-for-review.
- 09:59 AM Bug #2199 (In Progress): mon: get_bl osdmap_full/9583 No such file or directory
- Sage pointed out the stash data structure isn't necessarily the same as the other stored data structures, so this nee...
- 12:47 PM Messengers Cleanup #2216 (Resolved): SimpleMessenger should make sure it owns passed-in Connections
- 10:50 AM Messengers Cleanup #2216 (Resolved): SimpleMessenger should make sure it owns passed-in Connections
- Otherwise we get weird issues like #2212.
- 12:38 PM Cleanup #2191: reexamine simple_spinlock
- my log branch drops this for the dout logging. the last user is the buffer.h debugging (enabled manually via a macro...
- 12:06 PM RADOS Bug #2047: crush: with a rack->host->device hierarchy, several down devices are likely to cause b...
- fwiw dropping the local search behavior fixes this bad behavior. the question is what probably was the local search ...
- 11:27 AM RADOS Bug #2047: crush: with a rack->host->device hierarchy, several down devices are likely to cause b...
- 11:27 AM Bug #2210 (Duplicate): osd: some PGs remains remapped or degraded
- this is actually a crush problem, see #2047.
- 09:45 AM Bug #2210: osd: some PGs remains remapped or degraded
- #2173 has some osd logs and related info for the same problem on a less clean cluster. Thanks for the detailed steps ...
- 10:36 AM CephFS Fix #2215 (Resolved): ceph-fuse does not invalidate page cache
- Right now the userspace client doesn't invalidate the page cache when it loses the cache capability on an inode. Appa...
- 09:58 AM Bug #2212 (Resolved): osd: FAILED assert(msgr->lock.is_locked())
- ah, i was using wrong msgr, fixing!
- 05:50 AM Bug #2212 (Resolved): osd: FAILED assert(msgr->lock.is_locked())
- With the new heartbeat code I noticed a couple of OSD's go down with:...
- 09:58 AM RADOS Bug #2214 (Resolved): crush: pgs only mapped to 2 devices with replication level 3
- This is from #2173. Note that all 3 osds are up....
- 09:43 AM Bug #2173 (Resolved): MDS crash when start with end of buffer
- 06:04 AM Feature #2213 (Resolved): rbd: shouldn't need config file to get help
- I just ran "rbd --help" on a pretty much un-configured machine and got:
global_init: unable to open config file.
... - 05:22 AM Bug #2211 (Resolved): osd: entity_inst_t OSDMap::get_inst(int) const
- While trying out the new heartbeat code I encountered this crash:...
- 03:28 AM Revision e478a758 (ceph): vstart: enable omap for xattrs
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
03/25/2012
- 08:39 PM Bug #2173: MDS crash when start with end of buffer
- Shall we colse this bug, as the mds server was recovered by providing an empty session map and we can not reproduced ...
- 08:39 PM Bug #2210 (Duplicate): osd: some PGs remains remapped or degraded
- Some PGs remains 'remapped' or 'degraded' status after adding an osd server.
The steps to re-produce the bugs:
1.... - 03:05 PM Revision f4b2097a (ceph): Merge remote branch 'gh/wip-doc-peering'
- 02:57 PM Revision d3bcac24 (ceph): Makefile: fix modules that cannot find pk11pub.h when compiling with NS...
- Signed-off-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 02:57 PM Revision 3ab28950 (ceph): don't override CFLAGS
- leveldb adds -I flags to CFLAGS and CXXFLAGS, but if these macros are
overridden in the make command line, the flags ... - 09:54 AM Feature #2087: lightweight filestore workload generator
- Pushed a new commit to [1], making the code compliant with the CodeStyle and with Sage's suggestions on github.
[1... - 04:47 AM Revision ef17c8c9 (ceph): add smoke suite
- This could probably be collapsed into a bunch of singleton tasks to make
it simpler to track how many actual jobs res... - 04:20 AM Revision b5641ef3 (ceph): rgw: don't #include fcgi from rgw_common.h
- ceph-dencoder #includes rgw_common.h, and needs to build even when
--without-radosgw is specified and libfcgi isn't i... - 04:09 AM Revision 1c1192a9 (ceph): backfill: use 'rbd' pool instead of 'data'
- (data has a replay interval, which makes writes take longer to resume
after repeering) - 04:09 AM Revision 397e7f2f (ceph): add osd_recovery task to test divergent osd logs
03/24/2012
- 11:07 PM Revision 24910c3b (ceph): add osd-recovery test
- 11:07 PM Revision 6bf9c957 (ceph): renamed backfill -> osd_backfill
- 11:05 PM Revision ca9a5a4a (ceph): rename backfill -> osd_backfill
- 10:36 PM Revision 22e80874 (ceph): put filestore xattr option in [global]
- ...for test_filestore_idempotent's benefit
- 09:41 PM Feature #2134: qa: smoke suite
- 09:04 PM Feature #1802 (Resolved): qa: test to exercise divergent osd logs
- 03:10 PM Bug #2192: ceph-mon hangs consuming 100% CPU
- Is this reproducible? Are you able to connect to the ceph-mon process with gdb?
- 03:06 PM Bug #2185 (Won't Fix): osd/ReplicatedPG.cc: 5938: FAILED assert(r >= 0) in ReplicatedPG::scan_ran...
- 08:13 AM Feature #2087: lightweight filestore workload generator
- Pushed a working version to ceph's git repository, branch wip-2087 [1]. Feedback would be appreciated.
[1] - https...
03/23/2012
- 08:27 PM Revision 2ec8f27f (ceph): rados_bench: generate_object_name now takes a buffer length
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com> - 05:27 PM Bug #2209 (Resolved): osd: read kb stats not tracked?
- 01:21 PM Bug #2196: `rados bench` will write test objects with a constant oid, under-reporting performance.
- 2ec8f27f58adca40d125051a23547b639ee7d5f6
- 01:21 PM Bug #2196 (Resolved): `rados bench` will write test objects with a constant oid, under-reporting ...
- 12:53 PM rgw Bug #2208 (Resolved): rgw: radosgw-admin temp remove failure
- The radosgw-admin temp remove on congress goes into infinite loop when trying to list the .intent-log pool.
- 11:07 AM Bug #2200 (Can't reproduce): mon: not accepting new connections
- Yehuda's indicated that this might be tied in to networking issues that were ongoing at the time. Given the symptoms ...
- 11:04 AM Bug #2199 (Fix Under Review): mon: get_bl osdmap_full/9583 No such file or directory
- I believe this is fixed in misc-fixes-for-review commit:e08b489d094efe384c3db639af0be765665bee23. Sage needs to revie...
03/22/2012
- 11:09 PM Bug #2200: mon: not accepting new connections
- Okay, that appears to not be it (the connections established and terminated match for clients and are only off by 9 o...
- 10:09 PM Bug #2200: mon: not accepting new connections
- There's not a lot I can do to diagnose this with just logs; the Monitors don't refuse connections like that on their ...
- 09:42 AM Bug #2200 (Can't reproduce): mon: not accepting new connections
- Following a networking downtime and monitors restart (as described in #2199), and following a recovery process, all a...
- 10:00 PM Bug #2199 (In Progress): mon: get_bl osdmap_full/9583 No such file or directory
- Looks like the problem is that the Monitor got elected leader, and while it collected all the state it didn't write i...
- 10:00 AM Bug #2199: mon: get_bl osdmap_full/9583 No such file or directory
- My guess/hope is that this is one of the issues solved by the monitor slurp and other fixes since 0.41, but I haven't...
- 09:41 PM Revision 21a170e8 (ceph): doc: dev/peering.rst edits from Greg
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:12 PM Bug #2207 (Resolved): osd: crash when op length is greater than op input data
- This could happen due to a malicious or buggy client. I caused this with an accidentally empty request, with positive...
- 05:10 PM CephFS Documentation #2206 (Resolved): Need a control command to gracefully shutdown an active MDS prior...
- There is currently no way to gracefully shutdown an active MDS and allow a standby to activate or to transfer the act...
- 04:53 PM Bug #2205 (Won't Fix): mkcephfs throws "No such file or directory" errors when the pwd the script...
- When executing mkcephfs on a new cluster the script throws the message "bash: line 0: cd: /home/matthew/forCeph: No s...
- 03:33 PM Revision 8fa904a6 (ceph): doc: update dev/peering document
- - fix discussion of last epoch started
- define terms for current and past intervals
- describe role of pg info
- rem... - 02:55 PM Revision de867632 (ceph): msgr: fix tcp.cc linkage
- Signed-off-by: Sage Weil <sage@newdream.net>
- 02:53 PM Revision fd9935b7 (ceph): cephtool: don't prefix log items
- This just makes it hard to read them.
Signed-off-by: Sage Weil <sage@newdream.net> - 02:46 PM Subtask #2201: Document old design
- +1; I have no idea what this bug is for
- 11:39 AM Subtask #2201: Document old design
- Old design of...what?
(I see now that it's connected to the omap stuff, but if you could include a little more con... - 11:34 AM Subtask #2201 (In Progress): Document old design
- 11:33 AM Subtask #2201 (In Progress): Document old design
- 01:08 PM Bug #2196 (In Progress): `rados bench` will write test objects with a constant oid, under-reporti...
- 11:34 AM Subtask #2204 (Rejected): implement upgrade from old design to new design
- 11:33 AM Subtask #2203 (In Progress): implement new design
- 11:33 AM Subtask #2202 (Rejected): Document new design
- 11:33 AM Feature #2149 (In Progress): osd: use omap for snap collections
- 11:17 AM Feature #2198: add an option to force a down osd to be marked immediately out
- Hmm, yeah, I forgot about that.
Somebody was asking about it; I'm not sure if they cared exactly but I'm sure there ... - 11:08 AM Feature #2198: add an option to force a down osd to be marked immediately out
- Not really, a write will still go to N-1 replicas until the new one is backfilled up through the object's position.
... - 11:00 AM Feature #2198: add an option to force a down osd to be marked immediately out
- It guarantees that you always have the set number of copies on-disk when you get a commit, instead of probably having...
- 10:47 AM Feature #2198: add an option to force a down osd to be marked immediately out
- What's the motivation for doing that? Is it any better than setting the out interval to be something very short?
- 09:14 AM Bug #2116: Repeated messages of "heartbeat_check: no heartbeat from"
- see new wip-osd-hb branch
03/21/2012
- 11:41 PM Revision 2e21adf2 (ceph): Objecter: resend linger_ops on any change
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com> - 11:35 PM Revision b47454b6 (ceph): ObjectStore: add COLLECTION_MOVE to dump
- Signed-off-by: Samuel Just <rexludorum@gmail.com>
- 11:35 PM Revision 23313ee6 (ceph): FileStore: whitelist COLLECTION_MOVE on replay
- Signed-off-by: Samuel Just <rexludorum@gmail.com>
- 11:35 PM Revision ec52eeb2 (ceph): FileStore: remove src on EEXIST during collection_move replay
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 11:35 PM Revision 52aff487 (ceph): ObjectStore: Add collection_move to generate_instances
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 09:01 PM Revision 3caa4319 (ceph): ceph: define and use a shell_scripts Makefile variable
- Define a variable "shell_scripts" in the Makefile.in, and use it
along with some pattern rules to avoid some duplicat... - 09:01 PM Revision 1b2a0669 (ceph): ceph-kdump-copy: add tools for saving kdumps
- This puts in place an init script and a command it runs to save a
kernel core dump to a remote server when a panic or... - 08:41 PM Bug #2199: mon: get_bl osdmap_full/9583 No such file or directory
- kept logs for the failing monitor under /var/log/ceph/2199
- 08:26 PM Bug #2199 (Resolved): mon: get_bl osdmap_full/9583 No such file or directory
- Happened on congress (afair, off 0.41). One monitor is out for more than a month. Following network outage, both moni...
- 07:00 PM Revision 6f0f250b (ceph): suite: add missing print statement
- 06:58 PM Revision 8a9a5670 (ceph): suite: fix print statement when summary doesn't exist
- 04:59 PM Feature #2198 (New): add an option to force a down osd to be marked immediately out
- 02:25 PM rgw Bug #2197 (Resolved): rgw: need to throttle incoming requests
- In case we can't handle requests, we'd end up accepting requests indefinitely thus we consume fds endlessly. This wil...
- 01:30 PM Revision d0e8f148 (ceph): doc: update list of debian dists
- Signed-off-by: Sage Weil <sage@newdream.net>
- 01:28 PM Revision a608a8fe (ceph): Merge branch 'stable'
- 12:52 PM Bug #2196 (Resolved): `rados bench` will write test objects with a constant oid, under-reporting ...
- (As discussed on @#ceph@, 2012/03/21 -- with thanks to @joshd@)
The command @rados bench@ generates a sequence of ... - 08:21 AM Bug #2178: rbd: corruption of first block
- The next object is whatever the MBR points to. You can find the object name from the sector offset that gdisk gives y...
- 02:55 AM Bug #2178: rbd: corruption of first block
- Hi Josh,
thanks for taking the time to investigate this... And yes, many others show the same behaviour. Is "the n... - 06:50 AM Feature #2127: Save kernel core dumps on all of our test machines
- I seem to remember seeing a reference to 'mkcrashrd', a mkinitrd type script that generates the initrd image the cras...
- 02:00 AM Revision 91c08f6e (ceph): Add watch op to rados.py
- Signed-off-by: Samuel Just <sam.just@dreamhost.com>
- 12:51 AM Revision 72361784 (ceph): Objecter: resend linger_ops on any change
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com> - 12:51 AM Revision 3019d460 (ceph): TestRados: Add watch
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com> - 12:20 AM Revision 2998368a (ceph): rgw: remove unused definition
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 12:20 AM Revision 4760536f (ceph): rgw: keep pool placement info also in cacheable location
- Mirror the pools placement info, so that we can cache it.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 12:00 AM Revision f1563a66 (ceph): Revert "Objecter: add op->resend_on_any_change"
- This reverts commit c53194d75390dd6d5aa4a9a33f741cbd106e3338.
recalc_linger_op_target is used for linger_ops
Signed...
03/20/2012
- 11:11 PM Revision 2daff0e9 (ceph): ReplicatedPG: osd_max_notify_timeout -> osd_default_notify_timeout
- This setting should not override user specified timeout.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> - 11:11 PM Revision c53194d7 (ceph): Objecter: add op->resend_on_any_change
- lingers must be resent even if the primary does not change.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> - 11:11 PM Revision fc7a1bda (ceph): ReplicatedPG: return -EBUSY on delete for objects with watchers
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 10:15 PM Revision 6a5cbec3 (ceph): rgw: replace bucket_id generation
- bucket_id is now string: <global instance id>.<num> where
num is increasing monotonically within the current rgw
inst... - 09:07 PM Feature #2127 (In Progress): Save kernel core dumps on all of our test machines
- I finally have crash dumps getting packaged and sent over to a
remote machine reliably. The problem is that it does... - 06:59 PM Bug #2178: rbd: corruption of first block
- I looked at the block you attached, and compared it to the first 4MiB of my desktop's hard drive. It looks like it co...
- 03:58 AM Bug #2178: rbd: corruption of first block
- Hi *,
any update on this topic? Cause we are working for hours and days with three people to rescue as many images... - 06:41 PM Revision cdd5298d (ceph): v0.44
- 05:59 PM Revision e42fbb70 (ceph): rgw: process default alt args before processing conf file
- this fixes #2189
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 05:52 PM Revision e0b8f7a0 (ceph): rgw: process default alt args before processing conf file
- this fixes #2189
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 05:37 PM Revision 51a07339 (ceph): rgw: incrase socket backlog
- 20 is too small
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 05:25 PM Revision 5b331987 (ceph): rgw: fix internal cache api
- This fixes issue #2190
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 05:14 PM rgw Bug #2193 (Resolved): rgw: .pools.avail is not cached
- Fixed, commit:4760536fe573c702bac8fb1d51213d76059e32dc.
We now mirror the info in the object. Still keeping the om... - 09:28 AM rgw Bug #2193 (Resolved): rgw: .pools.avail is not cached
- Probably due to recent omap changes, we don't cache omap operations. Either we cache it, or just keep available pools...
- 03:10 PM rgw Feature #2194 (Resolved): rgw: replace bucket-marker-ver with better, fast, more scalable solution
- Fixed, commit:6a5cbec38b761d524e699e2a7410a340d093ccca.
- 09:32 AM rgw Feature #2194 (Resolved): rgw: replace bucket-marker-ver with better, fast, more scalable solution
- We use this object in order to create unique prefix for bucket objects (we do it at bucket creation). Instead of this...
- 02:50 PM Revision 815fc3e2 (ceph): suite: failed runs might not have durations
- This was one cause of emails not being sent - stale /tmp/cephtest dirs
fail without recording a duration. - 10:47 AM rgw Bug #2189 (Resolved): rgw: can't change debug level through ceph.conf
- Fixed, commit:e0b8f7a0331b0ceee54a911bb9231cb168eb2d0f.
- 10:28 AM rgw Bug #2190 (Resolved): rgw: cache disabled
- Fixed, commit:5b3319870ea9d6c715c671e006e3a772008e3e78.
- 09:43 AM CephFS Feature #2195 (Resolved): Allow removal of last MDS if there's no filesystem
- Right now you can't remove the last MDS from your cluster, which means that if you aren't using it and it's off you w...
- 05:43 AM Bug #2192 (Won't Fix): ceph-mon hangs consuming 100% CPU
- I have a test setup of two nodes each running 0.43 mds, mon and osd. I mount ceph kernel filesystem at /srv/ceph on b...
03/19/2012
- 11:36 PM Cleanup #2191 (Resolved): reexamine simple_spinlock
- We've got a homebrewed spinlock implementation in src/common/simple_spin.h/cc. It was written so we could use dout in...
- 11:10 PM Revision f923b840 (ceph): OSD: do not hold obc lock in disconnect_session_watches
- ObjectContext::lock is used only for implementing read_lock and
write_lock. PG::lock is used to protect the ObjectCo... - 09:16 PM Revision a65d4136 (ceph): suite, coverage: use absolute dirs for isdir checks
- This fixes the results to wait for all jobs to complete again.
- 06:57 PM Revision bdb72c28 (ceph): filestore_idempotent: get coverage and coredumps
- 06:31 PM Revision 6c8db1a8 (ceph): suite: more results logging
- 05:34 PM rgw Bug #2190 (Resolved): rgw: cache disabled
- in master branch only, due to internal api change.
- 05:33 PM rgw Bug #2189 (Resolved): rgw: can't change debug level through ceph.conf
- 05:12 PM Bug #2188 (Resolved): mon: mds rm should be harder to break things with
- If you run ceph mds rm 0 on a healthy cluster, it breaks the Monitor's world. I'm uncomfortable with the command exis...
- 04:04 PM Bug #2183 (Resolved): osd: lockdep cycle with obc lock and watch_lock
- pushed to master f923b840edec79df5791a7fb7fdec8b0b40f25f1
- 03:33 PM Bug #2183: osd: lockdep cycle with obc lock and watch_lock
- I believe it's inappropriate to hold obc->lock there anyway, pg lock serves that purpose.
- 11:07 AM Bug #2185: osd/ReplicatedPG.cc: 5938: FAILED assert(r >= 0) in ReplicatedPG::scan_range()
- In the wip-rbd-bid branch that I pushed last week I added an option to the rbd tool to create images using existing d...
- 11:01 AM Bug #2185: osd/ReplicatedPG.cc: 5938: FAILED assert(r >= 0) in ReplicatedPG::scan_range()
- should be pretty easy to rebuild the xattr, removing the object would corrupt the rbd image
03/18/2012
- 10:36 PM Bug #2173: MDS crash when start with end of buffer
- I have managed to start mds server after resetting the journal. So I can get my data back.
Thanks very much to all o... - 06:56 PM Revision 7173a8af (ceph): ceph.conf: no comment
- 06:06 PM Revision 7de798f6 (ceph): ceph.conf: set 'filestore xattr use omap = true'
- 05:50 PM Revision 7d2e1056 (ceph): fix teuthology-ls isdir check
- 05:48 PM Revision 94f0ba1e (ceph): run valgrind with cwd set to /tmp/cephtest/archive/coredump
- This lets us capture the vgcore.* files, which always go to valgrind's
cwd.
Fixes: #1953 - 04:09 PM Revision fd851304 (ceph): ReplicatedPG: there should be no object_contexts during on_activate
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 04:08 PM Revision 6c17a7b3 (ceph): Merge branch 'next'
- 04:08 PM Revision 77c08f86 (ceph): osd: fix object_info.size mismatch file due to truncate_seq on new object
- If the first write that creates an object includes a truncate_seq and
truncate_size, we were taking the truncte patch... - 01:46 PM CephFS Bug #2187 (Can't reproduce): pjd chown/00.t failed test 97
- on both ceph-fuse and kclient, nightly_coverage_2012-03-17-a,
> 1727 FAIL scheduled_teuthology@teuthology collection... - 01:43 PM CephFS Bug #2159 (Resolved): ceph-fuse: big_writes option not recognized
- 12:09 PM Bug #2080 (Resolved): osd: scrub on disk size does not match object info size
- 12:09 PM Bug #1953: teuthology: core files aren't archived when using valgrind
- 12:07 PM Bug #2164: osd: scrub missing _, snapset attrs
- this was non-btrfs, right after the new idempotent replay stuff was fixed.
- 10:50 AM Bug #2186 (Can't reproduce): osd: shutdown race
- ...
- 10:07 AM Bug #2180 (Resolved): osd/ReplicatedPG.cc: 3381: FAILED assert(obc->watchers.size() == 0)
03/16/2012
- 11:59 PM Revision 619fe730 (ceph): .gitignore: xattr_bench
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 10:06 PM Revision 3a6c085e (ceph): heartbeatmap: use utimes(2) instead of futimens(2)
- For poor users with ancient glibc. We don't much care about rename races
here anyway.
Signed-off-by: Sage Weil <sag... - 09:36 PM Revision 63ec06b3 (ceph): osd: remove special handline for head recovery from clone
- This breaks because:
- we don't have the head or current snapset
- get_object_context() creates a new snapset, whi... - 08:49 PM Revision d8bcc1b3 (ceph): config: fix recursive locking of md_config_t::lock
- Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com> - 08:30 PM Revision 58c5d5a0 (ceph): osd: ReplicatedPG::create_object_context()
- New helper that creates a new object context.
Signed-off-by: Sage Weil <sage@newdream.net> - 08:30 PM Revision d4addf57 (ceph): osd: re-use create_object_context() in get_object_context()
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:30 PM Revision 15d85af4 (ceph): osd: explicitly create new object,snap contexts on push
- We specifically want to use this during recovery to avoid loading the obc
or ssc for a previous version of the object... - 08:28 PM Revision 01924a22 (ceph): disable rbd thrash workload, #2174
- 08:04 PM Revision 96780bd1 (ceph): osd: create_snapset_context()
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:15 PM Revision 872bdd0d (ceph): osd: ensure we don't clobber other *contexts when registering new ones
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:52 PM Revision 9791035d (ceph): Merge branch 'wip_omap_xattrs'
- 06:44 PM Revision 07b97fe7 (ceph): suite: log results and coverage generation
- Need to figure out where and when results emails are failing.
- 06:40 PM Revision 2a593dda (ceph): RadosModel: test xattrs with omap
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:40 PM Revision a49a1972 (ceph): ReplicatedPG,FileStore: clone should copy xattrs as well
- _make_clone (called from make_writeable) and _rollback_to included
attr reads from head or a clone. In that case, an... - 06:40 PM Revision 14506dc6 (ceph): FileStore: add support for omap xattrs
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:31 PM Revision a5f143d2 (ceph): Merge branch 'wip-msgr4'
- Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 06:29 PM Revision 983fd190 (ceph): ObjectMap: add interface for storing xattrs
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:29 PM Revision d8325e50 (ceph): DBObjectMap: implement xattr interface
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:29 PM Revision fdb92748 (ceph): test_object_map: update unit test for xattr
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:29 PM Revision 8fc43179 (ceph): config_opts.h: opts for omap_xattrs
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:29 PM Revision ecd875fe (ceph): tests/: Added xattr bench
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:29 PM Revision b09fb15d (ceph): ObjectMap: use Index object for locking rather than path object
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:28 PM Revision 9fd4a12a (ceph): DBObjectMap: add support for storing xattrs
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:18 PM Bug #2185: osd/ReplicatedPG.cc: 5938: FAILED assert(r >= 0) in ReplicatedPG::scan_range()
- strace indicated we had a missing xattr on
2268 stat("/data/osd0/current/164.2_head/rb.0.0.000000000000__head_DA6... - 06:02 PM Bug #2185: osd/ReplicatedPG.cc: 5938: FAILED assert(r >= 0) in ReplicatedPG::scan_range()
- ...
- 03:33 PM Bug #2185: osd/ReplicatedPG.cc: 5938: FAILED assert(r >= 0) in ReplicatedPG::scan_range()
- Here output from osd.3 after recent crash:
root@fcmsnode3:/data/osd3/current# find 0.0_head
0.0_head
0.0_head/10... - 03:22 PM Bug #2185 (Won't Fix): osd/ReplicatedPG.cc: 5938: FAILED assert(r >= 0) in ReplicatedPG::scan_ran...
- ...
- 06:01 PM Bug #2173: MDS crash when start with end of buffer
- Talked more on irc, soft crack is trying to reset his journal since it looks like at least all his metadata objects a...
- 04:43 PM Bug #2173: MDS crash when start with end of buffer
- osd map file for 'ceph osd getmap 3212 -o /tmp/osdmap'
- 01:31 PM Bug #2173: MDS crash when start with end of buffer
- Greg: look at the osd dump above: all pools are rep size 3.
- 01:13 PM Bug #2173: MDS crash when start with end of buffer
- Did all the pools get set to 3x replication, or are the confused PGs all part of the metadata pool?
- 12:26 PM Bug #2173: MDS crash when start with end of buffer
- Could you attach the output of 'ceph osd dump 3212' and the binary version of that osdmap (ceph osd getmap 3212 -o /t...
- 09:56 AM Bug #2173: MDS crash when start with end of buffer
- Unfortunately we can see that this assert too is caused by ENOENT on an object that really ought to be there, which m...
- 08:19 AM Bug #2173: MDS crash when start with end of buffer
- Can you post an mds log with debug mds = 20 leading up to that last crash?
Resetting the journal is not something ... - 08:15 AM Bug #2173: MDS crash when start with end of buffer
- I managed to insert a empty sessionmap. The server continue starting.
And I get an assert error:... - 05:39 PM Revision 0904c7b7 (ceph): configure: fix warnings
- Finally!
Signed-off-by: Sage Weil <sage@newdream.net> - 05:04 PM Revision f2e6b8d7 (ceph): ReplicatedPG: populate_object_context during handle_pull_response
- A cached objectcontext should always have its watchers populated.
Signed-off-by: Samuel Just <samuel.just@dreamhost.... - 04:43 PM Revision 4cfc34f8 (ceph): leveldb: .gitignore TAGS
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:40 PM Revision 5db6902b (ceph): leveldb: un-revert
- Accidentally reverted by c2af646b38995ba005140e748a21baba4263e53f.
Signed-off-by: Sage Weil <sage@newdream.net> - 02:33 PM Bug #2080: osd: scrub on disk size does not match object info size
- wip-2080
- 01:33 PM Bug #2184 (Resolved): audit calls to populate_obc_watchers and add watch/notify to RadosModel
- 01:32 PM Feature #2125 (Resolved): osd: put large xattrs in leveldb
- 01:20 PM Bug #2183: osd: lockdep cycle with obc lock and watch_lock
- crashed it with this mutl...
- 01:18 PM Bug #2183 (Resolved): osd: lockdep cycle with obc lock and watch_lock
- ...
- 12:04 PM Bug #2180: osd/ReplicatedPG.cc: 3381: FAILED assert(obc->watchers.size() == 0)
- Hi Sage,
here the according log after upgrading and starting 0.43-1...
Hope it helps,
Oliver.
- 08:56 AM Bug #2180 (Resolved): osd/ReplicatedPG.cc: 3381: FAILED assert(obc->watchers.size() == 0)
- ...
- 10:58 AM Bug #2182 (Resolved): audit osd reads for reads from potentially unstable objects
- In particular, there are places we read object_info and snapset outside of the get_object_context and get_snapset_con...
- 10:50 AM Bug #2181 (Won't Fix): 4051: FAILED assert(!missing.is_missing(soid)) in ceph version 0.43-244-g9...
- v0.43 and this commit from master aren't compatible; the final v0.44 will have a protocol rev to prevent this problem.
- 10:21 AM Bug #2181 (Won't Fix): 4051: FAILED assert(!missing.is_missing(soid)) in ceph version 0.43-244-g9...
- Hi Sage,
here u r. This was the version, which failed, too, after all others didn't help either... Similar with al... - 09:13 AM Bug #2132: FAILED assert(!missing.is_missing(soid))
- Oliver Francke wrote:
> Well,
>
> its tagged as resolved, but today another node died...:
>
> osd/ReplicatedPG... - 05:36 AM Bug #2132: FAILED assert(!missing.is_missing(soid))
- Well,
its tagged as resolved, but today another node died...:
osd/ReplicatedPG.cc: In function 'void Replicated... - 04:19 AM Bug #2178: rbd: corruption of first block
- Here is one of many, where the header is missing:
--- 8-< ---
fcms@fcmsnode3:~$ rbd ls 1320396354
vm-451-disk-1.... - 12:34 AM Revision 8fbd087d (ceph): results: make sure email is sent before anything else fails
03/15/2012
- 06:08 PM Bug #2173: MDS crash when start with end of buffer
- Sorry for mistake.
ceph osd dump -o -:
2012-03-16 09:10:04.887611 mon <- [osd,dump]
2012-03-16 09:10:04.888161... - 06:01 PM Bug #2173: MDS crash when start with end of buffer
- ceph -s:...
- 10:43 AM Bug #2173: MDS crash when start with end of buffer
- Well that's exciting; this means it's an OSD bug.
The meaning of that output is that of your 209 PGs, 185 are happy;... - 05:35 PM Revision 89ccd95a (ceph): osd: maybe clear DEGRADED on recovery completion
- We set degraded if we don't have enough "active" replicas, which excludes
the backfill target. We need to recheck th... - 05:32 PM Revision b4572351 (ceph): Revert "disable rbd thrash workload, #2174"
- This reverts commit 1bec416c7c7ff8a6462d94baaba8e7da73e88ab4.
Fixed with #2174 - 12:58 PM rgw Feature #1941 (Rejected): rgw: revisit bucket removal
- 12:57 PM rgw Feature #785 (Rejected): rgw: fix filesystem backend
- 10:29 AM Bug #2160 (Resolved): active+recovering+degraded+backfill becomes active+clean+degraded when reco...
- 09:49 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- The test that reproduced the problem has now run once to completion
without hitting it. Therefore it's ready to shi... - 08:35 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- ...
- 07:57 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- Thanks Alex. I remember thinking it fixed a race initially, but then going back later and being unable to find the ra...
- 07:43 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- That's excellent Josh. I'll use it, it's basically what I was
thinking of doing anyway, now I'll just use yours. D... - 07:38 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- That analysis of the race looks correct to me. The first unapplied patch in wip-rbd would have fixed this (9a3e22a0ce...
- 07:14 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- ...
- 07:12 AM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- I think I can explain this:
[ 265.117432] INFO: trying to register non-static key.
[ 265.149933] the code is ... - 12:16 AM Revision 826d30f1 (ceph): rgw: remove extra layer of RGWAccess
- Not needed, now that we got rid of RGWFS
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
03/14/2012
- 11:33 PM Revision 80e2a5e8 (ceph): msgr: switch all users over to abstract interface
- This will let us transparently swap implementations out.
Signed-off-by: Sage Weil <sage@newdream.net> - 11:29 PM Revision 1e1453c1 (ceph): msgr: introduce static Messenger::create() function
- Create a new messenger, with whatever implementation is appropriate.
Signed-off-by: Sage Weil <sage@newdream.net> - 11:29 PM Revision d26feffd (ceph): msgr: promote more methods to abstract Messenger interface
- This will be everything that people actually use.
Signed-off-by: Sage Weil <sage@newdream.net> - 11:01 PM Revision c2af646b (ceph): rgw: put_obj() uses bufferlist instead of extra alloc/copy
- makes it cleaner.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 11:01 PM Revision 2b3bfd0c (ceph): rgw: remove fs backend
- was broken anyway
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 10:51 PM Revision 1bec416c (ceph): disable rbd thrash workload, #2174
- 08:53 PM Linux kernel client Bug #2174: rbd: iozone thrashing failure
- I tried reproducing the problem, and although I'm not sure I know
how to recognize it my test did end in failure.
... - 09:54 AM Linux kernel client Bug #2174 (Can't reproduce): rbd: iozone thrashing failure
- consistently failing
- ceph:
log-whitelist:
- wrongly marked me down or wrong addr
- objects unfo... - 08:32 PM Revision e14d428c (ceph): Merge branch 'master' of github.com:ceph/teuthology
- 08:32 PM Revision 2b879905 (ceph): Merge branch 'master' of github.com:ceph/teuthology
- 08:01 PM Revision a81b23e2 (ceph): Merge branch 'next'
- 07:59 PM Revision bec47b57 (ceph): introduce CEPH_FEATURE_OMAP
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:55 PM Revision 8c96fd26 (ceph): leveldb: new .gitignore entry
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:14 PM Revision 20d11714 (ceph): osd: rev cluster internal protocol
- This covers:
- the push/pull changes in 0.43 (which we forgot to protect against; see
#2132)
- the new omap stuff ... - 06:23 PM Bug #2173: MDS crash when start with end of buffer
- Thanks for your responses.
I created this ceph file system with 1 mon, 1 osd, 1 mds. It works perferctly, and I wr... - 04:11 PM Bug #2173: MDS crash when start with end of buffer
- Huh. Is this a new filesystem? Have you had any problems with the RADOS cluster (the OSDs)?
What's happening now i... - 04:16 AM Bug #2173: MDS crash when start with end of buffer
- I also tried: 'ceph-mds -i 1 -d --reset-journal 0'.
It just freeze. - 04:14 AM Bug #2173 (Resolved): MDS crash when start with end of buffer
- My system is ubuntu 11.10 64bit. Mds just crashes when startup.
I noticed the message: 'No such file or directory'... - 04:36 PM Revision a0bcab5a (ceph): ceph-fuse: make big_writes optional via 'fuse big writes'
- Fixes: #2159
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 04:35 PM rgw Bug #2001 (Resolved): radosgw memory leak
- At this point I can't see any other leak (I already fixed one). Doesn't mean that there isn't another one, but I'm re...
- 04:08 PM CephFS Bug #2179 (Resolved): mds: don't crash on nonexistent SessionMap
- Inspired by #2173. When the MDS tries to load the SessionMap it unconditionally decodes it, which causes a crash if t...
- 03:46 PM Feature #2127: Save kernel core dumps on all of our test machines
- http://linux.die.net/man/8/netdump
this mechanism looks simpler? - 02:55 PM Feature #2127: Save kernel core dumps on all of our test machines
- Wed Mar 14 11:14:50 CDT 2012
OK, I got kernel core dumps and crash working in Ubuntu 11.10.
A lot of what I use... - 02:54 PM Feature #2127: Save kernel core dumps on all of our test machines
- Oh, I forgot to mention I also wrote a little program that extracts
identifying information from a dump file that "k... - 02:48 PM Feature #2127: Save kernel core dumps on all of our test machines
- I have been able to generate a core dump on an Ubuntu system.
I have transferred the result using scp to another hos... - 02:16 PM Bug #2178 (Resolved): rbd: corruption of first block
- 01:02 PM Bug #2132 (Resolved): FAILED assert(!missing.is_missing(soid))
- 12:01 PM Bug #2132: FAILED assert(!missing.is_missing(soid))
- Aha, that explains it... the 0.42.2 and 0.43 interaction looks like the culprit here. We should have made them expli...
- 11:37 AM Bug #2132: FAILED assert(!missing.is_missing(soid))
- All cephfs workload. It could be a versioning issue, I don't have the syslogs anymore that would show when I updated ...
- 11:19 AM Bug #2132 (Need More Info): FAILED assert(!missing.is_missing(soid))
- Matthew Roy: What was the nature of the workload? rbd? ceph fs?
- 11:06 AM Bug #2132: FAILED assert(!missing.is_missing(soid))
- Josh Durgin wrote:
> stxShadow saw this as well.
It looks like in stxshadow's case, it was a version mismatch (cr... - 11:23 AM CephFS Cleanup #2177 (Resolved): mds: play nicely with omap
- Convert the MDS to use OMAP properly.
There is at least one specific thing: right now it has optimizations for whe... - 10:46 AM Bug #2176 (Resolved): dependencies not checked by autoconf
- I recently resurrected a build of the user-mode and kernel clients on CentOS and found that I was missing a few packa...
- 10:09 AM rgw Feature #2171: rgw: asynchronously calculate md5
- Actually, I think it'll be easier doing it the other way around. As we already write the object asynchronously we can...
- 10:08 AM Documentation #2175 (Resolved): doc: fix doc build errors
- e.g., http://ceph.newdream.net/gitbuilder-doc/log.cgi?log=a0bcab5a583e6c1fd87430252590ec902d1b6b98
It would be gre... - 09:56 AM Bug #2022: osd: misdirectect request
- Just saw this with a different workload:...
- 09:51 AM CephFS Bug #2071: kclient: pjd mkfifo failures
- hit this again:...
- 09:49 AM rgw Cleanup #2166 (Resolved): rgw: make sure librgw doesn't link against libfcgi
- Fixed, commit:e19417ef55c713e60c61edd0de7c2228953407a1.
- 09:48 AM rgw Bug #2170 (Resolved): librgw references g_ceph_context
- Fixed, commit:5912312c14a6214f4318fd7bfb6fd08714458b6f.
- 12:21 AM Revision 5912312c (ceph): rgw: remove some more globals from librgw
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 12:04 AM Revision 213a3f5e (ceph): rgw: fix identation
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 12:04 AM Revision d90298de (ceph): ceph-dencoder: don't use rgw types if configured without rgw
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
03/13/2012
- 11:40 PM rgw Feature #2172 (Resolved): rgw: get chunks asynchronously
- Chunks are read synchronously. We need to have a window of chunks that are read asynchronously (as with PUT).
- 11:38 PM rgw Feature #2171 (Rejected): rgw: asynchronously calculate md5
- When doing a PUT we calculate the md5 of the content (used later for the etag) synchronously. We need to be able to c...
- 11:23 PM Revision a9d18975 (ceph): Merge branch 'master' of github.com:ceph/ceph
- 11:22 PM Revision 60524aba (ceph): Added documentation for building the ceph documentation.
- 09:59 PM Revision b9097619 (ceph): rgw: get rid of references to g_ceph_context where required
- trickling down ceph context.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 08:48 PM Revision e6969258 (ceph): global: drop yellow warning on startup
- Fixes: #2143
Signed-off-by: Sage Weil <sage@newdream.net> - 08:48 PM Revision e455d388 (ceph): doc: update project status/stability blurb
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Mark Kampe <mark.kampe@dreamhost.com> - 07:55 PM Revision e5934f10 (ceph): qa: kclient/file_layout.sh: ...
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:46 PM Revision 0a2068fc (ceph): Merge branch 'librados-cleanup'
- Reviewed-by: Sage Weil <sage.weil@dreamhost.com>
- 06:46 PM Revision 8f278647 (ceph): librados: split into separate files and remove unnecessary headers
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 06:46 PM Revision 5f92f338 (ceph): librados: move methods that require an IoCtx to IoCtxImpl
- RadosClient still does a few different things, but at least it
no longer does all the work of an IoCtx.
Signed-off-b... - 06:46 PM Revision db126279 (ceph): ObjectCacher: remove unused and crufty atomic sync operations
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 06:46 PM Revision 095c3a0e (ceph): OSDMap: make get_pools() const
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 06:46 PM Revision 16f99606 (ceph): osd_types: use uint64_t for ObjectExtent offsets and lengths
- This is just client in-memory state, and allows us to address objects >4GiB,
to match the existing librados/Objecter ... - 05:49 PM Revision b90354db (ceph): thrash: put client on separate machine from osds
- This allows us to run kenrel clients (kclient, rbd) against the thrashing
cluster. - 05:09 PM Revision 5c9acbd8 (ceph): gitbuilder: put flavor last
- in case we refine the field later
- 05:02 PM Revision 1a01ccaa (ceph): Pull from new gitbuilder.ceph.com locations.
- Simplifies the flavor stuff into a tuple of
<package,type,flavor,dist,arch>
where package is ceph, kenrel, etc.
typ... - 01:56 PM Bug #2132: FAILED assert(!missing.is_missing(soid))
- stxShadow saw this as well.
- 01:45 PM Cleanup #2143 (Resolved): Remove ALL "don't use this product" warnings
- 01:31 PM Feature #2145 (Resolved): doc gitbuilder
- 12:28 PM Linux kernel client Bug #2064: ceph-client: messenger: nocrc flag not implemented correctly
- Update: the commit had to be rebased, so it's id is now: 4d3e7aa992
- 08:09 AM Linux kernel client Bug #2064: ceph-client: messenger: nocrc flag not implemented correctly
- This is fixed by this commit:
086da4c6f8 libceph: fix inverted crc option logic
That is now present in the c... - 12:26 PM Linux kernel client Bug #2157: ceph: xattr: fix nanosecond display on i_rctime
- This has been fixed in this commit:
260ac0e65b ceph: fix three bugs, two in ceph_vxattrcb_file_layout()
The comm... - 12:26 PM Linux kernel client Bug #2156: ceph: xattr: fix a possible buffer overrun bug
- This has been fixed in this commit:
260ac0e65b ceph: fix three bugs, two in ceph_vxattrcb_file_layout()
The comm... - 12:26 PM Linux kernel client Bug #2155: ceph: xattr: wrong value assumed for "no preferred PG"
- This has been fixed in this commit:
260ac0e65b ceph: fix three bugs, two in ceph_vxattrcb_file_layout()
The comm... - 11:01 AM rgw Bug #2170: librgw references g_ceph_context
- Ouch. Mostly through dout, but there are other references.
- 10:40 AM rgw Bug #2170 (Resolved): librgw references g_ceph_context
- 2012-03-13T00:48:30.009 INFO:teuthology.task.workunit.client.0.err:OSError: /tmp/cephtest/binary/usr/local/lib/librgw...
- 09:31 AM rgw Feature #2169 (Resolved): rgw: api to control bucket placement
- It'd be nice to be able to control which pool the bucket would be placed in when creating it.
- 12:02 AM Revision 98792e93 (ceph): rgw: add more meaningful tests instances of encoded objects
- this completes #2140
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Also available in: Atom