Project

General

Profile

Activity

From 11/30/2011 to 12/29/2011

12/29/2011

11:43 PM Revision 585fb5ce (ceph): clitests: update for new error format
This was changed in 1f434da8a3ca4db830d1f3b0d87e5df941d85f2d
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
11:28 PM Revision cec2692e (ceph): clitests: update monmaptool test
e93961c11119942eae3a4cd14a79f779a5a4d277 changed output format.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
09:09 PM Revision f04e2955 (ceph): teuthology rgw-admin: annotated test cases for inventory
this is not a nose suite, so I simply added test case
descriptions in csv format, and put a file to extract
the...
Mark Kampe
08:00 PM Revision 48df71c8 (ceph): init script: be LSB compliant for exit code on status
An exit code of 1 on status is defined in LSB as
"program is dead, but pid file exists". Check for existence
of this ...
Florian Haas
07:58 PM Revision 3b2ca7cf (ceph): keyring: print more useful errors to log/err
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:57 PM Revision eba235f2 (ceph): common: trigger all observers on startup
Among other things, this makes err-to-stderr and friends initialize
properly in the DoutStreamBuf.
Signed-off-by: Sa...
Sage Weil
07:24 PM Revision 1f434da8 (ceph): common: make cpp_strerror output prettier
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:24 PM Revision 04c8db00 (ceph): librados: check for monclient::init() error
I think this fixes #1835.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:59 PM Revision 1a59405c (ceph): rgw: turn on cache by default
Yehuda Sadeh
05:59 PM Revision 37013b6f (ceph): qa: load-gen-mix-small.sh
Sage Weil
05:41 PM Revision 959fd71f (ceph): osd: explicitly track leading edge of backfill
backfill_pos is the leading edge; last_backfill is the trailing edge.
Anything inbetween is either pushed, doesn't ex...
Sage Weil
05:31 PM CephFS Bug #1682: mds: segfault in CInode::authority
Not sure if this is the same underlying problem, but here's another CInode::authority crash from teuthology:~teuthwor... Josh Durgin
05:09 PM Revision d24ea235 (ceph): mds: assert if we get an EINVAL on our truncate
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:00 PM Revision 47013c28 (ceph): osd: get fsid from monmap, not osdmap
We may not have a valid OSDMap in all of these cases (notably, during
boot). Always take the fsid from the monmap, w...
Sage Weil
04:59 PM Revision 05cc4eb9 (ceph): monc: get latest monmap during authentication
Tell the monitor which monmap version we have in our initial auth message.
Make the monitor send the latest monmap if...
Sage Weil
04:44 PM Revision 5d5c9b6f (ceph): osdmap: add const markers to some unfixed functions
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
04:44 PM Revision 300c7584 (ceph): osd: catch authenticate error on startup
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:30 PM Linux kernel client Bug #1866 (Duplicate): null pointer dereference after osd went down
This was during a kernel_untar_build workunit on rbd:... Josh Durgin
12:15 PM Bug #1835: Monclient crash when keyring is not readable
should be fixed by commit:04c8db001a4ed02ef7335ed01ce73ce9ab28dc9d .. can you verify, Wido? Sage Weil
11:16 AM Feature #1863 (In Progress): qa: tester for osd op reply order
Josh Durgin
11:14 AM Bug #1865 (Duplicate): mon: need to disconnect clients when we drop out of quorum
From sepia4:/tmp/cephtest/archive/log/osd.0.log:... Josh Durgin
10:59 AM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
Happened again in teuthology:~teuthworker/archive/nightly_coverage_2011-12-29-a/5318/remote/ubuntu@sepia60.ceph.dream... Josh Durgin
10:33 AM Bug #1490: cfuse assert failure: assert(ob->last_commit_tid < tid)
Happened again in teuthology:~teuthworker/archive/nightly_coverage_2011-12-28-b/5258/teuthology.log. Josh Durgin
10:04 AM Bug #1862: filestore: EINVAL on replay
Marco Aroldi wrote:
> Hmmm
> I have another problem: i've tried the patch in #1759 but I have a error at compile ti...
Sage Weil
10:01 AM Bug #1862: filestore: EINVAL on replay
Hmmm
I have another problem: i've tried the patch in #1759 but I have a error at compile time:
CXX libos_l...
Marco Aroldi
08:33 AM Bug #1862 (Duplicate): filestore: EINVAL on replay
Aha, this is actually #1759. If you apply the patch in that bug report it'll get your OSDs up and running again. Th... Sage Weil
03:33 AM Bug #1862: filestore: EINVAL on replay
Hi,
I've downloaded and compiled the latest code from the git repository.
I've issued a "ceph-osd -i 1 --debug_ms 2...
Marco Aroldi
09:53 AM Bug #1804: filestore: unexpected EINVAL
My money is that this is caused by #1759. Which hopefully means that the qa suite will eventually trigger the new as... Sage Weil
09:17 AM Bug #1741 (Can't reproduce): teuthology: failed to untar
Sage Weil
09:16 AM Bug #1759 (Need More Info): mds/client: truncate size overflow, fails with EINVAL
The OSD now returns EINVAL, the MDS asserts if it gets EINVAL, and we have some MDS-side assertions that should catch... Sage Weil
09:10 AM Bug #1846 (Resolved): Mds crash immediately after start (segmentation fault)
Great! Sage Weil
06:05 AM Bug #1846: Mds crash immediately after start (segmentation fault)
I have built debian package from master branch and upgraded ceph on both servers. Mds and osd started properly. Thank... Maciej Galkiewicz
09:08 AM Bug #1848 (Resolved): osd got zeroed out fsid
Sage Weil
09:08 AM Bug #1848: osd got zeroed out fsid
fixed by commit:47013c289e6ad6638b0f77152dafbc9f4723c032 and commit:05cc4eb93ce6d193c6aea4918144006fb4d1c187 Sage Weil
01:00 AM Revision e18b1c97 (ceph): rgw: removing swift user index when removing user
Yehuda Sadeh
12:50 AM Revision 997e35ae (ceph): rgw-admin: remove subuser index when required
Yehuda Sadeh
12:42 AM Revision 1f40031f (ceph): osd: fix push completion check
Only check backfill if we pushed to the backfill target. And avoid teh hash
lookup in the general case.
Signed-off-...
Sage Weil
12:34 AM Revision 2dc90d03 (ceph): rgw: clone operation should only update index for main category
Yehuda Sadeh
12:33 AM Revision bb52b187 (ceph): rgw: fix cache interface (was not overloading method)
Yehuda Sadeh

12/28/2011

11:10 PM Revision 0db9a423 (ceph): rgw: fix bucket creation
Yehuda Sadeh
06:43 PM Feature #1709 (Resolved): specfile: merge suse spec file changes
Sage Weil
06:42 PM Feature #1678 (Resolved): rados tool: ability to specify object locator
Sage Weil
06:42 PM Bug #1683 (Resolved): librados: list objects should also return locator key
Sage Weil
06:41 PM Bug #1508 (Can't reproduce): iozone stuck on kernel rbd mount
haven't seen this recently Sage Weil
05:06 PM Bug #1848: osd got zeroed out fsid
Happened again in teuthology:~teuthworker/archive/nightly_coverage_2011-12-28/5238/remote/ubuntu@sepia56.ceph.dreamho... Josh Durgin
05:05 PM rgw Bug #1854 (Resolved): Deletion of an rgw user that has a subuser with a swift key leaves behind a...
Fixed, commit:e18b1c9734e88e3b779ba2d70cdd54f8fb94743d. Yehuda Sadeh
03:26 PM Bug #1846: Mds crash immediately after start (segmentation fault)
Oh, I think Henry sent in a fix for this. Can you apply commit:bfbeae68c045de76ede86ca4f72d2a760a19c84b (or use late... Sage Weil
02:45 PM Bug #1846: Mds crash immediately after start (segmentation fault)
... Maciej Galkiewicz
08:12 AM Bug #1846: Mds crash immediately after start (segmentation fault)
I looked at the attached monmap and didn't see anything odd. This fully reproducible, I take it? That's good news.
...
Sage Weil
02:35 PM rgw Bug #1864 (Resolved): rgw: atomic bucket info
Yehuda Sadeh
01:46 PM Bug #1828 (Resolved): osd: preserve write order when ops wait for recovery of src_oids
Sage Weil
01:45 PM Feature #1863 (Resolved): qa: tester for osd op reply order
Out of order osd replies current trigger an ObjectCacher assert. Presumably there are lots more of them than the one... Sage Weil
09:14 AM Bug #1862: filestore: EINVAL on replay
Hi Sage,
I'm sorry but I don't understand the steps requested.
Please, could you explain a little bit more?
Marco Aroldi
08:19 AM Bug #1862 (Need More Info): filestore: EINVAL on replay
Can you try running the latest master code and restart ceph-osd? Specifically, commit:7133a2faf0ae0710b7cbd9801c6476... Sage Weil
07:25 AM Bug #1862 (Duplicate): filestore: EINVAL on replay
Hello,
I'm testing ceph 0.39 on two VM (Ubuntu 11.10) on Hyper-V with all Linux Integration Components installed.
2...
Marco Aroldi

12/27/2011

03:18 PM Bug #1846: Mds crash immediately after start (segmentation fault)
Do you have any suggestions how to temporary workaround this problem? Maciej Galkiewicz

12/23/2011

08:47 PM Revision 4ac04e89 (ceph): rgw: write bucket info in one operation
Yehuda Sadeh
05:56 PM Revision 60bbf688 (ceph): Objecter: fix local reads one more time.
Document it a little since we've gotten it wrong so often.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum
04:59 PM Bug #1841: OSDs should disconnect from Monitor before their MOSDPGStat timeouts happen
Pushed a wip-osd-mon-communication branch that implements this. It's untested, though! Greg Farnum
02:50 PM RADOS Feature #1861 (New): qa: test OSD handling of misdirected operations
This will probably require some new facilities in a client in order to do automatically, but we had a regression (I t... Greg Farnum
02:48 PM Messengers Bug #1803 (New): msgr: behave better when ending TCP connections
Greg Farnum
02:47 PM Bug #1858: OSD needs to check for misdirected ops before putting non-existent PGs on hold
Pushed the branch osd-misdirected-checks. So far it's untested. Greg Farnum
07:56 AM Bug #1858: OSD needs to check for misdirected ops before putting non-existent PGs on hold
we should also include a read-from-replicas workload in the qa suite.. probably combined with osd thrashing. that ma... Sage Weil
10:56 AM rados-java Feature #1860 (New): qa: write tests for local reads and random replica reads
We need to test the client options that randomize the read load and attempt to read from local replicas. The local re... Greg Farnum
10:07 AM rgw Bug #1859 (Resolved): rgw: bucket creation is not atomic
the backing bucket object is being created in two opearions: create and write. We need to combine these into a single... Yehuda Sadeh
01:41 AM Revision eb37637f (ceph): Merge remote branch 'upstream/master' into wip-backfill
Conflicts:
src/include/object.h
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
01:38 AM Revision 1f02e34c (ceph): ReplicatedPG: objects currently begin backfilled are degraded
pending_stat_updates has also been renamed to pending_backfill_updates.
Signed-off-by: Samuel Just <samuel.just@drea...
Samuel Just
01:24 AM Revision d2eb119a (ceph): ReplicatedPG: fill in backfill_peer in on_activate
Previously, there was a race between issue_repop/do_op and
start_recovery_ops.
Signed-off-by: Samuel Just <samuel.ju...
Samuel Just
01:17 AM Revision 517ddf84 (ceph): ReplicatedPG: only pull in one backfill peer at a time
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just

12/22/2011

11:25 PM Revision 855e93b6 (ceph): filestore: fix config observer
Actually, I don't think this was fully implemented to begin with, so it's
not a 'fix' per se. This will let you use ...
Sage Weil
11:18 PM Revision decdc363 (ceph): MOSDPGRepScrub: Fix typo in MOSDPGRepScrub
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
10:33 PM Revision db04d680 (ceph): ReplicatedPG: update last_backfill when pushes complete
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
10:33 PM Revision 3b90df0d (ceph): ReplicatedPG: init backfill infos to last_backfill
We can scan starting from last_backfill to avoid rescanning portions
of the collection recovered by normal recovery. ...
Samuel Just
10:33 PM Revision 298b1349 (ceph): PG: backfill info should be cleared on recovery reset
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
10:33 PM Revision 8b8aab84 (ceph): PG: update stats from master only if not backfilling
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
10:33 PM Revision fa6bd38a (ceph): ReplicatedPG: simplify recover_backfill
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
10:00 PM Revision 9fc060c0 (ceph): Merge branch 'wip-signal'
Sage Weil
09:48 PM Bug #1858 (Resolved): OSD needs to check for misdirected ops before putting non-existent PGs on hold
Right now it doesn't. If it had, then diagnosing Noah's local reads problem would have been much, much faster. :( Greg Farnum
08:33 PM Revision c7fee72d (ceph): MOSDRepScrub: use header.version for payload version
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
08:16 PM Revision 3cb53cc9 (ceph): Merge branch 'stable'
Sage Weil
08:15 PM Revision e93961c1 (ceph): monmap: iterate over addr_name when printing summary
The rank is now ordered by IP address. We should iterate over
addr_name.
Signed-off-by: Henry C Chang <henry.cy.chan...
Henry Chang
08:15 PM Revision bfbeae68 (ceph): monmap: clear addr_name map on calculating ranks
We should clear addr_name before filling it. Otherwise, the removed
mon will stay there and cause incorrect rank assi...
Henry Chang
08:15 PM Revision ea9f2f62 (ceph): interval_set: fix truncation of _size
_size is type of int64_t. Use int to store the value of _size
will cause value truncation.
Signed-off-by: Henry C Ch...
Henry Chang
04:28 PM CephFS Bug #1047: mds: crash on anchor table query
Unfortunately there's not enough info in this log either. We're going to need a log with (at minimum) level 10 mds de... Greg Farnum
04:27 PM Bug #1850 (Duplicate): mds sometimes crashes removing trees with plenty of hardlinks
I'm pretty sure you're looking at #1047 here. :) Greg Farnum
12:45 PM Fix #1857 (Resolved): osd: reimplement shutdown()
on sigterm, go through OSD::shutdown() and try to clean things up in an orderly fashion. This will be useful for lea... Sage Weil
11:02 AM rgw Bug #1856 (Resolved): It is possible to look up an rgw user by a subuser that does not exist as l...
Matthew Wodrich
11:00 AM rgw Bug #1855 (Resolved): Creation of a subuser that appears to own an s3 key is possible, and removi...
Matthew Wodrich
10:58 AM rgw Bug #1854 (Resolved): Deletion of an rgw user that has a subuser with a swift key leaves behind a...
Creating an rgw user with a subuser and swift key and then deleting the user appears to orphan the object for that su... Matthew Wodrich

12/21/2011

10:21 PM Revision 9eee1ecb (ceph): osd: remove SIGTERM cruft
The default handler will exit(0). The got_sigterm stuff was dead code.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:17 PM Revision e04109a3 (ceph): mon: drop special SIGTERM handler
Default does exit(0).
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:17 PM Revision 2daa655f (ceph): mds: drop special SIGTERM handler
Default does exit(0).
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:17 PM Revision d1dbeaf5 (ceph): exit(0) on SIGTERM by default
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:20 PM Revision 07e11862 (ceph): ReplicatedPG: Initialize blocked_by in ObjectContext constructor
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
03:07 PM rgw Bug #1853 (Resolved): rgw: qa test to verify bucket recreation does not override bucket
Yehuda Sadeh
09:11 AM RADOS Feature #1852 (New): librados: don't do memory copies for the C interface
The current implementation of the librados C interface (well, the one I'm working on now) uses in-memory copies for a... Greg Farnum
09:10 AM Messengers Feature #1851 (Rejected): SimpleMessenger: use non-blocking io
This will allow some great stuff, like doing a revoke on sending messages. And decoupling threads from sockets. Etc.
...
Greg Farnum
04:11 AM Revision dcedda84 (ceph): Merge pull request #7 from kylemarsh/wip-obsync-swift-metadata
obsync: pull object metadata from swift store Sage Weil
02:58 AM Bug #1850 (Duplicate): mds sometimes crashes removing trees with plenty of hardlinks
rsync -aH /usr/share/zoneinfo/ /mnt/ceph-fuse/subdir/ (the H and a hardlink-plentiful /usr/share/zoneinfo are essenti... Alexandre Oliva
02:42 AM Bug #1849 (Duplicate): directories' timestamps in snapshots sometimes change when directory is mo...
When trying to load a series of backups from directories trees large and small, I noticed one particularly undesirabl... Alexandre Oliva
01:35 AM Revision 78030bc7 (ceph): ReplicatedPG: take references for blocked_by and blocking
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
01:08 AM Revision a85ab1ea (ceph): obsync: pull object metadata from swift store
Obsync wasn't pulling object metadata from swift stores and thus wasn't
syncing metadata when reading from a swift st...
Kyle Marsh
01:05 AM Revision fd3231c6 (ceph): Merge remote branch 'upstream/wip-backfill-ordering' into wip-backfill
Samuel Just
12:52 AM Revision 7eb28730 (ceph): PG: add some documentation
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
12:50 AM Revision ffd1b437 (ceph): ReplicatedPG: delay op while snapdir is missing/degraded
We cannot get/update a snapcontext if snapdir is missing/degraded.
Signed-off-by: Samuel Just <samuel.just@dreamhost...
Samuel Just

12/20/2011

11:28 PM Revision 45b9659f (ceph): ReplicatedPG: don't manage waiting_on_backfill in start/finish_recovery_op
Set waiting_on_backfill in recover_backfill and clear in do_scan.
Signed-off-by: Samuel Just <samuel.just@dreamhost....
Samuel Just
07:39 PM Revision 3bea1ed4 (ceph): rgw: fix subuser key name when purging subuser keys
Yehuda Sadeh
07:00 PM Revision 9ddb802c (ceph): radosgw-admin: add --purge-keys option
Yehuda Sadeh
06:53 PM Revision 5e9d1019 (ceph): ReplicatedPG: apply_repop: apply local_t before op_t
We create snap_collections in local_t and clone into them in op_t.
Signed-off-by: Samuel Just <samuel.just@dreamhost...
Samuel Just
04:25 PM CephFS Bug #1737: ceph-fuse crash in xlist::remove
Another occurence today in teuthology:~teuthworker/archive/nightly_coverage_2011-12-20-b/4585/teuthology.log Josh Durgin
11:05 AM rgw Bug #1801 (Resolved): rgw: radosgw-admin remove subuser and related swift key in a single command
done, commit:9ddb802c72fc805ce400f9bf5cceffb88b0f3d47
radosgw-admin subuser rm --subuser=<name> --purge-keys
Yehuda Sadeh
10:09 AM Bug #1848 (Resolved): osd got zeroed out fsid
From teuthology:~teuthworker/archive/nightly_coverage_2011-12-20-a/4579/remote/ubuntu@sepia51.ceph.dreamhost.com/log/... Josh Durgin
10:00 AM rgw Feature #1847 (Resolved): rgw: revisit the way we store large objects
We should probably keep large objects in chunks, and not coalesce into a single large object. Chunks shouldn't use th... Yehuda Sadeh
08:38 AM Bug #1846: Mds crash immediately after start (segmentation fault)
I got monmap from machine with working mds cause the other one does not have admin key. I hope that this is not a pro... Maciej Galkiewicz
08:34 AM Bug #1846: Mds crash immediately after start (segmentation fault)
can you 'ceph mon getmap -o /tmp/monmap' and attach that file to this bug? Sage Weil
03:25 AM Bug #1846: Mds crash immediately after start (segmentation fault)
In the same way crashes osd on this machine. Maciej Galkiewicz
03:23 AM Bug #1846 (Resolved): Mds crash immediately after start (segmentation fault)
I have two mds' in my configuration. One of them works fine and the other crashes immediately after reboot:
@2011-...
Maciej Galkiewicz
02:03 AM Revision 97dd28c0 (ceph): librados: return -EROFS when trying to write to a snapshot
operate_read doesn't need this check because it does not write.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
02:00 AM Revision 68ba1862 (ceph): librados: make getxattrs ENOMEM return negative
This is more consistent with the rest of librados.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
12:26 AM Revision a798a85f (ceph): PG: Do not update_snap_collections for log entries > last_backfill
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
12:26 AM Revision 2401176b (ceph): PG: Fix stat debug output
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
12:23 AM Revision 1362d3e1 (ceph): calc_acting: Prefer up[0] as primary if possible
Previously, we could get into a state where although up[0] has been
fully backfilled, acting[0] could be selected as ...
Samuel Just
12:01 AM Revision 01f3f6a6 (ceph): rgw: add timeout to init path
Yehuda Sadeh

12/19/2011

10:57 PM Revision cc22f154 (ceph): MOSDRepScrub,ReplicatedPG: Add scrub_to to MOSDRepScrub
When scrub_from is set, also set scrub_to to the primary's
last_update_applied (which will also be the official last_...
Samuel Just
10:02 PM Revision 720bab94 (ceph): osd: EINVAL on truncate to huge object size
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:02 PM Revision ed780fdd (ceph): mds: misc assertions about truncation
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:02 PM Revision 2710bd85 (ceph): mon: update man page to document --mkfs stuff
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:00 PM Revision 29e6d6c8 (ceph): Merge pull request #6 from kylemarsh/wip-obsync-swift
Wip obsync swift Sage Weil
09:57 PM Revision 33cb2796 (ceph): rgw: remove temp context in prepare_get_obj
Yehuda Sadeh
09:57 PM Revision 5e739335 (ceph): rgw: fix xml parser internal structure leak
Yehuda Sadeh
09:57 PM Revision a72348ea (ceph): rgw: fix a leak of acl structure (in req_state)
Yehuda Sadeh
09:54 PM Revision 002eb581 (ceph): rgw: remove temp context in prepare_get_obj
Yehuda Sadeh
09:38 PM Revision 27da89f4 (ceph): rgw: fix xml parser internal structure leak
Yehuda Sadeh
09:38 PM Revision 3a8af0f7 (ceph): rgw: fix a leak of acl structure (in req_state)
Yehuda Sadeh
09:25 PM Revision 42980922 (ceph): Merge branch 'wip-osd-maybe-created'
Greg Farnum
09:24 PM Revision 98a4809a (ceph): Merge branch 'wip-osd-fsid'
Sage Weil
09:24 PM Revision 3af5fff5 (ceph): doc: fix typo
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:15 PM Revision dc977901 (ceph): osd: --get-journal-fsid
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:13 PM Revision c8c5e5d6 (ceph): filestore: make fsid uuid_d instead of uint64_t
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:13 PM Revision ae8fbb88 (ceph): filejournal: uuid for fsid
Decode old header struct, but encode new class using more normal encoding
style. Embed in a bufferlist for later ext...
Sage Weil
04:12 PM Revision dcceb8e8 (ceph): osd: include osd_fsid in OSDSuperblock
Generated during mkfs.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:12 PM Revision a5822095 (ceph): osd: store osd_fsid as text in osd_data dir
along with ceph_fsid (the cluster fsid) and a few other things.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:12 PM Revision c59eb8ca (ceph): osd: --get-osd-fsid and --get-cluster-fsid
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:12 PM Revision 237b19cd (ceph): osd: rename OSDSuperblock::fsid -> cluster_fsid
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:04 PM Revision cd909aca (ceph): doc: fix mon cluster expansion docs
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:03 PM Revision f2a95990 (ceph): mon: pull addr from ceph.conf, mon_host as needed when joining mon cluster
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:57 PM Revision d9593342 (ceph): mon: fix setting of mon addr when joining a cluster
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
02:02 PM rgw Bug #1844 (Resolved): radosgw memory leak
Fixed a few leaks, as of commit:33cb27961e1b20f188d2a83a764ae3f2fabeb141. Current run with massif looks flat. Yehuda Sadeh
11:15 AM rgw Bug #1844 (Resolved): radosgw memory leak
apparently radosgw is leaking. Yehuda Sadeh
01:38 PM Bug #1825 (Resolved): osd loses object deletes by some creates in the same transaction
Merged to master in commit:42980922f253ed29718bfac64e17c85cdf9805a6. Still haven't written tests but I have a persona... Greg Farnum
01:22 PM rgw Feature #1838 (Resolved): rgw: update man page
Sage Weil
01:21 PM Bug #1845 (Rejected): "recovery_ops" performance counter isn't decreased
Right, it should never decrement. Closing! Sage Weil
11:51 AM Bug #1845: "recovery_ops" performance counter isn't decreased
I'm using munin, wrote plugin myself and I don't divide this value by anything. If it shouldn't decrement please clos... Szymon Szypulski
11:47 AM Bug #1845: "recovery_ops" performance counter isn't decreased
The counters are counting events and never decrement. Normally collectd will divide the change by time to give you s... Sage Weil
11:15 AM Bug #1845 (Rejected): "recovery_ops" performance counter isn't decreased
I'm generating osd statistics based on performance sockets like described here - http://ceph.newdream.net/wiki/Perfom... Szymon Szypulski
11:07 AM Bug #1758: OSD segfault in SimpleMessenger::send_message
For the life of me I cannot seem to get useful symbols out of this, though I'm not sure why. I've been using LD_LIBRA... Greg Farnum
10:49 AM Bug #1688: Benjamin: pg stuck in scrub
Still happening, I'm looking into an instance on benjamin now. Samuel Just
10:07 AM Bug #1530: osd crash during build_inc_scrub_map
This was the only failure in the run last night. Core at teuthology:~teuthworker/archive/nightly_coverage_2011-12-19-... Josh Durgin
10:00 AM Bug #1490 (New): cfuse assert failure: assert(ob->last_commit_tid < tid)
Happened again in teuthology:~teuthworker/archive/nightly_coverage_2011-12-15-b/4357/remote/ubuntu@sepia63.ceph.dream... Josh Durgin
08:09 AM Bug #1839 (Resolved): osd: assert in send_incremental_map_msg
this was the hobject_t::max initialization patch that wasn't in master. Sage Weil
08:08 AM Documentation #1840 (Resolved): doc: fix mon addition stpes
Sage Weil

12/17/2011

03:34 PM Feature #1655 (Resolved): gitbuilder aggregator page
http://ceph.newdream.net/gitbuilder.cgi Sage Weil
06:21 AM Revision 37e7a521 (ceph): rgw: fix updating of object metadata
being used in swift POST. We were updating wrong object
size and etag
Yehuda Sadeh
06:21 AM Revision 44b4e029 (ceph): rgw: bucket cannot be recreated if already exists
Yehuda Sadeh
06:15 AM Revision e5f49104 (ceph): man: Update the configuration example for radosgw
Signed-off-by: Wido den Hollander <wido@widodh.nl> Wido den Hollander
06:15 AM Revision 83cf1b62 (ceph): man: It is capital -C instead of -c when for creating a new keyring
Signed-off-by: Wido den Hollander <wido@widodh.nl> Wido den Hollander
06:04 AM Revision 3e323e6a (ceph): rgw: fix updating of object metadata
being used in swift POST. We were updating wrong object
size and etag
Yehuda Sadeh
02:09 AM Revision d0e90d71 (ceph): syslog checking: forgot a pipe
Josh Durgin
01:14 AM Revision 08f968f8 (ceph): rgw: bucket cannot be recreated if already exists
Yehuda Sadeh
12:07 AM Revision f54f4aa0 (ceph): obsync: add authurl to CLI
s3 connections require the hostname and swift connections require the
authurl. obsync treats these as equivalent int...
Kyle Marsh

12/16/2011

10:42 PM Revision bfbde5b1 (ceph): object.h: initialize max in hobject_t(sobject_t) constructor
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
10:09 PM rgw Bug #1830 (Resolved): RGW Swift Metadata Bug
Fixed, commit:3e323e6adbf87d794be39fd4f75c6626e8968ce1. Yehuda Sadeh
05:36 PM rgw Bug #1830: RGW Swift Metadata Bug
Ok, was able to reproduce it. Problem is in the swift specific update metadata operation. Fix should be pretty easy. Yehuda Sadeh
08:41 PM Revision 061e7619 (ceph): ReplicatedPG: fix handle_watch_timeout ctx->at_version
ctx->at_version should match the head of the new log entries
during issue_repop. This could cause the scrub hang bug...
Samuel Just
07:43 PM Revision 5274e88d (ceph): ReplicatedPG: add asserts to catch scrub error
If last_update_applied skipped over last_update, we would see
scrub hang.
Signed-off-by: Samuel Just <samuel.just@dr...
Samuel Just
06:39 PM Revision 3f3913c9 (ceph): doc: fix filename in mon addition process
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:28 PM rgw Bug #1843 (Resolved): rgw: recreation of bucket overrides old one
Fixed with commit 08f968f8cd74a2e782257eea91a97b52598ef6f1. Yehuda Sadeh
05:08 PM rgw Bug #1843 (Resolved): rgw: recreation of bucket overrides old one
Instead of returning success and not doing anything, we actually create a new bucket and override the old one. This i... Yehuda Sadeh
05:19 PM Revision 7d81a3b5 (ceph): filejournal: preallocate journal bytes on create
This should reduce fragmentation for large journals that are written
slowly the first time around.
Signed-off-by: Sa...
Sage Weil
05:08 PM Revision 92cb2a20 (ceph): Merge pull request #5 from homac/master
Minor fix for init files and cleaned up spec file. Please pull Sage Weil
03:20 PM Bug #1841 (In Progress): OSDs should disconnect from Monitor before their MOSDPGStat timeouts happen
Yep; it is easy enough to add a check in tick based on how long it's been since we sent a PGStat without getting an a... Greg Farnum
01:28 PM Bug #1841: OSDs should disconnect from Monitor before their MOSDPGStat timeouts happen
My memory is a bit fuzzy, but I think they're waiting on acks for the MOSDPGStat messages they're sending.. checking ... Sage Weil
11:23 AM Bug #1841 (Resolved): OSDs should disconnect from Monitor before their MOSDPGStat timeouts happen
Right now OSDs don't notice their monitor connection has dropped until after the (by default) 15 minute TCP connectio... Greg Farnum
03:18 PM Bug #1842 (Can't reproduce): osd: failed authorizations leak memory somehow?
I've got a log from todin showing lots of "fault initiating reconnect" that I suspect are on failed auths. Log in kai... Greg Farnum
01:03 PM rgw Feature #1838: rgw: update man page
I just submitted a patch on the ml with an updated version of the manpage. This works in my setup. Wido den Hollander
10:10 AM rgw Feature #1838 (Resolved): rgw: update man page
use current alexandria as a model, probably. minus the now-unneeded setenv stuff.
Sage Weil
10:51 AM Documentation #1840 (Resolved): doc: fix mon addition stpes
--public-addr
ameks ure port is correct, too
Sage Weil
10:37 AM Bug #1839 (Resolved): osd: assert in send_incremental_map_msg
... Sage Weil
09:59 AM Linux kernel client Feature #1837 (New): krbd: freeze filesystem on snapshot
The block device can ask for an fs freeze (dm currently does this). We can do this with rbd when we see that the rbd... Sage Weil
09:22 AM Feature #1836 (Resolved): filejournal: use async directio to write to the journal
Currently we're doing a sync direct io write, which means we pay a full rotation between each io. Sage Weil
08:44 AM Bug #1833 (Resolved): mon: failed decode in LogMonitor::update_from_paxos
Yeah, this is one of the things I hit (and fixed) in a few different ways when doing the mon thrashing on the new code. Sage Weil
06:33 AM Bug #1835: Monclient crash when keyring is not readable
Btw, I know I can use the build-in 'secret' functions of libvirt, but I didn't modify my XML's yet. Wido den Hollander
06:32 AM Bug #1835 (Resolved): Monclient crash when keyring is not readable
I had some issues with my Qemu-RBD VM's to get them online, I saw Qemu segfault and started tracing this back with GD... Wido den Hollander
05:18 AM Bug #1834 (Closed): 'High' memory usage of monitors
Actually, I seem to be wrong here. My other monitor running on a 4GB box is using about 240MB of memory, I did a smal... Wido den Hollander
04:43 AM Bug #1834 (Closed): 'High' memory usage of monitors
I'm still hunting this one, but I'm seeing high memory usage of my monitors (three in total).
My monitor configura...
Wido den Hollander
04:34 AM phprados Feature #424: Stream wrappers
It took some time to find docs about this, but I'm currently on track. Wido den Hollander

12/15/2011

10:03 PM Revision 739fd9fe (ceph): man: clarify mount.ceph auth options
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
09:49 PM Revision e5a5ae12 (ceph): man: update rule definition for ceph-rbdnamer
This is the rule we install since 891025e539a92b5d75011e2e75c475fc0c272042.
Signed-off-by: Josh Durgin <josh.durgin@...
Josh Durgin
09:43 PM Revision 4eb83654 (ceph): authx -> cephx everywhere it's used
The term authx was in the mount.ceph man page, and got accidentally
copied into rbd help.
Signed-off-by: Josh Durgin...
Josh Durgin
09:24 PM Revision 7eec3094 (ceph): rountrip: add task
Yehuda Sadeh
09:15 PM Revision 41f64be0 (ceph): ReplicatedPG: calc_clone_subsets fix other clone_overlap case
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
09:15 PM Revision b5c32590 (ceph): ReplicatedPG: fix backfill mismatch error output
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
09:15 PM Revision 5b41c470 (ceph): OSD: use disk_tp.pause() without osd_lock
Previously, we called disk_tp.pause_new(). This can cause a race
where snap_trimmer queues more transactions after w...
Samuel Just
08:39 PM Revision 97cc6c29 (ceph): readwrite: fix task with default conf
Yehuda Sadeh
04:51 PM Revision ec776f4b (ceph): ceph.spec: Clean up and fix spec file and build for a couple of distrib...
Clean up and fix the spec file. This includes cleaning up of build and
installed system dependencies, LSB compliance ...
Holger Macht
04:49 PM Revision 0e0583f8 (ceph): init-ceph/init-radosgw: Don't use unspecified runlevel 4
Don't use runlevel 4 in init scripts. AFAIK, no distribution is using it
and at least the Open Build Service complain...
Holger Macht
02:32 PM Bug #1833 (Resolved): mon: failed decode in LogMonitor::update_from_paxos
Saw this on benjamin today. It was during catchup; mon.beta had been out for a day or more and was catching up. Perha... Greg Farnum
03:08 AM Revision 0c547046 (ceph): osd: preserve write order when waiting on src_oids
We need to preserve the order of write operations on each object. If we
have a write on X that needs to read from Y,...
Sage Weil
03:08 AM Revision ca2e8e5a (ceph): osd: EINVAL on mismatched locator without waiting for degraded
No reason to recover before returning an error.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
03:08 AM Revision 7a7aab25 (ceph): osd: wait for src_oid if it on other side of last_backfill from oid
If the target object is before last_backfill, then the backfill_target
will be asked to apply the operation. If one ...
Sage Weil
01:43 AM Revision da286059 (ceph): client: fix logger deregistration
Only unregister logger if it is non-NULL (and thus registered) to avoid
running afoul of the cct assertions.
Signed-...
Sage Weil
01:14 AM Revision 659e66aa (ceph): readwrite: fix conf, task runs
Yehuda Sadeh
12:12 AM Revision 7d085ad9 (ceph): readwrite: add readwrite task
still not really running, but at least getting configured Yehuda Sadeh

12/14/2011

11:51 PM Revision 62c830f0 (ceph): ReplicatedPG: add_object_context_to_pg_stat, obc->ssc may be null
obc->ssc is not necessarily filled in by get_object_context.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
11:37 PM Revision 5a400935 (ceph): obsync: add vvprint back in
Commit ebe5fc60d20f92a0037c53c1e7bd7ae512be3da4 removed the definition of
vvprint without removint all the places tha...
Kyle Marsh
11:19 PM Revision cda5f0d3 (ceph): PG: clear waiting_on_backfill during clear_recovery_state
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
11:17 PM Revision d32fd8c5 (ceph): ReplicatedPG: list snapid 0 on collection_list_partial for backfill
0 will list all objects, CEPH_NO_SNAP will list only head objects.
Signed-off-by: Samuel Just <samuel.just@dreamhost...
Samuel Just
10:10 PM Bug #1831: mon: should not accept (and should disconnect) session when not in quorum
There's two things here, the second being the monitor changes you're focusing on. I need to investigate further why t... Greg Farnum
07:03 PM Bug #1831: mon: should not accept (and should disconnect) session when not in quorum
I think there are two parts here:
- the mon shouldn't let sessions start if it is not in the quorum. that may ac...
Sage Weil
03:39 PM Bug #1831 (Resolved): mon: should not accept (and should disconnect) session when not in quorum
This happened on Benjamin. The OSDs ought to be failing the connection and going to a new monitor, but they failed to... Greg Farnum
07:40 PM Revision d9d05117 (ceph): Merge remote branch 'upstream/master' into wip_backfill_merged
Samuel Just
07:39 PM Revision 07b3ba81 (ceph): ReplicatedPG: collection_list_partial also takes a snapid
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
07:38 PM Revision 1430c8ab (ceph): doc: Make overview.rst valid reStructuredText, so I can stop seeing war...
It's still wrong, but now it won't clutter the output.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
Tommi Virtanen
07:33 PM Revision 53f7323c (ceph): doc: reStructuredText syntax fix.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
07:33 PM Revision c1190740 (ceph): pybind: Add a description to docstring.
This avoids a Sphinx warning like this:
.../src/pybind/rbd.py:docstring of rbd.RBD.version:2: WARNING: Field list en...
Tommi Virtanen
07:32 PM Revision 9d633a4f (ceph): PG: A backfill osd can have last_complete < log_tail
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
07:32 PM Revision 51deeef6 (ceph): ReplicatedPG: calc_*_subsets must consider last_backfill
Objects yet to be backfilled do not show up in the missing set. Thus,
we cannot use an object past last_backfill to ...
Samuel Just
07:32 PM Revision 7832e17e (ceph): PG: activate, backfill replica can have last_complete < log_tail
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
07:32 PM Revision b9eea709 (ceph): osd: object_stat_sum_t::clear()
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:32 PM Revision 940a55e0 (ceph): osd: track backfill target pg stats
Maintain backfill target pg stats to be the summation over objects to
the left of last_backfill. Reflect this in the...
Sage Weil
07:32 PM Revision 7213c457 (ceph): PG: Ask for digest at most once at a time
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
07:32 PM Revision 9bb77b49 (ceph): osd: observe last_backfill in merge_log() and helpers
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:32 PM Revision e1006d76 (ceph): osd: more backfill changes
Always ship log for updates to backfill targets to preserve the repgather
ordering.
Fix up recover_backfill() bounds...
Sage Weil
07:32 PM Revision af7536d0 (ceph): hobject_t: fix hobject(sobject_t) constructor
Initialize max
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:32 PM Revision cd0c8fb3 (ceph): osd: add incomplete, backfill states; simplify calculation
Set/clear states in peering state machine state ctor/dtors where possible.
Set degraded if the number of non-backfil...
Sage Weil
07:32 PM Revision f83a787e (ceph): osd: some recover_backfill() comments
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:32 PM Revision f1caaa37 (ceph): osd: fix calc_acting()
Look at usable, not want.size(), so we don't count backfill targets.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:32 PM Revision 57baf9ef (ceph): osd: fix signed/unsigned comp
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:32 PM Revision 71893b0e (ceph): osd: remove bad !is_incomplete() assert
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:32 PM Revision 999846f7 (ceph): PG: fix phantom entry in peer_info
In GetLog, do not call pg->peer_info[newest_update_osd] if
newest_update_osd is osd->whoami.
Signed-off-by: Samuel J...
Samuel Just
07:32 PM Revision f483df15 (ceph): PG: there may now be backfill entries in the acting set
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
07:32 PM Revision f1ae9ed5 (ceph): objectstore: make list by hash *next > instead of >=
This means we should set it to a hash boundary or the last item of our
result set (not the next item we didn't includ...
Sage Weil
07:31 PM Revision f7a0b9c5 (ceph): hobject_t: fix sorting by hash key
Use get_effective_key() to return key (if explicit) or object name. Sort
by that within each hash value.
Clean up o...
Sage Weil
07:31 PM Revision 9288f0e0 (ceph): osd: advance last_backfill by keys only
This ensures that transactions are never split by last_backfill.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:31 PM Revision 88ee86d0 (ceph): osd: keep backfill targets in acting set
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision b99e1358 (ceph): osd: make backfill (basically) work again
Still need to handle concurrent updates, log recovery vs backfill, etc.
Signed-off-by: Sage Weil <sage.weil@dreamhos...
Sage Weil
07:31 PM Revision de19a6bb (ceph): Revert "osd: don't keep push state on replicas"
This reverts commit 69c77e33f8530993dbc280525bd21218ea6f9ddb.
sub_op_pull() calls send_push_op directly, does not pa...
Sage Weil
07:31 PM Revision baa21c9b (ceph): osd: implement PG::copy_range()
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision c03c49ca (ceph): osd: initialize repop gather set in issue_repop instead of new_repop
Simpler. It will also make the last_backfill correction live in one
place.
Signed-off-by: Sage Weil <sage.weil@drea...
Sage Weil
07:31 PM Revision 5b558dc4 (ceph): osd: strip out some backlog logic
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 82a23dbe (ceph): osd: strip backlog case out of merge_log
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 3f5ced69 (ceph): osd: kill backlog_requested
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 6d299552 (ceph): osd: strip backlog logic out of PG::activate()
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision e7514f75 (ceph): osd: state machine whitespace
I feel better now
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:31 PM Revision 257b85d8 (ceph): osd: remove log_backlog from PG::Info
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 7521c51a (ceph): osd: remove backlog case from clean_up_local
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 9ceecc89 (ceph): osd: kill PG::Info::backlog
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision d7f7bbdc (ceph): osd: remove recovery-from-backlog kludge last_update
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 722ec7e5 (ceph): osd: kill unused PG_STATE_SCANNING
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision d84a9f6f (ceph): osd: cleanup
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 693950bf (ceph): osd: cleanup lingering backlog refs
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision e63c595a (ceph): osd: kill unused PG::Log::copy_after_unless_divergent
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision b5de19b5 (ceph): osd: kill unused PG::trim_write_ahead
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 0e7f4aff (ceph): osd: pg whitespace
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 400c27da (ceph): osd: track backfill with last_backfill, not interval_set<>
We always fill from the bottom up anyway. Using an hobject_t also gives us
a precise bound. It also makes things co...
Sage Weil
07:31 PM Revision 91ee3375 (ceph): osd: osd_kill_backfill_at
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 99c614fa (ceph): osd: don't keep push state on replicas
Primaries need this, but replicas don't: the primary will explicitly pull
the pieces of the object that it wants.
Si...
Sage Weil
07:31 PM Revision 2cdc6b4e (ceph): osd: rewrite choose_acting process
Consolidate callers, eliminate obsolete backlog ones.
New process:
- pick best log, with preferences for those that...
Sage Weil
07:31 PM Revision 9e51c639 (ceph): osd: MOSDPGScan
Message to query hash ranges of a PG.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:31 PM Revision 8f14a358 (ceph): osd: add PG::BackfillInterval type
Describe a range of objects for the purposes of backfilling a PG.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:31 PM Revision 55c24813 (ceph): osd: implement ReplicatedPG::_lookup_object_context
Look up an existing ObjectContext without taking a reference.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:31 PM Revision 92d290d6 (ceph): osd: implement ReplicatedPG::scan_range
Scan a range of the local collection.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:31 PM Revision 17b5d5c3 (ceph): osd: implement do_scan
Handle MOSDPGScan messages to request or send a digest of a range of
objects in a collection, sorted in hobject_t (ha...
Sage Weil
07:31 PM Revision 353195d6 (ceph): types: operator<< for multimaps
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision e4ab0e3b (ceph): osd: add MOSDPGBackfill message
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 910398fe (ceph): osd: recover discontiguous peers using backfill instead of backlog
Instead of generating a huge list of objects to recover, and then pushing
them, iterate over the collection and copy ...
Sage Weil
07:31 PM Revision 4509e619 (ceph): test_backfill.sh
Sage Weil
07:31 PM Revision 004e7c92 (ceph): osd: add Incomplete peering state
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 73d15e01 (ceph): osd: do not read backlog off disk
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision b0664856 (ceph): osd: remove backlog generation code
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 6e9d135a (ceph): osd: simplify replica queries for finding divergent objects
No need to request backlog here, clearly, since those don't exist anymore.
Signed-off-by: Sage Weil <sage.weil@dream...
Sage Weil
07:31 PM Revision b8ee27a3 (ceph): osd: remove Query::BACKLOG processing
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 78b64473 (ceph): osd: kill PG::Log::copy_non_backlog
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:31 PM Revision 10e481d1 (ceph): osd: fix push_to_replica typo
We are always pushing soid. If we are missing snapdir locally, that means
we can't do an informed efficient clone, a...
Sage Weil
07:19 PM Revision b7a5a6a6 (ceph): doc: More consistency on formatting placeholder names.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
07:19 PM Revision 196d4273 (ceph): doc: Link to manpage when command is mentioned.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
07:19 PM Revision 75fd16a5 (ceph): doc: Use todo directive, rescue list of missing commands from wiki.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
07:19 PM Revision 81feae12 (ceph): doc: Add misc explanations of Ceph internals from email.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
07:19 PM Revision 034dd58f (ceph): doc: Add more missing commands to control.
This is too unstructured, that will have to be fixed later.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost....
Tommi Virtanen
07:19 PM Revision f5cfdbb7 (ceph): doc: Split intro to talk about the DFS separately. Mention petabytes.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
07:19 PM Revision bc16ac3b (ceph): doc: Fix sentence that ended too abruptly.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
07:19 PM Revision d745ff8d (ceph): doc: "ceph -w" clarification.
Stop saying "watch cluster state" so many times.
Don't say stdout, that's the assumption.
Don't call showing things...
Tommi Virtanen
07:14 PM Revision 18d99637 (ceph): Merge branch 'wip-messenger'
Greg Farnum
07:11 PM Revision 55639dcd (ceph): msgr: unset did_bind in stop().
We use did_bind as a flag on whether or not to stop the Accepter thread
and we should clear it when we do the stoppin...
Greg Farnum
06:59 PM Revision 41049f30 (ceph): objecter: fix use-after-free
messenger consumes the m reference. Yay valgrind.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:51 PM Revision 041d0456 (ceph): client: move PerfCounter into Client
globals are evil.
Fixes: #1826
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:50 PM Revision e8e1e5df (ceph): swift: auth response returns X-Auth-Token instead of X-Storage-Token
Yehuda Sadeh
05:31 PM Revision c9d0e556 (ceph): osd: fix build_incremental_map_msg
We keep both the inc and the full for our oldest osdmap.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
05:27 PM Revision 1a473b7a (ceph): osd: clean up _delete_head
Might be fixing a subtle logic bug, but old flow was confusing, so not
sure. :)
Signed-off-by: Sage Weil <sage@newd...
Sage Weil
05:26 PM Revision 6c8f60f6 (ceph): osd: simplify creation logic in do_osd_ops
Drop the maybe_created variable, and track exists over the course of the
transaction.
Fixes: #1825
Signed-off-by: Sa...
Sage Weil
05:16 PM Bug #1832 (Closed): osd: size tracking discrepancy (scrub stat mismatch)
During fsstress on the kernel client, this occurred:... Josh Durgin
01:53 PM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
Hi,
I've run into this precise problem on a small testing cluster that I'm running -- down to the large 64-bit tru...
David McBride
11:55 AM rgw Bug #1830 (Resolved): RGW Swift Metadata Bug
I believe the rados gateway has a but in the way it's talking swift. When I ask it to list the objects in a container... Kyle Marsh
11:44 AM Feature #1782 (Resolved): mon: dump key cluster stats via perfcounter
Sage Weil
11:32 AM CephFS Bug #1788: msgr file descriptor leak
Forgot to update this. Haven't run into it yet and wip-messenger seemed to have fixed things. Thanks Greg! Noah Watkins
11:27 AM CephFS Bug #1788 (Resolved): msgr file descriptor leak
Haven't heard any new issues from Noah; merged to master in commit:18d996370efc2fc32d4973e9e6934901558bcbaf. Greg Farnum
11:26 AM Messengers Bug #1829 (Resolved): SimpleMessenger tries to shut down threads that aren't running
Oh, even simpler than I expected. Fixed in commit:55639dcd87fe985059355afe5fab787e4d139b11 (compile tested). Greg Farnum
11:12 AM Messengers Bug #1829 (Resolved): SimpleMessenger tries to shut down threads that aren't running
Saw this on benjamin yesterday. Looks like the OSD repeatedly restarted its messengers and was eventually unable to r... Greg Farnum
11:01 AM CephFS Cleanup #1826 (Resolved): client: kill static perfcounter
commit:041d04563e7cfdb837a345787a1569b07a064307 Sage Weil
10:54 AM rgw Bug #1780 (Resolved): swift: auth response should return X-Auth-Token instead of X-Storage-Token
Fixed, commit:e8e1e5dffbd25e2124331e607264e1bc4120676c. Yehuda Sadeh
10:12 AM Linux kernel client Bug #1793: NULL pointer dereference at try_write+0x627/0x1060
This happened again on sepia70 during the kernel untar build workunit on rbd. Josh Durgin
09:40 AM Bug #1804 (Need More Info): filestore: unexpected EINVAL
Sage Weil
09:39 AM Bug #1828 (Resolved): osd: preserve write order when ops wait for recovery of src_oids
This affects current code.
It will need a minor adjustment so that "recovery" includes both is_missing() and osd >...
Sage Weil
09:33 AM CephFS Bug #1549 (Need More Info): mds: zeroed root CDir* vtable in scatter_writebehind_finish
Sage Weil
09:32 AM Bug #1530: osd crash during build_inc_scrub_map
fixed that last thing with commit:c9d0e556c7ad294819c60ca4e3cd4d0191811f18, but i think it's unrelated to the rest of... Sage Weil
09:22 AM Bug #1825: osd loses object deletes by some creates in the same transaction
Fix looks good; I'm working on tests to verify and check regressions. Greg Farnum
02:08 AM Revision abecbc59 (ceph): OSDMonitor: remove useless check
Session was already verified to exist before this.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
12:31 AM Revision 5804477b (ceph): qa: trivial_libceph test
This currently fails... see #1827
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:29 AM Revision c87f31e0 (ceph): client: return errors from init
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:29 AM Revision 2f281d1f (ceph): libceph: catch errors from Client::init()
And clean up error paths.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:29 AM Revision 207c40b0 (ceph): libceph: add missing #includes
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:16 AM Revision 31b5ccbf (ceph): coverage: use locally stored build instead of downloading from a gitbui...
Josh Durgin

12/13/2011

05:31 PM CephFS Bug #1827: libceph: hang on creating a file
see commit:5804477b20f89a2b02218b518a44e73073b393c9 for reproducer.
fwiw i ran with vstart and 'LD_PRELOAD=../../s...
Sage Weil
04:36 PM CephFS Bug #1827 (Resolved): libceph: hang on creating a file
Using trivial thinger from Noah. Sage Weil
05:15 PM Revision 6b425676 (ceph): objectstore: implement Transaction::dump()
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:15 PM Revision 7133a2fa (ceph): filestore: dump transaction to log if we hit an error
This will let us see which operation in the transaction failed.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:05 PM Revision 3d13f003 (ceph): objectstore: create Transaction::iterator class
Remove iterator state from Transaction itself.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:32 PM CephFS Cleanup #1826 (Resolved): client: kill static perfcounter
Make it a Client member. The CephContext stuff tracks "per-process" state now, so no need to be weird. Also, these ... Sage Weil
04:28 PM Revision 4da96ff3 (ceph): rados load-gen workunits
Sage Weil
04:19 PM Revision 6ff95e9d (ceph): qa: rados load-gen workunits
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:10 PM Bug #1825: osd loses object deletes by some creates in the same transaction
see wip-osd-maybe-created Sage Weil
02:11 PM Bug #1825 (Resolved): osd loses object deletes by some creates in the same transaction
We found a missing object in alexandria, caused by the gateway trying to delete an object that seems to not actually ... Greg Farnum
11:07 AM rgw Tasks #1823: radosgw should have internal timeouts
I think I wasn't clear enough. RGW doesn't need to do that in the I/O path. Anyway, we need to think of the functiona... Yehuda Sadeh
10:55 AM rgw Tasks #1823: radosgw should have internal timeouts
RGW ought to be able to grab information about IOs which are taking too long and figure out what OSD that IO resides ... Greg Farnum
10:52 AM rgw Tasks #1823: radosgw should have internal timeouts
We can have timeouts for the init process for other operations I'm not sure it'll make sense doing it in the rgw laye... Yehuda Sadeh
10:44 AM rgw Tasks #1823 (Rejected): radosgw should have internal timeouts
Letting Apache time out the rados gateway makes admins sad, since there's no visibility into what is actually timing ... Greg Farnum
10:53 AM rgw Tasks #1824 (Resolved): ceph monitor status should be available and documented
I saw last night that I think we can run "ceph quorum_status" to see which monitors are in the quorum, "ceph mon_stat... Greg Farnum
10:49 AM Bug #1821: librados: rados_create_with_context is unusable
Josh Durgin wrote:
> The C++ variant librados::Rados::init_with_context is used by librbd, radosgw, and some command...
Sage Weil
10:44 AM Bug #1821: librados: rados_create_with_context is unusable
The C++ variant librados::Rados::init_with_context is used by librbd, radosgw, and some command line tools, but this ... Josh Durgin
10:49 AM Bug #1820: deprecate "ceph stop"
It's not being run because getting the parsing and isolating leaks is a pain, but there are teuthology tasks to run v... Greg Farnum
10:28 AM Bug #1820: deprecate "ceph stop"
none of this is tested anywhere.. it's for when you manually want to check for leaks, and need the osd to try to shut... Sage Weil
10:08 AM Bug #1820: deprecate "ceph stop"
I don't see anything in teuthology sending stop commands to the OSDs; I believe the valgrind stuff just uses SIGTERM. Greg Farnum
09:59 AM Bug #1820: deprecate "ceph stop"
exit(0) on SIGTERM is perfectly valid.
If we do need more than SIGUSR1 & SIGUSR2, the communication mechanism shou...
Anonymous
09:38 AM Bug #1820: deprecate "ceph stop"
... Sage Weil
09:31 AM Bug #1820: deprecate "ceph stop"
gcov is already using SIGTERM. Anonymous
10:33 AM Bug #1530: osd crash during build_inc_scrub_map
I'm guessing this is the new incarnation of this issue?
From teuthology:~teuthworker/archive/nightly_coverage_2011-1...
Josh Durgin
10:31 AM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
Happened again in teuthology:teuthworker~/archive/nightly_coverage_2011-12-13-a/4183/remote/ubuntu@sepia74.ceph.dream... Josh Durgin
10:12 AM rgw Bug #1822 (Closed): radosgw can be slow to respond to requests
The DHO admins are having problems where sometimes requests take so long that Apache issues an ISE 500. It's often bu... Greg Farnum
09:48 AM Bug #1789 (Need More Info): mon: failed assert(paxosv == pg_map.version)
have core, but no matching binary. not clear from code inspection what happened.
Sage Weil
09:30 AM Bug #1804: filestore: unexpected EINVAL
as of commit:7133a2faf0ae0710b7cbd9801c64767172d48faf we dump the failed transaction to the log. Sage Weil
08:28 AM Feature #1799 (Resolved): qa: add 'rados --load-gen' test(s)
Sage Weil
12:29 AM Revision c9e4504f (ceph): Ignore lockdep being turned off for now.
Some machines are hitting this udev issue:
http://marc.info/?l=linux-kernel&m=132033587908426&w=2 and lockdep is
turn...
Josh Durgin
12:00 AM Revision 6d5e5bdb (ceph): pybind/rados: add asynchronous write,append,read,write_full operations
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just

12/12/2011

10:31 PM Revision 78b7a255 (ceph): doc: Import the list of ceph subcommands from wiki.
This adds the content of the wiki page at
http://ceph.newdream.net/wiki/Monitor_commands
to doc/control.rst in orde...
Andre Noll
10:31 PM Revision 9aadd41b (ceph): doc: Add documentation of missing osd commands.
The set of OSD commands which added by the previous commit is
incomplete. This patch adds documentation for the follo...
Andre Noll
10:31 PM Revision 1867a745 (ceph): doc: Document pause and unpause osd commands.
These two commands were undocumented so far. This patch adds a short
description.
Signed-Off-By: Andre Noll <maan@sy...
Andre Noll
10:31 PM Revision 7dce3e6f (ceph): doc: Update the list of fields for the pool set command.
This list was lacking a few fields: crash_replay_interval, pg_num,
pgp_num and crush_ruleset. Include these fields an...
Andre Noll
10:31 PM Revision db30716b (ceph): doc: Add missing documentation for osd pool get.
"osd pool set" was already documented, but the corresponding "get"
command was not. This patch adds the list of valid...
Andre Noll
10:31 PM Revision fb8fd186 (ceph): doc: Clarify documentation of reweight command.
This caused some discussions on the mailing list, so let's try to be clear
about the meaning of an OSD weight.
Signe...
Andre Noll
09:35 PM Bug #1821: librados: rados_create_with_context is unusable
i think radosgw uses it. it creates a CephContext by linking directly the ceph internals... Sage Weil
05:12 PM Bug #1821 (Resolved): librados: rados_create_with_context is unusable
There's no way to get a CephContext using the C api, so you can't pass one to rados_create_with_context. Maybe a rado... Josh Durgin
09:24 PM Revision 06046470 (ceph): SimpleMessenger: remove void send_keepalive.
Nobody uses this; they all call the version that returns an int.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhos...
Greg Farnum
09:24 PM Revision e6e66232 (ceph): mds: mark_disposable when closing a Client connection.
This is causing issues since the Client's ack of the MClientSession
is somehow not getting back to the MDS. We should...
Greg Farnum
09:24 PM Revision 1dd173a2 (ceph): messenger: fix up fault()'s "onconnect" parameter.
We should be setting this true when calling fault() from connect().
And rename it in the header -- it does produce le...
Greg Farnum
07:25 PM Bug #1820: deprecate "ceph stop"
Iirc the real purpose is to make the daemon shut down cleanly. This is important for gprof, valgrind memcheck, etc. ... Sage Weil
02:38 PM Bug #1820 (Resolved): deprecate "ceph stop"
A good daemon supervision system would try to restart any daemons that just exited. For "ceph stop" to work in the wo... Anonymous
05:29 PM Revision 5e215c7e (ceph): Merge branch 'wip-mon-stats'
Sage Weil
05:27 PM Revision 808a851d (ceph): mdsmap: rename get_num_*_mds() methods
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:27 PM Revision 711447d8 (ceph): mon: add mds, mon info to cluster_logger
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:24 PM Revision ac31d526 (ceph): mon: report basic cluster stats via perfcounters
These are basic point-in-time cluster stats.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:22 PM Revision 1f1b5fdf (ceph): crush: drop unused label
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:20 PM Revision 62b78de7 (ceph): Merge remote branch 'gh/stable'
Sage Weil
05:18 PM Revision 495307a1 (ceph): crush: fix force to behave with non-root TAKE
If the (first) TAKE in the crush rule is not the root, see if they picked
a point somewhere beneath the appropriate p...
Sage Weil
05:17 PM Revision 14f8f00e (ceph): crush: simplify force argument check
force isn't used past this point, only force_pos. Collapse the if
conditions.
Signed-off-by: Sage Weil <sage@newdre...
Sage Weil
04:45 PM Messengers Bug #1803: msgr: behave better when ending TCP connections
And I've flipped back and forth umpteen times today about what's going on. At this point I can conclude that nobody o... Greg Farnum
10:49 AM Messengers Bug #1803 (In Progress): msgr: behave better when ending TCP connections
From the little I'm reading in Unix Network Programming, it looks like we're just doing this wrong — we call shutdown... Greg Farnum
11:21 AM Documentation #1819 (Resolved): document librados python api
Josh Durgin
11:21 AM rbd Documentation #1818 (Closed): document librbd C++ api
Josh Durgin
11:20 AM Documentation #1817 (Closed): document librados C++ api
Josh Durgin
11:20 AM rbd Documentation #1816 (Closed): document librbd C api
Use similar examples to the python api docs. Josh Durgin
11:19 AM Documentation #1815 (Resolved): document librados C api
Document the librados C api with doxygen. Josh Durgin
10:00 AM Documentation #1814 (Resolved): doc: openstack + ceph install howto
Sage Weil
09:58 AM rgw Documentation #1813 (Resolved): doc: document radosgw api diffs with s3
move from google docs or wherever. clean up. maintain going forward. Sage Weil
09:50 AM Bug #1683 (In Progress): librados: list objects should also return locator key
Sage Weil
09:48 AM Bug #1744: teuthology: race with daemon shutdown?
any additional teuthology logging we can add to sort out what is happening? Sage Weil
09:47 AM RADOS Bug #1794 (Resolved): crush: creating/destroying buckets of zero items
fixed by commit:ca002a3389877f5e150659649e27e7ae59d7d402 Sage Weil
09:45 AM Feature #1782: mon: dump key cluster stats via perfcounter
Sage Weil
08:53 AM Bug #1758: OSD segfault in SimpleMessenger::send_message
Verify that last failure was running a commit that included the fix? Sage Weil
08:38 AM Linux kernel client Bug #1812 (Resolved): iput scheduling while atomic
iput can sleep, but is called with spinlocks held in some cases.... Sage Weil
08:34 AM Bug #1750 (In Progress): xattr errors silently ignored, cause trouble later
Sage Weil
08:31 AM Bug #1750: xattr errors silently ignored, cause trouble later
Shouldn't the FileStore have asserted on the -28? Sage Weil
03:19 AM Linux kernel client Bug #1795: break d_lock > s_cap_lock ordering
Seems fixed here now with git branch wip-d-lock. Amon Ott
03:18 AM Linux kernel client Bug #1762: i_lock vs s_cap_lock vs inodes_with_caps_lock lock ordering
Seems to be fixed here now with git commits be655596b3de5873f994ddbe205751a5ffb4de39 (for-linus) and 1a2fe05d296a35da... Amon Ott

12/10/2011

12:31 AM Revision cf279a8b (ceph): workunits: print tests pjd runs
This will tell us which ones actually failed within a test suite.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost....
Josh Durgin

12/09/2011

11:23 PM Revision 8064440d (ceph): Merge branch 'wip_pgls'
Samuel Just
11:22 PM Revision 864847b2 (ceph): pybind: add object locator support to pybind pool listing
list_objects returns Object(). Object therefore now has an optional
locator_key parameter which will set up the obje...
Samuel Just
09:44 PM Revision 111c12ce (ceph): ReplicatedPG: collection_list_handle_t is now an hobject_t
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
09:44 PM Revision 4ce7dd48 (ceph): rados.cc: add --object-locator and object locator output to ls
--object-locator locator causes io to use the specified locator. For
objects with non-empty locators, rados pool ls ...
Samuel Just
09:44 PM Revision 798ef38b (ceph): osd: delay pg list on a snapid until missing is empty
We cannot determine from the missing set whether an object existed
at a given snap.
Signed-off-by: Samuel Just <samu...
Samuel Just
04:53 PM CephFS Bug #1811 (Duplicate): 2 pjd chown tests failed on cfuse
From teuthology:~teuthworker/archive/nightly_coverage_2011-12-09-a/4061/teuthology.log:... Josh Durgin
04:32 PM Linux kernel client Bug #1793: NULL pointer dereference at try_write+0x627/0x1060
A disk error prevented me from getting logs before:... Josh Durgin
03:42 PM Linux kernel client Bug #1793: NULL pointer dereference at try_write+0x627/0x1060
Got the same trace on sepia18 while running mkfs.ext3 on an rbd image. Josh Durgin
03:18 PM Bug #1758 (New): OSD segfault in SimpleMessenger::send_message
This happened again yesterday. Core is in teuthology:~teuthworker/archive/nightly_coverage_2011-12-08-a/3954/remote/u... Josh Durgin
11:18 AM Messengers Bug #1803: msgr: behave better when ending TCP connections
I'm going to see if I can handle this in userspace today — fixing it in the kernel client will be another ticket. Greg Farnum
11:14 AM Feature #1810 (Resolved): monclient: timeouts?
It's been suggested that maybe certain categories of clients which are used for gathering statistics rather than comm... Greg Farnum
11:13 AM Messengers Feature #1809 (New): msgr: limit simultaneous connections
Right now SimpleMessenger has no mechanism for limiting the number of simultaneous connections it holds open. This is... Greg Farnum
11:10 AM Feature #1808 (Rejected): filestore: gracefully handle EMFILE
If the FileStore gets an EMFILE error it asserts out without attempting to handle the problem. I don't know whether t... Greg Farnum
09:34 AM Revision e2a94505 (ceph): obsync: add swift support to obsync
A single "url" doesn't make sense for a swift object store the way it does
for an S3 store or local file, so this com...
Kyle Marsh
07:15 AM Bug #1797: configure doesn't link to pthread on Fedora 14 on linking librados-config
I just find out it works when you call configure with
LIBS="-lpthread" ./configure
Still a bug, though, the c...
Guido Winkelmann
02:01 AM Revision d21f4abc (ceph): msgr: turn up socket debug printouts
These shouldn't be too common and will help in debugging
socket leaks.
Signed-off-by: Greg Farnum <gregory.farnum@dr...
Greg Farnum
01:47 AM Revision a768ad73 (ceph): coverage: don't generate html reports for each test
These can always be generated from the lcov files later, right now they just waste space. Josh Durgin
01:17 AM Revision 7b52dd14 (ceph): syslog: ignore 'task blocked' warnings
These will happen under heavy load (usually on the osd). Josh Durgin
12:36 AM Revision 891025e5 (ceph): udev: drop device number from name
The device number depends on how many rbd images have been
mapped. Removing it makes the name determined solely by th...
Josh Durgin

12/08/2011

11:35 PM Revision 6b8588b7 (ceph): Use btrfs for regression tests
Some of the tests (particularly the s3 tests) use very long filenames
which trigger bugs related to ext4 xattr handli...
Samuel Just
09:10 PM Revision a5606ca4 (ceph): pybind: trivial fix of missing argument
Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com> Henry Chang
06:40 PM Bug #1805 (Rejected): OSD: fd leak
I was trying to figure out why the OSD was generating ~600 new sessions in the 4.5 seconds after starting up, when I ... Greg Farnum
06:20 PM Bug #1805 (Need More Info): OSD: fd leak
*sigh* It appears that I didn't manage to gather the correlated data that I thought I did. After an audit of who uses... Greg Farnum
02:10 PM Bug #1805 (Rejected): OSD: fd leak
There's an fd leak in the OSD. It looks like it's probably related to doing lots of OSDMap advancements at once, base... Greg Farnum
06:35 PM Bug #1807 (Can't reproduce): CentOS compile error in perfglue/heap_profiler.cc
on a CentOS system, I did a git fetch/merge followed by a make clean,
and got a compilation error in perf
CXX ...
Anonymous
05:59 PM Bug #1741: teuthology: failed to untar
Doesn't look like any other tests that day had the same machines locked while this was run. I think this might just b... Josh Durgin
05:40 PM Bug #1741: teuthology: failed to untar
It was 2662 that had this error. Josh Durgin
05:21 PM Feature #1800 (Resolved): qa: run osd tests on btrfs
Josh Durgin
04:42 PM Revision e4db1297 (ceph): crush: whitespace
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:41 PM Revision 808763ea (ceph): osdmap: initialize cluster_snapshot_epoch
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:41 PM Revision c94590ab (ceph): crush: set max_devices=0 for map with empty buckets
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
04:06 PM Revision ca002a33 (ceph): crush: fix stepping on unallocated memory
If size is 0 we can't write here.
Reported-by: pankaj singh <psingh.ait@gmail.com>
Signed-off-by: Sage Weil <sage.we...
Sage Weil
03:56 PM CephFS Bug #1806 (Can't reproduce): MDS won't start
ceph-mds fails to enter replay on start even though mon appears to instruct it to do so, all 3 mds processes remain i... Adam Jacob Muller
03:34 PM Bug #1750 (Rejected): xattr errors silently ignored, cause trouble later
I've updated the regression suite to use btrfs. Samuel Just
02:16 PM Bug #1750: xattr errors silently ignored, cause trouble later
I was able to reproduce this once with logging. It appears to be the ext4 xattr limitation.
2011-12-08 12:45:41.2...
Samuel Just
11:31 AM CephFS Bug #1788: msgr file descriptor leak
I guess this bug should be considered fixed by commit:8c4f4748e8b683f5b4ea939295793421c0ab7b61 in the wip-messenger b... Greg Farnum
05:19 AM Revision d940d68d (ceph): client: trim lru after flushing dirty data
Shouldn't matter, but it would be interesting to see if this affects
#1737.
Signed-off-by: Sage Weil <sage.weil@drea...
Sage Weil
05:19 AM Revision 1545d03c (ceph): client: unmount cleanup
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
05:19 AM Revision f3c90f8d (ceph): client: wait for sync writes even with cache enabled
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
05:19 AM Revision adbe3639 (ceph): client: send umount warnings to log, not stderr
stderr isn't usually open anyway.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil

12/07/2011

11:20 PM Revision e69057e4 (ceph): internal: check syslog for errors
This should catch lockdep warnings and mark tests with them as failed. Josh Durgin
07:40 PM Revision 9ab445a4 (ceph): ObjectStore: Add collection_list_partial for hash order
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Sage Weil
07:40 PM Revision 997265a2 (ceph): os/HashIndex: some minimal debug output
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:40 PM Revision 0807e7d5 (ceph): hobject_t: make filestore_hobject_key_t 64 bits
So we can return 0x100000000 when max=true.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:40 PM Revision 322f93a2 (ceph): hobject_t: encode max properly
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:40 PM Revision 717621f6 (ceph): librados,Objecter,PG: list objects now includes the locator key
Previously, there was no way to recover the locator key used to create
and object. Now, rados_objects_list_next and ...
Samuel Just
07:40 PM Revision 2d3721c6 (ceph): ObjectStore,ReplicatedPG: remove old collection_list_partial
No need for the old collection_list_partial instance: it's cleaner to
just use an hobject_t as the collection list ha...
Samuel Just
07:40 PM Revision 2026450b (ceph): hobject_t: define max value
Create a max value that is greater than all other values.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:40 PM Revision 348321a5 (ceph): hobject_t: sort by (max, hash, oid, snap)
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
07:40 PM Revision cada2f2e (ceph): object.h: Sort hobject_t by nibble reversed hash
To match the HashIndex ordering, we need to sort hobject_t by the nibble
reversed hash. We store objects in the file...
Samuel Just
07:40 PM Revision 63e3d864 (ceph): hobject_t: define explicit hash, operator<<; drop implicit sobject_t()
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
05:56 PM Bug #1804 (Closed): filestore: unexpected EINVAL
Core file and binary are on gitbuilder-gcov-amd64:~/bug_1804.
The data is still on sepia24 for inspection....
Josh Durgin
05:20 PM Messengers Bug #1803: msgr: behave better when ending TCP connections
This actually caused a deadlock with ffsb on the kernel client - ffsb ended up with 1006 connections in the CLOSING s... Josh Durgin
04:56 PM Messengers Bug #1803 (Won't Fix): msgr: behave better when ending TCP connections
TV is telling me that if we're not confirming that each side of the connection calls ::shutdown() on the socket, we'r... Greg Farnum
04:51 PM Bug #1791 (Resolved): osd: assert(0) in sub_op_modify
This looks like the objecter bug, fixed by commit:2f5bd5f737e831a03beb93c3928c74b59a59052e Sage Weil
03:38 PM Bug #1763 (Resolved): qa: need to run qa tests on kernel with lockdep enabled
Lockdep was already enabled, but we weren't marking runs as failed if errors appeared in syslog. Teuthology commit e6... Josh Durgin
01:49 PM CephFS Bug #1737: ceph-fuse crash in xlist::remove
This happened again from a different path in teuthology:~teuthworker/archive/nightly_coverage_2011-12-07-a/3843/remot... Josh Durgin
11:46 AM Feature #1802 (Resolved): qa: test to exercise divergent osd logs
- generate some write/overwrite workload with many concurrent writes
- extend ceph_manager to pause (kill -STOP) an ...
Sage Weil
11:18 AM rgw Bug #1801 (Resolved): rgw: radosgw-admin remove subuser and related swift key in a single command
Yehuda Sadeh
11:15 AM Feature #1800 (Resolved): qa: run osd tests on btrfs
i think all the code is there, but we need to make the night runs actually do it. Sage Weil
10:41 AM Feature #1799 (Resolved): qa: add 'rados --load-gen' test(s)
maybe a few tests with a range of options, if appropriate Sage Weil
10:41 AM Feature #1798 (Rejected): qa: add rados/librados tests (RadosModel)
Sage Weil
10:10 AM Feature #1784 (Duplicate): osd: redo pgls api
Sage Weil
09:27 AM rbd Feature #1790: rbd: have a way of establishing configured mappings at boot time
Single-file configuration is more annoying to handle with automated tools, file-per-device gives you good atomicity o... Anonymous
09:01 AM Bug #1778 (Resolved): Error after installing an iso-image via qemu / rbd-image
Hi Oliver,
You can use rbd to take live snapshots with the same consistency as with snapshotting images on nfs. Th...
Josh Durgin
03:31 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Hi Josh,
well, the small fix does it, no more crashes.
But, of course I would love to have back my live-snapsho...
Oliver Francke
08:57 AM Bug #1797 (Resolved): configure doesn't link to pthread on Fedora 14 on linking librados-config
When building ceph 0.39 on Fedora 14, the build process fails with the
following messages:
CXXLD librados-con...
Guido Winkelmann
08:43 AM CephFS Bug #1796 (Resolved): mds: exit cleanly on EBLACKLISTED
... Sage Weil
08:31 AM Linux kernel client Bug #1795 (Resolved): break d_lock > s_cap_lock ordering
... Sage Weil
08:01 AM RADOS Bug #1794 (Resolved): crush: creating/destroying buckets of zero items
we still try to calloc the length zero array
and then try to free it later...
Sage Weil
07:32 AM CephFS Bug #1047: mds: crash on anchor table query
Got it again with 0.39. Still there. Amon Ott
12:16 AM Revision 95e63247 (ceph): workunit: set client id and secretfile env vars
These are used by the kernel rbd workunit to know how to map images.
Signed-off-by: Josh Durgin <josh.durgin@dreamho...
Josh Durgin

12/06/2011

11:56 PM rbd Feature #1790: rbd: have a way of establishing configured mappings at boot time
What if your image is not in the pool "rbd" ?
I was thinking about a 'rbdtab' file:...
Wido den Hollander
11:10 AM rbd Feature #1790 (Resolved): rbd: have a way of establishing configured mappings at boot time
We need to be careful about the config format, to make automatic editing easy (think Chef).
First draft:
/etc/c...
Anonymous
11:22 PM Revision 745be30f (ceph): gitignore: Ignore src/keyring, as created by vstart.sh
Commit 86c34ba9ee8c883b71a8449c3c261154365c35ae changed
the filename but not .gitignore.
Signed-off-by: Tommi Virtan...
Tommi Virtanen
10:44 PM Revision a1ebd725 (ceph): ReplicatedPG: don't crash on empty data_subset in sub_op_push
If data_subset is empty (i.e., the data we pulled is no longer useful),
we should mark complete false and continue ra...
Samuel Just
10:24 PM Revision 03b03553 (ceph): ReplicatedPG: do not ->put() scrub messages when adding to a WorkQueue.
This function is passing a reference from PG::active_rep_scrub to
the req_scrub_wq, not eliminating the reference (an...
Greg Farnum
10:20 PM Revision 8afa5a5d (ceph): workunits: fix secret file and temp file removal for kernel rbd
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
09:36 PM Revision bcd26fca (ceph): workunits: make rbd kernel workunit executable
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
08:13 PM Revision 2bdf9078 (ceph): doc: Reorganize pip calls to use a requirements file.
The conditional before running pip install was unnecessary,
"pip install" on already installed packages is fast (as l...
Tommi Virtanen
08:07 PM Revision 200d7c89 (ceph): doc: Switch diagram tools from dia to ditaa.
Now you can create diagrams easily with the ".. ditaa::"
directive in the Sphinx documents.
admin/build-doc now chec...
Tommi Virtanen
06:50 PM Revision 20b7af79 (ceph): doc: fix typo
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:50 PM Revision 33753c82 (ceph): filestore: send back op error to log, not stderr
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:31 PM Revision 66b6b1bf (ceph): workunits: add some tests for kernel rbd
This covers some snapshot and resize functions that aren't tested by fs benchmarks.
Signed-off-by: Josh Durgin <josh...
Josh Durgin
06:26 PM Revision 575f717f (ceph): rbd: allow snapshots to be mapped
unmap and showmapped already support snapshots. map should too.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
06:26 PM Revision 01d30e6a (ceph): secret: fix error check
add_key will return -1 when an error occurs, which should be handled at a higher level and not printed here.
Signed-...
Josh Durgin
06:26 PM Revision 0ad0fbfe (ceph): secret: add is_kernel_secret function
This will let us know whether we can add a key mount option
if no secret is specified.
Signed-off-by: Josh Durgin <j...
Josh Durgin
06:26 PM Revision 274f4890 (ceph): rbd, mount.ceph: use pre-stored secret if available
If a secret is specified, store and use it, but otherwise
check for a pre-existing secret to use.
Signed-off-by: Jos...
Josh Durgin
06:26 PM Revision 16a211bf (ceph): ceph-rbdnamer: include snapshot name if present
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
06:26 PM Revision fd9556f0 (ceph): rbd: the showmapped command shouldn't connect to the cluster
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
06:02 PM Linux kernel client Bug #1793 (Can't reproduce): NULL pointer dereference at try_write+0x627/0x1060
Found in sepia50's console:... Josh Durgin
04:44 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Josh Durgin
04:44 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
The bug is in the qemu driver - the fix is "in our qemu repo":https://github.com/NewDreamNetwork/qemu-kvm/commit/7ee2... Josh Durgin
09:28 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Hi Oliver,
That gdb session is actually an entirely different crash - I'll take a closer look at both of these tod...
Josh Durgin
02:14 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Well Josh,
being quite busy... and need to understand ( not a "real-coder" these days anymore ;-) ) how to configu...
Oliver Francke
04:34 PM Revision ddc11a8f (ceph): test_rados.py: clean up after EEXIST test
This extra pool caused subsequent pool tests to fail.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
02:35 PM Bug #1758 (Resolved): OSD segfault in SimpleMessenger::send_message
I checked out a core dump, and the OSD is calling send_message with a null Connection* from PG::replica_scrub::2895. ... Greg Farnum
11:53 AM Bug #1758: OSD segfault in SimpleMessenger::send_message
And in teuthology:~teuthworker/archive/nightly_coverage_2011-12-06-a/3757/remote/ubuntu@sepia66.ceph.dreamhost.com/lo... Josh Durgin
11:52 AM Bug #1758: OSD segfault in SimpleMessenger::send_message
Happened again today in teuthology:~teuthworker/archive/nightly_coverage_2011-12-06-a/3772/remote/ubuntu@sepia66.ceph... Josh Durgin
02:01 PM CephFS Bug #1702 (Can't reproduce): Ceph MDS crash + client mount problem
Sage Weil
02:01 PM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
I think the next step here is to run the mds under valgrind. Sage Weil
02:00 PM Bug #1490 (Resolved): cfuse assert failure: assert(ob->last_commit_tid < tid)
Sage Weil
11:34 AM CephFS Bug #1792 (Can't reproduce): crash in ceph-mds
This is the full log from teuthology:~teuthworker/archive/nightly_coverage_2011-12-01-b/3516/remote/ubuntu@sepia70.ce... Josh Durgin
11:25 AM Bug #1791 (Resolved): osd: assert(0) in sub_op_modify
From teuthology:~teuthworker/archive/nightly_coverage_2011-12-02-a/3569/remote/ubuntu@sepia6.ceph.dreamhost.com/log/o... Josh Durgin
11:19 AM Bug #1750 (New): xattr errors silently ignored, cause trouble later
Happened again after s3tests in teuthology:~teuthworker/archive/nightly_coverage_2011-12-02-b/3624/teuthology.log. Josh Durgin
11:09 AM CephFS Bug #1675: mds: failed rstat assert
Happened during fsstress in teuthology:~teuthworker/archive/nightly_coverage_2011-12-02-b/3593/remote/ubuntu@sepia92.... Josh Durgin
11:07 AM Bug #1789 (Resolved): mon: failed assert(paxosv == pg_map.version)
From teuthology:~teuthworker/archive/nightly_coverage_2011-12-02-b/3603/remote/ubuntu@sepia44.ceph.dreamhost.com/log/... Josh Durgin
10:54 AM Bug #1530: osd crash during build_inc_scrub_map
Another one crashed in PG::replica_scrub yesterday. core is in teuthology:~teuthworker/archive/nightly_coverage_2011-... Josh Durgin
06:01 AM CephFS Bug #1047: mds: crash on anchor table query
Updated Ceph to 0.39 and the bug seems to be gone. Amon Ott
01:33 AM Revision 54758abc (ceph): Merge remote branch 'gh/stable'
Sage Weil
12:16 AM Revision 9512aed5 (ceph): doc: fix rst syntax
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

12/05/2011

10:07 PM Revision 7178f1ca (ceph): doc: document monitor cluster expansion/contraction
Pretty sure my rst syntax is wrong.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:33 PM Revision 16f79282 (ceph): cephtool: fix shutdown
Fix 'ceph -w' brokenness from commit ad13d0b7.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:21 PM Revision 019597e6 (ceph): filejournal: make FileJournal::open() arg slightly less weird
Pass in fs_op_seq (last_committed_seq), not the next expected seq, so we
can avoid subtracting and adding 1 in odd pl...
Sage Weil
07:21 PM Revision bfbc4324 (ceph): Merge branch 'stable'
Sage Weil
07:21 PM Revision 86c34ba9 (ceph): vstart.sh: .ceph_keyring -> keyring
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:15 PM CephFS Bug #1774: client: files become inaccessible in large directories (with snapshots?)
Some interesting findings... It appears that the problem has nothing to do with the mds, but with the fuse client. ... Alexandre Oliva
06:53 PM Revision 1e3da7ed (ceph): filejournal: remove bogus check in read_entry
It is perfectly fine to read events that are older than the fs's seq from
the journal; open() will skip them when pos...
Sage Weil
06:08 PM Revision dbd7a3b4 (ceph): Rename "testrados" task to not begin with "test".
See commit e80c32c44293e6453cce1bf89ad3cf5b1b4917ab in
teuthology.git
Tommi Virtanen
06:07 PM Revision e80c32c4 (ceph): Rename "testrados" and "testswift" tasks to not begin with "test".
Anything "test*" looks like a unit test, and shouldn't be used for
actual code.
Tommi Virtanen
06:07 PM Revision 9598e479 (ceph): Rename "testrados" and "testswift" tasks to not begin with "test".
Anything "test*" looks like a unit test, and shouldn't be used for
actual code.
Tommi Virtanen
06:02 PM Revision 0dd4d69f (ceph): Fix unit tests for SSH keep-alive setting.
Commit 6e3e0d7cdcb5ba70f938f0850a8828aca2753ab5 failed to pass
unit tests.
Tommi Virtanen
05:37 PM Revision dc167bac (ceph): filejournal: set last_committed_seq based on fs, not journal
last_committed_seq is the last seq committed to the fs, not the journal.
Set it when we begin replay with the fs prov...
Sage Weil
04:15 PM CephFS Bug #1788 (Resolved): msgr file descriptor leak
With our Hadoop workload (lots of client connections), this problem occurs every couple hours -- although this is the... Noah Watkins
02:18 PM Bug #1786 (Resolved): ceph -w goes dead after 5 minutes
commit:16f79282cd0132c3633216f51fbbf0f93a0aec61 Sage Weil
11:13 AM Bug #1786 (Resolved): ceph -w goes dead after 5 minutes
Sage Weil
02:18 PM Bug #1785 (Resolved): osd: os/FileJournal.cc: 1011: FAILED assert(seq >= last_committed_seq)
commit:1e3da7edcf8881b10f35879e4b5b6be93167c636 Sage Weil
09:14 AM Bug #1785 (Resolved): osd: os/FileJournal.cc: 1011: FAILED assert(seq >= last_committed_seq)
Sage Weil
11:22 AM CephFS Bug #1787 (Closed): mds: laggy oneshot replays pollute mdsmap
... Sage Weil
10:53 AM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
I lost my setup over the weekend, so I'm not going to be able to try the wip-truncate branch on the deployment to see... Sam Lang

12/03/2011

03:11 PM Feature #1784 (Duplicate): osd: redo pgls api
include locators
use hobject_t as iterator (and hopefully make the objecter split/merge coping logic less ugly in th...
Sage Weil
03:09 PM Feature #1783 (Resolved): osd: scrub incrementally across hash range using MOSDPGScan
Current scrub will not scale to large PGs. Sage Weil
01:01 AM CephFS Bug #1047: mds: crash on anchor table query
Attached a log of a full run up to the crash. MDS tries to recover from some problem, replays and crashes. Amon Ott

12/02/2011

11:35 PM Revision 4a0b00a0 (ceph): mon: stub perfcounters for monitor, cluster
The 'mon' perfcounter is for the local daemon and is always registered.
The 'cluster' perfcounter is for cluster sta...
Sage Weil
11:27 PM Revision 6dd81485 (ceph): osd: rename {take -> requeue}_object_waiters
It calls osd->requeue_ops(), so make naming more consistent and avoid
confusing people like me.
Signed-off-by: Sage ...
Sage Weil
11:27 PM Revision 8bbe576c (ceph): osd: safely requeue waiting_for_ondisk waiters on_role_change
This could conceivably cause the reply ordering mismatch seen in bug
#1490. Not sure why we didn't also fix this cal...
Sage Weil
09:38 PM Revision c8831004 (ceph): rados.py: add list_pools method
Signed-off-by: Eric Chen <Eric_YH_Chen@wistron.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
08:06 PM Revision 6b4b6595 (ceph): Merge branch 'stable'
Sage Weil
07:28 PM Revision 06228716 (ceph): Doc: add a conceptual overview of the peering process
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com> Mark Kampe
07:19 PM Revision c45a8491 (ceph): mds: remove obsolete doc
Sage Weil
06:52 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Hi Oliver,
With snapshot=on data is never saved to the backing device - the original file is not modified unless y...
Josh Durgin
05:31 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Well Josh,
attached you will find a crash, qemu-system... started without "-daemonize" to see what's going on ;-)
...
Oliver Francke
04:46 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Hi Josh,
I have just made a session with savevm/loadvm, once without/with the snapshot-option, now with qemu-1.0. ...
Oliver Francke
05:58 PM Revision 0c183ec7 (ceph): crush: ignore forcefed input that doesn't exist
This might happen if, e.g., the file_layout specifies an osd that later
is removed from the cluster entirely. Just i...
Sage Weil
05:47 PM Revision faf5ce62 (ceph): Revert "CrushWrapper: ignore forcefeed if it does not exist"
This reverts commit 6fbab6da6942c238d40a6b4f1680a7e6da463289.
This fails a unit test.
And I change my mind.. I thin...
Sage Weil
05:01 PM Revision 321ecdab (ceph): v0.39
Sage Weil
05:00 PM Revision 75aff023 (ceph): OSDMap: build_simple_from_conf pg_num should not be 0 with one osd
Previously, pg_num would end up set to 0 if osd.0 is the only osd.
Signed-off-by: Samuel Just <samuel.just@dreamhost...
Samuel Just
03:51 PM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
Sorry - haven't had a chance yet. I'll try it on Monday. Sam Lang
11:50 AM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
Sam, did you get a chance to try this? Sage Weil
03:43 PM Bug #1490: cfuse assert failure: assert(ob->last_commit_tid < tid)
If we're lucky this was caused by taking waiters improperly, which Sage fixed in commit:8bbe576cab9ecdbfea939ad3d7866... Greg Farnum
03:40 PM Feature #1782: mon: dump key cluster stats via perfcounter
commit:4a0b00a0f29a87965925e0b44c997bece96b9936 stubs this out. just need to populate the perfcounter with the relev... Sage Weil
02:20 PM Feature #1782 (Resolved): mon: dump key cluster stats via perfcounter
This may be a minor abuse of the perfcounter intent, but it lets us get cluster stats using a common mechanism (via c... Sage Weil
03:22 PM Feature #390 (In Progress): Implement bdrv_snapshot_goto (Rollback), bdrv_snapshot_delete
Have some functions, trying to get a setup to test them with. Greg Farnum
01:54 PM Feature #1082 (Rejected): obsync: swift support
dho guys are doing this. Sage Weil
01:27 PM Feature #1781 (Resolved): qa: readwrite and roundtrip rgw tests in qa suite
Sage Weil
01:01 PM rgw Bug #1780 (Resolved): swift: auth response should return X-Auth-Token instead of X-Storage-Token
Yehuda Sadeh
11:56 AM Bug #1750 (Resolved): xattr errors silently ignored, cause trouble later
Sage Weil
11:54 AM Bug #1757 (Closed): oi disagrees with stat, or error code on stat
Sage Weil
11:52 AM Bug #1679 (Can't reproduce): assertion failure is_replica()
and old codepending new code. Sage Weil
11:52 AM Bug #1688 (Won't Fix): Benjamin: pg stuck in scrub
old code. Sage Weil
11:50 AM Bug #1689 (Can't reproduce): osd: segfault in recover_primary
going to ignore this and see how the new backfill code fares. Sage Weil
11:48 AM CephFS Bug #1775 (Need More Info): mds startup: _replay journaler got error -22, aborting, possible regr...
Without logs, it's hard to say, but it looks like something caused the OSD to drop a write (or series of writes). No... Sage Weil
11:46 AM Bug #1617 (Won't Fix): pgs stuck down and peering with only one osd down and out
the new code will have an explicit 'incomplete' state when peering fails, instead of being 'stuck'. let's ignore thi... Sage Weil
09:44 AM CephFS Bug #1047 (Need More Info): mds: crash on anchor table query
Amon Ott just hit this one. Sage Weil
04:36 AM Revision 2f5bd5f7 (ceph): objecter: initialize global_op_flags to zero
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
12:13 AM Revision 813523a6 (ceph): Doc: delete gratuitous index.html
It was not an index, and seems to contain recommendations
for system configuration. I have renamed it to confusing.t...
Mark Kampe
12:12 AM Revision 48165af5 (ceph): Doc: complete reversion of architecture.rst
(abandon in progress improvements until everything works)
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
Mark Kampe
12:12 AM Revision 3c7a82a6 (ceph): Doc: deleted gratuitious PlanningImplementation.html,
which was a copy of PlanningImplementation.txt
(and not html at all).
restored previous index.rst, which was overwri...
Mark Kampe
12:11 AM Revision fdf3f7bd (ceph): Doc: Restore the previous version of architecture.rst
it was accidentally overwritten with a version of the product
had a somewhat different audience/focus and a few sphin...
Mark Kampe
12:07 AM Revision 4cfe0815 (ceph): doc: change state model from .svg to .png
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com> Mark Kampe

12/01/2011

10:41 PM Revision 1bbf9ae6 (ceph): fixed ubuntu version typo
Steve MacGregor
10:20 PM Revision 6fbab6da (ceph): CrushWrapper: ignore forcefeed if it does not exist
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
08:38 PM Revision 363ebb6c (ceph): librbd: report an error if rbd header does not match
This will fail on future incompatible versions of the header format.
Signed-off-by: Josh Durgin <josh.durgin@dreamho...
Josh Durgin
07:15 PM Revision cce67171 (ceph): Merge branch 'wip_local_reads'
Greg Farnum
07:15 PM Revision d4aef202 (ceph): hadoop: apache license.
We haven't made explicit that the Hadoop Java code is under the Apache
License. Do so (with permission from the other...
Greg Farnum
05:40 PM Messengers Bug #1747 (Need More Info): msgr: osd connection originates from wrong port
The blank address isn't a problem; it's due to the in_hbmsgr not being bound (deliberately). Unfortunately I've been ... Greg Farnum
05:17 PM Revision 348c71c4 (ceph): mds: fix blocking in standby replay thread
We need to hold mylock before waiting on the cond or else we get
./common/Cond.h: In function 'int Cond::Wait(Mutex&...
Sage Weil
05:17 PM Revision f6ee3699 (ceph): global: make daemon banner print explicit
This eliminates some flags and avoids annoying cases where the banner is
printed but we don't want to see it.
Signed...
Sage Weil
04:19 PM Revision 5828009e (ceph): mds: fix usage text
Filename is not optional.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
01:16 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
There's certainly a difference with the snapshot parameter - it doesn't store anything in the rbd image unless you us... Josh Durgin
12:09 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Hi Josh,
at least my experience showed a different behaviour: no reliable snapshots and even crashes of qemu-syste...
Oliver Francke
10:54 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
You don't need any special qemu options to use snapshots - the snapshot option is confusingly named. The qemu 'snapsh... Josh Durgin
09:30 AM Bug #1778 (Resolved): Error after installing an iso-image via qemu / rbd-image
Hi *,
we are currently running:
ceph version 0.38 (commit:b600ec2ac7c0f2e508720f8e8bb87c3db15509b9) fro...
Oliver Francke
12:10 PM CephFS Bug #1775: mds startup: _replay journaler got error -22, aborting, possible regresion?
stick a
continue;
after the set_read_pos() call to avoid the second crash.
Sage Weil
08:36 AM CephFS Bug #1775: mds startup: _replay journaler got error -22, aborting, possible regresion?
No I didn't have osd logging enabled, I'll provide you with journal in few minutes. Szymon Szypulski
08:26 AM CephFS Bug #1775: mds startup: _replay journaler got error -22, aborting, possible regresion?
Can you dump the mds journal so we can get a closer look at the corruption? Something like
ceph-mds -i foo --dum...
Sage Weil
12:24 AM CephFS Bug #1775 (Resolved): mds startup: _replay journaler got error -22, aborting, possible regresion?
ubuntu natty, kernel 3.2-rc2, ceph 0.38 (stable from git) with patch from #1756 and workaround for #1757
setup
s1...
Szymon Szypulski
10:13 AM rgw Bug #1779 (Resolved): rgw: swift auth returns wrong error code when unexisting user is given
returns 404 instead of 403 Yehuda Sadeh
09:12 AM rgw Bug #1777 (Resolved): rgw: user info modification is not atomic
e.g., adding keys, etc.
I think it's more important to identify cases where operations left system in an inconsist...
Yehuda Sadeh
09:05 AM rgw Feature #1776 (Resolved): rgw: swift auth prefix should be configurable (and optional)
Yehuda Sadeh
01:07 AM Revision 50c4b312 (ceph): Handle interactive-on-error also when error is from contextmanager exit.
Closes: http://tracker.newdream.net/issues/1745 Tommi Virtanen

11/30/2011

07:21 PM CephFS Bug #1774 (Resolved): client: files become inaccessible in large directories (with snapshots?)
Taking snapshots of certain directories within ceph that hold backups of root filesystems of my openmoko phone causes... Alexandre Oliva
05:57 PM Revision 353ee000 (ceph): mds: adjust flock lock state on export
Looks like this was missed when flocklock was added. Did a quick grep and
it doesn't look like it is missing anywher...
Sage Weil
05:49 PM Feature #1773 (Resolved): rbd: class interface for header interaction
This will include:
* create(size, order, features)
* get_info(image)
* get_snapc
* snap_add
* later snap_add...
Josh Durgin
05:43 PM Feature #1772 (Resolved): rbd: define new on-disk header format
This should include several new things:
* CompatSet
* read-only flag
* parent_{pool, image_id, snap_id}
* list<...
Josh Durgin
05:28 PM Bug #1771 (Resolved): rbd: delete snapshots when image is deleted
Currently the snapshots are left around with no way to access them. Josh Durgin
05:23 PM CephFS Bug #1770 (Can't reproduce): directory nonexistent on kernel_untar_build.sh
... Sage Weil
05:18 PM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
the tasks were in nightly_coverage_2011-11-30-a
3433: collection:basic clusters:fixed-3.yaml tasks:kclient_workuni...
Sage Weil
05:13 PM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
Happened twice today:... Sage Weil
05:08 PM Feature #1745 (Closed): teuthology: make interactive-on-error stop further cleanup
... Anonymous
05:06 PM Bug #1690 (Can't reproduce): osd re-created from scratch will crash on start-up
Sage Weil
03:19 PM CephFS Bug #1753 (Won't Fix): ceph copy raw images from qemu incorrectly
Unfortunately, right now making Ceph report sparse files correctly would be prohibitively expensive. It can be done, ... Greg Farnum
02:57 PM CephFS Bug #1753: ceph copy raw images from qemu incorrectly
To create the sparse file qemu-img just calls ftruncate. It does nothing fs-specific, so this can be replicated with ... Josh Durgin
11:10 AM CephFS Bug #1753: ceph copy raw images from qemu incorrectly
The file copy took 3 minutes. It is ok for 3Gb file but not for 100Kb file. max mikheev
09:43 AM CephFS Bug #1753: ceph copy raw images from qemu incorrectly
I'm a little confused here. Ceph has never reported only the used space for a file; doing so is prohibitively expensi... Greg Farnum
02:20 PM Messengers Bug #1747 (In Progress): msgr: osd connection originates from wrong port
The problem here is somewhere on osd.2 — osd.1 is using the address that osd.2 is providing, and you can see that osd... Greg Farnum
01:17 PM CephFS Bug #1756 (Resolved): mds crash right after successful recovery
Sage Weil
11:28 AM Linux kernel client Bug #1769 (New): osd_client: susceptibility to low memory deadlocks
We could be trying to flush the cache in order to free up memory, and find ourselves unable to allocate a ceph_osd or... Anonymous
11:21 AM Linux kernel client Cleanup #1768 (Closed): osd_client: gratuitous ceph_monc_request_next_osdmap calls
kick_requests() is called from within a loop that iterates through multiple OSD map updates ... which means that it m... Anonymous
11:15 AM Linux kernel client Bug #1767 (Resolved): osd_client: send_request() cannot fail
The static __send_request() routine is sure to succeed in queuing its request for the specified osd client, yet ceph_... Anonymous
11:12 AM Linux kernel client Bug #1766 (New): mon_client: sends request before authentication
The passed request is sent unconditionally, whether or not we have finished authenticating.
If we have not yet com...
Anonymous
10:11 AM Bug #1765 (Resolved): osd: 'call' op can return data even if op is modifying
Not sure if it'd actually return data, but in any case the api is ambiguous. If it does return data it breaks idempot... Yehuda Sadeh
10:07 AM Feature #1764 (Rejected): osd classes: add an optional source object
This can be very useful. Source object should have the same locator as the target object. Similar to clone-range. An ... Yehuda Sadeh
10:03 AM Bug #1490: cfuse assert failure: assert(ob->last_commit_tid < tid)
This didn't turn out to have anything to do with #1727, did it? Greg Farnum
09:36 AM Linux kernel client Bug #1762: i_lock vs s_cap_lock vs inodes_with_caps_lock lock ordering
Argh, this is a real pain. igrab() requires i_lock, which we use extensively to protect complicated changes. In the... Sage Weil
09:19 AM Linux kernel client Bug #1762 (Resolved): i_lock vs s_cap_lock vs inodes_with_caps_lock lock ordering
Reported by Amon Ott on ML.... Sage Weil
09:25 AM Bug #1763 (Resolved): qa: need to run qa tests on kernel with lockdep enabled
We need to catch lock ordering regressions like #1762 in our nightly runs. Sage Weil
02:14 AM Revision 2443878b (ceph): Objecter: loop the right direction when searching for local replicas
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
12:35 AM Revision 1c696b65 (ceph): doc: Add peering state diagram
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
12:20 AM Revision 2918b501 (ceph): Move kclient multiple_rsync workunit to stress collection.
Bug #1760 keeps being triggered by this. Josh Durgin
 

Also available in: Atom