Project

General

Profile

Activity

From 11/07/2011 to 12/06/2011

12/06/2011

11:56 PM rbd Feature #1790: rbd: have a way of establishing configured mappings at boot time
What if your image is not in the pool "rbd" ?
I was thinking about a 'rbdtab' file:...
Wido den Hollander
11:10 AM rbd Feature #1790 (Resolved): rbd: have a way of establishing configured mappings at boot time
We need to be careful about the config format, to make automatic editing easy (think Chef).
First draft:
/etc/c...
Anonymous
11:22 PM Revision 745be30f (ceph): gitignore: Ignore src/keyring, as created by vstart.sh
Commit 86c34ba9ee8c883b71a8449c3c261154365c35ae changed
the filename but not .gitignore.
Signed-off-by: Tommi Virtan...
Tommi Virtanen
10:44 PM Revision a1ebd725 (ceph): ReplicatedPG: don't crash on empty data_subset in sub_op_push
If data_subset is empty (i.e., the data we pulled is no longer useful),
we should mark complete false and continue ra...
Samuel Just
10:24 PM Revision 03b03553 (ceph): ReplicatedPG: do not ->put() scrub messages when adding to a WorkQueue.
This function is passing a reference from PG::active_rep_scrub to
the req_scrub_wq, not eliminating the reference (an...
Greg Farnum
10:20 PM Revision 8afa5a5d (ceph): workunits: fix secret file and temp file removal for kernel rbd
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
09:36 PM Revision bcd26fca (ceph): workunits: make rbd kernel workunit executable
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
08:13 PM Revision 2bdf9078 (ceph): doc: Reorganize pip calls to use a requirements file.
The conditional before running pip install was unnecessary,
"pip install" on already installed packages is fast (as l...
Tommi Virtanen
08:07 PM Revision 200d7c89 (ceph): doc: Switch diagram tools from dia to ditaa.
Now you can create diagrams easily with the ".. ditaa::"
directive in the Sphinx documents.
admin/build-doc now chec...
Tommi Virtanen
06:50 PM Revision 20b7af79 (ceph): doc: fix typo
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:50 PM Revision 33753c82 (ceph): filestore: send back op error to log, not stderr
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:31 PM Revision 66b6b1bf (ceph): workunits: add some tests for kernel rbd
This covers some snapshot and resize functions that aren't tested by fs benchmarks.
Signed-off-by: Josh Durgin <josh...
Josh Durgin
06:26 PM Revision 575f717f (ceph): rbd: allow snapshots to be mapped
unmap and showmapped already support snapshots. map should too.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
06:26 PM Revision 01d30e6a (ceph): secret: fix error check
add_key will return -1 when an error occurs, which should be handled at a higher level and not printed here.
Signed-...
Josh Durgin
06:26 PM Revision 0ad0fbfe (ceph): secret: add is_kernel_secret function
This will let us know whether we can add a key mount option
if no secret is specified.
Signed-off-by: Josh Durgin <j...
Josh Durgin
06:26 PM Revision 274f4890 (ceph): rbd, mount.ceph: use pre-stored secret if available
If a secret is specified, store and use it, but otherwise
check for a pre-existing secret to use.
Signed-off-by: Jos...
Josh Durgin
06:26 PM Revision 16a211bf (ceph): ceph-rbdnamer: include snapshot name if present
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
06:26 PM Revision fd9556f0 (ceph): rbd: the showmapped command shouldn't connect to the cluster
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
06:02 PM Linux kernel client Bug #1793 (Can't reproduce): NULL pointer dereference at try_write+0x627/0x1060
Found in sepia50's console:... Josh Durgin
04:44 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Josh Durgin
04:44 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
The bug is in the qemu driver - the fix is "in our qemu repo":https://github.com/NewDreamNetwork/qemu-kvm/commit/7ee2... Josh Durgin
09:28 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Hi Oliver,
That gdb session is actually an entirely different crash - I'll take a closer look at both of these tod...
Josh Durgin
02:14 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Well Josh,
being quite busy... and need to understand ( not a "real-coder" these days anymore ;-) ) how to configu...
Oliver Francke
04:34 PM Revision ddc11a8f (ceph): test_rados.py: clean up after EEXIST test
This extra pool caused subsequent pool tests to fail.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
02:35 PM Bug #1758 (Resolved): OSD segfault in SimpleMessenger::send_message
I checked out a core dump, and the OSD is calling send_message with a null Connection* from PG::replica_scrub::2895. ... Greg Farnum
11:53 AM Bug #1758: OSD segfault in SimpleMessenger::send_message
And in teuthology:~teuthworker/archive/nightly_coverage_2011-12-06-a/3757/remote/ubuntu@sepia66.ceph.dreamhost.com/lo... Josh Durgin
11:52 AM Bug #1758: OSD segfault in SimpleMessenger::send_message
Happened again today in teuthology:~teuthworker/archive/nightly_coverage_2011-12-06-a/3772/remote/ubuntu@sepia66.ceph... Josh Durgin
02:01 PM CephFS Bug #1702 (Can't reproduce): Ceph MDS crash + client mount problem
Sage Weil
02:01 PM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
I think the next step here is to run the mds under valgrind. Sage Weil
02:00 PM Bug #1490 (Resolved): cfuse assert failure: assert(ob->last_commit_tid < tid)
Sage Weil
11:34 AM CephFS Bug #1792 (Can't reproduce): crash in ceph-mds
This is the full log from teuthology:~teuthworker/archive/nightly_coverage_2011-12-01-b/3516/remote/ubuntu@sepia70.ce... Josh Durgin
11:25 AM Bug #1791 (Resolved): osd: assert(0) in sub_op_modify
From teuthology:~teuthworker/archive/nightly_coverage_2011-12-02-a/3569/remote/ubuntu@sepia6.ceph.dreamhost.com/log/o... Josh Durgin
11:19 AM Bug #1750 (New): xattr errors silently ignored, cause trouble later
Happened again after s3tests in teuthology:~teuthworker/archive/nightly_coverage_2011-12-02-b/3624/teuthology.log. Josh Durgin
11:09 AM CephFS Bug #1675: mds: failed rstat assert
Happened during fsstress in teuthology:~teuthworker/archive/nightly_coverage_2011-12-02-b/3593/remote/ubuntu@sepia92.... Josh Durgin
11:07 AM Bug #1789 (Resolved): mon: failed assert(paxosv == pg_map.version)
From teuthology:~teuthworker/archive/nightly_coverage_2011-12-02-b/3603/remote/ubuntu@sepia44.ceph.dreamhost.com/log/... Josh Durgin
10:54 AM Bug #1530: osd crash during build_inc_scrub_map
Another one crashed in PG::replica_scrub yesterday. core is in teuthology:~teuthworker/archive/nightly_coverage_2011-... Josh Durgin
06:01 AM CephFS Bug #1047: mds: crash on anchor table query
Updated Ceph to 0.39 and the bug seems to be gone. Amon Ott
01:33 AM Revision 54758abc (ceph): Merge remote branch 'gh/stable'
Sage Weil
12:16 AM Revision 9512aed5 (ceph): doc: fix rst syntax
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

12/05/2011

10:07 PM Revision 7178f1ca (ceph): doc: document monitor cluster expansion/contraction
Pretty sure my rst syntax is wrong.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:33 PM Revision 16f79282 (ceph): cephtool: fix shutdown
Fix 'ceph -w' brokenness from commit ad13d0b7.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
07:21 PM Revision 019597e6 (ceph): filejournal: make FileJournal::open() arg slightly less weird
Pass in fs_op_seq (last_committed_seq), not the next expected seq, so we
can avoid subtracting and adding 1 in odd pl...
Sage Weil
07:21 PM Revision bfbc4324 (ceph): Merge branch 'stable'
Sage Weil
07:21 PM Revision 86c34ba9 (ceph): vstart.sh: .ceph_keyring -> keyring
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:15 PM CephFS Bug #1774: client: files become inaccessible in large directories (with snapshots?)
Some interesting findings... It appears that the problem has nothing to do with the mds, but with the fuse client. ... Alexandre Oliva
06:53 PM Revision 1e3da7ed (ceph): filejournal: remove bogus check in read_entry
It is perfectly fine to read events that are older than the fs's seq from
the journal; open() will skip them when pos...
Sage Weil
06:08 PM Revision dbd7a3b4 (ceph): Rename "testrados" task to not begin with "test".
See commit e80c32c44293e6453cce1bf89ad3cf5b1b4917ab in
teuthology.git
Tommi Virtanen
06:07 PM Revision e80c32c4 (ceph): Rename "testrados" and "testswift" tasks to not begin with "test".
Anything "test*" looks like a unit test, and shouldn't be used for
actual code.
Tommi Virtanen
06:07 PM Revision 9598e479 (ceph): Rename "testrados" and "testswift" tasks to not begin with "test".
Anything "test*" looks like a unit test, and shouldn't be used for
actual code.
Tommi Virtanen
06:02 PM Revision 0dd4d69f (ceph): Fix unit tests for SSH keep-alive setting.
Commit 6e3e0d7cdcb5ba70f938f0850a8828aca2753ab5 failed to pass
unit tests.
Tommi Virtanen
05:37 PM Revision dc167bac (ceph): filejournal: set last_committed_seq based on fs, not journal
last_committed_seq is the last seq committed to the fs, not the journal.
Set it when we begin replay with the fs prov...
Sage Weil
04:15 PM CephFS Bug #1788 (Resolved): msgr file descriptor leak
With our Hadoop workload (lots of client connections), this problem occurs every couple hours -- although this is the... Noah Watkins
02:18 PM Bug #1786 (Resolved): ceph -w goes dead after 5 minutes
commit:16f79282cd0132c3633216f51fbbf0f93a0aec61 Sage Weil
11:13 AM Bug #1786 (Resolved): ceph -w goes dead after 5 minutes
Sage Weil
02:18 PM Bug #1785 (Resolved): osd: os/FileJournal.cc: 1011: FAILED assert(seq >= last_committed_seq)
commit:1e3da7edcf8881b10f35879e4b5b6be93167c636 Sage Weil
09:14 AM Bug #1785 (Resolved): osd: os/FileJournal.cc: 1011: FAILED assert(seq >= last_committed_seq)
Sage Weil
11:22 AM CephFS Bug #1787 (Closed): mds: laggy oneshot replays pollute mdsmap
... Sage Weil
10:53 AM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
I lost my setup over the weekend, so I'm not going to be able to try the wip-truncate branch on the deployment to see... Sam Lang

12/03/2011

03:11 PM Feature #1784 (Duplicate): osd: redo pgls api
include locators
use hobject_t as iterator (and hopefully make the objecter split/merge coping logic less ugly in th...
Sage Weil
03:09 PM Feature #1783 (Resolved): osd: scrub incrementally across hash range using MOSDPGScan
Current scrub will not scale to large PGs. Sage Weil
01:01 AM CephFS Bug #1047: mds: crash on anchor table query
Attached a log of a full run up to the crash. MDS tries to recover from some problem, replays and crashes. Amon Ott

12/02/2011

11:35 PM Revision 4a0b00a0 (ceph): mon: stub perfcounters for monitor, cluster
The 'mon' perfcounter is for the local daemon and is always registered.
The 'cluster' perfcounter is for cluster sta...
Sage Weil
11:27 PM Revision 6dd81485 (ceph): osd: rename {take -> requeue}_object_waiters
It calls osd->requeue_ops(), so make naming more consistent and avoid
confusing people like me.
Signed-off-by: Sage ...
Sage Weil
11:27 PM Revision 8bbe576c (ceph): osd: safely requeue waiting_for_ondisk waiters on_role_change
This could conceivably cause the reply ordering mismatch seen in bug
#1490. Not sure why we didn't also fix this cal...
Sage Weil
09:38 PM Revision c8831004 (ceph): rados.py: add list_pools method
Signed-off-by: Eric Chen <Eric_YH_Chen@wistron.com>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
08:06 PM Revision 6b4b6595 (ceph): Merge branch 'stable'
Sage Weil
07:28 PM Revision 06228716 (ceph): Doc: add a conceptual overview of the peering process
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com> Mark Kampe
07:19 PM Revision c45a8491 (ceph): mds: remove obsolete doc
Sage Weil
06:52 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Hi Oliver,
With snapshot=on data is never saved to the backing device - the original file is not modified unless y...
Josh Durgin
05:31 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Well Josh,
attached you will find a crash, qemu-system... started without "-daemonize" to see what's going on ;-)
...
Oliver Francke
04:46 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Hi Josh,
I have just made a session with savevm/loadvm, once without/with the snapshot-option, now with qemu-1.0. ...
Oliver Francke
05:58 PM Revision 0c183ec7 (ceph): crush: ignore forcefed input that doesn't exist
This might happen if, e.g., the file_layout specifies an osd that later
is removed from the cluster entirely. Just i...
Sage Weil
05:47 PM Revision faf5ce62 (ceph): Revert "CrushWrapper: ignore forcefeed if it does not exist"
This reverts commit 6fbab6da6942c238d40a6b4f1680a7e6da463289.
This fails a unit test.
And I change my mind.. I thin...
Sage Weil
05:01 PM Revision 321ecdab (ceph): v0.39
Sage Weil
05:00 PM Revision 75aff023 (ceph): OSDMap: build_simple_from_conf pg_num should not be 0 with one osd
Previously, pg_num would end up set to 0 if osd.0 is the only osd.
Signed-off-by: Samuel Just <samuel.just@dreamhost...
Samuel Just
03:51 PM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
Sorry - haven't had a chance yet. I'll try it on Monday. Sam Lang
11:50 AM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
Sam, did you get a chance to try this? Sage Weil
03:43 PM Bug #1490: cfuse assert failure: assert(ob->last_commit_tid < tid)
If we're lucky this was caused by taking waiters improperly, which Sage fixed in commit:8bbe576cab9ecdbfea939ad3d7866... Greg Farnum
03:40 PM Feature #1782: mon: dump key cluster stats via perfcounter
commit:4a0b00a0f29a87965925e0b44c997bece96b9936 stubs this out. just need to populate the perfcounter with the relev... Sage Weil
02:20 PM Feature #1782 (Resolved): mon: dump key cluster stats via perfcounter
This may be a minor abuse of the perfcounter intent, but it lets us get cluster stats using a common mechanism (via c... Sage Weil
03:22 PM Feature #390 (In Progress): Implement bdrv_snapshot_goto (Rollback), bdrv_snapshot_delete
Have some functions, trying to get a setup to test them with. Greg Farnum
01:54 PM Feature #1082 (Rejected): obsync: swift support
dho guys are doing this. Sage Weil
01:27 PM Feature #1781 (Resolved): qa: readwrite and roundtrip rgw tests in qa suite
Sage Weil
01:01 PM rgw Bug #1780 (Resolved): swift: auth response should return X-Auth-Token instead of X-Storage-Token
Yehuda Sadeh
11:56 AM Bug #1750 (Resolved): xattr errors silently ignored, cause trouble later
Sage Weil
11:54 AM Bug #1757 (Closed): oi disagrees with stat, or error code on stat
Sage Weil
11:52 AM Bug #1679 (Can't reproduce): assertion failure is_replica()
and old codepending new code. Sage Weil
11:52 AM Bug #1688 (Won't Fix): Benjamin: pg stuck in scrub
old code. Sage Weil
11:50 AM Bug #1689 (Can't reproduce): osd: segfault in recover_primary
going to ignore this and see how the new backfill code fares. Sage Weil
11:48 AM CephFS Bug #1775 (Need More Info): mds startup: _replay journaler got error -22, aborting, possible regr...
Without logs, it's hard to say, but it looks like something caused the OSD to drop a write (or series of writes). No... Sage Weil
11:46 AM Bug #1617 (Won't Fix): pgs stuck down and peering with only one osd down and out
the new code will have an explicit 'incomplete' state when peering fails, instead of being 'stuck'. let's ignore thi... Sage Weil
09:44 AM CephFS Bug #1047 (Need More Info): mds: crash on anchor table query
Amon Ott just hit this one. Sage Weil
04:36 AM Revision 2f5bd5f7 (ceph): objecter: initialize global_op_flags to zero
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
12:13 AM Revision 813523a6 (ceph): Doc: delete gratuitous index.html
It was not an index, and seems to contain recommendations
for system configuration. I have renamed it to confusing.t...
Mark Kampe
12:12 AM Revision 48165af5 (ceph): Doc: complete reversion of architecture.rst
(abandon in progress improvements until everything works)
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com>
Mark Kampe
12:12 AM Revision 3c7a82a6 (ceph): Doc: deleted gratuitious PlanningImplementation.html,
which was a copy of PlanningImplementation.txt
(and not html at all).
restored previous index.rst, which was overwri...
Mark Kampe
12:11 AM Revision fdf3f7bd (ceph): Doc: Restore the previous version of architecture.rst
it was accidentally overwritten with a version of the product
had a somewhat different audience/focus and a few sphin...
Mark Kampe
12:07 AM Revision 4cfe0815 (ceph): doc: change state model from .svg to .png
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com> Mark Kampe

12/01/2011

10:41 PM Revision 1bbf9ae6 (ceph): fixed ubuntu version typo
Steve MacGregor
10:20 PM Revision 6fbab6da (ceph): CrushWrapper: ignore forcefeed if it does not exist
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
08:38 PM Revision 363ebb6c (ceph): librbd: report an error if rbd header does not match
This will fail on future incompatible versions of the header format.
Signed-off-by: Josh Durgin <josh.durgin@dreamho...
Josh Durgin
07:15 PM Revision cce67171 (ceph): Merge branch 'wip_local_reads'
Greg Farnum
07:15 PM Revision d4aef202 (ceph): hadoop: apache license.
We haven't made explicit that the Hadoop Java code is under the Apache
License. Do so (with permission from the other...
Greg Farnum
05:40 PM Messengers Bug #1747 (Need More Info): msgr: osd connection originates from wrong port
The blank address isn't a problem; it's due to the in_hbmsgr not being bound (deliberately). Unfortunately I've been ... Greg Farnum
05:17 PM Revision 348c71c4 (ceph): mds: fix blocking in standby replay thread
We need to hold mylock before waiting on the cond or else we get
./common/Cond.h: In function 'int Cond::Wait(Mutex&...
Sage Weil
05:17 PM Revision f6ee3699 (ceph): global: make daemon banner print explicit
This eliminates some flags and avoids annoying cases where the banner is
printed but we don't want to see it.
Signed...
Sage Weil
04:19 PM Revision 5828009e (ceph): mds: fix usage text
Filename is not optional.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
01:16 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
There's certainly a difference with the snapshot parameter - it doesn't store anything in the rbd image unless you us... Josh Durgin
12:09 PM Bug #1778: Error after installing an iso-image via qemu / rbd-image
Hi Josh,
at least my experience showed a different behaviour: no reliable snapshots and even crashes of qemu-syste...
Oliver Francke
10:54 AM Bug #1778: Error after installing an iso-image via qemu / rbd-image
You don't need any special qemu options to use snapshots - the snapshot option is confusingly named. The qemu 'snapsh... Josh Durgin
09:30 AM Bug #1778 (Resolved): Error after installing an iso-image via qemu / rbd-image
Hi *,
we are currently running:
ceph version 0.38 (commit:b600ec2ac7c0f2e508720f8e8bb87c3db15509b9) fro...
Oliver Francke
12:10 PM CephFS Bug #1775: mds startup: _replay journaler got error -22, aborting, possible regresion?
stick a
continue;
after the set_read_pos() call to avoid the second crash.
Sage Weil
08:36 AM CephFS Bug #1775: mds startup: _replay journaler got error -22, aborting, possible regresion?
No I didn't have osd logging enabled, I'll provide you with journal in few minutes. Szymon Szypulski
08:26 AM CephFS Bug #1775: mds startup: _replay journaler got error -22, aborting, possible regresion?
Can you dump the mds journal so we can get a closer look at the corruption? Something like
ceph-mds -i foo --dum...
Sage Weil
12:24 AM CephFS Bug #1775 (Resolved): mds startup: _replay journaler got error -22, aborting, possible regresion?
ubuntu natty, kernel 3.2-rc2, ceph 0.38 (stable from git) with patch from #1756 and workaround for #1757
setup
s1...
Szymon Szypulski
10:13 AM rgw Bug #1779 (Resolved): rgw: swift auth returns wrong error code when unexisting user is given
returns 404 instead of 403 Yehuda Sadeh
09:12 AM rgw Bug #1777 (Resolved): rgw: user info modification is not atomic
e.g., adding keys, etc.
I think it's more important to identify cases where operations left system in an inconsist...
Yehuda Sadeh
09:05 AM rgw Feature #1776 (Resolved): rgw: swift auth prefix should be configurable (and optional)
Yehuda Sadeh
01:07 AM Revision 50c4b312 (ceph): Handle interactive-on-error also when error is from contextmanager exit.
Closes: http://tracker.newdream.net/issues/1745 Tommi Virtanen

11/30/2011

07:21 PM CephFS Bug #1774 (Resolved): client: files become inaccessible in large directories (with snapshots?)
Taking snapshots of certain directories within ceph that hold backups of root filesystems of my openmoko phone causes... Alexandre Oliva
05:57 PM Revision 353ee000 (ceph): mds: adjust flock lock state on export
Looks like this was missed when flocklock was added. Did a quick grep and
it doesn't look like it is missing anywher...
Sage Weil
05:49 PM Feature #1773 (Resolved): rbd: class interface for header interaction
This will include:
* create(size, order, features)
* get_info(image)
* get_snapc
* snap_add
* later snap_add...
Josh Durgin
05:43 PM Feature #1772 (Resolved): rbd: define new on-disk header format
This should include several new things:
* CompatSet
* read-only flag
* parent_{pool, image_id, snap_id}
* list<...
Josh Durgin
05:28 PM Bug #1771 (Resolved): rbd: delete snapshots when image is deleted
Currently the snapshots are left around with no way to access them. Josh Durgin
05:23 PM CephFS Bug #1770 (Can't reproduce): directory nonexistent on kernel_untar_build.sh
... Sage Weil
05:18 PM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
the tasks were in nightly_coverage_2011-11-30-a
3433: collection:basic clusters:fixed-3.yaml tasks:kclient_workuni...
Sage Weil
05:13 PM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
Happened twice today:... Sage Weil
05:08 PM Feature #1745 (Closed): teuthology: make interactive-on-error stop further cleanup
... Anonymous
05:06 PM Bug #1690 (Can't reproduce): osd re-created from scratch will crash on start-up
Sage Weil
03:19 PM CephFS Bug #1753 (Won't Fix): ceph copy raw images from qemu incorrectly
Unfortunately, right now making Ceph report sparse files correctly would be prohibitively expensive. It can be done, ... Greg Farnum
02:57 PM CephFS Bug #1753: ceph copy raw images from qemu incorrectly
To create the sparse file qemu-img just calls ftruncate. It does nothing fs-specific, so this can be replicated with ... Josh Durgin
11:10 AM CephFS Bug #1753: ceph copy raw images from qemu incorrectly
The file copy took 3 minutes. It is ok for 3Gb file but not for 100Kb file. max mikheev
09:43 AM CephFS Bug #1753: ceph copy raw images from qemu incorrectly
I'm a little confused here. Ceph has never reported only the used space for a file; doing so is prohibitively expensi... Greg Farnum
02:20 PM Messengers Bug #1747 (In Progress): msgr: osd connection originates from wrong port
The problem here is somewhere on osd.2 — osd.1 is using the address that osd.2 is providing, and you can see that osd... Greg Farnum
01:17 PM CephFS Bug #1756 (Resolved): mds crash right after successful recovery
Sage Weil
11:28 AM Linux kernel client Bug #1769 (New): osd_client: susceptibility to low memory deadlocks
We could be trying to flush the cache in order to free up memory, and find ourselves unable to allocate a ceph_osd or... Anonymous
11:21 AM Linux kernel client Cleanup #1768 (Closed): osd_client: gratuitous ceph_monc_request_next_osdmap calls
kick_requests() is called from within a loop that iterates through multiple OSD map updates ... which means that it m... Anonymous
11:15 AM Linux kernel client Bug #1767 (Resolved): osd_client: send_request() cannot fail
The static __send_request() routine is sure to succeed in queuing its request for the specified osd client, yet ceph_... Anonymous
11:12 AM Linux kernel client Bug #1766 (New): mon_client: sends request before authentication
The passed request is sent unconditionally, whether or not we have finished authenticating.
If we have not yet com...
Anonymous
10:11 AM Bug #1765 (Resolved): osd: 'call' op can return data even if op is modifying
Not sure if it'd actually return data, but in any case the api is ambiguous. If it does return data it breaks idempot... Yehuda Sadeh
10:07 AM Feature #1764 (Rejected): osd classes: add an optional source object
This can be very useful. Source object should have the same locator as the target object. Similar to clone-range. An ... Yehuda Sadeh
10:03 AM Bug #1490: cfuse assert failure: assert(ob->last_commit_tid < tid)
This didn't turn out to have anything to do with #1727, did it? Greg Farnum
09:36 AM Linux kernel client Bug #1762: i_lock vs s_cap_lock vs inodes_with_caps_lock lock ordering
Argh, this is a real pain. igrab() requires i_lock, which we use extensively to protect complicated changes. In the... Sage Weil
09:19 AM Linux kernel client Bug #1762 (Resolved): i_lock vs s_cap_lock vs inodes_with_caps_lock lock ordering
Reported by Amon Ott on ML.... Sage Weil
09:25 AM Bug #1763 (Resolved): qa: need to run qa tests on kernel with lockdep enabled
We need to catch lock ordering regressions like #1762 in our nightly runs. Sage Weil
02:14 AM Revision 2443878b (ceph): Objecter: loop the right direction when searching for local replicas
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
12:35 AM Revision 1c696b65 (ceph): doc: Add peering state diagram
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
12:20 AM Revision 2918b501 (ceph): Move kclient multiple_rsync workunit to stress collection.
Bug #1760 keeps being triggered by this. Josh Durgin

11/29/2011

11:36 PM Revision 30ede648 (ceph): Makefile: ipaddr.h, pick_address.h
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:05 PM rbd Cleanup #1761: krbd: make block/segment naming consistent
Segment refers to a partial range, a part of an object, so I think we should keep it in this context. So object shoul... Yehuda Sadeh
09:15 PM rbd Cleanup #1761 (Resolved): krbd: make block/segment naming consistent
pick consistent term for an object (segment or object, but not block) and use throughout. Sage Weil
09:31 PM Revision 77a62fdc (ceph): Makefile: add missing uuid.h to tarball
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:30 PM Revision ebb585d9 (ceph): Objecter: fix local reads in recalc_op_target
We want to use the actual OSD, not the index into the array!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum
05:27 PM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
Actually, maybe you run with the wip-truncate branch on the mds and see if you triggers a failed assertion on the MDS... Sage Weil
05:19 PM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
Do you by chance have the log preceeding the first crash?
Working around this is probably a matter of patching wit...
Sage Weil
11:28 AM Bug #1759 (Resolved): mds/client: truncate size overflow, fails with EINVAL
My version of ceph is a minor variant of 0.38, running with ext4, and ceph-fuse. It looks like my fs has gotten corr... Sam Lang
05:07 PM CephFS Cleanup #814: hadoop: refactor hadoop shim in terms of java libceph bindings
http://www.debian.org/doc/packaging-manuals/java-policy/x105.html Sage Weil
04:28 PM Revision 8788a404 (ceph): osd: subscribe to next map if flagged FULL
This ensures the osd finds out when we become un-full in a timely manner.
Fixes: #1755
Signed-off-by: Sage Weil <sag...
Sage Weil
04:26 PM CephFS Bug #1760 (Resolved): multiple_rsync workunit cannot remove non-empty directory intermittently
This has occurred in half of the regression runs since 11/24: ... Josh Durgin
10:52 AM Bug #1757: oi disagrees with stat, or error code on stat
As we talked at #ceph, I've updated kernel to 3.2-rc2 and patched osd with this workaround http://fpaste.org/PKwW/, n... Szymon Szypulski
08:25 AM Bug #1757: oi disagrees with stat, or error code on stat
The fix for #1612 is upstream kernel commit:ed3ee9f44ba55eb6acfbfc8caa881e0253710d2a. Does your kernel on the osds h... Sage Weil
01:52 AM Bug #1757 (Closed): oi disagrees with stat, or error code on stat
I've similar bugs #1334, #1473 which should be solved by #1612, but it doesn't help.
Ubuntu natty, ceph 0.38 with ...
Szymon Szypulski
09:05 AM Bug #1758 (Can't reproduce): OSD segfault in SimpleMessenger::send_message
in the 11/29 nightlies, cfuse_workunit_misc (3335) the osd on sepia5 seg-faulted.
The end of the osd log is:
2011-1...
Anonymous
08:59 AM Bug #1755 (Resolved): OSD: subscribe to map updates on FULL flag
commit:8788a404ae4a10cd10ec8048f0b32d473640a607 Sage Weil
08:25 AM Bug #1612: osd/PG.cc: 3839: FAILED assert(missing[oid].need <= v)
upstream kernel commit:ed3ee9f44ba55eb6acfbfc8caa881e0253710d2a Sage Weil
05:39 AM Revision c2889fef (ceph): mds: encode truncate_pending in inode
Otherwise we don't actually journal this value, and we get confused when
we replay a start_truncate and try to restar...
Sage Weil

11/28/2011

10:11 PM CephFS Bug #1756: mds crash right after successful recovery
This should let you restart your mds:... Sage Weil
09:28 AM CephFS Bug #1756 (Resolved): mds crash right after successful recovery
Ubuntu Natty, ceph 0.38, kernel 2.6.38-12-server, 2x separate mds daemons crashed in the middle of the night
* sho...
Szymon Szypulski
08:52 PM Revision 98e0a6fd (ceph): uclient: remove filer_flags and use Objecter::global_op_flags instead
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
08:52 PM Revision da2e0c3c (ceph): Objecter: add a new global_op_flags that is passed to every Op construc...
We can use this for a global use of LOCALIZE_READS (and are about
to do so!).
Signed-off-by: Greg Farnum <gregory.fa...
Greg Farnum
08:30 PM Revision 51385930 (ceph): Objecter: remove unused variable in op_submit
These flags are probably relics from when the function got split;
they belong in send_op now.
Signed-off-by: Greg Fa...
Greg Farnum
06:32 PM Revision 4974a9c2 (ceph): uclient: remove useless if-else based on snapid
These are the same command anyway!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum
05:01 PM Revision cef16732 (ceph): debian init: Do not stop or start daemons when installing or upgrading
Signed-off-by: Wido den Hollander <wido@widodh.nl> Wido den Hollander
03:49 PM CephFS Bug #1753: ceph copy raw images from qemu incorrectly
This is using the ceph filesystem, not rbd. Josh Durgin
11:12 AM CephFS Bug #1749: nonexistent directory in kclient_workunit_kernel_untar_build
This could have the same (unknown) root cause as #1741. Anonymous
09:46 AM Feature #1736 (Resolved): collectd: hacky script to generate types.db from perfcounter schema
Sage Weil
09:26 AM Bug #1755 (Resolved): OSD: subscribe to map updates on FULL flag
When the OSDs get a full flag they stop most of their activity, which shuts down the usual map propagation methods. T... Greg Farnum
09:14 AM Bug #1631: osd: failed assert(repop_queue.front() == repop)
Ok, pretty sure this is related to the reconnect. We need to put together a test that artificially triggers messenge... Sage Weil
12:11 AM Revision ce657227 (ceph): mon: search for local ip during mkfs
If an address isn't explicitly specified during mkfs, look for an unnamed
monitor in the (generated) monmap and see i...
Sage Weil
12:11 AM Revision 61b9db3a (ceph): pick_address: implement have_local_addr()
Check for a local ip from within a list of addresses.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:04 AM Revision 84b00597 (ceph): monclient: name nameless monitors noname-<foo>
This makes them easy to pick out as unnamed.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil

11/27/2011

10:50 PM Revision 7a453402 (ceph): pick_address: whitespace
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:44 PM Bug #1751: Copy in CEPH too slow
rbd only. there no plan yet for reflink(2) in the ceph filesystem. Sage Weil
02:48 PM Bug #1751: Copy in CEPH too slow
Is clone for rbd only or for files too.
Copy of files is slow too.
max mikheev
02:45 PM Bug #1751 (Duplicate): Copy in CEPH too slow
A 'clone' operation that does copy-on-write is coming in the next couple weeks. See #988 Sage Weil
05:39 PM Feature #1754 (Resolved): qa: run other suites nightly as well
stick suite name in mail subject?
run all suites nightly (not just regression)
Sage Weil
04:32 PM CephFS Bug #1746 (Resolved): PerfCounters::set segfault
Sage Weil
04:32 PM Bug #1727 (Resolved): osd: failed assert(pending_ops > 0) in dequeue_op
Sage Weil
04:30 PM Feature #1647 (Resolved): mon: robust bootstrap
Sage Weil

11/25/2011

02:08 PM CephFS Bug #1753 (Won't Fix): ceph copy raw images from qemu incorrectly
Hi,
Ceph cannot correctly handle raw images from qemu incorrectly:
oneadmin@s2-8core:~/OpenNebula/var/images/tm...
max mikheev

11/24/2011

01:02 PM CephFS Bug #1752 (Can't reproduce): ceph-fuse isn't releasing caps without flushing data?
Xiaofei Du reported on the mailing list that running an "ls" on a directory with multiple writers takes a while (much... Greg Farnum
10:16 AM Bug #1751 (Duplicate): Copy in CEPH too slow
Hi,
The copy operations for files and for rbd images are too slow. The ceph is a copy on write system I think the c...
max mikheev

11/23/2011

11:56 PM Revision 30def38d (ceph): corrected variable (con) to be consistent with prior examples (cluster)
Signed-off-by: Mark Kampe <mark.kampe@dreamhost.com> Mark Kampe
10:07 PM Revision 934e1e52 (ceph): ReplicatedPG: Also count overlaps for snapsets on snapdirs
Previously, the overlaps for snapdirs would not be included in
cstat causing the computed total to be incorrect.
Sig...
Samuel Just
10:07 PM Revision 97d82ed9 (ceph): ReplicatedPG: Account for clone space usage in make_writeable
Previously, we accounted for clone space usage inconsistently in
write_update_size_and_usage etc when walking through...
Samuel Just
05:09 PM Bug #1631: osd: failed assert(repop_queue.front() == repop)
This happened again with the same workload in /var/lib/teuthworker/archive/nightly_coverage_2011-11-23-b/3034/remote/... Josh Durgin
05:06 PM Bug #1530: osd crash during build_inc_scrub_map
A new crash during scrub from /var/lib/teuthworker/archive/nightly_coverage_2011-11-23-b/3051/remote/ubuntu@sepia71.c... Josh Durgin
05:02 PM Bug #1676 (Resolved): stats mismatch during snaps workunit
97d82ed950b26cfaef4267ee44edd9ad927fb828 and 934e1e52514b6036c91c1c7db1c8b6727ac8c6d8 should take care of the size di... Samuel Just
09:41 AM Bug #1676: stats mismatch during snaps workunit
I do not know if this is likely to be related, but in the 11/23a nightlies, 3027 (rgw_s3tests)
1 Aborts found in 3...
Anonymous
05:00 PM Bug #1750 (Closed): xattr errors silently ignored, cause trouble later
Comment
I do not know if this is likely to be related, but in the 11/23a nightlies, 3027 (rgw_s3tests)
1 Aborts f...
Samuel Just
02:45 PM Revision 32a68378 (ceph): Merge branch 'wip-mon'
Sage Weil
02:44 PM Revision ad13d0b7 (ceph): ceph: fix shutdown race
Shut down MonClient before messenger, to avoid race with MonClient::tick()
and MonClient::shutdown().
Fixes
#0 __l...
Sage Weil
01:33 PM Bug #1744: teuthology: race with daemon shutdown?
Josh saw similar, it seems the ctx.daemons data structure loses entries / they never get added / something. So far, r... Anonymous
09:27 AM CephFS Bug #1749 (Can't reproduce): nonexistent directory in kclient_workunit_kernel_untar_build
In the 11/23a nightlies, 3003, there may have been
a transient directory access error:
... lots of stuff works
2...
Anonymous
09:11 AM CephFS Bug #1748 (Can't reproduce): mds segfault CDir::project_fnode
In the 11/23a nighlies, 2995/remote/ubuntu@sepia75.ceph.dreamhost.com/log/mds.0.log.gz
2011-11-22 23:59:14.857453 ...
Anonymous
07:16 AM Feature #1487 (Resolved): config: {cluster,public}_subnets
Sage Weil
04:52 AM Revision 414caa7d (ceph): common/pick_address: Fix IP address stringification.
Different sockaddr_* have the actual address (sin_addr, sin6_addr)
at different offsets, and sockaddr->sa_data just i...
Tommi Virtanen
12:28 AM Revision 9870e2f7 (ceph): mon: pick_addresses before common_init_finish
We can't modify g_conf->public_addr after that.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:22 AM Revision 036ad4c7 (ceph): mon: set default port if not specified...
...when looking for self in monmap during mkfs.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
12:04 AM Revision 0045c901 (ceph): monmap: assign rank by sorting addr, not name
This allows monitors to bootstrap knowing peer addrs but not their names,
as when we specify mon_host.
Signed-off-by...
Sage Weil
12:04 AM Revision 36978a63 (ceph): mon: calculate rank by addr, not name
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil

11/22/2011

11:06 PM Revision ebe5fc60 (ceph): obsync: tear out rgw
Yehuda Sadeh
10:53 PM Revision 3a20b425 (ceph): mon: name self in monmap if --public-addr specified during mkfs
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:40 PM Messengers Bug #1747 (Resolved): msgr: osd connection originates from wrong port
osd.2 sends a couple messages to osd.1:... Sage Weil
06:31 PM Revision a859763b (ceph): rgw: don't remove tail of lru if that's what we touch
Yehuda Sadeh
06:09 PM Revision aeeeade6 (ceph): mon: mark down all connections when rank changes
The election and some other stuff depend on msg->get_source().num() to get
the peer rank, and that is part of the con...
Sage Weil
06:08 PM Revision bed3c472 (ceph): mon: handle rank change in bootstrap
The rank can change either because we probe and get a new monmap, or
because we get one via paxos. Move the checks t...
Sage Weil
05:53 PM Revision 8b464093 (ceph): mon: pick an address when joining and existing cluster
If we are joining an existing cluster, we can pick whatever address we
want (e.g., one specified by public_addr or pu...
Sage Weil
05:52 PM Revision 5ba356b3 (ceph): mon: remove unused myaddr
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:52 PM Revision 0c9724d6 (ceph): mon: simplify suicide when removed from map
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
03:02 PM rgw Feature #1697 (Resolved): s3-tests: test bucket headers
Fixed, added the following tests:
s3tests.functional.test_headers.test_bucket_put_bad_canned_acl
s3tests.function...
Yehuda Sadeh
10:33 AM rgw Bug #1719 (Resolved): rgw: crash in ObjectCache::touch_lru
should be fixed by commit:a859763b1cba844d0d56b861a372e5f63f87c607. Yehuda Sadeh
05:58 AM Revision 24ee09b0 (ceph): Revert "more logs (yuck) for #1682"
This reverts commit ea00114f08440563bce8e27ae2cd887bbc85aba5. Sage Weil
01:46 AM Revision eb8d91fe (ceph): PG: it's not necessary to call build_inc_scrub_map in build_scrub_map
Because we have called osr.flush(), it's safe to tag map.valid_through
as last_update. We will still have to catch ...
Samuel Just
12:17 AM Revision 0f4b59a4 (ceph): Merge remote branch 'gh/subnet'
Sage Weil
12:00 AM Revision c651c88e (ceph): Properly handle case where first error is inside a context manager __ex...
Closes: http://tracker.newdream.net/issues/1743 Tommi Virtanen
12:00 AM Revision fab1e55e (ceph): Merge remote branch 'gh/wip-mon'
Sage Weil

11/21/2011

10:27 PM Revision eec61b48 (ceph): common/ipaddr: Add utility function to parse ip/cidr style networks.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
10:27 PM Revision 0477f238 (ceph): common/pickaddr: Pick cluster_addr/public_addr based on *_network.
Tommi Virtanen
10:27 PM Revision c066e926 (ceph): mds, osd, synclient: Pick cluster_addr/public_addr based on *_network.
Instead of specifying an IP address in ceph.conf like
[global]
cluster_addr = 10.1.2.3
you can now avoid the node...
Tommi Virtanen
10:27 PM Revision 0f748d4c (ceph): common/ipaddr: Find a configured IP address in given subnet.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
10:07 PM CephFS Bug #1549 (In Progress): mds: zeroed root CDir* vtable in scatter_writebehind_finish
Sage Weil
09:56 PM Bug #1490: cfuse assert failure: assert(ob->last_commit_tid < tid)
happened again on /var/lib/teuthworker/archive/nightly_coverage_2011-11-21-b/2818
This may be the same root cause ...
Sage Weil
09:37 PM Revision 2bae3506 (ceph): osd: Remove unused variable.
Tommi Virtanen
09:37 PM Revision 0f9a0605 (ceph): common/str_list: Make unused return value void.
Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com> Tommi Virtanen
09:37 PM Revision 97464bca (ceph): msg: Move public_addr use outside ->bind()
Tommi Virtanen
09:28 PM Revision 3c8fec2d (ceph): osd: fix 'stop' command
Special case. We can't join the command_tp thread from itself.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
09:23 PM Revision b47347bd (ceph): osd: protect handle_osd_map requeueing with queue lock
pending_ops was protected by osd_lock, but it tracks something in the
queue, which has it's own lock. Messy. Also, ...
Sage Weil
07:15 PM Revision 70dfe8e9 (ceph): osd: lock pg when requeuing requests
The op queue is shut down, so this is mostly safe, unless someone comes
through and does requeue_ops() from a callbac...
Sage Weil
06:33 PM Revision 811145f7 (ceph): paxosservice: tolerate _active() call when not active
This can happen when multiple C_Active events are queued, and the first
does a propose_pending() (moving us into upda...
Sage Weil
05:19 PM Revision 88963a18 (ceph): objecter: simplify map request check
We should request a missing/intervening map if it appears to exist.
Otherwise, skip it.
Signed-off-by: Sage Weil <sa...
Sage Weil
05:19 PM Revision cd2e523f (ceph): objecter: cancel tick event on shutdown
Hopefully this is the root cause for
2011-11-20 23:57:41.555292 7f75dd743780 ceph version 0.38-205-g3b53b72
(commit:...
Sage Weil
05:01 PM rgw Bug #1719: rgw: crash in ObjectCache::touch_lru
I think what happens here is that the entry that we touch happens to be the one that we dispose of (at the tail of th... Yehuda Sadeh
04:02 PM Bug #1743 (Closed): teuthology: not exiting with error when ceph-fuse shutdown fails
commit c651c88eacf9c3bbf1f037be3a5dc0425308c730
Author: Tommi Virtanen <tv@eagain.net>
Date: 2011-11-21 16:00:19 ...
Anonymous
03:42 PM Bug #1743: teuthology: not exiting with error when ceph-fuse shutdown fails
This reproduced it nicely:
diff --git a/teuthology/task/internal.py b/teuthology/task/internal.py
index 58e7f14...
Anonymous
03:57 PM Bug #1744: teuthology: race with daemon shutdown?
Tommi Virtanen wrote:
> Was this using any one of the following?
>
> teuthology/task/lost_unfound.py
> teutholog...
Sage Weil
03:33 PM Bug #1744: teuthology: race with daemon shutdown?
Was this using any one of the following?
teuthology/task/lost_unfound.py
teuthology/task/mon_recovery.py
teuthol...
Anonymous
02:57 PM Bug #1741: teuthology: failed to untar
The path mentioned above is incorrect. Run nightly_coverage_2011-11-18-2/2663 failed because of network failure.
T...
Anonymous
02:52 PM Bug #1741: teuthology: failed to untar
This is exactly what would happen if someone nuked the machine, or locking failed and someone else ran a faster test ... Anonymous
01:29 PM Bug #1727: osd: failed assert(pending_ops > 0) in dequeue_op
hopefully fixed by commit:b47347bd7c377037f7fbc199f0c88b447c9626d1 Sage Weil
08:59 AM Bug #1727: osd: failed assert(pending_ops > 0) in dequeue_op
Happened again in the 11/21 nightlies - 2791, sepia33 Anonymous
09:53 AM Bug #1742 (Rejected): qa: s3-tests failed 100-continue test on sepia
This was due to an old entry in /etc/apt/sources.list - older versions of the apache packages were still used. The ch... Josh Durgin
09:43 AM rbd Feature #1713: teuthology: qemu tasks, tests
Sorry comment #2 was meant for another bug.
Anonymous
09:42 AM rbd Feature #1713: teuthology: qemu tasks, tests
This is in the plans after the new sepia hardware is in place; current sepia re-install is too slow & painful to dare... Anonymous
09:23 AM CephFS Bug #1746: PerfCounters::set segfault
i think this is objecter event teardown. see commit:cd2e523fba1d6cf8d15e7a349ad700b744f24ecf Sage Weil
09:05 AM CephFS Bug #1746 (Resolved): PerfCounters::set segfault
In the 11/21 nightlies, while trying to run workunit/ffsb,
2779/remote/ubuntu@sepia57.ceph.dreamhost.com/log/mon.2.l...
Anonymous
08:57 AM Bug #1530: osd crash during build_inc_scrub_map
Both of the above described variants occurred in the 11/21 nightlies
(2775:sepia17, 2783:sepia81, 2805:sepia82)
Anonymous

11/20/2011

11:24 PM Revision ea00114f (ceph): more logs (yuck) for #1682
Sage Weil
10:26 PM Revision f6070282 (ceph): paxos: fix sharing of learned commits during collect/last
We can learn either an uncommitted or committed value during the
collect/last recovery phase. For the committed valu...
Sage Weil
09:18 PM Revision 3b53b722 (ceph): rgw: support alternative date formatting
being used by s3cmd Yehuda Sadeh
09:05 PM Feature #1745 (Closed): teuthology: make interactive-on-error stop further cleanup
It would be nice if a failure in cleanup with prevent further cleanup when interactive-on-error is true. For example... Sage Weil
09:03 PM Bug #1744 (Resolved): teuthology: race with daemon shutdown?
... Sage Weil
08:02 PM Bug #1743 (Closed): teuthology: not exiting with error when ceph-fuse shutdown fails
here's the log tail:... Sage Weil
03:23 PM CephFS Bug #1682: mds: segfault in CInode::authority
Hrm, this has me stumped.
The log leading up is...
Sage Weil
04:56 AM Revision 4b53288b (ceph): ceph_manager: %
Sage Weil
04:56 AM Revision 721c0e97 (ceph): nuke: don't specify full path
/tmp/cephtest/binary may have been removed; kill stray daemons by name
only. we really don't care about false positi...
Sage Weil
03:28 AM Revision dcab329b (ceph): fix conf thinko
'int' object has no attribute 'iteritems' Sage Weil

11/19/2011

10:30 PM Revision becfce35 (ceph): mon: share random osd map from update_from_paxos, not committed()
This will let us remove committed() entirely.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:30 PM Revision b521710f (ceph): mon: mdsmon: tick() from on_active() instead of committed()
Same effect, and avoids useless committed().
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
10:30 PM Revision 10fed791 (ceph): paxosservice: remove unused committed() callback
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:30 PM Revision 9aabd398 (ceph): paxosservice: consolidate _active and _commit
Use the same callback for when paxos goes active and for when it commits
something. The response in both cases is th...
Sage Weil
09:56 PM Revision 9920a168 (ceph): config: support --no-<foo> for bool options
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:56 PM Revision 1a468c7e (ceph): config: whitespace
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
09:56 PM Revision a08e7f12 (ceph): regression/basic/tasks/kclient_workunit_misc: turn on mds log
Hopefully will catch #1682 Sage Weil
09:45 PM Revision 13c98df9 (ceph): regression/basic/tasks/cfuse_dbench: turn up client debugging
Hopefully we'll hit #1737... Sage Weil
02:28 PM Bug #1732 (Can't reproduce): osdmap assert fail during rados bench
Sage Weil
02:03 PM Bug #1742 (Rejected): qa: s3-tests failed 100-continue test on sepia
/var/lib/teuthworker/archive/nightly_coverage_2011-11-18-2/2683
and the chef task _did_ run...
Sage Weil
01:59 PM Bug #1741 (Can't reproduce): teuthology: failed to untar
teuthology:/var/lib/teuthworker/archive/nightly_coverage_2011-11-18-2/2662... Sage Weil
01:54 PM CephFS Bug #1573 (Duplicate): mds crash during multiple_rsync workunit
Sage Weil
12:13 AM Revision cc5b5e17 (ceph): osdmon: set the maps-to-keep floor to be at least epoch 0
Looks like this conditional was just set backwards by mistake. There
have been a number of issues with OSDMap version...
Greg Farnum

11/18/2011

11:57 PM Revision 45cf89c1 (ceph): Revert "osd: simplify finalizing scrub on replica"
This reverts commit dd5087fabb2a743741a96ee4610379afa8431f68.
Calling osr.flush() is not quite enough since the onre...
Samuel Just
11:56 PM Revision 57ad8b2e (ceph): FileStore.cc: onreadable callbacks in OpSequencer order is enough
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
10:19 PM rbd Bug #1740: krbd: don't return head data when reading from a non-existent snapshot
The requests are made for the head version, since the removed snapid is not found when looking up the snapshot name i... Josh Durgin
08:58 PM rbd Bug #1740: krbd: don't return head data when reading from a non-existent snapshot
Hmm, what should they return? -ENXIO or -EIO or something? What is the OSD returning in this case?
Sage Weil
05:11 PM rbd Bug #1740 (Resolved): krbd: don't return head data when reading from a non-existent snapshot
If you have an rbd image mapped at a snapshot, and then delete the snapshot, any subsequent reads succeed and give yo... Josh Durgin
09:53 PM Revision 508f4f83 (ceph): Save summary after nuking machines.
This way you can tell when tests are entirely finished running. Josh Durgin
08:22 PM Revision 91cfdfea (ceph): Add an example overrides file for running regression tests.
Josh Durgin
06:21 PM Revision 7c8a7a89 (ceph): Move multimds tests to a new suite, 'experimental'.
This suite is for testing features that aren't expected to be stable yet. Josh Durgin
05:49 PM Revision 09c20c51 (ceph): objecter: trigger oncommit acks if the request returns an error code.
Many users only set oncommit acks, so if they get an error code
(which comes only as a CEPH_OSD_OP_ACK right now) the...
Greg Farnum
05:49 PM Revision dedf2c4a (ceph): osd: error responses should trigger all requested notifications.
There's no good reason I can find to limit error code responses to
the ACK.
Signed-off-by: Greg Farnum <gregory.farn...
Greg Farnum
05:49 PM Revision 9800faeb (ceph): paxos: do not create_pending if !active
This avoids a scenario like:
- _active()
- proposes value
- _commit()
- creates new pending, even though in upda...
Sage Weil
05:43 PM Revision fa587687 (ceph): Revert "mon: don't propose new state from update_from_paxos"
This reverts commit 66c628acc8be71a92e801179431e4b938b857b3d. Sage Weil
05:15 PM rgw Feature #1482 (Resolved): qa: swift-tests
testswift was added to teuthology. Yehuda Sadeh
05:14 PM rgw Feature #1664 (Resolved): rgw: pass swift tests
We pass most of the tests, other than a few which we don't intend to fix at this point (different enforced limits) an... Yehuda Sadeh
05:00 PM rgw Feature #1739 (Resolved): rgw: multipart upload should use manifest object
Yehuda Sadeh
04:39 PM RADOS Bug #1738 (Duplicate): bad crushmap behavior
./osdmaptool --test-map-pg 1.21 <attached osdmap>
pg 1.21 ends up mapped only to osd3 despite there being two othe...
Samuel Just
02:40 PM Bug #1530: osd crash during build_inc_scrub_map
Got a couple more of these today: teuthworker/archive/nightly_coverage_2011-11-18-2/2649/remote/ubuntu@sepia56.ceph.d... Josh Durgin
02:37 PM CephFS Bug #1682: mds: segfault in CInode::authority
Another crash is CInode::Authority happened today, although a different backtrace.
From teuthology:~teuthworker/arc...
Josh Durgin
02:35 PM CephFS Bug #1737 (Resolved): ceph-fuse crash in xlist::remove
From teuthology:~teuthworker/archive/nightly_coverage_2011-11-18-2/2645/remote/ubuntu@sepia13.ceph.dreamhost.com/log/... Josh Durgin
10:11 AM Bug #1351 (Resolved): rados bench should report errors
Fixed by commit:dedf2c4a066876bdab9a0b0154196194cefc1340. Greg Farnum
04:45 AM Revision 66c628ac (ceph): mon: don't propose new state from update_from_paxos
Proposing a new state from within update_from_paxos() confuses some callers,
like PaxosService::_active(). Instead, ...
Sage Weil
04:28 AM phprados Tasks #869 (Resolved): Update to new librados API
Ok, it took some time, but it's done.
v0.9.3 is updated to the librados2 API and wraps all the C functions into PHP.
Wido den Hollander
01:57 AM Revision 94100ad0 (ceph): Move collections into separate suites
For now, there are just two suites:
* regression - tests that should always pass
* stress - tests that have p...
Josh Durgin
01:26 AM Revision 42cecb5e (ceph): suite: put common config before facets
This lets you add tasks to the beginning of a run, like the chef task. Josh Durgin
01:16 AM Revision 044a88ce (ceph): suite: schedule a list of collections for running instead of a single s...
Josh Durgin
01:00 AM Revision d8fc1513 (ceph): Clean up C++isms.
Tommi Virtanen
12:55 AM Revision 6ae0f81e (ceph): rgw: if swift url is not set up, just use whatever client used
Yehuda Sadeh
12:53 AM Revision 23aae67a (ceph): testswift: fix config
Yehuda Sadeh
12:53 AM Revision 6236e7db (ceph): testswift: fix config
Yehuda Sadeh
12:49 AM Revision c5450948 (ceph): Add a task for easily running chef-solo on all the nodes.
Tommi Virtanen

11/17/2011

11:01 PM Revision ef5ca293 (ceph): fuse: fix readdir return code
Ignore ENOSPC generated by our own callback, as it is only used to
terminate the loop.
Broken by commit cd90061239a5...
Sage Weil
10:11 PM Revision d61ba644 (ceph): paxos: fix trimming when we skip over incrementals
Remove open-coded trimming of old states and use our method (that also
removes additional per-state files). Fixes ol...
Sage Weil
10:10 PM Revision 367ab142 (ceph): paxos: store stashed state _and_ incrementals
Paxos::share_state() may share a stashed state and incrementals that
follow; we need to store the same.
Signed-off-b...
Sage Weil
09:53 PM Revision 6bc9a544 (ceph): mon: elector: always start election via monitor
Don't go from active -> electing without passing (monitor) go.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:46 PM Revision 89f80412 (ceph): ceph_manager: fix logging
Sage Weil
09:23 PM Bug #1708 (Resolved): mon/PGMonitor.cc: 218: FAILED assert(paxos->get_version() + 1 == pending_in...
This latest variation should be fixed by commit:66c628acc8be71a92e801179431e4b938b857b3d. Thanks for the log! Sage Weil
05:18 PM Bug #1708: mon/PGMonitor.cc: 218: FAILED assert(paxos->get_version() + 1 == pending_inc.version)
Yes, I still get the problem with an updated master 6bc9a544b62bb21f6ee7ef51bfbe9111f7add9cb
I had monitor debuggi...
Josh Pieper
09:07 PM Revision f85f5dd7 (ceph): ceph: deep merge overrides, so e.g. log whitelists can be overridden
Josh Durgin
09:06 PM Revision a7632976 (ceph): misc: move deep_merge out of the MergeConfig class - it's generic
Josh Durgin
08:07 PM Revision 685450b7 (ceph): common: libraries should not log to stdout/stderr
Certainly not by default.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:57 PM Revision c6988a07 (ceph): Save config after locking nodes, so targets are included.
Josh Durgin
07:56 PM Revision f1dd56d9 (ceph): objecter: set skipped_map if we skip a map
This ensures that we resend _all_ requests, since we aren't sure which
may have mapped to a different primary and the...
Sage Weil
07:39 PM Revision 5afef020 (ceph): objecter: add is_locked() asserts
Sanity check.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:39 PM Revision bf91177e (ceph): objecter: send slow osd MPing via Connection*
This may address #1732 indirectly because we have a Connection* reference
here. However, it's still not clear how we...
Sage Weil
07:18 PM Revision 4e6cd55c (ceph): filestore_idempotent: remove unused import
Josh Durgin
07:16 PM Revision 7d51e3d3 (ceph): mon_recovery: remove unused code and import
Josh Durgin
07:11 PM Revision f4d527e7 (ceph): thrashosds: timeout for every clean check, not just the last one
Josh Durgin
07:05 PM Revision 9d12b720 (ceph): ceph_manager: add a default timeout of 5 minutes for mon quorum
Josh Durgin
06:45 PM Revision cb9ac089 (ceph): ceph_manager: log mon quorum status so the logs show progress (or lack ...
Josh Durgin
05:42 PM Bug #1351: rados bench should report errors
Quick skim analysis:
If there's an error, the OSD returns it as an ACK.
The objecter only sends back data on the re...
Greg Farnum
11:05 AM Bug #1351: rados bench should report errors
This is probably what caused #1734. Josh Durgin
05:03 PM Feature #1736 (Resolved): collectd: hacky script to generate types.db from perfcounter schema
... Sage Weil
04:48 PM rgw Bug #1729 (Resolved): test_object_create_bad_expect_empty
Sage Weil
03:22 PM rgw Bug #1729: test_object_create_bad_expect_empty
Yehuda thinks this was a problem with not having the right Apache package installed; I think he's right and I've seen... Greg Farnum
04:43 PM Feature #1387 (Closed): teuthology-nuke: don't fail on down nodes
Josh Durgin
04:36 PM Bug #1723 (Rejected): timeouts during ffsb
Sage Weil
04:36 PM Bug #1723: timeouts during ffsb
also didn't have the umount bug fix.
i think the osd timeouts are just sluggish server, not actual errors per se.....
Sage Weil
04:33 PM Bug #1724 (Resolved): timeout during tiobench test
this test ran commit:dfc3ddc8983fbc7c376394067335b360c68cd314, which did not include the root dentry fix in commit:77... Sage Weil
03:06 PM CephFS Bug #1728 (Resolved): multiple cfuse tests failing with non-empty directories
fixed by commit:ef5ca293a7eee6fd37c1ea8e8027a5f6d83b66da Sage Weil
02:13 PM CephFS Bug #1728: multiple cfuse tests failing with non-empty directories
My guess is the warning cleanup patch that added an error check in the readdir code, commit:cd90061239a598f6fca94326b... Sage Weil
02:41 PM Bug #1731 (Resolved): PAXOS assert(begin->last_committed == last_committed)
fixed by commit:367ab142d7bc938c5a8b40027acd2431a11c8022 Sage Weil
11:56 AM Bug #1732: osdmap assert fail during rados bench
with commit:bf91177e57a4fae54882d78aa6b2bcf1adccae5d this won't crash, but its still not clear how we got an OSDSessi... Sage Weil
08:51 AM Bug #1732 (Can't reproduce): osdmap assert fail during rados bench
... Josh Durgin
11:39 AM Feature #1262 (Closed): teuthology: monitor health during run
Duplicate of #1240. Josh Durgin
11:06 AM Bug #1733 (Duplicate): rados bench duration can be ignored
Probably caused by #1351. Josh Durgin
09:05 AM Bug #1733: rados bench duration can be ignored
Is it generating new writes, or waiting for old writes to complete?
The time you give rados bench was never intend...
Greg Farnum
08:58 AM Bug #1733 (Duplicate): rados bench duration can be ignored
Sometimes a thrashing run with rados bench will continue indefinitely, with rados bench continuing to write after its... Josh Durgin
10:57 AM Bug #1730 (Rejected): mysterious compilation error
These were actually just warnings - the test passed. Josh Durgin
12:00 AM Revision f3c569ee (ceph): rgw: add swift task
still not completely working (for some reason it skips all the tests) Yehuda Sadeh
12:00 AM Revision 1dd607ca (ceph): rgw: add swift task
still not completely working (for some reason it skips all the tests) Yehuda Sadeh

11/16/2011

09:11 PM Revision fa4b0fb9 (ceph): osd: add pending_ops assert
Just a sanity check, hopefully helping us track down #1727.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:01 PM Revision 17fa1e0d (ceph): mon: renamed get_latest* -> get_stashed*
This makes e.g. get_latest_version() vs get_last_committed() less
confusing.
Signed-off-by: Sage Weil <sage@newdream...
Sage Weil
06:57 PM Revision b9d5fbe4 (ceph): mon: fix ver tracking for auth database
Local variable keys_ver needs to be updated when we slurp up latest stashed
version.
Signed-off-by: Sage Weil <sage@...
Sage Weil
06:54 PM Revision b425f6d6 (ceph): mon: always load stashed version when version doesn't match
The slurp process can happen after the monitor has started and has some
in-memory version of the state, and that proc...
Sage Weil
06:30 PM Bug #1731 (Resolved): PAXOS assert(begin->last_committed == last_committed)
In the 11/16 nightlies, there were numerous coredumps in:
sepia72 mon.{f,l,o,r,u}.log
sepia74 mon.q.log
All ...
Anonymous
06:23 PM Bug #1730 (Rejected): mysterious compilation error
In the 11/16 nightlies, 2071 rbd_dbench a compile failed ... with some warnings.
Has this worked in the past?
20...
Anonymous
06:19 PM rgw Bug #1729 (Resolved): test_object_create_bad_expect_empty
in the 11/16 nightly, 2080 rgw_s3tests
2011-11-16T00:51:18.914 INFO:teuthology.orchestra.run.err:s3tests.functional....
Anonymous
05:59 PM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
This happened again on 11/16, 2056 kclient_workunit_kernel_untar_build
2011-11-16T00:36:30.996 INFO:teuthology.task....
Anonymous
05:51 PM CephFS Bug #1728 (Resolved): multiple cfuse tests failing with non-empty directories
All from the 11/16 nightlies:
2044 cfuse_workunit_snaps ...
2011-11-16T00:05:11.781 INFO:teuthology.task.workunit...
Anonymous
01:10 PM Bug #1727 (Resolved): osd: failed assert(pending_ops > 0) in dequeue_op
from ml:... Sage Weil

11/15/2011

04:55 PM Bug #1432 (Resolved): libvirt: fix definition for rbd params/sources/etc
Merged upstream. Josh Durgin
11:12 AM rgw Cleanup #1716: rgw: remove curl use
We might want to hold this until we figure out whether and how we want to support openstack keystone. Yehuda Sadeh
11:08 AM rgw Bug #1721: rgw: spurious multipart-upload failures
It seems that the osd is a bit sluggish when we see those errors. Basically the complete (or abort) multipart takes t... Yehuda Sadeh
11:04 AM rgw Feature #1726 (Rejected): rgw: improve multipart upload performance
Currently when the upload completes, for each part we do:
- prepare index
- remove object
- complete index
E...
Yehuda Sadeh
10:24 AM Bug #1725 (Rejected): osd: os/FileStore.cc: 2426: FAILED assert(0 == "unexpected error")
btrfs bug, fixable by http://article.gmane.org/gmane.comp.file-systems.btrfs/13630/match=large+xattr Samuel Just
07:00 AM Bug #1725 (Rejected): osd: os/FileStore.cc: 2426: FAILED assert(0 == "unexpected error")
Getting a crash on one OSD when it tries to start up after upgrading to 0.38.
Here is the log of start up to crash...
Damien Churchill
01:02 AM Revision 2e195500 (ceph): rgw: don't log entries with bad utf8
Yehuda Sadeh

11/14/2011

10:39 PM Revision 0276eab4 (ceph): rgw: adjust error code in swift copy failures
Yehuda Sadeh
09:55 PM Revision 1fe16923 (ceph): rgw: fix swift responses encoding
Yehuda Sadeh
09:23 PM Revision 2445fd84 (ceph): rgw: Fix some merge problems uncovered by gcc warnings:
* a refactor in e2100bce left the mod_ptr and unmod_ptr members set
incorrectly in RGWCopyObj::init_common
* a fi...
Josh Pieper
09:23 PM Revision cd900612 (ceph): Resolve gcc warnings.
These should have no functional changes:
* Check errors from functions that currently cannot return any
* Initializ...
Josh Pieper
08:15 PM Revision a5b8c851 (ceph): osd: remove dead osd_max_opq code
This is no longer used as of a while ago!
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
05:02 PM rgw Bug #1698 (Resolved): radosgw-admin log list returns invalid json when a log object was created w...
Fixed, commit:2e195500b5d3a8ab8512bcf2a219a6b7ff922c97. Not logging entries with non-utf8 bucket name. Yehuda Sadeh
04:30 PM Bug #1676: stats mismatch during snaps workunit
Still happening in 11/11 nightly
1812/remote/ubuntu@sepia69.ceph.dreamhost.com/log/osd.1.log.gz
Anonymous
04:27 PM Bug #1530: osd crash during build_inc_scrub_map
Still happening in 11/11 nightly
1814/remote/ubuntu@sepia55.ceph.dreamhost.com/log/osd.1.log.gz
Anonymous
04:24 PM Bug #1722: osd_class_dir must reflect autoconf libdir
the original commit is commit:7e5dee907a8218647a88d1c7d3316cc277e1c44b. iirc that approach didn't work because autom... Sage Weil
02:11 PM Bug #1722: osd_class_dir must reflect autoconf libdir
See also #1614, which for some reason doesn't let me edit it anymore. Anonymous
02:11 PM Bug #1722 (Resolved): osd_class_dir must reflect autoconf libdir
These two end up at different values for systems using /usr/lib64:
src/common/config_opts.h:285:OPTION(osd_class_d...
Anonymous
04:19 PM Bug #1614 (Duplicate): default rados class location needs to be depend on autoconf libdir
Sage Weil
02:08 PM Bug #1614: default rados class location needs to be depend on autoconf libdir
Sage Weil
04:18 PM Revision f418775d (ceph): workunits: rados python workunit should be executable
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
04:12 PM Bug #1659: Upgrade from 0.27 -> 0.37 going wrong, OSDs miss map updates
I saw a very similar stack trace in the 11/11 Nightly
1862/remote/ubuntu@sepia9.ceph.dreamhost.com/log/osd.5.log.gz
Anonymous
04:06 PM Revision b43981b8 (ceph): multimon: need at least 2 osds to go healthy
Josh Durgin
04:04 PM Bug #1724: timeout during tiobench test
(I should have said the other problem was filed as bug 1723) Anonymous
04:03 PM Bug #1724 (Resolved): timeout during tiobench test
During the 11/11 nightlies, the tiotest task blocked multiple times. The first stack trace
(from 1831/remote/ubuntu...
Anonymous
03:57 PM Bug #1723 (Rejected): timeouts during ffsb
During the 11/11 nightlies, in suite 1827, sepia65 experienced multiple timeout events.
The first (from 1827/remote/...
Anonymous
12:34 PM rgw Bug #1721 (Can't reproduce): rgw: spurious multipart-upload failures
Sage Weil
11:56 AM Bug #1707 (Resolved): After fresh install, OSD initialization fails with: error error 17: File ex...
great, thanks! Sage Weil
03:53 AM Bug #1707: After fresh install, OSD initialization fails with: error error 17: File exists not ha...
Yes, I tested after that revision and could not reproduce the problem. Josh Pieper

11/13/2011

10:18 PM Revision 102c4342 (ceph): crush: send debug output to dout, not stdout/err
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:16 PM Revision 25eee416 (ceph): test/run_cmd: use mkstemp instead of mkstemps
my box didn't have mkstemps
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
10:07 PM Revision 18009866 (ceph): ceph-authtool: fix clitests
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
02:20 PM Bug #1688: Benjamin: pg stuck in scrub
is this addressed by the pg lock vs transaction submit ordering changes? Sage Weil
02:13 PM Bug #1707: After fresh install, OSD initialization fails with: error error 17: File exists not ha...
I think this is fixed by commit:7fb182a17b703002c1bd098391fb688b5b1e2749. Can you retest against latest master? Sage Weil
02:06 PM Bug #1708 (Can't reproduce): mon/PGMonitor.cc: 218: FAILED assert(paxos->get_version() + 1 == pen...
I fixed a number of bugs in this area, and there was a big refactor. Can you retest the latest and see if you run in... Sage Weil
02:05 PM Feature #1720 (Duplicate): qa: rpm autobuilders
probably start with opensuse and fedora, but eventually we probably want
- fedora (+ rawhide)
- opensuse (+ tumbl...
Sage Weil

11/12/2011

11:17 PM Revision d476ae25 (ceph): test_str_list: make sure ' ' and ', ' separaters work for str lists
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:55 PM Revision ecd713c5 (ceph): ceph-authtool: make error msg more helpful
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:55 PM Revision 4f39aaa7 (ceph): keyring: don't print auid if it is the default
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:55 PM Revision ee02a1e1 (ceph): mon: implement 'fsid' command
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:19 PM Revision 5a3004e2 (ceph): Merge branch 'stable'
Sage Weil
10:08 PM Revision 73f99a18 (ceph): mon: fix 'osd crush add ..' weight
This was changed to floating point in commit 3f67893.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
10:05 PM Revision 1b843e0e (ceph): osdmap: build_simple with normal osd/host/rack/pool hierarchy
This will be useful in the general case where the cluster is created with
an empty map and useful crush hierarchy.
S...
Sage Weil
10:04 PM Revision ec97c852 (ceph): mon: fix 'osd crush add ..' weight
This was changed to floating point in commit 3f67893.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
09:42 PM Revision 0349fa96 (ceph): vstart.sh: don't generate initial osdmap explicitly
This is simpler and exercises the monitors ability to start with a generic
osdmap and build it out as new osds are ad...
Sage Weil
09:41 PM Revision 30ddc85e (ceph): mon: make initial osdmap optional
If an initial osdmap is not provided, we generate an empty one. The user
add osds on their own after that.
Signed-o...
Sage Weil
09:41 PM Revision 0d812252 (ceph): osdmap: build_simple: create reasonable pools when numosd==0
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
09:16 PM Revision 8e150fb4 (ceph): mon: add '--fsid foo' arg for setting generated monmap fsid
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
05:04 AM Revision b51d817e (ceph): mon: take '--fsid foo' arg with --mkfs
This will set the seed monmap's fsid. This is useful if the monmap is
dynamically generated (e.g., based on ceph.con...
Sage Weil
05:04 AM Revision 0c731ed7 (ceph): osd: fix warnings
osd/ReplicatedPG.cc: In member function 'virtual void ReplicatedPG::remove_watchers_and_notifies()':
osd/ReplicatedPG...
Sage Weil
04:52 AM Revision 73705f66 (ceph): monmaptool: fix clitests
Initial map is epoch 0. Modifications still bump epoch by one.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
04:49 AM Revision 36241da4 (ceph): paxos: discard waiting_for_active events on reset
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:48 AM Revision 2253c016 (ceph): use libuuid for fsid
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
04:48 AM Revision 80ab6568 (ceph): monclient: use blank fsid (instead of epoch==0) for monmap checks
We can safely mkfs with an epoch=0 monmap as long as the fsid is set. And
that is what commit f31825cee5300c708800a0...
Sage Weil

11/11/2011

10:59 PM Revision 07950bb8 (ceph): crush: grammer: allow '.' in name token
These are now in the generated crush maps, so it seems appropriate to
recompile them :).
Reported-by: Martin Mailand...
Sage Weil
10:54 PM Revision cf0a53e1 (ceph): mon: fix seed monmap removal
Remove if we previous had no latest, not based on which map we now have.
It's possible we join when monmap epoch is s...
Sage Weil
10:52 PM Revision 6d370f3b (ceph): mon: allow monitor to automagically join cluster
If a monitor starts up with the correct fsid and auth keys, it will now
add itself to the monmap (and subsequently tr...
Sage Weil
08:52 PM Revision d56485a8 (ceph): osd: pass monclient::init errors up the stack
Fixes crash like
ceph version 0.38-149-gbf254de (commit:bf254de5cf8a17ce9467d166d87f3ab93170ae13)
1: (ceph::BackTr...
Sage Weil
08:37 PM Revision bf254de5 (ceph): mon: verify fsid during probe and election
This will keep mismatched fsids out of the same quorum.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:22 PM Revision f1a98fb8 (ceph): mon: tolerate won election while active
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:22 PM Revision cd736b9d (ceph): mon: clean up logic a bit
More explicit.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:22 PM Revision 2633d71d (ceph): mon: only re bootstrap if monmap actually changes
If we go thru here just to update latest, that's fine; no need to restart
the bootstrap process.
Signed-off-by: Sage...
Sage Weil
08:15 PM Revision 622fbadd (ceph): paxos: fix off-by-one in share_state
We hit this on adding a new monitor to an existing cluster.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:05 PM Revision 6c663d85 (ceph): mon: fix monmap update
It's on the stack; update in place.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:02 PM Revision 1134fdfe (ceph): mon: properly process monmaps even when i have the latest
We may get the latest monmap when we are doing our probing, but we still
need to process it in update_from_paxos(). ...
Sage Weil
07:55 PM Revision c097e634 (ceph): mon: fix up update_from_paxos() methods
Make sure they behave when the initial state is learned from paxos.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:41 PM Revision aea7563f (ceph): mon: create initial states after quorum is formed
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:41 PM Revision e545af2d (ceph): mon: remove empty monstore dirs
This is sloppy, but it works well enough since we mkdir dirs as needed
too.
Signed-off-by: Sage Weil <sage@newdream....
Sage Weil
07:41 PM Revision 65f797ea (ceph): mon: clean up mkfs seed data
And make sure the monmap/latest gets written properly.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
07:41 PM Revision f31825ce (ceph): monmaptool: new maps get epoch 0
Just for consistency's sake.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:45 PM Revision 1533f1c0 (ceph): mon: stage mkfs seed info in mkfs/ dir
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:34 PM Revision 9e941c43 (ceph): mon: eliminate PaxosService::init()
update_from_paxos() is sufficient
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
06:19 PM Revision 0a926ef5 (ceph): mon: include monmap dump in mon_status and quorum_status
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:15 PM Revision 8c3d872e (ceph): mon: pull initial monmap from monmap/latest OR mkfs/monmap
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:05 PM Revision 0ecae996 (ceph): mon: take explicit initial monmap -or- generate one via MonClient
This will simplify bootstrapping a cluster via e.g. mon_host.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
09:58 AM Linux kernel client Bug #1704 (Resolved): oid limited to 40 chars, rbd images can be longer
fixed by commit:224736d9113ab4a7cf3f05c05377492bd99b4b02
still need to do some cleanup here
Sage Weil
09:57 AM Linux kernel client Bug #1696 (Resolved): kclient: crash in ceph_d_prune
fixed by commit:774ac21da76f5c3018428725074e27a3fd40b128 Sage Weil
07:17 AM rgw Bug #1719 (Resolved): rgw: crash in ObjectCache::touch_lru
... Sage Weil
05:36 AM Revision 2bad0115 (ceph): filestore-idempotent
run filestore_idempotent.py task. Sage Weil
05:35 AM Revision c5f070b8 (ceph): filestore_idempotent.py: simple task to test non-idempotent osd ops
Write some non-idempotent events to the osd. Simulate a failure. Verify
the result is correct on replay.
This must...
Sage Weil
05:12 AM Revision 69cd3625 (ceph): filestore: sync after non-idempotent operations
This is a big hammer to fix journal replay on non-btrfs fs backends (extN,
xfs, whatever). The problem is that it is...
Sage Weil
05:12 AM Revision 09811120 (ceph): filestore: document the btrfs_* fields
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
05:12 AM Revision 8df0cd38 (ceph): filestore: make trigger_commit() wake up sync; adjust locking
We need to wake up the sync thread (duh).
Also, we need to obey the FileJournal::lock -> journal_lock locking
order....
Sage Weil
05:12 AM Revision 9f1673c1 (ceph): test_filestore_idempotent: transactions are individually idempotent
Make individual transactions idempotent, but their interactions
non-idempotent. I.e. A A A A is okay, but A B A is n...
Sage Weil
05:12 AM Revision add04d15 (ceph): filejournal: fix replay of non-idempotent ops
- start sync thread prior to replay, so that we can commit as we replay
operations
- keep applied_seq accurate
- pa...
Sage Weil
05:12 AM Revision dae6c956 (ceph): test_filestore_idempotent: detect commit cycles due to non-idempotent ops
If we do a non-idempotent op and it does a commit itself, we don't see
fs->is_committed() true ever. Also count full...
Sage Weil
04:50 AM Revision fa5047b3 (ceph): Merge remote branch 'gh/stable'
Sage Weil
01:15 AM Revision 1c1ebb4d (ceph): Add rados python tests.
Josh Durgin
01:10 AM Revision 2fb70297 (ceph): rgw: remove warning
Yehuda Sadeh
01:03 AM Revision 5407fa70 (ceph): workunits: add workunit for running rgw and rados python tests
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
12:52 AM Revision 71bfe897 (ceph): test/pybind: add test_rgw
Forgot to add this in the previous commit.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Josh Durgin
12:46 AM Revision ea42e02c (ceph): test/pybind: convert python rados and rgw tests to be runnable by nose
These tests can now be run automatically more easily.
Fixes: #1653
Signed-off-by: Josh Durgin <josh.durgin@dreamhost...
Josh Durgin
12:37 AM CephFS Bug #1702: Ceph MDS crash + client mount problem
Yes I am stopping the clients and remounting...but if im doing a mkcephfs, i make sure to umount all the clients befo... Gokul Krishnan
12:33 AM Revision 25cde7f9 (ceph): rados.py: fix Snap.get_timestamp
This now uses datetime, imports the right things, and calls the right function.
Fixes #1577
Signed-off-by: Josh Durg...
Josh Durgin

11/10/2011

11:07 PM Revision b600ec2a (ceph): v0.38
Sage Weil
11:05 PM Revision 2a7fbe0c (ceph): common: return null if mc.init() unsuccessful
Prevents ceph.cc from segfaulting on missing keyring.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
11:05 PM Revision a177a702 (ceph): rbd.py: fix list when there are no images
It should return [], not [''].
Reported-by: Eric Chen <Eric_YH_Chen@wistron.com>
Signed-off-by: Josh Durgin <josh.du...
Josh Durgin
11:05 PM Revision 27bb48c5 (ceph): mon: overwrite in put_bl
This fixes a situation where we accept a large value, there is some failure
and recovery, and then we commit a smalle...
Sage Weil
11:05 PM Revision 2f97a222 (ceph): PG: mark scrubmap entry as not absent when we see an update
Previously, there would be an assert failure in _scan_list if we see an
object deleted and then recreated.
Signed-of...
Samuel Just
10:58 PM Revision 87941128 (ceph): rgw: implement swift copy, fix copy auth
Yehuda Sadeh
10:13 PM Revision 77c977c1 (ceph): misc: allow >1 monitor per role in get_mon_names()
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:09 PM Revision 704644bc (ceph): PG: gen_prefix: use osdmap_ref rather than osd->osdmap
Otherwise, the debug output might not match the map used by
the pg logic.
Signed-off-by: Samuel Just <samuel.just@dr...
Samuel Just
10:09 PM Revision 7fb182a1 (ceph): OSD: sync_and_flush afer mkfs to create first snap
Previously, if we kill the OSD process before the filestore
does its first sync, we end up replaying the journal on t...
Samuel Just
09:41 PM Bug #1670 (Can't reproduce): osd: crash in update_heartbeat_peers
Sage Weil
09:38 PM Bug #213 (Resolved): non-idempotent transactions (clone) under ext3 may not replay correct result
commit:dae6c956543276e103a272eb1e897db17b840348 Sage Weil
08:54 PM Bug #1530: osd crash during build_inc_scrub_map
Sage Weil
05:29 PM Bug #1530: osd crash during build_inc_scrub_map
We just found surprisingly similar stack traces in three of last night's failures:
nightly_coverage_2011-11-10/1740/...
Anonymous
06:45 PM Feature #1516 (Resolved): openstack: single node dev environment
Josh Durgin
05:06 PM rgw Feature #1717 (Resolved): rgw: support json input
Yehuda Sadeh
05:06 PM Feature #1653 (Resolved): librados: python binding nose tests
Fixed by commit:ea42e02ca2fd3655dbaf2e720e31d78da5022e21. Josh Durgin
05:05 PM rgw Cleanup #1716 (Closed): rgw: remove curl use
Yehuda Sadeh
05:05 PM Bug #1577 (Resolved): rados.py: Snap.get_timestamp does not work
Fixed by commit:25cde7f98ac195b0458830a3e345db54a994384b. Josh Durgin
04:57 PM Feature #1539 (Duplicate): libvirt: make sure snapshots work
Sage Weil
04:11 PM rgw Feature #1715 (Rejected): rgw: use RENAME osd operation to avoid slow CLONE operations
add to osd too Sage Weil
04:03 PM rbd Feature #1713 (Resolved): teuthology: qemu tasks, tests
gitbuilder
teuthology task
some tests that run in it
Sage Weil
03:29 PM CephFS Bug #1702: Ceph MDS crash + client mount problem
Gokul Krishnan wrote:
> Thank you for reverting back so quickly.
>
> Well in my scenario, i just have one Ceph se...
Sage Weil
03:29 PM CephFS Bug #1702: Ceph MDS crash + client mount problem
Gokul Krishnan wrote:
> by the way,
> you have assigned a target version as v0.39...but in the site i can find only...
Sage Weil
01:50 AM CephFS Bug #1702: Ceph MDS crash + client mount problem
by the way,
you have assigned a target version as v0.39...but in the site i can find only the source for v0.37...
e...
Gokul Krishnan
12:45 AM CephFS Bug #1702: Ceph MDS crash + client mount problem
Thank you for reverting back so quickly.
Well in my scenario, i just have one Ceph server running. And yes, every ...
Gokul Krishnan
03:29 PM rgw Feature #1712 (Resolved): rgw: support swift manifest objects
Yehuda Sadeh
03:22 PM Feature #1711 (Resolved): chef: multiple monitor support
Sage Weil
03:22 PM Bug #1669 (Resolved): linux 32 bit kernel client ld libraries and rm issue
Sage Weil
03:14 PM Feature #1709 (Resolved): specfile: merge suse spec file changes
Sage Weil
03:00 PM rgw Bug #1706 (Resolved): rgw: copy object auth verification (probably) broken
Yehuda Sadeh
02:59 PM rgw Bug #1706: rgw: copy object auth verification (probably) broken
Fixed, commit:87941128b60608d66dc5327038f099a1fb2a99c3. Yehuda Sadeh
02:59 PM rgw Bug #1705 (Resolved): rgw: swift copy is broken
Fixed, commit:87941128b60608d66dc5327038f099a1fb2a99c3. Yehuda Sadeh
02:57 PM CephFS Feature #1448: test hadoop on sepia
The following benchmark, TestDFSIO, is for 12 OSDs, 1 MDS/MON. There is a single ext4 disk per node dedicated to Ceph... Noah Watkins
02:46 PM Bug #1632 (Can't reproduce): osd: crash in dequeue_op
Sage Weil
01:54 PM Bug #1708 (In Progress): mon/PGMonitor.cc: 218: FAILED assert(paxos->get_version() + 1 == pending...
Sage Weil
01:45 PM Bug #1708 (Resolved): mon/PGMonitor.cc: 218: FAILED assert(paxos->get_version() + 1 == pending_in...
Running ceph version from git: a3dd5bd67ba19aae51a51318138ef10213a91449
Slaves are all ubuntu 11.10, 3.0.0-12
Files...
Josh Pieper
12:06 PM Bug #1707 (Resolved): After fresh install, OSD initialization fails with: error error 17: File ex...
Running ceph from git @ a3dd5bd6 with btrfs
Ubuntu 11.10, 3.0.0-12 on all machines
After installing my compiled c...
Josh Pieper
01:17 AM Revision a3dd5bd6 (ceph): PG: update info.history even if lastmap is absent
Previously, we did not update same_interval_since etc if
we do not have the previous map.
Signed-off-by: Samuel Just...
Samuel Just
12:36 AM Revision 023ff590 (ceph): Makefile: add MMonProbe.h
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
12:33 AM Revision fd5fb993 (ceph): osd: remove useless proc_replica_log() side-effect
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil

11/09/2011

11:38 PM Revision 78ad144a (ceph): hadoop: update patch and Readme.
Patch generated by Noah Watkins <noahwatkins@gmail.com>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Greg Farnum
11:30 PM Revision 386c0db3 (ceph): rgw: swift guesses mime type if not specified
Yehuda Sadeh
10:50 PM Revision 78ccb2a9 (ceph): osd: comment PG::lock*(), whitespace
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
10:46 PM Revision 87318389 (ceph): Merge branch 'master' of github.com:NewDreamNetwork/ceph
Conflicts:
src/osd/PG.cc
Sage Weil
10:32 PM Revision 5fa8df1e (ceph): osd: improve last_peering_reset debugging
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
10:32 PM Revision 383dfa33 (ceph): crypto: make crypto handlers non-static
These were static in auth/Crypto.cc, which was mostly fine, except when
we got a signal shutting everything down for ...
Sage Weil
10:15 PM Revision 9db994a5 (ceph): PG: always add backlog entry
Previously, we did not add a backlog entry if the object already had an
entry in the log along with an entry for that...
Samuel Just
10:15 PM Revision 0dffddf3 (ceph): osd/: change type of osd::osdmap to a shared_ptr
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
10:15 PM Revision 5df28ece (ceph): OSDMap,CrushWrapper: const cleanup on OSDMap
The osd's cached maps are not actually modified once cached. Marking
these methods const (which they should be) allo...
Samuel Just
10:15 PM Revision b41b1fa5 (ceph): PG: cache read-only reference to the current osdmap on pg lock
Previously, we needed to grab an osd_map read lock to send messages,
among other things. Now, we grab a reference to...
Samuel Just
10:04 PM Revision 15da4787 (ceph): rbd: Fix the showmapped cmd usage
If the rbd showmapped cmd is given any extra arguments, rbd will fail
with "assert(0)". Fix it by exiting with "usage...
Stratos Psomadakis
09:37 PM Revision 303e863d (ceph): add hammer.sh
simple script to repeat a test until it fails. can probably do something much more sophisticated
here, but this works.
Sage Weil
09:28 PM Revision 33549333 (ceph): hadoop: return all replica hostnames
Updates CephFileSystem to return all replica locations,
and in addition attempts to use reverse DNS to convert
the OS...
Noah Watkins
09:23 PM Revision e6035a62 (ceph): hadoop: make listStatus quiet
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
09:23 PM Revision d7f911fb (ceph): hadoop: handle new ceph_get_file_stripe_address
Updates the Hadoop JNI/CephFileSystem to handle
the new version of ceph_get_file_stripe_address
which returns the loc...
Noah Watkins
09:23 PM Revision 619430a7 (ceph): client: return stripe address replicas
Changes ceph_get_file_stripe_address to return a
vector of entity_addr_t's for the primary and the
replicas. libcephf...
Noah Watkins
09:15 PM Revision c5c50377 (ceph): client: fix bad perfcounter fset callers
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
08:50 PM Revision 808c6442 (ceph): Improve use of syncfs.
Test syncfs return value and fallback to btrfs sync and then sync.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unic...
Alexandre Oliva
08:48 PM Revision c51e2f72 (ceph): osd: fix perfcounter typo
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
07:43 PM Revision 1ac6b47c (ceph): os: rename and make use of the split_threshold parameter.
This was accidentally left out of the must_split calculation. Put it
in, and rename it to split_multiplier (as that i...
Greg Farnum
07:03 PM Revision 09455eea (ceph): perfcounters: fix users of fset on averages
I forgot to audit these before merging the assert and they popped up
in teuthology and stuff. :(
Signed-off-by: Greg...
Greg Farnum
06:49 PM Revision afa56f16 (ceph): nuke: increase reboot timeout
Some sepia nodes are very slow to reboot. Josh Durgin
05:35 PM Bug #1690: osd re-created from scratch will crash on start-up
I was using v0.37; in order to debug this, I first build top of the tree stable (b8979f4d292f6a739daac81ce8e59aa084e1... Alexandre Oliva
05:11 PM rgw Bug #1706 (Resolved): rgw: copy object auth verification (probably) broken
Looking at RGWCopyObj::verify_permission(), we don't look at the source acl, but rather at the source bucket's acl. Yehuda Sadeh
05:07 PM rgw Bug #1705 (Resolved): rgw: swift copy is broken
Swift can accept alternative HTTP COPY method (with src/dest transposed). Yehuda Sadeh
04:38 PM Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct result
Sage Weil
02:55 PM Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct result
Update: the current first pass plan is to initiate a FileStore sync after any non-idempotent operation. This updates... Sage Weil
03:35 PM Linux kernel client Bug #1701: krbd: limits and constants are not consistent in kernel and userspace
Also related: we have MAX_POOL_NAME_SIZE and MAX_SNAP_NAME_SIZE as 128 in qemu right now. Josh Durgin
02:37 PM Linux kernel client Bug #1701: krbd: limits and constants are not consistent in kernel and userspace
Stratos Psomadakis wrote:
> Instead of opening a new issue, I think I can add it here.
>
> Besides those limits o...
Sage Weil
02:18 PM Linux kernel client Bug #1701: krbd: limits and constants are not consistent in kernel and userspace
Instead of opening a new issue, I think I can add it here.
Besides those limits on the RBD images, there's also a ...
Stratos Psomadakis
12:44 PM Linux kernel client Bug #1701 (New): krbd: limits and constants are not consistent in kernel and userspace
There are a few things that exist in the kernel but not userspace:
* SNAP_NAME_LEN
* (MIN|MAX)_OBJECT_ORDER
Also...
Josh Durgin
03:00 PM CephFS Bug #1702: Ceph MDS crash + client mount problem
Ok, so generally speaking, the only time you shoudl see fsid mismatches like that is if you have daemons from multipl... Sage Weil
02:55 PM CephFS Bug #1702: Ceph MDS crash + client mount problem
Hello,
thank you for the reply.
no, unfortunately i am not able to reproduce the error using debug ms = 20(for MD...
Gokul Krishnan
01:23 PM CephFS Bug #1702 (Need More Info): Ceph MDS crash + client mount problem
Are you able to reproduce this with 'debug mds = 20' and 'debug ms = 20' in your ceph.conf [mds section]?
Not sure...
Sage Weil
12:51 PM CephFS Bug #1702 (Can't reproduce): Ceph MDS crash + client mount problem
Hello,
i have configured ceph using a configuration as shown here[[http://pastebin.com/sQb8WZbx]].
The Ceph serve...
Gokul Krishnan
02:43 PM Bug #1684 (Duplicate): mon: crash in CryptoKey::encrypt
Sage Weil
02:42 PM Bug #1633 (Resolved): osd crash in CryptoKey::decrypt
should be fixed by commit:383dfa33682abeae7348655fc103dd80c41b7ba7 Sage Weil
02:39 PM Linux kernel client Feature #962 (Resolved): d_prune
Sage Weil
02:39 PM Linux kernel client Bug #850 (Resolved): make NULL lookup using I_COMPLETE work
Sage Weil
02:39 PM Linux kernel client Bug #851 (Resolved): make dcache readdir with I_COMPLETE work
Sage Weil
02:38 PM Linux kernel client Bug #1704 (Resolved): oid limited to 40 chars, rbd images can be longer
From Stratos Psomadakis:
"Besides those limits on the RBD images, there's also a hardcoded limit in
libceph (mess...
Sage Weil
02:27 PM rgw Bug #1698: radosgw-admin log list returns invalid json when a log object was created with a name ...
This is my vote for "let's not allow radosgw clients to create artifacts with non-utf8 names in the first place". Anonymous
02:19 PM Bug #1530 (Resolved): osd crash during build_inc_scrub_map
Samuel Just
02:08 PM Bug #1703 (Resolved): rbd: showmapped cmd fails, when extra args are present
Sage Weil
02:00 PM Bug #1703 (Resolved): rbd: showmapped cmd fails, when extra args are present
rbd showmapped cmd will fail with assert(0), when given any extra arguments.
Patch to fix it attached (exiting wit...
Stratos Psomadakis
01:02 PM Bug #1695 (Rejected): wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
Serge Rittscher wrote:
> ok, the output is:
> @
> rm -f init-ceph init-ceph.tmp
> sed -e 's|@bindir[@]|/usr/local...
Sage Weil
11:39 AM Bug #1695: wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
ok, the output is:
@
rm -f init-ceph init-ceph.tmp
sed -e 's|@bindir[@]|/usr/local/bin|g' -e 's|@libdir[@]|/usr/lo...
Serge Rittscher
11:04 AM Bug #1695: wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
oops, 'touch init-ceph.in' first, then 'make init-ceph' Sage Weil
12:49 AM Bug #1695: wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
@make init-ceph@
returns:
@make: `init-ceph' is up to date.@
Serge Rittscher
11:10 AM Bug #1700 (Resolved): osd: invalid perfcounter usage
Should be fixed in commit:09455eeac4fb37c31998202ad9503901f53c21dc. My bad! Greg Farnum
10:14 AM Bug #1700 (Resolved): osd: invalid perfcounter usage
During dbench, two osds crashed on this assert:... Josh Durgin
11:09 AM Bug #1694 (Resolved): monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Sage Weil
11:09 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
oh nevermind, didn't see that second comment. the fix is commit:0bcdd4f3b2a2dba405639122b84f7aad978f347b, which come... Sage Weil
11:06 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Great. Can you attach (or email) the ceph.conf you're using?
Thanks!
Sage Weil
07:55 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
The monitor that was generating the osdmap was running commit:5bd029ef01fcb59bea9170af563c3499cce1e8c4 and that faile... Wido den Hollander
02:25 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Ok, I've ran those commands and it gives me:... Wido den Hollander
07:19 AM CephFS Bug #1472: cfuse hangs with v0.34
Some of the hangs we've been seeing on the client may have been related to having two nics on each node. We had seen... Sam Lang
06:17 AM Revision 6d39cc11 (ceph): ceph: keep ceph.conf at ctx.ceph.conf
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:17 AM Revision 60863f70 (ceph): ceph_manager: manipulate monitors
Sage Weil
06:17 AM Revision 6618a027 (ceph): mon_recovery: add task to test monitor cluster failure recovery
Some simple tests to start with. We still need some sort of mon cluster
thrashing.
Signed-off-by: Sage Weil <sage@n...
Sage Weil
06:16 AM Revision 9acea7a6 (ceph): multimon mon_recovery tests on variously sized monitor clusters
Sage Weil
06:11 AM Revision 6ab14874 (ceph): Merge branch 'wip-mon'
Sage Weil
05:58 AM Revision 87634ce1 (ceph): osd: don't open deleted map from generate_past_intervals
The first get_map() call needs to be avoided when stop < last_epoch. This
fixes a crash like
2011-11-08 21:51:09.04...
Sage Weil
05:13 AM Revision 20cf1e96 (ceph): automake: enable 'make V=0'
Enables silent mode for automake generated Makefiles,
and silent mode is _off_ by default. Using V=0 the output
is mu...
Sage Weil
12:45 AM Revision 4b0cf89b (ceph): Add rbd python binding test.
Josh Durgin
12:24 AM Revision 1bc1a244 (ceph): mon: handle active -> electing transition properly
If we are already active, make sure we reset things properly before going
into an election.
Signed-off-by: Sage Weil...
Sage Weil
12:09 AM Revision 5d32bcae (ceph): Add nuke-on-error option.
This lets automated jobs nuke and unlock machines after failed
tests. Each machine is nuke individually, so one down ...
Josh Durgin
12:09 AM Revision 006a0dd4 (ceph): Remove unused imports and variable.
Josh Durgin

11/08/2011

10:21 PM Feature #1007 (Resolved): qa: osd failure and cluster recovery test(s)
yay thrashing Sage Weil
10:20 PM Bug #1694 (Need More Info): monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Sage Weil
09:28 PM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Can you try this and see if there is a mismatch?... Sage Weil
10:06 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
Aha! Read that wrong, tnx.
I used mkcephfs to generate the crushmap, I did not write my own.
Wido den Hollander
09:17 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
max_osd in the osdmap needs to be >= the max_devices in the crush map. how did you set up the cluster? did mkcephfs... Sage Weil
07:18 AM Bug #1694: monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
I just made a small adjustment to crushtool so it would print max_devices:... Wido den Hollander
07:01 AM Bug #1694 (Resolved): monitor crash: FAILED assert(get_max_osd() >= crush.get_max_devices())
I just did a fresh install of my cluster and after starting I saw my monitors go down with:... Wido den Hollander
10:18 PM Feature #1646 (Resolved): mon: catch up on committed items before attempting to join quorum
Sage Weil
10:17 PM Revision 7a32cc60 (ceph): rgw: swift bucket report returns both bytes size and actual size
Yehuda Sadeh
10:17 PM Revision 76090324 (ceph): rgw: don't return partial content response with bad header
Yehuda Sadeh
10:17 PM Revision a04afd09 (ceph): rgw: abort early on incorrect method
Yehuda Sadeh
09:33 PM Bug #1695: wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
What is the output if you... Sage Weil
09:06 AM Bug #1695 (Rejected): wrong path to ceph's libs / bash scripts in /etc/init.d/ceph
After installing Ceph from sources (version ceph-0.37.tar.gz) on Ubuntu by executing
$ ./autogen.sh
$ ./configure...
Serge Rittscher
09:09 PM Revision 2fb73bdd (ceph): paxos: fix race between active and commit
If paxos reproposes an old learned value, we have a C_Active waiter, and
also a commit in progress.
When we reach qu...
Sage Weil
08:56 PM Revision 1ffb7b97 (ceph): mon: add 'quorum_status' command
Show status of the current quorum. Block until there is one.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
08:52 PM Revision a8b28ee5 (ceph): mon: do not participate in the election unless we are in electing state
If we participate, we may be included in the quorum, even tho we are
probing, slurping, whatever.
Signed-off-by: Sag...
Sage Weil
07:50 PM Revision 64350c0b (ceph): rgw: guard perfcounter accesses in rgw_cache.
This gets called by radosgw-admin, so it needs to handle
perfcounter being a null pointer.
Signed-off-by: Greg Farnu...
Greg Farnum
07:28 PM Revision 42f5f024 (ceph): rgw: initialize all the perfcounters, in order
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
06:42 PM Revision e952e10f (ceph): ReplicatedPG: use finc, not fset, on average counters
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
06:42 PM Revision 29e091b5 (ceph): mon: 'mon_status' command to dump individual mon state
Signed-off-by: Sage Weil <sage@newdream.net> Sage Weil
06:04 PM Revision f0b9a331 (ceph): rgw: use l_rgw_qactive perfcounter
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
05:58 PM Revision 9035ffb2 (ceph): mon: add probe+slurp timeouts
A short timeout on probe, so we can form new quorums quickly.
A longer timeout on slurp, so we will tolerate a slow ...
Sage Weil
05:50 PM Revision 0fe0f9db (ceph): rgw: create and tear down a radosgw perfcounter
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Sage Weil
05:50 PM Revision d0b226e7 (ceph): perfcounter: assert when you try and set an average.
If you're trying to set an average, you're probably doing it wrong.
Signed-off-by: Greg Farnum <gregory.farnum@dream...
Greg Farnum
05:50 PM Revision 57b60b8a (ceph): perfcounter: add some minimal documentation.
The data model is a bit obtuse if you're just looking at the code.
Signed-off-by: Greg Farnum <gregory.farnum@dreamh...
Greg Farnum
05:50 PM Revision cf566550 (ceph): rgw: implement perfcounters
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Greg Farnum
04:59 PM Linux kernel client Bug #1696: kclient: crash in ceph_d_prune
Here is the code:... Sage Weil
11:50 AM Linux kernel client Bug #1696 (Resolved): kclient: crash in ceph_d_prune
During the 11/08 nighly, several suites:
1606 autotest dbench
1607 workunit direct_io
1608 workunit kc...
Anonymous
04:57 PM Bug #1684: mon: crash in CryptoKey::encrypt
This happened on an mds during a thrashing run:... Josh Durgin
04:29 PM Linux kernel client Feature #1699 (Resolved): debug symbols in autobuilt (sepia) kernels
We need debug symbols in the .ko objects:... Sage Weil
03:49 PM rgw Bug #1698: radosgw-admin log list returns invalid json when a log object was created with a name ...
The two preceding days show similar errors as well. Matthew Wodrich
03:48 PM rgw Bug #1698: radosgw-admin log list returns invalid json when a log object was created with a name ...
The description above is malformed for whatever reason, so I'll try again:
radosgw-admin log list is producing bad J...
Matthew Wodrich
03:44 PM rgw Bug #1698 (Resolved): radosgw-admin log list returns invalid json when a log object was created w...
2011-11-07-12-0-<80>.. Matthew Wodrich
02:34 PM rgw Feature #1697 (Resolved): s3-tests: test bucket headers
Sage Weil
12:04 PM rgw Feature #1591 (Resolved): rgw: instrument with perfcounter
Finally sat down and did this. Merged in commit:64350c0b4d3ba2061cebed87f4cd6f513d2ba6ed and passed s3tests. Greg Farnum
06:46 AM Revision 2523b70e (ceph): mon: slurp latest state from active monitors before joining quorum
If a monitor has been down and is behind, and joins the quorum, the
other nodes will try to send it all of the needed...
Sage Weil
06:41 AM Revision c2fc986e (ceph): monmap: simplify constructor
Explicitly set created, last_changed where appropriate.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Sage Weil
06:41 AM Revision 279661f3 (ceph): paxos: last_consumed == latest_stashed; behave accordingly
Initialize on startup.
Don't re-read off of disk on every trim_to() call.
Signed-off-by: Sage Weil <sage.weil@dreamh...
Sage Weil
06:41 AM Revision 100fba8e (ceph): mon: fix osdmap trim
We can raise the floor even when min_last_epoch_clean if very close to
the current version, as long as it is still ab...
Sage Weil
04:40 AM Revision 628de548 (ceph): mon: don't call out to mon->call_election for internal election restarts
This lets us drop the is_new kludge.
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:40 AM Revision 18941dd0 (ceph): mon: rename election_starting -> restart
These callbacks reset monitor/paxos/paxosesrvice state, which used to
happen when an election started, but will now n...
Sage Weil
04:40 AM Revision 2f46e8cd (ceph): mon: revamp monitor states
starting -> probing, electing
some cleanup
Signed-off-by: Sage Weil <sage@newdream.net>
Sage Weil
04:40 AM Revision 40843eb3 (ceph): rgw: fix warning
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Sage Weil
01:08 AM Revision 2836104a (ceph): rgw: fix accept-range for suffix format, other related issues
Yehuda Sadeh

11/07/2011

11:04 PM Revision 2f881e12 (ceph): Timer.cc: remove global thread variable
Signed-off-by: Samuel Just <samuel.just@dreamhost.com> Samuel Just
11:04 PM Revision d4ef9215 (ceph): common: return null if mc.init() unsuccessful
Prevents ceph.cc from segfaulting on missing keyring.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Samuel Just
09:05 PM Revision c764b247 (ceph): Fix leftover orchestra import clause.
This seems to be a leftover from
a2372fce12b6bd1818e155d1d8ed5134dbd8fd4a,
no idea how it stayed hidden this long.
Tommi Virtanen
05:27 PM Revision 480b8260 (ceph): rbd: add showmapped to clitests and rst man page
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Josh Durgin
05:27 PM Revision 4e518ed3 (ceph): rbd: Document the rbd showmapped cmd
Document the rbd showmapped cmd in rbd.usage(), and rbd's man page,
and add it to the bash completion script.
Signed...
Stratos Psomadakis
05:10 PM Revision 34d80397 (ceph): rbd.py: fix list when there are no images
It should return [], not [''].
Reported-by: Eric Chen <Eric_YH_Chen@wistron.com>
Signed-off-by: Josh Durgin <josh.du...
Josh Durgin
03:35 PM Bug #1690: osd re-created from scratch will crash on start-up
I seem to be having some trouble reproducing this. What version are you running? Could you repeat the procedure wit... Samuel Just
10:33 AM Bug #1690 (Can't reproduce): osd re-created from scratch will crash on start-up
Some time ago, it was possible to re-create an osd after its filesystem failed as simply as running “cosd -i # --mkfs... Alexandre Oliva
02:59 PM CephFS Feature #1693: libcephfs: Support TRIM (hole punching)
Kernelside ceph.ko ticket is #591. Let this ticket stand for the userspace libcephfs (and ceph-fuse) support. Anonymous
02:12 PM CephFS Feature #1693 (Resolved): libcephfs: Support TRIM (hole punching)
Anonymous
02:57 PM Feature #1692: librbd: Support TRIM (hole punching) (userspace client)
Kernel-side rbd.ko ticket is #190. Let this ticket stand for the librbd (userspace) support. Anonymous
02:11 PM Feature #1692 (Duplicate): librbd: Support TRIM (hole punching) (userspace client)
Anonymous
01:56 PM Bug #1691 (Can't reproduce): rados export failures
... Sage Weil
11:36 AM Linux kernel client Bug #1667 (Resolved): BUG at fs/inode.c line 1375
Sage Weil
11:17 AM rbd Feature #1662 (In Progress): libvirt: obscure qemu/rbd secrets
Sage Weil
 

Also available in: Atom