Activity
From 01/31/2012 to 02/29/2012
02/29/2012
- 11:47 PM Revision a80246c1 (ceph): dump_stuck: note required ceph configuration
- 11:45 PM Revision b2bbede8 (ceph): dump-stuck: set pg stuck threshold to match test
- 10:46 PM Revision 86340655 (ceph): rgw: don't retry certain operations if we raced
- The atomic get/put scheme was retrying writes in case where it lost
races (head object was rewritten by another clien... - 10:46 PM Revision 85d04c6c (ceph): rgw: don't check for ECANCELED in the _impl() functions
- We already check it in the outer functions.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 09:22 PM Bug #2022: osd: misdirectect request
- ...
- 09:22 PM Revision b1f26440 (ceph): msgr: fix race in learned_addr()
- - two connect() threads
- both hit if (need_addr) check
- one takes lock, sets addr, need_addr = false, unlocks
- con... - 09:16 PM Bug #2080: osd: scrub on disk size does not match object info size
- hit this again, ...
- 08:28 PM Revision 8a2b7641 (ceph): msgr: print existing->state before failing assert
- May help with #1378.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:07 PM Revision cbb12809 (ceph): Merge remote-tracking branch 'gh/wip-2121'
- Reviewed-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
- 05:46 PM Revision 052d64e1 (ceph): osd: unregister signal handlers on shutdown
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:46 PM Revision db96831b (ceph): mon: unregister signal handlers on shutdown
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:45 PM Revision 8e9bf611 (ceph): mds: unregister SIGHUP too
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:45 PM Revision bb5c7640 (ceph): radosgw: handle SIGHUP
- Fixes: #2121
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:23 PM Revision 9c7b63e1 (ceph): init-radosgw: add 'reload' command to send SIGHUP
- Fixes: #2121
Signed-off-by: Sage Weil <sage@newdream.net> - 05:21 PM Revision e8437665 (ceph): osd: fix typo is recovery_state query dump
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:17 PM Revision 0e03e9dd (ceph): osd: add missing space to scrub error
- [ERR] 18.5 osd.3: soid 8a5e37ad/rb.0.0.000000002b99/headextra attr _, extra attr snapset
Signed-off-by: Sage Weil <s... - 05:12 PM Revision 2437ce02 (ceph): msgr: discard the local_pipe's queue on shutdown.
- To facilitate this, we do two things:
1) actually identify the number of special code values we pass around
2) use th... - 05:10 PM Revision 7690f0b9 (ceph): osd: remove down OSDs from peer_info on reset
- If an OSD goes down, remove it from peer_info. In particular, I saw
2012-02-28 11:04:25.851038 12e53700 osd.5 3602 p... - 02:57 PM Bug #2116: Repeated messages of "heartbeat_check: no heartbeat from"
- i'm hoping wip-2116 fixes it...
- 02:31 PM Bug #2116: Repeated messages of "heartbeat_check: no heartbeat from"
- Wido, are you able to reproduce this reliably? I have an idea what the problem is, but have never reproduced this. ...
- 02:17 PM Bug #2002: osd: racy push/pull for clones
- reenabling this in my thrashing tests. if all goes well, i'll reenable in master under the assumption that sam's cle...
- 02:16 PM Bug #1977 (Can't reproduce): mon: ceph command hang
- we can reopen if this ever pops up again
- 01:59 PM Feature #2111 (In Progress): msgr workloads
- What we're looking for here are basic tests like connect, send message, kill connection, send another message; and ve...
- 01:30 PM Messengers Bug #1747 (Resolved): msgr: osd connection originates from wrong port
- commit:b1f264406f93af35600786f58e75908c393cf2ed
- 12:21 PM Messengers Bug #1747: msgr: osd connection originates from wrong port
- wip-1747
- 11:25 AM Messengers Bug #1747: msgr: osd connection originates from wrong port
- just hit this again. osd.1:...
- 12:48 PM rgw Bug #2121 (Resolved): radosgw: reload command for init script
- 09:48 AM rgw Bug #2121: radosgw: reload command for init script
- 09:25 AM rgw Bug #2121 (Resolved): radosgw: reload command for init script
- 12:48 PM Bug #1458 (Resolved): Run ceph suite with valgrind enabled
- 11:13 AM Bug #1975: btrfs: EINVAL on snap create
- see also this thread: http://marc.info/?t=132768583600004&r=1&w=2
- 10:46 AM Bug #1975: btrfs: EINVAL on snap create
- the EINVAL seems to have come from...
- 10:44 AM Bug #1975: btrfs: EINVAL on snap create
- somehow we end up here in btrfs:...
- 10:39 AM Bug #1975: btrfs: EINVAL on snap create
- quick brain dump:
- last time this reproduced i narrowed it down to a case where there were racing rmdirs with the... - 10:55 AM Bug #2115: OSD failed to start: Operation not permitted
- it looks like you may be having trouble authenticating with the monitor. can you reproduce this with 'debug ms = 1'? ...
- 10:28 AM Bug #2031 (Can't reproduce): paxos: failed assert (begin->last_committed == last_committed)
- 10:09 AM Messengers Bug #2086 (Resolved): msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- merged!
- 10:06 AM Messengers Bug #2086: msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- Sage suggested I could just add a local dispatch to the shutdown or wait functions to test this properly...I did, and...
- 09:18 AM Messengers Bug #2086: msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- 09:27 AM Bug #1873: crush_rule type is inconsistent
- It's __s16 or int so that a negative value can mean undefined/not specified. I'm inclined to just leave this as is...
- 09:18 AM Bug #2119 (Resolved): osd: do_query to !up osd
- 01:04 AM Revision fe94c041 (ceph): Merge branch 'next'
02/28/2012
- 10:05 PM Revision 23a0c039 (ceph): rgw: check for bucket swift permissions only if failed
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 09:55 PM Revision 85cc96c1 (ceph): dump_stuck: verify that 'ceph health' mentions the right number of inac...
- 09:53 PM Revision b9a675a2 (ceph): mon: report pgs stuck inactive/unclean/stale in health check
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Sage Weil <sage.weil@dreamhost.com> - 09:31 PM Revision e73ab2cc (ceph): Merge branch 'master' into wip-swift-acls
- 09:29 PM Revision bc80ba1f (ceph): rgw: fix swift bucket acl verification
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 08:37 PM Revision cc935180 (ceph): rgw: implement swift public group
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 08:29 PM Revision d10e1f46 (ceph): mon: fix slurp_latest to fill in any missing incrementals
- Fixes #1789.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> - 06:39 PM Bug #2115: OSD failed to start: Operation not permitted
- See attachment please
- 09:17 AM Bug #2115: OSD failed to start: Operation not permitted
- Can you attach the actual log? I want to make sure there is no subtle difference in the output. Thanks!
- 01:40 AM Bug #2115: OSD failed to start: Operation not permitted
- ceph version 0.42.2 (commit:732f3ec94e39d458230b7728b2a936d431e19322)
- 01:38 AM Bug #2115 (Rejected): OSD failed to start: Operation not permitted
- I'm setting up a new ceph cluster on ubuntu 11.10 with kernel version 3.0.0-16-server x86_64. The osd server failed t...
- 05:57 PM Messengers Bug #2086: msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- To be clear, I didn't try and generate the actual failure condition that was causing an assert before — that should b...
- 05:55 PM Messengers Bug #2086: msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- wip-2086 should fix this.
Ran a simple test:... - 05:27 PM Messengers Bug #2086 (In Progress): msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- 04:51 PM Messengers Bug #2086: msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- Okay, looks like the local_pipe doesn't get its message queue cleared...I'm checking the others and looking at how it...
- 05:50 PM Revision 999e2192 (ceph): peer: ignore +scrubbing portion of pg state
- It can cause the mon state and osd states to not match.
- 05:33 PM Revision 7b48cca1 (ceph): test_osd_types: fix unit test for new pg_t::is_split() prototype
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:30 PM Revision fd0712df (ceph): Makefile: drop separate libjson_spirit.la
- automake seems to have difficulty with the .la dependency on another .la.
Since libjson_spirit.la is only used by lib... - 05:26 PM Revision edd35c04 (ceph): osd: drop useless ENOMEM check
- new throws exception; doesn't return NULL.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:11 PM Revision a7de459f (ceph): ceph-osd: clarify error messages
- So we know where the error came from. And use real error codes in init().
Signed-off-by: Sage Weil <sage@newdream.net> - 05:10 PM Revision 97926e18 (ceph): init: Actually do start the daemons when 'service ceph start <type>' is...
- A bug in my previous patch prevented any daemon with auto_start set to false from starting.
This patch allows:
* /et... - 04:55 PM rgw Bug #2120: rgw: atomic write guard doesn't scale well
- Implementing #1956 would solve this issue, and would make the entire atomic scheme simpler.
- 03:03 PM rgw Bug #2120: rgw: atomic write guard doesn't scale well
- This was reported by a user through the ml. We should figure out with that user whether it's a real issue, or a red h...
- 02:51 PM rgw Bug #2120: rgw: atomic write guard doesn't scale well
- Do we care? You can't do partial updates to objects IIRC, so many writers pretty much has to be wrong somehow or other.
- 02:35 PM rgw Bug #2120 (Resolved): rgw: atomic write guard doesn't scale well
- shen there is a large number of writers to the same object.
- 04:48 PM rgw Bug #2106 (Resolved): failed s3tests.functional.test_s3.test_100_continue
- Machines were running wrong apache and fastcgi modules.
- 04:23 PM Bug #2116: Repeated messages of "heartbeat_check: no heartbeat from"
- This may be a messenger issue, but it's not losing that initial message — notice how osd5 tries to send a ping back t...
- 11:26 AM Bug #2116: Repeated messages of "heartbeat_check: no heartbeat from"
- the other side of this conversation is...
- 11:20 AM Bug #2116 (In Progress): Repeated messages of "heartbeat_check: no heartbeat from"
- looks like a msgr issue?...
- 07:35 AM Bug #2116 (Resolved): Repeated messages of "heartbeat_check: no heartbeat from"
- As discussed on the ml I gathered some logs.
Today I upgraded my whole cluster to 0.42.2 from 0.41.
Due to the ... - 12:54 PM Bug #1789 (Resolved): mon: failed assert(paxosv == pg_map.version)
- Pushed to master in commit:d10e1f46df8cc252f2f1d57cf5e577ea38eee1ae
- 12:48 PM Bug #1789: mon: failed assert(paxosv == pg_map.version)
- Okay, figured it out. Our current slurp code pulls in all the incrementals, then sends off a request for latest_stash...
- 12:01 PM Bug #2119 (Resolved): osd: do_query to !up osd
- ...
- 11:09 AM Bug #2118: osd: flawed commit_op_seq check on startup
- 10:08 AM Bug #2118 (Resolved): osd: flawed commit_op_seq check on startup
- the check that current/commit_op_seq == newest snap is flawed because ceph-osd can write a new current/commit_op-seq ...
- 10:09 AM Bug #2104 (Won't Fix): teuthology: wait_for_clean doesn't wait for last_epoch_started to propagate
- 10:09 AM Bug #2107 (Resolved): teuthology: lost_unfound fails pg state assert
- 09:41 AM devops Feature #2117 (New): qa: gitbuilder that does ENCODE_DUMP
02/27/2012
- 11:41 PM Revision f317028f (ceph): doc: beginnings of documentation of stuck pgs and pg states
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net> - 11:13 PM Revision 19170241 (ceph): filestore: make less noise on ENOENT
- Don't generate high-level log spam on every open error.
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed-by: Sa... - 10:52 PM Revision 722af1a4 (ceph): no peer as part of lost_unfound
- 10:49 PM Revision 244b7029 (ceph): pg: use get_cluster_inst instead of get_inst in activate
- This was mistakenly broken in 4b3bb5ab37a05fa001d59f24da7d9c30d650321b
Signed-off-by: Greg Farnum <gregory.farnum@dr... - 10:37 PM Revision f02195b4 (ceph): Merge branch 'wip-split2'
- Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
- 10:35 PM Revision b6a04174 (ceph): osd: pg_t::is_split(): make children out param a pointer, and optional
- Also unit test it.
Signed-off-by: Sage Weil <sage@newdream.net> - 10:18 PM Revision 85ed06e9 (ceph): osd: bypass split code
- Until it is fully implemented. It's also disabled in the monitor
currently, but just in case it gets into the OSDMap... - 10:16 PM Revision 15d53249 (ceph): osd: fix pg locking flags
- Two things we need to handle:
- callers who already hold map_lock (split_pg())
- callers who already hold another ... - 10:04 PM Revision fc7b11a9 (ceph): osd: partially refactor pg split
- This partially refactors the OSD split code to do the split synchronously
when processing a new OSDMap. It is incomp... - 07:44 PM Revision 6a081888 (ceph): osd: factor hobject key into child pgid calc during split
- When we calculate the object's new pg, take the locator key into
consideration, to avoid a crash like
osd/OSD.cc: In... - 07:44 PM Revision d9cf3322 (ceph): osd: implement pg_t::is_split()
- Test to determine if a pg has split between two pool sizes, and if so,
what its children are.
Signed-off-by: Sage We... - 07:39 PM Revision ee4d9909 (ceph): journaler: log on unexpected objecter error
- This will help with #2110, #1796, #1640.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:56 PM Revision 91b119a0 (ceph): osd: fix recursive map_lock via check_replay_queue()
- Also drop activate_pg() helper while we're at it, so it's clear that we
are the only user.
recursive lock of OSD::ma... - 04:20 PM Messengers Bug #2086: msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- The guards for something like that shouldn't be too complicated to set up...actually, I thought they were at one poin...
- 04:19 PM Bug #1789 (In Progress): mon: failed assert(paxosv == pg_map.version)
- Iiiinteresting. This assert is the post-update check, after loading and running through all the incrementals. (Meanin...
- 01:41 PM Bug #1789: mon: failed assert(paxosv == pg_map.version)
- Shouldn't be related — this is a problem with a single monitor daemon and the other is a write problem that an MDS is...
- 12:35 PM Bug #1789: mon: failed assert(paxosv == pg_map.version)
- Core dump attached. Dumb thought: could this be related to http://tracker.newdream.net/issues/2110, they happened wit...
- 10:14 AM Bug #1789: mon: failed assert(paxosv == pg_map.version)
- Crash occurred on the third monitor when starting after being down for several hours shortly after cluster creation. ...
- 02:07 PM CephFS Bug #2110 (Duplicate): osdc/Journaler.cc: 360: FAILED assert(r >= 0)
- #1796
- 01:40 PM CephFS Bug #2110: osdc/Journaler.cc: 360: FAILED assert(r >= 0)
- can you attach ceph-mds too? or better yet, fire up gdb ceph-mds core and print out the value of r from that frame. ...
- 12:00 PM CephFS Bug #2110: osdc/Journaler.cc: 360: FAILED assert(r >= 0)
- Sage Weil wrote:
> Do you have a core file? I'm curious what the value of 'r' is.
Attached. Probably. (datetime ... - 11:43 AM CephFS Bug #2110: osdc/Journaler.cc: 360: FAILED assert(r >= 0)
- Do you have a core file? I'm curious what the value of 'r' is.
- 11:40 AM CephFS Bug #2110 (Duplicate): osdc/Journaler.cc: 360: FAILED assert(r >= 0)
- Assert in MDS. This cluster was running a CephFS home directory workload with one active MDS and one MDS in standby r...
- 01:49 PM Bug #2045 (Need More Info): osd: dout_lock deadlock
- 01:33 PM Feature #2114 (Resolved): old sepia setup on new hardware
- 01:31 PM Feature #2113 (Resolved): objectcacher perfcounters
- 01:18 PM Feature #2112 (Resolved): msgr fault injection
- 01:18 PM Feature #2111 (Fix Under Review): msgr workloads
- Develop the interfaces which will allow us to break messenger sockets at precisely-defined points.
Allow comparison ... - 11:38 AM Tasks #2109: qa/benchmark: Explore using Filebench for benchmarks / stress testing
- Justification and a good intro: http://cuddletech.com/blog/pivot/entry.php?id=949
- 11:36 AM Tasks #2109 (New): qa/benchmark: Explore using Filebench for benchmarks / stress testing
- http://filebench.sourceforge.net/
"Ships with more than 40 pre-defined personalities, including the one that descr... - 11:05 AM Feature #2108 (New): track object states to inform error injection/testing
- 11:04 AM Feature #1412 (Resolved): qa: spec out messenger testing
- we now have a high-level plan on how to attack msgr testing.
- 10:03 AM Bug #1977: mon: ceph command hang
- Pretty sure you pushed changes the day you filed it (note reference in previous message), although I can't find the e...
- 09:51 AM rgw Bug #2106: failed s3tests.functional.test_s3.test_100_continue
- Strange, I can see the request in the apache logs, but not in the rgw logs....
- 09:12 AM Bug #2107 (Resolved): teuthology: lost_unfound fails pg state assert
- ubuntu@teuthology:/a/nightly_coverage_2012-02-27-a/14063...
- 04:56 AM Revision 402ece5e (ceph): init-ceph: stick with /var/run for the time being
- /run isn't present on older systems. Stick with the old location until it
is more pervasive, or we add an autoconf o... - 04:47 AM Revision 41295b58 (ceph): debian: /var/run/ceph -> /run/ceph
- /run/ceph should exists for creating UNIX domain sockets
ceph uses UNIX domain sockets for internal communication. Cr... - 04:45 AM Revision 0d8b5756 (ceph): debian: build-{indep,arch}
- Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
- 04:45 AM Revision 3ad6ccb4 (ceph): debian: sdparm|hdparm, new standards version
- Signed-off-by: Laszlo Boszormenyi <gcs@debian.hu>
- 01:09 AM Revision 9afafdf1 (ceph): move peer to separate test for now
02/26/2012
- 08:56 PM Bug #1977: mon: ceph command hang
- Hmm, I wonder if somehow misdiagnosed this, or inadvertantly fixed it: haven't seen this hang in weeks, and it happen...
- 05:09 PM rgw Bug #2106 (Resolved): failed s3tests.functional.test_s3.test_100_continue
- ...
- 05:02 PM Bug #2022: osd: misdirectect request
- ubuntu@teuthology:/a/nightly_coverage_2012-02-26-a/13876$ grep WRN ceph.log
2012-02-26 01:18:03.166529 osd.1 10.3.1... - 11:19 AM Bug #2105 (Resolved): filestore: mkfs does not create initial snap
- This bug almost the same as this bug:http://tracker.newdream.net/issues/1707
I followed the instruction:http://ceph.... - 05:35 AM Revision 6295578f (ceph): lost_unfound: do peer after, until wait_for_clean propagates last_epoch...
- The peer task does wait_for_clean, and then lost_unfound immediately marks
something down. But the PGs become clean ... - 05:05 AM Revision 84cd4ed6 (ceph): peer: wait for peering to complete, or block
- We need to wait for peering to either complete, or block because it is
waiting for another PG. _Then_ look at all th...
02/25/2012
- 09:33 PM Bug #2104 (Won't Fix): teuthology: wait_for_clean doesn't wait for last_epoch_started to propagate
- 09:06 PM Bug #2103 (Resolved): osd: lockdep error on watch_lock
- ...
- 09:04 PM Bug #2102 (Can't reproduce): osd: pg stuck in backfill
- ...
- 05:39 AM Revision d944e7ee (ceph): fix lockdep.yaml conf syntax
- 01:01 AM Revision 266902a9 (ceph): rgw: initialize bucket_id in bucket structure
- might make valgrind a little bit less noisy.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 12:07 AM Revision 4a5a0911 (ceph): Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph
02/24/2012
- 11:32 PM Revision f8f6e4d8 (ceph): rgw: _exit(0) on SIGTERM
- We need to do something a bit smarter to get coverage information, but this
is a start.
Signed-off-by: Sage Weil <sa... - 11:20 PM Revision 5d5a022c (ceph): run radosgw through valgrind for s3tests
- 11:05 PM Revision edbb41e1 (ceph): add peer task
- Force a pg to get stuck in 'down' state, verify we can query the peering
state, then start the OSD so it can recover. - 11:04 PM Revision c9c1a4ab (ceph): do peer test along with lost_unfound
- 11:01 PM Revision b8739585 (ceph): peer: remove unused variable
- 10:56 PM Revision 62bda127 (ceph): misc: always return a usable result from get_valgrind_args
- 10:56 PM Revision e4801819 (ceph): rgw: simplify valgrind args
- 09:52 PM Revision 708be0a5 (ceph): Merge remote branch 'gh/wip-crush-adjust'
- Reviewed-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
- 09:48 PM Revision b0feba56 (ceph): Merge remote branch 'gh/wip-mds-resetter'
- Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 09:43 PM Revision 5c6e8b37 (ceph): Merge branch 'wip-pg-query'
- Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
- 09:22 PM Revision 008ce6b2 (ceph): Merge branch 'stable'
- 09:00 PM Revision 732f3ec9 (ceph): v0.42.2
- 09:00 PM Revision 321ba67f (ceph): Merge remote-tracking branch 'gh/stable' into stable
- 08:54 PM Revision be761149 (ceph): Merge branch 'stable'
- 08:49 PM Revision fc531a91 (ceph): rename valgrind -> verify, add in runs under lockdep
- 08:42 PM Revision c43e87d1 (ceph): ceph_manager: list_pg_missing
- List missing objects for the given pgid.
- 08:42 PM Revision 7ac04a42 (ceph): lost_unfound: list missing/unfound for each pg and verify the unfound c...
- This also tests the pg list_missing functionality.
- 08:40 PM Revision d85ed91c (ceph): osd: fix array index
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:40 PM Revision 722e9e59 (ceph): lockdep: don't make noise on startup
- Who cares!
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 08:40 PM Revision fdaed0a7 (ceph): formatter: fix trailing dump_stream()
- Flush a previous dump_stream() if it was the last thing prior to a
close_section().
Signed-off-by: Sage Weil <sage.w... - 08:05 PM Revision 7bf64b73 (ceph): rgw: accept dict
- e.g.,
tasks:
...
- rgw:
client.0:
client.1: - 08:05 PM Revision e2ea73d1 (ceph): rgw: add valgrind support
- tasks:
- ceph:
- rgw:
client.a:
valgrind: [--tool=memcheck] - 08:05 PM Revision 7af6e46c (ceph): ceph: always try to process valgrind logs
- Check for errors in valgrind logs even if there is no valgrind option
the ceph task config stanza. Other tasks can r... - 08:05 PM Revision 90fdc840 (ceph): ceph: always create valgrind logs dir
- Other tasks use it too. It's more annoying to conditionally create it.
- 08:05 PM Revision 9ec04722 (ceph): refactor all valgrind users to use a get_valgrind_args() helper
- This avoids much annoying, duplicated code.
- 08:05 PM Revision 3bfb8d69 (ceph): ceph, ceph-fuse: simplify valgrind argument additions
- 08:05 PM Revision c93a08ed (ceph): Whitespace and unnecessary formatting fixes
- 08:04 PM Revision 7ad35ce4 (ceph): osd: include timestamps in state json dumps
- Include the time we entered this state in the dump.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 08:00 PM Revision 185c6b1f (ceph): Merge branch 'wip-2007'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 07:59 PM Revision e22adac2 (ceph): osd: use blocks for readability in list_missing
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:33 PM Revision e22a45a1 (ceph): osd: query recovery state machine
- For now, just append this to the end of the pg <pgid> query json dump.
We definitely want to do something smarter her... - 07:33 PM Revision a7c8bfbe (ceph): osd: query Peering substates
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:33 PM Revision 6d90a6dd (ceph): osd: dump recovery_state states in json
- Use a formatter. Present a vector of states, inner to outer.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:24 PM Revision d3b203af (ceph): osd: add tunable for number of records in osd command replies
- e.g., 'pg <pgid> list_missing [offset]'.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:24 PM Revision 0361a3c4 (ceph): osd: pass in data to do_command
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:24 PM Revision 2677c72f (ceph): add libjson_spirit.la
- This is lightweight and relies on boost spirit, which we already use, so
there are no new dependencies.
There were s... - 07:24 PM Revision 6c257c4d (ceph): hobject_t: decode json
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:24 PM Revision 91fbc687 (ceph): osd: 'pg <pgid> list_missing <json hobject_t offset>'
- Dump missing objects in json. If more key is non-zero, user should ask for
more by passing the last object as the of... - 07:24 PM Revision c9416e61 (ceph): osd: 'tell osd.N mark_unfound_lost revert' -> 'pg <pgid> mark_unfound_l...
- More consistent interface.
Fixes: #2030
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
Reviewed-by: Josh Durgin ... - 07:15 PM Revision 64038524 (ceph): lockdep: warn on stderr (via derr), not stdout
- Otherwise we screw up ceph-conf output and the like.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:15 PM Revision 804f243b (ceph): do_autogen.sh: -T for --without-tcmalloc
- Signed-off-by: Sage Weil <sage@newdream.net>
- 03:30 PM Feature #2054 (Resolved): teuthology: run radosgw through valgrind
- ok, this now works with yaml like...
- 01:52 PM Feature #2006 (Resolved): osd: report what is blocking peering completion
- commit:5c6e8b3795d0cf58814619bfc15cb0841e9a4f17
- 01:51 PM CephFS Bug #1792 (Can't reproduce): crash in ceph-mds
- even if we could, we would never know, since there isn't any distinguishing info here, and the teuth archive is gone.
- 01:48 PM RADOS Bug #2096 (Resolved): crush: adjust weight broken for tree, list buckets
- commit:708be0a5abef63a5da8409ad13719adb7bb744f8
- 01:47 PM RADOS Feature #2101 (Resolved): crushtool: check for weight overflow on reweight
- 11:56 AM Feature #2007 (Resolved): osd: enumerate unfound, lost objects, possible locations
- 09:52 AM Feature #2007: osd: enumerate unfound, lost objects, possible locations
- wip-2007
- 11:34 AM Feature #2030 (Resolved): osd: clean up mark_unfound api
- 10:34 AM Messengers Feature #2100 (Resolved): msgr: Prevent throttled clients from slowing down non-throttled connect...
- Right now, it seems a throttled connection will still receive a TCP receive buffer's worth of data, but because the u...
- 09:15 AM Linux kernel client Bug #2099: messenger: unexpected socket state (4)
- I don't think any of these other states are necessarily problematic, as long as the socket eventually ends up in CLOS...
- 08:49 AM Linux kernel client Bug #2099: messenger: unexpected socket state (4)
- This may be related to http://tracker.newdream.net/issues/1803 and http://permalink.gmane.org/gmane.comp.file-systems...
- 08:33 AM Linux kernel client Bug #2099: messenger: unexpected socket state (4)
- Adding that I see more of the same WARNING() messages in the log for
the same state, as well as others for state 5, ... - 08:13 AM Linux kernel client Bug #2099 (Rejected): messenger: unexpected socket state (4)
- Running tests defined by the YAML file below. Note that branch
wip-messenger is 107a8aaf21d01ee6cbc7a638faf1328f2bd... - 07:59 AM CephFS Bug #2092: BUG at fs/ceph/caps.c:999
- mdsc->mutex protects the globalish mds client state (request/session lists), which is different from session->s_mutex...
- 06:57 AM CephFS Bug #2092: BUG at fs/ceph/caps.c:999
- Just a quick look at this.
Here's the code:
static void __queue_cap_release(struct ceph_mds_session *session,
... - 06:10 AM Bug #2091 (Can't reproduce): corrupt v5 inc osdmap
- logs don't go far enough back. :(
moral of the story: next time grab the full mon data dir immediately in case it... - 05:57 AM Linux kernel client Bug #1907 (Resolved): rbd: don't reuse device ids while they're still in use elsewhere
- Committed a couple of weeks ago and has seen no bad effect during the
intervening testing. So I'm marking this one ... - 04:22 AM Revision 5efa821c (ceph): rgw: swift read acls allow bucket listing
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 04:11 AM Revision f09fb870 (ceph): rgw: fix swift acl enforcement
- we'll also need to make it so that swift read acls allow bucket listing
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdr... - 04:09 AM Revision d40a9b27 (ceph): lost_unfound: new mark_unfound_lost syntax
- 02:58 AM Revision 7c7349ef (ceph): ceph: fix help.t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 02:48 AM Revision 730b9ee0 (ceph): ceph-dencoder: man page
- Signed-off-by: Sage Weil <sage@newdream.net>
- 02:48 AM Revision f6e42a8b (ceph): ceph.spec.in: add ceph-dencoder
- Signed-off-by: Sage Weil <sage@newdream.net>
- 02:48 AM Revision 0281f1c6 (ceph): debian: add ceph-dencoder
- Signed-off-by: Sage Weil <sage@newdream.net>
- 02:48 AM Revision c3e1291d (ceph): v0.42.1
- 02:13 AM Revision cbf79a97 (ceph): ceph-tool: remove reference to "stop" command
- This doesn't exist any more, and I don't think it
ever "cleanly shut down the filesystem" -- certainly not
within my ... - 02:13 AM Revision 3bad945b (ceph): mds: remove unused MDBalancer dump_pop_map() function.
- Commenting it out is not the right answer. ;)
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
Reviewed-by:... - 01:22 AM Revision 4dfec574 (ceph): rgw: enforce swift acls
- doesn't work yet, but almost.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 01:07 AM Revision 81a46c46 (ceph): dump_stuck: flush stats before waiting for recovery/clean
- 12:35 AM Revision 159f2b86 (ceph): mds: fix Resetter locking
- We need to hold the lock for ms_dispatch, esp calls into objecter. We
should only drop it when blocking; use distinc... - 12:35 AM Revision 065d6dd8 (ceph): mds: clean up useless block
- Signed-off-by: Sage Weil <sage@newdream.net>
02/23/2012
- 11:34 PM Revision f5bf9d9c (ceph): rgw: s3 only shows s3 acls
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 11:33 PM Revision c88da93e (ceph): Merge remote branch 'origin/wip-mds-old-inodes'
- Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 11:06 PM Revision 5aa60ce4 (ceph): Merge remote branch 'origin/wip-dencoder'
- Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 11:06 PM Revision db99217b (ceph): Merge remote branch 'origin/wip-1820'
- Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 11:05 PM Revision e8bc42ff (ceph): osd: only set CLEAN when we are not remapped (up == acting)
- If we have a temporary mapping for this PG, consider that unclean. This
makes CLEAN and REMAPPED mutually exclusive.... - 10:59 PM Revision 4d1d5229 (ceph): rgw: show swift ACLs
- 10:56 PM Revision d8df5655 (ceph): Merge remote-tracking branch 'gh/wip-pg-query'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 10:55 PM Revision ddc99983 (ceph): osd: conditionally encode old pg_pool_t when no CEPH_FEATURE_OSDENC
- This fixes OSDMap compatibility between v0.42 and <v0.42.
For MOSDMap, reencode maps if OSDENC feature is missing. ... - 10:38 PM Revision cd9f7df9 (ceph): Merge remote-tracking branch 'gh/wip-dump-ops-in-flight'
- Reviewed-by: Sage Weil <sage@newdream.net>
- 10:28 PM Revision 079dd6db (ceph): mon: mds "stop" -> "deactivate"
- See #1820.
Signed-off-by: Sage Weil <sage@newdream.net> - 10:28 PM Revision a1544c0e (ceph): doc: 'deactivate mds' instead of 'stop mds'
- Signed-off-by: Sage Weil <sage@newdream.net>
- 10:28 PM Revision d85e9153 (ceph): mon: use pending_mdsmap for deactivate
- We should always look at the proposed map to avoid weird races.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:56 PM Revision 2824c07f (ceph): rgw: can use swift to set bucket permissions
- Currently only setting, not reading. Also, at the moment it's
setting the wrong permissions.
Signed-off-by: Yehuda S... - 08:12 PM Revision 700fe079 (ceph): test: add basic test for the OSD's dump_ops_in_flight adminsocket command
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 08:12 PM Revision 5944016b (ceph): osd: add "dump_ops_in_flight" to the AdminSocket.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 08:08 PM Revision 21c43133 (ceph): mon: refuse to stop mds if max_mds will make it rejoin
- Otherwise the MDS will leave the cluster and immediately rejoin, which is
useless and confusing to users. See #1820.... - 08:07 PM Feature #2030: osd: clean up mark_unfound api
- wip-2030
- 07:53 PM Revision 7700ea94 (ceph): crushtool: add --reweight-item cli tests
- Test list, tree, and straw buckets.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:39 PM Revision 286df2db (ceph): crush: fix weight adjust for list, tree buckets
- Fix the typo. Code now matches that for straw buckets.
Reported-by: ZhuRongze <zrz4ceph@gmail.com>
Signed-off-by: S... - 07:16 PM Revision 963dec82 (ceph): Merge branch 'wip-2090'
- Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 07:15 PM Revision d1fe2f8f (ceph): mon: deprecate mon 'stop' command
- Send SIGTERM.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:15 PM Revision 62a113aa (ceph): mon: unlock mon before msgr shutdown
- The ceph_mon.cc main() will delete mon when the msgr dispatch thread
completes. Make sure we unlock before we shut d... - 07:14 PM Revision 962aa3ea (ceph): msgr: join dispatch_thread after it completes
- This is just for completeness. No change in behavior, since we don't
get here until the thread has signaled it is do... - 07:04 PM Revision d8192222 (ceph): Merge remote-tracking branch 'gh/wip-stop'
- 06:52 PM Messengers Bug #2086: msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- it did. probably a race with another thread in connect() or accept() reregistering a new Pipe.. connect() pbly
- 06:47 PM Messengers Bug #2086: msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- We sure this was run including commit:ebbfdefa120ae93b95780c67027ec9efd4b7b5cd?
- 05:51 PM Revision 86a54a6e (ceph): filestore: use IOC_CLONERANGE intead of IOC_CLONE ioctl
- This is functionally equivalent, except that valgrind doesn't complain
about a bad pointer passed to an ioctl.
Signe... - 05:43 PM Revision 49588e94 (ceph): osd: drop "stop" command
- Send SIGINT.
Fixes: #1820
Signed-off-by: Sage Weil <sage@newdream.net> - 05:42 PM Revision 560ddf46 (ceph): osd: drop unused "stop" check
- This is never reached: both callers handle "stop" explicitly.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:39 PM Revision 64ca584d (ceph): osd: don't complete recovery if unfound
- Otherwise we fail the !needs_recovery() assert. Because we aren't
recovered. For example,
2012-02-21 16:16:13.1046... - 04:38 PM Feature #2006 (In Progress): osd: report what is blocking peering completion
- wip-pg-query
- 04:07 PM Bug #2098 (Resolved): xfs/ext4 non-idempotent transaction
- Forcing a sync after a non-idempotent transaction is not adequate to ensure correctness during journal replay.
Con... - 03:36 PM Bug #1820 (Resolved): deprecate "ceph stop"
- 02:37 PM Bug #1820: deprecate "ceph stop"
- ok, tested all this in wip-1820. 'deactivate' already moves the ceph-mds to standby (not exit), all good there.
n... - 11:30 AM Bug #1820: deprecate "ceph stop"
- yeah. i think the simplest is to make 'leave' refuse if it's is < max_mds.
and we could drop max mds from the cep... - 11:22 AM Bug #1820: deprecate "ceph stop"
- Oh, I've talked of this before. It might be nice to have a "start ceph-mds only to process a leftover journal and han...
- 11:19 AM Bug #1820: deprecate "ceph stop"
- Changing docs is easy, and the branches already rip out "documented" commands. Let's just make it make sense.
I wo... - 11:04 AM Bug #1820: deprecate "ceph stop"
- It can easily go back into standby (via the respawn() -> execve() path) instead of shutting down. Then it's really "...
- 10:54 AM Bug #1820: deprecate "ceph stop"
- On termination the process exits. On receipt of a stop command it exports authority over the filesystem hierarchy to ...
- 10:52 AM Bug #1820: deprecate "ceph stop"
- Tommi Virtanen wrote:
> Greg, how is "ceph mds stop 0" different from that ceph-mds receiving a local request to ter... - 10:51 AM Bug #1820: deprecate "ceph stop"
- Greg, how is "ceph mds stop 0" different from that ceph-mds receiving a local request to terminate (e.g. SIGTERM)?
- 10:49 AM Bug #1820: deprecate "ceph stop"
- No, the important part is the hierarchy authority export. Then it shuts down; it's not a "go standby". I guess you co...
- 10:48 AM Bug #1820: deprecate "ceph stop"
- Which makes me think, is the concept of "go standby" of any value, if there's something that'll automatically say the...
- 10:44 AM Bug #1820: deprecate "ceph stop"
- It sounds like that does two things: move the MDS from active to standby, and terminate it. And we're removing the "r...
- 10:31 AM Bug #1820: deprecate "ceph stop"
- That one is a bit different.. it's instructing ceph-mds to export all of it's metadata to another node and leave the ...
- 10:11 AM Bug #1820: deprecate "ceph stop"
- Yeah. I can't speak for the threading & locking changes, but the command removal is trivial.
That still leaves
... - 09:51 AM Bug #1820: deprecate "ceph stop"
- wip-stop and wip-2090
- 03:35 PM Bug #2095 (Resolved): osd: need feature bit for v0.42 osdmap encoding change
- commit:ddc99983228e761f754e0038aecbe341d7e2181f
- 09:27 AM Bug #2095: osd: need feature bit for v0.42 osdmap encoding change
- we had a feature bit already, we just needed to conditonally encodee the old format, and tweak MOSDMap to reencode ma...
- 03:16 PM Bug #2094 (Resolved): osd: pgs remapped to down+out osd
- making remapped and clean mutually exclusive. commit:e8bc42ff435e5648b88b818775d8fa47989af5dc
- 10:43 AM Bug #2094: osd: pgs remapped to down+out osd
- Reproduced again with stats flushing. This seems to happen every time with this configuration (maybe having only 2 os...
- 03:14 PM Bug #2091: corrupt v5 inc osdmap
- ok.. yeah, it looks like the monitor may have published a bad inc update or something? unclear. i'll check with the...
- 03:11 PM Bug #2091: corrupt v5 inc osdmap
- OK, picking a few things out of the original corruption report.
The basic header stuff is the same as before, as e... - 02:48 PM Feature #2015 (Resolved): osd: dump in-flight ops via admin socket
- 02:37 PM CephFS Feature #2097 (Rejected): mds: 'ceph mds activate <gid>'
- ability to explicitly instruct a standby mds to join the active cluster.
- 12:04 PM Messengers Bug #1985 (Won't Fix): msgr: creating new Pipe for pre-existing connection leaks Pipe if they don...
- at least until we demonstrate the problem (after the msg leak fix). this will probably be moot after refactoring som...
- 12:01 PM RADOS Bug #2096: crush: adjust weight broken for tree, list buckets
- wip-crush-adjust
- 10:48 AM RADOS Bug #2096 (Resolved): crush: adjust weight broken for tree, list buckets
- ...
- 11:25 AM Bug #2090 (Resolved): mon: assertion failed on shutdown
- commit:963dec82880717054c760a745cf93cc7b43112df
- 09:06 AM Bug #2080 (Resolved): osd: scrub on disk size does not match object info size
- 05:24 AM Revision 3628f901 (ceph): mds: make EMetaBlob::fullbit::old_inodes non-ptr
- No need to put this separately on the heap, as a static map<> isn't much
more expensive than a pointer. Also, this e... - 05:21 AM Revision 7842bb50 (ceph): mds: Add old_inodes to emetablob
- Add information about old inodes to the mds journal.
Signed-off-by: Alexandre Oliva <oliva@lsd.ic.unicamp.br>
Signed... - 05:08 AM Revision 26b56753 (ceph): Fix ceph-mds --journal-reset
- Complete configuration initialization for special actions, and
hold Resetter lock while running reset.
Signed-off-by...
02/22/2012
- 10:12 PM Linux kernel client Cleanup #2093: ceph-client: messenger: the "to" parameter to read_partial() needs to go
- I think it's right as is... all of those read calls are non-blocking. So the first time around in_base_pos is 0 and ...
- 05:28 PM Linux kernel client Cleanup #2093 (Resolved): ceph-client: messenger: the "to" parameter to read_partial() needs to go
- I have been doing some refactoring of the net/ceph/messenger.c. One of
my aims was to understand the how (and why) ... - 09:33 PM Bug #2091: corrupt v5 inc osdmap
- the first badness in the log is below. once it missed one incremental, things probably got out of sync and the pg_te...
- 09:28 PM Bug #2091: corrupt v5 inc osdmap
- Oh.. that means the pg_temp mapping was inserted by a previous inc map, probably. we need to find the first instance...
- 06:23 PM Bug #2091: corrupt v5 inc osdmap
- I've manually decoded the entire ceph_osdmap dumped in the log and everything
therein looks fine. (This was overkil... - 01:20 PM Bug #2091: corrupt v5 inc osdmap
- I'm starting to look at this in detail but haven't concluded what went wrong yet.
Does it matter whether it was th... - 09:33 AM Bug #2091: corrupt v5 inc osdmap
- reencoded to old format (using latest ceph-dencoder) gives us...
- 09:28 AM Bug #2091 (Can't reproduce): corrupt v5 inc osdmap
- ...
- 09:20 PM Bug #2090: mon: assertion failed on shutdown
- ...
- 09:20 PM Bug #2090: mon: assertion failed on shutdown
- wip-2090
- 05:04 AM Bug #2090 (Resolved): mon: assertion failed on shutdown
- I was running repeated cycles of the kernel_untar_build.sh workunit
to try to reproduce a problem in the client and ... - 09:17 PM Bug #2095 (Resolved): osd: need feature bit for v0.42 osdmap encoding change
- 07:02 PM Bug #2094 (Resolved): osd: pgs remapped to down+out osd
- This is why the dump_stuck test fails on master. When one osd is marked out, the pg is remapped incorrectly:...
- 10:06 AM Feature #2005 (Resolved): mon: track timestamps on pg states
- 10:06 AM Feature #2058 (Resolved): ceph: query pg state
- 10:03 AM Feature #2054: teuthology: run radosgw through valgrind
- wip-valgrind
- 09:45 AM CephFS Bug #2092 (Can't reproduce): BUG at fs/ceph/caps.c:999
- ...
- 09:36 AM Bug #2022: osd: misdirectect request
- hit this again:...
- 01:11 AM Revision 761ecc69 (ceph): Makefile: include encoding check scripts in dist tarball
- This makes 'make distcheck' happy. Well, more happy at least; it's still
cranky but I can't tell why.
Signed-off-by... - 12:21 AM Revision 52a52cf4 (ceph): Add test for 'ceph pg dump_stuck'
02/21/2012
- 11:44 PM Revision a6c7f999 (ceph): ceph-dencoder: man page
- Signed-off-by: Sage Weil <sage@newdream.net>
- 11:44 PM Revision cd5a8f7e (ceph): ceph.spec.in: add ceph-dencoder
- Signed-off-by: Sage Weil <sage@newdream.net>
- 11:44 PM Revision 7fab4fa0 (ceph): debian: add ceph-dencoder
- Signed-off-by: Sage Weil <sage@newdream.net>
- 11:24 PM Revision 8c48a8e0 (ceph): rgw: read correct acls for swift metadata update ops
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 11:12 PM Revision 995dc1f7 (ceph): Add a task for testing stuck pg visibility.
- 11:12 PM Revision 2a1c74c5 (ceph): Move duration calculation to an internal task
- This excludes all generic start up costs, like waiting for locks,
rebooting into a new kernel, etc. - 11:08 PM Revision e67c0ff0 (ceph): osd: make object_info_t::dump using hobject_t and object_locator_t dum...
- Makes the output more readable.
Signed-off-by: Sage Weil <sage@newdream.net> - 11:04 PM Revision eb434a50 (ceph): Add necessary imports for s3 tasks, and keep them alphabetical.
- 11:04 PM Revision 1ac4bb10 (ceph): Add necessary imports for s3 tasks, and keep them alphabetical.
- 10:46 PM Revision f7feded0 (ceph): Merge remote-tracking branch 'gh/wip-dump-stuck-pgs'
- Reviewed-by: Sage Weil <sage@newdream.net>
- 10:44 PM Revision 04c8e01d (ceph): Merge remote-tracking branch 'gh/wip-osd-write'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 10:43 PM Revision 55a60651 (ceph): osdmap: dump embedded crush map in Incremental::dump()
- Signed-off-by: Sage Weil <sage@newdream.net>
- 10:39 PM Revision 2365c77a (ceph): rgw: maintain separate policies for object and bucket
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 10:39 PM Revision cc78fdaa (ceph): Merge branch 'wip-crush'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 10:37 PM Revision d2335fab (ceph): crush: write CrushWrapper:dump()
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:58 PM Revision 174f6b84 (ceph): osd: refuse to return data payload if request wrote anything
- Write operations aren't allowed to return a data payload because
we can't do so reliably. If the client has to resend... - 09:58 PM Revision 27c8a3f4 (ceph): test/rados-api/misc: fix LibRadosMisc.Operate1PP test
- It's a mutation, so we get a result of 0 (or error).
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 09:51 PM Revision 270bb5cf (ceph): Merge branch 'wip-osdmap'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 09:50 PM Revision 7cafa255 (ceph): osdmap: dump fullmap from dump()
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:43 PM Revision 80d86306 (ceph): Merge branch 'wip-1821'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 08:23 PM Revision 11073e50 (ceph): s3roundtrip, s3readwrite: access key uses url safe chars
- Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
- 08:18 PM Revision 0e4367aa (ceph): rgw: accepted access key chars should be url safe
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 08:12 PM Revision 6e1b3a56 (ceph): rgw: access key uses url safe chars
- Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
- 08:12 PM Revision 92110e5a (ceph): rgw: access key uses url safe chars
- Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
- 06:02 PM Revision df5f5738 (ceph): add valgrind collection to regression suite
- Run a smaller set of tests with valgrind on the mon, osd, and mds.
Valgrind is currently ignoring leaks, but this wi... - 05:29 PM Revision 17d38700 (ceph): rgw: don't invalidate cache when adding xattrs
- 04:58 PM rgw Cleanup #2089 (Resolved): rgw: less dout(0) noise?
- i think that's hwere this si coming from:...
- 03:32 PM Feature #1932 (Resolved): mon: before accepting a new crushmap, monitor should validate and test ...
- 03:31 PM Feature #2088 (Rejected): msgr: refactor 2 threads to one
- 03:30 PM Feature #1412 (New): qa: spec out messenger testing
- 03:29 PM Feature #1412: qa: spec out messenger testing
- er, wrong bug!
- 12:22 PM rgw Bug #2083 (Resolved): rgw: test_object_raw_authenticated* fail (on xfs?)
- Should be fixed now. Updated relevant teuthology tests to use only url safe chars. Also updated rgw-admin to disallow...
- 10:34 AM rgw Bug #2083: rgw: test_object_raw_authenticated* fail (on xfs?)
- Not really related to xfs. The problem is that when generating authenticated urls, boto doesn't escape the access key...
- 10:55 AM Feature #2087 (Resolved): lightweight filestore workload generator
- simple program that uses FileStore and generates something that looks vaguely like what an OSD does. e.g.,
- stre... - 09:13 AM Bug #2084: segfault in tcmalloc
- and again (hammer b.yaml). right before the crash sched_scrub() was called......
- 04:40 AM Revision cedb3d73 (ceph): ceph: if 'pg <pgid> ..' doesn't parse a pgid, send to mon
- E.g., 'pg dump'. Sigh.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 12:03 AM Revision 9927671b (ceph): Makefile: fix misplaced unit tests
- These weren't run on make check because they were defined in the wrong
spot.
Signed-off-by: Sage Weil <sage@newdream... - 12:03 AM Revision 1ff75684 (ceph): hobject_t: remove unused back_up_to_bounding_key()
- This was a path not taken in the backfill code.
Signed-off-by: Sage Weil <sage@newdream.net>
02/20/2012
- 11:17 PM Revision c5688e65 (ceph): ceph: valgrind trumps coverage when picking a flavor
- valgrind will crash if we don't use notcmalloc; coverage will silently
fail to collect coverage info. - 10:54 PM Revision 5216d3c7 (ceph): ceph.conf: no lockdep by default
- 10:41 PM Revision 4d3de038 (ceph): osd: sched_scrub() outside of map_lock
- Inside sched_scrub() we call _lookup_lock_pg(), which takes
map_lock.get_read(). That's technically okay because RWL... - 10:38 PM Revision 0b7f6e39 (ceph): global: resurrect lockdep
- Add 'lockdep' config option, and initialize g_lockdep from that in
global_init().
Signed-off-by: Sage Weil <sage@new... - 09:38 PM Revision 5f9445c8 (ceph): suite.results: include test duration in output
- 09:00 PM Revision 44320370 (ceph): mon: disable pg_num adjustment
- Until #1515 is fixed/reimplemented.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 08:49 PM Revision 84bd876c (ceph): cfuse -> ceph-fuse
- 07:02 PM Revision 7d3ae375 (ceph): mon: use encode function for new Incremental
- When we encode an Incremental, use the encode wrapper function, so that
we can capture the encoded struct when buildi... - 06:56 PM Revision a4f2fdb5 (ceph): osdmap: add Incremental::dump()
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:56 PM Revision 1e407b4f (ceph): ceph-dencoder: add OSDMap::Incremental
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:56 PM Revision ebd29b65 (ceph): qa/btrfs/test_rmdir_async_snap
- Attempt to reproduce btrfs bug when rmdirs race with an async snap.
Unsuccessful. Best guess is that we need multipl... - 06:56 PM Revision f3020c4a (ceph): osdmap: use FEATURE encoder macro
- This generates encode/decode functions that pass feature bits into the
encoder, allowing us to encode old formats.
S... - 06:56 PM Revision f3a273a6 (ceph): osdmap: successfully decode short map
- When we send (old) maps to the kclient, we omit the extended section. Lets
decode those (old, abbreviated maps) succ... - 05:40 PM Revision 76cc71b2 (ceph): osd: don't count SNAPDIR as a clone during backfill
- When we are backfilling, we add in objects as we push them. Do not count
the snapdir object as a clone, or else we'l... - 04:19 PM Messengers Bug #2086 (Resolved): msgr: msg/SimpleMessenger.h: 203: FAILED assert(!i->second->is_on_list())
- ...
- 03:12 PM Revision 71d0d97a (ceph): cfuse -> ceph-fuse
- 03:04 PM Revision 7ff9f044 (ceph): ceph: allow valgrind per-type (not just per-name)
- 02:54 PM Linux kernel client Cleanup #2085 (New): kclient: improve mtime update in page_mkwrite
- this should be done in the various helpers we call when we successfully mark a page dirty, not in the outer function.
- 02:40 PM Revision 24b470a9 (ceph): crush: fix CrushCompiler warning
- warning: crush/CrushCompiler.cc:595: ‘r’ may be used uninitialized in this function
Signed-off-by: Sage Weil <sage.w... - 02:29 PM Bug #1765 (Resolved): osd: 'call' op can return data even if op is modifying
- commit:afc1748db52911295708e4afbe7fd7884c97dbbf
- 02:28 PM Revision d74e0294 (ceph): test/encoding/readable.sh: sh, not dash
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 02:27 PM Bug #1821 (Resolved): librados: rados_create_with_context is unusable
- we could still add refcounting to the CephContext later.
- 02:27 PM Revision e33bf5af (ceph): crushtool: fix clitests
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 02:24 PM Bug #2084 (Can't reproduce): segfault in tcmalloc
- heap corruption?...
- 01:52 PM Linux kernel client Bug #2081: msgr: spinlock badness?
- ubuntu@teuthology:/a/nightly_coverage_2012-02-20-b/12984 with same trace on the console.
- 01:10 PM Bug #2080: osd: scrub on disk size does not match object info size
- 08:48 AM Bug #2080: osd: scrub on disk size does not match object info size
- reproduced with log. metropolis:~sage/bug-2080
- 06:20 AM Bug #2080: osd: scrub on disk size does not match object info size
- ubuntu@teuthology:/a/master-2012-02-19_19:50:05/12884
- 08:31 AM Cleanup #2021 (Resolved): fix signal handlers
- 06:29 AM rgw Bug #2083 (Resolved): rgw: test_object_raw_authenticated* fail (on xfs?)
- This fails sometimes, but not always. It seems to happen more often on xfs, but maybe that's my imagination....
- 03:40 AM Revision eb93fa74 (ceph): lost_unfound: mark osds in when we revive them
- so that we test what we meant to. It also lets us actually go clean at the
very end. - 03:37 AM Revision 0429aa79 (ceph): msgr: fix shutdown race again
- Only unlock once. Sigh.
Signed-off-by: Sage Weil <sage@newdream.net> - 03:36 AM Revision d6de0bb8 (ceph): Merge branch 'stable'
02/19/2012
- 11:30 PM Revision b205c64c (ceph): v0.42
- 10:52 PM Revision 76e88d10 (ceph): msgr: fix accept shutdown race fault
- Need to hold pipe_lock.
Signed-off-by: Sage Weil <sage@newdream.net> - 10:50 PM Revision ca04ee13 (ceph): mon: test injected crush map
- Run a bunch of inputs through an injected crush map to make sure it isn't
broken.
Fixes: #1932
Signed-off-by: Sage W... - 10:48 PM Revision 5dd24f9f (ceph): crush: move crushtool --test into CrushTester
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 10:44 PM Revision e42a0e9f (ceph): crush: move (de)compile into CrushCompiler class
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:44 PM Revision 2c2b3881 (ceph): mon: fix message discard on shutdown
- Return true, so the messenger is happy, and drop the message reference.
Avoids an assert like
2012-02-19T12:36:05.1... - 08:08 PM Revision 4dd8c354 (ceph): crush: uninline encode/decode
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:59 PM Revision 6b5be276 (ceph): crush: cleanup: use temp var for curstep
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 03:52 PM Bug #2082 (Resolved): osd: broken queuing during replay
- ...
- 03:49 PM Bug #1638 (Won't Fix): Can't create object with large xattrs in a single operation (on extN)
- 03:48 PM CephFS Bug #2018 (Resolved): mds: can't change file_max
- oh, i fixed this a week or two ago. the problem was that the file isn't open read/write, but Client was still trying ...
- 03:46 PM Bug #2032 (Resolved): paxos: somehow didn't update stash alongside new states
- 03:45 PM Bug #2044 (Resolved): osd: pg stuck in active+backfill
- 03:45 PM Feature #1412 (Can't reproduce): qa: spec out messenger testing
- this code has been refactored a bit.
the messenger tests won't directly trigger this, though we may the/an under... - 03:45 PM Bug #1631 (Can't reproduce): osd: failed assert(repop_queue.front() == repop)
- this code has been refactored a bit.
the messenger tests won't directly trigger this, though we may the/an under... - 03:41 PM Revision ff5178c8 (ceph): mds: use want_state to indicate shutdown
- State gets DNE when we receive the first map. And want_ makes more sense
anyway. Fixes MDS startup.
Signed-off-by:... - 03:40 PM Feature #1932: mon: before accepting a new crushmap, monitor should validate and test some inputs
- wip-crush
- 02:51 PM Bug #2080: osd: scrub on disk size does not match object info size
- ...
- 06:49 AM Revision 15016f02 (ceph): ceph: direct 'pg <pgid> ...' to primary osd for given pgid
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:49 AM Revision ffddb349 (ceph): osd: dispatch 'pg <pgid> ...' commands to PG::do_command()
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:49 AM Revision 481e629c (ceph): osd: implement 'pg <pgid> query'
- Dump a blob of json about the pg state.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:17 AM Revision 344c2022 (ceph): osd: fix up argument to PG::init()
- Commit cefa55b288b40e17ade9875493dd94de52ac22bf moved PG initialization
into init(), but passed acting for both up an... - 06:12 AM Revision 10016923 (ceph): mds: ignore all msgr callbacks on shutdown, not just dispatch
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:12 AM Revision 1f5e446d (ceph): msgr: promote SimpleMessenger::Policy to Messenger::Policy
- This is part of the generic interface, not specific to the implementation.
Signed-off-by: Sage Weil <sage.weil@dream... - 06:12 AM Revision 2500a9b6 (ceph): SimpleMessenger: drop unused sigint()
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:41 AM Revision 1f240ca4 (ceph): mon: discard messages while shutting down
- Add SHUTDOWN state. Ignore any msgr callbacks if set.
Fixes crash like
2012-02-18T21:57:58.912 INFO:teuthology.tas...
02/18/2012
- 11:13 PM Linux kernel client Bug #2081 (Can't reproduce): msgr: spinlock badness?
- captured this console fragment from a crashed qa run...
- 10:57 PM Bug #2070 (Duplicate): osd/ReplicatedPG.cc: 3627: FAILED assert(is_active())
- ok i didn't observe this crash and trace it back, but i'm almost certain it's the same as #2075.
commit:344c202203... - 01:54 PM Bug #2070: osd/ReplicatedPG.cc: 3627: FAILED assert(is_active())
- ubuntu@teuthology:/a/nightly_coverage_2012-02-18-a/12494
- 10:56 PM Bug #2075 (Resolved): osd: recover_got assert
- commit:344c20220345197c03fbaf46e2c1289d81a0a14f
- 02:01 PM Bug #2075: osd: recover_got assert
- ubuntu@teuthology:/a/nightly_coverage_2012-02-18-a/12489...
- 10:44 PM Revision 45b6189b (ceph): ceph_manager: ignore stale states when counting
- also remove assumptions about ordering of states
- 10:28 PM Revision 787dd170 (ceph): msgr: fix shutdown vs accept race
- This is a kludge. The real fix is to rewrite SimpleMessenger as a state
machine.
Fixes: #2073
Signed-off-by: Sage W... - 10:28 PM Revision c3a509a0 (ceph): mds: drop all messages during suicide
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 10:01 PM Feature #2074 (Rejected): teuthology: remove old kernel packages
- i did this manually on sepia. new teuth will reimage regularly.
- 10:00 PM Revision fe0859aa (ceph): Merge remote branch 'gh/wip-pg-states'
- 09:56 PM Revision b5668cf6 (ceph): thrashing: whitelist 'objects unfound and apparently lost' message
- This can happen when we mark OSDs down... if the objects are found when
the osds come back up then we're fine. if no... - 09:24 PM Messengers Bug #2073 (Resolved): msgr: shutdown can hang
- this appears to be fixed with commit:787dd1709797876dd9fa6004c6723df859003b59, unless there is some subtle difference...
- 03:51 PM Feature #2034 (Resolved): osd: refactor push code
- 03:50 PM Feature #2058: ceph: query pg state
- wip-pg-query
- 02:15 PM Bug #2061 (Resolved): osd: scrub mismatch
- pretty sure this was fixed by the recover refactor.. haven't hit it since then.
- 01:48 PM Bug #2080 (Resolved): osd: scrub on disk size does not match object info size
- ...
- 05:53 AM Revision 196d4a1f (ceph): wait_till_clean -> wait_for_clean and wait_for_recovery
- Clean now also means the correct number of replicas, whereas recovered
means we have done all the work we can do give... - 12:34 AM Revision bcb5059b (ceph): PGMap: fix else indentation
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 12:34 AM Revision 449d8702 (ceph): PGMap: extract method for outputting plain pg stats
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 12:34 AM Revision c0ab63e7 (ceph): mon: constify functions needed to use dout from a const function
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 12:34 AM Revision c08615e6 (ceph): mon: add dump_stuck command
- This will help monitoring transient pg states at a coarse level.
Fixes: #2005
Signed-off-by: Josh Durgin <josh.durgi... - 12:34 AM Revision 806285f6 (ceph): mon: fix STUCK_STALE check
- Look at last_unstale if STALE bit is not set.
Signed-off-by: Sage Weil <sage@newdream.net> - 12:24 AM Revision 06a2202b (ceph): osd: only complete/deregister repop once
- It's now possible to send the ack and deregister the repop before the
op_applied() happens. And when that happens, w... - 12:24 AM Revision 9e309c49 (ceph): filestore: hold journal_lock during
- Hold journal_lock during replay so that we don't stomp on variables like
op_seq and open_ops that the the commit thre... - 12:24 AM Revision fb31f631 (ceph): osd: don't update_stats() on prec_replica_info
- Nothing changes here...
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 12:24 AM Revision 6e89d9ca (ceph): osd: update_stats() in GetInfo state start
- This is the first stage of peering.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
02/17/2012
- 10:31 PM Revision c1db9009 (ceph): Merge branch 'next'
- 10:27 PM Revision 4925e9c6 (ceph): man: regenerate man pages
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 10:27 PM Revision 304389ca (ceph): man: move man page fixes to rst
- 83cf1b62fde525d068bc292c4a1ccc42199657ae and
e5f49104ab62ba7bc42cf6ecf41c9257b46585f7 updated the nroff output
but no... - 10:27 PM Revision a446f323 (ceph): doc: fix snapshot creation/deletion syntax in rbd man page (trivial)
- Creating a snapshot requires using "rbd snap create",
as opposed to just "rbd create". Also for purposes of
clarifica... - 10:18 PM Revision ff822fbf (ceph): PGMap: fix dump header fields
- kilobytes were removed from the output by
625b0b0291543baf424fb3bae4c7a36d280df91e, and last_scrub_stamp was
added by... - 10:18 PM Revision 9baa4b62 (ceph): PGMap: add last_state_change to dump output
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 10:18 PM Revision d373f716 (ceph): PGMap: add indent settings header
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 09:48 PM Revision 7837c19b (ceph): osd: make op_commit imply op_applied for purposes of repop completion
- For repop completion, we want waitfor_ack and _commit to be empty. For
replicas, a commit reply implies ack, so ack ... - 09:46 PM Revision d6c76745 (ceph): osd: add REMAPPED state
- Set this bit whenever up != acting. This tells you that the OSDMap is
explicitly remapping the PG to different nodes... - 09:19 PM Revision 8e6f9ca8 (ceph): osd: refactor recovery completion
- - rename is_all_update() -> needs_recovery(), reverse logic.
- drop up != acting check; that has nothing to do with
... - 06:56 PM Revision 8c0e184c (ceph): osd: introduce RECOVERING pg state
- Since clean now means not degraded, we need some other indication that
recovery has completed and we are "done" (give... - 06:23 PM Revision db41bdda (ceph): paxos: fix is_consistent() check
- If our last_committed == 1, we don't need a separate stash. This is the
logic that slurp() follows, so fix is_consis... - 05:17 PM Revision d913e5e6 (ceph): osd: change nested iterator name
- Don't shadow the iterator variable.
Signed-off-by: Tom Callaway <spot@redhat.com>
Signed-off-by: David Nalley <david... - 05:17 PM Revision 2325da86 (ceph): add missing #includes to build on gcc 4.7
- Signed-off-by: Tom Callaway <spot@redhat.com>
Signed-off-by: David Nalley <david@gnsa.us> - 05:17 PM Revision d938246c (ceph): mds: comment out unused code in mds dump_pop_map
- Signed-off-by: Tom Callaway <spot@redhat.com>
Signed-off-by: David Nalley <david@gnsa.us> - 04:26 PM Bug #1975: btrfs: EINVAL on snap create
- We aren't triggering this any more, now that the filestore transaction bug is fixed.
- 03:13 PM Bug #2061: osd: scrub mismatch
- oooooh, these went away and i was confused. but hten i just ran the regression suite against next and hit them again...
- 01:22 PM Bug #2068 (Resolved): osd: FAILED assert(infoevt.info.history.last_epoch_started >= pg->info.hist...
- 12:46 PM Bug #2079 (Duplicate): rbd: creating a snapshot with the same name doesn't return an error
- ...
- 12:37 PM Cleanup #2078 (Resolved): ceph tool: only output response data to stdout
- By default, "ceph osd getmap" or any other command that fetches binary data outputs it to stdout. However, other info...
- 10:32 AM Bug #2077 (Resolved): mon: assert in Paxos::is_consistent
- we don't need a stash for v == 1. make is_consistent() check match slurp() logic. commit:db41bdda7e02aedc42d14be635...
- 09:41 AM Bug #2077 (Resolved): mon: assert in Paxos::is_consistent
- I tripped across a bug when adding a new monitor into an existing cluster
(see attached). I was on GIT commit
4b3bb... - 09:36 AM Bug #2076 (Resolved): ceph fails to build with gcc 4.7
- commit:d913e5e670282c19a35c6cb420fc1d711c388cc4
- 09:30 AM Bug #2076: ceph fails to build with gcc 4.7
- That is indeed fine.
Thanks! - 09:25 AM Bug #2076: ceph fails to build with gcc 4.7
- Committing these, with both of your signed-off-by's.. I assume that's okay?
- 08:13 AM Bug #2076 (Resolved): ceph fails to build with gcc 4.7
- Fedora has moved to gcc 4.7 for the upcoming Fedora 17 release[1].
Currently Ceph fails to build with gcc 4.7.
... - 05:00 AM Revision 07504607 (ceph): Merge branch 'next'
- 05:00 AM Revision 95633b9b (ceph): osd: fix _activate_committed replica->primary message
- Normally we take a fresh map reference in PG::lock(). However,
_activate_committed needs to make sure the map hasn't...
02/16/2012
- 11:18 PM Revision 41425f6b (ceph): osd: skip threadpool pause on shutdown when blackholed
- We can't pause the threadpools if they're blocked on a blackholed
filestore. Instead, just call _exit().
Signed-off... - 11:03 PM Revision 35db2ea4 (ceph): rgw: set default acls for certain swift operations
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 09:02 PM Revision a4ff47b0 (ceph): Revert "swift: auth response returns X-Auth-Token instead of X-Storage-...
- This reverts commit e8e1e5dffbd25e2124331e607264e1bc4120676c.
- 08:55 PM Bug #2070: osd/ReplicatedPG.cc: 3627: FAILED assert(is_active())
- ubuntu@teuthology:/a/nightly_coverage_2012-02-16-b/12294
- 11:32 AM Bug #2070: osd/ReplicatedPG.cc: 3627: FAILED assert(is_active())
- if i had to guess this is related to the pg init() refactor. not much to be found from the core, except that pg->sta...
- 09:39 AM Bug #2070: osd/ReplicatedPG.cc: 3627: FAILED assert(is_active())
- also hit this on ubuntu@teuthology:/a/nightly_coverage_2012-02-15-b/12169
- 09:36 AM Bug #2070 (Duplicate): osd/ReplicatedPG.cc: 3627: FAILED assert(is_active())
- ubuntu@teuthology:/a/nightly_coverage_2012-02-15-b/12164...
- 08:44 PM Bug #2075 (Resolved): osd: recover_got assert
- ...
- 08:37 PM Messengers Bug #2073: msgr: shutdown can hang
- here's the bt:...
- 04:15 PM Messengers Bug #2073 (Resolved): msgr: shutdown can hang
- saw this...
- 08:34 PM Revision bbdba468 (ceph): Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph into wip...
- 08:34 PM Revision 91afb38f (ceph): Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph
- 05:12 PM Revision 4b3bb5ab (ceph): osd: fix _activate_committed replica->primary message
- Normally we take a fresh map reference in PG::lock(). However,
_activate_committed needs to make sure the map hasn't... - 04:36 PM Feature #2074 (Rejected): teuthology: remove old kernel packages
- sepia disks are filling up from all the old kernel packages (/lib/modules/$version is 1.3 GB each)
- 04:10 PM rgw Bug #2072 (Resolved): rgw: owner cannot change acl if it doesn't have bucket read permission
- rgw_op.cc:read_acls() tests for read permission, this is wrong.
- 03:11 PM CephFS Bug #2071: kclient: pjd mkfifo failures
- ubuntu@teuthology:/a/nightly_coverage_2012-02-16-b/12255
- 03:11 PM CephFS Bug #2071 (Can't reproduce): kclient: pjd mkfifo failures
- ...
02/15/2012
- 11:20 PM Revision 82eceb9a (ceph): osd: fix do not always clear DEGRADED/set CLEAN on recovery finish
- Clean means we have exactly the right number of replicas and recovery is
complete. Degraded means we do not have eno... - 05:29 PM Revision 45701f5b (ceph): init: Only check if auto start is disabled when the issued command is "...
- This still makes sure daemons don't start on boot.
When auto start was disabled it would also prevent logrotate from... - 05:28 PM Revision 543e8b98 (ceph): ceph.spec.in: Move libcls_*.so from -devel to base package
- OSDs (src/osd/ClassHandler.cc) specifically look for libcls_*.so in
/usr/$libdir/rados-classes, so libcls_rbd.so and ... - 05:04 PM Revision 1a994bed (ceph): objclass: add debug_objclass knob, default to off
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:03 PM Revision ba0ef62f (ceph): osd: reduce watch/notify debug noise
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:21 PM Revision ebbfdefa (ceph): msgr: mark_all_down on shutdown
- This ensures we destroy all the Pipes and discard their messages. Among
other things, this can avoid
2012-02-15 03:... - 04:21 PM Revision c1b6b218 (ceph): osd: do not sync_and_flush if blackholed
- If we have blackholed this will block forever. In that case dont' bother.
Signed-off-by: Sage Weil <sage@newdream.net> - 04:20 PM Revision e6ffe31b (ceph): workqueue: make pause/unpause count
- We can pause() multiple times, and we need as many unpause()s to actually
resume work.
This resolves problems where ... - 03:28 PM Linux kernel client Bug #2069 (Can't reproduce): client crash during kernel_untar_build rm -r step
- this keeps happening:...
- 03:24 PM Bug #2022: osd: misdirectect request
- weird, saw this twice a few days (maybe 18 runs apart), but wasn't able to reproduce after several hundred iterations...
- 03:20 PM Bug #2033 (Closed): osd: segfault in OSD::update_heartbeat_peers()
- I'm not totally sure how this happened, but the new heartbeat locking should avoid it..
- 03:18 PM Cleanup #2049 (Resolved): osd: improve heartbeat peer locking
- 03:18 PM Bug #2060 (Resolved): osd: lone osd is not marked degraded with replication level 2
- 02:11 PM Bug #2056 (Resolved): osd: unfound object during backfill qa test
- fixed in backfill task.. it was killing a second osd before waiting for things to peer/recover from the first failure.
- 12:01 PM Bug #2068: osd: FAILED assert(infoevt.info.history.last_epoch_started >= pg->info.history.same_in...
- Oh, i see the problem.. the osdmap ref is taken by lock().. this pg hasn't seen the new map yet.
just need to tag... - 11:49 AM Bug #2068: osd: FAILED assert(infoevt.info.history.last_epoch_started >= pg->info.history.same_in...
- looking at the core file.
- we are primary
- replica is sending us an info message, with one record. it is therefo... - 09:21 AM Bug #2068 (Resolved): osd: FAILED assert(infoevt.info.history.last_epoch_started >= pg->info.hist...
- ...
- 06:05 AM Revision 40802ae8 (ceph): osd: exit code 0 on SIGINT/SIGTERM
- This makes daemon-handler happy...
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:49 AM Revision bc0e4068 (ceph): add regression/multifs collection; run rgw tests under both xfs and btrfs
- 05:04 AM Revision 2aafdead (ceph): signals: check write(2) return values
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:03 AM Revision ec066829 (ceph): mds: remove some cruft
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:03 AM Revision 9cd09003 (ceph): osd: semi-clean shutdown on signal
- Make some effort to stop work in progress, remove pid file, and exit with
informative error code.
Note that this is ... - 05:03 AM Revision ecd28025 (ceph): signals: implement safe async signal handler framework
- Based on http://evbergen.home.xs4all.nl/unix-signals.html.
Instead of his design, though, we write single bytes, and... - 05:03 AM Revision 79513155 (ceph): signals: do not install default SIGHUP, SIGINT, SIGTERM handlers
- These should be app specific and async.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:03 AM Revision afa1f9e3 (ceph): signal: remove unused/obsolete handle_shutdown_signal
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:03 AM Revision be704fe1 (ceph): mds: install async signal handlers for SIG{HUP,INT,TERM}
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:03 AM Revision e905564b (ceph): osd: install async signal handlers for SIG{HUP,INT,TERM}
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:03 AM Revision eafe8327 (ceph): mon: install async signal handlers for SIG{HUP,INT,TERM}
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:03 AM Revision bbe5cd75 (ceph): mon: do a clean shutdown on SIGINT/SIGTERM
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:03 AM Revision 395dc659 (ceph): mds: remove pidfile
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:03 AM Revision 4425f3b3 (ceph): libradospp: add config_t typedef
- Don't expose internal CephContext type name.
Signed-off-by: Sage Weil <sage@newdream.net> - 01:03 AM Revision 06fa2685 (ceph): librados: use rados_config_t typedef instead of CephContext
- Signed-off-by: Sage Weil <sage@newdream.net>
02/14/2012
- 11:52 PM Revision e32668f8 (ceph): doc: Balance backticks.
- Signed-off-by: Tommi Virtanen <tommi.virtanen@dreamhost.com>
- 11:24 PM Revision ad9d7fb6 (ceph): backfill: wait for clean before writing+blackholing
- If we have straggler pgs and blackhole osd.1, we can deadlock because we
need info from that osd to repeer and contin... - 11:23 PM Revision 50cc60f0 (ceph): nuke: nuke testrados too
- Slightly fewer nuke -r's
- 10:50 PM Bug #1765 (In Progress): osd: 'call' op can return data even if op is modifying
- the c++ librados api now separates these operations. osd now refuses to return any result data payload if op is mark...
- 10:01 PM Revision 8d19e735 (ceph): Merge branch 'wip-osd-hb'
- Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
- 09:59 PM Revision 2281a009 (ceph): librados: expose CephContext via C API
- We can already create rados cluster handles with an existing CephContext,
but that is only useful if you are building... - 09:41 PM Revision bc4e78dd (ceph): mds: use new tmap_get pbl argument
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:39 PM Revision dd322858 (ceph): librados: need prval for tmap_get
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:37 PM Revision 7842bf12 (ceph): librados: add aio_operate for reads and tmap_get for ObjectWriteOp
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 09:36 PM Cleanup #2021 (In Progress): fix signal handlers
- 09:35 PM Revision 70450963 (ceph): osd: remove unused need_size
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:30 PM CephFS Bug #1991 (Duplicate): mds: crash during clean shutdown
- see #1549.. we are racing with exit(0) from the SIGTERM handler
- 09:28 PM Bug #2032: paxos: somehow didn't update stash alongside new states
- Can we close this one?
- 09:27 PM Bug #2037 (Resolved): mon: a crash in the middle of slurping is unrecoverable
- 09:03 PM Revision 34145d5d (ceph): Merge branch 'wip_push_refactor'
- Reviewed-by: Sage Weil <sage@newdream.net>
- 08:56 PM Revision a53a0174 (ceph): ReplicatedPG: pull() should return PULL_NONE, not false
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 08:55 PM Revision 5a3ef17c (ceph): ReplicatedPG: clean up push/pull
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 08:52 PM Revision f9b7529f (ceph): osd_types.h: Add constructors for ObjectRecovery*
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 07:53 PM Revision 7b1c144f (ceph): test_filestore_idempotent: fix test to create initial object
- Filestore now properly fails to clone a non-existent object, which means
we should create one.
Fixes: #2062
Signed-o... - 05:06 PM Bug #2067 (Resolved): librados: we leak CephContext from rados_create()
- 05:06 PM Revision 6b30cd3b (ceph): libcephfs: define CEPH_SETATTR_*
- These are also defined internally in ceph_fs.h, so use a guard. Annoying,
but gives us consistent naming (ceph_*/CEP... - 05:05 PM rgw Feature #2066 (Resolved): rgw: make list_objects efficient
- 04:58 PM Revision 3fbb5714 (ceph): rename fs files
- 03:14 PM Feature #1772: rbd: define new on-disk header format
- The other point that came up was, if rbd can't delete the parent volume until all children have been deleted, the gla...
- 03:13 PM Feature #1772: rbd: define new on-disk header format
- Being a little bit more explicit: the point of the UUIDs is to allow child images to add themselves to the parent's l...
- 01:07 PM Feature #1772: rbd: define new on-disk header format
- To get around the issue of a child image needing to update the parent image's header, Sage suggested only allowing ac...
- 02:07 PM Bug #1821: librados: rados_create_with_context is unusable
- see wip-1821
- 01:06 PM Feature #988: librbd: trivial layering
- To get around the issue of a child image needing to update the parent image's header, Sage suggested using only allow...
- 11:55 AM Linux kernel client Bug #2064 (Resolved): ceph-client: messenger: nocrc flag not implemented correctly
- The "nocrc" option is supposed to disable CRC32 calculation on messages
sent between ceph entities. The default is ... - 11:55 AM rgw Bug #2063 (Resolved): rgw: access key shouldn't contain chars that need to be url encoded
- We see some issues in our tests that when generating signed url these chars aren't being encoded. We should try to av...
- 11:50 AM Bug #2062 (Resolved): filestore: idempotent test failed
- the test was broken. triggered by filestore now noticing clone could fail.
commit:7b1c144f21c3ccfe2dfd4342e3d5461... - 11:36 AM Bug #2062 (Resolved): filestore: idempotent test failed
- ...
- 09:38 AM Bug #2026: osd: ceph::HeartbeatMap::check_touch_file
- in my case, this looks like #2045.
- 07:59 AM Bug #2026: osd: ceph::HeartbeatMap::check_touch_file
- I just hit this in qa, ubuntu@teuthology:/var/lib/teuthworker/archive/nightly_coverage_2012-02-14-a/11871.
- 09:37 AM Bug #2061 (Resolved): osd: scrub mismatch
- New one, "[ERR] 0.c scrub stat mismatch, got 6/6 objects, 2/5 clones, 13511948/13511948 bytes."
Workload was
<pre... - 09:31 AM Bug #2045: osd: dout_lock deadlock
- again, although this time there is a write that looks blocked somehow...
- 12:45 AM Revision 10a94d2b (ceph): regression/thrash on xfs and btrfs both
02/13/2012
- 11:29 PM Revision 04f3e445 (ceph): btrfs: 1 -> fs: btrfs
- 11:28 PM Revision 46b612ef (ceph): misc: make get_scratch_devices look for (almost) any disk that's not mo...
- 11:28 PM Revision 975d73a2 (ceph): nuke: nuke testrados and rados processes, too
- So that -r is needed slightly less often.
- 11:28 PM Revision af4ce442 (ceph): ceph: use any fs, not just btrfs, on scratch devices
- The
btrfs: true
syntax is replaced with
fs: btrfs
or ext4, xfs. - 11:28 PM Revision 6f3abc6c (ceph): ceph_manager: mark in a bit more often than out
- Otherwise we can get into cases where many/most nodes are out, and things
don't work as well. e.g., crush may start ... - 10:43 PM Revision b54bac30 (ceph): test/encoding/readable.sh: drop bashisms
- =, not ==!
Signed-off-by: Sage Weil <sage@newdream.net> - 10:35 PM Revision ffa1de32 (ceph): filejournal: drop unused variable
- Signed-off-by: Sage Weil <sage@newdream.net>
- 10:32 PM Revision ccf8867f (ceph): filejournal: aio off by default
- For now, until we have a better handle on the ext4 bug, and demonstrate
that it is a clear performance win with the f... - 10:31 PM Revision 12035cd4 (ceph): Merge remote-tracking branch 'gh/wip-journal-aio-rebased'
- 10:09 PM Revision 3d3237fe (ceph): Merge remote-tracking branch 'gh/wip-osd'
- Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
- 10:08 PM Revision 9fded38f (ceph): test/encoding/readable.sh: skip old version with known incompatibilities
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:41 PM Revision 3e1cc0b9 (ceph): ceph-dencoder: add osd_peer_stat_t
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:08 PM Revision 9065dbd3 (ceph): rgw: remove extra useless info in bucket entry encoding
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 08:07 PM Revision 1bf037bf (ceph): ReplicatedPG: refactor push and pull
- Now, push progress is represented by ObjectRecoveryProgress. In
particular, rather than tracking data_subset_*ing, w... - 07:27 PM Revision fbbbd01b (ceph): add CEPH_FEATURE_OSDENC
- Require it for osd <-> osd and osd <-> mon communication.
This covers all the new encoding changes, except hobject_t... - 07:26 PM Revision 94a198c8 (ceph): ReplicatedPG: is_degraded may return true for backfill
- If is_degraded returns true for backfill, the object may not be
in any replica's missing set. Only call start_recove... - 07:26 PM Revision d0ccf280 (ceph): ReplicatedPG: add debugging for in flight backfill ops
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 07:26 PM Revision af38ce1f (ceph): ReplicatedPG: consider backfill_pos to be degraded
- A write may trigger via make_writeable the creation of a clone which
sorts before the object being written.
Signed-o... - 07:24 PM Revision 2476dd71 (ceph): MOSDSubOp: Add new object recovery state
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 07:18 PM Revision d43d5d9f (ceph): ReplicatedPG: is_degraded may return true for backfill
- If is_degraded returns true for backfill, the object may not be
in any replica's missing set. Only call start_recove... - 07:18 PM Revision 4785ae39 (ceph): ReplicatedPG: add debugging for in flight backfill ops
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 07:18 PM Revision f80e0c71 (ceph): ReplicatedPG: consider backfill_pos to be degraded
- A write may trigger via make_writeable the creation of a clone which
sorts before the object being written.
Signed-o... - 07:06 PM Revision 389653e6 (ceph): osd: remove peer_stat from MOSDOp entirely
- We haven't used this feature for years and years, and don't plan to. It
was there to facilitate "read shedding", whe... - 06:01 PM Revision b1162a37 (ceph): Merge remote-tracking branch 'gh/wip-mon-lag'
- Reviewed-by: Sage Weil <sage@newdream.net>
- 05:42 PM Revision 4dfa4dc2 (ceph): osd: new osd_peer_stat_t shell type
- We weren't using this, and it had broken (raw) encoding. The constructor
also didn't initialize fields properly.
Cl... - 05:42 PM Revision 1f351cdb (ceph): qa/btrfs/.gitignore: ignore targets
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:37 PM Revision ac646c54 (ceph): Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph
- 04:33 PM Feature #2028 (Resolved): qa: allocate disks to btrfs on new hardware
- 01:23 PM Feature #2028 (In Progress): qa: allocate disks to btrfs on new hardware
- 03:47 PM Bug #2060 (Resolved): osd: lone osd is not marked degraded with replication level 2
- With only one osd in, 'ceph -s' and 'ceph health' should report that the cluster has degraded objects.
- 02:54 PM Feature #1836 (Resolved): filejournal: use async directio to write to the journal
- 02:50 PM rgw Feature #773 (Resolved): rgw: efficient list-objects filtering
- That was fixed when we introduced the bucket index.
- 02:40 PM rgw Bug #2048 (Resolved): rgw: multipart upload listing return key starting with _multipart_
- It seems that this has already been resolved, most likely by the fix for #2025.
- 01:14 PM Feature #2058 (Resolved): ceph: query pg state
- 01:12 PM Feature #2005 (In Progress): mon: track timestamps on pg states
- 01:11 PM Feature #2005 (Resolved): mon: track timestamps on pg states
- 01:06 PM Feature #1962 (In Progress): ferro: Trigger vMedia boot via IPMI/DRAC
- 01:06 PM Feature #1571 (In Progress): osd: non-trivial map object
- 12:11 PM rgw Cleanup #2036 (Resolved): rgw: bucket index tree contains the same info 3 times
- Ok, as of commit:9065dbd36d35b6e44c66293e74b6ba92031ca9ae it's only appears twice. Removing another copy of the objec...
- 09:37 AM Bug #2056 (Resolved): osd: unfound object during backfill qa test
- ubuntu@teuthology:/a/nightly_coverage_2012-02-13-a/11793
it happened a couple days earlier, too. - 04:48 AM Revision cefa55b2 (ceph): osd: move new pg initialization into PG::info()
- Move initialization of misc elements of the new pg from OSD.cc to a PG
method. No change in functionality.
Signed-o... - 04:48 AM Revision 04f175f8 (ceph): osd: use PG::init() for newly local (but not created) PGs
- Use the helper for PGs that are newly instantiated on the local OSD.
This fixes the initialization of pg->info.stats... - 04:48 AM Revision c97c14f6 (ceph): osd: use single helper for pg creation
- Take a bool so that we initialize the last_epoch_started properly on
newly created PGs. This gives us a single code ... - 02:08 AM Revision 72a56108 (ceph): osd: protect per-pg heartbeat peers with inner lock
- Currently we update the overall heartbeat peers by looking directly at
per-pg state. This is potentially problematic...
02/12/2012
- 04:16 PM Bug #1759 (Resolved): mds/client: truncate size overflow, fails with EINVAL
- this is a problem with weird truncate_seq/size values in requests, that the osd is now cleaning up.
commit:0ded7e4da... - 04:15 PM Bug #1688 (Closed): Benjamin: pg stuck in scrub
- 02:29 PM Bug #2022: osd: misdirectect request
- saw this again on rados_api_tests:...
- 06:43 AM Revision 508be8e3 (ceph): rgw: don't use SCRIPT_NAME and QUERY_STRING vars
- REQUEST_URI holds everything we need, and it's encoded correctly.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 05:47 AM Revision 3796c4ab (ceph): osd: flush pg on activate _after_ we queue our transaction
- We recently added a flush on activate, but we are still building the
transaction (the caller queues it), so calling o... - 05:46 AM Revision 4d8e9a5e (ceph): osd: do OpRequest dispatch into PG::do_request
- This simplifies the external PG interface, and gives us a single path into
the PG...
Signed-off-by: Sage Weil <sage@... - 05:24 AM Revision eba609be (ceph): filestore: make flush() block forever if blackholed
- If we are blackholing the disk, we need to make flush() wait forever, or
else the flush() logic will return (the IO w... - 05:16 AM Revision 610da665 (ceph): Revert "rgw: don't treat plus as a space in url decode"
- This reverts commit a6d7629c177fbab722a7a0c7f861caf91ff92deb.
- 05:15 AM Revision 053dc33c (ceph): osd: emit useful scrub error on missing clone
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:14 AM Revision 43828dff (ceph): filestore: return error from CLONE
- Aie!
Signed-off-by: Sage Weil <sage@newdream.net>
02/11/2012
- 11:55 PM Revision 67d8daf0 (ceph): Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph
- 11:09 PM Revision 7c6dff48 (ceph): osd: filter trimming|purged snaps out of op SnapContext
- We can receive an op with an old SnapContext that includes snaps that we've
already trimmed or are in the process of ... - 10:43 PM rgw Bug #2043 (Resolved): rgw: cannot use '+' in url
- commit:508be8e3b3b47b71035d07d26dead49b3b91463d hopefully fixes the issue. Also reverted previous fix.
- 09:42 PM rgw Bug #2043 (In Progress): rgw: cannot use '+' in url
- It's still broken. Certain clients use '+' as a space. I think that the apache rewrite rule makes things inconsistent.
- 10:32 PM Bug #2026: osd: ceph::HeartbeatMap::check_touch_file
- I guess a btrfs one. Right now I'm running a couple of virtual machines without any issues, so for now we can leave t...
- 09:57 AM Bug #2026: osd: ceph::HeartbeatMap::check_touch_file
- This looks like a btrfs or kernel issue to me. Have you seen it since?
- 10:32 PM Revision 02bda42f (ceph): mon: add {mon,quorum}_status admin socket commands
- These dump some json with the current monitor/quorum status.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 10:30 PM Revision e4258ce0 (ceph): mon: move quorum_status into helper
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 10:19 PM Revision 2adad559 (ceph): hammer.sh: assume path is set
- 10:10 PM Revision 60067f84 (ceph): mon: move mon_status into a helper
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:44 PM Revision a414fd51 (ceph): init-ceph, mkcephfs: try 'btrfs device scan' before 'btrfsctl -a'
- Fixes: #2023
Reported-by: Wido den Hollander <wido@widodh.nl>
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 09:40 PM Revision 4fad1317 (ceph): add snap thrashing covering a small number of objects
- The snaps-many-objects has a relatively low density of ops-per-object. This
hammers on a small number of them and doe... - 09:39 PM Revision e841f9c7 (ceph): move snap thrashing back into regression suite
- 07:57 PM Revision a391b0d1 (ceph): osd: fix MOSDPGCreate version setting
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:48 PM Revision 6e0e33e7 (ceph): Merge remote branch 'gh/wip-osd-encoding'
- 07:48 PM Revision a0caa851 (ceph): osd: some cleanup
- Signed-off-by: Sage Weil <sage@newdream.net>
- 07:48 PM Revision 4834c4c7 (ceph): osd: check for valid snapc _before_ doing op work
- Check this early to avoid wasting effort, or causing side-effects from
do_osd_op_effects().
Signed-off-by: Sage Weil... - 07:48 PM Revision e09c90fd (ceph): osd: queue pg removal under pg's epoch
- The PG may be doing work relative to a different epoch than what the osd
has. Make sure the PG removal message is qu... - 05:49 PM Revision 7eff37be (ceph): mon: validate osmdap input
- And clean up some error return paths while we're here.
Fixes: #1493
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 03:08 PM Bug #1943 (Duplicate): osd: bad clone transaction on journal replay
- 03:07 PM Bug #1949 (Resolved): osd: ENOTEMPTY on collection removal from snaptrimmer
- fixed by commit:7c6dff487171deb37852e2fb059dcb6e3af65702
- 03:05 PM Feature #2038 (Rejected): mon: can't currently do commands/get status when not in quorum
- I think it's fine to fail (or actually, block/wait on) authentication if we are out of quorum. The client will retry...
- 02:22 PM Cleanup #2023 (Resolved): btrfs: Use btrfs device scan instead of btrfsctl -a
- commit:a414fd51c7c5ae5dbe9e3af7db6f17741a58c1a7
- 10:23 AM Bug #1758 (Can't reproduce): OSD segfault in SimpleMessenger::send_message
- Haven't seen this one in ages, either. Going to assume it's been fixed.
- 10:22 AM Bug #1992 (Can't reproduce): OSD::get_or_create_pg
- Hmm, we haven't been able to trigger this with our thrashing.
- 10:21 AM Bug #1493 (Resolved): cmon: nice error message on undecodable (osdmap, monmap) input
- commit:7eff37be494714febed4e6724237c03722b4e8c5
- 10:07 AM Feature #2055 (Duplicate): osd: fix up push cloning
- 01:05 AM Revision 7e32a3d4 (ceph): rgw: objects can contain '%'
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 12:40 AM Revision 6028b363 (ceph): move kclient_workunit_suites_blogbench.yaml to stress suite
- This is consistently failing due to an mds/kclient interaction.
02/10/2012
- 11:17 PM Revision bd1a9567 (ceph): mon: fix MMonElection encoding version
- Signed-off-by: Sage Weil <sage@newdream.net>
- 11:07 PM Revision 22eca410 (ceph): mon: remove the last_consumed setting in Paxos
- This was only ever used while initializing the Paxos machine, and it
doesn't need to be. Its existence is just an inv... - 11:06 PM Revision 6e6c34f9 (ceph): objecter: LingerOp is refcounted
- this should fix Bug #2050, where a linger op was used after being freed.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newd... - 11:02 PM Revision aecf4e02 (ceph): mon: handle inconsistent disk states on startup.
- This lets us recover from an interrupted slurp while still noticing
other corruption issues. Rather than running init... - 10:45 PM Revision 0de1d550 (ceph): objecter: LingerOp is refcounted
- this should fix Bug #2050, where a linger op was used after being freed.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newd... - 10:39 PM Revision 631650b1 (ceph): Merge branch 'wip-encoding'
- Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 10:39 PM Revision 3c5dcf89 (ceph): qa/btrfs/create_async_snap
- Stupid tool to call the async snap ioctl. Until the btrfs tool does it.
Signed-off-by: Sage Weil <sage@newdream.net> - 10:38 PM Revision 7b5689ac (ceph): messages: populate header.version in constructor
- Define a HEAD_VERSION and COMPAT_VERSION for any versioned message. Pass
to Message constructor so that it is always... - 09:37 PM Revision 0bd545f5 (ceph): mon: add a slurping flag to the Paxos state
- Set it before we start slurping, and clear it when we end slurping.
This allows us to differentiate between deliberat... - 09:17 PM RADOS Bug #1738 (Duplicate): bad crushmap behavior
- 09:17 PM RADOS Bug #2047 (Duplicate): crush: with a rack->host->device hierarchy, several down devices are likel...
- 05:57 PM Revision e369ec15 (ceph): ReplicatedPG: don't put the op on -EAGAIN
- EAGAIN indicates that the op is
waiting_for_missing or waiting_for_degraded
Reviewed-by: Greg Farnum <greg.farnum@dr... - 05:30 PM Bug #1949: osd: ENOTEMPTY on collection removal from snaptrimmer
- 01:22 PM Bug #1949: osd: ENOTEMPTY on collection removal from snaptrimmer
- another log, with filestore debugging, and the contents of the fs. There was...
- 05:16 PM Revision 3a7bb999 (ceph): mon: initialize paxos state in constructor
- These should all be initialized in init() anyway
(except accepted_pn_from, which is set in collect and handle_collect... - 05:06 PM rgw Bug #2051 (Resolved): rgw: can't use '%' in object name
- Fixed, commit:7e32a3d4bc90d84970754350414c553e7ca01299.
- 02:48 PM rgw Bug #2051 (Resolved): rgw: can't use '%' in object name
- 04:34 PM Feature #2055 (Duplicate): osd: fix up push cloning
- 04:32 PM Feature #2054 (Resolved): teuthology: run radosgw through valgrind
- 04:13 PM Feature #2053 (Rejected): librados: caching
- 04:12 PM Feature #2052 (Resolved): librbd: caching
- 03:30 PM rgw Cleanup #2036: rgw: bucket index tree contains the same info 3 times
- the reason it is kept 3 times is that we index it by the bucket name, have the bucket name as one of the fields in th...
- 03:20 PM rgw Bug #2043 (Resolved): rgw: cannot use '+' in url
- Fixed, commit:a6d7629c177fbab722a7a0c7f861caf91ff92deb.
- 03:19 PM Bug #2050 (Resolved): rgw: crash at Objecter::_linger_commit()
- Fixed, commit:0de1d5502b0d9ab0f0809947a0664586d7754a08.
- 02:27 PM Bug #2050: rgw: crash at Objecter::_linger_commit()
- We think that what happens is this:
librados::linger()
->ack response
unregister_watcher()
->commit response
... - 02:26 PM Bug #2050 (Resolved): rgw: crash at Objecter::_linger_commit()
- ubuntu@teuthology:/a/nightly_coverage_2012-02-09-a/11236$ cat ./remote/ubuntu@sepia72.ceph.dreamhost.com/log/rgw.stdo...
- 01:58 PM Cleanup #2049: osd: improve heartbeat peer locking
- need to move heartbeat peer stuff out from under osd_lock to facilitate pushing pg peering crap into the worker threads
- 01:57 PM Cleanup #2049 (Resolved): osd: improve heartbeat peer locking
- 06:09 AM Revision 811e6298 (ceph): msg: include compat_version in version header
- header.version is the version we encoded.
header.compat_version is the oldest version of code that can decode it.
If... - 06:09 AM Revision 989d6786 (ceph): msg: populate compat_version for encoded messages
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:09 AM Revision 8d90856a (ceph): msg: check compat_version before decoding
- If the newly constructed message's version is older than the
compat_version, don't even try to decode; just fail.
Si... - 06:06 AM Revision cb15eb88 (ceph): os: new encoding for hobject_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:06 AM Revision 5b8d0c73 (ceph): new encoding for Log{Entry,Summary}
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:05 AM Revision 7d85c481 (ceph): osd: new encoding for pg_create_t
- There was no version encoding previously, so this is an incompatible
change. Fortunately this type is only used in o... - 05:58 AM Revision 92fc5f09 (ceph): osd: new encoding for pg_missing_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 156c6ebe (ceph): osd: new encoding for SnapSet
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision c8cf0aea (ceph): osd: new encoding for watch_info_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision a4a9d520 (ceph): osd: new encoding for object_info_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision fa779dba (ceph): osd: new encoding for ScrubMap
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 14d6ed49 (ceph): objectstore: new encoding for Transaction
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 9f3f1197 (ceph): osd: new encoding for PG::OndiskLog
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 00997f93 (ceph): osd: new encoding for PG::Interval
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 88f1fbc1 (ceph): osd: include state timestamps, mapping_epoch in pg_stat_t
- Track the time when the pg state last changed (or was refreshed) in
interesting ways.
Also track the epoch when the ... - 05:58 AM Revision a65586ca (ceph): mon: set last_unstale when marking PGs stale
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 92a058aa (ceph): mon: set last_changed when creating new pgs
- This will help us identify PGs that are stuck in creating state.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:58 AM Revision 7a68fd9f (ceph): osd: new ScrubMap::object encoding
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:58 AM Revision 4c3a41f7 (ceph): osd: new encoding for osd_reqid_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 757e3b05 (ceph): osd: new encoding for object_locator_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision f9d67f1a (ceph): osd: new encoding for osd_stat_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision a03ff1b5 (ceph): osd: new encoding for OSDSuperblock
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision fc869dee (ceph): osd: new encoding for pool_snap_info_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 5b6b5008 (ceph): osd: new encoding for pg_pool_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision bd829e35 (ceph): osd: new encoding for object_stat_sum_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 3a547efe (ceph): osd: new encoding for object_stat_collection_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 9016658c (ceph): osd: new encoding for pg_stat_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 0d79b1bf (ceph): osd: new encoding for pool_stat_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 538e8d12 (ceph): osd: new encoding for pg_history_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 9304db00 (ceph): osd: new encoding for pg_history_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 18c8861b (ceph): osd: new encoding for pg_log_entry_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:58 AM Revision 02dd0a85 (ceph): osd: new encoding for pg_log_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:45 AM Revision 7f10d5fa (ceph): osd: move object_locator_t to osd_types.{h,cc}
- That's a better home. Also add to ceph-dencoder.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:35 AM Revision e255044c (ceph): ceph-dencoder: add osd_reqid_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:30 AM Revision 01c7b3bd (ceph): ceph-dencoder: add hobject_t
- Move to a separate file in os/, since this is an ObjectStore related
object.
Signed-off-by: Sage Weil <sage.weil@dre... - 05:17 AM Revision b49a55b9 (ceph): mon: add/use pg_create_t ctor
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:15 AM Revision 34158ed0 (ceph): ceph-dencoder: add pg_create_t (formerly MOSDPGCreate::create_rec)
- Signed-off-by: Sage Weil <sage@newdream.net>
- 01:20 AM Revision 7479828e (ceph): test/encoding/readable.sh: no \t
- Not sure why this sometimes works and sometimes doesn't. Maybe it's a
bashism?
Signed-off-by: Sage Weil <sage@newdr... - 12:47 AM Revision fab2254c (ceph): Merge branch 'wip-corpus'
- 12:44 AM Revision e492ffd9 (ceph): ceph-dencoder: add Log{Entry,EntryKey,Summary}
- Signed-off-by: Sage Weil <sage@newdream.net>
- 12:40 AM Revision 57295654 (ceph): Merge remote branch 'gh/master' into wip-journal-aio-rebased
- 12:20 AM Revision 1009d1a0 (ceph): filestore: fix op queue quiesce during commit
- When I added the ordering constraint fix back in 259c509a I got the
check backwards. We want to wait if we are block... - 12:20 AM Revision 93d7ef96 (ceph): filestore: wait to start op if other ops are in line
- We can have a sequence like:
- commit_start, blocked=true
- op_start thread A gets in line
- op_start thread B gets ...
02/09/2012
- 09:41 PM Bug #1974: osd: radosmodel crash on thrashing
- commit:359dfb9966d15d997f9e0351a5ed8de1faae62fe
- 09:41 PM Bug #1974 (Resolved): osd: radosmodel crash on thrashing
- 09:20 PM Bug #1975: btrfs: EINVAL on snap create
- I'm pretty sure this was triggered by #2046. There is still a btrfs bug, but we were doing the wrong thing if rmdir ...
- 09:18 PM Bug #2013 (Resolved): osd: messages for pgs we don't store are never freed
- 04:38 PM Bug #2046 (Resolved): filestore: do_op running during commit
- commit:1009d1a016f049e19ad729a0c00a354a3956caf7 and commit:93d7ef96316f30d3d7caefe07a5a747ce883ca2d
- 04:02 PM Bug #2046: filestore: do_op running during commit
- this was broken by commit:259c509a8941bf7cdad8bd4ede0ccd73ca8a83d3, way back in v0.25! Sigh. The wait condition for...
- 10:05 AM Bug #2046 (Resolved): filestore: do_op running during commit
- commit_start() is supposed to quiesce writes, but I see...
- 04:24 PM Bug #2044: osd: pg stuck in active+backfill
- This should be fixed by commit:f0334673ab8547807b961aae19a8e53531585e3f.
- 10:55 AM rgw Bug #2048 (Resolved): rgw: multipart upload listing return key starting with _multipart_
- reported by jdwilson over irc.
- 10:41 AM RADOS Bug #2047 (Resolved): crush: with a rack->host->device hierarchy, several down devices are likely...
- See http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/5166
Sage says the cause is down devices only tr... - 10:02 AM Bug #2045: osd: dout_lock deadlock
- ubuntu@teuthology:/a/nightly_coverage_2012-02-09-a/11210
metropolis:~sage/bug-2045 - 09:56 AM Bug #2045 (Can't reproduce): osd: dout_lock deadlock
- a thread is blocked on dout_lock, can't tell who.
- 05:05 AM Revision 0a60fcf3 (ceph): Merge remote branch 'gh/wip-types'
- 04:43 AM Revision 143ad86b (ceph): Merge remote branch 'gh/wip-stuck-in-backfill'
- Reviewed-by: Sage Weil <sage.weil@dreamhost.com>
- 01:40 AM Revision f0334673 (ceph): ReplicatedPG: don't count deletions as ops
- Counting them as ops but not requeueing the pg for recovery causes
backfill to stall when only deletions are sent in
... - 01:15 AM Revision 42db09b7 (ceph): osd: don't remove pg from recovery queue if not enough recovery ops sta...
- The pg has already been dequeued at the beginning of do_recovery(),
and it requeues itself only if it starts a new re... - 01:11 AM Revision a6d7629c (ceph): rgw: don't treat plus as a space in url decode
- Any special character encoding should be done through %hex. The
plus sign is a valid character in object names, and i... - 12:19 AM Revision 72bbaeac (ceph): osd: discard waiting ops when pg mapping changes
- If the pg mapping changes away from us, we can safely discard messages we
have waiting for the PG to be created.
Fix...
02/08/2012
- 09:30 PM Linux kernel client Bug #1793: NULL pointer dereference at try_write+0x627/0x1060
- Hmm.. yeah, I don't think we have anything beyond these console dumps. And we don't capture any kind of kernel core ...
- 09:17 PM Linux kernel client Bug #1793: NULL pointer dereference at try_write+0x627/0x1060
- Is there a core file for this problem anywhere?
It would really be nice to poke around in the message, or the
con... - 09:19 PM Revision 359dfb99 (ceph): osd: flush on activate
- PG::activate() can make lots of changes, most notably clean_up_local()
which deletes lots of local objects. Those ch... - 09:17 PM Revision 6c4687fe (ceph): Makefile: check readability of object corpus on 'make check'
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:17 PM Revision e261317e (ceph): add ceph-object-corpus.git submodule
- 09:12 PM Revision 66867888 (ceph): ceph-dencoder: PGMap[::Incremental]
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision bc7fd210 (ceph): mon: uninline Monmap encode/decode
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision 8cf81ccf (ceph): ceph-dencoder: MonMap
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision b6d1c0c9 (ceph): mon: initialize [near]full_ratio during create_initial(), not ctor
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision d778cab0 (ceph): mon: make [near]full_ratio config options floats
- ratio implies a real number, not a percentage. Correct, though, if it is
> 1.0.
Signed-off-by: Sage Weil <sage@newd... - 09:12 PM Revision dc5033f0 (ceph): osd: fix ScrubMap::object ctor
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision 4df4465c (ceph): osd: is_zero() method for stat structs
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision d20d5c10 (ceph): mon: refactor calc_stats()
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision 59a9e4eb (ceph): mon: fix PGMap::generate_test_instances()
- Apply an incremental instead of futzing directly with members.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:12 PM Revision dfaa7fd7 (ceph): ceph-dencoder: MonCap[s]
- Need some better test instances for MonCaps...
Signed-off-by: Sage Weil <sage@newdream.net> - 09:12 PM Revision 3f94c15b (ceph): mon: better MonCaps test cases
- Move MonCaps to libcommon.la.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:12 PM Revision 8e2ceb4e (ceph): mon: fix [near]full_ratio conf update
- Already a value in [0,1]. Interpret as a percentage if > 1.0.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:45 PM Bug #2044 (Resolved): osd: pg stuck in active+backfill
- jmlowe ran into this on his cluster several times. The primary doing backfill failed to requeue the pg for recovery.
... - 07:08 PM Revision 5ce10979 (ceph): mon: waitlist new sessions trying to connect while we're out of quorum
- If we're stuck out of the quorum, we don't want clients connecting to
to us. Instead, waitlist their requests; proces... - 06:30 PM Revision cbf50eb2 (ceph): update amazonaws xmlns to correct url
- Signed-off-by: michael rodriguez <michael@newdream.net>
- 04:54 PM rgw Bug #2043 (Resolved): rgw: cannot use '+' in url
- Either in signed urls (e.g., as part of the uid), or in object names. Reason is that url_decode removes it. Relax url...
- 04:45 PM Bug #2042: mon: crash in LogMonitor::update_from_paxos
- ubuntu@teuthology:/a/nightly_coverage_2012-02-08-b/11127
- 04:45 PM Bug #2042: mon: crash in LogMonitor::update_from_paxos
- core + binary + tarball are at metropolis:~sage/bug-2042
- 04:43 PM Bug #2042 (Duplicate): mon: crash in LogMonitor::update_from_paxos
- ...
- 02:26 PM Linux kernel client Bug #1907: rbd: don't reuse device ids while they're still in use elsewhere
- 02:23 PM Linux kernel client Bug #1907: rbd: don't reuse device ids while they're still in use elsewhere
- After a few weeks of wandering around the code, figuring out how
things work and refactoring and fixing things as I ... - 01:17 PM Cleanup #2041 (Resolved): osd: move peering into worker threads
- 10:52 AM Bug #1974: osd: radosmodel crash on thrashing
- Just hit this:
- clean_up_local removed an object (due it a 'delete' log entry)
- a read came in and read it befo... - 06:41 AM rgw Feature #2040 (Resolved): rgw: disable rgw log through ceph.conf
- Currently the way to do it is through the apache conf.
- 06:07 AM Revision 1a028e5c (ceph): mds: remove IntervalTree code
- Not used, not tested.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:56 AM Revision 21a1dbd8 (ceph): trivial_libceph: need O_RDWR
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:36 AM Revision d63303de (ceph): client: -EINVAL write if not opened writable
- Fixes: #1827
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:31 AM Revision 4784b98f (ceph): client: clean up ctor a bit
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:28 AM Revision b5a5a4bf (ceph): client: initialize initialized
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:14 AM Revision 1f864354 (ceph): osd: always send scrub errors to cluster log
- Some errors were going to the cluster log, some weren't. Normalize the
output format and send them all.
Signed-off-... - 12:27 AM Revision f28287f0 (ceph): mon: make PaxosService::update_from_paxos return void.
- You can't really recover from a failed update (as PGMonitor was trying
to do), and nothing in the system checks the r... - 12:26 AM Revision 1125d71b (ceph): mon: call update_from_paxos() when we finish slurping updates.
- To aid in this, add a new get_paxos_service_by_name function.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
02/07/2012
- 10:56 PM Bug #2013 (In Progress): osd: messages for pgs we don't store are never freed
- see wip-pg-waiters?
- 10:46 PM CephFS Bug #1996 (Duplicate): mds: scatter_nudge() bad pointer on shutdown?
- this is the signal handler thing
- 10:45 PM Bug #1901 (Resolved): Missing files in ceph packages results in build failure of tests
- 10:43 PM rgw Bug #1721 (Can't reproduce): rgw: spurious multipart-upload failures
- 10:41 PM Bug #1626 (Can't reproduce): ceph-mon HA not working right; all must be up
- 10:37 PM CephFS Bug #1902 (Won't Fix): mds: unittest_interval_tree bad memory access
- 10:37 PM Bug #1659 (Can't reproduce): Upgrade from 0.27 -> 0.37 going wrong, OSDs miss map updates
- 10:35 PM Bug #1564 (Won't Fix): osd: osd should not be primary before data is replicated
- no more backlogs, so this problem is mostly moot. it can sort of still happen (to a vastly decreased degree), but it...
- 10:33 PM Bug #1529 (Can't reproduce): cosd: os/FileStore.cc: 2390: FAILED assert(0 == "ENOENT on clone sug...
- 10:31 PM Revision 675e4c41 (ceph): mon: drop election messages with bad rank
- The bad message came from old code pre-bfbeae68c045de76ede86ca4f72d2a760a19c84b.
Fixes: #1909
Signed-off-by: Sage We... - 10:31 PM Bug #1797 (Resolved): configure doesn't link to pthread on Fedora 14 on linking librados-config
- I'm going to assume that using the automake pthread macros fix this (commit:c5144eed4eadf5cfaa0a41c0ced2a1cd3462289f)...
- 10:30 PM Cleanup #1899 (Resolved): use acx_pthread instead of hardcoding libs and cflags into build system
- applied this a while back, commit:c5144eed4eadf5cfaa0a41c0ced2a1cd3462289f
- 10:29 PM rgw Feature #2039 (Rejected): rgw: keep more than one bucket marker object
- We generate a unique bucket index id by leveraging the pg version returned on a write operation to a special bucket m...
- 10:28 PM CephFS Bug #1827 (Resolved): libceph: hang on creating a file
- finally looked at this. the problem is just that open wasn't passed O_WRONLY or O_RDWR, and ceph_write() wasn't retu...
- 10:20 PM Feature #2038 (Rejected): mon: can't currently do commands/get status when not in quorum
- For obvious reasons, the MonClient has to authenticate with a monitor before talking to it. Right now this is accompl...
- 10:05 PM Bug #2031: paxos: failed assert (begin->last_committed == last_committed)
- Made a new bug for that issue anyway. #2037
- 04:50 PM Bug #2031: paxos: failed assert (begin->last_committed == last_committed)
- oh, that was meant for #2032!
- 04:49 PM Bug #2031: paxos: failed assert (begin->last_committed == last_committed)
- I think that could happen, so I'll check and fix it if so, but it's not what happened here.
- 04:44 PM Bug #2031: paxos: failed assert (begin->last_committed == last_committed)
- oh.. maybe it was slurping, and crashed before it stashed. when it restarted it didn't go back into slurp, because t...
- 03:03 PM Bug #2031 (Can't reproduce): paxos: failed assert (begin->last_committed == last_committed)
- ...
- 10:04 PM Bug #2037 (Resolved): mon: a crash in the middle of slurping is unrecoverable
- If a monitor comes up and starts slurping, it will start adding incremental maps to its store and update [first|last]...
- 09:39 PM Bug #1547 (Resolved): client log doesn't go to stderr unless 'log file' specified
- fixed this a few releases back
- 09:38 PM Bug #1688: Benjamin: pg stuck in scrub
- is this old/fixed? haven't seen it in a while
- 09:03 PM Feature #2024 (Resolved): make gitbuilders time out when github is sucking
- 04:04 PM rgw Cleanup #2036 (Resolved): rgw: bucket index tree contains the same info 3 times
- This is apparent by running strings on the index objects. We should be able to reduce the excessive information (whic...
- 04:01 PM rgw Bug #2035 (Resolved): rgw: bucket removal fails
- bucket removal sometimes either return 'access denied' or 'bucket not empty'
- 03:45 PM Bug #2033: osd: segfault in OSD::update_heartbeat_peers()
- ...
- 03:32 PM Bug #2033 (Closed): osd: segfault in OSD::update_heartbeat_peers()
- just hit this twice, on two different clusters, both under testrados workloads....
- 03:32 PM Feature #2034 (Resolved): osd: refactor push code
- 03:07 PM Bug #2032 (Resolved): paxos: somehow didn't update stash alongside new states
- lxo reported that on one monitor, after seeing #2031 and bringing the monitor back up (much later), the monitor faile...
- 02:28 PM Bug #1909 (Resolved): Two mons crash after starting the third one
- this really looks like the bug fixed in commit:bfbeae68c045de76ede86ca4f72d2a760a19c84b... the sender sent a message ...
- 02:19 PM Bug #1789: mon: failed assert(paxosv == pg_map.version)
- We only saw this the once, but we believe the bug and want to keep it open.
- 02:18 PM Messengers Bug #1747: msgr: osd connection originates from wrong port
- We only saw this the once, but we believe the bug and want to keep it open.
- 02:14 PM Bug #1631: osd: failed assert(repop_queue.front() == repop)
- We haven't seen this, but hope that the messenger tests now being designed will flush it out again.
- 02:11 PM CephFS Bug #1947 (Duplicate): mds: SIGBUS during _mark_dirty
- #1549
- 02:06 PM RADOS Feature #1639: osd: guard against bad objects in cls map functions
- the specific instance was fixed. can we in general catch any exception in the class methods? safely?
- 02:02 PM Bug #1530 (Can't reproduce): osd crash during build_inc_scrub_map
- 11:28 AM Feature #2030: osd: clean up mark_unfound api
ceph pg 1.2 mark_unfound_revert foo
NOT ceph tell osd.12 mark_unfound revert pgid objectname
- 11:27 AM Feature #2030 (Resolved): osd: clean up mark_unfound api
- 11:27 AM Feature #2007: osd: enumerate unfound, lost objects, possible locations
ceph pg 1.2 list_missing|list_unfound
- list of missing objects, lcoators, and known locations (if !unfound)
- 11:19 AM Feature #2007: osd: enumerate unfound, lost objects, possible locations
- PGLS_MISSING
(new pg op) using rados - 11:27 AM Feature #2006: osd: report what is blocking peering completion
ceph pg 1.2 status|query
- peering status
- recovery status
- another interseting status
- 11:11 AM Feature #2006: osd: report what is blocking peering completion
- ceph ...
ceph tell <who> ....
ceph pg query 1.2
map pg, query osd directly with
['pg', 'query', '1.2']
- 11:07 AM Feature #2005: mon: track timestamps on pg states
- query list of stale/unpeered/whatever pgs
ceph pg dump_stuck [--format=json|plain]
- 10:06 AM Bug #1974: osd: radosmodel crash on thrashing
- Summary: An object was deleted, but after a recovery was found to be back ... which is almost surely indicative of a ...
- 12:10 AM Revision 6df25e53 (ceph): rgw: url_decode object name
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 12:10 AM Revision 0da793ba (ceph): rgw: cleanup url_decode usage
- we now url_decode the relevant strings at initialization,
thus it's clear whether we need to url_decode or not later ...
02/06/2012
- 11:19 PM Revision f859f25d (ceph): osd: re-take the osd lock in the init error path where it's not held
- The Mutex::Locker will unlock it once the function exits.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> - 09:34 PM Revision 2e84c1ec (ceph): ceph-dencoder: ScrubMap[::object]
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:34 PM Revision 0bf3c54b (ceph): osd: uninline osd_stat_t methods
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision c0711b09 (ceph): ceph-dencoder: coll_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision 8bf08abc (ceph): ceph-dencoder: pg_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision 8791cb98 (ceph): ceph-dencoder: SnapSet
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision c5a58420 (ceph): ceph-dencoder: SnapContext, SnapRealmInfo
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision fe39d58d (ceph): move SnapContext, SnapRealmInfo to common/snap_types.{h,cc}
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision f07d2835 (ceph): ceph-dencoder: filepath
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:33 PM Revision 1fbb8ebc (ceph): ceph-dencoder: CompatSet
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:33 PM Revision 5b423b60 (ceph): kill unused tstring
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:33 PM Revision cba2674b (ceph): kill useless [cn]string.h
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:12 PM Revision c23d217c (ceph): rgw: escape and list correctly objects that start with underscore
- This should fix bug #2025.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 06:36 PM Revision 8ded2647 (ceph): crush: don't BUG_ON
- Fail gracefully on map errors; only BUG on code errors.
Signed-off-by: Sage Weil <sage@newdream.net> - 06:33 PM Revision 9895f0bf (ceph): crush: don't BUG_ON within crush_choose
- It's very hard to recover from an invalid crushmap if mons fail
assertions while processing the map, and osds crash w... - 05:48 PM Bug #1975: btrfs: EINVAL on snap create
- RATIONALE:
We seem to be able to make this happen, and believe it to be a btrfs bug.
We are not calling it u... - 05:44 PM Feature #1932: mon: before accepting a new crushmap, monitor should validate and test some inputs
- Users can create their own rules, so bad rules will happen, and we must do a better job of making the Monitors robust...
- 04:22 PM rgw Bug #2025 (Resolved): rgw: objects starting with underscore are badly listed
- Fixed, commit:c23d217c93bb6ed21c1b07e347710e18446a3abc.
- 04:22 PM rgw Bug #2029 (Resolved): rgw: space in object name is turned into a different character
- Fixed, commit:6df25e53abe37b19b38e5657dbf3b4c37f03d8e3.
- 02:37 PM rgw Bug #2029 (Resolved): rgw: space in object name is turned into a different character
- looks like we fail to use the url-decoded object name.
- 02:11 PM Feature #2028 (Resolved): qa: allocate disks to btrfs on new hardware
- root isn't consistently on /dev/sda, it seems. or on a consistent /dev/disk/by-path on the plana nodes.
- 02:04 PM Feature #1970 (Resolved): osd: migrate to new encoding schemes
- this is all done, but unmerged; it'll get pulling into a release with a bunch of other encoding updates.
- 10:52 AM rgw Bug #2027 (Can't reproduce): rgw -> apache miscommunication
- There were some mystery failures, where we've seen rgw getting requests from apache, processing them, sending respons...
- 10:10 AM Bug #1973 (Can't reproduce): osd: segfault in ReplicatedPG::remove_object_with_snap_hardlinks
- let's chalk this up to the bad object_info_t
- 10:06 AM Bug #1984 (Can't reproduce): osd: failed assert, got into finish_recovery_ops without any recover...
- 10:00 AM Bug #1490 (Resolved): cfuse assert failure: assert(ob->last_commit_tid < tid)
- 05:28 AM Revision 8427090a (ceph): filejournal: flush needn't abort on write_stop
- Flush should wait for things to flush, even if we are also shutting down.
Not sure this would ever trigger, but this ... - 05:27 AM Revision 4b8374cc (ceph): filejournal: clean up check_aio_completion
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:25 AM Revision fffee825 (ceph): filejournal: get multiple aios at a time
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 03:36 AM Bug #2026 (Can't reproduce): osd: ceph::HeartbeatMap::check_touch_file
- After my data loss due to a btrfs bug I re-installed my whole cluster with 0.41 and kernel 3.2 (ceph-client with btrf...
02/05/2012
- 08:54 PM Bug #1975: btrfs: EINVAL on snap create
- ...
- 07:11 PM rgw Bug #2025 (Resolved): rgw: objects starting with underscore are badly listed
- 01:44 AM Revision b7c20e77 (ceph): streamtest: show total throughput, avg latencies
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:44 AM Revision 30a77acb (ceph): filejournal: implement aio for writes
- Implement aio for the journal writes.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 01:44 AM Revision 3f0a592a (ceph): debian: depend on libaio-dev
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:44 AM Revision 4842b3d2 (ceph): ceph.spec.in: buildrequires libaio-devel
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:44 AM Revision fb0e2a3e (ceph): configure: add --without-libaio option
- Use it by default; fail if it's not there. Unless --without-libaio is
specified.
Signed-off-by: Sage Weil <sage.wei... - 01:44 AM Revision f3dd5832 (ceph): filejournal: print aio mode on open
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
02/04/2012
- 09:25 PM Revision a9a60461 (ceph): client: init/shutdown objecter in init/shutdown
- Not in mount/unmount, and don't do shutdown() twice!
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:02 PM Feature #2024 (Resolved): make gitbuilders time out when github is sucking
- 03:32 PM CephFS Bug #1945: blogbench hang on caps
- ubuntu@teuthology:/a/nightly_coverage_2012-02-04-a/10600
- 12:01 PM Cleanup #2023 (Resolved): btrfs: Use btrfs device scan instead of btrfsctl -a
- I justed upgraded my btrfs userland tools and saw:...
02/03/2012
- 11:00 PM Bug #2022 (Resolved): osd: misdirectect request
- from rados_api_tests.yaml:
[WRN] client.4292 10.3.14.128:0/3016298 misdirected client.4292.0:4 0.0 to osd.1 not [0,1... - 10:42 PM Revision ba12f26f (ceph): rgw: fix autobuilder errors
- librgw wasn't linking with some useless unit test
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 10:42 PM Revision dba22f8e (ceph): rgw: fix warning
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 10:29 PM Revision 90fe53c3 (ceph): rgw: fix acl cleanup related regression
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 09:27 PM Revision 06ea2f7f (ceph): doc: add the ceph mds stop command.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 06:35 PM Revision caabec96 (ceph): mon: show full status in ceph health
- HEALTH_WARN when nearfull, HEALTH_ERROR when full.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off... - 05:34 PM Revision 13c89137 (ceph): rgw: use request uri if script name is empty
- this was required for some nginx configuration
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 05:28 PM Revision 7641a0e1 (ceph): osd: signal dispatch_cond on ms_dispatch completion
- There may be another dispatch thread waiting on this cond; we do need to
signal it!
Signed-off-by: Sage Weil <sage@n... - 05:27 PM Revision eaa46f50 (ceph): osd: reorder PG recovery_state initialization
- The state machine state constructors print stuff to the logs, and the
PG::gen_prefix() includes all kinds of PG field... - 10:48 AM Cleanup #2021 (Resolved): fix signal handlers
- 10:45 AM Feature #2008 (Resolved): mon: include full/nearfull in health check
- 10:31 AM Feature #2004 (Resolved): qa: make deb gitbuilder faster
- 10:11 AM Feature #2020 (Duplicate): collectd: submit plugin upstream
- 05:06 AM Revision d7f61c8d (ceph): test/encoding/readable.sh: nicer output
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:06 AM Revision 5216eb07 (ceph): ceph-dencoder: more helpful error message for messages
- If the type doesn't match, share what it was vs what you expected.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:06 AM Revision 5103338c (ceph): messages: set type in default constructor
- ceph-dencoder wants this to verify it decoded the correct message type.
Not that it is likely to happen, but let's be... - 03:24 AM Revision ae67c2de (ceph): pick object from random osd for primary recovery
- When recovering a primary, try the osds that have a copy of the object
in random order, rather than preferring the lo... - 01:04 AM Revision 05f66c45 (ceph): msg: fix message leak on receipt of undecodable message
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:01 AM Revision c3eacb15 (ceph): Makefile: add test/encoding/types.h
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:00 AM Revision 0dcbc86c (ceph): Merge branch 'wip-encoding'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Conflicts:
src/msg/Message.h
src/osd/OSD.cc
src/osd/Replicat... - 12:58 AM Revision 6e62fc48 (ceph): test/encoding/readable.sh: check all version
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 12:58 AM Revision 625a89d4 (ceph): test/encoding/readable.sh: nicer output
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 12:58 AM Revision d597dc2d (ceph): encoding: document ENCODE/DECODE macros
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
02/02/2012
- 11:20 PM Revision e9b97c13 (ceph): osd: fix another repop->ctx->op deref
- Ok this time I actually looked for more and didn't see any.
Signed-off-by: Sage Weil <sage@newdream.net> - 11:16 PM Revision bf5d7d05 (ceph): Merge remote branch 'gh/wip-objecter-initialized'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 11:07 PM Revision 2f5ba8fb (ceph): osd: avoid null deref of repop->ctx->op
- It's optional.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:48 PM Revision a0dde422 (ceph): encoding: document ENCODE_DUMP throttling weirdness
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:43 PM Revision 96876097 (ceph): encoding: fix DECODE_START macro
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:43 PM Revision 9c2d779b (ceph): encoding: add DECODE_OLDEST macro
- So we can (gracefully) fail to decode very old encoded versions we no
longer support.
Signed-off-by: Sage Weil <sage... - 09:31 PM Revision 690b9919 (ceph): osd: fix another issue_repop() ctx->op null deref
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:50 PM Revision d9261942 (ceph): check-generated.sh: do self-decode test first
- This way we get a helpful error instead of silent failure on later.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 08:47 PM Revision efe77a8e (ceph): check-generated.sh: nicer output
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:46 PM Revision 153e89d2 (ceph): ceph-dencoder: print errors to stderr
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:36 PM Revision 2a262956 (ceph): osd: do not dereference ctx->op when NULL
- We may not have an OpRequest. Make the later check do the cast properly
when it is needed.
Signed-off-by: Sage Weil... - 07:57 PM Bug #2016 (Resolved): OSD: pull should randomly choose a pull target
- applied, thanks!
- 12:00 PM Bug #2016: OSD: pull should randomly choose a pull target
- Here's a patch that fixes this.
- 11:13 AM Bug #2016 (Resolved): OSD: pull should randomly choose a pull target
- Currently, we choose the lowest numbered osd to pull from. This biases the recovery load towards lowered numbered osds.
- 07:30 PM Revision 8623c64d (ceph): encoding: better DECODE_START_LEGACY_COMPAT_LEN
- - let you specify whether to decode compat and/or len
- put the argument order in the macro name so you know when you... - 07:30 PM Revision 73e92b31 (ceph): buffer: iterator::get_remaining()
- It's helpful to know how much data is remaining.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:23 PM Messengers Bug #1985: msgr: creating new Pipe for pre-existing connection leaks Pipe if they don't replace
- Hacked up a small patch that should do it, but need to test and get some feedback on related protocol stuff I ran into.
- 06:55 PM Revision 04753e9a (ceph): Merge remote-tracking branch 'gh/wip-osd-op-tracking'
- Reviewed-by: Sage Weil <sage@newdream.net>
- 06:53 PM Revision cb754917 (ceph): osd: use obc for size in calc_head_subsets()
- No need to call stat(2) here; the caller has what we need.
Signed-off-by: Sage Weil <sage@newdream.net> - 06:53 PM Revision c4ca1142 (ceph): osd: fix osd_recover_clone_overlap
- - we need to populate data_subset
- add check in calc_head_subsets() too
Fixes 2116f012.
Signed-off-by: Sage Weil <... - 06:41 PM Revision dab9f0f9 (ceph): Merge branch 'master' into wip-encoding
- Conflicts:
src/osd/OSD.cc
src/osd/PG.cc
src/osd/PG.h - 06:38 PM Revision 83432af2 (ceph): common/Throttle: throttle in FIFO order
- Under heavy write load from many clients, many reader threads will
be waiting in the policy throttler, all on a singl... - 06:07 PM Revision 36a4ca40 (ceph): filestore: remove obsolete fs type check
- This isn't a useful check. xfs and ext4 work too.
Fixes: #1995
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:36 PM Revision 0cd16cf0 (ceph): ceph: always add logger for daemons
- The extra log function added redundant info and didn't allow different
levels. - 05:35 PM Revision 7af7c66b (ceph): ceph: rename type parameter to type_
- type is a built-in and shouldn't be aliased.
- 05:27 PM Revision 7146db92 (ceph): ceph: use the correct comparison operator
- is compares identity (i.e. address in cpython), not value.
- 05:26 PM Revision e7672b64 (ceph): ceph: sync before unmounting btrfs devices
- There may still be writes in flight, since the osds may not have
shutdown cleanly. This should prevent EBUSY when unm... - 05:26 PM Revision 1364b882 (ceph): ceph: delay raising exceptions until all daemons are stopped
- If a daemon crashes, the exception is raised when we stop it. This
caused some daemons to continue running during cle... - 05:02 PM Revision 290730ee (ceph): objecter: track whether initialized; add asserts
- init() should be called when not initialized; shutdown() should not be
called unless initialized. No handle_* method... - 05:02 PM Revision 33659521 (ceph): librados: discard incoming messages when DISCONNECTED
- If we are disconnected (probably shutting down, if we are receiving a
message) then ignore anything incoming. This a... - 05:02 PM Revision 51ccce06 (ceph): client: let set_filer_flags clear flags, toos
- ceph-syn does this...
Signed-off-by: Sage Weil <sage@newdream.net> - 05:02 PM Revision 824c3af7 (ceph): client: add initialized flag to client
- Do not call init() while initialized; do not call shutdown unless
initialized.
Drop incoming messages if not initial... - 05:01 PM Revision 4ef4d3f1 (ceph): test_filejournal: fix warnings
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:50 PM Linux kernel client Bug #1990 (Resolved): rbd: null pointer dereference during map
- Problem seems to be gone now.
- 04:36 PM CephFS Bug #1945: blogbench hang on caps
- Happened again in /var/lib/teuthworker/archive/nightly_coverage_2012-02-02-a/10268 (also blogbench)
- 03:38 PM CephFS Bug #2019 (Resolved): mds: CInode::filelock stuck in sync->mix
- Reported by Kioob`Taff in irc. Some logging is available at gregf@kai:~/logs/kioob. Unfortunately not of the lock get...
- 03:33 PM CephFS Bug #2018 (Resolved): mds: can't change file_max
- http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/4612
Relevant MDS log snip (repeats):... - 03:15 PM Bug #2014 (Resolved): librados shutdown race
- resolved by commit:33659521a92315f71040551b2699d9961acc07f7 and neighbors.
- 03:13 PM Linux kernel client Bug #2017: osd: segfault in snap trimmer
- Since Sage has fixed this, I've deleted the archive of /tmp/cephtest I had saved.
- 03:06 PM Linux kernel client Bug #2017 (Resolved): osd: segfault in snap trimmer
- pushed fix for this (and another similar bug) to master.
- 03:00 PM Linux kernel client Bug #2017: osd: segfault in snap trimmer
- I bundled up the /tmp/cephtest directory in its entirety. It is here:
flak.ops.newdream.net:~elder/tracker_2017... - 02:59 PM Linux kernel client Bug #2017: osd: segfault in snap trimmer
- The segfault was from trying to dereference repop->ctx->op, which was NULL.
- 02:52 PM Linux kernel client Bug #2017 (Resolved): osd: segfault in snap trimmer
- Testing some reasonably solid changes to the rbd code I ran across an OSD crash.
It looks like it happ
The YAML fil... - 11:03 AM Feature #2015 (Resolved): osd: dump in-flight ops via admin socket
- 11:03 AM Feature #1879 (Resolved): osd: track list of in-progress requests, log slow ones
- 10:09 AM Bug #1997 (Resolved): teuthology: wait for clean osd shutdown before umount
- This was different from #1744 - daemons are shut down without waiting for I/O to complete, which causes this issue wh...
- 10:07 AM Bug #1744 (Resolved): teuthology: race with daemon shutdown?
- This turned out to be uncaught exceptions that weren't logged until later when daemons crashed. Fixed by 1364b8826f3f...
- 10:06 AM Bug #1995 (Resolved): Turn down non-btrfs warning in FileStore
- commit:36a4ca40805a5b0665e749b2b928d94749a8dd87
- 01:17 AM Revision da02c40d (ceph): osd: d'oh again! Make this real exponential, not...ever-linear.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 01:17 AM Revision 030ad872 (ceph): osd: mark_started() osd sub ops
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 01:17 AM Revision f7e6e18a (ceph): osd: OpRequest currently_* needs to look at latest, not hit.
- D'oh!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> - 01:01 AM Revision c81845b6 (ceph): rgw: fix crash related to cleanups
- there are still a few regressions, but getting there.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 12:34 AM Revision 00a2e84b (ceph): do_autogen.sh: -e <path> to dump encoded objects to a path
- Make it easy to build with encode dumping enabled. This is just a
convenient way to generate a large corpus of encod... - 12:13 AM Revision 91073a6a (ceph): check-generated.sh: run on 'make check'
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 12:05 AM Revision 8870a676 (ceph): Merge remote branch 'origin/master' into wip-osd-op-tracking
- Conflicts:
src/osd/ReplicatedPG.h
02/01/2012
- 11:54 PM Revision f125070b (ceph): osd: pg_stat_t: fix member initialization
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:54 PM Revision 71c59dae (ceph): osd: add check_ops_in_flight()
- By default it warns on requests that are more than 30 seconds old,
using an exponential backoff of that interval.
Als... - 11:51 PM Revision 63ad89d2 (ceph): osd: fix PG::Interval member initialization
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:49 PM Revision 500f4c66 (ceph): osdmap: finalize crush after building simple map
- This ensures that max_devices gets calculated (and thus encoded) properly.
Signed-off-by: Sage Weil <sage.weil@dream... - 11:49 PM Revision c41adacf (ceph): osdmap: make test instnaces deterministic
- current time can vary
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 11:41 PM Revision d4d1b64f (ceph): import-generated.sh: fix to use ceph-dencoder syntax
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:41 PM Revision 290c4b72 (ceph): ceph-dencoder: fix ctor
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:31 PM Revision 2d0da67d (ceph): ceph_context: initialize member var
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:28 PM Revision cb5f2708 (ceph): rgw: some more acls cleanup
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 11:02 PM Revision 544ea29d (ceph): osd: switch op passing interface to use OpRequest instead of raw Messages
- This doesn't handle the PG internals yet.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> - 11:02 PM Revision ba392e3d (ceph): PG: switch op passing interface to use OpRequest
- This is all the PG/ReplicatedPG internals and the few remaining OSD callers.
Signed-off-by: Greg Farnum <gregory.far... - 11:02 PM Revision fd3108ee (ceph): osd: "mark" OpRequests as they move through the system.
- Right now these are just informational flags which can be read out. Later
they might extend to timing information, se... - 11:02 PM Revision 4075d521 (ceph): osd: add new OpRequest struct and an xlist to track it
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 11:02 PM Revision 25c5daec (ceph): osd: PGLSResponse -> pg_ls_response_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:02 PM Revision ac1fbd18 (ceph): osd: PG::Missing -> pg_missing_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 10:28 PM Revision a41679a6 (ceph): osd: PG::Log -> pg_log_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 10:16 PM Revision 5263c12a (ceph): osd: PG::Log::Entry -> pg_log_entry_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:03 PM Revision 460a4622 (ceph): osd: PG::Query -> pg_query_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:58 PM Revision 3353f572 (ceph): cls_rgw: update bucket index when deleting object (with pending)
- Bug #2012. Racing delete with other operations (update or another
delete) failed to update the bucket index.
Signed-... - 08:53 PM Revision 25293748 (ceph): osd: PG::Info -> pg_info_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:43 PM Revision 1c5370cd (ceph): osd: PG::Info::History -> pg_history_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:52 PM Revision 64bdd389 (ceph): osd: PG::Info[::History] dump, test instances
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:26 PM Revision 139db823 (ceph): ceph-dencoder: remove message type dups
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:25 PM Revision 2b05000d (ceph): ceph-dencoder: generate test instances on heap
- Some objects aren't copyable (OSDMap contains CrushWrapper), but we'd
still like to programmatically generate test in... - 06:55 PM Revision cc405721 (ceph): Merge remote branch 'gh/wip-divergent-backfill'
- Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
- 06:49 PM Revision 1fdb5e5f (ceph): osd: dump, test instances for PG::Interval
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:46 PM Revision 19313fbe (ceph): osd: dump, instances for PG::OndiskLog
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:41 PM Revision 9307edd1 (ceph): ceph-dencoder: use g_ceph_context
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:41 PM Revision 19227b23 (ceph): ceph-dencoder: OSDMap and osd_info_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:41 PM Revision cbaf83d7 (ceph): osdmap: test instances for osd_info_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:40 PM Revision d0dbaaaf (ceph): osdmap: test instances
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:34 PM Revision ad77bb48 (ceph): osdmap: normalist encode/decode
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:09 PM Feature #1879: osd: track list of in-progress requests, log slow ones
- This is in the branch wip-osd-op-tracking. There are some ops that still need to get marked up; I have logs to go thr...
- 03:44 PM Feature #1836: filejournal: use async directio to write to the journal
- 01:36 PM Bug #1984: osd: failed assert, got into finish_recovery_ops without any recovery ops active?
- Hmm, we still haven't seen this in our thrashing in qa. I'll start thrashing on some of the new hardware.
- 01:35 PM Bug #1983 (Resolved): osd: failed assert, info does not match peer info
- 01:12 PM rgw Bug #2012 (Resolved): rgw: racing object creation and removal may lead to bad bucket accounting
- Fixed, commit:3353f572f84707fbc0e99a9af2dc48de2d0aa2c9.
- 12:49 PM Bug #2014: librados shutdown race
- ...
- 12:49 PM Bug #2014 (Resolved): librados shutdown race
- ...
- 12:36 PM Linux kernel client Bug #147: lockdep: possible irq lock inversion dependency w/ osdc->request_mutex and con->mutex
- Saw this again. It's been a while.....
- 08:45 AM Feature #1971 (Resolved): encoding: adapt to messages
- 08:45 AM Feature #1969 (Resolved): gitbuilder for 11.10, 12.04
- 07:46 AM Bug #1992: OSD::get_or_create_pg
- I was running the stock 3.0 kernel from Ubuntu 11.10
I tried with the latest ceph-client code (saw your post about... - 06:48 AM Revision a5366c8b (ceph): ceph-dencoder: add all message types
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:48 AM Revision 32010d78 (ceph): msg: add missing #includes for messages
- And remove that unused max() macro.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:39 AM Revision 1cb39fac (ceph): msg: dump messages via build option
- Dump encoded messages to ENCODE_DUMP when it is defined, just as we do with
the regular encode function.
Signed-off-... - 04:06 AM Revision 597e97a6 (ceph): osd: fix assignment in PG::rewind_divergent_log()
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 12:37 AM Revision 0b68dbca (ceph): add backfill test
- 12:25 AM Revision 0236dc0f (ceph): add backfill task
- This does a basic test of backfill functionality, including a divergent
log on a backfill target (#1983). - 12:18 AM Revision 7cb561b4 (ceph): Merge remote-tracking branch 'gh/wip-journal-crc'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 12:13 AM Revision e337c472 (ceph): ceph_manager: add manager.blackhole_kill_osd()
- This will suspend disk writes for a couple seconds and then kill the
daemon. It helps us similute a hardware failure.
01/31/2012
- 11:41 PM Revision 9d385f52 (ceph): msgr: Document recv_stamp and add a dispatch_stamp and throttle_wait.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 09:56 PM Feature #2004 (In Progress): qa: make deb gitbuilder faster
- 09:56 PM Feature #1885 (Resolved): identify top 10 expected failures and process to diagnose
- 09:00 PM Revision ba4aad48 (ceph): qa: test_backfill.sh: take osd.0 down
- Mark this down to
1- trigger the WaitActingChange vs osd down race, and
2- help trigger a divergnet log when osd.2 is... - 07:44 PM Revision f1c3538f (ceph): osd: fix divergent backfill targets
- During peering, a previous backfill target may have a slightly newer
last_update than the other options, but it will ... - 07:44 PM Revision f4e44e43 (ceph): qa: test_backfill.sh: limit pg log length so we trigger backfill
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:44 PM Revision 747b3d46 (ceph): osd: use RecoveryContext transaction, finishers on recovery completion
- We should use the enclosing transaction and finisher list here.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:44 PM Revision 9dfa46ff (ceph): osd: rename recovery event NeedNewMap -> NeedActingChange
- This is more precise.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:44 PM Revision 5a544836 (ceph): osd: restart peering if requesting acting osd goes down
- If we request an acting set, we need to restart peering if one of the
requested nodes goes down. This prevents a dea... - 06:56 PM Bug #2013: osd: messages for pgs we don't store are never freed
- I think the thing to do is check the waiting map on activate and discard pgs and their messages if they ok longer map...
- 05:52 PM Bug #2013 (Resolved): osd: messages for pgs we don't store are never freed
- Once request timestamps are implemented, we could have a timeout period after which misdirected requests are dropped.
- 05:09 PM rgw Bug #2012 (Resolved): rgw: racing object creation and removal may lead to bad bucket accounting
- 04:05 PM Revision d7be7762 (ceph): Allow user to disable lock checking.
- The new plana hardware isn't in the old sepia lock database,
and the machine pools are risky to merge as nothing in t... - 03:59 PM Revision 09bed164 (ceph): Allow user to provide flavor to use.
- With this, you can use Ubuntu 11.10 machines with teuthology by saying::
tasks:
- ceph:
flavor: oneiric
... - 03:23 PM Revision 9520ee78 (ceph): filestore: implement filestore_blackhole hook
- If true, we'll drop any new transactions on the floor. Useful for
triggering failure conditions (e.g., prior to killi... - 02:19 PM Feature #2011 (Resolved): osd: do not backfill/recover to full osds
- 02:18 PM Feature #2010 (New): mon: check for slow performing osds
- 02:18 PM Feature #2009 (Resolved): osd: report performance to monitor
- 02:17 PM Feature #2008 (Resolved): mon: include full/nearfull in health check
- 02:17 PM Feature #2007 (Resolved): osd: enumerate unfound, lost objects, possible locations
- 02:16 PM Feature #2006 (Resolved): osd: report what is blocking peering completion
- 02:15 PM Feature #2005 (Resolved): mon: track timestamps on pg states
- 09:10 AM Bug #2002: osd: racy push/pull for clones
- sage@metropolis.ceph.dreamhost.com:osd.log.badpushpull
shows the (or similar) badness. workload was... - 01:02 AM Revision 1fe75ee6 (ceph): rgw: should remove bucket dir instead of sending intent
- that was really useless, and also bucket cleanup was broken anyway.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.... - 12:48 AM Revision 2b5bbe8e (ceph): librados: fix a leak
- watch notification message was missing a ->put()
Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
Also available in: Atom