Activity
From 01/11/2012 to 02/09/2012
02/09/2012
- 09:41 PM Bug #1974: osd: radosmodel crash on thrashing
- commit:359dfb9966d15d997f9e0351a5ed8de1faae62fe
- 09:41 PM Bug #1974 (Resolved): osd: radosmodel crash on thrashing
- 09:20 PM Bug #1975: btrfs: EINVAL on snap create
- I'm pretty sure this was triggered by #2046. There is still a btrfs bug, but we were doing the wrong thing if rmdir ...
- 09:18 PM Bug #2013 (Resolved): osd: messages for pgs we don't store are never freed
- 04:38 PM Bug #2046 (Resolved): filestore: do_op running during commit
- commit:1009d1a016f049e19ad729a0c00a354a3956caf7 and commit:93d7ef96316f30d3d7caefe07a5a747ce883ca2d
- 04:02 PM Bug #2046: filestore: do_op running during commit
- this was broken by commit:259c509a8941bf7cdad8bd4ede0ccd73ca8a83d3, way back in v0.25! Sigh. The wait condition for...
- 10:05 AM Bug #2046 (Resolved): filestore: do_op running during commit
- commit_start() is supposed to quiesce writes, but I see...
- 04:24 PM Bug #2044: osd: pg stuck in active+backfill
- This should be fixed by commit:f0334673ab8547807b961aae19a8e53531585e3f.
- 10:55 AM rgw Bug #2048 (Resolved): rgw: multipart upload listing return key starting with _multipart_
- reported by jdwilson over irc.
- 10:41 AM RADOS Bug #2047 (Resolved): crush: with a rack->host->device hierarchy, several down devices are likely...
- See http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/5166
Sage says the cause is down devices only tr... - 10:02 AM Bug #2045: osd: dout_lock deadlock
- ubuntu@teuthology:/a/nightly_coverage_2012-02-09-a/11210
metropolis:~sage/bug-2045 - 09:56 AM Bug #2045 (Can't reproduce): osd: dout_lock deadlock
- a thread is blocked on dout_lock, can't tell who.
- 05:05 AM Revision 0a60fcf3 (ceph): Merge remote branch 'gh/wip-types'
- 04:43 AM Revision 143ad86b (ceph): Merge remote branch 'gh/wip-stuck-in-backfill'
- Reviewed-by: Sage Weil <sage.weil@dreamhost.com>
- 01:40 AM Revision f0334673 (ceph): ReplicatedPG: don't count deletions as ops
- Counting them as ops but not requeueing the pg for recovery causes
backfill to stall when only deletions are sent in
... - 01:15 AM Revision 42db09b7 (ceph): osd: don't remove pg from recovery queue if not enough recovery ops sta...
- The pg has already been dequeued at the beginning of do_recovery(),
and it requeues itself only if it starts a new re... - 01:11 AM Revision a6d7629c (ceph): rgw: don't treat plus as a space in url decode
- Any special character encoding should be done through %hex. The
plus sign is a valid character in object names, and i... - 12:19 AM Revision 72bbaeac (ceph): osd: discard waiting ops when pg mapping changes
- If the pg mapping changes away from us, we can safely discard messages we
have waiting for the PG to be created.
Fix...
02/08/2012
- 09:30 PM Linux kernel client Bug #1793: NULL pointer dereference at try_write+0x627/0x1060
- Hmm.. yeah, I don't think we have anything beyond these console dumps. And we don't capture any kind of kernel core ...
- 09:17 PM Linux kernel client Bug #1793: NULL pointer dereference at try_write+0x627/0x1060
- Is there a core file for this problem anywhere?
It would really be nice to poke around in the message, or the
con... - 09:19 PM Revision 359dfb99 (ceph): osd: flush on activate
- PG::activate() can make lots of changes, most notably clean_up_local()
which deletes lots of local objects. Those ch... - 09:17 PM Revision 6c4687fe (ceph): Makefile: check readability of object corpus on 'make check'
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:17 PM Revision e261317e (ceph): add ceph-object-corpus.git submodule
- 09:12 PM Revision 66867888 (ceph): ceph-dencoder: PGMap[::Incremental]
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision bc7fd210 (ceph): mon: uninline Monmap encode/decode
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision 8cf81ccf (ceph): ceph-dencoder: MonMap
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision b6d1c0c9 (ceph): mon: initialize [near]full_ratio during create_initial(), not ctor
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision d778cab0 (ceph): mon: make [near]full_ratio config options floats
- ratio implies a real number, not a percentage. Correct, though, if it is
> 1.0.
Signed-off-by: Sage Weil <sage@newd... - 09:12 PM Revision dc5033f0 (ceph): osd: fix ScrubMap::object ctor
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision 4df4465c (ceph): osd: is_zero() method for stat structs
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision d20d5c10 (ceph): mon: refactor calc_stats()
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:12 PM Revision 59a9e4eb (ceph): mon: fix PGMap::generate_test_instances()
- Apply an incremental instead of futzing directly with members.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:12 PM Revision dfaa7fd7 (ceph): ceph-dencoder: MonCap[s]
- Need some better test instances for MonCaps...
Signed-off-by: Sage Weil <sage@newdream.net> - 09:12 PM Revision 3f94c15b (ceph): mon: better MonCaps test cases
- Move MonCaps to libcommon.la.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:12 PM Revision 8e2ceb4e (ceph): mon: fix [near]full_ratio conf update
- Already a value in [0,1]. Interpret as a percentage if > 1.0.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:45 PM Bug #2044 (Resolved): osd: pg stuck in active+backfill
- jmlowe ran into this on his cluster several times. The primary doing backfill failed to requeue the pg for recovery.
... - 07:08 PM Revision 5ce10979 (ceph): mon: waitlist new sessions trying to connect while we're out of quorum
- If we're stuck out of the quorum, we don't want clients connecting to
to us. Instead, waitlist their requests; proces... - 06:30 PM Revision cbf50eb2 (ceph): update amazonaws xmlns to correct url
- Signed-off-by: michael rodriguez <michael@newdream.net>
- 04:54 PM rgw Bug #2043 (Resolved): rgw: cannot use '+' in url
- Either in signed urls (e.g., as part of the uid), or in object names. Reason is that url_decode removes it. Relax url...
- 04:45 PM Bug #2042: mon: crash in LogMonitor::update_from_paxos
- ubuntu@teuthology:/a/nightly_coverage_2012-02-08-b/11127
- 04:45 PM Bug #2042: mon: crash in LogMonitor::update_from_paxos
- core + binary + tarball are at metropolis:~sage/bug-2042
- 04:43 PM Bug #2042 (Duplicate): mon: crash in LogMonitor::update_from_paxos
- ...
- 02:26 PM Linux kernel client Bug #1907: rbd: don't reuse device ids while they're still in use elsewhere
- 02:23 PM Linux kernel client Bug #1907: rbd: don't reuse device ids while they're still in use elsewhere
- After a few weeks of wandering around the code, figuring out how
things work and refactoring and fixing things as I ... - 01:17 PM Cleanup #2041 (Resolved): osd: move peering into worker threads
- 10:52 AM Bug #1974: osd: radosmodel crash on thrashing
- Just hit this:
- clean_up_local removed an object (due it a 'delete' log entry)
- a read came in and read it befo... - 06:41 AM rgw Feature #2040 (Resolved): rgw: disable rgw log through ceph.conf
- Currently the way to do it is through the apache conf.
- 06:07 AM Revision 1a028e5c (ceph): mds: remove IntervalTree code
- Not used, not tested.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:56 AM Revision 21a1dbd8 (ceph): trivial_libceph: need O_RDWR
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:36 AM Revision d63303de (ceph): client: -EINVAL write if not opened writable
- Fixes: #1827
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:31 AM Revision 4784b98f (ceph): client: clean up ctor a bit
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:28 AM Revision b5a5a4bf (ceph): client: initialize initialized
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:14 AM Revision 1f864354 (ceph): osd: always send scrub errors to cluster log
- Some errors were going to the cluster log, some weren't. Normalize the
output format and send them all.
Signed-off-... - 12:27 AM Revision f28287f0 (ceph): mon: make PaxosService::update_from_paxos return void.
- You can't really recover from a failed update (as PGMonitor was trying
to do), and nothing in the system checks the r... - 12:26 AM Revision 1125d71b (ceph): mon: call update_from_paxos() when we finish slurping updates.
- To aid in this, add a new get_paxos_service_by_name function.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
02/07/2012
- 10:56 PM Bug #2013 (In Progress): osd: messages for pgs we don't store are never freed
- see wip-pg-waiters?
- 10:46 PM CephFS Bug #1996 (Duplicate): mds: scatter_nudge() bad pointer on shutdown?
- this is the signal handler thing
- 10:45 PM Bug #1901 (Resolved): Missing files in ceph packages results in build failure of tests
- 10:43 PM rgw Bug #1721 (Can't reproduce): rgw: spurious multipart-upload failures
- 10:41 PM Bug #1626 (Can't reproduce): ceph-mon HA not working right; all must be up
- 10:37 PM CephFS Bug #1902 (Won't Fix): mds: unittest_interval_tree bad memory access
- 10:37 PM Bug #1659 (Can't reproduce): Upgrade from 0.27 -> 0.37 going wrong, OSDs miss map updates
- 10:35 PM Bug #1564 (Won't Fix): osd: osd should not be primary before data is replicated
- no more backlogs, so this problem is mostly moot. it can sort of still happen (to a vastly decreased degree), but it...
- 10:33 PM Bug #1529 (Can't reproduce): cosd: os/FileStore.cc: 2390: FAILED assert(0 == "ENOENT on clone sug...
- 10:31 PM Revision 675e4c41 (ceph): mon: drop election messages with bad rank
- The bad message came from old code pre-bfbeae68c045de76ede86ca4f72d2a760a19c84b.
Fixes: #1909
Signed-off-by: Sage We... - 10:31 PM Bug #1797 (Resolved): configure doesn't link to pthread on Fedora 14 on linking librados-config
- I'm going to assume that using the automake pthread macros fix this (commit:c5144eed4eadf5cfaa0a41c0ced2a1cd3462289f)...
- 10:30 PM Cleanup #1899 (Resolved): use acx_pthread instead of hardcoding libs and cflags into build system
- applied this a while back, commit:c5144eed4eadf5cfaa0a41c0ced2a1cd3462289f
- 10:29 PM rgw Feature #2039 (Rejected): rgw: keep more than one bucket marker object
- We generate a unique bucket index id by leveraging the pg version returned on a write operation to a special bucket m...
- 10:28 PM CephFS Bug #1827 (Resolved): libceph: hang on creating a file
- finally looked at this. the problem is just that open wasn't passed O_WRONLY or O_RDWR, and ceph_write() wasn't retu...
- 10:20 PM Feature #2038 (Rejected): mon: can't currently do commands/get status when not in quorum
- For obvious reasons, the MonClient has to authenticate with a monitor before talking to it. Right now this is accompl...
- 10:05 PM Bug #2031: paxos: failed assert (begin->last_committed == last_committed)
- Made a new bug for that issue anyway. #2037
- 04:50 PM Bug #2031: paxos: failed assert (begin->last_committed == last_committed)
- oh, that was meant for #2032!
- 04:49 PM Bug #2031: paxos: failed assert (begin->last_committed == last_committed)
- I think that could happen, so I'll check and fix it if so, but it's not what happened here.
- 04:44 PM Bug #2031: paxos: failed assert (begin->last_committed == last_committed)
- oh.. maybe it was slurping, and crashed before it stashed. when it restarted it didn't go back into slurp, because t...
- 03:03 PM Bug #2031 (Can't reproduce): paxos: failed assert (begin->last_committed == last_committed)
- ...
- 10:04 PM Bug #2037 (Resolved): mon: a crash in the middle of slurping is unrecoverable
- If a monitor comes up and starts slurping, it will start adding incremental maps to its store and update [first|last]...
- 09:39 PM Bug #1547 (Resolved): client log doesn't go to stderr unless 'log file' specified
- fixed this a few releases back
- 09:38 PM Bug #1688: Benjamin: pg stuck in scrub
- is this old/fixed? haven't seen it in a while
- 09:03 PM Feature #2024 (Resolved): make gitbuilders time out when github is sucking
- 04:04 PM rgw Cleanup #2036 (Resolved): rgw: bucket index tree contains the same info 3 times
- This is apparent by running strings on the index objects. We should be able to reduce the excessive information (whic...
- 04:01 PM rgw Bug #2035 (Resolved): rgw: bucket removal fails
- bucket removal sometimes either return 'access denied' or 'bucket not empty'
- 03:45 PM Bug #2033: osd: segfault in OSD::update_heartbeat_peers()
- ...
- 03:32 PM Bug #2033 (Closed): osd: segfault in OSD::update_heartbeat_peers()
- just hit this twice, on two different clusters, both under testrados workloads....
- 03:32 PM Feature #2034 (Resolved): osd: refactor push code
- 03:07 PM Bug #2032 (Resolved): paxos: somehow didn't update stash alongside new states
- lxo reported that on one monitor, after seeing #2031 and bringing the monitor back up (much later), the monitor faile...
- 02:28 PM Bug #1909 (Resolved): Two mons crash after starting the third one
- this really looks like the bug fixed in commit:bfbeae68c045de76ede86ca4f72d2a760a19c84b... the sender sent a message ...
- 02:19 PM Bug #1789: mon: failed assert(paxosv == pg_map.version)
- We only saw this the once, but we believe the bug and want to keep it open.
- 02:18 PM Messengers Bug #1747: msgr: osd connection originates from wrong port
- We only saw this the once, but we believe the bug and want to keep it open.
- 02:14 PM Bug #1631: osd: failed assert(repop_queue.front() == repop)
- We haven't seen this, but hope that the messenger tests now being designed will flush it out again.
- 02:11 PM CephFS Bug #1947 (Duplicate): mds: SIGBUS during _mark_dirty
- #1549
- 02:06 PM RADOS Feature #1639: osd: guard against bad objects in cls map functions
- the specific instance was fixed. can we in general catch any exception in the class methods? safely?
- 02:02 PM Bug #1530 (Can't reproduce): osd crash during build_inc_scrub_map
- 11:28 AM Feature #2030: osd: clean up mark_unfound api
ceph pg 1.2 mark_unfound_revert foo
NOT ceph tell osd.12 mark_unfound revert pgid objectname
- 11:27 AM Feature #2030 (Resolved): osd: clean up mark_unfound api
- 11:27 AM Feature #2007: osd: enumerate unfound, lost objects, possible locations
ceph pg 1.2 list_missing|list_unfound
- list of missing objects, lcoators, and known locations (if !unfound)
- 11:19 AM Feature #2007: osd: enumerate unfound, lost objects, possible locations
- PGLS_MISSING
(new pg op) using rados - 11:27 AM Feature #2006: osd: report what is blocking peering completion
ceph pg 1.2 status|query
- peering status
- recovery status
- another interseting status
- 11:11 AM Feature #2006: osd: report what is blocking peering completion
- ceph ...
ceph tell <who> ....
ceph pg query 1.2
map pg, query osd directly with
['pg', 'query', '1.2']
- 11:07 AM Feature #2005: mon: track timestamps on pg states
- query list of stale/unpeered/whatever pgs
ceph pg dump_stuck [--format=json|plain]
- 10:06 AM Bug #1974: osd: radosmodel crash on thrashing
- Summary: An object was deleted, but after a recovery was found to be back ... which is almost surely indicative of a ...
- 12:10 AM Revision 6df25e53 (ceph): rgw: url_decode object name
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 12:10 AM Revision 0da793ba (ceph): rgw: cleanup url_decode usage
- we now url_decode the relevant strings at initialization,
thus it's clear whether we need to url_decode or not later ...
02/06/2012
- 11:19 PM Revision f859f25d (ceph): osd: re-take the osd lock in the init error path where it's not held
- The Mutex::Locker will unlock it once the function exits.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> - 09:34 PM Revision 2e84c1ec (ceph): ceph-dencoder: ScrubMap[::object]
- Signed-off-by: Sage Weil <sage@newdream.net>
- 09:34 PM Revision 0bf3c54b (ceph): osd: uninline osd_stat_t methods
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision c0711b09 (ceph): ceph-dencoder: coll_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision 8bf08abc (ceph): ceph-dencoder: pg_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision 8791cb98 (ceph): ceph-dencoder: SnapSet
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision c5a58420 (ceph): ceph-dencoder: SnapContext, SnapRealmInfo
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision fe39d58d (ceph): move SnapContext, SnapRealmInfo to common/snap_types.{h,cc}
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:34 PM Revision f07d2835 (ceph): ceph-dencoder: filepath
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:33 PM Revision 1fbb8ebc (ceph): ceph-dencoder: CompatSet
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:33 PM Revision 5b423b60 (ceph): kill unused tstring
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:33 PM Revision cba2674b (ceph): kill useless [cn]string.h
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:12 PM Revision c23d217c (ceph): rgw: escape and list correctly objects that start with underscore
- This should fix bug #2025.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 06:36 PM Revision 8ded2647 (ceph): crush: don't BUG_ON
- Fail gracefully on map errors; only BUG on code errors.
Signed-off-by: Sage Weil <sage@newdream.net> - 06:33 PM Revision 9895f0bf (ceph): crush: don't BUG_ON within crush_choose
- It's very hard to recover from an invalid crushmap if mons fail
assertions while processing the map, and osds crash w... - 05:48 PM Bug #1975: btrfs: EINVAL on snap create
- RATIONALE:
We seem to be able to make this happen, and believe it to be a btrfs bug.
We are not calling it u... - 05:44 PM Feature #1932: mon: before accepting a new crushmap, monitor should validate and test some inputs
- Users can create their own rules, so bad rules will happen, and we must do a better job of making the Monitors robust...
- 04:22 PM rgw Bug #2025 (Resolved): rgw: objects starting with underscore are badly listed
- Fixed, commit:c23d217c93bb6ed21c1b07e347710e18446a3abc.
- 04:22 PM rgw Bug #2029 (Resolved): rgw: space in object name is turned into a different character
- Fixed, commit:6df25e53abe37b19b38e5657dbf3b4c37f03d8e3.
- 02:37 PM rgw Bug #2029 (Resolved): rgw: space in object name is turned into a different character
- looks like we fail to use the url-decoded object name.
- 02:11 PM Feature #2028 (Resolved): qa: allocate disks to btrfs on new hardware
- root isn't consistently on /dev/sda, it seems. or on a consistent /dev/disk/by-path on the plana nodes.
- 02:04 PM Feature #1970 (Resolved): osd: migrate to new encoding schemes
- this is all done, but unmerged; it'll get pulling into a release with a bunch of other encoding updates.
- 10:52 AM rgw Bug #2027 (Can't reproduce): rgw -> apache miscommunication
- There were some mystery failures, where we've seen rgw getting requests from apache, processing them, sending respons...
- 10:10 AM Bug #1973 (Can't reproduce): osd: segfault in ReplicatedPG::remove_object_with_snap_hardlinks
- let's chalk this up to the bad object_info_t
- 10:06 AM Bug #1984 (Can't reproduce): osd: failed assert, got into finish_recovery_ops without any recover...
- 10:00 AM Bug #1490 (Resolved): cfuse assert failure: assert(ob->last_commit_tid < tid)
- 05:28 AM Revision 8427090a (ceph): filejournal: flush needn't abort on write_stop
- Flush should wait for things to flush, even if we are also shutting down.
Not sure this would ever trigger, but this ... - 05:27 AM Revision 4b8374cc (ceph): filejournal: clean up check_aio_completion
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:25 AM Revision fffee825 (ceph): filejournal: get multiple aios at a time
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 03:36 AM Bug #2026 (Can't reproduce): osd: ceph::HeartbeatMap::check_touch_file
- After my data loss due to a btrfs bug I re-installed my whole cluster with 0.41 and kernel 3.2 (ceph-client with btrf...
02/05/2012
- 08:54 PM Bug #1975: btrfs: EINVAL on snap create
- ...
- 07:11 PM rgw Bug #2025 (Resolved): rgw: objects starting with underscore are badly listed
- 01:44 AM Revision b7c20e77 (ceph): streamtest: show total throughput, avg latencies
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:44 AM Revision 30a77acb (ceph): filejournal: implement aio for writes
- Implement aio for the journal writes.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 01:44 AM Revision 3f0a592a (ceph): debian: depend on libaio-dev
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:44 AM Revision 4842b3d2 (ceph): ceph.spec.in: buildrequires libaio-devel
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:44 AM Revision fb0e2a3e (ceph): configure: add --without-libaio option
- Use it by default; fail if it's not there. Unless --without-libaio is
specified.
Signed-off-by: Sage Weil <sage.wei... - 01:44 AM Revision f3dd5832 (ceph): filejournal: print aio mode on open
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
02/04/2012
- 09:25 PM Revision a9a60461 (ceph): client: init/shutdown objecter in init/shutdown
- Not in mount/unmount, and don't do shutdown() twice!
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:02 PM Feature #2024 (Resolved): make gitbuilders time out when github is sucking
- 03:32 PM CephFS Bug #1945: blogbench hang on caps
- ubuntu@teuthology:/a/nightly_coverage_2012-02-04-a/10600
- 12:01 PM Cleanup #2023 (Resolved): btrfs: Use btrfs device scan instead of btrfsctl -a
- I justed upgraded my btrfs userland tools and saw:...
02/03/2012
- 11:00 PM Bug #2022 (Resolved): osd: misdirectect request
- from rados_api_tests.yaml:
[WRN] client.4292 10.3.14.128:0/3016298 misdirected client.4292.0:4 0.0 to osd.1 not [0,1... - 10:42 PM Revision ba12f26f (ceph): rgw: fix autobuilder errors
- librgw wasn't linking with some useless unit test
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 10:42 PM Revision dba22f8e (ceph): rgw: fix warning
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 10:29 PM Revision 90fe53c3 (ceph): rgw: fix acl cleanup related regression
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 09:27 PM Revision 06ea2f7f (ceph): doc: add the ceph mds stop command.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 06:35 PM Revision caabec96 (ceph): mon: show full status in ceph health
- HEALTH_WARN when nearfull, HEALTH_ERROR when full.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off... - 05:34 PM Revision 13c89137 (ceph): rgw: use request uri if script name is empty
- this was required for some nginx configuration
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 05:28 PM Revision 7641a0e1 (ceph): osd: signal dispatch_cond on ms_dispatch completion
- There may be another dispatch thread waiting on this cond; we do need to
signal it!
Signed-off-by: Sage Weil <sage@n... - 05:27 PM Revision eaa46f50 (ceph): osd: reorder PG recovery_state initialization
- The state machine state constructors print stuff to the logs, and the
PG::gen_prefix() includes all kinds of PG field... - 10:48 AM Cleanup #2021 (Resolved): fix signal handlers
- 10:45 AM Feature #2008 (Resolved): mon: include full/nearfull in health check
- 10:31 AM Feature #2004 (Resolved): qa: make deb gitbuilder faster
- 10:11 AM Feature #2020 (Duplicate): collectd: submit plugin upstream
- 05:06 AM Revision d7f61c8d (ceph): test/encoding/readable.sh: nicer output
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:06 AM Revision 5216eb07 (ceph): ceph-dencoder: more helpful error message for messages
- If the type doesn't match, share what it was vs what you expected.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:06 AM Revision 5103338c (ceph): messages: set type in default constructor
- ceph-dencoder wants this to verify it decoded the correct message type.
Not that it is likely to happen, but let's be... - 03:24 AM Revision ae67c2de (ceph): pick object from random osd for primary recovery
- When recovering a primary, try the osds that have a copy of the object
in random order, rather than preferring the lo... - 01:04 AM Revision 05f66c45 (ceph): msg: fix message leak on receipt of undecodable message
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:01 AM Revision c3eacb15 (ceph): Makefile: add test/encoding/types.h
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:00 AM Revision 0dcbc86c (ceph): Merge branch 'wip-encoding'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
Conflicts:
src/msg/Message.h
src/osd/OSD.cc
src/osd/Replicat... - 12:58 AM Revision 6e62fc48 (ceph): test/encoding/readable.sh: check all version
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 12:58 AM Revision 625a89d4 (ceph): test/encoding/readable.sh: nicer output
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 12:58 AM Revision d597dc2d (ceph): encoding: document ENCODE/DECODE macros
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
02/02/2012
- 11:20 PM Revision e9b97c13 (ceph): osd: fix another repop->ctx->op deref
- Ok this time I actually looked for more and didn't see any.
Signed-off-by: Sage Weil <sage@newdream.net> - 11:16 PM Revision bf5d7d05 (ceph): Merge remote branch 'gh/wip-objecter-initialized'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 11:07 PM Revision 2f5ba8fb (ceph): osd: avoid null deref of repop->ctx->op
- It's optional.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:48 PM Revision a0dde422 (ceph): encoding: document ENCODE_DUMP throttling weirdness
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:43 PM Revision 96876097 (ceph): encoding: fix DECODE_START macro
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:43 PM Revision 9c2d779b (ceph): encoding: add DECODE_OLDEST macro
- So we can (gracefully) fail to decode very old encoded versions we no
longer support.
Signed-off-by: Sage Weil <sage... - 09:31 PM Revision 690b9919 (ceph): osd: fix another issue_repop() ctx->op null deref
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:50 PM Revision d9261942 (ceph): check-generated.sh: do self-decode test first
- This way we get a helpful error instead of silent failure on later.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 08:47 PM Revision efe77a8e (ceph): check-generated.sh: nicer output
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:46 PM Revision 153e89d2 (ceph): ceph-dencoder: print errors to stderr
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:36 PM Revision 2a262956 (ceph): osd: do not dereference ctx->op when NULL
- We may not have an OpRequest. Make the later check do the cast properly
when it is needed.
Signed-off-by: Sage Weil... - 07:57 PM Bug #2016 (Resolved): OSD: pull should randomly choose a pull target
- applied, thanks!
- 12:00 PM Bug #2016: OSD: pull should randomly choose a pull target
- Here's a patch that fixes this.
- 11:13 AM Bug #2016 (Resolved): OSD: pull should randomly choose a pull target
- Currently, we choose the lowest numbered osd to pull from. This biases the recovery load towards lowered numbered osds.
- 07:30 PM Revision 8623c64d (ceph): encoding: better DECODE_START_LEGACY_COMPAT_LEN
- - let you specify whether to decode compat and/or len
- put the argument order in the macro name so you know when you... - 07:30 PM Revision 73e92b31 (ceph): buffer: iterator::get_remaining()
- It's helpful to know how much data is remaining.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:23 PM Messengers Bug #1985: msgr: creating new Pipe for pre-existing connection leaks Pipe if they don't replace
- Hacked up a small patch that should do it, but need to test and get some feedback on related protocol stuff I ran into.
- 06:55 PM Revision 04753e9a (ceph): Merge remote-tracking branch 'gh/wip-osd-op-tracking'
- Reviewed-by: Sage Weil <sage@newdream.net>
- 06:53 PM Revision cb754917 (ceph): osd: use obc for size in calc_head_subsets()
- No need to call stat(2) here; the caller has what we need.
Signed-off-by: Sage Weil <sage@newdream.net> - 06:53 PM Revision c4ca1142 (ceph): osd: fix osd_recover_clone_overlap
- - we need to populate data_subset
- add check in calc_head_subsets() too
Fixes 2116f012.
Signed-off-by: Sage Weil <... - 06:41 PM Revision dab9f0f9 (ceph): Merge branch 'master' into wip-encoding
- Conflicts:
src/osd/OSD.cc
src/osd/PG.cc
src/osd/PG.h - 06:38 PM Revision 83432af2 (ceph): common/Throttle: throttle in FIFO order
- Under heavy write load from many clients, many reader threads will
be waiting in the policy throttler, all on a singl... - 06:07 PM Revision 36a4ca40 (ceph): filestore: remove obsolete fs type check
- This isn't a useful check. xfs and ext4 work too.
Fixes: #1995
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 05:36 PM Revision 0cd16cf0 (ceph): ceph: always add logger for daemons
- The extra log function added redundant info and didn't allow different
levels. - 05:35 PM Revision 7af7c66b (ceph): ceph: rename type parameter to type_
- type is a built-in and shouldn't be aliased.
- 05:27 PM Revision 7146db92 (ceph): ceph: use the correct comparison operator
- is compares identity (i.e. address in cpython), not value.
- 05:26 PM Revision e7672b64 (ceph): ceph: sync before unmounting btrfs devices
- There may still be writes in flight, since the osds may not have
shutdown cleanly. This should prevent EBUSY when unm... - 05:26 PM Revision 1364b882 (ceph): ceph: delay raising exceptions until all daemons are stopped
- If a daemon crashes, the exception is raised when we stop it. This
caused some daemons to continue running during cle... - 05:02 PM Revision 290730ee (ceph): objecter: track whether initialized; add asserts
- init() should be called when not initialized; shutdown() should not be
called unless initialized. No handle_* method... - 05:02 PM Revision 33659521 (ceph): librados: discard incoming messages when DISCONNECTED
- If we are disconnected (probably shutting down, if we are receiving a
message) then ignore anything incoming. This a... - 05:02 PM Revision 51ccce06 (ceph): client: let set_filer_flags clear flags, toos
- ceph-syn does this...
Signed-off-by: Sage Weil <sage@newdream.net> - 05:02 PM Revision 824c3af7 (ceph): client: add initialized flag to client
- Do not call init() while initialized; do not call shutdown unless
initialized.
Drop incoming messages if not initial... - 05:01 PM Revision 4ef4d3f1 (ceph): test_filejournal: fix warnings
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:50 PM Linux kernel client Bug #1990 (Resolved): rbd: null pointer dereference during map
- Problem seems to be gone now.
- 04:36 PM CephFS Bug #1945: blogbench hang on caps
- Happened again in /var/lib/teuthworker/archive/nightly_coverage_2012-02-02-a/10268 (also blogbench)
- 03:38 PM CephFS Bug #2019 (Resolved): mds: CInode::filelock stuck in sync->mix
- Reported by Kioob`Taff in irc. Some logging is available at gregf@kai:~/logs/kioob. Unfortunately not of the lock get...
- 03:33 PM CephFS Bug #2018 (Resolved): mds: can't change file_max
- http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/4612
Relevant MDS log snip (repeats):... - 03:15 PM Bug #2014 (Resolved): librados shutdown race
- resolved by commit:33659521a92315f71040551b2699d9961acc07f7 and neighbors.
- 03:13 PM Linux kernel client Bug #2017: osd: segfault in snap trimmer
- Since Sage has fixed this, I've deleted the archive of /tmp/cephtest I had saved.
- 03:06 PM Linux kernel client Bug #2017 (Resolved): osd: segfault in snap trimmer
- pushed fix for this (and another similar bug) to master.
- 03:00 PM Linux kernel client Bug #2017: osd: segfault in snap trimmer
- I bundled up the /tmp/cephtest directory in its entirety. It is here:
flak.ops.newdream.net:~elder/tracker_2017... - 02:59 PM Linux kernel client Bug #2017: osd: segfault in snap trimmer
- The segfault was from trying to dereference repop->ctx->op, which was NULL.
- 02:52 PM Linux kernel client Bug #2017 (Resolved): osd: segfault in snap trimmer
- Testing some reasonably solid changes to the rbd code I ran across an OSD crash.
It looks like it happ
The YAML fil... - 11:03 AM Feature #2015 (Resolved): osd: dump in-flight ops via admin socket
- 11:03 AM Feature #1879 (Resolved): osd: track list of in-progress requests, log slow ones
- 10:09 AM Bug #1997 (Resolved): teuthology: wait for clean osd shutdown before umount
- This was different from #1744 - daemons are shut down without waiting for I/O to complete, which causes this issue wh...
- 10:07 AM Bug #1744 (Resolved): teuthology: race with daemon shutdown?
- This turned out to be uncaught exceptions that weren't logged until later when daemons crashed. Fixed by 1364b8826f3f...
- 10:06 AM Bug #1995 (Resolved): Turn down non-btrfs warning in FileStore
- commit:36a4ca40805a5b0665e749b2b928d94749a8dd87
- 01:17 AM Revision da02c40d (ceph): osd: d'oh again! Make this real exponential, not...ever-linear.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 01:17 AM Revision 030ad872 (ceph): osd: mark_started() osd sub ops
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 01:17 AM Revision f7e6e18a (ceph): osd: OpRequest currently_* needs to look at latest, not hit.
- D'oh!
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> - 01:01 AM Revision c81845b6 (ceph): rgw: fix crash related to cleanups
- there are still a few regressions, but getting there.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 12:34 AM Revision 00a2e84b (ceph): do_autogen.sh: -e <path> to dump encoded objects to a path
- Make it easy to build with encode dumping enabled. This is just a
convenient way to generate a large corpus of encod... - 12:13 AM Revision 91073a6a (ceph): check-generated.sh: run on 'make check'
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 12:05 AM Revision 8870a676 (ceph): Merge remote branch 'origin/master' into wip-osd-op-tracking
- Conflicts:
src/osd/ReplicatedPG.h
02/01/2012
- 11:54 PM Revision f125070b (ceph): osd: pg_stat_t: fix member initialization
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:54 PM Revision 71c59dae (ceph): osd: add check_ops_in_flight()
- By default it warns on requests that are more than 30 seconds old,
using an exponential backoff of that interval.
Als... - 11:51 PM Revision 63ad89d2 (ceph): osd: fix PG::Interval member initialization
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:49 PM Revision 500f4c66 (ceph): osdmap: finalize crush after building simple map
- This ensures that max_devices gets calculated (and thus encoded) properly.
Signed-off-by: Sage Weil <sage.weil@dream... - 11:49 PM Revision c41adacf (ceph): osdmap: make test instnaces deterministic
- current time can vary
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 11:41 PM Revision d4d1b64f (ceph): import-generated.sh: fix to use ceph-dencoder syntax
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:41 PM Revision 290c4b72 (ceph): ceph-dencoder: fix ctor
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:31 PM Revision 2d0da67d (ceph): ceph_context: initialize member var
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:28 PM Revision cb5f2708 (ceph): rgw: some more acls cleanup
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 11:02 PM Revision 544ea29d (ceph): osd: switch op passing interface to use OpRequest instead of raw Messages
- This doesn't handle the PG internals yet.
Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> - 11:02 PM Revision ba392e3d (ceph): PG: switch op passing interface to use OpRequest
- This is all the PG/ReplicatedPG internals and the few remaining OSD callers.
Signed-off-by: Greg Farnum <gregory.far... - 11:02 PM Revision fd3108ee (ceph): osd: "mark" OpRequests as they move through the system.
- Right now these are just informational flags which can be read out. Later
they might extend to timing information, se... - 11:02 PM Revision 4075d521 (ceph): osd: add new OpRequest struct and an xlist to track it
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 11:02 PM Revision 25c5daec (ceph): osd: PGLSResponse -> pg_ls_response_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 11:02 PM Revision ac1fbd18 (ceph): osd: PG::Missing -> pg_missing_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 10:28 PM Revision a41679a6 (ceph): osd: PG::Log -> pg_log_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 10:16 PM Revision 5263c12a (ceph): osd: PG::Log::Entry -> pg_log_entry_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:03 PM Revision 460a4622 (ceph): osd: PG::Query -> pg_query_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:58 PM Revision 3353f572 (ceph): cls_rgw: update bucket index when deleting object (with pending)
- Bug #2012. Racing delete with other operations (update or another
delete) failed to update the bucket index.
Signed-... - 08:53 PM Revision 25293748 (ceph): osd: PG::Info -> pg_info_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:43 PM Revision 1c5370cd (ceph): osd: PG::Info::History -> pg_history_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:52 PM Revision 64bdd389 (ceph): osd: PG::Info[::History] dump, test instances
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:26 PM Revision 139db823 (ceph): ceph-dencoder: remove message type dups
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:25 PM Revision 2b05000d (ceph): ceph-dencoder: generate test instances on heap
- Some objects aren't copyable (OSDMap contains CrushWrapper), but we'd
still like to programmatically generate test in... - 06:55 PM Revision cc405721 (ceph): Merge remote branch 'gh/wip-divergent-backfill'
- Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
- 06:49 PM Revision 1fdb5e5f (ceph): osd: dump, test instances for PG::Interval
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:46 PM Revision 19313fbe (ceph): osd: dump, instances for PG::OndiskLog
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:41 PM Revision 9307edd1 (ceph): ceph-dencoder: use g_ceph_context
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:41 PM Revision 19227b23 (ceph): ceph-dencoder: OSDMap and osd_info_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:41 PM Revision cbaf83d7 (ceph): osdmap: test instances for osd_info_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:40 PM Revision d0dbaaaf (ceph): osdmap: test instances
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:34 PM Revision ad77bb48 (ceph): osdmap: normalist encode/decode
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:09 PM Feature #1879: osd: track list of in-progress requests, log slow ones
- This is in the branch wip-osd-op-tracking. There are some ops that still need to get marked up; I have logs to go thr...
- 03:44 PM Feature #1836: filejournal: use async directio to write to the journal
- 01:36 PM Bug #1984: osd: failed assert, got into finish_recovery_ops without any recovery ops active?
- Hmm, we still haven't seen this in our thrashing in qa. I'll start thrashing on some of the new hardware.
- 01:35 PM Bug #1983 (Resolved): osd: failed assert, info does not match peer info
- 01:12 PM rgw Bug #2012 (Resolved): rgw: racing object creation and removal may lead to bad bucket accounting
- Fixed, commit:3353f572f84707fbc0e99a9af2dc48de2d0aa2c9.
- 12:49 PM Bug #2014: librados shutdown race
- ...
- 12:49 PM Bug #2014 (Resolved): librados shutdown race
- ...
- 12:36 PM Linux kernel client Bug #147: lockdep: possible irq lock inversion dependency w/ osdc->request_mutex and con->mutex
- Saw this again. It's been a while.....
- 08:45 AM Feature #1971 (Resolved): encoding: adapt to messages
- 08:45 AM Feature #1969 (Resolved): gitbuilder for 11.10, 12.04
- 07:46 AM Bug #1992: OSD::get_or_create_pg
- I was running the stock 3.0 kernel from Ubuntu 11.10
I tried with the latest ceph-client code (saw your post about... - 06:48 AM Revision a5366c8b (ceph): ceph-dencoder: add all message types
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:48 AM Revision 32010d78 (ceph): msg: add missing #includes for messages
- And remove that unused max() macro.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:39 AM Revision 1cb39fac (ceph): msg: dump messages via build option
- Dump encoded messages to ENCODE_DUMP when it is defined, just as we do with
the regular encode function.
Signed-off-... - 04:06 AM Revision 597e97a6 (ceph): osd: fix assignment in PG::rewind_divergent_log()
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 12:37 AM Revision 0b68dbca (ceph): add backfill test
- 12:25 AM Revision 0236dc0f (ceph): add backfill task
- This does a basic test of backfill functionality, including a divergent
log on a backfill target (#1983). - 12:18 AM Revision 7cb561b4 (ceph): Merge remote-tracking branch 'gh/wip-journal-crc'
- Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
- 12:13 AM Revision e337c472 (ceph): ceph_manager: add manager.blackhole_kill_osd()
- This will suspend disk writes for a couple seconds and then kill the
daemon. It helps us similute a hardware failure.
01/31/2012
- 11:41 PM Revision 9d385f52 (ceph): msgr: Document recv_stamp and add a dispatch_stamp and throttle_wait.
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 09:56 PM Feature #2004 (In Progress): qa: make deb gitbuilder faster
- 09:56 PM Feature #1885 (Resolved): identify top 10 expected failures and process to diagnose
- 09:00 PM Revision ba4aad48 (ceph): qa: test_backfill.sh: take osd.0 down
- Mark this down to
1- trigger the WaitActingChange vs osd down race, and
2- help trigger a divergnet log when osd.2 is... - 07:44 PM Revision f1c3538f (ceph): osd: fix divergent backfill targets
- During peering, a previous backfill target may have a slightly newer
last_update than the other options, but it will ... - 07:44 PM Revision f4e44e43 (ceph): qa: test_backfill.sh: limit pg log length so we trigger backfill
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:44 PM Revision 747b3d46 (ceph): osd: use RecoveryContext transaction, finishers on recovery completion
- We should use the enclosing transaction and finisher list here.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:44 PM Revision 9dfa46ff (ceph): osd: rename recovery event NeedNewMap -> NeedActingChange
- This is more precise.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:44 PM Revision 5a544836 (ceph): osd: restart peering if requesting acting osd goes down
- If we request an acting set, we need to restart peering if one of the
requested nodes goes down. This prevents a dea... - 06:56 PM Bug #2013: osd: messages for pgs we don't store are never freed
- I think the thing to do is check the waiting map on activate and discard pgs and their messages if they ok longer map...
- 05:52 PM Bug #2013 (Resolved): osd: messages for pgs we don't store are never freed
- Once request timestamps are implemented, we could have a timeout period after which misdirected requests are dropped.
- 05:09 PM rgw Bug #2012 (Resolved): rgw: racing object creation and removal may lead to bad bucket accounting
- 04:05 PM Revision d7be7762 (ceph): Allow user to disable lock checking.
- The new plana hardware isn't in the old sepia lock database,
and the machine pools are risky to merge as nothing in t... - 03:59 PM Revision 09bed164 (ceph): Allow user to provide flavor to use.
- With this, you can use Ubuntu 11.10 machines with teuthology by saying::
tasks:
- ceph:
flavor: oneiric
... - 03:23 PM Revision 9520ee78 (ceph): filestore: implement filestore_blackhole hook
- If true, we'll drop any new transactions on the floor. Useful for
triggering failure conditions (e.g., prior to killi... - 02:19 PM Feature #2011 (Resolved): osd: do not backfill/recover to full osds
- 02:18 PM Feature #2010 (New): mon: check for slow performing osds
- 02:18 PM Feature #2009 (Resolved): osd: report performance to monitor
- 02:17 PM Feature #2008 (Resolved): mon: include full/nearfull in health check
- 02:17 PM Feature #2007 (Resolved): osd: enumerate unfound, lost objects, possible locations
- 02:16 PM Feature #2006 (Resolved): osd: report what is blocking peering completion
- 02:15 PM Feature #2005 (Resolved): mon: track timestamps on pg states
- 09:10 AM Bug #2002: osd: racy push/pull for clones
- sage@metropolis.ceph.dreamhost.com:osd.log.badpushpull
shows the (or similar) badness. workload was... - 01:02 AM Revision 1fe75ee6 (ceph): rgw: should remove bucket dir instead of sending intent
- that was really useless, and also bucket cleanup was broken anyway.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.... - 12:48 AM Revision 2b5bbe8e (ceph): librados: fix a leak
- watch notification message was missing a ->put()
Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
01/30/2012
- 10:27 PM Revision 2116f012 (ceph): osd: disable clone overlap for push/pull
- There is a bug in the push/pull code. Disable the recovery smarts by
default until we fix #2002.
There is currently... - 09:42 PM Revision 9d246a43 (ceph): Merge remote branch 'gh/wip-warnings'
- 09:41 PM Revision 2adabbe5 (ceph): mon: make 'osd [out|in|down]' succeed if already whatever
- If we want something out and it is already out, succeed. This makes the
client command succeed if there is a transie... - 09:30 PM Revision 591a8909 (ceph): ceph-dencoder: handle messages
- Dump for now uses the string rendering function, and that's it. Maybe
we'll write proper dump methods for all of the... - 09:30 PM Revision 11de3f11 (ceph): msg: implement Message::dump()
- Just wrap print() for now.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 09:28 PM Revision 9987f8f9 (ceph): msg: go const-crazy on messages
- - get_type_name()
- print()
and all the random crap they call.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 09:07 PM Feature #2004 (Resolved): qa: make deb gitbuilder faster
- don't use pbuilder
- 08:32 PM Revision 9279619b (ceph): paxos: explicitly pass in send timestamp
- This is cleaner.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 08:28 PM Revision 0e8129ad (ceph): msg: no cct for decode_payload
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:28 PM Revision 0107aee8 (ceph): msg: use absolute times for message encoding
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:22 PM Revision 5436bf50 (ceph): msg: make decode cct optional
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:03 PM Revision 11f6a840 (ceph): msg: no cct needed for message encoding
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:58 PM Feature #2003: limit XFS extent fragmentation for rbd
- Hmm, I'm don't think that rbd_writeback_window will help much here; it doesn't affect the size of IOs sent to rados.
... - 07:09 PM Feature #2003 (Rejected): limit XFS extent fragmentation for rbd
- A user with the handle pmjdebruijn was asking earlier today about XFS extent fragmentation due to ceph writes on the ...
- 07:07 PM Revision 04497a51 (ceph): mdsmap: move member initialization to monitor create_initial()
- The dependence on cct/conf here was totally wrong.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:34 PM Revision dabf1e48 (ceph): msg: use explicit feature argument instead of Connection*
- Use the new argument. Don't rely on Connection *connection being defined.
Signed-off-by: Sage Weil <sage.weil@dream... - 06:34 PM Revision 79998762 (ceph): msg: pass features explicitly into message encoders
- Avoid using the connection reference; pass it in explicitly instead. This
will make ceph-dencoder's life a bit easie... - 06:13 PM Bug #1992: OSD::get_or_create_pg
- What version of btrfs are you running? Have you tried the latest code?
There is a mount -o recover option for btr... - 06:11 PM Bug #1983 (In Progress): osd: failed assert, info does not match peer info
- 05:23 PM Revision de2ec7c2 (ceph): Merge remote branch 'gh/master' into wip-encoding
- 04:57 PM Bug #1977: mon: ceph command hang
- a new monitor election could do it, or a socket error between the ceph command and monitor.
- 04:31 PM Bug #1977: mon: ceph command hang
- The proper behavior is more a question of what the command means, I think. I tend to think of them as being an action...
- 02:22 PM Bug #2002 (Resolved): osd: racy push/pull for clones
- There is currently a race where:
- an adjacent clone is missing
- we (calculate some clone overlap? and) start pull... - 01:13 PM Feature #1969 (In Progress): gitbuilder for 11.10, 12.04
- oh, i'm dumb.. they're just building tarballs, not debs.
- 01:05 PM Feature #1969: gitbuilder for 11.10, 12.04
- Sorry I'm dumb, where are the debs? I see only natty and squeeze in http://ceph.newdream.net/debian-snapshot-amd64/wi...
- 11:02 AM Bug #1974: osd: radosmodel crash on thrashing
- Looks like in some cases we are finding objects which should have been deleted.
- 10:45 AM Bug #1997 (Duplicate): teuthology: wait for clean osd shutdown before umount
- As far as I can tell, this is exactly #1744, just failing at a different point because of timing differences. The two...
- 10:03 AM rgw Bug #2001: radosgw memory leak
- probably related to recent objecter changes? can you run it through massif?
- 10:02 AM rgw Bug #2001 (Resolved): radosgw memory leak
- Seems that there's a new leak. I ran s3-tests over the weekend and radosgw mem usage went up to ~10G. The version I w...
- 09:52 AM Feature #1993 (Resolved): mon: warn admin about down pgs
- 09:50 AM Bug #1986 (Resolved): objecter: segfault during osd op reply demux
- 09:48 AM Bug #1943: osd: bad clone transaction on journal replay
- I'm going to disable the clone stuff for snap objects until the push/pull code is rewritten (in a non-buggy way).
- 05:08 AM Revision d3590b5e (ceph): qa: test/gather fix warning
- warning: test/gather.cc:29:222: passing NULL to non-pointer argument 3 of ‘static testing::AssertionResult testing::i...
- 05:08 AM Revision 60ead1ee (ceph): qa: encoding: silence warning
- This is cheating, but we always use this class with int types, so it makes
this go away:
warning: test/encoding.cc:7... - 04:54 AM Revision e9e212f8 (ceph): qa: test/rados-api/list fix warning
- warning: test/rados-api/list.cc:43:156: converting ‘false’ to pointer type for argument 1 of ‘char testing::internal:...
- 04:36 AM Revision 853c8b21 (ceph): test_ipaddr: reverse ASSERT_EQ order
- Make these warnings go away:
warning: test/test_ipaddr.cc:217:156: converting ‘false’ to pointer type for argument 1... - 01:26 AM Revision 773acfdf (ceph): osd: remove unused var
- warning: osd/PG.cc:1331:20: variable 'plu' set but not used [-Wunused-but-set-variable]
Signed-off-by: Sage Weil <sa... - 01:26 AM Revision 9454102a (ceph): admin_socket: fix uninit warning
- warning: common/admin_socket_client.cc:166:19: 'socket_fd' may be used uninitialized in this function [-Wuninitialize...
01/29/2012
- 09:46 PM Feature #1969 (Resolved): gitbuilder for 11.10, 12.04
- up and running for amd64. cleaning out some warnings in wip-warnings branch.
- 09:41 PM Bug #1952 (Resolved): rgw: test suite times out
- this is solved.. it was #1993
- 09:33 PM Bug #1975: btrfs: EINVAL on snap create
- hit this 2 days ago with ubuntu@teuthology:/var/lib/teuthworker/archive/nightly_coverage_2012-01-27-a/9261. thrashing...
- 09:32 PM Bug #1943: osd: bad clone transaction on journal replay
- Sage Weil wrote:
> hit this again on ubuntu@teuthology:/var/lib/teuthworker/archive/nightly_coverage_2012-01-27-a/92... - 09:18 PM Bug #1977: mon: ceph command hang
- hrm.. I didn't manage to reproduce a hang, but I did reproduce a failure. A transient error made a command succeed b...
- 05:51 PM Bug #1490: cfuse assert failure: assert(ob->last_commit_tid < tid)
- ...
- 05:27 PM Revision 483c089c (ceph): mon: trim old auth states
- These aren't exposed outside the monitor, so we really only keep them
around to assist in mon recovery. Give ourselv... - 04:48 PM Revision 9bb3875b (ceph): filestore: fix rollback when current/ missing entirely
- This can happen when we are starting, rolling back, remove current/, and
then fail before we snapshot a snap_ into pl... - 03:29 PM Feature #2000 (Resolved): mon: trim old auth files
- commit:483c089c1bc035ddd26729bcb72d61f4a969f856
- 09:46 AM Feature #2000 (Resolved): mon: trim old auth files
- 12:51 PM Bug #1992: OSD::get_or_create_pg
- I see what might went wrong. I build the latest master a couple of days ago, ran the OSD's with that code for about 2...
- 09:03 AM Bug #1998 (Resolved): qa: admin socket test broken
- working now...
- 08:47 AM Bug #1999 (Resolved): osd: bad current/ version on osd restart
- commit:9bb3875b1671b89a74895f3c97a27845867b3941
- 08:31 AM Bug #1999 (Resolved): osd: bad current/ version on osd restart
- this was triggered by thrashing on
ubuntu@teuthology:/a/nightly_coverage_2012-01-29-a/9641... - 05:11 AM Revision 9da01185 (ceph): make 6-osd-2-machine simpler... single monitor
01/28/2012
- 10:59 PM Revision 5e16974c (ceph): osd: reset pgstats timer when we reopen monitor session
- Otherwise we'll reopen every second from here on out, without giving the
new session a chance to start up and do it's... - 08:47 PM Feature #1836: filejournal: use async directio to write to the journal
- 08:09 PM Bug #1974: osd: radosmodel crash on thrashing
- another one:...
- 08:00 PM Bug #1998 (Resolved): qa: admin socket test broken
- 07:40 PM Revision 9e78d53d (ceph): clock: ignore clock_offset if cct is NULL
- This is helpful e.g. from assert.
Signed-off-by: Sage Weil <sage@newdream.net> - 07:18 PM Revision 5938d177 (ceph): filejournal: add corruption test to check crc checking code
- Verify that the journal replay rejects a corrupted journal entry.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 07:17 PM Revision e2fbe439 (ceph): filejournal: include crc in entry header/footer
- Use the unused flags field for this. Previously it was always 0, so this
lets us skip old entries on old journals an... - 07:17 PM Revision cc117210 (ceph): filejournal: assume gibberish flags imply none
- Old journals didn't properly initialize the flags (oops). Assume that
any bits besides the first 2 imply no flags.
... - 07:16 PM Revision 6197b5a4 (ceph): qa: test_filejournal: test lots of small writes too
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 07:16 PM Revision 4ddca467 (ceph): qa: add test_filejournal
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:57 PM Revision db3b9ee5 (ceph): filejournal: fix header initialization
- Make sure it's zeros to start with. Currently flags might be gibberish!
Signed-off-by: Sage Weil <sage.weil@dreamho... - 06:57 PM Revision 19f829da (ceph): filejournal: clean up some errno checks
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:56 PM Revision 7f474edb (ceph): filejournal: assert submit_entry gets >0 bytes
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:56 PM Revision 86738305 (ceph): filejournal: initialize header before writing
- Avoid writing uninitialized crap.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:56 PM Revision 8d439e93 (ceph): filejournal: move zero_buf allocation
- We need header.alignment to be defined.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:56 PM Revision f9620d7d (ceph): client: do not send release to down mds
- We can have a session with state where the mds is not up; don't blindly
send a message or we can get
./mds/MDSMap.h:... - 06:04 PM Revision d43387c3 (ceph): Merge branch 'stable'
- 05:26 PM Revision e43db381 (ceph): signal: use _exit() on SIGTERM
- No need to call onexit handlers, static dtors, whatever.
This may help with #1996 and #1549.
Signed-off-by: Sage We... - 02:40 PM Bug #1975: btrfs: EINVAL on snap create
- ...
- 02:08 AM Revision 06c8fdc9 (ceph): regression: add admin socket test for objecter requests.
- 01:13 AM Revision f84b4aa5 (ceph): Add admin socket task.
- This simply gets the output of an admin socket command, makes sure
it's json, and runs a user-provided test script on... - 01:07 AM Revision c9cf7b71 (ceph): admin socket: add include guard
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 01:07 AM Revision 39f6c4c1 (ceph): admin socket: increase debug level for successful requests
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 01:07 AM Revision 097bc5cb (ceph): objecter: add an admin socket command to get in-flight requests
- Fixes: #1881
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> - 01:07 AM Revision 0f9c6b43 (ceph): test: add script for checking admin socket 'objecter_requests' output
- Just a couple internal consistency checks for now. More specific ones
would depend on workload.
Signed-off-by: Josh ... - 01:07 AM Revision dbda1b6b (ceph): CephContext: add method for retrieving admin socket
- This is needed to allow higher layers in the stack to add admin socket
commands.
Signed-off-by: Josh Durgin <josh.du... - 12:40 AM Revision d0a447d8 (ceph): Merge branch 'wip-pg-stale'
01/27/2012
- 09:27 PM Revision 56d164c8 (ceph): mon: stale pgs -> HEALTH_WARN
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:21 PM Revision 61c54a79 (ceph): mon: mark pgs stale in pg_map if primary osd is down
- This alerts the administrator when all OSDs for a PG have failed and the
monitor doesn't receive any further updates.... - 09:09 PM Bug #1997 (Resolved): teuthology: wait for clean osd shutdown before umount
- /a/master-2012-01-27_13:29:47/9361...
- 09:02 PM Revision 6e44af9f (ceph): osd: add STALE pg state bit
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:00 PM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
- again,...
- 08:58 PM CephFS Bug #1996 (Duplicate): mds: scatter_nudge() bad pointer on shutdown?
- ...
- 08:35 PM Revision c1345f71 (ceph): v0.41
- 08:23 PM Revision 374fec47 (ceph): objector: document Objecter::init_ops()
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:23 PM Revision 6d37d5c9 (ceph): objecter: fix out_* initialization
- This looks more like the real cause for #1986. Op ctor gets a vector of
ops but out_* aren't initialized to match.
... - 07:21 PM Revision 995ff222 (ceph): osd: remove unused PG::block_if_wrlocked declaration
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 07:21 PM Revision 94729206 (ceph): Revert "common/Throttle: Remove unused return type on Throttle::get()"
- This reverts commit 4549501c9b0968ce4243e06ff7e9ef03b19de667.
We're about to use it to avoid a time lookup if possibl... - 06:45 PM Revision 946da5a3 (ceph): filestore: dump offending transaction on any error
- Clean this code up to explicitly whitelist what is ok so that the flow is
less annoying to follow/maintain, and so th... - 06:40 PM Revision 6453123c (ceph): objecter: warn when OSD returns mismatched op vector
- The osd shouldn't do this (even though we should tolerate it).
Signed-off-by: Sage Weil <sage@newdream.net>
Reviewed... - 06:39 PM Revision 0cc26a94 (ceph): objecter: fix bounds checking on op reply demuxing
- We can't assume that the size of out_ops (from the reply) matches the
op->out_* vectors from our request state. In p... - 06:28 PM Feature #1881 (Resolved): objecter: expose in-progress request state via admin socket
- Implemented in commit:097bc5cb1dbc83d8b09d4cb95c3c5abd1874de77 and added to the qa suite.
- 04:50 PM Feature #1881: objecter: expose in-progress request state via admin socket
- This is implemented in the wip-track-objecter-reqs branch in ceph.git, and testing is enabled by wip-admin-socket in ...
- 06:01 PM Revision 9b554d4c (ceph): mds: remove test assert
- Grr!
Signed-off-by: Sage Weil <sage@newdream.net> - 02:32 PM Revision b8e6a6bd (ceph): assert: include timestamp
- Also drop quotes around thread id.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 01:28 PM Feature #1993: mon: warn admin about down pgs
- 11:18 AM Feature #1993 (Resolved): mon: warn admin about down pgs
- 01:21 PM Bug #1995 (Resolved): Turn down non-btrfs warning in FileStore
- ...
- 12:20 PM RADOS Feature #1994 (New): osd: expire objects using scrubbing
- We can set an attribute on the object that would set its expiration, and check that attribute when doing the scrubbing.
- 11:46 AM Bug #1992: OSD::get_or_create_pg
- Er, actually, the OSD is getting an MOSDPGLog with info DNE (presumably uninitialized). That appears to be non-kosher...
- 11:00 AM Bug #1992: OSD::get_or_create_pg
- The assert here is because the PG doesn't exist yet but the OSD is not the primary for that PG. It's getting into get...
- 07:26 AM Bug #1992 (Can't reproduce): OSD::get_or_create_pg
- I've just upgraded my 0.39 cluster to 0.40 and that didn't go that well.
The whole cluster started bouncing and cr... - 09:50 AM Bug #1943: osd: bad clone transaction on journal replay
- hit this again on ubuntu@teuthology:/var/lib/teuthworker/archive/nightly_coverage_2012-01-27-a/9261. thrashing.
a... - 06:49 AM Bug #1986: objecter: segfault during osd op reply demux
- wip-1986
01/26/2012
- 09:36 PM Bug #1986: objecter: segfault during osd op reply demux
- nevermind, wrong branch
- 09:34 PM Bug #1986: objecter: segfault during osd op reply demux
- I can't find 'if (*p)' anywhere in osdc/Objecter.cc... what commit was this on?
- 10:41 AM Bug #1988 (Won't Fix): osd: scrub stat mismatch
- removed num_kb code entirely.
- 10:12 AM Bug #1984: osd: failed assert, got into finish_recovery_ops without any recovery ops active?
- http://85.214.49.87/ceph/20120124/osd.0.log.bz2
- 08:17 AM Linux kernel client Bug #147: lockdep: possible irq lock inversion dependency w/ osdc->request_mutex and con->mutex
- I'll take this. I may have reordered something in my recent commits
to cause this to surface. - 07:54 AM Revision b3c80bcb (ceph): rgw: acls cleanup wip
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 07:17 AM Bug #1974: osd: radosmodel crash on thrashing
- ...
01/25/2012
- 11:40 PM Revision 91b547b9 (ceph): osd: remove the unused require_current_map
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 11:09 PM Revision b5371403 (ceph): Merge branch 'master' into wip-encoding
- Conflicts:
src/osd/osd_types.h - 10:07 PM Revision 2bc71056 (ceph): filestore: fix typo
- Grr
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 10:04 PM Revision fe2834f6 (ceph): remove snap thrashing from regression suite for time being
- 10:03 PM Revision 0b088eb5 (ceph): Merge branch 'wip-kb'
- Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
- 09:58 PM Revision 4454d391 (ceph): Merge remote branch 'upstream/wip-osd-clone-obc'
- 09:52 PM Revision ec7a1402 (ceph): filestore: zero btrfs vol_args prior to ioctl
- Just to be paranoid. Nothing we haven't set *should* affect the ABI,
but...
Always do this immediately after declar... - 08:40 PM Revision 625b0b02 (ceph): osd: remove num_kb from object_stat_sum_t stats
- This is redundant--we can just use num_bytes. If we're worried about the
per-object overhead or rounding, we can fac... - 08:40 PM Revision dedf5758 (ceph): mon: num_kb -> num_bytes in cluster perfcounters
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:15 PM CephFS Bug #1991 (Duplicate): mds: crash during clean shutdown
- Teuthology config:...
- 06:01 PM Linux kernel client Bug #147: lockdep: possible irq lock inversion dependency w/ osdc->request_mutex and con->mutex
- This happened in a teuthology run of iozone on rbd. From teuthology:~teuthworker/archive/master-2012-01-25_14:30:58/9...
- 05:56 PM Revision acb164c8 (ceph): osd: improve object context debug output
- Include pointer. This may help with #1979.
Signed-off-by: Sage Weil <sage@newdream.net> - 03:59 PM Bug #1980 (Closed): osd/PG.cc: 3562: FAILED assert(p->second.need <= v)
- f16b38deaee3d0ed35229aecba4f8f12b8404f03 should take care of this.
- 03:13 PM Linux kernel client Bug #1990 (In Progress): rbd: null pointer dereference during map
- I was already looking at another commit which I found after
review to be suspect:
rbd: adequately protect rbd cli... - 11:32 AM Linux kernel client Bug #1990 (Resolved): rbd: null pointer dereference during map
- These commands from teuthology:...
- 01:59 PM Bug #1978 (Resolved): osd: FAILED assert(!object_contexts.size())
- 10:28 AM Bug #1978: osd: FAILED assert(!object_contexts.size())
- this should be resolved by 44b11441ad3ef231ff207476bbb0d2e8ab130f26 once it's in master.
- 01:41 PM Feature #1885: identify top 10 expected failures and process to diagnose
- Mark Kampe wrote:
> Additional issues from Carl's list:
> * RGW request timeouts
That's a symptom, not a cause...
... - 01:25 PM Feature #1885: identify top 10 expected failures and process to diagnose
- Additional issues from Carl's list:
* RGW request timeouts
* OSD file system timeouts
* OSD that is "down" but sti... - 01:17 PM Bug #1949: osd: ENOTEMPTY on collection removal from snaptrimmer
- i have a full log (osd 20 filestore 20) leading up to this at metropolis:/home/sage/osd.enotempty.log
- 10:45 AM Bug #1981 (Won't Fix): pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(...
- Glad to hear it!
- 10:30 AM Bug #1982 (Closed): osd: failed assert (obc->watchers.size())
- should be fixed in b17736a12611a12461df26fb184acc5d85f82fea
- 10:29 AM Bug #1979 (Duplicate): osd: suicide timeout on recovery_tp... heap corruption?
- same as 1978
- 09:56 AM Bug #1979 (Need More Info): osd: suicide timeout on recovery_tp... heap corruption?
- pushed a patch to include pointer in get/put object_context to help narrow this down.
- 10:05 AM Bug #1989: teuthology: error in ceph.log didn't make teutholgy return error code
- we whitelist log entries. it only prints that (and sets success=False) if it sees something unexpected...
- 10:02 AM Bug #1989: teuthology: error in ceph.log didn't make teutholgy return error code
- I thought we turned this off on purpose because thrashing always triggered it. Am I remembering incorrectly?
- 09:50 AM Bug #1989 (Resolved): teuthology: error in ceph.log didn't make teutholgy return error code
- in a bash while loop, I saw...
- 10:01 AM Bug #1988: osd: scrub stat mismatch
- I've never understood why we track them separately to begin with, myself. :)
- 09:48 AM Bug #1988: osd: scrub stat mismatch
- I suspect the problem is due to the rounding off when we are doing the complicated dance of keeping these stats corre...
- 09:35 AM Bug #1988 (Won't Fix): osd: scrub stat mismatch
- on thrash + radosmodel workload. bytes match, but kb don't:...
- 09:15 AM Bug #1975 (Need More Info): btrfs: EINVAL on snap create
- added btrfs printks to figure out where that EINVAL is coming from. also changed it to return EBADF if the fd is inva...
- 09:08 AM Bug #1975: btrfs: EINVAL on snap create
- ...
- 06:03 AM Revision f16b38de (ceph): osd: track obc for clone from log replay
- We need to keep an in-memory obc to track the state of the in-flight io
to disk. This is analogous to when an object... - 05:34 AM Revision 44b11441 (ceph): osd: set object_info_t::oid properly when recovering clones
- I saw a case (#1973) where the clone had the oid set to the head. That is
clearly wrong. Not sure what damage this ... - 05:19 AM Revision abc005a5 (ceph): Merge remote branch 'gh/wip-filestore-errors'
- 05:18 AM Revision eec87bb8 (ceph): package *.py* files
- Some post-install rpmbuild defaults byte-compile all packaged python
files, so don't bother removing the .pyc files, ... - 01:08 AM Revision 2c2cc159 (ceph): librbd: don't infinite loop when header is too large
- Since snapshots are currently stored at the end of the header, having
many snapshots made the header larger than the ... - 12:50 AM Revision 746a2302 (ceph): ReplicatedPG: data_subset may be empty during sub_op_push
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com>
01/24/2012
- 10:15 PM Revision 72cd1210 (ceph): rgw: acl changes compile
- and link. Doesn't work though.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 09:23 PM Revision f3d200c0 (ceph): filestore: fix non-::-prefixed close
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 09:22 PM Revision a49a53d7 (ceph): filestore: add debugging to each error case in lfn_open
- Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
- 09:16 PM Revision 0fd6ca9a (ceph): filestore: audit + clean up error checks
- - use temp var for errno
- in general return -errno from helpers
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 09:16 PM Revision a43937f0 (ceph): filestore: return -errno from lfn_open
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 09:16 PM Revision ae36f599 (ceph): filestore: TEMP_FAILURE_RETRY on ::close(2)
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 08:57 PM Revision 2835d402 (ceph): rgw: rgw_acl_s3.* compiles
- very much wip
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 07:28 PM Revision 4aa9ca45 (ceph): CephManager: base timeout on time since last change in active+clean
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 06:56 PM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- Greg Farnum wrote:
> Okay, that's a cross-version message incompatibility. You should be able to resolve your issue ... - 06:18 PM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- Okay, that's a cross-version message incompatibility. You should be able to resolve your issue by just upgrading the ...
- 04:59 PM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- Greg Farnum wrote:
> Okay, I see the bug which is the immediate cause of the OOM[1], but I haven't yet tracked down ... - 04:15 PM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- Okay, I see the bug which is the immediate cause of the OOM[1], but I haven't yet tracked down the actual triggering ...
- 02:02 PM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- Greg Farnum wrote:
> Actually James, could you tell me what actions you took that resulted in this log? It looks lik... - 01:34 PM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- Actually James, could you tell me what actions you took that resulted in this log? It looks like:
started OSD at 10:... - 11:54 AM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- That log contains enough to get started on; yesterday's won't matter. If you could make sure that your logs from the ...
- 11:44 AM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- I am attaching the logs for today. I did shut this box down yesterday to upgrade it, but due to build system delays, ...
- 11:42 AM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- Dirk Meister wrote:
> Greg Farnum wrote:
> > A 33GB core file is...very large! Did you have any logging enabled th... - 11:37 AM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- Greg Farnum wrote:
> A 33GB core file is...very large! Did you have any logging enabled that might let us see what h... - 11:25 AM Bug #1981: pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(ret == 0)
- A 33GB core file is...very large! Did you have any logging enabled that might let us see what happened?
When you s... - 11:22 AM Bug #1981 (Won't Fix): pthread_create failed with error 11: common/Thread.cc: 140: FAILED assert(...
- I just upgraded one of my boxes from 0.39 to 0.40, and I am getting this shortly after starting the osds:...
- 05:28 PM Bug #1987 (Resolved): librbd: listing an image with more than ~200 snapshots infinite loops and c...
- Fixed by commit:2c2cc1596cd63b6368d13a2665c6c85d3d8ed532.
- 05:01 PM Bug #1987 (Resolved): librbd: listing an image with more than ~200 snapshots infinite loops and c...
- As reported in http://article.gmane.org/gmane.comp.file-systems.ceph.devel/5038.
- 04:58 PM Bug #1986 (Resolved): objecter: segfault during osd op reply demux
- This happened on master + an rbd fix when running 'rbd snap purge blah', when the blah image had > 200 snapshots. Cor...
- 04:25 PM Messengers Bug #1985 (Won't Fix): msgr: creating new Pipe for pre-existing connection leaks Pipe if they don...
- See #1981. Turns out that if we have an existing Pipe but don't replace it, we never clean ourselves up and we need to.
- 01:20 PM Bug #1979: osd: suicide timeout on recovery_tp... heap corruption?
- Ooh, I saw exactly that yesterday. IIRC there was a get_object_context for some object (refcount now one), and then ...
- 11:21 AM Bug #1979: osd: suicide timeout on recovery_tp... heap corruption?
- It looks like there was a use after free or heap corruption - the recovery thread was stuck waiting on an invalid obj...
- 09:52 AM Bug #1979 (Duplicate): osd: suicide timeout on recovery_tp... heap corruption?
- /var/lib/teuthworker/archive/nightly_coverage_2012-01-24-a/8882...
- 01:10 PM Bug #1984 (Can't reproduce): osd: failed assert, got into finish_recovery_ops without any recover...
- osd/PG.cc: In function 'void PG::finish_recovery_op(const hobject_t&, bool)', in thread '7f1fdab26700'
osd/PG.cc: 15... - 01:09 PM Bug #1983 (Resolved): osd: failed assert, info does not match peer info
- ...
- 01:07 PM Bug #1982 (Closed): osd: failed assert (obc->watchers.size())
- ...
- 11:50 AM Feature #1885: identify top 10 expected failures and process to diagnose
- OSD:
* cascading failures
* single OSD failure
* failure to complete peering/recovery
* unfound objects after rec... - 11:28 AM Bug #1976 (Closed): osd: timeout getting clean
- In the case of 8880 at least, it looked like the osds were still making progress. I've changed wait_till_clean to ti...
- 09:38 AM Bug #1976 (Closed): osd: timeout getting clean
- actually, this may be a monitor thing. seen it twice now:
/var/lib/teuthworker/archive/nightly_coverage_2012-01-2... - 10:46 AM Bug #1975: btrfs: EINVAL on snap create
- wip-filestore-errors looks good to me except for one comment on github.
- 10:03 AM Bug #1975: btrfs: EINVAL on snap create
- wip-filestore-errors check should be reviewed+merged so we can see the actual error code. sadly,...
- 09:35 AM Bug #1975 (Won't Fix): btrfs: EINVAL on snap create
- /var/lib/teuthworker/archive/nightly_coverage_2012-01-24-a/8879...
- 09:52 AM Bug #1980 (Closed): osd/PG.cc: 3562: FAILED assert(p->second.need <= v)
- /var/lib/teuthworker/archive/nightly_coverage_2012-01-24-a/8882...
- 09:50 AM Bug #1978 (Resolved): osd: FAILED assert(!object_contexts.size())
- /var/lib/teuthworker/archive/nightly_coverage_2012-01-24-a/8882...
- 09:45 AM Bug #1977 (Can't reproduce): mon: ceph command hang
- /var/lib/teuthworker/archive/nightly_coverage_2012-01-24-a/8881...
- 09:31 AM Feature #1655: gitbuilder aggregator page
- It's just a quick perl hack. What it really should do is make javascript and <div>s to fetch the results for each bu...
- 04:47 AM Feature #1655: gitbuilder aggregator page
- Is the source available for the gitbuilders.cgi aggregator? It looks like a pretty useful script for other projects w...
01/23/2012
- 09:50 PM Revision 1e421093 (ceph): Merge commit '9dc7b9233b985bf859751fc89a5b02253e829836'
- Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 09:16 PM Bug #1974 (Resolved): osd: radosmodel crash on thrashing
- ...
- 09:11 PM Bug #1973: osd: segfault in ReplicatedPG::remove_object_with_snap_hardlinks
- this is the _9 snap object_info_t:...
- 09:02 PM Bug #1973 (Can't reproduce): osd: segfault in ReplicatedPG::remove_object_with_snap_hardlinks
- /var/lib/teuthworker/archive/nightly_coverage_2012-01-23-b/8775...
- 08:50 PM Revision 54a76734 (ceph): ceph: don't write output on error
- Accumulate all output, and write it at the end. This way we can avoid
writing it if any of the commands fail.
Fixes... - 08:50 PM Revision cfe1d011 (ceph): ceph: bail out on first failing command
- Signed-off-by: Sage Weil <sage@newdream.net>
- 08:50 PM Revision 9dc7b923 (ceph): rgw: fix warning
- rgw/rgw_rest.cc:258: warning: comparison between signed and unsigned integer expressions
Signed-off-by: Sage Weil <s... - 06:24 PM Revision c5e7a74c (ceph): .gitignore: ceph-dencoder
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:21 PM Revision 7ce544e6 (ceph): osd: ignore MInfoRec, MNotifyRec in WaitActingChange
- We should ignore logs, infos, and notifies while we are waiting for the
map to change. Peering has reached a dead-en... - 05:53 PM Revision d9eedf53 (ceph): rgw: fix warning in 32bit arch
- 05:19 PM Revision 1a10b517 (ceph): ceph-dencoder: needs ceph_ver.h dependency
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:02 PM Revision 92a8f5e7 (ceph): pg: unindex entries when clearing or removing from the log
- Leaving the index around could cause use of the indexes to access
freed memory.
Signed-off-by: Josh Durgin <josh.dur... - 05:02 PM Revision 5451d871 (ceph): osd: do not clobber log on backfill progress update
- This is unnecessary and counterproductive, since the log is used to detect
dup ops. It's an artifact of an earlier b... - 03:58 PM Bug #1936: teuthology: github downtime -> failed runs
- Greg Farnum wrote:
> Is this going to be okay if the update hangs in the middle and then qa clones the repository?
... - 02:18 PM Revision f002ed4c (ceph): features: #include ceph_features directly where needed
- Less rebuild time when touched.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 01:58 PM Bug #1954 (Resolved): ceph tool: don't create output files when an error occurs
- commit:54a76734b11c87f1edab993f15dc0a754b843019
- 12:44 PM Tasks #1923 (Resolved): document required properties and features for alternative backend file sy...
- Updating wiki also.
Ceph requirements for running over alternative backend filesystems (non btrfs):
1. support ... - 11:20 AM Feature #1972 (Resolved): encoding: cross-version test repo, scripts
- 11:16 AM Feature #1971 (Resolved): encoding: adapt to messages
- 11:13 AM Feature #1970 (Resolved): osd: migrate to new encoding schemes
- 11:11 AM Feature #1934 (Closed): Get new Sepia machines into service
- 11:10 AM Feature #1969 (Resolved): gitbuilder for 11.10, 12.04
- 11:06 AM Feature #1968 (Rejected): ferro: Batch resource allocation (not fair, no quotas yet)
- 11:06 AM Feature #1967 (Rejected): ferro: Single API endpoint that delegates to machine managers
- 11:05 AM Feature #1966 (Rejected): ferro: Connect actions to state machine
- 11:05 AM Feature #1965 (Rejected): ferro: Machine management state machine (fake actions)
- 11:04 AM Feature #1964 (Rejected): ferro: Create a cloud-init OVF config that reimages a machine
- 11:04 AM Feature #1963 (Closed): ferro: OVF Environment creation as a library
- 11:03 AM Feature #1962 (Rejected): ferro: Trigger vMedia boot via IPMI/DRAC
- 11:03 AM Feature #1881 (In Progress): objecter: expose in-progress request state via admin socket
- 11:03 AM Feature #1961 (Rejected): ferro: Python wrapper for vmcli (using gevent)
- 10:29 AM Bug #1958 (Resolved): osd: crash during peering due to receiving an info msg in WaitActingChange
- commit:7ce544e640d45e901ef67e8268c963c958a66eff
- 06:59 AM Bug #1958: osd: crash during peering due to receiving an info msg in WaitActingChange
- fix pushed to commit:2f6205e57c7b8a21da72f0af8f1edd38a5989149
- 10:00 AM Cleanup #1960: You should be able to print daemon options without specifying a config file
- Greg Farnum wrote:
> Ew! This problem probably occurs with the mon and mds, but maybe not.
I can confirm this occ... - 09:43 AM Cleanup #1960 (Resolved): You should be able to print daemon options without specifying a config ...
- ...
- 09:46 AM Bug #1959 (Resolved): qa: half of nightlies failing with chef+ruby error
- sepia29 has bad disk, marked out.
- 06:16 AM Revision e1044712 (ceph): osd: pg_stat_t generator
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision c095c354 (ceph): osd: uninline pool_stat_t methods
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision 37d38f77 (ceph): osd: pool_stat_t generator
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision f09c01f7 (ceph): osd: watch_info_t generator
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision abb0510b (ceph): objectstore: implement generator
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision 58988920 (ceph): ceph-dencoder: reenable generated types
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision 10b87ed0 (ceph): objectstore: drop unused Transaction::p
- This should have gone away when we added the iterator.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:16 AM Revision 99851450 (ceph): ceph-dencoder: fix up usage a bit
- - verb_noun
- "in-memory"
- 1-based test index
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:16 AM Revision 56c5e851 (ceph): osd: initialize fields in watch_info_t constructor
- Go-go unit tests!
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:16 AM Revision 0ced79db (ceph): objectstore: remove unused setattr variants
- No callers. Fugly.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:16 AM Revision 0b2cf7de (ceph): osd: pool_snap_info_t generator
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision e07178fb (ceph): osd: pg_pool_t generator
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision adcc9bf7 (ceph): osd: object_sum_stat_t generator
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision ee823972 (ceph): osd: uninline object_stat_sum_t methods
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision 8e9c229b (ceph): osd: generator for object_stat_collection_t; uninline too
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision 89b189ab (ceph): osd: uninline pg_stat_t methods
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 06:16 AM Revision 40f59b80 (ceph): msg: entity_{name,addr}_t generators
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:43 AM Revision 7ece2787 (ceph): osd: dump and generators for OSDSuperblock
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:43 AM Revision f1af44f6 (ceph): test-generated.sh
- Test built-in test instances
Signed-off-by: Sage Weil <sage@newdream.net> - 04:43 AM Revision 47e20068 (ceph): ceph-dencoder: implement 'version' command
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:43 AM Revision fa20be37 (ceph): qa: misc encoding scripts, in various states of usefullness.
- Signed-off-by: Sage Weil <sage@newdream.net>
- 04:43 AM Revision 2baa6c0d (ceph): encoding: adjust ENCODE/DECODE macros
- - make argument order consistent
- 'v' for code version
- 'compat' for compat version (lower bound)
- *_FINISH needs ... - 04:41 AM Revision 354f2cbe (ceph): ceph-dencoder: encode/decode/dump test tool
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:41 AM Revision 942f302b (ceph): move feature bit definition to separate header file
- It was sloppy to have this in SimpleMessenger.h
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 04:41 AM Revision e674f2b4 (ceph): features: add missing features to default set
- UID, MONCLOCKCHECK. No practical impact here, since they only mattered to
the mon and that specified them explicitly... - 04:41 AM Revision c0bdb071 (ceph): ceph-dencoder: support feature bits
- Print our version's feature bits with --get-features.
Specify bits to encode with via -f <val>.
Signed-off-by: Sage... - 04:41 AM Revision 3f30a6f2 (ceph): objectstore: implement Transaction::dump(Formatter*)
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:41 AM Revision 7c0a4da3 (ceph): ceph-dencoder: ObjectStore::Transaction
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:41 AM Revision 7393ad19 (ceph): ceph-dencoder: fix build
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:41 AM Revision 7d4c3dbc (ceph): osd: osd_stat_t generator
- Generate some test object instances.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 04:41 AM Revision bbf18268 (ceph): ceph-dencoder: generate object instances from static method
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:41 AM Revision 03b6113a (ceph): ceph-dencoder: clean up usage
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:41 AM Revision 45649214 (ceph): osd: implement watch_info_t, object_info_t::dump()
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:41 AM Revision 80b80b05 (ceph): filejournal: use ::encode() wrapper func for Transaction
- So we can capture the output.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 04:41 AM Revision cdeeed6c (ceph): msgr: dump() entity_name_t and entity_addr_t
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 04:36 AM Revision a7106421 (ceph): encoding: instrument to dump encoded objects
- If built with -DENCODE_DUMP=path, dump encoded copies of objects to that
directory. Limit the copies of each class w... - 04:36 AM Revision d3213935 (ceph): encoding: new {ENCODE,DECODE}_{START,FINISH} macros
- New macros to bracket encode/decode methods. The encoding scheme:
1 byte - version
1 byte - incompat version
4 b...
01/22/2012
- 02:55 PM Bug #1959 (Resolved): qa: half of nightlies failing with chef+ruby error
- 02:55 PM Bug #1936 (Resolved): teuthology: github downtime -> failed runs
- 02:54 PM Messengers Bug #1942 (Won't Fix): msgr: Address family not supported by protocol
- 02:44 PM Feature #1884 (Resolved): plan encoding strategy to test+facilitate non-disruptive upgrades
- 06:06 AM Bug #1849: directories' timestamps in snapshots sometimes change when directory is modified
- I'm not sure this was a dupe, but I can see why it would seem like it.
When I reported this, plenty of directories... - 06:00 AM CephFS Bug #1435: mds: loss of layout policies upon mds restart
- I've been looking at the MDS implementation, and I have a theory now.
It was probably not the MDS restarts that we...
01/20/2012
- 08:56 PM Revision aea6a305 (ceph): rgw: read_user_buckets() fix redone
- The problem with the original fix is that it wasn't atomic. Going back
to the original inefficient (though atomic) me... - 06:55 PM Revision fdaf91e2 (ceph): osd: implement --dump-journal
- Dump the contents of the journal to stdout in text form. Useful for
debugging.
Signed-off-by: Sage Weil <sage.weil@... - 06:50 PM Revision a52762ac (ceph): rgw: read large bucket directory correctly
- Issue #1955. When there wre too many buckets, we failed reading
the bucket directory.
Signed-off-by: Yehuda Sadeh <y... - 03:48 PM Bug #1958 (Resolved): osd: crash during peering due to receiving an info msg in WaitActingChange
- This happened during a teuthology run with thrashing and reads/writes/deletes.
Logs are in vit:~joshd/bug_1958
<p... - 03:30 PM CephFS Bug #1957: ceph-fuse: have "." and ".." entries consistently
- this is specifically me not know how to handle .. on the root directory with fuse. sshfs does it, though, so it's po...
- 02:56 PM CephFS Bug #1957 (Resolved): ceph-fuse: have "." and ".." entries consistently
- I was cleaning old emails and found this: http://marc.info/?l=ceph-devel&m=130688351921306&w=2
Quick experiment sa... - 12:58 PM rgw Bug #1955 (Resolved): rgw: cannot list user buckets when number of buckets is large
- Issue at rgw code, not librados. read_user_buckets() was broken. Fixed with commit:aea6a305e61c1fa54828b71eff29070c3f...
- 09:29 AM rgw Bug #1955 (Resolved): rgw: cannot list user buckets when number of buckets is large
- could be an issue with librados::tmap_get().
- 11:10 AM Feature #1956 (Resolved): rgw: revisit atomic GET/PUT
- Discuss the following different options to simplify the process:
1. Instead of writing tmp object and then clone i... - 11:02 AM Feature #1944 (Resolved): osd: dump journal
- 12:54 AM Revision 802acb11 (ceph): rgw: refactor acls, separate protocol dependent code
- Does not compile yet, part of swift acls work.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
01/19/2012
- 05:11 PM Revision 6c275c81 (ceph): rgw: fix warning
- Signed-off-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com>
- 01:36 PM Bug #1954 (Resolved): ceph tool: don't create output files when an error occurs
- Doing something like 'ceph osd getmap 1000000 -o osdmap' results in a 0 length file if epoch 1000000 doesn't exist.
- 11:15 AM Bug #1953 (Resolved): teuthology: core files aren't archived when using valgrind
- When daemons crash while running under valgrind, the core file is being saved in the home dir instead of /tmp/cephtes...
- 10:50 AM Bug #1952 (Resolved): rgw: test suite times out
- We have seen a number of instances where the s3-tests test suite (running via jenkins) does not complete after 20 min...
- 10:01 AM rgw Feature #830: rgw: swift per-object ACLs
- Decisions:
(a) buckets used only through S3 see strict S3 behavior
(b) buckets used only through Swift see strict... - 07:51 AM Linux kernel client Bug #1907: rbd: don't reuse device ids while they're still in use elsewhere
- I can ask on fsdevel, but right now I feel the need to understand a
little better what's going in inside rbd in orde... - 04:41 AM Revision 3650fd61 (ceph): Merge remote branch 'gh/wip-op-data-mux'
- Reviewed-by: Greg Farnum <greg.farnum@dreamhost.com>
Reviewed-by: Yehuda Sadeh <yehuda.sadeh@dreamhost.com> - 01:49 AM Revision 29885f3e (ceph): kernel: ignore connection problems while waiting for reboot
01/18/2012
- 11:52 PM Revision e016cca9 (ceph): Convert mount.ceph to use KEY_SPEC_PROCESS_KEYRING
- having mount.ceph use KEY_SPEC_USER_KEYRING to pass keys to the kernel has
several disadvantages:
1) It leaves the k... - 09:16 PM Cleanup #1886 (Resolved): objecter/osd: mux/demux in MOSDOpReply encoding
- 07:46 PM Revision f1f75dd4 (ceph): Merge branch 'wip-rgw-simplelog'
- 07:37 PM Revision 8a9252f9 (ceph): rgw: adjust high level debug level
- setting it to 2 instead of 1
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 07:25 PM Revision b890838c (ceph): Merge remote branch 'gh/wip-rgw-simplelog'
- * gh/wip-rgw-simplelog:
rgw: add timestamp to high level log
rgw: log host_bucket, http status
rgw: simple requ... - 04:05 PM Bug #1849 (Duplicate): directories' timestamps in snapshots sometimes change when directory is mo...
- I believe this is now a duplicate of #1946?
- 03:41 PM Feature #1951 (Resolved): store history of configs and changes
- During a discussion today it occurred to me that there will be situations under which we'll be a lot happier if we ca...
- 02:57 PM rgw Feature #1950 (New): rgw: create S3/Swift ACL interoperability suite
- We need that in order to test swift and s3 ACLs interactions.
- 11:49 AM rgw Feature #1882 (Resolved): rgw: high-level log entries for request state transitions
- Done, merged at commit:f1f75dd4768e017c61088b44e7457bf96916d1a9. Log level 2 now dumps a plain request state with tim...
- 11:45 AM Bug #1949 (Resolved): osd: ENOTEMPTY on collection removal from snaptrimmer
- ...
- 11:27 AM Linux kernel client Bug #1907: rbd: don't reuse device ids while they're still in use elsewhere
- my gut feeling is also that 'echo > /sys/bus/rbd/remove' should return EBUSY (along with rbd unmap). if you can't te...
- 10:40 AM Linux kernel client Bug #1907: rbd: don't reuse device ids while they're still in use elsewhere
- Alex Elder wrote:
> From the linked message:
> > root <at> cephnode3:/# rbd unmap /dev/rbd0
> >
> < -> works wit... - 09:51 AM Linux kernel client Bug #1907: rbd: don't reuse device ids while they're still in use elsewhere
- From the linked message:
> root <at> cephnode3:/# rbd unmap /dev/rbd0
>
< -> works without any error message (sho... - 07:42 AM Revision 148031b7 (ceph): rgw: fix intent log processing
- Intent log processing was completely broken. First, it wasn't
parsing the date correctly (due to failure to initalize... - 07:40 AM Revision 731c8832 (ceph): rgw: initialize tm before calling strptime
- strptime assumes tm is already initialized.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 05:59 AM Revision 0aab0890 (ceph): objecter: some helpful multiop result debug output
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:57 AM Linux kernel client Bug #1795 (Resolved): break d_lock > s_cap_lock ordering
- I have verified under UML that no new problems arise with the
fix in place. I have not verified the lockdep warning... - 05:32 AM Revision 6f35e322 (ceph): objecter: make getxattrs set rval on decode error
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:31 AM Revision f441adfd (ceph): objecter: add stat ops to op vector!
- They work better that way.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:10 AM Revision 1d5c8fd3 (ceph): objecter: gift reply data to outbl _after_ demuxing
- Divvy up the result bl first, then gift the whole shebang to outbl. If
we gift it first, there's nothing to demux (s... - 01:33 AM Revision 2bffed3b (ceph): Merge remote branch 'gh/master' into wip-op-data-mux
- 01:33 AM Revision 905e8d80 (ceph): osd: make in/outdata split/merge helpers static OSDOp methods
- Avoid defining new global functions.
Also add basic doxygen descriptions.
Signed-off-by: Sage Weil <sage.weil@dream...
01/17/2012
- 11:10 PM Revision 1a7c8b49 (ceph): rgw: log_show_next() fix reading of the next buffer
- Bug #1939. Failed reading large logs.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 11:08 PM Revision 5bb9a9d6 (ceph): Add small cluster thrashing tasks
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 11:05 PM Revision 019a0d4c (ceph): Merge branch 'master' of ssh://github.com/NewDreamNetwork/ceph
- 10:23 PM Revision 956a4b43 (ceph): Merge remote branch 'gh/wip-backfill'
- Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
Conflicts:
src/ceph_mds.cc
src/ceph_osd.cc - 10:21 PM Revision 06e7562f (ceph): filestore: overwrite fsid during --mkfs
- This mainly matters because read_fsid() now looks at the file size to
determine if it's an old- or new-style fsid, an... - 09:42 PM Revision 4c6c4430 (ceph): rgw: reset timestamp when processing starts
- otherwise we'd count also the time waiting for the request.
Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> - 09:00 PM Revision bd8e32d9 (ceph): doc: update control file for setting pg num on pool create
- Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
- 09:00 PM Revision a85ea475 (ceph): hadoop: check for valid filehandler, before using in next calls
- In case of nonexistent file, calling Client::replication()
triggers assert.
Signed-off-by: Andrey Stepachev <octo@ya... - 09:00 PM Revision 127bbd17 (ceph): hadoop: fix unix timestamp calculation in hadoop lib
- Hadoop always see wrong dates due of wrong timestamp calculation. Properly
convert nanoseconds to millis when adding.... - 07:43 PM Revision 94f55f48 (ceph): TestRados: fix {min,max}_stride_size initialization
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:54 PM Revision 0c2f2b76 (ceph): Merge branch 'master' of ssh://ceph.newdream.net/git/ceph
- 06:51 PM Revision 79d19320 (ceph): osd: fix bind error checks
- Signed-off-by: Sage Weil <sage@newdream.net>
- 06:44 PM Revision 7804046d (ceph): Makefile: fix testkeys non-tcmalloc linkage
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:56 PM Revision 241cbebe (ceph): rgw: add timestamp to high level log
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 05:56 PM Revision db65295b (ceph): rgw: log host_bucket, http status
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 05:56 PM Revision 0e8b12cd (ceph): rgw: simple request logging
- Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
- 05:36 PM Revision 63b94b6f (ceph): mds: abort startup if we fail to bind
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:36 PM Revision 4f70acfa (ceph): osd: abort on startup if we fail to bind to a port
- Signed-off-by: Sage Weil <sage@newdream.net>
- 05:24 PM Revision 45e4c924 (ceph): thrashosds: maxdead default to 0
- This avoids any possibility of blocking peering.
- 04:21 PM Revision 47db4d04 (ceph): ceph: fix "run_uml.sh" script
- Last-minute cleverness prior to checkin broke the "run-uml.sh" script.
Rearange where a few definitions are done to m... - 03:54 PM rgw Bug #1948 (Resolved): rgw: need to read intent log in chunks
- 03:14 PM rgw Bug #1939 (Resolved): rgw: error processing large logs
- Fixed, commit:1a7c8b49f099268ee468877f7f1f7ad747995547.
- 02:58 PM Feature #1658 (Resolved): osd: backfill instead of backlog
- merged in commit:956a4b439759e46424fde3551971cd66b6d682e6
- 02:18 PM Messengers Bug #1942: msgr: Address family not supported by protocol
- commit:dcceb8e835cbf40173c334de18bd68c2cf7f3716 add the osd_fsid to the OSDSuperblock message and reved the version. ...
- 11:27 AM Messengers Bug #1942: msgr: Address family not supported by protocol
- we now report a connection fault instead of asserting. and during initialization we check for bind() errors. afaics...
- 11:47 AM CephFS Bug #1947 (Resolved): mds: SIGBUS during _mark_dirty
- This happened on umount after ffsb with the kernel client.
From teuthology:~teuthworker/archive/nightly_coverage_201... - 11:40 AM Bug #1936: teuthology: github downtime -> failed runs
- Is this going to be okay if the update hangs in the middle and then qa clones the repository?
- 11:25 AM Bug #1936: teuthology: github downtime -> failed runs
- I set up github mirrors for ceph.git, ceph-qa-chef.git, and s3-tests.git. They are at ceph.newdream.net/git, and upd...
- 11:34 AM CephFS Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
- Happened again in teuthology:~teuthworker/archive/nightly_coverage_2012-01-15-b/7721/remote/ubuntu@sepia6.ceph.dreamh...
- 11:28 AM Bug #1943 (Need More Info): osd: bad clone transaction on journal replay
- 11:23 AM CephFS Bug #1946: snapshot inherits timestamp/size/etc from modified trunk dir upon mds restart
- It looks to me like the problem is that CInode::old_inodes isn't included in EMetaBlob::fullbit. CInode::pick_old_in...
- 11:18 AM CephFS Bug #1946 (Resolved): snapshot inherits timestamp/size/etc from modified trunk dir upon mds restart
- mkdir .snap/name
ls -ld . .snap/name
# both have the same timestamp
touch .
ls -ld . .snap/name
# now . has a di... - 09:15 AM Bug #1638: Can't create object with large xattrs in a single operation (on extN)
- So now there's an assert on the ENOSPC. I triggered t by running s3tests under teuthology. Adding this here so we kno...
- 09:11 AM CephFS Bug #1945 (Can't reproduce): blogbench hang on caps
- ...
- 12:54 AM Revision 549b7806 (ceph): TestRados: implement max_seconds, reimplement argument parsing
- Signed-off-by: Sage Weil <sage@newdream.net>
- 12:53 AM Revision bf22a4fb (ceph): task/rados: use new usage for radosmodel tool
- 12:22 AM Revision 20f3f686 (ceph): RadosModel: prefix line with m_op
- So we can guage progress...
Signed-off-by: Sage Weil <sage@newdream.net> - 12:16 AM Revision 7b2fd45b (ceph): mds: fix uninitialized value in MClientLease::h
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
01/16/2012
- 11:09 PM Revision b2c07d8a (ceph): add simple thrash workload to regression suite
- 11:05 PM Revision 8fc60869 (ceph): thrashosds: make actions less nonsensical
- Make marking OSD up/down and in/out totally orthogonal.
Signed-off-by: Sage Weil <sage@newdream.net> - 11:05 PM Revision 71390f97 (ceph): thrashosds: fix action selection
- I'm not sure what the old code was trying to do, but I'm pretty sure it
wasn't doing it correctly.. a .1 chance_down ... - 10:38 PM Revision ec6e57c9 (ceph): Merge remote branch 'gh/master' into wip-op-data-mux
- 10:06 PM Revision b5f8de7b (ceph): msgr: move operator<< for sockaddr_storage to msg_types.cc
- tcp.{cc,h} aren't built/linked cleanly.
Signed-off-by: Sage Weil <sage@newdream.net> - 09:26 PM Revision e93999f9 (ceph): qa/workunits/rados/load-gen-mix.sh
- 10k objects, not 100k!
Signed-off-by: Sage Weil <sage@newdream.net> - 09:25 PM Revision ba83e8c6 (ceph): qa: rados load-gen: use rbd pool
- No replay interval.
- 09:18 PM Revision 9419f583 (ceph): ls: include duration, less noise
- 09:18 PM Revision c5bbfffa (ceph): hammer.sh: new -nuke syntax
- 08:39 PM Revision 8fb115fe (ceph): include run duration in summary.yaml
- 07:08 PM Revision 8e126db1 (ceph): mon.0 -> mon.a
- 07:08 PM Revision 43da161d (ceph): mds.0 -> mds.a
- 06:47 PM Revision 7b47e49f (ceph): ls: fix extraneous newline
- 06:40 PM Revision b7a11026 (ceph): rados: load-gen: wake up on reply
- So we can send requests more than once per second.
Signed-off-by: Sage Weil <sage@newdream.net> - 06:40 PM Revision 51e402e3 (ceph): rados: fix load-gen 'max-ops'
- This was mixed up with min/max_op_len. And max_ops wasn't being used
the initial object creation stage, flooding the... - 06:19 PM Revision 7d3b2c41 (ceph): librados: allow ObjectReadOperation::stat() to get time_t mtime
- We can't use the internal utime_t type here.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 06:16 PM Revision ecd6dec6 (ceph): Merge remote branch 'gh/master' into wip-op-data-mux
- Conflicts:
src/librados.cc
src/objclass/class_api.cc
src/rgw/rgw_rados.cc - 05:55 PM Revision b58f9560 (ceph): ceph: ignore all leaks
- unless/until we figure out where the DefinitelyLost records are coming
from.. at first glance they look bogus. - 05:46 PM Revision 706b6910 (ceph): osd: recover_primary_got() -> recover_got()
- This is called on primary and replicas alike.
Signed-off-by: Sage Weil <sage@newdream.net> - 05:34 PM Revision a4e2395f (ceph): osd: clear missing set on replica when restarting backfill
- The primary does the same in PG::activate().
Signed-off-by: Sage Weil <sage@newdream.net> - 05:22 PM Revision 40fb86ff (ceph): ceph: take single arg or list for valgrind args
- 06:54 AM Revision c88ec571 (ceph): combined mon, osd, mds starter functions
- 06:53 AM Revision f8ec23e7 (ceph): rbd: default to all:
- 06:52 AM Revision f7952614 (ceph): lost_unfound: make test work with backfill
- If we backfill, we fail to peer instead of having every object show up as
'unfound'. Avoid that by preventing log tr... - 06:52 AM Revision f70b158c (ceph): show host -> roles mapping on startup
- Less guessing when manually inspecting an in-progress or hung run.
- 06:52 AM Revision fbfa94bb (ceph): teuthology-ls: show pid, last line of output for running jobs
- 06:52 AM Revision 72057a9c (ceph): use local mirrors for (most) github urls
- A cronjob on ceph.newdream.net updates these every 15 minutes. Sigh.
- 06:52 AM Revision 709d9441 (ceph): use local mirrors for (most) github urls
- A cronjob on ceph.newdream.net updates these every 15 minutes. Sigh.
- 05:56 AM Revision a4642946 (ceph): msgr: don't assert on socket(2) failure
- This can happen if we're connecting to an invalid address. Generate an
error message instead of crashing.
See #1942...
01/15/2012
- 10:59 PM Feature #390 (Resolved): Implement bdrv_snapshot_goto (Rollback), bdrv_snapshot_delete
- 10:57 PM Feature #1781 (Resolved): qa: readwrite and roundtrip rgw tests in qa suite
- 10:56 PM rgw Feature #1911 (Closed): rgw: plan handling for large and/or manifest objects, s3 and/or swift
- the plan is to do manifest objects for large s3 objects. that means the pieces won't get a locator and will be distr...
- 10:28 PM Messengers Bug #1942: msgr: Address family not supported by protocol
- Still not sure how the bad address made it into the map (or OSDBoot) message, but at least it won't crash now as of c...
- 05:16 AM Revision a6c06103 (ceph): msgr: uninline operator<< on sockaddr_storage
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
01/14/2012
- 10:01 PM Bug #1928 (Resolved): osd: scrub stat mismatch after fsstress on kernel client
- 09:51 PM Feature #1944: osd: dump journal
- wip-osd-dump-journal
- 06:47 PM Feature #1944 (Resolved): osd: dump journal
- dump text summary of journal contents. this would help debug #1943
- 06:46 PM Bug #1943 (Duplicate): osd: bad clone transaction on journal replay
- Martin got this with v0.40:...
- 02:29 PM Messengers Bug #1942 (Won't Fix): msgr: Address family not supported by protocol
- http://joshp.no-ip.com:8080/20120114-osd-family-error.log.bz2
- 11:01 AM rgw Feature #1941 (Rejected): rgw: revisit bucket removal
- We can try to look again at the steps were doing wen removing buckets, exploring ways to reduce osd operations that c...
- 01:13 AM Revision 6b02f9fa (ceph): osd: rev osd internal cluster protocol
- Prevent backfill code from talking to pre-backfill code.
Signed-off-by: Sage Weil <sage@newdream.net>
01/13/2012
- 11:57 PM Revision 8d271f43 (ceph): Merge branch 'stable'
- 11:08 PM Revision 7f123de8 (ceph): mds: require OSDREPLYMUX feature bit
- We use ObjectOperations now and need a new server to decompose replies
into their constituent components.
Signed-off... - 11:07 PM Revision 012a9855 (ceph): librados: require OSDREPLYMUX feature
- We need this since we now rely on the server telling us rvals and
payload_lens for each OSDOp.
Signed-off-by: Sage W... - 11:07 PM Revision 436f8cac (ceph): define new OSDREPLYMUX feature bit
- This corresponds to the OSDs ability to pass payload_len hints and
return values for each OSDOp in the MSDOOpReply me... - 10:55 PM Linux kernel client Bug #1940: locking cycle in ceph_osdc_start_request
- this was causing teuthology runs to fail.
patch in master, testing! - 10:28 PM Linux kernel client Bug #1940 (Resolved): locking cycle in ceph_osdc_start_request
- ...
- 10:50 PM Revision 9d0476c5 (ceph): objecter: fix add_*() calls to use proper helper
- The helper resizes the other vectors; need that everywhere.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 09:04 PM Revision 42a6cefe (ceph): ReplicatedPG: munge truncate_seq 1/truncate_size -1 to seq 0/size 0
- Truncate with seq 1 and size -1 is a noop.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Sage ... - 08:34 PM Revision 0ded7e4d (ceph): ReplicatedPG: munge truncate_seq 1/truncate_size -1 to seq 0/size 0
- Truncate with seq 1 and size -1 is a noop.
Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Sage ... - 08:19 PM Revision 44cb0763 (ceph): rgw: limit object PUT size
- 07:26 PM Revision 3bfa41cf (ceph): Use yaml.safe_dump so unicode doesn't mess up the yaml files.
- In general, yaml.dump is comparable to pickle, and my personal
coding standard says *never* use it. yaml.safe_dump is... - 07:14 PM Linux kernel client Bug #1795: break d_lock > s_cap_lock ordering
- I was hitting some odd behavior while testing. Will try again over
the weekend or early next week.
Also, a note:... - 02:28 PM Linux kernel client Bug #1795: break d_lock > s_cap_lock ordering
- OK, after several iterations and some discussion we
concluded the last two patches (turning these things
into atomi... - 11:54 AM Linux kernel client Bug #1795: break d_lock > s_cap_lock ordering
- I have posted a series of four proposed patches to the list to address
this, along with a few other issues identifie... - 05:06 PM Revision d5753374 (ceph): objecter: fix up stat, getxattrs handlers
- - try/catch
- stat mtime
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 04:36 PM Revision 7eea40ea (ceph): v0.40
- 04:35 PM Revision 224a65a7 (ceph): Merge remote branch 'gh/master' into wip-backfill
- 03:08 PM rgw Bug #1939 (Resolved): rgw: error processing large logs
- 02:06 PM Bug #1928: osd: scrub stat mismatch after fsstress on kernel client
- Looks like the fixes for this introduced a new bug - 3 runs so far today failed with a similar scrub stat mismatch:
... - 10:49 AM Bug #1928 (Closed): osd: scrub stat mismatch after fsstress on kernel client
- 4815cafddf46e968501ac3b96e593c5e8db6218b
- 11:40 AM Bug #1935: teuthology: readwrite/roundtrip jobs run manually, but not in suite
- teuthworker is updated.
- 11:31 AM Bug #1935 (Resolved): teuthology: readwrite/roundtrip jobs run manually, but not in suite
- Alright, I was reading the wrong file. s3readwrite.py and s3roundtrip.py do use the yaml format. The above cleanups w...
- 10:04 AM Bug #1935: teuthology: readwrite/roundtrip jobs run manually, but not in suite
- I'm wrong, ignore me for a while.
- 09:58 AM Bug #1935: teuthology: readwrite/roundtrip jobs run manually, but not in suite
- More specific plan for change:
- change s3tests/functional/__init__.py to read env var S3TEST_YAML, raise if it is... - 09:50 AM Bug #1935: teuthology: readwrite/roundtrip jobs run manually, but not in suite
- s3tests/functional (run via nosetests) reads a .ini -style configuration.
This was not flexible enough for all the... - 11:36 AM CephFS Bug #1938 (Resolved): mds: snaptest-2 doesn't pass with 3 MDS system
- run vstart, mount ceph-fuse, run snaptest-2; mds.a crashes:...
- 09:24 AM Cleanup #1886: objecter/osd: mux/demux in MOSDOpReply encoding
- 08:52 AM Feature #1937 (Resolved): teuthology: --unlock option for -nuke
- this'd make cleanup slightly less painful (no need to check back on old terminal to unlock nuked nodes)
- 08:38 AM CephFS Bug #1682: mds: segfault in CInode::authority
- happened again on /var/lib/teuthworker/archive/nightly_coverage_2012-01-13-a/7335...
- 01:50 AM Revision 81c0ad82 (ceph): librados: make new ObjectReadOperations arguments non-optional
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:50 AM Revision 7347538d (ceph): rgw: use new librados ObjectReadOperation method arguments
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 01:47 AM Revision 4815cafd (ceph): ReplicatedPG: Update stat accounting for truncate during write
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
Reviewed-by: Josh Durgin <josh.durgin@dreamhost.com> - 12:39 AM Revision 8ceb3883 (ceph): rgw: wrap cls_cxx_map_* with try/catch around decoding
- 12:23 AM Revision 876d829a (ceph): librados: add ObjectOperation::exec
- 12:23 AM Revision 05d8ecbe (ceph): rgw: bucket index creation and init in a single operation
- 12:14 AM Revision dc628a5b (ceph): secret: move null check before strlen(key_name) deref
- Coverity cid: 98
Signed-off-by: Sage Weil <sage@newdream.net> - 12:10 AM Revision d41ddcdf (ceph): osd: stat op, don't compare in memory state to object
- might be that object is being created by the current compound request.
01/12/2012
- 11:44 PM Revision 4f4b79cc (ceph): osd: include return code in OSDOp
- This will expose the per-operation return values to the caller.
Signed-off-by: Sage Weil <sage.weil@dreamhost.com> - 11:44 PM Revision a8558284 (ceph): osd: mux/demux OSDOp::outdata in MOSDOpReply
- Bump encoding, so that we don't try to demux old encoded messages, which
will likely have OSDOp::payload_len == indat... - 11:44 PM Revision ff55d2f3 (ceph): osd: put result data in OSDOp.outdata
- The removes an argument from do_osd_ops() and cleans up the surrounding
code a bit.
Signed-off-by: Sage Weil <sage@n... - 11:44 PM Revision fe077832 (ceph): objecter: specify read return values pointers in ObjectOperatio methods
- This let's Objecter do the demuxing work for compount read operations.
Signed-off-by: Sage Weil <sage.weil@dreamhost... - 11:44 PM Revision 920bd568 (ceph): librados: specify read return value pointers in ObjectReadOperation met...
- This lets librados do the work of parsing the reply from compound
operations, instead of requiring callers to have kn... - 11:09 PM Revision f42c658d (ceph): osd: fill in empty item in peer_missing for strays
- If we search_for_missing() on a host, make a corresponding entry in our
peer_missing map (if it isn't already there).... - 11:06 PM Bug #1936 (Resolved): teuthology: github downtime -> failed runs
- we should use a github mirror for anything on github. maybe we can/should piggyback off whatever carl set up.
- 11:02 PM Revision 10b00316 (ceph): rgw: don't crash when copying a zero sized object
- 10:48 PM Revision 0da44591 (ceph): nuke: take config files from -t argument
- teuthology-lock and teuthology-updatekeys both use -t for this already
- 09:21 PM Revision 80f57f96 (ceph): ReplicatedPG: fix stat accounting error in CEPH_OSD_OP_WRITEFULL
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 09:21 PM Revision 845aa534 (ceph): ReplicatedPG: Do a write even for 0 length operation
- Otherwise, a 0 length write to an offset past the end of the file will
cause the internal accounting to reflect the f... - 09:02 PM Revision 96e89d30 (ceph): kernel: loop reconnecting in case we race with shutdown
- Previously, if we reconnected before shutdown completed we asserted
that the kernel did not boot into the new version... - 08:59 PM Revision cfa39bfb (ceph): qa/client/gen-1774.sh
- Capture Alexandre's script for reproducing #1774 here for posterity, until
we write a properly harnessed test for thi... - 07:46 PM Revision 6cf77532 (ceph): osd: fix PG::Log::copy_up_to() tail
- The tail needs to refer to the entry preceeding the first entry in the
log. This updates copy_up_to() to match the b... - 07:07 PM Revision 805513be (ceph): osd: reset last_complete on backfill restart
- Since last_backfill is hobject_t(), we can set this equal to last_update.
This fixes a problem where last_complete pr... - 06:38 PM Revision 1e56367e (ceph): client: avoid taking inode ref in case of nonexistent dir
- Signed-off-by: Andrey Stepachev <octo@yandex-team.ru>
Signed-off-by: Sage Weil <sage@newdream.net> - 06:35 PM Revision cedd92be (ceph): Merge branch 'wip-makefile'
- 06:03 PM Revision 71131371 (ceph): COPYING: note licenses for all files, not just the default
- This (mostly) copies debian/copyright for now, but there are format
restrictions for that file. Suggestions for a cl... - 06:03 PM Revision 54e0dfc1 (ceph): debian/copyright: note acx_pthread.m4 license
- Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
- 05:42 PM Bug #1935 (Resolved): teuthology: readwrite/roundtrip jobs run manually, but not in suite
- I'm not sure what's going on here.. roundtrip and readwrite are failing when scheduled, but when i copy/paste the sam...
- 05:17 PM Revision 6b55b6b3 (ceph): Makefile: Add headers that were omitted in make dist and prevented test...
- Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius@gentoo.org>
- 05:17 PM Revision c9e028f4 (ceph): Makefile: Handle corner case of crypto++ correctly
- i.e. use c++ while compiling, append to CRYPTO_LIBS instead of LIBS
Signed-off-by: Kacper Kowalik (Xarthisius) <xart... - 05:17 PM Revision c5144eed (ceph): Makefile: Use ACX_PTHREAD in configure.ac and resulting flags in src/Ma...
- instead of hardcoded flags
Signed-off-by: Kacper Kowalik (Xarthisius) <xarthisius@gentoo.org> - 05:17 PM Revision 7bf01b11 (ceph): Makefile: Add recent acx_pthread.m4 that has a fix for nostdlib issue.
- See http://code.google.com/p/protobuf/issues/detail?id=188 for details
Signed-off-by: Kacper Kowalik (Xarthisius) <x... - 04:58 PM Bug #1759: mds/client: truncate size overflow, fails with EINVAL
- Ok, I reproduced this with osd debugging, but not mds unfortunately. The logs are at slider:/home/samuelj/archived_l...
- 04:45 PM Bug #1930 (Resolved): objclass: need to wrap cls_cxx_map_* with try/catch, protect against bad de...
- Fixed, commit:8ceb388396d02daefe53e3bdb68c08b4855ceaf7.
- 09:48 AM Bug #1930 (Resolved): objclass: need to wrap cls_cxx_map_* with try/catch, protect against bad de...
- 04:26 PM Bug #1931 (Resolved): rgw: bucket index should create and init index atomically
- Fixed, commit:05d8ecbe1cea8d7c7bd33f9b53f4cbf06b2c4e61.
- 09:56 AM Bug #1931 (Resolved): rgw: bucket index should create and init index atomically
- 04:25 PM Linux kernel client Bug #1795 (In Progress): break d_lock > s_cap_lock ordering
- Discussed this with Sage. The problem arose because dentry_lease_is_valid()
is using the MDS session's s_cap_lock f... - 04:13 PM Feature #1934 (Closed): Get new Sepia machines into service
- 03:47 PM rgw Bug #1933 (Resolved): rgw: crash in swift copy
- 03:08 PM rgw Bug #1933: rgw: crash in swift copy
- copy of zero sized objects indeed. Affects both S3 and swift.
Fixed, at commit:10b00316b7778f6aecbf46ec0aea2aca8b8... - 01:36 PM rgw Bug #1933: rgw: crash in swift copy
- Might be a copy of a zero sized object.
- 01:24 PM rgw Bug #1933 (Resolved): rgw: crash in swift copy
- ...
- 03:38 PM Bug #1924 (Resolved): teuthology: installing kernels can fail due to reconnecting too soon
- Fixed by 96e89d30ec5f912f3c1b4844328e0966a2266e05 in teuthology.git.
- 03:24 PM Bug #1909: Two mons crash after starting the third one
- I have reinstalled ceph mon like I wrote but it has a different IP address now. Even though I have changed DNS record...
- 01:13 PM Bug #1928: osd: scrub stat mismatch after fsstress on kernel client
- Samuel Just wrote:
> It seems that fstress will do that: 2012-01-11T14:30:04.867 INFO:teuthology.task.workunit.clien... - 01:07 PM Bug #1928: osd: scrub stat mismatch after fsstress on kernel client
- It seems that fstress will do that: 2012-01-11T14:30:04.867 INFO:teuthology.task.workunit.client.0.out:8/17: dwrite f...
- 12:58 PM Bug #1928: osd: scrub stat mismatch after fsstress on kernel client
- One possibility: in CEPH_OSD_OP_WRITE in ReplicatedPG::do_op we pass op.extent.offset and op.extent.length to write_u...
- 11:07 AM Linux kernel client Feature #1922 (Resolved): rbd: annotate for lockdep
- The fix was to initialize the semaphore in rbd_add().
I have verified that this eliminates the lockdep warning.
... - 10:56 AM Bug #1898 (Duplicate): very long scrub blocked write operation
- Even though this is not the same complaint as 1783, we plan on
fixing it with the same changes, so I am calling this... - 10:30 AM Feature #1932 (Resolved): mon: before accepting a new crushmap, monitor should validate and test ...
- 01:04 AM Cleanup #1899: use acx_pthread instead of hardcoding libs and cflags into build system
- Sage Weil wrote:
> Looks good.. can I add your Signed-off-by to this?
Sure
01/11/2012
- 09:50 PM Revision b93bf285 (ceph): PG: gen_prefix should grab a map reference atomically
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 09:46 PM Feature #1929 (Resolved): teuthology: log runtime
- - include run time in summary.yaml
- include run time in teuthology-ls output
this will help a bit in identifying... - 09:37 PM Revision 38b9b503 (ceph): rgw-admin: add pool rm and pools list
- 09:05 PM Revision e2c02543 (ceph): rgw-admin: clean up unused commands
- 09:04 PM Revision ac1e105e (ceph): osd: bound log we send when restarting backfill
- Use the new tunable from b1da5115aa0756aefa4f0aad36395911e82fce28.
Signed-off-by: Sage Weil <sage@newdream.net> - 08:54 PM Revision 6dae2f8a (ceph): thrasher: adjust min_dead default
- Make this 1, not 2. That's a bit more friendly. It doesn't strictly
matter, tho, since we revive osds before waitin... - 08:54 PM Revision 3c0346b4 (ceph): lost_unfound: typo
- 08:54 PM Revision 59369237 (ceph): thrasher: don't mark down osds out; tell monitor same
- Stopping ceph-osd doesn't make it out (immediately). Prevent monitor
from doing this after a delay too so we can kee... - 08:54 PM Revision 50463ffd (ceph): verify all osds start before checking health
- Just checking health isn't good enough, since it races with OSD startup:
we can have a healthy cluster with 0 (or som... - 08:54 PM Revision fb74b901 (ceph): thrasher: add max_dead
- Add max_dead, and revive osds prior to waiting for clean. Otherwise we
can leave too many OSDs down and the cluster ... - 08:22 PM Revision 79085ad0 (ceph): rados.py: avoid getting return value of void function
- rados_ioctx_locator_set_key is void. The return value seems to have
been uninitialized, so the tests failed rarely.
... - 08:12 PM Linux kernel client Feature #1922: rbd: annotate for lockdep
- I believe the problem is that when a new rbd device structure gets
initialized in rbd_add(), the rw_semaphore contai... - 07:13 PM Revision 85552cf8 (ceph): pg: remove unnecessary guard from calc_trim_to()
- The num_objects check doesn't make sense, and could only make trimming
happen more often than it should. Sage did not... - 07:13 PM Revision b1da5115 (ceph): pg: add a configurable lower bound on log size
- This helps prevent problems with retrying requests being detected as
duplicates. The problem occurs when the log is t... - 06:34 PM Revision 8a9dbc47 (ceph): Merge remote branch 'gh/master' into wip-backfill
- 05:29 PM Bug #1928 (Resolved): osd: scrub stat mismatch after fsstress on kernel client
- ...
- 04:24 PM CephFS Bug #1774: client: files become inaccessible in large directories (with snapshots?)
- This script (properly adjusted to actually mount and remount the ceph-fuse tree) should be enough to trigger the bug.
- 02:41 PM Revision 734737f3 (ceph): osd: limit size of log sent to reset backfill targets
- Need to replace magic number with new tunable, once that is merged.
Signed-off-by: Sage Weil <sage@newdream.net> - 01:47 PM Bug #1925 (Closed): osd: segfault during _scan_list
- b93bf285c9f05ab943e8e506ea2125af0f1f97ad should fix it.
- 06:58 AM Bug #1925 (Closed): osd: segfault during _scan_list
- ...
- 01:39 PM rgw Feature #1927 (Resolved): rgw: add radosgw-admin pool list
- Fixed, commit:38b9b5030747349acf657946133ef57736542310.
- 12:50 PM rgw Feature #1927 (Resolved): rgw: add radosgw-admin pool list
- listing the active set of pools.
- 01:38 PM rgw Feature #1926 (Resolved): rgw: add radosgw-admin pool remove
- Fixed, commit:38b9b5030747349acf657946133ef57736542310.
- 12:49 PM rgw Feature #1926 (Resolved): rgw: add radosgw-admin pool remove
- Would allow removing pool from the active set of pools.
- 01:17 PM Cleanup #1899: use acx_pthread instead of hardcoding libs and cflags into build system
- Looks good.. can I add your Signed-off-by to this?
- 04:25 AM Revision 8f9549f0 (ceph): client: start caching readdir results after readdir_start
- Use upper_bound rather than lower_bound to compute the initial pd within
insert_trace, so that we don't attempt to re... - 12:39 AM Revision 5d989608 (ceph): monclient: fix resolve_addrs() call
- This was broken in def36668a13459d9c0851e4d4da440a288f9a34f it looks like.
Passing uninitialized memory to resolve_ad... - 12:35 AM Revision f09b21ef (ceph): resolve_addrs: return ipv4 and ipv6 addrs
- Fixes: #1891
Signed-off-by: Sage Weil <sage@newdream.net> - 12:22 AM Revision 9e9b5c6f (ceph): ReplicatedPG: fix typo in stats accounting in _rollback_to
- Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
- 12:14 AM Revision d4815e5b (ceph): osd: send log with backfill restart
- This makes backfill restart less of a special case: we send an info AND
log, just like we do normally. Code paths ar... - 12:07 AM Revision f4883ebf (ceph): ceph: let the user running ceph-osd remove subvolumes
- This will prevent EPERM when using the SNAP_DESTROY ioctl,
so the filestore will use btrfs snaps.
Also available in: Atom