Activity
From 12/31/2012 to 01/29/2013
01/29/2013
- 11:40 PM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
- ...
- 11:28 PM rbd Bug #3964 (Won't Fix): krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd i...
- fghaas reported, I reproduced on a precise 32-bit system:
create an image, map, writes work fine, even with dd ofl... - 11:06 PM Bug #3963 (Won't Fix): cls_log should check should_gather before vsnprintf()
- 1) faster
2) would have allowed workaround for 3961 - 11:04 PM Bug #2481 (Won't Fix): ceph tell has almost no error reporting
- this should get cleaned up with whatever refactor we do with the api work, but not worth spending time on individuall...
- 11:01 PM Bug #3577 (Can't reproduce): osd missing reported by osd_recovery.test_incomplete_pgs workload
- we fixed several things that could explain this.
- 11:01 PM Bug #3595 (Need More Info): ceph-osd and ceph-mds crash on Debian Squeeze
- Is this still a problem with the bobtail packages?
- 10:58 PM Bug #2721 (Resolved): Ceph status does not work in 0.48 even if it is still documented
- wrong monitor version was running
- 10:57 PM Bug #2647 (Can't reproduce): osd: old request, waiting for subops
- 10:56 PM Bug #2500 (Resolved): osd: unprotected ::decodes in ReplicatedPG::do_osd_ops
- cleaned up ages ago
- 10:55 PM Bug #1197 (Resolved): osd: make inconsistent state durable
- this got fixed in commit:2475066c3247774a2ad048a2e32968e47da1b0f5
- 10:54 PM Bug #3646 (Resolved): pg_temp with two down/out osds
- commit:6122a9f62f9eeae1410d1703fecb8939a35fb03f
- 10:46 PM rbd Bug #3961 (Resolved): 32-bit cls_rbd tries cls_log with %d for 64-bit int, segfaults
- 32-bit system: rbd create i -s 1; rbd rm i causes death of osd in cls_log();
presumably this is because of cls_log(%... - 10:42 PM Revision 59ac4d35 (ceph): qa: add rbd/concurrent workunit
- This defines a new workunit shell script that performs a bunch of
rbd operations concurrently in order to exercise co... - 10:35 PM Revision 3bc21143 (ceph): ObjectCacher: fix flush_set when no flushing is needed
- C_GatherBuilder takes ownership of the Context we pass it. Deleting it
in flush_set after constructing the C_GatherBu... - 10:10 PM RADOS Feature #3807: crush: simple commands to create common rules
- ceph osd crush rule list
ceph osd crush rule create-simple <name> <root> <failure domain>
ceph osd crush rule create-... - 10:04 PM Revision 19f42731 (ceph): peer: fix filtering out of scrub from pg state
- 09:59 PM Revision 95677fc5 (ceph): mon: OSDMonitor: only share osdmap with up OSDs
- Try to share the map with a randomly picked OSD; if the picked monitor is
not 'up', then try to find the nearest 'up'... - 09:59 PM Revision e4d76cb8 (ceph): utime: fix narrowing conversion compiler warning in sleep()
- Fix compiler warning:
./include/utime.h: In member function 'void utime_t::sleep()':
./include/utime.h:139:50: warnin... - 09:33 PM Documentation #3960 (Resolved): [Document bug]MON and MDS do not need a ssd for data storage.
- From :http://ceph.com/docs/master/install/hardware-recommendations/#data-storage
it says:
Since the storage requi... - 09:17 PM Revision a8964107 (ceph): rgw: fix crash when missing content-type in POST object
- Fixes: #3941
This fixes a crash when handling S3 POST request and content type
is not provided.
Signed-off-by: Yehud... - 08:38 PM Linux kernel client Bug #3959 (Duplicate): krbd: decrement img_request->obj_request_count when deleting
- Each image request keeps a count of its object requests.
Adding a object request to or deleting one from an image
r... - 08:34 PM Feature #2472: osd: add opaque 'class <name> <foo>' cap that class can interpret/enforce
- 08:34 PM CephFS Bug #1946 (Resolved): snapshot inherits timestamp/size/etc from modified trunk dir upon mds restart
- commit:7842bb50c7814cc16c22589bf41df7db1f7492eb
- 08:33 PM Feature #3890 (Fix Under Review): osd: create tool to extract pg info and pg log from filestore
- In final review to merge from wip-3890 branch.
- 08:33 PM Bug #3126 (Can't reproduce): mds crashed bool CDir::check_rstats()
- we'll see i this comes up with all of yan's fixes in now.
- 08:33 PM rbd Bug #3566 (Resolved): log max new = 1 can cause hang on process exit
- fixed a few weeks ago, commit:813787af3dbb99e42f481af670c4bb0e254e4432 and a few prior commits
- 08:32 PM Bug #3125 (Resolved): Assertion Error in peer.py - failure from the nightly run
- this is fixed up now, most recent commit was 3772d437dd4c562a6490f84124eb4757e22eca92
- 08:26 PM rbd Bug #3958 (Resolved): rbd fsx fails with EBUSY
- ...
- 07:41 PM CephFS Bug #3553 (Won't Fix): MDS core dumped running 0.48.2argonaut
- if/when see this on bobtail or later, we'll investigate.
- 07:32 PM Bug #3878 (Rejected): osd: nobackfill flag doesn't work
- it works. it just doesn't leave the pg in backfill_wait, as i was expecting.
- 07:30 PM Bug #3836 (Resolved): osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
- in bobtail, commit:e6bceeedb0b77d23416560bd951326587470aacb
- 07:24 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
- Sage Weil wrote:
> Aaron Schulz wrote:
> > Ian Colle wrote:
> > > Aaron are you still seeing this?
> >
> > Sorr... - 12:31 PM rgw Bug #3365 (Can't reproduce): Broken metadata (duplicated as CSV)
- Thanks for trying to reproduce this on Bobtail, Aaron. I'm moving it to Can't Reproduce.
- 12:26 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
- I'm having a hard time reproducing this on bobtail. If I remove the metadata normalization code in the MediaWiki/Clou...
- 07:07 PM Bug #3938: ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons, 1 bobtail)
- is there a core for this?
- 06:51 PM Revision 7cd4e50d (ceph): client: Wait for caps to flush when flushing metadata.
- Embarrassingly, this conditional has been backwards since
I committed it in 818e7939. But we want to do the wait when... - 06:44 PM Revision 11e1f3ac (ceph): ReplicatedPG: make_snap_collection when moving snap link in snap_trimmer
- Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry... - 06:43 PM Bug #3957 (Resolved): new #include breaks assert.h (again)
- 06:40 PM Bug #3957 (Resolved): new #include breaks assert.h (again)
- #include <boost/lexical_cast.hpp> in mds/Server.cc apparently re-includes the system assert.h,
blowing up dout(). F... - 06:42 PM Bug #3956 (Resolved): ceph auth add/del entity name parameter check
- commit:25e9a0be63fdad9fd8f7909585c9270a3729dc44
- 06:00 PM Bug #3956 (Resolved): ceph auth add/del entity name parameter check
- It's currently (as of v0.56.1) possible to run "ceph auth add" without any further parameters. This results in the ad...
- 05:28 PM Revision 907c709c (ceph): mds: Send created ino in journaled_reply
- The MDS avoids sending an early reply if a request
triggered inode allocation (no preallocated inodes yet).
For creat... - 05:27 PM Bug #3955 (Resolved): Configure should explicity check for c++ compiler.
- If no c++ compiler is installed, configure fails with a misleading message when checking for boost libraries.
- 05:11 PM Bug #3747 (Closed): PGs stuck in active+remapped
- I think this was probably related to the lagging pg peering workqueue.. is there anything to suggest that isn't the c...
- 05:09 PM Bug #3948 (Need More Info): problems from leveldb static linkage and leveldb downgrade
- Corin-
Just restart the osd. And check dmesg for any kernel malfeasance... that is usually what triggers this. A... - 04:51 PM Bug #3900 (Resolved): init-ceph should do ulimit -n's with do_root_cmd
- commit:84a024b647c0ac2ee5a91bacdd4b8c966e44175c in next, cherry-pick -x'ed to bobtail
- 03:21 PM Bug #3900 (Fix Under Review): init-ceph should do ulimit -n's with do_root_cmd
- 04:37 PM Subtask #3840 (Resolved): osd: ack push after apply+commit
- as part of #3833
- 04:36 PM Feature #3732 (Resolved): osd/mon: report recovery rate (bytes and objects per sec)
- commit:c2e50e580d18107162d2d101c5c243c665e56124
- 04:33 PM CephFS Feature #3953 (Resolved): kclient: get/set layout via virtual xattrs
- 04:32 PM CephFS Feature #1236 (Resolved): libceph: set layout via virtual xattrs (libceph/cfuse)
- commit:1564c3a0a3efbde5a326001586238fde8f6648ad for userspace bits.
the kernel bits still need review.. opening se... - 04:18 PM Revision cf7c3f7d (ceph): client: Don't use geteuid/gid for fuse ll_create
- Fixes a bug in ll_create where files that already exist at the MDS
don't get the created flag set on reply. This cau... - 03:11 PM rbd Bug #3952 (Resolved): krbd: no need for object header version
- The header object watch operation had a sort of half implemented
use of the version of the object. It apparently is... - 03:08 PM rbd Bug #3946 (Resolved): rbd fsx failing in nightly
- Just an extra delete in a code path in flush_set that wasn't exercised before. Fixed by commit:3bc21143552b35698c9916...
- 02:44 PM rbd Bug #3946: rbd fsx failing in nightly
- Reproducing locally seems to confirm this, since there was a recent change to replace commit_set() with flush_set():
... - 12:06 PM rbd Bug #3946: rbd fsx failing in nightly
- I'm guessing these are related to recent objectcacher changes, since they didn't affect runs without caching. The cor...
- 02:48 PM rbd Feature #3949 (Resolved): krbd: create test script that exercises concurrent operations
- I just committed the test script to the ceph master branch.
The script is located here: qa/workunits/rbd/concurrent... - 09:16 AM rbd Feature #3949: krbd: create test script that exercises concurrent operations
- Well the script is really nice. And I just got a new
crash while running it on a real machine (rather than
my UML ... - 08:22 AM rbd Feature #3949 (Resolved): krbd: create test script that exercises concurrent operations
- I suggested doing this in http://tracker.ceph.com/issues/3427.
That issue is about a bug where an image unmapping ca... - 01:50 PM rgw Bug #3941 (Resolved): s3tests crash on bobtail
- Crash fixed, commit:f41010c44b3a4489525d25cd35084a168dc5f537.
Also, pushed a change to s3-tests.git, setting a requi... - 01:27 PM Bug #3268: osd: localize reads handling is incorrect
- Yes, the OSDs will serve replica reads as things stand.
- 01:11 PM Bug #3268: osd: localize reads handling is incorrect
- I'm starting on this bug now. Before fixing the flag handling described in the ticket, I want to make sure that the O...
- 12:43 PM Bug #3810: btrfs corrupts file size on 3.7
- I'm making an attempt.
- 12:36 PM Bug #3810: btrfs corrupts file size on 3.7
- Mike, Bill: are you able to test Josef's patch?
- 11:45 AM Revision e805b7d6 (ceph): admin_socket: don't bother remote executing if there is no test
- 11:30 AM CephFS Bug #3951: ceph-fuse: permissions error on create
- I've got a question in for Sam, but other than that this looks good to me!
- 09:37 AM CephFS Bug #3951 (Resolved): ceph-fuse: permissions error on create
- Reported by Greg Farnum:
gregf@kai:~/ceph/src [master]$ cd mnt/
gregf@kai:~/ceph/src/mnt$ sudo chown gregf.gregf ... - 11:12 AM Revision c9201d0e (ceph): ReplicatedPG: correctly handle new snap collections on replica
- Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry... - 11:10 AM rbd Bug #3950: krbd: new assertion failure running concurrent rbd test
- OK, I do have the osd request pointer now. It was available
in register R14. And with a little work I can determin... - 10:35 AM rbd Bug #3950: krbd: new assertion failure running concurrent rbd test
- The object being operated on is the rbd header image, in
this case named "image.5X5ZNB.rbd". The object request typ... - 10:06 AM rbd Bug #3950: krbd: new assertion failure running concurrent rbd test
- Weird. It looks to me like the object request that's
just completing is already done, meaning we got
a callback fr... - 09:19 AM rbd Bug #3950 (Can't reproduce): krbd: new assertion failure running concurrent rbd test
- (I think this is a new issue, I haven't investigated it yet.)
I hit an assertion failure while running my new test... - 10:34 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
- I've opened a new issue that has symptoms similar to this
but not identical:
http://tracker.ceph.com/issues/395... - 09:41 AM Bug #3768 (In Progress): perl is required for logrotate, we need to include Perl as a dependency
- Putting back to in-progress. The preferred solution is to replace the perl filter line with sed or python and remove...
- 09:38 AM Bug #3930 (Resolved): ceph.spec: udev rule for rbd not in rpms
- Branch: refs/heads/master
Home: https://github.com/ceph/ceph
Commit: 0b66994c180b1ce5856a38518423d82fbebc8a2e
... - 09:15 AM rbd Bug #3427: krbd: unmap does not remove block device properly
- I have opened this to cover developing that test script
http://tracker.ceph.com/issues/3949
- 07:53 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- ...yes, yes it is. I've been working in FUSE so far. *sigh* Well, it needed the fix too.
- 07:26 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- I don't see wip-2753-fsync-errors in the repo. Also, note that this problem was reported on the cephfs kernel client...
- 06:49 AM Revision 0b66994c (ceph): ceph.spec.in: package rbd udev rule
- Package udev/50-rbd.rules per bug 3930.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> - 04:22 AM Revision 1c311949 (ceph): osd_recovery: inject a recovery delay
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 04:22 AM Revision e33b425d (ceph): osd_recovery: use --no-cleanup for rados bench
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 03:53 AM Revision 3b27c9ec (ceph): osd_backfill: --no-cleanup for rados bench
- 03:46 AM Revision a7d15afb (ceph): mon: smooth pg stat rates over last N pgmaps
- This smooths the recovery and throughput stats over the last N pgmaps,
defaulting to 2.
Signed-off-by: Sage Weil <sa... - 03:17 AM Revision 0f7a9e56 (ceph): Merge remote-tracking branch 'yan/wip-mds'
- Reviewed-by: Sage Weil <sage@inktank.com>
- 03:03 AM Revision ecda1208 (ceph): doc: fix overly-big fixed-width text in Firefox
- Changed font size for ...
- 03:01 AM Revision d5008602 (ceph): btrfs.yaml: increase osd op thread timeout
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 02:50 AM Revision 4aea19ee (ceph): osd_types: add recovery counts to object_sum_stats_t
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:50 AM Revision a2495f65 (ceph): osd: track recovery ops in stats
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:50 AM Revision 76e9fe5f (ceph): mon/PGMap: include timestamp
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:50 AM Revision 208b02a7 (ceph): mon/PGMap: report recovery rates
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:50 AM Revision 3f6837e0 (ceph): mon/PGMap: report IO rates
- This does not appear to be very accurate; probably the stat values we're
displaying are not being calculated correctl... - 02:49 AM Revision 193dbedb (ceph): rbd-fuse: fix warning
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:44 AM Revision 1e24ce22 (ceph): doc: Removed indep, and clarified explanation.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 02:17 AM Revision 0e9c8124 (ceph): mds: add projected rename's subtree bounds to ESubtreeMap
- Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
- 02:17 AM Revision e69e7e5d (ceph): mds: fix 'discover' handling in the rejoin stage
- If the MDS is the resolve stage, current MDCache::handle_discover() only handles
'discover' from MDS that it has alre... - 02:17 AM Revision abc4c785 (ceph): mds: allow handling slave request in the clientreplay stage
- replaying a client request may need to create slave request and the slave
MDS can be also in the clientreplay stage.
... - 02:17 AM Revision 58841776 (ceph): mds: mark export bounds for cross authority directory rename
- this guarantees that the importing MDS gets directory fragment's
up-to-date fragstat/rstat.
Signed-off-by: Yan, Zhen... - 02:17 AM Revision 829aeba6 (ceph): mds: clear inode dirty when slave rename finishes.
- The inode is linked to a non-auth directory, so remove it from LogSegment's
dirty inode list.
Signed-off-by: Yan, Zh... - 02:17 AM Revision c93cf2d2 (ceph): mds: fix for MDCache::disambiguate_imports
- In the resolve stage, if no MDS claims other MDS's disambiguous subtree
import, the subtree's dir_auth is undefined.
... - 02:17 AM Revision 0cf5e4e5 (ceph): mds: journal inode's projected parent when doing link rollback
- Otherwise the journal entry will revert the effect of any on-going
rename operation for the inode.
Signed-off-by: Ya... - 02:17 AM Revision 9a0cfcc5 (ceph): mds: don't journal opened non-auth inode
- If we journal opened non-auth inode, during journal replay, the corresponding
entry will add non-auth objects to the ... - 02:17 AM Revision 4fc68a48 (ceph): mds: properly clear CDir::STATE_COMPLETE when replaying EImportStart
- when replaying EImportStart, we should set/clear directory's COMPLETE
flag according with the flag in the journal ent... - 02:17 AM Revision 710bba3a (ceph): mds: move variables special to rename into MDRequest::more
- My previous patches add two pointers (ambiguous_auth_inode and
auth_pin_freeze) to class Mutation. They are both used... - 02:17 AM Revision f4abf00a (ceph): mds: rejoin remote wrlocks and frozen auth pin
- Includes remote wrlocks and frozen authpin in cache rejoin strong message
Signed-off-by: Yan, Zheng <zheng.z.yan@int... - 02:17 AM Revision 77946dcd (ceph): mds: fetch missing inodes from disk
- The problem of fetching missing inodes from replicas is that replicated inodes
does not have up-to-date rstat and fra... - 02:17 AM Revision 9944d9fb (ceph): mds: don't journal non-auth rename source directory
- After replaying a slave rename, non-auth directory that we rename out of will
be trimmed. So there is no need to jour... - 02:17 AM Revision 1a6626f0 (ceph): mds: preserve non-auth/unlinked objects until slave commit
- The MDS should not trim objects in non-auth subtree immediately after
replaying a slave rename. Because the slave ren... - 02:17 AM Revision 844cd46c (ceph): mds: fix slave rename rollback
- The main issue of old slave rename rollback code is that it assumes
all affected objects are in the cache. The assump... - 02:17 AM Revision a42a9187 (ceph): mds: split reslove into two sub-stages
- The resolve stage serves to disambiguate the fate of uncommitted slave
updates and resolve subtrees authority. The MD... - 02:17 AM Revision 3a66656b (ceph): mds: send resolve messages after all MDS reach resolve stage
- Current code sends resolve messages when resolving MDS set changes.
There is no need to send resolve messages when so... - 02:17 AM Revision 85294a59 (ceph): mds: always use {push,pop}_projected_linkage to change linkage
- Current code skips using {push,pop}_projected_linkage to modify replica
dentry's linkage. This confuses EMetaBlob::ad... - 02:17 AM Revision e0aa64d0 (ceph): mds: don't replace existing slave request
- The MDS may receive a client request, but find there is an existing
slave request. It means other MDS is handling the... - 02:17 AM Revision baa6bd6b (ceph): mds: fix for MDCache::adjust_bounded_subtree_auth
- After swallowing extra subtrees, subtree bounds may change, so it
should re-check.
Signed-off-by: Yan, Zheng <zheng.... - 02:17 AM Revision c9ff21a9 (ceph): mds: fix "had dentry linked to wrong inode" warning
- The reason of "had dentry linked to wrong inode" warning is that
Server::_rename_prepare() adds the destdir to the EM... - 02:17 AM Revision ce431eb5 (ceph): mds: splits rename force journal check into separate function
- the function will be used by later patch that fixes rename rollback
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> - 02:17 AM Revision fb497135 (ceph): mds: force journal straydn for rename if necessary
- rename may overwrite an empty directory inode and move it into stray
directory. MDS who has auth subtree beneath the ... - 02:17 AM Revision cd8d9107 (ceph): mds: don't set xlocks on dentries done when early reply rename
- _rename_finish() does not send dentry link/unlink message to replicas.
We should prevent dentries that are modified b... - 02:15 AM Revision 87d85fa2 (ceph): Merge remote-tracking branch 'gh/next'
- 01:51 AM Revision e58fe519 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
- 01:50 AM Revision b429a3a3 (ceph): doc: Updated to add indep and first n to chooseleaf. Num only used with...
- fixes: #3711
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 01:31 AM Revision f41010c4 (ceph): rgw: fix crash when missing content-type in POST object
- Fixes: #3941
This fixes a crash when handling S3 POST request and content type
is not provided.
Signed-off-by: Yehud... - 01:22 AM Revision 26988038 (ceph): Merge branch 'wip-osd-down-out'
- Reviewed-by: Samuel Just <sam.just@inktank.com>
- 01:14 AM Revision 09522e5a (ceph): rgw: fix crash when missing content-type in POST object
- Fixes: #3941
This fixes a crash when handling S3 POST request and content type
is not provided.
Signed-off-by: Yehud... - 01:13 AM Revision 75f6ba56 (ceph): crush: implement get_children(), get_immediate_parent_id()
- Signed-off-by: Sage Weil <sage@inktank.com>
- 01:13 AM Revision 2b8ba7ca (ceph): osdmap: implement subtree_is_down() and containing_subtree_is_down()
- Implement two methos to see if an entire subtree is down, and if the
containing parent node of type T of a given node... - 01:13 AM Revision b955a599 (ceph): mon: set limit so that we do not an entire down subtree out
- Add new configurable 'mon osd down out subtree limit' so that you can
prevent marking out an entire subtree. If for ... - 01:12 AM Revision 2efdfb41 (ceph): mon: Elector: reset the acked leader when the election finishes and we ...
- Failure to do so will mean that we will always ack the same leader during
an election started by another monitor. Th... - 01:12 AM Revision 428ddb7d (ceph): Merge remote-tracking branch 'gh/wip-timecheck
- Reviewed-by: Sage Weil <sage@inktank.com>
- 12:58 AM Revision 81ed1bc7 (ceph): rados: add pool_ops workunit to cephtool test
- 12:53 AM Revision c79f7c6c (ceph): Merge branch 'wip-pool-delete'
- Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
- 12:52 AM Revision 97b78924 (ceph): doc: update ceph man page link
- It's not the wiki anymore, and the man page needed to be regenerated.
Signed-off-by: Josh Durgin <josh.durgin@inktan... - 12:52 AM Revision 91a0bc89 (ceph): ceph, rados: update pool delete docs and usage
- Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
01/28/2013
- 11:25 PM Revision 1a6197a7 (ceph): qa: fix mon pool_ops workunit
- Use ! for clarity when commands are supposed to fail.
Check a few other cases that should fail, and correct deleting
... - 10:54 PM Revision 826e5860 (ceph): cram: fix for runs with coverage enabled
- Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
- 10:50 PM Bug #3948 (Resolved): problems from leveldb static linkage and leveldb downgrade
- Two days ago I upgraded one of my osds to 0.48.3 (see http://tracker.ceph.com/issues/3797) and everything worked fine...
- 09:56 PM Revision 014fc6d6 (ceph): utime: fix narrowing conversion compiler warning in sleep()
- Fix compiler warning:
./include/utime.h: In member function 'void utime_t::sleep()':
./include/utime.h:139:50: warnin... - 09:56 PM Revision fb85c7f6 (ceph): rbd: don't ignore return value of system()
- Check for the return value of system() and handle the error if needed
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bi... - 09:56 PM Revision f74265b0 (ceph): configure: fix check for fuse_getgroups()
- Check for fuse_getgroups() only in case we have found libfuse already.
Moved the check to the check for --with-fuse.
... - 09:56 PM Revision 21673e8b (ceph): rbd-fuse: fix usage of conn->want
- Fix usage of conn->want and FUSE_CAP_BIG_WRITES. Both need libfuse
version >= 2.8. Encapsulate the related code line ... - 09:56 PM Revision 818e9a2c (ceph): rbd-fuse: fix printf format for off_t and size_t
- Fix printf format for off_t and size_t to print the same on 32 and 64bit
systems. Use PRI* macros from inttypes.h.
S... - 09:51 PM Bug #3930 (In Progress): ceph.spec: udev rule for rbd not in rpms
- 09:50 PM Bug #3945 (In Progress): osd: dynamically link to leveldb
- 04:56 PM Bug #3945 (Resolved): osd: dynamically link to leveldb
- We hit a problem with quantal that underscored the danger of linking statically to libleveldb. After some discussion...
- 09:21 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- wip-2753-fsync-errors has a patch which makes fsync return an error if the client gets back an error from the Objecte...
- 05:32 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- Looked at this briefly; I see that the way we do fsyncs is attached to a "FIXME: this could starve" comment, and I be...
- 09:18 PM rbd Bug #3947 (Resolved): krbd: read zeroing freed bio?
- This happened to me once before but I wasn't sure what
I did. Now I think I do know. This is with the new
request... - 08:52 PM Revision 4edef483 (ceph): Merge branch 'wip-java-api'
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sage Weil <... - 08:45 PM Feature #3833: osd: improve recovery throttling
- commit:d6db239ce5134a9c410554fb292c54981375c628
- 08:20 PM Feature #3833: osd: improve recovery throttling
- Commit?
- 07:32 PM Feature #3833 (Resolved): osd: improve recovery throttling
- 07:27 PM Revision 0ded0fdf (ceph): mon: Monitor: rework timecheck code to clarify logic boundaries
- The initial timecheck implementation relied on a cleanup function to
clean the state each time we changed epochs (or ... - 06:13 PM Revision 3a089420 (ceph): doc: fix rbd create syntax
- --dest-pool does not apply to create. Also remove extraneous
whitespace.
Signed-off-by: Josh Durgin <josh.durgin@ink... - 06:08 PM RADOS Documentation #3830: crush-map.rst: chooseleaf doesn't include 'firstn|indep', and 'aggregates' i...
- Can we get something moving on this bug, or give it to John to research? (and btw, firstn|indep has
been addressed u... - 05:20 PM Bug #3906 (Won't Fix): ceph-mon leaks memory during peering
- This isn't something that's worth dealing with on the monitor side right now.
- 05:19 PM Bug #3797 (Duplicate): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48....
- see #3376
- 04:43 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- The conclusion:
- quantal had a newer libleveldb than we built statically into our debs
- downgrading made compac... - 05:02 PM rbd Bug #3946 (Resolved): rbd fsx failing in nightly
- ...
- 04:49 PM Bug #3905 (Can't reproduce): incomplete & stale (lost?) PGs
- This appears to be something that was triggered and exacerbated by now-fixed issues. Until we can trigger it, I'm in...
- 08:04 AM Bug #3905: incomplete & stale (lost?) PGs
- Due to some other issues and after a chat with Sage, I restarted all of my osds and this disappeared since. So I'm af...
- 04:28 PM Bug #3944 (Resolved): ceph tool should prevent --admin-socket
- Misremembering, I tried several 'ceph --admin-socket' commands rather than 'ceph --admin-daemon'; the result was that...
- 04:12 PM Bug #3810: btrfs corrupts file size on 3.7
- sent a report to linux-btrfs
- 03:41 PM Bug #3810: btrfs corrupts file size on 3.7
- Ok, this looks like a btrfs bug to me. On osd.3, the write extends the file size to 4194304, but the later stat sees...
- 10:59 AM Bug #3810: btrfs corrupts file size on 3.7
- I ran part of the workload and found an inconsistent pg. I've uploaded ceph.log and logs from the primary and second...
- 03:50 PM Documentation #3711: crush-map.rst: choose firstn talks about "N", but does not clearly define wh...
- Things I mentioned in comment 4 are still present; I'd like to either change them or update here why we're not.
- 02:11 PM rbd Bug #3427 (Fix Under Review): krbd: unmap does not remove block device properly
- I have posted two patches for review, the second of which
should fix this problem. I have not actually reproduced
... - 12:50 PM devops Feature #3479: ceph-deploy: uninstall
- commit:93082e82df56b01c524d0195e20068f6a6c8ca26
- 12:49 PM devops Feature #3910: ceph-deploy: uninstall purge
- ceph-deploy commit:93082e82df56b01c524d0195e20068f6a6c8ca26
- 12:48 PM devops Feature #3341: ceph-disk-activate: Make --mount the default
- I made it autodetect whether to mount or not based on whether you pass a directory or block device in. Simpler all a...
- 12:48 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
- Aaron Schulz wrote:
> Ian Colle wrote:
> > Aaron are you still seeing this?
>
> Sorry I need to get the time to ... - 10:24 AM Feature #3890 (In Progress): osd: create tool to extract pg info and pg log from filestore
- 10:10 AM rgw Cleanup #3777 (In Progress): rgw: audit code for reading NULL env variables
- reopening, see #3941.
- 10:09 AM rgw Bug #3941: s3tests crash on bobtail
- Yeah, similar to that other issue (#3777)...
- 09:21 AM CephFS Feature #3540 (In Progress): mds: maintain per-file backpointers on first file object
- 02:18 AM Revision 6bd676ea (ceph): mds: fix end check in Server::handle_client_readdir()
- commit 1174dd3188 (don't retry readdir request after issuing caps)
introduced an bug that wrongly marks 'end' in the ... - 02:18 AM Revision 5176cb71 (ceph): mds: check deleted directory in Server::rdlock_path_xlock_dentry
- Commit b03eab22e4 (mds: forbid creating file in deleted directory)
is not complete, mknod, mkdir and symlink are miss... - 02:18 AM Revision 919df3bf (ceph): mds: lock remote inode's primary dentry during rename
- commit 1203cd2110 (mds: allow open_remote_ino() to open xlocked dentry)
makes Server::handle_client_rename() xlocks r... - 02:18 AM Revision 67144973 (ceph): mds: allow journaling multiple root inodes in EMetaBlob
- In some cases (rename, rmdir, subtree map), we may need journal multiple
root inodes (/, mdsdir) in one EMetaBlob. Th... - 02:18 AM Revision 6daec530 (ceph): mds: introduce XSYN to SYNC lock state transition
- If lock is in XSYN state, Locker::simple_sync() firstly try changing
lock state to EXCL. If it fail to change lock st... - 02:18 AM Revision 659d1a39 (ceph): mds: properly set error_dentry for discover reply
- If MDCache::handle_discover() receives an 'discover path' request but
can not find the base inode. It should properly...
01/27/2013
- 06:12 PM Revision c5478161 (ceph): mon: Elector: reset the acked leader when the election finishes and we ...
- Failure to do so will mean that we will always ack the same leader during
an election started by another monitor. Th... - 03:59 PM Bug #3810: btrfs corrupts file size on 3.7
- I can do that, it will take somewhere between 12 and 24 hours to run.
- 03:34 PM Bug #3810: btrfs corrupts file size on 3.7
- Mike, would it be possible to reproduce this with debug file store = 20? That will tell us if what Ceph thinks it di...
- 02:10 PM Bug #3810: btrfs corrupts file size on 3.7
- I deleted the rbd's with inconsistent pg's, recreated the rbd's, ran rsync with the same data set, made sure no btrfs...
- 02:15 PM Revision d74b31b2 (ceph): mon: Monitor: force timecheck cleanup on finish_election()
- Fixes: #3854
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> - 12:58 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Ok, everything still looks good :). Last question: should I upgrade my whole cluster to this version or will a new ar...
- 12:01 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Ok, after around 10 minutes of runtime everything seems normal. Thanks for the fast and great help! :-)
ceph versi... - 12:00 PM Bug #3797 (Fix Under Review): osd takes 100% cpu after upgrading from 0.48.2argonaut to the lates...
- 11:56 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- that fixed it it seems. we could
- update argonaut and bobtail to newer leveldb :/
- link dynamically for quant... - 11:08 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- looks like levedb spinning on background compaction.
his .2 package is quantals, which is leveldb 1.5.. newer than... - 10:56 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Output of gdb /usr/bin/ceph-osd $pid, then 'thread apply all bt'
- 10:25 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Hi Sage, here we go. Is it enough data or do you need more? I didn't disable the logging yet...
- 10:00 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Hi Corin-
Can you enable 'debug osd = 20' for a bit and attach that log? I think this is related to commit:830b8f... - 08:31 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Just another small update - nothing changed so far. The cluster is still healthy, but the osd is still using 100% of ...
- 05:50 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Just a small update - nothing changed so far. The cluster is still healthy, but the osd is still using 100% of one co...
- 05:06 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Here's a nice graph to see the difference before/ after upgrade of disk activity....
The cluster is clean, no reco... - 05:03 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Hi Sage,
sorry for the delay. I just shutdown the osd, upgraded it and started it again. It's again using almost 1... - 10:29 AM rgw Bug #3941 (Resolved): s3tests crash on bobtail
- ...
- 09:28 AM Revision f666c617 (ceph): Revert "librbd: ensure header is up to date after initial read"
- Using assert version for linger ops doesn't work with retries,
since the version will change after the first send.
Th... - 09:28 AM Revision 10053b14 (ceph): librbd: establish watch before reading header
- This eliminates a window in which a race could occur when we have an
image open but no watch established. The previou... - 09:28 AM Revision 76f93751 (ceph): rbd: Don't call ProgressContext's finish() if there's an error.
- do_copy was different from the others; call pc.fail() on error and
do not call pc.finish().
Fixes: #3729
Signed-off-... - 09:28 AM Revision a16c6f3d (ceph): rbd: fix bench-write infinite loop
- I/O was continously submitted as long as there were few enough ops in
flight. If the number of 'threads' was high, or... - 08:58 AM Revision 575a5866 (ceph): os/FileStore: only adjust up op queue for btrfs
- We only need to adjust up the op queue limits during commit for btrfs,
because the snapshot initiation (async create)... - 08:47 AM Revision c9eb1b0a (ceph): common/HeartbeatMap: fix uninitialized variable
- Introduced by me in 132045ce085e8584a3e177af552ee7a5205b13d8. Thank you,
valgrind!
Signed-off-by: Sage Weil <sage@i... - 06:35 AM Revision fa421cf5 (ceph): configure: remove -m4_include(m4/acx_pthread.m4)
- Since we use already AC_CONFIG_MACRO_DIR, no need to include m4/acx_pthread.m4
extra.
Signed-off-by: Danny Al-Gaaf <... - 06:34 AM Revision 32276e9a (ceph): configure: fix RPM_RELEASE
- Use git to get RPM_RELEASE only if this is a git repo
clone and if the git command is available on the system.
Signe... - 04:49 AM Revision 341e6760 (ceph): osdmaptool: fix clitests
- Signed-off-by: Sage Weil <sage@inktank.com>
- 03:33 AM Revision 54c392e0 (ceph): osd: dump/display pool min_size
- Signed-off-by: Sage Weil <sage@inktank.com>
- 01:24 AM devops Feature #3479 (Resolved): ceph-deploy: uninstall
- 01:24 AM devops Feature #3910 (Resolved): ceph-deploy: uninstall purge
01/26/2013
- 09:46 PM Revision 1ba4c80b (ceph): qa/workunits/rbd/copy.sh: use non-deprecated --image-format option
- --format is deprecated.
Signed-off-by: Sage Weil <sage@inktank.com> - 09:45 PM Revision bbb86ec7 (ceph): mon: safety interlock for pool deletion
- Require that the pool name be passed twice along with an force option
before we irreversibly delete an entire pool of... - 09:26 PM Revision 700bcede (ceph): Revert "mon: implement safety interlock for deleting pools"
- This reverts commit c993ac9b1fa4037f4cc2674455728ee38a7c978b.
This is too hard to test. Requiring the pool name twi... - 09:18 PM Revision 6c407943 (ceph): Added libexpat dependency
- 09:13 PM Revision b5f81636 (ceph): osdthrasher: inject pause on a live (on in) osd
- 08:58 PM devops Feature #3917 (Fix Under Review): ceph-dir-prepare command
- 08:58 PM devops Feature #3915 (Rejected): ceph-disk-prepare: support sysvinit or upstart
- init system is a property of the host, not the disk.. doesn't belong in ceph-disk-prepare.
- 08:57 PM devops Feature #3911 (Fix Under Review): sysvinit: allow daemon enumeration via dirs
- 08:57 PM devops Feature #3914 (Fix Under Review): ceph-disk-activate: support sysvinit
- 08:54 PM devops Feature #3341 (Rejected): ceph-disk-activate: Make --mount the default
- 08:53 PM devops Bug #3898 (Resolved): ceph-deploy: problems with >1 mon
- ceph-deploy commit:8067dd0afa19ff7b7ca75f984dedc4213d3a4be8
- 05:21 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
- Ian Colle wrote:
> Aaron are you still seeing this?
Sorry I need to get the time to try and reproduce this (and o... - 12:44 PM rbd Bug #3937 (Fix Under Review): krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
- A patch resolving this has been posted for review.
[PATCH 4/4] rbd: don't drop watch requests on completion - 12:43 PM rbd Bug #3940 (Fix Under Review): krbd: decrement obj request count when deleting
- A patch resolving this has been posted for review.
- 08:05 AM rbd Bug #3940 (Resolved): krbd: decrement obj request count when deleting
- The obj_request_count value keeps track of how many object requests
are associated with an image request. It is inc... - 07:57 AM rbd Bug #3939 (Duplicate): krbd: circular locking report in sysfs code
- I intended to write this up before but don't think I did.
I'm getting a "possible circular locking dependency detect... - 05:27 AM Revision 7daf3724 (ceph): rbd-fuse: Original code from Andreas Bluemle
- Signed-off-by: Andreas Bluemle <andreas.bluemle@itxperts.de>
- 05:27 AM Revision 2a6dcabf (ceph): rbd-fuse: add simple RBD FUSE client
- Currently written in C on FUSE hi-level interfaces, so error reporting
could be better. No serious work done for per... - 05:25 AM Revision aec2a474 (ceph): s3/php: update to 1.5? version of API
- Something like v1.5 of the Amazon PHP library requires the AmazonS3
constructor to be given an array of parameters ra... - 02:07 AM Revision b2a473be (ceph): workunit for iogen
- Signed-off-by: tamil <tamil.muthamizhan@inktank.com>
- 01:59 AM Revision b98da75a (ceph): Merge branch 'wip-osd-msgr'
- Reviewed-by: Samuel Just <sam.just@inktank.com>
- 01:58 AM Revision 17cd549a (ceph): mon: Monitor: timecheck: only output report to dout once
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 01:56 AM Revision 13fb1726 (ceph): mon: Monitor: track timecheck round state and report on health
- Fixes: #3854
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 01:56 AM Revision aa85d914 (ceph): task: mon_clock_skew_check: increase timeout and kick it off only on stop
- We were kicking-off the timeout as soon as we started; it's better however
to kick if off only when we are told to st... - 01:56 AM Revision 673101c7 (ceph): task: mon_clock_skew_check: distinguish between on-going and finished c...
- Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
- 01:24 AM Revision e6bceeed (ceph): sharedptr_registry: remove extaneous Mutex::Locker declaration
- For some reason, the lookup() retry loop (for when happened to
race with a removal and grab an invalid WeakPtr) locke... - 01:24 AM Revision 60888caf (ceph): FileStore: ping TPHandle after each operation in _do_transactions
- Each completed operation in the transaction proves thread
liveness, a stuck thread should still trigger the timeouts.... - 01:24 AM Revision 6b8a673f (ceph): OSD: use TPHandle in peering_wq
- Implement _process overload with TPHandle argument and use
that to ping the hb map between pgs and between map epochs... - 01:24 AM Revision aa6d20aa (ceph): WorkQueue: add TPHandle to allow _process to ping the hb map
- Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 4f653d23999b24fc8c65a5... - 01:23 AM Revision e66a7505 (ceph): ReplicatedPG: handle omap > max_recovery_chunk
- span_of fails if len == 0.
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from c... - 01:23 AM Revision 44f0407a (ceph): ReplicatedPG: correctly handle omap key larger than max chunk
- Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit c3dec3e30a85ecad0090c7... - 01:23 AM Revision 50fd6ac9 (ceph): ReplicatedPG: start scanning omap at omap_recovered_to
- Previously, we started scanning omap after omap_recovered_to.
This is a problem since the break in the loop implies t... - 01:23 AM Revision 4b32eecb (ceph): ReplicatedPG: don't finish_recovery_op until the transaction completes
- Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 62a4b96831c1726043699db86a664dc6a0af8637) - 01:23 AM Revision da34c77b (ceph): ReplicatedPG: ack push only after transaction has completed
- Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 20278c4f77b890d5b2b95d2ccbeb4fbe106667ac) - 01:23 AM Revision f9381c74 (ceph): ObjectStore: add queue_transactions with oncomplete
- Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 4d6ba06309b80fb21de7bb5d12d5482e71de5f16) - 01:22 AM Revision e2560554 (ceph): common/HeartbeatMap: inject unhealthy heartbeat for N seconds
- This lets us test code that is triggered by an unhealthy heartbeat in a
generic way.
Signed-off-by: Sage Weil <sage@... - 01:22 AM Revision cbe8b5bc (ceph): os/FileStore: add stall injection into filestore op queue
- Allow admin to artificially induce a stall in the op queue. Forces the
thread(s) to sleep for N seconds. We pause f... - 01:22 AM Revision beb6ca44 (ceph): osd: do not join cluster if not healthy
- If our internal heartbeats are failing, do not send a boot message and try
to join the cluster.
Signed-off-by: Sage ... - 01:22 AM Revision 1ecdfca3 (ceph): osd: hold lock while calling start_boot on startup
- This probably doesn't strictly matter because start_boot doesn't need the
lock (currently) and few other threads shou... - 01:22 AM Bug #3938 (Can't reproduce): ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons,...
- 7:09:03.310220 7f652087e700 1 mon.a@1(peon).osd e72 e72: 20 osds: 20 up, 20 in ...
- 01:21 AM Revision e120bf20 (ceph): osd: do not reply to ping if internal heartbeat is not healthy
- If we find that our internal threads are stalled, do not reply to ping
requests. If we do this long enough, peers wi... - 01:21 AM Revision 5f396e2b (ceph): osd: reduce op thread heartbeat default 30 -> 15 seconds
- If the thread stalls for 15 seconds, let our internal heartbeat fail.
This will let us internally respond more quickl... - 01:17 AM Revision fca288b7 (ceph): osd: improve sub_op flag points
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 73a969366c8bbd105579611320c43e2334907fef) - 01:17 AM Revision f13ddc8a (ceph): osd: refactor ReplicatedPG::do_sub_op
- PULL is the only case where we don't wait for active.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked fro... - 01:17 AM Revision d5e00f96 (ceph): osd: make last state for slow requests more informative
- Report on the last event string, and pass in important context for the
op event list, including:
- which peers were... - 01:17 AM Revision ab3a110c (ceph): osd: dump op priority queue state via admin socket
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 24d0d7eb0165c8b8f923f2d8896b156bfb5e0e60) - 01:17 AM Revision 43a65d04 (ceph): osd: simplify asok to single callback
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 33efe32151e04beaafd9435d7f86dc2eb046214d) - 01:16 AM Revision d0407986 (ceph): common/PrioritizedQueue: dump state to Formatter
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 514af15e95604bd241d2a98a97b938889c6876db) - 01:16 AM Revision 691fd505 (ceph): common/PrioritizedQueue: add min cost, max tokens per bucket
- Two problems.
First, we need to cap the tokens per bucket. Otherwise, a stream of
items at one priority over time w... - 01:16 AM Revision a2b03fe0 (ceph): common/PrioritizedQueue: buckets -> tokens
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c549a0cf6fae78c8418a3b4b0702fd8a1e4ce482) - 01:16 AM Revision 612d75cd (ceph): note puller's max chunk in pull requests
- this lets us calculate a cost value
(cherry picked from commit 128fcfcac7d3fb66ca2c799df521591a98b82e05) - 01:16 AM Revision 2224e413 (ceph): osd: add OpRequest flag point when commit is sent
- With writeahead journaling in particular, we can get requests that
stay in the queue for a long time even after the c... - 01:16 AM Revision 5b5ca592 (ceph): osd: set PULL subop cost to size of requested data
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit a1bf8220e545f29b83d965f07b1abfbea06238b3) - 01:16 AM Revision 10651e4f (ceph): osd: use Message::get_cost() function for queueing
- The data payload is a decent proxy for cost in most cases, but not all.
Signed-off-by: Sage Weil <sage@inktank.com>
... - 01:16 AM Revision 9735c6b1 (ceph): osd: debug msg prio, cost, latency
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit bec96a234c160bebd9fd295df5b431dc70a2cfb3) - 01:15 AM Revision c48279da (ceph): filestore: filestore_queue_max_ops 500 -> 50
- Having a deep queue limits the effectiveness of the priority queues
above by adding additional latency.
Signed-off-b... - 01:15 AM Revision f47b2e8b (ceph): osd: target transaction size 300 -> 30
- Small transactions make pg removal nicer to the op queue. It also slows
down PG deletion a bit, which may exacerbate... - 01:15 AM Revision 4947f0ef (ceph): os/FileStore: allow filestore_queue_max_{ops,bytes} to be adjusted at r...
- The 'committing' ones too.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit cfe4b8519363f92f84... - 01:14 AM Revision ad6e6c91 (ceph): osd: make osd_max_backfills dynamically adjustable
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 101955a6b8bfdf91f4229f4ecb5d5b3da096e160) - 01:14 AM Revision 939b1855 (ceph): osd: make OSD a config observer
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 9230c863b3dc2bdda12c23202682a84c48f070a1)
Con... - 12:16 AM Revision b49440bc (ceph): doc: Added new, more comprehensive OSD/PG monitoring doc.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 12:15 AM Revision 5f210505 (ceph): doc: Trimmed some detail and added a x-ref to detailed osd/pg monitorin...
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 12:14 AM Revision 95cfdd46 (ceph): doc: Added osd/pg monitoring section to the index.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 12:14 AM Revision d36a208c (ceph): doc: Added x-ref links.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
01/25/2013
- 10:25 PM Revision 89386856 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
- 10:24 PM Revision 1af3578e (ceph): doc: fixed description for pg in control section.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 08:48 PM Revision 248835d4 (ceph): doc: wider sidebar, larger font, cleaned tip CSS
- The sidebar is now about a hundred pixels wider and the fonts
are larger throughout. This works a lot better when yo... - 08:16 PM Linux kernel client Bug #3860: rbd: problems if watch setup returns ERANGE
- Just to close this out...
The fix (not repeating no ERANGE) has been committed:
commit c04306471ad93f1daf60771a... - 06:27 AM Linux kernel client Bug #3860: rbd: problems if watch setup returns ERANGE
- Josh rejected this. But since he said that the
change I proposed--to not do the loop--was OK
I suggest this bug sh... - 07:41 PM Revision 037900dc (ceph): sharedptr_registry: remove extaneous Mutex::Locker declaration
- For some reason, the lookup() retry loop (for when happened to
race with a removal and grab an invalid WeakPtr) locke... - 06:54 PM Revision 8bd306b9 (ceph): doc: Added Subdomain section.
- fixes: #3778
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 05:40 PM Revision 8fef6fa3 (ceph): osd/PG: include map epoch in query results
- Currently you can only infer it from the info.history.* fields.
Signed-off-by: Sage Weil <sage@inktank.com> - 05:38 PM Revision e359a862 (ceph): osd: kill unused addr-based send_map()
- Not used, old API, bad.
Signed-off-by: Sage Weil <sage@inktank.com> - 05:38 PM Revision 5e2fab54 (ceph): osd: share incoming maps via Connection*, not addrs
- Kill a set of parallel methods that are using the old addr/inst-based
msgr APIs, and instead use Connection handles. ... - 05:38 PM Revision 1bc419a7 (ceph): osd: pass new maps to dead osds via existing Connection
- Previously we were sending these maps to dead osds via their old addrs
using a new outgoing connection and setting th... - 05:38 PM Revision 76705ace (ceph): osd: requeue osdmaps on heartbeat connections for cluster connection
- If we receive an OSDMap on the cluster connection, requeue it for the
cluster messenger, and process it there where w... - 05:38 PM Revision a7059eb3 (ceph): msgr: add get_loopback_connection() method
- Return the Connection* for ourselves, so we can queue messages for
ourselves.
Signed-off-by: Sage Weil <sage@inktank... - 05:38 PM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
- I will be able to reproduce after the Feb,8. Willl do if nobody will reproduce before.
- 04:33 PM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
- please set 'debug mds = 10' and upload mds log. To minimize mds log size, please truncate the mds log before executin...
- 09:54 AM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
- I made a mistake during initial post: amount of files in directory is 3.5K, not 35K. It's my netflow for last years, ...
- 12:11 AM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
- At #3936 I'm providing some benchmarks to show that IOPS/speed is OK for my installation and my hands are not perform...
- 04:25 PM Documentation #3222 (Resolved): DOC: Get an Object from a Primary OSD
- Added a full exercise toward the end here: http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/
- 08:50 AM Documentation #3222 (In Progress): DOC: Get an Object from a Primary OSD
- 04:24 PM Documentation #3333 (Resolved): doc: Explain "degraded" more
- More extensive discussion here: http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/
- 04:24 PM Documentation #3331 (Resolved): doc: Where is my data placed?
- Provided an entire exercise toward the end of this document: http://ceph.com/docs/master/rados/operations/monitoring-...
- 04:22 PM Documentation #3320 (Resolved): doc: What persistency does Ceph guarantee
- Added more extensive discussions.
Here: http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/ and
Her... - 03:25 PM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
- OK, with Josh's help I finally managed to reproduce the
problem intentionally to check my fix.
I'm building it no... - 11:11 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
- I have confirmed that every time a request registered to linger
is re-submitted the osd client will call the callbac... - 08:07 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
- I've decoded the osd request that's been provided to
rbd_osd_req_callback(). Its contents look completely
legitima... - 06:54 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
- Adding two things:
- this occurred during test 190 of the third consecutive pass
of xfstests with this in the teuth... - 05:04 AM rbd Bug #3937 (Resolved): krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
- Looking at a crash this morning in the new request code due
to this failed assertion in rbd_osd_req_callback():
... - 03:14 PM rgw Bug #3620: rgw:improve multiple user access keys scalability
- 01:51 PM Subtask #3840: osd: ack push after apply+commit
- 01:50 PM Feature #3833: osd: improve recovery throttling
- 11:48 AM Bug #3836: osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
- pushed to master, still need to backport
- 11:40 AM Bug #3836 (Fix Under Review): osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
- D'oh. sharedptr_registry.hpp has an extaneous Mutex::Locker l(lock) declaration in the retry loop. It only actually...
- 11:41 AM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
- 11:40 AM Documentation #3390 (Resolved): doc: add detail on different bucket algorithms
- 11:12 AM rgw Feature #3669 (In Progress): rgw: support acl grants through http headers
- 11:09 AM rgw Cleanup #3777 (Resolved): rgw: audit code for reading NULL env variables
- Merged into master, commit: b3a2e7e955547a863d29566aab62bcc480e27a65
- 11:07 AM rgw Feature #3667 (In Progress): rgw: support extra canned acl params
- 10:55 AM Bug #3928 (Resolved): osd: peering workqueue tryings to advance through *all* past osdmaps in one...
- The timeout should be fixed by e0511f4f4773766d04e845af2d079f82f3177cb6.
- 10:55 AM rgw Bug #3778 (Resolved): document procedure for enabling subdomain S3 api calls
- Added info for subdomain call.
- 10:33 AM rgw Bug #3778 (In Progress): document procedure for enabling subdomain S3 api calls
- 09:54 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
- It's pretty likely that this is a server-side behavior rather than a client-side one. Keep that in mind when reproduc...
- 12:00 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
- rados -p rbd bench 120 write -t 16
shows about 90-110 MB/sec. - 09:52 AM rbd Bug #3654 (Resolved): libvirt: colons in ipv6 monitor addresses are not escaped when sent to qemu
- Upstream commit c1509ab47edf61e9f20d11922526b9fca518d238
- 09:34 AM rbd Bug #3927: krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
- Yes, the ENXIO is expected. Assuming it's being propagated out to dd, and the test passes (outputs OK at the end of k...
- 05:55 AM rbd Bug #3427: krbd: unmap does not remove block device properly
- We had some discussion about the whether an atomic bit
operation for this was sufficient, or whether a memory
barri... - 05:48 AM Revision a6ed62e3 (ceph): common: fix cli tests on usage
- Signed-off-by: Sage Weil <sage@inktank.com>
- 05:06 AM Revision 38871e27 (ceph): os/FileStore: only adjust up op queue for btrfs
- We only need to adjust up the op queue limits during commit for btrfs,
because the snapshot initiation (async create)... - 05:06 AM Revision 5f9ab930 (ceph): Revert "filestore: disable extra committing queue allowance"
- This reverts commit 44dca5c8c5058acf9bc391303dc77893793ce0be.
The allowance is not only added for btrfs as of commit... - 05:00 AM Revision d95b4313 (ceph): adminops.rst: revert changes for as-yet-unimplemented features
- See wip-admin-api for the new specification
Fixes: #3724
Signed-off-by: Dan Mick <dan.mick@inktank.com> - 04:40 AM CephFS Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
- Heh. Funny markup. The numbered list came out of #s used for comments.
Anyway, I've just verified that the issue... - 04:34 AM CephFS Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
- I've just verified that the problem is still present in 3.7.3, and I have a much simpler reproducer too.
mount -t ... - 03:43 AM Revision bb860e49 (ceph): rados: remove unused "check_stdio" parameter
- Signed-off-by: Dan Mick <dan.mick@inktank.com>
- 02:05 AM Bug #3810: btrfs corrupts file size on 3.7
- Kernel was 3.7.1
Ran btrfsck on the partitions when the error first occurred with nothing found.
Tried your fix o... - 01:54 AM Revision 234becd3 (ceph): rados: obey op_size for 'get'
- Otherwise we try to read the whole object in one go, which doesn't bode
well for large objects (either non-optimal or... - 01:31 AM Revision 3a5c70b8 (ceph): ceph_manager: turn long stall injection off by default
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 01:25 AM Revision 4f653d23 (ceph): WorkQueue: add TPHandle to allow _process to ping the hb map
- Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com> - 01:25 AM Revision e0511f4f (ceph): OSD: use TPHandle in peering_wq
- Implement _process overload with TPHandle argument and use
that to ping the hb map between pgs and between map epochs... - 01:25 AM Revision 0c1cc687 (ceph): FileStore: ping TPHandle after each operation in _do_transactions
- Each completed operation in the transaction proves thread
liveness, a stuck thread should still trigger the timeouts.... - 12:24 AM Revision 006e7065 (ceph): osd_recovery: fix up incomplete test
- - stop rados bench from cleaning up
- flush pg stats
- fix sleep call
One or more of these helped fix this test, don... - 12:23 AM Revision 20af01f2 (ceph): ceph_manager: fix get_num_active_recovered()
- The states now have 'backfill' *or* 'recover' in them.
- 12:20 AM Revision 79d599cf (ceph): java: remove extra whitespace
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
01/24/2013
- 11:59 PM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
- I also tried to do:
dd if=/dev/rbd/rbd/test of=/dev/null bs=4M - the same situation.
- 11:57 PM rbd Bug #3936 (Rejected): rbd: Strange dd speed behaviour (server side issue?)
- I have 3 node/15 osds (5 on each), every on separate drive installation (with SSD cache), journal in RAMFS. XFS as ba...
- 11:46 PM CephFS Bug #3935 (Can't reproduce): kclient: Big directory access bugs (multiple), mixed 32- and 64-bit ...
- I have next directory structure in ceph fs:...
- 11:21 PM Revision b150e8e3 (ceph): workunit: pass java path as env variable
- The libcephfs-java test needs this.
- 11:13 PM Revision 6f0e1137 (ceph): libcephfs-java test: use provided environment
- Signed-off-by: Sage Weil <sage@inktank.com>
- 09:41 PM Bug #3810: btrfs corrupts file size on 3.7
- Bill Kenworthy wrote:
> Version was 55.1 when created and the error occurred, now updated to 56.1 (on gentoo) after ... - 09:36 PM Bug #3810: btrfs corrupts file size on 3.7
- Version was 55.1 when created and the error occurred, now updated to 56.1 (on gentoo) after error
Its organised as 5... - 08:55 PM Bug #3810: btrfs corrupts file size on 3.7
- Bill Kenworthy wrote:
> I have been hit by the same thing ... is there any information you need before I try and fix... - 06:18 PM Bug #3810: btrfs corrupts file size on 3.7
- I have been hit by the same thing ... is there any information you need before I try and fix it further.
Ive tried... - 01:35 PM Bug #3810: btrfs corrupts file size on 3.7
- How about this object instead:
2013-01-23 18:41:31.336722 osd.7 149.165.228.11:6800/28046 159 : [ERR] 2.202 osd.0: s... - 01:16 PM Bug #3810: btrfs corrupts file size on 3.7
- the going theory is that this is triggered by btrfs scrub. can we confirm this somehow?
- 11:03 AM Bug #3810: btrfs corrupts file size on 3.7
- Samuel Just wrote:
> I need a dump of the xattrs on the d0c18e1d/605.00000000/head//1 object in pg 1.1d on osd 7 and... - 10:17 AM Bug #3810: btrfs corrupts file size on 3.7
- Additional info, btrfs scrubs were done while the osd's were active which may or may not have had a negative effect. ...
- 09:31 PM Revision 40ae8cea (ceph): common: only show -d, -f options for daemons
- Fixes: #3073
Signed-off-by: Sage Weil <sage@inktank.com> - 09:13 PM Revision 7e7130da (ceph): doc: Syntax fixes.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 08:58 PM Revision b51bfdf0 (ceph): doc: Updated usage for Bobtail.
- fixes: #3831
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 08:57 PM Revision 1d71d052 (ceph): doc: Updated usage for Bobtail.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 08:55 PM rgw Bug #3724 (Resolved): docs refer to non-implemented features of the radosgw-admin rest api
- commit d95b4313de1614fd85265879e6d7ddadd5268af2
- 08:45 PM rgw Bug #3724: docs refer to non-implemented features of the radosgw-admin rest api
- Since the docs are in wip-admin-api, this amounts to rolling doc/radosgw/admin/adminops.rst back to its state as of 0...
- 01:41 PM rgw Bug #3724: docs refer to non-implemented features of the radosgw-admin rest api
- 01:38 PM rgw Bug #3724: docs refer to non-implemented features of the radosgw-admin rest api
- John - any update?
- 08:54 PM Revision b0a5fe94 (ceph): java: support ceph_get_file_pool_name
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 08:50 PM Revision 6a859bcd (ceph): ceph_manager: use 80/70 as pause_long, pause_check_after defaults
- OSD::op_tp suicides after 150.
Signed-off-by: Samuel Just <sam.just@inktank.com> - 08:47 PM Revision 6b272e0f (ceph): Merge branch 'master' of https://github.com/ceph/ceph
- 08:46 PM Revision 42d92b73 (ceph): doc: Added example of ext4 user_xattr mount option.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 08:43 PM Bug #3885: osd: osd-recovery-incomplete qa test failing
- (the above commit is in the teuthology code)
- 04:28 PM Bug #3885 (Resolved): osd: osd-recovery-incomplete qa test failing
- fixed, mostly by commit:20af01f23ba932cb97cb40bba89bff546e10c461, which may fix up some of hte other spurious failure...
- 11:13 AM Bug #3885 (In Progress): osd: osd-recovery-incomplete qa test failing
- 08:30 PM Revision b3a2e7e9 (ceph): rgw_rest: Make fallback uri configurable.
- Some HTTP servers, notabily lighttp, do not set SCRIPT_URI, make the fallback
string configurable.
Signed-off-by: ca... - 08:29 PM Revision b0f27a8f (ceph): librbd: Allow get_lock_info to fail
- If the lock class isn't present, EOPNOTSUPP is returned for lock calls
on newer OSDs, but sadly EIO on older; we need... - 07:33 PM Revision 0c6d5a9d (ceph): java: support fchmod
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:33 PM Revision 9cefa969 (ceph): java: add missing chmod unmounted test
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:33 PM Revision 487bacdb (ceph): java: fix exception name typo
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:33 PM Revision 352652b6 (ceph): libcephfs: document ERANGE rv for get_file_pool_name
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:27 PM Revision 4b3bcb92 (ceph): java: support stat()
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 06:52 PM Revision 00cfe1d3 (ceph): common/HeartbeatMap: fix uninitialized variable
- Introduced by me in 132045ce085e8584a3e177af552ee7a5205b13d8. Thank you,
valgrind!
Signed-off-by: Sage Weil <sage@i... - 06:41 PM Revision b9f58baa (ceph): libcephfs-java test: jar files are in /usr/local/share/java, it seems
- Signed-off-by: Sage Weil <sage@inktank.com>
- 06:35 PM Revision f9f31aae (ceph): wireshark: fix indention
- Fix indention.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de> - 06:35 PM Revision 3e9cc0d4 (ceph): wireshark: fix guint64 print format handling
- Use G_GUINT64_FORMAT to handle print format of guint64 correctly.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect... - 06:08 PM Revision 0f24dca2 (ceph): ceph_manager: use do_rados for rmpool
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 04:54 PM devops Bug #3934: ceph-deploy new should require at least one host name
- If no hosts are specified on the command line, a ceph.conf file is created without any monitors listed. No errors or...
- 04:51 PM devops Bug #3934 (Resolved): ceph-deploy new should require at least one host name
- 04:14 PM devops Bug #3933: ceph-deploy gatherkeys silently fails if no host is specified
- If no host is specified and ceph.conf exists gatherkeys will fail, but not report any error.
- 04:12 PM devops Bug #3933 (Resolved): ceph-deploy gatherkeys silently fails if no host is specified
- 02:52 PM Bug #3930 (Resolved): ceph.spec: udev rule for rbd not in rpms
- The udev rule for kernel rbd (udev/50-rbd.rules in ceph.git) should be packaged. It's already in the debs: debian/lib...
- 01:41 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
- 01:39 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
- Any update?
- 01:41 PM rgw Bug #3450: WRITE permission only doesn't allow proper multi-part upload
- 01:33 PM rgw Bug #3450: WRITE permission only doesn't allow proper multi-part upload
- Needs to be part of larger overall discussion about the intent of subusers.
- 01:41 PM rgw Bug #3706: rgw functional test testSlashInName failed in nightly
- 01:38 PM rgw Bug #3706: rgw functional test testSlashInName failed in nightly
- Need to see if happens again and then find reproducer.
- 01:41 PM rgw Feature #2804: rgw: disallow running multiple gateways on the same fastcgi socket
- 01:41 PM rgw Feature #3074: radosgw needs --help support
- 01:41 PM rgw Bug #2366: rgw: bucket index update rely on pg state
- 01:41 PM rgw Bug #2650: rgw: swift key creation overrides subuser access mask
- 01:41 PM rgw Bug #1777: rgw: user info modification is not atomic
- 01:41 PM rgw Bug #1779: rgw: swift auth returns wrong error code when unexisting user is given
- 01:14 PM rgw Bug #1779: rgw: swift auth returns wrong error code when unexisting user is given
- Work in course with other swift changes, but not a driver.
- 01:40 PM rgw Feature #3366: rgw: dr: define management api
- Caleb to get out updated document for review.
- 01:37 PM rgw Bug #3620: rgw:improve multiple user access keys scalability
- Caleb to review.
- 01:36 PM rgw Bug #3682 (Resolved): valgrind errors seen when running rgw tests in nightlies
- Increased time in tests and has not occurred.
- 01:35 PM rgw Bug #3628 (Resolved): rgw: leak of object parts on partial upload
- Fixed in bobtail
- 01:34 PM rgw Bug #3485 (In Progress): rgw: unique user emails not enforced
- 01:34 PM Bug #3906: ceph-mon leaks memory during peering
- the logs indicate this may be related to failed auth connection attempts spamming the monitor.
- 11:43 AM Bug #3906: ceph-mon leaks memory during peering
- we need to reproduce this on a large internal cluster, with many osds and even more pgs.
- 09:38 AM Bug #3906: ceph-mon leaks memory during peering
- I believe this to be related to #3609
- 01:32 PM rgw Bug #3073: radosgw-admin: is not a daemon, should not have -d/-f options
- commit:40ae8ceab58b4c05e01dc9f7809728a592cc4f0d actaully
- 01:30 PM rgw Bug #3073 (Resolved): radosgw-admin: is not a daemon, should not have -d/-f options
- commit:b878b2c6e9ee41de25faf4dfdd7285dcb01b36e8
- 01:26 PM rgw Bug #3073: radosgw-admin: is not a daemon, should not have -d/-f options
- Change common init
- 01:30 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
- Aaron are you still seeing this?
- 01:29 PM rgw Bug #3365 (Need More Info): Broken metadata (duplicated as CSV)
- 01:21 PM rgw Feature #2490: rgw-admin: only register watch when needed
- Performance improvement.
- 01:21 PM CephFS Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
- This is still present in 3.6.11 (I'll know about 3.7.* soon). I suspect this may have to do with failing to mark met...
- 01:18 PM rgw Bug #2482 (Rejected): rgw: duplicate content-length results in 400
- Apache issue.
- 01:14 PM rgw Bug #1906 (Can't reproduce): rgw: total_time isn't logged consistently
- 01:13 PM Documentation #3831 (Resolved): ceph osd crush set command needs correction in the doc
- 01:10 PM rgw Bug #1673: rgw: mod_fastcgi needs to be backward compatible
- 01:10 PM rgw Bug #1673: rgw: mod_fastcgi needs to be backward compatible
- Canonical can not take our changes up stream until we solve this issue.
- 11:16 AM rgw Cleanup #3929 (New): s3-tests: refactor all test_post_* tests
- These tests mostly do the same thing, can be cleaned up, no need to duplicate the same code across all.
- 10:58 AM CephFS Feature #3821 (In Progress): qa: run backuppc as part of qa suite
- Ekapol Rojpiboonphun wrote:
> Just to make sure that I will be on this along the line of what you might already have... - 10:52 AM CephFS Feature #3821: qa: run backuppc as part of qa suite
- Just to make sure that I will be on this along the line of what you might already have in mind. (More details please ...
- 09:56 AM CephFS Feature #3821: qa: run backuppc as part of qa suite
- Download/install backuppc and get it into suite.
- 10:32 AM Bug #3928 (In Progress): osd: peering workqueue tryings to advance through *all* past osdmaps in ...
- 10:02 AM Bug #3928 (Resolved): osd: peering workqueue tryings to advance through *all* past osdmaps in one...
- 10:10 AM Bug #3905: incomplete & stale (lost?) PGs
- Sounds like a combination of crush map and rules that aren't behaving well together — "incomplete" means the PG doesn...
- 09:42 AM Bug #3801: Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED assert(0 == "...
- The olog stuff is fixed in bobtail, and won't be backported to argonaut.
I'm not sure what the root cause of hte h... - 08:42 AM Bug #3854: mon: clock skew tests failing on master
- Happened again on QA, reopening while testing a new patch.
- 08:15 AM rbd Bug #3927: krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
- Hey! I just looked at the test, and here's how it ends:
# remove snapshot and detect error from mapped snapshot
... - 08:15 AM rbd Bug #3927: krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
- This is the relevant portion of the yaml file:
- workunit:
clients:
all:
- rbd/map-unmap.sh
... - 08:09 AM rbd Bug #3927 (Closed): krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
- I'm seeing ENXIO errors at what I believe to the "rbd/kernel.sh
teuthology workunit while testing the new request co... - 05:49 AM rbd Feature #3926 (Resolved): krbd: use slab allocation for common data structures
- There are some common data structures--like image and object
requests--that are very frequently allocated and would ... - 05:29 AM rbd Bug #3925 (Resolved): krbd: sysfs write lockdep warnings
- ...
- 03:42 AM Revision 2f192eaf (ceph): TestRados expects rollback, not snap_rollback
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 02:50 AM Revision 67c77577 (ceph): PendingReleaseNotes: pool removal cli changes
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:49 AM Revision 87fe35f6 (ceph): Merge remote-tracking branch 'gh/wip-rm-pool'
- Reviewed-by: Samuel Just <sam.just@inktank.com>
- 02:47 AM Revision 64b9dd08 (ceph): Merge remote-tracking branch 'gh/wip-3832-oc-flushrange'
- Reviewed-by: Sage Weil <sage@inktank.com>
- 02:43 AM Revision 9b56f367 (ceph): Merge remote-tracking branch 'gh/wip_heartbeat'
- 02:40 AM Revision 62579eef (ceph): Merge branch 'wip-osd-hb'
- Reviewed-by: Samuel Just <sam.just@inktank.com>
- 01:44 AM Revision ec5a1455 (ceph): ceph_manager: default chance_down to 0.4
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 01:40 AM Revision 566ae533 (ceph): ceph_manager: add filestore and heartbeat stalls
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 01:22 AM Revision 5d66c9ab (ceph): Use ceph git repo instead of github.
- This code change is so that instead of pulling the tarball of github
which can be unreliable at times it instead uses... - 12:55 AM Revision d6db239c (ceph): Merge remote-tracking branch 'upstream/wip_push_after_complete'
- Reviewed-by: Sage Weil <sage@inktank.com>
01/23/2013
- 10:00 PM devops Feature #3229 (Resolved): Support clean ceph-fuse fstab automounting
- implemented this already; /sbin/mount.fuse.ceph is in bobtail.
- 09:59 PM devops Feature #3924 (Resolved): ceph-deploy: package it
- 09:57 PM devops Feature #3923 (Resolved): ceph-deploy: discover HOST
- somewhat similar to new, except we pull the ceph.conf from a remote host.
- 09:57 PM devops Feature #3922 (Resolved): ceph-deploy: version command
- 09:57 PM devops Feature #3921 (Resolved): ceph-deploy: support RPM-based distros
- 09:57 PM devops Feature #3920 (Resolved): ceph-deploy: support other deb-based distros
- 09:56 PM devops Feature #3919 (Resolved): ceph-deploy: remove upstart dependency
- eliminate whatever remaining upstart dependencies are in ceph-deploy, so that upstart and sysvinit are both viable.
- 09:55 PM devops Feature #3918 (Resolved): ceph-deploy: osd create HOST:DIR[:JOURNAL]
- trigger ceph-dir-prepare instead of ceph-disk-prepare.
- 09:54 PM devops Feature #3917 (Resolved): ceph-dir-prepare command
- ceph-dir-prepare <dir> [journal] or similar
somewhat similar to ceph-disk-prepare, but simpler.
- allocate osd ... - 09:54 PM devops Feature #3916 (Resolved): ceph-disk-activate: non-upstart trigger (udev?)
- 09:53 PM devops Feature #3915 (Rejected): ceph-disk-prepare: support sysvinit or upstart
- 09:53 PM devops Feature #3914 (Resolved): ceph-disk-activate: support sysvinit
- 09:52 PM devops Feature #3913 (Resolved): ceph-deploy: break mon into create/destroy
- 09:52 PM devops Feature #3912 (Resolved): ceph-deploy: break osd into create/destroy
- Actually, we want
ceph-deploy osd prepare HOST:DEV[:JOURNAL]
ceph-deploy osd activate HOST:DEVORDIR
and perh... - 09:52 PM devops Feature #3911 (Resolved): sysvinit: allow daemon enumeration via dirs
- 09:52 PM devops Feature #3910 (Resolved): ceph-deploy: uninstall purge
- 09:52 PM devops Feature #3909 (Resolved): ceph-deploy: update install for bobtail/argonaut urls
- 09:51 PM devops Feature #3907 (Resolved): ceph-deploy: be verbose about what is run and what is done (with -q)
- 08:49 PM Revision 8a97eef1 (ceph): ReplicatedPG: handle omap > max_recovery_chunk
- span_of fails if len == 0.
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com> - 08:35 PM Revision c3dec3e3 (ceph): ReplicatedPG: correctly handle omap key larger than max chunk
- Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com> - 08:15 PM Revision 09c71f2f (ceph): ReplicatedPG: start scanning omap at omap_recovered_to
- Previously, we started scanning omap after omap_recovered_to.
This is a problem since the break in the loop implies t... - 08:10 PM Bug #3904: FAILED assert(want_acting.empty())
- I have a theory:
reset
started
primary
getinfo
got infos
getlog
calc_acting succeeds, choose_acting fails,... - 02:48 PM Bug #3904 (Resolved): FAILED assert(want_acting.empty())
- Ceph 0.56.1 on Ubuntu 12.04, standard ceph.com packages. Multiple OSDs started getting marked down/crashing out, this...
- 07:50 PM Revision 20278c4f (ceph): ReplicatedPG: ack push only after transaction has completed
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 07:50 PM Revision 62a4b968 (ceph): ReplicatedPG: don't finish_recovery_op until the transaction completes
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 07:50 PM Revision 4d6ba063 (ceph): ObjectStore: add queue_transactions with oncomplete
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 06:48 PM CephFS Bug #3832 (Resolved): client: does not observe O_SYNC
- commit:64b9dd088d8f20019d6c1042895676b2ec57077e
- 06:42 PM Feature #3888 (Resolved): osd: stop heartbeating peers when internal heartbeat fails
- 06:42 PM Feature #3888: osd: stop heartbeating peers when internal heartbeat fails
- commit:62579eefba057eea200d8a9a3f6b3d8bca29b8b4
- 06:31 PM Bug #3906 (Won't Fix): ceph-mon leaks memory during peering
- I've done multiple OSD swaps with both 0.55 & 0.56/0.56.1 on a cluster with > 16k PGs. In those, I've noticed multipl...
- 06:27 PM Bug #3905 (Can't reproduce): incomplete & stale (lost?) PGs
- I added a bunch of new OSDs into my Ceph cluster (0.56.1 on Ubuntu 12.04 LTS) about 72h ago. Simultaneously, I marked...
- 05:14 PM Revision a972fd40 (ceph): mds: fix end check in Server::handle_client_readdir()
- commit 1174dd3188 (don't retry readdir request after issuing caps)
introduced an bug that wrongly marks 'end' in the ... - 04:49 PM Revision c061e841 (ceph): rados: safety interlock on 'rmpool' command
- This is a very easy way for a user to do a lot of damage with no way back.
Make sure they mean it.
Signed-off-by: Sa... - 04:40 PM Revision c993ac9b (ceph): mon: implement safety interlock for deleting pools
- This is a very easy way for users to accidentally to a *lot* of damage.
Make it an annoying manual process to actuall... - 02:43 PM Bug #3903 (Resolved): OSDMap::raw_pg_to_pps causes pools to have similar mappings
- The pool should be added in a way to ensure that different pools have independent mappings.
- 02:27 PM Revision 022a5254 (ceph): osd: drop newlines from event descriptions
- These produce extra newlines in the log.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.j... - 02:22 PM Revision ebc93a87 (ceph): OSD: do deep_scrub for repair
- Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(cherry picked... - 02:22 PM Revision 32527fa3 (ceph): ReplicatedPG: ignore snap link info in scrub if nlinks==0
- links==0 implies that the replica did not sent snap link information.
Signed-off-by: Samuel Just <sam.just@inktank.c... - 02:22 PM Revision 13e42265 (ceph): osd/PG: fix osd id in error message on snap collection errors
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 381e25870f26fad144ecc2fb99710498e3a7a1d4) - 02:22 PM Revision e3b6191f (ceph): osd/ReplicatedPG: validate ino when scrubbing snap collections
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 665577a88b98390b9db0f9991836d10ebdd8f4cf) - 02:21 PM Revision 353b7341 (ceph): ReplicatedPG: compare nlinks to snapcolls
- nlinks gives us the number of hardlinks to the object.
nlinks should be 1 + snapcolls.size(). This will allow
us to ... - 02:21 PM Revision 33d5cfc8 (ceph): ReplicatedPG/PG: check snap collections during _scan_list
- During _scan_list check the snapcollections corresponding to the
object_info attr on the object. Report inconsistenc... - 02:21 PM Revision bea783bd (ceph): osd_types: add nlink and snapcolls fields to ScrubMap::object
- Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit b85687475fa2ec74e5429d92ee64eda2051a256c) - 02:21 PM Revision 0c48407b (ceph): PG: move auth replica selection to helper in scrub
- Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 39bc65492af1bf1da481a8ea0a70fe7d0b4b17a3) - 02:21 PM Revision c3433ce6 (ceph): mon: note scrub errors in health summary
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 8e33a8b9e1fef757bbd901d55893e9b84ce6f3fc) - 02:21 PM Revision 90c6edd0 (ceph): osd: fix rescrub after repair
- We were rescrubbing if INCONSISTENT is set, but that is now persistent.
Add a new scrub_after_recovery flag that is r... - 02:21 PM Revision 0696cf57 (ceph): osd: note must_scrub* flags in PG operator<<
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit d56af797f996ac92bf4e0886d416fd358a2aa08e) - 02:21 PM Revision 1541ffe4 (ceph): osd: based INCONSISTENT pg state on persistent scrub errors
- This makes the state persistent across PG peering and OSD restarts.
This has the side-effect that, on recovery, we r... - 02:21 PM Revision 60910125 (ceph): osd: fix scrub scheduling for 0.0
- The initial value for pair<utime_t,pg_t> can match pg 0.0, preventing it
from being manually scrubbed. Fix!
Signed-... - 02:21 PM Revision 0961a3a8 (ceph): osd: note last_clean_scrub_stamp, last_scrub_errors
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 389bed5d338cf32ab14c9fc2abbc7bcc386b8a28) - 02:21 PM Revision 8d823045 (ceph): osd: add num_scrub_errors to object_stat_t
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 2475066c3247774a2ad048a2e32968e47da1b0f5) - 02:20 PM Revision 3a1cd6e0 (ceph): osd: add last_clean_scrub_stamp to pg_stat_t, pg_history_t
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit d738328488de831bf090f23e3fa6d25f6fa819df) - 02:20 PM Revision 7e5a899b (ceph): osd: fix object_stat_sum_t dump signedness
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 6f6a41937f1bd05260a8d70b4c4a58ecadb34a2f) - 02:20 PM Revision e252a313 (ceph): osd: change scrub min/max thresholds
- The previous 'osd scrub min interval' was mostly meaningless and useless.
Meanwhile, the 'osd scrub max interval' wou... - 02:20 PM Revision 33aa64ee (ceph): osd/PG: remove useless osd_scrub_min_interval check
- This was already a no-op: we don't call PG::scrub_sched() unless it has
been osd_scrub_max_interval seconds since we ... - 02:20 PM Revision fdd0c1ec (ceph): osd: move scrub schedule random backoff to seperate helper
- Separate this from the load check, which will soon vary dependon on the
PG.
Signed-off-by: Sage Weil <sage@inktank.c... - 02:20 PM Revision 9ffbe268 (ceph): osd/PG: trigger scrub via scrub schedule, must_ flags
- When a scrub is requested, flag it and move it to the front of the
scrub schedule instead of immediately queuing it. ... - 02:19 PM Revision cffb1b22 (ceph): osd/PG: introduce flags to indicate explicitly requested scrubs
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 1441095d6babfacd781929e8a54ed2f8a4444467) - 02:19 PM Revision 438e3dfc (ceph): osd/PG: move scrub schedule registration into a helper
- Simplifies callers, and will let us easily modify the decision of when
to schedule the PG for scrub.
Signed-off-by: ... - 01:40 PM Revision acb47e4d (ceph): os/FileStore: only flush inline if write is sufficiently large
- Honor filestore_flush_min in the inline flush case.
Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Re... - 01:40 PM Revision 15a1ced8 (ceph): os/FileStore: fix compile when sync_file_range is missing;
- If sync_file_range is not present, we always close inline, and flush
via fdatasync(2).
Fixes compile on ancient plat... - 01:39 PM Revision 9dddb9d8 (ceph): osd: set pg removal transactions based on configurable
- Use the osd_target_transaction_size knob, and gracefully tolerate bogus
values (e.g., <= 0).
Signed-off-by: Sage Wei... - 01:38 PM Revision c30d231e (ceph): osd: make pg removal thread more friendly
- For a large PG these are saturating the filestore and journal queues. Do
them synchronously to make them more friend... - 01:38 PM Revision b2bc4b95 (ceph): os: move apply_transactions() sync wrapper into ObjectStore
- This has nothing to do with the backend implementation.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked f... - 01:38 PM Revision 6d161b57 (ceph): os: add apply_transaction() variant that takes a sequencer
- Also, move the convenience wrappers into the interface and funnel through
a single implementation.
Signed-off-by: Sa... - 12:31 PM Support #3902 (Closed): S3-tests need to cleanup after themselves
- On Congress, DHO has hit the max number of users due to s3-tests not cleaning up after execution. Could we have the s...
- 11:27 AM rbd Tasks #2853 (In Progress): krbd: read path
- With my patches for the basic new request code now
out for initial review, I've started working on this
feature. I... - 11:20 AM rbd Subtask #2852 (In Progress): krbd: open parent on open
- The many patches have now been posted for review.
Included in that is a small, temporary patch that enables
this ... - 05:21 AM rbd Fix #3665: librbd: deadlock during flatten
- possibly here: ...
- 05:20 AM Revision 657df852 (ceph): os/FileStore: add stall injection into filestore op queue
- Allow admin to artificially induce a stall in the op queue. Forces the
thread(s) to sleep for N seconds. We pause f... - 05:20 AM Revision 132045ce (ceph): common/HeartbeatMap: inject unhealthy heartbeat for N seconds
- This lets us test code that is triggered by an unhealthy heartbeat in a
generic way.
Signed-off-by: Sage Weil <sage@... - 02:03 AM Revision a4e78652 (ceph): osd: do not join cluster if not healthy
- If our internal heartbeats are failing, do not send a boot message and try
to join the cluster.
Signed-off-by: Sage ... - 02:01 AM Revision c406476c (ceph): osd: hold lock while calling start_boot on startup
- This probably doesn't strictly matter because start_boot doesn't need the
lock (currently) and few other threads shou... - 01:56 AM Revision ad6b2311 (ceph): osd: do not reply to ping if internal heartbeat is not healthy
- If we find that our internal threads are stalled, do not reply to ping
requests. If we do this long enough, peers wi... - 01:53 AM Revision 61eafffc (ceph): osd: reduce op thread heartbeat default 30 -> 15 seconds
- If the thread stalls for 15 seconds, let our internal heartbeat fail.
This will let us internally respond more quickl... - 12:54 AM Revision 371e6fbe (ceph): Merge pull request #35 from cholcombe973/master
- Making the usage details a little better.
- 12:23 AM Bug #3900: init-ceph should do ulimit -n's with do_root_cmd
- I think he's right, except it should be do_root_cmd, and I'm not certain if that echoes the result of the command cor...
- 12:11 AM Bug #3900 (Resolved): init-ceph should do ulimit -n's with do_root_cmd
- Chen Xiaoxi points out on ceph-devel:
Here is part of /etc/init.d/ceph script:
case "$command" in
s... - 12:19 AM Revision 0d172b95 (ceph): packaging: add smalliobenchrbd
- Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
- 12:13 AM Revision 8eee815f (ceph): Merge remote-tracking branch 'gh/wip-3833-b'
- Conflicts:
src/osd/OSD.cc
src/osd/OSD.h
Reviewed-by: Samuel Just <sam.just@inktank.com> - 12:07 AM Revision 9388f941 (ceph): Update src/rgw/rgw_admin.cc
- Improved the usage message.
01/22/2013
- 11:58 PM Revision eaf20fa9 (ceph): Merge branch 'wip-3651'
- 11:57 PM Revision 509a93e8 (ceph): osd: Add digest of omap for deep-scrub
- Add ScrubMap encode/decode v4 message with omap digest
Compute digest of header and key/value. Use bufferlist
to ref... - 11:57 PM Revision db48caf6 (ceph): osd: debug support for omap deep-scrub
- Deep-scrub test support through admin socket
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Sam... - 11:57 PM Revision cfb1aa80 (ceph): osd: Add missing unregister_command() in OSD::shutdown()
- Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com> - 11:48 PM Revision e714c778 (ceph): osd: Testing of deep-scrub omap changes
- Fix scrub_test.py and add omap corruption test
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: S... - 11:23 PM Revision e328fa6c (ceph): test/bench: add rbd backend to smalliobench
- Only supports format 1 images to start, and does not issue flushes, so
it's best used with caching off.
Signed-off-b... - 11:10 PM Revision 0ee5ec7e (ceph): common/Throttle: fix modeline, whitespace
- Signed-off-by: Sage Weil <sage@inktank.com>
- 11:10 PM Revision c3266ad1 (ceph): config: helper to identify internal fields we should be quiet about
- Signed-off-by: Sage Weil <sage@inktank.com>
- 11:01 PM Revision 89072fbb (ceph): test/bench: don't alias bl from above
- Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
- 11:01 PM Revision c50f5f52 (ceph): test/bench: use uint64_t for uniform distribution
- int is too small for rbd image sizes
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> - 10:55 PM Revision 451cc00a (ceph): doc: Modified usage for upgrade.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 10:47 PM Revision 73a96936 (ceph): osd: improve sub_op flag points
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:47 PM Revision 33efe321 (ceph): osd: simplify asok to single callback
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:47 PM Revision 24d0d7eb (ceph): osd: dump op priority queue state via admin socket
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:47 PM Revision a1137eb3 (ceph): osd: make last state for slow requests more informative
- Report on the last event string, and pass in important context for the
op event list, including:
- which peers were... - 10:47 PM Revision 23c02bce (ceph): osd: refactor ReplicatedPG::do_sub_op
- PULL is the only case where we don't wait for active.
Signed-off-by: Sage Weil <sage@inktank.com> - 10:47 PM Revision c549a0cf (ceph): common/PrioritizedQueue: buckets -> tokens
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:47 PM Revision 6e3363b2 (ceph): common/PrioritizedQueue: add min cost, max tokens per bucket
- Two problems.
First, we need to cap the tokens per bucket. Otherwise, a stream of
items at one priority over time w... - 10:47 PM Revision 514af15e (ceph): common/PrioritizedQueue: dump state to Formatter
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:47 PM Revision bec96a23 (ceph): osd: debug msg prio, cost, latency
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:47 PM Revision e8e0da1a (ceph): osd: use Message::get_cost() function for queueing
- The data payload is a decent proxy for cost in most cases, but not all.
Signed-off-by: Sage Weil <sage@inktank.com> - 10:47 PM Revision a1bf8220 (ceph): osd: set PULL subop cost to size of requested data
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:47 PM Revision b685f727 (ceph): osd: add OpRequest flag point when commit is sent
- With writeahead journaling in particular, we can get requests that
stay in the queue for a long time even after the c... - 10:47 PM Revision 128fcfca (ceph): note puller's max chunk in pull requests
- this lets us calculate a cost value
- 10:47 PM Revision cfe4b851 (ceph): os/FileStore: allow filestore_queue_max_{ops,bytes} to be adjusted at r...
- The 'committing' ones too.
Signed-off-by: Sage Weil <sage@inktank.com> - 10:47 PM Revision 44dca5c8 (ceph): filestore: disable extra committing queue allowance
- The motivation here is if there is a problem draining the op queue
during a sync. For XFS and ext4, this isn't gener... - 10:47 PM Revision 1233e861 (ceph): osd: target transaction size 300 -> 30
- Small transactions make pg removal nicer to the op queue. It also slows
down PG deletion a bit, which may exacerbate... - 10:47 PM Revision 40654d6d (ceph): filestore: filestore_queue_max_ops 500 -> 50
- Having a deep queue limits the effectiveness of the priority queues
above by adding additional latency.
Signed-off-b... - 10:47 PM Revision 9230c863 (ceph): osd: make OSD a config observer
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:47 PM Revision 101955a6 (ceph): osd: make osd_max_backfills dynamically adjustable
- Signed-off-by: Sage Weil <sage@inktank.com>
- 09:23 PM Feature #3888 (Fix Under Review): osd: stop heartbeating peers when internal heartbeat fails
- wip-osd-hb
- 03:09 PM Feature #3888: osd: stop heartbeating peers when internal heartbeat fails
- backport to bobtail!
- 08:12 AM Feature #3888 (Resolved): osd: stop heartbeating peers when internal heartbeat fails
- if our internal thread heartbeats fail, stop replying to pings from peers.
- 09:09 PM Revision b6e3edc6 (ceph): test: create /tmp/cephtest/mnt.{id}
- The workunit task assumes that a mount exists
at /tmp/cephtest/mnt.{id}
This patch creates the path if it doesn't
exi... - 09:05 PM Revision 6401abf8 (ceph): qa/workunit: Add iozone test script for sync
- The iozone-sync.sh script runs iozone testing
various sync flags, O_SYNC, O_DSYNC, O_RSYNC.
Signed-off-by: Sam Lang ... - 09:05 PM Revision 72147fd3 (ceph): objectcacher: Remove commit_set, use flush_set
- commit_set() and flush_set() are identical in functionality,
so use flush_set everywhere and remove commit_set from
t... - 08:43 PM Revision 00b11869 (ceph): testing: add workunit to run hadoop internal tests.
- This workunit runs the internal tests for our local branch of hadoop-common.
Requires ant be installed on the host ru... - 07:37 PM Bug #3899 (Won't Fix): osd: failed to decode object_info_t
- This happened after moving a journal from a file to an ssd, and changing filestore xattr use omap from true to false,...
- 07:36 PM Bug #3836: osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
- ubuntu@teuthology:/a/teuthology-2013-01-22_07:00:04-regression-bobtail-master-basic/3235...
- 07:19 PM devops Bug #3898 (Resolved): ceph-deploy: problems with >1 mon
- If you try "ceph-deploy new ceph1 ceph2" then it correctly creates the ceph.conf and then spits out "Cluster config e...
- 06:25 PM Revision 4a871b55 (ceph): Merge branch 'wip-config'
- Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
- 06:24 PM Revision 359d0e98 (ceph): config: report on log level changes
- Signed-off-by: Sage Weil <sage@inktank.com>
- 06:24 PM Revision c5e09517 (ceph): config: clean up output
- Report a simple list of key='value', without extra verbosity.
Signed-off-by: Sage Weil <sage@inktank.com> - 05:37 PM CephFS Bug #3404: oops in strlen() from set_request_path_attr()
- I'm found the same bug in Bobtail release with NFS kernel server and 3.7.3 kernel
[70205.985665] BUG: unable to ha... - 05:35 PM Bug #3513 (Resolved): rgw log show error
- 05:35 PM Bug #3513: rgw log show error
- Nope, I had it wrong; the required params are: object *or* all three of date, bucket, and bucket-id.
Message change ... - 02:37 PM Bug #3513: rgw log show error
- Actually I guess the && should be || and the || should be && (the old DeMorgan's rule)
- 02:30 PM Bug #3513: rgw log show error
- I experienced this also on ubuntu 12.10 0.56.1-1
root@dlcephgw01:~# radosgw-admin log show --bucket=chris --date... - 05:04 PM rgw Bug #3896 (Resolved): rest-bench common/WorkQueue.cc: 54: FAILED assert(_threads.empty())
- It seems rest-bench doesn't like to exit cleanly while cleaning up after itself.... I did test at low concurrency bu...
- 04:31 PM Bug #3895 (Resolved): librados test hang during mon thrashing
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-01-21_19:00:03-regression-master-testing-gcov/2929
... - 04:05 PM Feature #3651 (Resolved): osd: deep scrub should hash omap
- 02:58 PM Bug #3894 (Closed): monclient: --keyring failed despite presence of file
- While going over install basics with Gary, we got "ERROR: missing keyring, cannot use cephx for authentication" when ...
- 02:40 PM rbd Feature #3877 (Fix Under Review): krbd: don't wait for notify ack to complete
- I've posted this code for review. I continue to do testing.
- 02:39 PM rbd Subtask #3741 (Fix Under Review): krbd: rework request tracking code
- I've posted this code for review. I continue to do testing.
- 02:39 PM rbd Tasks #3755 (Fix Under Review): krbd: use new request tracking code for sync object operations
- I've posted this code for review. I continue to do testing.
- 02:39 PM rbd Feature #3754 (Fix Under Review): krbd: use new request tracking code for notify ack
- I've posted this code for review. I continue to do testing.
- 02:19 PM rbd Feature #3893 (Rejected): krbd: document the new request code
- There are bits and pieces of the new request code
documented for the kernel rbd client--in the comments
and in the ... - 02:09 PM CephFS Bug #3832: client: does not observe O_SYNC
- Fixed a bug in objectcacher::flush_set. Branch wip-3832-oc-flushrange has been updated, and passes the accompanying ...
- 01:09 PM Subtask #2659: mon: Single-Paxos: ceph tool -w subscriptions not being updated
- Can't recall if this was fixed at some point, or if the root cause was even related.
This must be tested again onc... - 01:06 PM Subtask #2622 (Resolved): mon: Single-Paxos: convert existing, old MonitorStore to a brand new Mo...
- This was implemented both as an offline tool as well as integrated in ceph-mon. The ceph-mon will attempt to open the...
- 01:02 PM Subtask #3069: mon: Single-Paxos: messaging: log MMonSync messages for offline matching
- If we really want to do offline matching, this can be done using just the logs. This could be interesting however fo...
- 12:54 PM Subtask #3843 (Rejected): osd: move purged_snaps out of info
- 12:54 PM Subtask #3844 (Rejected): osd: move info and log into leveldb
- 12:54 PM Subtask #3842 (Rejected): osd: create tool to extract pg info and pg log from filestore
- 12:54 PM Feature #3841 (Rejected): osd: avoid seeks for log and info writes on client writes
- broke out subtasksa nd top level features
- 12:53 PM Feature #3892 (Resolved): osd: move pg info into leveldb
- 12:53 PM Feature #3891 (Resolved): osd: move purged_snaps out of info
- 12:53 PM Feature #3890 (Resolved): osd: create tool to extract pg info and pg log from filestore
- 10:38 AM Feature #2580 (Resolved): perf: investigate poor performance at 10 osds per node
- This was probably unique to the burnupi cluster and/or older ceph. Performance is fine on the SC847a now with lots o...
- 10:27 AM rbd Bug #3889 (Won't Fix): krbd: handle zero-length requests
- I'm pretty sure there are some special zero-length
requests (like flush) that can come down from the
block layer. ... - 07:07 AM Linux kernel client Bug #3887 (Closed): kernel client: small object memory leak
- In testing my new request code for rbd (issue 3741 and related)
I tried paying special attention to Linux slab usage... - 05:10 AM Revision 98cc1b83 (ceph): task: mon_clock_skew_check: add option to run at least one timecheck
- at-least-once Runs at least once, even if we are told to stop.
(default: True)
at... - 04:11 AM Linux kernel client Bug #3886: Futher testing result for the issue "ceph: avoid 32-bit page index overflow"
- https://SizableSend.com/0g9dwn/ceph_mds.a.log
- 04:06 AM Linux kernel client Bug #3886 (New): Futher testing result for the issue "ceph: avoid 32-bit page index overflow"
- We raised an issue in the following ticket and the ticket has been resolved
http://tracker.newdrea...
01/21/2013
- 11:09 PM Revision b7cb1b11 (ceph): rados/thrash: 3 monitors, so that we can thrash them
- 10:20 PM Feature #3848: osd: gracefully handle cluster network heartbeat failure
- One option: do not mark ourselves back up (after being wrongly marked down) unless we are able to successfully ping a...
- 10:12 PM Bug #3885 (Resolved): osd: osd-recovery-incomplete qa test failing
- ubuntu@teuthology:/a/teuthology-2013-01-21_19:00:03-regression-master-testing-gcov$ teuthology-ls --archive-dir . | g...
- 10:08 PM Feature #3833 (In Progress): osd: improve recovery throttling
- 09:59 PM Bug #2655: scrub slows writes more than it should
- This ticket predates the chunky scrub work that went into ~0.54 or thereabouts.
- 09:15 PM Bug #2655 (Resolved): scrub slows writes more than it should
- 09:12 PM Bug #2357 (Can't reproduce): mds takes down ceph
- 09:11 PM Bug #3854 (Resolved): mon: clock skew tests failing on master
- pushed to master
- 04:45 PM Revision d7d81922 (ceph): config: don't make noise about 'internal_safe_to_start_threads'
- This is set on start, and subsequently gets into the changed set.
Once any other config value is injected, it is the ... - 04:22 PM Revision 3399860d (ceph): Merge remote-tracking branch 'gh/next'
- 04:21 PM Revision 2e39dd5e (ceph): mds: fix default_file_layout constructor
- Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 04:21 PM Revision e461f096 (ceph): mds: fix byte_range_t ctor
- I do not think we saw any bugs from this, but anything that involved
capability issues on restart or migrate might ha... - 01:20 PM Fix #3884 (Resolved): osd: resurrect partially deleted PGs
- If a PG is in the process of getting removed and we repeer and discover we want to keep it, we currently block waitin...
- 12:30 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Corin-
Have you tried 0.48.3 again since then? I'd like to get to the bottom of this, if possible... :) - 09:35 AM rbd Bug #3737: Higher ping-latency observed in qemu with rbd_cache=true during disk-write
- Hi Josh,
according to our conversation I did some testing.
I started the dd if=/dev... of=/tmp/doof.dat bs=4k cou... - 12:11 AM Revision c5fe0965 (ceph): osd: calculate initial PG mapping from PG's osdmap
- The initial values of up/acting need to be based on the PG's osdmap, not
the OSD's latest. This can cause various co... - 12:11 AM Revision 17160843 (ceph): osd: calculate initial PG mapping from PG's osdmap
- The initial values of up/acting need to be based on the PG's osdmap, not
the OSD's latest. This can cause various co...
01/20/2013
- 11:01 PM CephFS Feature #1236 (Fix Under Review): libceph: set layout via virtual xattrs (libceph/cfuse)
- wip-vxattr (ceph.git) and wip-vxattrs (ceph-client.git). There's a test script that passes on both fuse and kclient....
- 10:58 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
- Greg Farnum wrote:
> How large would a simple "layout" xattr actually be in comparison to the shipped inodes? I'm no... - 03:12 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
- How large would a simple "layout" xattr actually be in comparison to the shipped inodes? I'm not sure the size is so ...
- 08:26 PM rbd Feature #3877: krbd: don't wait for notify ack to complete
- I have implemented this in the new request code.
It will be posted for review along with the rest
of that new code ... - 08:14 PM rbd Feature #3877 (In Progress): krbd: don't wait for notify ack to complete
- Ian points out that "I've already implemented this change"
suggests that the status of this issue should at least
b... - 08:26 PM rbd Subtask #3741 (In Progress): krbd: rework request tracking code
- Considering this "is actually work that's mostly complete"
I'm (finally) marking it "In Progress."
This code is f... - 08:22 PM rbd Feature #3754 (In Progress): krbd: use new request tracking code for notify ack
- I have completed implementing sending synchronous acknowledgement
in response to a watch request notification. It i... - 08:19 PM rbd Tasks #3755 (In Progress): krbd: use new request tracking code for sync object operations
- I have completed implementing all of these in the new request
code:
- synchronous object read (for v1 header object... - 04:12 PM Bug #3879 (Resolved): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
- thanks! commit:17160843d0c523359d8fa934418ff2c1f7bffb25
- 03:51 PM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
- Looks good to me.
- 09:58 AM Bug #3879 (Fix Under Review): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
- wip-3879
- 09:06 AM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
- Output from the following attached:
ceph osd getmap 554 -o 554 - 08:46 AM Bug #3879 (In Progress): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
- Jens Kristian Søgaard wrote:
> Output from the following attached:
>
> ceph osd getmap 555 -o 555
> ceph osd get... - 12:49 AM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
- Output from the following attached:
ceph osd getmap 555 -o 555
ceph osd getmap 556 -o 556
- 11:15 AM Bug #3883 (Won't Fix): osd: leaks memory (possibly triggered by scrubbing) on argonaut
- 100MB/day reported by multiple users, both on 0.48 and 0.56.1.
Some correlation with scrubbing. Possibly specific... - 09:55 AM CephFS Feature #3882: Hide snapshot directory name in mount/mtab
- It seems like better (or perhaps just "more important") fix is to restrict access to .snap in the first place.
FWI... - 07:14 AM CephFS Feature #3882 (Rejected): Hide snapshot directory name in mount/mtab
- The idea is to avoid users to see what snapshot directory name choosen during mount.
This is useful if we want to... - 09:51 AM CephFS Bug #3881 (Rejected): Wrong ip network to exchange data between kernel ceph and MDS
- Ivan Kudryavtsev wrote:
> Hm. It seems that I'm wrong about the way it works. It connects to OSDs via OSD-defined pu... - 09:44 AM CephFS Bug #3881: Wrong ip network to exchange data between kernel ceph and MDS
- Hm. It seems that I'm wrong about the way it works. It connects to OSDs via OSD-defined public network. It seems that...
- 07:03 AM CephFS Bug #3881 (Rejected): Wrong ip network to exchange data between kernel ceph and MDS
- I'm using ceph installation with three networks:
1st is Infiniband network for OSD exchange and replication
2nd i...
01/19/2013
- 02:24 PM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
- full log at http://bit.ly/11Hn7BN
- 02:04 PM Bug #3879 (Resolved): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
- ...
- 12:58 PM Bug #3878 (Rejected): osd: nobackfill flag doesn't work
- on currently master, bobtail
- 11:43 AM Feature #3833: osd: improve recovery throttling
- see wip-3833 for push
- 08:40 AM rbd Feature #3877 (Closed): krbd: don't wait for notify ack to complete
- When we receive notification of a change to an rbd image's header
object we need to refresh our information about th... - 06:36 AM Revision 2491f976 (ceph): workunits/cephtool: add tests for ceph osd pool set/get
- Signed-off-by: Dan Mick <dan.mick@inktank.com>
- 04:57 AM Revision ea9628fb (ceph): Merge remote-tracking branch 'gh/next'
- 03:26 AM Revision 48308954 (ceph): Clarify journal size based on filestore max sync
- The docs had the recommended journal size based on the option
"filestore min sync interval" when it should have been
... - 02:32 AM Revision aea898db (ceph): ceph: reject negative weights at ceph osd <n> reweight
- Check the integer (fixed-point) value to avoid any worries
about floating-point rounding. Add tests for reweight < 0... - 02:32 AM Revision 7d9d7651 (ceph): workunit/cephtool: Use '! cmd' when expecting failure
- Signed-off-by: Dan Mick <dan.mick@inktank.com>
- 12:55 AM Revision ee4a9f25 (ceph): marginal/mds_thrasher: Add tests for mds thrasher
- Adds a basic set of roles for testing the mds thrasher
with 1 active and 1 standby, and a few basic tests that
stress... - 12:40 AM Revision 6008b1d8 (ceph): osdmap: make replica separate in default crush map configurable
- Add 'osd crush chooseleaf type' option to control what the default
CRUSH rule separates replicas across. Default to ... - 12:17 AM Revision 8c0d702e (ceph): msg/Pipe: use state_closed atomic_t for _lookup_pipe
- We shouldn't look at Pipe::state in SimpleMessenger::_lookup_pipe() without
holding pipe_lock. Instead, use an atomi... - 12:17 AM Revision 5fb77bf1 (ceph): ceph: adjust crush tunables via 'ceph osd crush tunables <profile>'
- Make it easy to adjust crush tunables. Create profiles:
legacy: the legacy values
argonaut: the argonaut defaults... - 12:17 AM Revision 373f1671 (ceph): msgr: atomically queue first message with connect_rank
- Atomically queue the first message on the new pipe, without dropping
and retaking pipe_lock.
Signed-off-by: Sage Wei... - 12:17 AM Revision ae1882e7 (ceph): msgr: don't queue message on closed pipe
- If we have a con that refs a pipe but it is closed, don't use it. If
the ref is still there, it is only because we a... - 12:17 AM Revision 34e2d402 (ceph): msgr: fix race on Pipe removal from hash
- When a pipe is faulting and shutting down, we have to drop pipe_lock to
take msgr lock and then remove the entry. Th... - 12:17 AM Revision 8e0359c3 (ceph): msgr: inject delays at inconvenient times
- Exercise some rare races by injecting delays before taking locks
via the 'ms inject internal delays' option.
Signed-... - 12:01 AM Revision 0cb760f3 (ceph): OSD: do deep_scrub for repair
- Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
01/18/2013
- 11:45 PM Revision 684a8f8f (ceph): Merge branch 'wip-pg-removal'
- Reviewed-by: Samuel Just <sam.just@inktank.com>
- 11:44 PM Revision f6c69c3f (ceph): os: add apply_transaction() variant that takes a sequencer
- Also, move the convenience wrappers into the interface and funnel through
a single implementation.
Signed-off-by: Sa... - 11:44 PM Revision bc994045 (ceph): os: move apply_transactions() sync wrapper into ObjectStore
- This has nothing to do with the backend implementation.
Signed-off-by: Sage Weil <sage@inktank.com> - 11:44 PM Revision 4712e984 (ceph): osd: make pg removal thread more friendly
- For a large PG these are saturating the filestore and journal queues. Do
them synchronously to make them more friend... - 11:44 PM Revision 5e00af40 (ceph): osd: set pg removal transactions based on configurable
- Use the osd_target_transaction_size knob, and gracefully tolerate bogus
values (e.g., <= 0).
Signed-off-by: Sage Wei... - 11:33 PM Revision 82f22b38 (ceph): config_opts.h: default osd_recovery_delay_start to 0
- This setting was intended to prevent recovery from overwhelming peering traffic
by delaying the recovery_wq until osd... - 11:21 PM Documentation #3711: crush-map.rst: choose firstn talks about "N", but does not clearly define wh...
- 11:20 PM Documentation #3711: crush-map.rst: choose firstn talks about "N", but does not clearly define wh...
- Sorry, I think this is still wrong; the descriptions of {num} only apply if firstn is supplied, correct? Otherwise {...
- 11:12 PM Bug #3869: ceph osd pool get doesn't support everything set does
- Added tests with commit:2491f976e4cd6eca5c30f7c184038364e4fe1873
- 01:22 PM Bug #3869: ceph osd pool get doesn't support everything set does
- how about a quick bash test script that gets and sets some of these values?
- 12:49 PM Bug #3869 (Resolved): ceph osd pool get doesn't support everything set does
- commit:1f911fd0616c3fb45d5d36de7947a1914190017b
- 12:27 PM Bug #3869 (Fix Under Review): ceph osd pool get doesn't support everything set does
- 12:15 PM Bug #3869: ceph osd pool get doesn't support everything set does
- This was noted on #ceph overnight.
- 12:14 PM Bug #3869 (Resolved): ceph osd pool get doesn't support everything set does
- ...for no apparently good reason. Adding the missing info is easy.
- 11:11 PM RADOS Bug #3872 (Resolved): You can put negative weights on OSDs
- commit:aea898db2b56878b50f09dcbbf52347f4cc5c754
- 05:39 PM RADOS Bug #3872: You can put negative weights on OSDs
- 04:01 PM RADOS Bug #3872 (Fix Under Review): You can put negative weights on OSDs
- 02:32 PM RADOS Bug #3872 (Resolved): You can put negative weights on OSDs
- DHO reports that negative weights can be assigned to an OSD. Tested on Alexandria running 0.56-20-g9aecacd-1precise.
... - 09:48 PM Revision 53f22d94 (ceph): task/mds_thrasher: New task for thrashing the mds
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 09:43 PM Revision 4bdcfbff (ceph): client: Respect O_SYNC, O_DSYNC, and O_RSYNC
- If the file is opened with O_SYNC, O_DSYNC, or O_RSYNC, we need to
flush cached data (and metadata for O_SYNC) on a w... - 09:31 PM Revision b4e0f7ca (ceph): Merge remote-tracking branch 'gh/wip-client-pool-api'
- Reviewed-by: Sage Weil <sage@inktank.com>
- 09:16 PM Linux kernel client Bug #3875 (Resolved): osd_client: don't use r_num_pages for bio requests
- There is an osd request field "r_num_pages" that's used
to record the number of pages supplied with the request.
Fo... - 09:02 PM Revision 609442da (ceph): Merge remote-tracking branch 'gh/wip-scrub-argonaut' into argonaut
- 08:42 PM Revision 1f911fd0 (ceph): ceph: allow osd pool get to get everything you can set
- osd pool get was missing size, min_size, crash_replay_interval,
and crush_ruleset; they're all easily added.
Fixes: ... - 08:21 PM Revision 045af959 (ceph): qa: remove xfstest 068 from qemu testing
- This tests fsfreeze, which sometimes hangs in xfs in linux 3.2
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> - 08:14 PM Revision 49726dcf (ceph): os/FileStore: only flush inline if write is sufficiently large
- Honor filestore_flush_min in the inline flush case.
Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Re... - 08:14 PM Revision 8ddb55d3 (ceph): os/FileStore: fix compile when sync_file_range is missing;
- If sync_file_range is not present, we always close inline, and flush
via fdatasync(2).
Fixes compile on ancient plat... - 07:05 PM Revision b8d5e286 (ceph): doc/rados/operations/crush: need kernel v3.6 for first round of tunables
- Reported-by: rl219 in #ceph on irc.oftc.net
Signed-off-by: Sage Weil <sage@inktank.com> - 06:47 PM Revision dbc38eff (ceph): rbd.py: update scratch and test image sizes
- Test 167 was failing due to running out of space on the scratch
file system. The test reserves 21MB in a file, and r... - 06:45 PM Revision 7e8e6491 (ceph): os/: Add CollectionIndex::prep_delete
- If an unlink is interupted between removing the file
and updating the subdir attribute, the attribute will
overestima... - 06:35 PM Revision 736966f3 (ceph): java: support get pool id/replication interface
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 06:33 PM Revision 40415d1c (ceph): libcephfs: add pool id/size lookup interface
- Adds new interfaces ceph_get_pool_id() and ceph_get_pool_replication()
to libcephfs.
Signed-off-by: Noah Watkins <no... - 05:35 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
- Translating any ceph.* setxattrs into a sync setxattr and handling it on the MDS seems like an easy win. I can't thi...
- 01:34 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
- We're still thinking through the implications of the best way to implement this. Nonetheless there are people using h...
- 05:01 PM CephFS Bug #3832: client: does not observe O_SYNC
- Current status: the iozone-sync.sh test script is causing a segfault (sometimes at hang). Needs more testing! Segf...
- 04:46 PM Documentation #3808 (In Progress): Block device quick start page need update
- 03:58 PM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- It had sounded to me like the trend was towards eliminating the Perl usage rather than adding it as a dependency. Did...
- 03:56 PM Feature #3815 (Duplicate): osd: move pg_info_t back into the xattr; avoid writing pginfo file whe...
- 03:49 PM Bug #3870 (Resolved): osd: make pg removal more friendly
- commit:684a8f8f84312d4d9c6cdeb8d6d9fad792bd5a6d
- 01:44 PM Bug #3870 (Resolved): osd: make pg removal more friendly
- wip-pg-removal needs cleanup and merge
- 03:49 PM Bug #3806 (Won't Fix): OSDs stuck in active+degraded after changing replication from 2 to 3
- Thanks. I was trying to figure out where the conflict could come from, and actually it does make sense: The single-os...
- 03:45 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
- Sure, it's attached...
- 03:40 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
- @Josh: Even with the new CRUSH tunables it's still a matter of probability, so if you give it a particularly challeng...
- 03:31 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
- OK, it looks like I may have simply given CRUSH a challenging assignment, given the resources of the cluster.
I ... - 02:58 PM Bug #3873 (Duplicate): Ceph cli tool allows setting negative weights
- 02:54 PM Bug #3873 (Duplicate): Ceph cli tool allows setting negative weights
- Setting OSD weights to negative values:...
- 02:46 PM Bug #1807 (Can't reproduce): CentOS compile error in perfglue/heap_profiler.cc
- 02:01 PM CephFS Feature #3570 (Resolved): teuthology: mds thrasher
- 02:01 PM rbd Bug #3871 (Resolved): krbd: initial header read may be out of date
- Currently krbd uses the version parameter of a watch operation to try to prevent this, but that was never implemented...
- 01:55 PM Linux kernel client Bug #3860 (Rejected): rbd: problems if watch setup returns ERANGE
- 01:54 PM Linux kernel client Bug #3860: rbd: problems if watch setup returns ERANGE
- ERANGE is never actually returned - it was never implemented (#2592). The real fix for the race it was intended to pr...
- 08:08 AM Linux kernel client Bug #3860 (Rejected): rbd: problems if watch setup returns ERANGE
- When rbd sets up the watch request for a newly-mapped rbd image
it loops and tries again if the request returns ERAN... - 12:49 PM CephFS Feature #3865 (Duplicate): mds: implement lookup-by-ino based on inode backtraces
- #3541. Whoops!
- 11:02 AM CephFS Feature #3865 (Duplicate): mds: implement lookup-by-ino based on inode backtraces
- Following #3862 and #3863, implement the lookup-by-ino algorithm described in http://www.spinics.net/lists/ceph-devel...
- 12:49 PM CephFS Feature #3541: mds: robust ino lookup using file backpointers
- We have a design now!
- 12:48 PM CephFS Feature #3862 (Duplicate): mds: add file backtraces to data objects
- #3540. Whoops!
- 10:26 AM CephFS Feature #3862 (Duplicate): mds: add file backtraces to data objects
- Add backtraces to each file object, as described at http://www.spinics.net/lists/ceph-devel/msg11872.html. This ticke...
- 12:48 PM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
- We have a design now!
- 11:09 AM CephFS Feature #3727: mds: refactor EMetablob encoding paths
- What is this bug about?
- 11:08 AM CephFS Feature #3867 (Resolved): optionally do not use an anchor table
- Following #3865 and #3866, we should introduce a config option that, when set, does not make use of the Anchor table ...
- 11:07 AM CephFS Feature #3866 (New): mds: Add lazily-updated backtraces to hard links
- As described in http://www.spinics.net/lists/ceph-devel/msg11872.html, we want hard links to contain lazily-updated b...
- 10:55 AM CephFS Feature #3863: implement a tool to lookup inode numbers without holding their path
- +1 for just adding the libcephfs function, and a test in test_libcephfs.
- 10:41 AM CephFS Feature #3863 (Resolved): implement a tool to lookup inode numbers without holding their path
- This should just be a small wrapper around Client.cc*, but we need to be able to generate inode lookups without knowi...
- 10:41 AM Feature #3769: osd: scrub should verify snap collection existence, membership
- In master, sha-1 7b6fe03208c507b55517abe45cdff5c96d91904a
Needs backport when we are happy with the testing (if it's... - 10:15 AM rbd Tasks #3755: krbd: use new request tracking code for sync object operations
- The sync header read operation was another one that was needed.
That's basically done too.
All of this will be re... - 10:09 AM rbd Tasks #3755: krbd: use new request tracking code for sync object operations
- I have been looking in detail at how the watch requests are
implemented and in the process identified a few potentia... - 10:14 AM Linux kernel client Bug #3751 (Resolved): krbd: fix type of snap_id local variable
- ...
- 10:11 AM Bug #3854 (Fix Under Review): mon: clock skew tests failing on master
- 10:07 AM Bug #3854: mon: clock skew tests failing on master
- teuthology's wip-3854 commit:1d8640860441dc27e8342788c1ae17f5c1b3ccc0 fixes this issue.
- 09:00 AM Bug #3816: osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
- commit:98a763123240803741ac9f67846b8f405f1b005b
When the osd does a "mark myself back up" it takes care to rebind ... - 08:58 AM rbd Feature #3861 (Resolved): rbd: consider splitting rbd_osd_req_op_create()
- When it was out for review, Josh suggested that it might
be better to have separate (type-checking) functions for
b... - 08:25 AM CephFS Bug #3845: mds: standby_for_rank not getting cleared on takeover
- +1 clearing it for cosmetic reasons.
- 08:25 AM Revision 76e715ba (ceph): doc: Added link to rotation section.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 08:25 AM Revision e1741ba6 (ceph): doc: Added hyperlink to log rotation section.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 08:24 AM Revision 612717af (ceph): doc: Added section on log rotation.
- fixes: #3776
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 08:07 AM rbd Bug #3859 (Resolved): osd_client: define ceph_osdc_clear_request_linger()
- There is a ceph_osdc_set_request_linger() function that
sets a flag on a request and takes an additional reference.
... - 08:04 AM rbd Bug #3858 (Resolved): osd_client: ceph_osdc_wait_request() seems wrong
- The only error wait_for_completion_interruptible() will
return is ERESTARTSYS. So if that gets returned inside
cep... - 07:33 AM Revision 48f41468 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
- 07:32 AM Revision 83326588 (ceph): doc: Modified index to include mon-osd-interaction.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 07:31 AM Revision d6fc92df (ceph): doc: Added a section describing mon/osd interaction.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 07:14 AM rbd Feature #1491: qemu: make qemu-img convert fast
- This was rejected because feature is not relevant anymore. At the time, when I was looking at it there was some obvio...
- 06:43 AM Revision bebdc70b (ceph): build: Add perl installation dependency to rpm and debian packages.
- There was already a dependency on python in the debian control file,
a similar dependency was added to the rpm spec f... - 06:13 AM Revision ff7c971f (ceph): doc: Added an admonishment for SSD write latency.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 06:00 AM Revision 6f28faf9 (ceph): mds: open mydir after replay
- In certain cases, we may replay the journal and not end up with the
dirfrag for mydir open. This is fine--we just ne... - 05:51 AM Revision dd7caf5f (ceph): mds: gracefully exit if newer gid replaces us by name
- If 'mds enforce unique name' is set, and another MDS with the same name
kicks us out of the MDSMap, gracefully exit i... - 05:45 AM Revision 2e112333 (ceph): mon: enforce unique name in mdsmap
- Add 'mds enforce unique name' option, defaulting to true.
If set, when an MDS boots, it will kick any previous mds w... - 05:27 AM Revision ca2d9ac9 (ceph): doc: Updated OSD configuration reference with backfill config options.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 05:25 AM Revision e330b7ec (ceph): mon: create fail_mds_gid() helper; make 'ceph mds rm ...' more generic
- Take a gid or a rank or a name. Use a nicer helper.
Signed-off-by: Sage Weil <sage@inktank.com> - 05:05 AM Revision 5a384f48 (ceph): Merge branch 'wip-mds'
- Reviewed-by: Sage Weil <sage@inktank.com>
- 05:00 AM Revision f41b5421 (ceph): add mon_thrash task to kernel and rados thrashers collections
- Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
- 04:57 AM Revision 626f6104 (ceph): Add a test for the truncate/osd-commit-reply race
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 04:54 AM Revision cc7bf1bd (ceph): rados: add osd reply delay injection
- 01:54 AM Revision d81ac841 (ceph): rbd: fix bench-write infinite loop
- I/O was continously submitted as long as there were few enough ops in
flight. If the number of 'threads' was high, or... - 01:01 AM Revision 233d034d (ceph): Merge branch 'wip-cephx'
- Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
- 12:43 AM devops Feature #2885 (Resolved): doc: mon initial members requirements, functioning, admin steps to take
- This was done some time ago. Step 9 here: http://ceph.com/docs/master/rados/deployment/chef/#configure-your-ceph-envi...
- 12:35 AM Documentation #3062 (Resolved): doc: osd tuning config options
- This was completed some time ago.
- 12:28 AM Documentation #3329 (Resolved): doc: What metrics should be used to set node weight
- Discussion was primarily starting with 1TB as a weight of 1.00 with additional consideration for throughput. If this ...
- 12:27 AM Tasks #3779 (In Progress): update osd config ref as appropriate
- 12:26 AM Bug #3776 (Resolved): Need doc describing how to alter our log rotation
- 12:09 AM Revision e776b63d (ceph): crushtool: consolidate_whitespace() should eat everything except \n
- CRUSH map source with \r (like a DOS text file) failed to compile
with the usual nonuseful message; turns out that ea... - 12:09 AM Revision 60db6e3e (ceph): crushtool: warn usefully about missing output spec
- When running with --test, you must request output to CSV files or
specific types of output to --show-X; make the erro...
01/17/2013
- 11:41 PM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
- 11:41 PM Documentation #3389 (Resolved): doc: crush docs could use a full example crushmap
- 11:40 PM Documentation #3709 (Resolved): crush-map.rst: claims 'types' are default, not true (must be spec...
- 11:40 PM Documentation #3707 (Resolved): crush-map.rst: syntax error in example
- 11:28 PM Feature #3505 (Resolved): default to libnss
- This was done for RPMs with the commit listed below. Debians already had the --with-nss flag in the rules file.
... - 11:21 PM Bug #2176 (In Progress): dependencies not checked by autoconf
- All these are listed as build requirements for the rpm and debian packages. I'll add the missing ones to configure.ac.
- 11:16 PM devops Tasks #3512 (In Progress): Publish our fastcgi packages
- The approach is to pick up the latest debian and rpm packages for mod_fastcgi, apply the ceph patch, and build manual...
- 11:13 PM Bug #3736: kernel build: failures starting in 3.8-rc1
- The immediate kernel build problems have been solved by recreating the patch that is applied to the debian package bu...
- 11:09 PM Bug #3736: kernel build: failures starting in 3.8-rc1
- Branch: refs/heads/master
Home: https://github.com/ceph/autobuild-ceph
Commit: 0ff4f9a9ce82b37288b3bbcc5b5d65b5... - 11:12 PM Revision efa595f5 (ceph): doc/rados/operations/authentication: update for cephx sig requirement o...
- Signed-off-by: Sage Weil <sage@inktank.com>
- 11:12 PM Revision 50db10dc (ceph): msg/Pipe: require MSG_AUTH feature on server if option is enabled
- If we
negotiate cephx AND
are a server AND
cephx require signatures = true
then require the MSG_AUTH feature ... - 11:12 PM Revision 91a573a4 (ceph): mon: enforce 'cephx require signatures' during negotiation
- If we are negotiating which auth protocol to use, and the client does not
support the MSG_AUTH feature, and the serve... - 11:11 PM Revision 4a49a09d (ceph): cephx: control signaures for service vs cluster
- Signed-off-by: Sage Weil <sage@inktank.com>
- 11:01 PM Revision c236a51a (ceph): osdmap: make replica separate in default crush map configurable
- Add 'osd crush chooseleaf type' option to control what the default
CRUSH rule separates replicas across. Default to ... - 10:54 PM Bug #3768 (Resolved): perl is required for logrotate, we need to include Perl as a dependency
- Branch: refs/heads/master
Home: https://github.com/ceph/ceph
Commit: bebdc70b4254a78d9fe86af9c645e828fd11e2b2
... - 10:16 PM Documentation #3831 (In Progress): ceph osd crush set command needs correction in the doc
- 10:14 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
- 10:02 PM CephFS Feature #3857: mds: enforce unique mds names in mdsmap
- see wip-mds-names
- 09:36 PM CephFS Feature #3857 (Resolved): mds: enforce unique mds names in mdsmap
- Currently mds's are uniquely identified by their addr (i.e., a unique instance of the process). The name is useful on...
- 08:27 PM Revision cd09be6a (ceph): ceph: pass ceph.conf to osdmaptool
- This ensure it sees the chooseleaf option and generates the proper
CRUSH rules. - 06:37 PM rbd Bug #3413 (Resolved): rbd bench-write fails with assert when rbd caching turned on
- commit:d81ac8418f9e6bbc9adcc69b2e7cb98dd4db6abb
- 01:39 PM rbd Bug #3413 (Fix Under Review): rbd bench-write fails with assert when rbd caching turned on
- branch wip-rbd-bench-write
- 06:11 PM Revision c6f8010b (ceph): mon: Monitor: drop messages from old timecheck epochs
- We were asserting when the message's timecheck epoch (which is mapped to
the election epoch) was older than the curre... - 06:08 PM Revision 81e8bb55 (ceph): osdmaptool: more fix cli test
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit b0162fab3d927544885f2b9609b9ab3dc4aaff74) - 06:08 PM Revision 2b5b2657 (ceph): osdmaptool: fix cli test
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 5bd8765c918174aea606069124e43c480c809943) - 06:08 PM Revision f739d123 (ceph): osdmaptool: allow user to specify pool for test-map-object
- Fixes: #3820
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Gregory Farnum <greg@in... - 06:07 PM Revision 00759ee0 (ceph): rados.cc: fix rmomapkey usage: val not needed
- Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <samuel.just@inktank.com>
(cherry pic... - 06:07 PM Revision 06b3270f (ceph): librados.hpp: fix omap_get_vals and omap_get_keys comments
- We list keys greater than start_after.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <... - 06:07 PM Revision 75072965 (ceph): rados.cc: use omap_get_vals_by_keys in getomapval
- Fixes: #3811
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(... - 06:07 PM Revision a3c2980f (ceph): rados.cc: fix listomapvals usage: key,val are not needed
- Fixes: #3812
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(... - 06:00 PM rgw Feature #3856 (Resolved): rgw: list buckets S3 api should be paginated
- The S3 api (unlike swift) does not define marker, max when listing buckets (probably due to the fact that max buckets...
- 05:25 PM Bug #3836: osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
- ...
- 08:52 AM Bug #3836 (Resolved): osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
- ...
- 04:55 PM Bug #3279: mon/caps: cap comparison in get-or-create is based on a string literal
- This effects the chef mon recipe. I am able to correct this error by joining lines 96-99.
[Thu, 17 Jan 2013 16:5... - 04:44 PM Feature #3850: Add json output for ceph pg dump and ceph osd tree
- 'pg dump' and 'osd dump' both have 'json' support since argonaut, but argonaut does not support outputting json on 'o...
- 03:20 PM Feature #3850: Add json output for ceph pg dump and ceph osd tree
- It already exists for pg dump and osd dump too. osd tree was recent though, maybe it's not in the version he's using?
- 03:02 PM Feature #3850 (Closed): Add json output for ceph pg dump and ceph osd tree
- Kyle Bader has requested json output for the following commands:
ceph pg dump
ceph osd tree
Sage Comment:
th... - 04:32 PM Feature #3855 (Resolved): Making Scrubs Nicer
- As requested from DHO:
Currently scrubs are not very nice, Sage referred to these issues and it would be nice if t... - 04:26 PM Bug #3854 (Resolved): mon: clock skew tests failing on master
- ...
- 04:21 PM Feature #3853 (Resolved): qa: include iogen in qa suite
- 04:10 PM Bug #3827 (Resolved): crushtool --test: claims to want -o, really wants --output-csv or --show-*
- commit:60db6e3e394df1e4110eefa5951657b648b02006
- 04:10 PM RADOS Bug #3834 (Resolved): crushtool really really hates \r
- commit:e776b63dd5c540a6f49b03b67e72a1f4636a74fd
- 11:06 AM RADOS Bug #3834: crushtool really really hates \r
- Well isspace() would catch newline too, which I think we don't want, so it'd be iswhite(c) && c != '\n', which I'm no...
- 04:06 PM devops Bug #3852 (Resolved): chef recipes don't try to start OSDs
- I wasn't aware the chef recipes were this incomplete, but it appears as though, unless
you're running Crowbar, osd.r... - 04:05 PM devops Bug #3851 (Resolved): chef recipes don't enable upstart
- Since upstart management of daemons now explicitly looks for an upstart tag file, Chef
doesn't start the monitors co... - 03:17 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- I presume we're planning to backport this to bobtail after it passes some nights of testing? Maybe we should leave th...
- 03:03 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
- commit:f358cb1d2b0a3a78bf59c4fd085906fcb5541bbe
- 02:58 PM Feature #3849 (Resolved): Track slow PGs and times OSDs marked down
- Kyle Bader:
"Over the weekend of 01/02/13 we encountered an issue that we had not yet
encountered. One of our cephs... - 02:54 PM Feature #3848 (Resolved): osd: gracefully handle cluster network heartbeat failure
- From Kyle Bader
"Back in October we had a switch failure on our cluster (backend) network.
This was not noticed b... - 02:24 PM rbd Bug #3847 (Resolved): rbd: figure out correct byte order for watch version
- In the process of refactoring rbd code that builds up osd
operations I noticed that for NOTIFY_ACK and WATCH operati... - 01:40 PM Documentation #3846 (Resolved): Debian install has incorrect gitbuilder URL
From http://ceph.com/docs/master/install/debian/ :...- 12:32 PM rbd Feature #1491 (Rejected): qemu: make qemu-img convert fast
- 12:28 PM CephFS Bug #3832 (Fix Under Review): client: does not observe O_SYNC
- Implemented in wip-3832. Needs review.
- 12:17 PM CephFS Bug #3845: mds: standby_for_rank not getting cleared on takeover
- I dont' think it matters. It's is a fixed lifecycle from standby -> active -> dead, so the leftover standby_ just te...
- 12:13 PM CephFS Bug #3845: mds: standby_for_rank not getting cleared on takeover
- This is a monitor thing; the MDS is only involved in relaying the config setting over on boot-up.
- 11:38 AM CephFS Bug #3845 (Closed): mds: standby_for_rank not getting cleared on takeover
- This is the mdsmap after mds.a was active and given rank 0, then killed, and another mds (mds.b-s-r0) that had standb...
- 11:34 AM CephFS Feature #3730: Support replication factor in Hadoop
- Sage Weil wrote:
> If there are more such cases, that is a separate bug!
It was a bug I had introduced in wip-cli... - 09:51 AM CephFS Feature #3730: Support replication factor in Hadoop
- Noah Watkins wrote:
> In Client, osdmap is protected by client_lock? If so, new version of branch isn't broken..
... - 08:55 AM CephFS Feature #3730: Support replication factor in Hadoop
- In Client, osdmap is protected by client_lock? If so, new version of branch isn't broken..
- 10:45 AM Subtask #3844 (Rejected): osd: move info and log into leveldb
- 10:45 AM Subtask #3843 (Rejected): osd: move purged_snaps out of info
- the purged_snaps set is really a property of the local pg instance rather than a global property and does not get upd...
- 10:42 AM Subtask #3842 (Rejected): osd: create tool to extract pg info and pg log from filestore
- Once these are moved into leveldb, it will be much more difficult to manually extract these structures.
- 10:41 AM Feature #3841 (Rejected): osd: avoid seeks for log and info writes on client writes
- Probable approach is to move log and info into leveldb.
- 10:38 AM Subtask #3840 (Resolved): osd: ack push after apply+commit
- This will prevent the primary from shoving another push before the first has completed. Alternately, make the number...
- 10:28 AM Documentation #3839 (Resolved): SSD crushmap example will not compile
- The SSD CRUSH map example (http://ceph.com/docs/master/rados/operations/crush-map/#placing-different-pools-on-differe...
- 10:24 AM CephFS Bug #1435: mds: loss of layout policies upon mds restart
- wip-mds-layout2
needs to be rebased reviewed and tested! - 10:13 AM Bug #3835 (Resolved): mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) when monito...
- pushed to master, commit:c6f8010b1c8e4d54f9fb24b2e4e25ff8a2bde778
- 09:34 AM Bug #3835 (Fix Under Review): mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) whe...
- 08:51 AM Bug #3835: mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) when monitors are seve...
- This issue is fixed on wip-3835, commit:785a2bc3e9271607b1ddf25390056e9dd9c72b21
- 07:47 AM Bug #3835 (Resolved): mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) when monito...
- The leader schedules a new 'ping' to the monitors in the quorum as soon as the pings are all sent.
This allows for... - 10:04 AM Bug #3820: osdmaptool - user cannot specify pool
- 85eb8e382a26dfc53df36ae1a473185608b282aa
- 09:58 AM Bug #3816 (Resolved): osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
- 09:50 AM rbd Feature #3838 (New): krbd: use common functions for striping calculations
- With the STRIPINGV2 feature bit, format 2 striping has the same parameters as cephfs striping. Re-work the rbd object...
- 09:29 AM Linux kernel client Feature #3837 (Resolved): krbd: support format 2 striping
- Format 2 images with the STRIPINGV2 feature bit set (created with rbd create --stripe-count X --stripe-unit Y --order...
- 09:12 AM rbd Feature #3754: krbd: use new request tracking code for notify ack
- Yay!
- 04:52 AM rbd Feature #3754: krbd: use new request tracking code for notify ack
- Yeehah! All tests passed, including the previously-failing
blogbench.sh, fsstress, and two passes through xfstests. - 09:11 AM Bug #2843: filestore: replay failure on xfs
- The post-v0.50 version of this bug was just fixed, commit:66eb93b83648b4561b77ee6aab5b484e6dba4771, which is backport...
- 02:38 AM Bug #2843: filestore: replay failure on xfs
- Hi,
We have exactly the same problem on 1 of our osd (bobtail 0.56.1).
[[https://gist.github.com/4555135]]
Wha... - 09:08 AM CephFS Bug #3261 (Rejected): mds crashes in EMetaBlob::replay
- Understood. I'm sorry we weren't able to dig in when it happened. When do you get around to retesting we should be ...
- 02:09 AM CephFS Bug #3261: mds crashes in EMetaBlob::replay
- should i test the same btrfs volume with a new ceph? if so i might get to it in the next month. please close with ins...
- 05:19 AM Revision b0162fab (ceph): osdmaptool: more fix cli test
- Signed-off-by: Sage Weil <sage@inktank.com>
- 05:10 AM Revision 5bd8765c (ceph): osdmaptool: fix cli test
- Signed-off-by: Sage Weil <sage@inktank.com>
- 05:01 AM Revision 98a76312 (ceph): osd: leave osd_lock locked in shutdown()
- No callers expect the lock to be dropped.
Fixes: #3816
Signed-off-by: Sage Weil <sage@inktank.com> - 04:48 AM Revision 72db1a59 (ceph): When running teuthology with targets provisionned on OpenStack and kvm,...
- Signed-off-by: Loic Dachary <loic@dachary.org>
- 02:04 AM Revision faa62fa8 (ceph): radosgw: increate nofile ulimit in upstart
- The default ulimit for open file descriptors per process is 1024,
far too few for radosgw if you have lots of OSDs an... - 12:59 AM Revision df399da1 (ceph): rgw: copy object should not copy source acls
- Fixes: #3802
Backport: argonaut, bobtail
When using the S3 api and x-amz-metadata-directive is
set to COPY we used t... - 12:25 AM Revision 19ee2311 (ceph): ceph: adjust crush tunables via 'ceph osd crush tunables <profile>'
- Make it easy to adjust crush tunables. Create profiles:
legacy: the legacy values
argonaut: the argonaut defaults... - 12:19 AM Revision 85eb8e38 (ceph): osdmaptool: allow user to specify pool for test-map-object
- Fixes: #3820
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Gregory Farnum <greg@in...
01/16/2013
- 11:52 PM Revision 7b6fe032 (ceph): Merge branch 'wip_snap_scrub'
- Reviewed-by: Sage Weil <sage@inktank.com>
- 11:40 PM Revision 0946a78c (ceph): fix mon clock queue test syntax
- 11:30 PM Revision 20b27a1c (ceph): rgw: copy object should not copy source acls
- Fixes: #3802
Backport: argonaut, bobtail
When using the S3 api and x-amz-metadata-directive is
set to COPY we used t... - 11:22 PM Revision 37dbf7d9 (ceph): rgw: copy object should not copy source acls
- Fixes: #3802
Backport: argonaut, bobtail
When using the S3 api and x-amz-metadata-directive is
set to COPY we used t... - 10:42 PM Revision b8568747 (ceph): osd_types: add nlink and snapcolls fields to ScrubMap::object
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 10:42 PM Revision 57352351 (ceph): ReplicatedPG/PG: check snap collections during _scan_list
- During _scan_list check the snapcollections corresponding to the
object_info attr on the object. Report inconsistenc... - 10:42 PM Revision e65ea70e (ceph): ReplicatedPG: compare nlinks to snapcolls
- nlinks gives us the number of hardlinks to the object.
nlinks should be 1 + snapcolls.size(). This will allow
us to ... - 10:42 PM Revision 665577a8 (ceph): osd/ReplicatedPG: validate ino when scrubbing snap collections
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:42 PM Revision 381e2587 (ceph): osd/PG: fix osd id in error message on snap collection errors
- Signed-off-by: Sage Weil <sage@inktank.com>
- 10:42 PM Revision 70c35120 (ceph): ReplicatedPG: ignore snap link info in scrub if nlinks==0
- links==0 implies that the replica did not sent snap link information.
Signed-off-by: Samuel Just <sam.just@inktank.com> - 10:42 PM Revision 39bc6549 (ceph): PG: move auth replica selection to helper in scrub
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 10:42 PM Revision 9e44fca1 (ceph): ReplicatedPG: correctly handle new snap collections on replica
- Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 10:35 PM Revision 88956e31 (ceph): ReplicatedPG: make_snap_collection when moving snap link in snap_trimmer
- Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 10:33 PM Revision 3f0ad497 (ceph): librados.hpp: fix omap_get_vals and omap_get_keys comments
- We list keys greater than start_after.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <... - 10:33 PM Revision 625c3cb9 (ceph): rados.cc: fix rmomapkey usage: val not needed
- Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <samuel.just@inktank.com> - 10:33 PM Revision 44c45e52 (ceph): rados.cc: fix listomapvals usage: key,val are not needed
- Fixes: #3812
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com> - 10:33 PM Revision cb5e2be4 (ceph): rados.cc: use omap_get_vals_by_keys in getomapval
- Fixes: #3811
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com> - 09:57 PM Revision 3c67ee36 (ceph): rbd: add test for formatted output from rbd cli
- 09:41 PM RADOS Bug #3834: crushtool really really hates \r
- Ha! Sorry about htat. Maybe iswhite() (or wahtever that helper is) would be best here?
- 09:36 PM RADOS Bug #3834 (Resolved): crushtool really really hates \r
- Spent a long time trying to figure out why a crush map wouldn't compile; finally got it to no differences at all, eve...
- 09:29 PM Revision 333cc0d5 (ceph): Merge branch 'wip-rbd-formatted-output'
- Reviewed-by: Dan Mick <dan.mick@inktank.com>
Conflicts:
src/rbd.cc
src/test/cli/rbd/help.t - 09:23 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
- OK, that quick fix wasn't enough.
I had a spinlock protecting the check for something being
complete. But that w... - 08:13 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
- Well that's unfortunate. I hit the same problem. I'll
need to take a closer look I guess. - 07:39 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
- Seems to be working better. It may end up being an
atomic rather than protecting with a spinlock, but
either way, ... - 03:15 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
- I've pretty much implemented this feature but having done
this I'm looking at a crash that happened with this code
... - 09:17 PM Revision b59c27dd (ceph): Merge branch 'master' into wip-scrub
- 09:15 PM Revision fb4bb5d7 (ceph): osd: better error message for request on pool that dne
- If the request is sent when the pool didn't even exist, say so. This
would have made #3734 a bit easier to track dow... - 09:14 PM Revision 9a1f5742 (ceph): osd: drop newlines from event descriptions
- These produce extra newlines in the log.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.j... - 09:14 PM Revision 6934ac3f (ceph): rbd: move Formatter construction to main
- Each method that uses a formatter is doing the same thing.
Simplify by constructing and handling errors only once.
Al... - 09:14 PM Revision 8fea6dee (ceph): rbd: add --pretty-format option
- This is the same option the rados and radosgw-admin tool use for more
human-readable json/xml.
Signed-off-by: Josh D... - 09:14 PM Revision 4e5a07bc (ceph): XMLFormatter: fix pretty printing
- It used the wrong indentation level and did not add a newline after
closing a section. dump_stream() did not indent a... - 09:14 PM Revision d7cdcc0e (ceph): rbd: regenerate man page and cli test
- Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
- 09:14 PM Revision f6dabc83 (ceph): rbd: always output result for formatted output
- When there's nothing, return an empty array.
This way scripts don't have to special case this.
Signed-off-by: Josh D... - 09:14 PM Revision 0efb9c51 (ceph): test: add cram integration test for formatted output
- This can be used with the new teuthology cram task.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> - 09:14 PM Revision 84c5d857 (ceph): rbd: support plain/json/xml output formatting
- This patch renames the --format option to --image-format, for
specifying the RBD image format, and uses --format to s... - 09:14 PM Revision 98487b56 (ceph): rbd: fix long lines
- Several >80 characters have crept in recently.
The older ones generally don't have very useful history,
so I'm not wo... - 07:21 PM Revision a586966a (ceph): osd: fix rescrub after repair
- We were rescrubbing if INCONSISTENT is set, but that is now persistent.
Add a new scrub_after_recovery flag that is r... - 07:21 PM Revision 8e33a8b9 (ceph): mon: note scrub errors in health summary
- Signed-off-by: Sage Weil <sage@inktank.com>
- 07:17 PM Revision 476eb24b (ceph): Merge branch 'wip-rpm-update'
- Merges work around for odd AS_IF behaviour in configure.ac.
- 06:37 PM rgw Bug #3813: radosgw doesn't have a logrotate script
- Let's go with /var/log/radosgw and a separate logrotate script. Simpler!
- 09:06 AM rgw Bug #3813: radosgw doesn't have a logrotate script
- Given that radosgw gets installed without ceph, it seems like teh viable optoins are putting the logrotate cofnig in ...
- 04:14 AM rgw Bug #3813: radosgw doesn't have a logrotate script
- Note that the official docs suggest to put "log file = /var/log/ceph/radosgw.log" too. If "ceph" isn't installed, thi...
- 04:02 AM rgw Bug #3813 (Resolved): radosgw doesn't have a logrotate script
- Currently there's no logrotate configuration for radosgw at all. Even if one sets "log file" to /var/log/ceph/somethi...
- 06:35 PM Feature #3833 (Resolved): osd: improve recovery throttling
- 06:24 PM Bug #3810 (Need More Info): btrfs corrupts file size on 3.7
- I need a dump of the xattrs on the d0c18e1d/605.00000000/head//1 object in pg 1.1d on osd 7 and osd 0
- 05:59 PM CephFS Bug #3832 (Resolved): client: does not observe O_SYNC
- if the file was opened with O_SYNC we need to flush the io on every write call.
- 05:49 PM Bug #3795 (Resolved): loadgen task gets into msgr loop
- 05:44 PM rgw Feature #3207 (Resolved): qa: swift functional tests in nightly
- 05:41 PM Revision c1a86ab1 (ceph): configure.ac: fix problem with --enable-cephfs-java
- The AS_IF used to cover java related checks via --enable-cephfs-java
didn't work correctly. Use a plain 'if/fi' inste... - 05:34 PM CephFS Feature #3730: Support replication factor in Hadoop
- Oh right, libcephfs is not built on top of librados. Never mind, that's a whole different discussion we start occasio...
- 05:15 PM CephFS Feature #3730: Support replication factor in Hadoop
- I don't think libcephfs will give up an instance of the rados client, if that's what you mean by grant access to rado...
- 04:33 PM CephFS Feature #3730: Support replication factor in Hadoop
- Sorry to back this up a little, but I can't recall — does using libcephfs automatically grant a user access to the RA...
- 04:30 PM CephFS Feature #3730: Support replication factor in Hadoop
- This interface update is up for review in wip-client-pool-api
- 09:52 AM CephFS Feature #3730: Support replication factor in Hadoop
- From stand-up, stick with int64_t for userspace, and enforce 32-bit range.
- 09:43 AM CephFS Feature #3730: Support replication factor in Hadoop
- The move from int32 -> int64 was misguided, and incomplete. At this point it's not really worth the effort to move a...
- 07:31 AM CephFS Feature #3730: Support replication factor in Hadoop
- It looks like in OSDMap there is some mixed usage of int64 and int for pool id, too. In Client::_create pool id is e...
- 06:40 AM CephFS Feature #3730: Support replication factor in Hadoop
- Can we change the type in libcephfs to uint64? We're the only ones calling ceph_get_file_pool() right now as far as ...
- 05:33 PM Bug #3820 (Resolved): osdmaptool - user cannot specify pool
- 02:24 PM Bug #3820 (Resolved): osdmaptool - user cannot specify pool
- 05:23 PM Documentation #3831 (Resolved): ceph osd crush set command needs correction in the doc
- ceph osd crush set command has different parameters in different places.
http://ceph.com/docs/master/rados/operat... - 05:21 PM rgw Bug #3802 (Resolved): x-amz-acl header ignored on copy operation
- Fixed, commit:ccfefe3097a51b49885f2ed5d9334e85b497d963. Fix was pushed to both argonaut and bobtail branches.
- 11:17 AM rgw Bug #3802: x-amz-acl header ignored on copy operation
- ok, affects both argonaut and bobtail. Actual bug is when copying object, if x-amz-metadata-directive is set to COPY ...
- 10:01 AM rgw Bug #3802: x-amz-acl header ignored on copy operation
- On what version?
- 05:16 PM RADOS Documentation #3830 (Closed): crush-map.rst: chooseleaf doesn't include 'firstn|indep', and 'aggr...
- 1) I think chooseleaf should also include [firstn|indep] like choose does.
2) I'm not certain I understand just wh... - 05:15 PM Bug #3829 (Can't reproduce): new osd added to the cluster is not receiving data
- ceph version: 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
1. Initially , had a cluster[burnupi21,burnupi22,b... - 04:12 PM CephFS Bug #3828 (Rejected): seeing error: fault, server, going to standby whenever I run a ceph-syn loa...
- This is showing up on your MDS, about 15 minutes after a client completes accesses, right? This is associated with th...
- 04:01 PM CephFS Bug #3828 (Rejected): seeing error: fault, server, going to standby whenever I run a ceph-syn loa...
- while validating bug 520, i saw an interesting error. it may be a red herring, as I am seeing no problem with the wr...
- 03:47 PM CephFS Bug #520 (Closed): mds: change ifile state mix->sync on (many) lookups?
- 3 Node Cluster:
ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
# cat /etc/ceph/ceph.conf
[global]... - 02:51 PM CephFS Bug #520: mds: change ifile state mix->sync on (many) lookups?
- csyn is now called ceph-syn
and --debug-ms 1 to see those messages go by! - 03:43 PM Revision 1d50affc (ceph): mds: fix usage typo for ceph-mds
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 03:26 PM CephFS Bug #3261: mds crashes in EMetaBlob::replay
- This looks like a problem with what's in the journal, but soo much MDS code has changed since then that I don't think...
- 03:24 PM CephFS Bug #1760 (Resolved): multiple_rsync workunit cannot remove non-empty directory intermittently
- this also looks like the tmap problem, commit:e52ebacb73747ef642aabdb3cc3cb2a328687a4c and preceeding 4 commits.
- 03:23 PM CephFS Bug #2380 (Rejected): kclient: aufs over a cephfs mount fails with Stale NFS file handle
- this is a generic problem with lookup by ino, see #3541 and other features
- 03:23 PM CephFS Bug #2092 (Can't reproduce): BUG at fs/ceph/caps.c:999
- commit:561cf283173360c39db19dc735da4a319be68ff6 fixes the multi-mds case. we haven't seen this again for single-mds.....
- 03:21 PM Bug #3827 (Resolved): crushtool --test: claims to want -o, really wants --output-csv or --show-*
- The error message is wrong, apparently, for crushtool's test mode; it looks like it wants either
--output-csv (in wh... - 03:11 PM CephFS Feature #3826 (Resolved): uclient: Be more aggressive about checking for pools we can't write to
- Right now the client will happily buffer up writes to a pool that it can't actually write to. #2753 is going to make ...
- 03:06 PM CephFS Bug #3746 (Rejected): kclient mmap doesn't zero past EOF
- Run against bad code.
- 03:03 PM CephFS Bug #2444 (Can't reproduce): null pointer deference in ceph_d_prune inside kvm
- 03:00 PM CephFS Bug #2071 (Can't reproduce): kclient: pjd mkfifo failures
- 02:59 PM CephFS Bug #1770 (Can't reproduce): directory nonexistent on kernel_untar_build.sh
- 02:58 PM CephFS Bug #1749 (Can't reproduce): nonexistent directory in kclient_workunit_kernel_untar_build
- 02:55 PM CephFS Bug #1318 (Resolved): directories disappear across multiple rsyncs
- commit:e52ebacb73747ef642aabdb3cc3cb2a328687a4c and 4 preceeding patches fix up the TMAP bug that is the likely cause...
- 02:55 PM CephFS Bug #1511: fsstress failure with 3 active mds
- Sam thinks this works now! Adding to QA suite.
- 02:50 PM CephFS Bug #3625 (Resolved): client: EEXIST error on multiple clients to create
- commit:b4d3bd06d4083d780755f6ef506df1643932fa2f
- 02:49 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
- Maybe you already handled this?
- 02:11 PM CephFS Bug #3625 (Fix Under Review): client: EEXIST error on multiple clients to create
- 06:16 AM CephFS Bug #3625: client: EEXIST error on multiple clients to create
- The kernel side has been reviewed and tested, but needs to be merged. The fuse side has been tested, but I think it ...
- 02:48 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- we should return an error code on fsync().. that is the quick fix.
a more polite feature will be opened to return ... - 09:19 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- This is clearly a bug, bureaucracy or not. It should not be a feature. We can do new development to fix a bug. If you...
- 02:47 PM Bug #3812 (Resolved): rados.cc listomapvals usage is wrong, <key> <val> are ignored and not needed
- 02:47 PM Bug #3811 (Resolved): rados.cc getomapval implementation is broken, should use omap_get_vals_by_keys
- 02:46 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
- I think this can be closed. There is a bunch of autoconf changes for Java that have or will be merged.
- 02:41 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
- I just did a ./configure and using CPPFLAGS to indicate where the jni headers were and that worked just fine. Using C...
- 02:45 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
- Multi-mds, currently low priority.
- 02:44 PM CephFS Bug #3637 (In Progress): client: not issuing caps for with clients doing shared writes
- 02:43 PM CephFS Bug #3637 (Fix Under Review): client: not issuing caps for with clients doing shared writes
- 02:40 PM CephFS Bug #3498 (Resolved): mds: mds assert failure during untar_kernel
- this was a msgr bug, long since fixed. commit:36c0fd220ef02b1ffd7a3ae0d98e0fdec6b55a5b or thereabouts
- 02:39 PM CephFS Bug #1666: hadoop: time-related meta-data problems
- http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg10334.html
Also wip-mtime-incr in the ceph repo. - 02:38 PM CephFS Bug #2218: CephFS "mismatch between child accounted_rstats and my rstats!"
- 02:32 PM CephFS Feature #3821 (New): qa: run backuppc as part of qa suite
- 02:32 PM CephFS Bug #2494 (Can't reproduce): mds: Cannot remove directory despite it being empty.
- The dupe inode suggests this is the problem fixed by Yan's tmap fixes.
- 02:29 PM CephFS Bug #2019 (Can't reproduce): mds: CInode::filelock stuck in sync->mix
- Presumably we'll see this again, but it hasn't turned up in our testing lately and we need more info to debug it.
- 02:27 PM CephFS Bug #1811 (Duplicate): 2 pjd chown tests failed on cfuse
- 02:22 PM CephFS Bug #1537 (Resolved): cmds 100% when copying lots of files, mds_cache_size and mds_bal_frag
- This is an optimization issue, which we'll get to!
- 02:22 PM Bug #3816: osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
- Interesting, but where did this actually get from?
And why didn't it get triggered when I started the OSDs again? ... - 01:08 PM Bug #3816 (Fix Under Review): osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
- -5678> 2013-01-15 17:18:24.509093 7f5a10cec700 1 accepter.accepter.rebind avoid 6812
-5677> 2013-01-15 17:18:24.5... - 12:43 PM Bug #3816: osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
- Like requested on the mailinglist I'm attaching the logfiles from osd.0 to osd.3
There is indeed a osd_map logline... - 09:59 AM Bug #3816 (Resolved): osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
- ...
- 02:21 PM CephFS Feature #3819 (Resolved): mds: re-add snaptests to qa suite
- 02:02 PM CephFS Bug #3818 (Duplicate): kclient: fsx fails in mapread
With the fix in #3681, fsx fails in mapread with bad data. It looks like this is unrelated to the fix, and is a se...- 01:56 PM Bug #3786: osd: scrub is deferred indefinitely if load is high
- Fixed by https://github.com/ceph/ceph/commit/299548024acbf8123a4e488424c06e16365fba5a
- 01:38 PM Bug #3786 (Resolved): osd: scrub is deferred indefinitely if load is high
- 01:38 PM Bug #3774 (Resolved): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- 11:38 AM rbd Feature #3817 (Resolved): librbd: make cache write-through until a flush is encountered
- Writeback caching is unsafe if higher layers don't send flushes. qemu can be accidentally misconfigured to not send f...
- 11:09 AM CephFS Feature #3543 (In Progress): mds: new encoding
- Oh, this has been in progress all week.
- 10:35 AM CephFS Bug #3773 (Can't reproduce): mds crashed at LogEvent::decode
- I have been trying to reproduce this but have not hit it yet.
will reopen the bug, when needed. - 10:34 AM Bug #3801 (New): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED assert(...
- 10:28 AM Bug #3801: Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED assert(0 == "...
- Sage Weil wrote:
> The osd.40 error means the fs returned EIO on a read operation. Check yoru kern.org.. there is p... - 09:39 AM Bug #3801 (Need More Info): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAI...
- The osd.40 error means the fs returned EIO on a read operation. Check yoru kern.org.. there is probably a bad disk, ...
- 09:41 AM Feature #3815 (Duplicate): osd: move pg_info_t back into the xattr; avoid writing pginfo file whe...
- see wip-pginfo for a hacky prototype.
did some testing, and it looks good:... - 09:39 AM Linux kernel client Bug #3800 (Won't Fix): libceph: check compatibility between ceph modules
- 07:03 AM Feature #3805: log: detect dup messages
- The one that comes to mind is "no heartbeat from osd.foo since timestamp bar" messages. We could try to identify the...
- 06:43 AM Revision 2dc2b480 (ceph): mds: use #defines for bits per cap
- Hard-coding 0xff in SimpleLock.h is too far away from where we add new cap
bits.
Signed-off-by: Sage Weil <sage@inkt... - 06:04 AM CephFS Bug #3601: client: With multiple clients, file remove doesn't free up space
- Yeah its that the lru doesn't have a timeout.
The mds could send an "enable timeout" message to clients once it se... - 03:27 AM Revision 63e33c8a (ceph): osd: send forced scrub/repair through scrub scheduling
- This marks a PG for immediate scrub or repair. Adjust the sched_scrub()
code so that we handle these PGs even when s... - 03:26 AM Revision 27ad74b9 (ceph): osd: use helpers to queue a PG in the scrub LRU
- Move the duplicated reach into info.history.last_scrub_stamp into a helper
so we can control when we queue the PG for... - 03:25 AM Revision f8a649c0 (ceph): osd/ReplicatedPG: validate ino when scrubbing snap collections
- Signed-off-by: Sage Weil <sage@inktank.com>
- 03:25 AM Revision 8fb04813 (ceph): ReplicatedPG: compare nlinks to snapcolls
- nlinks gives us the number of hardlinks to the object.
nlinks should be 1 + snapcolls.size(). This will allow
us to ... - 03:24 AM Revision 4affecee (ceph): ReplicatedPG/PG: check snap collections during _scan_list
- During _scan_list check the snapcollections corresponding to the
object_info attr on the object. Report inconsistenc... - 03:21 AM Revision 40e0f2db (ceph): byteorder: fix gcc 4.7 warnings
- ./include/encoding.h: In function 'void encode(int64_t, ceph::bufferlist&, uint64_t)':
./include/encoding.h:101:1: wa... - 03:21 AM Revision dde83262 (ceph): osd_types: add nlink and snapcolls fields to ScrubMap::object
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 03:21 AM Revision f969f6b3 (ceph): osd_types: bring ScrubMap::object up to the 0.56.1 encoding
- We need to introduce some new fields here, so to maintain compatibility
we'll need to first bring the 48.* series up ... - 03:21 AM Revision b6561a2f (ceph): osd: make missing head non-fatal during scrub
- If we encounter a scrub without a preceeding head, warn instead of
crashing. Note that this is still something we ca... - 02:00 AM Revision d882d053 (ceph): ReplicatedPG: fix snapdir trimming
- The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l... - 02:00 AM Revision 015a454a (ceph): osdmap: spread replicas across hosts with default crush map
- This is more often the case than not, and we don't have a good way to
magically know what size of cluster the user wi... - 02:00 AM Revision 55b7dd32 (ceph): mon: OSDMonitor: don't output to stdout in plain text if json is specified
- Fixes: #3748
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(che... - 02:00 AM Revision 898a4b19 (ceph): Revert "osdmap: spread replicas across hosts with default crush map"
- This reverts commit 503917f0049d297218b1247dc0793980c39195b3.
This breaks vstart and teuthology configs. A better f... - 02:00 AM Revision 3293b31b (ceph): OSD: only trim up to the oldest map still in use by a pg
- map_cache.cached_lb() provides us with a lower bound across
all pgs for in-use osdmaps. We cannot trim past this sin...
01/15/2013
- 10:07 PM Revision c8a9a9a8 (ceph): Add cram task
- This runs cram tests, which are an easy way to test output
stays consistent. We already use cram for basic cli tests ... - 09:39 PM Bug #3811 (Fix Under Review): rados.cc getomapval implementation is broken, should use omap_get_v...
- 09:21 PM Bug #3811 (Resolved): rados.cc getomapval implementation is broken, should use omap_get_vals_by_keys
- 09:38 PM Bug #3812 (Fix Under Review): rados.cc listomapvals usage is wrong, <key> <val> are ignored and n...
- 09:22 PM Bug #3812 (Resolved): rados.cc listomapvals usage is wrong, <key> <val> are ignored and not needed
- 08:53 PM CephFS Feature #3728 (Resolved): mds: draft design for lookup by ino
- 08:41 PM Revision cf149c8c (ceph): Merge branch 'wip-rpm-update'
- Clean-up the handling of ceph java bindings in the rpm specfile and
configure.ac. - 08:38 PM CephFS Feature #3730: Support replication factor in Hadoop
- pool ids are currently exposed via libcephfs from ceph_file_layout, which uses a 32bit integer for pool id. However, ...
- 08:34 PM CephFS Feature #3730: Support replication factor in Hadoop
- Someone could toss a 'ceph osd pool set size' Hadoop's way, so a static mapping between pg pool size and pool name co...
- 07:51 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
- I'm not sure yet whether the problem has to do with this
or whether it's in the existing "new request" code. But
I... - 06:23 PM Documentation #3808: Block device quick start page need update
- Fixed description formatting. Also, 3784 is in master now (e94b06a19218decaf7d2d7b009bd862040f20285)
- 04:46 PM Documentation #3808: Block device quick start page need update
- The current writeup also assumes that the mount is local to the cluster so it hides (for the beginner) important deta...
- 03:38 PM Documentation #3808: Block device quick start page need update
- -c and --secret aren't needed if you're using the default ceph.conf and your keyring can be found based on your ceph....
- 03:30 PM Documentation #3808 (Resolved): Block device quick start page need update
- The instructions don't match well with the bobtail release.
- should include a note that ceph-common needs to be ins... - 06:21 PM Feature #3805: log: detect dup messages
- I tend to think there aren't very many dups we could usefully compress. It's pretty easy to add a one-string buffer ...
- 02:25 PM Feature #3805: log: detect dup messages
- What kind of dups are we trying to detect?
This sounds to me like a wishlist item that requires much more work to... - 02:17 PM Feature #3805 (New): log: detect dup messages
- If a log message comes through and is a dup of the previous, increment a counter or something and only log it once wi...
- 05:35 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
- No. So far I'm focus on stabilize basic fs function for multiple MDS setup, completely ignore snapshot.
- 03:28 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
- Hmm, did this get fixed by some of Zheng's later patches? I remember things about snaprealms and migration...
- 05:33 PM Bug #3810 (Resolved): btrfs corrupts file size on 3.7
- After creating a new ceph cluster pg's become inconsistent after using the qemu client. Logs indicate that the prima...
- 04:54 PM Bug #3809 (Won't Fix): crush compiler errors are not helpful
- Small, or large, errors in the CRUSH input are apparently all treated the same by crushtool -c:
error: parse error a... - 04:44 PM CephFS Feature #3289: ceph-fuse: somehow exert pressure on the VFS to remove dentries from the cache
- #3575 should be kept in mind while doing this/instead of this — there's a forget_multi as well.
- 04:44 PM CephFS Bug #3601 (New): client: With multiple clients, file remove doesn't free up space
- Whoops, didn't mean to change that status.
- 04:43 PM CephFS Bug #3601 (Duplicate): client: With multiple clients, file remove doesn't free up space
- The LRU actually already exists; check out Client::lru. (Unless I'm misunderstanding something?) So we might want to ...
- 04:37 PM CephFS Bug #925: mds: update replica snaprealm on rename
- De-prioritizing multi-MDS issues...
- 04:34 PM CephFS Bug #1117: mds: rename rollback broken on slaves during replay
- De-prioritizing multi-mds issues for now.
- 04:27 PM CephFS Bug #1435: mds: loss of layout policies upon mds restart
- I'm guessing we want to move this up the queue; will discuss in bug scrub tomorrow!
- 04:23 PM CephFS Bug #1511: fsstress failure with 3 active mds
- De-prioritizing multi-mds failures at this time.
- 04:23 PM CephFS Bug #1535: concurrent creating and removing directories crashes cmds
- De-prioritizing multi-MDS bugs at this time.
- 03:51 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- Fair enough, but if I can just make a suggestion, perhaps you might want to explain these procedures somewhere in the...
- 03:45 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- I agree it's a bug, but given the procedures we have now (ack! changing procedures coming alert!) I don't think we wa...
- 03:43 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- No, please. A write pretending to succeed while actually not writing data _is_ a bug. The filesystem _not lying to it...
- 03:33 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
- This is a great suggestion but falls into feature rather than bug-fix category. My initial thought is keeping a list ...
- 03:42 PM CephFS Bug #1675 (Can't reproduce): mds: failed rstat assert
- The logs are long gone. This will presumably pop up again; it's a pretty common failure mode, but there's nothing in ...
- 03:38 PM CephFS Bug #1938: mds: snaptest-2 doesn't pass with 3 MDS system
- De-prioritizing all multi-MDS bugs for now.
- 03:27 PM CephFS Bug #3267: Multiple active MDSes stall when listing freshly created files
- Currently de-prioritizing multi-MDS bugs.
- 03:23 PM Bug #3537: Logs can run root out of space and crash ceph cluster (need more aggressive log rotation)
- Not an FS bug, and #3775 has a lot more conversation on this subject.
- 03:22 PM Bug #3552: After ceph-deploy installation a reboot breaks OSDs
- Whoops, not an FS bug!
I've put this in the main Ceph project for now, but it might also belong in devops. We need... - 03:18 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
- I know you guys did a couple rounds on this one, what's the status?
- 02:39 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
- Yes, the question is why they're 'getting unlucky'.
- 02:22 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
- Haven't looked into this, but my guess is a couple PGs are getting unlucky with their replica selection. I assume you...
- 02:17 PM Bug #3806 (Won't Fix): OSDs stuck in active+degraded after changing replication from 2 to 3
- Small 3 node cluster running 0.56.1-1~bpo60+1 on Debian/Squeeze, with "tuneables" enabled
I recently changed the r... - 02:27 PM RADOS Feature #3807 (Resolved): crush: simple commands to create common rules
- These should be in CrushWrapper or similar, and available via crushtool and via some 'ceph osd crush ...' commands.
... - 02:16 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
- I agree. If there are lots of log messages at the default levels, that is the problem. I don't think there is much ...
- 01:59 PM Feature #3775 (Need More Info): log: stop logging in statfs reports usage above some threshold
- So I suggest we split this into two issues:
1) the documentation examples show an awfully-high logging value for s... - 12:03 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
- so, a couple ideas of what can be done.
if we do set size and frequency (or inform the user how to), then it could... - 11:39 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
- So a couple of thoughts:
1) changing size in logrotate.conf doesn't help unless we also change frequency
2) with ... - 02:15 PM Documentation #3804 (Resolved): Logging section recommends fairly high levels, doesn't stress how...
- 3775 introduced the observation that logs can fill very quickly and bury a small root disk.
Our documentation could ... - 02:03 PM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
- commit:15bb00cafc31305cacf3c4684a429c2c9ee6f804 in master
- 02:03 PM rbd Feature #3635 (Resolved): rbd cli: call "udevadm settle" after use of add/remove kernel interface
- 02:02 PM rbd Feature #3784: rbd: issue modprobe when rbd map is called
- commit:e94b06a19218decaf7d2d7b009bd862040f20285 in master
- 02:01 PM rbd Feature #3784 (Resolved): rbd: issue modprobe when rbd map is called
- 01:47 PM Bug #3803 (Resolved): rados parsing error with hostnames in mon_host
- nevermind.. this is fixed in v0.48.3argonaut too.
- 01:45 PM Bug #3803: rados parsing error with hostnames in mon_host
- Responed to the upstraem bug. This is fixed in master and bobtail, but not backported to argonaut. Should we?
- 08:37 AM Bug #3803 (Resolved): rados parsing error with hostnames in mon_host
- In /etc/ceph/ceph.conf, if I set hostnames in the mon_host variable and separate them with spaces, the parsing algori...
- 01:25 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
- Sage has a different proposed fix than what's in the branch. Still needs to be tested.
- 12:50 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
- I don't remember where this ended up. Was the proposed fix problematic, or did it never get looked at?
- 01:16 PM Bug #3770: OSD crashes on boot
- Yeah, I just pushed a work-around branch (which I haven't tested much, so ideally you would try it on a node you can ...
- 12:08 PM rbd Subtask #3741: krbd: rework request tracking code
- I found the source of my trouble, and in the process understood
a little more about some subtlety in bio reference c... - 11:39 AM CephFS Bug #3718: multi-client dbench gets stuck over NFS exported cephfs
- This apparently is only a problem under re-export, which I believe we are not focusing on right now.
- 11:35 AM CephFS Bug #3553: MDS core dumped running 0.48.2argonaut
- Given what we know so far (the Op got sent to the wrong OSD) this is a bug in the Objecter, not the MDS. Or possibly ...
- 11:17 AM Bug #3771: ceph does not have startup scripts in Centos
- Not an FS bug! :)
- 10:17 AM Bug #3771 (In Progress): ceph does not have startup scripts in Centos
- 11:16 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- Whoops, this was never an FS bug. :)
- 10:15 AM Bug #3768 (In Progress): perl is required for logrotate, we need to include Perl as a dependency
- 10:54 AM Bug #3747: PGs stuck in active+remapped
- No I didn't, just the CRUSH rule.
- 10:46 AM Bug #3747 (Need More Info): PGs stuck in active+remapped
- Faidon: did you also change the replication level of pool 3 (.rgw.buckets) ?
- 10:18 AM Feature #3505 (In Progress): default to libnss
- This may already have been done. Will double check.
- 10:16 AM Feature #3733 (In Progress): osd: update leveldb submodule
- 10:10 AM Bug #3797 (Need More Info): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest ...
- 07:09 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Can you try reupgrading one of the nodes and start it with debug file store = 20? That will tell is what it is writing.
- 02:49 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- I just downgraded to 0.48.2argonaut and everything seems to be running normally again now:
Before downgrade:
ii ... - 02:28 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- Here's the output of dstat http://pastie.org/5687470.text
I'm not sure why it is writing so much now, before the ... - 02:17 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
- I just noticed the second osd is now consuming 100% cpu too. Before it was properly running for around 15 minutes. Gu...
- 02:14 AM Bug #3797 (Duplicate): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48....
- I just upgraded one of my production servers (2 osds) from 0.48.2argonaut to the latest 0.48.3argonaut and now of the...
- 08:33 AM rgw Bug #3802 (Resolved): x-amz-acl header ignored on copy operation
- When copying an object the x-amz-acl header is ignored. To replicate; copy a private object and send the 'x-amz-acl' ...
- 07:43 AM Bug #3801 (Won't Fix): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED a...
- 0.48.2argonaut
Relevant logs are attached. Core dumps are available if needed.... - 07:25 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
- You're right, as long as you are using matching
code it's fine.
If it occurred, it's a serious problem. It just
... - 07:17 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
- Is this really a problem? It seems like this could only bite someone building mixed versions out of tree.
- 06:57 AM Linux kernel client Bug #3800 (Resolved): libceph: check compatibility between ceph modules
- It's possible for semantic changes to occur in one of the
ceph modules (fs/ceph, net/libceph, or block/rbd) that is
... - 06:58 AM Linux kernel client Bug #3799: libceph/rbd: bio refs are messed up
- Because this suggests a semantically-incompatible change
between modules, this should probably be completed first:
... - 06:56 AM Linux kernel client Bug #3799 (Resolved): libceph/rbd: bio refs are messed up
- There is an ugly reference counting dance that occurs with bio
pointers in the kernel osd I/O path, and it needs to ... - 06:57 AM Linux kernel client Bug #3798: libceph/rbd: take reference to all bio's in list
- The other bug related to this is:
http://tracker.newdream.net/issues/3799 - 06:56 AM Linux kernel client Bug #3798 (Resolved): libceph/rbd: take reference to all bio's in list
- In a separate bug ("libceph/rbd: bio refs are messed up") I
describe how reference counting of bio's interact betwee... - 03:20 AM Revision d56af797 (ceph): osd: note must_scrub* flags in PG operator<<
- Signed-off-by: Sage Weil <sage@inktank.com>
- 03:20 AM Revision 26a63df9 (ceph): osd: fix scrub scheduling for 0.0
- The initial value for pair<utime_t,pg_t> can match pg 0.0, preventing it
from being manually scrubbed. Fix!
Signed-... - 03:20 AM Revision 2baf1253 (ceph): osd: based INCONSISTENT pg state on persistent scrub errors
- This makes the state persistent across PG peering and OSD restarts.
This has the side-effect that, on recovery, we r... - 02:24 AM Revision 16d67c79 (ceph): osd/PG: remove useless osd_scrub_min_interval check
- This was already a no-op: we don't call PG::scrub_sched() unless it has
been osd_scrub_max_interval seconds since we ... - 02:24 AM Revision 29954802 (ceph): osd: change scrub min/max thresholds
- The previous 'osd scrub min interval' was mostly meaningless and useless.
Meanwhile, the 'osd scrub max interval' wou... - 02:24 AM Revision 6f6a4193 (ceph): osd: fix object_stat_sum_t dump signedness
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:24 AM Revision d7383284 (ceph): osd: add last_clean_scrub_stamp to pg_stat_t, pg_history_t
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:24 AM Revision 2475066c (ceph): osd: add num_scrub_errors to object_stat_t
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:24 AM Revision 389bed5d (ceph): osd: note last_clean_scrub_stamp, last_scrub_errors
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:24 AM Revision 796907e2 (ceph): osd/PG: move scrub schedule registration into a helper
- Simplifies callers, and will let us easily modify the decision of when
to schedule the PG for scrub.
Signed-off-by: ... - 02:24 AM Revision 1441095d (ceph): osd/PG: introduce flags to indicate explicitly requested scrubs
- Signed-off-by: Sage Weil <sage@inktank.com>
- 02:24 AM Revision 62ee6e09 (ceph): osd/PG: trigger scrub via scrub schedule, must_ flags
- When a scrub is requested, flag it and move it to the front of the
scrub schedule instead of immediately queuing it. ... - 02:24 AM Revision a1481207 (ceph): osd: move scrub schedule random backoff to seperate helper
- Separate this from the load check, which will soon vary dependon on the
PG.
Signed-off-by: Sage Weil <sage@inktank.com> - 12:25 AM Revision 123a2dc4 (ceph): rados: adjust socket injection rate down
- See #3795.
- 12:14 AM Revision 71097b7b (ceph): Revert "task/kclient: chmod root to 1777."
- This reverts commit f17847e537802671c6f90bd1a0cdaa0e9d1e6f7a. It had
a typo and we hopefully don't need it.
Signed-o...
01/14/2013
- 10:11 PM Revision be0c4b34 (ceph): ac_prog_javah.m4: Use AC_CANONICAL_TARGET instead of AC_CANONICAL_SYSTEM.
- 10:07 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
- oh *fine*. :)
- 10:04 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
- Funny you should mention it: that is step #1 (or maybe 2 or 3) for the management API work, IMHO. :)
- 09:41 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
- I sorta think we ought to clean up how the various output channels are used in this code in general. This fixes the ...
- 09:23 PM Revision e182c1fd (ceph): Merge branch 'wip-java-sync'
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Joe Buck <jbbuck@gmail.com> - 09:11 PM Revision fb8a488e (ceph): java: remove create/release synchronization
- The constructor calls create, and finalize() calls release. Since each
of these can only happen once (enforced by Jav... - 09:11 PM Revision 2b9da45d (ceph): java: remove unnecessary synchronization
- The body of ceph_unmount is a call to a synchronized method.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> - 09:11 PM Revision 85c10357 (ceph): java: remove all intrinsic locks
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 09:11 PM Revision 13cb196e (ceph): java: add fine grained synchronization
- Adds r/w lock to protect against some races.
1. Mutual exclusion for mount/unmount prevents races between the two in... - 08:02 PM rbd Subtask #3741: krbd: rework request tracking code
- OK, I ran a test and got a crash. The bio built for
an object request gets handed off to an osd request.
I need to... - 07:32 PM rbd Subtask #3741: krbd: rework request tracking code
- I spent the day trying to find the memory leak and finally
found it. The structure being leaked was a bio. It was
... - 06:48 AM rbd Subtask #3741: krbd: rework request tracking code
- For some reason my tests started hanging on Friday when
I added memory debug code for catching leaks and reuses.
I ... - 07:49 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
- Is this still an issue?
- 04:54 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- Josh just pinged me that there was a typo in the chmod patch, and nobody's noticed so apparently it still hasn't been...
- 04:24 PM Bug #3795: loadgen task gets into msgr loop
- I looked a bit more and I see some failures before that, and also some passes after, e.g. teuthology-2013-01-11_07:00...
- 11:35 AM Bug #3795: loadgen task gets into msgr loop
- taking a look again at the nightly runs, looks like this issue has been happening on next branch from 01-01-2013 whic...
- 08:13 AM Bug #3795: loadgen task gets into msgr loop
- going to see if the recent msgr changes are to blame.. bisecting!
- 08:04 AM Bug #3795: loadgen task gets into msgr loop
- This appears to be a simple cycle:
- objecter has lots of requests outstanding
- there is a fault (msgr failure i... - 03:37 PM Revision 017b6d63 (ceph): Revert "osdmap: spread replicas across hosts with default crush map"
- This reverts commit 7ea5d84fa3d0ed3db61eea7eb9fa8dbee53244b6.
This breaks teuthology and vstart both in its current ... - 03:04 PM CephFS Documentation #3796 (Resolved): FUSE mount documentation needs some corrections for v0,56x
- The FUSE instructions need to be updated for v0.56 and later
currently:
> http://ceph.com/docs/master/cephfs/fuse... - 01:35 PM Bug #3772 (Can't reproduce): osd: osd_disk_threads = 5 seems to hang recovery
- I also don't seem to be able to reproduce on bobtail, marking can't reproduce.
- 12:58 PM Bug #3772 (New): osd: osd_disk_threads = 5 seems to hang recovery
- I don't seem to be able to reproduce this on master.
- 10:37 AM Bug #3772: osd: osd_disk_threads = 5 seems to hang recovery
- didn't reproduce with simple test, trying something more complicated. (roles/8882.yaml + osd disk threads : 10, teste...
- 01:28 PM CephFS Feature #3749 (Resolved): Remove forced synchronization from Java bindings
- 12:57 PM Feature #3769 (Fix Under Review): osd: scrub should verify snap collection existence, membership
- wip_snap_scrub
- 11:55 AM rbd Bug #2871 (Resolved): rbd export command hangs when trying to export an image of size 0 to a loca...
- Not certain which recent fix resolved this, but it works now.
- 11:32 AM rbd Bug #3585 (Closed): Image import via QEMU-IMG results in a corrupt rbd
- Great, glad to hear it's fixed.
- 11:09 AM rbd Bug #3427: krbd: unmap does not remove block device properly
- Patch posted for review. I'm not sure I'll be able to test
the scenario very well but hopefully it can be seen by
... - 09:56 AM rbd Bug #3427: krbd: unmap does not remove block device properly
- Implementing the change I described now.
- 11:01 AM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
- for reference, ubuntu@teuthology:/a/teuthology-2013-01-10_07:00:03-regression-argonaut-master-basic/38145
- 10:50 AM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
- This has shown up once in argonaut, probably not worth backporting unless it becomes more of a problem?
- 09:42 AM Bug #3629 (Resolved): test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsid())
- commit:3610e72e4f9117af712f34a2e12c5e9537a5746f
- 07:00 AM CephFS Bug #2187: pjd chown/00.t failed test 97
- Happened again on Friday. Time to add the delay injection to the nightlies?
2013-01-11T07:32:37.489 INFO:teutholo... - 06:52 AM Revision 92a9d9c2 (ceph): ceph.conf: separate replicas across osds
- ceph.git master now separates across crush hosts without this setting.
For teuthology clusters, we don't want that (u... - 05:43 AM Bug #3770: OSD crashes on boot
- So, my (very basic) understanding of this suggests that the fix is that the trim wouldn't happen in the first place.
...
01/13/2013
- 10:11 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- Nope.. which leads me to realize that that setting needs to go in teuthology's ceph.conf. Doing that now, and then I...
- 10:01 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- *sigh*
This also looks good to me, and I like it better (should have suggested this the first time around). But no... - 10:05 PM Bug #3774 (Fix Under Review): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- wip-scrub
- 10:05 PM Bug #3786 (Fix Under Review): osd: scrub is deferred indefinitely if load is high
- wip-scrub
- 07:04 AM Revision 410906e0 (ceph): mon: OSDMonitor: don't output to stdout in plain text if json is specified
- Fixes: #3748
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
01/12/2013
- 11:05 PM Bug #3748 (Resolved): ceph osd dump --format=json includes non-JSON line
- commit:410906e04936c935903526f26fb7db16c412a711
- 11:03 PM Bug #3795 (Resolved): loadgen task gets into msgr loop
- ...
- 11:01 PM Bug #3785 (Fix Under Review): ceph: default crush rule does not suit multi-OSD deployments
- der, broke vstart. can you review wip-3785?
- 08:01 AM CephFS Feature #3749: Remove forced synchronization from Java bindings
- In libcephfs mount/unmount race against each other, and the test of the API (e.g. unmount racing against write). In C...
- 01:10 AM Revision 7ea5d84f (ceph): osdmap: spread replicas across hosts with default crush map
- This is more often the case than not, and we don't have a good way to
magically know what size of cluster the user wi... - 01:09 AM Revision 3610e72e (ceph): mon: OSDMonitor: only share osdmap with up OSDs
- Try to share the map with a randomly picked OSD; if the picked monitor is
not 'up', then try to find the nearest 'up'... - 12:25 AM Revision 1f721804 (ceph): rbd: Fix tabs
- Signed-off-by: Dan Mick <dan.mick@inktank.com>
01/11/2013
- 11:56 PM Revision 34138993 (ceph): doc: Updates to CRUSH paper.
- fixes: 3329, 3707, 3711, 3389
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 10:28 PM Revision 15bb00ca (ceph): rbd: call udevadm settle on map/unmap
- When we map/unmap devices, udev gets called to manage device nodes;
this will allow the command to wait for those man... - 10:28 PM Revision e94b06a1 (ceph): rbd: make 'add' modprobe rbd so it has a chance of success
- Check for existence of /sys/bus/rbd first to avoid unnecessary calls
Fixes: #3784
Signed-off-by: Dan Mick <dan.mick@... - 08:17 PM Revision 66eb93b8 (ceph): OSD: only trim up to the oldest map still in use by a pg
- map_cache.cached_lb() provides us with a lower bound across
all pgs for in-use osdmaps. We cannot trim past this sin... - 08:15 PM Revision 8cf79f25 (ceph): OSD: check for empty command in do_command
- Fixes: #3878
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com> - 08:09 PM Revision 3e147295 (ceph): Merge pull request #32 from imjustmatthew/imjustmatthew_docs
- Correct typo in mon docs 'ceph.com' to 'ceph.conf'
- 07:59 PM Revision 0f161f1e (ceph): Correct typo in mon docs 'ceph.com' to 'ceph.conf'
- 06:49 PM Revision aeb02061 (ceph): qa/run_xfstests.sh: use cloned xfstests repository
- Use our own copy of the xfstests repository rather than hitting
the upstream one repeatedly.
Signed-off-by: Alex Eld... - 06:15 PM Revision 8d0fa15e (ceph): mon: Monitor: only schedule a timecheck after election if we are not alone
- Fixes: #3790
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 05:51 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
- Merged to master in commit:7ea5d84fa3d0ed3db61eea7eb9fa8dbee53244b6 and cherry-picked to bobtail in commit:503917f004...
- 05:45 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- good question. let's start with bobtail.
- 05:39 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- Looks good to me. What branches do we want to cherry-pick it on.
- 05:24 PM Bug #3785 (Fix Under Review): ceph: default crush rule does not suit multi-OSD deployments
- wip-3785
- 01:59 PM Bug #3785 (New): ceph: default crush rule does not suit multi-OSD deployments
- dang! wrong bug. opening this one back up.
sorry all! - 12:34 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- I think maybe Deb's comments and closure were meant for another bug (perhaps 3789?)
- 11:34 AM Bug #3785 (Won't Fix): ceph: default crush rule does not suit multi-OSD deployments
- This comment should have been in bug 3789
caused by a lack of resources on the system.
have increased the memory fro... - 11:32 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- This comment should have been in bug 3789
upping the memory on these VMs from 512M to 2G
since it appears it was a... - 10:55 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- I agree with Ian, I have seen *very bad things* happen when crush choses two OSD on one host, rather than distribute...
- 10:11 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
- The issue here is that CRUSH maps which behave well on multi-host deployments behave quite poorly on one or two host ...
- 05:46 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- Yes, Greg. The test passed in the recent runs.
- 05:34 PM Bug #3752 (Resolved): fsync-tester script need to be fixed to run in the nightlies
- This appears to be passing now, right Tamil?
Since I'm not seeing anything else breaking I'm inclined to leave the... - 04:25 PM Bug #3772 (In Progress): osd: osd_disk_threads = 5 seems to hang recovery
- 03:53 PM Documentation #3330 (In Progress): doc: How to troubleshoot unbalanced CRUSH
- 03:51 PM Documentation #3329 (In Progress): doc: What metrics should be used to set node weight
- 02:45 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
- That makes this sounds like a simple fix... we need to swap the frsize and bsize fields. Except that right now we ar...
- 02:39 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
- I spent a bit of time with gregaf trying to find authoritative sources for what the different values denote. While `...
- 01:40 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
- This coreutils commit may have useful data:
http://git.savannah.gnu.org/cgit/coreutils.git/commit/src?id=0863f018f0f... - 01:38 PM CephFS Bug #3793 (Resolved): wrong size reported in some distributions/toolchains
- In ceph_statfs we set f_bsize to be 1MB in order to report very large available spaces. However, nowadays it is appar...
- 02:38 PM CephFS Feature #3749: Remove forced synchronization from Java bindings
- This needs more thought than just removing synchronization. We'd like to be segfault free in Java, even though you co...
- 02:26 PM Bug #3789: OSD core dump and down OSD on CentOS cluster
- There is 'ceph health', and a nagios plugin that runs it. A similarly trivial plugin can probably be written for oth...
- 02:01 PM Bug #3789 (Won't Fix): OSD core dump and down OSD on CentOS cluster
- dmesg shows it was a lack of resources.
upping the memory on these VMs from 512M to 2G
since it appears it ... - 10:28 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
- Deb Barba wrote:
> all core files have similar backtrace.
> again, Sage, looks like you are right, low resources
>... - 10:27 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
- all core files have similar backtrace.
again, Sage, looks like you are right, low resources
dmesg:
hrtimer: inte... - 10:23 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
- looks from dmesg, you are right Sage, low on resources
centos1 core# gdb /usr/bin/ceph-osd core.0.26177
Core wa... - 10:16 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
- backtrace of core.0.14401 from centos3:
Core was generated by `/usr/bin/ceph-osd -i 8 --pid-file /var/run/ceph/osd.... - 09:37 AM Bug #3789 (Need More Info): OSD core dump and down OSD on CentOS cluster
- check dmesg, or VM responsiveness. this triggers when a call to sync(2) takes more than... 2 minutes? i forget how l...
- 09:13 AM Bug #3789 (Won't Fix): OSD core dump and down OSD on CentOS cluster
- Running a CentOS VM cluster. Running v0.56.1
I had written a bit of data, and stopped writing about 4pm yesterday... - 02:17 PM rbd Subtask #3741: krbd: rework request tracking code
- Unfortunately my system crashed after an hour or so. The
crash was in the network driver, and a little analysis
su... - 10:45 AM rbd Subtask #3741: krbd: rework request tracking code
- My full test run isn't complete but I seem to have resolved
whatever problem I was hitting yesterday. I have not ye... - 01:39 PM CephFS Bug #3794 (Resolved): uclient: reports sizes wrong in some cases
- This is the counterpart to kernel bug #3793. See Client::statfs, in which we set f_bsize to 1MB but f_frsize to 4KB. ...
- 12:22 PM Bug #3787 (Resolved): Ceph OSD crashes on ceph tell osd.x
- 8cf79f252a1bcea5713065390180a36f31d66dfd
- 11:12 AM Bug #3787 (Fix Under Review): Ceph OSD crashes on ceph tell osd.x
- wip_3787
- 09:33 AM Bug #3787: Ceph OSD crashes on ceph tell osd.x
- verified this happens on master. should be an easy fix. thanks for the report!
- 12:17 AM Bug #3787 (Resolved): Ceph OSD crashes on ceph tell osd.x
- I recently set up a small test cluster with 2 nodes to test the 0.48.3 -> 0.56.1 upgrade. After Upgrading one of the ...
- 12:22 PM Bug #3770 (Resolved): OSD crashes on boot
- 66eb93b83648b4561b77ee6aab5b484e6dba4771
- 11:16 AM Bug #3770 (Fix Under Review): OSD crashes on boot
- wip_3770
- 11:03 AM Bug #3770: OSD crashes on boot
- The fault is in OSD::handle_osd_map where we trim old maps. Prior to 0.50, the pgs would have processed up to the cu...
- 09:59 AM Bug #3770: OSD crashes on boot
- I'm seeing this same assert failure when trying to startup 3 of my OSDs. Happy to provide feedback for the debugging ...
- 09:43 AM Bug #3770: OSD crashes on boot
- sjust said that we're done collecting information and that I could rm the pg directory/log/info, which I did. Unfortu...
- 09:41 AM Bug #3770: OSD crashes on boot
- 12:04 PM Bug #3788: debian source packages are missing
- Gary Lowell wrote:
> It looks like the Sources file has been zero length in past releases as well. Still investigat... - 12:03 PM Bug #3788: debian source packages are missing
- My favorite use case when source packages are available would be...
- 11:33 AM Bug #3788: debian source packages are missing
- I think we should build source packages too (in addition to tarballs, etc.).
- 10:47 AM Bug #3788: debian source packages are missing
- We are not currently building debian or rpm source packages. We do put out a source tarball corresponding to the rel...
- 09:56 AM Bug #3788 (In Progress): debian source packages are missing
- It looks like the Sources file has been zero length in past releases as well. Still investigating.
- 02:20 AM Bug #3788: debian source packages are missing
- Proposed fix at https://github.com/ceph/ceph-build/pull/1
- 01:44 AM Bug #3788: debian source packages are missing
- http://ceph.com/debian/conf/distributions is created from https://github.com/ceph/ceph-build/blob/master/gen_reprepro...
- 01:35 AM Bug #3788 (Resolved): debian source packages are missing
- Following the instructions at http://ceph.com/docs/master/install/debian/ to add the ...
- 10:52 AM CephFS Bug #3773: mds crashed at LogEvent::decode
- Sure Sage. I was running bonnie from client during upgrade.
I had debug ms=1 set, i will try to reproduce this with... - 09:41 AM CephFS Bug #3773 (Need More Info): mds crashed at LogEvent::decode
- Tamil, I wonder if you can try to reproduce this with mds logging turned up from teh start (debug mds = 20, debug ms ...
- 10:34 AM Messengers Bug #2569: msgr: connect_rank crash
- yes, you are right, Greg. I just wanted to put a note of this somewhere, so chose to update the bug itself :)
- 10:23 AM Bug #3748 (Fix Under Review): ceph osd dump --format=json includes non-JSON line
- wip-3748 has a fix, commit:0edb53f02231fb83f33d3bc5f58b37b14cd5df82
- 10:20 AM Bug #3695 (Resolved): monitor crashed after an upgrade in Monitor::timecheck
- 10:16 AM Bug #3790 (Resolved): Mon crash after update to ceph version 0.56-209-g310112f
- looks good, merged into master. commit:8d0fa15e6aa3847e89de5d5adfca0a863e8da976
- 10:06 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
- Had a redundant check on the previous commit; fixed and rebased it and the new commit can be found on wip-3790 commit...
- 10:02 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
- This patch fixes it.
- 09:31 AM Bug #3790 (In Progress): Mon crash after update to ceph version 0.56-209-g310112f
- My fault. Forgot a check on win_election().
Any chance you can test 6104629d95207f3dfd3a744d81b011b6a714070e on wi... - 09:18 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
- Previous installed version was .56-193.
- 09:14 AM Bug #3790 (Resolved): Mon crash after update to ceph version 0.56-209-g310112f
- I have a single node cluster on burnupi60 updated each morning to the latest Master branch. After the update this mo...
- 09:16 AM Bug #3774 (In Progress): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- 09:16 AM Bug #3774: osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- wip-scrub-sched for the argonaut version. should look very similar for master/bobtail.
- 02:05 AM Revision 310112f7 (ceph): Merge remote-tracking branch 'gh/wip-3633'
- Reviewed-by: Sage Weil <sage@inktank.com>
- 02:04 AM Revision 9e4a3f03 (ceph): Merge remote-tracking branch 'gh/wip-3633'
- 02:03 AM Revision 305cb54a (ceph): suites: rados: multimon: add mon clock skews task yaml files
- Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
- 12:58 AM Revision 2fa5d23b (ceph): test: Hadoop cluster and task config.
- Add a 3-node cluster specification and a
task for running wordcount with Hadoop on Ceph.
Signed-off-by: Joe Buck <jb... - 12:44 AM Revision aa40de90 (ceph): messages: add MTimeCheck
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 12:44 AM Revision 684d4ba2 (ceph): mon: Monitor: add timecheck infrastructure to detect clock skews
- Fixes: #3633
Fixes: #3695
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inkt... - 12:44 AM Revision ff1c254b (ceph): mon: Monitor: reduce indentation level; make code more readable
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
- 12:44 AM Revision 7a7fff57 (ceph): mon: Monitor: move a couple of if's together on handle_command()
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
- 12:44 AM Revision bc57c7a9 (ceph): mon: Monitor: use 'else if' on handle_command instead of bunches of 'if'
- ... when the options are mutually exclusive.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> - 12:44 AM Revision 58e03ecb (ceph): mon: Monitor: unify 'ceph health' and 'ceph status'; add json output
- Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
- 12:03 AM Revision e6f284e9 (ceph): doc: Added -a option. Should work without from server, as described.
- fixes: #3750
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
01/10/2013
- 11:59 PM Revision de6633f9 (ceph): doc: Normalized to term "drive" rather than disk. Changed "(Manual)" en...
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 11:06 PM Revision 7a8ec194 (ceph): Merge branch 'next'
- 09:54 PM Revision 988f3597 (ceph): rados: add truncate support
- Signed-off-by: Samuel Just <sam.just@inktank.com>
Revewed-by: Greg Farnum <greg@inktank.com> - 09:04 PM Bug #3786 (Resolved): osd: scrub is deferred indefinitely if load is high
- If the load is above the threshold, we will never scrub. For some environments, this is normal (e.g., mixed OSD and ...
- 08:23 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
- This seems to be fixed in QEMU 1.3.0 and Ceph 0.56.1
I've tried QED -> Raw -> Ceph -> Raw then QED -> Ceph -> Raw an... - 07:56 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
- Version: 0.48.2-0ubuntu2~cloud0
Our Ceph deployments typically involve multiple OSDs per host with no disk redunda... - 07:10 PM rbd Feature #3635 (In Progress): rbd cli: call "udevadm settle" after use of add/remove kernel interface
- 07:10 PM Revision 44625d44 (ceph): config_opts.h: default osd_recovery_delay_start to 0
- This setting was intended to prevent recovery from overwhelming peering traffic
by delaying the recovery_wq until osd... - 07:09 PM rbd Feature #3784 (In Progress): rbd: issue modprobe when rbd map is called
- 06:04 PM rbd Feature #3784 (Resolved): rbd: issue modprobe when rbd map is called
- rbd map will not work unless the rbd kernel module is loaded, and this must be done manually. Add code to rbd to cau...
- 07:02 PM Revision 830b8ffa (ceph): ReplicatedPG: fix snapdir trimming
- The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l... - 06:34 PM Revision 0f42c373 (ceph): ReplicatedPG: fix snapdir trimming
- The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l... - 06:24 PM Bug #3774: osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- 06:14 PM Revision 035caac5 (ceph): Revert "rgw: fix handler leak in handle_request"
- This reverts commit eba314a811cd98a79f483dc7a9128fe76c722c78.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> - 06:11 PM rgw Feature #3402 (Fix Under Review): rgw: improve tests for multipart upload
- 06:10 PM rgw Feature #3634 (Fix Under Review): rgw: improve teuthology radosgw-admin test
- 06:09 PM Bug #3633 (Resolved): mon: clock drift errors not reported by ceph status
- commit:310112f702d14294e6ba48f8af41a306288cba65
- 06:09 PM Revision eb997e25 (ceph): Merge pull request #31 from chrisglass/expose_cluster_stats_to_python
- Added python wrapper to rados_cluster_stat
- 05:59 PM rbd Bug #3518 (Can't reproduce): rbd import file --format 2 creates an image named '--format'
- 05:59 PM rbd Bug #3518: rbd import file --format 2 creates an image named '--format'
- It seems that this no longer happens as of e6f284e945f45e39c57921149d4551d9e78557a5,
so closing non-reproducible. - 05:06 PM CephFS Bug #3773: mds crashed at LogEvent::decode
- Okay, I gathered up a core file, a high-debug MDS log, and the log with the bad event (and the bad event itself) in t...
- 02:05 PM CephFS Bug #3773: mds crashed at LogEvent::decode
- I'll at least start this off.
- 04:54 PM Revision c8f3fd6e (ceph): marginal: Remove broken symlinks
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 04:47 PM Messengers Bug #2569: msgr: connect_rank crash
- I believe this was caused by some issues which we decided not to backport the fixes for due to their size; Sage can c...
- 04:43 PM Messengers Bug #2569: msgr: connect_rank crash
- hit this on a mixed cluster running argonaut v0.48.3 and v0.56 [ ceph version 0.56-193-g00898c1]
monitors,mds,osds... - 04:37 PM rbd Bug #3688 (Won't Fix): rbd allows image of size 0 to be created
- I claim that zero-sized images are legal, if not particularly useful in that size...but one might well want to create...
- 04:15 PM Bug #3770: OSD crashes on boot
- root@ms-be1003:/var/lib/ceph/osd/ceph-27# find current/meta/ | tee ~/ceph-osd.27.meta | wc -l
42992
Attached. - 04:02 PM Bug #3770: OSD crashes on boot
- root@ms-be1003:/var/lib/ceph/osd/ceph-27/current/4.f9_head# attr -lq $PWD | while read attr; do echo $attr; attr -q -...
- 02:27 PM Bug #3770 (Need More Info): OSD crashes on boot
- From the backtrace:
pgid = {m_pool = 4, m_seed = 249, m_preferred = -1}
Based on the info attr, we try to... - 04:04 PM Bug #3750 (Resolved): Possible Ceph 5-minute quick start guide typo
- Documentation described making the call from the server console, which should work as described. Added -a so that it ...
- 03:52 PM Bug #3780 (Won't Fix): pg_num inappropriately low on new pools
- Version: 0.48.2-0ubuntu2~cloud0
On a Ceph cluster with 18 OSDs, new object pools are being created with a pg_num o... - 03:08 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
- The documentation should note that the
@rgw dns name = {hostname}@
option must be set in the
@[client.radosgw.g... - 11:13 AM rgw Bug #3778 (Resolved): document procedure for enabling subdomain S3 api calls
- The process for setting up a server that handles subdomain API requests is not documented. If possible we should add ...
- 03:07 PM Documentation #3711 (In Progress): crush-map.rst: choose firstn talks about "N", but does not cle...
- 03:05 PM devops Documentation #2886 (In Progress): doc: crush location tricks, ceph.conf, automatic host=
- 02:23 PM rbd Subtask #3741: krbd: rework request tracking code
- I am leaving shortly for a few hours. In reviewing this
new code I find a few things that make it a little hard
ma... - 01:00 PM rbd Subtask #3741: krbd: rework request tracking code
- I did some testing yesterday and found that I got I/O errors
while running xfstests. This was unexpected; I thought... - 01:43 PM Revision 797b3db3 (ceph): Added python wrapper to rados_cluster_stat
- The new get_cluster_stats() method on the rados.Rados object calls
the rados_cluster_stat() function in the librados ... - 12:51 PM Bug #2533 (Duplicate): osd: watchers tracked by entity_name_t, not by cookie
- 12:48 PM Feature #3769: osd: scrub should verify snap collection existence, membership
- Written, just needs to be ported to Bobtail
- 09:40 AM Feature #3769 (In Progress): osd: scrub should verify snap collection existence, membership
- 12:47 PM Bug #3736 (In Progress): kernel build: failures starting in 3.8-rc1
- 12:02 PM Bug #3736: kernel build: failures starting in 3.8-rc1
- The remaining issue is that the patch we apply to scripts/package/builddeb to build the perf tools is out of date. I...
- 12:45 PM Bug #3702 (New): OSD SIGABRT during startup
- 12:40 PM Bug #3617 (Resolved): Ceph doesn't support > 65536 PGs(?) and fails silently
- 09:35 AM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
- How's the testing come along, Sage?
- 12:39 PM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
- Believed fixed by patch to 3633
684d4ba242b26828bd7927860226bfc8a0cfcc2b - 12:35 PM Bug #3650 (Can't reproduce): osd: crash in Reset state -> start_peering_interval -> on_change -> ...
- Looked into the core dump, can't see how this happened.
- 12:30 PM Bug #3591 (Closed): auth: could not find secret_id=0
- 12:30 PM Bug #3591 (Resolved): auth: could not find secret_id=0
- Resolved by Sage's fix above.
- 12:29 PM Bug #3563 (Closed): osd crashed with error "auth: could not find secret_id=2"
- 12:29 PM Bug #3563 (Resolved): osd crashed with error "auth: could not find secret_id=2"
- Resolved by fix to 3591
- 12:20 PM Bug #3467 (Closed): osd: bad state machine event in start_recoverY_ops
- 12:20 PM Bug #3467 (Won't Fix): osd: bad state machine event in start_recoverY_ops
- If encountered, restart OSD.
- 12:13 PM Bug #3300: ceph::buffer::end_of_buffer isn't caught
- Josh - Is this just a case where the documentation needs to be updated?
- 11:46 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- The same issue exists with the debian packages. We have an explicit dependency on python, but not on perl. I don't ...
- 10:55 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- Can we check to ensure perl is not used elsewhere?
Are there guidelines that are provided to the developers that spe... - 10:06 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- I hate to see a dependency like perl get added for a oneliner perl regex. Is this the only place perl is used? Can ...
- 09:43 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
- backport to bobtail
- 11:26 AM Tasks #3779 (Resolved): update osd config ref as appropriate
- I'm not sure what our update policies on the docs are, but the defaults named in http://ceph.com/docs/master/rados/co...
- 11:11 AM rgw Cleanup #3777 (Resolved): rgw: audit code for reading NULL env variables
- Similar to the issue that triggered #3735
- 10:25 AM Bug #3647 (Can't reproduce): forgot the auth options for Cephx and added them later: Get msg: 7f...
- 10:19 AM rgw Bug #3735 (Closed): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- 10:19 AM rgw Bug #3735 (Resolved): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- 10:00 AM rgw Bug #3735: rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- commit:e1da85f286838cdd3a6329840cec748c6a11fd26
- 09:57 AM Bug #3747: PGs stuck in active+remapped
- Sage Weil wrote:
> commit:f83fcf63a928fdb8ab4d604bdce596c0c4afd854
oops, wrong bug! - 09:45 AM Bug #3747 (Resolved): PGs stuck in active+remapped
- commit:f83fcf63a928fdb8ab4d604bdce596c0c4afd854
- 09:55 AM CephFS Feature #3621 (Closed): qa: add knfsd reexport tests to qa suite
- 09:52 AM CephFS Feature #3621: qa: add knfsd reexport tests to qa suite
- commit:aaa03bbcd2549a38f962a61fc63be16cca3a6d90 in teuthology.git
- 09:34 AM Bug #3776 (Resolved): Need doc describing how to alter our log rotation
- If a user has a small to moderate size of root disk, they will probably have to modify the log rotation process for c...
- 09:32 AM Bug #3661 (Resolved): mon: idle/empty osds marked down after 15 min
- 08:34 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
- Sam,
That is a cool idea. I will open a doc bug for that. Providing instructions for those with smaller root dri... - 06:32 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
- The easiest solution for this might be to adjust the default logrotate script (src/logrotate.conf) to use the size pa...
- 03:52 AM Revision 59aad347 (ceph): configure.ac: check for org.junit.rules.ExternalResource
- Check for org.junit.rules.ExternalResource if build with
--enable-cephfs-java and --with-debug. Checking for junit4
i... - 01:13 AM Revision 12af11a1 (ceph): src/java/Makefile.am: fix default java dir
- Fix default javadir in src/java/Makefile.am to $(datadir)/java
since this is the common data dir for java files.
Sig... - 01:13 AM Revision 9b167b46 (ceph): ceph.spec.in: fix handling of java files
- Fix handling of JAVA (jar) files. Don't move the files around in the install
section since the related Makefile is fi... - 01:13 AM Revision f027d025 (ceph): ceph.spec.in: rename libcephfs-java package to cephfs-java
- Rename the libcephfs-java package to cephfs-java since the package
contains no (classic) library and RPMLINT complain... - 01:13 AM Revision d8c4fc5e (ceph): ceph.spec.in: fix libcephfs-jni package name
- Rename libcephfs-jni to libcephfs_jni1 to reflect the SO name/version of
the library and to prevent RPMLINT to compla... - 01:13 AM Revision aedbb97f (ceph): configure.ac: remove AC_PROG_RANLIB
- Remove already comment out AC_PROG_RANLIB to get rid of warning:
libtoolize: `AC_PROG_RANLIB' is rendered obsolete b... - 01:13 AM Revision 61437ee2 (ceph): configure.ac: change junit4 handling
- Change handling of --with-debug and junit4. Add a new conditional HAVE_JUNIT4
to be able to build ceph-test package a... - 12:11 AM Revision 00898c18 (ceph): rbd: allow copy of zero-length images. Includes simple test.
- Fixes: #3765
Signed-off-by: Dan Mick <dan.mick@inktank.com> - 12:10 AM Revision 1c3d6840 (ceph): doc/install/debian.rst: fix typo in link ref; broke doc build
- Signed-off-by: Dan Mick <dan.mick@inktank.com>
01/09/2013
- 11:11 PM Revision 133e4e34 (ceph): Merge branch 'next'
- Want to get various rbd-related fixes together for upgrade testing
- 10:40 PM Revision 48f13946 (ceph): ReplicatedPG: increment scrubber.errors rather than errors
- Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 05:37 PM Bug #3705 (Resolved): osd: crash in scrub finalize [argonaut]
- commit:5b12b514b047a8a46cc5549bd94b398289b9b5f6
- 05:08 PM rbd Bug #3766 (Resolved): rbd resize command fails on a mixed node cluster when it is a copied rbd im...
- I'm calling this fixed, then.
- 04:54 PM rbd Bug #3766: rbd resize command fails on a mixed node cluster when it is a copied rbd image and whe...
- This works fine on the master branch that has a fix for it :
ceph version 0.56-193-g00898c1 (00898c1860e8ae95b52192... - 01:44 PM rbd Bug #3766 (Need More Info): rbd resize command fails on a mixed node cluster when it is a copied ...
- I think this might be e1776809031c6dad441cfb2b9fac9612720b9083, which is still in next. Can you try an rbd client fr...
- 04:35 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
- Deb Barba <deb.barba@inktank.com>
3:13 PM (1 hour ago)
to Dan
so, as I explained in chat.
i am again seeing ... - 04:34 PM Feature #3775 (New): log: stop logging in statfs reports usage above some threshold
- Add a 'log stop on utilization = .95' option that will make the log code print one last line like
--- suspending l... - 04:31 PM Bug #3774 (Resolved): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
- These should get put at the top of the scrub queue in a way that still honors all the scheduling.
The problem is t... - 04:27 PM rbd Bug #3765 (Resolved): rbd cp of a zero sized image succeeds with error
- 04:27 PM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
- Fixed, test added, in master:
commit:00898c1860e8ae95b5219257d1635b15ccdce5c1 - 11:44 AM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
- 02:58 PM CephFS Bug #3773 (Can't reproduce): mds crashed at LogEvent::decode
- ceph version: 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
I had a cluster [burnupi06, burnupi07, burnupi08] ... - 02:32 PM rbd Bug #3753 (Resolved): rbd copy command reports error even though copy is successful on a mixed no...
- I believe this to have been fixed by the fix for #3744.
- 01:47 PM rbd Bug #3753: rbd copy command reports error even though copy is successful on a mixed node cluster
- Tamil, does this still happen with the fix in wip-no-cls-lock (and now in next) for 3744?
- 02:14 PM Bug #3772 (Can't reproduce): osd: osd_disk_threads = 5 seems to hang recovery
- reported on IRC, should be easy to reproduce.
we may want to change the default to 2 in order to avoid hiding thes... - 01:51 PM rbd Bug #3697 (Can't reproduce): rbd copy.sh test failing in nightly
- unable to reproduce so far
- 12:05 PM CephFS Feature #3570 (In Progress): teuthology: mds thrasher
- 11:47 AM rbd Feature #2256: rbd: parallelize deletions
- 11:46 AM rbd Feature #2297: ObjectCacher: mark buffers mergeable for ksm
- 11:46 AM rbd Bug #3518: rbd import file --format 2 creates an image named '--format'
- 11:46 AM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
- 11:42 AM Bug #3744 (Resolved): librbd: need to handle older OSDs that don't have cls_lock
- commit:4483285c9fb16f09986e2e48b855cd3db869e33c in next
- 11:28 AM Bug #3771: ceph does not have startup scripts in Centos
- Gary found that the installation script was commented out 2011-10-17
> commit 9baf5ef4f35c38d7fbaa70bde8f2c9383b2f... - 11:13 AM Bug #3771 (Resolved): ceph does not have startup scripts in Centos
- I did a basic ceph v0.56 installation on Centos 6.3
I have rebooted my nodes, and find that ceph is not startup up a... - 10:58 AM CephFS Bug #3681: kclient fsx fails nightly
- Proposed fix to set i_size before the setattr request:
This will resolve the above issue, because the cap flush on... - 09:59 AM Bug #3683 (Can't reproduce): mon: leak of MMonPaxos
- 09:58 AM Bug #3683: mon: leak of MMonPaxos
- I can't for the life of me get to reproduce this leak. In the meantime, Sage submitted a patch to msg/Pipe.cc [1] tha...
- 07:17 AM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
- I've been unable to reproduce this bug, but the cause was pretty obvious, so I pushed a fix that should deal with thi...
- 03:39 AM Revision 62e721a9 (ceph): librados: add aio stat tests
- Implement simple write-stat test, and a write-stat-remove-stat test cycle.
Signed-off-by: Filippos Giannakos <philip... - 03:38 AM Revision 879578c1 (ceph): librados: implement aio_stat
- Implement aio stat and also export this functionality to the C API.
Signed-off-by: Filippos Giannakos <philipgian@gr... - 02:32 AM Revision 5b12b514 (ceph): osd: make missing head non-fatal during scrub
- If we encounter a scrub without a preceeding head, warn instead of
crashing. Note that this is still something we ca... - 02:29 AM Revision e1da85f2 (ceph): rgw: Fix crash when FastCGI frontend doesn't set SCRIPT_URI
- Fixes: #3735
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 02:28 AM Revision eba314a8 (ceph): rgw: fix handler leak in handle_request
- Fixes: #3682
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 02:25 AM Revision 4483285c (ceph): librbd: Allow get_lock_info to fail
- If the lock class isn't present, EOPNOTSUPP is returned for lock calls
on newer OSDs, but sadly EIO on older; we need... - 02:21 AM Revision 77ddf276 (ceph): doc/release-notes: v0.48.3argonaut
- Signed-off-by: Sage Weil <sage@inktank.com>
- 12:23 AM Bug #3770 (Resolved): OSD crashes on boot
- One of my 0.56.1 OSDs crashed and couldn't boot: it was reaching tp_op heartbeats, and even after increasing that I w...
01/08/2013
- 10:21 PM Feature #3769 (Resolved): osd: scrub should verify snap collection existence, membership
- and, hopefully, backport this to argonaut
- 09:39 PM Feature #3651 (In Progress): osd: deep scrub should hash omap
- 07:57 PM Revision 573f5315 (ceph): marginal/multiclient: Matching tests for kclient
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 07:54 PM Revision 14385a66 (ceph): marginal/multiclient: Add three client cluster
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 07:51 PM Revision a4df5238 (ceph): marginal/multiclient: Adding ior test to marginal
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 06:36 PM Revision 1e03fe18 (ceph): marginal/multiclient: Add a test for fsx-mpi
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 06:23 PM Revision c07a4cb6 (ceph): marginal/multiclient: New task to run mdtest
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 06:11 PM Revision f17847e5 (ceph): task/kclient: chmod root to 1777.
- Signed-off-by: Greg Farnum <greg@inktank.com>
- 05:27 PM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
- I looked into this; it happens because clip_io() (called from read_iterate()) tries to validate
that writing at offs... - 03:23 PM rbd Bug #3765 (Resolved): rbd cp of a zero sized image succeeds with error
- ceph version 0.56-131-gd283abd (d283abdf50b1e4429b775680bfae1bb20c75306b)
while am still surprised about why we ne... - 04:45 PM Bug #3768 (Resolved): perl is required for logrotate, we need to include Perl as a dependency
- logrotate for ceph (/etc/logrotate.d/ceph) uses perl commands
if perl is not installed, logrotate fails
if logrotat... - 04:29 PM CephFS Bug #3597: ceph-fuse: denying root access
- Is root actually a member of the fuse group? If not that would be correct behavior.
- 04:07 PM Revision f8958463 (ceph): task/mpi: Allow working directory to be specified
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 03:46 PM rbd Bug #3766 (Resolved): rbd resize command fails on a mixed node cluster when it is a copied rbd im...
- ubuntu@burnupi24:/var/log/ceph$ ceph -v
ceph version 0.56-131-gd283abd (d283abdf50b1e4429b775680bfae1bb20c75306b)
... - 03:42 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- I think so.
But first let's verify it passes. - 12:43 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- Should we revert that teuthology commit, then?
- 12:31 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- There was a bug in the kernel for o_creat permissions checking for non root users.. Its fixed in the testing branch. ...
- 10:49 AM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- This is weird. Tamil says this one has never passed, but we can both run it locally fine and it passes in the ceph-fu...
- 09:39 AM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
- I made a change to the cfuse task to chmod 1777 the ceph root dir after its mounted. I think we should do the same f...
- 09:21 AM Bug #3752 (Resolved): fsync-tester script need to be fixed to run in the nightlies
- log: ubuntu@teuthology:/a/teuthology-2013-01-05_22:28:52-regression-next-testing-basic/35949
35949: (190s) collect... - 03:34 PM Revision 16248121 (ceph): task: A task to setup mpi
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 03:33 PM Revision e88c0fc8 (ceph): task/ceph-fuse: chmod root to 1777
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 03:32 PM Revision 4ed20ae8 (ceph): task/pexec: Add barrier capability
- This patch adds the ability to barrier between
parallel exec tasks so that all tasks will perform
the following step ... - 03:31 PM Revision 35320083 (ceph): task/pexec: More fixes for all case, exec on hosts
- We don't want to do an exec per role, but per-host. We
were already doing an exec per host, but the names were confu... - 03:29 PM Revision 081a80f8 (ceph): task/pexec: Fix when 'all' is used
- Signed-off-by: Sam Lang <sam.lang@inktank.com>
- 03:25 PM Revision d44fb147 (ceph): radosgw-admin.py: Increase test coverage to current admin feature set.
- Signed-off-by: caleb miles <caleb.miles@inktank.com>
- 12:58 PM Feature #3760: osd: maintain checksum on collection contents
- It wasn't clear to me from the description, but we are of course talking about maintaining in the HashIndex a checksu...
- 12:13 PM Feature #3760 (Rejected): osd: maintain checksum on collection contents
- Currently, there is no way for an OSD to detect erroneously missing objects in a pg collection. A scrub, therefore, ...
- 12:33 PM RADOS Feature #3764 (New): osd: async replicas
- The following is more a topic for conversation than a feature:
Currently, latency on any operation is limited by t... - 12:23 PM rbd Feature #3763 (Resolved): krbd: handle flattening of mapped image
- An rbd client receives notice if the snapshot context for
a mapped rbd image has changed. It is possible for the
s... - 12:19 PM Linux kernel client Bug #3762 (Duplicate): kernel osd client: verify support for multiple ops per request
- In order to support layered rbd images, the osd client needs
to support multiple ops in a single osd request.
Loo... - 12:15 PM rbd Feature #3761 (Resolved): kernel messenger: need to support multiple ops per request
- The kernel messenger currently gets message data from either
a bio list or a page vector. That is one or the other,... - 12:13 PM Bug #3759 (Duplicate): osd: maintain checksum on collection contents
- 12:11 PM Bug #3759 (Duplicate): osd: maintain checksum on collection contents
- Currently, there is no way for an OSD to detect erroneously missing objects in a pg collection. A scrub, therefore, ...
- 12:08 PM rbd Tasks #2853: krbd: read path
- This task depends on the completion of the following others
before it can be completed:
3741 krbd: rework request ... - 12:07 PM Feature #3758 (Rejected): osd: incremental object checksumming
- Currently, scrub can only compare the checksums between replicas. If an inconsistency is found between two replicas,...
- 12:07 PM rbd Subtask #2854: krbd: write path
- Work on this won't really begin until the read path work
has completed (http://tracker.newdream.net/issues/2853).
- 12:06 PM rbd Subtask #2854: krbd: write path
- OK, I'm going to interpret this as:
Any write operation on a layered image will be preceded
by an existence c... - 12:04 PM CephFS Feature #626 (Closed): qa: add IOR, rompio, or other parallel workloads suite
- Added tests to the _marginal_ qa suite that run IOR, mdtest, and fsx-mpi.
- 11:48 AM Feature #3756 (Duplicate): Watch/Notify cleanup
- 11:41 AM Feature #3756 (Duplicate): Watch/Notify cleanup
- The current design is rather fragile particularly with respect to the locking and ref counting.
The result of this... - 11:47 AM Feature #3757 (Resolved): osd: Watch/Notify cleanup
- The current design is rather fragile particularly with respect to the locking and ref counting.
The result of this... - 11:24 AM Bug #3744: librbd: need to handle older OSDs that don't have cls_lock
- Actually, rados lock list should continue to fail.
- 11:10 AM Documentation #3322: doc: Explain multi-tenant CephFS
- Where is this located? I wasn't able to find it.
- 11:00 AM rbd Tasks #3755 (Resolved): krbd: use new request tracking code for sync object operations
- The last request type still using the old request tracking code
is for handling synchronous operations. There are t... - 10:58 AM rbd Feature #3754 (Closed): krbd: use new request tracking code for notify ack
- Two request types remain that still use the old request
tracking mechanism. One of them is sending acknowledgements... - 09:54 AM rbd Bug #3753 (Resolved): rbd copy command reports error even though copy is successful on a mixed no...
- ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
On a mixed node cluster running argonaut[burnupi21,... - 09:39 AM CephFS Feature #3543: mds: new encoding
- I'm going to get started on this (mostly just figuring out current state, probably) today.
- 09:28 AM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
- 06:54 AM Bug #3695 (In Progress): monitor crashed after an upgrade in Monitor::timecheck
- 08:47 AM Linux kernel client Bug #3751: krbd: fix type of snap_id local variable
- I have a fix for this and I'll post it for review later
today.... - 08:47 AM Linux kernel client Bug #3751 (Resolved): krbd: fix type of snap_id local variable
- The type of the snap_id local variable in rbd_dev_v2_snap_info()
is defined with the wrong byte order. - 06:43 AM Bug #3748: ceph osd dump --format=json includes non-JSON line
- One other option would be to provide "standard" fields for status output when using json, regardless of any other exp...
- 05:08 AM Revision 920f82e8 (ceph): v0.48.3argonaut
- 04:51 AM Bug #3750 (Resolved): Possible Ceph 5-minute quick start guide typo
- I believe that the Ceph quick start guide should specify
@sudo service ceph -a start@
instead of the current
@... - 04:51 AM Revision f07921be (ceph): doc/install: new URLs for argonaut vs bobtail
- Also restructure the document a bit to make the choice of packages more
clear.
Signed-off-by: Sage Weil <sage@inktan... - 04:46 AM Revision 72674ad4 (ceph): doc/release-notes: v0.56.1
- Signed-off-by: Sage Weil <sage@inktank.com>
- 03:40 AM Bug #3747: PGs stuck in active+remapped
- I did a "ceph osd out 0; sleep 30; ceph osd in 0" and out of those 61 active+remapped pgs, 5 went into active+remappe...
- 12:14 AM Revision 1b194b25 (ceph): Merge branch 'wip-stripe-gran'
- Reviewed-by: Greg Farnum <greg@inktank.com>
01/07/2013
- 11:50 PM Revision 26e8438a (ceph): test: enforce -ENOTCONN contract in libcephfs
- Tests all relevant calls for -ENOTCONN when used with an unmounted
ceph_mount_info param.
Signed-off-by: Noah Watkin... - 11:49 PM Revision 5c58aa96 (ceph): libcephfs: return -ENOTCONN when call unmounted
- Adds -ENOTCONN return value for stat, fchmod, fchown, lchown.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> - 11:16 PM Revision f83fcf63 (ceph): PG: set DEGRADED in Active AdvMap handler based on pool size
- Otherwise, if the acting set does not change, the pg might
not show up as degraded if the pool size now exceeds the
a... - 11:04 PM Revision c4121093 (ceph): libcephfs: clarify interface return value
- Document that ceph_get_stripe_unit_granularity may return an error code
(e.g. -ENOTCONN). The interface requires a mo... - 09:33 PM Revision e4a54162 (ceph): v0.56.1
- 09:12 PM Revision c8f8c7e6 (ceph): Merge branch 'next'
- 09:08 PM Revision 9aecacda (ceph): msg/Pipe: prepare Message data for wire under pipe_lock
- We cannot trust the Message bufferlists or other structures to be
stable without pipe_lock, as another Pipe may claim... - 09:08 PM Revision 299dbad4 (ceph): msgr: update Message envelope in encode, not write_message
- Fill out the Message header, footer, and calculate CRCs during
encoding, not write_message(). This removes most modi... - 09:08 PM Revision 35d2f583 (ceph): msg/Pipe: encode message inside pipe_lock
- This modifies bufferlists in the Message struct, and it is possible
for multiple instances of the Pipe to get referen... - 09:08 PM Revision 9b23f195 (ceph): msg/Pipe: associate sending msgs to con inside lock
- Associate a sending message with the connection inside the pipe_lock.
This way if a racing thread tries to steal thes... - 09:08 PM Revision 6229b5a0 (ceph): msg/Pipe: fix msg leak in requeue_sent()
- The sent list owns a reference to each message.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from comm... - 09:04 PM Revision 1b39b316 (ceph): Merge branch 'wip-3678-b' into next
- Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com> - 09:02 PM Revision 40706afc (ceph): msgr: update Message envelope in encode, not write_message
- Fill out the Message header, footer, and calculate CRCs during
encoding, not write_message(). This removes most modi... - 09:02 PM Revision d16ad926 (ceph): msg/Pipe: prepare Message data for wire under pipe_lock
- We cannot trust the Message bufferlists or other structures to be
stable without pipe_lock, as another Pipe may claim... - 09:01 PM Revision 6a00ce0d (ceph): osdc/Objecter: fix linger_ops iterator invalidation on pool deletion
- The call to check_linger_pool_dne() may unregister the linger request,
invalidating the iterator. To avoid this, inc... - 08:58 PM Revision 62586884 (ceph): osdc/Objecter: fix linger_ops iterator invalidation on pool deletion
- The call to check_linger_pool_dne() may unregister the linger request,
invalidating the iterator. To avoid this, inc... - 06:39 PM Revision 213e3559 (ceph): osd: fix race in do_recovery()
- Verify that the PG is still RECOVERING or BACKFILL when we take the pg
lock in the recovery thread. This prevents a ... - 06:38 PM Revision e410d1a0 (ceph): ReplicatedPG: requeue waiting_for_ondisk in apply_and_flush_repops
- Fixes: #3722
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com> - 06:34 PM Revision 4c9f4c3c (ceph): ceph-fuse: rename ceph_ll_* to fuse_ll_*
- To not conflict with future linuxbox pull for nfs-ganesha.
Signed-off-by: David Zafman <david.zafman@inktank.com>
Re... - 04:04 PM CephFS Feature #3749 (Resolved): Remove forced synchronization from Java bindings
- Remove "synchronized" keyword from native interface. This was originally added when we were seeing some pthread mutex...
- 03:58 PM Bug #3748 (Resolved): ceph osd dump --format=json includes non-JSON line
- ceph osd dump --format=json includes the non-JSON "dumped osdmap epoch N" at the top of the output, which of course b...
- 03:42 PM Bug #3747 (Closed): PGs stuck in active+remapped
- About a week ago I doubled the number of OSDs in my cluster from 24 to 48 and, in the same day, adjusted CRUSH's defa...
- 03:35 PM rbd Subtask #2854: krbd: write path
- rbd write path.. 'guard' in the sense that the write has a check to verify the object already exists.
- 03:22 PM rbd Subtask #2854: krbd: write path
- Pretty sure this is about the rbd locking and fencing.
- 03:11 PM rbd Subtask #2854: krbd: write path
- I'm about to mark bug 3418 as a duplicate of this one.
I'm adding the following from that bug here first.
I did... - 03:11 PM rbd Subtask #2854: krbd: write path
- I'm not sure what "guard writes" is supposed to mean.
But I'm going to interpret it as simply implementing the
writ... - 03:26 PM CephFS Bug #3746 (Rejected): kclient mmap doesn't zero past EOF
- Error coming from fsx:
INFO:teuthology.orchestra.run.out:Mapped Write: non-zero data past EOF (0xb826) page offset... - 03:14 PM rbd Feature #3419 (Duplicate): krbd: copy-up on write to clone
- This is a duplicate of http://tracker.newdream.net/issues/2855.
- 03:14 PM rbd Subtask #2855: krbd: copy-up on write to clone
- I don't know how to change the one-line bug description or I
would.
I need some clarification about the intended ... - 03:12 PM rbd Feature #3418 (Duplicate): krbd: write path (layering)
- This is a duplicate of http://tracker.newdream.net/issues/2854.
- 03:07 PM rbd Feature #3417 (Duplicate): krbd: read path (layering)
- This is a duplicate of tracker.newdream.net/issues/2854.
- 03:06 PM rbd Tasks #2853: krbd: read path
- I'm about to mark bug 3417 as a duplicate of this.
I'm putting this bit of info from there here first.
Work o... - 03:05 PM rbd Feature #3416 (Duplicate): krbd: open parent on open
- Marking this as a duplicate of http://tracker.newdream.net/issues/2852.
- 02:51 PM rbd Bug #3743: krbd: errors on submitted requests are ignored
- If I could figure out how, I'd change the title of this
to say "krbd" rather than "rbd" to help make it clear
which... - 02:27 PM rbd Bug #3743 (Won't Fix): krbd: errors on submitted requests are ignored
- When a Linux request comes down to the rbd driver via rbd_rq_fn(),
rbd_dev_do_request() is called after validating t... - 02:50 PM rbd Bug #3745 (Rejected): krbd: individual response errors are ignored
- A Linux I/O request on an rbd image is broken into one or
more rbd requests, one request directed to each osd object... - 02:41 PM Bug #3744 (Resolved): librbd: need to handle older OSDs that don't have cls_lock
- Older OSDs didn't have libcls_lock, and will fail lock operations; this means
virtually all rbd operations and rados... - 01:22 PM Bug #3722 (Resolved): osd: indefinitely hung request on stable cluster
- commit:e410d1a066b906cad3103a5bbfa5b4509be9ac37
- 01:22 PM Bug #3736: kernel build: failures starting in 3.8-rc1
- Sure enough, this is the commit that causes the problem:
af3df2c perf tools: Try to build Documentation when insta... - 11:48 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- Looks like commit 6ca2a9c is the first one in that branch
that fails. It has a parent ce37f40 that succeeds.
I'v... - 10:24 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- Heard back from Neil as well as Vlad Yasevich about my
proposed fix and they both ack'd it. Linus was in on
the di... - 09:07 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- Despite a working build of the *kernel*, the package build
overall is still failing. It has something to do with bu... - 08:52 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- Neil Horman sent a response to my message and suggested
three possible alternatives to fix the underlying problem,
... - 05:42 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- I changed our config file, found in the git repository
autobuild-ceph in the file "kernel-config" in the way
descri... - 05:40 AM Bug #3736: kernel build: failures starting in 3.8-rc1
- I'm retroactively updating this so a bit about what's been
done gets documented.
The problem was in the Kconfig f... - 05:35 AM Bug #3736 (Resolved): kernel build: failures starting in 3.8-rc1
- Kernels as of version 3.8-rc1 are not properly building in
autobuilder. The initial symptom was that the config pha... - 01:16 PM Bug #3678 (Resolved): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- 01:16 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- commit:1b39b31678aea8c5bbdb38811b3919525228d10f
- 01:01 PM Bug #3734 (Resolved): osd/objecter: misdirected op in librados api tests
- 12:19 PM CephFS Cleanup #3742 (Resolved): Remove old Hadoop wrappers and configuration options
- I think it's likely that the current Hadoop shim is at least at feature parity with the old wrappers.
- 12:16 PM Bug #3702: OSD SIGABRT during startup
- Dan Mick wrote:
> Is this related to rbd, or should it be in category 'ceph'?
Ah, yes, it should. Thank you for c... - 11:31 AM Bug #3702: OSD SIGABRT during startup
- Is this related to rbd, or should it be in category 'ceph'?
- 12:07 PM rbd Subtask #3741: krbd: rework request tracking code
- ...
- 11:54 AM rbd Subtask #3741 (Resolved): krbd: rework request tracking code
- This is actually work that's mostly complete, but it never
got a bug assigned to it.
In order to handle layering ... - 11:26 AM Bug #3632 (Resolved): occasional testrados failure: process_8 exited with a signal
- this is probably #3734, now fixed.
- 11:09 AM rbd Subtask #2852: krbd: open parent on open
- This work is essentially done, and has been since
October 2012 (or even earlier). However I held off
posting it fo... - 11:00 AM Linux kernel client Bug #3740 (Resolved): ceph-client: change to be based on 3.8-rc2
- Our current ceph-client tree is based on Linux 3.6.
That is fairly old code (late September, 2012). We
should upda... - 10:12 AM Feature #3739 (Resolved): osd: repair object size vs object_info_t mismatches
- if the object_info_t size doesn't match the on-disk file/object size, we needt o repair it. this means proposing a s...
- 10:02 AM CephFS Bug #3726 (Resolved): Enforce Ceph's minimum stripe size in the java bindings
- 10:02 AM CephFS Bug #3726 (Closed): Enforce Ceph's minimum stripe size in the java bindings
- 09:21 AM CephFS Bug #3738 (Resolved): kclient fsx truncate/write multi-client race
This bug is similar to #3681, but occurs only in the non-exclusive case (multiple clients), where a truncate doesn'...- 09:09 AM CephFS Bug #3681: kclient fsx fails nightly
- The race here is between a truncate down, and completion of osd write ops triggering a cap flush. The exact order th...
- 06:30 AM rbd Bug #3737 (Resolved): Higher ping-latency observed in qemu with rbd_cache=true during disk-write
- Hi Josh,
as per our short conversation in IRC-#ceph there is an issue with latency/responsiveness with rbd_cache e... - 04:38 AM Revision 4cfc4903 (ceph): msg/Pipe: encode message inside pipe_lock
- This modifies bufferlists in the Message struct, and it is possible
for multiple instances of the Pipe to get referen... - 04:38 AM Revision a058f161 (ceph): msg/Pipe: associate sending msgs to con inside lock
- Associate a sending message with the connection inside the pipe_lock.
This way if a racing thread tries to steal thes... - 04:38 AM Revision 2a1eb466 (ceph): msg/Pipe: fix msg leak in requeue_sent()
- The sent list owns a reference to each message.
Signed-off-by: Sage Weil <sage@inktank.com> - 04:18 AM rgw Bug #3735: rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- Here's the fix I used on my system to fix the problem. The S3 service is set at the root of the virtual server so "" ...
- 03:07 AM rgw Bug #3735 (Closed): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
- I'm using lighttpd as a Fast CGI front end for radosgw and it doesn't set SCRIPT_URI environment variable.
So the ...
01/06/2013
- 10:50 PM Bug #3734 (Fix Under Review): osd/objecter: misdirected op in librados api tests
- wip-3734
- 10:41 PM Bug #3734: osd/objecter: misdirected op in librados api tests
- epoch 328:...
- 10:15 PM Bug #3734 (Resolved): osd/objecter: misdirected op in librados api tests
- ...
- 03:10 PM Bug #3715 (Duplicate): Crash during 0.55 -> 0.56 upgrade
- this was #3731
- 02:38 PM Bug #3722: osd: indefinitely hung request on stable cluster
- 02:34 PM Bug #3678 (Fix Under Review): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MN...
- YAY, wip-3678 is consistently passing now.
- 05:37 AM Revision a10950f9 (ceph): os/FileJournal: include limits.h
- Needed for IOV_MAX.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit ce49968938ca3636f48fe5431... - 04:54 AM Revision ce499689 (ceph): os/FileJournal: include limits.h
- Needed for IOV_MAX.
Signed-off-by: Sage Weil <sage@inktank.com>
01/05/2013
- 09:32 PM Feature #3733 (Closed): osd: update leveldb submodule
- 07:17 PM Revision e9efa332 (ceph): java: add stripe unit granularity tests
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:12 PM Revision ececcf57 (ceph): java: update javadoc comments
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:10 PM Revision cdd138da (ceph): java: fix whitespace
- Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
- 07:08 PM Revision abcda95b (ceph): libcephfs: expose stripe unit granularity
- Assists clients in choosing layout parameters.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> - 07:08 PM Revision 6954bf33 (ceph): java: add support for get_stripe_unit_granularity
- Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com> - 06:47 PM Documentation #3389 (In Progress): doc: crush docs could use a full example crushmap
- 10:02 AM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
- Do we have a test that checks our interfaces to
automatically catch inadvertent protocol changes?
If not, we should. - 09:04 AM Bug #3731 (Resolved): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible prot...
- commit:988a52173522e9a410ba975a4e8b7c25c7801123
- 09:04 AM Bug #3721 (Resolved): filestore: op_seq written in wrong order on non-btrfs
- commit:28d59d374b28629a230d36b93e60a8474c902aa5
- 09:03 AM Bug #3698 (Resolved): filestore: ENOENT on clone
- commit:e89b6ade63cdad315ab754789de24008cfe42b37
- 08:27 AM Feature #3732 (Resolved): osd/mon: report recovery rate (bytes and objects per sec)
- Report the rate of recovery (objects and bytes per second) via the monitor, presumably via 'ceph -w' and similar inte...
- 04:48 AM Revision 415294c0 (ceph): Merge branch 'next'
- 04:47 AM Revision cd194ef3 (ceph): osd: special case CALL op to not have RD bit effects
- In commit 20496b8d2b2c3779a771695c6f778abbdb66d92a we treat a CALL as
different from a normal "read", but we did not ... - 04:47 AM Revision 921e06de (ceph): Revert "OSD: remove RD flag from CALL ops"
- This reverts commit 91e941aef9f55425cc12204146f26d79c444cfae.
We cannot change this op code without breaking compati... - 04:46 AM Revision 988a5217 (ceph): osd: special case CALL op to not have RD bit effects
- In commit 20496b8d2b2c3779a771695c6f778abbdb66d92a we treat a CALL as
different from a normal "read", but we did not ... - 04:46 AM Revision d3abd0fe (ceph): Revert "OSD: remove RD flag from CALL ops"
- This reverts commit 91e941aef9f55425cc12204146f26d79c444cfae.
We cannot change this op code without breaking compati... - 03:51 AM Revision 3a940874 (ceph): libcephfs: delete client after messenger shutdown
- Prevents race between messages being dispatched to the client after the
client has been free'd.
Signed-off-by: Noah ... - 02:02 AM Revision 0978dc49 (ceph): rbd: Don't call ProgressContext's finish() if there's an error.
- do_copy was different from the others; call pc.fail() on error and
do not call pc.finish().
Fixes: #3729
Signed-off-...
01/04/2013
- 09:45 PM Revision 7513e971 (ceph): ReplicatedPG: remove old-head optization from push_to_replica
- This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic... - 09:44 PM Revision e89b6ade (ceph): ReplicatedPG: remove old-head optization from push_to_replica
- This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic... - 09:37 PM Revision 6a3d475c (ceph): Merge remote branch 'origin/wip-rbd-watch'
- Reviewed-by: Dan Mick <dan.mick@inktank.com>
- 08:32 PM Revision cd5f2bfd (ceph): ObjectCacher: fix off-by-one error in split
- This error left a completion that should have been attached
to the right BufferHead on the left BufferHead, which wou... - 07:54 PM CephFS Bug #3666 (Resolved): Segfault running test_libcephfs
- commit:3a9408742a8a6cbc870cba543a208285f1a6cec1
- 03:25 PM CephFS Bug #3666: Segfault running test_libcephfs
- I pushed a new wip-client-shutdown. This switches the clean-up order of client/messenger in libcephfs, rather than mo...
- 01:36 PM CephFS Bug #3666: Segfault running test_libcephfs
- Right, I think your fix will work, but it breaks the interface abstraction (messenger is created above the client, de...
- 01:16 PM CephFS Bug #3666: Segfault running test_libcephfs
- This is what I'm running to reproduce the error. It's been running now for an hour on wip-client-shutdown without any...
- 12:57 PM CephFS Bug #3666: Segfault running test_libcephfs
- Rather than moving messenger shutdown into client shutdown?
- 12:48 PM CephFS Bug #3666: Segfault running test_libcephfs
- A similar issue was just handled in the ceph_fuse.cc code. There we just delay deleting the client till the end. Yo...
- 10:41 AM CephFS Bug #3666: Segfault running test_libcephfs
- During unmount, the client is shutdown and free'd before the messenger. If any messages are delivered after the clien...
- 07:07 PM Revision 802c486f (ceph): config: change default log_max_recent to 10,000
- Commit c34e38bcdc0460219d19b21ca7a0554adf7f7f84 meant to do this but got
the wrong number of zeros.
Signed-off-by: S... - 06:18 PM Revision d6496abf (ceph): remove rbd_header_race test
- This no longer works since export does not do a watch, and the race is
being closed a different way not detectable by... - 06:16 PM Revision 620dd551 (ceph): task: mon_clock_skew_check.py: Check for clock skews on the monitors
- Will run for as long as teuthology runs. By default, fails if any clock
skews higher than 0.05 seconds are detected, ... - 06:11 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
- commit:0978dc4963fe441fb67afecb074bc7b01798d59d
- 03:12 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
- ceph version 0.56-109-gd8940d1 (d8940d15c330d05c8a198ff7dde16df748938b65)
when trying to copy rbd image to an alre... - 06:06 PM Bug #3702: OSD SIGABRT during startup
- Sage Weil wrote:
> Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else?
... - 09:42 AM Bug #3702 (Need More Info): OSD SIGABRT during startup
- 05:54 PM Revision 1a878611 (ceph): regression: include nfs suite
- 05:50 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- got msgr logs in ubuntu@teuthology:/a/sage-a3/34724, but the crash looked different from the earlier ones (whose logs...
- 05:40 PM Bug #3731 (Fix Under Review): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompati...
- see wip-3731
- 05:19 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
- Agreed. And let's make sure it's fixed for 0.56.1.
- 05:15 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
- Discussed this with Dan and Sam and I think we just want to roll this patch back and tell people not to use v0.56 for...
- 04:34 PM Bug #3731 (Resolved): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible prot...
- CEPH_OSD_OP_CALL changed to remove the CEPH_OSD_OP_MODE_RD bit in
91e941aef9f55425cc12204146f26d79c444cfae; however,... - 05:03 PM Revision e88b909a (ceph): task: ceph_manager: add 'get_mon_health' function
- Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
- 03:29 PM CephFS Feature #3730 (Closed): Support replication factor in Hadoop
- In order to support per-file replication values in Hadoop we need to specify that a new file should be generated in a...
- 02:38 PM rbd Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
- commit:6a3d475cf08eb3051e8cdbce10b17b53c92b9cb5
- 11:31 AM rbd Bug #3642 (Fix Under Review): librbd: watch is sent with assert version, which fails on resends
- in branch wip-rbd-watch
- 01:54 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
- Also, name it something along the lines of get_stripe_granularity() and not .._min(imum)_ as that isn't entirely accu...
- 01:40 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
- After a discussion on jabber, the decision is to go with exposing a function call in libcephfs and then using that in...
- 11:09 AM CephFS Bug #3726 (Resolved): Enforce Ceph's minimum stripe size in the java bindings
- The Hadoop bindings are using the blocksize as the stripe size. If a block size is explicitly passed down, it ends up...
- 01:00 PM CephFS Bug #3718: multi-client dbench gets stuck over NFS exported cephfs
- Heads up, Zheng Yan's patches on the mds fix issues related to running multiclient dbench tests.
- 12:24 PM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Hmm, okay. I wasn't real clear on the previous bugs so I'll need to look at it more if I end up taking this, but soun...
- 11:46 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Greg Farnum wrote:
> Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thin... - 11:35 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thing set_trace() does. Ex...
- 11:17 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Greg Farnum wrote:
> Am I reading it correctly that this is just going to be doing the config and wrapper work to no... - 09:01 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
- Am I reading it correctly that this is just going to be doing the config and wrapper work to not call set_trace() in ...
- 12:20 PM CephFS Feature #3543: mds: new encoding
- 12:20 PM CephFS Feature #3728: mds: draft design for lookup by ino
- 12:14 PM CephFS Feature #3728 (Resolved): mds: draft design for lookup by ino
- 12:20 PM CephFS Feature #3570: teuthology: mds thrasher
- 12:06 PM CephFS Feature #3727 (Resolved): mds: refactor EMetablob encoding paths
- Right now, the EMetaBlob sub-structures — for performance reasons — use an encoding pattern that doesn't match anythi...
- 11:42 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
- Greg Farnum wrote:
> I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should e... - 09:34 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
- I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should encompass. Are you talk...
- 11:41 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
- This was a whiteboard discussion 2 years ago. Nothing was written down. We should reopen new and more detailed issu...
- 09:29 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
- Where are the results of this bug? It's marked resolved but I don't see any fsck references in the git tree, and ther...
- 11:39 AM Feature #685: libcephmon: interact with ceph monitors via a library
- BTW it may make sense to push the client command stuff in the ceph tool into MonClient, and then wrap that in libceph...
- 11:38 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
- Greg Farnum wrote:
> Do we have a separate bug for the library calls this needs?
#685, which would take the clien... - 09:27 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
- Do we have a separate bug for the library calls this needs?
- 11:36 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
- Greg Farnum wrote:
> And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the... - 09:24 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
- And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the Ceph shim or anything...
- 11:35 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
- Greg Farnum wrote:
> Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph moun... - 09:24 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
- Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph mount and then running tes...
- 11:34 AM CephFS Feature #3426: ceph-fuse: build/run on os x
- Greg Farnum wrote:
> Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a... - 09:22 AM CephFS Feature #3426: ceph-fuse: build/run on os x
- Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a cluster going with vs...
- 11:32 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
- Greg Farnum wrote:
> What all does this encompass? Design? Implementation? Does it need to be an online switch or ca... - 09:13 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
- What all does this encompass? Design? Implementation? Does it need to be an online switch or can it be an offline job?
- 11:30 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
- Greg Farnum wrote:
> Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect ... - 09:12 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
- Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect so, but the presence o...
- 11:23 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
- 10:32 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
- log: ubuntu@teuthology:/a.old/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33734...
- 11:23 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
- Greg Farnum wrote:
> Do we have any kind of design for this? We've talked about it some and it's conceptually simple... - 09:08 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
- Do we have any kind of design for this? We've talked about it some and it's conceptually simple, but splitting up the...
- 11:15 AM CephFS Feature #626 (In Progress): qa: add IOR, rompio, or other parallel workloads suite
- Yeah, that's what slang's working on to enable this. Assigning this to him.
- 08:57 AM CephFS Feature #626: qa: add IOR, rompio, or other parallel workloads suite
- SamL has done some work on getting MPI going under teuthology, and on running some multi-client FS tests. I'm not sur...
- 11:14 AM Bug #3722: osd: indefinitely hung request on stable cluster
- the trigger is a brief osd reset due to an intermittent network outage. no actual ceph-osd daemons restart.
<pr... - 09:39 AM Bug #3722 (Need More Info): osd: indefinitely hung request on stable cluster
- 08:36 AM Bug #3722 (Resolved): osd: indefinitely hung request on stable cluster
- 0.48.2argonaut, rbd workload.
occasional requests are blocked indefinitely.
*may* be osd down/up cycles (due to... - 11:13 AM CephFS Feature #3621 (Resolved): qa: add knfsd reexport tests to qa suite
- 10:53 AM Bug #3723: ceph osd down command reports incorrectly
- similarly for "ceph osd in" command as well
ubuntu@burnupi06:/etc/ceph$ sudo ceph osd in 2 -k /etc/ceph/ceph.key... - 09:33 AM Bug #3723 (Can't reproduce): ceph osd down command reports incorrectly
- issuing the command: "sudo ceph osd down 2" reports osd.2 is already down but sudo ceph osd stat reports all are up.
... - 10:21 AM Bug #3698 (In Progress): filestore: ENOENT on clone
- 09:43 AM Bug #3699 (Resolved): osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
- commit:4ae4dce5c5bb547c1ff54d07c8b70d287490cae9
- 09:43 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
- Oh, those are librados specific numbers, aren't they. So this bug is to create and expose a libceph version, then. Wh...
- 09:35 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
- In libcephfs there is a call to get Ceph version (yes, just expose this). But, I recall Sage mentioning that it might...
- 09:19 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
- This is just exposing the librados version() function to Java, right?
- 09:41 AM rgw Bug #3724 (Resolved): docs refer to non-implemented features of the radosgw-admin rest api
- The only radosgw-admin API calls currently are *get usage* and *trim usage* The docs at
http://ceph.com/doc... - 09:41 AM CephFS Cleanup #660: mds: use helpers in mknod, mkdir, openc paths
- What kind of helpers are you talking about with this? inode fetchers and lock grabbers? In a quick scan over handle_c...
- 09:36 AM CephFS Feature #603: mds: repair directory hierarchy
- This is part of #82 fsck, right? Do we have a more detailed algorithm anywhere?
- 05:02 AM Revision 39a734fb (ceph): os/FileStore: fix non-btrfs op_seq commit order
- The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as... - 04:17 AM devops Documentation #3686: install prerequisites (Debian)
- Greg Farnum wrote:
> Nat, you should be able to install either of libtcmalloc-minimal or libgoogle-perftools — are... - 03:40 AM Revision c63c6646 (ceph): os/FileStore: fix non-btrfs op_seq commit order
- The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as... - 03:00 AM Revision acfa0c9a (ceph): mds: optimize C_MDC_RetryOpenRemoteIno
- When opening remote inode, C_MDC_RetryOpenRemoteIno is used as onfinish
context for discovering remote inode. When it... - 02:45 AM Revision b03eab22 (ceph): mds: forbid creating file in deleted directory
- Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
- 02:45 AM Revision 59953257 (ceph): mds: keep dentry lock in sync state as much as possible
- Unlike locks of other types, dentry lock in unreadable state can block path
traverse, so it should be in sync state a... - 02:45 AM Revision f9280cb6 (ceph): mds: fix replica state for LOCK_MIX_LOCK
- LOCK_MIX_LOCK state is for gathering local locks and caps, so replica state
should be LOCK_MIX.
Signed-off-by: Yan, ... - 02:45 AM Revision 248e4ab8 (ceph): mds: fix cap mask for ifile lock
- ifile lock has 8 cap bits, should its cap mask should be 0xff
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> - 02:45 AM Revision 420f3355 (ceph): mds: rdlock prepended dest trace when handling rename
- rdlock prepended dest trace to prevent them from being xlocked by
someone else.
Signed-off-by: Yan, Zheng <zheng.z.y... - 02:45 AM Revision ea2fd127 (ceph): mds: check null context in CDir::fetch()
- Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
- 02:45 AM Revision 3705c7ca (ceph): mds: drop locks when opening remote dentry
- Opening remote dentry while holding locks may cause dead lock. For example,
'discover' is blocked by a xlocked dentry... - 02:45 AM Revision ca4dc4db (ceph): mds: check if stray dentry is needed
- The necessity of stray dentry can change before the request acquires
all locks.
Signed-off-by: Yan, Zheng <zheng.z.y... - 02:45 AM Revision acbe6d97 (ceph): mds: don't issue caps while inode is exporting caps
- If issue caps while inode is exporting caps, the client will drop the
caps soon when it receives the CAP_OP_EXPORT me... - 02:45 AM Revision d379ac8e (ceph): mds: disable concurrent remote locking
- Current code allows multiple MDRequests to concurrently acquire a
remote lock. But a lock ACK message wakes all reque... - 01:15 AM Revision 28d59d37 (ceph): os/FileStore: fix non-btrfs op_seq commit order
- The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as... - 12:23 AM Revision 49416619 (ceph): log: broadcast cond signals
- We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre... - 12:13 AM Revision f1e0305f (ceph): doc: Removed the --without-tcmalloc flag until further advised.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 12:07 AM Revision 19df2086 (ceph): Merge pull request #30 from rca/master
- Minor clarification in docs.
01/03/2013
- 11:04 PM Revision 5ce47c2a (ceph): ssh_keys.py: pull the keys out of targets entry
- rather than the hosts known hosts file.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@i... - 10:51 PM Revision 88af7d18 (ceph): doc: Added defaults for PGs, links to recommended settings, and updated...
- Fixes: #3555
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 10:32 PM Revision b8f061dc (ceph): OSD: for old osds, dispatch peering messages immediately
- Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ... - 10:18 PM Revision 4ae4dce5 (ceph): OSD: for old osds, dispatch peering messages immediately
- Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ... - 09:30 PM Revision 73bc8ffc (ceph): doc: Added comments on --without-tcmalloc option when building Ceph.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 09:30 PM Revision 37b57cdf (ceph): Update doc/rados/configuration/filesystem-recommendations.rst
- Clarified when it's necessary to use the setting:
filestore xattr use omap = true - 09:29 PM Revision 43ef6772 (ceph): doc: Added some packages to the copyable line.
- Fixes: #3686
Signed-off-by: John Wilkins <john.wilkins@inktank.com> - 09:28 PM Revision 333ae82c (ceph): doc: Fixed syntax error.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 08:57 PM Revision aaa03bbc (ceph): qa: Add knfsd reexport suite
- Feature http://tracker.newdream.net/issues/3621
Signed-off-by: David Zafman <david.zafman@inktank.com> - 08:55 PM Revision 67968d11 (ceph): osd: move common active vs booting code into consume_map
- Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o... - 08:54 PM Revision 34266e6b (ceph): osd: let pgs process map advances before booting
- The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c... - 08:53 PM Revision 4034f6c8 (ceph): log: broadcast cond signals
- We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre... - 08:53 PM Revision 7e94f6f1 (ceph): Merge remote-tracking branch 'gh/wip-3714-b' into next
- Signed-off-by: Samuel Just <sam.just@inktank.com>
- 08:44 PM Revision 224a33bb (ceph): qa/workunit: Add dbench-short.sh for nfs suite
- A multi-client dbench run doesn't work over NFS,
see bug #3718. Make single client dbench available.
Signed... - 08:13 PM Documentation #3709 (In Progress): crush-map.rst: claims 'types' are default, not true (must be s...
- 02:32 PM Documentation #3709: crush-map.rst: claims 'types' are default, not true (must be specified); spe...
- These are "defaults" in the sense that they're generated as part of the default OSD Map. Apparently that needs to be ...
- 07:57 PM Documentation #3707 (In Progress): crush-map.rst: syntax error in example
- 05:54 PM Bug #3702: OSD SIGABRT during startup
- Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else?
- 05:45 PM Bug #3721: filestore: op_seq written in wrong order on non-btrfs
- 04:02 PM Bug #3721 (Resolved): filestore: op_seq written in wrong order on non-btrfs
- see wip-fsync
- 05:23 PM Revision f8bb4814 (ceph): log: fix locking typo/stupid for dump_recent()
- We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb... - 05:14 PM Revision eee795c0 (ceph): rbd_xfstests.yaml: drop test 186
- Stop running test 186. It keeps failing in nightly runs, unable
to unmount the scratch file system during setup. As... - 04:47 PM rgw Documentation #2993 (Resolved): doc: write quick RGW guide (if feasible)
- 04:45 PM devops Feature #2884: doc: osd hotplugging
- I believe the hotplug event was added, but will confirm.
- 04:43 PM devops Documentation #2974: doc: update chef docs for mon key distribution
- I believe this is done. Will verify.
- 04:13 PM devops Documentation #3686: install prerequisites (Debian)
- Greg Farnum wrote:
> John, can you remove that --without-tcmalloc bit until we hear more?
>
> Nat, you should be ... - 02:48 PM devops Documentation #3686 (In Progress): install prerequisites (Debian)
- John, can you remove that --without-tcmalloc bit until we hear more?
Nat, you should be able to install either of ... - 02:45 PM devops Documentation #3686: install prerequisites (Debian)
- Eek. We really, really want people to be using tcmalloc (memory behavior without it is astonishingly atrocious). I kn...
- 01:31 PM devops Documentation #3686 (Resolved): install prerequisites (Debian)
- Added packages to the copyable lines. Modified the build page to include --without-tcmalloc.
- 03:50 PM Bug #3698: filestore: ENOENT on clone
- Ok. The recovery_qos stuff can allow a client op to reorder past a push. This is a problem since the push might be ...
- 07:53 AM Bug #3698: filestore: ENOENT on clone
- another instance with logs: ubuntu@teuthology:/a/sage-a2/33879
- 02:52 PM Documentation #3555 (Resolved): {page-num} in ceph osd pool create is not optional
- Updated the document to add "required," the default values, a link to calculating PG values, clarification about PGP,...
- 02:49 PM Bug #3633: mon: clock drift errors not reported by ceph status
- The OSD clocks are actually fairly unimportant. Everything they use that requires precise timing should be based enti...
- 10:12 AM Bug #3633: mon: clock drift errors not reported by ceph status
- The objective here was to make sure that clock skews on the monitors were detected and reported, as said skews might ...
- 08:46 AM Bug #3633: mon: clock drift errors not reported by ceph status
- Reading the patch it looks only the clocks of the mons are checked. So the clocks of the osds are not important to ce...
- 02:34 PM Bug #3720: Ceph Reporting Negative Number of Degraded objects
- Per Josh D's suggestion, I set the tunables and it resolved the issue.
# ceph osd getcrushmap -o /tmp/crush
# cru... - 01:02 PM Bug #3720 (Duplicate): Ceph Reporting Negative Number of Degraded objects
- Changed the replication of two pools from 2x to 3x. Cluster rebalanced to nearly HEALTH_OK but got stuck at:
HEALT... - 02:32 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- When reproducing with lots of error logging to stderr, the error occurs on snapshots because the snap rm/snap info te...
- 01:59 PM CephFS Bug #3597: ceph-fuse: denying root access
- I believe that we can reproduce this error. We are running Ubuntu 12.04 LTS Server on both the client and on the Cep...
- 12:56 PM CephFS Bug #3719 (Can't reproduce): pjd test 145 failed in the nightly runs
- logs: ubuntu@teuthology:/a/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33621...
- 12:53 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
- commit:7e94f6f1a7b7a865433edacd6a521f6ea1170eac
- 10:28 AM Bug #3714 (Fix Under Review): osd: new peering code does not consume osdmaps prior to booting
- 12:48 PM CephFS Bug #3718 (Rejected): multi-client dbench gets stuck over NFS exported cephfs
- When running qa/workunit dbench.sh the dbench 1 passes, but the dbench 10 gets hung up.
We should check this with ... - 12:28 PM CephFS Feature #3621 (In Progress): qa: add knfsd reexport tests to qa suite
- 09:49 AM RADOS Feature #3717 (New): osd: Make Rebalancing Smarter
- From Corin Langosch - During recovery/ rebalacing it can happen that an osd receives lots of new data before data tha...
- 09:45 AM Bug #3716: recovery should take osd usage into account
- 1. My cluster already uses the tuned crushmap "crushtool -i /tmp/crush --set-choose-local-tries 0 --set-choose-local-...
- 09:36 AM Bug #3716 (Closed): recovery should take osd usage into account
- #1: this is a matter of adjusting the crush tunables. see http://ceph.com/docs/master/rados/operations/crush-map/?hig...
- 09:08 AM Bug #3716 (Closed): recovery should take osd usage into account
- Using argonaut 0.48.2. Yesterday one osd crashed (disk io error) and recovery started as expected. All osds had an us...
- 09:44 AM Bug #3550: mon: Ceph fails to work when IP address is changed on the host
- Joao,
thanks for the update.
Since mine came about due to a testing environment build on DHCP, I did not have the ... - 09:32 AM CephFS Bug #3681: kclient fsx fails nightly
- Its most likely all the same bug, but fsx fails in different ways each time (always because of a truncate down). The...
- 09:27 AM CephFS Feature #3543: mds: new encoding
- right. about 80% complete, see wip-mds-encoding.
- 09:22 AM CephFS Feature #3543: mds: new encoding
- What is this task? Switching to use our versioned encoding scheme?
- 09:17 AM rbd Bug #3685: xfs test 186 fails in the nightlies
- I just disabled test 186 from the list run for the nightly
tests. It's defined in the ceph-qa-suite git repository,... - 06:39 AM Revision a32d6c5d (ceph): osd: move common active vs booting code into consume_map
- Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o... - 06:20 AM Revision 0bfad8ef (ceph): osd: let pgs process map advances before booting
- The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c... - 06:04 AM Revision 5fc94e89 (ceph): osd: drop oldest_last_clean from activate_map
- Signed-off-by: Sage Weil <sage@inktank.com>
- 06:04 AM Revision 67f7ee67 (ceph): osd: drop unused variables from activate_map
- Signed-off-by: Sage Weil <sage@inktank.com>
- 05:09 AM Revision a14a36ed (ceph): OSDMap: fix modifed -> modified typo
- Signed-off-by: Sage Weil <sage@inktank.com>
- 04:44 AM Revision 9ca69e73 (ceph): ceph: malloc check =3 means we hear on stderr too
- 03:58 AM Revision 2141454e (ceph): log: fix locking typo/stupid for dump_recent()
- We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb... - 02:13 AM Revision 6b5a89d2 (ceph): Merge remote-tracking branch 'gh/next'
- 01:01 AM Revision 43cba617 (ceph): log: fix locking typo/stupid for dump_recent()
- We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
01/02/2013
- 11:59 PM Revision 29ff87a5 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
- 11:58 PM Revision 64d2760a (ceph): doc: Added a memory profiling section. Ported from the wiki.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 11:57 PM Revision 5066abf1 (ceph): doc: Added memory profiling to the index.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 11:08 PM Revision 0e9a0cd7 (ceph): qa/workunit: Update pjd script to use new tarball
- The pjd script now uses the latest version of pjd
with an additional test for opening a non-existent
file.
Signed-of... - 11:07 PM Bug #3715: Crash during 0.55 -> 0.56 upgrade
- is someone sending an MOSDOp that has no ops? init_op_flags() is called before can_*(), so this sounds like an empty...
- 10:05 PM Bug #3715 (Duplicate): Crash during 0.55 -> 0.56 upgrade
- I started upgrading my 0.55.1 cluster to 0.56 and at one point in the middle of the upgrade, all 0.55.1 OSDs started ...
- 10:38 PM Revision d8940d15 (ceph): fuse: Fix cleanup code path on init failure
- With the changes from 856f32ab, the cfuse.init call returns
a _positive_ errno, which was getting ignored. Also, if ... - 10:15 PM Revision c4370ff0 (ceph): librbd: establish watch before reading header
- This eliminates a window in which a race could occur when we have an
image open but no watch established. The previou... - 09:56 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- Reproduces OK on plana cluster, indeed. This seems to point toward some sort of OSD bug where committed state isn't ...
- 09:39 AM rbd Bug #3697 (In Progress): rbd copy.sh test failing in nightly
- 09:42 PM Revision 93656013 (ceph): test_filejournal: optionally specify journal filename as an argument
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 483c6f76adf960017614a8641c4dcdbd7902ce33) - 09:42 PM Revision be0473bb (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c461e7fc1e34fdddd8ff8833693d067451df906b) - 09:42 PM Revision de619327 (ceph): os/FileJournal: limit size of aio submission
- Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig... - 09:42 PM Revision ded454c6 (ceph): os/FileJournal: logger is optional
- Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 076b418c7f03c5c62f811fdc566e4e2b776389b7) - 09:42 PM Revision 9a1cf518 (ceph): Merge branch 'wip-journal-aio' into next
- Reviewed-by: Samuel Just <sam.just@inktank.com>
Backport: bobtail - 09:39 PM Revision dda7b651 (ceph): os/FileJournal: limit size of aio submission
- Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig... - 09:39 PM Revision c461e7fc (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
- Signed-off-by: Sage Weil <sage@inktank.com>
- 09:39 PM Revision 483c6f76 (ceph): test_filejournal: optionally specify journal filename as an argument
- Signed-off-by: Sage Weil <sage@inktank.com>
- 09:34 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
- Previously when we handled the old osdmaps catching up (pre-MOSDBoot) we'd do advance_map and the pgs would update th...
- 08:32 PM Revision e0858fa8 (ceph): Revert "librbd: ensure header is up to date after initial read"
- Using assert version for linger ops doesn't work with retries,
since the version will change after the first send.
Th... - 08:31 PM Revision 06310994 (ceph): ceph: enable malloc debugging for ceph-osd
- 07:49 PM Revision 3686371e (ceph): rados: add test_filejournal
- This writes to /tmp by default; should be ok plana, since it's / and not
tmpfs. - 07:24 PM Revision 82297706 (ceph): doc: Minor edits.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 07:15 PM Revision d3b9803e (ceph): doc: Fixed typo, clarified usage.
- Signed-off-by: John Wilkins <john.wilkins@inktank.com>
- 05:23 PM rbd Bug #3685: xfs test 186 fails in the nightlies
- It is possible for umount() to return EBUSY. However from
what I can tell that only occurs when the device being
u... - 02:34 PM rbd Bug #3685: xfs test 186 fails in the nightlies
- OK I've tried reproducing it manually (on a teuthology node, but
running it using a command line while in an "intera... - 12:06 PM rbd Bug #3685: xfs test 186 fails in the nightlies
- Test 184 doesn't touch the scratch device. Looks like the next
one back is 167, which exercises unwritten extent co... - 11:56 AM rbd Bug #3685: xfs test 186 fails in the nightlies
- I thought I had updated this but I have not.
Test 186 is exercising activities that at one time caused a
bug in x... - 05:15 PM Bug #3699: osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
- reproduced this on burnupi21.
- 05:00 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- with glibc malloc and debug enabled:...
- 08:57 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
- another one with full osd logs:...
- 04:13 PM Documentation #3687 (Resolved): Documentation needs a "memory profiling" section
- This has been ported. I haven't added a valgrind use case yet.
- 01:20 PM Documentation #3687 (In Progress): Documentation needs a "memory profiling" section
- 03:51 PM Feature #3713 (Rejected): ceph osd tree should show disk usage
- As ceph seems to already monitor the disk usage of each osd it's be great to have it displayed in "ceph osd tree".
- 03:08 PM rbd Bug #3619: librbd: read_iterate sparse behavior broken
- Mitigated somewhat by sparsification efforts in rbd import/export, but still librbd
should be fixed. - 02:11 PM devops Feature #3712 (New): Ceph Commands should provide appropriate responses, when Ceph Service is not...
- When ceph service is not running, running other ceph command should give a response that makes sense instead of just ...
- 02:02 PM Cleanup #2078: ceph tool: only output response data to stdout
- i think we need to phase out all of the first-line nonsense.
- 01:48 PM Cleanup #2078: ceph tool: only output response data to stdout
- This also affects things like ceph pg dump --format=json. You can't pipe it to a pretty printer without ignoring the ...
- 01:52 PM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
- The implication is that 'N' is "the number of buckets of type 'type' available", but Sam believes it must really be "...
- 01:40 PM Bug #3684 (Resolved): filejournal: aio vector size is not limited
- 01:34 PM rbd Feature #3456 (Closed): make exit code of ceph status commands status dependent
- 01:29 PM rbd Documentation #2992 (Resolved): doc: RBD parent/child snapshot
- 01:26 PM rbd Documentation #2992: doc: RBD parent/child snapshot
- This should be resolved.
- 01:24 PM Documentation #3710 (Closed): crush-map.rst: talks about 'step choose' but does not document it
- 01:23 PM Documentation #3411 (Resolved): doc: add introductory detail to the main doc page (index.rst)
- 01:21 PM rgw Feature #3207 (In Progress): qa: swift functional tests in nightly
- 01:21 PM rgw Feature #3366 (In Progress): rgw: dr: define management api
- 01:18 PM Documentation #2980 (Resolved): doc: write upgrading Ceph version
- This was checked in and also reviewed by Josh and Sage.
- 01:16 PM Documentation #3322 (Resolved): doc: Explain multi-tenant CephFS
- This has been added to a the end of the Ceph Configuration file section. It may benefit from review, as I believe the...
- 01:12 PM Feature #647 (Duplicate): mon: refactor paxos interaction
- 01:11 PM Feature #183 (Resolved): qa: xfstests workunit
- 01:10 PM Documentation #3709 (Resolved): crush-map.rst: claims 'types' are default, not true (must be spec...
- crush-map.rst claims that the bucket type defaults are as appear in the table, but they're
not defaults; they must b... - 01:09 PM Feature #3376 (Duplicate): use external leveldb package for default builds
- 01:08 PM Documentation #3707 (Resolved): crush-map.rst: syntax error in example
- example includes:
item ceph-osd-server-1 2.00
this must have 'weight' explicitly in the line:
... - 01:03 PM Feature #3425 (Resolved): mon workload generator
- 12:39 PM Bug #3702: OSD SIGABRT during startup
- Attempting to start osd.131 (which was down due to the above noted problems) today resulted in quorum loss. Essential...
- 12:03 PM rgw Bug #3706 (Resolved): rgw functional test testSlashInName failed in nightly
- logs: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33224...
- 11:25 AM Revision a79493da (ceph): mds: skip frozen inode when assimilating dirty inodes' rstat
- CDir::assimilate_dirty_rstat_inodes() may encounter frozen inodes that
are being renamed. Skip these frozen inodes be... - 11:25 AM Revision 2f96b472 (ceph): mds: fix anchor table commit race
- Anchor table updates for a given inode is fully serialized on client side.
But due to network latency, two commit req... - 11:25 AM Revision 7e04504d (ceph): mds: fix on-going two phrase commits tracking
- The slaves for two phrase commit should be mdr->more()->witnessed
instead of mdr->more()->slaves. mdr->more()->slaves... - 11:25 AM Revision b3796f46 (ceph): mds: indroduce DROPLOCKS slave request
- In some rare case, Locker::acquire_locks() drops all acquired locks
in order to auth pin new objects. But Locker::dro... - 11:25 AM Revision b2d5005a (ceph): mds: fix lock state transition check
- Locker::simple_excl() and Locker::scatter_mix() miss is_rdlocked
check; Locker::file_excl() miss is_rdlocked check an... - 11:25 AM Revision fe5936b1 (ceph): mds: remove unnecessary is_xlocked check
- Locker::foo_eval() is always called for stable locks, so no need to
check if the lock is xlocked.
Signed-off-by: Yan... - 11:25 AM Revision f5ea5c36 (ceph): mds: don't defer processing caps if inode is auth pinned
- We should not defer processing caps if the inode is auth pinned by MDRequest,
because the MDRequest may change lock s... - 11:25 AM Revision 5e8642a8 (ceph): mds: call maybe_eval_stray after removing a replica dentry
- MDCache::handle_cache_expire() processes dentries after inodes, so the
MDCache::maybe_eval_stray() in MDCache::inode_... - 11:25 AM Revision 84224743 (ceph): mds: fix rename inode exportor check
- Use "srcdn->is_auth() && destdnl->is_primary()" to check if the MDS is
inode exportor of rename operation is not reli... - 11:25 AM Revision 26279574 (ceph): mds: don't trigger assertion when discover races with rename
- Discover reply that adds replica dentry and inode can race with rename
if slave request for rename sends discover and... - 11:25 AM Revision 5ae715be (ceph): mds: xlock stray dentry when handling rename or unlink
- This prevents MDS from reintegrating stray before rename/unlink finishes
Signed-off-by: Yan, Zheng <zheng.z.yan@inte... - 11:25 AM Revision 7a520168 (ceph): mds: don't journal null dentry for overwrited remote linkage
- Server::_rename_prepare() adds null dest dentry to the EMetaBlob if
the rename operation overwrites a remote linkage.... - 11:25 AM Revision fcb9f988 (ceph): mds: use null dentry to find old parent of renamed directory
- When replaying an directory rename operation, MDS need to find old parent of
the renamed directory to adjust auth sub... - 11:25 AM Revision d9d71473 (ceph): mds: don't trim ambiguous imports in MDCache::trim_non_auth_subtree
- Trimming ambiguous imports in MDCache::trim_non_auth_subtree() confuses
MDCache::disambiguate_imports() and causes in... - 11:25 AM Revision 3b13d3dc (ceph): mds: only export directory fragments in stray to their auth MDS
- Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
- 11:25 AM Revision 61da9b18 (ceph): mds: mark rename inode as ambiguous auth on all involved MDS
- When handling cross authority rename, the master first sends OP_RENAMEPREP
slave requests to witness MDS, then sends ... - 11:09 AM Linux kernel client Bug #2764 (Closed): xfstest hang; osd socket closed messages
- The fix for the warning messages is:
28362986f8743124b3a0fda20a8ed3e80309cce1
libceph: report connection ... - 10:54 AM Bug #3698: filestore: ENOENT on clone
- recent log: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33152
- 09:45 AM CephFS Bug #3700: mds: FAILED assert(!item_session_list.is_on_list())
- fixed by revert of bad fix, see commit:6711a4c4038dbdf843f9dfe42c7809c5c37ae534
- 09:37 AM CephFS Bug #3700 (Resolved): mds: FAILED assert(!item_session_list.is_on_list())
- 09:41 AM rbd Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
- This is a known problem with argonaut, but the fix is a rewrite of the whole module and we've chosen not to backport ...
- 09:09 AM Bug #3705 (Resolved): osd: crash in scrub finalize [argonaut]
- ...
- 08:28 AM Feature #3704 (Resolved): mon: add min log level to send cluster msgs to syslog
- e.g., WARN and above only, but not INFO. This is for the mon/LogMonitor.cc submission path, not log/Log.cc (for debu...
- 05:55 AM Revision e10267b5 (ceph): mds: fix Locker::simple_eval()
- Locker::simple_eval() checks if the loner wants CEPH_CAP_GEXCL to
decide if it should change the lock to EXCL state, ... - 05:54 AM Revision 7e23321b (ceph): mds: don't renew revoking lease
- MDS may receives lease renew request while lease is being revoked,
just ignore the renew request.
Signed-off-by: Yan...
01/01/2013
- 06:36 PM Revision eb02eaed (ceph): Merge remote-tracking branch 'gh/wip-bobtail-docs'
- 05:35 AM Revision f1196c7e (ceph): Merge branch 'master' of https://github.com/ceph/ceph
- 05:31 AM Revision 5dd6b199 (ceph): Merge branch 'next'
- 02:37 AM Revision 8f77ec7d (ceph): Merge branch 'next'
- 02:36 AM Revision 94a5dd6b (ceph): Merge remote-tracking branch 'gh/wip-3675'
- Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
- 01:10 AM Revision 1a32f0a0 (ceph): v0.56
12/31/2012
- 11:28 PM Revision 49ebe1ee (ceph): client: fix _create created ino condition
- We get 8 bytes back for the created ino.
Signed-off-by: Sage Weil <sage@inktank.com> - 11:26 PM Revision a10054bc (ceph): libcephfs: choose more unique nonce
- We were using a per-process counter combined with the pid. A short
running process can easily loop through and reuse... - 11:26 PM Revision e2fef38d (ceph): client: fix _create
- make_request() clear out req->reply and frees req; we can't inspect
it here.
Instead, just assume that extra_bl is t... - 06:35 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- FWIW I ran this in a loop and reproduced it after 7 iterations (well, a slightly different error actually, when it re...
- 05:42 PM rbd Bug #3697 (Can't reproduce): rbd copy.sh test failing in nightly
- 05:08 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- Hm, doesn't reproduce on local vstart cluster. Pondering possible failure modes.
- 04:23 PM rbd Bug #3697: rbd copy.sh test failing in nightly
- Trying to reproduce now
- 06:17 PM Revision 7d70dd11 (ceph): Revert "kernel: move fsync test to marginal suite until it works"
- This reverts commit acb91f7d0d4882d7393a99b142aec8687b9b4bb7.
Now fixed in master branch, commit b4d3bd06d4083d78075... - 06:16 PM Revision b4d3bd06 (ceph): Merge remote-tracking branch 'gh/wip-3625'
- 05:38 PM rbd Bug #3703: osd: crash while encrypting
- This is an osd crash....
- 02:55 PM rbd Bug #3703 (Can't reproduce): osd: crash while encrypting
- logs: ubuntu@teuthology:/a/teuthology-2012-12-30_19:00:03-regression-next-testing-basic/32113...
- 04:11 PM Revision ed586c1b (ceph): task: ceph: don't wait for 'healthy' if 'wait-for-healthy' is false.
- This new config option obviously defaults to 'true' in order to not only
maintain compatibility, but because it makes... - 02:58 PM Bug #3699: osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
- bringing back the marked out osd.1 in on burnupi06 while running the io hit the following,
2012-12-31 14:26:26.6... - 02:30 PM Messengers Feature #3509 (Resolved): msgr: delay injection
- 10:18 AM Bug #3689 (Resolved): osd: bad peering state machine event with mixed v0.52 and next cluster
- 09:06 AM Bug #3702 (Can't reproduce): OSD SIGABRT during startup
- After conversion of OSD's from btrfs to XFS, some OSD's SIGABRT during their first startup on XFS:
2012-12-29 05:0... - 08:55 AM Bug #3683: mon: leak of MMonPaxos
- recent logs: ubuntu@teuthology:/a/teuthology-2012-12-29_19:00:03-regression-next-testing-basic/31414
- 08:37 AM rbd Bug #3701 (Can't reproduce): qemu xfstest hung BUG: unable to handle kernel NULL pointer derefere...
- logs: ubuntu@teuthology:/a/teuthology-2012-12-30_03:00:06-regression-master-testing-gcov/31929...
Also available in: Atom