Project

General

Profile

Activity

From 06/10/2013 to 07/09/2013

07/09/2013

10:12 PM Bug #5581: fix big suite failures
current run: osd.2 has stale pgs, flush_pg_stats hung. vapre a.yaml.out. didn't have logging on. Sage Weil
03:22 PM Bug #5581 (Resolved): fix big suite failures
Sage Weil
09:09 PM devops Bug #5345: ceph-disk: handle less common device names
Jing Yuan Luke wrote:
> Hi Sage,
>
> I had the following error:
>
> root@yyy:~/ceph-configure# ceph-deploy -v ...
Sage Weil
07:03 PM devops Bug #5345: ceph-disk: handle less common device names
Hi Sage,
I had the following error:
root@yyy:~/ceph-configure# ceph-deploy -v osd prepare xxx:cciss/c0d1
Prepa...
Jing Yuan Luke
09:05 PM Bug #5518 (Fix Under Review): osd: marking single osd down makes others go down (cuttlefish)
Sage Weil
05:58 PM Bug #5518: osd: marking single osd down makes others go down (cuttlefish)
... Sage Weil
05:04 PM Bug #5517: osd: stuck peering on cuttlefish
slider.ops.newdream.net:~samuelj/log_files/ Samuel Just
03:23 PM Bug #5517: osd: stuck peering on cuttlefish
osd.227 queries log on osd.21:
2013-07-08 11:49:10.723242 7f4afac64700 1 -- 10.81.144.107:6800/13176 --> 10.81.158....
Samuel Just
04:32 PM rgw Bug #5522 (Resolved): rgw: use of select for waiting on curl sockets
We now use curl_multi_wait() when available. When it's not available we force a timeout to the select() so that we do... Yehuda Sadeh
04:30 PM Feature #4983 (Resolved): OSD: namespaces pt 2 (caps)
David Zafman
12:39 PM Feature #4983: OSD: namespaces pt 2 (caps)
David Zafman
04:29 PM Feature #4982 (Resolved): OSD: namespaces pt 1 (librados/osd, not caps)
David Zafman
04:21 PM rgw Bug #5348 (Resolved): rgw: missing copy constraints checks for inter region user object copy
Yehuda Sadeh
04:21 PM rgw Bug #5344 (Resolved): rgw: make list of bucket placement pools index configurable
Yehuda Sadeh
04:19 PM rgw Feature #2169 (Resolved): rgw: api to control bucket placement
Yehuda Sadeh
04:19 PM rgw Feature #2169: rgw: api to control bucket placement
This is solved in Dumpling. It will be possible to set up placement targets for each region. At the zone level these ... Yehuda Sadeh
04:13 PM rgw Bug #5357 (Resolved): rgw: set and retrieve intra-region copy operation state
Yehuda Sadeh
04:09 PM rgw Feature #3991 (Resolved): rgw: dr: region mgt changes: define datastructures
Yehuda Sadeh
04:09 PM rgw Feature #3990 (Resolved): rgw: dr: implement new version objclass
Yehuda Sadeh
04:09 PM rgw Feature #3989 (Resolved): rgw: dr: region mgt changes: radosgw admin changes
Yehuda Sadeh
04:09 PM rgw Feature #3988 (Resolved): rgw: dr: region mgt changes: define/implement internal API
Yehuda Sadeh
04:09 PM rgw Feature #3987 (Resolved): rgw: dr: region mgt changes: extend json parser with json decoder
Yehuda Sadeh
04:08 PM rgw Feature #5134 (Resolved): rgw: RESTful api for datalog
Yehuda Sadeh
04:08 PM rgw Feature #5133 (Resolved): rgw: RESTful api to lock/unlock mdlog
Yehuda Sadeh
04:05 PM devops Feature #5091 (Resolved): google-perftools for arm
Sage Weil
03:49 PM rgw Tasks #5586 (Resolved): rgw: build test plan
Sage Weil
03:38 PM rgw Feature #5341 (Resolved): rgw: keep state for cross-rgw copy operations
Ian Colle
03:38 PM rgw Feature #5352 (Resolved): rgw: metadata get should also dump mtime
Sage Weil
03:38 PM rgw Feature #5349 (Resolved): rgw: intra-region object copy
Sage Weil
03:38 PM rgw Documentation #5166: rgw: dr: async repl and DR documentation
Neil Levine
03:38 PM rgw Feature #4310 (Resolved): rgw: multisite: radosgw changes: copy across regions
Ian Colle
03:38 PM rgw Feature #5354 (Resolved): rgw: intra-region object copy should also set mtime on object
Sage Weil
03:38 PM rgw Feature #5353 (Resolved): rgw: metadata put should apply mtime if set
Sage Weil
03:38 PM rgw Feature #4098 (Resolved): rgw: multi-site: Global Bucket Namespace
Ian Colle
03:38 PM rgw Feature #4334 (Resolved): rgw: dr: bucket index log API: implement RESTful API
Ian Colle
03:38 PM rgw Feature #4333 (Resolved): rgw: multisite: metadata-changes log: implement RESTful API
Ian Colle
03:38 PM rgw Feature #4329 (Resolved): rgw: dr: updated buckets log: RESTful API
Ian Colle
03:38 PM rgw Feature #5008 (Resolved): rgw: bucket metadata changes should be reflected in mdlog
Ian Colle
03:38 PM rgw Feature #4745 (Resolved): rgw: radosgw-admin command to stat object
Ian Colle
03:37 PM rgw Feature #5358 (Resolved): rgw: RESTful api for intra-region copy state
Sage Weil
03:37 PM rgw Feature #4613 (Resolved): Allow bucket data to reside in a separate pool to object data
Sage Weil
03:37 PM rgw Feature #4330 (Resolved): rgw: dr: updated buckets log: radosgw-admin changes
Ian Colle
03:37 PM rgw Feature #5417 (Resolved): rgw: separate bucket metadata object into pointer object and instance o...
Sage Weil
03:37 PM rgw Feature #4328 (Resolved): rgw: dr: updated buckets log: tie into internal bucket changes tracker
Ian Colle
03:37 PM rgw Feature #4327 (Resolved): rgw: dr: updated buckets log: create internal API
Ian Colle
03:37 PM rgw Feature #4346 (Resolved): rgw: dr: bucket index objclass: changes
Ian Colle
03:36 PM rgw Feature #4336 (Resolved): rgw: dr: sync processing state: implement internal RESTful API
Sage Weil
03:35 PM rgw Feature #5406 (Resolved): rgw: a RESTful api to dump region map
Sage Weil
03:35 PM rgw Feature #5408 (Resolved): rgw: turn off dr/geo logging
Sage Weil
03:35 PM rgw Feature #4347 (Resolved): rgw: dr: bucket index objclass: fetch changes log
Ian Colle
03:35 PM rgw Cleanup #5558 (In Progress): rgw: modify certain radosgw-admin operations interface
Sage Weil
10:23 AM rgw Cleanup #5558 (Resolved): rgw: modify certain radosgw-admin operations interface
We need to be more consistent with operations that read, write, list, and remove data:
region info ...
Yehuda Sadeh
03:34 PM rgw Feature #4309 (Resolved): rgw: multisite: metadata objects versioning
Ian Colle
03:34 PM rgw Feature #4311 (Resolved): rgw: dr: radosgw changes: internal bucket changes tracker
Sage Weil
03:34 PM rgw Feature #4331 (Resolved): rgw: multisite: metadata-changes log: create internal API
Sage Weil
03:34 PM rgw Feature #4312 (Resolved): rgw: multisite: log metadata changes
Ian Colle
03:34 PM rgw Feature #4332 (Resolved): rgw: multisite: metadata-changes log: tie into metadata update operations
Sage Weil
03:31 PM Tasks #5585 (Resolved): test large scale exapnsion and contraction
Sage Weil
03:30 PM Tasks #5584 (Resolved): measure peering performance
Sage Weil
03:29 PM Tasks #5583 (Resolved): populate test cluster with rgw data
Sage Weil
03:29 PM Tasks #5582 (Resolved): create large test cluster on burnupi and/or mira
Sage Weil
03:18 PM Bug #5507: osd: ENOENT on clone
Sage Weil
03:17 PM Messengers Bug #5508 (Need More Info): msg/SimpleMessenger.cc: 230: FAILED assert(!cleared)
Sage Weil
03:17 PM Bug #5519 (Pending Backport): mon/osd: trimming of old maps based on last_epoch_clean is broken b...
Sage Weil
03:14 PM Feature #3984 (Resolved): api: Send Out DRAFT REST API for Review
Sage Weil
03:14 PM Feature #3983 (Resolved): api: create initial DRAFT REST API Design
Sage Weil
03:12 PM Feature #3273: mon: simple dm-crypt key management
Piston expressed interest in hooking this up to a PKI system they are looking at. Waiting on details from them. Neil Levine
03:11 PM Feature #5421 (In Progress): mon: add formatter option for various mon commands
Sage Weil
03:10 PM Fix #5278 (Resolved): osd: smarter recovery for small objects
Sage Weil
01:51 PM Bug #4599: ceph auth import -i <file> is broken
Did it ever work from stdin? I don't see any code in the original tool that would
do that...
Dan Mick
01:40 PM Bug #5526 (Resolved): ceph health detail is broken for formatted output
LGTM. Dan Mick
09:32 AM Bug #5526 (Fix Under Review): ceph health detail is broken for formatted output
wip-rest commit commit:f810626a857dc34ff23d823d1b700488ff1798e8 Joao Eduardo Luis
12:32 PM Bug #5492 (In Progress): scripts installing into /usr/usr/sbin (with --prefix=/usr)
This fix was temporarily reverted. The ceph build process for debian packages relies on binaries being found in /usr... Anonymous
12:05 PM Bug #5524 (Resolved): df shows incorrect disk usage/size for cephfs mount
Sage Weil
12:03 PM Bug #5524: df shows incorrect disk usage/size for cephfs mount
I am using '3.8.0-19-generic' on Ubuntu 13.04.
If this was fixed in 3.9 then it is not really an issue going for...
Shain Miley
09:35 AM Bug #5524: df shows incorrect disk usage/size for cephfs mount
what kernel version?
this was fixed in 3.9 or so
Sage Weil
12:04 PM Bug #5392: osd: unfound objects from thrashing
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-07-08_20:00:15-rados-cuttlefish-testing-basic/59715
...
Sage Weil
11:47 AM Bug #5557 (Duplicate): /dev/disk/by-path not generated for ATA disks in later versions of udev
the wip-ceph-disk no longer uses the by-path directory at all... see #5354 Sage Weil
09:50 AM Bug #5557 (Duplicate): /dev/disk/by-path not generated for ATA disks in later versions of udev
The version of udev in Ubuntu Saucy and higher does not generate /dev/disk/by-path entries for ATA disks.
This ude...
James Page
11:25 AM devops Bug #4924: ceph-deploy: gatherkeys fails on raring (cuttlefish)
Actually, this looks like it is being caused by the monitor crashing, and ceph-create-keys not being able to connect ... Noah Watkins
11:19 AM devops Bug #4924: ceph-deploy: gatherkeys fails on raring (cuttlefish)
I am seeing this same problem. I am using the latest master version of ceph-deploy, and the target node is Ubuntu 12.... Noah Watkins
09:37 AM CephFS Feature #5520: osdc: should handle namespaces
This is needed before librbd or cephfs can use namespaces Ian Colle
09:14 AM rgw Bug #5478 (Resolved): rgw: fix init script for rgw daemon name
Sage Weil
12:46 AM rgw Bug #5478: rgw: fix init script for rgw daemon name
Should be closed as http://tracker.ceph.com/projects/ceph/repository/revisions/8b4cb8f37266183fe4e1925d07e50703e520f2... Christophe Courtaut
08:13 AM Bug #5528 (Resolved): Python bindings Object.get_xattrs() requires unnecessary xattr_name
get_xattrs() returns all xattrs for an object, but Object.get_xattrs() requires an xattr_name. This differs than Ioct... Johannes Erdfelt
05:41 AM Subtask #5510 (Fix Under Review): ObjectContext : replace ref with shared_ptr
Loïc Dachary
12:47 AM Subtask #5510 (In Progress): ObjectContext : replace ref with shared_ptr
Loïc Dachary
05:22 AM rgw Feature #3074: radosgw needs --help support
This issue should be closed. Christophe Courtaut
12:49 AM rgw Bug #5374: Avoid relying on keystone's admin token
The patch above needs review.
Please change ticket status.
Christophe Courtaut
12:47 AM rgw Bug #1779: rgw: swift auth returns wrong error code when unexisting user is given
I need some advice here, if it need some backport, and if so where? Christophe Courtaut
12:10 AM Subtask #5527 (Resolved): unit tests for common/sharedptr_registry.hpp
"work in progress":https://github.com/dachary/ceph/tree/wip-5527
Loïc Dachary

07/08/2013

11:52 PM Bug #5526 (Resolved): ceph health detail is broken for formatted output
ceph health detail appends plain-text detail output whether or not a formatter is requested
Dan Mick
09:16 PM Bug #5515 (Pending Backport): mon: paxos allows reads before initial commit is done
Sage Weil
10:48 AM Bug #5515 (Fix Under Review): mon: paxos allows reads before initial commit is done
Sage Weil
09:31 AM Bug #5515 (In Progress): mon: paxos allows reads before initial commit is done
Sage Weil
09:14 AM Bug #5515 (Resolved): mon: paxos allows reads before initial commit is done
ubuntu@teuthology:/a/teuthology-2013-07-07_20:00:17-rados-cuttlefish-testing-basic/58476
got an auth failure befor...
Sage Weil
06:39 PM rgw Documentation #5525 (Resolved): Radosgw 'add the ceph keyring entries' section should be updated ...
When using ceph-deploy a key file named 'ceph.client.admin.keyring ' is generated.
When attempting to follow the s...
Shain Miley
06:27 PM Bug #5524 (Resolved): df shows incorrect disk usage/size for cephfs mount
After mounting a cephfs volume (ceph version 0.61.4) and filling it with 11GB of .wav files I used 'df' to list the m... Shain Miley
05:55 PM CephFS Bug #5036: `ls` hangs on random folder
was your mds complied from the newest source code? was there mds restart before you saw the hang? if there was, the b... Zheng Yan
01:03 PM CephFS Bug #5036: `ls` hangs on random folder
Yan,
Even after rebuilding my 3.10 kernel with the missing fix (libceph: call r_unsafe_callback when unsafe reply ...
Milosz Tanski
05:40 PM devops Feature #5523 (Resolved): libcurl 7.28+ packages
We need to provide libcurl 7.28 or newer packages for relevant architectures that don't have it. It's needed for rgw ... Yehuda Sadeh
04:49 PM devops Bug #5369 (Resolved): fedora18: sysvinit doesn't start mon on reboot
Doing a chkconfig --list wasn't listing the network manager so it threw me off a bit but it was indeed enabled. I tri... Sandon Van Ness
04:23 PM rgw Bug #5522: rgw: use of select for waiting on curl sockets
Solution that we discussed is to modify code to use curl_multi_wait() and to provide backported packages for the rele... Yehuda Sadeh
04:22 PM rgw Bug #5522 (Resolved): rgw: use of select for waiting on curl sockets
We should use curl_multi_wait() instead. Main issue is that this function only available in more recent libcurl packa... Yehuda Sadeh
03:32 PM Feature #5521 (Duplicate): Enhance PGLS or new op to list all namespace/objects in a pool.

Sage pull request comment:
allow a 'any namespace' flag for PGLS.. maybe a different op code? (PGLSALL, or PGLS_...
David Zafman
03:10 PM Bug #5512 (Pending Backport): mon: missing full osdmaps after sync
Sage Weil
02:52 PM Bug #5492: scripts installing into /usr/usr/sbin (with --prefix=/usr)
the proposed fix breaks the deb builds Sage Weil
10:58 AM Bug #5492 (Resolved): scripts installing into /usr/usr/sbin (with --prefix=/usr)
Sage Weil
02:47 PM CephFS Feature #5520 (Rejected): osdc: should handle namespaces

As a follow on to 4982/4983 we should implement namespace handling in the ObjectCacher.
David Zafman
01:36 PM Bug #5519 (Fix Under Review): mon/osd: trimming of old maps based on last_epoch_clean is broken b...
Sage Weil
01:00 PM Bug #5519 (Resolved): mon/osd: trimming of old maps based on last_epoch_clean is broken by design
consider a cluster with one pg whose mapping does not change for 10000 epochs, while other pgs recover and then go cl... Sage Weil
12:21 PM Bug #5518 (Resolved): osd: marking single osd down makes others go down (cuttlefish)
Settings:
- paxos propose interval = 1
- debug ms = 1
- debug osd = 20
Log:
07:57: Cluster health OK.
07...
Sage Weil
12:18 PM Bug #5517 (Resolved): osd: stuck peering on cuttlefish
Settings:
- paxos propose interval = 1
- debug ms = 1
- debug osd = 20
- debug mon = 20
Log:
11:45: star...
Sage Weil
11:33 AM devops Bug #5345: ceph-disk: handle less common device names
Hi Luke, Tomas,
Are you able to test the latest version in this branch? https://raw.github.com/ceph/ceph/wip-ceph...
Sage Weil
11:20 AM rgw Bug #5516 (Resolved): rgw: update bucket relink teuthology test
In teuthology/task/radosgw-admin.py there is a test for relinking a bucket from one user to another. Here is the code... Anonymous
09:33 AM Messengers Bug #5508: msg/SimpleMessenger.cc: 230: FAILED assert(!cleared)
Sage Weil
09:32 AM Bug #5509 (Resolved): mon/Monitor.cc: 1395: FAILED assert(latest_monmap.epoch > 0)
Sage Weil
09:02 AM RADOS Bug #5514: mon: can get stuck in tight loop with old rotating keys
Isn't this going to clear itself up as soon as the mon generates new keys anyway?
Besides only being visible to peop...
Greg Farnum
08:58 AM RADOS Bug #5514 (New): mon: can get stuck in tight loop with old rotating keys
osd sees rotating keys are out of date, requests new ones
mon returns out of date keys
osd loops
this happened t...
Sage Weil

07/07/2013

08:40 PM Bug #5482: cephx: verify_reply coudln't decrypt with error: error decoding block for decryption
The kernel version doesn't matter if you are just running the ceph userspace.
If you are mounting cephfs via the k...
Sage Weil
07:42 PM Bug #5482: cephx: verify_reply coudln't decrypt with error: error decoding block for decryption
Sage Weil wrote:
> I actually meant 0.56.6, but you can move to 0.61.x (cuttelfish) too. See ceph.com/docs/master i...
chen atrmat
09:03 AM Bug #5513 (Can't reproduce): osd crashes consistently after startup
I can't get one of my OSDs to start up, it gives the following log output (with debug osd = 20): https://pastee.org/u... Matthew Via
08:36 AM Bug #5512 (Resolved): mon: missing full osdmaps after sync
full osdmaps are occasionally part of paxos transactions. if one of these overlaps with a sync, the synced monitor w... Sage Weil
06:03 AM Subtask #5510 (Fix Under Review): ObjectContext : replace ref with shared_ptr
Loïc Dachary

07/06/2013

07:43 AM Linux kernel client Feature #190: krbd: DISCARD support
These seems like a pretty important feature to me. Without this feature the storage necessary for a highly active (b... Kyle Tarplee
07:29 AM Feature #5511 (Duplicate): rados.py support for object locking
It seems like an easy task to expose to the python bindings the following functions:
rados_lock_exclusive()
rados_l...
Kyle Tarplee
05:27 AM Subtask #5510 (Resolved): ObjectContext : replace ref with shared_ptr
"work in progress":https://github.com/dachary/ceph/tree/wip-5510
"take 1":https://github.com/dachary/ceph/commit/a50...
Loïc Dachary

07/05/2013

04:21 PM Messengers Bug #5508 (In Progress): msg/SimpleMessenger.cc: 230: FAILED assert(!cleared)
Sage Weil
11:26 AM Messengers Bug #5508 (Resolved): msg/SimpleMessenger.cc: 230: FAILED assert(!cleared)
... Sage Weil
04:04 PM Bug #5509 (Fix Under Review): mon/Monitor.cc: 1395: FAILED assert(latest_monmap.epoch > 0)
Sage Weil
03:45 PM Bug #5509 (Resolved): mon/Monitor.cc: 1395: FAILED assert(latest_monmap.epoch > 0)
... Sage Weil
11:47 AM Bug #5502 (Fix Under Review): mon: long-running sync will restart (cuttlefish)
Sage Weil
11:40 AM Bug #5495 (Can't reproduce): ceph-mon and minus character in hostname
Sage Weil
08:40 AM Bug #5507 (Resolved): osd: ENOENT on clone
... Sage Weil
08:39 AM Bug #5445: random osd EPERM on journal
teuthology-2013-07-05_01:00:13-rados-master-testing-basic 55351: and 55360: Sage Weil
01:58 AM Bug #5504: osd stack on peereng for a long time
ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)
6 mons at at 6 servers
156 osds: 156 up, 156 in
648...
Dominik Mostowiec

07/04/2013

11:55 PM rgw Feature #5506 (Resolved): rgw: use Keystone to authenticate S3 requests
The idea is that there should be an alternative way to authenticate S3 requests. Currently we handle the S3 authentic... Yehuda Sadeh
09:31 PM Bug #5497: ceph features mis-reported
I'm pretty sure that a pre-fix OSD is going to report that it supports all features to a post-fix OSD — right? That's... Greg Farnum
08:57 PM Bug #5497: ceph features mis-reported
the new command syntax (json instead of vector<string), and new MonCap encoding. the old protocol won't ever see a >... Sage Weil
07:49 PM Bug #5497: ceph features mis-reported
What was the mon protocol change covering (I missed it going in), and don't we need to preserve compatibility across ... Greg Farnum
07:52 PM RADOS Feature #5280: osd/client: messages should be tagged with the earliest sane map
Couldn't we pretty easily on the client side just keep track of the last map which changed the osd states or the pg_t... Greg Farnum
11:54 AM Bug #5505 (Resolved): mon health does not reflect slow requests
Sage Weil
06:56 AM rgw Bug #5416: --help output needs --rgw-zone option
I must be blind because I can't find it :-) Would you be so kind as to paste the github URL where it can be found ? Loïc Dachary
05:59 AM Bug #5504 (Duplicate): osd stack on peereng for a long time
PGs from one sometimes osd stacks on peereng for a long time.
--
ceph health details
HEALTH_WARN 3 pgs peering; ...
Dominik Mostowiec
02:23 AM Bug #5495: ceph-mon and minus character in hostname
Sage Weil wrote:
> what version is this?
This are the official 0.61.4 packages from ceph.com
> can you strace ...
Robert Sander
12:00 AM Bug #5401: cuttlefish osd recovery slow
OK reverting them to the defaults seems to work fine too. Stefan Priebe

07/03/2013

11:49 PM Bug #5401: cuttlefish osd recovery slow
Sorry solution is the wrong word - minimize the effect (it'sstill there but not as much as before) Stefan Priebe
11:47 PM Bug #5401: cuttlefish osd recovery slow
Nobody - i was just helpless and tried to find a solution for me.
Stefan Priebe
03:35 PM Bug #5401: cuttlefish osd recovery slow
What prompted you to set 'filestore min sync interval' and 'filestore max sync interval'? The defaults are 0.01 and ... Samuel Just
01:33 AM Bug #5401: cuttlefish osd recovery slow
Sage - any news / ideas? Stefan Priebe
10:42 PM Bug #5503 (Resolved): osd: ceph --admin-daemon interface doesn't handle spaces in names

The simple parsing mechanism string_to_vec() uses spaces to separate arguments. There is no processing of quotes. ...
David Zafman
10:36 PM Feature #4982: OSD: namespaces pt 1 (librados/osd, not caps)
David Zafman
10:13 PM devops Bug #5211 (Resolved): ceph-disk prepare: list_partitions() shouldn't return disks
Sage Weil
10:12 PM Bug #5330 (Resolved): ceph daemon <name> ... broken
commit:88f73c5a6363eebe5d51d2636bc1328fec40dd2c Sage Weil
10:11 PM CephFS Bug #4850 (Resolved): ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
i think this is resolved now... Sage Weil
10:02 PM RADOS Feature #5280: osd/client: messages should be tagged with the earliest sane map
i started coding this up for the objecter but realized it doesn't really help.. when we initiate a io we calculate th... Sage Weil
09:58 PM Feature #5137 (Resolved): osd: magically fall back to leveldb for xattrs
Sage Weil
09:58 PM Feature #4983 (Fix Under Review): OSD: namespaces pt 2 (caps)
Sage Weil
09:56 PM CephFS Bug #5453 (Resolved): kclient: multiple_rsync tee output partially zeroed
thanks, added this to the test suite Sage Weil
06:32 PM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
my reproducer... Zheng Yan
10:31 AM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
patch is in testing branch (tho i'm tracking down a different regression in that branch).
btw, yan, were you able ...
Sage Weil
06:03 AM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
Sage Weil wrote:
> in combination with the mds s/wrlock/xlock/ change?
the kclient patch fixes this issue alone. ...
Zheng Yan
09:46 PM devops Bug #5345 (Fix Under Review): ceph-disk: handle less common device names
Sage Weil
07:05 AM devops Bug #5345 (In Progress): ceph-disk: handle less common device names
Thanks, Luke--that was exactly the info I needed! Sage Weil
12:22 AM devops Bug #5345: ceph-disk: handle less common device names
Hi Sage,
Just tried the ceph-disk as per your suggestion, however I found the following error:
ceph-deploy osd ...
Jing Yuan Luke
09:46 PM Bug #5497 (Resolved): ceph features mis-reported
i don't think we need to backport this.. the mons have a protocol version change and can't talk from cuttlefish -> 0.... Sage Weil
01:22 PM Bug #5497 (Pending Backport): ceph features mis-reported
Samuel Just
11:26 AM Bug #5497 (Resolved): ceph features mis-reported
I observed some osd daemons with 32 bit ~0 as features while others have 64 bit ~0. Samuel Just
07:05 PM Bug #5502: mon: long-running sync will restart (cuttlefish)
the other odd behavior we saw: a mon that was in quorum was restarted, and had to sync. Sage Weil
05:58 PM Bug #5502 (Resolved): mon: long-running sync will restart (cuttlefish)
on a largish cluster we observe that a ~20 minute mon sync restarts after it finishes. it looks like a problem with ... Sage Weil
06:19 PM CephFS Bug #5036: `ls` hangs on random folder
I suggest trying test branch ceph-client. At least two bugs that can cause hang like this have been fixed. Zheng Yan
04:08 PM CephFS Bug #5036: `ls` hangs on random folder
... Milosz Tanski
02:25 PM CephFS Bug #5036: `ls` hangs on random folder
Milosz, I think you've run into #2019, which I've reopened. Quan might have seen the same issue but there's not the r... Greg Farnum
02:15 PM CephFS Bug #5036: `ls` hangs on random folder
I'm experiencing the same issue here when trying to ls one directory in our ceph cluster on nodes. Using both vanilla... Milosz Tanski
03:44 PM Bug #5460 (Resolved): v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
commit:e8b42a6998653bde488502097eaa0a2fb834d964 Sage Weil
01:02 PM Bug #5460: v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
pushed a patch to paravoid-test branch; can you give it a try? the best theory i have is that at one point an osd st... Sage Weil
01:48 AM Bug #5460: v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
I started osd.0 with 0.65-188-g946a838 and --debug-ms 20 --debug-osd 20. I did have the same errors, although this ti... Faidon Liambotis
03:16 PM Bug #5495 (Need More Info): ceph-mon and minus character in hostname
what versino is this?
can you strace -f ceph-mon and attach that output? that'll give a better hint as to where t...
Sage Weil
03:56 AM Bug #5495 (Can't reproduce): ceph-mon and minus character in hostname
It looks like ceph-mon does not cope with a - in the hostname:
# /usr/bin/ceph-mon --cluster=office -i test-uplink...
Robert Sander
02:53 PM CephFS Bug #2019: mds: CInode::filelock stuck in sync->mix
it's a kclient bug, probably already fixed by 'libceph: call r_unsafe_callback when unsafe reply is received' Zheng Yan
02:31 PM CephFS Bug #2019: mds: CInode::filelock stuck in sync->mix
I'm into this as part of bug: #5036... Milosz Tanski
02:27 PM CephFS Bug #2019: mds: CInode::filelock stuck in sync->mix
See #5036. Greg Farnum
02:45 PM Bug #5500: ceph CLI should validate, reject bad daemon commands
I should note that the admin socket already supports "get_command_descriptions" so this change is
all in the CLI.
Dan Mick
02:42 PM Bug #5500 (Resolved): ceph CLI should validate, reject bad daemon commands
The CLI rewrite has the ability to validate commands before sending; it does this with mon and
osd commands, but not...
Dan Mick
01:31 PM devops Bug #5499: ceph-deploy --cluster clustername osd prepare fails
ssh root@node "ceph-disk-prepare --cluster office -- /path/to/mountpoint"
suceeds
Robert Sander
01:30 PM devops Bug #5499 (Resolved): ceph-deploy --cluster clustername osd prepare fails
ceph-deploy --cluster clustername osd prepare node:/path/to/mountpoint fails with
ceph-disk-prepare -- /path/to/mo...
Robert Sander
12:48 PM CephFS Fix #5498 (New): client: report actual file count instead of object count
Right now we fill in the statvfs struct's f_files member with the number of files. Instead we should use the mount po... Greg Farnum
12:16 PM rgw Bug #5374: Avoid relying on keystone's admin token
A pull request has been made for this issue https://github.com/ceph/ceph/pull/392
Might need testing.
Christophe Courtaut
11:32 AM devops Bug #5369: fedora18: sysvinit doesn't start mon on reboot
sandon, can you make sure our images or chef or whatever is updated to avoid this pitfall? Sage Weil
11:32 AM devops Bug #5369: fedora18: sysvinit doesn't start mon on reboot
this is a network.service vs NetworkManager problem.. the $network LSB line allegedly doesn't wait for networkmanager... Sage Weil
09:45 AM devops Bug #5496: Unable to install librados rpm on Fedora 18
This looks like it's an issue with fedora moving the libraries into a ceph-libs package. If we don't want to change ... Anonymous
09:40 AM devops Bug #5496 (In Progress): Unable to install librados rpm on Fedora 18
Anonymous
04:30 AM devops Bug #5496 (Resolved): Unable to install librados rpm on Fedora 18
I am unable to install librados library due to conflict in existing package (ceph-libs) already installed on fedora 1... Chris Howarth
09:38 AM Bug #5492 (Fix Under Review): scripts installing into /usr/usr/sbin (with --prefix=/usr)
Sage Weil
09:33 AM CephFS Bug #5458: mds: standby-replay -> replay takeover does not handle racing expire/trim
Saw this backtrace again at /a/teuthology-2013-07-02_01:00:48-fs-next-testing-basic/52429/teuthology.log Greg Farnum
08:56 AM rgw Bug #1779 (Pending Backport): rgw: swift auth returns wrong error code when unexisting user is given
Do we want to backport this anywhere? Greg Farnum
06:35 AM rgw Bug #1779: rgw: swift auth returns wrong error code when unexisting user is given
This issue should be closed as https://github.com/ceph/ceph/pull/385 was merged. Christophe Courtaut
07:30 AM rgw Bug #5416: --help output needs --rgw-zone option
That's for both radosgw, radosgw-admin. This specific option is already upstream (in cuttlefish) iirc. Yehuda Sadeh
07:28 AM rgw Bug #5324 (Resolved): radosgw-admin --help missing the --shard-id option
Yehuda Sadeh
12:21 AM rgw Bug #5324: radosgw-admin --help missing the --shard-id option
This issue should be closed.
Fixed by https://github.com/ceph/ceph/commit/f571bd38f6c6aff861566e14a44570291eb959a2
Christophe Courtaut
07:09 AM Bug #5482 (Can't reproduce): cephx: verify_reply coudln't decrypt with error: error decoding bloc...
I actually meant 0.56.6, but you can move to 0.61.x (cuttelfish) too. See ceph.com/docs/master in the upgrade sectio... Sage Weil
12:41 AM Bug #5482: cephx: verify_reply coudln't decrypt with error: error decoding block for decryption
Sage Weil wrote:
> i haven't seen this particular message before. 0.56.3 is a bit out of date, though; please upgra...
chen atrmat
12:40 AM Bug #5482: cephx: verify_reply coudln't decrypt with error: error decoding block for decryption
ok, i'll try to upgrade.
BTW, if i upgrade from 0.56.3 to 0.61.4, my data stored on OSD would be loss? or how to bac...
chen atrmat
07:06 AM rgw Bug #5228 (Duplicate): radosgw-admin bucket list no longer shows all buckets
Sage Weil
12:31 AM rgw Bug #5228: radosgw-admin bucket list no longer shows all buckets
Seems to be a duplicate of http://tracker.ceph.com/issues/5455 Christophe Courtaut
06:51 AM rgw Bug #5455: radosgw-admin buckets list regression
As discussed here https://github.com/ceph/ceph/pull/384, this issue should be closed. Christophe Courtaut
12:54 AM Bug #5481: Ceph OSD Connection refused
Sage Weil wrote:
> if the ceph-osd daemon isn't running, other daemons will get connection refused. look in the log...
chen atrmat

07/02/2013

09:28 PM rgw Feature #5406 (Fix Under Review): rgw: a RESTful api to dump region map
Yehuda Sadeh
07:53 PM devops Bug #5345: ceph-disk: handle less common device names
Hi Jing,
As far as I can tell the current ceph-disk supports these device names, but as I mentioned I don't have a...
Sage Weil
07:26 PM devops Bug #5345: ceph-disk: handle less common device names
I had similar problem as Thomas but mine are HP Blades (more specifically BL460 G1) with P200i controllers but I susp... Jing Yuan Luke
04:57 PM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
btw the mds change passed 11 iterations before it stopped because of a chef/network hiccup. Sage Weil
09:42 AM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
in combination with the mds s/wrlock/xlock/ change? Sage Weil
02:43 PM Bug #5490 (Resolved): osd won't start after ceph-deploy on Debian 7 - fix
commit:87c98e92d1375c8bc76196bbbf06f677bef95e64 Sage Weil
02:07 PM Bug #5490 (Fix Under Review): osd won't start after ceph-deploy on Debian 7 - fix
wip-5490
fixed the same thing in the upstart scripts. nice catch, thanks!
Sage Weil
02:31 PM Bug #5493 (Resolved): objecter: hang on osd_command to nonexistent target
Sage Weil
11:43 AM Bug #5493 (Resolved): objecter: hang on osd_command to nonexistent target
job was teuthology-2013-07-02_01:00:14-rados-next-testing-basic 52253 Sage Weil
10:48 AM Bug #5460 (Need More Info): v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
Can you set 'debug ms = 20' and 'debug osd = 20', reproduce the problem, and attach teh log on the marked-down osd an... Sage Weil
09:31 AM devops Bug #5369: fedora18: sysvinit doesn't start mon on reboot
(06:25:53 PM) mbiebl: systemctl enable NetworkManager-wait-online.service
here's the full exhcnage:...
Sage Weil
09:30 AM CephFS Support #5491 (Rejected): Use of RBD kernel module
Questions like this should be addressed to the ceph-users list, please. :)
http://ceph.com/resources/mailing-list-irc/
Greg Farnum
04:02 AM CephFS Support #5491 (Rejected): Use of RBD kernel module
I would be grateful if you could explain a couple of items as these are not immediately clear to me from the Ceph doc... Chris Howarth
08:53 AM Bug #5492 (Resolved): scripts installing into /usr/usr/sbin (with --prefix=/usr)
At least in some cases, scripts (ceph-disk*, ceph-create-keys) installing into /usr/usr/sbin. This happened while "ce... Denis kaganovich
06:58 AM rgw Feature #3074: radosgw needs --help support
The -d option of radosgw seems to come from global_init function, which is located under src/global/global_init.cc, a... Christophe Courtaut
05:53 AM rgw Bug #5402: rgw compilation problem on wip-rgw-geo-2 branch
Fixed by https://github.com/ceph/ceph/commit/cfc1f2ee1f4345a88d7d72f33f6f9b838fa134ce Christophe Courtaut

07/01/2013

11:19 PM Bug #5460: v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
I rechecked and confirm that all the old ones run 0.61.3. I've stopped the new 0.65+ ones since then, but you can see... Faidon Liambotis
05:46 PM Bug #5460: v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
I updated a cluster from 0.61 to the same 0.65 SHA1 version.
$ ceph --version
ceph version 0.65-188-g946a838 (946...
David Zafman
11:05 PM Bug #5490 (Resolved): osd won't start after ceph-deploy on Debian 7 - fix
I created a cluster with ceph-deploy on a set of Debian 7 VMs for testing. Each VM has a physical device/partition at... Matt Thompson
10:19 PM devops Bug #5489: ceph-deploy: mon destroy throws inappropriate message
test setup: burnupi05, burnupi21, burnupi63
burnupi21 is the one, stuck up at ceph-create-keys.
Tamilarasi muthamizhan
10:19 PM devops Bug #5489 (Resolved): ceph-deploy: mon destroy throws inappropriate message
I tried to build 3 node cluster using ceph-deploy on centos , but forgot to disable iptables service on one of the no... Tamilarasi muthamizhan
09:44 PM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
patch "ceph: fix pending vmtruncate race" should fix the issue. Zheng Yan
09:23 PM rgw Bug #5346 (Resolved): rgw: invalid read from RGWFormatter_Plain::write_data
Sage pushed a fix at commit:49ff63b1750789070a8c6fef830c9526ae0f6d9f Yehuda Sadeh
06:22 PM rgw Bug #5346 (Fix Under Review): rgw: invalid read from RGWFormatter_Plain::write_data
Sage Weil
03:39 PM rgw Bug #5346: rgw: invalid read from RGWFormatter_Plain::write_data
using a trivial implemention of strlen avoids this. unfortunately we can't whitelist the glibc strlen call because i... Sage Weil
06:36 PM rbd Bug #5468 (Resolved): qemu: reports snapshot not supported on rpms
Josh built some rpm packages by adding a qemu patch and we tested it. Works fine now.
The packages are at http://w...
Tamilarasi muthamizhan
05:00 PM rgw Feature #5408 (Fix Under Review): rgw: turn off dr/geo logging
Yehuda Sadeh
04:49 PM rbd Bug #5488: librbd: deadlock in image refresh
looks like bh_write is holding cache_lock (== ObjectCacher::lock) and trying ot take snap_read, while ictx_refresh() ... Sage Weil
04:19 PM rbd Bug #5488 (Resolved): librbd: deadlock in image refresh
... Sage Weil
04:06 PM Subtask #5487 (In Progress): Factor out ObjectContext / ReplicatedPG::object_contexts
Loïc Dachary
12:52 PM Subtask #5487 (Closed): Factor out ObjectContext / ReplicatedPG::object_contexts
* "read/write locks unit tests":https://github.com/ceph/ceph/pull/407
Loïc Dachary
04:05 PM rbd Bug #5464: krbd: modifying mapped image also modifies snapshot
backport patch updated with mainline commit sha1 and sent to stable@, just waiting for it to land in the stable tree. Sage Weil
03:22 PM Bug #5484 (Resolved): mon/Paxos.cc: 51: FAILED assert(err == 0) (reapply_all_versions())
commit:935c27842e4549b32c1c8721e178e9f3a10f2ad4 Sage Weil
10:58 AM Bug #5484: mon/Paxos.cc: 51: FAILED assert(err == 0) (reapply_all_versions())
... Sage Weil
10:11 AM Bug #5484 (Resolved): mon/Paxos.cc: 51: FAILED assert(err == 0) (reapply_all_versions())
ubuntu@teuthology:/a/teuthology-2013-07-01_01:00:09-rados-master-testing-basic/51450$ cat summary.yaml
description:...
Samuel Just
02:56 PM Subtask #5433: Factor out the ReplicatedPG object replication and client IO logic as a PGBackend ...
Sam's comments:
* It is better to do the changes within the same file to better read the diffs
* The changes are in...
Loïc Dachary
02:44 PM rgw Bug #5402 (Resolved): rgw compilation problem on wip-rgw-geo-2 branch
Sage Weil
02:29 PM rgw Bug #5455 (Resolved): radosgw-admin buckets list regression
teuthology.git commit:c0bf24d770162fd2babfb806aacda4c78b23288d Sage Weil
01:42 PM Bug #5470 (Resolved): osd/PGLog.h: 199: FAILED assert(log.log.size() == log_keys_debug.size())
Sage Weil
10:01 AM Bug #5470: osd/PGLog.h: 199: FAILED assert(log.log.size() == log_keys_debug.size())
Looks right to me. Samuel Just
12:36 PM CephFS Feature #5486 (Resolved): kclient: make it work with selinux
see #5477 for the latest failed attempt Sage Weil
12:34 PM CephFS Bug #5477 (Resolved): Unable to create files on CephFS on Fedora 18 using kernel module
Sage Weil
12:19 PM CephFS Bug #5477: Unable to create files on CephFS on Fedora 18 using kernel module
Many thanks for the responses Sage and Greg. You were right - once I disabled SElinux this worked.
Chris
Chris Howarth
10:51 AM CephFS Bug #5485 (Can't reproduce): failed cifs mount
In teuthology, logs at /a/teuthology-2013-07-01_01:00:46-fs-master-testing-basic/51619... Greg Farnum
09:50 AM Bug #5154: osd/SnapMapper.cc: 270: FAILED assert(check(oid))
ubuntu@teuthology:/a/teuthology-2013-07-01_01:00:09-rados-master-testing-basic/51502/remote$ less ceph-osd.4.log
...
Samuel Just
09:47 AM Bug #5482 (Need More Info): cephx: verify_reply coudln't decrypt with error: error decoding block...
i haven't seen this particular message before. 0.56.3 is a bit out of date, though; please upgrade to 0.56.6 to get ... Sage Weil
09:45 AM Bug #5481 (Rejected): Ceph OSD Connection refused
ceph-users is a better forum for these sorts of questions. thanks! Sage Weil
09:43 AM Bug #5481: Ceph OSD Connection refused
if the ceph-osd daemon isn't running, other daemons will get connection refused. look in the log for the down osd to... Sage Weil
06:04 AM rgw Bug #1779: rgw: swift auth returns wrong error code when unexisting user is given
Here comes a fix for this issue.
https://github.com/ceph/ceph/pull/385
Christophe Courtaut
05:16 AM devops Bug #5483: ceph-deploy --cluster othername mon create does not work
It looks like othe ceph-deploy commands also fail with a cluster name different than "ceph". Robert Sander
04:51 AM devops Bug #5483 (Resolved): ceph-deploy --cluster othername mon create does not work
ceph-deploy --cluster othername mon create nodename
does not work. It creates a directory /var/lib/ceph/mon/ceph-n...
Robert Sander
03:25 AM rgw Bug #5416: --help output needs --rgw-zone option
Joe Buck wrote:
> For the ./radosgw-admin command that is.
On which branch the --rgw-zone exist? Is this radosgw-...
Christophe Courtaut

06/30/2013

12:40 PM Bug #5401: cuttlefish osd recovery slow
The following seems to help:
- osd recovery max active = 2
- filestore min sync interval = 1
- filestore max sync ...
Stefan Priebe

06/29/2013

07:46 PM Feature #5437: ceph-mon performance on ARM
Downgraded to Normal, due to landing of patch. Keeping open as a placeholder for any future issues identified for imp... Ian Colle
05:35 PM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
please check if the attached patch solves this issue Zheng Yan
02:09 PM Bug #5401: cuttlefish osd recovery slow
Just another point regarding my idea. If we have an osd having writes to avg 50 PGs/min then ceph now wants that this... Stefan Priebe
07:02 AM Bug #5460: v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
I've just tried ... Faidon Liambotis
06:56 AM Bug #5482 (Can't reproduce): cephx: verify_reply coudln't decrypt with error: error decoding bloc...
Hi all,
i found an OSD down, my env is ceph 0.56.3 , the part of log is below:
==== 47+0+0 (2545873131 0 0) 0x6...
chen atrmat
06:45 AM Bug #5481 (Rejected): Ceph OSD Connection refused
Hi all,
i found an error in ceph osd log file.in my osd tree, there're 2 osd marked down, not running.
in osd...
chen atrmat

06/28/2013

10:03 PM Bug #5470 (Fix Under Review): osd/PGLog.h: 199: FAILED assert(log.log.size() == log_keys_debug.si...
Sage Weil
06:31 PM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
i hit it after just a couple iterations of the teuthology test. i'll capture the osd log... Sage Weil
06:08 PM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
I can't reproduce this locally. how difficult to reproduce this? what's the backend fs for osd? Zheng Yan
05:32 PM CephFS Bug #5411: teuthology: bad object dereference
IME that's what this kind of error from gevent/eventlet etc. means - once the thread exits in a certain abnormal way,... Josh Durgin
03:28 PM CephFS Bug #5411: teuthology: bad object dereference
Yeah, I am/somebody will need to spend some time digging into this when we have some time free. There's another issue... Greg Farnum
03:24 PM CephFS Bug #5411: teuthology: bad object dereference
I think this is just a symtom of the mds_thrasher crashing, but not logging the exception since this join happens bef... Josh Durgin
03:36 PM rbd Bug #5480 (Can't reproduce): libceph: unexpected old state in con_sock_state_change
This happened after running this job in a loop for a while:... Josh Durgin
03:19 PM devops Bug #5479 (Resolved): Append our built packages with some sort of inktank/ceph identifier
Seems like our packages should have something appended to them so we know they are the inktank/ceph ones. I ran into ... Sandon Van Ness
02:58 PM rgw Feature #5406 (In Progress): rgw: a RESTful api to dump region map
Yehuda Sadeh
02:58 PM rgw Feature #5408 (In Progress): rgw: turn off dr/geo logging
Yehuda Sadeh
02:57 PM rgw Feature #4336 (Fix Under Review): rgw: dr: sync processing state: implement internal RESTful API
Yehuda Sadeh
02:25 PM CephFS Bug #5381 (Pending Backport): ceph-fuse: stuck with disconnected inodes on shutdown
commit:946a838cffa0927d1237489e8c2c143e87d66892 Sage Weil
01:48 PM Bug #5401: cuttlefish osd recovery slow
mhm retestet. Log looks different but the result is still the same. Strangely after processing around 100pgs the reco... Stefan Priebe
01:34 PM Bug #5401: cuttlefish osd recovery slow
oops, it's 'osd recover clone overlap = false' (no y) Sage Weil
12:12 PM Bug #5401: cuttlefish osd recovery slow
Oh i forgot to say that in last log i activated debug_filestore later on at 20:51:02. it was 0 before. Stefan Priebe
11:58 AM Bug #5401: cuttlefish osd recovery slow
Log still contains:
2013-06-28 20:51:29.928755 7fe5c33ba700 20 filestore(/ceph/osd.3/) _do_clone_range 3260416~8192 ...
Stefan Priebe
11:52 AM Bug #5401: cuttlefish osd recovery slow
new files here - still a lot of reading goes on /home/cephdrop/5401/overlap_false_ceph-osd.3.log.gz Stefan Priebe
11:44 AM Bug #5401: cuttlefish osd recovery slow
all of the read activity is from using the clone when recovering snapped objects. can you see how things perform wit... Sage Weil
11:38 AM Bug #5401: cuttlefish osd recovery slow
Just to note the log was done with (filestore_queue_max_ops = 50, filestore_queue_committing_max_ops = 50) I h... Stefan Priebe
11:28 AM Bug #5401: cuttlefish osd recovery slow
OK Log uploaded to:
/home/cephdrop/5401/debug_filestore_20_ceph-osd.3.log.gz
The high read rates started at 20:22...
Stefan Priebe
07:29 AM Bug #5401: cuttlefish osd recovery slow
Stefan Priebe wrote:
> I'm not really sure that this is problem cause it was working fine under bobtail. But i can't...
Sage Weil
07:14 AM Bug #5401: cuttlefish osd recovery slow
I'm not really sure that this is problem cause it was working fine under bobtail. But i can't guarantee that the load... Stefan Priebe
07:02 AM Bug #5401: cuttlefish osd recovery slow
It is a good idea. I have to think a bit more about how hard it would be to implement. Currently the pg_temp remapp... Sage Weil
06:47 AM Bug #5401: cuttlefish osd recovery slow
thanks sage - i already played with the threading but that's not the problem. The problem is 2x 10-15MB/s reading whi... Stefan Priebe
06:22 AM Bug #5401: cuttlefish osd recovery slow
Stefan Priebe wrote:
> While recovering i'm seeing two threads out of the OSD doing in parallel 10-15MB/s reading an...
Sage Weil
12:20 AM Bug #5401: cuttlefish osd recovery slow
Generally i think i would be the best to optionally block/redirect client I/O on a recovering OSD. So it can recover ... Stefan Priebe
01:36 PM rgw Bug #5478 (Resolved): rgw: fix init script for rgw daemon name
after installing and configuring radosgw on centos6, found that the naming convention is bit confusing for rgw daemon... Tamilarasi muthamizhan
10:54 AM devops Bug #5405: ceph-deploy: transient pushy exception on install
on centos 6.4,... Tamilarasi muthamizhan
10:51 AM Linux kernel client Bug #5429: libceph: rcu stall, null deref in osd_reset->__reset_osd->__remove_osd
plana72 still sitting in kdb. Sage Weil
10:51 AM Linux kernel client Bug #5429: libceph: rcu stall, null deref in osd_reset->__reset_osd->__remove_osd
hit this again, ubuntu@teuthology:/a/teuthology-2013-06-28_01:01:07-kernel-master-testing-basic/48683 Sage Weil
09:31 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
Wow, that is a much simpler test case than I would expect to be required. I can reproduce with a single file and this... Greg Farnum
02:24 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start

This is all in the lab at present.
We have been doing some additional testing, and have now confirmed that this...
Chris Clayton
09:23 AM CephFS Bug #5477: Unable to create files on CephFS on Fedora 18 using kernel module
And you don't need any kernel support to run the Ceph daemons. You should also check the permissions — it's possible ... Greg Farnum
06:51 AM CephFS Bug #5477: Unable to create files on CephFS on Fedora 18 using kernel module
I suspect this is SElinux or something similar getting in the way... Sage Weil
04:47 AM CephFS Bug #5477 (Resolved): Unable to create files on CephFS on Fedora 18 using kernel module
I have mounted a CephFS filesystem on a Fedora 18 system, which succeeds as follows:
[root@e8c4-dl360g7-03 ceph]# ...
Chris Howarth
06:08 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Yes, boost1.46 from Ubuntu does seem to make a difference. The last built has been running for 5 days now with far fe... Emil Renner Berthing

06/27/2013

11:30 PM Bug #5401: cuttlefish osd recovery slow
While recovering i'm seeing two threads out of the OSD doing in parallel 10-15MB/s reading and 10-20Mb/s writing. The... Stefan Priebe
07:01 AM Bug #5401: cuttlefish osd recovery slow
mikedawson told me on irc that he has the same problem - since upgrading from bobtail to cuttlefish Stefan Priebe
09:39 PM CephFS Bug #5381 (Fix Under Review): ceph-fuse: stuck with disconnected inodes on shutdown
Sage Weil
09:22 AM CephFS Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
Greg Farnum
06:05 PM Bug #4599: ceph auth import -i <file> is broken
As of now ./ceph auth import -i appears to work, but reading from stdin doesn't:
dzafman@ubuntu:~/ceph/src$ ./ceph...
David Zafman
05:03 PM Feature #4839 (Resolved): api: make new CLI send old version of commands to old monitors during u...
Ian Colle
05:03 PM Feature #4315 (Resolved): api: create python CLI wrapper for ceph tool; read command descriptions...
Ian Colle
05:03 PM Feature #4314 (Resolved): api: modify ceph tool to describe own commands
Ian Colle
03:45 PM Feature #4200 (Resolved): mon: break pgmap into separate leveldb keys
Sage Weil
03:25 PM devops Bug #5476 (Resolved): ceph-deploy: disk list command is broken on centos 6.4
This error was actually in ceph-disk, not in ceph-deploy. Now fixed.
cuttlefish commit:90f5c448abeb127ae5a5528a79b...
Greg Farnum
02:59 PM devops Bug #5476: ceph-deploy: disk list command is broken on centos 6.4
test setup: burnupi63 Tamilarasi muthamizhan
02:59 PM devops Bug #5476 (Resolved): ceph-deploy: disk list command is broken on centos 6.4
on the latest cuttlefish branch, disk list command is broken,
[ubuntu@burnupi63 ceph-deploy]$ sudo ceph -v
ceph v...
Tamilarasi muthamizhan
02:42 PM Feature #5437: ceph-mon performance on ARM
Latest wip-mon-pgmap tests got through PG creation without issue. The mons have been quite stable on ARM with 168 OS... Mark Nelson
02:21 PM Bug #5424 (Resolved): mon/Paxos.cc: 549: FAILED assert(begin->last_committed == last_committed)
Sage Weil
01:38 PM devops Bug #5475 (Closed): ceph-deploy deb install breaks on i386 due to missing python-pushy package
We're not building python-pushy for i386, so attempts to apt-get install ceph-deploy fail:
> rturk@ubuntu:~$ sudo ...
Ross Turk
01:34 PM Bug #5473 (Resolved): osd/ReplicatedPG.cc: 1379: FAILED assert(0) in trim_object() on master, cut...
... Sage Weil
10:08 AM rgw Bug #5415: rgw: failing valgrind leak checks
41708: (401s) collection:verify clusters:fixed-2.yaml fs:btrfs.yaml msgr-failures:few.yaml tasks:rgw_s3tests.yaml val... Ian Colle
01:50 AM rgw Bug #5415: rgw: failing valgrind leak checks
Sage Weil wrote:
> teuthology-2013-06-21_01:00:44-rgw-master-testing-basic/41708 and 41709
Hi, I'm willing to wor...
Christophe Courtaut
08:03 AM Bug #5471: mon: do not join a quorum if quorum's version is lower than ours
I have a simple patch for this that simply compares the quorum's version to our own paxos version and forces us to su... Joao Eduardo Luis
07:58 AM Bug #5471 (Resolved): mon: do not join a quorum if quorum's version is lower than ours
With p being the monitor's Paxos version, consider:
* A - p:100 (at time quorum was formed)
* B - p:100 (at time ...
Joao Eduardo Luis
07:59 AM rgw Feature #4336 (In Progress): rgw: dr: sync processing state: implement internal RESTful API
Yehuda Sadeh
07:57 AM rgw Feature #5358 (Fix Under Review): rgw: RESTful api for intra-region copy state
Yehuda Sadeh
07:57 AM rgw Feature #4613 (Fix Under Review): Allow bucket data to reside in a separate pool to object data
Yehuda Sadeh
07:45 AM rgw Bug #5324 (In Progress): radosgw-admin --help missing the --shard-id option
Yehuda Sadeh
02:42 AM rgw Bug #5324: radosgw-admin --help missing the --shard-id option
Hi,
Here comes a fix for this bug.
https://github.com/ceph/ceph/pull/382
Best
Christophe Courtaut
07:38 AM rgw Feature #5420 (Rejected): rgw: integrate bucket metadata changes with bucket index log
We're not going to do that, decided on a different approach. Yehuda Sadeh
07:37 AM rgw Feature #5417 (Fix Under Review): rgw: separate bucket metadata object into pointer object and in...
Yehuda Sadeh
07:36 AM rgw Bug #5344 (Fix Under Review): rgw: make list of bucket placement pools index configurable
Yehuda Sadeh
06:05 AM Bug #5470 (Resolved): osd/PGLog.h: 199: FAILED assert(log.log.size() == log_keys_debug.size())
... Sage Weil

06/26/2013

11:30 PM Bug #5444 (Rejected): ceph df print error message
Sage Weil
11:15 PM CephFS Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
this is sufficient to reproduce. i think this is a problem with unlinked inodes in the client cache not getting clea... Sage Weil
10:35 PM Bug #5460: v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
Faidon Liambotis wrote:
> I did have mons upgraded and restarted first, sorry for not mentioning this earlier. I'll ...
Sage Weil
09:21 PM Bug #5460: v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
I did have mons upgraded and restarted first, sorry for not mentioning this earlier. I'll try the workaround neverthe... Faidon Liambotis
09:07 PM Bug #5460 (Resolved): v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
commit:fe6633172ea10b9f95c6d59bccdac01651195f25 Sage Weil
08:08 PM Bug #5460: v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
Note the workaround here is to upgrade and restart Mons first. Sage Weil
07:10 PM Bug #5460 (Fix Under Review): v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
Proposed fix in wip-5460.
In older release the osdmap did not specify a front-side interface. A couple of places ...
David Zafman
02:02 AM Bug #5460 (Resolved): v0.61.3 -> v0.65 upgrade: new OSDs mark old as down
I tried upgrading my v0.61.3 cluster to v0.65 today. All of the new (v0.65) OSDs are marking all of the old (v0.61.3)... Faidon Liambotis
10:08 PM CephFS Bug #5453: kclient: multiple_rsync tee output partially zeroed
putting the tee'd file in /tmp fixes the problem, implying this is a kclient/cephfs bug of some sort. moving this in... Sage Weil
09:35 PM rgw Bug #5455: radosgw-admin buckets list regression
committed, but still need the radosgw-admin.py test before we close the bug Sage Weil
06:55 PM rbd Bug #5469 (Resolved): qemu-io: segfault when tried IO with invalid arguments
tested this on rhel 6.3 and rhel6.4
tried qemu-io to perform IO on the rbd image by writing in a pattern and readi...
Tamilarasi muthamizhan
06:42 PM devops Bug #5390 (Resolved): ceph-deploy osd create hangs
Sage Weil
06:40 PM Bug #5424: mon/Paxos.cc: 549: FAILED assert(begin->last_committed == last_committed)
Sage Weil
06:39 PM rbd Bug #5464: krbd: modifying mapped image also modifies snapshot
Sage Weil
03:24 PM rbd Bug #5464: krbd: modifying mapped image also modifies snapshot
This was caused by bf0d5f503dc11d6314c0503591d258d60ee9c944, which was first included in 3.9. Josh Durgin
03:20 PM rbd Bug #5464 (Fix Under Review): krbd: modifying mapped image also modifies snapshot
The workunit that tests this case (rbd/kernel.sh) succeeds on both 3.9 and 3.10-rc7 with the patches in wip-snapc-3.9... Josh Durgin
11:08 AM rbd Bug #5464 (Resolved): krbd: modifying mapped image also modifies snapshot
http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/15815 Josh Durgin
06:36 PM Bug #5445 (Resolved): random osd EPERM on journal
Sage Weil
06:28 PM Bug #5445 (Pending Backport): random osd EPERM on journal
Sage Weil
07:52 AM Bug #5445 (Fix Under Review): random osd EPERM on journal
pushed wip-5445.
this normally wouldn't happen, except that teuthology does not define fsid in the ceph.conf, so c...
Sage Weil
05:13 PM Bug #5467: ceph auth add <entity> -i <file> broken
Added simple test at Sage's suggestion in commit:ca55c3416e02398991c9789b1590d721fbca212d Dan Mick
04:21 PM Bug #5467 (Resolved): ceph auth add <entity> -i <file> broken
commit:bfed2d60a59bf69863989ca3bc108079c1d37f4f
Dan Mick
03:48 PM Bug #5467 (Fix Under Review): ceph auth add <entity> -i <file> broken
Dan Mick
03:17 PM Bug #5467 (In Progress): ceph auth add <entity> -i <file> broken
Dan Mick
03:16 PM Bug #5467 (Resolved): ceph auth add <entity> -i <file> broken
In the Ceph CLI rewrite, I broke "auth add" without caps. auth add currently requires
both an entity name and at le...
Dan Mick
03:30 PM rbd Bug #5468 (Resolved): qemu: reports snapshot not supported on rpms
Tested qemu on the following packages "http://ceph.com/packages/qemu-kvm/redhat/x86_64/qemu-img-0.12.1.2-2.355.el6.2.... Tamilarasi muthamizhan
02:38 PM Bug #5466 (Can't reproduce): raring tarball builds mysteriously fail make check
I've noticed this intermittently: make check on raring tarball builds fails like this:... Dan Mick
11:32 AM Bug #5401: cuttlefish osd recovery slow
Hi sage very strange. I tried your suggestions but also setting recovery active to 20 and priority to 60 does not hel... Stefan Priebe
11:32 AM rbd Feature #5465 (Resolved): openstack: cinder: support resize with rbd
Cinder now has an api for growing volumes. Add support for it to the rbd driver Josh Durgin
10:09 AM Bug #5272 (Duplicate): Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of existing...
This was issues with the MDS doing heavy reads off of the OSDs. See #4405 and the related caching issues. Greg Farnum
06:28 AM Bug #5195 (Resolved): "ceph-deploy mon create" fails when adding additional monitors
Updated documentation to add a note about needing the public network statement. Anonymous
03:54 AM Feature #5461 (Rejected): ceph-disk-prepare should support LVM volumes
It is very convenient at least for testing purposes to configure osds on LVM volumes. Unfortunately ceph-disk-prepare... Maciej Galkiewicz
01:47 AM rbd Bug #5454: krbd: assertion failure in rbd_img_obj_callback()
Hi Sage,
yes, dd was the program, bs=1024k
Regards
Andreas
Andreas Bluemle

06/25/2013

10:21 PM Bug #5445: random osd EPERM on journal
oh, the test in question isn't mounting a drive, but is storing the data directly in /var/lib/ceph/osd/ceph-$id. the... Sage Weil
09:35 PM Bug #5445 (In Progress): random osd EPERM on journal
this happens on tasks that don't use all available disks. a previous job with ceph-deploy leaves behind osd disks, s... Sage Weil
07:48 PM CephFS Bug #5450 (Resolved): mds: failed CDir::_fetched() assert
nice! cherry-picked to commit:ccb3dd5ad5533ca4e9b656b4e3df31025a5f2017 Sage Weil
07:08 PM CephFS Bug #5450: mds: failed CDir::_fetched() assert
0.61.4-5-gd572cf6 ? probably already fixed by commit:81d073fecb (mds: fix underwater dentry cleanup) Zheng Yan
10:20 AM CephFS Bug #5450 (Resolved): mds: failed CDir::_fetched() assert
... Greg Farnum
07:33 PM Bug #5459: ceph-mon failure using wip-mon-pgmap on ARM
this is almost certainly the max_open_files limit. add max open files = 16384 to ceph.conf and restart the mons. Sage Weil
07:28 PM Bug #5459 (Resolved): ceph-mon failure using wip-mon-pgmap on ARM
This happened on the mixed burnupi/calxeda cluster with wip-mon-pginfo. leveldb caching was set to 256MB.... Mark Nelson
07:19 PM CephFS Bug #5418: kceph: crash in remove_session_caps
Zheng Yan wrote:
> I still don't figure out that root cause of the crash, infinite loop in iterate_session_caps(), B...
Sage Weil
07:01 PM CephFS Bug #5418: kceph: crash in remove_session_caps
I still don't figure out the cause of the crash, infinite loop in iterate_session_caps(), BUG_ON(session->s_nr_caps >... Zheng Yan
01:00 PM CephFS Bug #5418: kceph: crash in remove_session_caps
ubuntu@teuthology:/a/teuthology-2013-06-25_01:00:47-kernel-next-testing-basic/45603 Sage Weil
06:01 PM CephFS Bug #5458 (Duplicate): mds: standby-replay -> replay takeover does not handle racing expire/trim
not sure this is the right diagnosis since i only looked at this briefly, but:... Sage Weil
05:30 PM Bug #5336: osd crash triggered by 'rbd rm ...'
no word from Florian on the ML; downgrading to high. Sage Weil
05:28 PM Bug #5401: cuttlefish osd recovery slow
hmm, i don't see anything awry here. the rops count does go over the max because we get client io requests to missing... Sage Weil
03:15 PM Bug #5401: cuttlefish osd recovery slow
OK this is a complete log from starting a stopped OSD, recover and then shut it down again. This one was done using t... Stefan Priebe
02:28 PM Bug #5401: cuttlefish osd recovery slow
debug osd = 20
debug ms = 1
thanks!
Sage Weil
01:57 PM Bug #5401: cuttlefish osd recovery slow
Will test soon. Can also provide a complete log. Just debug osd 20? everything else can be 0?
Stefan Priebe
01:35 PM Bug #5401: cuttlefish osd recovery slow
pushed another wip-5401 and resolves the race more thoroughly. tested lightly and it behaves. but i haven't been ab... Sage Weil
01:11 PM Bug #5401: cuttlefish osd recovery slow
Stefan, can you capture an osd log that covers the entire period from before recovery starts up until when it slows/s... Sage Weil
11:43 AM Bug #5401: cuttlefish osd recovery slow
thanks sage i tested wip-5401 branch no change ;-( Stefan Priebe
11:21 AM Bug #5401: cuttlefish osd recovery slow
Yes see comment #7 (log file uploaded to cephdrop folder 5401/ceph-osd.44.log.gz).
Yes peering is already fully co...
Stefan Priebe
11:19 AM Bug #5401: cuttlefish osd recovery slow
i've pushed wip-5021 which makes a small optimization that will improve recovery for small objects. it's not well te... Sage Weil
11:18 AM Bug #5401: cuttlefish osd recovery slow
Did you put the logs in cephdrop?
I'm mostly curious to see if we can figure out where the disk thrashing is coming ...
Greg Farnum
11:14 AM Bug #5401: cuttlefish osd recovery slow
Maybe what's missing. The backtraces i provided and the logs are form that timeframe. Also it's the timeframe i'm see... Stefan Priebe
09:59 AM Bug #5401: cuttlefish osd recovery slow
Stefan Priebe wrote:
> First the slow requests and VM hangs are results of a heavily stressed disk. I see a MASSIVE ...
Greg Farnum
12:03 AM Bug #5401: cuttlefish osd recovery slow
Yes generally this check needs to be fixed. But it does not solve the problem.
To be a bit more specific.
First...
Stefan Priebe
05:07 PM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
latest is that the precise version of boost appears to have resolved this.. so far. Sage Weil
04:47 PM rgw Bug #5422 (Resolved): object deletion should log the object tag
fixed in the wip-rgw-geo-2 branch. Yehuda Sadeh
04:46 PM rgw Bug #5441 (Resolved): bucket creation does not log the tag
this should be fixed now, reopen if you see it again. Yehuda Sadeh
04:43 PM rbd Bug #5454: krbd: assertion failure in rbd_img_obj_callback()
Hi Andreas,
How are you writing to the image? dd with bs=1M or something?
Sage Weil
12:15 PM rbd Bug #5454 (Resolved): krbd: assertion failure in rbd_img_obj_callback()
when working on a large rbd image, I am hitting an
assertion failure in rbd kernel module; the
assertion is in rbd_...
Andreas Bluemle
04:16 PM devops Bug #5345: ceph-disk: handle less common device names
can you please try the version of ceph-disk in the wip-ceph-disk branch? it has a bunch of changes to be smarter abo... Sage Weil
03:24 PM devops Bug #5345: ceph-disk: handle less common device names
I have several HP dl180's with p400 raid controllers. This is all standard hardware.
The disk paths are enumerated...
Tomas Lovato
02:15 PM rgw Bug #5455 (Resolved): radosgw-admin buckets list regression
With bobtail, it was possible to list every rgw buckets on the cluster with:
# radosgw-admin buckets list
[
"a...
Alexandre Marangone
01:59 PM Bug #5438 (Resolved): mon/Monitor.cc: 1888: FAILED assert(is_probing() || is_synchronizing())
Sage Weil
12:50 PM Bug #5440 (Resolved): osd: marked down due to no pgstats reports
broken test + test yaml, fixed in teuthology.git and ceph-qa-suite.git Sage Weil
08:55 AM Bug #5440: osd: marked down due to no pgstats reports
ubuntu@teuthology:/a/teuthology-2013-06-25_01:00:06-rados-next-testing-basic/45417
in mon log, osd msgs suddenly s...
Sage Weil
12:13 PM CephFS Bug #5453 (In Progress): kclient: multiple_rsync tee output partially zeroed
Sage Weil
12:09 PM CephFS Bug #5453 (Resolved): kclient: multiple_rsync tee output partially zeroed
latest run:... Sage Weil
11:55 AM devops Bug #5452 (Resolved): ceph-deploy: purge command reports unsupported platform on fedora18
commit f67486632cb40a34d089c9dcfa371383a5f3ab9c
fix worked fine!
Tamilarasi muthamizhan
11:15 AM devops Bug #5452 (Resolved): ceph-deploy: purge command reports unsupported platform on fedora18
while purge command does uninstall on rhel and centos, it reports, "unsupported platform" for fedora18.
We need to...
Tamilarasi muthamizhan
10:59 AM Bug #4907 (Resolved): rados python bindings: get_xattr() uses a fixed 4k buffer
commit:3016f46f53d4701ead1e30f2a3d67a39ca0050f8 Josh Durgin
07:15 AM Bug #4907 (Fix Under Review): rados python bindings: get_xattr() uses a fixed 4k buffer
proposed patch : https://github.com/ceph/ceph/pull/380 Loïc Dachary
10:49 AM Feature #4982 (Fix Under Review): OSD: namespaces pt 1 (librados/osd, not caps)
David Zafman
10:15 AM Bug #5312 (Resolved): Skip EXT4StoreTest._detect_fs test if DISK or MOUNTPOINT environment variab...
Sage Weil
06:14 AM Bug #5312 (Fix Under Review): Skip EXT4StoreTest._detect_fs test if DISK or MOUNTPOINT environmen...
ready for rewiew : https://github.com/ceph/ceph/pull/379 Loïc Dachary
05:52 AM Bug #5312: Skip EXT4StoreTest._detect_fs test if DISK or MOUNTPOINT environment variables not set
TEST (EXT4StoreTest, _detect_fs) was introduced by https://github.com/ceph/ceph/commit/574051f8da0a30073a7d5da880878e... Loïc Dachary
09:31 AM Bug #5449 (Can't reproduce): osd crash immediately after booting up
Debian 7.1 (wheezy), kernel 3.2.41-2+deb7u2, ceph 0.61.4. I have three osds and only osd.1 crashes just after booting... Maciej Galkiewicz
09:27 AM Bug #5444: ceph df print error message
this is the fix:
$ git tag --contains 0c2b738d8d07994fee4c73dd076ac9364a64bdb2
v0.64
maybe your ceph-mon daemo...
Sage Weil
12:15 AM Bug #5444: ceph df print error message
ceph version 0.64-639-g37a2017 (37a20174fd22a79938ba9c93046e8830f4a3306f)
jianpeng ma
01:05 AM rbd Feature #1790: rbd: have a way of establishing configured mappings at boot time
For the record : "Add rc script for rbd map/unmap":https://github.com/ceph/ceph/commit/a4ddf704868832e119d7949e96fe35... Loïc Dachary

06/24/2013

09:01 PM Bug #5442 (Resolved): ceph-deploy: rados test failed in the nightlies
qa suite commit:2df1e209d3649a46aaf01c6cd961cb96638fe1f6 Sage Weil
11:50 AM Bug #5442 (Resolved): ceph-deploy: rados test failed in the nightlies
logs: ubuntu@teuthology:/a/teuthology-2013-06-24_01:01:13-ceph-deploy-master-testing-basic/44159... Tamilarasi muthamizhan
08:42 PM Bug #5431 (Resolved): osd: dump_stuck test fails with ENXIO
Sage Weil
08:35 PM Bug #5444 (Need More Info): ceph df print error message
what version is this? the units were off by a factor of 1000. this was recently fixed.. Sage Weil
07:16 PM Bug #5444 (Rejected): ceph df print error message
root@ubuntu:/media# ceph df
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
3903M 2267M 1367M ...
jianpeng ma
07:31 PM Bug #5445 (Can't reproduce): random osd EPERM on journal
ubuntu@teuthology:/a/teuthology-2013-06-24_18:48:34-rados-cuttlefish-testing-basic/45071... Sage Weil
06:58 PM rgw Bug #5441: bucket creation does not log the tag
previous tag: a tag for the read_version Yehuda Sadeh
06:57 PM rgw Bug #5441: bucket creation does not log the tag
Not sure what you mean:
That's the tag:...
Yehuda Sadeh
11:46 AM rgw Bug #5441 (Resolved): bucket creation does not log the tag
In the wip-rgw-geo-2 branch, the mdlog entries for a bucket creation do not include the tag.
The tag should be incl...
Anonymous
06:52 PM Bug #5438 (Pending Backport): mon/Monitor.cc: 1888: FAILED assert(is_probing() || is_synchronizin...
Sage Weil
11:39 AM Bug #5438 (Resolved): mon/Monitor.cc: 1888: FAILED assert(is_probing() || is_synchronizing())
... Sage Weil
05:54 PM CephFS Bug #5381 (Need More Info): ceph-fuse: stuck with disconnected inodes on shutdown
Sage Weil
01:03 PM CephFS Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
next time we see this (or any other ceph-fuse hsutdown hang), grab teh logs manually via scp before nuking, and note ... Sage Weil
05:49 PM rgw Bug #5443 (Resolved): el6: rgw init script has debug mode on
commit:31d6062076fdbcd2691c07a23b381b26abc59f65 Sage Weil
05:17 PM rgw Bug #5443 (Resolved): el6: rgw init script has debug mode on
rgw init script that comes with the rpm packages for centos has debug mode on.
bash -x is set. It would be nice to...
Tamilarasi muthamizhan
05:26 PM rgw Bug #5323: trim data log lists dates as optional, enforced as required in the current code
I wonder if they should be optional. It's easy to purge the entire log by mistake. Yehuda Sadeh
04:39 PM Bug #5401 (In Progress): cuttlefish osd recovery slow
that fix looks correct, but i'm not sure it is the problem here.. if i'm reading this right, blowing past the max wil... Sage Weil
04:25 PM Bug #5407 (Resolved): mon: is_writeable doesn't match wait_for_writeable on cuttlefish
bah, well, ended up just backporting the rest of the series from master. in the long run that will be easier to main... Sage Weil
06:21 AM Bug #5407 (Fix Under Review): mon: is_writeable doesn't match wait_for_writeable on cuttlefish
pushed to wip-mon-cuttlefish. Joao Eduardo Luis
04:24 PM Bug #5205 (Resolved): mon: FAILED assert(ret == 0) on config's set_val_or_die() from pick_address...
Sage Weil
12:53 PM Bug #5205 (Fix Under Review): mon: FAILED assert(ret == 0) on config's set_val_or_die() from pick...
Sage Weil
04:24 PM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
yeah, let's update the docs ("the new monitor needs to know what address to bind to") and close this bug. Yay! Sage Weil
04:22 PM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
After clean-up and re-installation of wip-5195 it worked.
2013-06-24 19:13:52.853138 7f3f92b9e700 0 log [INF] : m...
Anonymous
01:51 PM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
I ran a test with the wip-5195 branch. That fixed the issue with the assert.
There now seems to be a authenticati...
Anonymous
12:53 PM Bug #5195 (Fix Under Review): "ceph-deploy mon create" fails when adding additional monitors
Sage Weil
10:55 AM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
Gary Lowell wrote:
> With the public_network statement added to ceph.conf it look slike it gets further gut hits an ...
Joao Eduardo Luis
10:20 AM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
With the public_network statement added to ceph.conf it look slike it gets further gut hits an assert. Stack trace a... Anonymous
01:27 PM rgw Feature #4613: Allow bucket data to reside in a separate pool to object data
Implemented this one for dumpling. Made it possible to define different data and index pools for a bucket, and give u... Yehuda Sadeh
12:54 PM Bug #5427 (Resolved): mon: could not get service secret for auth subsystem
Sage Weil
10:20 AM Bug #5427 (Pending Backport): mon: could not get service secret for auth subsystem
Sage Weil
10:01 AM Bug #5427: mon: could not get service secret for auth subsystem
Looked at it yesterday, pull request 374. I believe it to be correct. Joao Eduardo Luis
09:26 AM Bug #5427: mon: could not get service secret for auth subsystem
Joao - please review. Ian Colle
11:44 AM Bug #5154: osd/SnapMapper.cc: 270: FAILED assert(check(oid))
just hit this on job... Sage Weil
11:43 AM Bug #5440 (Resolved): osd: marked down due to no pgstats reports
2013-06-24T02:04:34.124 INFO:teuthology.task.ceph.mon.b.err:2013-06-24 02:04:37.762017 7fe7e462b700 -1 mon.b@0(leader... Sage Weil
11:40 AM rgw Bug #5439 (Resolved): rgw: 500 returned on s3readwrite
teuthology@/a/teuthology-2013-06-24_01:00:42-rgw-master-testing-basic/44036:... Yehuda Sadeh
10:58 AM CephFS Bug #5333 (Resolved): mds: segfault in MDLog::standby_trim_segments
done, commit:f046dab88fcfeda23391bcd694abc65ff1ed8cd8 Sage Weil
10:12 AM CephFS Bug #5333 (Pending Backport): mds: segfault in MDLog::standby_trim_segments
I saw this crash under teuthology in the next branch as well; can we put it there? Greg Farnum
10:44 AM CephFS Bug #5411: teuthology: bad object dereference
#5333 is what I was referring to. There's a whole string of failures which are hitting both that and this. Greg Farnum
10:08 AM CephFS Bug #5411: teuthology: bad object dereference
Josh, I went back and looked at the first instance (/a/teuthology-2013-06-18_01\:00\:37-fs-next-testing-basic/38877/)... Greg Farnum
10:05 AM CephFS Bug #5411: teuthology: bad object dereference
Happened again... Greg Farnum
09:45 AM CephFS Bug #5411: teuthology: bad object dereference
If you look at the message from the first exception, it says the mds failed:... Josh Durgin
09:36 AM CephFS Bug #5382: mds: failed objecter assert on shutdown
/a/teuthology-2013-06-23_20:00:47-fs-cuttlefish-testing-basic/43843/teuthology.log Greg Farnum
09:10 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
Unfortunately this is an area where CephFS needs some hardening and some recovery tools — part of why we don't recomm... Greg Farnum
05:49 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
We have fit a very similar problem with V0.61.2. We are unable to start any MDS daemons following testing that involv... Chris Clayton
09:09 AM Bug #5432 (Pending Backport): msgr: bad locking mark_down_all
the crash was from the earlier changes that are only in master, and this whole series is just to fix Connection and *... Sage Weil
08:17 AM Feature #5437 (Resolved): ceph-mon performance on ARM
Scaling tests with 168 OSDs show bottlenecks when ceph-mon is running on ARM processors. Requests are not being proc... Mark Nelson
06:30 AM Subtask #2659 (Can't reproduce): mon: Single-Paxos: ceph tool -w subscriptions not being updated
Joao Eduardo Luis
06:28 AM Subtask #2621 (Resolved): mon: Single-Paxos: synchronize the MonitorDBStore of oblivious monitor
Joao Eduardo Luis
06:26 AM Bug #5294 (Closed): mon upgrade issue 0.61.2 -> 0.61.3
From ceph-users, it appears the monitors were just taking a long, long time compacting a 500GB+ store (which explains... Joao Eduardo Luis

06/23/2013

11:20 PM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
Yeah,I think as things currently stand though the Mon looks for that option to be defined. We can probably fix it in... Sage Weil
10:55 PM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
Sage Weil wrote:
> oh, right. in this case i think teh thing to do is add 'public network = 1.2.3.0/24' or whatever...
Robert Sander
10:03 PM Bug #5195: "ceph-deploy mon create" fails when adding additional monitors
oh, right. in this case i think teh thing to do is add 'public network = 1.2.3.0/24' or whatever to the ceph.conf so... Sage Weil
10:43 PM Bug #5432: msgr: bad locking mark_down_all
Merged into master with commit:134d08a9654f66634b893d493e4a92f38acc63cf. Does wip-msgr need any backports? I think th... Greg Farnum
03:10 PM Bug #5432 (Fix Under Review): msgr: bad locking mark_down_all
wip-msgr fixes this already, needs review! Sage Weil
11:03 AM Bug #5432 (Resolved): msgr: bad locking mark_down_all
... Sage Weil
10:12 PM CephFS Bug #5021: ceph-fuse: crash on traceless reply
btw wip-5021 still hasn't merged because it failed the smbtorture test. i'll rebase on master and retest to see wher... Sage Weil
10:11 PM devops Bug #5193: RHEL6 does not ship with xfsprogs
xfs support is ramping up for newer 6.x rhel releases, so think this goes away on its own... Sage Weil
10:09 PM CephFS Bug #5105 (Duplicate): mds/CInode.cc: 1996: FAILED assert(auth_pins >= 0)
#4832 Sage Weil
10:06 PM CephFS Bug #5333 (Resolved): mds: segfault in MDLog::standby_trim_segments
commit:abd0ff64e108b7670a062b3fa39baaf3d3e48fb3 Sage Weil
04:30 PM CephFS Bug #5430 (Duplicate): newfs makes ceph-mds segfault in suicide
#5432 Sage Weil
10:57 AM CephFS Bug #5430: newfs makes ceph-mds segfault in suicide
... Sage Weil
10:52 AM CephFS Bug #5430 (Duplicate): newfs makes ceph-mds segfault in suicide
... Sage Weil
04:22 PM Bug #5431 (Fix Under Review): osd: dump_stuck test fails with ENXIO
https://github.com/ceph/teuthology/pull/16 Sage Weil
11:00 AM Bug #5431 (Resolved): osd: dump_stuck test fails with ENXIO
... Sage Weil
01:30 PM Subtask #5433 (Rejected): Factor out the ReplicatedPG object replication and client IO logic as a...
"work in progress":https://github.com/dachary/ceph/tree/wip-5433
h3. Moving code PG <=> ReplicatedPG
Prior to d...
Loïc Dachary
12:20 PM Subtask #5085 (Rejected): PG::merge_log should not have side effects other than on the log & miss...
It's probably too early in the process to do that kind of enhancement / modification. Loïc Dachary
10:37 AM Bug #5427 (Fix Under Review): mon: could not get service secret for auth subsystem
Sage Weil
09:26 AM Bug #5427: mon: could not get service secret for auth subsystem
the leader never ticked while paxos was healthy, i think because of the clock skew.
see wip-5427
Sage Weil
08:59 AM Bug #5427 (Resolved): mon: could not get service secret for auth subsystem
... Sage Weil
10:24 AM Linux kernel client Bug #5429: libceph: rcu stall, null deref in osd_reset->__reset_osd->__remove_osd
leaving plana56 in kdb Sage Weil
10:23 AM Linux kernel client Bug #5429 (Resolved): libceph: rcu stall, null deref in osd_reset->__reset_osd->__remove_osd
... Sage Weil
10:15 AM rbd Bug #5428: libceph: null deref in ceph_auth_reset
leaving plana09 in kdb Sage Weil
10:12 AM rbd Bug #5428: libceph: null deref in ceph_auth_reset
first guess was a shutdown race, but ceph_monc_stop() is flushing the msgr wq. also, no other threads appear to be i... Sage Weil
10:02 AM rbd Bug #5428 (Can't reproduce): libceph: null deref in ceph_auth_reset
... Sage Weil
08:51 AM rbd Bug #5426 (Resolved): librbd: mutex assert in perfcounters::tinc in librbd::AioCompletion::comple...
... Sage Weil
08:08 AM rbd Bug #5425: krbd: xfstest 89 hang, 'read_partial_message skipping long message'
... Sage Weil
08:03 AM rbd Bug #5425 (Resolved): krbd: xfstest 89 hang, 'read_partial_message skipping long message'
... Sage Weil

06/22/2013

02:15 PM Bug #5401: cuttlefish osd recovery slow
This one seems to "hide" the problem:
commit 9fe5611fdd7374654ad58185fa3988e216c52f08
Author: Stefan Priebe <s.prie...
Stefan Priebe
02:04 PM Bug #5401: cuttlefish osd recovery slow
max may also be negative but there is only a check for == 0 in OSD.cc Stefan Priebe
01:03 PM Bug #5401: cuttlefish osd recovery slow
It seems locking and unlocking the mutex isn't working correctly. So multiple threads seem to higher recovery_ops_act... Stefan Priebe
12:31 PM Bug #5401: cuttlefish osd recovery slow
g_conf->osd_recovery_max_active is 5 but on a freshly started osd i'm seeing log messages like these ...2013-06-22 ..... Stefan Priebe
06:10 AM Bug #5401: cuttlefish osd recovery slow
It did defer the recovery again and again. Some example log output:
2013-06-22 15:07:20.187878 7f8c3f49c700 15 osd...
Stefan Priebe
04:11 AM Bug #5401: cuttlefish osd recovery slow
I could hide this problem by higher "osd recovery delay start" => 120 but then the overall recovery time is very high... Stefan Priebe
12:04 PM Bug #5424: mon/Paxos.cc: 549: FAILED assert(begin->last_committed == last_committed)
Greg Farnum wrote:
> Shouldn't that cause LevelDB to block or throw an error or something? I'm not quite sure how it...
Sage Weil
10:22 AM Bug #5424: mon/Paxos.cc: 549: FAILED assert(begin->last_committed == last_committed)
Shouldn't that cause LevelDB to block or throw an error or something? I'm not quite sure how it leads to us not readi... Greg Farnum
09:53 AM Bug #5424 (Resolved): mon/Paxos.cc: 549: FAILED assert(begin->last_committed == last_committed)
all peons died with teh above assert. the leader did this:... Sage Weil

06/21/2013

11:50 PM Bug #5195 (In Progress): "ceph-deploy mon create" fails when adding additional monitors
The problem occurs when a monitor is added on a host that was not in the initial list of cluster members.
Sequence...
Anonymous
05:47 PM rgw Bug #5422 (Resolved): object deletion should log the object tag
object tags are used to tell one instance of an object from another with the same name (to differentiate a deleted ob... Anonymous
05:07 PM Bug #5414 (Resolved): qa jobs failing with /var/lib/ceph/osd/*/journal root-owned from prior clus...
Sage Weil
04:30 PM Bug #5414: qa jobs failing with /var/lib/ceph/osd/*/journal root-owned from prior clusters
Sage Weil
10:29 AM Bug #5414 (Resolved): qa jobs failing with /var/lib/ceph/osd/*/journal root-owned from prior clus...
this is causing various runs to fail. so far i see it on upgrade runs. maybe the package update is triggering somet... Sage Weil
02:33 PM rgw Feature #4335 (Resolved): rgw: dr: sync processing state: define datastructures
Oooh! Oooh! Merged into the integration branch. Greg Farnum
02:15 PM Feature #5421: mon: add formatter option for various mon commands
I've done "auth export". The tricky bit is understanding the naming and levels of containers (that end up unnamed in... Dan Mick
02:13 PM Feature #5421 (Resolved): mon: add formatter option for various mon commands
Sage Weil
02:14 PM Feature #4983: OSD: namespaces pt 2 (caps)
Sage Weil
02:08 PM Feature #3983 (In Progress): api: create initial DRAFT REST API Design
Dan Mick
02:08 PM Feature #4463 (In Progress): api: RESTful client: demonstrate remaining N-1 commands JSON or XML
Dan Mick
02:07 PM Feature #4462 (In Progress): api: RESTful client: implement remaining N-1 commands JSON or XML
Dan Mick
02:07 PM Feature #4459 (In Progress): api: RESTful client: implement remaining commands JSON only
Dan Mick
02:07 PM Feature #4460 (In Progress): api: RESTful client: demonstrate remaining N-1 commands JSON only
Dan Mick
02:07 PM Fix #5278 (In Progress): osd: smarter recovery for small objects
Sage Weil
02:06 PM Feature #4457 (Resolved): api: add JSON schema/output protocol to rados.py
Sage Weil
02:06 PM Feature #4458 (Resolved): api: RESTful client: prototype 1 command JSON only
Sage Weil
02:06 PM Feature #4461 (Resolved): api: RESTful client: prototype 1 command JSON or XML
Sage Weil
02:06 PM Feature #4547 (Resolved): api: implement self-description for --admin-daemon commands
Sage Weil
02:04 PM Feature #4548 (Resolved): api: implement self-description for osd/mon tell commands
Sage Weil
02:04 PM Feature #4455 (Resolved): api: move '--format' into just another command argument
Sage Weil
01:58 PM rgw Feature #5420 (Rejected): rgw: integrate bucket metadata changes with bucket index log
for the sake of correctness and robustness Yehuda Sadeh
01:43 PM devops Feature #5013 (In Progress): build internal openstack + ceph cluster out of some burnupi
Sage Weil
01:43 PM devops Feature #5214 (In Progress): Kernel gitbuilders for rpm distros
Sage Weil
12:48 PM Feature #5419 (New): cephtool: sanitize extra args before configuring cluster handle
The '--admin-socket' argument is not a valid ceph tool argument. The option one would actually want is '--admin-daemo... Noah Watkins
12:02 PM CephFS Bug #5418: kceph: crash in remove_session_caps
kdb dumpall attached Sage Weil
12:02 PM CephFS Bug #5418 (Resolved): kceph: crash in remove_session_caps
... Sage Weil
11:33 AM rgw Feature #5417 (Resolved): rgw: separate bucket metadata object into pointer object and instance o...
Instead of having a single bucket metadata object, we'll separate into a bucket 'head' object that will point at the ... Yehuda Sadeh
11:07 AM rgw Bug #5416: --help output needs --rgw-zone option
For the ./radosgw-admin command that is. Anonymous
11:07 AM rgw Bug #5416 (Resolved): --help output needs --rgw-zone option
The new --rgw-zone option needs to be added to the --help output. Anonymous
10:30 AM rgw Bug #5415 (Resolved): rgw: failing valgrind leak checks
teuthology-2013-06-21_01:00:44-rgw-master-testing-basic/41708 and 41709 Sage Weil
10:13 AM rgw Feature #5356 (Rejected): rgw: RESTful api for bucket upstream zone + marker info
As with #5353, we're not doing it at the moment, not clear if we really need it. Yehuda Sadeh
10:10 AM rgw Feature #5355 (Rejected): rgw: get and set bucket upstream zone + marker info
We're not doing it atm. Discussed it with Greg, and we think that the replica log already covers this info, so there'... Yehuda Sadeh
09:36 AM Bug #5413 (Resolved): osd: valgrind issue in watch code (cuttlefish?)
commit:17d2745f095e7bb640dece611d7824d370ea3b81 Sage Weil
08:17 AM Bug #5413 (Resolved): osd: valgrind issue in watch code (cuttlefish?)
teuthology-2013-06-20_20:00:11-rados-cuttlefish-testing-basic/41401... Sage Weil
09:25 AM rbd Feature #4550: Create Qemu+RBD rpm package for RHEL+CentOS 6.3 on ceph.com
These packages have not been through QA yet. Anonymous
09:13 AM rbd Feature #4550: Create Qemu+RBD rpm package for RHEL+CentOS 6.3 on ceph.com
Great, thanks Gary. Have these been QAd?
Neil Levine

06/20/2013

11:13 PM Bug #5301: mon: leveldb crash in tcmalloc
Maciej Galkiewicz wrote:
> debian wheezy (7.0)
ok now it sounds a lot like #5239. i'm not able to reproduce this...
Sage Weil
11:01 PM Feature #3273 (Need More Info): mon: simple dm-crypt key management
http://marc.info/?l=ceph-devel&m=137179443405614&w=2 Sage Weil
10:40 PM rbd Feature #4550: Create Qemu+RBD rpm package for RHEL+CentOS 6.3 on ceph.com
Following the packaging discussions, the redhat packages were respun with the latest redhat sources + the ceph rados ... Anonymous
09:51 PM rgw Feature #4340 (In Progress): rgw: dr: data sync agent: implement full sync
Yehuda Sadeh
09:51 PM rgw Feature #5358 (In Progress): rgw: RESTful api for intra-region copy state
Yehuda Sadeh
09:50 PM rgw Feature #5356 (In Progress): rgw: RESTful api for bucket upstream zone + marker info
Yehuda Sadeh
09:50 PM rgw Feature #5341 (Fix Under Review): rgw: keep state for cross-rgw copy operations
Yehuda Sadeh
09:33 PM CephFS Fix #5399: timestamp changes on replayed mds request (pjd link 71)
probably need to extend the replayed request message to include the timestamps we got for the inode and dir so that t... Sage Weil
09:33 PM CephFS Fix #5399: timestamp changes on replayed mds request (pjd link 71)
- we send a create to mds
- get an ack, but it isn't journaled
- pjd stats the mtime/ctime/ec.
- mds restarts
- w...
Sage Weil
09:12 PM CephFS Bug #5290: mds: crash whilst trying to reconnect
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-06-20_13:32:57-fs-master-testing-basic/41231
logs in ...
Sage Weil
06:45 PM CephFS Bug #5333 (Fix Under Review): mds: segfault in MDLog::standby_trim_segments
wip-5333
this looks like a simple matter of not crashing if the segment list is empty. that at least covers this ...
Sage Weil
12:53 PM CephFS Bug #5333: mds: segfault in MDLog::standby_trim_segments
Just a note: maybe we missed a spot, but I remember doing a re-read head object, retry journal read whenever we get a... Greg Farnum
12:47 PM CephFS Bug #5333: mds: segfault in MDLog::standby_trim_segments
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-06-20_01:00:49-fs-next-testing-basic/40965
with ful...
Sage Weil
06:15 PM CephFS Bug #5380 (Resolved): osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
Sage Weil
12:30 PM CephFS Bug #5380: osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
Sage Weil
05:42 PM Bug #5412 (Resolved): doc bug: incorrect reference to monitor quorum requirements
http://ceph.com/docs/master/rados/deployment/ceph-deploy-mon/... Greg Farnum
02:56 PM Bug #4004 (Can't reproduce): Intermittent kernel build failures
Anonymous
02:52 PM Bug #4004: Intermittent kernel build failures
Closing since we haven't seen any problems for a couple months. Anonymous
02:55 PM devops Cleanup #5106 (Resolved): ceph_deploy: install/compile error on wheezy
The was happening do to syntax in the test programs that wasn't supported on Python 2.6. Not shipping the test direc... Anonymous
02:51 PM Bug #2176 (Resolved): dependencies not checked by autoconf
We've got all the current dependencies in the configure.ac checks and in the rpm or debian requirements. Anonymous
02:42 PM CephFS Bug #5411 (Resolved): teuthology: bad object dereference
... Greg Farnum
01:30 PM CephFS Fix #5268: mds: fix/clean up file size/mtime recovery code
See also #4485. Greg Farnum
01:30 PM CephFS Feature #4485: Improve "needsrecover" handling
See also #5268. Greg Farnum
01:24 PM CephFS Feature #1693 (In Progress): libcephfs: Support TRIM (hole punching)
See "[PATCH] Ceph-fuse: Punch hole support" from Li Wang. Greg Farnum
01:17 PM CephFS Feature #3541 (In Progress): mds: robust ino lookup using file backpointers
A bunch of this got done, but Sage isn't sure if the client -> LOOKUPINO messages are wired up to that infrastructure... Greg Farnum
12:58 PM Feature #4929: Erasure encoded placement group
the pad is only archived for so long, keep a "pad backup":http://pad.ceph.com/p/Erasure_encoding_as_a_storage_backend... Loïc Dachary
11:31 AM Bug #5409 (Resolved): mon: log command does not wait for commit
Sage Weil
10:43 AM Bug #5409 (Resolved): mon: log command does not wait for commit
the mon replies immediately, and may lose the msg if it restarts.... Sage Weil
10:35 AM rgw Feature #5408 (Resolved): rgw: turn off dr/geo logging
Yehuda Sadeh
10:26 AM Bug #5407 (Resolved): mon: is_writeable doesn't match wait_for_writeable on cuttlefish
fixes in master.. need a minimal cuttlefish backport. Sage Weil
09:58 AM rgw Feature #5406 (Resolved): rgw: a RESTful api to dump region map
Yehuda Sadeh
09:42 AM devops Bug #5405 (Resolved): ceph-deploy: transient pushy exception on install
... Sage Weil
08:37 AM devops Feature #5403 (Resolved): make ceph.com repos mirrorable
Sage Weil
08:01 AM Fix #5388: osd: localized reads (from replicas) reordered wrt writes
Since disabling localized reads I've not seen the problem occur, so thanks :) Mike Bryant
07:32 AM Bug #5401: cuttlefish osd recovery slow
Full backtrace (while recovering):
http://pastebin.com/raw.php?i=DWGHiNP6
2nd full backtrace:
http://pastebin.co...
Stefan Priebe
07:18 AM Bug #5401: cuttlefish osd recovery slow
Not sure if this helps:
# /etc/init.d/ceph stop osd.24; sleep 15; /etc/init.d/ceph start osd.24; sleep 10; inotif...
Stefan Priebe
06:59 AM Bug #5401: cuttlefish osd recovery slow
Some more information. While recoverig i see nearly no CPU load. If i look at the disk activity i see a HUGE amount o... Stefan Priebe
06:53 AM Bug #5401: cuttlefish osd recovery slow
Lowering osd recovery max active makes it even more worth as the over all recovery takes longer. So it's not the I/O ... Stefan Priebe
01:01 AM Bug #5401 (Can't reproduce): cuttlefish osd recovery slow
While the peering is fine now (Bug #5232) (latest upstream/cuttlefish) even without wip_cuttlefish_compact_on_startup... Stefan Priebe
06:59 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
Thanks. Jeff Moskow
06:56 AM Bug #5292 (Resolved): mon: monitor crashing due to not being in the monmap (no monmap to be in)
You hit #5205 -- not the same issue, thus closing this ticket again. Joao Eduardo Luis
06:48 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
Here you go. Please let me know if you need anything else.
Jeff
Jeff Moskow
06:42 AM Bug #5292 (Need More Info): mon: monitor crashing due to not being in the monmap (no monmap to be...
Okay, can you post the monitor's logs with 'debug mon = 20' ? Joao Eduardo Luis
02:46 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
I did a reboot, just to make sure :-(
# ceph -v
ceph version 0.61.4 (1669132fcfc27d0c0b5e5bb93ade59d147e23404)
...
Jeff Moskow
06:48 AM rgw Bug #5402 (Resolved): rgw compilation problem on wip-rgw-geo-2 branch
The wip-rgw-geo-2 branch does not compile from a2cf14fe27a2da54e44b12a373b15b29c89d31b9.
In fact the method encode...
Christophe Courtaut
12:06 AM Fix #5232: osd: slow peering due to pg log rewrites
I'm fridudad ;-) Peering works fine but recovery does not. In my initial text of this tracker i also mentioned recove... Stefan Priebe

06/19/2013

11:14 PM Subtask #5213 (Resolved): unit tests for src/osd/PGLog.{cc,h}
Loïc Dachary
10:48 PM CephFS Bug #5289: mds closing stale session
Sage Weil wrote:
> this is caused when teh client is not talknig to the mds. can you verify the network is working, ...
chen atrmat
08:08 PM CephFS Bug #5380: osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
The patch only fixes the root cause. It doesn't help if objects already have wrong size. Zheng Yan
06:06 PM rbd Bug #5222 (Resolved): krbd: use per-rbd_dev mutex to protect header updates
Sage Weil
06:06 PM rbd Bug #3925 (Resolved): krbd: sysfs write lockdep warnings
Sage Weil
06:04 PM Bug #5398 (Resolved): PGLog::rewind_divergent_log dereferencing iterator "p" past the end of its ...
Sage Weil
01:58 PM Bug #5398 (Fix Under Review): PGLog::rewind_divergent_log dereferencing iterator "p" past the end...
"pull request":https://github.com/ceph/ceph/pull/366 Loïc Dachary
12:59 PM Bug #5398 (Resolved): PGLog::rewind_divergent_log dereferencing iterator "p" past the end of its ...
Loïc Dachary
06:03 PM devops Bug #5161 (Resolved): daemons should create /var/run/ceph if it doesn't already exist
Sage Weil
06:02 PM Bug #5227 (Can't reproduce): ARM set up: rados test failed
Sage Weil
07:45 AM Bug #5227: ARM set up: rados test failed
Been trying to reproduce this on the talas but no joy so far. Still hammering cuttlefish. Joao Eduardo Luis
06:00 PM devops Bug #5387 (Resolved): ceph-disk: lockfile does not detect stale locks (dead parent process)
Sage Weil
06:00 PM devops Bug #5390 (Pending Backport): ceph-deploy osd create hangs
Sage Weil
05:28 PM devops Bug #5390: ceph-deploy osd create hangs
bah, trivial fcntl(2) is all we need here. Sage Weil
05:00 PM rgw Feature #4335 (Fix Under Review): rgw: dr: sync processing state: define datastructures
We're well beyond "defining" the data structures at this point; there's code and it's undergoing review. But this is ... Greg Farnum
04:12 PM devops Bug #5211 (Pending Backport): ceph-disk prepare: list_partitions() shouldn't return disks
Sage Weil
09:25 AM devops Bug #5211 (Fix Under Review): ceph-disk prepare: list_partitions() shouldn't return disks
pusehd to wip-ceph-disk Sage Weil
04:02 PM CephFS Fix #5399 (New): timestamp changes on replayed mds request (pjd link 71)
Hmm, Sage points out this might be something else; reopening. Greg Farnum
03:56 PM CephFS Fix #5399 (Rejected): timestamp changes on replayed mds request (pjd link 71)
It's a time stamp check for things going backwards, and is failing due to out-of-sync clocks (over a network) being h... Greg Farnum
03:44 PM CephFS Fix #5399 (Resolved): timestamp changes on replayed mds request (pjd link 71)
teuthology-2013-06-19_10:46:59-fs-cuttlefish-master-basic 40138 40141 Sage Weil
03:50 PM Fix #5232: osd: slow peering due to pg log rewrites
This won't get backported. Some mitigating patches did go into cuttlefish. Also, there is wip_cuttlefish_compact_on... Samuel Just
03:45 PM Fix #5232 (Resolved): osd: slow peering due to pg log rewrites
Samuel Just
03:44 PM RADOS Tasks #5243: osd testing: create peering speed test
peering_speed_test.py, still needs to be added to ceph-qa-suite somewhere appropriate. Samuel Just
03:43 PM Fix #5278: osd: smarter recovery for small objects
wip-small-object-recovery, in progress Samuel Just
03:26 PM Bug #5389 (Resolved): osd: op_tp timeout on big cluster + radosmodel
Samuel Just
03:11 PM Bug #5389: osd: op_tp timeout on big cluster + radosmodel
lfn_unfound Samuel Just
03:21 PM devops Bug #5306: Xen based OSDs fail to start ceph-osd process
Here's the output of udevadm test
~ sudo udevadm test --action=add /sys/devices/vbd-51728/block/xvdb/xvdb1
run_c...
Yan-Fa Li
03:04 PM devops Bug #5306: Xen based OSDs fail to start ceph-osd process
OK, I updated using ceph-deploy to all my nodes to this version:
ceph version 0.61.3-29-g08304a7 (08304a7c46da7517...
Yan-Fa Li
11:43 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
I'm still using the cluster with the modified ceph-mds program, it still works. I caused another power outage (this i... Jérôme Poulin
11:25 AM rgw Bug #5346: rgw: invalid read from RGWFormatter_Plain::write_data
well, swift is the only user of the plain formatter I guess. Yehuda Sadeh
11:14 AM rgw Bug #5346: rgw: invalid read from RGWFormatter_Plain::write_data
this appears to be triggered by the swift test.. doesn't happen with s3tests or readwrite etc
also present on cutt...
Sage Weil
11:15 AM Bug #4976 (Resolved): osd powercycle triggers object corruption on xfs
Sage Weil
09:09 AM Bug #4976: osd powercycle triggers object corruption on xfs
What do you mean "remove fadvise"? And is this a known upstream issue? Greg Farnum
10:37 AM rbd Documentation #3220 (Resolved): doc: more detail on QEMU+RBD page
http://ceph.com/docs/master/rbd/qemu-rbd/ John Wilkins
09:56 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
It was backported to the branch that should soon become 0.61.4. Until then, you'll be able to find it on the gitbuil... Joao Eduardo Luis
08:36 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
I just tried apt-get update and it didn't pull down any cuttlefish updates. Have they been released? Do I need to d... Jeff Moskow
07:43 AM Bug #5292 (Resolved): mon: monitor crashing due to not being in the monmap (no monmap to be in)
Fix for this went into next and cuttlefish branches as of last night; see #5256. Joao Eduardo Luis
08:59 AM devops Feature #5397 (New): terminate ceph-create-keys when its mon process dies
Right now, it's easy to build up a bunch of ceph-create-keys processes on a node because it is started when the monit... Greg Farnum
08:38 AM Bug #5375 (Resolved): squeeze tcmalloc leaks
i'll send a note to the email list. thanks for tracking this down! Sage Weil

06/18/2013

11:38 PM Bug #5375: squeeze tcmalloc leaks
OK this def. fixes it for me. So the squeeze perftool package seems to have a memory leak. Stefan Priebe
12:39 PM Bug #5375: squeeze tcmalloc leaks
Sage Weil
12:17 PM Bug #5375: squeeze tcmalloc leaks
OK backported: google-perftools from wheezy to squeeze, recompiled leveldb and ceph to reflect new google-perftools v... Stefan Priebe
11:18 AM Bug #5375: squeeze tcmalloc leaks
yeah, try wheezy. they won't update squeeze at this point anyway. Sage Weil
11:17 AM Bug #5375: squeeze tcmalloc leaks
The Debian Maintainer is: Daigo Moriwaki <daigo at debian.org>
Should i first try to use the one from wheezy on sq...
Stefan Priebe
11:14 AM Bug #5375: squeeze tcmalloc leaks
Hmm, looks like maybe we need to send a bug to upstream (Debian and/or libgoogle-perftools devs).
Sage, any ideas ...
Greg Farnum
11:02 AM Bug #5375: squeeze tcmalloc leaks
no change. Should i update my tcmalloc on debian squeeze?
[: ~]# pmap -x 11783|tail -n1
total kB 176290...
Stefan Priebe
10:56 AM Bug #5375: squeeze tcmalloc leaks
Yes. In particular the "heap release" bit is trying to more aggressively give memory back to the OS. We've observed i... Greg Farnum
10:48 AM Bug #5375: squeeze tcmalloc leaks
not should be now Stefan Priebe
10:48 AM Bug #5375: squeeze tcmalloc leaks
not it's using 1GB should i run these commands again? Stefan Priebe
10:57 PM rgw Feature #4335 (In Progress): rgw: dr: sync processing state: define datastructures
Yehuda Sadeh
10:57 PM rgw Feature #4338 (In Progress): rgw: multisite: metadata sync agent: implement delta changes sync
Yehuda Sadeh
10:56 PM rgw Feature #5341 (In Progress): rgw: keep state for cross-rgw copy operations
Yehuda Sadeh
10:56 PM rgw Bug #5357 (Fix Under Review): rgw: set and retrieve intra-region copy operation state
Yehuda Sadeh
10:33 PM devops Bug #5390: ceph-deploy osd create hangs
starting with the mercurial lock implementation, which uses a pid. see wip-ceph-disk-lock, tho still incomplete. Sage Weil
08:45 AM devops Bug #5390 (Fix Under Review): ceph-deploy osd create hangs
care to review teh top patch in wip-ceph-disk?
alternatively, do you know of a replacement for lockfile that will ...
Sage Weil
08:38 AM devops Bug #5390 (In Progress): ceph-deploy osd create hangs
see also #5387. and i'll add the sigint handler to reduce the probability of this happening! Sage Weil
07:36 AM devops Bug #5390 (Resolved): ceph-deploy osd create hangs
On Ubuntu 13.04 with ceph 0.61.3 .
It hangs when creating a new osd using ceph-deploy.
ceph@ceph-node4:~/mycluste...
Da Chun Wu
10:24 PM Fix #5279 (In Progress): pipeline large object recovery
Sage Weil
10:24 PM Feature #4200 (In Progress): mon: break pgmap into separate leveldb keys
Sage Weil
10:08 PM Bug #5256 (Resolved): Upgraded bobtail->cuttlefish mon crashes, then can't resume the conversion
Sage Weil
09:45 PM Bug #4976 (Fix Under Review): osd powercycle triggers object corruption on xfs
the problem is that sync_file_range(2) and posix_fadvaise(..DONTNEED) break xfs's internal write and zero ordering. ... Sage Weil
09:19 PM Bug #5372 (Duplicate): osd/SnapMapper.cc: 270: FAILED assert(check(oid))
I think this is caused by the same thing as 5269. Samuel Just
09:18 PM Bug #5320 (Resolved): osd/ReplicatedPG.cc: 4753: FAILED assert(!pg_log.get_missing().is_missing(s...
Samuel Just
09:00 PM Fix #5388: osd: localized reads (from replicas) reordered wrt writes
... Sage Weil
10:21 AM Fix #5388: osd: localized reads (from replicas) reordered wrt writes
Sage Weil
09:33 AM Fix #5388: osd: localized reads (from replicas) reordered wrt writes
I've just reproduced it with those log levels.
There was 1 master, 1 regionserver.
So I think the writing and readi...
Mike Bryant
08:04 AM Fix #5388 (Need More Info): osd: localized reads (from replicas) reordered wrt writes
Hi Mike,
I gather that the same data was just written by a different node in the cluster? And this is right near/...
Sage Weil
03:40 AM Fix #5388 (New): osd: localized reads (from replicas) reordered wrt writes
I'm using hbase, with the hadoop-cephfs bindings, on top of a ceph 0.61 cluster.
I'm seeing instances where reading ...
Mike Bryant
08:37 PM Bug #5395 (Can't reproduce): arm: osd: big performance differential between read/write
-- arm --
Raw /dev/rbd device
$ sudo dd if=/dev/zero of=/dev/rbd1 bs=4M count=128 conv=fdatasync
128+0 record...
Sage Weil
05:43 PM Bug #5084: osd: slow peering after osd restart (bobtail)
I tried wip_cuttlefish_compact_on_startup today. First, I upgraded one box to 0.61.3-47-g47f1bed-1precise.
Then, w...
Faidon Liambotis
05:02 PM devops Bug #5266 (Closed): the apt-get install instructions are missing an update
Verified fixes, thanks. Yan-Fa Li
02:32 PM devops Bug #5266 (Resolved): the apt-get install instructions are missing an update
See:
http://ceph.com/docs/master/start/quick-start-preflight/#install-ceph-deploy
http://ceph.com/docs/master/rado...
John Wilkins
04:59 PM devops Bug #5306: Xen based OSDs fail to start ceph-osd process
I will do this tomorrow. The xen box is temporarily down. Yan-Fa Li
01:46 PM devops Bug #5306 (Need More Info): Xen based OSDs fail to start ceph-osd process
can you retest on latest cuttlefish branch? (ceph-deploy install --dev=cuttlefish) Sage Weil
04:07 PM Documentation #3391: doc: add instructions on snapshot reversion
Actually, this isn't a documentation oversight. There's no means of rolling back an entire snapshot on a pool visible... John Wilkins
03:29 PM Bug #5226: Some PG stay in "incomplete" state
Well, I have pools on that clusters which are fines (thanks to 3 copies) ; so how can I recover a HEALTH_OK status, s... Olivier Bonvalet
01:05 PM Bug #5226 (Won't Fix): Some PG stay in "incomplete" state
nothing much to be done here if 2 disk were replaced/failed Sage Weil
03:18 PM rgw Documentation #5178 (Resolved): rgw: fix keystone openssl to nss conversion
See http://ceph.com/docs/master/radosgw/config/#integrating-with-openstack-keystone John Wilkins
03:17 PM devops Bug #5211: ceph-disk prepare: list_partitions() shouldn't return disks
Came up with this: https://gist.github.com/alram/33ea3360d5aa6a86e8a4
Alexandre Marangone
02:38 PM devops Bug #5211: ceph-disk prepare: list_partitions() shouldn't return disks
i think the right way to do this is to look to see if /sys/block/$disk/$part exist (e.g., /sys/block/sda/sda1) to tel... Sage Weil
01:43 PM devops Bug #5211: ceph-disk prepare: list_partitions() shouldn't return disks
Alexandre, please implement one of the suggestions you mentioned. Ian Colle
02:21 PM devops Bug #5182 (Won't Fix): ceph-disk looks like it tries to mark preexisting OSD partitions with the ...
this is correct.. if you do ceph-disk prepare /dev/sdb1 (a partition) we don't touch the partition type. if you do /... Sage Weil
02:14 PM devops Bug #5161 (Pending Backport): daemons should create /var/run/ceph if it doesn't already exist
Sage Weil
02:02 PM devops Bug #5338 (In Progress): need rpm packages built for libapache-mod-fastcgi
Sage Weil
01:53 PM devops Bug #5263 (Resolved): Python Error While Installing ceph-deply on debian wheezy
Sage Weil
01:52 PM devops Bug #5299 (Won't Fix): ceph-deploy fails with cryptic error message if expected directories not f...
Sage Weil
01:52 PM Bug #5392: osd: unfound objects from thrashing
There seem to be a lot of threads waiting on throttle. Unfortunately, the test timed out before I could get more inf... Samuel Just
01:50 PM Bug #5392: osd: unfound objects from thrashing
... Samuel Just
11:21 AM Bug #5392 (Resolved): osd: unfound objects from thrashing
... Sage Weil
01:51 PM devops Bug #5342 (Resolved): Make tcmalloc default on ARM
Sage Weil
01:51 PM devops Bug #5339 (Resolved): ceph-deploy suite failures, 'insufficient osds'
Sage Weil
01:50 PM devops Bug #5359 (Resolved): ceph-deploy: install and purge commands on rhel sometimes errors out though...
Sage Weil
01:49 PM devops Bug #5066 (Resolved): Problems with ceph-deploy debs
Sage Weil
01:48 PM devops Bug #5199 (Resolved): ceph-deploy: on fedora18, osd create command doesnt seem to mount the disks
Sage Weil
01:47 PM Bug #5301: mon: leveldb crash in tcmalloc
debian wheezy (7.0) Maciej Galkiewicz
01:01 PM Bug #5301: mon: leveldb crash in tcmalloc
what distro are you using? this sounds a bit like #5239 Sage Weil
01:47 PM devops Bug #5258 (Resolved): ceph-deploy: forgetkeys command could delete existing keyring files without...
commit:953bee3cc66d19ef9b201299fc82c270587936a9 Sage Weil
01:46 PM devops Bug #4916 (Resolved): ceph-deploy: mon create fails on bobtail branch in centos 6.3
Sage Weil
01:45 PM devops Bug #5334 (Resolved): ceph-deploy: "modules not installed"
Sage Weil
01:40 PM devops Bug #5345 (Need More Info): ceph-disk: handle less common device names
Sage Weil
01:32 PM Bug #5389 (Resolved): osd: op_tp timeout on big cluster + radosmodel
Samuel Just
12:18 PM Bug #5389: osd: op_tp timeout on big cluster + radosmodel
... Samuel Just
07:26 AM Bug #5389 (Resolved): osd: op_tp timeout on big cluster + radosmodel
no errors in kern.log, so we can't blame this on the kenrel.... Sage Weil
01:29 PM Bug #4268 (Can't reproduce): mon: timecheck: teuthology task fails due to unreported timecheck fr...
Joao Eduardo Luis
01:28 PM Bug #4189 (Resolved): osd/ReplicatedPG.cc: 4994: FAILED assert(log.objects.count(soid) ...
Samuel Just
01:28 PM Bug #4265 (Won't Fix): ceph-deploy new doesn't support multiple monitors on one host.
Sage Weil
01:28 PM Bug #4216 (Resolved): osd: dbojectmap incorrectly skipping ops
Sage Weil
01:27 PM Bug #3683 (Resolved): mon: leak of MMonPaxos
Sage Weil
01:26 PM Bug #3723 (Can't reproduce): ceph osd down command reports incorrectly
Sage Weil
01:26 PM Bug #3607 (Resolved): FileStore::_write conditional code for HAVE_SYNC_FILE_RANGE seems wrong
Sage Weil
01:25 PM Bug #3593 (Can't reproduce): MDS crash in MDCache.cc _recovered()
Sage Weil
01:25 PM Bug #2563 (Resolved): leveldb corruption
Samuel Just
01:24 PM Bug #3576 (Resolved): scripe scripts broken after upgrade to 0.55
Sage Weil
01:24 PM Bug #3182 (Can't reproduce): No JSON object could be decoded - failure in the nightly run
Sage Weil
01:24 PM Bug #3287 (Resolved): OSD dies when using zfs
Sage Weil
01:23 PM Bug #3458 (Can't reproduce): aio enabled but not used
Sage Weil
01:23 PM Bug #3644 (Resolved): ObjectCacher: discard_set ignores waiters
Sage Weil
01:23 PM Bug #3771 (Resolved): ceph does not have startup scripts in Centos
Sage Weil
01:23 PM Bug #3537 (Won't Fix): Logs can run root out of space and crash ceph cluster (need more aggressiv...
Sage Weil
01:21 PM Bug #4041 (Can't reproduce): mon: Single-Paxos: on Paxos, leader didn't trim old versions
Sage Weil
01:20 PM Bug #2896 (Won't Fix): ceph pg dump has empty hb_out field
it's vestigal. Sage Weil
01:18 PM Bug #4523 (Duplicate): osd: read stats not updated
Sage Weil
01:18 PM Bug #4723 (Can't reproduce): FAILED assert(!db->create_and_open(std::cerr)) after IO Error.
Ian Colle
01:15 PM Bug #5052 (Duplicate): kclient_workunit_misc test failed in the nightlies
Sage Weil
01:14 PM Bug #5074 (Can't reproduce): nightlies: timed out waiting for admin socket of restarted osd
Sage Weil
01:13 PM Bug #5059 (Won't Fix): PGs can get stuck degraded if OSD removed before being out
Sage Weil
01:11 PM Bug #5082 (Can't reproduce): OSD wrongly marked as down
Sage Weil
01:10 PM Bug #4856 (Won't Fix): monitor: upgrades produce "client did not provide supported auth type" in log
Sage Weil
01:07 PM Bug #3143 (Won't Fix): Obsync object verification takes too long
https://github.com/dreamhost/obsync Sage Weil
01:07 PM Bug #5173 (Can't reproduce): ceph scrub found missing pg object
Sage Weil
01:04 PM Bug #5205: mon: FAILED assert(ret == 0) on config's set_val_or_die() from pick_addresses()
Sage Weil
01:04 PM Bug #5292 (In Progress): mon: monitor crashing due to not being in the monmap (no monmap to be in)
Monitor is not in the monmap because there is no monmap. This should be due to a sync bug (related to #5256) that re... Joao Eduardo Luis
12:57 PM Bug #5343 (Resolved): mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Sage Weil
12:51 PM CephFS Bug #5289 (Can't reproduce): mds closing stale session
this is caused when teh client is not talknig to the mds. can you verify the network is working, and ceph-fuse is hea... Sage Weil
12:50 PM Bug #5288 (Resolved): ceph.py: catch rados errors and print them nicely
Sage Weil
12:49 PM Bug #4179 (Resolved): osd: memory leak during deep scrub on bobtail
Sage Weil
12:49 PM Bug #5163 (Can't reproduce): filestore: ENOTEMPTY on object removal
Samuel Just
12:48 PM Bug #5246 (Resolved): mon crashing on pool/pg creation with wip-mon
Sage Weil
12:48 PM Bug #5157 (Resolved): install: unable to pull ceph rpm packages on fedora18
Sage Weil
12:44 PM Bug #3829 (Can't reproduce): new osd added to the cluster is not receiving data
Sage Weil
12:43 PM Bug #4764 (Can't reproduce): ceph -w sometimes does not reflect clean pgs
Sage Weil
12:42 PM Bug #5072 (Can't reproduce): mon: segfault on leveldb::Table::Open() during monitor start
Sage Weil
12:42 PM Bug #4791 (Can't reproduce): osd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0) in scan_range
Sage Weil
12:35 PM Bug #5238 (Resolved): osd: slow recovery (uselessly dirtying pg logs during peering)
Sage Weil
12:01 PM devops Feature #5393 (Rejected): ceph-disk: prepare should warn when using partitions
When using ceph-disk prepare with already created partitions, we do not set the partition uuid, thus the udev rules a... Alexandre Marangone
11:11 AM Bug #5383 (Resolved): arm write EFBIG
6b52acc8502ec16e2d0b89d8caf6235ec45778cb Samuel Just
10:54 AM Bug #5069: monitor crashed during mon thrash in nightlies
Forgot to mention that the sync flag is set on the store. Sage pointed out that the real issue here is that we're al... Joao Eduardo Luis
10:32 AM Bug #5069: monitor crashed during mon thrash in nightlies
I've been able to reproduce this on some locked nodes that were hammering the monitors pretty hard for the past week.... Joao Eduardo Luis
09:34 AM CephFS Bug #5379 (Resolved): mds/ceph-fuse hang on mount
Sage Weil
08:26 AM rbd Bug #5391 (Duplicate): krbd: crash in rbd_obj_request_create -> strlen
... Sage Weil
07:48 AM Bug #5272 (Can't reproduce): Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of ex...
Sage Weil
12:47 AM Bug #5272: Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of existing cluster
As I re-encountered the same issue without upgrading, just restarting MDS daemon, I think this tracker issue may be c... To Pro

06/17/2013

11:59 PM Bug #5375: squeeze tcmalloc leaks
[: ~]# pmap -x 11783|tail -n1
total kB 1547412 688752 685152
[: ~]# ceph -m 10.255.0.100:6789 heap stat...
Stefan Priebe
09:40 AM Bug #5375: squeeze tcmalloc leaks
Stefan, could you please try (for your monitor's IP and PORT):... Joao Eduardo Luis
06:31 AM Bug #5375 (Resolved): squeeze tcmalloc leaks
While running cuttlefish 0.61.3 or 08304a7c46da7517319b7db0b64d1c4f54771472
i'm seeing high memory usage of ceph-mo...
Stefan Priebe
09:15 PM CephFS Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
This is different from #4850. In issue #4850, disconnected inodes have no cap. In this issue, all disconnected inodes... Zheng Yan
01:32 PM CephFS Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
Good chance this is a duplicate of #4850 (though that's fsstress, so maybe not). Greg Farnum
01:22 PM CephFS Bug #5381 (Resolved): ceph-fuse: stuck with disconnected inodes on shutdown
Seen this at least 2x in the last few days:... Sage Weil
08:37 PM rgw Bug #5357 (In Progress): rgw: set and retrieve intra-region copy operation state
Yehuda Sadeh
08:36 PM rgw Bug #5351 (Resolved): rgw: make sure wip-rgw-geo passes gitbuilder
Yehuda Sadeh
08:35 PM devops Bug #5387 (Resolved): ceph-disk: lockfile does not detect stale locks (dead parent process)
python lockfile class does not detect when teh prior lock owner process is gone. we should switch to a class that do... Sage Weil
05:43 PM CephFS Bug #5380: osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
see commit a41bad1a9b(ceph: re-calculate truncate_size for strip object) Zheng Yan
01:18 PM CephFS Bug #5380 (Resolved): osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
on mds shutdown... Sage Weil
05:34 PM rgw Feature #5354 (Fix Under Review): rgw: intra-region object copy should also set mtime on object
Yehuda Sadeh
04:44 PM CephFS Bug #5379: mds/ceph-fuse hang on mount
Sage Weil
12:52 PM CephFS Bug #5379 (Resolved): mds/ceph-fuse hang on mount
have observed serveral times ceph-fuse hanging on getattr(#1). latest job was... Sage Weil
03:55 PM Bug #5373 (Can't reproduce): osd: dump_stuck test fails on tell
Sage Weil
03:49 PM devops Bug #5194 (Resolved): udev does not start osd after reboot on wheezy or el6 or fedora
Sage Weil
02:54 PM devops Bug #5194 (Fix Under Review): udev does not start osd after reboot on wheezy or el6 or fedora
now works on rhel, centos, wheezy, precise. f18 still has the mon start issue. Sage Weil
02:16 PM Bug #5383 (Resolved): arm write EFBIG
2013-06-17 15:05:31.066237 a6919420 0 -- 10.214.156.115:6800/7870
submit_message osd_op_reply(30
...
Samuel Just
02:09 PM CephFS Bug #5382: mds: failed objecter assert on shutdown
Sorry, logs at /a/teuthology-2013-06-15_01:00:44-fs-next-testing-basic/36375 Greg Farnum
02:07 PM CephFS Bug #5382 (Can't reproduce): mds: failed objecter assert on shutdown
I haven't been through this completely, but it looks like the mds went laggy, and then it received a SIGTERM (the tes... Greg Farnum
01:26 PM Bug #5269 (Resolved): osd: EEXIST on mkcoll
Samuel Just
10:08 AM Bug #5269: osd: EEXIST on mkcoll
ubuntu@teuthology:/a/teuthology-2013-06-17_01:00:05-rados-master-testing-basic/37637 Sage Weil
12:36 PM rgw Bug #5362: rgw: failure when listing objects with prefix that starts with underscore
I confirmed that this was tested, and I built it on all the branches:
next as of commit:d582ee2438a3bd307324c5f44491...
Greg Farnum
12:35 PM rgw Bug #4600: rgw: list bucket broken when marker start with underscore
Cherry-picked this commit into bobtail as well, in commit:a8f9d57a15ad7a69d53aa8fc6090fd1b394b616a. It got missed in ... Greg Farnum
12:25 PM Bug #5366 (Resolved): assert in ODSMap::is_blacklisted()
Sage Weil
09:40 AM Bug #5366: assert in ODSMap::is_blacklisted()
Sam, please review. Ian Colle
12:24 PM CephFS Bug #5368 (Resolved): ceph-fue: fsx-mpi hangs in _sync_read
commit:ee40c217e373b538e227f7218b09c1c794b4124a Sage Weil
11:49 AM rbd Bug #4446: librbd: crash from opensolaris vm
I just upgraded to KVM 1.4.2 -- same problem. Jeff Moskow
11:04 AM rgw Bug #5378 (Resolved): make radosgw-admin user rm idempotent
It would be extremely useful for radosgw-admin user rm to be idempotent, specifically so that it will return success ... JuanJose Galvez
09:42 AM Bug #5340 (Resolved): Bad arguments to zero will cause OSD to crash
Sage Weil
05:42 AM rgw Bug #5374 (Resolved): Avoid relying on keystone's admin token
The current Keystone integration requires knowledge of the keystone admin token. The keystone admin token is for Keys... Soren Hansen

06/16/2013

08:36 PM Bug #5269: osd: EEXIST on mkcoll
Running with logging overnight to reproduce. Samuel Just
08:08 PM Bug #5269: osd: EEXIST on mkcoll
and... Sage Weil
07:58 PM Bug #5269: osd: EEXIST on mkcoll
don't think this was #5270.. just hit it on... Sage Weil
08:09 PM Bug #5373 (Can't reproduce): osd: dump_stuck test fails on tell
... Sage Weil
04:52 PM Bug #5372 (Duplicate): osd/SnapMapper.cc: 270: FAILED assert(check(oid))
... Sage Weil
10:04 AM Bug #5370 (Resolved): ceph tool occasionally hangs
Sage Weil
10:01 AM Bug #5370: ceph tool occasionally hangs
fixed by ceph-qa-suite commit:73413642d7a1a1aa09cfa240cadba925b1ba812d Sage Weil
05:50 AM CephFS Bug #5367: multiclient tests: kernel mount gets EPERM
kclient and MDS never return -EACCES. was ior executed with root privilege? Zheng Yan

06/15/2013

10:10 PM rgw Feature #4310 (Fix Under Review): rgw: multisite: radosgw changes: copy across regions
Yehuda Sadeh
10:09 PM rgw Bug #5362 (Fix Under Review): rgw: failure when listing objects with prefix that starts with unde...
Yehuda Sadeh
10:09 PM rgw Feature #5352 (Fix Under Review): rgw: metadata get should also dump mtime
Yehuda Sadeh
10:08 PM rgw Feature #5353 (Fix Under Review): rgw: metadata put should apply mtime if set
Yehuda Sadeh
08:49 PM Bug #5366: assert in ODSMap::is_blacklisted()
commit:f25f212027294e5107fc9938e67d31879c171088 merged to fix the weekend qa runs. still should get a review. Sage Weil
09:10 AM Bug #5366 (Resolved): assert in ODSMap::is_blacklisted()
wip pushed Sage Weil
08:46 PM Bug #5371 (Resolved): idempotent filestore test failure
... Sage Weil
08:10 PM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Sage Weil
08:09 PM devops Bug #5363 (Resolved): specfile: ceph does not start on reboot
Sage Weil
08:09 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
update:
* wheezy is working well.
* fedora is failing only because the mon doesn't start on boot. see #5369
* r...
Sage Weil
07:57 PM Bug #5370 (Resolved): ceph tool occasionally hangs
"description": "/var/lib/teuthworker/archive/teuthology-2013-06-15_01:00:11-rados-next-testing-basic/36197",
...
Sage Weil
07:50 PM devops Bug #5369 (Resolved): fedora18: sysvinit doesn't start mon on reboot
mon log indicates it can't bind to the ip, suggesting it is starting before the network. however, note that... Sage Weil
07:46 PM CephFS Bug #5367: multiclient tests: kernel mount gets EPERM
mpi-fsx also gets EPERM. Sage Weil
07:15 PM CephFS Bug #5367 (Resolved): multiclient tests: kernel mount gets EPERM
... Sage Weil
07:45 PM CephFS Bug #5368 (Resolved): ceph-fue: fsx-mpi hangs in _sync_read
infinite loop in _sync_read() due to a short read. see wip-client-sync. Sage Weil
08:19 AM Bug #5365 (Rejected): Massive OSD flaps
Note that the current development releases include more robust heartbeat checks and a backoff behavior that prevents ... Sage Weil
03:10 AM Bug #5365: Massive OSD flaps
I found networking bug (not full connectivity). Ticket could be closed.
The reason was that new osd host was unable ...
Ivan Kudryavtsev
03:05 AM Bug #5365: Massive OSD flaps
During upgrade I restarted services on all nodes. Ivan Kudryavtsev
02:55 AM Bug #5365: Massive OSD flaps
I upgraded full cluster to
new: ceph version 0.56.6 (95a0bda7f007a33b0dc7adf4b330778fa1e5d70c)
but it still flap...
Ivan Kudryavtsev
02:31 AM Bug #5365 (Rejected): Massive OSD flaps
Hi, all.
Today I added one more node to my CEPH and it became unstable, i mean here that it's unable to work with ...
Ivan Kudryavtsev

06/14/2013

11:28 PM rgw Feature #5349 (Fix Under Review): rgw: intra-region object copy
Yehuda Sadeh
11:01 AM rgw Feature #5349 (Resolved): rgw: intra-region object copy
This should also include the ability to copy namespaced objects (to be able to copy multipart upload parts). Yehuda Sadeh
06:11 PM rgw Bug #5348 (Fix Under Review): rgw: missing copy constraints checks for inter region user object copy
Yehuda Sadeh
11:00 AM rgw Bug #5348 (Resolved): rgw: missing copy constraints checks for inter region user object copy
Yehuda Sadeh
06:04 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
rhel seems to be working, fedora18 is acting very strange. Sage Weil
02:06 PM devops Bug #5194 (In Progress): udev does not start osd after reboot on wheezy or el6 or fedora
tahnks- i now see the problem (and can reproduce it here, yay!). testing a fix Sage Weil
01:09 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Hi Sage,
attached is the current syslog.
I started "partprobe /dev/sdb" at Jun 14 21:57:06 and "partprobe /dev/...
Robert Sander
01:04 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Can you generate and attach a udev log after the reboot? Actually, ideally,
- reboot
- note the time
- run part...
Sage Weil
12:59 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Sage Weil wrote:
> Can you grab
>
> https://github.com/ceph/ceph/blob/master/src/ceph-disk and copy it to /usr/...
Robert Sander
12:43 PM devops Bug #5194 (Need More Info): udev does not start osd after reboot on wheezy or el6 or fedora
Hi Robert,
Can you grab
https://github.com/ceph/ceph/blob/master/src/ceph-disk and copy it to /usr/sbin
https:...
Sage Weil
12:42 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Sage Weil
05:48 PM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
nope wrong ticket; ignore Sage Weil
05:32 PM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Sage, was that reply intended for this ticket? If it was I'm surely missing something... Joao Eduardo Luis
01:03 PM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Can you generate and attach a udev log after the reboot? Actually, ideally,
- reboot
- note the time
- run part...
Sage Weil
12:44 PM Bug #5343 (Pending Backport): mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
pushed.. will backport once we have done more testing Sage Weil
10:45 AM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
I ran the following test for an already existing single-monitor setup:
* generate monmap with random fsid
* injec...
Joao Eduardo Luis
09:28 AM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Greg pointed out that it's likely the fsid issue results from messing around with the monmap's fsid. Setting up a te... Joao Eduardo Luis
09:01 AM Bug #5343: mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Running gdb, looks like the 2810's incremental fsid is different from the OSDMap's fsid:... Joao Eduardo Luis
07:41 AM Bug #5343 (In Progress): mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
Joao Eduardo Luis
07:33 AM Bug #5343 (Resolved): mon: infinite OSDMonitor::update_from_paxos() on single-monitor setup
A user on ceph-users shared a log containing a most interesting behavior happening on OSDMonitor::update_from_paxos()... Joao Eduardo Luis
03:44 PM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
sandon put wheezy on these mira for us to test this locally: mira09[456] Sage Weil
03:04 PM devops Bug #5363 (Resolved): specfile: ceph does not start on reboot
testing fix Sage Weil
02:54 PM rgw Bug #5347 (Fix Under Review): rgw: bucket marker should include original zone name
Yehuda Sadeh
11:00 AM rgw Bug #5347 (Resolved): rgw: bucket marker should include original zone name
To avoid marker collisions Yehuda Sadeh
02:41 PM rgw Bug #5362 (Resolved): rgw: failure when listing objects with prefix that starts with underscore
Yehuda Sadeh
02:40 PM Bug #5062 (Can't reproduce): mon: 0.61.2 asserts on AuthMonitor during monitor start
Sage Weil
02:39 PM devops Feature #5361 (Resolved): ceph-all should start after networking bug before runlevel [2345]
just in case other system services rely on it being up. Sage Weil
02:38 PM devops Bug #5248 (Resolved): upstart: ceph-all job is starting too soon
hmm opening a separate bug for the 'start earlier than this' part. Sage Weil
12:39 PM devops Bug #5248: upstart: ceph-all job is starting too soon
changing this to runlevel [2345] for now. Sage Weil
02:38 PM devops Feature #3302 (Resolved): ceph-disk: activate-journal, and matching udev rule
Sage Weil
02:23 PM devops Feature #3302: ceph-disk: activate-journal, and matching udev rule
commit:a2a78e8d16db0a71b13fc15457abc5fe0091c84c Sage Weil
02:18 PM devops Bug #5189 (Resolved): ceph-deploy disk prepare fails silently
this is now working with the fixes from #4984. Sage Weil
02:14 PM devops Bug #4984 (Resolved): ceph_deploy: osd create succeeds with an error message (partprobe returns e...
woot! tested and backported to cuttlefish!
still issues on reboot with wheezy... #5194
Sage Weil
01:08 PM Bug #5326 (Resolved): mon: osd crush add ... comamdn broken
commit:9a7ed0b3f8df5bd74133f216bad61ae71eab0816, tho this actual error was a problem with the ceph cli sometime in te... Sage Weil
12:50 PM CephFS Bug #5360 (Rejected): ceph-fuse: failing smbtorture tests
We're failing the maxfid test when samba is backed by a ceph-fuse mount. It seems to be an inconsistent (this is the ... Greg Farnum
11:39 AM devops Bug #5359 (Resolved): ceph-deploy: install and purge commands on rhel sometimes errors out though...
install command on rhel platform errors out though the command is successful and ceph is installed,
the error mess...
Tamilarasi muthamizhan
11:09 AM rgw Feature #5358 (Resolved): rgw: RESTful api for intra-region copy state
Yehuda Sadeh
11:08 AM rgw Bug #5357 (Resolved): rgw: set and retrieve intra-region copy operation state
Yehuda Sadeh
11:07 AM rgw Feature #5356 (Rejected): rgw: RESTful api for bucket upstream zone + marker info
Yehuda Sadeh
11:07 AM rgw Feature #5355 (Rejected): rgw: get and set bucket upstream zone + marker info
Yehuda Sadeh
11:06 AM rgw Feature #5354 (Resolved): rgw: intra-region object copy should also set mtime on object
Yehuda Sadeh
11:05 AM rgw Feature #5353 (Resolved): rgw: metadata put should apply mtime if set
Yehuda Sadeh
11:05 AM rgw Feature #5352 (Resolved): rgw: metadata get should also dump mtime
Yehuda Sadeh
11:04 AM rgw Bug #5351 (Resolved): rgw: make sure wip-rgw-geo passes gitbuilder
Yehuda Sadeh
11:03 AM rgw Feature #5350 (New): rgw: copy object metadata should include omap data for object
That's needed multipart head objects copy Yehuda Sadeh
10:56 AM devops Bug #5339: ceph-deploy suite failures, 'insufficient osds'
changing the priority as this has nothing to do with ceph-deploy,
leaving it in this state until the nightlies succ...
Tamilarasi muthamizhan
10:18 AM Bug #5252 (Resolved): osd: EINVAL from truncate causes osd to crash
commit:f1b6bd7988ab964c9167eff7bea51a49573f5175 Sage Weil
08:50 AM rgw Bug #5346 (Resolved): rgw: invalid read from RGWFormatter_Plain::write_data
ubuntu@teuthology:/a/teuthology-2013-06-14_01:00:36-rgw-master-testing-basic/35856$ zless ./remote/ubuntu@plana63.fro... Sage Weil
08:35 AM devops Bug #5345 (Resolved): ceph-disk: handle less common device names
/dev/sdaa*
/dev/cciss/c0d0p1
etc.
Sage Weil
08:21 AM rgw Bug #5344 (Resolved): rgw: make list of bucket placement pools index configurable
The object containing the list of placement pools is hard coded, make it configurable (through ceph.conf). Yehuda Sadeh

06/13/2013

07:43 PM CephFS Bug #5333: mds: segfault in MDLog::standby_trim_segments
I think it's an old race. The standby MDS gets the pos of journal head, then reads the corresponding journal object. ... Zheng Yan
02:02 PM CephFS Bug #5333: mds: segfault in MDLog::standby_trim_segments
I see that Yan changed one line in this function recently (which shouldn't have had any impact), but other than that ... Greg Farnum
05:41 PM devops Bug #5339: ceph-deploy suite failures, 'insufficient osds'
modified ceph-deploy task to throw appropriate exceptions in case of failures.
most of the ceph-deploy tests have ...
Tamilarasi muthamizhan
10:48 AM devops Bug #5339 (Resolved): ceph-deploy suite failures, 'insufficient osds'
The cluster is NOT operational due to insufficient OSDs Sage Weil
04:37 PM devops Bug #5342 (Resolved): Make tcmalloc default on ARM
tcmalloc usage needs to be enabled on ARM. While packages are not available on all platforms yet, the locally compil... Anonymous
04:03 PM devops Feature #3302 (Fix Under Review): ceph-disk: activate-journal, and matching udev rule
this was causing unreliable ubuntu activation, at least in my case Sage Weil
03:01 PM rgw Feature #5341 (Resolved): rgw: keep state for cross-rgw copy operations
Need to implement a new class that'd index the data. Yehuda Sadeh
01:52 PM devops Bug #5283: Ceph-deploy can't handle /dev/disk/by-* device paths
With the by-id path which does not have embedded colons:... Anonymous
12:52 PM devops Bug #5283: Ceph-deploy can't handle /dev/disk/by-* device paths
glowell@gary-ubuntu-01:~/ceph-deploy$ ./ceph-deploy osd create gary-ubuntu-01:/dev/disk/by-path/pci-0000:00:07.0-scsi... Anonymous
12:28 PM devops Bug #5309 (Closed): ceph-deploy mon create fails to start monitor damon
Issue is no longer occurring after recent commits to ceph-deploy. Not sure which one fixed it but around 10 June. Anonymous
10:49 AM devops Bug #4984 (In Progress): ceph_deploy: osd create succeeds with an error message (partprobe return...
Sage Weil
10:49 AM Bug #5329 (Resolved): ceph osd tell * injectargs broken
commit:3abd2d8bc94ab77364345e3f830cfb83124df31d Sage Weil
10:49 AM Bug #5340 (Resolved): Bad arguments to zero will cause OSD to crash
Check offset/len arguments for zero operation so that later fallocate() error doesn't cause OSD to crash. David Zafman
10:41 AM devops Bug #5338 (Resolved): need rpm packages built for libapache-mod-fastcgi
We currently have libapache-mod-fastcgi packages built for debs. It would be nice to have them built for rpms as well... Tamilarasi muthamizhan
10:23 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Hi Sage,
this was a clean reboot of the cluster node.
As the filesystems have not been mounted automatically no...
Robert Sander
09:16 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
I see it starting osd.5 and osd.2:... Sage Weil
08:40 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Hi,
attached is /var/log/syslog after booting the machine with udev debug level logging.
The filesystems have n...
Robert Sander
10:06 AM Bug #5227 (Need More Info): ARM set up: rados test failed
This sure looks a lot like #4879 which would have been fixed by 0.61. I thought I had grabbed the stores and the logs... Joao Eduardo Luis
09:37 AM devops Bug #5334: ceph-deploy: "modules not installed"
Update. I was able to get it installed correctly with the `ceph-deploy-1.0-0.noarch.rpm` package, but my understandin... Noah Watkins
09:28 AM Bug #5301: mon: leveldb crash in tcmalloc
Okay, regarding the crash, although I've been unable to figure out what or who (us or leveldb) may be causing it, the... Joao Eduardo Luis
08:28 AM devops Bug #5189: ceph-deploy disk prepare fails silently
Hi Sage,
We are currently testing with some Debian wheezy VMs on a VMware ESXi host.
root@ceph01-test:~# lsb_re...
Robert Sander
08:02 AM Bug #5336 (Can't reproduce): osd crash triggered by 'rbd rm ...'
Reported by Florian Wiessner on ML
looks like a stall in the op_tp.. requested detailed logs.
Sage Weil
07:48 AM Bug #5256: Upgraded bobtail->cuttlefish mon crashes, then can't resume the conversion
Okay, here's what is the likely order of events in this case:
* the monitor was converting when it was killed for ...
Joao Eduardo Luis
01:24 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Argh. I spoke too soon. We just had another crash this morning while deleting the benchmark pool. Using the staticall... Emil Renner Berthing

06/12/2013

08:37 PM rbd Feature #5335 (New): qa: test that kernel rbd and librbd can read images written by each other
This test would have caught an issue with format 2 object names being different in librbd and the kernel driver. Josh Durgin
06:17 PM Bug #5329 (Fix Under Review): ceph osd tell * injectargs broken
Sage Weil
12:29 PM Bug #5329 (Resolved): ceph osd tell * injectargs broken
... Sage Weil
06:13 PM Bug #5331 (Resolved): objecter: osd_command doesn't handle dne/down osd properly
commit:8808ca57c652502d9cf803b0dc53673ca9dd62af Sage Weil
01:02 PM Bug #5331 (Resolved): objecter: osd_command doesn't handle dne/down osd properly
we return an error but don't trigger the callback or clean up... Sage Weil
05:51 PM devops Bug #5259 (Duplicate): osd create command fails inconsistently on ubuntu
i think we should call this a dup of the other bug.. this is all about udev vs partprobe vs udevadm settle races. se... Sage Weil
04:13 PM devops Bug #5334 (Resolved): ceph-deploy: "modules not installed"
Using cuttlefish RPM install for CentoOS 6.4. Ceph-deploy is installed on all the nodes. I get the following:
<pre...
Noah Watkins
02:00 PM Bug #5327 (Resolved): cephtool/test.sh fails
commit:701943a27857fcad7fbb405cf95a59c945fea815 Sage Weil
11:44 AM Bug #5327 (Resolved): cephtool/test.sh fails
... Sage Weil
01:49 PM Bug #5238 (Pending Backport): osd: slow recovery (uselessly dirtying pg logs during peering)
Sage Weil
01:46 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
Maybe something different i've this one:
http://tracker.ceph.com/issues/5232
and it makes a HUGE difference regar...
Stefan Priebe
01:44 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
For what it's worth, I also tried it (wip_5238_cuttlefish specifically) per Sam's suggestion while troubleshooting #5... Faidon Liambotis
01:33 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
we are going to tset it a bit more in master before putting it in teh cuttlefish branch. good to know this is helpin... Sage Weil
01:28 PM Bug #5238: osd: slow recovery (uselessly dirtying pg logs during peering)
This one is missing in upstream/cuttlefish ? It helps a lot. Stefan Priebe
01:39 PM Bug #5332 (Resolved): boost::get: key stuckops is not type std::vector<std::string, std::allocato...
commit:de1723834cf2cfe51cc991ece1b53624ff56d7d5 Sage Weil
01:05 PM Bug #5332 (Resolved): boost::get: key stuckops is not type std::vector<std::string, std::allocato...
2013-06-12T02:25:15.786 INFO:teuthology.task.ceph.mon.a.err:2013-06-12 02:26:06.734468 7f2e3ef1e700 -1 bad boost::get... Sage Weil
01:23 PM CephFS Bug #5333 (Resolved): mds: segfault in MDLog::standby_trim_segments
... Sage Weil
12:35 PM Bug #5330 (Resolved): ceph daemon <name> ... broken
it uses ceph-conf to get admin_socket, but taht doesn't work. this does:
ubuntu@plana38:~$ ceph-osd -n osd.0 --s...
Sage Weil
12:23 PM rbd Feature #5168: openstack: cinder: rbd as a backup target
https://blueprints.launchpad.net/cinder/+spec/cinder-backup-to-ceph Josh Durgin
12:23 PM rbd Feature #5167: openstack: cinder: differential backups
https://blueprints.launchpad.net/cinder/+spec/cinder-backup-to-ceph Josh Durgin
11:42 AM Bug #5326 (Resolved): mon: osd crush add ... comamdn broken
... Sage Weil
10:21 AM devops Bug #4984: ceph_deploy: osd create succeeds with an error message (partprobe returns error)
it should have been wip-4984 :) Tamilarasi muthamizhan
09:33 AM Bug #5312: Skip EXT4StoreTest._detect_fs test if DISK or MOUNTPOINT environment variables not set
1577e203f08c3f94c36fd128dda14e8bceeca7a9 Ian Colle
09:32 AM Bug #5311 (Resolved): Existence of parent directories for admin and bootstrap keys in ceph-create...
Sage Weil
08:18 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Ok, I tried the ubuntu leveldb package but in ubuntu leveldb is only built as a static library. So what I did was to ... Emil Renner Berthing
06:10 AM CephFS Bug #5290: mds: crash whilst trying to reconnect
Hi Zheng,
Is this what you mean?
Damien Churchill

06/11/2013

08:54 PM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
We need to gather some udev logs to diagnose this... can you change teh level in /etc/udev/udev.conf to 'debug', rest... Sage Weil
08:50 PM Bug #4698 (Won't Fix): osd suicide timed out after 150
this was an ext4 bug:... Sage Weil
08:45 PM Bug #5062: mon: 0.61.2 asserts on AuthMonitor during monitor start
Do we have any logs or recent occurrences of this bug to go on, or mon logs of it happening?
If not, I think this ...
Sage Weil
08:43 PM devops Bug #5189 (Need More Info): ceph-deploy disk prepare fails silently
Hi Robert-
Are you still having this problem? Can you share a bit more information about the environment? What d...
Sage Weil
07:20 PM rgw Bug #5324 (Resolved): radosgw-admin --help missing the --shard-id option
The new 'mdlog trim' call requires a --shard-id option be specified but that option is not listed in the --help output. Anonymous
06:32 PM devops Bug #4984: ceph_deploy: osd create succeeds with an error message (partprobe returns error)
pushed wip-4948. works ok on centos/rhel, but we should verify it also behaves on ubuntu and debian. Sage Weil
06:19 PM rgw Bug #5323: trim data log lists dates as optional, enforced as required in the current code
I believe that the offending line is in ceph/src/rgw/rgw_rest_log.cc in the function RGWOp_MDLog_Delete::execute().
...
Anonymous
06:17 PM rgw Bug #5323 (Resolved): trim data log lists dates as optional, enforced as required in the current ...
In the wip-rgw-geo branch, the
DELETE /admin/log?id=<shard id>
call lists start-time and end-time as optional. How...
Anonymous
05:41 PM Linux kernel client Bug #4854 (Rejected): read more than they should
this is due to readahead. readahead can be disabled by posix_fadvise(2) Zheng Yan
04:49 PM Bug #5310 (Resolved): StoreTest.ColSplitTest1 hits assert in _split_collection()
Samuel Just
11:13 AM Bug #5310 (Resolved): StoreTest.ColSplitTest1 hits assert in _split_collection()
$ ./ceph_test_filestore
...
[ RUN ] StoreTest.ColSplitTest1
2013-06-11 11:06:49.332610 7f38942e4780 1 filest...
David Zafman
04:30 PM Bug #5176 (Resolved): leveldb: Compaction makes things time-out yielding spurious elections
Sylvain Munaut wrote:
> I can try to do this tomorrow.
>
> But in the mean time I played with the paxos trimming ...
Sage Weil
11:16 AM Bug #5176: leveldb: Compaction makes things time-out yielding spurious elections
I can try to do this tomorrow.
But in the mean time I played with the paxos trimming values and made it go away.
...
Sylvain Munaut
08:12 AM Bug #5176 (Need More Info): leveldb: Compaction makes things time-out yielding spurious elections
Can you capture a debug mon = 20, debug paxos = 20, debug ms = 1 log that includes an election and send us the set of... Sage Weil
12:59 AM Bug #5176: leveldb: Compaction makes things time-out yielding spurious elections
fyi, I just upgraded from wip-5176 to 0.61.3 and those spurious elections are back. Sylvain Munaut
03:43 PM Bug #5320 (Resolved): osd/ReplicatedPG.cc: 4753: FAILED assert(!pg_log.get_missing().is_missing(s...
-901> 2013-06-11 14:02:22.138530 7f9bd4913700 5 filestore(/var/lib/ceph/osd/ceph-1) _do_op 0x1d4bfa0 seq 68202 osr... Samuel Just
02:56 PM Linux kernel client Bug #4614: Root cephfs does not mount at boot on Ubuntu 12.04
I can confirm this problem occurs on Ubuntu 12.04 as well. sam beckwith
01:46 PM Bug #5311: Existence of parent directories for admin and bootstrap keys in ceph-create-keys not c...
Yes, the packages do it right after the installation. But this does not mean that these dirs still exist when you run... Peter Wienemann
01:08 PM Bug #5311: Existence of parent directories for admin and bootstrap keys in ceph-create-keys not c...
Aren't these directories supposed to be installed by the packages? *Something* is doing it in the normal case or thes... Greg Farnum
01:06 PM Bug #5311: Existence of parent directories for admin and bootstrap keys in ceph-create-keys not c...
A fix is available as pull request #355. Peter Wienemann
12:56 PM Bug #5311 (Resolved): Existence of parent directories for admin and bootstrap keys in ceph-create...
The ceph-create-key script does not check the existence of the parent directories in which the admin and the bootstra... Peter Wienemann
01:32 PM Bug #5312 (Resolved): Skip EXT4StoreTest._detect_fs test if DISK or MOUNTPOINT environment variab...
I disabled the ColSplitTest1/ColSplitTest2 tests (see bug #5310).
Currently, this test case just crashes with uncl...
David Zafman
10:57 AM devops Bug #5309 (Closed): ceph-deploy mon create fails to start monitor damon
This is with current master: 0.63-572-g0948624-1
It appears that somewhere between ceph-deploy and the ceph-mon...
Anonymous
10:53 AM Bug #5307 (Resolved): ceph_test_filestore crashes
Needs --filestore-xattr-use-omap=true Samuel Just
10:33 AM Bug #5307 (Resolved): ceph_test_filestore crashes
$ ./ceph_test_filestore
[==========] Running 11 tests from 2 test cases.
[----------] Global test environment set-u...
David Zafman
10:53 AM devops Bug #5300: ceph-deploy purgedata should give warning if ceph still installed
I'll retest, I might not have been paying attention to purge vs purge data. In any event the test system was left in... Anonymous
09:37 AM devops Bug #5300: ceph-deploy purgedata should give warning if ceph still installed
purge is supposed to remove the package files *and* any config files... Sage Weil
10:53 AM Bug #5269 (Duplicate): osd: EEXIST on mkcoll
This is probably the same thing as 5270. Samuel Just
10:52 AM Bug #5240 (Resolved): run_seed_to_range failed, probably fdcache
Samuel Just
10:27 AM devops Bug #5306 (Can't reproduce): Xen based OSDs fail to start ceph-osd process
After a clean install and ceph-deploy prepare and activate the osd process is running on the node.
After a reboot th...
Yan-Fa Li
10:26 AM Bug #5305 (Resolved): ceph-deploy gatherkeys fails (ceph-create-keys)
When invoked with ceph-deploy ceph-create-keys fails silently and the only indication of a problem is that the subsqu... Anonymous
10:21 AM Bug #5305 (Resolved): ceph-deploy gatherkeys fails (ceph-create-keys)
glowell@gary-ubuntu-01:~/ceph-deploy$ sudo /usr/sbin/ceph-create-keys --cluster=ceph -i gary-ubuntu-01
INFO:ceph-cre...
Anonymous
09:56 AM rgw Bug #5302: rest-bench breaks with XmlParseFailure
what fastcgi module is being used here? Maybe try:
rgw print continue = false
int your ceph.conf.
Yehuda Sadeh
07:26 AM rgw Bug #5302 (Can't reproduce): rest-bench breaks with XmlParseFailure
This was reported on the mailing list when trying to run rest-bench:... Mark Nelson
09:39 AM devops Bug #5299: ceph-deploy fails with cryptic error message if expected directories not found
/etc/ceph should be installed by the package.
did yo uby chance run purgedata without running purge first? that mi...
Sage Weil
09:36 AM Bug #5301 (New): mon: leveldb crash in tcmalloc
Ian Colle
09:29 AM Bug #5301: mon: leveldb crash in tcmalloc
Well I could try to reproduce but I am not going to do this because it is my production cluster. I have also experien... Maciej Galkiewicz
08:21 AM Bug #5301: mon: leveldb crash in tcmalloc
Hi-
The 3.8.y kernel is EOL, but I pushed a branch that has the patch that (I believe) fixes this problem: linux-3...
Sage Weil
06:10 AM Bug #5301 (Can't reproduce): mon: leveldb crash in tcmalloc
Hello
I have replaced my crushmap:...
Maciej Galkiewicz
08:55 AM CephFS Bug #5303 (Resolved): OSD segfaults on SIGINT
This was a missed backport for an old fix. I pushed it to the cuttlefish branch and it will be included in .4. Thanks! Sage Weil
08:41 AM CephFS Bug #5303: OSD segfaults on SIGINT
Without debugger:... Jérôme Poulin
08:38 AM CephFS Bug #5303 (Resolved): OSD segfaults on SIGINT
This is not the first time but interrupting the OSD with SIGINT (CTRL+C) causes a segmentation fault.
Cuttlefish 0...
Jérôme Poulin
08:39 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Ah. Can you please try the ubuntu leveldb package and see if the problem persists? Thanks! Sage Weil
07:43 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
I just looked into LevelDB packaging in wheezy and precise. Again it seems that debian ships a newer version of Level... Emil Renner Berthing
01:06 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Yes, now we seem to have provoked two different errors. Both of them has happened at least twice each but on differen... Emil Renner Berthing
08:34 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
I think that this is what you want, if not, just let me know.
Jeff
Jeff Moskow
08:21 AM Bug #5292: mon: monitor crashing due to not being in the monmap (no monmap to be in)
Can you share the monitor's logs with 'debug mon = 20' set? Joao Eduardo Luis
07:19 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
Removing the assert worked around the problem:... Jérôme Poulin
06:32 AM CephFS Bug #5250: ceph-mds 0.61.2 aborts on start
I noticed that resetting the MDS journal using ceph-mds -i 1 --reset-journal 0 -d hangs there.... Jérôme Poulin
01:40 AM Fix #5232: osd: slow peering due to pg log rewrites
This one misses cuttlefish for backport? Stefan Priebe

06/10/2013

11:34 PM Bug #5272: Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of existing cluster
I'm afraid that as long as no one else encounters this issue I am not able to provide more detailed information. The ... To Pro
05:53 PM Bug #5272 (Need More Info): Updating ceph from 0.61.2 to 0.61.3 obviously changes tunables of exi...
I went through a diff and there's nothing obvious between those two versions that could have caused these feature bit... Greg Farnum
11:07 PM devops Bug #5283 (In Progress): Ceph-deploy can't handle /dev/disk/by-* device paths
The fix for this will actually be in ceph-disk, ceph-deploy pretty much passes the device unmodified.
Anonymous
10:28 PM CephFS Bug #5290: mds: crash whilst trying to reconnect
looks like session map corruption.
Damien, please upload the session map. you can find where is it by "ceph osd ma...
Zheng Yan
02:16 AM CephFS Bug #5290 (Can't reproduce): mds: crash whilst trying to reconnect
Hi,
Recently I experienced an issue with the mds servers in my cluster, the cluster storage would be absolutely fi...
Damien Churchill
10:15 PM devops Bug #5300 (Resolved): ceph-deploy purgedata should give warning if ceph still installed
Purge will remove directores needed for continued operation. Probable need to issue a warning in this case since if ... Anonymous
10:10 PM devops Bug #5299 (Won't Fix): ceph-deploy fails with cryptic error message if expected directories not f...
In this case it's /etc/ceph
glowell@gary-ubuntu-01:~/ceph-deploy$ ./ceph-deploy mon create gary-ubuntu-01
Traceba...
Anonymous
05:51 PM RADOS Bug #5298 (New): mon: "setting" CRUSH tunables to their current values creates a map
Maybe this is adding pointless churn, maybe it's blocking the user longer than necessary, or maybe it's a great way t... Greg Farnum
05:24 PM Bug #5297 (Resolved): Slow requests after restarting an OSD (post peering)
On my Cuttlefish 0.61.3, when I restart an OSD, besides the effects of #5084, I see a bunch of "slow request" message... Faidon Liambotis
05:22 PM Bug #5084: osd: slow peering after osd restart (bobtail)
Just for the record:
We did a troubleshooting/log collecting session with Sam last week. It seems that the issue i...
Faidon Liambotis
05:17 PM Bug #5270 (Resolved): osd: crash in PG::peek_map_epoch()
Samuel Just
02:06 AM Bug #5270: osd: crash in PG::peek_map_epoch()
I've got the same error when some pginfo files have been lost due to XFS corruption. Removing pg collection helped to... Sergey Fionov
04:50 PM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
any luck? Sage Weil
08:06 AM Bug #5239: osd: Segmentation fault in ceph-osd / tcmalloc
Ok, all our OSD nodes are now running v0.61.3, but built --without-tcmalloc.
We'll try different workloads during ...
Emil Renner Berthing
04:24 PM devops Bug #5295 (Resolved): mon keyring path in mon.py not checked properly
commit:dd9392023da4773c7006ec1fb86fee07a862d8f9 Sage Weil
02:06 PM devops Bug #5295 (Resolved): mon keyring path in mon.py not checked properly
In the file mon.py, line 37 ff., of the ceph-deploy code the mon keyring path is not checked properly. Prior to writi... Peter Wienemann
04:20 PM devops Bug #4916: ceph-deploy: mon create fails on bobtail branch in centos 6.3
commit:96c001021e6dd06b43686de7040f78c484869344 fixes the mkdir -p thing. Does that fix the centos problem too? Sage Weil
01:48 PM devops Bug #4916: ceph-deploy: mon create fails on bobtail branch in centos 6.3
I am having the same problem on Debian wheezy. After some debugging I found that the cause of the problem is in the f... Peter Wienemann
04:15 PM Subtask #5213: unit tests for src/osd/PGLog.{cc,h}
"related thread":http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/15499... Loïc Dachary
01:29 PM Bug #5294 (Closed): mon upgrade issue 0.61.2 -> 0.61.3
This was reported on the mailing list by Nelson Jeppesen at Disney. Joao, any idea if we've seen anything else like ... Mark Nelson
11:31 AM devops Documentation #5293 (Rejected): ceph-osd needs ulimit value to be set otherwise won't start
I needed to add the following line to my /etc/security/limits.conf otherwise the osd didn't start up correctly and th... Yan-Fa Li
11:24 AM Bug #5291: Bug with client naming for Cinder-Volume usage
The defaults everywhere are client.admin. Perhaps you've got the CEPH_ARGS environment variable specifying --id volum... Josh Durgin
02:42 AM Bug #5291 (Can't reproduce): Bug with client naming for Cinder-Volume usage
Hello!
It seems there are bug with naming client for Cinder-Volume usage.
According to this documentation http://...
Igor Laskovy
09:42 AM CephFS Bug #5287 (Resolved): the permission of file in CephFS
Ian Colle
06:53 AM rbd Bug #4446: librbd: crash from opensolaris vm
I've upgraded to Cuttlefish and the newest Promox (KVM 1.4.1) and still have the same problem. The kvm command is:
...
Jeff Moskow
06:48 AM Bug #5292 (Resolved): mon: monitor crashing due to not being in the monmap (no monmap to be in)
I run a 4 node CEPH cluster (all are currently running 0.61.3 - upgraded to cuttlefish a few weeks ago) and (3 nodes ... Jeff Moskow
04:29 AM devops Bug #5194: udev does not start osd after reboot on wheezy or el6 or fedora
Sage Weil wrote:
> what happens if you do 'ceph-disk-active /dev/sdb1' (or whatever the xfs patition is)? what abou...
Robert Sander
 

Also available in: Atom