Activity
From 07/26/2014 to 08/24/2014
08/24/2014
- 04:10 PM Feature #8343 (Closed): please enable data integrity checking (by default) / silent data corruption
- 04:06 PM Bug #8349 (Resolved): env-vs-args unittest is racy
- Fixed by https://github.com/ceph/ceph/commit/3230060f07c738383cc1034a99d60d2ad369560f
- 03:32 PM Support #8462: related to integrity of objects
- 03:12 PM Feature #7238 (Fix Under Review): erasure code : implement LRC plugin
- The rados test work (no thrashing).
- 02:57 PM Support #8310 (Closed): Most pgs stuck stale, no osds reporting them, repair ineffective
- 09:25 AM CephFS Bug #9212 (Won't Fix): mon election delays mds beacon
- ubuntu@teuthology:/a/teuthology-2014-08-22_23:04:01-fs-master-testing-basic-multi/444359...
- 08:36 AM Bug #9211 (Resolved): osdmap blacklist encoding order is nondeterministic
- ...
08/23/2014
- 05:00 PM Bug #9203: ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limit' failed.
- fwiw the reproducer hits a crash on firefly, but not emperor or dumpling. A fair bit changed in ceph_test_rados for ...
- 03:13 PM Bug #9203: ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limit' failed.
So it turns out that ceph_test_rados is also crashy on master, as I found when I took my reproducer for this issue ...- 03:53 PM rbd Bug #9210 (Resolved): osdc/ObjectCacher.cc: 529: FAILED assert(i->empty()) on fencing test shutdown
- ...
- 11:50 AM Feature #7238: erasure code : implement LRC plugin
- 11:25 AM Feature #7238 (Fix Under Review): erasure code : implement LRC plugin
- Although thrashing tests using an LRC pool fail, I believe this is due to the size of the pool rather than the plugin...
- 11:29 AM Bug #9209: osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0)
- The same YAML file run against firefly 0.80.5-171-gca3ac90-1trusty instead of master succeeds.
- 11:23 AM Bug #9209 (Resolved): osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0)
- Using ...
08/22/2014
- 06:26 PM rgw Bug #9208 (Resolved): rgw: civetweb does not drain request buffer correctly
- When radosgw returns an early error without reading the request content, we need civetweb to drain the buffer so that...
- 05:24 PM Subtask #6478 (Rejected): ErasureCode : XOR plugin
- This has been obsoleted by the work on the ISA plugin.
- 05:22 PM Feature #7238: erasure code : implement LRC plugin
- Fixed a bug that made the plugin incorrectly claiming it could not recover when the last OSD was out, running tests a...
- 03:09 PM Bug #9207 (Resolved): osdc/Objecter.cc: 1074: FAILED assert(op->get_nref() > 1)
- ubuntu@teuthology:/var/lib/teuthworker/archive/john-2014-08-22_10:24:47-rados-wip-objecter-testing-basic-multi/441988...
- 03:04 PM rgw Bug #9206 (Resolved): rgw: cross rgw message headers filtered by apache 2.4
- apache 2.4 filters out header fields that have underscores in them. Need to convert underscores into dashes.
- 02:52 PM Bug #9205 (Resolved): osd: notify ops reordered
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-21_11:40:02-upgrade:dumpling-x:stress-split-master...
- 01:23 PM devops Feature #9136 (Resolved): ceph-deploy: use pre-existing ceph.conf
- merged commit 2781538 into ceph:master
- 12:44 PM devops Feature #9118 (Fix Under Review): ceph-deploy: Add pre-generated keys to a Monitor
- Pull request opened https://github.com/ceph/ceph-deploy/pull/235
- 12:02 PM Bug #9203: ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limit' failed.
- Does not reproduce very often, but eventually caught in the act with debug turned up.
The oid in the asserting ope... - 06:39 AM Bug #9203 (Resolved): ceph_test_rados: ObjectDesc::iterator::advance(bool): Assertion `pos < limi...
http://pulpito.front.sepia.ceph.com/john-2014-08-22_02:21:21-rados-wip-objecter-testing-basic-multi/440722/
http:/...- 11:28 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
- added patches to master that will dump the weak_refs on shutdown
- 06:32 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
- http://pulpito.front.sepia.ceph.com/john-2014-08-22_02:21:21-rados-wip-objecter-testing-basic-multi/440850/
http://p... - 06:24 AM Bug #7995 (New): osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
- This is happening again:
http://pulpito.front.sepia.ceph.com/john-2014-08-22_02:21:21-rados-wip-objecter-testing-b... - 11:15 AM Bug #8736: thrash and scrub combination lead to error
- This needs to be prioritized.
Confirmed, logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-21_11:... - 10:19 AM Bug #8985: "[WRN] map e9 wrongly marked me down" in upgrade:dumpling-x-firefly---basic-vps suite
- 06:36 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- The stack trace created by the minimal script is different from the one reported above, but it fails at the same poin...
- 05:51 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- The problem does not show if waiting after the object is inserted. It is a race condition....
- 05:25 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- For the problem to show the file being removed has to be the primary.
- 05:06 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- Even simpler and does not require root privileges...
- 04:56 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- The following reproduces it reliably on my laptop:...
- 03:47 AM Fix #8914 (In Progress): osd crashed at assert ReplicatedBackend::build_push_op
- Thanks for the update, will try again :-)
- 02:57 AM CephFS Bug #4545: error creating empty object store. Invalid argument.
- i maybe found the problem.
before you mkcephfs,you should ensure the dir(/var/lib/ceph/osd/ceph-0) empty.
once i wr... - 02:32 AM Bug #9202 (Can't reproduce): Performance degradation during recovering and backfilling
- From recent test and analysis, we find slow requests mainly happen at 2 patterns during recovering and backfilling.
...
08/21/2014
- 11:12 PM rgw Feature #8911: RGW doesn't return 'x-timestamp' in header which is used by 'View Details' of Open...
- Thanks Luis... actually its a new feature request not a bug. Since we want one to one headers mapping between Swift a...
- 09:11 PM rgw Bug #9201 (Resolved): rgw: bad object with different pool alignment
- http://qa-proxy.ceph.com/teuthology/sage-2014-08-21_17:03:27-rgw-master-testing-basic-multi/440046/teuthology.log
... - 05:28 PM Bug #9153 (Resolved): erasure-code: jerasure_matrix_dotprod segmentation fault due to package upg...
- 04:55 PM Feature #8147 (Resolved): osd: make split automatically trigger scrub
- 04:49 PM Bug #8998 (Resolved): osd: SEGV in OSD::heartbeat()
- no backport needed; this happened bc update_osd_stats() was in OSDService but still using hte other dout macro, but f...
- 04:49 PM rgw Feature #9200 (Resolved): rgw: log civetweb access
- Apache has an access log, civetweb has one too, however we need to incorporate it into our logging system.
- 04:44 PM CephFS Bug #5762 (Resolved): teuthology: Failed MPI runs lead to a hung test instead of a failure
- 03:29 PM Feature #8639: mon: dispatch messages while blocked waiting for IO
- 03:29 PM Feature #7516 (Resolved): mon: reweight-by-pg
- 03:27 PM Fix #9199 (Resolved): librados: watch linger pings need to verify pg mapping hasn't changed
- at the same time, osds might want to push osdmap incrementals to client sessions with watchers to expedite things ...
- 03:22 PM Feature #9198 (Resolved): librados: notify callback includes gid of notifier
- 03:21 PM Feature #9197 (Resolved): librados/osd: notify reply payload
- 03:21 PM Fix #9196 (Resolved): librados: watch_check() to synchronous verify we haven't missed notifies
- 03:21 PM Fix #9195 (Resolved): librados: issue watch callback on (possibly) missed notifies
- 03:20 PM Fix #9194 (Resolved): librados/osd: watch reconnect needs to be exclusive to detect possibly miss...
- 03:18 PM Linux kernel client Bug #8806: libceph: must use new tid when watch is resent
- the watch resend needs to use a new tid to avoid the dup op detection in the osd. this is how librbd avoids this pro...
- 02:55 PM Bug #9176 (Pending Backport): mon: leaked MMonGetVersion
- 01:08 PM Bug #9176 (Fix Under Review): mon: leaked MMonGetVersion
- https://github.com/ceph/ceph/pull/2301
- 02:49 PM rgw Bug #9160: rgw failures with 'NoneType' object has no attribute 'get_contents_as_string'
- http://pulpito.front.sepia.ceph.com/sage-2014-08-19_15:19:41-rgw-master-testing-basic-multi/435812/
http://pulpito.f... - 02:43 PM rgw Bug #9160: rgw failures with 'NoneType' object has no attribute 'get_contents_as_string'
- http://pulpito.front.sepia.ceph.com/john-2014-08-20_19:21:46-rgw-wip-objecter-testing-basic-plana/438545/
- 01:56 PM Bug #9144 (Pending Backport): filestore: commit triggered during journal replay
- 01:21 PM Bug #9193: notify does not return an error code on timeout
- https://github.com/ceph/ceph/pull/2302
- 01:20 PM Bug #9193 (Resolved): notify does not return an error code on timeout
- commit:7c7bf5fee7be397ef141b947f532a2a0b3567b42
There is simply no error code passed back to the caller; the API c... - 01:10 PM Bug #9150: osd/ECBackend.cc: 529: FAILED assert(pop.data.length() == sinfo.aligned_logical_offset...
- suspect this and #9135 to be a ghost due to misbehaving underlying fs
- 01:09 PM Bug #9145 (Resolved): recursive lock of CollectionIndex::access_lock (52)
- 12:51 PM Bug #9182 (Need More Info): osd deadlock after ms_handle_reset
- 12:50 PM Bug #9181 (Need More Info): Osd: segv in OpTracker::unregister_inflight_op
- no log, core isn't giving me good info :(
- 12:34 PM Bug #8885 (Can't reproduce): SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
- 12:09 PM devops Feature #9136 (Fix Under Review): ceph-deploy: use pre-existing ceph.conf
- Pull request opened https://github.com/ceph/ceph-deploy/pull/234
- 12:07 PM devops Bug #9185: incorrect Centos 6.5 fastcgi package
- ok, the idle timeout is working fine.. i can pause the radosgw process (kill -STOP) and curl will block for well over...
- 10:27 AM devops Bug #9185 (In Progress): incorrect Centos 6.5 fastcgi package
- 09:52 AM devops Bug #9185: incorrect Centos 6.5 fastcgi package
- (09:51:57 AM) sagehm@newdream.net/montreal: mod_fastcgi-2.4.7-1.ceph.el6.x86_64
(09:52:15 AM) sagehm@newdream.net/mo... - 11:43 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- Does fio complete eventually? Are there any other hung tasks in dmesg? A task blocking for more than 120 seconds is...
- 11:38 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- I apply http://gitbuilder.ceph.com/kernel-deb-precise-x86_64-basic/ref/wip-request-fn/linux-image-3.16.0-ceph-00037-g...
- 11:37 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- Ok, I've applied the "..." with Kernel 3.16.0 and the error continues:
...
Aug 21 14:38:45 mail02-old kernel: [ 7... - 10:19 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- Eric is correct, the fix isn't in 3.16 stable yet, and unfortunately won't be in 3.15 at all - Linus pulled it into h...
- 10:10 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- The fix looks like it made it into 3.17rc1. I have been testing this kernel since Sunday, and have not triggered the ...
- 09:31 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- Upgrade to kernel: 3.16.0 and got the same problem:
...
[ 70.858716] Key type ceph registered
[ 70.858800] l... - 11:18 AM Linux kernel client Bug #9192 (New): krbd: poor read (about 10%) vs write performance
- We started testing the 3.17rc1 kernel over the weekend, as it is the only Linus
released kernel that has the fix fo... - 10:05 AM devops Feature #5773 (In Progress): ceph-deploy: should add more tests to ceph-deploy task
- 09:55 AM CephFS Bug #9152 (In Progress): mds: beacon needs to not take mds_lock
- wip-9152
- 09:50 AM CephFS Bug #9177: ceph-fuse: failing MPI mdtest runs
- The compiler is spitting out a warning about getcwd -- no evidence that that's what it's actually hitting in this ins...
- 08:53 AM CephFS Bug #9177: ceph-fuse: failing MPI mdtest runs
- http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-20_23:04:01-fs-next-testing-basic-multi/439228/
- 08:29 AM CephFS Bug #9177: ceph-fuse: failing MPI mdtest runs
- How did you track it down to getcwd? If that is the issue there are a bunch of avenues of attack here, and we should ...
- 06:31 AM CephFS Bug #9177: ceph-fuse: failing MPI mdtest runs
- mdtest has a getcwd call into an unzeroed buffer that it doesn't check the error of. If fuse is failing the getcwd f...
- 09:46 AM devops Bug #9190 (Resolved): idle times out do not work on ubuntu precise
- This maybe similar to #9185
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-21_08:29:18-upgrade... - 08:26 AM Bug #9188: make check fails for setmaxosd
- "make check" is passing on our gitbuilders (http://ceph.com/gitbuilder.cgi). Try updating and running it again? If th...
- 02:28 AM Bug #9188 (Rejected): make check fails for setmaxosd
- make check fails for setmaxosd. This is after a recent change in setmaxosd behavior to disallow shrinking of OSDs. He...
- 06:56 AM CephFS Bug #9151 (In Progress): mds should log/error/warn when segments are NOT getting trimmed
- 05:56 AM CephFS Feature #9189 (Resolved): Expose client identifying metadata to MDS, e.g. hostname
Currently, when doing e.g. a "session ls" on an MDS's admin socket, we get client IDs and IP addresses. It would b...- 05:35 AM CephFS Bug #9173 (Fix Under Review): Crash in Server::_session_logged
https://github.com/ceph/ceph/pull/2297- 03:28 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- Missed a step to mention.
before i did a repair on the primary osd; i aslo did a scrub
#:/build/ceph-firefly84/sr... - 03:17 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- Hi Loic,
please find below the steps to reproduce the issue.
@*#:/build/ceph-firefly84/src# ./ceph -v
*** DEVE... - 01:09 AM rgw Bug #9155: Swift Subuser - 403 Forbidden - during upload/post
- made a comment on your proposed fix.
08/20/2014
- 09:02 PM devops Bug #9187 (Resolved): osds down after fresh deploy in master branch of ceph
- 09:02 PM devops Bug #9187: osds down after fresh deploy in master branch of ceph
- thsi si fixed later today. it was the isa preload thing:
2014-08-20 21:04:58.845739 7f7369af2780 -1 load: jerasur... - 04:37 PM devops Bug #9187 (Resolved): osds down after fresh deploy in master branch of ceph
- ceph version 0.84-367-gf71c889
test setup: mira023
ceph-deploy version: 1.5.11
created 4 osds, with a combi... - 08:48 PM Bug #9180 (Resolved): keyvaluestore: bad op 2563
- done, commit:fdbab46852e74d405b5c747da98564a5866ec8a7 . thanks!!
- 08:07 PM Bug #9180: keyvaluestore: bad op 2563
- We need to backport commit c08adbc98ff5f380ecd215f8bd9cf3cab214913c(https://github.com/ceph/ceph/commit/c08adbc98ff5f...
- 10:39 AM Bug #9180 (Resolved): keyvaluestore: bad op 2563
- ...
- 05:33 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Plugging one of the 520s into a 3Gbit sata port makes no difference either.
- 04:58 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Updated the bios on the work machine. No difference.
- 04:08 PM Bug #9153 (In Progress): erasure-code: jerasure_matrix_dotprod segmentation fault due to package ...
- preloading jerasure is not enough : the plugin selects another plugin to be loaded depending on the CPU features (jer...
- 03:29 PM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- I still see this error in today's run http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-20_13:52:13-upgrade:dump...
- 10:07 AM Bug #9153 (Resolved): erasure-code: jerasure_matrix_dotprod segmentation fault due to package upg...
- 03:27 PM devops Bug #9185: incorrect Centos 6.5 fastcgi package
- fcgi? how does that even enter into it? I thought our work was only with fastcgi?
Is this on teuthology, or cust... - 03:26 PM devops Bug #9185: incorrect Centos 6.5 fastcgi package
- So this problem is with the fcgi package not mod_fastcgi?
- 02:07 PM devops Bug #9185: incorrect Centos 6.5 fastcgi package
- This should fix #9169
- 01:54 PM devops Bug #9185 (Rejected): incorrect Centos 6.5 fastcgi package
- The fastcgi package that is being installed is, or either based off: fcgi-2.4.0-10.el6.x86_64. Not 100% sure that it ...
- 02:33 PM Feature #9031: List RADOS namespaces and list all objects in all namespaces
- 02:31 PM Bug #9186 (Duplicate): erasure-code: conditionally preload isa plugin
- The isa plugin is only built on some platforms. When the OSD preloads plugins, it should not try to load plugins that...
- 02:05 PM rgw Bug #9169: 100-continue broken for centos/rhel
- This seems to be due to idle timeout is not working, should be fixed by #9185
- 01:27 PM devops Feature #9136 (In Progress): ceph-deploy: use pre-existing ceph.conf
- 10:54 AM Bug #9182: osd deadlock after ms_handle_reset
- ..and when i detached gdb the osd saw it was marked down, and came back to life after that. :/
- 10:52 AM Bug #9182: osd deadlock after ms_handle_reset
- ...
- 10:51 AM Bug #9182 (Can't reproduce): osd deadlock after ms_handle_reset
- ubuntu@teuthology:/a/teuthology-2014-08-19_02:30:02-rados-firefly-distro-basic-multi/435572...
- 10:47 AM CephFS Bug #9173: Crash in Server::_session_logged
- Better log.
- 06:30 AM CephFS Bug #9173 (Resolved): Crash in Server::_session_logged
Hit by mds_client_recovery task...- 10:43 AM Bug #9181 (Resolved): Osd: segv in OpTracker::unregister_inflight_op
- ...
- 10:38 AM Bug #9179 (Resolved): unfound objects, recovery timeout
- 402/7722 unfound (
all osds up
ubuntu@teuthology:/a/teuthology-2014-08-19_02:30:02-rados-firefly-distro-basic-m... - 10:33 AM CephFS Bug #9178: samba: ENOTEMPTY on "rm -rf"
- http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-10_23:14:02-samba-next-testing-basic-plana/415869/
- 10:30 AM CephFS Bug #9178 (Resolved): samba: ENOTEMPTY on "rm -rf"
- ...
- 10:14 AM CephFS Bug #9177 (Resolved): ceph-fuse: failing MPI mdtest runs
- ...
- 09:40 AM Bug #9176 (Resolved): mon: leaked MMonGetVersion
- ubuntu@teuthology:/a/teuthology-2014-08-19_02:30:02-rados-firefly-distro-basic-multi/435589
- 09:38 AM Bug #9175 (Duplicate): osd: stuck recovery
- ubuntu@teuthology:/a/teuthology-2014-08-19_02:30:02-rados-firefly-distro-basic-multi/435529
pgs stuck recovery, ne... - 09:33 AM Feature #7238: erasure code : implement LRC plugin
- Reserved three machines and run the following job on them:...
- 09:32 AM rgw Subtask #9068 (In Progress): rgw: add rgw setup to vstart
- Pull request: https://github.com/ceph/ceph/pull/2292
- 09:31 AM rgw Documentation #9003: rgw: document development setup for rgw
- Abhishek L wrote:
> Luis Pabon wrote:
> > I have edited vstart.sh so that it can setup rgw automatically. I have a... - 09:30 AM rgw Documentation #9003: rgw: document development setup for rgw
- patch has been submitted: https://github.com/ceph/ceph/pull/2292
- 05:21 AM rgw Documentation #9003: rgw: document development setup for rgw
- Luis Pabon wrote:
> I have edited vstart.sh so that it can setup rgw automatically. I have also documented most of ... - 09:19 AM Bug #9128: Newly-restarted OSD may suicide itself after hitting suicide time out value because it...
- sounds like we need to use the TPHandle and tp.reset_tp_handle() inside the search_For_missing loop
- 07:53 AM Documentation #9174: wrong picture on http://ceph.com/docs/master/cephfs/
- ...
- 07:46 AM Documentation #9174 (Closed): wrong picture on http://ceph.com/docs/master/cephfs/
- The picture on page http://ceph.com/docs/master/cephfs/
is not correct.
ceph.ko is not on top of libcephfs / librad... - 03:11 AM devops Feature #8868 (Resolved): Update Fedora to 0.80.5 packages with ceph-common
- The updated packages with spec file synced up with the upstream spec file were pushed to epel 7, fedora 22, fedora 21...
08/19/2014
- 09:31 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- A related thought is that the Intel 520s are plugged into the sata 6Gbit ports on the motherboard, so if there are an...
- 06:52 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- It might be worth trying an Intel 530 if that is dramatically easier to source - as it is similar to the 520 in the m...
- 06:26 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- That should have said unpatched wip-9073.
- 06:25 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Doing a little more digging for the cause of 2/ (invalid argument erro). Using unpatched ipw-0973 and changing the jo...
- 09:07 PM rgw Bug #9125 (Resolved): rgw: swift tests fail with civetweb
- 05:44 PM Feature #7238: erasure code : implement LRC plugin
- There is no need to test upgrade on a plugin that does not exist in LRC.
- 02:34 PM Feature #7238: erasure code : implement LRC plugin
- canceled the previous job because it did not have enough OSD to complete (the LRC rule requires a minimum of 8 for ea...
- 12:22 PM Feature #7238: erasure code : implement LRC plugin
- Cancel the "teuthology run that did not contain any LRC workload":http://pulpito.ceph.com/loic-2014-08-19_20:27:09-up...
- 11:27 AM Feature #7238: erasure code : implement LRC plugin
- Fixed a few problems and running "a firefly upgrade suite":http://pulpito.ceph.com/loic-2014-08-19_20:27:09-upgrade:f...
- 03:08 PM Bug #9156: SWIFT tests failed in upgrade:dumpling:rgw-dumpling-distro-basic-vps suite
- Further analyzes and chants with Loic and Yehuda revealed that in apache access log we indeed have 30 sec not 1200 se...
- 03:02 PM Bug #9156: SWIFT tests failed in upgrade:dumpling:rgw-dumpling-distro-basic-vps suite
- Suspected backport apache 2.4 issue, test branch wip-rgw-dumpling for ceph-qa-suite
Running now ... - 02:15 PM Fix #8914 (Need More Info): osd crashed at assert ReplicatedBackend::build_push_op
- I'm not able to reproduce the problem on *ceph version 0.84-343-g92b227e (92b227e1c0b1533c359e74c81de58140b483ee8e)* ...
- 01:15 PM rgw Bug #9155: Swift Subuser - 403 Forbidden - during upload/post
- I pushed a different fix to wip-8587, please take a look and see if you think it makes sense.
- 01:10 PM Feature #8155: Disallow changing cache_mode in nonsensical ways
- c3f403293c7f8d946f66a871aa015a558120ce78
- 01:10 PM Feature #8155 (Resolved): Disallow changing cache_mode in nonsensical ways
- 01:09 PM devops Feature #9050: Calamari builds for ceph.com
- Asking Ian and Neil, they confirm that what this means is "repos". The hard choice is going to be figuring out what ...
- 12:15 PM Bug #9170 (Resolved): erasure-code: preload erasure code plugins
- Whitelist the plugins to be preloaded.
- 11:19 AM devops Feature #3019 (Closed): juju: modernize ceph charm, mon & osd bootstrap
- 11:11 AM rgw Bug #9169 (Resolved): 100-continue broken for centos/rhel
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-18_16:07:27-upgrade:dumpling-firefly-x-firefly-dis...
- 11:10 AM devops Feature #8868 (In Progress): Update Fedora to 0.80.5 packages with ceph-common
- 09:20 AM rgw Feature #8911: RGW doesn't return 'x-timestamp' in header which is used by 'View Details' of Open...
- I'll take a look. Seems like this is new functionality in RGW, not a bug, right?
- 09:13 AM CephFS Bug #9152: mds: beacon needs to not take mds_lock
- Hmm, the beacon send code doesn't need to hold the lock on its own, but it's triggered by the SafeTimer, which is jus...
- 09:07 AM rgw Documentation #9003: rgw: document development setup for rgw
- I have edited vstart.sh so that it can setup rgw automatically. I have also documented most of the steps needed by n...
- 09:02 AM rgw Documentation #9003 (In Progress): rgw: document development setup for rgw
- 09:05 AM CephFS Bug #9151: mds should log/error/warn when segments are NOT getting trimmed
- What kind of logging do we want? I assume you mean journal segments, and this is a bog standard operation...
If it's... - 09:04 AM rgw Feature #8945: rgw: support swift /info api
- After spending some time on this call, I am going to have to break it down to smaller tasks. I am currently investig...
- 09:02 AM Bug #9143: Incorrect key sequence in encoding object name to key for GenericObjectMap
- How did you run across this? Is it feasible to fix it by typing the escaped strings and writing a custom comparator?
- 07:47 AM Bug #9079: osd: bad learned_addr during send_boot
- "pending pull request":https://github.com/ceph/ceph/pull/2275
- 07:41 AM Feature #9167 (Resolved): erasure-code: check plugin version when loading it
- When loading the erasure code plugin, check the Ceph version against which it was built and fail if it does not match...
- 07:22 AM devops Bug #9166 (Closed): activate dmcrypt volumes via init script
- Hi,
I don't know if this is more a bug or a feature request.
I think it would helpful if the activation of ceph ... - 07:16 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- "firefly backport":https://github.com/ceph/ceph/pull/2286
- 07:10 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- The teuthology upgrade tests fails consistently with the same problem. Backporting to firefly seem to be the only way...
- 05:21 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- "Running upgrade:dumpling-firefly-x with the proposed fix":http://pulpito.ceph.com/loic-2014-08-19_14:23:09-upgrade:d...
- 06:49 AM CephFS Fix #4286: SLES 11 - cfuse: disable 'big_writes'and 'atomic_o_trunc
- 04:17 AM rbd Bug #9076: Can't completely remove a version 1 image on RHEL 7
- Ok it's better with ceph.com packages. You can close this :)
Thanks! - 04:16 AM rbd Bug #9075: Can't create a version 2 images on RHEL 7
- Ok it's better with ceph.com packages. You can close this :)
Thanks!
08/18/2014
- 11:21 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- FWIW - checked this myself on my home machine (which was *not* seeing this last issue recall, only the hang) by reboo...
- 07:48 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- My linux version is 3.2 && 3.5. I'll test on 3.13.0-32-generic to find whether kernel cause this bug.
- 07:00 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Excellent. Purely out of interest, any idea (now) why we only saw this bug on one particular system?
- 04:04 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Hmm, thanks very much! I'll send the patch.
Thanks again, Mark! - 03:44 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Retested with only debug-journal-header-3.diff on wip-9073. I did 200 test runs, good journal every time.
- 02:39 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- I think you should retest only using debug-journal-header-3.diff on wip-9073. And test more times to avoid the bug r...
- 02:36 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- I had your last debugging diff on there as well (I can retest without that if needed).
- 02:34 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Are you only apply debug-journal-header-3.diff on wip-9073 to test ?
- 02:32 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Heh - sorry, means 'really fixed it well'!
- 02:30 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- What's mean of nail it? sorry, i don't know.
- 02:21 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Lol, you certainly have - been a pleasure debugging this with you!
I actually applied the patch attached in this n... - 02:01 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- growl, make that 3.13.0-32-generic, typed 'uname -a' in wrong (x)window before!
- 02:01 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- I have a thought. It's strange.
Using aio, the kernel use user-space to write. But if before write to journal, the u... - 01:58 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- ...oh and kernel is 3.13.0-34-generic (sorry)!
- 01:52 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Yeah, disabling dio seems to get a consistently good header (10 consecutive runs)
- 01:22 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- From the latest ceph-osd.o.log. Before io_submit, the content is ok.
I found another issue.
2014-08-18 20:10:09.7... - 01:10 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Seems I spoke too soon - a few more runs showed up:
$ hexdump -n8 journalblk-prestart--20864.txt
0000000 7000 033... - 12:38 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- With *only* your latest pacth applied to wip-9073 I'm seeing a good journal header:
$ hexdump -n8 journalblk-prest... - 12:12 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Using my latest patch, is journal-header corrupt?
From my debug info, before io_submit and after aio completed, the ... - 09:44 PM rgw Bug #9155: Swift Subuser - 403 Forbidden - during upload/post
- Here's the pull request:
[[https://github.com/ceph/ceph/pull/2281]] - 08:20 AM rgw Bug #9155: Swift Subuser - 403 Forbidden - during upload/post
- That's duplicate of #8587, a pull request for your fix would be great.
- 07:49 AM rgw Bug #9155 (Resolved): Swift Subuser - 403 Forbidden - during upload/post
- Swift Upload fails with HTTP error 403 for an subuser that was created with the required permissions. This happens ge...
- 06:26 PM Bug #9062: Mon segfault in waitlist_or_zap_client
- the fix was merged in commit:321d4defd4a0f5a53a41276e6dc048479cb3084a
- 05:14 PM Bug #9145: recursive lock of CollectionIndex::access_lock (52)
- The fix Sam suggested is to name the CollectionIndex lock based on the collection names. This will make lockdep happy...
- 01:58 PM Bug #9145: recursive lock of CollectionIndex::access_lock (52)
- Sage,
Yes, I am able to reproduce this following the steps you suggested. But, this time I am hitting the issue in _... - 04:51 PM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- "minimal fix":https://github.com/ceph/ceph/pull/2282
- 09:05 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- Stopping the daemons may not be the brightest idea because of http://tracker.ceph.com/issues/8849 . Pre-loading the p...
- 08:09 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- "proposed fix":https://github.com/ceph/ceph/pull/2278
- 07:27 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- Here is a possible scenario:
* ceph-osd-0.80.5 is running but did not load jerasure
* ceph-osd-0.83 is installed ... - 07:09 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- Here is the part of the teuthology log dealing with the upgrade, which is immediately followed by a core dump from os...
- 06:43 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- Trying a manual upgrade...
- 06:25 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- The ceph-libs package is obsolete and the jerasure plugin now lives in the ceph package. The problem does not come fr...
- 06:18 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- It looks like the ceph-libs package is not upgraded, which explains the core dump : master cannot successfully load a...
- 05:31 AM Bug #9153 (Fix Under Review): erasure-code: jerasure_matrix_dotprod segmentation fault due to pac...
- "proposed fix":https://github.com/ceph/ceph/pull/2276
- 05:22 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- If the ceph-libs package is upgraded before the ceph package, it is entirely possible that the shared library is repl...
- 04:47 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- The upgrade sequence
* dumpling
* firefly -> installs and load the jerasure plugin
* master -> installs an updat... - 04:41 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- The stack trace is bizarre. ECUtil::decode calls ErasureCodeJerasure::encode_chunks which makes no sense becase a) de...
- 04:29 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- Got three VPS with rhel 6.5 installed, running the job on them with no "nuke-on-error"
- 03:43 AM Bug #9153 (In Progress): erasure-code: jerasure_matrix_dotprod segmentation fault due to package ...
- As soon as VPS are available, lock three and run the job again hoping to repeat it...
- 01:22 AM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- Ack
- 02:42 PM Feature #9161 (New): Cache warmup and ejection
- Initial access of an object in a high performance cache tier can have high latency as the object is fetched from the ...
- 02:20 PM rgw Bug #9160 (Closed): rgw failures with 'NoneType' object has no attribute 'get_contents_as_string'
Several jobs in this suite failed with this error:
http://pulpito.ceph.com/john-2014-08-18_16:28:28-rgw-wip-object...- 01:56 PM rgw Bug #9125: rgw: swift tests fail with civetweb
- looks like the fix is merged to master, tested it on master branch and it worked fine.
will mark it as "Resolved"... - 10:45 AM Bug #9158 (Duplicate): osd crashed in upgrade:dumpling-x:stress-split-master-distro-basic-vps suite
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-17_11:40:01-upgrade:dumpling-x:stress-split-master...
- 10:24 AM Bug #9072 (Resolved): error setting 'mon_pg_warn_min_objects' to '10K': (22) Invalid argument
- 09:23 AM Bug #9072: error setting 'mon_pg_warn_min_objects' to '10K': (22) Invalid argument
- I checked the firefly branch and Sage cherry-picked the required patches to it.
That ought to fix all issues with ... - 09:08 AM devops Feature #9118: ceph-deploy: Add pre-generated keys to a Monitor
- Keith Schincke wrote:
> Can the precreated/populated keyring be propagated with the ceph-deploy command when the clu... - 09:04 AM devops Feature #9118: ceph-deploy: Add pre-generated keys to a Monitor
- Can the precreated/populated keyring be propagated with the ceph-deploy command when the cluster is created?
- 08:23 AM Bug #9156 (Resolved): SWIFT tests failed in upgrade:dumpling:rgw-dumpling-distro-basic-vps suite
- 12 tests total failed in http://pulpito.front.sepia.ceph.com/teuthology-2014-08-17_12:05:01-upgrade:dumpling:rgw-dump...
- 05:17 AM Bug #9112 (Resolved): (wip-objecter) librados notify calls freezing
- No longer occurring after reinstating _recalc_linger_op_target and updating related bits of code
08/17/2014
- 11:52 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Meanwhile, I have been doing a little digging of my own: if I disable dio or aio via
[osd]
journal [d,a]io = fals... - 11:40 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Here's the log with that patch applied.
- 07:27 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Hi Mark,
Could you test again? I add more debug message this time.
Thanks!
- 08:53 PM rbd Bug #8919 (Resolved): qemu-iotests fails to find common.env
- 05:15 PM Bug #9153: erasure-code: jerasure_matrix_dotprod segmentation fault due to package upgrade race
- Loic, can you take a look?
- 04:38 PM Bug #9153 (Resolved): erasure-code: jerasure_matrix_dotprod segmentation fault due to package upg...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-17_11:30:03-upgrade:dumpling-firefly-x-master-dist...
- 01:00 PM CephFS Bug #9152 (Resolved): mds: beacon needs to not take mds_lock
- any random task that holds the mds lock for a long time prevents beacons, which will trigger a failover
- 12:48 PM CephFS Bug #9151 (Resolved): mds should log/error/warn when segments are NOT getting trimmed
08/16/2014
- 10:01 PM rgw Bug #8621 (Pending Backport): civetweb frontend fails authentication if URL has special chars
- 09:55 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Sage's comment suggested I check something - reverting 4eb18dd487da4cb621dcbecfc475fc0871b356ac from wip-9073 and run...
- 08:59 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- I've reverted commit:4eb18dd487da4cb621dcbecfc475fc0871b356ac on next so we can release v0.84. once we sort this out...
- 12:47 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- jianpeng ma wrote:
> I read the #6003. I think they are not the same.
> You can see those two files (patch.diff ... - 09:53 PM Feature #9030 (Resolved): mon: quickly identify 'problem' osds
- 09:26 PM Bug #9150 (Can't reproduce): osd/ECBackend.cc: 529: FAILED assert(pop.data.length() == sinfo.alig...
- ...
- 08:57 PM rgw Bug #9137 (Resolved): AH00534: apache2: Configuration error: No MPM loaded. (rpm distros)
- 04:56 PM rgw Bug #9137: AH00534: apache2: Configuration error: No MPM loaded. (rpm distros)
- works on el6 and el7. fc20 fails the ceph-qa-chef because of tiobench.
- 02:16 PM rgw Bug #9137: AH00534: apache2: Configuration error: No MPM loaded. (rpm distros)
- verfied to work on precise and trusty.
still need to test on el6, el7, and fedora. - 08:52 PM rgw Bug #9148 (Resolved): rgw: multiregion tests failing, s3tests.functional.test_s3.test_region_copy...
- ...
- 03:42 PM CephFS Bug #8574 (Resolved): teuthology: NFS mounts on trusty are failing
- chef adds a dummy export and restarts nfs-kernel-server now
- 02:41 PM CephFS Bug #8574: teuthology: NFS mounts on trusty are failing
- root@mira055:~# service nfs-kernel-server restart
* Stopping NFS kernel daemon ... - 02:08 PM Linux kernel client Bug #9147 (Closed): krbd: run_xfstests.sh fails
- ...
- 02:07 PM rbd Bug #9146 (Can't reproduce): EPERM from image_read.sh
- ...
- 01:54 PM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
- The restriping tool never made it to dumpling. It actually isn't even in firefly.
- 01:39 PM rgw Bug #9039 (Pending Backport): Using COPY on radosgw to copy object from one bucket to another tha...
- the restriping fix patches also need to go to dumpling...
- 01:46 PM Bug #8997: ceph_test_rados_watch_notify hangs
- ubuntu@teuthology:/a/sage-2014-08-15_21:44:35-rados-master-testing-basic-multi/427533 (probably)
- 01:43 PM Bug #9145 (Resolved): recursive lock of CollectionIndex::access_lock (52)
- ...
- 01:17 PM Feature #7238: erasure code : implement LRC plugin
- "running teuthology test run":http://pulpito.ceph.com/loic-2014-08-16_22:17:50-upgrade:firefly-x:stress-split-wip-723...
- 12:41 PM Bug #9144 (Fix Under Review): filestore: commit triggered during journal replay
- https://github.com/ceph/ceph/pull/2274
- 09:26 AM Bug #9144 (Resolved): filestore: commit triggered during journal replay
- ...
- 09:38 AM Feature #9033 (Resolved): erasure-code: simplified LRC
- "part of a larger pull request":https://github.com/dachary/ceph/commit/43b8f66797184b1138560184708573aa6930e8c4
- 09:15 AM Bug #9053 (Pending Backport): mon/Paxos.cc: 628: FAILED assert(begin->last_committed == last_comm...
- 07:47 AM Bug #9143 (Rejected): Incorrect key sequence in encoding object name to key for GenericObjectMap
- For example, two oid has same hash and their name is:
A: "rb.data.123"
B: "rb-123"
In ghobject_t compare level, ... - 06:02 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- "all green !":http://pulpito.ceph.com/loic-2014-08-16_10:42:43-upgrade:firefly-x:stress-split-wip-9025-chunk-remappin...
08/15/2014
- 07:50 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- I read the #6003. I think they are not the same.
You can see those two files (patch.diff Magnifier (571 Bytes) ji... - 06:19 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- This is starting to sound a lot like #6003!
- 01:56 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- hexdump -n8 journalblk-prestart.txt
0000000 3000 021d 0000 0000 - 12:09 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Can you paste the journal head after this command. Only first 8byte.
- 04:41 PM rgw Bug #9137: AH00534: apache2: Configuration error: No MPM loaded. (rpm distros)
- The MPM selection is supposed to be made in the default config.
- 01:23 PM rgw Bug #9137: AH00534: apache2: Configuration error: No MPM loaded. (rpm distros)
- Looking into this; my theory is two problems: 1) package structure changed in 2.4 and we might need to explicitly in...
- 11:46 AM rgw Bug #9137 (Resolved): AH00534: apache2: Configuration error: No MPM loaded. (rpm distros)
- ...
- 03:49 PM Bug #9130 (Resolved): (wip-objecter) FAILED assert(cur_con) in MonClient
- fix in wip-objecter
- 06:42 AM Bug #9130 (Resolved): (wip-objecter) FAILED assert(cur_con) in MonClient
http://pulpito.front.sepia.ceph.com/john-2014-08-15_03:34:51-rbd-wip-mds-contexts-testing-basic-multi/425519/
<p...- 02:08 PM Bug #9119 (Pending Backport): READFORWARD ordering bug
- 02:03 PM RADOS Bug #8963 (Resolved): erasure coding crush rulset breaks rbd kernel clients on non-ec pools on Ub...
- backported to firefly
- 01:34 PM Bug #9142 (Can't reproduce): [ RUN ] LibRadosTwoPoolsPP.PromoteSnapScrub hang
- ubuntu@teuthology:/a/samuelj-2014-08-14_18:41:07-rados-wip-sam-testing-testing-basic-multi/425498
- 01:33 PM Bug #9140: [ FAILED ] LibRadosTwoPoolsPP.PromoteOn2ndRead (9913 ms)
- ubuntu@teuthology:/a/samuelj-2014-08-14_18:41:07-rados-wip-sam-testing-testing-basic-multi/425458
- 01:30 PM Bug #9140 (Duplicate): [ FAILED ] LibRadosTwoPoolsPP.PromoteOn2ndRead (9913 ms)
- 2014-08-15T05:48:20.619 INFO:tasks.workunit.client.0.plana16.stdout:[ OK ] LibRadosTwoPoolsPP.HitSetWrite (2908...
- 01:32 PM Bug #9141 (Can't reproduce): [ RUN ] LibRadosAio.IsCompletePP hang
- ubuntu@teuthology:/a/samuelj-2014-08-14_18:41:07-rados-wip-sam-testing-testing-basic-multi/425497
- 01:01 PM Bug #9139 (Rejected): ceph_test_rados reports incorrectly missing object
- ORDERSNAPS was fixing something important:
1) cache-primary send DELETE on object we are flushing
2) base-primary q... - 11:28 AM devops Feature #9134 (Duplicate): ceph-deploy: add pre-generated client keys to MON
- 9118
- 11:22 AM devops Feature #9134 (Duplicate): ceph-deploy: add pre-generated client keys to MON
- User story: As an admin, I have already generated Ceph client keys and would like to add them to the cluster during t...
- 11:27 AM devops Feature #9136 (Resolved): ceph-deploy: use pre-existing ceph.conf
- User story: As an admin, I have already generated a ceph,conf file and would like to use it for a new cluster install...
- 11:26 AM Bug #9135 (Can't reproduce): ENOENT on collection_add
- ...
- 11:08 AM CephFS Feature #8869 (Resolved): MDS: support standby-replay on old-format journals
- This merged a couple of weeks ago in https://github.com/ceph/ceph/commit/440c820cce2c262570ab78e352bed8a630d41be5
- 10:49 AM devops Feature #9133 (Rejected): create ceph user/group; run daemons as ceph (non-root)
- this will involve lots of updates to packaging.
- 05:33 AM Feature #7238: erasure code : implement LRC plugin
- Teuthology job description:...
- 04:45 AM CephFS Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
- Punting on a general purpose fix for ObjectCacher for the time being, and just fixing this in librbd teardown.
- 04:44 AM CephFS Bug #9105 (Fix Under Review): ~ObjectCacher behaves poorly on EBLACKLISTED
- https://github.com/ceph/ceph/pull/2263
- 03:53 AM Bug #9128 (Resolved): Newly-restarted OSD may suicide itself after hitting suicide time out value...
- Stop one OSD daemon for a long time, like many hours even to 1 day, without marking it as out. During this time, ther...
- 03:40 AM Feature #9025 (Resolved): erasure-code: chunk remapping
- 03:38 AM Feature #9025: erasure-code: chunk remapping
- Teuthology job passes.
08/14/2014
- 11:25 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- The strace attached. So this is the mkfs...and wip-9073 with *just* the last patch applied.
- 11:20 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Can you using strace to catch the ceh-osd command? Please using strace -f to cache all child process.
Thanks! - 11:14 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Back to seeing the same error (invalid argument) with this latest patch :-(
- 10:58 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Makr, i don't find the reason. But i think this bug may caused by patch. So i modify my patch and hope the bug don't ...
- 10:58 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- ...suggests a memory overwrite problem - we really need to get the binaries running under valgrind!
- 08:11 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- With that last patch applied, journal header looks good every mkfs and osd is starting every time.
- 07:47 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Yes. It's a great step. A strange bug.
The attachment is a patch which add read_header on some place.Can you try t... - 07:41 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Very interesting: *somethimes* after the mkfs the header looks like:
0000000 b000 02b5 0000 0000 0001 0000 0000 00... - 07:12 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Will do.
- 06:57 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- But from the code, when start osd, read journal-header is the first thing for journal.
I don't know the command 's... - 06:54 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Hmmm - just checked again and got:
$ hexdump journalblk-prestart.txt|head -1
0000000 3000 02a0 0000 0000 0001 000... - 06:45 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Funny you should mention that, I had just check that myself:
So, just after the mkfs, journal header is:
$ hexd... - 06:30 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Hi Mark,
I use different on my hand but i can't reproduce this.
From the deply.sh, for osd operation
1:ceph-osd ... - 03:33 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Can you use "strace -f ceph-osd .." to trace all syscall?
We may from the info find some clue. - 03:20 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- No, sorry,
$ sudo dd if=/dev/zero of=/dev/sdc1 bs=512
$ sudo ./deploy.sh
is the prescription. The result is os... - 03:08 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Are you mean if you zero the journal-disk then the osd can start? Otherwise, it will met this bug.
- 03:04 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Sure - I'm running the script attached initially - now using a minor variation thereof (attached again).
The only ot... - 02:40 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- The first 8 byte of journal-header destroyed. But the debug info show the content of journal-header is right.
Now ... - 02:06 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Also, I note that running
$ sudo ceph-osd -i 0 --mkjournal
results in a journal state that lets the osd start, ... - 01:38 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Shame about no 520, but here are the files.
- 06:06 PM rgw Bug #9125 (Resolved): rgw: swift tests fail with civetweb
- logs are copied to ubuntu@mira042.front.sepia.ceph.com:/home/ubuntu/civetweb_swift...
- 05:57 PM rgw Bug #8971 (Duplicate): rgw: s3 test failures with civetweb
- 05:56 PM rgw Bug #8971: rgw: s3 test failures with civetweb
- s3tests now pass on wip-8621 branch.
- 05:55 PM rgw Bug #8621: civetweb frontend fails authentication if URL has special chars
- s3tests passed with recent changes to wip-8621.
- 05:39 PM Bug #9058 (Need More Info): rest-api: long-running process may fail 'tell osd...' due to stale os...
- ok, my theory doesn't seem right.. Objecter is checking for a new map if it gets ENXIO or similar. enabled logging i...
- 05:36 PM devops Bug #8330 (Resolved): repodata on rpm repos do not list latest ceph-deploy (1.5.2)
- Thanks for verifying.
- 05:33 PM devops Bug #8976 (Fix Under Review): httpd on RHEL7 (RHEL repo) incompatible with mod_fastcgi (ceph repo)
- We have a new version available out at:
http://gitbuilder.ceph.com/apache2-rpm-rhel7-x86_64-basic/ref/master/
A... - 05:13 PM Bug #8895: ceph osd pool stats (displayed incorrect values)
- Can probably close this as dupe of #5884?
- 04:14 PM CephFS Bug #9101: multimds: unlinked file is not pruned from replica mds caches
- 03:20 PM CephFS Bug #9123 (Can't reproduce): kceph: had 130k+ inodes with write caps
- in #9121 the client had more than 130k inodes open for write, resulting in a huge file recovery queue. there definit...
- 02:37 PM CephFS Bug #9121 (In Progress): mds: inode stuck recovering after client restart
- recovery is working.. there are just a lot of inodes queued:
2014-08-14 14:40:06.695087 7fd45f757700 10 mds.0.cach... - 02:10 PM CephFS Bug #9121 (Resolved): mds: inode stuck recovering after client restart
- ...
- 01:51 PM CephFS Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
- John Spray wrote:
> This is happening when the librbd-using client is blacklisted, ObjectCacher fails to flush when ... - 10:16 AM CephFS Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
- This is happening when the librbd-using client is blacklisted, ObjectCacher fails to flush when requested, and ImageC...
- 09:44 AM CephFS Bug #9105: ~ObjectCacher behaves poorly on EBLACKLISTED
- Started failing in 061c8e93f76dc4fd6290d6d15723d76e73267444 where rbd_cache and rbd_cache_writethrough_until_flush we...
- 01:17 PM rgw Bug #8988 (Resolved): AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- 12:33 PM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- "the suite runs ok":http://pulpito.ceph.com/loic-2014-08-14_14:25:55-upgrade:firefly-x:stress-split-wip-9025-chunk-re...
- 05:55 AM rgw Bug #8988 (Fix Under Review): AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- "need review":https://github.com/ceph/ceph-qa-suite/pull/87
- 05:36 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- The reason why "the suite fails":http://pulpito.ceph.com/loic-2014-08-14_09:47:05-upgrade:firefly-x:stress-split-wip-...
- 12:53 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- It failed for the same reason. "Rescheduled once more, hoping the problem has been fixed":http://pulpito.ceph.com/loi...
- 01:13 PM Bug #8865 (Resolved): cep osd setmaxosd doesn't check if osds exist
- 12:37 PM Feature #9025: erasure-code: chunk remapping
- Now that the teuthology + MDS bugs are fixed, the following job will be scheduled to exercise remapping:...
- 11:10 AM Bug #9119 (Resolved): READFORWARD ordering bug
- READFORWARD is forwarding RWORDERED reads.
- 11:06 AM devops Feature #9118: ceph-deploy: Add pre-generated keys to a Monitor
- Any keys (client.admin or otherwise) in the keyring file passed to "ceph-mon --mkfs --keyring <foo>" will get seeded ...
- 10:56 AM devops Feature #9118 (Resolved): ceph-deploy: Add pre-generated keys to a Monitor
- ceph-authtool can be used to generate a key and keyring before a Ceph cluster is running, if a user has access to the...
- 10:54 AM Feature #9083 (Closed): Standalone script to generate Ceph keys
- Feature already exists in ceph-authtool
- 09:34 AM Bug #9113: osd: snap trimming eats memory, linearly
- a few notes:...
- 06:40 AM Bug #9113 (Resolved): osd: snap trimming eats memory, linearly
- - rados pool snapshot taken weekly
- trimmed when >30 days old
- trimming makes some osds consume memory linearly
... - 09:06 AM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- ubuntu@teuthology:/a/sage-2014-08-13_15:28:18-rados-next-testing-basic-multi/422862
- 09:05 AM Bug #9114: osd: segv in build_push_op
- note: i manually killed ceph_test_rados to make teuthology clean up
- 07:09 AM Bug #9114 (Duplicate): osd: segv in build_push_op
- ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-08-13_15:28:18-rados-next-testing-basic-multi/422759...
- 08:33 AM Bug #9102 (Resolved): ceph-disk has undefined variables
- 07:51 AM Bug #9102 (Fix Under Review): ceph-disk has undefined variables
- PR opened https://github.com/ceph/ceph/pull/2251
- 07:58 AM rgw Documentation #9116 (Resolved): rgw: broken link
- From Luis Pabon:...
- 07:21 AM devops Bug #9066 (Rejected): Need ceph-deploy to be able to run to JUST generate ceph.conf and keyring w...
- The initial issue was misunderstood, ceph-deploy already is able to create a ceph.conf and a mon keyring. Other requi...
- 06:47 AM Bug #9062 (Resolved): Mon segfault in waitlist_or_zap_client
- 06:40 AM Bug #9112 (In Progress): (wip-objecter) librados notify calls freezing
- 06:39 AM Bug #9112: (wip-objecter) librados notify calls freezing
- Client log with objecter and librados debug logging at 20 in teuthology:~/jcsp/9112
- 06:28 AM Bug #9112 (Resolved): (wip-objecter) librados notify calls freezing
Hitting this in rbd tests, periodically the ceph_test_rados_fsx process gets stuck inside IoCtxImpl::notify
<pre...- 06:34 AM CephFS Bug #8725 (Resolved): mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
- 06:16 AM devops Feature #9103: create a (generic) webservice to handle Sphinx documentation versions
- 1.- Adding something to the Sphinx build is non-trivial. Sphinx extensions (the right way to do this) are very comple...
- 02:48 AM Bug #9111: PG stuck with 'active+remapped' forever with cluster wide change (add/remove OSDs)
- Right after I filed this bug, I got some clue, I found the problem came from those removed OSDs (which has status DNE...
- 02:01 AM Bug #9111 (Won't Fix): PG stuck with 'active+remapped' forever with cluster wide change (add/remo...
- After adding/removing OSDs, some PGs stuck with 'active+remapped' forever.
1. ceph -s
-bash-4.1$ ceph -s... - 01:35 AM Bug #9082: Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before tier agent ...
- Thanks Sage , the issue has been resolved, cluster is Healthy now.
08/13/2014
- 11:49 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Mark, i can't find the ssd in lab.
And i also can't find the code.But from my two patch, i don't modify code which c... - 07:08 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- hexdump journalblk.txt
0000000 1000 03ce 0000 0000 0001 0000 0000 0000
0000010 bdb9 29ac 51d7 a343 3bbf 1114 622e... - 06:51 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Here's the 4096 bytes of sdc1
- 06:41 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- For the code,there is a logic error.
int r = ::pread(fd, bp.c_str(), bp.length(), 0);
bl.push_back(bp);
try ... - 06:21 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Can you read the 4096 of /dev/sdc1 and send to me?
The journal header is in first 4096 size. - 06:12 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- The info for the Intel 520:
Re more journal debugging - sure, I already have the following set:
[osd]
debug os... - 06:09 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- The script puts in symlinks (also note slightly different osd data path on the work machine):
$ ls -l /var/lib/cep... - 06:04 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- From you message, I found:
14-08-14 10:58:01.735317 7f944f5e4800 20 journal _check_disk_write_cache: disk write cach... - 05:36 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Can you send the 520 disk-info using hdparm to me?
I'll search the lab try to find this ssd.
Thanks!
- 05:13 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Can you print more debuginfo about journal?
From the messages:
journal read_header error decoding journal header
... - 03:58 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Doing a secure erase of the 520's changes nothing. Still seeing problem 2/ 'invalid argument' opening the journal.
- 01:55 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- An aside thought - to rule out weird ssd related stuff I had performed a secure erase on the Crucial m4's while inves...
- 01:40 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- I'm happy to report that wip-9073 definitely fixes problem 1/ (the hang).
- 01:04 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- to (hopefully) clarify the errors:
- Home machine: osd mkfs hangs (which I've called 1/)
- work machine: osd mkfs... - 12:56 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Interesting... I'm just building wip-9073 on my home machine now, will update you with what I find.
The issue *mig... - 09:17 PM rgw Feature #8473: rgw: Shard bucket index objects to improve single bucket PUT throughput
- Here is the first patch - https://github.com/ceph/ceph/pull/2187
- 09:16 PM Bug #7521 (Won't Fix): Add more events (hold object context) to OpTracker to better analyze perfo...
- With more understanding of the tracker, I found actually the issue being tracked by this but can be achieved by the c...
- 09:14 PM Bug #7710 (Resolved): Multiple rados bench instance will overwrite the metadata object
- 09:10 PM Documentation #6142: Ceph needs mor than 32k pids
- John, not sure where this should go in the doc structure...
- 06:20 PM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- Loic, we had disk failed and possibly due to that suite failed (guessing), I re-started it http://pulpito.front.sepia...
- 04:11 PM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- Waiting for "Shipping apache config":https://github.com/ceph/ceph-qa-suite/blob/master/tasks/rgw.py#L82 with...
- 04:04 PM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- "running a suite using the new VPS.yaml":http://pulpito.ceph.com/loic-2014-08-14_01:02:11-upgrade:firefly-x:stress-sp...
- 03:47 PM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- "fix indentation of rgw override":https://github.com/ceph/ceph-qa-suite/pull/85
- 03:35 PM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- To confirm there is a large delay requiring a large idle_timeout:...
- 03:33 PM rgw Bug #8988 (In Progress): AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- 04:30 PM Bug #9109 (New): ceph CLI: Help is missing -k keyring option
- The ceph command line should provide a -k keyring argument. "ceph --help" does not appear to list the -k option for t...
- 04:28 PM Bug #9087 (Need More Info): ceph_test_rados_list_parallel hang
- 02:21 PM Bug #9087: ceph_test_rados_list_parallel hang
- added some debugging.
- 12:47 PM Bug #9087: ceph_test_rados_list_parallel hang
- Looking
- 04:22 PM Bug #9053: mon/Paxos.cc: 628: FAILED assert(begin->last_committed == last_committed)
- Paxos::handle_last() bug.
the peon:... - 04:17 PM Bug #9053: mon/Paxos.cc: 628: FAILED assert(begin->last_committed == last_committed)
- 03:35 PM CephFS Bug #8964 (Resolved): kcephfs: client does not resend requests on mds restart
- 03:13 PM CephFS Bug #8725 (Fix Under Review): mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic...
- https://github.com/ceph/ceph/pull/2254
- 02:46 PM Cleanup #9106: ceph-authtool: Modifying user without --gen-key overwrites the key
- Wasn't able to reproduce this after retrying. Maybe just a usage issue.
- 02:24 PM Cleanup #9106 (Resolved): ceph-authtool: Modifying user without --gen-key overwrites the key
- If you are trying to modify a user's caps/permissions using ceph-authtool, and the user has an existing key, specifyi...
- 02:37 PM RADOS Feature #9108 (New): ceph auth get: Get multiple users
- The "ceph auth get <user>" command with the -o option is an ideal way to create a keyring for an individual user. How...
- 02:37 PM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- Hmm, most likely a bug in repair. We should start by creating a teuthology task which reproduces the bug. Once we h...
- 02:27 PM RADOS Feature #9107 (New): ceph-authtool: Delete a user.
- Currently, there is no corresponding "delete" feature that allows a user to delete a user from a keyring. We should h...
- 02:25 PM Feature #8389 (Resolved): osd: clean up old ec objects more aggressively
- 02:25 PM Feature #8480 (Resolved): modify scrub to detect/repair obsolete rollback objects
- 02:15 PM CephFS Bug #9105 (New): ~ObjectCacher behaves poorly on EBLACKLISTED
In ceph master 78dc4df
http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-11_23:00:01-rbd-master-testing-bas...- 01:59 PM devops Feature #9103: create a (generic) webservice to handle Sphinx documentation versions
The calamari docs already include a version (albeit a rather verbose one including the git hash). I guess with a l...- 01:06 PM devops Feature #9103 (Resolved): create a (generic) webservice to handle Sphinx documentation versions
- None of our docs allow a user to:
* Have a visual queue of what version of the docs they are seeing.
* be warned ... - 01:44 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
- ...
- 01:19 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
- ...
- 01:39 PM CephFS Bug #9101: multimds: unlinked file is not pruned from replica mds caches
- looks like the problem is that another mds has the inode in its cache and isn't trimming it (or being asked to trim i...
- 01:13 PM CephFS Bug #9101 (Fix Under Review): multimds: unlinked file is not pruned from replica mds caches
- https://github.com/ceph/ceph/pull/2250
- 11:36 AM CephFS Bug #9101: multimds: unlinked file is not pruned from replica mds caches
- Here is the debug data when using a ceph-fuse client.
We did reproduce the problem - 11:15 AM CephFS Bug #9101 (New): multimds: unlinked file is not pruned from replica mds caches
- as a result, deleted files stay pinned for a long time and space does not get removed.
- 01:35 PM Bug #9055 (Resolved): LibRadosTwoPoolsPP.HitSetWrite (and others) fail on remove of whiteout
- 01:30 PM Bug #9052 (Resolved): ceph-mon crashes with *** Caught signal (Floating point exception) **
- 12:38 PM CephFS Feature #9029 (Resolved): min/max uid for snapshot creation
- 11:59 AM Bug #9102 (Resolved): ceph-disk has undefined variables
- We fail to track them because the build doesn't yell at us, in the meantime, those should be fixed....
- 10:46 AM Bug #9096 (Resolved): OSD::require_same_peer_instance fails to acquire lock
- 10:23 AM Bug #9096 (Fix Under Review): OSD::require_same_peer_instance fails to acquire lock
- https://github.com/ceph/ceph/pull/2249
- 03:38 AM Bug #9096: OSD::require_same_peer_instance fails to acquire lock
- It is the cause of http://tracker.ceph.com/issues/9074
- 03:37 AM Bug #9096 (Resolved): OSD::require_same_peer_instance fails to acquire lock
- It can be reproduced by running a few times (less than 5) *qa/workunits/cephtool/test.sh -t mon_osd*. It will eventua...
- 10:33 AM Bug #9082 (Resolved): Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before ...
- 09:11 AM Bug #9082: Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before tier agent ...
- i've pushed wip-9082-firefly... can you please try this and see if it avoids the crash? i was looking for a divide b...
- 08:34 AM Bug #9082: Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before tier agent ...
- Hello Sage
Thanks for your time checking this bug. As required i have found some PG’s and 3 OSDs which are making... - 08:24 AM Bug #9082: Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before tier agent ...
- Hello Sage
As i have found some PG / OSD that make agent_choose_mode() unhappy. I am attaching logs of 2 differen... - 09:22 AM Feature #9097 (New): request for tools/commands to see hits/misses on cache pools
- request for tools/commands to see hits/misses on cache pools
- 07:23 AM Bug #9085 (Resolved): erasure-code: ISA plugin does not load
- The isa plugin "wip-firefly-isa":https://github.com/ceph/ceph/tree/wip-firefly-isa does not have the bug. It was intr...
- 03:39 AM devops Bug #9074 (Duplicate): gitbuilder: make check does not complete, sometimes
- It happens because of http://tracker.ceph.com/issues/9096
- 01:57 AM devops Bug #9074: gitbuilder: make check does not complete, sometimes
- Wrong diagnostic, the error is not from here. It loops while waiting for osds to come back up "a few lines below":htt...
- 01:02 AM devops Bug #9074: gitbuilder: make check does not complete, sometimes
- "test.sh":https://github.com/ceph/ceph/blob/ea731ae14216bb479eff1f86ed6bd4a7cb71fb56/qa/workunits/cephtool/test.sh fa...
- 03:17 AM rbd Bug #9078: Removing an RBD is very slow whenever there is write's in other RBD which also belongs...
- RBD's are created with different order parameter
- 02:00 AM rbd Bug #9078: Removing an RBD is very slow whenever there is write's in other RBD which also belongs...
- setup is not available, unable to check "ceph -w", below are information based on IO tool(fio)
before rbd remove: io... - 12:27 AM Bug #9077: Cluster is up in MON node even if Ceph is uninstalled in OSD node
- Mon logs and dmesg logs of mon node are attached
- 12:14 AM rbd Bug #9075: Can't create a version 2 images on RHEL 7
- Ok will do :).
08/12/2014
- 10:51 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- I can't reproduce.
From your messages, i can't find any error info.
Or am i missing something? - 10:28 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Yeah I'm using that commmand.
Sorry - messed up the commit hash : 4eb18dd487da4cb621dcbecfc475fc0871b356ac - 10:23 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Are you using this command "ceph-osd --id 0 --mkjournal --mkfs --osd-data /data1/cephdata --osd-journal /dev/sdc1"?
... - 10:10 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Building wip-9073. Hmmm still getting the invalid argument error and osd down. I'm guessing this means there are two ...
- 09:01 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Very quick work! Will test...
- 08:47 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Mark, I've pushed this as wip-9073.. can you please test?
Thanks, Jianpeng! Sorry I missed the pull request earlier! - 08:36 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Yes, I already found this bug. If journal use aio mode. The bug occur.
The https://github.com/ceph/ceph/pull/2185 c... - 08:28 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- When you say that reverting fixes it, do you mean that it allows an OSD that was erroring out on start to then start,...
- 06:31 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- ...or maybe the ::open()
- 06:14 PM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- On a different machine instead of a hang I am reliably getting:
2014-08-13 12:50:28.253439 7ffc701bb8c0 -1 ** ERR... - 01:40 AM Bug #9073: OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- A correctio - the 'stuck on a mutex' comment is completely wrong - sorry - I'd attached strace to the ceph-osd proces...
- 09:03 PM Feature #8560 (Pending Backport): mon: instrument paxos
- 06:27 PM Bug #8886: Miss some folders in PG's folder
- I see. Thank you for your reply~
- 01:43 PM Bug #8886 (Closed): Miss some folders in PG's folder
- ./default.4281.322\u\ushadow\u.Ndfi3nAmRHjph\uXyzjJQutltgGi1Dkd\u1__head_17F630A2__1b_ffffffffffffffff_7
appears t... - 06:18 PM Bug #9067 (Resolved): (wip-objecter) Objecter assertion in SIGINT handler
- ...
- 04:43 PM Bug #8894: osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
- 04:20 PM Bug #8894 (Resolved): osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
- 12:19 PM Bug #8894: osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
- 12:19 PM Bug #8894: osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
- wip-9054
- 11:25 AM Bug #8894: osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
- I think it's the C_Copyfrom which we gave the objecter in _copy_some. It's got a CopyOpRef.
- 04:34 PM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- This sounds right to me!
- 03:58 PM Bug #9082 (Need More Info): Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub b...
- 10:57 AM Bug #9082: Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before tier agent ...
- I have injected debus osd 20 to one OSD , and and then tried to initiate rados bench on EC pool which is tired with c...
- 09:31 AM Bug #9082: Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before tier agent ...
- can you reproduce this with debug osd = 20 and attach teh log? thanks!
- 08:27 AM Bug #9082: Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before tier agent ...
I have sent one email to ceph mailing list today , which is related to problem with Ceph pool . ...- 07:59 AM Bug #9082 (Resolved): Ceph Firefly 0.80.5 : PG has invalid (post-split) stats; must scrub before ...
- Hello
Ceph version : 0.80.5
Centos 6.5
Features in use : erasure coding and cache tiering
Few hours back m... - 03:48 PM Bug #9064 (Resolved): RadosModel assertion failure
- 03:48 PM Bug #9064 (Pending Backport): RadosModel assertion failure
- 03:26 PM Bug #9064: RadosModel assertion failure
- 03:26 PM Bug #9064: RadosModel assertion failure
- wip-9064
- 03:25 PM Bug #9064: RadosModel assertion failure
- Got it: 0ed3adc1e0a74bf9548d1d956aece11f019afee0
We're redirecting RW ordered reads due to the second read promote... - 02:00 PM Bug #9064: RadosModel assertion failure
I've now seen this in a case where the client wasn't in the process of handling a new OSD map (but the server was),...- 05:17 AM Bug #9064: RadosModel assertion failure
- This just reproduced on master 78dc4df, so looks like it's not wip-objecter specific.
- 03:24 PM Messengers Bug #8880 (Resolved): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq featu...
- 03:18 PM Bug #8860 (Resolved): ceph-disk issues with custom cluster name
- 12:21 PM Bug #8860 (Pending Backport): ceph-disk issues with custom cluster name
- 03:16 PM Bug #8625 (Resolved): EC pool - OSD creates an empty file for op with 'create 0~0, writefull 0~xx...
- 03:11 PM rgw Bug #8539 (Resolved): civetweb backend responds with a body when a HEAD request yields an error
- 03:02 PM Bug #8982 (Resolved): cache pool osds crashing when data is evicting to underlying storage pool
- 03:02 PM Bug #8714 (Resolved): we do not block old clients from breaking cache pools
- 03:01 PM Bug #8944 (Resolved): Ceph daemon bad asok used in connection with cluster
- 02:59 PM Bug #9080 (Resolved): LogClient: sends dup messages, misses some
- 01:15 PM Bug #9080 (Pending Backport): LogClient: sends dup messages, misses some
- 07:02 AM Bug #9080 (Resolved): LogClient: sends dup messages, misses some
- noticed where where 'ceph -s' woudln't show the most recent log message. tracing things it turns out that it was alw...
- 02:58 PM Bug #9022 (Resolved): Potential lock leaks in RadosClient
- 02:57 PM Bug #7999 (Resolved): osd: pgs share info that hasn't been persisted
- 02:57 PM rgw Bug #8169 (Resolved): rgw: swift user manifest does not compute etag
- 02:56 PM rgw Bug #8269 (Resolved): rgw: corrupted multipart object
- 02:56 PM Bug #8438 (Resolved): erasure code: object are not cleanup
- 02:56 PM rgw Bug #8442 (Resolved): rgw: does not detect/adapt to erasure pool stripe size
- 02:56 PM rgw Bug #8586 (Resolved): Missing Swift API Header causes RadosGW to segfault
- 02:55 PM rbd Bug #8912 (Resolved): librbd segfaults when creating new image (rbd-ephemeral-clone-stable-icehouse)
- 12:38 PM rbd Bug #8912 (Pending Backport): librbd segfaults when creating new image (rbd-ephemeral-clone-stabl...
- 02:54 PM Bug #8670 (Resolved): Cache tiering parameters can not be displayed for a pool
- 02:48 PM Bug #8696 (Resolved): mon: 'osd pool set' must take into account pool's nature when setting some ...
- 02:48 PM Bug #8701 (Resolved): osd: scrub found obsolete rollback obj
- 02:47 PM rgw Bug #8702 (Resolved): RadosGW incorrectly converting + to space in URLs
- 02:46 PM Bug #8733 (Resolved): OSD crashed at void ECBackend::handle_sub_read
- 02:39 PM Bug #8882 (Resolved): osd: osd tier remove ... leaves incomplete clones behind, confusing scrub
- 02:39 PM Bug #8889 (Resolved): osd/ReplicatedPG.cc: 5162: FAILED assert(got)
- 02:38 PM rbd Bug #8920 (Resolved): rbd/singleton/{all/formatted-output.yaml} fails on trusty due to whitespace
- 02:38 PM rgw Bug #8928 (Resolved): rgw: bad object created if stripe size is not a multiple of chunk size
- 02:38 PM Bug #8931 (Resolved): failed write reply order from ceph_test_rados
- 02:37 PM rgw Bug #8937 (Resolved): rgw: broken large(-ish) objects
- 02:37 PM Bug #8943 (Resolved): "ceph df" cannot show pool available space correctly
- 02:37 PM Bug #8969 (Resolved): PerfCounters.SinglePerfCounters failure on i386
- 02:37 PM rgw Bug #8972 (Resolved): rgw: bucket index log wrong object name in multipart completion
- 02:34 PM Bug #9085 (Pending Backport): erasure-code: ISA plugin does not load
- 09:46 AM Bug #9085 (Fix Under Review): erasure-code: ISA plugin does not load
- "need review":https://github.com/ceph/ceph/pull/2245
- 09:20 AM Bug #9085 (Resolved): erasure-code: ISA plugin does not load
- Because the plugin was not compiled with ErasureCode.cc
- 02:07 PM devops Bug #8160 (Duplicate): multipath-tools does not co-exist with ceph
- If/when we implement multipath support in ceph-deploy, this should be resolved.
- 01:43 PM rgw Bug #9089 (Resolved): rgw: copy_obj_data() does not stripe target object
- copy_obj_data() is as it is now a reminiscent of a very old architecture. It should be modified to create a striped o...
- 01:36 PM Bug #8591 (Resolved): ceph-disk incorrectly colocates journal when using dm-crypt
- wip-ceph-disk
- 01:35 PM Bug #8922: ceph-deploy mon create fails to create additional monitoring nodes.
- does 'hostname' on those machines return the same string, or does it include a domain name, or somethign different?
- 01:34 PM Bug #8985: "[WRN] map e9 wrongly marked me down" in upgrade:dumpling-x-firefly---basic-vps suite
- change the vps.yaml timeout to 90 seconds instead of 40.. these should go away then
- 01:33 PM Bug #8986 (Duplicate): "[WRN] map e62 wrongly marked me down" in upgrade:dumpling-x-firefly---bas...
- 01:33 PM Bug #9012 (Duplicate): "[WRN] map e277 wrongly marked me down" in upgrade:dumpling-x-firefly---ba...
- 01:32 PM Bug #9011 (Duplicate): osd memory leaks on next
- #9023
- 01:27 PM devops Bug #9061 (Resolved): dumpling to firefly upgrade on RH6 restarts the daemons
- 01:26 PM Bug #8974 (Need More Info): osd crashed with merge_log assert due to removal of isds
- 01:25 PM Bug #8974: osd crashed with merge_log assert due to removal of isds
- We can probably make some progress if you reproduce with
debug ms = 1
debug osd = 20
debug filestore = 20
on ... - 01:14 PM Bug #8505 (Resolved): OSD osd/OSD.cc: 6222: FAILED assert(p->second.empty())
- 01:13 PM Bug #8691 (Resolved): osd: PG::_lock, OSD::pg_map_lock lock cycle
- 01:10 PM Bug #8939 (Duplicate): stalled LibRadosTwoPoolsPP.TryFlushReadRace; client failed to reconnect?
- #8891
- 01:09 PM Bug #8940 (Duplicate): 3.22s1 shard 0(2) missing ad166f62/benchmark_data_plana57_30491_object1036...
- 01:06 PM Bug #9069 (Resolved): rgw tests reported as failed in teuthology-2014-08-11_10:35:04-upgrade:dump...
- 12:43 PM rgw Bug #8784: rgw: completion leak
- Note that all the failures are at the copy object across regions path. I did find a missing cleanup at the error hand...
- 10:53 AM Bug #9058: rest-api: long-running process may fail 'tell osd...' due to stale osdmap
- ubuntu@teuthology:/a/teuthology-2014-08-10_02:30:01-rados-next-testing-basic-plana/412468
- 10:08 AM Bug #9087 (Can't reproduce): ceph_test_rados_list_parallel hang
- ...
- 09:09 AM rbd Bug #6631 (Need More Info): disabling writethrough until flush appears to disable RBD cache
- Amit Vijairania wrote:
> More repetition of tests..
>
> // IOPS for Sequential 4KB Write _with_ "rbd cache writet... - 09:07 AM rbd Bug #9078 (Need More Info): Removing an RBD is very slow whenever there is write's in other RBD w...
- it sounds like the cluster is just under heavy load. can you confirm how many ops ceph -w shows before and during th...
- 05:09 AM rbd Bug #9078 (Rejected): Removing an RBD is very slow whenever there is write's in other RBD which a...
- Configuration:
3 node with mon and 3 node with OSD connected via Enclosure/jbod, total 15 OSD's
Steps followed:
... - 09:07 AM Feature #9083 (Closed): Standalone script to generate Ceph keys
- Goal: To allow 3rd party products which will be acting as Ceph clients to be able to install & configure all Ceph-cli...
- 09:04 AM Bug #9077 (Need More Info): Cluster is up in MON node even if Ceph is uninstalled in OSD node
- can you turn up mon logging (if it isn't up already) and attach teh log from the leader? tehse should get marked dow...
- 04:49 AM Bug #9077 (Can't reproduce): Cluster is up in MON node even if Ceph is uninstalled in OSD node
- Configuration:
1 mon and 1 osd node, number of OSD's 7
Steps followed:
1. Make Cluster up in single node and e... - 09:00 AM rbd Bug #8845 (Resolved): Flattening Clones of clone, results in command failure
- 09:00 AM rbd Bug #9075 (Need More Info): Can't create a version 2 images on RHEL 7
- can you retry with the ceph.com package? the 0.81 from fedora is all kinds of busted.
- 02:45 AM rbd Bug #9075 (Resolved): Can't create a version 2 images on RHEL 7
- Hi,
I can't create version 2 images, version 1 works though.
# rbd create -s 10240 --image-format 2 lesebb
20... - 08:56 AM Bug #8595 (In Progress): osd: client op blocks until backfill starts (dumpling)
- with this patch, i see filestore tripping over ENOENT on clone:
ubuntu@teuthology:/a/teuthology-2014-08-11_19:00:0... - 07:35 AM rgw Bug #9002: Creating swift key with --gen-secret in separate step from subuser creation fails
- have meet on Wheezy and Ubuntu with Ceph0.80.5 too.
it can be sucessful when use :
radosgw-admin user create --su... - 07:31 AM CephFS Bug #9056: fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid context"
- ...
- 06:51 AM CephFS Bug #9056 (Resolved): fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid ...
- 05:10 AM CephFS Bug #9056: fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid context"
- This is supposed to be fixed upstream in v3.16-rc6 by commit c55a01d360af, will close this when we've seen a clean fs...
- 07:20 AM Bug #9044: erasure-code: use ruleset instead of ruleid
- "backport to firefly":https://github.com/ceph/ceph/pull/2244
- 05:58 AM Bug #9044 (Pending Backport): erasure-code: use ruleset instead of ruleid
- 05:57 AM Bug #9044 (Resolved): erasure-code: use ruleset instead of ruleid
- 05:55 AM Bug #9044: erasure-code: use ruleset instead of ruleid
- Works. The problems of this run are
* "unrelated MDS decode bug":http://pulpito.ceph.com/loic-2014-08-12_10:00:07-... - 12:58 AM Bug #9044: erasure-code: use ruleset instead of ruleid
- "scheduled upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-12_10:00:07-upgrade:firefly-x:stress-...
- 06:56 AM CephFS Bug #8648: Standby MDS leaks memory over time
- Any change you can run one of these in standby under massif for a while? that will tell us what is leaking!
- 06:55 AM CephFS Bug #8651 (Won't Fix): crashing mds in an active-active mds setup
- this MDS got blacklisted. there is an open issues somewhere to make the shutdown more friendly, but the behavior is ...
- 06:52 AM Bug #9023: valgrind failures in OSD
- The leaks in the init stuff seem likely also to be present on master
- 06:50 AM CephFS Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
- we probably have to do a reencoding trick like we do in MOSDMap?
- 06:48 AM CephFS Bug #8876 (Resolved): kcephfs: hang on read of length 0
- 06:22 AM Bug #9079 (Resolved): osd: bad learned_addr during send_boot
- ...
- 06:10 AM Bug #8520: osd: segv in PushOp::print()
- ...
- 03:27 AM rbd Bug #8385: RBD / QEMU Crash: Invalid fastbin entry (free)
- Any interest in a lookalike bug from Cuttlefish?
/lib/x86_64-linux-gnu/libc.so.6(+0x7e566)[0x7f7cd15ad566]
/usr/... - 02:55 AM rbd Bug #9076 (Resolved): Can't completely remove a version 1 image on RHEL 7
- I can create version 1 image, however the deletion is not complete.
# rbd create -s 10240 --image-format 1 leseb
... - 12:54 AM devops Bug #9074: gitbuilder: make check does not complete, sometimes
- "re-run the build to check if it fails always or sometimes":http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-tarball-...
- 12:52 AM devops Bug #9074 (Duplicate): gitbuilder: make check does not complete, sometimes
- It looks like i386 build fails because a timeout interrupts it before it gets a chance to complete.
It could be t...
08/11/2014
- 09:15 PM Bug #9073 (Resolved): OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- Using a src build (and the packages built from it) on Ubuntu 14.04 x86_64. Ceph version is 0.83-399-gf77449c.
In ... - 08:53 PM rbd Bug #9071 (Duplicate): mkfs.ext4 stuck in D state on RBD with kernel client
- This is a bug in 3.15; it is not present in 3.14. The fix will make it into the next stable 3.15 release soon.
- 07:32 PM rbd Bug #9071: mkfs.ext4 stuck in D state on RBD with kernel client
- Please, mark this issue as duplicate of http://tracker.ceph.com/issues/8818
- 06:06 PM rbd Bug #9071: mkfs.ext4 stuck in D state on RBD with kernel client
- Reproducible on all my ceph hosts (all with the same kernel), with any image format (1 or 2). But only with mkfs.ext4...
- 05:47 PM rbd Bug #9071 (Duplicate): mkfs.ext4 stuck in D state on RBD with kernel client
- I tried to create ext4 on newly created and mapped RBD image, but mkfs.ext4 stuck:
# mkfs.ext4 /dev/rbd/docker.rbd... - 06:15 PM Documentation #8955 (Resolved): doc refers to [default] section, don't think it exists
- 's/[default]/[global]/'
- 06:10 PM Documentation #8955 (In Progress): doc refers to [default] section, don't think it exists
- 06:05 PM devops Bug #8734 (Resolved): EPEL / Ceph.com package priority issues
- I added priorty=2 to the get packages document example for ceph.repo. I also added an install yum-priorities series o...
- 05:56 PM devops Bug #8734 (In Progress): EPEL / Ceph.com package priority issues
- 05:51 PM Bug #9072: error setting 'mon_pg_warn_min_objects' to '10K': (22) Invalid argument
- ubuntu@teuthology:/a/sage-2014-08-10_18:40:12-rados-firefly-next-distro-basic-multi/414556
- 05:50 PM Bug #9072 (Resolved): error setting 'mon_pg_warn_min_objects' to '10K': (22) Invalid argument
- ...
- 05:25 PM Bug #9069: rgw tests reported as failed in teuthology-2014-08-11_10:35:04-upgrade:dumpling:rgw-du...
- oh.. it' snot running as root.. or with daemon-helper.
- 05:24 PM Bug #9069: rgw tests reported as failed in teuthology-2014-08-11_10:35:04-upgrade:dumpling:rgw-du...
- 7585 ? Sl 0:05 radosgw -n client.0 -k /etc/ceph/ceph.client.0.keyring --rgw-socket-path /home/ubuntu/ceph...
- 03:57 PM Bug #9069 (Resolved): rgw tests reported as failed in teuthology-2014-08-11_10:35:04-upgrade:dump...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-11_12:05:02-upgrade:dumpling-dumpling---basic-vps/...
- 04:58 PM rbd Bug #8912 (Fix Under Review): librbd segfaults when creating new image (rbd-ephemeral-clone-stabl...
- https://github.com/ceph/ceph/pull/2239
- 01:39 PM rbd Bug #8912: librbd segfaults when creating new image (rbd-ephemeral-clone-stable-icehouse)
- Looks like it was a race condition in a previously little-used error path.
- 01:04 PM rbd Bug #8912 (In Progress): librbd segfaults when creating new image (rbd-ephemeral-clone-stable-ice...
- Excellent report, your reproducer causes the same crash for me.
- 04:14 PM Bug #9044: erasure-code: use ruleset instead of ruleid
- gitbuilder is running
- 03:32 PM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- wip-9054
- 03:07 PM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- When we go to flush clone 22, all we know is that 22 is dirty, has snaps
[21], and 4 is clean. As part of fl... - 02:24 PM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- Ok, we start with the following configuration in the cache (all dirty):
30:[29,21,20,15,10,4]:[22(21), 15(15,10), ... - 12:45 PM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- Actually, looks like this might already be handled correctly, re-consulting the log.
- 12:00 PM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- Thinking
- 11:52 AM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- Hmm, I think the bug is like this:
Normally, if we get the following op sequence:
- write 1:[]
- delete 10:[3] (... - 03:16 PM Bug #9040: clients can SEGV during package upgrade
- I see no segmentation errors in the latest run: /a/teuthology-2014-08-11_12:05:02-upgrade:dumpling-dumpling---basic-v...
- 12:34 PM rgw Bug #8539: civetweb backend responds with a body when a HEAD request yields an error
- Merged, commit:0a2b4c25541bbd15776d3d35986518e37166910f
- 12:34 PM rgw Bug #8539 (Pending Backport): civetweb backend responds with a body when a HEAD request yields an...
- 12:24 PM Bug #9064: RadosModel assertion failure
The bug is happening when a new OSD map is received in the middle of the series of transactions. The read transact...- 11:40 AM Bug #9064: RadosModel assertion failure
- Got an even more specific backtrace ...
- 09:57 AM Bug #9064: RadosModel assertion failure
- trying to reproduce locally with objecter logging turned up and ``ms inject socket failures`` enabled as it is in the...
- 09:29 AM Bug #9064: RadosModel assertion failure
- I understand this a little better now: the operations in this WriteOp are 1,2 (writes), 4 (setxattr), 5 (read). So t...
- 07:23 AM Bug #9064: RadosModel assertion failure
- http://qa-proxy.ceph.com/teuthology/john-2014-08-10_02:14:59-rados-wip-mds-contexts-testing-basic-plana/411119/teutho...
- 07:22 AM Bug #9064 (Resolved): RadosModel assertion failure
http://qa-proxy.ceph.com/teuthology/john-2014-08-10_02:14:59-rados-wip-mds-contexts-testing-basic-plana/411119/teut...- 10:41 AM Bug #9057 (Resolved): mark_down from fast dispatch can deadlock
- 09:57 AM rgw Subtask #9068 (Closed): rgw: add rgw setup to vstart
- As part of the development documentation we need to update vstart to create an RadosGW development environment.
- 09:53 AM Bug #9067 (Resolved): (wip-objecter) Objecter assertion in SIGINT handler
@ wip-mds-contexts 2550fc51f30a8a1e581dd9a90511732a3b70ad2a
When I start a "ceph status" while no mon is running...- 09:01 AM devops Bug #9066 (Rejected): Need ceph-deploy to be able to run to JUST generate ceph.conf and keyring w...
- Mirror of issue: https://bugzilla.redhat.com/show_bug.cgi?id=1127852
- 08:37 AM Bug #9065 (Resolved): LibRados* tests failed in upgrade:dumpling-x-firefly---basic-vps
- This should be fixed by https://github.com/ceph/ceph/pull/2236 (in review)
Logs are in http://qa-proxy.ceph.com/te... - 08:33 AM devops Bug #9032 (Rejected): ceph-deploy over proxy
- The `--gpg-url` is only valid if you are pointing to a custom repo.
What you need to do is create a custom repo se... - 08:28 AM Feature #8580: Decrease disk thread's IO priority and/or make it configurable
- Hi,
The backport to dumpling is missing the commit which provides the new configurable: https://github.com/ceph/ce... - 05:04 AM Bug #9062: Mon segfault in waitlist_or_zap_client
- Note that this was wip-mds-clients which doesn't have any messenger changes and doesn't have any mon changes other th...
- 05:01 AM Bug #9062 (Resolved): Mon segfault in waitlist_or_zap_client
http://pulpito.front.sepia.ceph.com/john-2014-08-10_02:14:59-rados-wip-mds-contexts-testing-basic-plana/411054/
...- 04:37 AM Bug #9023: valgrind failures in OSD
Haven't seen the "new Session" one since rebasing on master, so I'm optimistic that it was the same thing as the le...- 04:09 AM CephFS Bug #8878 (In Progress): mds lock cycle (wip-objecter)
- I think all these are OK now in wip-mds-contexts: remaining failures on that branch are all outside MDS.
- 04:09 AM Bug #9009 (Resolved): (wip-objecter) ObjectCacher assert in fs client
- This is all good now in wip-mds-contexts (http://pulpito.ceph.com/john-2014-08-09_14:56:53-fs-wip-mds-contexts-testin...
08/10/2014
- 11:43 PM devops Bug #9061 (Resolved): dumpling to firefly upgrade on RH6 restarts the daemons
- Hi,
When I upgrade the RPMs on a RH6 server from 0.67.9 to 0.80.5, the daemons are (cond)restarted. I believe these ... - 07:20 PM Linux kernel client Bug #8806: libceph: must use new tid when watch is resent
- meanwhile, the MWatchNotify message now has a return value encoded at the end (s32) when header.version >= 0. See wi...
- 07:19 PM Linux kernel client Bug #8806: libceph: must use new tid when watch is resent
- the bug is with the kernel client: it needs to use a new tid when resending the watch. this was partially fixed on t...
- 05:04 PM Bug #9057 (Fix Under Review): mark_down from fast dispatch can deadlock
- https://github.com/ceph/ceph/pull/2238
- 10:45 AM Bug #9057: mark_down from fast dispatch can deadlock
- ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-08-09_14:13:44-rados-next-testing-basic-multi/410713
3 (!... - 08:41 AM Bug #9057 (Resolved): mark_down from fast dispatch can deadlock
- ...
- 04:13 PM Feature #8639 (In Progress): mon: dispatch messages while blocked waiting for IO
- 03:45 PM Bug #8620: rest/test.py occasional failure (dumpling)
- ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-08-10_13:22:17-rados-dumpling-distro-basic-multi/413788
- 02:07 PM Feature #8560 (Fix Under Review): mon: instrument paxos
- 12:51 PM rgw Bug #8988 (Fix Under Review): AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- Two consecutive run with the increased timeout do not show the bug ("one":http://pulpito.ceph.com/loic-2014-08-10_15:...
- 02:03 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- In a few tickets it is suggested that this may be an idle timeout problem. I "rescheduled a suite":http://pulpito.cep...
- 01:31 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- In the attached file, each part separated with *-----------------------------* is the output between the last success...
- 01:09 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- The errors for each failure are different and suggests the tests are failing for an independent reason such as the cl...
- 01:03 AM rgw Bug #8988: AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- * http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping-testing-ba...
- 12:46 PM Bug #9055 (Fix Under Review): LibRadosTwoPoolsPP.HitSetWrite (and others) fail on remove of whiteout
- https://github.com/ceph/ceph/pull/2236
- 11:05 AM Feature #9059 (Resolved): osd: store opportunistic whole-object checksum
- when we deep scrub, we have a whole-object checksums that cover data and omap. store a copy in object_info_t, along ...
- 10:52 AM Bug #8935: operations not idempotent when enabling cache
- sage-2014-08-09_14:13:44-rados-next-testing-basic-multi/410527 and 410528
- 10:51 AM Bug #9058 (Can't reproduce): rest-api: long-running process may fail 'tell osd...' due to stale o...
- sage-2014-08-09_14:13:44-rados-next-testing-basic-multi/410524
- 10:48 AM Bug #8894: osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
- ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-08-09_14:13:44-rados-next-testing-basic-multi/410806
alwa... - 02:16 AM CephFS Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
- "same error":http://pulpito.ceph.com/loic-2014-08-10_09:59:49-upgrade:firefly-x:stress-split-wip-9025-chunk-remapping...
- 12:53 AM CephFS Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
- Another "similar crash":http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-x:stress-split-wip-9025-chun...
- 12:39 AM CephFS Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
- And the same trace at "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upgrade:firef...
- 12:33 AM CephFS Bug #8725: mds crashed in upgrade:dumpling-x:stress-split-master-testing-basic-plana
- Looks like a similar problem at "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upg...
- 01:04 AM Feature #9025: erasure-code: chunk remapping
- The upgrade suite from firefly had one error related to an independant "MDS problem":http://pulpito.ceph.com/loic-201...
- 12:49 AM Feature #8496 (Resolved): erasure-code: ErasureCode base class
- 12:41 AM Feature #8496: erasure-code: ErasureCode base class
- The "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upgrade:firefly-x:stress-split-...
- 12:16 AM Bug #8978: ceph ping not working as expected
- I'm experiencing the same (on newly installed ceph-cluster via Ubuntu server 14.04.1):
ceph status
cluster b6...
08/09/2014
- 10:55 PM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
- Unfortunately converting RBD to image format 2 did not fix it. User returned after being away for a week and her syst...
- 05:50 PM CephFS Bug #9056: fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid context"
http://pulpito.front.sepia.ceph.com/john-2014-08-09_14:56:53-fs-wip-mds-contexts-testing-basic-plana/409236/
http:...- 05:48 PM CephFS Bug #9056 (Resolved): fuse kmod + ceph-fuse triggers "BUG: sleeping function called from invalid ...
kernel 5f740d7e1531099b888410e6bab13f68da9b1a4d
wip-mds-contexts (aka wip-objecter) 7be59771bff09e2b46b5467627cb...- 12:53 PM Bug #9055 (Resolved): LibRadosTwoPoolsPP.HitSetWrite (and others) fail on remove of whiteout
- 2014-08-09T09:03:14.670 INFO:tasks.workunit.client.0.plana70.stdout:test/librados/TestCase.cc:93: Failure
2014-08-09... - 12:26 PM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- 2014-08-08 10:55:12.312751 7f1237847700 10 osd.0 pg_epoch: 462 pg[2.1( v 462'2839 (0'0,462'2839] local-les=422 n=53 e...
- 10:04 AM Bug #9054: ceph_test_rados: FAILED assert(!old_value.deleted())
- almost there. on osd.0, we finish trimming 14a here:
2014-08-08 10:55:12.311901 7f1237847700 10 osd.0 pg_epoch: 4... - 11:43 AM Bug #8894: osd/ReplicatedPG.cc: 9281: FAILED assert(object_contexts.empty())
- ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-08-08_22:30:19-rados-wip-sage-testing-testing-basic-burnupi/...
- 01:39 AM Bug #9044 (Fix Under Review): erasure-code: use ruleset instead of ruleid
- "associated pull request":https://github.com/ceph/ceph/pull/2232
08/08/2014
- 11:00 PM Bug #9054 (Resolved): ceph_test_rados: FAILED assert(!old_value.deleted())
- ubuntu@teuthology:/a/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/403383...
- 10:58 PM Bug #8997: ceph_test_rados_watch_notify hangs
- ubuntu@teuthology:/a/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/402968
- 10:55 AM Bug #8997: ceph_test_rados_watch_notify hangs
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/402968
- 10:54 PM Bug #9053 (Resolved): mon/Paxos.cc: 628: FAILED assert(begin->last_committed == last_committed)
- ubuntu@teuthology:/a/teuthology-2014-08-06_02:30:01-rados-next-testing-basic-plana/402965
description: rados/monthra... - 07:36 PM Bug #9052: ceph-mon crashes with *** Caught signal (Floating point exception) **
- With no OSDs in the cluster, the calculations for @pgs_per_osd@ can divide by zero (integer, but that still causes th...
- 07:29 PM Bug #9052 (Resolved): ceph-mon crashes with *** Caught signal (Floating point exception) **
- I've found that I can crash ceph-mon by attempting to change pool values (such as pg_num) before adding OSDs to the c...
- 06:59 PM rgw Documentation #9051 (Closed): Document rgw_defer_to_bucket_acls option
- It appears that the only documentation right now is the commit message of 1d7c2041.
- 06:16 PM Bug #7576: osd: large skew in pg epochs (dumpling)
- ..and when we do, include commit:a52a855f6c92b03dd84cd0cc1759084f070a98c2 !!
- 06:16 PM Bug #7576 (Pending Backport): osd: large skew in pg epochs (dumpling)
- still want to backport this to firefly ...
- 06:04 PM rgw Bug #8621: civetweb frontend fails authentication if URL has special chars
- tested wip-8621 by executing s3tests, there are still a few failures,
logs are copied to ubuntu@mira042.front.sepi... - 04:42 PM Fix #4205: librados: Improve Watch-notify semantics
- http://pad.ceph.com/p/watch-notify
- 03:55 PM devops Feature #9050 (Rejected): Calamari builds for ceph.com
- 03:24 PM devops Feature #6310 (Closed): Get Dumpling into CentOS Ceph repo
- 10:31 AM Bug #9046 (Resolved): Limiting the pool object quota stops the IO, however IO does not restart if...
- Issue Title: Limiting the pool object quota stops the IO, however IO does not restart if we rest the pool object quot...
- 09:37 AM Bug #9040: clients can SEGV during package upgrade
- 09:03 AM Bug #9023: valgrind failures in OSD
- Another `new Session` at OSD.cc:3704
http://qa-proxy.ceph.com/teuthology/john-2014-08-07_18:44:20-fs-wip-mds-context... - 06:43 AM Bug #9044 (Resolved): erasure-code: use ruleset instead of ruleid
- When "ruleset is looked up by name":https://github.com/ceph/ceph/blob/firefly/src/mon/OSDMonitor.cc#L2928 when creati...
- 03:15 AM Feature #9025: erasure-code: chunk remapping
- "requeued, for ubuntu 14.04 to get quicker results":http://pulpito.ceph.com/loic-2014-08-08_12:17:30-upgrade:firefly-...
- 03:13 AM Feature #8496: erasure-code: ErasureCode base class
- "requeued, for ubuntu 14.04 to get quicker results":http://pulpito.ceph.com/loic-2014-08-08_12:13:20-upgrade:firefly-...
- 02:43 AM rgw Bug #9043 (Duplicate): rgw:Cannot add object to Ceph using Openstack Dashboard(Horizon) in firefly
- Uploading a new object fails with message "Error: Unable to upload object".
While adding an object using Horizon w...
08/07/2014
- 03:56 PM Feature #8276: ceph-filestore-dump import-rados -p <pool> <archive>
- Implemented syntax:
ceph_objectstore_tool import-rados pool [import_file|-]
Import into the specified pool on r... - 03:54 PM Bug #8396 (Resolved): osd: message delayed in Session misdirected after split
- 03:39 PM Bug #8625 (Pending Backport): EC pool - OSD creates an empty file for op with 'create 0~0, writef...
- 02:34 PM Bug #9040: clients can SEGV during package upgrade
- https://github.com/ceph/ceph-qa-suite/pull/77 seemed fixing this.
Testing now. - 01:56 PM Bug #9040 (Won't Fix): clients can SEGV during package upgrade
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-06_16:30:35-upgrade:dumpling-dumpling---basic-vps/...
- 12:37 PM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
- Well, I think data copy is the right thing to do. If I put bucket in different pool is because they're configured dif...
- 10:40 AM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
- The problem is that it is implicitly assumed with the new manifest that the tail is going to reside at the same pool ...
- 10:07 AM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
- Really ? I didn't see anything in the code that checked whether the destination bucket was in the same pool or not an...
- 09:59 AM rgw Bug #9039: Using COPY on radosgw to copy object from one bucket to another that's in another pool...
- That sounds like an issue with the new (firefly) manifest.
- 07:21 AM rgw Bug #9039 (Resolved): Using COPY on radosgw to copy object from one bucket to another that's in a...
- Currently if you copy an object from a bucket to another one which is in another rados pool, things will just break. ...
- 09:34 AM Bug #9035 (Closed): ceph cluster is using more space than actual data after replication
- the used is simply summing the statfs(2) results on all the OSDs. you can see this by doing a df on the osd volumes,...
- 02:24 AM Bug #9035 (Closed): ceph cluster is using more space than actual data after replication
- Ceph cluster is using more space than estimated space to store data after replication.
Total cluster capacity is 5... - 07:52 AM rgw Bug #9037 (Duplicate): civetweb: error HEAD responses return body
- 07:40 AM rgw Bug #9037: civetweb: error HEAD responses return body
- Ah, sorry, somehow managed to miss it when I looked through the issue list. Please close this then.
- 07:34 AM rgw Bug #9037: civetweb: error HEAD responses return body
- See #8539
- 02:59 AM rgw Bug #9037 (Duplicate): civetweb: error HEAD responses return body
- 0.80.5 radosgw with civetweb frontend returns body data when sending an error response to a HEAD request. This breaks...
- 06:41 AM CephFS Feature #9029: min/max uid for snapshot creation
- 06:00 AM Bug #4254: osd: failure to recover before timeout on rados bench and thrashing; negative stats
- I am seeing this issue again on v0.80.4. I stopped 3 osd processes and marked them as out to trigger data migration (...
- 03:08 AM Feature #8496: erasure-code: ErasureCode base class
- "requeued on vps because plana are very busy":http://pulpito.ceph.com/loic-2014-08-07_12:09:48-upgrade:firefly-x:stre...
- 03:06 AM Feature #9025: erasure-code: chunk remapping
- "queued the suite on vps because plana are very busy":http://pulpito.ceph.com/loic-2014-08-07_12:06:56-upgrade:firefl...
- 12:54 AM Feature #9025: erasure-code: chunk remapping
- "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-07_09:56:17-upgrade:firefly-x:stress-split-wip-...
- 01:23 AM Feature #9034 (New): erasure-code: better LRC strategy
- The current LRC recovery strategy does not take advantage of all possibilities and may fail to discover a scenario th...
- 01:17 AM Feature #9033 (Resolved): erasure-code: simplified LRC
- Add implicit parity and simplified LRC as "described by Andreas":https://www.mail-archive.com/ceph-devel@vger.kernel....
08/06/2014
- 06:40 PM Bug #9022 (Pending Backport): Potential lock leaks in RadosClient
- 02:58 AM Bug #9022: Potential lock leaks in RadosClient
- Pull request on the way.
- 02:58 AM Bug #9022 (Resolved): Potential lock leaks in RadosClient
- While going through RadosClient, identified couple of interfaces librados::RadosClient::lookup_pool() and librados::R...
- 03:39 PM Feature #9031: List RADOS namespaces and list all objects in all namespaces
A way to implement this is to enhance the pg_ls_repsonse_t to include the namespace (or change object_t to hobject_...- 02:30 PM Feature #9031 (Resolved): List RADOS namespaces and list all objects in all namespaces
- We can currently create namespaces, but cannot easily view those that have been created. A method of listing namespac...
- 03:23 PM devops Bug #9032 (Rejected): ceph-deploy over proxy
- I have my servers working behind a proxy. When I run the ceph-deploy install command I get an error:
[ceph01][INFO ... - 02:05 PM Feature #9030 (Fix Under Review): mon: quickly identify 'problem' osds
- 02:05 PM Feature #9030 (Resolved): mon: quickly identify 'problem' osds
- 12:55 PM Bug #8860 (Fix Under Review): ceph-disk issues with custom cluster name
- PR opened https://github.com/ceph/ceph/pull/2216
- 12:25 PM CephFS Feature #9029 (Resolved): min/max uid for snapshot creation
- On shared systems like shared hosting it might be useful to prevent regular users from creating snapshots on CephFS.
... - 12:20 PM rgw Feature #6747: PowerDNS backend for RGW bucket directing
- 11:06 AM rbd Bug #8845 (Pending Backport): Flattening Clones of clone, results in command failure
- 09:41 AM Bug #9019 (Resolved): Makefile.am: error: required file './README' not found
- fixed it up with a symlink.. other solutions seemed more annoying :(
- 08:39 AM Linux kernel client Bug #8818 (Resolved): IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- OK, thanks everybody....
- 08:09 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- I switched to the good kernel (3.16.0-ceph-00037-g0532581) yesterday and re-ran my scripts overnight. The scripts co...
- 08:39 AM Linux kernel client Bug #8464 (Resolved): krbd: deadlock
- OK, thanks everybody....
- 08:06 AM Feature #8496: erasure-code: ErasureCode base class
- "scheduled upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-06_17:07:04-upgrade:firefly-x:stress-...
- 06:22 AM Feature #8496: erasure-code: ErasureCode base class
- The test "only had one job":http://pulpito.ceph.com/loic-2014-08-05_13:45:56-upgrade:firefly-x:stress-split-wip-8496-...
- 07:12 AM Feature #9025 (Fix Under Review): erasure-code: chunk remapping
- "need review":https://github.com/ceph/ceph/pull/2213
- 06:28 AM Feature #9025 (Resolved): erasure-code: chunk remapping
- Interpret the *mapping* parameter and remap the chunks accordingly. For instance mapping=_DD means the data chunks ar...
- 07:11 AM CephFS Feature #9026 (Resolved): client: vxattr support for rctime, rsize, etc.
- 05:44 AM Bug #9023 (Can't reproduce): valgrind failures in OSD
osd.2 from OSD.cc:462 (SafeTimer::init, pthread_create)
http://pulpito.front.sepia.ceph.com/john-2014-08-01_11:0...
08/05/2014
- 11:22 PM Feature #9021 (Resolved): librbd: shared flag, object map
- we need to consider to make a tradeoff between multi-client support and single-client support for librbd. In practice...
- 10:43 PM Bug #8797: "ceph status" do not exit with python_2.7.8
- For a moment Python maintainer in Debian kindly fixed this issue for us by adding patch to revert problematic change ...
- 07:34 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- The 00036 "bad" kernel started showing the problem in the /var/log/kern.log file within minutes of starting my test s...
- 12:49 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- Eric, Greg,
The fix on top of 3.16 + testing is in wip-request-fn.
http://gitbuilder.ceph.com/kernel-deb-precis... - 06:16 PM Bug #9019 (Resolved): Makefile.am: error: required file './README' not found
- commit(a923e2c9eb16823fa484c) Renamed README to README.md to render in markdown. After that, i can't generate Makefil...
- 06:15 PM Bug #9008: Objecter: pg listing can deadlock when throttling is in use
- > I'm guessing the request is hung on teh OSD side of things...
Thanks Sage. Sadly after radosgw daemon restarting, ... - 08:28 AM Bug #9008 (Need More Info): Objecter: pg listing can deadlock when throttling is in use
- 08:28 AM Bug #9008: Objecter: pg listing can deadlock when throttling is in use
- please query the admin socket for the process like so:
ceph daemon /var/run/ceph/ceph-client.*.asok objecter_requ... - 02:44 AM Bug #9008 (Resolved): Objecter: pg listing can deadlock when throttling is in use
- In our Ceph cluster (with radosgw), we found that occasionally the processing threads hands forever and eventually ha...
- 02:24 PM Bug #9018 (Resolved): "LibRadosTwoPoolsPP*" failed in upgrade:dumpling-x-firefly---basic-vps
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-05_09:22:33-upgrade:dumpling-x-firefly---basic-vps...
- 02:13 PM devops Feature #8868: Update Fedora to 0.80.5 packages with ceph-common
- So, there's a PR open for some restructuring of the .spec file now that we need to get in soon to make this more sane...
- 01:21 PM Fix #6278 (Resolved): osd: throttle snap trimming
- 01:20 PM devops Fix #9017 (Rejected): [paddles] implement validation across all controller methods
- paddles has a lot of boilerplate in controllers that look like:...
- 01:15 PM Feature #9015 (Resolved): msgr refactoring to support xio work
- 01:09 PM Feature #9015 (Resolved): msgr refactoring to support xio work
- 01:14 PM Fix #8905 (In Progress): msgr: encode osd epoch in nonce to avoid misc OSD reconnect races
- 01:10 PM Feature #7516 (Fix Under Review): mon: reweight-by-pg
- 01:06 PM Feature #7238 (In Progress): erasure code : implement LRC plugin
- 12:55 PM Bug #8083: erasure-code: fix static code analysis errors found in gf-complete
- 12:28 PM Documentation #8875 (Resolved): `ceph-deploy new` needs to be called for every node, not just the...
- PR https://github.com/ceph/ceph/pull/2206
and merged commit e6935dd into master - 09:37 AM Documentation #8875 (In Progress): `ceph-deploy new` needs to be called for every node, not just ...
- I noted the problem in the docs and will fix that shortly.
You are right, you need to run `ceph-deploy new {NODES}... - 11:19 AM Bug #9011: osd memory leaks on next
- gonna see if this happens on plana too
- 11:13 AM Bug #9011: osd memory leaks on next
- these look like static std::strings. and some other weird leaks that don't make sense...
- 08:00 AM Bug #9011 (Duplicate): osd memory leaks on next
- ubuntu@teuthology:/a/sage-2014-08-04_11:34:19-rgw-next-testing-basic-vps/397606
need to clean these up - 09:26 AM rgw Feature #9013 (Resolved): rgw: set civetweb as a default frontend
- Should add civetweb to the default frontends.
- 09:13 AM Messengers Bug #8880 (Pending Backport): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_s...
- 09:11 AM Bug #9012 (Duplicate): "[WRN] map e277 wrongly marked me down" in upgrade:dumpling-x-firefly---ba...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-08-04_14:18:17-upgrade:dumpling-x-firefly---basic-vps...
- 09:05 AM rgw Feature #8218 (In Progress): rgw: object versioning manifest changes
- 09:05 AM rgw Feature #8217 (In Progress): rgw: object versioning object overwrite / delete changes
- 09:05 AM rgw Feature #8216 (In Progress): rgw: object versioning objclass support
- 09:05 AM rgw Feature #8473 (In Progress): rgw: Shard bucket index objects to improve single bucket PUT throughput
- 08:54 AM rbd Bug #8845 (Fix Under Review): Flattening Clones of clone, results in command failure
- https://github.com/ceph/ceph/pull/2205
- 08:52 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- btw, the steps to reproduce this issue are mentioned by Sahana above & it can be reproduced on a single node too.
... - 08:47 AM Fix #8914: osd crashed at assert ReplicatedBackend::build_push_op
- Hi Greg,
No i did not intend to add any comments.
The reason i thought we should assert is, so that we can serv... - 08:25 AM Bug #9007 (Duplicate): Ceph Firefly 0.80.4 : Unable to get some pool values
- you're right. this is fixed in master, and backported to firefly-next.. will be in next firefly point release.
- 01:50 AM Bug #9007 (Duplicate): Ceph Firefly 0.80.4 : Unable to get some pool values
- h1. Hello Developers
I am curious to know if there is something missing from the code for Ceph pool values.
As... - 07:56 AM rgw Bug #8676: md5sum check failed during readwrite.py
- ubuntu@teuthology:/a/sage-2014-08-04_11:34:19-rgw-next-testing-basic-vps/397522
- 04:46 AM Feature #8496: erasure-code: ErasureCode base class
- "upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-05_13:45:56-upgrade:firefly-x:stress-split-wip-...
- 12:48 AM Feature #8496 (Fix Under Review): erasure-code: ErasureCode base class
- "pull request":https://github.com/ceph/ceph/pull/2201
- 04:22 AM Bug #9009 (In Progress): (wip-objecter) ObjectCacher assert in fs client
- OK, no big deal, just that there are contexts in the Client, like the MDS, which need updating to take client_lock wh...
- 03:49 AM Bug #9009 (Resolved): (wip-objecter) ObjectCacher assert in fs client
From branch wip-mds-contexts, which is a derivative of wip-objecter.
http://qa-proxy.ceph.com/teuthology/john-20...- 03:21 AM rgw Feature #8911: RGW doesn't return 'x-timestamp' in header which is used by 'View Details' of Open...
- It also doesnot returns "Content-type" header as well. Swift does return this header aswell. So I would love to see r...
- 12:45 AM rgw Documentation #9003: rgw: document development setup for rgw
- Much needed. Great!
08/04/2014
- 11:33 PM Feature #8496 (In Progress): erasure-code: ErasureCode base class
- Because it needs work to adapt the isa plugin, it deserves a separate patch. Otherwise it mixes two unrelated topics.
- 05:12 AM Feature #8496 (Rejected): erasure-code: ErasureCode base class
- It is part of a "larger pull request":https://github.com/ceph/ceph/pull/1911
- 11:21 PM Bug #8736: thrash and scrub combination lead to error
- http://pulpito.ceph.com/loic-2014-08-04_15:06:02-upgrade:firefly-x:stress-split-wip-8475-testing-basic-plana/396887/
... - 11:02 PM Feature #8475 (Resolved): erasure-code: oversized objects when using the Cauchy technique
- 06:05 AM Feature #8475: erasure-code: oversized objects when using the Cauchy technique
- "scheduled upgrade:firefly-x:stress-split":http://pulpito.ceph.com/loic-2014-08-04_15:06:02-upgrade:firefly-x:stress-...
- 02:07 AM Feature #8475: erasure-code: oversized objects when using the Cauchy technique
- "Rebased and repushed":https://github.com/ceph/ceph/pull/1890 , running gitbuilder
- 08:00 PM rgw Feature #3454: Support temp URLs for Swift API
- This should be documented somewhere too, at least in the table at http://ceph.com/docs/master/radosgw/swift/
- 03:09 PM Bug #8998 (Pending Backport): osd: SEGV in OSD::heartbeat()
- 03:00 PM Bug #8998 (Fix Under Review): osd: SEGV in OSD::heartbeat()
- https://github.com/ceph/ceph/pull/2198
- 09:14 AM Bug #8998: osd: SEGV in OSD::heartbeat()
- ubuntu@teuthology:/a/teuthology-2014-08-03_02:30:01-rados-next-testing-basic-plana/394893
- 02:18 PM rgw Feature #9004 (New): rgw: multi-site: multi-master
- As a user, I want to be able to write to any available RGW and have that file available on other RGWs for read and wr...
- 02:06 PM Bug #8891 (Resolved): rados bench hang during thrashing
- 09:17 AM Bug #8891 (Fix Under Review): rados bench hang during thrashing
- 01:53 PM rgw Documentation #9003: rgw: document development setup for rgw
- While we're at it, beefing up the rgw support in vstart.sh would be great. right now you can pass RGW=1 and it will ...
- 01:49 PM rgw Documentation #9003 (Closed): rgw: document development setup for rgw
- 11:20 AM rgw Bug #9002 (Duplicate): Creating swift key with --gen-secret in separate step from subuser creatio...
- Customer reported on CentOS with Ceph v0.80.4
Steps to reproduce:
radosgw-admin user create --uid=testuser1 --dis... - 11:00 AM rgw Bug #9001 (Won't Fix): Starting gateway with radosgw init script fails to create socket
- Ceph Version: v0.80.4
Distro: CentOS
Customer reported, unable to reproduce.
/var/run/ceph directory owned by ... - 09:16 AM Bug #7986: 3.1s0 scrub stat mismatch, got 2041/2044 objects, 0/0 clones, 2041/2044 dirty, 0/0
- ubuntu@teuthology:/a/teuthology-2014-08-03_02:30:01-rados-next-testing-basic-plana/395219
- 07:07 AM Linux kernel client Bug #8979: GPF kernel panics - auth?
- pushed wip-8979 which removes the fixed buffer size. but, we still need to make things not crash when the auth reply...
- 06:57 AM Linux kernel client Bug #8979: GPF kernel panics - auth?
- yeah:
#define TEMP_TICKET_BUF_LEN 256
- 06:48 AM Linux kernel client Bug #8979: GPF kernel panics - auth?
- ...
- 06:36 AM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
- I was able to complete install.
The first step above granted sudo rights on each node.
The way I was able to get it... - 05:56 AM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
- You still need a user that can call sudo without a password prompt on remote nodes.
And it looks like you only pas... - 05:47 AM devops Bug #8893 (Resolved): ceph-deploy install command on centos 6.5 reports exception
- merged commit eb9ea33 into ceph:master
- 01:47 AM Bug #8601 (Resolved): erasure-code: default profile does not exist after upgrade
08/03/2014
- 09:48 PM rgw Bug #8864: radosgw help doesn't seem to display some debug options
- I pushed a couple of commits to fix most of undocumented options in man pages & help for #8112. Can you let me know w...
- 09:35 PM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
- Finally I've isolated the issue.
Something was wrong with a particular RBD image (format 1) that was created on Ceph... - 09:11 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
- another similar hang:...
- 06:27 PM Bug #8891: rados bench hang during thrashing
- i think this was the same repaer vs fast dispatch that i tracked down in wip-msgr.
- 02:48 PM devops Bug #8330: repodata on rpm repos do not list latest ceph-deploy (1.5.2)
- Agreed, this is fixed. Current repodata works perfectly with all packages showing correctly (on the same host btw, I'...
- 08:40 AM rgw Bug #8784: rgw: completion leak
- ubuntu@teuthology:/a/teuthology-2014-08-01_23:02:01-rgw-master-testing-basic-plana/394054
- 08:39 AM Bug #8996 (Resolved): "Segmentation fault" in upgrade:dumpling-x-firefly---basic-vps suite
- botched (double) backport, fixed by commit:4e03d5b512c8d2f7fa51dda95c6132e676529f9b
08/02/2014
- 05:01 PM Bug #8998 (Resolved): osd: SEGV in OSD::heartbeat()
- ...
- 04:58 PM Bug #8997 (Can't reproduce): ceph_test_rados_watch_notify hangs
- ...
- 04:55 PM Bug #8996 (Resolved): "Segmentation fault" in upgrade:dumpling-x-firefly---basic-vps suite
- There are lots of these errors in:
http://pulpito.front.sepia.ceph.com/teuthology-2014-08-02_08:50:33-upgrade:dumpli... - 04:31 PM Messengers Bug #8880: msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq feature")
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392461
- 08:14 AM Bug #8396: osd: message delayed in Session misdirected after split
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392256
- 08:07 AM Bug #6003: journal Unable to read past sequence 406 ...
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-08-01_02:32:01-rados-master-testing-basic-plana/392342...
08/01/2014
- 09:25 PM Bug #8776 (Won't Fix): osd: runaway memory on dumpling
- this is a result of a very large omap object and us building a transaction to delete the keys. the problem is the bi...
- 09:57 AM Bug #8776: osd: runaway memory on dumpling
- Argh, it's building up a leveldb operation to atomically remove all of the keys associated with the object. I *think...
- 06:26 PM Bug #8930 (Resolved): osd: test unable to produce unfound objects
- 04:07 PM Bug #8930 (Fix Under Review): osd: test unable to produce unfound objects
- 09:41 AM Bug #8930: osd: test unable to produce unfound objects
- 03:56 PM devops Bug #8849 (Resolved): rpm restarts daemons on upgrade
- already backported, commit:e75dd2e4b7adb65c2de84e633efcd6c19a6e457b and ^
- 03:55 PM Bug #8728 (Resolved): rest/test.py osd create not idempotent
- 03:54 PM Bug #8670: Cache tiering parameters can not be displayed for a pool
- non trivial to backport.. need to get all the rados test refactoring, too!
- 03:51 PM CephFS Bug #8622 (Resolved): erasure-code: rados command does not enforce alignement constraints
- commit:7a58da53ebfcaaf385c21403b654d1d2f1508e1a
- 03:48 PM Bug #6789 (Resolved): cannot remove the leader when there only are two monitors
- 03:39 PM Bug #8944 (Pending Backport): Ceph daemon bad asok used in connection with cluster
- 03:37 PM Bug #8714 (Pending Backport): we do not block old clients from breaking cache pools
- 03:35 PM Feature #8674 (Pending Backport): osd: cache tier: avoid promotion on first read
- commit:79d1aff1821bc9f21477636df4d0d4e57f2cd008
- 03:32 PM rgw Bug #8937 (Pending Backport): rgw: broken large(-ish) objects
- 03:05 PM Documentation #8995 (Resolved): Preflight Checklist Clarifications
- There are several small clarifications that can be made to the Ceph Preflight Checklist to help new users try out Cep...
- 02:44 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- No need to do that just yet. I now fully understand the problem and working on a proper fix that I'd like you to tes...
- 02:37 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- I have done some testing and I am seeing the same thing as Eric. With the deadlock-bad kernel I hit the deadlock iss...
- 02:06 PM Bug #8625: EC pool - OSD creates an empty file for op with 'create 0~0, writefull 0~xxx, setxattr...
- Making it not an rgw bug.
- 02:06 PM Bug #8625: EC pool - OSD creates an empty file for op with 'create 0~0, writefull 0~xxx, setxattr...
- wip-8625, versioning should never be necessary after a create (it will be necessary before the create if the object a...
- 09:53 AM Bug #8625: EC pool - OSD creates an empty file for op with 'create 0~0, writefull 0~xxx, setxattr...
- It's the create 0~0 followed by a writefull. Arguably, we still shouldn't version the object, I'll take a look.
- 01:02 PM Fix #8993 (Closed): osd_pool_default_pgp_num woes
- When setting osd_pool_default_pgp_num and not osd_pool_default_pg_num you can create pools with more pgp than pg.
... - 12:57 PM devops Bug #8893 (Fix Under Review): ceph-deploy install command on centos 6.5 reports exception
- PR opened https://github.com/ceph/ceph-deploy/pull/226
- 06:51 AM devops Bug #8893 (In Progress): ceph-deploy install command on centos 6.5 reports exception
- 09:15 AM rbd Bug #8416 (Closed): Client Crash when try to map a volume (ubuntu)
- OK, I'm going to assume this was indeed the missing features handling bug. I looked into it, it was introduced in 3....
- 08:23 AM Bug #8989 (Rejected): Failed running iogen.sh in upgrade:firefly-firefly-testing-basic-vps suite
- It was a test mis-configuration. When we added a new client to run workload on, we had to be more specific about on ...
- 07:10 AM Bug #8717 (Resolved): teuthology: valgrind leak checks broken for osd (at least)
- 05:57 AM Bug #8601: erasure-code: default profile does not exist after upgrade
- ...
- 02:23 AM Feature #8992 (New): Uniqueness between two or more CRUSH ruleset choose statements
- Assuming that ceph-node1 is in default root, when we define and assign following crush rule:...
- 01:44 AM Bug #8641: Cache tiering agent cannot flush or evict objects during the benchmark
- In my opinion problem affect also cache_min_evict_age cache_min_flush_age and others. It's impossible to force ceph c...
- 12:40 AM CephFS Bug #8962: kcephfs: client does not release revoked cap
- ...
07/31/2014
- 09:04 PM rgw Bug #8972 (Pending Backport): rgw: bucket index log wrong object name in multipart completion
- 09:31 AM rgw Bug #8972 (Fix Under Review): rgw: bucket index log wrong object name in multipart completion
- 08:54 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
- Zheng Yan wrote:
> Sage Weil wrote:
> > Zheng Yan wrote:
> > > no clue what happened. please dump the mds cache wh... - 07:32 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
- Sage Weil wrote:
> Zheng Yan wrote:
> > no clue what happened. please dump the mds cache when it happens next time
... - 10:11 AM CephFS Bug #8962: kcephfs: client does not release revoked cap
- Zheng Yan wrote:
> no clue what happened. please dump the mds cache when it happens next time
We have a dump, act... - 08:48 PM rgw Bug #8991 (Resolved): rgw: RGWRados::list_bi_log_entries() doesn't clear list
- ...
- 03:52 PM Bug #8977: osd: didn't discard sub_op_reply from previous interval?
- Added some debugging to dump the OpWQ queue information if there are stale ops, running in loop.
- 12:53 PM Bug #8977: osd: didn't discard sub_op_reply from previous interval?
- 2014-07-30 10:40:58.317063 7fc2164da700 0 log [WRN] : slow request 960.196157 seconds old, received at 2014-07-30 10...
- 02:35 PM Bug #8989 (Rejected): Failed running iogen.sh in upgrade:firefly-firefly-testing-basic-vps suite
- There majority of failures related to this in this run: http://pulpito.front.sepia.ceph.com/teuthology-2014-07-30_12:...
- 12:52 PM Feature #131 (In Progress): bring wireshark plugin is up to date
- 12:51 PM Documentation #7 (Resolved): Document Monitor Commands
- ceph -h
- 11:29 AM rgw Bug #8988 (Resolved): AssertionError(s) in upgrade:firefly-x:stress-split-next---basic-plana
- "Related issue":http://tracker.ceph.com/issues/9100
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-201... - 11:25 AM Bug #8982 (Pending Backport): cache pool osds crashing when data is evicting to underlying storag...
- 11:14 AM Bug #8982 (Fix Under Review): cache pool osds crashing when data is evicting to underlying storag...
- 08:47 AM Bug #8982 (In Progress): cache pool osds crashing when data is evicting to underlying storage pool
- 07:36 AM Bug #8982 (Resolved): cache pool osds crashing when data is evicting to underlying storage pool
- We have a erasure coded pool 'ecdata' and a replicated(size=3) pool 'cache' acting as writeback cache upon it.
When... - 11:17 AM Bug #8969 (Pending Backport): PerfCounters.SinglePerfCounters failure on i386
- 09:48 AM rgw Feature #8987 (New): rgw: data sync for multipart upload
- 09:46 AM Bug #8986 (Duplicate): "[WRN] map e62 wrongly marked me down" in upgrade:dumpling-x-firefly---bas...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-30_13:00:44-upgrade:dumpling-x-firefly---basic-vps...
- 09:43 AM Bug #8985: "[WRN] map e9 wrongly marked me down" in upgrade:dumpling-x-firefly---basic-vps suite
- ...
- 09:42 AM Bug #8985 (Resolved): "[WRN] map e9 wrongly marked me down" in upgrade:dumpling-x-firefly---basic...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-30_13:00:44-upgrade:dumpling-x-firefly---basic-vps...
- 09:35 AM Bug #8970 (Won't Fix): Injectargs - inconsistent parsing of bool values
- these will also work:
--my-boolean-option=0
--my-boolean-option=false
but you're right, the others won't, be... - 09:33 AM Feature #8973: Add support for collecting usage information by namespace
- We decided not to do this when designing namespaces because we wanted namespaces to scale independnetly of the size o...
- 08:49 AM Bug #8947 (Duplicate): Writing rados objects with max objects set for cache pool crashed osd
- Oh, i see it now. This is a dup of #8982.
- 08:29 AM RADOS Support #8600: MON crashes on new crushmap injection
- In addition to the choose vs. chooseleaf issue that Joao is mentioning here, we have also seen problems when min_size...
- 08:13 AM Bug #8966: ceph.conf "osd pool default size = 2" not working
- Then the documentation (http://ceph.com/docs/master/start/quick-ceph-deploy/) on point 2 should be updated....
- 07:58 AM RADOS Bug #8984 (Won't Fix): creating erasure-code pool when not having a root item default
- When creating a EC pool:
> ceph osd pool create poolio 128 128 erasure profile15
It returns
> Error ENOENT: root ... - 07:46 AM Bug #8983 (Resolved): rados bench -b option does not take orders of magnitude (k,M,..) but also d...
- When running this:
> rados -p <pool> bench 1000 write -t 10 -b 4M
It runs with -b 4 instead of expected
> rados -... - 06:04 AM Bug #8601: erasure-code: default profile does not exist after upgrade
- Apparently having an EC pool is still sufficient to prevent kernel clients from mounting, so I don't think we can bac...
- 05:52 AM Bug #8601: erasure-code: default profile does not exist after upgrade
- "firefly backport":https://github.com/ceph/ceph/pull/2178
- 05:16 AM Bug #8601 (Pending Backport): erasure-code: default profile does not exist after upgrade
- 02:53 AM Linux kernel client Bug #8979 (Resolved): GPF kernel panics - auth?
- From James Eckersall, "GPF kernel panics" on ceph-users.
I've had a fun time with ceph this week.
We have a clust...
07/30/2014
- 10:59 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
- no clue what happened. please dump the mds cache when it happens next time
- 07:01 AM CephFS Bug #8962: kcephfs: client does not release revoked cap
- and the code that did it is in teuthology.git/teuthology/misc.py:...
- 07:00 AM CephFS Bug #8962: kcephfs: client does not release revoked cap
- here is the final state of the directory:...
- 10:25 PM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- The deadlock-bad kernel showed the error after a few minutes of running multiple dd writes to rbd device. Here is one...
- 11:33 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- All,
Can you try and confirm that deadlock-bad fails and deadlock-good works for you?
deadlock-bad:
http://g... - 05:18 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- Update: At this point I'm almost certain this is not an rbd/ceph problem. Trying to track down the exact culprit.
- 04:59 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- I can reproduce this with 100% certainty now on Trusty, 3.15.6-031506-generic.
Running:
bonnie++ -n 512
agai... - 09:57 PM Bug #8752 (New): firefly: scrub/repair stat mismatch
- This problem manifests only on caching pools.
I have two EC pools with the following settings:... - 09:44 PM Bug #8229 (Closed): 0.80~rc1: OSD crash (domino effect)
- Closing: nothing left to track here; did not have this problem with 0.80.4.
- 09:42 PM Bug #8978 (Can't reproduce): ceph ping not working as expected
- Reading the doc: http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-mon/
I came across command: cep... - 09:26 PM Bug #8977 (Can't reproduce): osd: didn't discard sub_op_reply from previous interval?
- /a/teuthology-2014-07-29_02:30:02-rados-firefly-distro-basic-plana/384397
an op gets stuck in limbo because we are... - 08:54 PM rgw Bug #8586 (Pending Backport): Missing Swift API Header causes RadosGW to segfault
- 05:57 PM devops Bug #8976: httpd on RHEL7 (RHEL repo) incompatible with mod_fastcgi (ceph repo)
- Also, when trying to enable the httpd ceph pkg with systemctl:
systemctl enable httpd
httpd.service is not a nat... - 05:22 PM devops Bug #8976 (Resolved): httpd on RHEL7 (RHEL repo) incompatible with mod_fastcgi (ceph repo)
- On a RHEL7 system
yum install httpd mod_fastcgi
systemctl start httpd
Apache fails to start with the folowin... - 05:12 PM Bug #8947 (Need More Info): Writing rados objects with max objects set for cache pool crashed osd
- can you attach the complete logs? all three osds claim to have hit an assert, but the assert message isn't in the lo...
- 04:59 PM rbd Bug #8920 (Pending Backport): rbd/singleton/{all/formatted-output.yaml} fails on trusty due to wh...
- 01:43 PM rbd Bug #8920 (Fix Under Review): rbd/singleton/{all/formatted-output.yaml} fails on trusty due to wh...
- 04:36 PM Bug #8776: osd: runaway memory on dumpling
- it's all here:...
- 02:49 PM Bug #8969 (Fix Under Review): PerfCounters.SinglePerfCounters failure on i386
- 10:31 AM Bug #8969 (Resolved): PerfCounters.SinglePerfCounters failure on i386
- [ RUN ] PerfCounters.SinglePerfCounters
test/perf_counters.cc:111: Failure
Value of: msg
Actual: "{"test_perfcount... - 02:29 PM Bug #8628 (Resolved): Bad ceph_osd_op.extent union access in ReplicatedPG::do_osd_ops
- commit:58212b1245373b6f015cbff11844d33a900bf3cb
- 02:19 PM Bug #8628 (Rejected): Bad ceph_osd_op.extent union access in ReplicatedPG::do_osd_ops
- ceph_osd_op_uses_extent(op.op) guards the references ot the extent view of the union
- 02:13 PM Bug #8717: teuthology: valgrind leak checks broken for osd (at least)
- 02:12 PM Bug #8717 (Resolved): teuthology: valgrind leak checks broken for osd (at least)
- 02:12 PM Bug #8777 (Can't reproduce): osd/PGLog.h: 88: FAILED assert(rollback_info_trimmed_to_riter == log...
- 02:11 PM Bug #8595: osd: client op blocks until backfill starts (dumpling)
- 02:02 PM Bug #8595 (In Progress): osd: client op blocks until backfill starts (dumpling)
- 01:59 PM Bug #8714 (Fix Under Review): we do not block old clients from breaking cache pools
- https://github.com/ceph/ceph/pull/2172
- 01:46 PM Bug #8974 (Can't reproduce): osd crashed with merge_log assert due to removal of isds
- Even I got same asserts in one of the osds, when removed one osd from each node in a ceph cluster of 3 osd nodes ( 5 ...
- 01:31 PM devops Bug #8850: ceph-deploy tests fail during tar due to file changed; incomplete shutdown?
- an initial take on getting more information on what is going on:
https://github.com/ceph/teuthology/pull/302/files - 12:47 PM devops Bug #8850: ceph-deploy tests fail during tar due to file changed; incomplete shutdown?
- I initially thought that the ceph daemon was still running but according to upstart docs, this output:...
- 11:53 AM Feature #8973 (New): Add support for collecting usage information by namespace
- As of now there is no simple way to determine how much data is being used by a particular namespace. Customers curren...
- 11:36 AM rgw Bug #8972 (Resolved): rgw: bucket index log wrong object name in multipart completion
- When completing a multipart upload operation, when removing the parts from the index the entries that are logged in t...
- 11:27 AM rgw Bug #8971 (Duplicate): rgw: s3 test failures with civetweb
- teuthology logs are copied to ubuntu@mira023.front.sepia.ceph.com:/home/ubuntu/civetweb_s3
config.yaml:... - 10:35 AM Bug #8970 (Won't Fix): Injectargs - inconsistent parsing of bool values
- Hi all,
ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74) on Ubuntu 14.04 LTS
This is how I am able ... - 10:19 AM Feature #8960 (Fix Under Review): filestore: store backend type persisently
- https://github.com/ceph/ceph/pull/2163
- 10:17 AM Bug #8601: erasure-code: default profile does not exist after upgrade
- "rebased and repushed":https://github.com/ceph/ceph/pull/1990
- 09:37 AM Bug #8966 (Closed): ceph.conf "osd pool default size = 2" not working
- the config option needs to go in the [global] section, not [default] (which is never used for anything)
- 04:31 AM Bug #8966: ceph.conf "osd pool default size = 2" not working
- Recognized the failure with the command "ceph osd dump". There the pools had always the size 3 (default).
- 04:29 AM Bug #8966 (Closed): ceph.conf "osd pool default size = 2" not working
- Version
ceph-deploy: 1.5.9
ceph 0.80.5
Ceph.config:... - 09:03 AM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
- It appears I was able to get further this time, the steps are below.
Key difference is, when I did ceph-deploy new I... - 06:20 AM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
- Hi Alfredo,
Nodes were cleaned out, will re-run install today and get you the log files.
In the mean time, it appea... - 06:17 AM Bug #8922: ceph-deploy mon create fails to create additional monitoring nodes.
- ceph-deploy new cwtcph001
ceph-deploy install cwtcph001 cwtcph002 cwtcph003
ceph-deploy mon create cwtcph001 cwtcph... - 05:32 AM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
- Ilya Dryomov wrote:
> What do you mean by "I can't explain why only one machine is affected" above? Do you have oth... - 12:27 AM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
- What do you mean by "I can't explain why only one machine is affected" above? Do you have other similar boxes/setups...
- 02:01 AM rgw Bug #8383: Upload part of one object passed with incorrect upload id or incorrect object id in S3...
- Hi,sage,
Sure!
I use S3 API to do this test.... - 01:28 AM CephFS Bug #8961 (Won't Fix): du [directory] vs du -b [directory] size doubles
- cephfs tracks recursive directory stats. A directory's size is space used by files underneath the directory. If you d...
07/29/2014
- 09:41 PM rbd Bug #8000: SLAB: Unable to allocate memory on node 0
- This problem remains to be very painful... Average frequency is one crash per day. Less than 24 hours ago I had two c...
- 09:38 PM Bug #8863: osd: second reservation rejection -> crash
- i used this command reimport the crushmap, bug osd still crash
- 01:19 PM Bug #8863: osd: second reservation rejection -> crash
- try this:
ceph osd getcrushmap -o cm
ceph osd setcrushmap -i cm
and then see if you can reproduce it after t... - 03:41 AM Bug #8863: osd: second reservation rejection -> crash
- osd reject the other osd's backfill request twice probably because the space is full, then the request one crashed
- 03:27 AM Bug #8863: osd: second reservation rejection -> crash
- *scenario:*
1. 3-replica
2. space is nearlly full(some osd >96%)
We guess the reason is osd continuously receivi... - 07:52 PM Bug #8886: Miss some folders in PG's folder
- Hi, Samuel,
First, I correct my word " it should be stored in the DIR_3 at third level", actually it miss the DIR_... - 01:43 PM Bug #8886: Miss some folders in PG's folder
- Can you add a find . on that pg directory? Also, does this happen reliably? Also, on what version did you reproduce...
- 07:30 PM CephFS Bug #8962: kcephfs: client does not release revoked cap
- I saw similar hang a few weeks ago. In that case, all OSDs were down, the MDS couldn't submit log event.
- 03:05 PM CephFS Bug #8962 (Resolved): kcephfs: client does not release revoked cap
- several instances where the mds tries to revoke a cap (Ls and Fs have been observed so far) and the client doesn't re...
- 07:18 PM CephFS Bug #8964: kcephfs: client does not resend requests on mds restart
- 07:18 PM CephFS Bug #8964: kcephfs: client does not resend requests on mds restart
- probably fixed by https://github.com/ceph/ceph-client/commit/967166011221589288348b893720d358150176b9
- 05:40 PM CephFS Bug #8964: kcephfs: client does not resend requests on mds restart
- mds log and the client kern.log with debug cranked up:...
- 05:39 PM CephFS Bug #8964 (Resolved): kcephfs: client does not resend requests on mds restart
- i have a bunch of hung requests,...
- 06:47 PM Feature #8965 (New): Improve threading for ObjectCacher
- The ObjectCacher currently use a single global lock for all state. Break this down to improve multithread performanc...
- 03:55 PM Feature #8960: filestore: store backend type persisently
- 10:27 AM Feature #8960 (Resolved): filestore: store backend type persisently
- 03:32 PM rgw Bug #8586 (Fix Under Review): Missing Swift API Header causes RadosGW to segfault
- 03:06 PM RADOS Bug #8963 (Resolved): erasure coding crush rulset breaks rbd kernel clients on non-ec pools on Ub...
- On a fresh install using ceph-deploy on Ubuntu 14.04 creating any erasure coded pool breaks rbd clients on linux 3.13...
- 03:02 PM Bug #8726 (Resolved): (firefly command on dumpling issue?) Error "'adjust-ulimits ceph-coverage /...
- commit:fcc0b2451b47793a64fc4cd4675fef667a4a5b45 in ceph-qa-suite.git
- 02:31 PM Bug #8628: Bad ceph_osd_op.extent union access in ReplicatedPG::do_osd_ops
- This was fixed in 58212b1.
- 02:28 PM devops Bug #6091 (Won't Fix): centos build should use redhat-rpm-config for debuginfo packages
- 02:28 PM devops Bug #5819 (Won't Fix): redhat-rpm-config package needed for debuginfo packages
- 02:26 PM devops Bug #7181 (Rejected): debian 7 wheezy init.d script will not start OSDs not corresponding to a mo...
- touch /var/lib/ceph/osd/*/sysvinit
- 02:26 PM devops Bug #6937 (Resolved): udev: OSD using dmcrypt aren't automatically started
- 02:25 PM devops Bug #6453 (Won't Fix): libapache2-mod-fastcgi Packages for Debian Squeeze have incorrect dependen...
- 02:25 PM devops Bug #6158: selective sync of ceph precise dependencies from havana cloud archive
- Note: Talk to neil about this.
- 02:22 PM devops Bug #8602 (Rejected): ceph fedora package is missing erasure code libraries
- redoing (redid?) these packages
- 02:22 PM Bug #8711 (Resolved): Error "ceph --format=json-pretty osd lspools" is "unrecognized command" in ...
Oops, this should have been closed already...- 01:51 PM Bug #8711: Error "ceph --format=json-pretty osd lspools" is "unrecognized command" in cuttlefish
- Probably best to change the test to cope?
- 02:21 PM devops Bug #7598 (Can't reproduce): ceph-disk-activate error with ceph-deploy
- 02:19 PM devops Bug #8581 (Can't reproduce): DNS issues when resolving hosts
- 02:17 PM devops Bug #8734: EPEL / Ceph.com package priority issues
- ceph-deploy sets the priority; other users will need to do so themselves.
perhaps that can be mentioned in the doc... - 02:15 PM devops Bug #5283 (Won't Fix): Ceph-deploy can't handle /dev/disk/by-* device paths
- 02:06 PM devops Bug #7627 (Resolved): ceph-disk: does not start daemons properly under systemd
- commit:3e0d9800767018625f0e7d797c812aa44c426dab
- 02:01 PM Documentation #8875: `ceph-deploy new` needs to be called for every node, not just the admin one
- Can you paste the whole output of ceph-deploy?
- 01:58 PM Bug #6141 (Can't reproduce): OSDs crash on recovery
- 01:52 PM Bug #8673 (Resolved): s3tests.functional.test_s3.test_multipart_upload failed in teuthology-2014-...
- 01:50 PM Bug #8654 (Resolved): Parsing /etc/lsb-release for OSD metadata is not portable
- 01:49 PM Bug #8644 (Rejected): 624ae21833 breaks ceph-disk
- 01:48 PM Bug #8852 (Won't Fix): submodules not cecking out the right branch, jerasure does not compile
- workaround is to remove the dir then rerun the submodule command. we blame git!
- 01:47 PM Bug #8801 (Can't reproduce): Ceph monitors do not start after server restart
- from teh logs the ceph-mon process was never started.. iw ould look in your /var/log/upstart logs?
- 01:37 PM Bug #8943 (Pending Backport): "ceph df" cannot show pool available space correctly
- commit:04d0526718ccfc220b4fe0c9046ac58899d9dafc
- 01:34 PM Bug #8495 (Duplicate): osd: bad state machine event on backfill request
- 01:29 PM Bug #8694 (Duplicate): OSD crashed (assertion failure) at FileStore::_collection_move_rename
- #8733
- 01:28 PM rgw Bug #8676: md5sum check failed during readwrite.py
- I don't see anything wrong in the logs other than this:...
- 01:27 PM Bug #8753: PG::activate assert failed when recover finished
- Has this happened since?
- 01:26 PM Bug #8865: cep osd setmaxosd doesn't check if osds exist
- agreed
- 01:26 PM Bug #8752 (Can't reproduce): firefly: scrub/repair stat mismatch
- 01:25 PM Bug #8752 (Resolved): firefly: scrub/repair stat mismatch
- 01:06 PM CephFS Bug #8961 (Won't Fix): du [directory] vs du -b [directory] size doubles
- Under cephfs using the kernel client, du -b shows an incorrect size.
I've also found that du --apparent-size shows... - 01:04 PM Bug #8717 (In Progress): teuthology: valgrind leak checks broken for osd (at least)
- 01:03 PM Bug #8717 (Resolved): teuthology: valgrind leak checks broken for osd (at least)
- 01:03 PM Bug #8926 (Resolved): osd: invalid Message* deref in C_SendMap
- 01:03 PM Bug #8924 (Resolved): osd: leaking local_connection under valgrind
- 12:59 PM Messengers Bug #8880: msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq feature")
- 10:42 AM rgw Bug #8632 (Resolved): rgw: bucket listing with delimiter doesn't scale well
- backported to dumpling commit:9604425b86f5839a109faa1f396b0d114e9b9391
- 09:36 AM rgw Bug #8632 (Pending Backport): rgw: bucket listing with delimiter doesn't scale well
- in firefly, not dumpling yet
- 10:31 AM rgw Bug #8846 (Resolved): radosgw on 0.80.4 crashes when doing a multi-part upload
- 10:11 AM Bug #8532 (Can't reproduce): 0.80.1: OSD crash (domino effect), same as BUG #8229
- Let us know if anything interesting comes up.
- 10:10 AM Bug #8229: 0.80~rc1: OSD crash (domino effect)
- This bug described a whole bunch of unrelated problems, can you open a fresh bug?
- 10:01 AM Bug #8959: osd crashed in upgrade:dumpling-x-firefly---basic-vps suite
- this sounds a bit like a problem we had a while back with hung IOs from the VMs?
- 08:40 AM Bug #8959: osd crashed in upgrade:dumpling-x-firefly---basic-vps suite
- Seems the same crash in another tests, logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-28_11:48:15...
- 08:36 AM Bug #8959 (Can't reproduce): osd crashed in upgrade:dumpling-x-firefly---basic-vps suite
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-07-28_11:48:15-upgrade:dumpling-x-firefly---basic-vps...
- 09:41 AM CephFS Bug #8574: teuthology: NFS mounts on trusty are failing
- I'm not sure if this is a different issue or a different system:...
- 09:40 AM devops Support #8861: Deploying additional monitors fails.
- I am also seeing this error when trying to add a new monitor. Same version of Ubuntu and Ceph.
- 09:38 AM rgw Bug #8735 (Can't reproduce): TestAccountNoContainers fail in Firefly upgrade:firefly-x:stress-split
- 09:38 AM rgw Bug #8766: multipart minimum size error should be EntityTooSmall
- 09:37 AM rgw Bug #8848 (Resolved): "adjust-ulimits: command not found" in upgrade:firefly-firefly-testing-basi...
- 09:37 AM rgw Bug #8847 (Can't reproduce): "Error initializing cluster client" in upgrade:firefly-firefly-testi...
- 09:34 AM Bug #8921 (Won't Fix): ceph pg dump <{summary|sum|delta|pools|osds|pgs|pgs_brief}> only work corr...
- 09:33 AM rgw Bug #8864: radosgw help doesn't seem to display some debug options
- there are others that we could add
- 09:32 AM rgw Bug #8864 (Resolved): radosgw help doesn't seem to display some debug options
- 09:32 AM rgw Bug #6911 (Won't Fix): rgw test failure on the arm set up
- 09:31 AM rgw Bug #8111 (Need More Info): /etc/init.d/ceph-radosgw for RHEL needs QA
- isn't it /etc/init.d/radosgw?
- 09:30 AM rgw Bug #8383 (Need More Info): Upload part of one object passed with incorrect upload id or incorrec...
- Can you provide more detailed steps to reproduce? ideally, a new test in s3-tests.... :)
- 09:29 AM rgw Bug #7799 (Can't reproduce): Errors in upgrade:dumpling-x:stress-split-firefly---basic-plana suite
- 09:25 AM rgw Bug #8311 (Resolved): No pool name error in ubuntu-2014-05-06_21:02:54-upgrade:dumpling-dumpling-...
- 09:25 AM rgw Bug #8784: rgw: completion leak
- 09:23 AM rbd Bug #6695 (Won't Fix): Upgrade rbd failure in nightly tests. (mkdir --p ..)
- 09:22 AM rbd Bug #5480 (Can't reproduce): libceph: unexpected old state in con_sock_state_change
- 09:21 AM rbd Bug #8845: Flattening Clones of clone, results in command failure
- fsx is now able to catch this one.
- 09:19 AM rbd Bug #8845: Flattening Clones of clone, results in command failure
- 09:15 AM rbd Bug #8845: Flattening Clones of clone, results in command failure
- 09:21 AM rbd Bug #7693: virsh domblkinfo fails with 'Bad file descriptor'
- https://bugzilla.redhat.com/show_bug.cgi?id=1124508
- 09:17 AM rbd Bug #7620 (Can't reproduce): BUG: soft lockup - CPU#0 stuck for 23s!
- 09:15 AM Linux kernel client Bug #8568 (New): libceph: kernel BUG at net/ceph/osd_client.c:885
- 09:10 AM Linux kernel client Bug #8568: libceph: kernel BUG at net/ceph/osd_client.c:885
- 09:14 AM rbd Bug #8709: stale size reported by ioctl(BLKGETSIZE64) after librbd_resize() returns
- The problem has been traced to http://tracker.ceph.com/issues/8806. Keeping this around to re-test after it gets fixed.
- 09:11 AM Bug #8439 (Won't Fix): ceph-osd crashing often
- see 0.80.x
- 09:10 AM Bug #8445 (Won't Fix): osd not starting anymore
- 0.78 had lots of issues; see 0.80.x
- 09:01 AM rbd Bug #8318 (Can't reproduce): "rbd: create error" in upgrade:dumpling-dumpling-testing-basic-plana...
- 09:01 AM rbd Bug #8715 (Can't reproduce): "ceph_test_librbd_fsx: invalid option -- 'h'" error in teuthology-20...
- 06:57 AM CephFS Feature #7759 (Resolved): journal-tool: roll in resetter/dumper from MDS
- ...
- 06:56 AM CephFS Feature #7761 (Resolved): journal-tool: forwards-search through corrupt regions
- ...
- 06:55 AM CephFS Feature #7763: journal-tool: import
- ...
- 06:54 AM CephFS Feature #7763 (Resolved): journal-tool: import
- This was done when undump was merged into cephfs-journal-tool
- 06:51 AM CephFS Bug #8773 (Resolved): failing cephfs set_layout tests
- Test is retired and unsafe behaviour (data pool default to 0) is disabled in master.
- 06:07 AM CephFS Bug #8811 (Resolved): Journal corruption during upgrade to 0.82 with standby-replay daemons
- This got fixed 11 days ago, but was never marked closed. Merged in commit:b9463e3497cc1f2a1bab0838430a4402d8c88af0
- 05:59 AM Bug #8932 (Resolved): rados api test hang on HitSetWrite
- Merged to master in commit:37eba045ec78f2ea8f9000c6b158e20808d29fb2
- 05:56 AM Bug #8931 (Pending Backport): failed write reply order from ceph_test_rados
- Merged to master in commit:050ac87530c2637f097e07b5373115721303f07c
07/28/2014
- 10:47 PM Bug #8944: Ceph daemon bad asok used in connection with cluster
- wip-8944 created, but gitbuilders are having enough problems I'm not submitting a PR yet
- 02:11 PM Bug #8944 (Fix Under Review): Ceph daemon bad asok used in connection with cluster
- Adding the global args to the invocation of ceph-conf seems to resolve this.
- 12:41 PM Bug #8944: Ceph daemon bad asok used in connection with cluster
- oh....because --cluster on the cli ... yeah.
- 12:40 PM Bug #8944: Ceph daemon bad asok used in connection with cluster
- ceph uses ceph-conf --show-config-value admin_socket -n <name> and believes it; wonder why that's not working?
- 09:58 AM Bug #8944: Ceph daemon bad asok used in connection with cluster
- 05:01 AM Bug #8944 (Resolved): Ceph daemon bad asok used in connection with cluster
- Using @ceph --cluster clustername daemon mon.host1 config@ causes ...
- 10:46 PM Bug #8947: Writing rados objects with max objects set for cache pool crashed osd
- Uploading crash dump
- 01:45 PM Bug #8947: Writing rados objects with max objects set for cache pool crashed osd
- Could not reproduce using vstart.sh on current master branch. I never saw a crash or bug report with that stack trace.
- 10:08 AM Bug #8947: Writing rados objects with max objects set for cache pool crashed osd
- I don't remember the details, but we were previously crashing with a 10-object limit anyway due to hit sets and such....
- 08:16 AM Bug #8947: Writing rados objects with max objects set for cache pool crashed osd
- Test configuration:
No of osd nodes: 3
No of osd's : 4
No of monitors: 2
Kernel versions: 3.13.0-24-generic
No o... - 08:15 AM Bug #8947 (Duplicate): Writing rados objects with max objects set for cache pool crashed osd
- Setting target_max_objects parameter and writing rados object onto cache pool crashed osd.
History of operations o... - 06:41 PM Messengers Bug #8880 (Fix Under Review): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_s...
- New patches to split up the code more, as requested. :)
- 10:56 AM Messengers Bug #8880 (In Progress): msg/Pipe.cc: 1538: FAILED assert(0 == "old msgs despite reconnect_seq fe...
- 02:12 PM rgw Bug #8937 (Fix Under Review): rgw: broken large(-ish) objects
- 02:10 PM rgw Feature #7774 (Resolved): rgw: cache decoded user and bucket info
- This one has been merged in a while a go, at commit:82c547952dc9e7a3e9fab1264f5fdd903ab6973e.
- 01:07 PM Bug #8941 (Can't reproduce): DaemonConfig.SubstitutionLoop unit test goes haywire
- nevermind, most recent occurrence was feb, so ignoring this.
- 01:02 PM rgw Feature #8956 (Resolved): rgw: support bucket notification
- 11:32 AM Documentation #8955: doc refers to [default] section, don't think it exists
- http://ceph.com/docs/master/start/quick-ceph-deploy/#create-a-cluster refers to the [default] section in the ceph.con...
- 11:31 AM Documentation #8955 (Resolved): doc refers to [default] section, don't think it exists
- 09:21 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- I'm pretty sure it's the disabled lockdep that affects this. Our testing kernel is built with lockdep enabled, Ubunt...
- 08:50 AM Linux kernel client Bug #8818: IO Hang on raw rbd device - Workqueue: ceph-msgr con_work [libceph]
- Hi Ilya,
I can reliably reproduce the error when running this generic kernel with no changes:
http://kernel.ubu... - 08:39 AM Bug #8935: operations not idempotent when enabling cache
- I think you're right that a per-object log would be needed to solve this problem — and I think that means we shouldn'...
- 08:02 AM rgw Feature #8945 (Resolved): rgw: support swift /info api
- 06:55 AM Bug #8938 (Resolved): OSD memory leak seen with fs-master-testing-basic/kernel_untar_build.sh
- This was fixed at about the same time:...
- 06:42 AM CephFS Feature #7810 (In Progress): libcephfs: add a test that freezes + unfreezes a client, and then ve...
- 05:27 AM Bug #8895: ceph osd pool stats (displayed incorrect values)
- Negative & undefined values in counts objects:
*-5/0 objects degraded (-inf%)*
*-32/12 objects degraded (-266... - 03:06 AM rgw Bug #8864: radosgw help doesn't seem to display some debug options
- This should be closed with #8112
- 02:48 AM Bug #8943 (Resolved): "ceph df" cannot show pool available space correctly
- Currently when user have 2 pools with different ruleset and different root, basically they will use differen...
- 12:37 AM Bug #8863: osd: second reservation rejection -> crash
- Last week we've created a new cluster(all components use v0.80.4), continuously writing data until space is full, the...
07/27/2014
- 11:45 PM Bug #8942 (Resolved): Bad JSON output in ceph osd tree
- Hi,
JSON output for @ceph osd tree@ has bad format for stray array: every osd are printed in the same array element.... - 10:41 PM Bug #8941 (Can't reproduce): DaemonConfig.SubstitutionLoop unit test goes haywire
- ...
- 10:31 PM Bug #8822: osd: hang on shutdown, spinlocks
- saw this again, ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-27_02:30:01-rados-next-testing-basi...
- 10:28 PM Bug #8396: osd: message delayed in Session misdirected after split
- very likely another instance, but i didn't look closely....
- 10:20 PM Bug #8940 (Duplicate): 3.22s1 shard 0(2) missing ad166f62/benchmark_data_plana57_30491_object1036...
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-27_02:30:01-rados-next-testing-basic-plana/380335
... - 09:47 PM Bug #6003: journal Unable to read past sequence 406 ...
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-07-27_02:30:01-rados-next-testing-basic-plana/380261
... - 02:32 PM Bug #8758: PGs get stuck in “replay”, but drop it upon osd restarts
- As for the issue of losing replay states upon member osd restarts... Could the fix be as simple as not setting inter...
- 01:44 PM Bug #8758: PGs get stuck in “replay”, but drop it upon osd restarts
- Here's a patch that addresses the “stuck in replay” problem (but not the “replay is dropped after osd re-peering” one).
- 11:21 AM Bug #8863 (Need More Info): osd: second reservation rejection -> crash
- 11:20 AM Bug #8922 (Need More Info): ceph-deploy mon create fails to create additional monitoring nodes.
- It sounds like the monitor names don't match the host names or something similar. Can you post the full sequence of ...
07/26/2014
- 10:14 PM Bug #8939 (In Progress): stalled LibRadosTwoPoolsPP.TryFlushReadRace; client failed to reconnect?
- 10:10 PM Bug #8939: stalled LibRadosTwoPoolsPP.TryFlushReadRace; client failed to reconnect?
- ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-07-25_22:40:14-rados-wip-sage-testing-testing-basic-plana/37...
- 10:05 PM Bug #8939 (Duplicate): stalled LibRadosTwoPoolsPP.TryFlushReadRace; client failed to reconnect?
- it appears the OSD was behaving properly, but things stalled because on of the stat replies got...
- 02:06 PM Bug #8938 (Resolved): OSD memory leak seen with fs-master-testing-basic/kernel_untar_build.sh
http://pulpito.front.sepia.ceph.com/teuthology-2014-07-25_23:04:01-fs-master-testing-basic-plana/378947/
Initial...
Also available in: Atom