Activity
From 11/10/2014 to 12/09/2014
12/09/2014
- 10:24 PM CephFS Bug #10288: ceph fs ls fails to list newly created fs
- This is probably going to be something obvious in the MDSMonitor.
- 09:38 PM CephFS Bug #10288 (Resolved): ceph fs ls fails to list newly created fs
- Hi!
After upgrading from .6 to .8 (giant current from ceph ubuntu packages), I wanted to play with CephFS. I foll... - 10:24 PM Bug #10287: ceph v0.80.7 ceph-mon --mkfs crash
- change to leveldb 1.12, everything works fine. Please close it.
- 08:25 PM Bug #10287: ceph v0.80.7 ceph-mon --mkfs crash
- ceph.conf file...
- 08:24 PM Bug #10287 (Resolved): ceph v0.80.7 ceph-mon --mkfs crash
- ceph version v0.80.7 a new install machine "CentOS Linux release 7.0.1406 (Core)" run, the rpm build in same OS platf...
- 06:14 PM CephFS Feature #1398: qa: multiclient file io test
- Currently I am testing with the following yaml file....
- 04:35 PM Feature #10198: PG removal occupy the disk thread several hours
- In that case, two things:
1) move scrubbing into the OpWQ (I'm working on that one)
2) restructure pg removal to on... - 03:23 PM Bug #10281 (Fix Under Review): firefly: make check fails on fedora 20
- https://github.com/ceph/ceph/pull/3128
- 02:45 PM Bug #10281 (Resolved): firefly: make check fails on fedora 20
- http://paste.ubuntu.com/9447409/
- 03:16 PM Bug #10282 (Resolved): gf-complete: missing .gitignore entry for .dirstamp
- upstream Greg's fix https://github.com/ceph/gf-complete/pull/2 :
* -https://github.com/ceph/gf-complete/pull/3-
* ... - 01:49 PM CephFS Bug #10248: messenger: failed Pipe;:connect::assert(m) in Hadoop client
- Hmm, the client only calls _closed_mds_session if:
1) it gets back a session close
2) the session goes stale
2a)... - 01:11 PM Feature #7862: allow backfill/recovery while below min_size
- 01:10 PM Feature #8635 (In Progress): add scrub, snap trimming, should be items in the OpWQ with cost/prio...
- 01:06 PM Feature #7861: osd: allow writes on degraded objects
- 01:06 PM Feature #9781 (In Progress): ceph_objectstore_tool: On import handle splits
- 01:05 PM Feature #9780: ceph_objectstore_tool: Add OSDMap information to pg export
- 12:05 PM Fix #9566: osd: prioritize recovery of OSDs with most work to do
- 11:32 AM Bug #9503: Dumpling: removing many snapshots in a short time makes OSDs go berserk
- Sam, is it correct to assume that this was fixed for dumpling in commit:1be9476afb9f715502a14749dd44e08371535b54, and...
- 06:49 AM rgw Bug #10268: s3tests.functional.test_s3.test_bucket_create_exists fails with 'S3CreateError not ra...
- ubuntu@teuthology:/a/teuthology-2014-12-08_02:35:02-smoke-master-distro-basic-multi/642112
- 02:43 AM Bug #10272: objects misplaced after reweight
- Of course... thanks for explaining
- 01:05 AM Bug #10225: keyvaluestore: OSDs do not start after few weeks of downtime (osd init failed / unabl...
- So there not exists OSD superblock issue? only EC+KV problem that #9978 mentioned?
- 12:36 AM Bug #10225: keyvaluestore: OSDs do not start after few weeks of downtime (osd init failed / unabl...
- This issue has something to do with down time. On KV OSDs I've checked 'superblock' files and found that they are OK ...
- 12:39 AM Bug #9978: keyvaluestore: void ECBackend::handle_sub_read
- Can this bug get a little attention please? It has profound effect on my crippled cluster and I'm talking about files...
12/08/2014
- 11:12 PM CephFS Bug #10277 (Fix Under Review): ceph-fuse: Consistent pjd failure in getcwd
- 02:58 PM CephFS Bug #10277 (Resolved): ceph-fuse: Consistent pjd failure in getcwd
- "job-working-directory: error retrieving current directory: getcwd: cannot access parent directories: No such file or...
- 09:39 PM Bug #10010 (Fix Under Review): ceph_osd.cc calls global_init_shutdown_stderr even when running wi...
- Seems pretty simple; just check g_conf->daemonize, and don't close if not set.
- 09:08 PM Bug #10242: FAILED assert(backfill_targets.empty() || backfill_targets == want_backfill)
- Since this is from online product environment, it never happened again. And I cannot reproduce it in my test/staging ...
- 02:53 PM Bug #10242: FAILED assert(backfill_targets.empty() || backfill_targets == want_backfill)
- Mmm, that assert is essentially saying that choose_acting is only called in two situations:
1) On a new interval. I... - 08:56 PM Bug #10171 (Fix Under Review): DBObjectMap: ghobject_t header key excludes hash for EC pools
- 08:55 PM Bug #10272 (Rejected): objects misplaced after reweight
- problem is the (post-crush) reweights. you're rejecting almost all osds with 80% probability. eventually crush will...
- 02:54 PM Bug #10272: objects misplaced after reweight
- This is a problem with the crush rule. Crush retried a bunch of times, but was unable to get 3 replicas for that pg.
- 10:35 AM Bug #10272 (Rejected): objects misplaced after reweight
- Steps to reproduce, after compiling from sources:...
- 04:13 PM Bug #10018: OSD assertion failure if the hinfo_key xattr is not there (corrupted?) during scrubbing
- It's waiting on https://github.com/ceph/ceph-qa-suite/pull/250
- 03:30 PM Bug #10018 (In Progress): OSD assertion failure if the hinfo_key xattr is not there (corrupted?) ...
- Loic: are the tests for this in the regression suite yet?
- 03:23 PM Bug #10018 (Resolved): OSD assertion failure if the hinfo_key xattr is not there (corrupted?) dur...
- 03:33 PM rbd Bug #10180 (Resolved): qemu tests crash host kernel
- 03:26 PM Bug #10262: osd/osd_types.h: 2944: FAILED assert(rwstate.empty())
- Unfortunately I doubt it. From what I have read, cranking up the logs so much would extremely quickly eat up availabl...
- 03:20 PM Bug #10262: osd/osd_types.h: 2944: FAILED assert(rwstate.empty())
- Can you reproduce? The logs don't have much information, I need it reproduced with
debug osd = 20
debug filestor... - 09:48 AM Bug #10262: osd/osd_types.h: 2944: FAILED assert(rwstate.empty())
- Replicated Pool. No cache tiering.
- 09:47 AM Bug #10262: osd/osd_types.h: 2944: FAILED assert(rwstate.empty())
- is this an erasure or replicated pool? are you using cache tiering?
- 03:25 PM Bug #9503 (Resolved): Dumpling: removing many snapshots in a short time makes OSDs go berserk
- 03:03 PM CephFS Bug #10263 (Resolved): [ERR] bad backtrace on dir ino 600
- They're all happy now, merged everything in.
- 12:36 PM CephFS Bug #10263: [ERR] bad backtrace on dir ino 600
- Merged in the patch for Giant as of commit:247a6fac54854e92a7df0e651e248a262d3efa05.
The others are a little unhap... - 02:05 PM CephFS Bug #10248: messenger: failed Pipe;:connect::assert(m) in Hadoop client
- the new assert for wip-10057 would trigger this.
this looks like a corner case is the session close + reopen seque... - 11:59 AM Bug #10250: PG stuck incomplete after interrupted backfill.
- Awesome, looks like it worked, it started backfilling right away and now my vms are unfrozen. Thanks a lot!
- 11:30 AM Bug #10250: PG stuck incomplete after interrupted backfill.
- install wip-last_epoch_started, set osd_find_best_info_ignore_history_les = true in ceph.conf, and restart the primar...
- 10:59 AM Bug #10250: PG stuck incomplete after interrupted backfill.
- ...
- 10:21 AM Bug #10250: PG stuck incomplete after interrupted backfill.
- I actually had already marked 11 as lost a few days ago. Just this morning I re-activated the disk and it came up as ...
- 08:52 AM Bug #10250: PG stuck incomplete after interrupted backfill.
- Can you try 'ceph osd lost 11' ? (I take it osd.11 is the one that you wiped and removed?)
If you can catpure the... - 06:48 AM Bug #10250: PG stuck incomplete after interrupted backfill.
- I found more info in the log relating to this pg. It looks like it's kicking it, but still hanging requests. At this ...
- 11:48 AM Bug #8935: operations not idempotent when enabling cache
- ubuntu@teuthology:/a/samuelj-2014-12-05_23:56:18-rados-wip-sam-firefly-testing-wip-testing-vanilla-fixes-basic-multi/...
- 11:47 AM Bug #8797: "ceph status" do not exit with python_2.7.8
- I think the right fix for this is to remove Rados.__del__. I'll come up with a pull request unless you want to, Joe.
- 10:33 AM rgw Bug #10066 (In Progress): rgw: failed md5sum on s3tests-test-readwrite
- 10:19 AM Bug #10241 (Need More Info): Incorrect OSD mapping with EC 6+2 setup in Giant
- need osdmap or crushmap that triggers the failed mapping
- 09:47 AM Bug #10258 (Duplicate): ceph health reporting blocked op indefinitely
- #10259
- 09:45 AM rgw Bug #10271 (Resolved): Radosgw urlencode
- When performing a multipart upload using AWS SDK JS v.2.0.29 I can see that the uploadId always starts with "2/" whic...
- 09:28 AM devops Bug #10266: Can't kill runs on magan002 "AuthenticationException: Authentication failed."
- What even caused this? I can't of course log in to the machines to check it out.
- 09:12 AM rbd Bug #10270 (Resolved): "[ FAILED ] LibRBD.ListChildren" in upgrade:firefly-x-giant-distro-basic...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-12-07_18:13:01-upgrade:firefly-x-giant-distro-basic-m...
- 08:40 AM Feature #10192 (Resolved): ceph_objectstore_tool object lookup
- 08:18 AM Bug #10215 (Resolved): vstart_wrapper.sh kills daemons that do not belong to it
- 06:56 AM devops Bug #10200: tgtd error: undefined symbol: rbd_discard
- tgt_1.0.38-48.bf6981.precise.ceph_amd64.deb from ceph-extras depends on rbd_discard, which is not available in stock ...
- 05:43 AM Bug #9916: osd: crash in check_ops_in_flight
- Sorry for the repetition of #5 and #6 for network problem.
- 05:41 AM Bug #9916: osd: crash in check_ops_in_flight
- In file osd/OSD.cc:
OSD::_dispatch(Message *m) method:... - 05:01 AM Fix #9566: osd: prioritize recovery of OSDs with most work to do
- ...
- 02:50 AM Fix #9566 (In Progress): osd: prioritize recovery of OSDs with most work to do
- 04:30 AM Linux kernel client Feature #5109: libceph: implement message signatures
- 04:04 AM Messengers Feature #10029: Retry binding on IPv6 address if not available
- Logs I'm seeing on a monitor when it boots:...
- 02:49 AM Cleanup #10253 (Closed): gf-complete dead code
- False positive according to Kevin & Janne.
- 02:03 AM Bug #9485: Monitor crash due to wrong crush rule set
- But you understand that when CRUSH can not find enough racks using the indep mode, things go wrong and the wrong rule...
- 01:41 AM Bug #9485: Monitor crash due to wrong crush rule set
- Although I've marked the issue as verified, I did not actually get to reproduce it. I meant to a number of times usin...
12/07/2014
- 07:46 PM Feature #10193 (Fix Under Review): Perf counter for WBThrottle
- https://github.com/ceph/ceph/pull/3111
- 06:40 PM Bug #9485: Monitor crash due to wrong crush rule set
- Hi sage:
According to my test earlier, crushtool may not be able to make it crash. I remember that crushtool will ... - 10:11 AM Bug #9485: Monitor crash due to wrong crush rule set
- Just to offer some debriefing on the issue.
After installing the patch, I managed to get the monitor up and runnin... - 05:50 PM CephFS Bug #10263 (Fix Under Review): [ERR] bad backtrace on dir ino 600
- It's introduced by the 'verify backtrace on fetching dirfrag' patch. Stray directories of old fs has no backtrace, th...
- 10:06 AM rgw Bug #10268 (Resolved): s3tests.functional.test_s3.test_bucket_create_exists fails with 'S3CreateE...
- ...
- 08:44 AM devops Bug #10266 (Resolved): Can't kill runs on magan002 "AuthenticationException: Authentication failed."
- ...
- 05:47 AM devops Bug #10148: Giant/Wheezy SysV: /etc/init.d/ceph -a start shifts crushmap to executing host
- Duplicate of #9407
- 05:00 AM Fix #10264 (Fix Under Review): docker-test-helper fails on detached head
- https://github.com/ceph/ceph/pull/3104
- 03:19 AM Fix #10264 (Resolved): docker-test-helper fails on detached head
- If the working tree is on a detached head (i.e. the commit may be unreachable from from any git refs), docker-test-he...
- 04:59 AM Documentation #10265: building from source should be a onliner
- https://github.com/ceph/ceph/pull/3104
- 04:46 AM Documentation #10265 (Resolved): building from source should be a onliner
- Building Ceph from sources is documented as multiple steps although it could be a oneliner grouping...
- 12:31 AM Feature #9888 (Resolved): AsyncMessenger: Async event threads can shared by all AsyncMessenger
12/06/2014
- 06:42 PM Bug #10125 (Resolved): radosgw is being started as root not apache with systemd
- 05:36 PM CephFS Bug #10263 (Resolved): [ERR] bad backtrace on dir ino 600
- ubuntu@teuthology:/a/sage-bug-10171-base/639742
and the other runs in this set. It's an upgrade test:... - 05:34 PM Bug #9485: Monitor crash due to wrong crush rule set
- I've fixed Panayiotis's issue, but it is different than the original bug.
Dong Lei, I've tried to reproduce this b... - 12:22 PM Bug #9485: Monitor crash due to wrong crush rule set
- https://github.com/ceph/ceph/commit/wip-9485
- 11:17 AM Bug #9485: Monitor crash due to wrong crush rule set
- for the attached linked, this is the result of the command (crushtool, as compiled from git tree with --with-debug --...
- 11:16 AM Bug #9485: Monitor crash due to wrong crush rule set
- for the attached linked, this is the result of the command (crushtool, as supplied by debian packages -- 0.80)
htt... - 11:09 AM Bug #9485: Monitor crash due to wrong crush rule set
- This is the crashing crushmap
https://www.dropbox.com/s/gbusu8jf2ku6k62/crushmap.orig?dl=0 - 05:28 PM rbd Bug #10180: qemu tests crash host kernel
- For the kernel fixes see https://github.com/ceph/teuthology/pull/380
Rbd suite run - http://pulpito.front.sepia.ce... - 11:08 AM Bug #10063 (Resolved): ceph_objectstore_tool does not support getting attributes for erasure code...
- 11:02 AM Feature #9420 (Resolved): erasure-code: tools and archive to check for non regression of encoding
- 06:26 AM RADOS Feature #6114: Complete python binding interfaces for librados
- * lock support https://github.com/ceph/ceph/pull/3099
- 06:04 AM Bug #10262 (Resolved): osd/osd_types.h: 2944: FAILED assert(rwstate.empty())
- During the night of 2014-12-06 our cluster (4 nodes, 12x4TB spinning disks, Firefly 0.80.7.1 on Ubuntu 14.04.1) suffe...
12/05/2014
- 09:56 PM Bug #10225: keyvaluestore: OSDs do not start after few weeks of downtime (osd init failed / unabl...
- Sage Weil wrote:
> Just a reminder that the "_dev" in "keyvaluestore_dev" means "experimental! danger! danger!". Th... - 07:01 AM Bug #10225: keyvaluestore: OSDs do not start after few weeks of downtime (osd init failed / unabl...
- Just a reminder that the "_dev" in "keyvaluestore_dev" means "experimental! danger! danger!". This code is not well-...
- 05:48 PM CephFS Feature #1398: qa: multiclient file io test
- The problem i believe is that we need to install ceph and make sure that we have some mount points before we run the ...
- 05:11 PM Bug #9485: Monitor crash due to wrong crush rule set
- Hello, I can verify that I am facing the same problem.
After trying to edit the crushmap in order to separate grou... - 03:26 PM Bug #10259 (Resolved): mon health stuck with phantom hung requests
- commit:1ac17c0a662e6079c2c57edde2b4dc947f547f57
(03:22:47 PM) sjust: sage: osd_stat_t
(03:22:59 PM) sjust: does n... - 02:36 PM Bug #10258 (Duplicate): ceph health reporting blocked op indefinitely
- On the performance test cluster, when creating an EC pool, ceph health reports that an op is blocked many hours after...
- 12:08 PM Bug #10257 (Resolved): Ceph df doesn't report MAX AVAIL correctly when using rulesets and OSD in ...
- In our setup we have two rulesets, one for SSDs and another one for HDDs. Ceph df normally reports the MAX AVAIL spac...
- 11:58 AM devops Bug #10252 (Resolved): apt-mirror having issues
- This was being caused by a proxy issue (needed another reload) which is used to access apt-mirror from the redhat net...
- 09:37 AM devops Bug #10252: apt-mirror having issues
- From magna002 curl sees the same:...
- 08:25 AM devops Bug #10252 (Resolved): apt-mirror having issues
- That prevent installation on some machines:...
- 10:16 AM Feature #10231 (Resolved): gperftools headers have moved
- 09:32 AM Bug #9785 (Fix Under Review): /etc/ceph/dmcrypt-keys and key contents are created world-readable
- * giant backport https://github.com/ceph/ceph/pull/3095
* firefly backport https://github.com/ceph/ceph/pull/3096
- 08:38 AM Bug #9785 (Pending Backport): /etc/ceph/dmcrypt-keys and key contents are created world-readable
- 09:25 AM Feature #10254 (New): mon,osd: long-term non-clean PGs prevent osdmap trimming
- sometimes clusters have pgs that are degraded for long periods of time. this forces the mon to retain lots of old os...
- 08:58 AM Cleanup #10253 (Closed): gf-complete dead code
- ...
- 08:12 AM rgw Bug #10251 (Resolved): "Segmentation fault" (radosgw()) in upgrade:dumpling-firefly-x:parallel-ne...
- Run http://pulpito.ceph.com/teuthology-2014-12-04_17:15:01-upgrade:dumpling-firefly-x:parallel-next-distro-basic-vps/...
- 07:42 AM Bug #9844: "initiating reconnect" (log) race; crash of multiple OSDs (domino effect)
- I have had this same issue on my cluster as well.
My cluster originally had 4 nodes, with 7 osds on each node, 28 ... - 06:37 AM Bug #10250: PG stuck incomplete after interrupted backfill.
- Oh I forgot to mention, min_size is 1 on this pool.
- 05:57 AM Bug #10250 (Closed): PG stuck incomplete after interrupted backfill.
- Ceph version: 0.87
OS: Ubuntu 14.04
Cluster: 3x osd nodes with ~24 osds each
Issue: I had a pool accidentally se... - 03:55 AM Bug #9916: osd: crash in check_ops_in_flight
- Sage Weil wrote:
> how is the OSDOp being formed? this looks like a bug on the client side to me. the attr ops sho... - 03:06 AM Bug #10246 (Resolved): add el7 to the list of supported version for centos in ceph-deploy install...
- 01:15 AM rbd Feature #10226: Add pool quota reporting for Libvirt and other clients
- Assigning this one to me.
Need to figure a way out to fetch the pool's quota from the cluster first. - 01:12 AM Fix #9566 (Fix Under Review): osd: prioritize recovery of OSDs with most work to do
- 01:10 AM Bug #10018 (Fix Under Review): OSD assertion failure if the hinfo_key xattr is not there (corrupt...
12/04/2014
- 07:36 PM CephFS Bug #10229 (Resolved): Filer: lock inversion with Objecter
- 06:06 PM Bug #10117: OSD crashes if xattr "_" is absent for the file when doing backfill scanning (Replica...
- I'm sorry, that I forgot use correct format before, so it omitted some character:...
- 06:04 PM Bug #10117: OSD crashes if xattr "_" is absent for the file when doing backfill scanning (Replica...
- Wenjun Huang wrote:
> Sorry for my carelessness, I meant:
> @if (r == -ENOENT || r == -ENOATTR)
> continue;@
... - 05:26 PM CephFS Feature #1398: qa: multiclient file io test
- ...
- 03:49 PM devops Feature #10046 (In Progress): run make check on every pull request
- https://github.com/ceph/ceph-build/pull/35
- 01:58 PM Bug #10063 (Fix Under Review): ceph_objectstore_tool does not support getting attributes for eras...
- * firefly backport https://github.com/ceph/ceph/pull/3089
* giant backport https://github.com/ceph/ceph/pull/3088
- 01:29 PM Bug #10125 (Fix Under Review): radosgw is being started as root not apache with systemd
- 12:07 PM Bug #10125: radosgw is being started as root not apache with systemd
- * giant backport https://github.com/ceph/ceph/pull/3085
* firefly backport https://github.com/ceph/ceph/pull/3086 - 10:03 AM Bug #10125 (Pending Backport): radosgw is being started as root not apache with systemd
- 09:58 AM Bug #10125 (Fix Under Review): radosgw is being started as root not apache with systemd
- ...
- 04:45 AM Bug #10125: radosgw is being started as root not apache with systemd
- I would like to test it manually but I don't know how to get a centos7 VPS. Help ?
- 01:23 PM Fix #9566 (In Progress): osd: prioritize recovery of OSDs with most work to do
- 01:22 PM Bug #9785 (Fix Under Review): /etc/ceph/dmcrypt-keys and key contents are created world-readable
- https://github.com/ceph/ceph/pull/3087
- 12:44 PM Bug #10246 (Fix Under Review): add el7 to the list of supported version for centos in ceph-deploy...
- 08:06 AM Bug #10246: add el7 to the list of supported version for centos in ceph-deploy installation instr...
- https://github.com/ceph/ceph/pull/3081
- 08:00 AM Bug #10246 (Resolved): add el7 to the list of supported version for centos in ceph-deploy install...
- http://ceph.com/docs/master/start/quick-start-preflight/#red-hat-package-manager-rpm
- 11:31 AM rbd Bug #10030 (Resolved): Crash when attempting to open non-existent parent image
- 10:37 AM CephFS Bug #10248 (New): messenger: failed Pipe;:connect::assert(m) in Hadoop client
- We have logs and a core dump from the QA run: http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-30_23:12:01-hado...
- 10:05 AM Bug #10211 (Fix Under Review): gf-complete exit(1) because of misaligned structure
- giant backport https://github.com/ceph/ceph/pull/3083
- 10:00 AM Bug #10211 (Pending Backport): gf-complete exit(1) because of misaligned structure
- 09:25 AM Feature #9728 (Resolved): erasure-code: jerasure support for NEON
- 09:24 AM Documentation #10247 (Resolved): Alpha sort os-recommendataions
- 08:14 AM Documentation #10247 (Resolved): Alpha sort os-recommendataions
- to remove implied bias.
- 09:24 AM Bug #10209 (Resolved): osd/OSD.cc: 5410: FAILED assert(session)
- 09:11 AM Documentation #10245: RPM quick start for RHEL should explain where to get tcmalloc & python-flask
- CentOS7 needs...
- 07:55 AM Documentation #10245: RPM quick start for RHEL should explain where to get tcmalloc & python-flask
- For RHEL, we should cover the basics of "Check if you're subscribed to Red Hat with @subscription-manager@ ?" and "ru...
- 07:53 AM Documentation #10245: RPM quick start for RHEL should explain where to get tcmalloc & python-flask
- Bumping the priority so that it gets triaged
- 07:47 AM Documentation #10245 (Resolved): RPM quick start for RHEL should explain where to get tcmalloc & ...
- http://ceph.com/docs/master/start/quick-start-preflight/#red-hat-package-manager-rpm...
- 08:59 AM rgw Bug #10066: rgw: failed md5sum on s3tests-test-readwrite
- That's the conf file I'm using for reproduction:...
- 07:12 AM Fix #10244 (New): double resource for setting up ceph-deploy
- And one of them is missing instructions on installing via RPM.
This causes issues with users that find the incompl... - 06:56 AM Linux kernel client Bug #4553 (Can't reproduce): kclient: lockdep report, crash involving ceph fs and libceph
- 06:36 AM rgw Bug #10243: civetweb is hitting a limit (number of threads 1024)
- The problem is in [1] as:
#define MAX_WORKER_THREADS 1024
I changed this to "MAX_WORKER_THREADS 20480" and it worke... - 04:00 AM rgw Bug #10243 (Resolved): civetweb is hitting a limit (number of threads 1024)
- When setting "rgw thread pool size" to a number higher than 1024 and enabling civetweb 'rgw_frontends="civetweb port=...
- 06:30 AM rbd Bug #10123 (Resolved): "Segmentation fault" in upgrade:dumpling-x-firefly-distro-basic-vps run
- 06:21 AM Bug #10067 (Can't reproduce): ::posix_memalign abort ceph::buffer::create_page_aligned in 0.80.7
- 06:04 AM Bug #10067: ::posix_memalign abort ceph::buffer::create_page_aligned in 0.80.7
- The ...
- 05:37 AM Bug #10067: ::posix_memalign abort ceph::buffer::create_page_aligned in 0.80.7
- ...
- 05:21 AM Bug #10067: ::posix_memalign abort ceph::buffer::create_page_aligned in 0.80.7
- It's a stress split test therefore no erasure code involved after upgrading to firefly.
- 04:54 AM Bug #10042 (Duplicate): OSD crash doing object recovery with EC pool
- http://tracker.ceph.com/issues/8588
- 04:48 AM Bug #10065 (Duplicate): hung ec-lost-unfound.yaml, failed of osd.{0,2,3}
- http://tracker.ceph.com/issues/10211
- 03:51 AM Bug #8588 (In Progress): In the erasure-coded pool, primary OSD will crash at decoding if any dat...
- 03:42 AM Bug #10017 (Fix Under Review): OSD wrongly marks object as unfound if only the primary is corrupt...
- https://github.com/ceph/ceph/pull/3034 ready for review
- 12:33 AM Bug #10242: FAILED assert(backfill_targets.empty() || backfill_targets == want_backfill)
- Since the osd went down, so please upgrade the priority and severity.
- 12:19 AM Bug #10242 (Can't reproduce): FAILED assert(backfill_targets.empty() || backfill_targets == want_...
- version: 0.80.7
config:
osd max backfills = 1
osd recovery max active = 1
One osd went down after hitting...
12/03/2014
- 09:31 PM CephFS Bug #10229: Filer: lock inversion with Objecter
- 10:40 AM CephFS Bug #10229 (Resolved): Filer: lock inversion with Objecter
- Saw this on a next test (http://qa-proxy.ceph.com/teuthology/sage-2014-12-01_11:11:17-fs-next-distro-basic-multi/6289...
- 09:02 PM Bug #10241 (Resolved): Incorrect OSD mapping with EC 6+2 setup in Giant
- Hit this on the performance test cluster during nightly giant testing. Notice the very incorrect mapping....
- 07:11 PM Feature #10198: PG removal occupy the disk thread several hours
- Samuel Just wrote:
> Radosgw creates unbounded size index objects, will change eventually.
Hi Sam,
It is more like... - 05:56 PM Bug #10117: OSD crashes if xattr "_" is absent for the file when doing backfill scanning (Replica...
- Sorry for my carelessness, I meant:
@if (r == -ENOENT || r == -ENOATTR)
continue;@ - 04:10 AM Bug #10117: OSD crashes if xattr "_" is absent for the file when doing backfill scanning (Replica...
- Samuel Just wrote:
> Right, the deleting the object portion is what I was talking about. I think that's the right w... - 05:55 PM CephFS Feature #1398: qa: multiclient file io test
- Note to self:
Try: rbd import to create an image name, rbd resize the image, make sure reads return EOF at right... - 05:43 PM Bug #10176: Segmentation fault in upgrade:firefly:singleton-firefly-distro-basic-vps run
- See PR - https://github.com/ceph/ceph-qa-suite/pull/253
- 04:24 PM rgw Bug #10195: s3 java jdk conn.getobject(...) (get s3 object) method fails with latest version of a...
- Both test runs used the exact same code, same endpoint, same authentication keys, and same GET request. The only thin...
- 12:52 PM Feature #10231: gperftools headers have moved
- Work-in-progress pushed to https://github.com/ceph/ceph/tree/wip-10231-gperftools-location, submitted for review at h...
- 12:44 PM Feature #10231 (Resolved): gperftools headers have moved
- The @google/@ headers location has been deprecated as of gperftools 2.0. As of gperftools 2.2rc, the @google/@ header...
- 10:18 AM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- ...
- 10:10 AM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- Here is a tentative approach. The idea is to accumulate authoritative peers instead of just keeping the last one. For...
- 08:56 AM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- Exploring a two options:
* changing PG::scrub_compare_maps to collect all shards for a given missing object so tha... - 08:49 AM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- When the primary shard is lost in k=2, m=2, the PG has an unfound object (because, as explained in the description) t...
- 08:45 AM rgw Bug #10227 (Duplicate): "Segmentation fault" radosgw() in smoke-master-distro-basic-multi suite
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-30_02:35:03-smoke-master-distro-basic-multi/626596...
- 08:35 AM rbd Feature #10226 (New): Add pool quota reporting for Libvirt and other clients
- Currently when adding a Ceph RBD pool into Libvirt it will set the pool size as the maximum capacity of the entire cl...
- 08:20 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- The initial run for this report passed - http://pulpito.front.sepia.ceph.com/teuthology-2014-12-02_17:00:03-upgrade:f...
- 06:56 AM CephFS Fix #10135 (Resolved): OSDMonitor: allow adding cache pools to cephfs pools already in use
- 26e8cf174b8e76b4282ce9d9c1af6ff12f5565a9
- 05:25 AM Bug #10211: gf-complete exit(1) because of misaligned structure
- For the record, strace on the process shows:...
- 05:16 AM CephFS Bug #10164 (Fix Under Review): Dirfrag objects for deleted dir not purged until MDS restart
- https://github.com/ceph/ceph/pull/3071
- 02:33 AM Bug #10202 (Can't reproduce): ceph_objecstore_tool.py : OSD has the store locked
- Thanks for trying, since armv8 based machines & the ubuntu distribution matching are not out yet, let's close this.
12/02/2014
- 11:20 PM Bug #9978: keyvaluestore: void ECBackend::handle_sub_read
- I reported KV OSD init problem as #10225.
- 11:18 PM Bug #10225: keyvaluestore: OSDs do not start after few weeks of downtime (osd init failed / unabl...
- Oh yeah, another thing: some filestore based OSDs crash at the end of boot sequence so I didn't bother to start them ...
- 11:15 PM Bug #10225: keyvaluestore: OSDs do not start after few weeks of downtime (osd init failed / unabl...
- I didn't do anything to Ceph because cluster was down so I didn't even start those OSDs. No upgrades to Ceph was depl...
- 11:01 PM Bug #10225: keyvaluestore: OSDs do not start after few weeks of downtime (osd init failed / unabl...
- Hmm, I'm not sure why this happen. It seemed keyvaluestore lose "osd_superblock"?
Do you upgrade ceph? - 10:45 PM Bug #10225 (Closed): keyvaluestore: OSDs do not start after few weeks of downtime (osd init faile...
- On "Giant" I've created seven KV OSDs (on 4 or 5 different hosts) before cluster went down due to cascade of OSD cras...
- 10:54 PM Bug #10202: ceph_objecstore_tool.py : OSD has the store locked
On a virtual machine.
$ cat /etc/issue
Ubuntu 14.04 LTS \n \l
$ uname -a
Linux ubuntu 3.13.0-24-generic #46-U...- 03:09 PM Bug #10202: ceph_objecstore_tool.py : OSD has the store locked
- It is ubuntu 14.04 on ARMv8
- 12:57 PM Bug #10202: ceph_objecstore_tool.py : OSD has the store locked
Could this be a platform specific bug using init-ceph to kill daemons? Do we know what platform they are running on?- 09:32 PM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- https://github.com/ceph/ceph/pull/3070
- 03:41 PM Bug #10153: Rados.shutdown() dies with Illegal instruction (core dumped)
- This is not specific to rados.py, of course.
- 01:15 PM Bug #10153: Rados.shutdown() dies with Illegal instruction (core dumped)
- This was fixed by the application of commit:92615ea and commit:cf2104d in master. Please backport to firefly.
- 01:47 PM Bug #9487 (Resolved): dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim...
- 01:43 PM Bug #9891 (Rejected): "Assertion: os/DBObjectMap.cc: 1214: FAILED assert(0)" in upgrade:firefly-x...
- 01:40 PM Bug #10085 (Resolved): dirty exit ("Illegal instruction") on pthread_rwlock_unlock()
- 01:39 PM Bug #10157: PGLog::(read|write)_log don't write out rollback_info_trimmed_to
- 01:37 PM Bug #9459 (Can't reproduce): osd: blocked request
- 01:36 PM Bug #8595 (Resolved): osd: client op blocks until backfill starts (dumpling)
- 01:35 PM Bug #10126 (Rejected): "Exception: timed out waiting for admin_socket" in upgrade:giant-x-next-di...
Second time this was reproduced a misc.log was available. The load average was 17 on node with osd.13 which must n...- 01:34 PM Bug #9806 (Pending Backport): Objecter: resend linger ops on split
- 01:34 PM Bug #9806 (Resolved): Objecter: resend linger ops on split
- 01:34 PM Bug #8885 (Can't reproduce): SIGABRT in TrackedOp::dump() via dump_ops_in_flight()
- 01:33 PM Bug #9731 (Can't reproduce): Ceph 0.80.6 OSD crashes
- 01:32 PM Messengers Bug #9898 (Pending Backport): osd: fast dispatch deadlock in mark_down (giant)
- 01:32 PM Messengers Bug #9898 (Resolved): osd: fast dispatch deadlock in mark_down (giant)
- 01:30 PM Bug #10058 (Can't reproduce): next stuck in recovery, no progress
- 01:28 PM Bug #10105 (Can't reproduce): crash in PG::peek_map_epoch on upgrade from 0.80.4 to 0.80.7
- 01:27 PM Bug #10138 (Need More Info): osd: crash in SnapSet::from_snap_set
- 01:19 PM Bug #8797: "ceph status" do not exit with python_2.7.8
- The SIGILL was cured in master with the application of 92615ea and cf2104d. I've tested backporting these to firefly ...
- 01:18 PM Bug #10209: osd/OSD.cc: 5410: FAILED assert(session)
- 01:18 PM Bug #10178 (Resolved): mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- 01:17 PM Bug #9939 (Resolved): "giant" no longer log scrub errors
- commit:d392f44891a064e08f28244673c43a869e1f6014
- 01:14 PM Bug #10109 (Duplicate): "LibRadosTwoPoolsECPP.PromoteSnap" test failed in upgrade:dumpling-firefl...
- 01:13 PM Bug #10113 (Duplicate): --log-to-stderr with -f/-d sends a lot of things to logfile
- #9180
- 01:13 PM Bug #10124 (Rejected): monitor recieves bus error signal
- leveldb bug
- 01:11 PM Bug #10146 (In Progress): ceph-disk: sometimes the journal symlink is not created
- Still open, needs tests.
- 01:10 PM Bug #10146 (Resolved): ceph-disk: sometimes the journal symlink is not created
- 01:10 PM Bug #10118 (Need More Info): messenger drops messages between osds
- 01:08 PM Bug #10117: OSD crashes if xattr "_" is absent for the file when doing backfill scanning (Replica...
- Right, the deleting the object portion is what I was talking about. I think that's the right way.
- 01:07 PM Bug #10173 (Resolved): autogen.sh will fail if submodule URL changes
- 01:06 PM Bug #10175 (Resolved): deps.deb.txt is obsolete
- 01:06 PM Feature #10198: PG removal occupy the disk thread several hours
- Radosgw creates unbounded size index objects, will change eventually.
- 12:41 PM rgw Bug #10015 (Fix Under Review): rgw sync agent: 403 when syncing object that has tilde in its name
- PR opened https://github.com/ceph/radosgw-agent/pull/12
- 11:37 AM rbd Bug #10122: "LibRBD.TestClone" FAILED in upgrade:dumpling-x-firefly-distro-basic-vps run
- Josh, can you take a look? Not sure if the project has to change to rbd or not tho.
- 10:57 AM Bug #9788: "Assertion: common/HeartbeatMap.cc: 79" placeholder for "hit suicide timeout" issues
- One more in run http://pulpito.ceph.com/teuthology-2014-12-01_18:18:01-upgrade:firefly-x-giant-distro-basic-vps/
L... - 10:56 AM rgw Bug #10066: rgw: failed md5sum on s3tests-test-readwrite
- test fails, and can be reproduced with the specific random seeds, with a specific object size (at the larger side of ...
- 10:53 AM rgw Bug #10221 (Resolved): Crash in "radosgw-admin" in upgrade:firefly:singleton-firefly-distro-basic...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-12-01_17:05:01-upgrade:firefly:singleton-firefly-dist...
- 10:27 AM rgw Bug #10062: s3-test failures using keystone authentication
- Hi Yehuda, Sage
the patch addressed only the first 5 or so failures as mentioned.
The post_object* tests were s... - 09:38 AM rgw Bug #10062 (Resolved): s3-test failures using keystone authentication
- 09:38 AM rgw Bug #10062: s3-test failures using keystone authentication
- Fix merged into master.
- 10:12 AM rbd Bug #9513 (Resolved): rbd_cache=true default setting is degading librbd performance ~10X in Giant
- Reverted in master commit:b808cdfaa8823f0747f78938f3ed9a7a75e9bed1
Reverted in Giant commit:3b1eafcabb6139133b5ff0bd... - 10:11 AM Linux kernel client Bug #9896: krbd: EPERM from map-snapshot-io.sh
- /a/teuthology-2014-10-24_23:06:01-krbd-giant-testing-basic-multi/570830...
- 10:09 AM rbd Bug #9936 (Resolved): Exporting images larger than 2GB fails
- 10:09 AM rbd Bug #9936: Exporting images larger than 2GB fails
- Master: commit:4b87a81c86db06f6fe2bee440c65fc05cd4c23ce
Giant: commit:65c565701eb6851f4ed4d2dbc1c7136dfaad6bcb
- 09:50 AM rgw Bug #10162 (Duplicate): s3tests-test-readwrite failure
- A duplicate of #10082
- 09:34 AM rgw Bug #10188: Can not create new rgw user when specifying an email already assigned to a user
- That's by design. Users cannot share the same email, as S3 permissions can be granted by email address, so email need...
- 09:32 AM rgw Bug #10195 (Need More Info): s3 java jdk conn.getobject(...) (get s3 object) method fails with la...
- Can you provide rgw log (debug rgw = 20), for such a failed request?
- 09:29 AM rgw Bug #10106: rgw acl response should start with <?xml version="1.0" ?>
- Note that Amazon's API definition does not specify this:
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGE... - 09:28 AM rbd Bug #10116 (Need More Info): Ceph vm guest disk lockup when using fio
- Warren, can you provide an stack backtraces from when you encountered the issue and a list of any OSD ops-in-flight?
- 09:24 AM rgw Bug #10108 (Duplicate): s3tests fail in upgrade:dumpling-firefly-x:parallel-next-distro-basic-mul...
- Duplicate of #10082
- 09:24 AM rbd Bug #9078 (Rejected): Removing an RBD is very slow whenever there is write's in other RBD which a...
- 09:23 AM rbd Bug #9742 (Resolved): `rbd map lun` fails with: (2) No such file or directory on kernel 3.14.14 w...
- 09:22 AM rgw Bug #10121 (Duplicate): "test.functional.tests.TestAccountUTF8" error in upgrade:dumpling-x-firef...
- Duplicate of #10082
- 09:20 AM rbd Bug #8329 (Won't Fix): qemu-img rpm provided breaks snapshooting functionality on centos
- 09:18 AM rgw Bug #9886 (Resolved): rgw: apache 2.4 does not send http status reason string
- 09:11 AM Bug #10125: radosgw is being started as root not apache with systemd
- https://github.com/ceph/ceph/pull/3059
- 09:11 AM rgw Bug #10145 (Can't reproduce): rgw swift functional test: testChunkedPut (test.functional.tests.Te...
- 09:11 AM rgw Bug #10144 (Can't reproduce): rgw swift functional test: testChunkedPut (test.functional.tests.Te...
- 09:08 AM rgw Bug #9899 (Resolved): Error "coverage ceph osd pool get '' pg_num" in upgrade:dumpling-dumpling-d...
- 09:05 AM Bug #10220 (Resolved): "mon/Paxos.cc: 1033: FAILED assert(mon->is_leader())" in upgrade:dumpling-...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-12-01_18:25:01-upgrade:dumpling-firefly-x:stress-spli...
- 09:04 AM rgw Bug #10219 (Resolved): s3-tests failing to clone
- http://pulpito.ceph.com/sage-2014-12-01_11:07:39-rgw-next-distro-basic-multi
- 08:01 AM devops Bug #10218 (Rejected): "Gem::DependencyError" error in upgrade:firefly-x-next-distro-basic-vps run
- In run http://pulpito.front.sepia.ceph.com/teuthology-2014-12-01_17:18:01-upgrade:firefly-x-next-distro-basic-vps/
... - 06:57 AM CephFS Bug #9997 (Resolved): test_client_pin case is failing
- Merged to next (https://github.com/ceph/ceph/pull/3056)
- 06:53 AM CephFS Bug #10217 (Resolved): old fuse should warn on flock
- This works in master.
- 06:19 AM CephFS Bug #10217: old fuse should warn on flock
- yes, we need recent version of ceph-fuse and MDS. old version does not support interrupting flock
- 03:37 AM CephFS Bug #10217 (Resolved): old fuse should warn on flock
Test failure: test_filelock (tasks.mds_client_recovery.TestClientRecovery):
http://pulpito.front.sepia.ceph.com/sa...- 05:40 AM devops Tasks #8366: Update ceph.com/docs to default to the latest major release (0.80)
- 03:46 AM CephFS Fix #10135 (Fix Under Review): OSDMonitor: allow adding cache pools to cephfs pools already in use
- giant backport PR: https://github.com/ceph/ceph/pull/3055
- 03:35 AM CephFS Bug #10151 (Resolved): mds client cache pressure health warning oscillates on/off
- The version on next has a pass on client-limits (the one that exercises health): http://pulpito.front.sepia.ceph.com/...
- 12:14 AM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- Fixed #10211 that showed up while experimenting
12/01/2014
- 11:27 PM Bug #10216 (Resolved): gf-complete and jerasure call exit(1)
- On error, gf-complete and jerasure call exit(1) instead of assert. This causes the OSD to disapear instead of display...
- 11:06 PM Bug #10211 (Fix Under Review): gf-complete exit(1) because of misaligned structure
- 04:26 PM Bug #10211 (In Progress): gf-complete exit(1) because of misaligned structure
- https://github.com/ceph/ceph/pull/3049
- 01:00 PM Bug #10211: gf-complete exit(1) because of misaligned structure
- https://github.com/ceph/gf-complete/pull/1
- 01:00 PM Bug #10211 (Resolved): gf-complete exit(1) because of misaligned structure
- Steps to reproduce (this is fragile because it depends on the version of the allocator):
* rm -fr dev out ; mkdir... - 10:42 PM Bug #10215 (Fix Under Review): vstart_wrapper.sh kills daemons that do not belong to it
- https://github.com/ceph/ceph/pull/3054
- 10:28 PM Bug #10215 (Resolved): vstart_wrapper.sh kills daemons that do not belong to it
- When vstart_wrapper.sh "calls vstart.sh":https://github.com/ceph/ceph/blob/master/src/test/vstart_wrapper.sh#L32 it u...
- 06:05 PM CephFS Fix #10135 (Pending Backport): OSDMonitor: allow adding cache pools to cephfs pools already in use
- merged to next in commit:25fc21b837ba74bab2f6bc921c78fb3c43993cf5
This also should go into giant (I think Firefly ... - 05:58 PM CephFS Bug #10011 (Resolved): Journaler: failed on shutdown or EBLACKLISTED
- giant commit:65f6814847fe8644f5d77a9021fbf13043b76dbe
- 06:37 AM CephFS Bug #10011 (Fix Under Review): Journaler: failed on shutdown or EBLACKLISTED
- Haven't seen any failures around this, let's backport to giant: https://github.com/ceph/ceph/pull/3047
- 05:47 PM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- I'll turn that fpaste into a real patch and get Sam or somebody to put it in some testing so we should at least see i...
- 05:37 PM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- If a connection gets marked down, we *cannot* reconnect to that endpoint again; it needs to recycle itself to a new e...
- 05:29 PM Bug #10213 (Resolved): Some inappropriate consts
- Thanks!
- 05:21 PM Bug #10213 (Fix Under Review): Some inappropriate consts
- My apologies for not being careful enough on this review. https://github.com/ceph/ceph/pull/3050
- 02:15 PM Bug #10213 (Resolved): Some inappropriate consts
- https://github.com/ceph/ceph/pull/3011/files
https://github.com/ceph/ceph/pull/3037/files
These added some consts... - 05:16 PM Bug #10214 (Resolved): crush: straw buckets do not have expected/desired properties
- two issues:
- straw scaling factors calculated for straw buckets do not produces the correct distribution when the... - 12:18 PM rgw Bug #10195: s3 java jdk conn.getobject(...) (get s3 object) method fails with latest version of a...
- typo... in bug title/description anywhere you see "jdk" I meant "sdk"... it was late before thanksgiving when I compo...
- 10:28 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- Modified vps.yaml:...
- 06:35 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- mon_lease_ack_timeout: 25...
- 09:30 AM Bug #10209 (Fix Under Review): osd/OSD.cc: 5410: FAILED assert(session)
- https://github.com/ceph/ceph/pull/3048
- 09:23 AM Bug #10209 (In Progress): osd/OSD.cc: 5410: FAILED assert(session)
- 08:50 AM Bug #10209 (Resolved): osd/OSD.cc: 5410: FAILED assert(session)
- This was with wip-sam-testing, but I don't think it's related to any of the patches.
ubuntu@teuthology:/a/samuelj-... - 09:22 AM Bug #9921 (Pending Backport): msgr/osd/pg dead lock giant
- 09:04 AM Bug #9321 (Resolved): pgmap updates from OSDMap can be delayed indefinitely
- 08:53 AM Bug #10210 (Closed): "Caught signal" in upgrade:dumpling-x-firefly-distro-basic-vps run
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-29_19:13:03-upgrade:dumpling-x-firefly-distro-basi...
- 06:59 AM CephFS Bug #10164 (In Progress): Dirfrag objects for deleted dir not purged until MDS restart
- Zheng: assigning to you since you mentioned you were working on it
- 06:34 AM CephFS Bug #9997 (Fix Under Review): test_client_pin case is failing
- https://github.com/ceph/ceph/pull/3045
- 04:42 AM CephFS Bug #9994: ceph-qa-suite: nfs mount timeouts
- http://pulpito.ceph.com/teuthology-2014-11-23_23:10:01-knfs-next-testing-basic-multi/617093/
http://pulpito.ceph.com... - 04:20 AM CephFS Feature #9881 (Resolved): mds: admin command to flush the mds journal
- Merged to master (forgot the Fixes:, doh)...
- 01:57 AM RADOS Bug #9523: Both op threads and dispatcher threads could be stuck at acquiring the budget of files...
- > I am wondering if it makes sense to add a new parameter named *should_take_filestore_budget* to dispatch_context an...
11/30/2014
- 11:35 PM RADOS Bug #9523: Both op threads and dispatcher threads could be stuck at acquiring the budget of files...
- There seems two problems here:
# Dispatcher thread hang due to filestore throttling
# Op thread hang due to filesto... - 01:35 PM Linux kernel client Bug #10208: libceph: intermittent hangs under memory pressure
- The kern.log attached, with the data got shortly after running the following command:
time dd if=/dev/zero of=4G00... - 11:15 AM devops Bug #9665 (Resolved): ceph-disk zap should call partprobe
- 05:00 AM Bug #9978: keyvaluestore: void ECBackend::handle_sub_read
- Haomai Wang wrote:
> A related bug is fixed. But I'm not fully sure whether fix this problem.
I'm not sure which...
11/29/2014
- 11:45 PM Linux kernel client Bug #10208 (Resolved): libceph: intermittent hangs under memory pressure
11/28/2014
- 01:01 PM Documentation #10207 (Resolved): documentation: auth service required needs clarification
- In http://ceph.com/docs/master/rados/configuration/auth-config-ref/#configuration-settings the distinction between *a...
- 12:49 PM Documentation #10206 (Resolved): documentation: Network Configuration Reference duplicate string
- The string *You may configure this range at your discretion.* shows twice in http://ceph.com/docs/master/rados/config...
- 12:41 PM Documentation #10205 (Resolved): documentation: reference to ceph-deploy should be a link
- In http://ceph.com/docs/master/rados/configuration/ceph-conf/#running-multiple-clusters the phrase *See ceph-deploy n...
- 12:36 PM Documentation #10204 (Resolved): documentation: mon should be listed before osd
- When deploying a Ceph cluster, the mon must be run first. In the list shown at http://ceph.com/docs/master/rados/conf...
- 12:28 PM Documentation #10203 (Resolved): documentation: explain the term MON
- http://ceph.com/docs/master/rados/ should read *and a Ceph Monitor (MON) maintains* of *and a Ceph Monitor maintains*...
- 12:08 PM Feature #9815 (Resolved): run make check in parallel
- 12:07 PM Bug #10201 (Fix Under Review): tests must use ceph_objectstore_tool
- https://github.com/ceph/ceph/pull/3033
- 07:55 AM Bug #10201 (Resolved): tests must use ceph_objectstore_tool
- Tests such as "osd-scrub-repair":https://github.com/ceph/ceph/blob/giant/src/test/osd/osd-scrub-repair.sh#L64 must no...
- 12:00 PM Feature #9403: Make rados import/export fully functional and re-enable
Created wip-9403 to preserve a change needed to make this feature work. Also, the existing code sort of supports x...- 10:20 AM Bug #10197: arch detection on armv8 must check asimd
- Janne, I would very much appreciate a run of src/unittest_arch on ARMv8 if you can spare the time ( the branch is htt...
- 10:17 AM Bug #10197 (Fix Under Review): arch detection on armv8 must check asimd
- https://github.com/ceph/ceph/pull/3035
- 09:44 AM Bug #10202 (Can't reproduce): ceph_objecstore_tool.py : OSD has the store locked
- Janne Grunau (jannau irc) can reproduce this reliably. I suspect it is because it does not wait long enough for the o...
- 09:25 AM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- https://github.com/ceph/ceph/pull/3034
- 06:12 AM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- The same problem shows up when two OSDs are missing (k=2, m=2).
- 08:59 AM Bug #10199 (Fix Under Review): ceph --format xml daemon {daemon}.{id} config get is not valid XML
- https://github.com/ceph/ceph/pull/3031
- 03:03 AM Bug #10199 (Resolved): ceph --format xml daemon {daemon}.{id} config get is not valid XML
- ...
- 05:11 AM devops Bug #10200 (Rejected): tgtd error: undefined symbol: rbd_discard
- Saw this error in syslog from an unrelated test. Presumably this is a bug somewhere?
/a/teuthology-2014-11-23_23:... - 12:58 AM Feature #10198: PG removal occupy the disk thread several hours
- ...
- 12:56 AM Feature #10198 (Resolved): PG removal occupy the disk thread several hours
- We found an issue after we enable scrubbing/deep-scrubbing when doing recovering. The phenomenon is that all the rado...
- 12:57 AM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- Pull request - https://github.com/ceph/ceph/pull/3029
11/27/2014
- 06:27 PM Bug #10163: rados bench parameter -b producing wrong values when different blocksize used in writes
- From "rados --help":
-b op_size
set the size of write ops for put or benchmarking
We can see -b only for w... - 05:12 PM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- Guang Yang wrote:
> Update more logs from the *crashed* OSD:
> [...]
>
> It seems that the peer OSD was marked d... - 03:41 AM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- Update more logs from the *crashed* OSD:...
- 01:16 AM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- AFAR, if server rebind and client try to connect, client won't get CEPH_MSGR_TAG_SEQ tag from server because no repla...
- 12:02 AM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- > The next question is, why B's in_seq is a very large number even after rebinding?
After more deep dive, I think th... - 02:33 PM Bug #10197 (Resolved): arch detection on armv8 must check asimd
- instead of neon in https://github.com/ceph/ceph/blob/master/src/test/test_arch.cc...
- 09:17 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-26_09:31:02-upgrade:giant-x-next-distro-basic-vps/...
- 06:04 AM Bug #10196: [RHEL7] Modification of ~/.ssh/config in admin node restricts ssh to other nodes.
- You are welcome! :)
- 05:50 AM Bug #10196 (Rejected): [RHEL7] Modification of ~/.ssh/config in admin node restricts ssh to other...
- Thanks for the update :-)
- 03:15 AM Bug #10196: [RHEL7] Modification of ~/.ssh/config in admin node restricts ssh to other nodes.
- The output of ssh in verbose mode just showed the error "Bad owner or permissions in /home/ceph/.ssh/config". So, the...
- 02:06 AM Bug #10196: [RHEL7] Modification of ~/.ssh/config in admin node restricts ssh to other nodes.
- See also https://github.com/ceph/ceph/pull/3007
- 01:59 AM Bug #10196: [RHEL7] Modification of ~/.ssh/config in admin node restricts ssh to other nodes.
- Could you please attach the output of ssh in verbose mode when trying to connect to the remote host ? That will show ...
- 01:19 AM Bug #10196 (Rejected): [RHEL7] Modification of ~/.ssh/config in admin node restricts ssh to other...
- While setting up Ceph cluster using RHEL7 VMs, I found that modifying ~/.ssh/config file in admin node with details i...
11/26/2014
- 11:34 PM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- Add some peer's logs to prove the two-ways connect:...
- 11:14 PM Feature #9420 (Fix Under Review): erasure-code: tools and archive to check for non regression of ...
- the "non regression tests":https://github.com/ceph/ceph/blob/master/qa/workunits/erasure-code/encode-decode-non-regre...
- 10:26 PM CephFS Bug #9997: test_client_pin case is failing
- For 3.18+ kernel, I think we can iterate the all dir inodes and invalidate dentry one by one.
- 12:19 AM CephFS Bug #9997: test_client_pin case is failing
- yes, I think it caused by the d_invalidate change. In 3.18-rc kernel, d_invalidate() unhash dentry regardless if the...
- 07:49 PM Feature #9951: librados, osd: per-object scrub operation
- Hi sage:
I'm interested in this feature(From that i'll know the process of scurb). Is there somebody already did... - 06:16 PM Bug #9978: keyvaluestore: void ECBackend::handle_sub_read
- Hi Dmitry,
A related bug is fixed. But I'm not fully sure whether fix this problem. So could you give a crash keyv... - 05:01 PM Bug #9978: keyvaluestore: void ECBackend::handle_sub_read
- It's been another week -- is there any chance to get this fixed please?
- 03:56 PM rgw Bug #10195 (Closed): s3 java jdk conn.getobject(...) (get s3 object) method fails with latest ver...
- For instance,
in the Java example ( http://docs.ceph.com/docs/master/radosgw/s3/java/ )
The example: ... - 03:09 PM rgw Bug #10194 (Resolved): rgw: fcgi connections are not closed when using mod-proxy-fcgi
- 09:06 AM Feature #10192 (Fix Under Review): ceph_objectstore_tool object lookup
- https://github.com/ceph/ceph/pull/3020
- 05:04 AM Feature #10192 (Resolved): ceph_objectstore_tool object lookup
- It would be convenient for test purposes to have...
- 07:22 AM Feature #10193 (Resolved): Perf counter for WBThrottle
- Since sync thread will cause unstable iops and latency performance curve, we may want to WBThread do more(or moderate...
- 04:40 AM Bug #10117: OSD crashes if xattr "_" is absent for the file when doing backfill scanning (Replica...
- Wenjun Huang wrote:
> Samuel Just wrote:
> > This should probably be a feature request for the backlog. We need a ... - 02:18 AM devops Bug #9665: ceph-disk zap should call partprobe
- * firefly backport https://github.com/ceph/ceph/pull/3014
* dumpling backport https://github.com/ceph/ceph/pull/3015
11/25/2014
- 04:07 PM rgw Bug #8233: Installation & Documentation broken for Ubuntu Trusty 14.04 - rgw
- The 100-Continue stuff was all in fastcgi, not httpd. So you can use Ubunu's httpd 2.4 if you want. Here's the patch ...
- 03:09 PM rgw Documentation #10142 (Resolved): Update S3 compatibility table to reflect bucket location support
- 03:06 PM rgw Feature #10191 (Resolved): rgw: object versioning multi-zone support
- 03:05 PM rgw Feature #8216 (Fix Under Review): rgw: object versioning objclass support
- 03:05 PM rgw Feature #8218 (Fix Under Review): rgw: object versioning manifest changes
- 03:05 PM rgw Feature #8217 (Fix Under Review): rgw: object versioning object overwrite / delete changes
- 03:03 PM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- I think https://github.com/ceph/ceph-qa-suite/pull/250 reproduces the problem reliably. ...
- 02:47 PM Bug #10017 (In Progress): OSD wrongly marks object as unfound if only the primary is corrupted fo...
- 03:03 PM Bug #10126 (New): "Exception: timed out waiting for admin_socket" in upgrade:giant-x-next-distro-...
- 03:02 PM Bug #10126: "Exception: timed out waiting for admin_socket" in upgrade:giant-x-next-distro-basic-...
- Still an issue, not sure what to make of it.
Run http://pulpito.ceph.com/teuthology-2014-11-24_17:05:01-upgrade:gian... - 01:23 PM Bug #9788 (New): "Assertion: common/HeartbeatMap.cc: 79" placeholder for "hit suicide timeout" is...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-24_17:18:03-upgrade:firefly-x-next-distro-basic-vp...
- 12:13 PM Documentation #6465: admin/build-doc should have some kind of build check for broken links
- No one else agrees this is urgent, so, dropping pri
- 10:52 AM rgw Bug #10188 (Won't Fix): Can not create new rgw user when specifying an email already assigned to ...
- The error can be seen in the output of the commands below. In the first command we create a user "jj1" with an email ...
- 10:25 AM Bug #10018: OSD assertion failure if the hinfo_key xattr is not there (corrupted?) during scrubbing
- * firefly backport https://github.com/ceph/ceph/pull/3009
* giant backport https://github.com/ceph/ceph/pull/3010
- 10:03 AM Bug #10018 (Pending Backport): OSD assertion failure if the hinfo_key xattr is not there (corrupt...
- 09:27 AM CephFS Fix #10135 (Fix Under Review): OSDMonitor: allow adding cache pools to cephfs pools already in use
- https://github.com/ceph/ceph/pull/3008
- 08:44 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- Committed vps.yaml on master, gaint and next
Fixed syntax for
@mon lease = 15@
to
@mon lease: 15@ - 06:55 AM Bug #9487 (Pending Backport): dumpling: snaptrimmer causes slow requests while backfilling. osd_s...
- oops, still need firefly
- 06:19 AM devops Bug #9665 (Fix Under Review): ceph-disk zap should call partprobe
- 06:19 AM devops Bug #9665: ceph-disk zap should call partprobe
- * giant backport https://github.com/ceph/ceph/pull/3005
- 06:17 AM Bug #10183: OSD dispatcher thread hangs several seconds due to contention for osd_lock
- https://github.com/ceph/ceph/pull/3004
- 05:51 AM Bug #9073 (Resolved): OSD with device/partition journals down after fresh deploy or upgrade to 0.83
- 05:46 AM Feature #9728: erasure-code: jerasure support for NEON
- 05:13 AM Bug #10185 (Resolved): neon runtime detection is always false
- 02:01 AM Bug #10185 (Fix Under Review): neon runtime detection is always false
- https://github.com/ceph/ceph/pull/3003
- 04:42 AM CephFS Bug #9997: test_client_pin case is failing
- After much head scratching and log examination, this appears to be a kernel regression (assuming our behaviour was va...
- 04:00 AM Bug #10117: OSD crashes if xattr "_" is absent for the file when doing backfill scanning (Replica...
- Samuel Just wrote:
> This should probably be a feature request for the backlog. We need a test reproducing it and s...
11/24/2014
- 11:31 PM Bug #10185 (Resolved): neon runtime detection is always false
- The neon CPU feature detection function should test if the number of elements returned is 1 "instead of the size of t...
- 10:21 PM Bug #10166 (Fix Under Review): fiemap or FileStore::do_sparse_copy_range bug: osd/ReplicatedPG.cc...
- https://github.com/ceph/ceph/pull/3000
I'm not fully ensure that this issue is caused by inconsistence size. Maybe... - 12:14 PM Bug #10166: fiemap or FileStore::do_sparse_copy_range bug: osd/ReplicatedPG.cc: 8706: FAILED asse...
- Hmm, this might be as simple as truncating out to the full copy_range size.
- 12:11 PM Bug #10166: fiemap or FileStore::do_sparse_copy_range bug: osd/ReplicatedPG.cc: 8706: FAILED asse...
- Going back to the full log, it appears to be related to _do_sparse_copy_range and therefore fiemap:
2014-11-23 19:... - 11:38 AM Bug #10166: fiemap or FileStore::do_sparse_copy_range bug: osd/ReplicatedPG.cc: 8706: FAILED asse...
- /a/teuthology-2014-11-23_18:13:01-upgrade:firefly-x-giant-distro-basic-multi/615700
2014-11-23 19:18:14.224698 7f4... - 07:56 PM rgw Documentation #10184 (Resolved): rgw: document swift temp url functionality
- Swift-temp-url functionality seems to have been merged from v0.78 or so. (Feature #3454) This needs to be documented....
- 06:41 PM Bug #10183 (Resolved): OSD dispatcher thread hangs several seconds due to contention for osd_lock
- Recently when investigating the long tail latency during backfilling/recovering, I found on some OSDs, the dispatcher...
- 06:21 PM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- Sage, OK
I changed vps.ayml on teuthology to:... - 05:36 PM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- Yuri, new plan: let's just add 'mon lease = 15' to vps.yaml and see if this comes up again.
- 12:04 PM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- on giant fix - https://github.com/ceph/ceph-qa-suite/pull/252
- 11:24 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- RE: https://github.com/ceph/ceph-qa-suite/pull/251...
- 11:21 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- ok, new plan. instead of changing mon behavior, make the tests more resilient.
if we upgrade all mons, then resta... - 11:02 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- https://github.com/ceph/ceph/pull/2999
- 11:00 AM Bug #10178 (Fix Under Review): mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- 10:46 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- ...
- 10:34 AM Bug #10178: mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- ...
- 10:29 AM Bug #10178 (Resolved): mon rejects peer during election based on OSD_SET_ALLOC_HINT feature?
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-22_17:00:03-upgrade:firefly:newer-firefly-distro-b...
- 05:19 PM CephFS Bug #10151 (Pending Backport): mds client cache pressure health warning oscillates on/off
- Merged to master as of commit:aa4d1478647ce416e9cf4e8fcd32411230639f40. I like to let things go through testing befor...
- 09:20 AM CephFS Bug #10151: mds client cache pressure health warning oscillates on/off
- Opened PR against master instead of next by mistake. Next PR is https://github.com/ceph/ceph/pull/2996
- 03:16 AM CephFS Bug #10151 (Fix Under Review): mds client cache pressure health warning oscillates on/off
- master: https://github.com/ceph/ceph/pull/2989
giant: https://github.com/ceph/ceph/pull/2990 - 12:11 PM Bug #10165 (Duplicate): ceph_test_rados got short read
- 12:10 PM Bug #10165: ceph_test_rados got short read
- I think this also is due to enabling fiemap in the nightlies, teuthology commit:
0f97481ce44e0487ac6cffa051a05590f... - 10:34 AM Bug #10165: ceph_test_rados got short read
- Repeated issue in run http://pulpito.ceph.com/teuthology-2014-11-22_17:05:01-upgrade:giant-x-next-distro-basic-multi/...
- 08:36 AM Bug #10165: ceph_test_rados got short read
- Same problem in run http://pulpito.front.sepia.ceph.com/teuthology-2014-11-23_09:54:59-powercycle-giant-distro-basic-...
- 11:28 AM rbd Bug #10180 (Resolved): qemu tests crash host kernel
- ...
- 11:16 AM Bug #10176: Segmentation fault in upgrade:firefly:singleton-firefly-distro-basic-vps run
- I think we should remove import_export.sh from these tag-based upgrades, where we're hitting issues that are fixed la...
- 09:12 AM Bug #10176 (Resolved): Segmentation fault in upgrade:firefly:singleton-firefly-distro-basic-vps run
- Log are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-22_17:05:01-upgrade:firefly:singleton-firefly-distr...
- 11:04 AM rbd Bug #10123 (Pending Backport): "Segmentation fault" in upgrade:dumpling-x-firefly-distro-basic-vp...
- 10:49 AM Bug #8204: "timed out waiting for admin_socket to appear after osd.5 restart" in upgrade:dumpling...
- Same issue in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-21_17:18:01-upgrade:firefly-x-next-distro-basic-...
- 10:18 AM Bug #9487 (Resolved): dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim...
- 10:17 AM Bug #9113 (Resolved): osd: snap trimming eats memory, linearly
- 09:47 AM Bug #10097: failed: mon_thrash
- Same issue in run http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-22_17:15:01-upgrade:giant-giant-distro-basic...
- 09:41 AM rgw Bug #10177 (Can't reproduce): test_multipart_upload failed in upgrade:dumpling-firefly-x:parallel...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-22_18:15:02-upgrade:dumpling-firefly-x:parallel-gi...
- 09:25 AM Bug #9920: admin socket check hang, osd appears fine
- Same issue in run http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-22_17:18:02-upgrade:firefly-x-next-distro-ba...
- 09:19 AM Bug #10018: OSD assertion failure if the hinfo_key xattr is not there (corrupted?) during scrubbing
- Same issue on next in run http://pulpito.ceph.com/teuthology-2014-11-22_17:18:02-upgrade:firefly-x-next-distro-basic-...
- 09:07 AM CephFS Bug #9997 (In Progress): test_client_pin case is failing
- 07:40 AM devops Feature #10046: run make check on every pull request
- http://tracker.ceph.com/issues/10175 will make it possible to rely on the content of deps.deb.txt to install the requ...
- 07:33 AM Feature #9817 (Resolved): display X.XX deep-scrub starts
- 06:51 AM Bug #10175 (Fix Under Review): deps.deb.txt is obsolete
- https://github.com/ceph/ceph/pull/2994
- 05:18 AM Bug #10175 (Resolved): deps.deb.txt is obsolete
- It is not consistently maintained because it is not tested
- 04:30 AM Feature #9728 (In Progress): erasure-code: jerasure support for NEON
- 04:29 AM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- 03:41 AM Bug #10173 (Fix Under Review): autogen.sh will fail if submodule URL changes
- https://github.com/ceph/ceph/pull/2992
- 03:30 AM Bug #10173 (Resolved): autogen.sh will fail if submodule URL changes
- After an initial "git submodule update":https://github.com/ceph/ceph/blob/master/autogen.sh#L32, if the URL of a subm...
- 02:29 AM Linux kernel client Bug #10141: rbd_img_obj_request_fill+0x81/0x200
- For the record, since the stack trace doesn't explain anything, this was the following BUG_ON in osd_req_op_extent_in...
- 02:21 AM Linux kernel client Bug #10141 (Resolved): rbd_img_obj_request_fill+0x81/0x200
- Fixed with "rbd: don't treat CEPH_OSD_OP_DELETE as extent op". I rebased it into testing before "libceph: add CREATE...
- 12:59 AM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- I am wondering if the following race occurred:
Let us assume A and B are two OSDs having the connection (pipe) bet...
11/23/2014
- 11:32 PM Bug #10017 (In Progress): OSD wrongly marks object as unfound if only the primary is corrupted fo...
- 11:29 PM Bug #10018 (Fix Under Review): OSD assertion failure if the hinfo_key xattr is not there (corrupt...
- 11:08 PM Feature #10172 (Resolved): AsyncMessenger: Bind async thread to special cpu core
- Now, 1-2 async op thread can fully meet a OSD's network demand with SSD backend. So maybe we can bind limited thread ...
- 10:56 PM rgw Bug #10145: rgw swift functional test: testChunkedPut (test.functional.tests.TestFileUTF8)
- I am using Ubuntu 14.04
And these are my apache ang fast cgi version:... - 08:31 PM Bug #10166: fiemap or FileStore::do_sparse_copy_range bug: osd/ReplicatedPG.cc: 8706: FAILED asse...
- ubuntu@teuthology:/a/teuthology-2014-11-23_18:13:01-upgrade:firefly-x-giant-distro-basic-multi/615700
- 08:30 PM CephFS Bug #9997: test_client_pin case is failing
- http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-16_23:04:01-fs-next-testing-basic-multi/603971/
- 06:53 PM Bug #9998 (Fix Under Review): Replaced OSD weight below 0
- https://github.com/ceph/ceph/pull/2986
11/22/2014
- 09:55 PM Bug #10171 (Resolved): DBObjectMap: ghobject_t header key excludes hash for EC pools
- ...
- 09:16 PM rbd Bug #9771 (Won't Fix): Segmentation fault after upgrade v0.80.5 -> v0.80.6
- 09:12 PM rbd Feature #6228 (Resolved): image name metavariable
- 11:55 AM Bug #10085 (In Progress): dirty exit ("Illegal instruction") on pthread_rwlock_unlock()
- It looks to me like this is a result of our naughty rwlock handling: https://github.com/ceph/ceph/pull/2937
There'... - 10:13 AM Bug #9998: Replaced OSD weight below 0
- I've just reproduced this on my test cluster. I'm using Ceph v0.87 (@c51c8f9d80fa4e0168aa52685b8de40e42758578@) with ...
- 09:21 AM Bug #9998: Replaced OSD weight below 0
- I'm still having trouble reproducing this. :( Maybe you can attach a copy of an osdmap just prior to adding the osd?...
- 04:04 AM Bug #9998: Replaced OSD weight below 0
- Dan van der Ster wrote:
> In our case we sometimes get -3.052e-05 as the first weight of a new osd that has been add... - 03:58 AM Bug #9998: Replaced OSD weight below 0
- In our case we sometimes get -3.052e-05 as the first weight of a new osd that has been added to the crush map by the ...
- 09:17 AM rgw Bug #10103 (Pending Backport): swift tests failing
11/21/2014
- 11:53 PM Bug #9998: Replaced OSD weight below 0
- I'm changing weights manually by editing CRUSH map.
- 05:48 PM Bug #9998: Replaced OSD weight below 0
- wip-9998 has 2 fixes, but i'm convinced they are the same bug...
- 05:39 PM Bug #9998 (Need More Info): Replaced OSD weight below 0
- Can you clarify how you are doing this?
"change host weight while not changing OSD weights (i.e. sum(weight(osd)... - 09:22 PM Bug #10052 (Pending Backport): LibRadosTwoPools[EC]PP.PromoteSnap failure
- 07:52 PM Messengers Bug #10022 (Resolved): AsyncMessenger: Wrong newly_acked_seq when replacing existing connection
- 05:31 PM Bug #10004: ceph osd find does not correctly report crush locations
- What version is this? I can't reproduce it on giant.
- 05:28 PM Bug #10165: ceph_test_rados got short read
- 2014-11-20T21:38:21.790 INFO:tasks.rados.rados.0.vpm200.stdout:only read 3829760 out of size 3832037
- 07:51 AM Bug #10165 (Duplicate): ceph_test_rados got short read
- Runs:
http://pulpito.front.sepia.ceph.com/teuthology-2014-11-20_17:05:01-upgrade:giant-x-next-distro-basic-vps/
Job... - 04:23 PM Bug #10168: dumpling: Assertion: osd/ReplicatedPG.cc: 7573: FAILED assert(!pg_log.get_log().empty())
- that was in suite:upgrade:dumpling-x
- 04:18 PM Bug #10168 (Resolved): dumpling: Assertion: osd/ReplicatedPG.cc: 7573: FAILED assert(!pg_log.get_...
- 03:00 PM Bug #10168: dumpling: Assertion: osd/ReplicatedPG.cc: 7573: FAILED assert(!pg_log.get_log().empty())
- 03:00 PM Bug #10168: dumpling: Assertion: osd/ReplicatedPG.cc: 7573: FAILED assert(!pg_log.get_log().empty())
- have branch, testing build on wip-sam-dumpling-testing. Caused by the backport, 03c5344f74991ec351cdc8a55f6495d49647...
- 02:50 PM Bug #10168 (Resolved): dumpling: Assertion: osd/ReplicatedPG.cc: 7573: FAILED assert(!pg_log.get_...
- Assertion: osd/ReplicatedPG.cc: 7573: FAILED assert(!pg_log.get_log().empty())
ceph version 0.67.11-42-g103c6a0 (103... - 03:01 PM Bug #10167 (Duplicate): osd/ReplicatedPG.cc: 7573: FAILED assert(!pg_log.get_log().empty()) in up...
- 01:42 PM Bug #10167 (Duplicate): osd/ReplicatedPG.cc: 7573: FAILED assert(!pg_log.get_log().empty()) in up...
- ubuntu@teuthology:/a/teuthology-2014-11-20_19:13:02-upgrade:dumpling-x-firefly-distro-basic-vps/611779...
- 02:30 PM rgw Bug #9206: rgw: cross rgw message headers filtered by apache 2.4
- Because this came up on ceph-users recently: this is fixed in master with this commit:...
- 12:34 PM CephFS Bug #9674 (Resolved): nightly failed multiple_rsync.sh
- I haven't seen this fail since then, hurray.
- 11:30 AM Bug #10163: rados bench parameter -b producing wrong values when different blocksize used in writes
- We should at least fix up the output to warn about this and not lie, even if we don't respect the requested block siz...
- 01:13 AM Bug #10163 (Resolved): rados bench parameter -b producing wrong values when different blocksize u...
- The -b (blocksize) parameter used in rados bench does produce wrong measurements iff a preceeding rados bench write w...
- 10:19 AM rgw Bug #10162: s3tests-test-readwrite failure
- This appears to be happening on the overnight tests: See #10108
- 10:00 AM Bug #6003: journal Unable to read past sequence 406 ...
- ubuntu@teuthology:/a/sage-2014-11-20_17:03:30-rados:thrash-wip-watch-notify-distro-basic-multi/611427
- 09:10 AM rbd Bug #10122 (New): "LibRBD.TestClone" FAILED in upgrade:dumpling-x-firefly-distro-basic-vps run
- 06:06 AM rbd Bug #10122: "LibRBD.TestClone" FAILED in upgrade:dumpling-x-firefly-distro-basic-vps run
- The explanation is that there was a race condition between deleting a pool and unprotecting the snapshot. When unpro...
- 05:06 AM rbd Bug #10122 (In Progress): "LibRBD.TestClone" FAILED in upgrade:dumpling-x-firefly-distro-basic-vp...
- 08:54 AM Linux kernel client Bug #10141 (In Progress): rbd_img_obj_request_fill+0x81/0x200
- 08:23 AM devops Fix #5900: Create a Python package for ceph Python bindings
- Discussed in IRC today with Alfredo and others: We're going to keep the pyceph modules as individual Python packages ...
- 08:03 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Today I restarted every mon and osd on the test cluster (again) and confirmed it is all running 0.67.11-4-g496e561. N...
- 07:55 AM Bug #10166 (Resolved): fiemap or FileStore::do_sparse_copy_range bug: osd/ReplicatedPG.cc: 8706: ...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-20_17:05:01-upgrade:giant-x-next-distro-basic-vps/...
- 07:10 AM Documentation #9867: PGs per OSD documentation needs clarification
- It should also be noted that the PG per Pool distribution should be directly proportional to the overall distribution...
- 06:46 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- I've been seeing similar issues using Ubuntu 14.04 as a guest VM. RT throttling occurs, so I tried disabling all the ...
- 06:39 AM CephFS Bug #10151 (In Progress): mds client cache pressure health warning oscillates on/off
- Reproduced this locally by just allowing 3 mons in a vstart cluster and following the procedure from the mds_client_l...
- 12:50 AM CephFS Bug #10151: mds client cache pressure health warning oscillates on/off
- Yes -- the leader is reporting the health warning but the peons are not.
The warning is "Client 2922132 failing to... - 06:34 AM CephFS Fix #10135: OSDMonitor: allow adding cache pools to cephfs pools already in use
- Yeah, we didn't think about this first time around because the focus was on cache tiers to EC pools, but it would mak...
- 03:58 AM CephFS Bug #10164: Dirfrag objects for deleted dir not purged until MDS restart
- Alternatively less contrived way to see the issue: just do a loop of "cp -r /etc . ; rm -rf ./etc" in a filesystem mo...
- 03:14 AM CephFS Bug #10164 (Resolved): Dirfrag objects for deleted dir not purged until MDS restart
Seen while playing with the #9881 flush functionality: the dirfrag objects for deleted directories are never cleane...- 01:24 AM Bug #10085: dirty exit ("Illegal instruction") on pthread_rwlock_unlock()
- PPS Bug info: https://sourceware.org/bugzilla/show_bug.cgi?id=17561
My patch posted here https://bugs.gentoo.org/sho...
11/20/2014
- 11:20 PM Bug #10119 (Resolved): 0.88 EC+ KV OSDs crashing
- 09:50 PM Bug #10085: dirty exit ("Illegal instruction") on pthread_rwlock_unlock()
- PS Bug is filled to (https://bugs.gentoo.org/show_bug.cgi?id=529076 ), but I think there was near "vanilla" case.
- 09:41 PM Bug #10085: dirty exit ("Illegal instruction") on pthread_rwlock_unlock()
- "Own packages"? It is Gentoo.
- 06:35 PM rbd Bug #10123 (Fix Under Review): "Segmentation fault" in upgrade:dumpling-x-firefly-distro-basic-vp...
- 05:46 PM rgw Bug #10162 (Duplicate): s3tests-test-readwrite failure
- Running teuthology using the following yaml file failed:...
- 04:42 PM Bug #10018: OSD assertion failure if the hinfo_key xattr is not there (corrupted?) during scrubbing
- https://github.com/ceph/ceph-qa-suite/pull/250 teuthology tests
- 02:44 PM Bug #10018 (In Progress): OSD assertion failure if the hinfo_key xattr is not there (corrupted?) ...
- 03:47 PM Bug #10028 (Duplicate): ec_lost_unfound failing on giant
- #10065
- 02:42 PM Bug #10028: ec_lost_unfound failing on giant
- 03:35 PM rgw Feature #10159 (New): rgw: sync agent support for object versioning
- 03:35 PM rgw Feature #10158 (Closed): rgw: sync agent support for bucket sharding
- 03:17 PM Bug #10157 (Resolved): PGLog::(read|write)_log don't write out rollback_info_trimmed_to
- In practice, this means that replicated pgs will scan their log on the first operations after boot needlessly. EC pg...
- 02:51 PM Bug #10150: osd/ReplicatedPG.cc: 10853: FAILED assert(r >= 0) (in _scan_range)
- 02:46 PM Bug #10150: osd/ReplicatedPG.cc: 10853: FAILED assert(r >= 0) (in _scan_range)
- Hah, that assert isn't valid. The object in question might be in the process of being removed *if* it is at the star...
- 08:34 AM Bug #10150 (Resolved): osd/ReplicatedPG.cc: 10853: FAILED assert(r >= 0) (in _scan_range)
- ...
- 02:43 PM Bug #9810: dout_emergency is silenced in ceph-osd
- 02:42 PM Bug #10017: OSD wrongly marks object as unfound if only the primary is corrupted for EC pool
- 02:42 PM Feature #9728: erasure-code: jerasure support for NEON
- 02:42 PM Bug #9485: Monitor crash due to wrong crush rule set
- 02:42 PM Bug #8741: osd: ec plugin leak
- 02:41 PM Bug #10065: hung ec-lost-unfound.yaml, failed of osd.{0,2,3}
- 08:43 AM Bug #10065: hung ec-lost-unfound.yaml, failed of osd.{0,2,3}
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-11-17_02:32:01-rados-giant-distro-basic-multi/604495
... - 02:41 PM Bug #9785: /etc/ceph/dmcrypt-keys and key contents are created world-readable
- 02:34 PM Bug #8011: osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= scrubber.end)
- 02:20 PM Bug #8011: osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= scrubber.end)
- Urgh, non-blocking flushes do not cause scrub to pause. I think the simplest solution is to fail a non-blocking scru...
- 08:38 AM Bug #8011: osd/ReplicatedPG.cc: 5244: FAILED assert(soid < scrubber.start || soid >= scrubber.end)
- this popped up again: ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-11-17_02:32:01-rados-giant-distr...
- 02:29 PM Feature #10156 (Rejected): Backport cluster_fingerprint feature to Dumpling
- Sometimes the subject says it all.
- 01:18 PM rbd Feature #10154 (Resolved): librbd: use early snapshot context for copyup operations so snapshots ...
- If we send the copyup operation with a snapshot context with an empty list of snap ids and a snap seq before the earl...
- 12:09 PM rgw Bug #10015 (In Progress): rgw sync agent: 403 when syncing object that has tilde in its name
- Confirmed that requests is doing the unquoting of a quoted url with the tilde char:...
- 12:09 PM Bug #10153 (Resolved): Rados.shutdown() dies with Illegal instruction (core dumped)
- In rados.py, Rados.shutdown() produces "Illegal instruction (core dumped)" when called.
To test, try applying the ... - 11:52 AM devops Bug #10152: drop tiobench references
- This involves deleting the associated Jenkins task as well: http://jenkins.ceph.com/job/tiobench
- 11:50 AM devops Bug #10152 (Rejected): drop tiobench references
- Both Fedora and Debian have dropped their tiobench packages from their distros because tiobench failed to build from ...
- 11:46 AM rbd Bug #10149 (Duplicate): Giant: data corruption with console rbd export
- This is a duplicate of #9936 and will be backported to Giant soon. In the meantime, as a workaround you can export t...
- 07:42 AM rbd Bug #10149 (Duplicate): Giant: data corruption with console rbd export
- - borrow large non-zeroed image, ten gig in my case
- upload it via cli, format 2 is set
- try to download the imag... - 11:46 AM devops Bug #9793: Fedora 20 ceph-extras Repo missing
- Both Fedora and Debian have dropped their tiobench packages from their distros because tiobench failed to build from ...
- 11:41 AM Bug #8797: "ceph status" do not exit with python_2.7.8
- In order to get the exit code, I tried this:...
- 10:19 AM Bug #9921: msgr/osd/pg dead lock giant
- passed a smoke test, ready for rados run
- 10:08 AM Bug #8978: ceph ping not working as expected
- Can you give me an example of how you are running the ceph ping command and the output you are seeing. I just tested ...
- 09:59 AM rgw Bug #10145: rgw swift functional test: testChunkedPut (test.functional.tests.TestFileUTF8)
- Chunked put can fail if running the wrong fastcgi module.
- 09:59 AM CephFS Bug #10151 (Resolved): mds client cache pressure health warning oscillates on/off
- seeing this on lab cluster. not sure if it is a problem in the mds health reporting or the mon, but it goes on and o...
- 07:30 AM devops Bug #10148 (Rejected): Giant/Wheezy SysV: /etc/init.d/ceph -a start shifts crushmap to executing ...
- Got:...
- 07:17 AM Messengers Feature #10147 (Resolved): Add unittest for Messenger
- 06:50 AM Bug #10146: ceph-disk: sometimes the journal symlink is not created
- I've pushed the alternative fix in the same pull req.
- 02:15 AM Bug #10146 (In Progress): ceph-disk: sometimes the journal symlink is not created
- I like the idea of not changing the uuid
- 01:35 AM Bug #10146 (Resolved): ceph-disk: sometimes the journal symlink is not created
- Hi,
We observed in practise that sometimes the journal symlink is not created during a ceph-disk prepare run.
En...
11/19/2014
- 10:29 PM Linux kernel client Feature #9906: Inline data support
- 09:43 PM rgw Bug #10145 (Can't reproduce): rgw swift functional test: testChunkedPut (test.functional.tests.Te...
- ...
- 09:36 PM rgw Bug #10144 (Can't reproduce): rgw swift functional test: testChunkedPut (test.functional.tests.Te...
- ...
- 06:20 PM Bug #8797: "ceph status" do not exit with python_2.7.8
- This works around the problem, while also destroying the exit code from the ceph program, so if you rely on that, thi...
- 05:32 PM CephFS Bug #10131 (Resolved): kclient: dentry still in use on umount
- 02:20 PM rgw Bug #10099 (Duplicate): radosgw-agent - error geting op state: list index out of range
- 12:37 PM rgw Bug #10102 (Fix Under Review): sync agent: does not handle gracefully transient errors
- PR opened: https://github.com/ceph/radosgw-agent/pull/11
- 12:21 PM Bug #10138: osd: crash in SnapSet::from_snap_set
- Whoops! I misread the version. Do you have a core file?
- 12:09 PM Bug #10138: osd: crash in SnapSet::from_snap_set
- Sage Weil wrote:
> Please try using the latest firefly release. 0.87 is comparatively old.. many bugs were fixed in... - 10:45 AM Bug #10138 (Rejected): osd: crash in SnapSet::from_snap_set
- Please try using the latest firefly release. 0.87 is comparatively old.. many bugs were fixed in 0.80.5 and again in...
- 04:48 AM Bug #10138: osd: crash in SnapSet::from_snap_set
- Sorry for the formatting fail. The crash log was supposed to look like this:...
- 04:46 AM Bug #10138 (Can't reproduce): osd: crash in SnapSet::from_snap_set
- We are running ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578) in a 3-server setup with 18 OSDs (4 HDDs ...
- 12:07 PM Bug #10128 (Resolved): ceph_objectstore_tool --op export to stdout broken
- Backport to giant as 6cb9a2499cac2645e2cc6903ab29dfd95aac26c7
- 12:05 PM rgw Documentation #10142 (Resolved): Update S3 compatibility table to reflect bucket location support
- Table at http://ceph.com/docs/master/radosgw/s3/
- 12:05 PM Bug #9439 (Resolved): pg_op_must_wait() not checking FILTER variants
- 05:18 AM Bug #9439: pg_op_must_wait() not checking FILTER variants
- https://github.com/ceph/ceph/pull/2962
- 12:05 PM Bug #10077 (Resolved): ceph_objectstore_tool: sets SHARDS feature on export it doesn't need to
- 11:13 AM Linux kernel client Bug #10141 (Resolved): rbd_img_obj_request_fill+0x81/0x200
- ...
- 09:43 AM rgw Feature #9932 (Fix Under Review): rgw: map swift X-Storage-Policy header to rgw pools
- 05:50 AM rbd Bug #10139 (New): librbd cpu usage 4x higher than krbd
- librbd cpu usage is quite huge currently, around 4-5x higher than krbd.
(Tested with fio+krbd vs fio+librbd, rando...
11/18/2014
- 11:51 PM CephFS Bug #10131: kclient: dentry still in use on umount
- it's a VFS bug. fixed by...
- 11:04 PM CephFS Bug #10131 (In Progress): kclient: dentry still in use on umount
- 09:20 AM CephFS Bug #10131 (Resolved): kclient: dentry still in use on umount
- ...
- 11:31 PM Bug #9978: keyvaluestore: void ECBackend::handle_sub_read
- Here are two new logs -- only filestore OSDs are up, all KV OSD are down.
- 11:07 PM Bug #10119 (Fix Under Review): 0.88 EC+ KV OSDs crashing
- https://github.com/ceph/ceph/pull/2966
- 08:06 AM Bug #10119: 0.88 EC+ KV OSDs crashing
- Thankyou, I have started a 3 osds keyvaluestore cluster to do benchmark and try to trigger crash
- 02:53 AM Bug #10119: 0.88 EC+ KV OSDs crashing
- I added the debug_keyvaluestore logging, and restarted them. The osds starting to crash immediately again, but there ...
- 05:55 PM devops Bug #9665 (Pending Backport): ceph-disk zap should call partprobe
- let's wait a week or two before backporting
- 05:54 PM devops Bug #9665 (Resolved): ceph-disk zap should call partprobe
- 07:05 AM devops Bug #9665: ceph-disk zap should call partprobe
- ...
- 05:44 PM Bug #10114 (Resolved): assembly files need annotation to assert that stack should not be executable
- https://github.com/ceph/ceph/pull/2961
- 05:40 PM Bug #10114: assembly files need annotation to assert that stack should not be executable
- https://github.com/ceph/ceph/pull/2963
- 01:16 PM Bug #10114: assembly files need annotation to assert that stack should not be executable
- https://github.com/ceph/ceph/pull/2946
- 01:15 PM Bug #10114: assembly files need annotation to assert that stack should not be executable
- Looks like it's merged, does this need to be backported?
- 05:04 PM Bug #10118: messenger drops messages between osds
- Samuel Just wrote:
> If you can reproduce with logs, that would help. The repops are supposed to complete in strict... - 01:12 PM Bug #10118: messenger drops messages between osds
- If you can reproduce with logs, that would help. The repops are supposed to complete in strict order, this could be ...
- 04:29 PM devops Bug #9793: Fedora 20 ceph-extras Repo missing
- This must've been an oversight when we started shipping Fedora 20 packages. It led me to dig into why we're still shi...
- 03:40 PM CephFS Fix #10135 (Resolved): OSDMonitor: allow adding cache pools to cephfs pools already in use
- Right now we disallow this with _check_remove_tier(), I believe because we were worried about coordinating the switch...
- 03:12 PM rgw Bug #10103: swift tests failing
- Seem to be the same problem in run http://pulpito.front.sepia.ceph.com/teuthology-2014-11-17_17:15:01-upgrade:dumplin...
- 03:11 PM Bug #10107 (Duplicate): Coredump in upgrade:giant-x-next-distro-basic-multi run
- I think this is caused by the same thing as 10059, duplicate marking.
- 03:10 PM Bug #7996: 0.78: OSD is not suspend-friendly (unresponsive cluster on OSD crash)
- "Won't fix" should be normally accompanied by explanation...
- 01:53 PM Bug #7996 (Won't Fix): 0.78: OSD is not suspend-friendly (unresponsive cluster on OSD crash)
- 03:06 PM Bug #9788 (Rejected): "Assertion: common/HeartbeatMap.cc: 79" placeholder for "hit suicide timeou...
- I think this one is the giant messenger deadlock, #9921, updated 9921, closing this ticket again.
- 03:05 PM Bug #9921: msgr/osd/pg dead lock giant
- I think this is another instance:
ubuntu@teuthology:/a/teuthology-2014-11-13_17:33:44-upgrade:giant-x-next-distro-... - 02:37 PM CephFS Feature #1398: qa: multiclient file io test
- Answering my own question: Item 2 above. It looks like this can all be done from python.
- 02:21 PM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- As per jdillaman's suggestion on IRC, I have backed off to the PVE 2.6.32-34-pve kernel from 3.10.0-5-pve and can no ...
- 12:49 PM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- second batch dump from same locked process as requested
- 12:43 PM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- and a 3rd for good measure
- 12:41 PM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- another attempt and trace
- 12:30 PM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Attached gdb output with libc and qemu debug symbols.
- 09:31 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Greg, incidentally several of the attached backtraces show the Pipe reader thread waiting on the pipe lock:...
- 09:03 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Jason, exactly what information is making you think the Pipe is hung waiting on a lock? And what version is in use ri...
- 07:48 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Logs show that the pipe reader to osd.0 is hung waiting for the pipe lock. The last message from that thread is:
<p... - 05:56 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Thanks, I'll start reviewing these this morning.
- 05:27 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Attacked is the blktrace of the latest lockup.
Then the qemu output exceeded your max file size (by a couple of KB),... - 04:12 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Sure, I'll do that this morning first. Then I found the repo that proxmox is using to build qemu-kvm, so I'll rebuil...
- 02:01 PM Bug #10104 (Fix Under Review): rados.py: wait_for_* don't wait; should have poll, wait, and wait+...
- 01:57 PM Bug #8978 (Can't reproduce): ceph ping not working as expected
- 01:52 PM Bug #9369 (Can't reproduce): init: ceph-osd (...) main process (...) killed by ABRT signal
- 01:50 PM Bug #9439 (Fix Under Review): pg_op_must_wait() not checking FILTER variants
- 01:48 PM Bug #9438 (Resolved): librados API generated doc broken
- 01:45 PM Bug #9738: rados cli: objects not present in a snapshot are listed anyway
- Ugh, need to look at the object info to do this right. Should either fix or change docs or remove.
- 01:45 PM Feature #9720: erasure-code: non regression should test jerasure variants
- 01:43 PM Bug #9748 (Rejected): Dead jobs in upgrade:dumpling-x-firefly-distro-basic-multi run
- Probably some kind of networking issue, closing until we get more intel.
- 01:42 PM Bug #9784: All tools should be named consistently and argument parsing should be better
- any tool that a user uses should have -'s. the source files should always use _'s.. just the final executable uses -...
- 01:41 PM Bug #9784: All tools should be named consistently and argument parsing should be better
- ceph-objectstore-tool. Generally, dashes for things people actually use, underscores for tests.
- 01:42 PM Bug #9751 (Rejected): ceph tell osd.6 version hangs
- 01:38 PM Bug #9801: ceph 0.80.7 build rpm packages in centos 7 error
- mkcephfs is removed post-firefly anyway, ignore the warning
- 01:38 PM Bug #9801 (Won't Fix): ceph 0.80.7 build rpm packages in centos 7 error
- 01:37 PM Feature #7104 (New): rest-api: support commands requiring 'w' cap without 'rw' cap
- the 'mds set' command is 'rw'. confused what the bug is... pls reopen and clarify if this is still an issue
- 01:37 PM Feature #7104 (Rejected): rest-api: support commands requiring 'w' cap without 'rw' cap
- 01:37 PM Bug #9818 (Resolved): ENXIO qa/workunits/cephtool/test.sh:test_osd_bench
- did not happen for a long time, looks like it's stable at last
- 01:36 PM Bug #10132 (Resolved): osd: tries to set ioprio when the config option is blank
- Saw this in a log:...
- 01:35 PM Bug #9761 (Rejected): ceph-osd: segfault at 654c30 ip 00007f00dc5f1f07 sp 00007f00c5642e00 error ...
- 01:33 PM Bug #9941 (Rejected): rados command line crashes when trying to copy pool snapshot
- The correct answer here will be to deprecate this command. We are talking about a more sophisticated import/export t...
- 01:32 PM Bug #9971 (Rejected): OSD crashes again after restarting due to op thread time out at writing pg ...
- Sounds like the disk is too slow for the timeouts, you'll have to increase them.
- 01:27 PM Bug #10126 (Rejected): "Exception: timed out waiting for admin_socket" in upgrade:giant-x-next-di...
- I suspect this is slowness in VM machines. There are no core files and nothing I could see of interest in osd.10 log...
- 01:25 PM Bug #10008 (Resolved): "obsolete rollback obj" error in upgrade:firefly-x-giant-distro-basic-vps run
- 01:21 PM Bug #10013 (Rejected): "Segmentation fault" in upgrade:dumpling-x-firefly-distro-basic-vps run
- upgrading the libraries -> crash
- 01:20 PM Bug #10067: ::posix_memalign abort ceph::buffer::create_page_aligned in 0.80.7
- client.admin.* is normal. Crash probably is not.
- 01:20 PM Bug #10069 (Rejected): SyncEntryTimeout::finish() timeout
- probably a slow vm
- 01:19 PM rgw Bug #10102: sync agent: does not handle gracefully transient errors
- Updated the description, and the RGW is not returning a 400 but a 500. The agent should probably get updated to under...
- 07:12 AM rgw Bug #10102 (In Progress): sync agent: does not handle gracefully transient errors
- 01:19 PM Bug #10085 (Rejected): dirty exit ("Illegal instruction") on pthread_rwlock_unlock()
- whatever the platform is, you'll have to build your own packages, I guess.
- 01:14 PM Bug #10117: OSD crashes if xattr "_" is absent for the file when doing backfill scanning (Replica...
- This should probably be a feature request for the backlog. We need a test reproducing it and some code to tolerate i...
- 01:07 PM Bug #10129 (Pending Backport): Bad locking in the trunc method of libradosstriper
- 03:03 AM Bug #10129: Bad locking in the trunc method of libradosstriper
- giant backport https://github.com/ceph/ceph/pull/2954
- 02:49 AM Bug #10129: Bad locking in the trunc method of libradosstriper
- https://github.com/ceph/ceph/pull/2951 testing
- 02:44 AM Bug #10129: Bad locking in the trunc method of libradosstriper
- 02:22 AM Bug #10129 (Resolved): Bad locking in the trunc method of libradosstriper
- A catch badly positioned makes the locking void and can lead to race conditions.
- 01:06 PM Bug #9970 (Resolved): document erasure coded pool simple operations
- 01:00 PM Bug #10077: ceph_objectstore_tool: sets SHARDS feature on export it doesn't need to
- 12:59 PM Feature #10064 (Resolved): add ceph_objectstore_tool tests to make check
- 10:25 AM Bug #10128 (Pending Backport): ceph_objectstore_tool --op export to stdout broken
- 01:44 AM Bug #10128 (Resolved): ceph_objectstore_tool --op export to stdout broken
- https://github.com/ceph/ceph/pull/2950
- 09:07 AM rbd Bug #10123: "Segmentation fault" in upgrade:dumpling-x-firefly-distro-basic-vps run
- LibRBD.ListChildren was the last logged test
- 07:53 AM devops Feature #10046: run make check on every pull request
- https://github.com/ceph/ceph/pull/2956
- 06:37 AM devops Bug #8896 (Rejected): missing i386 packages for Trusty
- we no longer build i386 and I don't think there are plans to add them back.
- 01:49 AM Feature #9943: osd: mark pg and use replica on EIO from client read
- Submit pull request.
https://github.com/ceph/ceph/pull/2952
11/17/2014
- 11:04 PM Bug #10128: ceph_objectstore_tool --op export to stdout broken
- It would be nice to fix the unit test to use all variants of export and import.
- 10:59 PM Bug #10128 (Resolved): ceph_objectstore_tool --op export to stdout broken
The change a2bd2aa7 broke --op export to stdout. It is writing text using out.
I want to backport this fix to g...- 07:45 PM Bug #9978: keyvaluestore: void ECBackend::handle_sub_read
- Yep, if free you can paste crash logs with debug_keyvaluestore=20/20
- 02:51 PM Bug #9978: keyvaluestore: void ECBackend::handle_sub_read
- Do you need any additional "debug_keyvaluestore=20/20" logs? It's been another week... Is there any progress? Any hop...
- 07:26 PM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- I can see four outstanding read requests in the last set of logs that you provided. Any chance you can re-run the sa...
- 02:26 PM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Using krbd instead of librbd with qemu doesn't hang, however, in the guest with dd, the total sequential performance ...
- 11:05 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- blktrace and qemu log attached as requested. I could not gracefully kill blktrace as the vm hardlocked so hopefully ...
- 10:33 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- None of the Ceph threads in the provided backtraces appeared to be deadlocked. It's possible a IO completion is bein...
- 09:40 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- logs from 3 runs back-to-back, forcibly killing the vm and restarting it between each attempt
- 09:28 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Brad, it would be helpful to see a few back-to-back GDB backtraces. In the full backtrace above, all blocked threads...
- 09:22 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- CPU usage is 0 when the lock occurs, so I don't think it is due to excess cpu usage.
I can definitely try those ... - 08:35 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- alexandre derumier wrote:
> Hi,
>
> >>kernel:BUG: soft lockup - CPU#0 stuck for 23s!
>
> by default share 1thr... - 08:31 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Hi,
>>kernel:BUG: soft lockup - CPU#0 stuck for 23s!
by default share 1thread for many things (clock,io access,... - 08:28 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- During lockup:...
- 07:10 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- I should also mention I am brad_mssw in the #ceph IRC channel on oftc if there are any suggestions or things to try.
- 05:47 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- Realized I was missing the librados debug symbols, here it is again, and also backtraced all threads:...
- 05:36 AM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- What is more interesting to me is if I break into it with GDB when it is hung, then tell it to continue, I get notifi...
- 05:55 PM devops Bug #8896: missing i386 packages for Trusty
- Would that be possible to have this fixed soon? I'm running into the problem since I can't upgrade 1 of my 3 servers ...
- 03:43 PM Bug #10096 (Resolved): ceph-disk prepare fails to unmount temp file successfully
- 02:58 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Oddly, I'm able to reproduce it easily on v0.67.11, but not wip-9113-9487-dumpling (496e561d81f2dd1bf92d588fc3afc2431...
- 02:49 PM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- This test cluster is currently running 0.67.11-4-g496e561, mons and osds.
On our prod cluster we still run ceph-0.... - 10:53 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- All other osds are running that branch, right? Also, which sha1 was it which you thought was working (the branches h...
- 03:17 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- Well the PG isn't empty -- I've been writing a bunch of data to it using rados bench. Basically, I'm having trouble g...
- 02:55 PM Bug #10126 (Rejected): "Exception: timed out waiting for admin_socket" in upgrade:giant-x-next-di...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-17_08:56:42-upgrade:giant-x-next-distro-basic-vps/...
- 02:00 PM Bug #10125 (Resolved): radosgw is being started as root not apache with systemd
- On RHEL 7 when radosgw is started with systemd it runs as root not apache which causes problems with the s3gw.fcgi is...
- 12:57 PM Bug #10124: monitor recieves bus error signal
- if reproducible
<joao> 'mon_debug_dump_transactions = true' and 'mon_debug_dump_location = /path' - 12:37 PM Bug #10124: monitor recieves bus error signal
- Oh, it looks like this has been reported by Joao before to leveldb list: https://groups.google.com/forum/#!topic/leve...
- 12:01 PM Bug #10124 (Rejected): monitor recieves bus error signal
- This happend in the latest giant verison. Bus error seems like something wrong with hardware, but the issue suspiciou...
- 10:15 AM Bug #10119: 0.88 EC+ KV OSDs crashing
- Hmm, it's strange because I already fixed this bug previously. Maybe it's another?
Could you run crashed OSD agai... - 06:53 AM Bug #10119 (Resolved): 0.88 EC+ KV OSDs crashing
- Hi,
I am further testing the EC+ KV setup, and the OSDs were crashing again, so I updated ticket #9727.
But after ... - 10:02 AM Bug #10093 (Resolved): ceph-monstore-tool: FAILED assert(!is_open)
- 09:49 AM rbd Bug #10123 (Resolved): "Segmentation fault" in upgrade:dumpling-x-firefly-distro-basic-vps run
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-16_19:13:03-upgrade:dumpling-x-firefly-distro-basi...
- 09:48 AM rbd Bug #10122 (Resolved): "LibRBD.TestClone" FAILED in upgrade:dumpling-x-firefly-distro-basic-vps run
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-16_19:13:03-upgrade:dumpling-x-firefly-distro-basi...
- 09:39 AM rgw Bug #10121 (Duplicate): "test.functional.tests.TestAccountUTF8" error in upgrade:dumpling-x-firef...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-16_19:13:03-upgrade:dumpling-x-firefly-distro-basi...
- 09:02 AM Bug #10063 (Pending Backport): ceph_objectstore_tool does not support getting attributes for eras...
- 08:05 AM Bug #9913 (Pending Backport): mon: audit log entires for forwarded requests lack info
- 07:38 AM Bug #9913 (Fix Under Review): mon: audit log entires for forwarded requests lack info
- https://github.com/ceph/ceph/pull/2944
- 08:02 AM devops Bug #10120 (New): "Assertion: os/FileStore.cc" in upgrade:firefly-x-next-distro-basic-multi run
- Sandon, that was mira034:...
- 07:53 AM devops Bug #10120 (Rejected): "Assertion: os/FileStore.cc" in upgrade:firefly-x-next-distro-basic-multi run
- that fail_eio assert means we got EIO back from the fs, whcih means there is a bad disk
- 07:34 AM devops Bug #10120 (Rejected): "Assertion: os/FileStore.cc" in upgrade:firefly-x-next-distro-basic-multi run
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-15_17:13:01-upgrade:firefly-x-next-distro-basic-mu...
- 07:52 AM rgw Bug #10103: swift tests failing
- Also in runs:
http://pulpito.front.sepia.ceph.com/teuthology-2014-11-14_02:35:01-smoke-master-distro-basic-multi/
... - 02:40 AM Bug #10118 (Can't reproduce): messenger drops messages between osds
- Log snippets before the daemon crash:...
- 01:53 AM Bug #9285: osd: promoted object can get evicted before promotion completes
- We probably have this issue on our ceph cluster (0.80.7 on commodity PC hardware + 10G ethernet) and that this is blo...
11/16/2014
- 09:41 PM Bug #10117 (Won't Fix): OSD crashes if xattr "_" is absent for the file when doing backfill scann...
- We observed a OSD crash pattern which is due to xattr "_" is absent for the file (on filesystem) which result in an a...
- 06:20 PM rbd Bug #10116: Ceph vm guest disk lockup when using fio
- I wonder if this isn't the issue from #9854.
fio gets through writing the test files, and the lock occurs during t... - 02:31 PM rbd Bug #10116 (Closed): Ceph vm guest disk lockup when using fio
- When running a disk benchmark within a guest, I'm getting a disk lockup that doesn't ever appear to resolve itself. ...
- 02:09 AM Bug #10059: osd/ECBackend.cc: 876: FAILED assert(0)
- Samuel Just wrote:
> This is almost certainly unrelated to those two bugs. This is a specific edge case in divergen...
11/14/2014
- 10:46 PM Bug #10115: mon not running. osd is dead
- my ceoh version is 0.80.1. i install them on ubuntu 12.04.4
uname -a : Linux controller 3.11.0-26-generic #45~preci... - 10:22 PM Bug #10115: mon not running. osd is dead
- this is the log file on one of my ceph node.
- 10:10 PM Bug #10115 (Can't reproduce): mon not running. osd is dead
- my ceph did't config the cephx. i sloved one problem before as this issue said:http://tracker.ceph.com/issues/8851.
... - 06:05 PM Bug #10114 (Fix Under Review): assembly files need annotation to assert that stack should not be ...
- seeming workaround in wip-execstack
- 05:58 PM Bug #10114: assembly files need annotation to assert that stack should not be executable
References:
https://bugzilla.redhat.com/show_bug.cgi?id=1118504 the original bug that noticed the problem on Fe...- 05:30 PM Bug #10114 (Resolved): assembly files need annotation to assert that stack should not be executable
- 05:10 PM Bug #10113: --log-to-stderr with -f/-d sends a lot of things to logfile
- on a vstart cluster with 3 osds, if I stop osd.2 and restart like:
./ceph-osd -i 2 -c ./ceph.conf --log-to-stderr ... - 05:10 PM Bug #10113 (Duplicate): --log-to-stderr with -f/-d sends a lot of things to logfile
- 03:45 PM Bug #10059: osd/ECBackend.cc: 876: FAILED assert(0)
- 03:12 PM Bug #10059: osd/ECBackend.cc: 876: FAILED assert(0)
- This is almost certainly unrelated to those two bugs. This is a specific edge case in divergent write recovery.
- 11:43 AM devops Cleanup #7722 (Resolved): Make /admin/build-doc distro independent
- 11:41 AM devops Cleanup #7722: Make /admin/build-doc distro independent
- Updated the procedure doc with all dependencies.
- 11:43 AM Bug #9788 (New): "Assertion: common/HeartbeatMap.cc: 79" placeholder for "hit suicide timeout" is...
- Logs are in http://pulpito.front.sepia.ceph.com/teuthology-2014-11-13_17:33:44-upgrade:giant-x-next-distro-basic-vps/...
- 10:22 AM Cleanup #10110 (New): librados: mark old objects_begin interface deprecated
- There is some minor refactoring needed since the new methods call the old ones when ns == "". The fix is probably to...
- 10:18 AM devops Tasks #8366: Update ceph.com/docs to default to the latest major release (0.80)
- Can we update it to the latest major release with the backports--e.g., v0.80.7? I finally have someone to help with t...
- 10:12 AM Bug #9487: dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not ...
- I think that's an annoying special case for snaps purged on an empty pg. Both the old primary which did the trim and...
- 08:09 AM Bug #10107: Coredump in upgrade:giant-x-next-distro-basic-multi run
- ...
- 07:40 AM Bug #10107 (Duplicate): Coredump in upgrade:giant-x-next-distro-basic-multi run
- (Maybe related to #8733)
Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-13_17:04:11-upgrade:gi... - 08:03 AM Bug #10109 (Duplicate): "LibRadosTwoPoolsECPP.PromoteSnap" test failed in upgrade:dumpling-firefl...
- 3 tests failed in run http://pulpito.front.sepia.ceph.com/teuthology-2014-11-13_17:15:02-upgrade:dumpling-firefly-x:p...
- 07:55 AM rgw Bug #10108 (Duplicate): s3tests fail in upgrade:dumpling-firefly-x:parallel-next-distro-basic-mul...
- All tests failed in run http://pulpito.front.sepia.ceph.com/teuthology-2014-11-13_17:10:02-upgrade:dumpling-firefly-x...
- 07:47 AM Bug #10105: crash in PG::peek_map_epoch on upgrade from 0.80.4 to 0.80.7
- the upgrade from 0.80.1 to 0.80.7 case was a bad disk.
- 07:32 AM Bug #9727: 0.86 EC+ KV OSDs crashing
- Hi,
I tried this again on the new 0.88 release.
After about 30 minutes of testing, the EC-KV OSDs started crashin... - 04:51 AM Messengers Feature #10029: Retry binding on IPv6 address if not available
- I started playing with this a bit (no commits yet), I simply loop in SimpleMessenger's Accepter.cc and retry to bind ...
- 03:26 AM Feature #9979 (In Progress): osd: cache: proxy reads (instead of redirect)
- https://github.com/ceph/ceph/pull/2927
- 02:17 AM rgw Bug #10106 (Resolved): rgw acl response should start with <?xml version="1.0" ?>
- I encountered some surprising behaviour when playing with radosgw and s3cmd.
You can probably make a convincing case... - 02:10 AM Bug #10018: OSD assertion failure if the hinfo_key xattr is not there (corrupted?) during scrubbing
11/13/2014
- 10:32 PM Bug #10052 (Fix Under Review): LibRadosTwoPools[EC]PP.PromoteSnap failure
- https://github.com/ceph/ceph/pull/2926
- 10:19 PM Bug #10052 (In Progress): LibRadosTwoPools[EC]PP.PromoteSnap failure
- // read baz
{
bufferlist bl;
ASSERT_EQ(-ENOENT, ioctx.read("baz", bl, 1, 0));
}
I think this usu... - 05:44 PM Bug #10052: LibRadosTwoPools[EC]PP.PromoteSnap failure
- ubuntu@teuthology:/a/sage-2014-11-12_13:30:37-smoke-wip-warn-max-pg-distro-basic-multi/598501
- 08:49 PM Bug #10105 (Can't reproduce): crash in PG::peek_map_epoch on upgrade from 0.80.4 to 0.80.7
- ...
- 05:48 PM Bug #10104 (Resolved): rados.py: wait_for_* don't wait; should have poll, wait, and wait+cb versions
- Completion.wait_for_{safe, complete} are using the poll functions "is_{safe,complete}"; the comments indicate that's ...
- 05:47 PM rgw Bug #10103 (Resolved): swift tests failing
- ubuntu@teuthology:/a/dzafman-2014-11-13_10:42:58-rgw-wip-10082-testing-basic-multi$ teuthology-ls . | grep FAIL
5996... - 05:02 PM Bug #10059: osd/ECBackend.cc: 876: FAILED assert(0)
- Any progress?
- 04:36 PM rgw Bug #10082 (Resolved): Segmentation fault in upgrade:dumpling-firefly-x:parallel-next-distro-basi...
- 04:28 PM Feature #10064 (Fix Under Review): add ceph_objectstore_tool tests to make check
- https://github.com/ceph/ceph/pull/2915
- 04:28 PM Bug #10063 (Fix Under Review): ceph_objectstore_tool does not support getting attributes for eras...
- https://github.com/ceph/ceph/pull/2915
- 03:48 PM rgw Bug #10102 (Resolved): sync agent: does not handle gracefully transient errors
- on a copy operation, rgw sent back 400 and the sync agent got stuck in the following loop:...
- 12:58 PM rgw Bug #9587 (Pending Backport): ceph-radosgw sysvinit script on EL6 cannot set ulimit
- 12:25 PM rgw Bug #10099 (Duplicate): radosgw-agent - error geting op state: list index out of range
- radosgw-agent logs the following, and objects are not synced to the secondary gateway.
INFO:urllib3.connectionpool... - 12:25 PM Bug #10096: ceph-disk prepare fails to unmount temp file successfully
- Notes:
- Issuing a short delay before 'umount' fixes the issue - this is a terrible workaround
- Issuing 'sync' b... - 07:52 AM Bug #10096 (Resolved): ceph-disk prepare fails to unmount temp file successfully
- I have been testing on a virtual machine for ease of testing, and 'ceph-disk prepare' kept forwarding an error from '...
- 11:07 AM Bug #10095 (Resolved): (crush_bucket_adjust_item_weight()+0) [0x7d1540] crash
- 11:02 AM Bug #10095 (Fix Under Review): (crush_bucket_adjust_item_weight()+0) [0x7d1540] crash
- https://github.com/ceph/ceph/pull/2920
- 07:37 AM Bug #10095 (Resolved): (crush_bucket_adjust_item_weight()+0) [0x7d1540] crash
- ubuntu@teuthology:/a/samuelj-2014-11-11_22:08:30-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/597458
... - 10:36 AM Bug #9835 (Resolved): osd: bug in misdirected op checks (firefly)
- 10:25 AM Messengers Feature #10079 (Resolved): AsyncMessenger: Support select for other OS
- 09:49 AM Feature #10098 (Resolved): wanted: command to clear 'incomplete' PGs
- Hello,
Please create a command that would clear 'incomplete' PGs.
Perhaps ceph pg force_create_pg could be extend... - 08:32 AM rbd Bug #9854 (Pending Backport): librbd: reads contending for cache space can cause livelock
- 08:28 AM Bug #10097 (Resolved): failed: mon_thrash
- debian 7.0
logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-12_17:15:01-upgrade:giant-giant-dist... - 07:17 AM Support #10024: Cluster unreachable after restart
- Hi,
I've missed anything?
Did I do something wrong?
Because I didn't get any answer after more than 1 week.
Thank... - 06:59 AM Cleanup #10094 (New): Create new git repo for json_spirit
- json spirt is currently part of the code tree of ceph, but it's external code. There was also no update within a long...
- 06:58 AM CephFS Bug #10092 (Resolved): multiple_rsync.sh + ceph-fuse timing out on firefly
- greg is right, these time out semi-regularly. increased the timeout on master, giant, firefly.
- 06:38 AM Bug #10093 (Fix Under Review): ceph-monstore-tool: FAILED assert(!is_open)
- https://github.com/ceph/ceph/pull/2914
- 06:35 AM Bug #10093 (Resolved): ceph-monstore-tool: FAILED assert(!is_open)
- Using a vstart cluster + stoph.sh:...
- 04:17 AM Bug #9916: osd: crash in check_ops_in_flight
- Hi Yehuda,
After taking a look at the rgw code, I failed to find which (http) request would need CEPH_OSD_OP_SRC_CMP... - 12:14 AM Feature #9943 (In Progress): osd: mark pg and use replica on EIO from client read
- Current OSD check PG map and get only k items and send sub-read request. So if one read failed. It assert and core du...
11/12/2014
- 09:21 PM Bug #10077: ceph_objectstore_tool: sets SHARDS feature on export it doesn't need to
- How do we tell the difference between (2) and (3)? In both cases, ceph_objectstore_tool will see there is no SHARDS ...
- 09:06 PM Bug #10077: ceph_objectstore_tool: sets SHARDS feature on export it doesn't need to
I see from the code that there are a couple of scenarios that need to be handled or at least documented:
1. Expo...- 08:59 PM CephFS Bug #10092 (Resolved): multiple_rsync.sh + ceph-fuse timing out on firefly
- teuthology-2014-11-11_23:04:01-fs-firefly-distro-basic-multi/598145
teuthology-2014-11-11_23:04:01-fs-firefly-distro... - 08:25 PM Bug #8588: In the erasure-coded pool, primary OSD will crash at decoding if any data chunk's size...
- Wei is working on this along with http://tracker.ceph.com/issues/9943 .
- 06:52 PM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- Greg Farnum wrote:
> What version are you running? This looks like one of a couple of bugs that have been resolved i... - 10:47 AM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- What version are you running? This looks like one of a couple of bugs that have been resolved in the latest point rel...
- 04:26 AM Messengers Bug #10080: Pipe::connect() cause osd crash when osd reconnect to its peer
- And the peer OSD's log is as below:...
- 03:40 AM Messengers Bug #10080 (Resolved): Pipe::connect() cause osd crash when osd reconnect to its peer
- When our cluster load is heavy, the osd sometimes crashes. The critical log is as below:
-278> 2014-08-20 11:04:28... - 05:15 PM rbd Bug #9771: Segmentation fault after upgrade v0.80.5 -> v0.80.6
- 05:13 PM rbd Bug #9771: Segmentation fault after upgrade v0.80.5 -> v0.80.6
- Commit b75f85a2 added new elements to the _Thread_ class, breaking ABI. In this (and several other upgrade tests fro...
- 05:08 PM Feature #9957: librados: add fadvise op
- See the pull request: https://github.com/ceph/ceph/pull/2905
- 04:09 PM rgw Bug #10090 (Resolved): ceph_objectstore_tool import broken
- 03:27 PM rgw Bug #10090 (Fix Under Review): ceph_objectstore_tool import broken
- 02:15 PM rgw Bug #10090 (Resolved): ceph_objectstore_tool import broken
The tool can't import because it finds that the recently removed collection still exists.
Is may be because fini...- 12:37 PM rbd Bug #10002 (Resolved): Errors during import_export test in upgrade:firefly-x-next-distro-basic-vp...
- commit:e94d3c11edb9c9cbcf108463fdff8404df79be33
- 11:38 AM Bug #10083 (Resolved): cephtool/test.sh: osd create w/o uuid test is noisy
- 10:09 AM Bug #10083: cephtool/test.sh: osd create w/o uuid test is noisy
- Verified to work with...
- 09:53 AM Bug #10083 (Fix Under Review): cephtool/test.sh: osd create w/o uuid test is noisy
- https://github.com/ceph/ceph/pull/2902
- 09:29 AM Bug #10083 (Resolved): cephtool/test.sh: osd create w/o uuid test is noisy
- ...
- 10:56 AM Bug #10085 (Resolved): dirty exit ("Illegal instruction") on pthread_rwlock_unlock()
- After upgrade to glibc 2.20, "ceph" & "rbd" commands exiting with "Illegal instruction" exit message and !=0 exit cod...
- 10:00 AM Feature #9598 (Pending Backport): re-enable Objecter fast dispatch
- sage-2014-11-11_08:26:01-rados-wip-sage-testing-distro-basic-multi
- 08:42 AM Bug #9702: "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:firefly-x-giant-...
- Same issue http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-11_17:03:01-upgrade:firefly:older-firefly-distro-ba...
- 08:29 AM rgw Bug #10082 (Resolved): Segmentation fault in upgrade:dumpling-firefly-x:parallel-next-distro-basi...
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-11_17:10:01-upgrade:dumpling-firefly-x:parallel-ne...
- 06:53 AM rbd Feature #2467 (Resolved): qemu: implement bdrv_invalidate_cache
- Merged upstream: http://git.qemu.org/?p=qemu.git;a=commitdiff;h=be21788495fdc8251b04dd4bfd0cdce95c49d75b
- 01:23 AM Messengers Feature #10079 (Resolved): AsyncMessenger: Support select for other OS
- AsyncMessenger already support epoll and kqueue, but for other legacy OS or windows, we need to use select for the wo...
11/11/2014
- 06:17 PM rbd Bug #10002 (Fix Under Review): Errors during import_export test in upgrade:firefly-x-next-distro-...
- https://github.com/ceph/ceph/pull/2899
- 08:23 AM rbd Bug #10002: Errors during import_export test in upgrade:firefly-x-next-distro-basic-vps run
- Same issue in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-10_17:15:02-upgrade:dumpling-firefly-x:parallel-...
- 08:17 AM rbd Bug #10002: Errors during import_export test in upgrade:firefly-x-next-distro-basic-vps run
- Seems similar issue in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-10_17:05:02-upgrade:firefly:singleton-f...
- 05:20 PM Bug #10052: LibRadosTwoPools[EC]PP.PromoteSnap failure
- ubuntu@teuthology:/a/sage-2014-11-11_14:57:42-smoke-wip-warn-max-pg-distro-basic-multi/596722
- 02:59 PM CephFS Bug #8090: multimds: mds crash in check_rstats
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2014-11-10_23:18:02-multimds-giant-testing-basic-multi/595393
- 02:54 PM Bug #10077 (Resolved): ceph_objectstore_tool: sets SHARDS feature on export it doesn't need to
- user on 0.87 exported a replicated pg and couldn't import it because the shards feature wasn't set on the osd.
w... - 02:14 PM rgw Feature #9933: rgw: implement S3 RR (reduced redundancy) API
- Hmm, was looking just now at the S3 api, and it seems that you can set RR per object, not per bucket. This complicate...
- 11:01 AM Bug #10069 (Rejected): SyncEntryTimeout::finish() timeout
The ceph_objectstore_tool aborted in FileStore code.
On my wip-9780 branch which is rebased on current master ru...- 10:31 AM devops Bug #10049: "Failed to fetch package" "rhel7_0-x86_64-basic"
- Replying to my own post for posterity:
I figured out why those Git hashes don't align. It's bug in log.cgi. Appare... - 08:50 AM devops Bug #10049 (Resolved): "Failed to fetch package" "rhel7_0-x86_64-basic"
- Looks fixed
- 09:53 AM Bug #10067 (Can't reproduce): ::posix_memalign abort ceph::buffer::create_page_aligned in 0.80.7
- Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-10_19:13:02-upgrade:dumpling-x-firefly-distro-basi...
- 09:01 AM rgw Feature #9013 (Resolved): rgw: set civetweb as a default frontend
- 08:48 AM rgw Bug #10066: rgw: failed md5sum on s3tests-test-readwrite
- Same problem in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-10_18:11:17-upgrade:firefly:newer-firefly-dist...
- 07:22 AM rgw Bug #10066 (Resolved): rgw: failed md5sum on s3tests-test-readwrite
- ...
- 08:16 AM Bug #9702: "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:firefly-x-giant-...
- Same issues in run http://pulpito.front.sepia.ceph.com/teuthology-2014-11-10_17:18:01-upgrade:firefly-x-next-distro-b...
- 08:02 AM Bug #10016 (Resolved): "Segmentation fault" in upgrade:giant-giant-distro-basic-multi run
- tests passed.
- 07:25 AM rgw Bug #9917 (Won't Fix): RADOSGW: Not able to create Swift objects with erasure coded pool
- 03:51 AM rgw Bug #9917: RADOSGW: Not able to create Swift objects with erasure coded pool
- OK,I was not aware of this, seems sane behaviour to me.
- 07:21 AM rgw Bug #10062: s3-test failures using keystone authentication
- Looks like for a few of them eg. the date ones occur as it looks like radosgw doesn't consider checking the date head...
- 05:02 AM rgw Bug #10062 (Resolved): s3-test failures using keystone authentication
- * "rgw: check for timestamp for s3 keystone auth":https://github.com/ceph/ceph/pull/2993
* "wip: rgw: check keystone... - 07:20 AM Bug #10065 (Duplicate): hung ec-lost-unfound.yaml, failed of osd.{0,2,3}
- this pattern keeps popping up:...
- 07:16 AM Bug #7995: osd shutdown: ./common/shared_cache.hpp: 93: FAILED assert(weak_refs.empty())
- ubuntu@teuthology:/a/teuthology-2014-11-10_02:32:01-rados-giant-distro-basic-multi/594038
- 06:40 AM Feature #10064 (Resolved): add ceph_objectstore_tool tests to make check
- The "ceph_objectstore_tool.py":https://github.com/ceph/ceph/blob/giant/src/test/ceph_objectstore_tool.py tests can be...
- 06:35 AM Bug #10063: ceph_objectstore_tool does not support getting attributes for erasure coded objects
- ...
- 06:33 AM Bug #10063 (Resolved): ceph_objectstore_tool does not support getting attributes for erasure code...
- ...
- 04:37 AM Bug #9554: "FAILED assert(0 == "hit suicide timeout")" in upgrade:firefly-firefly-testing-basic-v...
- Yes it reproduced in giant too.
11/10/2014
- 11:46 PM CephFS Bug #10041: ceph-fuse: never exit when no MDS server is available
- Just wanted to add that lack of timeout causes havoc all over the place... Autofs, backup scrips mounting CephFS on d...
- 04:05 PM CephFS Bug #10041: ceph-fuse: never exit when no MDS server is available
- Although it terminates on "Ctrl+C" a timeout would be _very_ useful because it would prevent system from hanging on b...
- 11:11 AM CephFS Bug #10041: ceph-fuse: never exit when no MDS server is available
- Was it blocking in the foreground? Did SIGKILL (ie, control-C) work on it?
We can add a configurable timeout but I... - 01:07 AM CephFS Bug #10041 (Resolved): ceph-fuse: never exit when no MDS server is available
- I'm attempting to mount CephFS using Fuse client (i.e. _ceph-fuse_) which do not exit if all MDS servers are down (I ...
- 10:57 PM CephFS Bug #10061 (New): uclient: MDS: output cap data in messages
- MClientCaps messages don't dump the caps they're updating, and generally neither does anything else. We need to optio...
- 10:55 PM CephFS Feature #10060 (New): uclient: warn about stuck cap flushes
- It can be hard to diagnose issues that involve cap state. To help with that, the client should keep track of its cap ...
- 10:40 PM CephFS Bug #9977 (Resolved): cephfs-journal-tool falsely reports invalid start_ptr
- In next branch as commit:65c33503c83ff8d88781c5c3ae81d88d84c8b3e4 and in giant as commit:fc5354dec55248724f8f6b795e3a...
- 09:36 PM CephFS Bug #9341: MDS: very slow rejoin
- Thanks.
- 09:27 PM CephFS Bug #9341 (Resolved): MDS: very slow rejoin
- This is backported to giant as of commit:97e423f52155e2902bf265bac0b1b9ed137f8aa0. The test for it also got backporte...
- 09:26 PM CephFS Bug #9800 (Resolved): client-limits test is not passing
- Backported in commit:387efc5fe1fb148ec135a6d8585a3b8f8d97dbf8
- 06:15 PM Bug #10042: OSD crash doing object recovery with EC pool
- I'm not sure either, investigating.
- 05:15 PM Bug #10042: OSD crash doing object recovery with EC pool
- Hi Loic,
I am still a little bit confused in terms of what happened behind the crash (and what is the relation betwe... - 05:30 AM Bug #10042: OSD crash doing object recovery with EC pool
- 03:49 AM Bug #10042 (Duplicate): OSD crash doing object recovery with EC pool
- We observed one OSD crash with the following assertion failure:...
- 06:10 PM rbd Bug #10045 (Resolved): common/Cond.h: 52: FAILED assert(mutex.is_locked()) in close_image()
- 06:45 AM rbd Bug #10045 (Resolved): common/Cond.h: 52: FAILED assert(mutex.is_locked()) in close_image()
- ...
- 05:44 PM Bug #9921: msgr/osd/pg dead lock giant
- Giving Sage this ticket since he took the PR.
- 05:35 PM Bug #10016: "Segmentation fault" in upgrade:giant-giant-distro-basic-multi run
- testing this PR https://github.com/ceph/ceph-qa-suite/pull/233
http://pulpito.front.sepia.ceph.com/teuthology-2014... - 03:06 PM Bug #10016: "Segmentation fault" in upgrade:giant-giant-distro-basic-multi run
- - install.upgrade:
all:
branch: giant
is upgrading all roles - 02:29 PM Bug #10016: "Segmentation fault" in upgrade:giant-giant-distro-basic-multi run
- Still failed - http://pulpito.front.sepia.ceph.com/teuthology-2014-11-10_10:56:16-upgrade:giant-giant-distro-basic-mu...
- 10:48 AM Bug #10016: "Segmentation fault" in upgrade:giant-giant-distro-basic-multi run
- Moved client.0 to a separate node, testing now
https://github.com/ceph/ceph-qa-suite/pull/232 - 09:57 AM Bug #10016: "Segmentation fault" in upgrade:giant-giant-distro-basic-multi run
- ...
- 05:20 PM CephFS Bug #10025 (Resolved): Journal undump causes MDS to crash when start pos is not on object boundary
- Merged into next in commit:69be8e9b30c18e47c17ff7dafc4ac8fbe00d48e7, and the appropriate backport bits were merged la...
- 04:34 PM rgw Feature #9359 (Resolved): rgw: Export user stats in get-user-info Adminops API
- 04:21 PM rgw Bug #9907 (Pending Backport): radosgw-admin: can't disable max_size quota
- 04:13 PM rgw Feature #8911 (Pending Backport): RGW doesn't return 'x-timestamp' in header which is used by 'Vi...
- 04:09 PM Bug #10059: osd/ECBackend.cc: 876: FAILED assert(0)
- This bug makes me cry as it is the reason for my cluster to be _completely down_ for over 10 days now... Duplicate ad...
- 03:20 PM Bug #10059 (Resolved): osd/ECBackend.cc: 876: FAILED assert(0)
- -1> 2014-11-09 14:13:01.334410 7f8b93c8b700 10 filestore(/var/lib/ceph/osd/ceph-3) FileStore::read(1.1ds0_head/78...
- 03:59 PM devops Bug #10049: "Failed to fetch package" "rhel7_0-x86_64-basic"
- When I look at the log for http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-rpm-rhel7-amd64-basic/log.cgi?log=6977d02...
- 03:29 PM devops Bug #10049: "Failed to fetch package" "rhel7_0-x86_64-basic"
- Disk space looks ok to me:...
- 10:28 AM devops Bug #10049: "Failed to fetch package" "rhel7_0-x86_64-basic"
- From http://gitbuilder.sepia.ceph.com/gitbuilder-ceph-rpm-rhel7-amd64-basic/log.cgi?log=6977d02f0d31c453cdf554a8f1796...
- 10:03 AM devops Bug #10049: "Failed to fetch package" "rhel7_0-x86_64-basic"
- Needs a link:
http://pulpito.front.sepia.ceph.com/teuthology-2014-11-09_17:18:01-upgrade:firefly-x-next-distro-basic... - 09:12 AM devops Bug #10049 (Resolved): "Failed to fetch package" "rhel7_0-x86_64-basic"
- Seems wide spread on next run using rhel 7
Run teuthology-2014-11-09_17:18:01-upgrade:firefly-x-next-distro-basic-... - 03:40 PM Bug #10057 (In Progress): msgr: skipped message on peer reconnect
- ...
- 01:42 PM Bug #10057 (Can't reproduce): msgr: skipped message on peer reconnect
- ubuntu@teuthology:/a/teuthology-2014-11-09_23:06:01-krbd-next-testing-basic-multi/593102...
- 03:36 PM Feature #9420: erasure-code: tools and archive to check for non regression of encoding
- the backport is needed to generate the content of https://github.com/ceph/ceph-erasure-code-corpus/tree/master/v0.80....
- 03:32 PM Feature #9420 (Pending Backport): erasure-code: tools and archive to check for non regression of ...
- 02:57 PM Feature #9420 (Resolved): erasure-code: tools and archive to check for non regression of encoding
- I don't think this needs to be backported.
- 03:06 PM Bug #10058 (Can't reproduce): next stuck in recovery, no progress
- /a/sage-2014-11-09_07:49:57-rados-next-testing-basic-multi/591906
/a/sage-2014-11-09_07:49:57-rados-next-testing-bas... - 02:59 PM Bug #9986 (Pending Backport): objecter: map epoch skipping broken
- 02:56 PM Feature #9262 (Resolved): Additional namespace issues
- 02:55 PM Feature #9031 (Resolved): List RADOS namespaces and list all objects in all namespaces
- 02:53 PM Bug #6756 (Pending Backport): journal full hang on startup
- 02:51 PM Bug #9852 (Resolved): mon: monitor asserts on 'ceph mds add_data_pool X' if X is an ID that DNE
- 02:49 PM Bug #9987 (Pending Backport): mon: min_last_epoch_complete tracking broken
- 02:12 PM Bug #10053 (Resolved): ./ceph tell osd.0 injectargs --no-osd_debug_op_order failure
- 11:18 AM Bug #10053 (In Progress): ./ceph tell osd.0 injectargs --no-osd_debug_op_order failure
- ubuntu@teuthology:/a/sage-2014-11-09_07:49:57-rados-next-testing-basic-multi$ teuthology-ls . | grep FAIL
591648 FAI... - 11:14 AM Bug #10053 (Resolved): ./ceph tell osd.0 injectargs --no-osd_debug_op_order failure
- ubuntu@teuthology:/a/samuelj-2014-11-07_21:48:36-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/590242
... - 01:40 PM Bug #10018: OSD assertion failure if the hinfo_key xattr is not there (corrupted?) during scrubbing
- * how to use ceph_objectstore_tool https://github.com/ceph/ceph-qa-suite/blob/giant/tasks/ceph_objectstore_tool.py
*... - 06:20 AM Bug #10018: OSD assertion failure if the hinfo_key xattr is not there (corrupted?) during scrubbing
- The tests should use the same as #9887 which requires https://github.com/ceph/ceph-qa-suite/compare/wip-dzaddscrub
- 01:27 PM Feature #10056 (New): Object metadata mismatch detection and handling
- Possible things we may want to address:
- clone vs head snapshot metadata mismatches
- object metadata vs ondis... - 01:23 PM Feature #10055 (New): PG metadata corruption detection and handling
- Possible problems we might want to handle:
- missing pg info
- missing pg epoch
- missing pg log
Correct ... - 01:21 PM Feature #10054 (New): OSD level metadata mismatch handling
- Meta feature for detecting and handling OSD metadata.
Possible directions:
- full osdmap vs incremental mismatch? - 11:57 AM devops Feature #10046: run make check on every pull request
- Removing myself and clarifying the scope. I would be happy to help with the implementation but I'm not equipped to ta...
- 07:48 AM devops Feature #10046 (Resolved): run make check on every pull request
- And report back on the success / failure, with the logs attached for debugging. The suggested approach is to define a...
- 11:24 AM CephFS Bug #9997: test_client_pin case is failing
- http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-09_23:04:01-fs-next-testing-basic-multi/593068/
- 11:23 AM CephFS Bug #6613: samba is crashing in teuthology
- Still happening: http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-09_23:14:01-samba-next-testing-basic-multi/59...
- 11:13 AM Bug #10052 (Resolved): LibRadosTwoPools[EC]PP.PromoteSnap failure
- ubuntu@teuthology:/a/samuelj-2014-11-07_21:48:36-rados-wip-sam-testing-wip-testing-vanilla-fixes-basic-multi/590439
... - 09:53 AM rbd Bug #10026 (Duplicate): "Assertion: common/Cond.h" in rbd-master-testing-basic-multi run
- #10045
- 09:52 AM Bug #10033 (Won't Fix): ceph pg <pg> query hangs when OSD down, EC PG
- In this case teh osd seems to be up (the pg state isn't 'stale'), so this is expected behavior (the osd hasn't respon...
- 09:51 AM rbd Bug #10051 (Won't Fix): kernel-mounted RBD image may block shutdown
- init-rbdmap fails to unmap an RBD image when the latter is still in use.
As consequence system shutdown hangs dead w... - 09:46 AM rgw Bug #9899 (Fix Under Review): Error "coverage ceph osd pool get '' pg_num" in upgrade:dumpling-du...
- Per Sage - removed mon_thrash tests from the rgw/ section, https://github.com/ceph/ceph-qa-suite/pull/230
- 09:30 AM rgw Bug #9899: Error "coverage ceph osd pool get '' pg_num" in upgrade:dumpling-dumpling-distro-basic...
- this bug was fixed in 0.80.3 or 0.80.4. i think we need to make the 'older' tests skip the mon_thrash tests.
- 09:23 AM rgw Bug #9899: Error "coverage ceph osd pool get '' pg_num" in upgrade:dumpling-dumpling-distro-basic...
- Same issue in run http://pulpito.front.sepia.ceph.com/teuthology-2014-11-09_10:00:02-upgrade:dumpling-dumpling-distro...
- 09:19 AM devops Bug #10050 (Rejected): "Segmentation fault" (radosgw-admin) in upgrade:firefly:singleton-firefly-...
- Logs rae in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-09_17:05:02-upgrade:firefly:singleton-firefly-dist...
- 09:05 AM Bug #10013: "Segmentation fault" in upgrade:dumpling-x-firefly-distro-basic-vps run
- Same issue in http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-09_19:13:01-upgrade:dumpling-x-firefly-distro-ba...
- 08:43 AM Bug #9913: mon: audit log entires for forwarded requests lack info
- session is with the monitor that forwarded the request. there's no auth handler for the session as it is a monitor. ...
- 08:41 AM rbd Bug #10030 (Pending Backport): Crash when attempting to open non-existent parent image
- 08:40 AM Bug #9702: "MaxWhileTries: 'wait_until_healthy'reached maximum tries" in upgrade:firefly-x-giant-...
- Same issue in job http://qa-proxy.ceph.com/teuthology/teuthology-2014-11-09_18:13:01-upgrade:firefly-x-giant-distro-b...
- 08:24 AM Bug #9864 (Can't reproduce): osd doesn't report new stats for 3 hours when running test LibCephFS...
- not enough info to tell why teh client test hung. let's see if it happens again!
- 08:08 AM Bug #9864: osd doesn't report new stats for 3 hours when running test LibCephFS.MulticlientSimple
- Looking into the osd logs show that the osds don't report new stats for the ~3 hours because no pgs are update in tha...
- 07:47 AM Bug #9864: osd doesn't report new stats for 3 hours when running test LibCephFS.MulticlientSimple
- 07:44 AM Bug #9864: osd doesn't report new stats for 3 hours when running test LibCephFS.MulticlientSimple
- Not so weird after all.
Log shows that last log is created because we had some stats to report:... - 07:30 AM Bug #9864: osd doesn't report new stats for 3 hours when running test LibCephFS.MulticlientSimple
- this is not the monitor taking 2 hours to commit. The log snippets above refer to two different proposals: the first...
- 06:08 AM Feature #10044 (New): ECUtil::HashInfoRef should have a NONE value
- So that "ECBackend::get_hash_info":https://github.com/ceph/ceph/blob/giant/src/osd/ECBackend.cc#L1435 can return it i...
- 05:10 AM Bug #10040 (Rejected): install ceph packages broken for firefly
- The problem here is that the machine needs to be properly cleaned up from newer Ceph packages.
It is always proble... - 04:13 AM Bug #8588: In the erasure-coded pool, primary OSD will crash at decoding if any data chunk's size...
- Hi Sam,
Any suggestion in terms of how to fix this issue?
One potential solution is to validate the digest for ea...
Also available in: Atom