Activity
From 08/03/2017 to 09/01/2017
09/01/2017
- 09:25 PM Bug #21218 (Resolved): thrash-eio + bluestore (hangs with unfound objects or read_log_and_missing...
- ...
- 02:02 PM Bug #21203: build_initial_pg_history doesn't update up/acting/etc
- https://github.com/ceph/ceph/pull/17423
- 01:30 PM Backport #20512 (Rejected): kraken: cache tier osd memory high memory consumption
- Kraken is EOL.
- 06:53 AM Bug #21211: 12.2.0,cephfs(meta replica 2, data ec 2+1),ceph-osd coredump
- 12.2.0
create cephfs
meta pool: model : replica 2
data pool: model : ec 2+1
ceph-osd coredump after r... - 06:49 AM Bug #21211 (Need More Info): 12.2.0,cephfs(meta replica 2, data ec 2+1),ceph-osd coredump
- ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc)
1: (()+0xa23b21) [0x7fe4a148bb21]
2...
08/31/2017
- 09:38 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
- Hi
My institute has a large cluster running Kraken 11.2.1-0 and using EC 8+3 and believe we have run into this bug... - 08:45 PM Bug #21121 (Pending Backport): test_health_warnings.sh can fail
- 08:44 PM Bug #21207 (Fix Under Review): bluestore: asyn cdeferred_try_submit deadlock
- https://github.com/ceph/ceph/pull/17409
- 08:39 PM Bug #21207 (Resolved): bluestore: asyn cdeferred_try_submit deadlock
- In deferred_aio_finish we may need to requeue pending deferred via a finisher. Currently we reuse finishers[0], but ...
- 06:56 PM Bug #21206 (Fix Under Review): thrashosds read error injection doesn't take live_osds into account
- https://github.com/ceph/ceph/pull/17406
- 06:54 PM Bug #21206 (Resolved): thrashosds read error injection doesn't take live_osds into account
- ...
- 04:31 PM Bug #20981: ./run_seed_to_range.sh errored out
- David, the first dead job to appear was http://pulpito.ceph.com/smithfarm-2017-08-21_19:38:42-rados-wip-jewel-backpor...
- 03:21 AM Bug #20981: ./run_seed_to_range.sh errored out
- Are we sure it isn't http://tracker.ceph.com/issues/20613#note-24 ? Because the dead runs here http://pulpito.ceph.c...
- 03:14 AM Bug #20981: ./run_seed_to_range.sh errored out
- I looked at https://github.com/ceph/ceph/pull/15050 and don't see anything that would cause this issue.
- 04:11 PM Bug #21204 (Resolved): DNS SRV default service name not used anymore
- Hi,
I am in the process of upgrading from Kraken to Luminous.
I am using DNS SRV records to lookup MON servers.
... - 01:56 PM Bug #19605 (Pending Backport): OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front(...
- 01:52 PM Bug #21203 (Resolved): build_initial_pg_history doesn't update up/acting/etc
- The loop doesn't update up/acting/etc values, which means the result is incorrect when there are multiple intervals s...
- 11:00 AM Bug #20933: All mon nodes down when i use ceph-disk prepare a new osd.
- I think I've hit the similar issue. Occured with 12.1.2 when tried to add host / osd (ceph-deploy osd prepare --dmcry...
- 09:27 AM Feature #21198 (New): Monitors don't handle incomplete network splits
- the network between monitors(the minimum rank and the maximum rank) disconnect, the node of the maximum rank always k...
- 03:30 AM Bug #21194 (New): mon clock skew test is fragile
- The original observed problem is that it failed to detect clock skew in run /a/sage-2017-08-27_02:16:57-rados-wip-sa...
08/30/2017
- 06:10 PM Bug #20981: ./run_seed_to_range.sh errored out
- My money is on https://github.com/ceph/ceph/pull/15050
- 06:07 PM Bug #20981: ./run_seed_to_range.sh errored out
- David, the jewel failure started occurring in the integration branch that included the following PRs: http://tracker....
- 04:56 PM Bug #20981: ./run_seed_to_range.sh errored out
- I reverted https://github.com/ceph/ceph/pull/15947 to see if that would fix it and it did NOT.
- 03:57 PM Backport #21182 (Resolved): luminous: 'osd crush rule rename' not idempotent
- https://github.com/ceph/ceph/pull/17481
- 03:28 PM Bug #21180 (Resolved): Bluestore throttler causes down OSD
- Writing large amount of data to EC RBD pool via NBD causes down OSDs, PGs and drop in traffic due to unhealthy cluste...
- 02:12 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- To clarify then: I have not tested this with a replicated cephfs data pool. Only tested with ec data pool as per my 4...
- 01:19 PM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- Martin: just to confirm, you were seeing this crash while you had EC pools involved, and when you do not have any EC ...
- 11:25 AM Bug #21174: OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_update)
- ...
- 06:12 AM Bug #21174 (Rejected): OSD crash: 903: FAILED assert(objiter->second->version > last_divergent_up...
- I've setup a cephfs erasure coded pool on a small cluster consisting of 5 bluestore OSDs.
The pools were created as ... - 01:33 PM Bug #20871 (Fix Under Review): core dump when bluefs's mkdir returns -EEXIST
- 01:33 PM Bug #20871: core dump when bluefs's mkdir returns -EEXIST
- https://github.com/ceph/ceph/pull/17357
08/29/2017
- 09:43 PM Bug #21171 (Fix Under Review): bluestore: aio submission deadlock
- https://github.com/ceph/ceph/pull/17352
- 02:47 PM Bug #21171 (Resolved): bluestore: aio submission deadlock
- - thread a holds deferred_submit_lock, blocks on aio submission (queue is full)
- thread b holds deferred_lock, bloc... - 08:58 PM Bug #21162 (Pending Backport): 'osd crush rule rename' not idempotent
- 10:46 AM Bug #21162 (Fix Under Review): 'osd crush rule rename' not idempotent
- https://github.com/ceph/ceph/pull/17329
- 07:38 PM Documentation #20486 (Resolved): Document how to use bluestore compression
- 04:11 PM Bug #21143: bad RESETSESSION between OSDs?
- Haomai Wang wrote:
> https://github.com/ceph/ceph/pull/16009
>
> this pr gives a brief about reason. it's really ... - 03:08 PM Bug #21092: OSD sporadically starts reading at 100% of ssd bandwidth
- Seems that is side effect of too small value for bluestore_cache_size.
We set it to 50M to reduce osd memory consump... - 07:36 AM Bug #20981: ./run_seed_to_range.sh errored out
- This is occurring in the current jewel branch now too:
https://github.com/ceph/ceph/pull/17317#issuecomment-325580432 - 07:06 AM Backport #16239 (Resolved): 'ceph tell osd.0 flush_pg_stats' fails in rados qa run
- h3. description...
- 03:00 AM Bug #21165 (Can't reproduce): 2 pgs stuck in unknown during thrashing
- ...
08/28/2017
- 10:20 PM Bug #21162 (Resolved): 'osd crush rule rename' not idempotent
- ...
- 06:15 PM Backport #21150: jewel: tests: btrfs copy_clone returns errno 95 (Operation not supported)
- Is this causing job failures? I'm having trouble finding anything indicating this would be fatal without an actual I...
- 08:06 AM Backport #21150 (Resolved): jewel: tests: btrfs copy_clone returns errno 95 (Operation not suppor...
- https://github.com/ceph/ceph/pull/18165
- 01:55 AM Bug #21016 (Resolved): CRUSH crash on bad memory handling
- 01:54 AM Backport #21106 (Resolved): luminous: CRUSH crash on bad memory handling
- https://github.com/ceph/ceph/pull/17214
08/27/2017
- 05:59 PM Bug #21147 (Resolved): Manager daemon x is unresponsive. No standby daemons available
- /a/sage-2017-08-26_20:38:41-rados-luminous-distro-basic-smithi/1567938
The last time I looked this appeared to be ... - 04:04 PM Bug #20924: osd: leaked Session on osd.7
- /a/sage-2017-08-26_20:38:41-rados-luminous-distro-basic-smithi/1568055
- 04:30 AM Bug #21143: bad RESETSESSION between OSDs?
- https://github.com/ceph/ceph/pull/16009
this pr gives a brief about reason. it's really rare, so I don't do it imm... - 02:16 AM Backport #21076 (Resolved): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools()....
- 02:15 AM Backport #21095 (Resolved): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
08/26/2017
- 06:14 PM Bug #20785 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool...
- 06:13 PM Bug #20913 (Resolved): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
- 06:08 PM Bug #21144 (Resolved): daemon-helper: command crashed with signal 1
- ...
- 05:56 PM Bug #21143 (Duplicate): bad RESETSESSION between OSDs?
- osd.5...
- 12:11 PM Bug #21142: OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
- I uploaded more logs and info files with ceph-post-file
f27fb8a5-baae-4f04-8353-d3b2b314c61a
- 11:56 AM Bug #21142 (Won't Fix): OSD crashes when loading pgs with "FAILED assert(interval.last > last)"
- after upgrading to luminous 12.1.4 rc
we saw several osds crashing with below logs.
the cluster was unhealthy when ... - 01:06 AM Bug #20981: ./run_seed_to_range.sh errored out
- Stack trace from core dump doesn't include a stack with _inject_failure() in it.
For core dump in /a/kchai-2017-08...
08/25/2017
- 08:02 PM Backport #21133 (Resolved): luminous: osd/PrimaryLogPG: sparse read won't trigger repair correctly
- https://github.com/ceph/ceph/pull/17475
- 08:02 PM Backport #21132 (Resolved): luminous: qa/standalone/scrub/osd-scrub-repair.sh timeout
- https://github.com/ceph/ceph/pull/17264
- 07:46 PM Bug #21127: qa/standalone/scrub/osd-scrub-repair.sh timeout
- https://github.com/ceph/ceph/pull/17264
- 07:44 PM Bug #21127 (Pending Backport): qa/standalone/scrub/osd-scrub-repair.sh timeout
- 03:01 PM Bug #21127: qa/standalone/scrub/osd-scrub-repair.sh timeout
- We need to backport fe81b7e3a5034ce855303f93f3e413f3f2dc74a8 and this change together to luminous.
- 02:59 PM Bug #21127: qa/standalone/scrub/osd-scrub-repair.sh timeout
- Caused by:
commit fe81b7e3a5034ce855303f93f3e413f3f2dc74a8
Author: huanwen ren <ren.huanwen@zte.com.cn>
Date: ... - 01:46 PM Bug #21127 (Fix Under Review): qa/standalone/scrub/osd-scrub-repair.sh timeout
- https://github.com/ceph/ceph/pull/17258
- 01:44 PM Bug #21127 (Resolved): qa/standalone/scrub/osd-scrub-repair.sh timeout
- ...
- 03:44 PM Bug #21130 (Can't reproduce): "FAILED assert(bh->last_write_tid > tid)" in powercycle-master-test...
- Run: http://pulpito.ceph.com/yuriw-2017-08-24_22:38:48-powercycle-master-testing-basic-smithi/
Job: 1560682
Logs: h... - 03:34 PM Backport #20781 (Fix Under Review): kraken: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- 03:33 PM Backport #20781: kraken: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- https://github.com/ceph/ceph/pull/17261
- 03:22 PM Backport #20780 (Fix Under Review): jewel: ceph-osd: PGs getting stuck in scrub state, stalling RBD
- 03:09 PM Bug #21123 (Pending Backport): osd/PrimaryLogPG: sparse read won't trigger repair correctly
- 03:08 PM Bug #21129 (New): 'ceph -s' hang
- ...
- 12:11 PM Backport #21076 (In Progress): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools...
- https://github.com/ceph/ceph/pull/17257
- 10:28 AM Bug #21092: OSD sporadically starts reading at 100% of ssd bandwidth
- Another stack trace that leads to pread same size and same offset:...
- 09:20 AM Bug #21092: OSD sporadically starts reading at 100% of ssd bandwidth
- Stacktrace of thread performing reads of 2445312 bytes from offset 96117329920 ...
- 10:19 AM Bug #20188 (New): filestore: os/filestore/FileStore.h: 357: FAILED assert(q.empty()) from ceph_te...
- /a//kchai-2017-08-25_08:38:31-rados-wip-kefu-testing-distro-basic-smithi/1561884...
- 06:35 AM Bug #20785 (Fix Under Review): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(p...
- /a//joshd-2017-08-25_00:03:46-rados-wip-dup-perf-distro-basic-smithi/1560728/ mon.c
- 02:40 AM Backport #21095: osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
- should backport https://github.com/ceph/ceph/pull/17246 also.
- 02:38 AM Bug #20913: osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
- https://github.com/ceph/ceph/pull/17246
- 02:09 AM Bug #20876: BADAUTHORIZER on mgr, hung ceph tell mon.*
- /a/sage-2017-08-24_17:38:40-rados-wip-sage-testing2-luminous-20170824a-distro-basic-smithi/1560473
08/24/2017
- 11:57 PM Bug #21123 (Resolved): osd/PrimaryLogPG: sparse read won't trigger repair correctly
- master PR: https://github.com/ceph/ceph/pull/17221
- 09:59 PM Bug #21121 (Fix Under Review): test_health_warnings.sh can fail
- https://github.com/ceph/ceph/pull/17244
- 09:55 PM Bug #21121: test_health_warnings.sh can fail
- I believe the fix is to subscribe to osdmaps when in the waiting for healthy state. if we are unhealthy because we a...
- 09:54 PM Bug #21121 (Resolved): test_health_warnings.sh can fail
- - test_mark_all_but_last_osds_down marks all but one osd down
- clears noup
- osd.1 fails the is_healthy check beca... - 07:25 PM Bug #20770: test_pidfile.sh test is failing 2 places
- This problem still hasn't been solved. The is disabled, so moving back to verified.
- 07:23 PM Bug #20770 (Resolved): test_pidfile.sh test is failing 2 places
- luminous backport rejected because the test continued to fail
- 07:22 PM Bug #20975 (Resolved): test_pidfile.sh is flaky
- luminous backport: https://github.com/ceph/ceph/pull/17241
- 05:50 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
- Nathan Cutler wrote:
> @Vikhyat, I think Abhi just created the luminous backport tracker manually. The jewel one wil... - 05:22 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
- @Vikhyat, I think Abhi just created the luminous backport tracker manually. The jewel one will be created automagical...
- 04:36 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
- Thanks Nathan. I think some issue and it did not create a tracker for jewel backport so I removed luminous so it can ...
- 03:56 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
- Verified that both commits from https://github.com/ceph/ceph/pull/17039 were cherry-picked to luminous.
- 05:25 PM Backport #21117 (Resolved): jewel: osd: osd_scrub_during_recovery only considers primary, not rep...
- https://github.com/ceph/ceph/pull/17815
- 05:23 PM Bug #21092: OSD sporadically starts reading at 100% of ssd bandwidth
- 59.log more obviously shows the issue with repeating part:...
- 10:10 AM Bug #21092 (New): OSD sporadically starts reading at 100% of ssd bandwidth
- luminous v12.1.4
bluestore
Periodically (10 mins) some osd starts reading ssd disk at maximum available speed (45... - 05:22 PM Backport #21106 (Resolved): luminous: CRUSH crash on bad memory handling
- 03:54 PM Bug #21096 (New): osd-scrub-repair.sh:381: unfound_erasure_coded: return 1
- ...
- 03:31 PM Backport #21095 (In Progress): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_del...
- https://github.com/ceph/ceph/pull/17233
- 03:30 PM Backport #21095 (Resolved): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
- ...
- 03:14 PM Bug #20913 (Pending Backport): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_del...
- 08:12 AM Bug #19605 (Fix Under Review): OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front(...
- https://github.com/ceph/ceph/pull/17217
- 06:58 AM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- although all ops in repop_queue are canceled upon pg reset (change), and pg discards messages from down OSDs accordin...
- 03:46 AM Bug #20785 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid.pool...
- 03:46 AM Backport #21090 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid...
- https://github.com/ceph/ceph/pull/17191
- 03:46 AM Backport #21090 (Resolved): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(pgid...
- https://github.com/ceph/ceph/pull/17191
- 03:44 AM Feature #20956 (Resolved): Include front/back interface names in OSD metadata
- 03:36 AM Bug #20970 (Resolved): bug in funciton reweight_by_utilization
- 03:13 AM Feature #21073: mgr: ceph/rgw: show hostnames and ports in ceph -s status output
- ...
- 03:04 AM Backport #21076 (Resolved): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools()....
- 03:03 AM Backport #21048 (Resolved): luminous: Include front/back interface names in OSD metadata
- 03:02 AM Backport #21077 (Resolved): luminous: osd: osd_scrub_during_recovery only considers primary, not ...
- 03:02 AM Backport #21079 (Resolved): bug in funciton reweight_by_utilization
- 12:30 AM Bug #21016 (Pending Backport): CRUSH crash on bad memory handling
08/23/2017
- 11:02 PM Bug #20730: need new OSD_SKEWED_USAGE implementation
- I've created 2 pull request for Jewel and Kraken to disable this now.
Jewel: https://github.com/ceph/ceph/pull/172... - 08:52 PM Bug #14115: crypto: race in nss init
- Still seeing this in Jewel 10.2.7, Ubuntu 16.04.2 running an application using ceph under Apache:...
- 06:33 PM Bug #21016: CRUSH crash on bad memory handling
- 05:27 PM Bug #18209 (Resolved): src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())
- 05:00 PM Backport #20965 (Resolved): luminous: src/common/LogClient.cc: 310: FAILED assert(num_unsent <= l...
- 01:46 PM Backport #20965 (In Progress): luminous: src/common/LogClient.cc: 310: FAILED assert(num_unsent <...
- 04:09 PM Feature #21084 (Resolved): auth: add osd auth caps based on pool metadata
- Add pool-metadata based auth caps. The initial use case is CephFS; if pools are tagged based on filesystem, then auth...
- 01:48 PM Backport #21079 (In Progress): bug in funciton reweight_by_utilization
- 01:47 PM Backport #21079 (Resolved): bug in funciton reweight_by_utilization
- https://github.com/ceph/ceph/pull/17198
- 01:37 PM Backport #21051 (In Progress): luminous: Improve size scrub error handling and ignore system attr...
- 01:30 PM Backport #21077 (In Progress): luminous: osd: osd_scrub_during_recovery only considers primary, n...
- 01:27 PM Backport #21077 (Resolved): luminous: osd: osd_scrub_during_recovery only considers primary, not ...
- https://github.com/ceph/ceph/pull/17195
- 01:26 PM Backport #21048 (In Progress): luminous: Include front/back interface names in OSD metadata
- 01:02 PM Backport #21076 (In Progress): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools...
- https://github.com/ceph/ceph/pull/17191
- 12:59 PM Backport #21076 (Resolved): luminous: osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools()....
- https://github.com/ceph/ceph/pull/17191
- 10:20 AM Bug #16553: Removing Writeback Cache Tier Does not clean up Incomplete_Clones
- It looks like I hit same issue on 10.2.9.
- 08:37 AM Bug #20913 (Fix Under Review): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_del...
- https://github.com/ceph/ceph/pull/17183
- 08:23 AM Feature #21073 (Resolved): mgr: ceph/rgw: show hostnames and ports in ceph -s status output
- Similar to the way we do mds and mgr statuses, we could display the rgw endpoints in ceph status as well, the informa...
- 05:26 AM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
- see also https://github.com/ceph/ceph/pull/17179
08/22/2017
- 11:30 PM Bug #20909 (Fix Under Review): Error ETIMEDOUT: crush test failed with -110: timed out during smo...
- https://github.com/ceph/ceph/pull/17169
- 04:43 PM Bug #20770: test_pidfile.sh test is failing 2 places
- Another change is needed too. I've requested that in the pull request.
https://github.com/ceph/ceph/pull/17052 sh... - 04:21 PM Bug #20770: test_pidfile.sh test is failing 2 places
- David Zafman wrote:
> To backport all the test-pidfile.sh cherry-pick 4 pull requests using the sha1s in this order:... - 04:26 PM Bug #20981: ./run_seed_to_range.sh errored out
- See also here =>
http://qa-proxy.ceph.com/teuthology/yuriw-2017-08-22_14:54:54-rados-wip-yuri-testing_2017_8_22-di... - 03:18 PM Bug #20981: ./run_seed_to_range.sh errored out
- David, can you take a look? This seems to be showing up pretty consistently in rados runs.
- 04:23 PM Bug #20975 (Duplicate): test_pidfile.sh is flaky
- 03:39 PM Bug #20785 (Pending Backport): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(p...
- 02:50 PM Feature #18206 (Pending Backport): osd: osd_scrub_during_recovery only considers primary, not rep...
- 01:18 PM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- ...
- 01:17 PM Bug #21016 (Fix Under Review): CRUSH crash on bad memory handling
- 06:09 AM Bug #20970 (Pending Backport): bug in funciton reweight_by_utilization
08/21/2017
- 11:53 PM Bug #15741: librados get_last_version() doesn't return correct result after aio completion
- This bug still exists.
- 10:34 PM Bug #19487 (Closed): "GLOBAL %RAW USED" of "ceph df" is not consistent with check_full_status
- Reopen this if issue hasn't been fixed in the latest code with the understanding that each OSD has its own fullness d...
- 04:14 PM Backport #21051 (Resolved): luminous: Improve size scrub error handling and ignore system attrs i...
- https://github.com/ceph/ceph/pull/17196
- 04:13 PM Backport #21048 (Resolved): luminous: Include front/back interface names in OSD metadata
- https://github.com/ceph/ceph/pull/17193
- 03:54 PM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- # osd.1 sent failure report of osd.0
# osd.1 sent repop 5386 to osd.0
# mon.a marked osd.0 down in osdmap.27
# osd... - 03:52 PM Bug #17138 (Resolved): crush: inconsistent ruleset/ruled_id are difficult to figure out
- 07:44 AM Bug #20981: ./run_seed_to_range.sh errored out
- /a//kchai-2017-08-21_01:51:35-rados-master-distro-basic-smithi/1545907/teuthology.log has debug heartbeatmap = 20.
<... - 03:57 AM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- Hi, everyone.
I've found that the reason that clone overlap modifications should pass "is_present_clone" condition... - 02:04 AM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
- /a//kchai-2017-08-20_09:42:12-rados-wip-kefu-testing-distro-basic-mira/1545387/
08/18/2017
- 11:20 PM Bug #20770: test_pidfile.sh test is failing 2 places
To backport all the test-pidfile.sh cherry-pick 4 pull requests using the sha1s in this order:
https://github.co...- 11:08 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
- https://github.com/ceph/ceph/pull/17039
- 09:34 AM Bug #20981: ./run_seed_to_range.sh errored out
- /a/kchai-2017-08-18_03:03:28-rados-master-distro-basic-mira/1537335...
- 03:12 AM Bug #20243 (Pending Backport): Improve size scrub error handling and ignore system attrs in xattr...
- https://github.com/ceph/ceph/pull/16407
08/17/2017
- 09:47 PM Bug #20332 (Won't Fix): rados bench seq option doesn't work
- 06:01 PM Feature #18206 (Fix Under Review): osd: osd_scrub_during_recovery only considers primary, not rep...
- 02:55 PM Bug #20970 (Fix Under Review): bug in funciton reweight_by_utilization
- 11:19 AM Bug #20970: bug in funciton reweight_by_utilization
- https://github.com/ceph/ceph/pull/17064
- 12:14 PM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- excerpt of osd.0.log...
- 11:31 AM Bug #20785 (Fix Under Review): osd/osd_types.cc: 3574: FAILED assert(lastmap->get_pools().count(p...
- https://github.com/ceph/ceph/pull/17065
- 07:52 AM Bug #21016: CRUSH crash on bad memory handling
- I believe this should be fixed by https://github.com/ceph/ceph/pull/17014/commits/6252068ec08c66513e5394188b786978236...
08/16/2017
- 10:34 PM Bug #21016: CRUSH crash on bad memory handling
- ...and this was also responsible for at least a couple failures that got detected as such.
- 10:15 PM Bug #21016 (Resolved): CRUSH crash on bad memory handling
- ...
- 12:04 PM Feature #18206: osd: osd_scrub_during_recovery only considers primary, not replicas
- david, i just read your inquiry over IRC. what would you want me to review for this ticket? do we have a PR for it al...
- 01:48 AM Bug #21005 (New): mon: mon_osd_down_out interval can prompt osdmap creation when nothing is happe...
- I saw a cluster where we had the whole gamut of no* flags set in an attempt to stop it creating maps.
Unfortunatel...
08/15/2017
- 03:40 PM Bug #20416: "FAILED assert(osdmap->test_flag((1<<15)))" (sortbitwise) on upgraded cluster
- Hello,
sorry for the delay
Yes, it appears under flags.... - 01:22 AM Bug #20770 (Pending Backport): test_pidfile.sh test is failing 2 places
08/14/2017
- 10:14 PM Feature #18206 (In Progress): osd: osd_scrub_during_recovery only considers primary, not replicas
- 09:00 PM Bug #20999 (New): rados python library does not document omap API
- The omap API can be fairly important for RADOS applications but it is not documented in the expected location http://...
- 08:32 PM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
- Note: bug is not present in master, as demonstrated by https://github.com/ceph/ceph/pull/17017
- 08:31 PM Backport #17445 (In Progress): jewel: list-snap cache tier missing promotion logic (was: rbd cli ...
- h3. description
In our ceph cluster some rbd images (create by openstack) make rbd segfault. This is on a ubuntu 1... - 10:48 AM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
- The pull request https://github.com/ceph/ceph/pull/17017
- 10:46 AM Backport #17445: jewel: list-snap cache tier missing promotion logic (was: rbd cli segfault when ...
- Hi, everyone.
I've just add a new list-snaps test, #17017, which can test whether this problem exists in master br... - 07:40 PM Bug #20770 (Fix Under Review): test_pidfile.sh test is failing 2 places
- 01:55 PM Bug #20985 (Resolved): PG which marks divergent_priors causes crash on startup
- Several other confirmations and a healthy test run later, all merged!
08/13/2017
- 07:20 PM Feature #14527: Lookup monitors through DNS
- The recent code doesn't support IPv6, apparently. Maybe we can choose among ns_t_a and ns_t_aaaa according to conf->m...
- 07:01 PM Bug #20939 (Resolved): crush weight-set + rm-device-class segv
- 06:59 PM Bug #20876: BADAUTHORIZER on mgr, hung ceph tell mon.*
- /a/sage-2017-08-12_21:09:40-rados-wip-sage-testing-20170812a-distro-basic-smithi/1518429...
- 09:17 AM Bug #20985: PG which marks divergent_priors causes crash on startup
- Stephan Hohn wrote:
> I can confirm that this build worked on my test cluster. It's back to HEALTH_OK and all OSDs a... - 09:17 AM Bug #20985: PG which marks divergent_priors causes crash on startup
- I can conform that this build worked on my test cluster. It's back to HEALTH_OK and all OSDs are up.
08/12/2017
- 06:08 PM Bug #20910: spurious MON_DOWN, apparently slow/laggy mon
- /a/sage-2017-08-11_21:54:20-rados-luminous-distro-basic-smithi/1512264
I'm going to whitelist this on luminous bra... - 05:31 PM Bug #20985: PG which marks divergent_priors causes crash on startup
- If anyone wants to validate that the fix packages at https://shaman.ceph.com/repos/ceph/wip-20985-divergent-handling-...
- 09:19 AM Bug #20985: PG which marks divergent_priors causes crash on startup
- Facing the same issue upgrading from jewel 10.2.9 -> luminous 12.1.3 (RC)
- 02:55 AM Bug #20923 (Resolved): ceph-12.1.1/src/os/bluestore/BlueStore.cc: 2630: FAILED assert(last >= start)
- 02:35 AM Bug #20983 (Resolved): bluestore: failure to dirty src onode on clone with 1-byte logical extent
08/11/2017
- 10:49 PM Bug #20986 (Can't reproduce): segv in crush_destroy_bucket_straw2 on rados/standalone/misc.yaml
- ...
- 10:45 PM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
- ...
- 10:43 PM Bug #20985: PG which marks divergent_priors causes crash on startup
- Luminous at https://github.com/ceph/ceph/pull/17001
- 10:20 PM Bug #20985: PG which marks divergent_priors causes crash on startup
- https://github.com/ceph/ceph/pull/17000
Still compiling, testing, etc - 10:16 PM Bug #20985 (Resolved): PG which marks divergent_priors causes crash on startup
- This was noticed in the course of somebody upgrading from 12.1.1 to 12.1.2:...
- 10:14 PM Bug #20910: spurious MON_DOWN, apparently slow/laggy mon
- /a/sage-2017-08-11_17:22:37-rados-wip-sage-testing-20170811a-distro-basic-smithi/1511996
- 10:12 PM Bug #20959: cephfs application metdata not set by ceph.py
- https://github.com/ceph/ceph/pull/16954
- 02:29 AM Bug #20959 (Resolved): cephfs application metdata not set by ceph.py
- 05:36 PM Bug #20770: test_pidfile.sh test is failing 2 places
- 05:34 AM Bug #20770 (In Progress): test_pidfile.sh test is failing 2 places
- 04:46 PM Bug #20983: bluestore: failure to dirty src onode on clone with 1-byte logical extent
- https://github.com/ceph/ceph/pull/16994
- 04:45 PM Bug #20983 (Resolved): bluestore: failure to dirty src onode on clone with 1-byte logical extent
- symptom is...
- 04:27 PM Bug #20981: ./run_seed_to_range.sh errored out
- Super weird.. looks like a race between heartbeat timeout and a failure injection maybe?...
- 01:26 PM Bug #20981 (Can't reproduce): ./run_seed_to_range.sh errored out
- ...
- 01:00 PM Bug #20974 (Fix Under Review): osd/PG.cc: 3377: FAILED assert(r == 0) (update_snap_map remove fails)
- https://github.com/ceph/ceph/pull/16982
08/10/2017
- 07:59 PM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- Yes, but osd.0 doing that is very incorrect. We've had some problems in this area before with marking stuff down not ...
- 10:20 AM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- greg, osd.0 failed to send the reply of tid 5386 over the wire because it was disconnected. but it managed to send th...
- 07:41 PM Bug #20975: test_pidfile.sh is flaky
- https://github.com/ceph/ceph/pull/16977
- 07:41 PM Bug #20975 (Resolved): test_pidfile.sh is flaky
- fails regularly on make check. disabling it for now.
- 04:41 PM Bug #20939: crush weight-set + rm-device-class segv
- 04:15 PM Feature #20956 (Pending Backport): Include front/back interface names in OSD metadata
- 04:12 PM Bug #20949 (Resolved): mon: quorum incorrectly believes mon has kraken (not jewel) features
- 03:49 PM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- Moving this back to RADOS -- changing librbd to force a full object diff if an object exists in the cache tier seems ...
- 02:16 PM Bug #20974 (Can't reproduce): osd/PG.cc: 3377: FAILED assert(r == 0) (update_snap_map remove fails)
- ...
- 01:33 PM Bug #20958 (Resolved): missing set lost during upgrade
- also backported
- 01:23 PM Bug #20973 (Can't reproduce): src/osdc/ Objecter.cc: 3106: FAILED assert(check_latest_map_ops.fin...
- ...
- 07:04 AM Bug #20970 (Resolved): bug in funciton reweight_by_utilization
- There is one bug in function OSDMonitor::reweight_by_utilization ...
08/09/2017
- 09:34 PM Bug #20798 (Need More Info): LibRadosLockECPP.LockExclusiveDurPP gets EEXIST
- Logs from the ClsLock unittest clearly show that there is a race in the test and it tries to take the lock again befo...
- 09:15 PM Bug #20959 (In Progress): cephfs application metdata not set by ceph.py
- So far I've identified three problems in the source:
1) we don't check that we're in luminous mode before the MDS se... - 07:57 PM Bug #20959: cephfs application metdata not set by ceph.py
- As I reported in #20891 I am seeing this on fresh luminous clusters.
- 07:56 PM Bug #20959: cephfs application metdata not set by ceph.py
- Okay, unlike the previous log I looked at, the "fs new" command is clearly *not* triggering a new osd map commit. We ...
- 07:53 PM Bug #20959: cephfs application metdata not set by ceph.py
- Hmm, this still doesn't make sense. The cluster started out as luminous and so the maps would always have the luminou...
- 04:19 PM Bug #20959: cephfs application metdata not set by ceph.py
- The bug I hit before was doing the right checks on encoding, *but* the pending_inc was applied to the in-memory mon c...
- 03:29 PM Bug #20959: cephfs application metdata not set by ceph.py
- We're encoding with the quorum features, though, so I don't think that could actually cause a problem, Maybe though.
- 03:23 PM Bug #20959: cephfs application metdata not set by ceph.py
- Sage was right, the MDSMonitor unconditionally calls do_application_enable() and that unconditionally sets applicatio...
- 03:06 PM Bug #20959 (Resolved): cephfs application metdata not set by ceph.py
- "2017-08-09 06:52:11.115593 mon.a mon.0 172.21.15.12:6789/0 154 : cluster [WRN] Health check failed: application not ...
- 07:54 PM Bug #20920 (Resolved): pg dump fails during point-to-point upgrade
- 07:26 PM Bug #20920: pg dump fails during point-to-point upgrade
- https://github.com/ceph/ceph/pull/16871
- 07:54 PM Backport #20963 (Resolved): luminous: pg dump fails during point-to-point upgrade
- Manually cherry-picked to luminous ahead of the 12.2.0 release.
- 06:32 PM Backport #20963 (Resolved): luminous: pg dump fails during point-to-point upgrade
- 07:33 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
- I'm not really sure how we could reasonably handle this scenario on the Ceph side. Seems like we should adjust the te...
- 07:06 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
- meanwhile on osd.2, start is...
- 06:46 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
- second write to teh object sets uv482...
- 06:09 PM Bug #20960 (Can't reproduce): ceph_test_rados: mismatched version (due to pg import/export)
- ...
- 07:20 PM Bug #20947 (Resolved): OSD and mon scrub cluster log messages are too verbose
- 09:48 AM Bug #20947 (Pending Backport): OSD and mon scrub cluster log messages are too verbose
- 07:20 PM Backport #20961 (Resolved): luminous: OSD and mon scrub cluster log messages are too verbose
- Manually cherry-picked to luminous branch.
- 06:32 PM Backport #20961 (Resolved): luminous: OSD and mon scrub cluster log messages are too verbose
- 06:34 PM Backport #20965 (Resolved): luminous: src/common/LogClient.cc: 310: FAILED assert(num_unsent <= l...
- https://github.com/ceph/ceph/pull/17197
- 06:19 PM Bug #20958: missing set lost during upgrade
- 06:14 PM Bug #20958: missing set lost during upgrade
- 05:47 PM Bug #20958: missing set lost during upgrade
- 04:17 PM Bug #20958: missing set lost during upgrade
- It looks like a bug in the jewel->luminous conversion:
* jewel doesn't save the missing set
* luminous detects th... - 02:12 PM Bug #20958: missing set lost during upgrade
- osd.3 send empty msising to primary at...
- 01:50 PM Bug #20958 (Resolved): missing set lost during upgrade
- pg 4.3...
- 05:46 PM Bug #18209 (Pending Backport): src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queu...
- 12:00 PM Bug #20888 (Fix Under Review): "Health check update" log spam
- https://github.com/ceph/ceph/pull/16942
- 11:54 AM Feature #20956: Include front/back interface names in OSD metadata
- https://github.com/ceph/ceph/pull/16941
- 11:52 AM Feature #20956 (Resolved): Include front/back interface names in OSD metadata
- This information is needed by anyone who has a TSDB/dashboard that wants to correlate their NIC statistics with the u...
- 05:28 AM Bug #20952 (Can't reproduce): Glitchy monitor quorum causes spurious test failure
qa/standalone/mon/misc.sh failed in TEST_mon_features()
http://qa-proxy.ceph.com/teuthology/dzafman-2017-08-08_1...- 02:34 AM Bug #20925 (Resolved): bluestore: bad csum during fsck
08/08/2017
- 10:43 PM Bug #20949 (Resolved): mon: quorum incorrectly believes mon has kraken (not jewel) features
- mon.2 is the last mon to restart:...
- 10:13 PM Bug #20923 (Fix Under Review): ceph-12.1.1/src/os/bluestore/BlueStore.cc: 2630: FAILED assert(las...
- https://github.com/ceph/ceph/pull/16924
- 09:10 PM Bug #20863 (Duplicate): CRC error does not mark PG as inconsistent or queue for repair
- 06:37 PM Bug #20863: CRC error does not mark PG as inconsistent or queue for repair
- This will be available in Luminous, see http://tracker.ceph.com/issues/19657
- 06:57 PM Bug #20947: OSD and mon scrub cluster log messages are too verbose
- https://github.com/ceph/ceph/pull/16916
- 06:56 PM Bug #20947 (Resolved): OSD and mon scrub cluster log messages are too verbose
- ...
- 06:43 PM Bug #20875 (Duplicate): mon segv during shutdown
- 06:16 PM Bug #20645: bluesfs wal failed to allocate (assert(0 == "allocate failed... wtf"))
- 06:00 PM Bug #20944 (Fix Under Review): OSD metadata 'backend_filestore_dev_node' is "unknown" even for si...
- https://github.com/ceph/ceph/pull/16913
- 01:17 PM Bug #20944: OSD metadata 'backend_filestore_dev_node' is "unknown" even for simple deployment
- Should have also said: bluestore was populating its bluestore_bdev_dev_node correctly on the same server and drive --...
- 01:16 PM Bug #20944 (Resolved): OSD metadata 'backend_filestore_dev_node' is "unknown" even for simple dep...
OSD created using ceph-deploy "ceph-deploy osd create --filestore", metadata after starting up is:...- 03:41 PM Bug #19881 (Can't reproduce): ceph-osd: pg_update_log_missing(1.20 epoch 66/11 rep_tid 1493 entri...
- 03:39 PM Bug #20116 (Can't reproduce): osds abort on shutdown with assert(ceph/src/osd/OSD.cc: 4324: FAILE...
- 03:39 PM Bug #20188 (Can't reproduce): filestore: os/filestore/FileStore.h: 357: FAILED assert(q.empty()) ...
- 03:39 PM Bug #15653: crush: low weight devices get too many objects for num_rep > 1
- 03:35 PM Bug #20543: osd/PGLog.h: 1257: FAILED assert(0 == "invalid missing set entry found") in PGLog::re...
- Probably the incorrectly-assessed "out-of-order" op numbers.
- 03:35 PM Bug #20543 (Can't reproduce): osd/PGLog.h: 1257: FAILED assert(0 == "invalid missing set entry fo...
- 03:33 PM Bug #20626 (Can't reproduce): failed to become clean before timeout expired, pgs stuck unknown
- 01:58 PM Bug #20925: bluestore: bad csum during fsck
- https://github.com/ceph/ceph/pull/16900
- 01:19 PM Bug #20925: bluestore: bad csum during fsck
- deferred writes are completing out of order. this is fallout from ca32d575eb2673737198a63643d5d1923151eba3.
08/07/2017
- 10:43 PM Bug #20919 (Fix Under Review): osd: replica read can trigger cache promotion
- https://github.com/ceph/ceph/pull/16884
- 10:32 PM Bug #20939 (Fix Under Review): crush weight-set + rm-device-class segv
- https://github.com/ceph/ceph/pull/16883
- 08:49 PM Bug #20939 (Resolved): crush weight-set + rm-device-class segv
- Although that is probably just one of many problems; weight-set and device classes don't play well together.
- 07:49 PM Bug #20920 (Pending Backport): pg dump fails during point-to-point upgrade
- 07:02 PM Bug #20933 (Closed): All mon nodes down when i use ceph-disk prepare a new osd.
- Sage thinks this has been fixed ("[12:02:12] <sage> oh, it was a problem with the reusing osd ids"). Please update t...
- 07:00 PM Bug #20933: All mon nodes down when i use ceph-disk prepare a new osd.
- Apparently this is the result of a typo: https://www.spinics.net/lists/ceph-users/msg37317.html
But I'm not sure t... - 09:07 AM Bug #20933 (Closed): All mon nodes down when i use ceph-disk prepare a new osd.
- ceph version 12.1.0 (262617c9f16c55e863693258061c5b25dea5b086) luminous (dev)
when "ceph-disk prepare --bluestore ... - 04:51 PM Bug #20923: ceph-12.1.1/src/os/bluestore/BlueStore.cc: 2630: FAILED assert(last >= start)
- Sage Weil wrote:
> [...]
> This object is larger than 32bits (4gb), which bluestore does not allow/support. Why ar... - 04:36 PM Bug #20923: ceph-12.1.1/src/os/bluestore/BlueStore.cc: 2630: FAILED assert(last >= start)
- ...
- 01:44 PM Bug #20923: ceph-12.1.1/src/os/bluestore/BlueStore.cc: 2630: FAILED assert(last >= start)
- Sage Weil wrote:
> can you reproduce with debug bluestore = 1/30 and attach the resulting log?
Here it comes (obj... - 01:21 AM Bug #20923 (Need More Info): ceph-12.1.1/src/os/bluestore/BlueStore.cc: 2630: FAILED assert(last ...
- can you reproduce with debug bluestore = 1/30 and attach the resulting log?
- 03:19 PM Bug #20922: misdirected op with localize_reads set
- Well, the issue is not immediately apparent, but _calc_target() is pretty complicated and we're feeding in a not-tota...
- 02:28 PM Bug #20475 (Resolved): EPERM: cannot set require_min_compat_client to luminous: 6 connected clien...
- 02:27 PM Backport #20639 (Resolved): jewel: EPERM: cannot set require_min_compat_client to luminous: 6 con...
- 08:22 AM Tasks #20932 (New): run rocksdb's env_test with our BlueRocksEnv
- 07:41 AM Backport #20930 (Rejected): kraken: assert(i->prior_version == last) when a MODIFY entry follows ...
- 01:16 AM Bug #20133: EnvLibradosMutipoolTest.DBBulkLoadKeysInRandomOrder hangs on rocksdb+librados
- /a/sage-2017-08-06_16:51:13-rados-wip-sage-testing2-20170806a-distro-basic-smithi/1490528
08/06/2017
- 07:08 PM Bug #19191 (Resolved): osd/ReplicatedBackend.cc: 1109: FAILED assert(!parent->get_log().get_missi...
- 07:06 PM Bug #20925 (Resolved): bluestore: bad csum during fsck
- ...
- 07:05 PM Bug #20924 (Resolved): osd: leaked Session on osd.7
- ...
- 07:03 PM Bug #20910: spurious MON_DOWN, apparently slow/laggy mon
- /a/sage-2017-08-06_13:59:55-rados-wip-sage-testing-20170805a-distro-basic-smithi/1490103
seeing a lot of these. - 09:36 AM Bug #20923 (Resolved): ceph-12.1.1/src/os/bluestore/BlueStore.cc: 2630: FAILED assert(last >= start)
- Running 12.1.1 RC1 OSD:s, currently doing inline migration to BlueStore (ceph osd destroy procedure). Getting these a...
08/05/2017
- 06:23 PM Bug #20922 (New): misdirected op with localize_reads set
- ...
- 05:47 PM Bug #20770: test_pidfile.sh test is failing 2 places
- 05:47 PM Bug #20770: test_pidfile.sh test is failing 2 places
- This is still failing sometimes in TEST_without_pidfile() even after adding a sleep 1.
- 03:32 PM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- I did another test: I did some writes to an object "rbd_data.1ebc6238e1f29.0000000000000000" to raise its "HEAD" obje...
- 03:30 PM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- I did another test: I did some writes to an object "rbd_data.1ebc6238e1f29.0000000000000000" to raise its "HEAD" obje...
- 03:34 AM Bug #20874: osd/PGLog.h: 1386: FAILED assert(miter == missing.get_items().end() || (miter->second...
- This may be a bluestore bug - the log is so large from bluestore debugging that I haven't had time to properly read i...
- 02:32 AM Bug #20843 (Pending Backport): assert(i->prior_version == last) when a MODIFY entry follows an ER...
- Backport only needed for kraken, jewel does not have error log entries.
- 12:03 AM Bug #20920: pg dump fails during point-to-point upgrade
- Do we have a "legacy" command map that matches the pre-luminous ones? I think we just need to use that for the comman...
08/04/2017
- 10:25 PM Bug #20920 (Resolved): pg dump fails during point-to-point upgrade
- Command failed on smithi021 with status 22: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage...
- 09:03 PM Bug #20919: osd: replica read can trigger cache promotion
- a replica was servicing a read and tried to do a cache promotion:...
- 08:53 PM Bug #20919 (Resolved): osd: replica read can trigger cache promotion
- ...
- 07:23 PM Bug #20561 (Can't reproduce): bluestore: segv in _deferred_submit_unlock from deferred_try_submit...
- 06:20 PM Bug #20904 (Resolved): cluster [ERR] 2.e shard 2 missing 2:70b3bf12:::existing_4:head on lost-unf...
- 06:40 AM Bug #20904 (Fix Under Review): cluster [ERR] 2.e shard 2 missing 2:70b3bf12:::existing_4:head on ...
- https://github.com/ceph/ceph/pull/16809
- 12:40 AM Bug #20904 (In Progress): cluster [ERR] 2.e shard 2 missing 2:70b3bf12:::existing_4:head on lost-...
- Think I found the problem, testing a fix.
- 06:17 PM Bug #20913 (Resolved): osd: leak from osd/PGBackend.cc:136 PGBackend::handle_recovery_delete()
- ...
- 06:00 PM Bug #18209 (Fix Under Review): src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queu...
- https://github.com/ceph/ceph/pull/16828
- 03:56 PM Bug #18209: src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())
- /a/sage-2017-08-04_13:49:55-rbd:singleton-bluestore-wip-sage-testing2-20170803b-distro-basic-mira/1482623...
- 04:04 PM Bug #20295 (Resolved): bluestore: Timeout in tp_osd_tp threads when running RBD bench in EC pool ...
- 01:59 PM Bug #20910 (Resolved): spurious MON_DOWN, apparently slow/laggy mon
- mon shows very slow progress for ~10 seconds, failing to send lease renewals etc, and triggering an election...
- 01:50 PM Bug #20845 (Resolved): Error ENOENT: cannot link item id -16 name 'host2' to location {root=bar}
- 01:46 PM Bug #20909 (Can't reproduce): Error ETIMEDOUT: crush test failed with -110: timed out during smok...
- ...
- 01:37 PM Bug #20908 (Resolved): qa/standalone/misc failure in TEST_mon_features
- ...
- 01:35 PM Bug #20133: EnvLibradosMutipoolTest.DBBulkLoadKeysInRandomOrder hangs on rocksdb+librados
- /a/sage-2017-08-04_05:23:06-rados-wip-sage-testing-20170803-distro-basic-smithi/1481973
- 08:41 AM Bug #20227: os/bluestore/BlueStore.cc: 2617: FAILED assert(0 == "can't mark unloaded shard dirty")
- Hit the same assert in http://qa-proxy.ceph.com/teuthology/joshd-2017-08-04_06:16:52-rados-wip-20904-distro-basic-smi...
- 07:15 AM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- I mean I think it's the condition check "is_present_clone" that
prevent the clone overlap to record the client write... - 04:54 AM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- Hi, grep:-)
I finally got what you mean in https://github.com/ceph/ceph/pull/16790..
I agree with you in that "... - 12:58 AM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- osd.1 in the posted log has pg 1.4 in epoch 26 from the time it first dequeues those operations right up until it cra...
08/03/2017
- 11:52 PM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- from irc:
<joshd>:
> I'd suggest making rbd diff conservative when it's used with cache pools (if necessary, repo... - 11:40 PM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- > the reason we are submitting the PR is that, when we do export-diff to an rbd image in a pool with a cache tier poo...
- 11:31 PM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- The reason we are submitting the PR is that, when we do export-diff to an rbd image in a pool with a cache tier pool,...
- 03:00 PM Bug #20896: export_diff relies on clone_overlap, which is lost when cache tier is enabled
- I submitted a pr for this: https://github.com/ceph/ceph/pull/16790
- 02:46 PM Bug #20896 (New): export_diff relies on clone_overlap, which is lost when cache tier is enabled
- Recently, we find that, under some circumstance, in the cache tier, the "HEAD" object's clone_overlap can lose some O...
- 11:44 PM Bug #20798 (In Progress): LibRadosLockECPP.LockExclusiveDurPP gets EEXIST
- 08:47 PM Bug #20798: LibRadosLockECPP.LockExclusiveDurPP gets EEXIST
- ...
- 11:28 PM Bug #20871 (In Progress): core dump when bluefs's mkdir returns -EEXIST
- 02:42 PM Bug #20871: core dump when bluefs's mkdir returns -EEXIST
- https://github.com/ceph/ceph/pull/16745/commits/6bb89702c1cae44558480f72c2723f564308f822
- 06:57 PM Bug #20904 (Resolved): cluster [ERR] 2.e shard 2 missing 2:70b3bf12:::existing_4:head on lost-unf...
- ...
- 06:22 PM Bug #20810 (Resolved): fsck finish with 29 errors in 47.732275 seconds
- 06:22 PM Bug #20844 (Resolved): peering_blocked_by_history_les_bound on workloads/ec-snaps-few-objects-ove...
- 02:49 PM Bug #20844 (Fix Under Review): peering_blocked_by_history_les_bound on workloads/ec-snaps-few-obj...
- https://github.com/ceph/ceph/pull/16789
- 01:51 PM Bug #20844: peering_blocked_by_history_les_bound on workloads/ec-snaps-few-objects-overwrites.yaml
- This appears to be a test problem:
- the thrashosds has 'chance_test_map_discontinuity: 0.5', which will mark an o... - 09:59 AM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- mon.a.log...
- 09:42 AM Bug #19605: OSD crash: PrimaryLogPG.cc: 8396: FAILED assert(repop_queue.front() == repop)
- ...
- 09:05 AM Documentation #20894 (Resolved): rados manpage does not document "cleanup"
- A user writes:...
- 02:46 AM Bug #20295: bluestore: Timeout in tp_osd_tp threads when running RBD bench in EC pool w/ overwrites
- https://github.com/ceph/ceph/pull/16769
Also available in: Atom