Activity
From 02/15/2021 to 03/16/2021
03/16/2021
- 08:07 PM Support #49847 (Closed): OSD Fails to init after upgrading to octopus: _deferred_replay failed to...
- An OSD fails to start after upgrading from mimic 13.2.2 to octopus 15.2.9.
It seems like first bluestore fails at... - 03:45 PM Bug #49832 (New): Segmentation fault: in thread_name:ms_dispatch
- ...
- 03:22 PM Bug #49781: unittest_mempool.check_shard_select failed
- The test condition should not be too strict because there really is no way to predict the result. It is however good ...
- 12:56 PM Bug #49781: unittest_mempool.check_shard_select failed
- Using "pthread_self for sharding":https://github.com/ceph/ceph/blob/master/src/include/mempool.h#L261-L262 is not gre...
- 11:25 AM Bug #49781 (In Progress): unittest_mempool.check_shard_select failed
- 08:15 AM Bug #49697: prime pg temp: unexpected optimization
- ping
- 08:14 AM Bug #49787 (Resolved): test_envlibrados_for_rocksdb.sh fails on master
- 06:28 AM Backport #49682 (In Progress): nautilus: OSD: shutdown of a OSD Host causes slow requests
03/15/2021
- 10:42 PM Bug #46978 (Resolved): OSD: shutdown of a OSD Host causes slow requests
- 10:42 PM Backport #49683 (Resolved): pacific: OSD: shutdown of a OSD Host causes slow requests
- 10:41 PM Backport #49774 (Resolved): pacific: Get more parallel scrubs within osd_max_scrubs limits
- 09:56 PM Backport #49402 (In Progress): octopus: rados: Health check failed: 1/3 mons down, quorum a,c (MO...
- 09:55 PM Backport #49401 (In Progress): pacific: rados: Health check failed: 1/3 mons down, quorum a,c (MO...
- 08:15 PM Backport #49817 (Resolved): pacific: mon: promote_standby does not update available_modules
- https://github.com/ceph/ceph/pull/40132
- 08:15 PM Backport #49816 (Resolved): octopus: mon: promote_standby does not update available_modules
- https://github.com/ceph/ceph/pull/40757
- 08:11 PM Bug #49778 (Pending Backport): mon: promote_standby does not update available_modules
- 05:26 PM Bug #49810 (Need More Info): rados/singleton: with msgr-failures/none MON_DOWN due to haven't for...
- ...
- 05:16 PM Bug #49809 (Need More Info): 1 out of 3 mon crashed in MonitorDBStore::get_synchronizer
- We experienced a single mon crash (out of 3 mons) - We observed no other issues on the machine or the cluster.
I a... - 03:02 PM Bug #48793 (Resolved): out of order op
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:02 PM Bug #48990 (Resolved): rados/dashboard: Health check failed: Telemetry requires re-opt-in (TELEME...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 10:38 AM Bug #49781: unittest_mempool.check_shard_select failed
- master also...
- 09:38 AM Bug #49779 (Resolved): standalone: osd-recovery-scrub.sh: Recovery never started
- 09:22 AM Bug #49758 (Resolved): messages/MOSDPGNotify.h: virtual void MOSDPGNotify::encode_payload(uint64_...
- 09:10 AM Backport #49796 (Resolved): pacific: pool application metadata not propagated to the cache tier
- https://github.com/ceph/ceph/pull/40119
- 09:10 AM Backport #49795 (Resolved): octopus: pool application metadata not propagated to the cache tier
- https://github.com/ceph/ceph/pull/40274
- 09:09 AM Bug #49788 (Pending Backport): pool application metadata not propagated to the cache tier
- 01:39 AM Bug #49696: all mons crash suddenly and cann't restart unless close cephx
- Neha Ojha wrote:
> can you share a coredump from the monitor, if the issue is still reproducible?
I'm afraid not....
03/14/2021
- 11:52 AM Bug #49781: unittest_mempool.check_shard_select failed
- https://github.com/ceph/ceph/pull/39978#discussion_r593341155
- 06:14 AM Feature #49789: common/TrackedOp: add op priority for TrackedOp
- PR:https://github.com/ceph/ceph/pull/40060
- 06:12 AM Feature #49789 (Fix Under Review): common/TrackedOp: add op priority for TrackedOp
- Now, we can not know a request priority by ceph daemon /var/run/ceph/ceph-osd.x.asok dump_historic_ops
if this comma... - 04:17 AM Bug #49779 (Fix Under Review): standalone: osd-recovery-scrub.sh: Recovery never started
03/13/2021
- 04:35 PM Bug #49788 (Fix Under Review): pool application metadata not propagated to the cache tier
- 04:27 PM Bug #49788 (Resolved): pool application metadata not propagated to the cache tier
- if you have a base pool with application metadata, that application is not propagated to the cache tier.
This is a... - 09:03 AM Bug #49787 (Resolved): test_envlibrados_for_rocksdb.sh fails on master
- ...
- 08:27 AM Bug #49781: unittest_mempool.check_shard_select failed
- It happened 5 days ago at https://github.com/ceph/ceph/pull/39883#issuecomment-791944956 and is related to https://gi...
- 03:33 AM Bug #49781 (Resolved): unittest_mempool.check_shard_select failed
- This test is probabilistic. Recording to see whether we find it failing more frequently.
From https://jenkins.ceph...
03/12/2021
- 09:36 PM Bug #49696 (Need More Info): all mons crash suddenly and cann't restart unless close cephx
- can you share a coredump from the monitor, if the issue is still reproducible?
- 09:31 PM Bug #49734 (Closed): [OSD]ceph osd crashes and prints Segmentation fault
- Luminous is EOL, please re-open if you see the same issue in later releases.
- 09:00 PM Backport #49775 (In Progress): nautilus: Get more parallel scrubs within osd_max_scrubs limits
- 06:20 PM Backport #49775 (Rejected): nautilus: Get more parallel scrubs within osd_max_scrubs limits
- https://github.com/ceph/ceph/pull/40142
- 08:58 PM Bug #49779 (Resolved): standalone: osd-recovery-scrub.sh: Recovery never started
In master and pacific, the TEST_recovery_scrub_2 subtest in qa/standalone/scrub/osd-recovery-scrub.sh has an interm...- 08:55 PM Backport #49776 (In Progress): octopus: Get more parallel scrubs within osd_max_scrubs limits
- 06:20 PM Backport #49776 (Rejected): octopus: Get more parallel scrubs within osd_max_scrubs limits
- https://github.com/ceph/ceph/pull/40088
- 08:52 PM Backport #49774 (In Progress): pacific: Get more parallel scrubs within osd_max_scrubs limits
- 06:20 PM Backport #49774 (Resolved): pacific: Get more parallel scrubs within osd_max_scrubs limits
- https://github.com/ceph/ceph/pull/40077
- 08:03 PM Bug #49778: mon: promote_standby does not update available_modules
- I think we probably also need a workaround so that we can upgrade from old ceph versions that have this bug...
- 08:00 PM Bug #49778 (Resolved): mon: promote_standby does not update available_modules
- originally observed during upgrade from <15.2.5 via cephadm: the cephadm migration runs immediately after upgrade and...
- 07:46 PM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
- ...
- 06:53 PM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
- ...
- 06:29 PM Bug #49777 (Resolved): test_pool_min_size: 'check for active or peered' reached maximum tries (5)...
- ...
- 06:20 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- ...
- 06:19 PM Bug #48843 (Pending Backport): Get more parallel scrubs within osd_max_scrubs limits
- 05:12 PM Bug #47181: "sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120...
- /a/yuriw-2021-03-11_19:01:40-rados-octopus-distro-basic-smithi/5956578/
- 01:59 PM Bug #48959: Primary OSD crash caused corrupted object and further crashes during backfill after s...
- We just ran into this again and had to remove the object to allow the PG to finish backfilling. The similarities betw...
- 01:38 PM Bug #49409: osd run into dead loop and tell slow request when rollback snap with using cache tier
- reopening this ticket, as its fix (https://github.com/ceph/ceph/pull/39593) was reverted as the fix of #49726
- 01:38 PM Bug #49409 (New): osd run into dead loop and tell slow request when rollback snap with using cach...
- 01:37 PM Bug #49726 (Resolved): src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_versio...
- 07:29 AM Bug #49726: src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_version64() == ve...
- created https://github.com/ceph/ceph/pull/40057 as an intermediate fix.
- 12:27 PM Bug #49427 (Resolved): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_...
- 11:50 AM Bug #48505: osdmaptool crush
- hanguang liu wrote:
> when osd map contains CRUSH_ITEM_NONE osd when i run:
> _./osdmaptool ./hkc4 --test-map-pgs-... - 11:44 AM Bug #48505: osdmaptool crush
- hanguang liu wrote:
> when osd map contains CRUSH_ITEM_NONE osd when i run:
> _./osdmaptool ./hkc4 --test-map-pgs-... - 07:26 AM Bug #49758 (Fix Under Review): messages/MOSDPGNotify.h: virtual void MOSDPGNotify::encode_payload...
- 05:37 AM Bug #49754: osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
- ...
03/11/2021
- 11:03 PM Bug #49726: src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_version64() == ve...
- https://github.com/ceph/ceph/pull/39593#issuecomment-792503213 this is where it first showed up, most likely this PR ...
- 02:03 AM Bug #49726: src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_version64() == ve...
- /a/kchai-2021-03-09_12:22:01-rados-wip-kefu-testing-2021-03-09-1847-distro-basic-smithi/5949457
/a/ideepika-2021-03-... - 01:56 AM Bug #49726 (Resolved): src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_versio...
- ...
- 08:19 PM Backport #49054 (Resolved): pacific: pick_a_shard() always select shard 0
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39977
m... - 06:40 PM Backport #49054: pacific: pick_a_shard() always select shard 0
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39977
merged - 08:17 PM Backport #49670: pacific: make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo be...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39963
m... - 08:11 PM Backport #49565: pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39844
m... - 08:08 PM Backport #49397 (Resolved): octopus: rados/dashboard: Health check failed: Telemetry requires re-...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39704
m... - 03:59 PM Backport #49397: octopus: rados/dashboard: Health check failed: Telemetry requires re-opt-in (TEL...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39704
merged - 06:56 PM Bug #49758 (Resolved): messages/MOSDPGNotify.h: virtual void MOSDPGNotify::encode_payload(uint64_...
- ...
- 06:45 PM Bug #49754 (New): osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
- ...
- 06:04 PM Bug #47838: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to
- /a/yuriw-2021-03-10_21:08:51-rados-wip-yuri8-testing-2021-03-10-0901-pacific-distro-basic-smithi/5954442 - similar
- 01:31 PM Bug #47380: mon: slow ops due to osd_failure
- an alternative fix: https://github.com/ceph/ceph/pull/40033
- 07:11 AM Bug #49734 (Closed): [OSD]ceph osd crashes and prints Segmentation fault
- This error occurs in Mar 6th, the osd.37 was down and out with bellow log info(ceph-osd.37.log-20210306):
2021-03-... - 07:07 AM Backport #49533 (In Progress): octopus: osd ok-to-stop too conservative
- 03:30 AM Backport #49730 (Resolved): octopus: debian ceph-common package post-inst clobbers ownership of c...
- https://github.com/ceph/ceph/pull/40275
- 03:30 AM Bug #49727: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Note that instead of a delay you can tell the OSDs to flush their pg stats. I wonder if that flushes to the mon and...- 03:16 AM Bug #49727 (Resolved): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
This has been seen in cases where all of pool 1 PGs are scrubbed and none of pool 2's. I suggest that this is beca...- 03:30 AM Backport #49729 (Resolved): nautilus: debian ceph-common package post-inst clobbers ownership of ...
- https://github.com/ceph/ceph/pull/40698
- 03:30 AM Backport #49728 (Resolved): pacific: debian ceph-common package post-inst clobbers ownership of c...
- https://github.com/ceph/ceph/pull/40248
- 03:26 AM Backport #49145 (Resolved): pacific: out of order op
- 03:25 AM Bug #49677 (Pending Backport): debian ceph-common package post-inst clobbers ownership of cephadm...
03/10/2021
- 10:41 PM Backport #49682: nautilus: OSD: shutdown of a OSD Host causes slow requests
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/40014
ceph-backport.sh versi... - 10:40 PM Backport #49681: octopus: OSD: shutdown of a OSD Host causes slow requests
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/40013
ceph-backport.sh versi... - 04:21 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
- I am aware of one place where we do log withholding pg creation, the following log message in the OSD logs.
https://... - 01:08 PM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
- Hey Konstantin and Loïc,
Understood; thanks! - 07:57 AM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
- Hi Mauricio,
You are welcome to join the Stable Release team on IRC at #ceph-backports to discuss and resolve the... - 06:47 AM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
- Mauricio, just make a backport PR at GitHub, we'll attach it to tracker later.
- 08:54 AM Bug #49697 (Resolved): prime pg temp: unexpected optimization
- I encountered a problem when splitting pgs that eventually cause pg
to be inactived.
I probably think the root reas... - 07:40 AM Bug #49696 (Need More Info): all mons crash suddenly and cann't restart unless close cephx
- crash info
{
"os_version_id": "7",
"utsname_release": "4.14.0jsdx_kernel",
"os_name": "CentOS Linux... - 02:13 AM Backport #49533 (Rejected): octopus: osd ok-to-stop too conservative
- Per Sage
> I'm not sure if this is worth backporting. The primary benefit is faster upgrades, and it's the target ... - 01:24 AM Bug #47419 (Resolved): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo bench...
- 01:24 AM Backport #49670 (Resolved): pacific: make check: src/test/smoke.sh: TEST_multimon: timeout 8 rado...
- 12:02 AM Backport #49565 (Resolved): pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
03/09/2021
- 11:58 PM Backport #49053 (In Progress): octopus: pick_a_shard() always select shard 0
- 11:58 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- https://github.com/ceph/ceph/pull/39844 merged
- 11:57 PM Backport #49054 (In Progress): pacific: pick_a_shard() always select shard 0
- 11:13 PM Backport #49691 (Rejected): pacific: ceph_assert(is_primary()) in PG::scrub()
- 11:10 PM Backport #49691 (Rejected): pacific: ceph_assert(is_primary()) in PG::scrub()
- 11:13 PM Bug #48712 (Resolved): ceph_assert(is_primary()) in PG::scrub()
- 11:09 PM Bug #48712 (Pending Backport): ceph_assert(is_primary()) in PG::scrub()
- 11:09 PM Bug #48712 (Resolved): ceph_assert(is_primary()) in PG::scrub()
- 11:12 PM Backport #49377 (In Progress): pacific: building libcrc32
- 10:55 PM Backport #48985 (In Progress): octopus: ceph osd df tree reporting incorrect SIZE value for rack ...
- 10:26 PM Bug #49689 (Resolved): osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch...
- ...
- 10:23 PM Bug #36304: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_wake_split_child(PG*)
- /a/yuriw-2021-03-08_21:03:18-rados-wip-yuri5-testing-2021-03-08-1049-pacific-distro-basic-smithi/5947439
- 10:21 PM Bug #49688 (Can't reproduce): FAILED ceph_assert(is_primary()) in submit_log_entries during Promo...
- ...
- 09:43 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
- Samuel Just wrote:
> I'm...not sure what that if block is supposed to do. It was introduced as part of the initial ... - 03:21 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
- I'm...not sure what that if block is supposed to do. It was introduced as part of the initial overwrites patch seque...
- 09:31 PM Backport #49670 (In Progress): pacific: make check: src/test/smoke.sh: TEST_multimon: timeout 8 r...
- https://github.com/ceph/ceph/pull/39963
- 03:45 PM Backport #49670 (Resolved): pacific: make check: src/test/smoke.sh: TEST_multimon: timeout 8 rado...
- https://github.com/ceph/ceph/pull/39963
- 07:46 PM Backport #49683: pacific: OSD: shutdown of a OSD Host causes slow requests
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/39957
ceph-backport.sh versi... - 07:35 PM Backport #49683 (Resolved): pacific: OSD: shutdown of a OSD Host causes slow requests
- https://github.com/ceph/ceph/pull/39957
- 07:40 PM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
- Igor, thanks.
I'd like to / can work on submitting the backport PRs, if that's OK.
In the future, if I want to ... - 07:33 PM Bug #46978 (Pending Backport): OSD: shutdown of a OSD Host causes slow requests
- 07:25 PM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
- The master PR has been merged.
Can someone update Status to Pending Backport, please?
Thanks! - 07:35 PM Backport #49682 (Resolved): nautilus: OSD: shutdown of a OSD Host causes slow requests
- https://github.com/ceph/ceph/pull/40014
- 07:35 PM Backport #49681 (Resolved): octopus: OSD: shutdown of a OSD Host causes slow requests
- https://github.com/ceph/ceph/pull/40013
- 05:57 PM Bug #49677 (Fix Under Review): debian ceph-common package post-inst clobbers ownership of cephadm...
- 05:54 PM Bug #49677 (Resolved): debian ceph-common package post-inst clobbers ownership of cephadm log dirs
- the debian/ubuntu ceph uid is different than the rhel/centos one used by the container. the postinst does a chown -R...
- 04:45 PM Backport #47364 (Resolved): luminous: pgs inconsistent, union_shard_errors=missing
- 03:43 PM Bug #47419 (Pending Backport): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p f...
- https://jenkins.ceph.com/job/ceph-pull-requests/70801/consoleFull#10356408840526d21-3511-427d-909c-dd086c0d1034 - thi...
- 08:32 AM Bug #48786 (Resolved): api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcount2...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 08:32 AM Bug #48984 (Resolved): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 06:30 AM Backport #49642: pacific: Disable and re-enable clog_to_monitors could trigger assertion
- https://github.com/ceph/ceph/pull/39938
- 04:11 AM Backport #49641: octopus: Disable and re-enable clog_to_monitors could trigger assertion
- https://github.com/ceph/ceph/pull/39935
03/08/2021
- 05:16 PM Backport #49482: pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcou...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39773
m... - 05:14 PM Backport #49532: pacific: osd ok-to-stop too conservative
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39737
m... - 05:07 PM Backport #49529 (In Progress): nautilus: "ceph osd crush set|reweight-subtree" commands do not se...
- 05:06 PM Backport #49530 (In Progress): octopus: "ceph osd crush set|reweight-subtree" commands do not set...
- 05:05 PM Backport #49528 (Resolved): pacific: "ceph osd crush set|reweight-subtree" commands do not set we...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39736
m... - 05:02 PM Backport #49526: pacific: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class '...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39735
m... - 05:01 PM Backport #49404: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39597
m... - 04:59 PM Backport #49404: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
- https://github.com/ceph/ceph/pull/39796
https://github.com/ceph/ceph/pull/39597
(double whammy) - 01:41 PM Backport #49640: nautilus: Disable and re-enable clog_to_monitors could trigger assertion
- https://github.com/ceph/ceph/pull/39912
- 11:44 AM Bug #49409 (Pending Backport): osd run into dead loop and tell slow request when rollback snap wi...
03/07/2021
- 10:02 PM Backport #49377: pacific: building libcrc32
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/39902
ceph-backport.sh versi... - 03:58 PM Backport #49482 (Resolved): pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/Manifes...
- 03:55 PM Backport #49642 (Resolved): pacific: Disable and re-enable clog_to_monitors could trigger assertion
- https://github.com/ceph/ceph/pull/40247
- 03:55 PM Backport #49641 (Resolved): octopus: Disable and re-enable clog_to_monitors could trigger assertion
- https://github.com/ceph/ceph/pull/39935
- 03:55 PM Backport #49640 (Resolved): nautilus: Disable and re-enable clog_to_monitors could trigger assertion
- https://github.com/ceph/ceph/pull/39912
- 03:54 PM Bug #48946 (Pending Backport): Disable and re-enable clog_to_monitors could trigger assertion
03/06/2021
- 02:58 PM Backport #49533 (In Progress): octopus: osd ok-to-stop too conservative
- https://github.com/ceph/ceph/pull/39887
- 02:43 PM Backport #49073 (Resolved): nautilus: crash in Objecter and CRUSH map lookup
- 01:16 AM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
- This is where we sent the subops...
03/05/2021
- 11:10 PM Bug #49524: ceph_test_rados_delete_pools_parallel didn't start
- https://tracker.ceph.com/issues/45946 looks very similar
- 11:04 PM Bug #49525: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missi...
- Ronen, can you check if this is caused due to a race between scrub and snap remove.
- 10:53 PM Bug #49403 (Duplicate): Caught signal (aborted) on mgrmap epoch 1 during librados init (rados-str...
- 07:15 PM Bug #48298: hitting mon_max_pg_per_osd right after creating OSD, then decreases slowly
- Another observation: I have nobackfill set, and I'm currently adding 8 new OSDs.
The first of the newly added OSDs... - 05:15 PM Backport #49482: pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcou...
- Myoungwon Oh wrote:
> https://github.com/ceph/ceph/pull/39773
merged - 02:39 AM Bug #47419 (Resolved): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo bench...
- Hopefully
- 01:49 AM Backport #49565 (In Progress): pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- https://github.com/ceph/ceph/pull/39844
03/04/2021
- 11:34 PM Bug #47419 (Fix Under Review): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p f...
- 11:34 PM Bug #47419 (Duplicate): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo benc...
- 04:33 PM Bug #47419: make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo bench 4 write -b...
- https://jenkins.ceph.com/job/ceph-pull-requests/70513/consoleFull#10356408840526d21-3511-427d-909c-dd086c0d1034
- 11:21 PM Bug #49614 (Duplicate): src/test/smoke.sh:56: TEST_multimon: timeout 8 rados -p foo bench 4 writ...
- 11:11 PM Bug #49614: src/test/smoke.sh:56: TEST_multimon: timeout 8 rados -p foo bench 4 write -b 4096 --...
- https://jenkins.ceph.com/job/ceph-pull-requests/70513/consoleFull#-1656021838e840cee4-f4a4-4183-81dd-42855615f2c1
- 10:58 PM Bug #49614 (Duplicate): src/test/smoke.sh:56: TEST_multimon: timeout 8 rados -p foo bench 4 writ...
- ...
- 09:14 PM Bug #44631: ceph pg dump error code 124
- /ceph/teuthology-archive/pdonnell-2021-03-04_03:51:01-fs-wip-pdonnell-testing-20210303.195715-distro-basic-smithi/593...
- 05:39 PM Bug #44631: ceph pg dump error code 124
- /a/yuriw-2021-03-02_20:59:34-rados-wip-yuri7-testing-2021-03-02-1118-nautilus-distro-basic-smithi/5928174
- 09:08 PM Backport #49532 (Resolved): pacific: osd ok-to-stop too conservative
- 06:47 PM Backport #49404 (Resolved): pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
- 06:47 PM Backport #49526 (Resolved): pacific: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound...
- 06:44 PM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
- /a/sage-2021-03-03_16:41:22-rados-wip-sage2-testing-2021-03-03-0744-pacific-distro-basic-smithi/5930113
- 04:48 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
- We also his this issue last week on Ceph Version 12.2.11.
Cluster configured with a replication factor of 3, issu... - 01:21 PM Backport #48987: nautilus: ceph osd df tree reporting incorrect SIZE value for rack having an emp...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39126
m...
03/03/2021
- 10:14 PM Bug #49104: crush weirdness: degraded PGs not marked as such, and choose_total_tries = 50 is too ...
- Thanks for the analysis Neha.
Something that perhaps wasn't clear in comment 2 -- in each case where I print the `... - 06:48 PM Bug #49104 (Triaged): crush weirdness: degraded PGs not marked as such, and choose_total_tries = ...
- Thanks for the detailed logs!
Firstly, the pg dump output can sometimes be a little laggy, so I am basing my asses... - 09:53 PM Backport #48987 (Resolved): nautilus: ceph osd df tree reporting incorrect SIZE value for rack ha...
- 04:05 PM Backport #48987: nautilus: ceph osd df tree reporting incorrect SIZE value for rack having an emp...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39126
merged - 08:47 PM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
- not seen in octopus and pacific so far, but pops sometimes in nautilus:...
- 08:39 PM Bug #49591 (New): no active mgr (MGR_DOWN)" in cluster log
- seen in nautilus...
- 03:37 PM Bug #49584: Ceph OSD, MDS, MGR daemon does not _only_ bind to specified address when configured t...
- After removing the specific public_addr and restarting the MDSes the situation returns to normal and the cluster reco...
- 03:22 PM Bug #49584 (New): Ceph OSD, MDS, MGR daemon does not _only_ bind to specified address when config...
- Documentation (https://docs.ceph.com/en/octopus/rados/configuration/network-config-ref/#ceph-daemons) states the foll...
- 11:32 AM Backport #49055 (Resolved): nautilus: pick_a_shard() always select shard 0
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39651
m... - 09:40 AM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
- Florian Haas wrote:
> With thanks to Paul Emmerich in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/threa... - 05:31 AM Bug #48417: unfound EC objects in sepia's LRC after upgrade
- https://tracker.ceph.com/issues/48613#note-13
- 12:28 AM Backport #49404 (In Progress): pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
03/02/2021
- 08:21 PM Bug #37808 (New): osd: osdmap cache weak_refs assert during shutdown
- /ceph/teuthology-archive/pdonnell-2021-03-02_17:29:53-fs:verify-wip-pdonnell-testing-20210301.234318-distro-basic-smi...
- 05:27 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
- personal ref dir: all grep reside in **/home/ideepika/pg[3.1as0.log** in teuthology server
job: /a/teuthology-2021... - 05:24 PM Bug #49572 (Duplicate): MON_DOWN: mon.c fails to join quorum after un-blacklisting mon.a
- This is the same as https://tracker.ceph.com/issues/47654...
- 04:58 PM Bug #49572 (Duplicate): MON_DOWN: mon.c fails to join quorum after un-blacklisting mon.a
- /a/sage-2021-03-01_20:24:37-rados-wip-sage-testing-2021-03-01-1118-distro-basic-smithi/5924612
it looks like the s... - 04:38 AM Backport #49482: pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcou...
- https://github.com/ceph/ceph/pull/39773
03/01/2021
- 11:25 PM Bug #49409 (Fix Under Review): osd run into dead loop and tell slow request when rollback snap wi...
- 10:16 PM Backport #49567 (Resolved): nautilus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- https://github.com/ceph/ceph/pull/40697
- 10:15 PM Backport #49566 (Resolved): octopus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- https://github.com/ceph/ceph/pull/40756
- 10:15 PM Backport #49565 (Resolved): pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- https://github.com/ceph/ceph/pull/39844
- 10:10 PM Bug #47719 (Pending Backport): api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- 05:15 PM Backport #49055: nautilus: pick_a_shard() always select shard 0
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39651
merged - 03:43 AM Bug #49543 (New): scrub a pool which size is 1 but found stat mismatch on objects and bytes
the pg has only one primary osd:...
02/28/2021
- 11:55 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- /a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-1217-distro-basic-smithi/5921418
- 11:52 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- I hit another instance of this here: /a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-1217-distro-basic-...
- 09:20 PM Bug #46318: mon_recovery: quorum_status times out
- same symptom... cli command fails to contact mon
/a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-121... - 09:37 AM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
- Florian Haas wrote:
> With thanks to Paul Emmerich in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/threa...
02/27/2021
- 08:59 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- /a/sage-2021-02-27_17:50:29-rados-wip-sage2-testing-2021-02-27-0921-pacific-distro-basic-smithi/5919090
- 03:20 PM Backport #49533 (Resolved): octopus: osd ok-to-stop too conservative
- https://github.com/ceph/ceph/pull/39887
- 03:20 PM Backport #49532 (Resolved): pacific: osd ok-to-stop too conservative
- https://github.com/ceph/ceph/pull/39737
- 03:20 PM Backport #49531 (Resolved): nautilus: osd ok-to-stop too conservative
- https://github.com/ceph/ceph/pull/40676
- 03:16 PM Bug #49392 (Pending Backport): osd ok-to-stop too conservative
- pacific backport: https://github.com/ceph/ceph/pull/39737
- 03:16 PM Backport #49530 (Resolved): octopus: "ceph osd crush set|reweight-subtree" commands do not set we...
- https://github.com/ceph/ceph/pull/39919
- 03:16 PM Backport #49529 (Resolved): nautilus: "ceph osd crush set|reweight-subtree" commands do not set w...
- https://github.com/ceph/ceph/pull/39920
- 03:15 PM Backport #49528 (Resolved): pacific: "ceph osd crush set|reweight-subtree" commands do not set we...
- https://github.com/ceph/ceph/pull/39736
- 03:15 PM Backport #49527 (Resolved): octopus: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound...
- https://github.com/ceph/ceph/pull/40276
- 03:15 PM Backport #49526 (Resolved): pacific: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound...
- https://github.com/ceph/ceph/pull/39735
- 03:14 PM Bug #48065 (Pending Backport): "ceph osd crush set|reweight-subtree" commands do not set weight o...
- pacific backport: https://github.com/ceph/ceph/pull/39736
- 03:13 PM Bug #49212 (Pending Backport): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to cl...
- backport for pacific: https://github.com/ceph/ceph/pull/39735
- 02:40 PM Bug #49525 (Resolved): found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 ...
- ...
- 02:36 PM Bug #48997: rados/singleton/all/recovery-preemption: defer backfill|defer recovery not found in logs
- /a/sage-2021-02-26_22:19:00-rados-wip-sage-testing-2021-02-26-1412-distro-basic-smithi/5916984
- 02:26 PM Bug #49521: build failure on centos-8, bad/incorrect use of #ifdef/#elif
- N.B. fedora-33 and later have sigdescr_np(); ditto for rhel-9.
Also strsignal(3) is not *MT-SAFE*. (It's also not ... - 02:26 PM Bug #43584: MON_DOWN during mon_join process
- /a/sage-2021-02-26_22:19:00-rados-wip-sage-testing-2021-02-26-1412-distro-basic-smithi/5917141
I think this is a g... - 02:17 PM Bug #49524 (Resolved): ceph_test_rados_delete_pools_parallel didn't start
- ...
- 02:11 PM Bug #49523 (New): rebuild-mondb doesn't populate mgr commands -> pg dump EINVAL
- ...
02/26/2021
- 09:37 PM Bug #49521 (New): build failure on centos-8, bad/incorrect use of #ifdef/#elif
- building 15.2.9 for CentOS Storage SIG el8 I hit this compile error:
cmake ... -DWITH_REENTRANT_STRSIGNAL=ON ...... - 03:30 PM Bug #44286: Cache tiering shows unfound objects after OSD reboots
- We even hit that bug twice today by rebooting two of our cache servers.
What's interesting is that only hit_set ob... - 01:53 PM Feature #49505 (New): Warn about extremely anomalous commit_latencies
- In a EC cluster with ~500 hdd osds, we suffered a drop in write performance from 30GiB/s down to 3GiB/s due to one si...
02/25/2021
- 10:24 PM Bug #49461 (Duplicate): rados/upgrade/pacific-x/parallel: upgrade incomplete
- 05:19 PM Backport #49397 (In Progress): octopus: rados/dashboard: Health check failed: Telemetry requires ...
- 05:18 PM Backport #49398 (Resolved): pacific: rados/dashboard: Health check failed: Telemetry requires re-...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39484
m... - 05:18 PM Backport #49398 (In Progress): pacific: rados/dashboard: Health check failed: Telemetry requires ...
- 03:00 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- /a/sage-2021-02-24_21:26:56-rados-wip-sage-testing-2021-02-24-1457-distro-basic-smithi/5912284
- 02:47 PM Bug #49487 (Fix Under Review): osd:scrub skip some pg
- 08:27 AM Bug #49487 (Resolved): osd:scrub skip some pg
- ENV:1 mon,1 mgr,1 osd
create a pool with 8 pg, change the value of osd_scrub_min_interval to trigger reschedule
... - 02:37 PM Bug #46318 (Need More Info): mon_recovery: quorum_status times out
- having trouble reproducing (after about 150 jobs). adding increased debugging to master with https://github.com/ceph...
- 01:29 PM Backport #49134: pacific: test_envlibrados_for_rocksdb.sh: EnvLibradosMutipoolTest.DBBulkLoadKeys...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39264
m... - 09:23 AM Support #49489 (New): Getting Long heartbeat and slow requests on ceph luminous 12.2.13
- 1. Current environment is integrated with ceph and openstack
2. It has NVME and SSD disks only
3. We have create fo... - 07:09 AM Bug #49427 (In Progress): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing()....
- https://github.com/ceph/ceph/pull/39670
- 06:30 AM Backport #49482 (Resolved): pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/Manifes...
- https://github.com/ceph/ceph/pull/39773
- 06:29 AM Bug #47024 (Duplicate): rados/test.sh: api_tier_pp LibRadosTwoPoolsPP.ManifestSnapRefcount failed
- 06:29 AM Bug #48915 (Duplicate): api_tier_pp: LibRadosTwoPoolsPP.ManifestFlushDupCount failed
- 06:28 AM Bug #48786 (Pending Backport): api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapR...
- 05:29 AM Bug #49460 (Resolved): qa/workunits/cephtool/test.sh: test_mon_osd_create_destroy fails
- 03:58 AM Bug #49468 (Resolved): rados: "Command crashed: 'rados -p cephfs_metadata rmxattr 10000000000.000...
- 02:58 AM Bug #49468: rados: "Command crashed: 'rados -p cephfs_metadata rmxattr 10000000000.00000000 parent'"
- Add more failures.
- 02:42 AM Bug #49468 (Resolved): rados: "Command crashed: 'rados -p cephfs_metadata rmxattr 10000000000.000...
- ...
- 12:04 AM Bug #49463: qa/standalone/misc/rados-striper.sh: Caught signal in thread_name:rados
- rados/singleton/{all/radostool mon_election/classic msgr-failures/many msgr/async-v2only objectstore/bluestore-comp-z...
02/24/2021
- 10:07 PM Bug #49463 (Can't reproduce): qa/standalone/misc/rados-striper.sh: Caught signal in thread_name:r...
- ...
- 09:34 PM Bug #49461 (Duplicate): rados/upgrade/pacific-x/parallel: upgrade incomplete
- ...
- 09:26 PM Bug #49460 (Fix Under Review): qa/workunits/cephtool/test.sh: test_mon_osd_create_destroy fails
- 08:58 PM Bug #49460 (Resolved): qa/workunits/cephtool/test.sh: test_mon_osd_create_destroy fails
- ...
- 09:00 PM Bug #49212 (Fix Under Review): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to cl...
- earlier,...
- 08:48 PM Bug #49212 (In Progress): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class '...
- ...
- 10:45 AM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
- Nokia ceph-users wrote:
> Do you suspect that this is something relevant to 14.2.2 and could be solved with a higher... - 04:55 AM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
- Do you suspect that this is something relevant to 14.2.2 and could be solved with a higher version?
- 09:16 AM Bug #49448 (New): If OSD types are changed, pools rules can become unresolvable without providing...
- When some OSDs in a cluster are of a specific type, such as hdd_aes, and the type is used in a rule, if the type of s...
- 05:07 AM Bug #49428 (Triaged): ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create...
- 04:50 AM Bug #49428: ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create failed wi...
- TLDR skip to ********* MON.A ************** below.
So this looks like a race. The calls seem to be serialized in t... - 12:48 AM Bug #49428: ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create failed wi...
- Here's the error from the mon log....
- 05:06 AM Bug #47719 (In Progress): api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- 12:50 AM Bug #49427: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
- Most likely, the problem is that the object being dirtied is present, but the prior clone is missing pending recovery.
02/23/2021
- 11:52 PM Bug #49427: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
- dec_refcount_by_dirty is related to tiering/dedeup which got added fairly recently in https://github.com/ceph/ceph/pu...
- 10:36 PM Bug #49427: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
- /a/bhubbard-2021-02-23_02:25:14-rados-master-distro-basic-smithi/5905669
- 01:44 AM Bug #49427 (Resolved): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_...
- /a/bhubbard-2021-02-22_23:51:15-rados-master-distro-basic-smithi/5904732
rados/verify/{centos_latest ceph clusters... - 11:15 PM Bug #49403: Caught signal (aborted) on mgrmap epoch 1 during librados init (rados-striper)
- /a/sage-2021-02-23_06:29:23-rados-wip-sage-testing-2021-02-22-2228-distro-basic-smithi/5906245
- 09:06 PM Backport #49055 (In Progress): nautilus: pick_a_shard() always select shard 0
- 03:40 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
- ...
- 12:23 PM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
- Nokia ceph-users wrote:
> Hi , Another occurrence
>
> _2021-02-22 09:19:43.010071 mon.cn1 (mon.0) 267937 : cluste... - 12:23 PM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
- Hi , Another occurrence
_2021-02-22 09:19:43.010071 mon.cn1 (mon.0) 267937 : cluster [INF] osd.146 marked down aft... - 04:34 AM Bug #48065: "ceph osd crush set|reweight-subtree" commands do not set weight on device class subtree
- Sage Weil wrote:
> BTW Mykola I would suggest using 'ceph osd crush reweight osd.N' (which works fine already) inste... - 02:02 AM Bug #49428 (Duplicate): ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool crea...
- /a/bhubbard-2021-02-22_23:51:15-rados-master-distro-basic-smithi/5904720...
- 12:42 AM Bug #49069 (Resolved): mds crashes on v15.2.8 -> master upgrade decoding MMgrConfigure
02/22/2021
- 08:22 PM Bug #48065: "ceph osd crush set|reweight-subtree" commands do not set weight on device class subtree
- BTW Mykola I would suggest using 'ceph osd crush reweight osd.N' (which works fine already) instead of the 'ceph osd ...
- 08:22 PM Bug #48065 (Fix Under Review): "ceph osd crush set|reweight-subtree" commands do not set weight o...
- 07:37 PM Bug #46318 (In Progress): mon_recovery: quorum_status times out
- Neha Ojha wrote:
> We are still seeing these.
>
> /a/teuthology-2021-01-18_07:01:01-rados-master-distro-basic-smi... - 09:50 AM Bug #49409 (New): osd run into dead loop and tell slow request when rollback snap with using cach...
- 08:59 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- If we are happy with https://github.com/ceph/ceph/pull/39601 in theory perhaps we need to extend it to cover the othe...
- 06:31 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- Hey Sage, I think you meant /a/sage-2021-02-20_16:46:42-rados-wip-sage2-testing-2021-02-20-0942-distro-basic-smithi/5...
- 02:17 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- First, the relevant test code from src/test/librados/watch_notify.cc....
02/21/2021
- 04:50 PM Backport #49404 (Resolved): pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
- https://github.com/ceph/ceph/pull/39597
- 04:48 PM Bug #48984 (Pending Backport): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
- 04:46 PM Bug #49403 (Duplicate): Caught signal (aborted) on mgrmap epoch 1 during librados init (rados-str...
- ...
- 04:42 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- /a/sage-2021-02-20_16:46:42-rados-wip-sage2-testing-2021-02-20-0942-distro-basic-smithi/5899129
- 11:48 AM Bug #48998: Scrubbing terminated -- not all pgs were active and clean
- rados/singleton/{all/lost-unfound-delete mon_election/classic msgr-failures/none msgr/async-v1only objectstore/bluest...
- 03:35 AM Backport #49402 (Resolved): octopus: rados: Health check failed: 1/3 mons down, quorum a,c (MON_D...
- https://github.com/ceph/ceph/pull/40138
- 03:35 AM Backport #49401 (Resolved): pacific: rados: Health check failed: 1/3 mons down, quorum a,c (MON_D...
- https://github.com/ceph/ceph/pull/40137
- 03:32 AM Bug #45441 (Pending Backport): rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" ...
02/20/2021
- 07:41 PM Bug #48386 (Resolved): Paxos::restart() and Paxos::shutdown() can race leading to use-after-free ...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:33 PM Bug #49395 (Resolved): ceph-test rpm missing gtest dependencies
- 03:47 AM Bug #49395 (Fix Under Review): ceph-test rpm missing gtest dependencies
02/19/2021
- 11:59 PM Bug #45647: "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr-failu...
- /a/teuthology-2021-02-17_03:31:03-rados-pacific-distro-basic-smithi/5889472
- 11:30 PM Backport #49398 (Resolved): pacific: rados/dashboard: Health check failed: Telemetry requires re-...
- https://github.com/ceph/ceph/pull/39484
- 11:30 PM Backport #49397 (Resolved): octopus: rados/dashboard: Health check failed: Telemetry requires re-...
- https://github.com/ceph/ceph/pull/39704
- 11:29 PM Bug #49212 (Duplicate): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class 'ss...
- 11:25 PM Bug #48990 (Pending Backport): rados/dashboard: Health check failed: Telemetry requires re-opt-in...
- 11:24 PM Bug #48990: rados/dashboard: Health check failed: Telemetry requires re-opt-in (TELEMETRY_CHANGED...
- pacific backport merged: https://github.com/ceph/ceph/pull/39484
- 11:24 PM Bug #40809: qa: "Failed to send signal 1: None" in rados
- Deepika Upadhyay wrote:
> this happens due to dispatch delay.
> Testing with increased values for a test case can ... - 07:45 AM Bug #40809: qa: "Failed to send signal 1: None" in rados
- this happens due to dispatch delay.
Testing with increased values for a test case can lead to this failure:
/ceph/... - 11:05 PM Bug #44945: Mon High CPU usage when another mon syncing from it
- Wout van Heeswijk wrote:
> I think this might be related to #42830. If so it may be resolved with Ceph Nautilus 14.2... - 11:04 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- Will do.
- 10:54 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- Brad, can you please take look at this one?
- 09:25 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- /a/teuthology-2021-02-17_03:31:03-rados-pacific-distro-basic-smithi/5889235
- 10:49 PM Bug #49359: osd: warning: unused variable
- f9f9270d75d3bc6383604addefc2386318ecfc8b was done to fix another warning, definitely not high priority :)
- 10:47 PM Bug #45441 (Fix Under Review): rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" ...
- 10:44 PM Bug #39039 (Duplicate): mon connection reset, command not resent
- let's track this at #45647
- 10:33 PM Bug #47003 (Duplicate): ceph_test_rados test error. Reponses out of order due to the connection d...
- 10:29 PM Feature #39339: prioritize backfill of metadata pools, automatically
- I think this tracker can be marked resolved since pull request 29181 merged.
- 10:26 PM Bug #48468 (Need More Info): ceph-osd crash before being up again
- Hi Clement,
Can you reproduce this with logs?... - 10:19 PM Bug #49393 (Need More Info): Segmentation fault in ceph::logging::Log::entry()
- 09:11 PM Bug #49393 (Can't reproduce): Segmentation fault in ceph::logging::Log::entry()
- ...
- 10:16 PM Bug #49395 (Resolved): ceph-test rpm missing gtest dependencies
- ...
- 09:28 PM Bug #48841 (Fix Under Review): test_turn_off_module: wait_until_equal timed out
- 08:09 PM Bug #49392 (Resolved): osd ok-to-stop too conservative
- Currently 'osd ok-to-stop' is too conservative: if the pg is degraded, and is touched by an osd we might stop, it alw...
- 04:43 PM Backport #49320 (In Progress): octopus: thrash_cache_writeback_proxy_none: FAILED ceph_assert(ver...
- https://github.com/ceph/ceph/pull/39578
- 02:51 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
- investigating the 2 unfound objects, `when all_unfound_are_queried_or_lost all of
might_have_unfound` all participat... - 01:53 PM Bug #49104: crush weirdness: degraded PGs not marked as such, and choose_total_tries = 50 is too ...
- Neha Ojha wrote:
> Regarding Problem A, will it be possible for you to share osd logs with debug_osd=20 to demonstra... - 01:26 PM Backport #49377 (Resolved): pacific: building libcrc32
- https://github.com/ceph/ceph/pull/39902
- 10:31 AM Bug #49231: MONs unresponsive over extended periods of time
- I think I found the reason for this behaviour. I managed to pull extended logs during an incident and saw that the MO...
- 01:40 AM Bug #48984 (Fix Under Review): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
02/18/2021
- 05:20 PM Bug #49359: osd: warning: unused variable
- https://stackoverflow.com/a/50176479
- 05:17 PM Bug #49359 (New): osd: warning: unused variable
- ...
- 03:30 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
- ...
- 03:21 PM Bug #49259 (Resolved): test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
- 03:21 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
- turned out to be caused by https://github.com/ceph/ceph/pull/39530
- 10:16 AM Bug #49353 (Need More Info): Random OSDs being marked as down even when there is very less activi...
- osd.149 went down at 03:25:26
2021-01-14 03:25:25.974634 mon.cn1 (mon.0) 384654 : cluster [INF] osd.149 marked down ... - 09:51 AM Bug #49353 (Need More Info): Random OSDs being marked as down even when there is very less activi...
- Hi,
We recently see some random OSDs being marked as down status with the below message on one of our Nautilus cl... - 08:06 AM Backport #48495 (Resolved): nautilus: Paxos::restart() and Paxos::shutdown() can race leading to ...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39160
m...
02/17/2021
- 08:22 PM Bug #48984 (In Progress): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
- 08:18 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Proposed fix in https://github.com/ceph/ceph/pull/39535
Needs extensive testing- 05:41 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
If a requested scrub runs into a rejected remote reservation, the m_planned_scrub is already reset. This means tha...- 07:51 PM Bug #48990: rados/dashboard: Health check failed: Telemetry requires re-opt-in (TELEMETRY_CHANGED...
- https://github.com/ceph/ceph/pull/39484 merged
- 04:29 PM Backport #49073: nautilus: crash in Objecter and CRUSH map lookup
- Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/39197
merged - 04:29 PM Backport #48495: nautilus: Paxos::restart() and Paxos::shutdown() can race leading to use-after-f...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/39160
merged - 10:20 AM Backport #49320 (Resolved): octopus: thrash_cache_writeback_proxy_none: FAILED ceph_assert(versio...
- https://github.com/ceph/ceph/pull/39578
- 10:15 AM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
- http://qa-proxy.ceph.com/teuthology/yuriw-2021-02-16_16:01:09-rados-wip-yuri-testing-2021-02-08-1109-octopus-distro-b...
02/16/2021
- 10:51 PM Bug #49259 (Need More Info): test_rados_api tests timeout with cephadm (plus extremely large OSD ...
- 09:30 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
- From IRC:...
- 06:00 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
- Sebastian Wagner wrote:
> sage: this is related to thrashing and only happens within cephadm. non-cephadm is not aff... - 08:47 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- Argh! So it does, my bad. Please ignore comment 22 for now.
- 08:44 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- Deepika Upadhyay wrote:
> /ceph/teuthology-archive/yuriw-2021-02-15_20:25:26-rados-wip-yuri3-testing-2021-02-15-1020... - 07:51 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- Deepika Upadhyay wrote:
> /ceph/teuthology-archive/yuriw-2021-02-15_20:25:26-rados-wip-yuri3-testing-2021-02-15-1020... - 07:14 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- -/ceph/teuthology-archive/yuriw-2021-02-15_20:25:26-rados-wip-yuri3-testing-2021-02-15-1020-nautilus-distro-basic-gib...
- 03:28 PM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
- Deepika, -I don't understand why or how the "workaround" addresses the issue here. probably you could file a PR based...
- 10:58 AM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
- hey Kefu! Should we use this workaround meanwhile the real bug is being fixed?...
- 08:39 AM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
- created https://github.com/ceph/ceph/pull/39491 in hope to work around this.
- 04:49 AM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
- filed https://bugzilla.redhat.com/show_bug.cgi?id=1929043
- 04:48 AM Bug #49303 (In Progress): FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on ...
- ...
- 01:23 PM Bug #49190: LibRadosMiscConnectFailure_ConnectFailure_Test: FAILED ceph_assert(p != obs_call_gate...
- Bug #40868 is not related
02/15/2021
- 10:35 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
- ...
- 03:40 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
- sage: this is related to thrashing and only happens within cephadm. non-cephadm is not affected
- 04:22 AM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
- Managed to reproduce this with some manageable large osd logs.
On the first osd, just before the slow ops begin we...
Also available in: Atom