Project

General

Profile

Activity

From 02/15/2021 to 03/16/2021

03/16/2021

08:07 PM Support #49847 (Closed): OSD Fails to init after upgrading to octopus: _deferred_replay failed to...
An OSD fails to start after upgrading from mimic 13.2.2 to octopus 15.2.9.
It seems like first bluestore fails at...
Eetu Lampsijärvi
03:45 PM Bug #49832 (New): Segmentation fault: in thread_name:ms_dispatch
... Deepika Upadhyay
03:22 PM Bug #49781: unittest_mempool.check_shard_select failed
The test condition should not be too strict because there really is no way to predict the result. It is however good ... Loïc Dachary
12:56 PM Bug #49781: unittest_mempool.check_shard_select failed
Using "pthread_self for sharding":https://github.com/ceph/ceph/blob/master/src/include/mempool.h#L261-L262 is not gre... Loïc Dachary
11:25 AM Bug #49781 (In Progress): unittest_mempool.check_shard_select failed
Loïc Dachary
08:15 AM Bug #49697: prime pg temp: unexpected optimization
ping fan chen
08:14 AM Bug #49787 (Resolved): test_envlibrados_for_rocksdb.sh fails on master
Kefu Chai
06:28 AM Backport #49682 (In Progress): nautilus: OSD: shutdown of a OSD Host causes slow requests
Konstantin Shalygin

03/15/2021

10:42 PM Bug #46978 (Resolved): OSD: shutdown of a OSD Host causes slow requests
Sage Weil
10:42 PM Backport #49683 (Resolved): pacific: OSD: shutdown of a OSD Host causes slow requests
Sage Weil
10:41 PM Backport #49774 (Resolved): pacific: Get more parallel scrubs within osd_max_scrubs limits
Sage Weil
09:56 PM Backport #49402 (In Progress): octopus: rados: Health check failed: 1/3 mons down, quorum a,c (MO...
Neha Ojha
09:55 PM Backport #49401 (In Progress): pacific: rados: Health check failed: 1/3 mons down, quorum a,c (MO...
Neha Ojha
08:15 PM Backport #49817 (Resolved): pacific: mon: promote_standby does not update available_modules
https://github.com/ceph/ceph/pull/40132 Backport Bot
08:15 PM Backport #49816 (Resolved): octopus: mon: promote_standby does not update available_modules
https://github.com/ceph/ceph/pull/40757 Backport Bot
08:11 PM Bug #49778 (Pending Backport): mon: promote_standby does not update available_modules
Sage Weil
05:26 PM Bug #49810 (Need More Info): rados/singleton: with msgr-failures/none MON_DOWN due to haven't for...
... Neha Ojha
05:16 PM Bug #49809 (Need More Info): 1 out of 3 mon crashed in MonitorDBStore::get_synchronizer
We experienced a single mon crash (out of 3 mons) - We observed no other issues on the machine or the cluster.
I a...
Christian Rohmann
03:02 PM Bug #48793 (Resolved): out of order op
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:02 PM Bug #48990 (Resolved): rados/dashboard: Health check failed: Telemetry requires re-opt-in (TELEME...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
10:38 AM Bug #49781: unittest_mempool.check_shard_select failed
master also... Kefu Chai
09:38 AM Bug #49779 (Resolved): standalone: osd-recovery-scrub.sh: Recovery never started
Kefu Chai
09:22 AM Bug #49758 (Resolved): messages/MOSDPGNotify.h: virtual void MOSDPGNotify::encode_payload(uint64_...
Kefu Chai
09:10 AM Backport #49796 (Resolved): pacific: pool application metadata not propagated to the cache tier
https://github.com/ceph/ceph/pull/40119 Backport Bot
09:10 AM Backport #49795 (Resolved): octopus: pool application metadata not propagated to the cache tier
https://github.com/ceph/ceph/pull/40274 Backport Bot
09:09 AM Bug #49788 (Pending Backport): pool application metadata not propagated to the cache tier
Kefu Chai
01:39 AM Bug #49696: all mons crash suddenly and cann't restart unless close cephx
Neha Ojha wrote:
> can you share a coredump from the monitor, if the issue is still reproducible?
I'm afraid not....
wencong wan

03/14/2021

11:52 AM Bug #49781: unittest_mempool.check_shard_select failed
https://github.com/ceph/ceph/pull/39978#discussion_r593341155 singuliere _
06:14 AM Feature #49789: common/TrackedOp: add op priority for TrackedOp
PR:https://github.com/ceph/ceph/pull/40060 yite gu
06:12 AM Feature #49789 (Fix Under Review): common/TrackedOp: add op priority for TrackedOp
Now, we can not know a request priority by ceph daemon /var/run/ceph/ceph-osd.x.asok dump_historic_ops
if this comma...
yite gu
04:17 AM Bug #49779 (Fix Under Review): standalone: osd-recovery-scrub.sh: Recovery never started
Kefu Chai

03/13/2021

04:35 PM Bug #49788 (Fix Under Review): pool application metadata not propagated to the cache tier
Sage Weil
04:27 PM Bug #49788 (Resolved): pool application metadata not propagated to the cache tier
if you have a base pool with application metadata, that application is not propagated to the cache tier.
This is a...
Sage Weil
09:03 AM Bug #49787 (Resolved): test_envlibrados_for_rocksdb.sh fails on master
... Kefu Chai
08:27 AM Bug #49781: unittest_mempool.check_shard_select failed
It happened 5 days ago at https://github.com/ceph/ceph/pull/39883#issuecomment-791944956 and is related to https://gi... Loïc Dachary
03:33 AM Bug #49781 (Resolved): unittest_mempool.check_shard_select failed
This test is probabilistic. Recording to see whether we find it failing more frequently.
From https://jenkins.ceph...
Josh Durgin

03/12/2021

09:36 PM Bug #49696 (Need More Info): all mons crash suddenly and cann't restart unless close cephx
can you share a coredump from the monitor, if the issue is still reproducible? Neha Ojha
09:31 PM Bug #49734 (Closed): [OSD]ceph osd crashes and prints Segmentation fault
Luminous is EOL, please re-open if you see the same issue in later releases. Neha Ojha
09:00 PM Backport #49775 (In Progress): nautilus: Get more parallel scrubs within osd_max_scrubs limits
David Zafman
06:20 PM Backport #49775 (Rejected): nautilus: Get more parallel scrubs within osd_max_scrubs limits
https://github.com/ceph/ceph/pull/40142 Backport Bot
08:58 PM Bug #49779 (Resolved): standalone: osd-recovery-scrub.sh: Recovery never started

In master and pacific, the TEST_recovery_scrub_2 subtest in qa/standalone/scrub/osd-recovery-scrub.sh has an interm...
David Zafman
08:55 PM Backport #49776 (In Progress): octopus: Get more parallel scrubs within osd_max_scrubs limits
David Zafman
06:20 PM Backport #49776 (Rejected): octopus: Get more parallel scrubs within osd_max_scrubs limits
https://github.com/ceph/ceph/pull/40088 Backport Bot
08:52 PM Backport #49774 (In Progress): pacific: Get more parallel scrubs within osd_max_scrubs limits
David Zafman
06:20 PM Backport #49774 (Resolved): pacific: Get more parallel scrubs within osd_max_scrubs limits
https://github.com/ceph/ceph/pull/40077 Backport Bot
08:03 PM Bug #49778: mon: promote_standby does not update available_modules
I think we probably also need a workaround so that we can upgrade from old ceph versions that have this bug... Sage Weil
08:00 PM Bug #49778 (Resolved): mon: promote_standby does not update available_modules
originally observed during upgrade from <15.2.5 via cephadm: the cephadm migration runs immediately after upgrade and... Sage Weil
07:46 PM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
... Deepika Upadhyay
06:53 PM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
... Deepika Upadhyay
06:29 PM Bug #49777 (Resolved): test_pool_min_size: 'check for active or peered' reached maximum tries (5)...
... Deepika Upadhyay
06:20 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
... Deepika Upadhyay
06:19 PM Bug #48843 (Pending Backport): Get more parallel scrubs within osd_max_scrubs limits
David Zafman
05:12 PM Bug #47181: "sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120...
/a/yuriw-2021-03-11_19:01:40-rados-octopus-distro-basic-smithi/5956578/ Neha Ojha
01:59 PM Bug #48959: Primary OSD crash caused corrupted object and further crashes during backfill after s...
We just ran into this again and had to remove the object to allow the PG to finish backfilling. The similarities betw... Tom Byrne
01:38 PM Bug #49409: osd run into dead loop and tell slow request when rollback snap with using cache tier
reopening this ticket, as its fix (https://github.com/ceph/ceph/pull/39593) was reverted as the fix of #49726 Kefu Chai
01:38 PM Bug #49409 (New): osd run into dead loop and tell slow request when rollback snap with using cach...
Kefu Chai
01:37 PM Bug #49726 (Resolved): src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_versio...
Kefu Chai
07:29 AM Bug #49726: src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_version64() == ve...
created https://github.com/ceph/ceph/pull/40057 as an intermediate fix. Kefu Chai
12:27 PM Bug #49427 (Resolved): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_...
Kefu Chai
11:50 AM Bug #48505: osdmaptool crush
hanguang liu wrote:
> when osd map contains CRUSH_ITEM_NONE osd when i run:
> _./osdmaptool ./hkc4 --test-map-pgs-...
hg liu
11:44 AM Bug #48505: osdmaptool crush
hanguang liu wrote:
> when osd map contains CRUSH_ITEM_NONE osd when i run:
> _./osdmaptool ./hkc4 --test-map-pgs-...
hg liu
07:26 AM Bug #49758 (Fix Under Review): messages/MOSDPGNotify.h: virtual void MOSDPGNotify::encode_payload...
Kefu Chai
05:37 AM Bug #49754: osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
... Kefu Chai

03/11/2021

11:03 PM Bug #49726: src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_version64() == ve...
https://github.com/ceph/ceph/pull/39593#issuecomment-792503213 this is where it first showed up, most likely this PR ... Neha Ojha
02:03 AM Bug #49726: src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_version64() == ve...
/a/kchai-2021-03-09_12:22:01-rados-wip-kefu-testing-2021-03-09-1847-distro-basic-smithi/5949457
/a/ideepika-2021-03-...
Neha Ojha
01:56 AM Bug #49726 (Resolved): src/test/osd/RadosModel.h: FAILED ceph_assert(!version || comp->get_versio...
... Neha Ojha
08:19 PM Backport #49054 (Resolved): pacific: pick_a_shard() always select shard 0
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39977
m...
Nathan Cutler
06:40 PM Backport #49054: pacific: pick_a_shard() always select shard 0
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39977
merged
Yuri Weinstein
08:17 PM Backport #49670: pacific: make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo be...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39963
m...
Nathan Cutler
08:11 PM Backport #49565: pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39844
m...
Nathan Cutler
08:08 PM Backport #49397 (Resolved): octopus: rados/dashboard: Health check failed: Telemetry requires re-...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39704
m...
Nathan Cutler
03:59 PM Backport #49397: octopus: rados/dashboard: Health check failed: Telemetry requires re-opt-in (TEL...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39704
merged
Yuri Weinstein
06:56 PM Bug #49758 (Resolved): messages/MOSDPGNotify.h: virtual void MOSDPGNotify::encode_payload(uint64_...
... Neha Ojha
06:45 PM Bug #49754 (New): osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
... Neha Ojha
06:04 PM Bug #47838: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to
/a/yuriw-2021-03-10_21:08:51-rados-wip-yuri8-testing-2021-03-10-0901-pacific-distro-basic-smithi/5954442 - similar Neha Ojha
01:31 PM Bug #47380: mon: slow ops due to osd_failure
an alternative fix: https://github.com/ceph/ceph/pull/40033 Kefu Chai
07:11 AM Bug #49734 (Closed): [OSD]ceph osd crashes and prints Segmentation fault
This error occurs in Mar 6th, the osd.37 was down and out with bellow log info(ceph-osd.37.log-20210306):
2021-03-...
文军 丁
07:07 AM Backport #49533 (In Progress): octopus: osd ok-to-stop too conservative
Kefu Chai
03:30 AM Backport #49730 (Resolved): octopus: debian ceph-common package post-inst clobbers ownership of c...
https://github.com/ceph/ceph/pull/40275 Backport Bot
03:30 AM Bug #49727: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs

Note that instead of a delay you can tell the OSDs to flush their pg stats. I wonder if that flushes to the mon and...
David Zafman
03:16 AM Bug #49727 (Resolved): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs

This has been seen in cases where all of pool 1 PGs are scrubbed and none of pool 2's. I suggest that this is beca...
David Zafman
03:30 AM Backport #49729 (Resolved): nautilus: debian ceph-common package post-inst clobbers ownership of ...
https://github.com/ceph/ceph/pull/40698 Backport Bot
03:30 AM Backport #49728 (Resolved): pacific: debian ceph-common package post-inst clobbers ownership of c...
https://github.com/ceph/ceph/pull/40248 Backport Bot
03:26 AM Backport #49145 (Resolved): pacific: out of order op
Kefu Chai
03:25 AM Bug #49677 (Pending Backport): debian ceph-common package post-inst clobbers ownership of cephadm...
Kefu Chai

03/10/2021

10:41 PM Backport #49682: nautilus: OSD: shutdown of a OSD Host causes slow requests
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/40014
ceph-backport.sh versi...
Mauricio Oliveira
10:40 PM Backport #49681: octopus: OSD: shutdown of a OSD Host causes slow requests
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/40013
ceph-backport.sh versi...
Mauricio Oliveira
04:21 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
I am aware of one place where we do log withholding pg creation, the following log message in the OSD logs.
https://...
Vikhyat Umrao
01:08 PM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
Hey Konstantin and Loïc,
Understood; thanks!
Mauricio Oliveira
07:57 AM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
Hi Mauricio,
You are welcome to join the Stable Release team on IRC at #ceph-backports to discuss and resolve the...
Loïc Dachary
06:47 AM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
Mauricio, just make a backport PR at GitHub, we'll attach it to tracker later. Konstantin Shalygin
08:54 AM Bug #49697 (Resolved): prime pg temp: unexpected optimization
I encountered a problem when splitting pgs that eventually cause pg
to be inactived.
I probably think the root reas...
fan chen
07:40 AM Bug #49696 (Need More Info): all mons crash suddenly and cann't restart unless close cephx
crash info
{
"os_version_id": "7",
"utsname_release": "4.14.0jsdx_kernel",
"os_name": "CentOS Linux...
wencong wan
02:13 AM Backport #49533 (Rejected): octopus: osd ok-to-stop too conservative
Per Sage
> I'm not sure if this is worth backporting. The primary benefit is faster upgrades, and it's the target ...
Kefu Chai
01:24 AM Bug #47419 (Resolved): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo bench...
Neha Ojha
01:24 AM Backport #49670 (Resolved): pacific: make check: src/test/smoke.sh: TEST_multimon: timeout 8 rado...
Neha Ojha
12:02 AM Backport #49565 (Resolved): pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
singuliere _

03/09/2021

11:58 PM Backport #49053 (In Progress): octopus: pick_a_shard() always select shard 0
singuliere _
11:58 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
https://github.com/ceph/ceph/pull/39844 merged Yuri Weinstein
11:57 PM Backport #49054 (In Progress): pacific: pick_a_shard() always select shard 0
singuliere _
11:13 PM Backport #49691 (Rejected): pacific: ceph_assert(is_primary()) in PG::scrub()
David Zafman
11:10 PM Backport #49691 (Rejected): pacific: ceph_assert(is_primary()) in PG::scrub()
Backport Bot
11:13 PM Bug #48712 (Resolved): ceph_assert(is_primary()) in PG::scrub()
David Zafman
11:09 PM Bug #48712 (Pending Backport): ceph_assert(is_primary()) in PG::scrub()
David Zafman
11:09 PM Bug #48712 (Resolved): ceph_assert(is_primary()) in PG::scrub()
David Zafman
11:12 PM Backport #49377 (In Progress): pacific: building libcrc32
singuliere _
10:55 PM Backport #48985 (In Progress): octopus: ceph osd df tree reporting incorrect SIZE value for rack ...
Brad Hubbard
10:26 PM Bug #49689 (Resolved): osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch...
... Neha Ojha
10:23 PM Bug #36304: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_wake_split_child(PG*)
/a/yuriw-2021-03-08_21:03:18-rados-wip-yuri5-testing-2021-03-08-1049-pacific-distro-basic-smithi/5947439 Neha Ojha
10:21 PM Bug #49688 (Can't reproduce): FAILED ceph_assert(is_primary()) in submit_log_entries during Promo...
... Neha Ojha
09:43 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
Samuel Just wrote:
> I'm...not sure what that if block is supposed to do. It was introduced as part of the initial ...
Neha Ojha
03:21 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
I'm...not sure what that if block is supposed to do. It was introduced as part of the initial overwrites patch seque... Samuel Just
09:31 PM Backport #49670 (In Progress): pacific: make check: src/test/smoke.sh: TEST_multimon: timeout 8 r...
https://github.com/ceph/ceph/pull/39963 Neha Ojha
03:45 PM Backport #49670 (Resolved): pacific: make check: src/test/smoke.sh: TEST_multimon: timeout 8 rado...
https://github.com/ceph/ceph/pull/39963 Backport Bot
07:46 PM Backport #49683: pacific: OSD: shutdown of a OSD Host causes slow requests
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/39957
ceph-backport.sh versi...
Mauricio Oliveira
07:35 PM Backport #49683 (Resolved): pacific: OSD: shutdown of a OSD Host causes slow requests
https://github.com/ceph/ceph/pull/39957 Backport Bot
07:40 PM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
Igor, thanks.
I'd like to / can work on submitting the backport PRs, if that's OK.
In the future, if I want to ...
Mauricio Oliveira
07:33 PM Bug #46978 (Pending Backport): OSD: shutdown of a OSD Host causes slow requests
Igor Fedotov
07:25 PM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
The master PR has been merged.
Can someone update Status to Pending Backport, please?
Thanks!
Mauricio Oliveira
07:35 PM Backport #49682 (Resolved): nautilus: OSD: shutdown of a OSD Host causes slow requests
https://github.com/ceph/ceph/pull/40014 Backport Bot
07:35 PM Backport #49681 (Resolved): octopus: OSD: shutdown of a OSD Host causes slow requests
https://github.com/ceph/ceph/pull/40013 Backport Bot
05:57 PM Bug #49677 (Fix Under Review): debian ceph-common package post-inst clobbers ownership of cephadm...
Sage Weil
05:54 PM Bug #49677 (Resolved): debian ceph-common package post-inst clobbers ownership of cephadm log dirs
the debian/ubuntu ceph uid is different than the rhel/centos one used by the container. the postinst does a chown -R... Sage Weil
04:45 PM Backport #47364 (Resolved): luminous: pgs inconsistent, union_shard_errors=missing
Nathan Cutler
03:43 PM Bug #47419 (Pending Backport): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p f...
https://jenkins.ceph.com/job/ceph-pull-requests/70801/consoleFull#10356408840526d21-3511-427d-909c-dd086c0d1034 - thi... Neha Ojha
08:32 AM Bug #48786 (Resolved): api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcount2...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:32 AM Bug #48984 (Resolved): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
06:30 AM Backport #49642: pacific: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/39938 gerald yang
04:11 AM Backport #49641: octopus: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/39935 gerald yang

03/08/2021

05:16 PM Backport #49482: pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcou...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39773
m...
Nathan Cutler
05:14 PM Backport #49532: pacific: osd ok-to-stop too conservative
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39737
m...
Nathan Cutler
05:07 PM Backport #49529 (In Progress): nautilus: "ceph osd crush set|reweight-subtree" commands do not se...
Nathan Cutler
05:06 PM Backport #49530 (In Progress): octopus: "ceph osd crush set|reweight-subtree" commands do not set...
Nathan Cutler
05:05 PM Backport #49528 (Resolved): pacific: "ceph osd crush set|reweight-subtree" commands do not set we...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39736
m...
Nathan Cutler
05:02 PM Backport #49526: pacific: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class '...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39735
m...
Nathan Cutler
05:01 PM Backport #49404: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39597
m...
Nathan Cutler
04:59 PM Backport #49404: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
https://github.com/ceph/ceph/pull/39796
https://github.com/ceph/ceph/pull/39597
(double whammy)
Nathan Cutler
01:41 PM Backport #49640: nautilus: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/39912 gerald yang
11:44 AM Bug #49409 (Pending Backport): osd run into dead loop and tell slow request when rollback snap wi...
Kefu Chai

03/07/2021

10:02 PM Backport #49377: pacific: building libcrc32
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/39902
ceph-backport.sh versi...
singuliere _
03:58 PM Backport #49482 (Resolved): pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/Manifes...
Loïc Dachary
03:55 PM Backport #49642 (Resolved): pacific: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/40247 Backport Bot
03:55 PM Backport #49641 (Resolved): octopus: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/39935 Backport Bot
03:55 PM Backport #49640 (Resolved): nautilus: Disable and re-enable clog_to_monitors could trigger assertion
https://github.com/ceph/ceph/pull/39912 Backport Bot
03:54 PM Bug #48946 (Pending Backport): Disable and re-enable clog_to_monitors could trigger assertion
Kefu Chai

03/06/2021

02:58 PM Backport #49533 (In Progress): octopus: osd ok-to-stop too conservative
https://github.com/ceph/ceph/pull/39887 Kefu Chai
02:43 PM Backport #49073 (Resolved): nautilus: crash in Objecter and CRUSH map lookup
Kefu Chai
01:16 AM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
This is where we sent the subops... Neha Ojha

03/05/2021

11:10 PM Bug #49524: ceph_test_rados_delete_pools_parallel didn't start
https://tracker.ceph.com/issues/45946 looks very similar Neha Ojha
11:04 PM Bug #49525: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missi...
Ronen, can you check if this is caused due to a race between scrub and snap remove. Neha Ojha
10:53 PM Bug #49403 (Duplicate): Caught signal (aborted) on mgrmap epoch 1 during librados init (rados-str...
Neha Ojha
07:15 PM Bug #48298: hitting mon_max_pg_per_osd right after creating OSD, then decreases slowly
Another observation: I have nobackfill set, and I'm currently adding 8 new OSDs.
The first of the newly added OSDs...
Jonas Jelten
05:15 PM Backport #49482: pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcou...
Myoungwon Oh wrote:
> https://github.com/ceph/ceph/pull/39773
merged
Yuri Weinstein
02:39 AM Bug #47419 (Resolved): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo bench...
Hopefully Neha Ojha
01:49 AM Backport #49565 (In Progress): pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
https://github.com/ceph/ceph/pull/39844 Neha Ojha

03/04/2021

11:34 PM Bug #47419 (Fix Under Review): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p f...
Sage Weil
11:34 PM Bug #47419 (Duplicate): make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo benc...
Sage Weil
04:33 PM Bug #47419: make check: src/test/smoke.sh: TEST_multimon: timeout 8 rados -p foo bench 4 write -b...
https://jenkins.ceph.com/job/ceph-pull-requests/70513/consoleFull#10356408840526d21-3511-427d-909c-dd086c0d1034 Neha Ojha
11:21 PM Bug #49614 (Duplicate): src/test/smoke.sh:56: TEST_multimon: timeout 8 rados -p foo bench 4 writ...
Neha Ojha
11:11 PM Bug #49614: src/test/smoke.sh:56: TEST_multimon: timeout 8 rados -p foo bench 4 write -b 4096 --...
https://jenkins.ceph.com/job/ceph-pull-requests/70513/consoleFull#-1656021838e840cee4-f4a4-4183-81dd-42855615f2c1 Sage Weil
10:58 PM Bug #49614 (Duplicate): src/test/smoke.sh:56: TEST_multimon: timeout 8 rados -p foo bench 4 writ...
... Sage Weil
09:14 PM Bug #44631: ceph pg dump error code 124
/ceph/teuthology-archive/pdonnell-2021-03-04_03:51:01-fs-wip-pdonnell-testing-20210303.195715-distro-basic-smithi/593... Patrick Donnelly
05:39 PM Bug #44631: ceph pg dump error code 124
/a/yuriw-2021-03-02_20:59:34-rados-wip-yuri7-testing-2021-03-02-1118-nautilus-distro-basic-smithi/5928174 Neha Ojha
09:08 PM Backport #49532 (Resolved): pacific: osd ok-to-stop too conservative
Sage Weil
06:47 PM Backport #49404 (Resolved): pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Sage Weil
06:47 PM Backport #49526 (Resolved): pacific: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound...
Sage Weil
06:44 PM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
/a/sage-2021-03-03_16:41:22-rados-wip-sage2-testing-2021-03-03-0744-pacific-distro-basic-smithi/5930113
Sage Weil
04:48 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
We also his this issue last week on Ceph Version 12.2.11.
Cluster configured with a replication factor of 3, issu...
Ross Martyn
01:21 PM Backport #48987: nautilus: ceph osd df tree reporting incorrect SIZE value for rack having an emp...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39126
m...
Nathan Cutler

03/03/2021

10:14 PM Bug #49104: crush weirdness: degraded PGs not marked as such, and choose_total_tries = 50 is too ...
Thanks for the analysis Neha.
Something that perhaps wasn't clear in comment 2 -- in each case where I print the `...
Dan van der Ster
06:48 PM Bug #49104 (Triaged): crush weirdness: degraded PGs not marked as such, and choose_total_tries = ...
Thanks for the detailed logs!
Firstly, the pg dump output can sometimes be a little laggy, so I am basing my asses...
Neha Ojha
09:53 PM Backport #48987 (Resolved): nautilus: ceph osd df tree reporting incorrect SIZE value for rack ha...
Brad Hubbard
04:05 PM Backport #48987: nautilus: ceph osd df tree reporting incorrect SIZE value for rack having an emp...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39126
merged
Yuri Weinstein
08:47 PM Bug #20909: Error ETIMEDOUT: crush test failed with -110: timed out during smoke test (5 seconds)
not seen in octopus and pacific so far, but pops sometimes in nautilus:... Deepika Upadhyay
08:39 PM Bug #49591 (New): no active mgr (MGR_DOWN)" in cluster log
seen in nautilus... Deepika Upadhyay
03:37 PM Bug #49584: Ceph OSD, MDS, MGR daemon does not _only_ bind to specified address when configured t...
After removing the specific public_addr and restarting the MDSes the situation returns to normal and the cluster reco... Stefan Kooman
03:22 PM Bug #49584 (New): Ceph OSD, MDS, MGR daemon does not _only_ bind to specified address when config...
Documentation (https://docs.ceph.com/en/octopus/rados/configuration/network-config-ref/#ceph-daemons) states the foll... Stefan Kooman
11:32 AM Backport #49055 (Resolved): nautilus: pick_a_shard() always select shard 0
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39651
m...
Nathan Cutler
09:40 AM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
Florian Haas wrote:
> With thanks to Paul Emmerich in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/threa...
Norman Shen
05:31 AM Bug #48417: unfound EC objects in sepia's LRC after upgrade
https://tracker.ceph.com/issues/48613#note-13 Deepika Upadhyay
12:28 AM Backport #49404 (In Progress): pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
David Zafman

03/02/2021

08:21 PM Bug #37808 (New): osd: osdmap cache weak_refs assert during shutdown
/ceph/teuthology-archive/pdonnell-2021-03-02_17:29:53-fs:verify-wip-pdonnell-testing-20210301.234318-distro-basic-smi... Patrick Donnelly
05:27 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
personal ref dir: all grep reside in **/home/ideepika/pg[3.1as0.log** in teuthology server
job: /a/teuthology-2021...
Deepika Upadhyay
05:24 PM Bug #49572 (Duplicate): MON_DOWN: mon.c fails to join quorum after un-blacklisting mon.a
This is the same as https://tracker.ceph.com/issues/47654... Neha Ojha
04:58 PM Bug #49572 (Duplicate): MON_DOWN: mon.c fails to join quorum after un-blacklisting mon.a
/a/sage-2021-03-01_20:24:37-rados-wip-sage-testing-2021-03-01-1118-distro-basic-smithi/5924612
it looks like the s...
Sage Weil
04:38 AM Backport #49482: pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcou...
https://github.com/ceph/ceph/pull/39773 Myoungwon Oh

03/01/2021

11:25 PM Bug #49409 (Fix Under Review): osd run into dead loop and tell slow request when rollback snap wi...
Neha Ojha
10:16 PM Backport #49567 (Resolved): nautilus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
https://github.com/ceph/ceph/pull/40697 Backport Bot
10:15 PM Backport #49566 (Resolved): octopus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
https://github.com/ceph/ceph/pull/40756 Backport Bot
10:15 PM Backport #49565 (Resolved): pacific: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
https://github.com/ceph/ceph/pull/39844 Backport Bot
10:10 PM Bug #47719 (Pending Backport): api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Brad Hubbard
05:15 PM Backport #49055: nautilus: pick_a_shard() always select shard 0
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/39651
merged
Yuri Weinstein
03:43 AM Bug #49543 (New): scrub a pool which size is 1 but found stat mismatch on objects and bytes

the pg has only one primary osd:...
Liu Lan

02/28/2021

11:55 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-1217-distro-basic-smithi/5921418
Sage Weil
11:52 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
I hit another instance of this here: /a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-1217-distro-basic-... Sage Weil
09:20 PM Bug #46318: mon_recovery: quorum_status times out
same symptom... cli command fails to contact mon
/a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-121...
Sage Weil
09:37 AM Bug #40029: ceph-mon: Caught signal (Aborted) in (CrushWrapper::update_choose_args(CephContext*)+...
Florian Haas wrote:
> With thanks to Paul Emmerich in https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/threa...
Norman Shen

02/27/2021

08:59 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/sage-2021-02-27_17:50:29-rados-wip-sage2-testing-2021-02-27-0921-pacific-distro-basic-smithi/5919090
Sage Weil
03:20 PM Backport #49533 (Resolved): octopus: osd ok-to-stop too conservative
https://github.com/ceph/ceph/pull/39887 Backport Bot
03:20 PM Backport #49532 (Resolved): pacific: osd ok-to-stop too conservative
https://github.com/ceph/ceph/pull/39737 Backport Bot
03:20 PM Backport #49531 (Resolved): nautilus: osd ok-to-stop too conservative
https://github.com/ceph/ceph/pull/40676 Backport Bot
03:16 PM Bug #49392 (Pending Backport): osd ok-to-stop too conservative
pacific backport: https://github.com/ceph/ceph/pull/39737
Sage Weil
03:16 PM Backport #49530 (Resolved): octopus: "ceph osd crush set|reweight-subtree" commands do not set we...
https://github.com/ceph/ceph/pull/39919 Backport Bot
03:16 PM Backport #49529 (Resolved): nautilus: "ceph osd crush set|reweight-subtree" commands do not set w...
https://github.com/ceph/ceph/pull/39920 Backport Bot
03:15 PM Backport #49528 (Resolved): pacific: "ceph osd crush set|reweight-subtree" commands do not set we...
https://github.com/ceph/ceph/pull/39736 Backport Bot
03:15 PM Backport #49527 (Resolved): octopus: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound...
https://github.com/ceph/ceph/pull/40276 Backport Bot
03:15 PM Backport #49526 (Resolved): pacific: mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound...
https://github.com/ceph/ceph/pull/39735 Backport Bot
03:14 PM Bug #48065 (Pending Backport): "ceph osd crush set|reweight-subtree" commands do not set weight o...
pacific backport: https://github.com/ceph/ceph/pull/39736
Sage Weil
03:13 PM Bug #49212 (Pending Backport): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to cl...
backport for pacific: https://github.com/ceph/ceph/pull/39735
Sage Weil
02:40 PM Bug #49525 (Resolved): found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 ...
... Sage Weil
02:36 PM Bug #48997: rados/singleton/all/recovery-preemption: defer backfill|defer recovery not found in logs
/a/sage-2021-02-26_22:19:00-rados-wip-sage-testing-2021-02-26-1412-distro-basic-smithi/5916984
Sage Weil
02:26 PM Bug #49521: build failure on centos-8, bad/incorrect use of #ifdef/#elif
N.B. fedora-33 and later have sigdescr_np(); ditto for rhel-9.
Also strsignal(3) is not *MT-SAFE*. (It's also not ...
Kaleb KEITHLEY
02:26 PM Bug #43584: MON_DOWN during mon_join process
/a/sage-2021-02-26_22:19:00-rados-wip-sage-testing-2021-02-26-1412-distro-basic-smithi/5917141
I think this is a g...
Sage Weil
02:17 PM Bug #49524 (Resolved): ceph_test_rados_delete_pools_parallel didn't start
... Sage Weil
02:11 PM Bug #49523 (New): rebuild-mondb doesn't populate mgr commands -> pg dump EINVAL
... Sage Weil

02/26/2021

09:37 PM Bug #49521 (New): build failure on centos-8, bad/incorrect use of #ifdef/#elif
building 15.2.9 for CentOS Storage SIG el8 I hit this compile error:
cmake ... -DWITH_REENTRANT_STRSIGNAL=ON ......
Kaleb KEITHLEY
03:30 PM Bug #44286: Cache tiering shows unfound objects after OSD reboots
We even hit that bug twice today by rebooting two of our cache servers.
What's interesting is that only hit_set ob...
Jan-Philipp Litza
01:53 PM Feature #49505 (New): Warn about extremely anomalous commit_latencies
In a EC cluster with ~500 hdd osds, we suffered a drop in write performance from 30GiB/s down to 3GiB/s due to one si... Dan van der Ster

02/25/2021

10:24 PM Bug #49461 (Duplicate): rados/upgrade/pacific-x/parallel: upgrade incomplete
Neha Ojha
05:19 PM Backport #49397 (In Progress): octopus: rados/dashboard: Health check failed: Telemetry requires ...
Nathan Cutler
05:18 PM Backport #49398 (Resolved): pacific: rados/dashboard: Health check failed: Telemetry requires re-...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39484
m...
Nathan Cutler
05:18 PM Backport #49398 (In Progress): pacific: rados/dashboard: Health check failed: Telemetry requires ...
Nathan Cutler
03:00 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/sage-2021-02-24_21:26:56-rados-wip-sage-testing-2021-02-24-1457-distro-basic-smithi/5912284
Sage Weil
02:47 PM Bug #49487 (Fix Under Review): osd:scrub skip some pg
Kefu Chai
08:27 AM Bug #49487 (Resolved): osd:scrub skip some pg
ENV:1 mon,1 mgr,1 osd
create a pool with 8 pg, change the value of osd_scrub_min_interval to trigger reschedule
...
wencong wan
02:37 PM Bug #46318 (Need More Info): mon_recovery: quorum_status times out
having trouble reproducing (after about 150 jobs). adding increased debugging to master with https://github.com/ceph... Sage Weil
01:29 PM Backport #49134: pacific: test_envlibrados_for_rocksdb.sh: EnvLibradosMutipoolTest.DBBulkLoadKeys...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39264
m...
Nathan Cutler
09:23 AM Support #49489 (New): Getting Long heartbeat and slow requests on ceph luminous 12.2.13
1. Current environment is integrated with ceph and openstack
2. It has NVME and SSD disks only
3. We have create fo...
ceph ceph
07:09 AM Bug #49427 (In Progress): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing()....
https://github.com/ceph/ceph/pull/39670 Myoungwon Oh
06:30 AM Backport #49482 (Resolved): pacific: api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/Manifes...
https://github.com/ceph/ceph/pull/39773 Backport Bot
06:29 AM Bug #47024 (Duplicate): rados/test.sh: api_tier_pp LibRadosTwoPoolsPP.ManifestSnapRefcount failed
Kefu Chai
06:29 AM Bug #48915 (Duplicate): api_tier_pp: LibRadosTwoPoolsPP.ManifestFlushDupCount failed
Kefu Chai
06:28 AM Bug #48786 (Pending Backport): api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapR...
Kefu Chai
05:29 AM Bug #49460 (Resolved): qa/workunits/cephtool/test.sh: test_mon_osd_create_destroy fails
Kefu Chai
03:58 AM Bug #49468 (Resolved): rados: "Command crashed: 'rados -p cephfs_metadata rmxattr 10000000000.000...
Kefu Chai
02:58 AM Bug #49468: rados: "Command crashed: 'rados -p cephfs_metadata rmxattr 10000000000.00000000 parent'"
Add more failures. Patrick Donnelly
02:42 AM Bug #49468 (Resolved): rados: "Command crashed: 'rados -p cephfs_metadata rmxattr 10000000000.000...
... Patrick Donnelly
12:04 AM Bug #49463: qa/standalone/misc/rados-striper.sh: Caught signal in thread_name:rados
rados/singleton/{all/radostool mon_election/classic msgr-failures/many msgr/async-v2only objectstore/bluestore-comp-z... Neha Ojha

02/24/2021

10:07 PM Bug #49463 (Can't reproduce): qa/standalone/misc/rados-striper.sh: Caught signal in thread_name:r...
... Neha Ojha
09:34 PM Bug #49461 (Duplicate): rados/upgrade/pacific-x/parallel: upgrade incomplete
... Neha Ojha
09:26 PM Bug #49460 (Fix Under Review): qa/workunits/cephtool/test.sh: test_mon_osd_create_destroy fails
Neha Ojha
08:58 PM Bug #49460 (Resolved): qa/workunits/cephtool/test.sh: test_mon_osd_create_destroy fails
... Neha Ojha
09:00 PM Bug #49212 (Fix Under Review): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to cl...
earlier,... Sage Weil
08:48 PM Bug #49212 (In Progress): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class '...
... Sage Weil
10:45 AM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
Nokia ceph-users wrote:
> Do you suspect that this is something relevant to 14.2.2 and could be solved with a higher...
Igor Fedotov
04:55 AM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
Do you suspect that this is something relevant to 14.2.2 and could be solved with a higher version? Nokia ceph-users
09:16 AM Bug #49448 (New): If OSD types are changed, pools rules can become unresolvable without providing...
When some OSDs in a cluster are of a specific type, such as hdd_aes, and the type is used in a rule, if the type of s... linzhou zhou
05:07 AM Bug #49428 (Triaged): ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create...
Brad Hubbard
04:50 AM Bug #49428: ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create failed wi...
TLDR skip to ********* MON.A ************** below.
So this looks like a race. The calls seem to be serialized in t...
Brad Hubbard
12:48 AM Bug #49428: ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool create failed wi...
Here's the error from the mon log.... Brad Hubbard
05:06 AM Bug #47719 (In Progress): api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Brad Hubbard
12:50 AM Bug #49427: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
Most likely, the problem is that the object being dirtied is present, but the prior clone is missing pending recovery. Samuel Just

02/23/2021

11:52 PM Bug #49427: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
dec_refcount_by_dirty is related to tiering/dedeup which got added fairly recently in https://github.com/ceph/ceph/pu... Neha Ojha
10:36 PM Bug #49427: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
/a/bhubbard-2021-02-23_02:25:14-rados-master-distro-basic-smithi/5905669 Brad Hubbard
01:44 AM Bug #49427 (Resolved): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_...
/a/bhubbard-2021-02-22_23:51:15-rados-master-distro-basic-smithi/5904732
rados/verify/{centos_latest ceph clusters...
Brad Hubbard
11:15 PM Bug #49403: Caught signal (aborted) on mgrmap epoch 1 during librados init (rados-striper)
/a/sage-2021-02-23_06:29:23-rados-wip-sage-testing-2021-02-22-2228-distro-basic-smithi/5906245
Sage Weil
09:06 PM Backport #49055 (In Progress): nautilus: pick_a_shard() always select shard 0
Nathan Cutler
03:40 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
... Deepika Upadhyay
12:23 PM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
Nokia ceph-users wrote:
> Hi , Another occurrence
>
> _2021-02-22 09:19:43.010071 mon.cn1 (mon.0) 267937 : cluste...
Nokia ceph-users
12:23 PM Bug #49353: Random OSDs being marked as down even when there is very less activity on the cluster...
Hi , Another occurrence
_2021-02-22 09:19:43.010071 mon.cn1 (mon.0) 267937 : cluster [INF] osd.146 marked down aft...
Nokia ceph-users
04:34 AM Bug #48065: "ceph osd crush set|reweight-subtree" commands do not set weight on device class subtree
Sage Weil wrote:
> BTW Mykola I would suggest using 'ceph osd crush reweight osd.N' (which works fine already) inste...
Mykola Golub
02:02 AM Bug #49428 (Duplicate): ceph_test_rados_api_snapshots fails with "rados_mon_command osd pool crea...
/a/bhubbard-2021-02-22_23:51:15-rados-master-distro-basic-smithi/5904720... Brad Hubbard
12:42 AM Bug #49069 (Resolved): mds crashes on v15.2.8 -> master upgrade decoding MMgrConfigure
Sage Weil

02/22/2021

08:22 PM Bug #48065: "ceph osd crush set|reweight-subtree" commands do not set weight on device class subtree
BTW Mykola I would suggest using 'ceph osd crush reweight osd.N' (which works fine already) instead of the 'ceph osd ... Sage Weil
08:22 PM Bug #48065 (Fix Under Review): "ceph osd crush set|reweight-subtree" commands do not set weight o...
Sage Weil
07:37 PM Bug #46318 (In Progress): mon_recovery: quorum_status times out
Neha Ojha wrote:
> We are still seeing these.
>
> /a/teuthology-2021-01-18_07:01:01-rados-master-distro-basic-smi...
Sage Weil
09:50 AM Bug #49409 (New): osd run into dead loop and tell slow request when rollback snap with using cach...
xin mycho
08:59 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
If we are happy with https://github.com/ceph/ceph/pull/39601 in theory perhaps we need to extend it to cover the othe... Brad Hubbard
06:31 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Hey Sage, I think you meant /a/sage-2021-02-20_16:46:42-rados-wip-sage2-testing-2021-02-20-0942-distro-basic-smithi/5... Brad Hubbard
02:17 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
First, the relevant test code from src/test/librados/watch_notify.cc.... Brad Hubbard

02/21/2021

04:50 PM Backport #49404 (Resolved): pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
https://github.com/ceph/ceph/pull/39597
Backport Bot
04:48 PM Bug #48984 (Pending Backport): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Sage Weil
04:46 PM Bug #49403 (Duplicate): Caught signal (aborted) on mgrmap epoch 1 during librados init (rados-str...
... Sage Weil
04:42 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/sage-2021-02-20_16:46:42-rados-wip-sage2-testing-2021-02-20-0942-distro-basic-smithi/5899129
Sage Weil
11:48 AM Bug #48998: Scrubbing terminated -- not all pgs were active and clean
rados/singleton/{all/lost-unfound-delete mon_election/classic msgr-failures/none msgr/async-v1only objectstore/bluest... Kefu Chai
03:35 AM Backport #49402 (Resolved): octopus: rados: Health check failed: 1/3 mons down, quorum a,c (MON_D...
https://github.com/ceph/ceph/pull/40138 Backport Bot
03:35 AM Backport #49401 (Resolved): pacific: rados: Health check failed: 1/3 mons down, quorum a,c (MON_D...
https://github.com/ceph/ceph/pull/40137 Backport Bot
03:32 AM Bug #45441 (Pending Backport): rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" ...
Kefu Chai

02/20/2021

07:41 PM Bug #48386 (Resolved): Paxos::restart() and Paxos::shutdown() can race leading to use-after-free ...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
12:33 PM Bug #49395 (Resolved): ceph-test rpm missing gtest dependencies
Kefu Chai
03:47 AM Bug #49395 (Fix Under Review): ceph-test rpm missing gtest dependencies
Patrick Donnelly

02/19/2021

11:59 PM Bug #45647: "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr-failu...
/a/teuthology-2021-02-17_03:31:03-rados-pacific-distro-basic-smithi/5889472 Neha Ojha
11:30 PM Backport #49398 (Resolved): pacific: rados/dashboard: Health check failed: Telemetry requires re-...
https://github.com/ceph/ceph/pull/39484 Backport Bot
11:30 PM Backport #49397 (Resolved): octopus: rados/dashboard: Health check failed: Telemetry requires re-...
https://github.com/ceph/ceph/pull/39704 Backport Bot
11:29 PM Bug #49212 (Duplicate): mon/crush_ops.sh fails: Error EBUSY: osd.1 has already bound to class 'ss...
Neha Ojha
11:25 PM Bug #48990 (Pending Backport): rados/dashboard: Health check failed: Telemetry requires re-opt-in...
Neha Ojha
11:24 PM Bug #48990: rados/dashboard: Health check failed: Telemetry requires re-opt-in (TELEMETRY_CHANGED...
pacific backport merged: https://github.com/ceph/ceph/pull/39484 Josh Durgin
11:24 PM Bug #40809: qa: "Failed to send signal 1: None" in rados
Deepika Upadhyay wrote:
> this happens due to dispatch delay.
> Testing with increased values for a test case can ...
Neha Ojha
07:45 AM Bug #40809: qa: "Failed to send signal 1: None" in rados
this happens due to dispatch delay.
Testing with increased values for a test case can lead to this failure:
/ceph/...
Deepika Upadhyay
11:05 PM Bug #44945: Mon High CPU usage when another mon syncing from it
Wout van Heeswijk wrote:
> I think this might be related to #42830. If so it may be resolved with Ceph Nautilus 14.2...
Neha Ojha
11:04 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Will do. Brad Hubbard
10:54 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
Brad, can you please take look at this one? Neha Ojha
09:25 PM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
/a/teuthology-2021-02-17_03:31:03-rados-pacific-distro-basic-smithi/5889235 Neha Ojha
10:49 PM Bug #49359: osd: warning: unused variable
f9f9270d75d3bc6383604addefc2386318ecfc8b was done to fix another warning, definitely not high priority :) Neha Ojha
10:47 PM Bug #45441 (Fix Under Review): rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" ...
Sage Weil
10:44 PM Bug #39039 (Duplicate): mon connection reset, command not resent
let's track this at #45647 Sage Weil
10:33 PM Bug #47003 (Duplicate): ceph_test_rados test error. Reponses out of order due to the connection d...
Neha Ojha
10:29 PM Feature #39339: prioritize backfill of metadata pools, automatically
I think this tracker can be marked resolved since pull request 29181 merged. David Zafman
10:26 PM Bug #48468 (Need More Info): ceph-osd crash before being up again
Hi Clement,
Can you reproduce this with logs?...
Sage Weil
10:19 PM Bug #49393 (Need More Info): Segmentation fault in ceph::logging::Log::entry()
Sage Weil
09:11 PM Bug #49393 (Can't reproduce): Segmentation fault in ceph::logging::Log::entry()
... Neha Ojha
10:16 PM Bug #49395 (Resolved): ceph-test rpm missing gtest dependencies
... Sage Weil
09:28 PM Bug #48841 (Fix Under Review): test_turn_off_module: wait_until_equal timed out
Neha Ojha
08:09 PM Bug #49392 (Resolved): osd ok-to-stop too conservative
Currently 'osd ok-to-stop' is too conservative: if the pg is degraded, and is touched by an osd we might stop, it alw... Sage Weil
04:43 PM Backport #49320 (In Progress): octopus: thrash_cache_writeback_proxy_none: FAILED ceph_assert(ver...
https://github.com/ceph/ceph/pull/39578 Neha Ojha
02:51 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
investigating the 2 unfound objects, `when all_unfound_are_queried_or_lost all of
might_have_unfound` all participat...
Deepika Upadhyay
01:53 PM Bug #49104: crush weirdness: degraded PGs not marked as such, and choose_total_tries = 50 is too ...
Neha Ojha wrote:
> Regarding Problem A, will it be possible for you to share osd logs with debug_osd=20 to demonstra...
Dan van der Ster
01:26 PM Backport #49377 (Resolved): pacific: building libcrc32
https://github.com/ceph/ceph/pull/39902 Backport Bot
10:31 AM Bug #49231: MONs unresponsive over extended periods of time
I think I found the reason for this behaviour. I managed to pull extended logs during an incident and saw that the MO... Frank Schilder
01:40 AM Bug #48984 (Fix Under Review): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
Neha Ojha

02/18/2021

05:20 PM Bug #49359: osd: warning: unused variable
https://stackoverflow.com/a/50176479 Patrick Donnelly
05:17 PM Bug #49359 (New): osd: warning: unused variable
... Patrick Donnelly
03:30 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
... Sebastian Wagner
03:21 PM Bug #49259 (Resolved): test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
Sebastian Wagner
03:21 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
turned out to be caused by https://github.com/ceph/ceph/pull/39530 Sebastian Wagner
10:16 AM Bug #49353 (Need More Info): Random OSDs being marked as down even when there is very less activi...
osd.149 went down at 03:25:26
2021-01-14 03:25:25.974634 mon.cn1 (mon.0) 384654 : cluster [INF] osd.149 marked down ...
Igor Fedotov
09:51 AM Bug #49353 (Need More Info): Random OSDs being marked as down even when there is very less activi...
Hi,
We recently see some random OSDs being marked as down status with the below message on one of our Nautilus cl...
Nokia ceph-users
08:06 AM Backport #48495 (Resolved): nautilus: Paxos::restart() and Paxos::shutdown() can race leading to ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/39160
m...
Nathan Cutler

02/17/2021

08:22 PM Bug #48984 (In Progress): lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
David Zafman
08:18 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs

Proposed fix in https://github.com/ceph/ceph/pull/39535
Needs extensive testing
David Zafman
05:41 PM Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs

If a requested scrub runs into a rejected remote reservation, the m_planned_scrub is already reset. This means tha...
David Zafman
07:51 PM Bug #48990: rados/dashboard: Health check failed: Telemetry requires re-opt-in (TELEMETRY_CHANGED...
https://github.com/ceph/ceph/pull/39484 merged Yuri Weinstein
04:29 PM Backport #49073: nautilus: crash in Objecter and CRUSH map lookup
Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/39197
merged
Yuri Weinstein
04:29 PM Backport #48495: nautilus: Paxos::restart() and Paxos::shutdown() can race leading to use-after-f...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/39160
merged
Yuri Weinstein
10:20 AM Backport #49320 (Resolved): octopus: thrash_cache_writeback_proxy_none: FAILED ceph_assert(versio...
https://github.com/ceph/ceph/pull/39578 Backport Bot
10:15 AM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
http://qa-proxy.ceph.com/teuthology/yuriw-2021-02-16_16:01:09-rados-wip-yuri-testing-2021-02-08-1109-octopus-distro-b... Deepika Upadhyay

02/16/2021

10:51 PM Bug #49259 (Need More Info): test_rados_api tests timeout with cephadm (plus extremely large OSD ...
Brad Hubbard
09:30 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
From IRC:... Neha Ojha
06:00 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
Sebastian Wagner wrote:
> sage: this is related to thrashing and only happens within cephadm. non-cephadm is not aff...
Neha Ojha
08:47 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
Argh! So it does, my bad. Please ignore comment 22 for now. Brad Hubbard
08:44 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
Deepika Upadhyay wrote:
> /ceph/teuthology-archive/yuriw-2021-02-15_20:25:26-rados-wip-yuri3-testing-2021-02-15-1020...
Neha Ojha
07:51 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
Deepika Upadhyay wrote:
> /ceph/teuthology-archive/yuriw-2021-02-15_20:25:26-rados-wip-yuri3-testing-2021-02-15-1020...
Brad Hubbard
07:14 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
-/ceph/teuthology-archive/yuriw-2021-02-15_20:25:26-rados-wip-yuri3-testing-2021-02-15-1020-nautilus-distro-basic-gib... Deepika Upadhyay
03:28 PM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
Deepika, -I don't understand why or how the "workaround" addresses the issue here. probably you could file a PR based... Kefu Chai
10:58 AM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
hey Kefu! Should we use this workaround meanwhile the real bug is being fixed?... Deepika Upadhyay
08:39 AM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
created https://github.com/ceph/ceph/pull/39491 in hope to work around this. Kefu Chai
04:49 AM Bug #49303: FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on aarch64
filed https://bugzilla.redhat.com/show_bug.cgi?id=1929043 Kefu Chai
04:48 AM Bug #49303 (In Progress): FTBFS due to cmake's inability to find std::filesystem on a CentOS8 on ...
... Kefu Chai
01:23 PM Bug #49190: LibRadosMiscConnectFailure_ConnectFailure_Test: FAILED ceph_assert(p != obs_call_gate...
Bug #40868 is not related Jos Collin

02/15/2021

10:35 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
... Brad Hubbard
03:40 PM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
sage: this is related to thrashing and only happens within cephadm. non-cephadm is not affected Sebastian Wagner
04:22 AM Bug #49259: test_rados_api tests timeout with cephadm (plus extremely large OSD logs)
Managed to reproduce this with some manageable large osd logs.
On the first osd, just before the slow ops begin we...
Brad Hubbard
 

Also available in: Atom