Project

General

Profile

Activity

From 04/12/2023 to 05/11/2023

05/11/2023

08:39 PM Bug #56849: crash: void PaxosService::propose_pending(): assert(have_pending)
Since this issue is marked as "Duplicate" it needs to specify what issue it duplicates in the "Related Issues" field. Yaarit Hatuka
08:12 PM Bug #56371: crash: MOSDPGLog::encode_payload(unsigned long)
Since this issue is marked as "Duplicate" it needs to specify what issue it duplicates in the "Related Issues" field. Yaarit Hatuka
08:07 PM Bug #56847: crash: void PaxosService::propose_pending(): assert(have_pending)
Since this issue is marked as "Duplicate" it needs to specify what issue it duplicates in the "Related Issues" field. Yaarit Hatuka
08:07 PM Bug #56848: crash: void PaxosService::propose_pending(): assert(have_pending)
Since this issue is marked as "Duplicate" it needs to specify what issue it duplicates in the "Related Issues" field. Yaarit Hatuka
11:51 AM Feature #59727: The libradosstriper interface provides an optional parameter to avoid shared lock...
pull request: https://github.com/ceph/ceph/pull/51443 Snow Si
08:36 AM Feature #59727 (New): The libradosstriper interface provides an optional parameter to avoid share...

The flow of the read operation of the current libradosstriper interface:
1. Lock (shared lock)
2. to read
3. Unl...
Snow Si
09:03 AM Bug #56707: pglog growing unbounded on EC with copy by ref
王子敬 wang wrote:
> 王子敬 wang wrote:
> > I have also experienced this situation here
> >
> > - Create 30 objects in...
Nitzan Mordechai
08:48 AM Bug #56707: pglog growing unbounded on EC with copy by ref
王子敬 wang wrote:
> I have also experienced this situation here
>
> - Create 30 objects in bucket1 using put
> - ...
王子敬 wang
03:32 AM Bug #56707: pglog growing unbounded on EC with copy by ref
I have also experienced this situation here
- Create 30 objects in bucket1 using put
- Using 30 objects as the s...
王子敬 wang
07:50 AM Bug #52624: qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)"
https://pulpito.ceph.com/yuriw-2023-05-09_19:39:46-fs-wip-yuri4-testing-2023-05-08-0846-pacific-distro-default-smithi... Kotresh Hiremath Ravishankar
12:08 AM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
It's not obvious to me from the above why this started popping up in the last few weeks -- have you been able to iden... Samuel Just

05/10/2023

11:25 PM Bug #59192: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enable...
/a/yuriw-2023-04-26_01:16:19-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7253751
Sure R...
Laura Flores
08:54 PM Bug #59656: pg_upmap_primary timeout
Hi Kevin,
I am working to reproduce this issue on my end, but I also have some tricks you can try to generate OSD ...
Laura Flores
04:21 PM Bug #49888: rados/singleton: radosbench.py: teuthology.exceptions.MaxWhileTries: reached maximum ...
/a/yuriw-2023-04-26_01:16:19-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7253869
/a/yuriw...
Laura Flores
04:20 PM Backport #59715 (In Progress): pacific: mon: race condition between `mgr fail` and MgrMonitor::pr...
Patrick Donnelly
04:19 PM Backport #59715 (Resolved): pacific: mon: race condition between `mgr fail` and MgrMonitor::prepa...
https://github.com/ceph/ceph/pull/50980 Patrick Donnelly
03:23 PM Bug #59049 (Fix Under Review): WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate ...
Neha Ojha
02:03 PM Bug #59291: pg_pool_t version compatibility issue
So Neha and I have discussed this and we were looking into a solution where anything 31 and above would have to encod... Kamoltat (Junior) Sirivadhna

05/09/2023

06:28 PM Bug #54511 (Resolved): test_pool_min_size: AssertionError: not clean before minsize thrashing starts
Kamoltat (Junior) Sirivadhna
06:28 PM Backport #57020 (Resolved): pacific: test_pool_min_size: AssertionError: not clean before minsize...
merged long time ago Kamoltat (Junior) Sirivadhna
06:21 PM Bug #56151 (Resolved): mgr/DaemonServer:: adjust_pgs gap > max_pg_num_change should be gap >= max...
Kamoltat (Junior) Sirivadhna
06:18 PM Backport #59179 (Resolved): pacific: [pg-autoscaler][mgr] does not throw warn to increase PG coun...
Kamoltat (Junior) Sirivadhna
06:18 PM Backport #59702 (Resolved): reef: mon: FAILED ceph_assert(osdmon()->is_writeable())
Backport Bot
06:18 PM Backport #59701 (Resolved): quincy: mon: FAILED ceph_assert(osdmon()->is_writeable())
Backport Bot
06:18 PM Backport #59700 (New): pacific: mon: FAILED ceph_assert(osdmon()->is_writeable())
Backport Bot
06:10 PM Bug #57017: mon-stretched_cluster: degraded stretched mode lead to Monitor crash
quincy backport: https://github.com/ceph/ceph/pull/51413
pacific backport: https://github.com/ceph/ceph/pull/51414
Kamoltat (Junior) Sirivadhna
06:08 PM Bug #59271: mon: FAILED ceph_assert(osdmon()->is_writeable())
reef: https://github.com/ceph/ceph/pull/51409
quincy: https://github.com/ceph/ceph/pull/51413
pacific: https://gith...
Kamoltat (Junior) Sirivadhna
06:08 PM Bug #59271 (Pending Backport): mon: FAILED ceph_assert(osdmon()->is_writeable())
Kamoltat (Junior) Sirivadhna
07:13 AM Bug #55009: Scrubbing exits due to error reading object head
piotr@stackhpc.com, Mark Holliman:
As a temporary step, I'd suggest increasing the osd_max_scrubs configuration para...
Ronen Friedman
05:02 AM Bug #50371: Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
Looking at the kern.log.gz file to get some hints this time.... Brad Hubbard
03:12 AM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
Radoslaw Zarzynski wrote:
> Let's check whether this reproduces in Reef too. If so, then... there is no OMAP without...
Brad Hubbard
02:40 AM Bug #59510: osd crash
Thank for your response, if use the ssd as the data pool, need to add a nvme device as DB? can zhu

05/08/2023

09:18 PM Bug #49888: rados/singleton: radosbench.py: teuthology.exceptions.MaxWhileTries: reached maximum ...
/a/yuriw-2023-04-24_22:54:45-rados-wip-yuri7-testing-2023-04-19-1343-distro-default-smithi/7250551
Something inter...
Laura Flores
09:05 PM Bug #59656: pg_upmap_primary timeout
Thank you Kevin! I appreciate it. Laura Flores
08:27 PM Bug #59656: pg_upmap_primary timeout
this is the map on which I tried to apply the osdmaptool Kevin NGUETCHOUANG
08:19 PM Bug #59656: pg_upmap_primary timeout
Thanks Kevin. Your osdmap will also be helpful whenever you get a chance.
I will need some time to evaluate what's h...
Laura Flores
08:08 PM Bug #59656: pg_upmap_primary timeout
This is the only logs I have, I really don't think they contain any valuable information,but may be I'm wrong Kevin NGUETCHOUANG
08:02 PM Bug #59656: pg_upmap_primary timeout
It's difficult to say without the logs. Even if there are no errors explicitly presenting themselves, something off a... Laura Flores
07:59 PM Bug #59656: pg_upmap_primary timeout
Ok, for now I have no errors, I will send it when I will face the "not acting set" error again.
Can you please look ...
Kevin NGUETCHOUANG
07:55 PM Bug #59656: pg_upmap_primary timeout
Kevin NGUETCHOUANG wrote:
> How can I get the OSD logs ?
All ceph logs are available by default under "/var/log/c...
Laura Flores
07:43 PM Bug #59656: pg_upmap_primary timeout
How can I get the OSD logs ? Kevin NGUETCHOUANG
07:36 PM Bug #59656: pg_upmap_primary timeout
Thanks Kevin, "Error EINVAL: osd.* is not in acting set for pg <pgid" helps, as it points me to the area of the code ... Laura Flores
07:28 PM Bug #59656: pg_upmap_primary timeout
Hello Laura thank for answering me.
1. I'm using the reef version. (v18)
2. this is where the problem begins, I d...
Kevin NGUETCHOUANG
06:57 PM Bug #59656: pg_upmap_primary timeout
Hello Kevin, thanks for reporting this issue.
A few questions:
1. What is the version of your cluster?
2. In wha...
Laura Flores
05:22 PM Bug #59656: pg_upmap_primary timeout
This a fresh Reef's feature added in https://github.com/ceph/ceph/pull/49178.
CCing Laura who was involved here.
Radoslaw Zarzynski
07:15 PM Bug #53575: Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
The pacific backport has been approved and is just awaiting testing: https://github.com/ceph/ceph/pull/49521 Laura Flores
07:14 PM Bug #53575: Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
Radoslaw Zarzynski wrote:
> Sounds like a missed backport. Please correct me if I'm wrong.
That's my understandin...
Laura Flores
05:28 PM Bug #53575: Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
Sounds like a missed backport. Please correct me if I'm wrong. Radoslaw Zarzynski
06:37 PM Bug #50371: Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
Thanks Brad! Let me know how I can help.
I found another instance in Pacific:
/a/yuriw-2023-05-06_14:41:44-rados-...
Laura Flores
06:19 PM Bug #59504: 17.2.6: build fails with fmt 9.1.0
Radoslaw Zarzynski wrote:
> I recall there was a bunch of libfmt-related fixes in main. Perhaph we missed bacporting...
Tomasz Kloczko
05:47 PM Bug #59504 (Need More Info): 17.2.6: build fails with fmt 9.1.0
I recall there was a bunch of libfmt-related fixes in main. Perhaph we missed bacporting some of them. Could by any c... Radoslaw Zarzynski
05:54 PM Bug #53751: "N monitors have not enabled msgr2" is always shown for new clusters
This might be a doc sure but I'm not sure. Bumping for deep bug scrub. Radoslaw Zarzynski
05:51 PM Bug #59510: osd crash
Increasing the timeout could be obviously help in short term but won't deal with the underlying problem. Igor's idea ... Radoslaw Zarzynski
05:48 PM Bug #59080: mclock-config.sh: TEST_profile_disallow_builtin_params_modify fails when $res == $opt...
Will be merged as a part of the big mClock PR. Radoslaw Zarzynski
05:39 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
Let's check whether this reproduces in Reef too. If so, then... there is no OMAP without RocksDB and we upgraded it r... Radoslaw Zarzynski
05:34 PM Bug #59057: rados/test_envlibrados_for_rocksdb.sh: No rule to make target 'rocksdb_env_librados_t...
High as it's a new thing in Reef. Radoslaw Zarzynski
05:33 PM Bug #59057: rados/test_envlibrados_for_rocksdb.sh: No rule to make target 'rocksdb_env_librados_t...
Laura: the occurance from February was actually on a branch with the rocksdb bump up. See: https://github.com/ceph/ce... Radoslaw Zarzynski
05:25 PM Bug #55009: Scrubbing exits due to error reading object head
Sounds entire scrubbing could get blocked. Radoslaw Zarzynski
07:50 AM Bug #55009: Scrubbing exits due to error reading object head
In case it's useful, more detailed log (with debug level 10) from the same environment mentioned by Mark:... Piotr Parczewski
05:19 PM Bug #59670: Ceph status shows PG recovering when norecover flag is set
Has the PG ultimately went into the proper state? Asking to exclude a race-condition on _just reporting_ via ceph-mgr. Radoslaw Zarzynski
02:07 PM Bug #59670 (New): Ceph status shows PG recovering when norecover flag is set
On the Gibba cluster, we observed that ceph -s was showing one PG in recovering state after norecovery flag was set
...
Aishwarya Mathuria
05:17 PM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
Rising to urgent to not lose it from the sight line of Reef. Radoslaw Zarzynski
02:39 PM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
This is an actual bug in the scrub code:
Working with Nitzan, here is what we've found out:
(based on logs from...
Ronen Friedman
05:13 PM Bug #58893: test_map_discontinuity: AssertionError: wait_for_clean: failed before timeout expired
/a/yuriw-2023-05-06_14:41:44-rados-pacific-release-distro-default-smithi/7264242... Laura Flores
05:10 PM Backport #59677 (New): quincy: osd:tick checking mon for new map
Backport Bot
05:10 PM Backport #59676 (New): reef: osd:tick checking mon for new map
Backport Bot
05:10 PM Backport #59675 (New): pacific: osd:tick checking mon for new map
Backport Bot
05:08 PM Bug #57977 (Pending Backport): osd:tick checking mon for new map
Radoslaw Zarzynski
05:07 PM Bug #59192: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enable...
Laura, would you mind taking a look? Definitely not urgent thing. Radoslaw Zarzynski
03:56 PM Bug #59192: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enable...
/a/yuriw-2023-05-06_14:41:44-rados-pacific-release-distro-default-smithi/7264188 Laura Flores
03:44 PM Bug #48965: qa/standalone/osd/osd-force-create-pg.sh: TEST_reuse_id: return 1
/a/yuriw-2023-05-06_14:41:44-rados-pacific-release-distro-default-smithi/7264602 Laura Flores

05/05/2023

07:40 PM Bug #59599: osd: cls_refcount unit test failures during upgrade sequence
See also here:
http://qa-proxy.ceph.com/teuthology/teuthology-2023-05-05_14:23:01-upgrade:pacific-x-quincy-distro...
Yuri Weinstein
06:59 AM Bug #59656 (Need More Info): pg_upmap_primary timeout
Hello,
I created a ceph cluster using cephadm with ceph version reef 10 nodes, 3 mon nodes and 8 osd nodes. On top o...
Kevin NGUETCHOUANG

05/04/2023

05:57 PM Bug #59192: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enable...
/a/yuriw-2023-04-25_18:56:08-rados-wip-yuri5-testing-2023-04-25-0837-pacific-distro-default-smithi/7252745 Laura Flores
03:05 PM Bug #53575: Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
Thanks Nitzan! Laura Flores
05:56 AM Bug #53575 (In Progress): Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
Nitzan Mordechai
05:55 AM Bug #53575: Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
Laura, the original PR was for quincy. And the related tracker https://tracker.ceph.com/issues/57618 already have bac... Nitzan Mordechai
07:54 AM Backport #59627 (Resolved): quincy: rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNo...
Already in quincy Nitzan Mordechai
05:43 AM Backport #59628 (In Progress): pacific: rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.Wat...
Nitzan Mordechai
01:51 AM Bug #50371: Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
Laura Flores wrote:
> /a/yuriw-2023-04-25_21:30:50-rados-wip-yuri3-testing-2023-04-25-1147-distro-default-smithi/725...
Brad Hubbard

05/03/2023

09:37 PM Backport #59637 (New): reef: scrub/osd-scrub-dump.sh: TEST_recover_unexpected fails from "ERROR: ...
Backport Bot
09:35 PM Bug #58797 (Pending Backport): scrub/osd-scrub-dump.sh: TEST_recover_unexpected fails from "ERROR...
/a/yuriw-2023-04-27_14:24:15-rados-wip-yuri6-testing-2023-04-26-1247-reef-distro-default-smithi/7255773 Laura Flores
09:26 PM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
/a/yuriw-2023-04-27_14:24:15-rados-wip-yuri6-testing-2023-04-26-1247-reef-distro-default-smithi/7255789
/a/yuriw-202...
Laura Flores
06:35 PM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
/a/yuriw-2023-04-25_21:30:50-rados-wip-yuri3-testing-2023-04-25-1147-distro-default-smithi/7253199
/a/yuriw-2023-04-...
Laura Flores
07:00 PM Bug #59057: rados/test_envlibrados_for_rocksdb.sh: No rule to make target 'rocksdb_env_librados_t...
/a/yuriw-2023-04-25_21:30:50-rados-wip-yuri3-testing-2023-04-25-1147-distro-default-smithi/7253386 Laura Flores
06:57 PM Bug #50371 (New): Segmentation fault (core dumped) ceph_test_rados_api_watch_notify_pp
/a/yuriw-2023-04-25_21:30:50-rados-wip-yuri3-testing-2023-04-25-1147-distro-default-smithi/7253544... Laura Flores
06:54 PM Backport #59628 (Resolved): pacific: rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchN...
https://github.com/ceph/ceph/pull/51341 Backport Bot
06:54 PM Backport #59627 (Resolved): quincy: rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNo...
Backport Bot
06:51 PM Bug #57618 (Pending Backport): rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNotify)
Laura Flores
06:39 PM Bug #59192: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enable...
/a/yuriw-2023-04-25_21:30:50-rados-wip-yuri3-testing-2023-04-25-1147-distro-default-smithi/7253406 Laura Flores
02:43 PM Bug #55009: Scrubbing exits due to error reading object head
Ronen, assigning to you in case you have any ideas. Laura Flores
02:38 PM Bug #55009: Scrubbing exits due to error reading object head
I'm seeing what at least looks similar to this bug on a cluster running: ceph version 16.2.10
About a week ago we ...
Mark Holliman

05/02/2023

11:02 PM Bug #59057: rados/test_envlibrados_for_rocksdb.sh: No rule to make target 'rocksdb_env_librados_t...
/a/lflores-2023-04-28_19:31:46-rados-wip-yuri10-testing-2023-04-18-0735-reef-distro-default-smithi/7257792 Laura Flores
11:48 AM Bug #59057: rados/test_envlibrados_for_rocksdb.sh: No rule to make target 'rocksdb_env_librados_t...
/a/sseshasa-2023-05-02_03:12:27-rados-wip-sseshasa3-testing-2023-05-01-2154-distro-default-smithi/7260279 Sridhar Seshasayee
10:57 PM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
/a/lflores-2023-04-28_19:31:46-rados-wip-yuri10-testing-2023-04-18-0735-reef-distro-default-smithi/7257789
/a/yuriw-...
Laura Flores
05:02 AM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
Neha Ojha wrote:
> Nitzan Mordechai wrote:
> > According to PR https://github.com/ceph/ceph/pull/44050 we can ignor...
Nitzan Mordechai
07:33 PM Bug #59564 (Fix Under Review): Connection scores not populated properly on monitors post installa...
Kamoltat (Junior) Sirivadhna
05:53 PM Bug #59192: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enable...
still failing consistently in the rgw suite
on main: https://pulpito.ceph.com/cbodley-2023-04-26_00:39:50-rgw-wip-cb...
Casey Bodley
02:53 PM Bug #56896: crash: int OSD::shutdown(): assert(end_time - start_time_func < cct->_conf->osd_fast_...
Looking at the OSD code I don't see much sense behind this assertion and the relevant timeout parameter.
Shouldn't w...
Igor Fedotov
12:01 PM Bug #59196: ceph_test_lazy_omap_stats segfault while waiting for active+clean
/a/sseshasa-2023-05-02_03:12:27-rados-wip-sseshasa3-testing-2023-05-01-2154-distro-default-smithi/7260300... Sridhar Seshasayee
11:27 AM Bug #59599: osd: cls_refcount unit test failures during upgrade sequence
/a/sseshasa-2023-05-01_18:57:15-rados-wip-sseshasa2-testing-2023-05-01-2153-quincy-distro-default-smithi/7259884 Sridhar Seshasayee
11:26 AM Bug #59599 (Resolved): osd: cls_refcount unit test failures during upgrade sequence
/a/sseshasa-2023-05-01_18:57:15-rados-wip-sseshasa2-testing-2023-05-01-2153-quincy-distro-default-smithi/7259891
H...
Sridhar Seshasayee
08:16 AM Bug #59333: PgScrubber: timeout on reserving replicas
/a/sseshasa-2023-05-02_03:09:13-rados-wip-sseshasa-testing-2023-05-01-2145-distro-default-smithi/7260258 Sridhar Seshasayee

05/01/2023

10:38 PM Bug #58289: "AssertionError: wait_for_recovery: failed before timeout expired" from down pg in pa...
/a/yuriw-2023-04-25_14:52:56-upgrade:pacific-p2p-pacific-release-distro-default-smithi/7252143 Laura Flores
08:13 PM Bug #53575 (New): Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
Laura Flores
08:11 PM Bug #53575: Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
/a/yuriw-2023-04-25_14:15:40-rados-pacific-release-distro-default-smithi/7251534 Laura Flores
07:43 PM Bug #59192: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enable...
/a/yuriw-2023-04-24_23:35:26-smoke-pacific-release-distro-default-smithi/7250661 Laura Flores
07:03 PM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
Nitzan Mordechai wrote:
> According to PR https://github.com/ceph/ceph/pull/44050 we can ignore that warning, i'll a...
Neha Ojha

04/29/2023

10:34 PM Support #59587 (New): ipv4 public_network + ipv6 cluster_network = osd: unable to find any IPv6 o...
Hi,
I have an ipv4 only public network.
I created an ipv6 only cluster network (full mesh - 3 nodes - OSPF)
In...
Pivert Dubuisson

04/28/2023

09:58 PM Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
Radoslaw Zarzynski wrote:
> Hi Laura. In luck with verification of the hypothesis from the comment #17?
I ran thi...
Laura Flores
09:53 PM Bug #49525: found snap mapper error on pg 3.2s1 oid 3:4abe9991:::smithi10121515-14:e4 snaps missi...
/a/yuriw-2023-04-25_14:15:40-rados-pacific-release-distro-default-smithi/7251426 Laura Flores
09:20 PM Bug #51729: Upmap verification fails for multi-level crush rule
I've landed on a potential fix for this problem. After evaluating the examples everyone provided and checking the ver... Laura Flores
08:15 PM Bug #59192: cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an application enable...
/a/yuriw-2023-04-25_14:15:40-rados-pacific-release-distro-default-smithi/7251186 Laura Flores
02:18 PM Backport #55541 (In Progress): pacific: should use TCMalloc for better performance
Ponnuvel P
12:08 PM Bug #59583 (New): osd: Higher client latency observed with mclock 'high_client_ops' profile durin...
Recovery/backfill testing was performed with OSDs on SSDs and with Erasure Coded backend. Tests with 'high_client_ops... Sridhar Seshasayee
10:08 AM Feature #42321: Add a new mode to balance pg layout by primary osds
!ceph_osd_df.png!
Hi,rosinL. I have used the function of "balance pg layout by primary osds" submitted by you. In a ...
linhuai deng

04/27/2023

02:12 PM Bug #59504: 17.2.6: build fails with fmt 9.1.0
Redirecting to general RADOS. Adam Kupczyk
01:16 PM Backport #52841 (In Progress): pacific: shard-threads cannot wakeup bug
Konstantin Shalygin
01:15 PM Backport #53166 (In Progress): pacific: api_watch_notify: LibRadosWatchNotify.Watch3Timeout failed
Konstantin Shalygin
01:13 PM Backport #53167 (Rejected): octopus: api_watch_notify: LibRadosWatchNotify.Watch3Timeout failed
Octopus is EOL Konstantin Shalygin
01:13 PM Bug #52739 (Resolved): msg/async/ProtocalV2: recv_stamp of a message is set to a wrong value
Konstantin Shalygin
01:13 PM Backport #52842 (Rejected): octopus: msg/async/ProtocalV2: recv_stamp of a message is set to a wr...
Octopus is EOL Konstantin Shalygin
01:13 PM Backport #52840 (Rejected): octopus: shard-threads cannot wakeup bug
Octopus is EOL Konstantin Shalygin
12:30 PM Backport #52307 (In Progress): pacific: doc: clarify use of `rados rm` command
Konstantin Shalygin
12:30 PM Backport #52306 (Rejected): octopus: doc: clarify use of `rados rm` command
Octopus is EOL Konstantin Shalygin
12:29 PM Backport #52557 (In Progress): pacific: pybind: rados.RadosStateError raised when closed watch ob...
Konstantin Shalygin
12:28 PM Backport #52556 (Rejected): octopus: pybind: rados.RadosStateError raised when closed watch objec...
Octopus is EOL Konstantin Shalygin
12:27 PM Backport #52596 (Rejected): octopus: make bufferlist::c_str() skip rebuild when it isn't necessary
Octopus is EOL Konstantin Shalygin
12:26 PM Backport #51525 (Rejected): octopus: osd: Delay sending info to new backfill peer resetting last_...
Octopus is EOL Konstantin Shalygin
12:26 PM Bug #50441 (Rejected): cephadm bootstrap on arm64 fails to start ceph/ceph-grafana service
Konstantin Shalygin
12:26 PM Backport #51551 (Rejected): octopus: cephadm bootstrap on arm64 fails to start ceph/ceph-grafana ...
Octopus is EOL Konstantin Shalygin
12:26 PM Bug #50393 (Resolved): CommandCrashedError: Command crashed: 'mkdir -p -- /home/ubuntu/cephtest/m...
Konstantin Shalygin
12:25 PM Backport #51741 (Rejected): octopus: CommandCrashedError: Command crashed: 'mkdir -p -- /home/ubu...
Octopus is EOL Konstantin Shalygin
12:23 PM Backport #56604 (In Progress): pacific: ceph report missing osdmap_clean_epochs if answered by peon
Konstantin Shalygin
12:23 PM Backport #56603 (Rejected): octopus: ceph report missing osdmap_clean_epochs if answered by peon
Octopus is EOL Konstantin Shalygin
12:22 PM Bug #48899 (Resolved): api_list: LibRadosList.EnumerateObjects and LibRadosList.EnumerateObjectsS...
Konstantin Shalygin
12:22 PM Backport #55581 (Rejected): octopus: api_list: LibRadosList.EnumerateObjects and LibRadosList.Enu...
Octopus is EOL Konstantin Shalygin
12:22 PM Backport #55066 (Rejected): pacific: osd_fast_shutdown_notify_mon option should be true by default
Duplicate? Konstantin Shalygin
12:21 PM Backport #55067 (Rejected): octopus: osd_fast_shutdown_notify_mon option should be true by default
Octopus is EOL Konstantin Shalygin
11:11 AM Bug #59080 (Fix Under Review): mclock-config.sh: TEST_profile_disallow_builtin_params_modify fail...
The test script issue is related to timing of a check once a change to mon DB is made. Any changes to the mon DB conf... Sridhar Seshasayee
10:50 AM Backport #52892 (In Progress): pacific: ceph-kvstore-tool repair segmentfault without bluestore-kv
Konstantin Shalygin
10:49 AM Backport #52893 (Rejected): octopus: ceph-kvstore-tool repair segmentfault without bluestore-kv
Octopus is EOL Konstantin Shalygin
09:12 AM Bug #48843 (Resolved): Get more parallel scrubs within osd_max_scrubs limits
Konstantin Shalygin
09:11 AM Backport #49776 (Rejected): octopus: Get more parallel scrubs within osd_max_scrubs limits
Octopus is EOL Konstantin Shalygin
09:11 AM Backport #52839 (In Progress): pacific: rados: build minimally when "WITH_MGR" is off
Konstantin Shalygin
09:10 AM Backport #52791 (In Progress): pacific: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_...
Konstantin Shalygin
09:10 AM Backport #52838 (Rejected): octopus: rados: build minimally when "WITH_MGR" is off
Octopus is EOL Konstantin Shalygin
09:09 AM Backport #52792 (Rejected): octopus: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_fli...
Octopus is EOL Konstantin Shalygin
09:09 AM Bug #48959 (Resolved): Primary OSD crash caused corrupted object and further crashes during backf...
Konstantin Shalygin
09:09 AM Backport #52937 (Rejected): octopus: Primary OSD crash caused corrupted object and further crashe...
Octopus is EOL Konstantin Shalygin
09:07 AM Bug #45868 (Resolved): rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Konstantin Shalygin
09:07 AM Backport #55768 (Resolved): pacific: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Konstantin Shalygin
09:06 AM Backport #55767 (Rejected): octopus: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Octopus is EOL Konstantin Shalygin
09:06 AM Bug #53506 (Closed): mon: frequent cpu_tp had timed out messages
Konstantin Shalygin
09:04 AM Backport #53719 (Resolved): octopus: mon: frequent cpu_tp had timed out messages
Konstantin Shalygin
02:37 AM Bug #59510: osd crash
The index pool make of ssd, and the data pool make of hdd, the crash message come from hdd, is there a way to voild t... can zhu

04/26/2023

07:59 PM Bug #53751: "N monitors have not enabled msgr2" is always shown for new clusters
Hi Radoslaw, before that, a quick thing for your consideration I just found:
Running monmaptool is step 13 in http...
Niklas Hambuechen
06:05 PM Bug #59564 (Pending Backport): Connection scores not populated properly on monitors post installa...
... Kamoltat (Junior) Sirivadhna
04:16 PM Bug #47838: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to
RCA by Aishwarya: https://gist.github.com/amathuria/26f5e9ecfc3f04a70c9795039fdf0c35?permalink_comment_id=4549186#gis... Radoslaw Zarzynski
12:14 PM Bug #59510: osd crash
You might also want to compact this OSD's DB using ceph-kvstore-tool. Some chances are that the timeout is caused by ... Igor Fedotov
07:04 AM Bug #59510: osd crash
like this?
*[6880136.695917] tp_osd_tp[6383]: segfault at 0 ip 00007ff38f003573 sp 00007ff36ba8a240 error 4 in libt...
can zhu
11:50 AM Backport #59456 (In Progress): quincy: Monitors do not permit OSD to join after upgrading to Quincy
Konstantin Shalygin
11:49 AM Backport #59455 (In Progress): pacific: Monitors do not permit OSD to join after upgrading to Quincy
Konstantin Shalygin
07:00 AM Bug #57977: osd:tick checking mon for new map
Radoslaw Zarzynski wrote:
> Yup, the patch does exactly that – it ensures that a random nonce is always used.
I h...
yite gu
01:36 AM Bug #59532: quincy: cephadm.upgrade from 16.2.4 (related?) stuck with one OSD upgraded
Radoslaw Zarzynski wrote:
> Hi Patrick!
> How reproducible this is? Is it constant or perhaps it happened just once...
Patrick Donnelly

04/25/2023

06:13 PM Bug #56393: failed to complete snap trimming before timeout
Bump up. Radoslaw Zarzynski
06:12 PM Bug #59049 (In Progress): WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
Radoslaw Zarzynski
06:11 PM Bug #59510 (Need More Info): osd crash
It looks the scan-for-backfill operation was taking long time and triggered the thread heartbeat. This could be even ... Radoslaw Zarzynski
06:08 PM Bug #59531: quincy: "OSD bench result of 228617.361065 IOPS exceeded the threshold limit of 500.0...
Hi Aishwarya! What do you think on the Patrick's question: "Should we (fs suite) be setting a config to mute this WRN... Radoslaw Zarzynski
12:25 AM Bug #59531 (Pending Backport): quincy: "OSD bench result of 228617.361065 IOPS exceeded the thres...
/ceph/teuthology-archive/pdonnell-2023-04-24_17:17:44-fs-wip-pdonnell-testing-20230420.183701-quincy-distro-default-s... Patrick Donnelly
06:05 PM Bug #53751: "N monitors have not enabled msgr2" is always shown for new clusters
Hello Niklas!
Thanks for getting back to it! Could you please collect monitor's logs with @debug_ms=20@ and @debug...
Radoslaw Zarzynski
01:44 AM Bug #53751: "N monitors have not enabled msgr2" is always shown for new clusters
The fundamental issue here seems to be that in my newly deployed test cluster, nothing listens on port 3300 even thou... Niklas Hambuechen
05:56 PM Bug #59333: PgScrubber: timeout on reserving replicas
bump up Radoslaw Zarzynski
03:46 PM Bug #59333: PgScrubber: timeout on reserving replicas
See the same on pacific 16.2.13 RC
http://qa-proxy.ceph.com/teuthology/yuriw-2023-04-25_14:15:06-smoke-pacific-rel...
Yuri Weinstein
05:46 PM Bug #57977: osd:tick checking mon for new map
Yup, the patch does exactly that – it ensures that a random nonce is always used. Radoslaw Zarzynski
05:42 PM Bug #59532 (Need More Info): quincy: cephadm.upgrade from 16.2.4 (related?) stuck with one OSD up...
Hi Patrick!
How reproducible this is? Is it constant or perhaps it happened just once? I'm asking because of the rec...
Radoslaw Zarzynski
12:34 AM Bug #59532 (Closed): quincy: cephadm.upgrade from 16.2.4 (related?) stuck with one OSD upgraded
... Patrick Donnelly
10:17 AM Backport #59538 (Rejected): pacific: osd/scrub: verify SnapMapper consistency not backported
Backport Bot
10:17 AM Backport #59537 (Resolved): quincy: osd/scrub: verify SnapMapper consistency not backported
https://github.com/ceph/ceph/pull/52182 Backport Bot
10:12 AM Bug #59478: osd/scrub: verify SnapMapper consistency not backported
@Wout, the bot should create backport tickets soon Konstantin Shalygin
10:11 AM Bug #59478 (Pending Backport): osd/scrub: verify SnapMapper consistency not backported
Konstantin Shalygin
10:04 AM Bug #56147: snapshots will not be deleted after upgrade from nautilus to pacific
Matan Breizman wrote:
> > For already-converted clusters: Separate PR will be issued to remove/update the malformed ...
Wout van Heeswijk

04/24/2023

10:46 PM Backport #59179: pacific: [pg-autoscaler][mgr] does not throw warn to increase PG count on pools ...
Kamoltat (Junior) Sirivadhna wrote:
> https://github.com/ceph/ceph/pull/50694
merged
Yuri Weinstein

04/23/2023

02:52 AM Bug #59510 (Need More Info): osd crash
... can zhu

04/21/2023

04:11 PM Bug #51729: Upmap verification fails for multi-level crush rule
Hi Chris, this issue was actually discussed at Cephalocon. Looking at the verify_upmap code, it seems that we may nee... Laura Flores
04:06 PM Bug #51729: Upmap verification fails for multi-level crush rule
Laura Flores wrote:
> Hi Chris, yes, I will post another update soon with my findings.
Pinging for updates....
Chris Durham
06:24 AM Bug #59478: osd/scrub: verify SnapMapper consistency not backported
I think the backport should go to at least pacific and quincy Wout van Heeswijk

04/20/2023

09:24 PM Bug #59504: 17.2.6: build fails with fmt 9.1.0
I found that this issue can be fixed by add -DFMT_DEPRECATED_OSTREAM to CXXFLAGS.
So after run `CXXFLAGS=-DFMT_DEP...
Tomasz Kloczko
07:35 PM Bug #59504 (Need More Info): 17.2.6: build fails with fmt 9.1.0
fmt 9.1.0
cmake settings...
Tomasz Kloczko
03:44 AM Bug #57977: osd:tick checking mon for new map
There are two conditions that can cause this problem:
1. The OSDmap version held by the MON is the same as the OSD's...
yite gu

04/19/2023

12:02 PM Bug #57977: osd:tick checking mon for new map
Radoslaw Zarzynski wrote:
> My understanding:
>
> 0. The OSD (as a process) got down BUT it was up **in the OSDMa...
yite gu

04/18/2023

10:34 AM Bug #59478 (Closed): osd/scrub: verify SnapMapper consistency not backported
We have a case where a cluster is suffering from malformed snapmapper keys due to bug https://tracker.ceph.com/issues... Wout van Heeswijk

04/17/2023

08:58 AM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
According to PR https://github.com/ceph/ceph/pull/44050 we can ignore that warning, i'll add it to the ignore log list Nitzan Mordechai

04/16/2023

09:05 AM Backport #59456 (Resolved): quincy: Monitors do not permit OSD to join after upgrading to Quincy
https://github.com/ceph/ceph/pull/51102 Backport Bot
09:05 AM Backport #59455 (Resolved): pacific: Monitors do not permit OSD to join after upgrading to Quincy
https://github.com/ceph/ceph/pull/51382 Backport Bot
09:02 AM Bug #58156 (Pending Backport): Monitors do not permit OSD to join after upgrading to Quincy
Konstantin Shalygin

04/15/2023

06:14 AM Bug #57977: osd:tick checking mon for new map
Radoslaw Zarzynski wrote:
> My understanding:
>
> 0. The OSD (as a process) got down BUT it was up **in the OSDMa...
yite gu

04/13/2023

06:25 PM Bug #57977: osd:tick checking mon for new map
My understanding:
0. The OSD (as a process) got down BUT it was up **in the OSDMap** -- these are 2 different thin...
Radoslaw Zarzynski
06:15 PM Bug #59333: PgScrubber: timeout on reserving replicas
Assgining for a screening whether this is a real problem or not (a testing issue?).
If it is, we could reassign even...
Radoslaw Zarzynski
06:10 PM Bug #59049: WaitReplicas::react(const DigestUpdate&): Unexpected DigestUpdate event
Not a high priority; good opportunity to learn. Radoslaw Zarzynski
06:06 PM Bug #56393: failed to complete snap trimming before timeout
Bump up. Radoslaw Zarzynski
06:02 PM Bug #49810 (Need More Info): rados/singleton: with msgr-failures/none MON_DOWN due to haven't for...
There was re-occurrence recorded over 2 years, so would need to wait for one to get logs. Radoslaw Zarzynski
05:53 PM Bug #59192 (In Progress): cls/test_cls_sdk.sh: Health check failed: 1 pool(s) do not have an appl...
Radoslaw Zarzynski
 

Also available in: Atom